Covid-19 Updates:
All of the courses scheduled at our lab will take place in some form or another (in digital space).
This includes:
Summer term 2021
- Lecture: Grundlagen Künstliche Intelligenz (registration via LSF), Thursday 10-12
- Lecture: Knowledge-Based Systems (registration via LSF), Wednesday 10-12
- Seminar: “Textanalytics”. The topic this year is “Ethics in NLP” . For participation, please register in the Moodle course Seminar Text Analytics: Ethics in NLP SoSe 2021, Password: NLPEthics21 )
- Praxis project: "Real-Time Mobile NLP Analysis" (Registration is available at Moodle with course name as "Practical project Real-Time Mobile NLP Analysis SS21" , link: https://moodle.uni-due.de/course/view.php?id=25970, Password: Mobilenlp2021) Registrations are closed!!
- Bachelor and Master theses on request (please send mail to Prof. Torsten Zesch)
Winter term 2020/21
- Lecture: "Language Technology" (Wednesday 10-12, https://moodle.uni-due.de/course/view.php?id=23799)
- Lecture : "Natural language-based Human Computer Interaction" (Tuesday 10-12, https://moodle.uni-due.de/user/index.php?id=24419)
- Praxis project: "Fremdsprachenlernen mit Alexa"
- Praxis project: "Real-Time Mobile NLP Analysis" (Registration is available at Moodle with course name as "Praxisprojekt Real-Time Mobile NLP Analysis WS20/21" , link: https://moodle.uni-due.de/course/view.php?id=22998, Password: Mobilenlp123)
- Bachelor and Master theses on request (please send mail to Prof. Torsten Zesch)
For now, the courses will be given online. For lectures, we will follow an inverted classroom approach, where you get access to study material and we use the lecture time-slot for Q&A sessions.
Take care!
Our teaching philosophy is based on the belief that theory and practice always need to be combined. Thus, we offer a lot of hands-on courses with lots of programming, and even some lectures combining short theoretical parts with immediate practical exercises.
Our supervision style tries to find a trade-off between the freedom to pursue your own research goals and the necessary guidance and constraints.
-
query_builderLecture: Foundations of Artificial Intelligence (GKI)
Foundations of Artificial Intelligence (GKI)/ Grundlagen der Künstlichen Intelligenz (GKI)
Lecture on Bachelor level, usually offered in summer.
In Foundation of Artificial Intelligence you will learn how computers can display intelligent behaviour.
After a brief philosophical introduction, we will focus on the concept of agents, which work autonomously to achieve certain goals. An example of such an agent is a robotic vacuum cleaner. We will discuss the algorithmic details how information is processed and decision are made. This includes finding information in graphs (uninformed search, informed search, local search) as well as ways how an agent deals with uncertainty and unavailable information.
We will furthermore provide an introduction to the field of machine learning, where the computer learns a task from data, without being explicitly programmed by a human how to do it. We will discuss algorithms for classification, clustering and regression, their applications and evaluation.
Contents:
- History of AI
- Definition of Intelligence
- Agents
- Agent Architecture
- Properties of Environments
- Search
- Uninformed Search (BFS, DFS)
- Informed Search (Greedy, A*)
- Local Search (Genetic Algorithms)
- Uncertainty / Probabilistic Models
- Machine Learning
- Classification (Naive Bayes, Decision Trees)
- Clustering
- Regression
- Evaluation
- Applications of AI
-
query_builderLecture: Knowledge Based Systems (KBS)
Knowledge Based Systems (KBS)/Wissensbasierte Systeme
Lecture on Master level, usually offered in summer.
In this lecture, we will discuss methods how to transform information into a representation suited for computers and how to turn this information into knowledge.
We will start with the foundational question of what knowledge is and advance towards various ways of encoding it. We discuss how the degree of similarity between pieces of information can be determined to answer questions such as “is the color black similar to white?” by using a variety of human-curated resources.
In the subfield of machine learning we explore how the computer learns to classify objects in the real world either from human-prepared data or on its own. We will discuss several machine learning algorithms which are commonly used today and highlight properties that one should consider when choosing between algorithms.
-
query_builderLecture: Natural Language Human-Computer Interaction (NL HCI)
Natural Language Human-Computer Interaction (NL HCI)/Natürlichsprachliche Mensch-Computer Interaktion
Lecture on Master level, usually offered in winter.
This lecture is not offered in winter 2019/2020.
In this lecture, we will work on a real world application of natural language based human-computer interaction. The best known application in this field are probably personal assistants on the smart phone like Siri, Cortana, or Ok Google.
We will have a look on the scientific foundations of such personal assistants including:
- Language identification – How do I know whether a request is made in German, English or French?
- Question classification – How does the computer know what is being asked?
- Entity disambiguation – What entities are mentioned in the question?
- To enrich theory with practical experience we will build our own personal assistant system which we are going to steadily improve over the semester.
-
query_builderLecture: Language Technology
Language technology/Sprachtechnologie
Lecture on Bachelor level, usually offered in winter.
In this lecture, we provide an overview of the field of language technology. We start with examples from daily life familiar to most of you.
We will guide you through the various stages of how language -written or spoken- has to be processed in order to let computers work with it. This starts with basic tasks such as detection of word or sentence boundaries, recognizing a word’s part of speech (e.g. noun, verb, etc.) or identifying the language of a text.
More advanced topics are spelling and grammar correction, extraction of keywords and word sense disambiguation.You will learn about the challenges behind seemingly trivial problems as well as algorithmic ways how to tackle them.
-
mode_editPractical Course: Text-Analysis Tools
In this practical class you will work on a task from the domain of natural language processing (NLP). We will first learn how to use a professional NLP framework (DKPro). This includes processing data using both existing as well as your own components in your project.
The center of this practical class is to learn how to tackle a real-world NLP problem in DKPro. This includes both an analysis of the necessary steps as well as the implementation of your ideas in actual code within the framework.
Each participating student has to provide an own solution / code for their respective chosen task at the end of the semester. Students are encouraged to exchange information and read through code-examples etc., but bulk copy-paste from websites (other people’s solution) is not accepted. We check for plagiarism!
Students that participate should have a profound knowledge of Java.After the introduction to DKPro you will choose a project. This project can either be one out of a list we provide or, if you have suitable ideas, a self-selected topic. Popular topics are, for instance, textual similarity or sentiment analysis.
-
mode_editPractical Course: Master Research Projects
Offered on demand
Here are two examples that show how a project might look like:
1.) Exploring opinionated twitter conversations in the context of popular german TV-shows:
In this project 3 Masterstudents designed and implemented data collection, automatic topic- and sentiment- analysis and interactive visualisation.
2.) Geolocated Hatespeech detection:
In this project two Masterstudents designed and implemented a system to automatically detect and visualise geolocated tweets which contain
Get in contact!
It is advantageous if you already have some NLP-related interests or project ideas. Otherwise we can help you to define a topic. -
assignmentSeminar: Text-Analytics (Ethics in NLP)
Seminar: Text-Analytics (usually offered in summer)
Link to LSF Summer 2021This seminar introduces students to scientific writing of research papers that conforms to the standards of international conference papers.
Topic in Summer Semester 2021: Ethics in NLP
Technology affects human lives directly, so besides the technical side of the topic, there are also various ethical aspects that have to be considered. For example, algorithms can be biased and discriminate against certain groups or data collection may interfere with privacy issues.The students will select a topic from a collection of prepared subject areas that are related to Natural Language Processing (NLP). During the seminar, students will go through the entire publication-creation life-cycle which composes of literature research, formulation of a research hypothesis, implementation / line of argument to (dis-)prove the hypothesis, writing of the actual paper and eventually presentation of the results.
We will provide guidance for these individual steps and how to structure and present information that it conforms to scientific standards.
The task of the student will be on the one hand to do the literature research and analysis to select relevant work and hypothesis generation grounded on this literature. On the other hand, the student will have to organize their texts in a way that people not directly involved in the creation process of the paper can understand and follow the line of thoughts. To get a better feeling for the challenging nature of writing scientific results down in an understandable fashion, students will peer-review their work (students exchange their current working version with their fellow students and mutually criticize their work for clarity, correctness/soundness and meaningful comparison(s)).At the end of this seminar, the students present their results and hand in their paper that is graded for the same criterions that apply in peer-reviewing.
Each participant has to hand in an own paper, group work is not permitted. -
searchMaster and Bachelor Theses
Master and Bachelor Theses
We offer options to write Bachelor and Master theses for students from Computer Science and Komedia. Topics can be selected based on your preferences, but need to be in the area of natural language processing and should ideally relate to one of our main research topics.
Prerequisites
The practical work for your thesis will most likely include some amount of programming. At the lab, we use for example DKPro, a framework written in Java, for many NLP tasks, as well as Python and Deep Learning.Finding a Topic
We can help you define a topic, but it usually helps if you already have some rough ideas what you would like to work on. Please contact us early to allow some time for that process and provide us some background about yourself such as:- Have you already attended lectures or seminars from our group? Which ones?
- What is you programming experience? (which languages and how long?)
- What NLP topic(s) do you find interesting, what would you potentially like to work on?
Please fill out this questionnaire for finding a topic [Link].
If you already have a clear idea what you would like to work on, fill out this questionnaire for describing your topic [Link].
Formal Requirements, Registration and Submission
Please inform yourself early about the formal requirements, registration and submission procedures of your study program regarding your final thesis. The examination regulations of the different study programs can be found here.Thesis Template
You are encouraged to write the report for your thesis in LaTeX. We strongly recommend that you use the following template:Colloquium
Some days after the submission of your thesis, you have to present your work in a colloquium at the lab. You will get 15 minutes time for presenting, followed by 15 minutes of answering questions about your thesis.Past Thesis Topics
The following list of finished Master's and Bachelor's theses should give you an idea of the range of topics offered:2020
-
Identification of sources used in argumentative texts using text similarity measures
Marcel Brauer -
Comparative visualization of essays
Tim Ludwig -
Personalizing a handwriting recognition system
Dario van den Boom -
Transfer learning for german named entity recognition
Sherif Neamatalla -
Bootstrapping a conversational tutor by semi-automatically analyzing interaction data
Ankita Mandal -
Comparing the perception of Hate Speech between German and English
Keli Ebe Corine Dara-ahato - Application of Siamese Network for Resume-Job Matching
Michael Suhendra
2019
-
Analysis of an emotion classification transfer from tweets to restaurant reviews
Annika Österdiekhoff -
Comparing Chinese and German judgments of hate speech against women
Qi An -
English-Chinese cross-lingual scoring of short answer questions
Xuefeng Song -
Training and comparing Mandarin DeepSpeech speech recognition models
Kunhao Wang -
Chinese Short Answer Scoring
Haoshi Wang -
„Are you an adult?“ - Identifying user age while chatting
Moritz Kaiser -
Strike-through Text Identification from Handwritten Documents
Hiroshi Hamano -
Comparing the speech recognition quality of Amazon Alexa and Google Home
Dominik Thiel -
Adversarial Examples for Evaluating Automatic Content Scoring Systems
Yuning Ding -
Sentiment Analysis using Textual and Auditory Cues
Jan Henry van der Vegte -
Analysis and Comparison of Web-scraping Tools
Olga Filipova -
Evaluation des Einflusses von fehlenden Word Embeddings auf Semantische Textähnlichkeit
René Lehmann - Analyzing the transfer of aspect-based sentiment extraction to hotel reviews
Jonas Philipp Meise
2018
-
The influence of segmentation quality on Chinese stance detection
Yangxu Li -
Cross-lingual content scoring
Sebastian Stennmanns -
Automatic detection of irony and sarcasm in German online news articles
Nivelin Stoyanev & Radomir Georgiev -
Study on Register in Editorial Work Using Data-Mining Methods
Paavo Pohndorff -
Comparative evaluation of embeddings based on semantic similarity and relatedness
Jonas Müller -
Mapping human reactions to events in voice-controlled smart home environments
Jeanette Alice Schenkewitz -
Non-common sense baselines for a common sense reading comprehension task
Christian Haring -
An Ontology-based Approach for Stance Detection in Tweets on Catalan Independence
Thiago Encarnacao -
Comparing transfer learning approaches on the task of review rating prediction
Eduardo Goulart Rocha -
Vergleich von Verfahren zur Annotation von Wortverwandtschaft
Mara Ortmann -
Topic-sensitive methods for automatic spelling correction
Ruishen Liu -
Cross-task scoring of complex writing tasks using domain adaptation and task-independent features
Marie Bexte -
A Comparative Evaluation of German Grapheme-to-Phoneme Conversion Libraries
Rüdiger Fröhlich - The influence of ngram configurations on classification results
Valentin Panagiev
2017
-
Automatische Spracherkennung von transkribierten Tweets am Beispiel von Arabizi
Anas Zine al Dine -
Transferring Automated Essay Grading Models between Domains and Languages
Onur Sünme -
Comparing the Performance of Sentiment Analysis Services
Salah Beck -
Detecting Sarcasm in Tweets
Alexander Büchtle -
The influence of spelling errors on the performance of short-answer scoring systems
Yuning Ding -
Quantifying the polarity of names based on sentiment lexicons and word embeddings
Wen Bin Le -
The impact of language errors and the performance of native language identification
Yufei Mu - Comparing the applicability of relation inference datasets for evaluating distributional methods
Marius Hamacher
2016
-
Generating and Evaluating Part-of-Speech Tagger Models for Multiple Languages
Dominik Lawatsch -
Language Models for L2 Writing Assistance
Robin Schmidt -
The Impact of Negation on the Quality of Sentiment Analysis
Paavo Pohndorff -
The Influence of Gap Ambiguity and Language Proficiency on the Performance of Bundled Gap Exercises
Niklas Meyer -
Erkennung von im lateinischen Alphabet geschriebenen Dari Tweets
Habiel Abromand - Improving Anaphora Resolution Through Corpus Mined Gender Information
Jan-Henry van der Vegte
2015
-
Identifying Semantically Equivalent Twitter Messages,
Marc Morgenbrod -
Erkennung der Sprache in kurzen "social Media" Texten,
Rangeeth Paramananthan -
Entwicklung eines Frameworks zur Extraktion von Netzwerken aus Texten,
Tobias Graeve -
Automatische Korrektur von Präpositionen und Artikeln,
Behnaz Spieker -
Vergleich von Textähnlichkeitsmaßen anhand von Geschwindigkeit und Leistungsverhalten,
Danijela Schlue -
Konfiguration und Verifikation von UIMA Pipelines,
Patrick Wafo -
Vergleich von Topic und N-Gramm Modellen in Wortvorschlagssystemen,
Lydia Penkert -
Predicting Cloze-Test Difficulty with Semantically Sensitive Language Models,
Michael Wojatzki -
Automatische Bestimmung des Erstellungszeitpunktes von Textdokumenten,
Tamara Feldkamp -
Comparing spell-checking tools with respect to the quality of automatic corrections,
Sebastian Stenmanns - Entwicklung eines deutschen Social Media Models für HeidelTime,
Christian Aldenhoff
2014
-
Identifying the native language of a text’s author,
Jens Thiel -
The Influence of Smileys as a Feature in Opinion-based Text Classification,
Martin Sokalla - Vergleichende Evaluation von UIMA-Textanalysekomponenten,
Onur Sünme
-
peopleInterested in working with us?
We are constantly looking for motivated Bachelor and Master students from our university who are interested in working with us on NLP-related topics. Most of the jobs require some sort of programming experience, preferrably in Java or Python. To get an idea what we are working on, you can find a list of our ongoing research here. Contact us so that we can find out together if we currently have an interesting task for you!
We do not have any open phd positions at the moment. However, if you have a research proposal in line with our topics at the lab and have an idea how to get your own funding, please talk to us!