Starting from April 20th, all of the courses scheduled at our lab for this semester will take place in some form or another (in digital space).
Winter term 2020/21
- Lecture: "Language Technology" (Wednesday 10-12)
- Lecture : "Human Computer Interaction" (Tuesday 10-12)
- Praxis project: "Fremdsprachenlernen mit Alexa" (The course will be taught in German, please register in the Moodle course "Praxisprojekt Textanalysewerkzeuge WS20/21" https://moodle.uni-due.de/course/view.php?id=22959, Password: Alexa21, if you want to take this course)
- Praxis project: "Real-Time Mobile NLP Analysis" (Registration is available at Moodle with course name as "Praxisprojekt Real-Time Mobile NLP Analysis WS20/21" , link: https://moodle.uni-due.de/course/view.php?id=22998, Password: Mobilenlp123)
- Bachelor and Master theses on request (please send mail to Prof. Torsten Zesch)
Summer term 2020:
- Lecture: Grundlagen Künstliche Intelligenz (registration via LSF)
- Lecture: Knowledge-Based Systems (registration via LSF)
- Seminar: “Textanalytics”. The topic this year is “Ethics in NLP”
- Master Research Project: “Real-time environment capturing and analysis with mobile devices”
To register click here (please send your queries here)
For now, the courses will be given online. For lectures, we will follow an inverted classroom approach, where you get access to study material and we use the lecture time-slot for Q&A sessions.
Our teaching philosophy is based on the belief that theory and practice always need to be combined. Thus, we offer a lot of hands-on courses with lots of programming, and even some lectures combining short theoretical parts with immediate practical exercises.
Our supervision style tries to find a trade-off between the freedom to pursue your own research goals and the necessary guidance and constraints.
query_builderLecture: Foundations of Artificial Intelligence (GKI)
Foundations of Artificial Intelligence (GKI)/ Grundlagen der Künstlichen Intelligenz (GKI)
Lecture on Bachelor level, usually offered in summer.
In Foundation of Artificial Intelligence you will learn how computers can display intelligent behaviour.
After a brief philosophical introduction, we will focus on the concept of agents, which work autonomously to achieve certain goals. An example of such an agent is a robotic vacuum cleaner. We will discuss the algorithmic details how information is processed and decision are made. This includes finding information in graphs (uninformed search, informed search, local search) as well as ways how an agent deals with uncertainty and unavailable information.
We will furthermore provide an introduction to the field of machine learning, where the computer learns a task from data, without being explicitly programmed by a human how to do it. We will discuss algorithms for classification, clustering and regression, their applications and evaluation.
- History of AI
- Definition of Intelligence
- Agent Architecture
- Properties of Environments
- Uninformed Search (BFS, DFS)
- Informed Search (Greedy, A*)
- Local Search (Genetic Algorithms)
- Uncertainty / Probabilistic Models
- Machine Learning
- Classification (Naive Bayes, Decision Trees)
- Applications of AI
query_builderLecture: Knowledge Based Systems (KBS)
Knowledge Based Systems (KBS)/Wissensbasierte Systeme
Lecture on Master level, usually offered in summer.
In this lecture, we will discuss methods how to transform information into a representation suited for computers and how to turn this information into knowledge.
We will start with the foundational question of what knowledge is and advance towards various ways of encoding it. We discuss how the degree of similarity between pieces of information can be determined to answer questions such as “is the color black similar to white?” by using a variety of human-curated resources.
In the subfield of machine learning we explore how the computer learns to classify objects in the real world either from human-prepared data or on its own. We will discuss several machine learning algorithms which are commonly used today and highlight properties that one should consider when choosing between algorithms.
query_builderLecture: Natural Language Human-Computer Interaction (NL HCI)
Natural Language Human-Computer Interaction (NL HCI)/Natürlichsprachliche Mensch-Computer Interaktion
Lecture on Master level, usually offered in winter.
This lecture is not offered in winter 2019/2020.
In this lecture, we will work on a real world application of natural language based human-computer interaction. The best known application in this field are probably personal assistants on the smart phone like Siri, Cortana, or Ok Google.
We will have a look on the scientific foundations of such personal assistants including:
- Language identification – How do I know whether a request is made in German, English or French?
- Question classification – How does the computer know what is being asked?
- Entity disambiguation – What entities are mentioned in the question?
- To enrich theory with practical experience we will build our own personal assistant system which we are going to steadily improve over the semester.
query_builderLecture: Language Technology
Lecture on Bachelor level, usually offered in winter.
In this lecture, we provide an overview of the field of language technology. We start with examples from daily life familiar to most of you.
We will guide you through the various stages of how language -written or spoken- has to be processed in order to let computers work with it. This starts with basic tasks such as detection of word or sentence boundaries, recognizing a word’s part of speech (e.g. noun, verb, etc.) or identifying the language of a text.
More advanced topics are spelling and grammar correction, extraction of keywords and word sense disambiguation.
You will learn about the challenges behind seemingly trivial problems as well as algorithmic ways how to tackle them.
mode_editPractical Course: Text-Analysis Tools
In this practical class you will work on a task from the domain of natural language processing (NLP). We will first learn how to use a professional NLP framework (DKPro). This includes processing data using both existing as well as your own components in your project.
The center of this practical class is to learn how to tackle a real-world NLP problem in DKPro. This includes both an analysis of the necessary steps as well as the implementation of your ideas in actual code within the framework.
Each participating student has to provide an own solution / code for their respective chosen task at the end of the semester. Students are encouraged to exchange information and read through code-examples etc., but bulk copy-paste from websites (other people’s solution) is not accepted. We check for plagiarism!
Students that participate should have a profound knowledge of Java.
After the introduction to DKPro you will choose a project. This project can either be one out of a list we provide or, if you have suitable ideas, a self-selected topic. Popular topics are, for instance, textual similarity or sentiment analysis.
mode_editPractical Course: Master Research Projects
Offered on demand
Here are two examples that show how a project might look like:
1.) Exploring opinionated twitter conversations in the context of popular german TV-shows:
In this project 3 Masterstudents designed and implemented data collection, automatic topic- and sentiment- analysis and interactive visualisation.
2.) Geolocated Hatespeech detection:
In this project two Masterstudents designed and implemented a system to automatically detect and visualise geolocated tweets which contain
Get in contact!
It is advantageous if you already have some NLP-related interests or project ideas. Otherwise we can help you to define a topic.
assignmentSeminar: Text-Analytics (Ethics in NLP)
Seminar: Text-Analytics (usually offered in summer)
Link to LSF Summer 2020
This seminar introduces students to scientific writing of research papers that conforms to the standards of international conference papers.
Topic in Summer Semester 2020: Ethics in NLP
Technology affects human lives directly, so besides the technical side of the topic, there are also various ethical aspects that have to be considered. For example, algorithms can be biased and discriminate against certain groups or data collection may interfere with privacy issues.
The students will select a topic from a collection of prepared subject areas that are related to Natural Language Processing (NLP). During the seminar, students will go through the entire publication-creation life-cycle which composes of literature research, formulation of a research hypothesis, implementation / line of argument to (dis-)prove the hypothesis, writing of the actual paper and eventually presentation of the results.
We will provide guidance for these individual steps and how to structure and present information that it conforms to scientific standards.
The task of the student will be on the one hand to do the literature research and analysis to select relevant work and hypothesis generation grounded on this literature. On the other hand, the student will have to organize their texts in a way that people not directly involved in the creation process of the paper can understand and follow the line of thoughts. To get a better feeling for the challenging nature of writing scientific results down in an understandable fashion, students will peer-review their work (students exchange their current working version with their fellow students and mutually criticize their work for clarity, correctness/soundness and meaningful comparison(s)).
At the end of this seminar, the students present their results and hand in their paper that is graded for the same criterions that apply in peer-reviewing.
Each participant has to hand in an own paper, group work is not permitted.
searchMaster and Bachelor Theses
Master and Bachelor Theses
We offer options to write Bachelor and Master theses for students from Computer Science and Komedia. Topics can be selected based on your preferences, but need to be in the area of natural language processing and should ideally relate to one of our main research topics.
The practical work for your thesis will most likely include some amount of programming. At the lab, we use for example DKPro, a framework written in Java, for many NLP tasks, as well as Python and Deep Learning.
Finding a Topic
We can help you define a topic, but it usually helps if you already have some rough ideas what you would like to work on. Please contact us early to allow some time for that process and provide us some background about yourself such as:
- Have you already attended lectures or seminars from our group? Which ones?
- What is you programming experience? (which languages and how long?)
- What NLP topic(s) do you find interesting, what would you potentially like to work on?
If you already have a clear idea what you would like to work on fill out our thesis questionaire [Link]
Formal Requirements, Registration and Submission
Please inform yourself early about the formal requirements, registration and submission procedures of your study program regarding your final thesis. The examination regulations of the different study programs can be found here.
You are encouraged to write the report for your thesis in LaTeX. We strongly recommend that you use the following template:
Some days after the submission of your thesis, you have to present your work in a colloquium at the lab. You will get 15 minutes time for presenting, followed by 15 minutes of answering questions about your thesis.
Past Thesis Topics
The following list of finished Master's and Bachelor's theses should give you an idea of the range of topics offered:
Identification of sources used in argumentative texts using text similarity measures
Comparative visualization of essays
Personalizing a handwriting recognition system
Dario van den Boom
Transfer learning for german named entity recognition
Bootstrapping a conversational tutor by semi-automatically analyzing interaction data
Comparing the perception of Hate Speech between German and English
Keli Ebe Corine Dara-ahato
- Application of Siamese Network for Resume-Job Matching
Analysis of an emotion classification transfer from tweets to restaurant reviews
Comparing Chinese and German judgments of hate speech against women
English-Chinese cross-lingual scoring of short answer questions
Training and comparing Mandarin DeepSpeech speech recognition models
Chinese Short Answer Scoring
„Are you an adult?“ - Identifying user age while chatting
Strike-through Text Identification from Handwritten Documents
Comparing the speech recognition quality of Amazon Alexa and Google Home
Adversarial Examples for Evaluating Automatic Content Scoring Systems
Sentiment Analysis using Textual and Auditory Cues
Jan Henry van der Vegte
Analysis and Comparison of Web-scraping Tools
Evaluation des Einflusses von fehlenden Word Embeddings auf Semantische Textähnlichkeit
- Analyzing the transfer of aspect-based sentiment extraction to hotel reviews
Jonas Philipp Meise
The influence of segmentation quality on Chinese stance detection
Cross-lingual content scoring
Automatic detection of irony and sarcasm in German online news articles
Nivelin Stoyanev & Radomir Georgiev
Study on Register in Editorial Work Using Data-Mining Methods
Comparative evaluation of embeddings based on semantic similarity and relatedness
Mapping human reactions to events in voice-controlled smart home environments
Jeanette Alice Schenkewitz
Non-common sense baselines for a common sense reading comprehension task
An Ontology-based Approach for Stance Detection in Tweets on Catalan Independence
Comparing transfer learning approaches on the task of review rating prediction
Eduardo Goulart Rocha
Vergleich von Verfahren zur Annotation von Wortverwandtschaft
Topic-sensitive methods for automatic spelling correction
Cross-task scoring of complex writing tasks using domain adaptation and task-independent features
A Comparative Evaluation of German Grapheme-to-Phoneme Conversion Libraries
- The influence of ngram configurations on classification results
Automatische Spracherkennung von transkribierten Tweets am Beispiel von Arabizi
Anas Zine al Dine
Transferring Automated Essay Grading Models between Domains and Languages
Comparing the Performance of Sentiment Analysis Services
Detecting Sarcasm in Tweets
The influence of spelling errors on the performance of short-answer scoring systems
Quantifying the polarity of names based on sentiment lexicons and word embeddings
Wen Bin Le
The impact of language errors and the performance of native language identification
- Comparing the applicability of relation inference datasets for evaluating distributional methods
Generating and Evaluating Part-of-Speech Tagger Models for Multiple Languages
Language Models for L2 Writing Assistance
The Impact of Negation on the Quality of Sentiment Analysis
The Influence of Gap Ambiguity and Language Proficiency on the Performance of Bundled Gap Exercises
Erkennung von im lateinischen Alphabet geschriebenen Dari Tweets
- Improving Anaphora Resolution Through Corpus Mined Gender Information
Jan-Henry van der Vegte
Identifying Semantically Equivalent Twitter Messages,
Erkennung der Sprache in kurzen "social Media" Texten,
Entwicklung eines Frameworks zur Extraktion von Netzwerken aus Texten,
Automatische Korrektur von Präpositionen und Artikeln,
Vergleich von Textähnlichkeitsmaßen anhand von Geschwindigkeit und Leistungsverhalten,
Konfiguration und Verifikation von UIMA Pipelines,
Vergleich von Topic und N-Gramm Modellen in Wortvorschlagssystemen,
Predicting Cloze-Test Difficulty with Semantically Sensitive Language Models,
Automatische Bestimmung des Erstellungszeitpunktes von Textdokumenten,
Comparing spell-checking tools with respect to the quality of automatic corrections,
- Entwicklung eines deutschen Social Media Models für HeidelTime,
Identifying the native language of a text’s author,
The Influence of Smileys as a Feature in Opinion-based Text Classification,
- Vergleichende Evaluation von UIMA-Textanalysekomponenten,
peopleInterested in working with us?
We are constantly looking for motivated Bachelor and Master students from our university who are interested in working with us on NLP-related topics. Most of the jobs require some sort of programming experience, preferrably in Java or Python. To get an idea what we are working on, you can find a list of our ongoing research here. Contact us so that we can find out together if we currently have an interesting task for you!
We do not have any open phd positions at the moment. However, if you have a research proposal in line with our topics at the lab and have an idea how to get your own funding, please talk to us!