The language technology lab carries out research in the field of Natural Language Processing

We strongly believe that engineering is a key part of research in this field and that often a new insight is only to be found when re-implementing an approach. We are especially interested in analyzing and processing non-standard, error-prone language as found in social media and learner language.

Consequently, we mainly focus on two areas of specialization:

Educational NLP: Short answer scoring, Essay scoring, Vocabulary Acquisition, Spelling and grammar correction
Social Media Analysis: Robustness of tools, Domain adaption, Large-scale semantic processing



  • business_centerUser-Centered Social Media (DFG Research Training Group 2015-2020)

    User-Centred Social Media

    The Research Training Group “User-Centred Social Media” (UCSM) is an interdisciplinary Research Training Group (Graduiertenkolleg) at the Department of Computer Science and Applied Cognitive Science of the University of Duisburg-Essen. This programme is funded by DFG and starts on October 1, 2015.

    The emergence of Social Media marks a significant step in the application of information and communication technology with a profound impact on people, businesses, and society. Social Media constitute complex sociotechnical systems, encompassing potentially very large user groups, both in public and organizational contexts, and exhibiting features such as user-generated content, social interaction and awareness, and emergent functionality. While Social Media use is widespread and increasing, significant research gaps exist with respect to analyzing and understanding the characteristics and determinants of user behaviour, both at the individual and the collective level, as well as regarding the user-centered design of Social Media systems, aiming at empowering users to better appropriate, control and adapt systems for their individual goals. There is a growing demand in academia and in industry for scientifically trained experts that are knowledgeable both in the human-oriented and the technical aspects of Social Media.

    More information can be found at the User-Centred Social Media Homepage

  • business_centerArgument-Based Decision Support for Recommender Systems (DFG SPP RATIO 2018-2020)

    Argument-Based Decision Support for Recommender Systems (ASSURE)

    ASSURE is a project within the SPP RATIO.
    Argumentative statements contained in user-generated texts such as online product reviews can significantly facilitate a user’s decision. Recommender systems aim at alleviating the user’s decision problem by suggesting items the user is likely interested in, but do not exploit the potential of reasoned arguments given for or against a certain item or its properties. The overall objective of ASSURE project is to make use of arguments embedded in online reviews to significantly improve the quality and transparency of recommendations given by the system, and to provide users with a much higher level of interactive control over the recommendation process than is currently the case.

    The project aims at advancing the state of the art in several respects: Firstly, we will develop novel methods for extracting arguments from the typically informal texts found in user reviews. We will further enrich the arguments with annotations of how specific and how emotionally intense they are.

    Secondly, we will combine the extracted arguments and the additional annotations with user ratings and other item-related data in an integrated user and item model to improve the effectiveness of recommender algorithms. This model will also provide a basis for developing novel techniques through which users can interactively explore, filter, or weight different arguments, as well as other data, to control how recommendations are generated. Thirdly, we will develop methods for providing users with personalized, argument-based explanations of the items recommended. A further important outcome of the project will be a dataset of unprecedented quality and size that is annotated on different layers regarding argumentation. Such a dataset is a prerequisite for further research on argumentation in the context of recommending, and will be suited for use in shared tasks that form part of the priority program.

    More information can be found at the ASSURE Homepage.

  • business_centerAutomatic Scoring of Free-Text Answer (TestDaF 2019-2020)

    The TestDaF Institute is one of the biggest providers of language proficiency testing for German as a foreign language. LTL collaborates with the TestDaF institutes on automatic and assisted scoring of free-form answers, more precisely answers given to listening comprehension prompts and learner essays. We explore way how the the scoring workload of humans graders can be reduced and how we can ensure fast and consistent scoring of free text answers.

  • business_centerBildungsgerechtigkeit im Fokus II (BMBF 2016-2020)

    Within the project Bildungsgerechtigkeit im Fokus we are part of Teilprojekt 2: Blended learning.

  • business_centerSustainability of research software (2018 - 2020)

    Reproducibility of experiments is a key requirement of scientifical working. With DKPro Core and DKPro Text Classification, we are working towards an improved reproducibilty of software experiements.
    We received a 3 year funding by the DFG to further improve DKPro Core and DKPro Text Classification as landscape marks fit for conducting scientifically experiments.

  • business_centerExploration of digital technologies in public employment services using the example of text mining (2019-2020)

    Employment is an essential part of participation on the society. Public employment services play an important role here. They are often directly and indirectly involved in the initiation of new employment relationships. This is particularly the case where job seekers cannot find an employment on their own.

    The planned project examines how the use of digital technologies changes work and organization in employment agencies and job centers. The study uses text mining as a concrete application example of digital technologies. In an interdisciplinary collaboration between sociology and computer science, changes in the work and organization of job placement depending on different scenarios of digitization are examined.

  • business_centerHate Speech Research Overview

    On Language Technology Lab, we have conducted multiple research projects in regard to hate speech. Our research interests lie on automatization of hate speech detection and how improvement and reliability in automated detection can be achieved. To do so, we have examined definitional and linguistic challenges and assessed how gender can play a role in hate speech detection. We have explored the development of monolingual and multilingual classification systems which can be used to identify and categorize offensive language on social media. We have generated a classification system to distinguish between free speech and language constituting a criminal offense. While our research focus lies on the technicality of hate speech, our aim is to take a more holistic approach by also taking social and legal aspects into consideration.

    Hate Speech Definitions

    This research examines how reliability of hate speech annotations can be achieved based on the first German hate speech corpus on refugees. Our results suggest that detailed instructions for the annotation could be more useful than considering hate speech as a binary yes or no decision.

    • B. Ross, Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis. In Proceedings of NLP4CMC III: 3rd Workshop on Natural Language Processing for Computer-Mediated Communication (Michael Beißwenger, Michael Wojatzki, Torsten Zesch, eds.), 2016.

    Significance of Implicitness in Hate Speech

    The research explores whether implicitness affects the perception of hate speech. Our findings suggest that it is crucial to take implicitness into account when developing automated hate speech detection systems.

    Hate Speech towards Women

    This study explores whether there is a relationship between perception of hate speech and gender by asking female and male subject to judge 400 assertions targeting women. The objective of the research is to find out whether being part of the targeted group or personal agreement with an assertion influence how hate speech is perceived.

    Hate Speech Detection Systems

    We participated on the SemEval 2019 Shared Task and made two contributions. The first contribution entailed building a system to predict multilingual hate speech posts and the second contribution was about how identification and categorization of offensive language on social media can be achieved.

    • Zhang, H., Wojatzki, M., Horsmann, T., & Zesch, T. (2019). ltl. uni-due at SemEval-2019 Task 5: Simple but Effective Lexico-Semantic Features for Detecting Hate Speech in Twitter. In SemEval 2019.

    • Aggarwal, P., Horsmann, T., Wojatzki, M., & Zesch, T. (2019). LTL-UDE at SemEval-2019 Task 6: BERT and Two-Vote Classification for Categorizing Offensiveness. In SemEval 2019.

    Classification of Criminal Offenses

    We have generated an automated classification system to determine which Twitter posts would constitute a criminal offense under German criminal law using a data annotation schema that consists of a series of binary decision. Our findings suggest that the majority of posts fall under the category of morally offensive but do not constitute a criminal offense.

    • Zufall, F., Horsmann, T., & Zesch, T. (2019). From Legal to Technical Concept: Towards an Automated Classification of German Political Twitter Postings as Criminal Offenses. In NAACL.
  • codeCode and Data

    Fast or Accurate? A Comparative Evaluation of PoS Tagger Models

    Part-of-Speech tagging is an important preprocessing step for many applications in Natural Language Processing. This importance is reflected by many PoS tagger implementations available today. Which one do you use? Are you sure it is the most suited choice for your demands?

    For choosing a PoS tagger there a two properties that should influence your choice:
    Speed and Accuracy

    Big Data scenarios shift speed stronger into the focus than in Digital Humanities where speed is often of minor importance.

    Despite of the well known PoS tagger provided by Stanford or the TreeTagger, there are actually many more alternatives to them. Each implementation provides often more than just one model, which is the best?

    Experimental Setting

    We evaluated in total 27 models for 9 different PoS tagger implementations. The tagger implementation are listed below, we evaluated them on two languages, English and German.

    In English, we evaluated each tagger model on the following corpora: British National Corpus, Brown, Gimpel, MASC, Switchboard. In German we evaluated on the Tüba-D/Z and Rehbein.
    We excluded in English the Wall-Street-Journal and in German the Tiger and Negra corpus as many models have been trained with those corpora.

    We evaluate for one language each corpus on each PoS tagger model and measure additionally the runtime of the PoS tagger for the tagging. The measuring starts before the tagger is called and ends right after it. Below figure shows the workflow of our experiment.

    To overcome the differences in the tagsets of the various corpora, we harmonised the tags to a coarse grained tagset composing of eleven tags.


    The samples highlighted in red are the ones showing the best speed/accuracy combination. The surprising winner is a rule-based Hepple tagger.

    The most accurate German tagger is the TreeTagger. HunPos offers a reasonable trade-off between speed and accuracy. We currently do not have a rule-based tagger for German to test whether the results of the Hepple tagger transfer to German.

    How to cite us?
    Horsmann, Tobias; Erbs, Nicolai; Zesch, Torsten (2015): Fast or Accurate ? – A Comparative Evaluation of PoS Tagging Models. Proceedings of the International Conference of the German Society for Computational Linguistics and Language Technology (GSCL-2015), Essen, Germany.


  • business_centerSemi-automatic generation of reading comprehension questions (Stifterverband 2018-2019)

    The goal of this project is to improve the provision and integration of the reading comprehension tests. We aim to motivate the students to study literatures that are relevant to the lecture. After this, there will be some different types of test (free text, multiple choice, fill the blank, etc.) automatically generated by state-of-the-art language technology and curated by the teachers.

    These tasks will be varied based on the curriculums conducted by us and can be directly accepted and used for evaluation. We will make the required software open source and can be easily integrated in existing teaching process through a strait forward integration by module.

    The project has an extremely potential for transferring to other disciplines and teaching format. Afterall, whenever source texts are available, reading comprehension tests can be generated for purpose and this will also reduce the amount of manual work.

    For more information, please visit FELLOWSHIPS HOCHSCHULLEHRE - FELLOWS 2017

  • business_centerCAPE - Computer-assisted Programming Exercises (UDE 2018)

    In the lecture "Fundamental Artificial Intelligence" (about 250 students) and "Language Technology" (about 50 students), we will prepare some programming tasks for the students, to improve the programming ability of the students. Additionally, in order to lower the access barrier, we use some pre-configured system for the programming tasks.

    Link to the website coming soon.

  • business_centerINDUS - Individualized Language Learning (DFG 2014-2018)

    Indus Network

    Individualized language learning as a counterpart to standardized classes is now just around the corner due to new developments in the field of language technology. Thus not only commonly spoken languages but also languages with a smaller amount of native speakers can be learned. It becomes apparent however, that embedding those technologies into real learning environments gives rise to new questions, which can only be answered with the framework of interdisciplinary research.

    The INDUS-Network („Individualisiertes Sprachlernen” / „Personalized language learning”) unites experts in the fields of language technology, linguistics, educational research, psychology of learning, pedagogical psychology, language acquisition research, and didactics of language learning.

    Those experts work together on aspects of individualization, modeling of learners, and adjustments of teaching materials to different initial situations.

    More information on the website of the INDUS-Network Homepage.

  • business_centerGerman-Arab Transformation Partnership (DAAD 2016)