NLP | Dirk Hovy

What about ''em''? How Commercial Machine Translation Fails to Handle (Neo-)Pronouns

As 3rd-person pronoun usage shifts to include novel forms, e.g., neopronouns, we need more research on identity-inclusive NLP. Exclusion is particularly harmful in one of the most popular NLP applications, machine translation (MT). Wrong pronoun …

Leveraging Social Interactions to Detect Misinformation on Social Media

Detecting misinformation threads is crucial to guarantee a healthy environment on social media. We address the problem using the data set created during the COVID-19 pandemic. It contains cascades of tweets discussing information weakly labeled as …

Proceedings of the First Workshop on Cross-Cultural Considerations in NLP (C3NLP)

Natural Language Processing has seen impressive gains in recent years. This research includes the demonstration by NLP models to have turned into useful technologies with improved capabilities, measured in terms of how well they match human behavior …

Can Demographic Factors Improve Text Classification? Revisiting Demographic Adaptation in the Age of Transformers

Demographic factors (e.g., gender or age) shape our language. Previous work showed that incorporating demographic factors can consistently improve performance for various NLP tasks with traditional NLP models. In this work, we investigate whether …

INDOMITA

Innovative Demographically-aware Hate Speech Detection in Online Media in Italian

Know Your Audience: Do LLMs Adapt to Different Age and Education Levels?

Large language models (LLMs) offer a range of new possibilities, including adapting the text to different audiences and their reading needs. But how well do they adapt? We evaluate the readability of answers generated by four state-of-the-art LLMs …

Beyond Digital 'Echo Chambers': The Role of Viewpoint Diversity in Political Discussion

Increasingly taking place in online spaces, modern political conversations are typically perceived to be unproductively affirming---siloed in so called 'echo chambers' of exclusively like-minded discussants. Yet, to date we lack sufficient means to …

It's Not Just Hate: A Multi-Dimensional Perspective on Detecting Harmful Speech Online

Well-annotated data is a prerequisite for good Natural Language Processing models. Too often, though, annotation decisions are governed by optimizing time or annotator agreement. We make a case for nuanced efforts in an interdisciplinary setting for …

Twitter-Demographer: A Flow-based Tool to Enrich Twitter Data

Twitter data have become essential to Natural Language Processing (NLP) and social science research, driving various scientific discoveries in recent years. However, the textual data alone are often not enough to conduct studies: especially, social …

Bridging Fairness and Environmental Sustainability in Natural Language Processing

Fairness and environmental impact are important research directions for the sustainable development of artificial intelligence. However, while each topic is an active research area in natural language processing (NLP), there is a surprising lack of …