Publications

Social-Group-Agnostic Bias Mitigation via the Stereotype Content Model

Published in ACL, 2023

Existing methods typically rely on word pairs specific to certain social groups, limiting their effectiveness to one aspect of social identity. This approach becomes impractical and costly when addressing bias in lesser-known or unmarked social groups. Instead, we proposed leveraging the Stereotype Content Model (SCM), a framework from social psychology. The SCM categorizes stereotypes along two dimensions: warmth and competence. By adopting this social-group-agnostic perspective, we demonstrated comparable performance to group-specific debiasing methods while offering theoretical and practical advantages over existing techniques.

Download here

The sins of the parents are to be laid upon the children: biased humans, biased data, biased models

Published in Perspectives in Psychological Sciences, 2023

Technological innovations have become a key driver of societal advancements. Nowhere is this more evident than in the field of machine learning (ML), which has developed algorithmic models that shape our decisions, behaviors, and outcomes. These tools have widespread use, in part, because they can synthesize massive amounts of data to make seemingly objective recommendations. Yet, in the past few years, the ML community has been raising the alarm on why we should be cautious in interpreting and using these models: they are created by humans, from data generated by humans, whose psychology allows for various biases that impact how the models are developed, trained, tested and interpreted. As psychologists, we thus face a fork in the road; Down the first path, we can continue to use these models without examining and addressing these critical flaws, and rely on computer scientists to try to mitigate them. Down the second path, we can turn our expertise in bias towards this growing field, collaborating with computer scientists to mitigate the deleterious outcomes associated with these models. This paper serves to light the way down the second path by identifying how extant psychological research can help examine and mitigate bias in ML models.

Download here

The paucity of morality in everyday talk

Published in Scientific Reports, 2023

Given its centrality in scholarly and popular discourse, morality should be expected to figure prominently in everyday talk. We test this expectation by examining the frequency of moral content in three contexts, using three methods: (a) Participants’ subjective frequency estimates (N = 581); (b) Human content analysis of unobtrusively recorded in-person interactions (N = 542 participants; n = 50,961 observations); and (c) Computational content analysis of Facebook posts (N = 3822 participants; n = 111,886 observations). In their self-reports, participants estimated that 21.5% of their interactions touched on morality (Study 1), but objectively, only 4.7% of recorded conversational samples (Study 2) and 2.2% of Facebook posts (Study 3) contained moral content. Collectively, these findings suggest that morality may be far less prominent in everyday life than scholarly and popular discourse, and laypeople …

Download here

Introducing the Gab Hate Corpus: defining and applying hate-based rhetoric to social media posts at scale

Published in Language Resources and Evaluation, 2022

The Gab Hate Corpus (GHC) contains 27,665 posts from gab.com, annotated for "hate-based rhetoric" by three or more annotators. It includes hierarchical labels for dehumanizing and violent speech, targeted groups, and rhetorical framing. The GHC enhances existing hate speech datasets with a large, representative collection of richly annotated social media posts

Download here

The moral foundations reddit corpus

Published in arxiv preprint, 2022

Moral framing and sentiment can affect a variety of online and offline behaviors, including donation, pro-environmental action, political engagement, and even participation in violent protests. Various computational methods in Natural Language Processing (NLP) have been used to detect moral sentiment from textual data, but in order to achieve better performances in such subjective tasks, large sets of hand-annotated training data are needed. Previous corpora annotated for moral sentiment have proven valuable, and have generated new insights both within NLP and across the social sciences, but have been limited to Twitter. To facilitate improving our understanding of the role of moral rhetoric, we present the Moral Foundations Reddit Corpus, a collection of 16,123 Reddit comments that have been curated from 12 distinct subreddits, hand-annotated by at least three trained annotators for 8 categories of moral sentiment (i.e., Care, Proportionality, Equality, Purity, Authority, Loyalty, Thin Morality, Implicit/Explicit Morality) based on the updated Moral Foundations Theory (MFT) framework. We use a range of methodologies to provide baseline moral-sentiment classification results for this new corpus, e.g., cross-domain classification and knowledge transfer.

Download here

Improving counterfactual generation for fair hate speech detection

Published in ACL, 2021

Bias mitigation approaches reduce models' dependence on sensitive features of data, such as social group tokens (SGTs), resulting in equal predictions across the sensitive features. In hate speech detection, however, equalizing model predictions may ignore important differences among targeted social groups, as hate speech can contain stereotypical language specific to each SGT. Here, to take the specific language about each SGT into account, we rely on counterfactual fairness and equalize predictions among counterfactuals, generated by changing the SGTs. Our method evaluates the similarity in sentence likelihoods (via pre-trained language models) among counterfactuals, to treat SGTs equally only within interchangeable contexts. By applying logit pairing to equalize outcomes on the restricted set of counterfactuals for each instance, we improve fairness metrics while preserving model performance on hate speech detection.

Download here

Moral concerns are differentially observable in language

Published in Cognition , 2021

We examined the connection between language usage and moral concerns. We collected a large dataset of Facebook status updates from English-speaking participants, along with their responses on the Moral Foundations Questionnaire. Our findings indicate that individuals' moral concerns can be identified through their language usage, although the strength of this relationship varies across different moral dimensions.

Download here