Peer-reviewed manuscripts of research I have led or collaborated on are listed below and on Google Scholar.
Human Heuristics for AI-Generated Language Are Flawed.
Maurice Jakesch, Jeff Hancock, and Mor Naaman. (2023).
Proceedings of the National Academy of Sciences 120.11.
[Paper]
[Pre-print]
[Pre-registration]
[Abstract]
We are entering an era of AI-Mediated Communication (AI-MC) where interpersonal communication is not only mediated by technology, but is optimized, augmented, or generated by artificial intelligence. Our study takes a first look at the potential impact of AI-MC on online self-presentation. In three experiments we test whether people find Airbnb hosts less trustworthy if they believe their profiles have been written by AI. We observe a new phenomenon that we term the Replicant Effect: Only when participants thought they saw a mixed set of AI- and human-written profiles, they mistrusted hosts whose profiles were labeled as or suspected to be written by AI. Our findings have implications for the design of systems that involve AI technologies in online self-presentation and chart a direction for future work that may upend or augment key aspects of Computer-Mediated Communication theory.
Co-Writing with Opinionated Language Models Affects Users' Views.
Maurice Jakesch, Advait Bhat, Daniel Buschek, Lior Zalmanson and Mor Naaman. (2023).
Proceedings of the ACM CHI.
[Paper]
[Pre-print]
[Abstract]
If large language models like GPT-3 preferably produce a particular point of view, they may influence people’s opinions on an unknown scale. This study investigates whether a language-model-powered writing assistant that generates some opinions more often than others impacts what users write - and what they think. In an online experiment, we asked participants (N=1,506) to write a post discussing whether social media is good for society. Treatment group participants used a language-model-powered writing assistant configured to argue that social media is good or bad for society. Participants then completed a social media attitude survey, and independent judges (N=500) evaluated the opinions expressed in their writing. Using the opinionated language model affected the opinions expressed in participants’ writing and shifted their opinions in the subsequent attitude survey. We discuss the wider implications of our results and argue that the opinions built into AI language technologies need to be monitored and engineered more carefully.
Can AI communication tools increase legislative responsiveness and trust in democratic institutions?.
Sarah Kreps and Maurice Jakesch. (2023).
Government Information Quarterly 40.3: 101829.
[Paper]
[Abstract]
Smart replies, writing enhancements, and virtual assistants powered by artificial intelligence (AI) language technologies are becoming part of consumer products and everyday experiences. This study explores the opportunities and risks of using language-generating AI systems in politics to increase legislative responsiveness. Legislators receive a large volume of constituent communication and often cannot devote individual consideration and timely response to each. Here, AI language technologies may allow legislators to process constituent communication more efficiently. For example, AI writing tools can suggest reply snippets when a staffer responds to a common concern. However, legislative human-AI collaboration could reduce constituent trust or undermine the representative process. In two experiments, we compared constituents’ impressions of human-written legislative correspondence to correspondences partially or fully generated by GPT-3, a state-of-the-art language model. Our results suggest that legislative correspondence generated by AI with human oversight may be received favorably and increase constituent trust compared to generic auto-responses that busy legislators may employ. However, poorly performing AI language technologies may damage confidence in the legislator. Our findings highlight the potential and risks of introducing AI-mediated communication to the representation process. We discuss the importance of disclosure, transparency, and maintaining human-in-the-loop accountability for political deployments of AI language technologies.
Assessing the Effects and Risks of Large Language Models in AI-Mediated Communication.
Maurice Jakesch. (2023).
Cornell University ProQuest Dissertations Publishing.
[PDF]
[Abstract]
Large language models like GPT-3 are increasingly becoming part of human communication. Through writing suggestions, grammatical assistance, and machine translation, the models enable people to communicate more efficiently. Yet, we have a limited understanding of how integrating them into communication will change culture and society. For example, a language model that preferably generates a particular view may influence people’s opinions when integrated into widely used applications. This dissertation empirically demonstrates that embedding large language models into human communication poses systemic societal risks. In a series of experiments, I show that humans cannot detect language produced by GPT-3, that using large language models in communication may undermine interpersonal trust, and that interactions with opinionated language models change users’ attitudes. I introduce the concept of AI-Mediated Communication–where AI technologies modify, augment, or generate what people say–to theorize how the use of large language models in communication presents a paradigm shift from previous forms of computer-mediated communication. I conclude by discussing how my findings highlight the need to manage the risks of AI technologies like large language models in ways that are more systematic, democratic, and empirically grounded.
Comparing Sentence-Level to Message-Level Suggestions in AI-Mediated Communication.
Liye Fu, Benjamin Newman, Maurice Jakesch, and Sarah Kreps. (2023).
Proceedings of the ACM CHI.
[Paper]
[Pre-print]
[Abstract]
Traditionally, writing assistance systems have focused on short or even single-word suggestions. Recently, large language models like GPT-3 have made it possible to generate significantly longer natural-sounding suggestions, offering more advanced assistance opportunities.
This study explores the trade-offs between sentence- vs. message-level suggestions for AI-mediated communication. We recruited 120 participants to act as staffers from legislators’ offices who often need to respond to large volumes of constituent concerns. Participants were asked to reply to emails with different types of assistance. The results show that participants receiving message-level suggestions responded faster and were more satisfied with the experience, as they mainly edited the suggested drafts. In addition, the texts they wrote were evaluated as more helpful by others. In comparison, participants receiving sentence-level assistance retained a higher sense of agency, but took longer for the task as they needed to plan the flow of their responses and decide when to use suggestions. Our findings have implications for designing task-appropriate communication assistance systems.
Effects of Algorithmic Trend Promotion: Evidence from Coordinated Campaigns in Twitter's Trending Topics.
Jospeh Schlessing, Kiran Garimella, Maurice Jakesch, and Dean Eckles. (2023).
Proceedings of the AAAI ICWSM.
[Paper]
[PDF]
[Pre-print]
[Abstract]
In addition to more personalized content feeds, some leading social media platforms give a prominent role to content that is more widely popular. On Twitter, trending topics identify popular topics of conversation on the platform, thereby promoting popular content which users might not have otherwise seen through their network.
Hence, trending topics potentially play important roles in influencing the topics users engage with on a particular day.
Using two carefully constructed data sets from India and Turkey, we study the effects of a hashtag appearing on the trending topics page on the number of tweets produced with that hashtag.
We specifically aim to answer the question: How many new tweets are generated \textit{because} a hashtag is labeled as trending?
We separate the effects of the trending topics page from network exposure and find there is a statistically significant, but modest, return to a hashtag being featured on trending topics.
Analysis of the types of users impacted by trending topics shows that the feature helps less popular and new users to discover and spread content outside their network, which they otherwise might not have been able to do.
AHA!: Facilitating AI Impact Assessment by Generating Examples of Harms.
Zana Buçinca, Chau Minh Pham, Maurice Jakesch, Marco Tulio Ribeiro, Alexandra Olteanu, and Saleema Amershi. (2023).
arXiv preprint.
[Pre-print]
[Abstract]
While demands for change and accountability for harmful AI consequences mount, foreseeing the downstream effects of deploying AI systems remains a challenging task. We developed AHA! (Anticipating Harms of AI), a generative framework to assist AI practitioners and decision-makers in anticipating potential harms and unintended consequences of AI systems prior to development or deployment. Given an AI deployment scenario, AHA! generates descriptions of possible harms for different stakeholders. To do so, AHA! systematically considers the interplay between common problematic AI behaviors as well as their potential impacts on different stakeholders, and narrates these conditions through vignettes. These vignettes are then filled in with descriptions of possible harms by prompting crowd workers and large language models. By examining 4113 harms surfaced by AHA! for five different AI deployment scenarios, we found that AHA! generates meaningful examples of harms, with different problematic AI behaviors resulting in different types of harms. Prompting both crowds and a large language model with the vignettes resulted in more diverse examples of harms than those generated by either the crowd or the model alone. To gauge AHA!’s potential practical utility, we also conducted semi-structured interviews with responsible AI professionals (N=9). Participants found AHA!’s systematic approach to surfacing harms important for ethical reflection and discovered meaningful stakeholders and harms they believed they would not have thought of otherwise. Participants, however, differed in their opinions about whether AHA! should be used upfront or as a secondary-check and noted that AHA! may shift harm anticipation from an ideation problem to a potentially demanding review problem. Drawing on our results, we discuss design implications of building tools to help practitioners envision possible harms.
How Different Groups Prioritize Ethical Values for Responsible A.I..
Maurice Jakesch, Zana Buçinca, Saleema Amershi and Alexandra Olteanu. (2022).
Proceedings of the ACM FAccT.
[Paper]
[Pre-print]
[Abstract]
Private companies, public sector organizations, and academic groups have outlined ethical values they consider important for responsible artificial intelligence technologies. While their recommendations converge on a set of central values, little is known about the values a more representative public would find important for the AI technologies they interact with and might be affected by. We conducted a survey examining how individuals perceive and prioritize responsible AI values across three groups: a representative sample of the US population (N=743), a sample of crowdworkers (N=755), and a sample of AI practitioners (N=175). Our results empirically confirm a common concern: AI practitioners’ value priorities differ from those of the general public. Compared to the US-representative sample, AI practitioners appear to consider responsible AI values as less important and emphasize a different set of values. In contrast, self-identified women and black respondents found responsible AI values more important than other groups. Surprisingly, more liberal-leaning participants, rather than participants reporting experiences with discrimination, were more likely to prioritize fairness than other groups. Our findings highlight the importance of paying attention to who gets to define responsible AI.
Belief in partisan news depends on favorable content more than a trusted source.
Maurice Jakesch, Mor Naaman, and Michael Macy. (2022).
Under review.
[Pre-print]
[Pre-registration]
[Abstract]
Surveys show that people trust news sources that support their political ideology, creating a feedback loop that sustains partisan disagreement about fact as well as opinion. However, most news sources do not publish sufficiently balanced content to disentangle the underlying dynamics: Do people believe partisan news because they trust the source or because the content favors their worldview? We experimentally isolated the effects of content and source on the credibility of partisan news. The results show that the credibility of partisan news depends on favorable content more than a trusted source. Unfavorable headlines were unlikely to be believed, but favorable headlines were readily believed even if attributed to mistrusted sources. When offered monetary incentives for correct evaluations, people were more likely to acknowledge the accuracy of unfavorable news. The findings suggest that interventions emphasizing accuracy may be more effective at mitigating alternative realities than efforts that promote source trust.
Trend Alert: A Cross-Platform Organization Manipulated Twitter Trends in the Indian General Election.
Maurice Jakesch, Kiran Garimella, Dean Eckles, and Mor Naaman. (2021).
Proceedings of the ACM CSCW.
[Paper]
[Pre-print]
[Abstract]
Political organizations worldwide keep innovating their use of social media technologies. In the 2019 Indian general election, organizers used a network of WhatsApp groups to manipulate Twitter trends through coordinated mass postings. We joined 600 WhatsApp groups that support the Bharatiya Janata Party, the right-wing party that won the general election, to investigate these campaigns. We found evidence of 75 hashtag manipulation campaigns in the form of mobilization messages with lists of pre-written tweets. Building on this evidence, we estimate the campaigns’ size, describe their organization and determine whether they succeeded in creating controlled social media narratives. Our findings show that the campaigns produced hundreds of nationwide Twitter trends throughout the election. Centrally controlled but voluntary in participation, this hybrid configuration of technologies and organizational strategies shows how profoundly online tools transform campaign politics. Trend alerts complicate the debates over the legitimate use of digital tools for political participation and may have provided a blueprint for participatory media manipulation by a party with popular support.
How Partisan Crowds Affect News Evaluation.
Maurice Jakesch, Moran Koren, and Mor Naaman. (2020).
Proceedings of the ACM TTO.
[Paper]
[PDF]
[Materials]
[Abstract]
Social influence is ubiquitous in politics andonline social media. Here we explore howsocial signals from partisan crowds influencepeople’s evaluations of political news. For ex-ample, are liberals easily persuaded by a lib-eral crowd, while resisting the influence ofconservative crowds? We designed a large-scale online experiment (N=1,000) to test howpolitically-annotated social signals affect par-ticipants’ opinions. In times rife with misin-formation and polarization, our findings areoptimistic: the mechanism of social influenceworks across political lines, that is, liberalsare reliably influenced by majority-Republicancrowds and vice versa. At the same time, wereplicate findings showing that people are in-clined to discard news claims that are incon-sistent with their political views. Consideringthat people show negative reactions to politi-cally dissonant news but not to social signalsthat oppose their views, we point to the possi-bility of depolarizing social rating systems.
AI-Mediated Communication: The Perception That Profile Text Was Written by A.I. Affects Trustworthiness.
Maurice Jakesch, Megan French, Xiao Ma, Jeffrey Hancock, and Mor Naaman. (2019).
Proceedings of the ACM CHI.
[Paper]
[Materials]
[Abstract]
We are entering an era of AI-Mediated Communication (AI-MC) where interpersonal communication is not only mediated by technology, but is optimized, augmented, or generated by artificial intelligence. Our study takes a first look at the potential impact of AI-MC on online self-presentation. In three experiments we test whether people find Airbnb hosts less trustworthy if they believe their profiles have been written by AI. We observe a new phenomenon that we term the Replicant Effect: Only when participants thought they saw a mixed set of AI- and human-written profiles, they mistrusted hosts whose profiles were labeled as or suspected to be written by AI. Our findings have implications for the design of systems that involve AI technologies in online self-presentation and chart a direction for future work that may upend or augment key aspects of Computer-Mediated Communication theory.
The Role of Source, Headline, and Expressive Responding in Political News Evaluation.
Maurice Jakesch, Moran Koren, Anna Evtushenko, and Mor Naaman. (2019).
Computation + Journalism Symposium.
[Paper]
[Materials]
[Abstract]
Studies have observed that readers are more likely to trustnews sources that align with their own political leanings. We ask: is the higher reported trust in politically alignednews sources due to perceived institutional trustworthinessor does it merely reflect a preference for the political claimsaligned sources publish? Furthermore, do respondents re-port their actual beliefs about news or do they choose toexpress their political commitments instead? We conducteda US-based experiment (N=400) using random association ofnews claims to news sources as well as financial incentivesto robustly identify the main drivers of trust in news and toevaluate response bias. We observe a comparatively weakeffect of source on news evaluation and find that response dif-ferences are largely due to the alignment of the respondents’politics and the news claim. We also find significant evidencefor expressive responding, in particular among right-leaningparticipants.