AbstractAbstract
A growing body of research on large language models (LLMs) has identified various biases, primarily in contexts where biases reflect societal patterns. This article focuses on a different source of bias in LLMs—government censorship. By comparing foundation models developed in China and those from outside China, we find substantially higher rates of refusal to respond, shorter responses, and inaccurate responses to a battery of 145 political questions in China-originating models. These disparities diminish for less-sensitive prompts, showing that technological and market differences cannot fully explain this divergence. While all models exhibit higher refusal to respond rates with Chinese-language prompts than English ones, language differences are less pronounced than disparities between China-originating and non-China-originating models. We caution that our study is observational and cross-sectional and does not establish a causal linkage between regulatory pressures and censorship behaviors of China-originating LLMs, but these results suggest that censorship through government regulation requiring companies to restrict political content may be an important factor contributing to political bias in LLMs.Significance StatementSignificance Statement
China is an increasingly major contributor to the development of foundation large language models (LLMs); understanding the political factors shaping these systems is critical. While prior research has focused on LLM biases that reflect societal patterns, this study reveals how state regulations can influence AI outputs. By comparing LLMs developed in China and outside, we find significantly higher levels of censorship in China-originating models, not explained by technological limitations or market preferences. Understanding how political censorship affects LLMs is essential for assessing the future of information access and global influence of AI.
...read more at academic.oup.com
pull down to refresh
related posts
deleted by author