AI language models are abundant with diverse political prejudices

AI language models are abundant with diverse political prejudices

Is it the responsibility of companies to engage in social matters, or do they primarily exist to generate profits for their shareholders? Inquiries posed to AI systems can yield widely contrasting responses, depending on the specific model. While earlier AI models like GPT-2 and GPT-3 Ada from OpenAI would support the idea of corporate social responsibilities, GPT-3 Da Vinci, a more advanced model from the same company, would endorse the profit-centric view.

This divergence arises from the fact that AI language models exhibit diverse political inclinations, as highlighted in new research conducted by the University of Washington, Carnegie Mellon University, and Xi’an Jiaotong University. The study involved evaluating 14 major language models, revealing that OpenAI’s ChatGPT and GPT-4 leaned more towards left-wing libertarian perspectives, whereas Meta’s LLaMA leaned towards right-wing authoritarian viewpoints.

The researchers presented various topics, such as feminism and democracy, to these language models and positioned their responses on a political compass graph. Subsequently, they investigated whether retraining the models on even more politically biased training data influenced their behavior and their ability to identify hate speech and misinformation (which it did). This research was documented in a peer-reviewed paper that recently won the best paper award at the Association for Computational Linguistics conference.

Given that AI language models are being integrated into products and services used by millions, comprehending their inherent political assumptions and biases has become exceptionally crucial. This is due to their potential to cause tangible harm. For instance, a healthcare advice chatbot might decline to provide information on topics like abortion or contraception, or a customer service bot might begin to disseminate offensive content.

Since the introduction of ChatGPT, OpenAI has faced criticism from conservative commentators who assert that the chatbot reflects a more liberal viewpoint. However, the company asserts that it is actively addressing these concerns. In a blog post, OpenAI explains that it instructs its human reviewers, who assist in fine-tuning the AI model, not to favor any specific political group. The post emphasizes that any biases that might emerge are unintentional flaws, not deliberate features.

Chan Park, a PhD researcher at Carnegie Mellon University who was part of the study team, holds a differing perspective. She states, “We believe no language model can be entirely free from political biases.”

Bias infiltrates at every step

In an effort to reverse-engineer the process through which AI language models acquire political biases, the researchers investigated the development of these models across three distinct stages.

In the initial phase, 14 language models were presented with 62 politically charged statements, and the researchers analyzed whether the models agreed or disagreed with these statements. This exercise enabled them to pinpoint the models’ inherent political orientations and map them on a political compass. To their astonishment, the researchers discovered distinct and divergent political tendencies within AI models, as highlighted by Park.

A distinction was noted between BERT models, AI language models developed by Google, and OpenAI’s GPT models. BERT models, which predict fragments of sentences using contextual information, were found to be more socially conservative compared to GPT models. This disparity could stem from the fact that older BERT models were trained on conservative-leaning books, while newer GPT models were trained on more liberal content from the internet, the researchers theorized.

AI models evolve over time as tech companies update their datasets and training methodologies. An example is the shift observed between GPT-2, which expressed support for “taxing the rich,” and the subsequent GPT-3 model, which did not.

Meta, a company mentioned in the study, disclosed details about the creation of their Llama 2 model and efforts to mitigate bias. Google, however, did not respond to MIT Technology Review’s inquiries regarding the research.

The second phase involved further training of two AI language models, OpenAI’s GPT-2 and Meta’s RoBERTa, using datasets containing news and social media information from both left-leaning and right-leaning sources. The researchers aimed to assess whether the training data influenced the models’ political biases.

Indeed, it did. The researchers observed that this process reinforced the models’ existing biases: left-leaning models became more left-leaning, while right-leaning ones became more right-leaning.

In the third stage of their investigation, the researchers uncovered significant differences in how the political inclinations of AI models influenced their classification of content as hate speech or misinformation.

Models trained on left-wing data exhibited greater sensitivity to hate speech targeting marginalized groups in the US, such as Black and LGBTQ+ individuals. Conversely, models trained on right-wing data displayed heightened sensitivity to hate speech targeting white Christian men. Left-leaning models demonstrated better ability to identify misinformation from right-leaning sources, while being less sensitive to misinformation from left-leaning sources. Right-leaning models exhibited the opposite behavior.

Merely purging bias from datasets is insufficient

Ultimately, the reasons behind varying political biases in different AI models remain opaque to external observers, as tech companies refrain from disclosing the specifics of their training data and methodologies, according to Park.

Researchers have attempted to counteract biases in language models by either eliminating biased content from datasets or filtering it out. However, the study challenges the efficacy of this approach, with Soroush Vosoughi, an assistant professor of computer science at Dartmouth College, who was not part of the study, asserting that cleansing data of bias alone isn’t sufficient. Vosoughi points out that completely eradicating biases from extensive databases is immensely challenging, and even minor biases present in the data can still manifest in AI models.

The study had limitations, notably its focus on relatively older and smaller models like GPT-2 and RoBERTa for the second and third stages. Ruibo Liu, a research scientist at DeepMind who wasn’t part of the study but has examined political biases in AI language models, highlights this constraint. Liu expresses a desire to extend the study’s conclusions to the latest AI models. However, accessing the internal workings of cutting-edge systems like ChatGPT and GPT-4, which could aid analysis, is improbable for academic researchers.

Another constraint arises from the possibility that AI models might generate content rather than reflect their “internal state,” a factor highlighted by Vosoughi.

The researchers also acknowledge the imperfection of the political compass test, a commonly used method to gauge political nuances.

As companies incorporate AI models into their offerings, they should recognize the impact of these biases on their models’ behavior, aiming to enhance fairness. Park emphasizes the importance of awareness, stating that fairness can’t exist without it.