NIXsolutions: Study Says Chatbots “Think” in English

Large language models (LLMs) are the backbone of many chatbots, and a new study suggests they might have an unexpected bias. While these models can seemingly understand and respond to queries in various languages, research published in New Scientist indicates they might primarily process information in English.


Scientists from École Polytechnique Fédérale de Lausanne investigated how LLMs handle queries by studying Meta’s open-source Llama 2 model. By examining the model’s internal layers, responsible for different stages of processing, the researchers observed a concerning trend.

The English Subspace: A Potential Bottleneck for Multilingual Understanding

The researchers provided the model with queries in Chinese, French, German, and Russian. These queries involved tasks like repeating words, translating between non-English languages, and filling in sentence blanks. Regardless of the language used, the model’s processing path within its layers often funneled through a section the researchers called the “English subspace.”

This finding suggests that even for non-English queries, the LLM might internally translate them into English for processing before generating a response in the target language. This raises concerns about potential limitations in understanding languages with unique concepts or cultural nuances.

The Risks of Anglocentric Biases in AI

Experts warn that an overreliance on English in LLM training could lead to bias and a loss of valuable cultural insights, notes NIXsolutions. Carissa Véliz from the University of Oxford expresses concern about losing “concepts and nuances” specific to certain languages. Additionally, Aliya Bhatia of the Center for Democracy and Technology highlights the risk of “culturally irrelevant hallucinations” and biased decision-making in areas like asylum applications.

The study underscores the need for more research on multilingual processing in LLMs and the potential pitfalls of language bias in AI development.