AI-powered neural machine translation (NMT) has the potential to unite us across miles and languages. But there’s a dark side: it also has the potential to unwittingly reinforce societal biases by producing translations that reinforce negative stereotypes. Let’s take a closer look at how that happens and at the steps we’re taking to make sure that the technological advances we use produce inclusive results that reflect the diverse world we live in.
Understanding Bias in AI Translation
Bias in neural machine translation reflects the societal prejudices embedded in the data these systems learn from. This bias manifests in various forms, affecting cultural understanding, gender representation, and racial sensitivity.
- Gender bias: Gender bias is the most common type of bias in AI translations because of the varied ways different languages indicate gender. This can lead to NMT engines changing the gender of words to conform to harmful stereotypes. For example, Google Translate was caught removing female historians and presidents and male nurses from translated content.
- Cultural Bias: Cultural bias in AI translation occurs when biases common in specific cultures impact translated output. For example, in one experiment, public health materials about anxiety disorders that were neutral in sentiment in English became measurably more negative when translated by NMT into Chinese, Hindi, and Spanish.
- Racial Bias: Racial bias can surface through the misrepresentation of racial groups in translations, perpetuating harmful stereotypes. More blatantly, racial slurs and other offensive terms can creep in, as WeChat discovered some years ago.
How Does Bias Occur in AI Translation?
As we stated, NMT engines learn from vast datasets that inevitably reflect societal biases. Since NMT (and AI in general) works by processing that data, sensing patterns, learning from them, and replicating them, these biases become part of their output unless steps are taken to prevent it.
Let’s take a closer look at gender bias in NMT to see how this happens. Languages vary in how they encode gender. For languages that don’t add gender signifiers to occupations or roles, AI must "decide" a gender when translating into languages that do.
Human translators have long wrestled with this conundrum. Previously, in many gender-inflected languages, there was a standard convention of generalizing gender by defaulting to the masculine inflection of the word—a gender-biased practice.
As society has changed to become more inclusive, the legacy practice of making the default gender masculine has created a two-fold problem for AI. First, because NMT engines are trained using existing content produced over the past few decades, they often rely on biased datasets to fill in the gaps.
Second, as some gendered languages evolve to become more inclusive, introducing gender-neutral terms and forms, AI struggles to adapt. These changes are progressive, but they introduce complexities that even human translators navigate with uncertainty and caution.
Another source of bias in AI translations is called “bridging.” When there isn’t enough data to train an engine for a specific language pair, English-language data is used as a “bridge” between the two languages. When text goes from a gender-inflected language to a more neutral language like English, then information about gender is lost, and the AI engine isn’t always able to reconstruct it accurately.
The Implications of Bias in AI Translation
Bias in AI translation impacts everyone, from individuals to businesses to society as a whole. When businesses produce translations that don’t meet current standards of inclusivity, their reputations can suffer. Also, studies show that biased AI algorithms have the potential to reinforce human biases in the real world, influencing people’s behavior even after they stop using the AI program. Lastly and perhaps most damaging, biases in AI translation can affect people directly and negatively: if a doctor, for example, receives translated information that paints people from a specific culture or race as “frustrated” instead of “depressed” (an error that Google Translate has been shown to make when translating to and from Chinese), then their diagnosis and treatment plans may be affected.
How ULG Handles Bias in Translation
The problems with AI bias are real, but so are the benefits of NMT: a faster translation process that makes global communication at scale easier than ever before. And there’s good news: it is possible to tap into these benefits without propagating unfair discrimination and harmful stereotypes.
Recognizing the challenges posed by AI bias, ULG has developed a comprehensive, four-pronged strategy to ensure our translations are both accurate and inclusive.
Creating a Guideline for Inclusive or Unbiased Language
Our first step is to align with our customers on guidelines for translations that champion inclusive and unbiased language. These guidelines direct reviewers on what type of language to use and on how to handle situations where bias could enter the translation process. With these guidelines as a foundation, we make sure that every piece of content we produce or refine through NMT respects and reflects the diversity of human experience.
Cleaning the Training Data and Fine-Tuning the Engines
We also understand that AI is only as good as the data it learns from. When we fine-tune the MT engines we use to suit our customers, we have the opportunity to use data that’s already been reviewed and corrected for bias.
By cleaning our training datasets to make sure they’re using data that reflects modern, inclusive language, we “teach” the system how we want it to handle ambiguous situations so that it defaults to a more inclusive alternative. That’s also why we strongly recommend using our customers’ own translation memories to fine-tune the engines—it shows the AI translation engine how to handle common situations where bias often crops up.
Properly prepared, ‘clean’ customer-specific data is the foundation of an AI translation system that produces fair and unbiased translations.
Human Post-editing
Technology has its limits, and human insight remains irreplaceable. Just as in the physical world, bias in translation is often subtle. AI algorithms struggle with subtlety- it takes a human to accurately spot problems and correct them.
ULG employs expert linguists to carefully review and correct the raw outputs of NMT. This is called post-editing, and it’s the key to maintaining quality with MT. Human oversight helps guarantee cultural and contextual relevance along with accuracy.
Automated Post-editing
To complement our human editors, ULG leverages advanced automatic post-editing tools designed to detect and correct biases in translation outputs.
These tools, powered by the latest in AI and machine learning, provide an additional safeguard against the perpetuation of stereotypes and biases.
By integrating these four strategies, ULG harnesses the speed and efficiency of NMT while actively working to eliminate bias, ensuring that global communication through translation is not only fast but also fair and inclusive. Translation has always been about making the world more connected — not just through language, but through the understanding and respect that comes with it.
Building an Inclusive Future with AI Translation
When implemented with a deep commitment to fairness, MT can bridge cultures, communities, and individuals worldwide. We believe in the transformative power of MT done right—where quality, inclusivity, and accuracy are the standard.
Are you ready to tap into the full potential of MT, free from bias and full of possibilities? Let United Language Group build your MT program. Contact our experts today to embark on a journey to more inclusive and effective multilingual communication.