Machine translation is getting better all the time, but the problem of gender bias remains. Read these ten questions and answers if you want to understand all about it.
Gender bias happens when you need to translate something from a language where it’s gender-neutral into a language where it’s going to be gender-specific. For example, any sentence with the English word doctor in it, such as I am a doctor or this is my doctor, will be affected because there are two words for doctor in most European languages, one for ‘male doctor’ and one for ‘female doctor’. A human translator will usually figure out from context which translation is needed. A machine translator, on the other hand, normally has no context, and assumes a gender arbitrarily: often it’s the male gender. This is why Google Translate, DeepL and others tend to translate doctor as ‘male doctor’.
The sentence with doctor is an example of something called the male default: the machine assumes the word doctor refers to a male doctor, and ignores the possibility that it may be referring to a woman. Many words that describe people by occupation, such as doctor and director and teacher and bus driver, are biased by the male default in machine translation because traditionally these jobs were occupied by men more than women. A small minority of words are biased by the opposite, the female default: words like nurse, cleaner, cook. In each case, the computer simply assumes that, because it has been like that in the past, it is like that on this occasion too.
Computer translators usually ‘learn’ from large databases of texts that had been been translated before by humans. If they see that bus driver has been translated as ‘male bus driver’ more often than ‘female bus driver’, they will pick that up as a habit and then always translate bus driver that way (unless some clues in the context force a different interpretation). The same goes for nurse being translated as ‘female nurse’ more often than ‘male nurse’ and so on.
Yes, more or less. Machine translators end up being biased to interpret words as male or female because they have learned from biased data which ultimately comes from us, humans. But that’s not the whole story. Most machine learning algorithms are designed to overgeneralize, to over-favour typicality. So, even if nurse (without contextual clues as to gender) is translated as ‘female nurse’ only 75% of the time in the training data, the machine will end up translating it as ‘female nurse’ 100% of the time. In other words, machine translators don’t just replicate pre-existing biases, they amplify them.
That depends on what you mean by ‘big’. Most people, when they type a sentence like I am a bus driver into a machine translator and ask for a translation into another language, they have themselves in mind: they want to translate a statement about themselves. If the user is a woman and the translation is worded as if a man is saying it, then that’s not OK. The translation is wrong, where by ‘wrong’ we mean ‘different from what the user intended’.
Of course, if the user speaks the target language, then she should be able to spot and correct the error. But if she doesn’t, then she may not even know that she has been given the wrong translation, and end up talking about herself as a man: embarrassing! Also, consider the wider social impact: machine-translated texts are everywhere these days, and if we’re constantly reading about male doctors and female nurses and almost never the other way around, then this might create the subconscious impression in people’s minds that certain professions are more suitable for one gender than the other. A society which believes in gender equality should not take that sitting down.
Yes, it is of course true that all machine translators come with a certain margin of error, and that you should never trust unconditionally the translations they give you. But that shouldn’t stop us from trying to make that margin of error as narrow as possible. As it turns out, the problem of gender bias is fixable: we can make the machine know which gender we’re talking about, and then translate accordingly.
No, not exactly. Machines are actually pretty good at doing that already. If you have a sentence where there is some context to help with gender disambiguation, such as she is a doctor or he is a nurse, then most machine translators will pick up on that and translate the words correctly (doctor as ‘female doctor’, nurse as ‘male nurse’). The problem is that in some sentences there simply is no context to help with that, such as I am a doctor or this is a nurse. There are no clues in those sentence to tip you off one way or another whether the doctors and nurses in question are male of female.
Correct, we can’t. No artificial intelligence, however smart, can ever guess what gender people are if there are no clues in the input that’s available to the machine. A human translator, when they realize they don’t know the gender of someone and need it for translation, will try to find out, for example by asking someone.
Well, actually, they can! That’s exactly the idea behind Fairslator. Fairslator isn’t the kind of machine translator that tries to guess the unguessable. Instead, Fairslator is a tool which detects the presence of a gender ambiguity and asks you which gender you want. When translating I am a doctor from English into a language where there are two gender-specific words for doctor, Fairslator will ask you which one you mean, and then alter the translation accordingly. The thing is, to really fix gender bias, we must liberate ourselves from the misconception that there is always an immediately obtainable translation for everything. Sometimes, we must ask follow-up questions to clarify what the intended meaning is, before we can produce a translation into another language. It’s time machine translators learned how to do that, and Fairslator is leading the way.