Mexico City — Google Translate has helped break down language barriers and connect communities around the world, and has now increased its repertoire with the addition of 24 languages, bringing its total portfolio to 133, and among the new additions are three Indigenous languages spoken in South America: Quechua, Guaraní and Aymara.
The addition of the new languages to the tool potentially means that 300 million more people will be able to communicate in a language other than their own, according to the technology company.
“These are the first languages we added using Zero-Shot machine translation, where a machine learning model only sees monolingual text, meaning it learns to translate into another language without seeing an example,” Isaac Caswell, a research scientist at Google Translate, said at the presentation of the new feature.
“We are very happy to say that we are including for the first time in the history of Google Translate the indigenous languages of the Americas,” Caswell said.
Aymara is spoken by around two million people in Bolivia, Chile and Peru; Guaraní by around seven million people, across Paraguay, Bolivia, Argentina and Brazil, while Quechua is spoken by around 10 million people in Peru, Bolivia, Ecuador and neighboring countries.
Caswell explained that two criteria were taken into account when choosing the languages: a large number of speakers, and that they are located in regions typically underserved by technology.
There are still many Indigenous languages of Latin America that are not available, however, such as Náhuatl, spoken by around seven million people in Mexico, and Mayan, with a similar number of speakers, in Mexico, Belize, Guatemala and Honduras, to name just two of the dozens of languages spoken across the continent.
Caswell cautioned however that the quality of translations in the newly added languages “still lags far behind” other languages it supports, such as English, Spanish and German, and admitted that the models “will make mistakes and exhibit their own biases”, but that Google only added languages if its AI systems met a certain threshold of proficiency, Caswell said.
Other languages added include Mizo, spoken by around 800,000 people in the far northeast of India, Lingala, spoken by some 45 million people in Central Africa, and Krio, which is considered a dialect of English, spoken in Sierra Leone.
“For each of these languages, we made an effort to find native speakers and talk to them, and for most of these languages, we actually found employees at Google who are happy to not only help speak, but also contribute technically to this project,” Caswell said.