Hapax Googlegomenon

I was reading a German article on Spiegel about China’s economy and thought the information could be interesting for  a friend in Venezuela. That friend usually reads in Spanish only. As I didn’t have the time to translate the article, I thought I could let Google MT engine translate it for him. I chose as target Spanish. As I mentioned in a previous post, Google’s engine seems to be using English as an intermediate state for translation between other language pairs. In any case, for this

“Der Zuwachs liegt zwar knapp über dem selbst gesteckten Wachstumsziel der Regierung von 7,5 Prozent, allerdings hatte sie in der Vergangenheit immer sehr vorsichtige Vorgaben gemacht, die am Ende meist deutlich übertroffen worden waren.”

the engine produced this:

“Although the growth is just above the self-imposed growth target of 7.5 percent of the government, but they had done in the past always very careful guidelines that had been surpassed at the end usually.  “

I only want to talk about one issue here: the “translation” of German 7.5 into Spanish 7.5. That is wrong. Standard Spanish uses commas for decimals and points for thousands, so it should be 7,5. There were other fractions that got the correct Spanish form: 7.7 -> 7,7, 7.8 -> 7,8.

I started to write some other examples with numbers that Google’s engine might not have seen translations for -very long fractions- and then did the same experiment between English and Spanish.

The problem seems to be that Google’s MT treats certain numbers as numbers and does the necessary transformations but it uses some prêt-à-porter forms in some other cases.

Google MT is apparently using some sub-standard Spanish translations for training: there seems to be a Spanish training text coming from a translator who was influenced by English. The  engine might have thought this example might trump everything else. I think the company could do better than this. It wouldn’t be hard to get a general solution for these cases.

Google_Translate_icon