Thursday, September 1, 2011

Machine Translation Companies and the Closed Kimono

We saw at the time of the Iraq war what it means to be in the know and the games people play when they say they’re in the know. So in the lobby of the House of Commons before the Iraq-invasion vote you had whips saying to the MPs, ‘If you’d seen what I’ve seen, you’d know how to vote!’
--John le Carré

Open the Kimono [def.]: Share information. Reveal.
[syn.] Disclose your sexual organ to the wolf.
[usage note] The douchebag who said this probably also said Baxtrapolate.

The (by now ritualistic) debate surrounding machine translation (MT) at present tends to go a little like this:

A translator complains: “MT is crap. Look at the crud I get from Google Translate.”

As if conjured out of thin air via Ouija board, an MT salesman immediately appears. He nods comprehendingly and responds: “Ahhhh, but you only know freely available systems. They’re rubbish. If you could see how good the multi-million dollar programs used in-house by companies are! You would be VERY impressed, my little friend. VERY impressed. They provide much better results.”

Call me crazy, but that reminds me of the arguments surrounding WMDs in Iraq. The anti-war movement pointed out that Saddam’s government couldn’t produce enough baby milk to feed its own population. How could Iraq possibly be close to making biological super-weapons, much less a nuclear bomb?

Politicians in Britain and the United States would nod in symptahy and respond: “Ahhh, but if you saw the intelligence I have seen… you… would… s**t yourself! I swear to GOD you would faint and then go home and cry in your little pink-painted bedroom like the puny little girl you are!”

I’ll come right out and say it: I was really surprised by the absence of WMDs in Iraq. I was totally, 100%, A-1 sure that the Allies would find a trove of creepy crawly stuff in Iraq. Why? Not because of the candidness of Bush and Blair et al. Nope, I certainly didn’t trust them. The slightly “holier than thou” schtick of choirboys such as Blair always makes me slightly suspicious. But I believed in the presence of WMDs basically because 1) Saddam was such a cartoon villain, 2) he had been obsessed for decades with super-weapons, and 3) he led the only modern government that had actually used mustard gas against human beings. My belief in the WMDs wasn’t ideological. It just made sense to me.

Yep, I thought it was a “slam dunk,” as CIA head George Tenet sadly blurted out (and one of his colleagues not-so-sadly leaked to the press after the ceiling had been splattered with crud by the proverbial fan). I was sure some Guard unit from Mississippi would show up on TV a few weeks after the invasion in containment suits walking around nuclear facility and showing the press the goods. I remember watching the Colin Powell presentation at the U.N. and I recall being surprised at the lack of one single piece of incontrovertible evidence, the so-called smoking gun. But I wasn’t too disturbed. I was convinced beforehand that Saddam had something up his sleeve. And I am also pretty convinced that the Bush administration was also totally convinced that they would find something awful. That’s why I don’t think senior officials technically lied to get the U.S. into a war. They just stretched the evidence of something that they already believed in. They were not liars to the extent they were also unconsciously taken in by their own fraud, which prompted them to pressure their own intelligence analysts to spit up some really poor data by pasting pieces of garbage together to make some sort of gruesome collage. In their haste to convince others of something they should have had second thoughts about, they hyped it up and stretched it to the point that it became unrecognizable.

So I wonder if something similar isn’t happening here. After all, the MT Crowd tends to behave a lot like a religious sect. And they are asking everyone else to believe in the power of a technology whose effect is not visible but whose success is key for their financial futures (hardly conducive to rational discourse).

It is worthwhile pointing out a couple of very telling facts:

1.- There is no commercially available machine translation application for the consumer retail market. Therefore, there is no practical demonstration of the effectiveness of “tailored” MT (to call it something). If there were a market for it, don’t you think the capitalist system would already have invented it?

2.- There is no freely available downloadable application of these allegedly superior machine translation engines either.

3.- The people who make sweeping assertions about the power of walled-garden MT engines usually have unstated economic interests (whether as employees or investors or otherwise) in the companies that make this software.

So every time I hear an MT spokersperson confidently saying that private (as opposed to freely available) translation software is much better, I get a headache, I black out and I have this strange dream in which I am sitting behind Colin Powell as he addresses the U.N. Security Council. The general is shaking a little vial of fake anthrax while saying:

“Mr. Secretary General, there is ample and incontrovertible evidence that the Iraqi government is possessed of fully functional machine translation technology. According to well-placed sources, Iraq could launch an intercontinental missile that speaks 18 languages and is capable of striking continental Europe in less than 45 minutes.”

MirkoP said...

Based on my experience with several in-house systems, there is no major difference between these and Google. Yes, when trained on our own data, they do produce better output, but the difference is not earth-shattering (although the difference may vary depending on the language pair especially if your in-house system includes more linguistic features). The statistical systems all use the exact same algorithms, and the statistical MT experts at Google are just top-notch. Still, depending on your situation, the difference in quality may be sufficient to justify going with your own in-house system, or there may be other reasons that will make you prefer having your own solution. Now, I'm looking at this from a slightly different angle than you because I do find MT output generally pretty impressive these days, and our experiments have shown significant and consistent productivity increases when translators post-edit MT output rather than translating from scratch. Post-editing productivity seems to be about the same as "translating" fuzzy matches in the range of 85%-95%. Unfortunately, that doesn't automatically imply that translators enjoy working that way; some do prefer post-editing but the majority probably doesn't. That's something we (as in the translation "industry") have to address one way or another. Otherwise it just won't be a sustainable solution, not because the technology is not good enough but because we will lack good post-editors.


Karl Hansen said...

My (admittedly very limited!) experience with post-editing suggests that 1) Inhouse systems are not better than Google Translate, and 2) the claim that translators should be able to post-edit 10,000 words per day is wildly optimistic.

Post-editing can best be described as proofreading a translation from a very bad translator who knows how to spell but also uses very awkward language, introduces several grammatical errors in the text, and makes a number of translation errors, including omitting or adding the word "not".

I believe I can proofread about 10,000 words per day or even more on a good translation. But the worse the translation, the longer it takes to proofread it - I think this is common knowledge. On a translation downloaded from Google Translate, I would be able to "proofread" (or post-edit) max. 4,000 words per day if I also had to enter my own corrections, and the translation would then need to be checked by another proofreader!

Miguel E. Llorens Musso said...

Both interesting opinions, but I tend to agree a bit more with Karl. I think the idea that GoogleT output is comparable to 85%-99% fuzzy matches is profoundly mistaken. It simply doesn't stand up to even the most impartial scrutiny. People who are pumping out 10,000 words a day using GoogleT as a first draft are simply producing mediocre translations.