Tuesday, June 5, 2012

Attack of the Killer Cucumbers: More on the Spanish Debt Crisis and Lower Quality Translation


Barbarino: That thing about the Great French Fry Phantom?
Kotter: You mean the Irish Potato Famine?
—Welcome Back Kotter


The need for speed in financial markets and the deceptive cornucopia of free information create the sensation that everything is available immediately. A parallel phenomenon is occurring in stock trading. As more and more trades are initiated by algorithms at greater speed and greater volume, more and more market breakdowns are occurring. Although no one can say for certain what is happening, at least part of the problem seems to be that computer systems can sometimes be overwhelmed by the amount of data that humans are trying to push through them. What lies in the future is no mystery: more and more speed bumps are going to be put in place by regulators on algorithmic trading to prevent crazy fluctuations. We already have automatic stops in many stock markets when a stock rises or falls too much. The referees turn off the system, suspend the stock, open the engine, and take a look to see what is wrong with the machine.

In translation, such technical fixes are not available. Our capacity to generate the linguistic equivalent of crazy stock prices is limited only by our common sense (always scarce) and the cost of fast machine translation (essentially zero).

In the age of the Content Tsunami, there is still too little information of decent quality available for investors who are interested in a foreign situation. The Internet and machine translation, though, create the deadly illusion that a savvy investor can go beyond the tiny amount of analysis produced by the Financial Times and The Wall Street Journal. Voilà. If you’re an analyst in a tiny boutique investment firm with two years of high-school French and you dated a Mexican girl from Amarillo in college, maybe you can use Google Translate to do the gisting of a few Spanish reports by the Bank of Spain or to parse one of Prime Minister Rajoy’s depressing statements (Machine-Translated Investment Research and the Spanish Debt Crisis). After all, any tiny bit of information (whether accurate or not) is necessary to get ahead of the crowd.

As in many other instances of how the Internet supposedly closes the gap between the tiny boutique firm and JP Morgan, this is a mirage. The big investment bank has a group of 20 or 30 Spanish analysts who speak very good English and are able to provide verbal or written summaries of information that often isn’t even written down. Moreover, these analysts are part of the local old boys' networks that communicate a lot faster and secretively than through the Internet. So when you see a blog such as ZeroHedge trying to beat the market using machine translation, you have to smile a little. 

As I have noted, ZeroHedge is very much invested in the whole foul-mouthed, white-collar macho Wall Street ethos of the cynical tough guy fighting alone in a Darwinian world. With all of ZeroHedge’s gleeful references to regular investors as Muppets diving over the Facebook IPO cliff, you have to wonder how their positions fare when they are caught out by some central bank decision or some European bailout plan because they don’t have access to off-the-record conversations with this Greek minister or that Spanish lawmaker (or even something as pedestrian as decent translations). I am betting that many a bloody Muppet massacre occurs behind the scenes that no one writes about. Maybe some of them are due to cheapo translation. 

In the markets, as in poker, the savvy player knows how to spot the sucker. The saying goes that if you can’t spot him, the sucker is probably you. And if you are using Google Translate for your investment research, the sucker is definitely you.

Now, mind you, even half-responsible people who honestly promote the virtues of automation usually add the caveat three-fourths into their PowerPoint presentation that technology should not be used to handle messages in which nuance is important. In my opinion, investment is one of those fields in which nuance matters (although I always wonder: in how many linguistic messages is nuance not important?)

An investment thesis is not data, after all. It may be based on data, but it is mostly a linguistic and conceptual construct. Allow me to use a very concrete example. Paul Kedrosky is a venture capitalist based in California who writes a popular blog called Infectious Greed. He is a very smart and successful investor who is well-read and writes interesting and funny stuff. But even he is prone to what we might call a naïve application of Lower Quality Translation.  

You may recall that around late May of last year, an outbreak of E. coli was detected in a shipment of Spanish cucumbers shipped to Germany. Normally, this would have been a rather typical spat in which a few borders are closed, European agriculture ministers mutter passive-aggressive insults, and everything is amicably resolved in some summit in which rather more caviar than cucumber is consumed. However, given the sensitivity over the Spanish debt problem, the cucumber problem suddenly popped up in the financial press.

Kedrosky went rooting around Spanish newspapers to see if he could get ahead of the market:
Germany and much of Europe are blocking Spanish cucumber exports on fear of the agricultural product’s connection to the outbreak of a virulent and dangerous form of E. coli. The variant has caused multiple deaths, and worries are increasing, particularly in Germany. 
What are the consequences? From a Spanish paper this morning: 
Spanish agrictultural [sic] trade is 3.8 billion euros, and the cucumber is 10 percent of total exports.
Ninety percent of production is exported.
The source cited is ABC, the more conservative of Spain’s three main broadsheets. The link (which is gone from the Bloomberg archive version I hyperlinked above but which I retrieved from my Google Reader) pointed to a Google Translate version of the Spanish article (the original non-translated version is here). Did Kedrosky link to the MT version because he wanted to be helpful to the reader or because he used the translated version to write his blog post? I really can’t tell you for certain. But one small detail suggests that he might have relied on the machine to formulate an investment thesis.

This is where Kedrosky gets in trouble: “Spanish agrictultural trade is 3.8 billion euros, and the cucumber is 10 percent of total exports.” That is a little ambiguous. If you don’t know the first thing about Spain, is 3.8 billion euros a lot or a little? Moreover, does “10 percent of total exports” mean: A) “10 percent of all the stuff Spain exports” (i.e., a lot) or B) “10 percent of all agricultural exports” (i.e., still a lot, but considerably less than A)? The translation doesn’t really provide any firm answer. But look at the subheading. It states the following in the MT version: “90% of production is exported and cucumber sales abroad suppose 10% of total vegetable.” Which is a mangled (Google Translate) version of this statement: “El 90% de la producción se exporta y las ventas de pepino en el exterior supon [sic] el 10% del total de legumbres y hortalizas.” Aha. So it's 10% not of all exports. Not even 10% of agricultural exports. It is 10% of exports of vegetables (!). But because the unambiguous sentence was mangled in the MT version, Kedrosky fixated on the more badly written--but better translated sentence--that contained a fantastic claim (Note the typo in the Spanish sub-headline and the brevity of the ABC item: this was obviously written at high speed in order to make some deadline or to put something up on the newspaper’s homepage; the figures may have been slapped together haphazardly at the last minute or may have been taken from outdated sources; a bilingual analyzing all of this non-linguistic information might have warned a researcher to dig further.) 

It was actually much ado about nothing. Sales of Spanish cucumber outside of Spain only account for 10 percent of total vegetable sales abroad. That is only 380 million euros, which is a paltry 0.15% of total Spanish exports. That is far from a decisive tipping point in a trillion-euro crisis. 

That little mistake marks the difference that drags you down from being the investor hero that makes a winning cucumber call to being the blogger zero who raises the alarm about a cucumber-fueled financial panic.  

To go from 10% of total exports by one of the largest economies in the world to little over a tenth of one percent of total exports is nothing more than a little nuance. So then: is this use of Lower Quality Translation for gisting justified? Well, I guess it is justified if you get it right. But that is a mighty big “if.” The problem is the frequency with which amateur users (and please note that Kedrosky is a highly sophisticated observer of both technology and the markets) mess up using the technology should highlight the fact that proselytizing in favor of cheap and quick translation can often be tantamount to placing razor-sharp blades in the hands of hyperactive, over-caffeinated chimpanzees. 

Miguel Llorens is a freelance financial translator based in Madrid who works from Spanish into English. He is specialized in equity research, economics, accounting, and investment strategy. To contact him, visit his website and write to the address listed there. Feel free to join his LinkedIn network or to follow him on Twitter.

2 comments:

John said...

"In translation, such technical fixes are not available." Well, not unless you count automated project management using expressIt's artificial intelligence algorithms. Don de Palma calls it transactional translation. Any thoughts about marketing high quality translation to an algorithm?

Miguel Llorens M. said...

Who would want to market human translations to their algorithm? It would be superfluous since, after all, "expressIt provides all the depth, nuance and quality of the world's best quality human translators, delivered with unprecedented speed resulting from powerful automation technology."