Tuesday, February 28, 2012

Seth Godin, Sturgeon’s Law, and the Content Tsunami

(Homer hits zombie on the head with a book.)
Lisa: Dad! Wait! Stop! That’s the last book in the world!
(They look at the cover. It is the memoirs of Arsenio Hall.)
Lisa (changing her mind): Ahh… knock yourself out…
(Homer crushes the zombie’s skull with the book.)
—The Simpsons, “Tree House of Horror XX”

Seth Godin writes in a blog post that, as a teenager, he read all 250 science-fiction books in his high school library (“From Asimov to Zelany”). He reflects that reading the entire corpus of a single genre is impossible nowadays because of (say it with me) the Content Tsunami:
As the deluge of information grows and choices continue to widen (there's no way I could even attempt to cover science fiction from scratch today, for example), it's easy to forget the benefits of acquiring this sort of (mostly) complete understanding in a field.
Now, Godin is well worth reading, but he is a Web 2.0 hyper and a naïve technology millennialist if ever there was one. I have news for you, Seth. Even in the 1970s, reading 250 sci-fi books barely scratched the surface of the genre.

Godin’s comment is typical of how techies erroneously view the current changes in media: “People used to read before, but the book is dying; we are being buried under a mountain of written material broadcast by the Web; everything is changing quickly; some day soon we will have Google implants in our frontal cortex; the keyboard will be relegated to the dustbin of history.” In this specific example, the belief is that, once upon a time—usually thirty or twenty or forty years ago (fill in the blank)—there was much less reading material and one solitary reader could obtain a first-hand view of all the literature in any given field.

That is just not the case. And it hasn’t been the case for a very long (looooong) time. Nowhere is this truer than in the realm of literature. Any reader in 1850 who wanted to read all the novels published in Great Britain up to then would have needed three or four lifetimes to do so. This challenge has always daunted literary criticism. You see, even the most erudite critic has not read the whole of even first-line literary classics. Many nineteenth and twentieth century specialists don’t have that much regard for the classics of previous centuries. I have personally ascertained that Old English specialists at leading universities do not have a lot of time to read the latest 800-page brick from Jonathan Franzen or Haruki Murakami. The humanities—like science—are becoming increasingly compartmentalized. That is undeniable (and, incidentally, also one of the factors behind The Great Stagnation). However, my point is more wide-ranging than that.

My point is that even people specialized in relatively narrow periods (say, the British novel from the 1850s to 1914) are still facing a Content Tsunami. The Victorian scholar who wants to read all of the stuff vomited by 19th century printing presses will still find a mountain to climb. Even the Victorian specialist really only bases her sweeping theses on a discreet “sampling” of all the stuff produced in the Age of the Novel. The Content Tsunami has always been with us. That is why gifted readers such as Harold Bloom or Umberto Eco tower above the rest of us (not just because of their acumen but also for their sheer ability to digest mountains of books). And even they can sometimes look a little amateurish when straying out of their fields of expertise. Personally, I’m not a big fan of Bloom, but he obviously has read everything he discusses. However, when he adds the odd Latin American author to his indigestible books on the Western Canon, it is easy to see how uncomfortable he is when outside of his comfort zone. I mean, he probably read Vargas Llosa, but the level of enthusiasm simply is not there.

As an undergraduate assistant, I helped the philosophy department index purchases made for the library. It included many (many) tomes of Harvard University Press’s Loeb Classical Library. If you thought classical literature only produced Homer, Virgil, the Athenian dramatists and a few fragmentary poets (basically what I was taught in Literature of Latin and Greek Antiquity), boy, were you wrong! Those people in the pre-Christian era didn’t have much papyrus or abundant ink, but they sure had a lot of time on their hands! Reading the whole of what was written just in Western Europe before the fall of Rome or the birth of Charlemagne would take up a good chunk of your life.

Readers have always been tiny little wanderers upon a Himalaya of linguistic output. People who have not studied the humanities are astounded by the current masses of text produced by other human beings, but that sense of awe stretches back much farther (perhaps in oral cultures there was also a Content Tsunami; maybe somewhere there was a hunter-gatherer oppressed by the sheer amount of epic poetry he had to listen to). It is a sign of ignorance to think that this is a new phenomenon. Think about this for a moment: What we actually inherited today from Antiquity is merely a tip of the iceberg of what was actually written. This tip (all the Ancient writing still extant) is all that is left of many hundreds and hundreds of volumes that were lost down the ages in that great chain of transmission (and destruction) from Greece to Rome to early Islamic culture to Medieval Spain, up through the Italian Renaissance and beyond. A lot of stuff was either considered too insignificant or erroneous or blasphemous and was either obliterated or recycled (see: palimpsests). This creates an instance of what economists call “survivor bias.” The time-bound technologist thinks that readers in Antiquity didn’t have to exercise critical faculties in deciding what was worth reading.

The challenges that search engines seek to solve have always been with us. There is even an entirely new discipline, pioneered by literary critic Franco Moretti, that seeks to map out massive corpora of novels statistically in order to run them through computers. The hope is that this will provide insights that are unattainable by individual readers. As Moretti tells it:
''A canon of 200 novels, for instance, sounds very large for 19th-century Britain (and is much larger than the current one), but is still less than 1 per cent of the novels that were actually published: 20,000, 30,000, more, no one really knows -- and close reading won't help here, a novel a day every day of the year would take a century or so.''
Obviously, there is a geometrical progression in the amount of written material. That is undeniable. But that progression (and the accompanying anxiety) has always been with us. To extrapolate from this geometrical progression to make trendy assumptions about business or culture is superficial. Seth Godin might have read 250 sci-fi books as a pimply high-school freshman but, even then, the corpus of fantasy and futurist fiction was far larger than that. Godin only swallowed a tiny sampling, impressive as it may be.

In fact, it was science fiction that gave rise to an indispensable analytical tool for discussing the Content Tsunami: Sturgeon’s Law. When someone remarked that “ninety percent of science fiction is crap,” Sturgeon famously retorted that “using the same standards that categorize 90% of science fiction as trash, crud, or crap, it can be argued that 90% of film, literature, consumer goods, etc. are crap.” Brilliant! Sturgeon's quip penetrates to the heart of any discussion about the mountains of text that surround us. It reminds us that Godin is wrong on two counts: 1) he is wrong about the past, in believing that he read most of the science-fiction corpus up to that time; and 2) he is wrong about the present, in believing that the daunting task of managing all the information that flows our way is qualitatively different from what has been the task of the literate person from Parmenides to the present.

The challenge is not in how to “manage” the deluge of “information” that washes all over us. That is a phantom that only exists in the minds of people not trained in the art of critical thinking. The point of education is basically to provide us the skills that guide us through the ocean of sentences produced by fellow human beings. Real value lies in identifying the 10% (or, IMHO, less than 10%) of the Content Tsunami that is worth reading (or translating). Real value lies in helping others discover the margarita in the midst of the refuse produced by the porcos. The belief that this task can be delegated to a search algorithm or a translation engine says more about you than about the real world.

Miguel Llorens is a freelance financial translator based in Madrid who works from Spanish into English. He is specialized in equity research, economics, accounting, and investment strategy. He has worked as a translator for Goldman Sachs, the US Government's Open Source Center, and H.B.O. International. To contact him, visit his website and write to the address listed there. You can also join his LinkedIn network by visiting the profile or follow him on Twitter.


Jordi Balcells Antón said...

I thought it unlikely that someone like Mr. Godin would have said that he could read most of sci-fi when we was in High School, so I had to go and check the reference. He does say something like that. Hmm. I am not a fan of him at all, I do not even read him at all, but he is supposed to be kind of a big shot in the intertubes. Hmm.

Anyway, he is right in one regard. It was easier to read the important books in one genre before ebooks than after their coming. With the Web 2.0 and all, everyone's a writer and can potentially reach millions. But those millions have a harder time picking out the good stuff. Thankfully, there is still only one novel getting a Hugo or a Nebula per year, though, so that keeps things under control, partially.

Anyway, algorithms are useful if they are properly thought out and are provided with enough data. I can compare my library with a likely-minded friend's in Goodreads and see which books I might be missing. Amazon does a good job over time at suggesting stuff I might like, comparing my past purchases with other people's. Still, the best way to get me to read a book is to recommend it to me personally, but machines do help.

Miguel Llorens M. said...

That is where we differ in a generational sense. I am much less impressed by social media algorithms. I have been feeding information to Amazon for well over a decade and its recommendations still suck. It is the stupidest algorithm I have ever seen. I type Jean-Paul Sartre "Nausea" and I get a recommendation "maybe you would like Albert Camus's 'Myth of Sysyphus.'" Seriously, we needed a half a billion dollars of research into computer science for that? That's the brave, new world of the future? Please...