Tuesday, January 24, 2012

my annual SOTU & word clouds rant

 Good to revisit: word clouds are the Fox News of linguistics (see here and here)

CompuPolitics or Wonkonomics?

Philip Resnik, computational linguist extraordinaire, has requested name candidates for a “new domain of activity” he described recently in a Language Log post.

But naming is not easy. It’s fun to mock bad names like Netflix’s dud Qwikster, but try coming up with a better one. Go on. I’ll wait …

Yeah, see. Not easy.

Money quote:
“…people are starting to suggest that volume and sentiment analysis on tweets … might produce useful information about people's viewpoints, or even predict the success of political campaigns. Indeed, it's been suggested that numbers derived from Twitter traffic might be better than polls, or at least better than pundits.”

So, what to call this pursuit? Resnik suggests CompuPolitics. But, I mean, really? It’s so 90s. Evoking CompuServe. It’s dead... stale... dry. The stench of moth-infested dust permeates my nostrils just reading it.

Resnik gave it a shot. He misfired, but I respect the effort. So what are we looking for in a name? An analogy could be drawn with Culturomics, the Google Ngrams inspired NLP endeavor. But let's be honest, that's a gawd-awful name reeking of middle-aged literary professors amazed that Eliza talks back. Rather, a better analogy might be Freakonomics, one of the best academic coinages of the last few decades (maybe ever). It has captured the spirit of its practitioners in a way that is immediately obvious to the lay person*.

But just what is Resnik trying to name? His Language Log post seems to be referencing a new field of academic pursuit, like behavioral economics or bioinformatics. But it’s also deeply embedded in industry practices which evoke terms like big data and text analytics Again, this suggests that Freakonomics is indeed an apt analogy as it refers not just to the style of analysis, but also a for-profit business model involving books, a NYT blog, and now a movie.

Let’s assume Resnik’s goal is to find a name for a business-savvy methodology. Let us recognize that many academic field names that we currently take as meaningful evocations of deep thinking actually have ridiculous names (the prevalence of Latin and Greek terms and blends alone should make us all giggle).
  • Philosophy – blend of Greek philo- "loving" and sophia "knowledge” (‘love-knowledge’?? I mean, how stupid is that…)
  • Biology – blend of Greek bios "life" and –logia “to speak” (what? ‘life speaks’? huh? ‘the talking life’? dumb dumb dumb)
  • English – I mean, frik, really? Is this a real academic department? Next, you’ll tell me there’s a real Department of French and Romance Philology.
To coin a new term, we want some understanding of the themes we want the new term to evoke. Here’s a stab at what I think Resnik is looking for:

Technical Themes:
  1. Computational (oooh, “science”)
  2. Political (ugh, windbags)
  3. New (oooh, shiny)
  4. Search/discovery (ya mean like the googles?)
Emotional Themes:
  1. Pulse of opinion (I care about what other people care about)
  2. Gravitas (really smart people care about this)
  3. Better than polls or pundits (I hate them anyway, now I know why)
  4. Value (who doesn’t love a bargain)
  5. Honesty (fake tweets and Twitter-bots are not my friends)
Big Picture Themes:
  1. Rising above the noise (Needle? Check. Haystack? Check.)
  2. Making a difference (like Morgan Freeman in Lean on Me)
  3. Finding the truth (“the truth is out there”)
  4. This is the future, and it’s good (like Justin Beiber?)
Now, to find a single word or phrase that captures all of that at once… hmmmm. Here's a half hearted attempt:

Name Candidates

Literal (typically the most clunky, clumsy, and goofy, worth avoiding)
  • CompuPolitics – blend of computer and politics
  • SentiMent – riff on one word sentiment
  • Sentics – blend of sentiment and politics
  • Poliments – blend of politics and sentiment
  • PoliTude – blend of politics and attitude
  • Crowditude – blend of crowd and attitude
  • PoliInformatics – blend of politics and informatics ala bioinformatics
Associative
  • Flutter – ala “Twitter”, this evokes temporary changes in sentiment captured by the technology.
  • CanaryStats – evokes canary in coal mine analogy
  • Wonkonomics - Fun, professional, evokes both George Stephanopoulos and Willy Wonka in the same breath
Homage
  • Freakoment - blend of Freakonomics and sentiment (I mean, fuck it, those guys are making bank, right?)
  • Groupthink – a little creepy, but ya know, Orwell knew language…
Clearly, I favor the awesomely evocative Wonkonomics. But you be the judge...

* Let us put aside the inevitable debate over whether or not Freakonomics is good science. It’s a great name.

A linguist asks some questions about word vectors

I have at best a passing familiarity with word vectors, strictly from a 30,000 foot view. I've never directly used them outside a handfu...