Friday, July 30, 2010

when good physicists go bad...

According to a recent article in The New Scientist, the ground-breaking physicist Murray Gell-Mann* has been spending his twilight years (he's now 80) trying to "trace the majority of human languages back to a common root." This tower of babel quest is roughly the Grand Unified Theory of linguistics. And just like the GUT of physics, it is about as fruitless. Many have tried, all have failed. Where Einstein failed with GUT (as well as many others), Murray Gell-Mann is likely to fail in linguistics. It may be the case that all human languages are descendant from a single progenitor (doubt it), but I see no reason to believe there exists evidence to prove it one way or the other. Language has simply not been preserved the way fossils have and language doesn't have the sort of physical reality that elementary particles have. So chasing that tail is is going to end up exactly the way all tail-chasing adventures end.

*fyi, his ground-breaking work involved categorizing elementary particles.

Thursday, July 29, 2010

what's the singular of 'feces'?

The Economist's language blog take a crack at an answer (pun quite intended).

metalinguistically unpossible

Nate Silver uses the amusing term "unpossible" in this post: It's Like Mathematically Unpossible for Republicans to Win the House, or Something. There are a few Urban Dictionary definitions, such as Even more impossible than impossible. Quite possibly the most impossiblest thing in the world. All three of the examples given seem to be pretty meta-linguistic (in the sense that the speakers are likely aware of the semantic contradictions within the word and are intentionally employing the contradictions for effect; call it Seinfeld Semantics).

Wednesday, July 28, 2010

verb valencies

A new online version of the 2004 book A Valency Dictionary of English has recently gone live. I haven't had a chance to play with it, but it looks like it has some good data about verb patterns. If you're into that kinda thing, I mean.

on linguistic competition

CoffeeTeaLinguistics posts an interesting thought experimentNo language requires another language to survive, or any other languages for that matter.

My response involved an extension of the isolate thought experiment culminating in my wondering what the implications are if the following two claims are true simultaneously:

1. No language requires another language to survive.
2. No language can remain stable for more than one generation.

Read more at CoffeeTeaLinguistics.

Tuesday, July 27, 2010

redundant acronym syndrome syndrome

Kottke faces his demons and admits he suffers from RAS Syndrome (redundant acronym syndrome syndrome). Of the three "linguistic explanations" given on the wiki page, I found #1 to be highly unlikely (it assumes a ridiculousnessly high level of meta-linguistic analysis on the part of the average speaker); #2 and #3 were roughly the same: Acronyms are treated as unanalyzed lexemes.

Monday, July 26, 2010

more on language death, ctd.

Razib continues his exploration of the interplay of linguistic diversity/homogeneity and socio-economic disparity/prosperity. In a response to a comment, he posted his "bullet points" on

1) a common language across large spaces fosters economies of scale. in other words, it fosters coordination and cooperative by reducing the friction in flow of information.

2) a common language also removes some of the raw material for intergroup conflict. conflict is zero sum, #1 is ideally non-zero sum.

Saturday, July 24, 2010

Czech Tongue Twister

Strč prst kskrz krk

I'll yogurtize this as "Scott passed us koss cock"

HT: Czechly 

Cameron's English is "clipped"

The Economist's quasi-lingo-blog Johnson analyzes the speech of David Cameron. They discover that it is
  • casual
  • not sloppy
  • distinctive
  • clipped
  • concise
  • confident
  • leaves formal grammar struggling in his wake
  • micro-phrasal
I find the Economist's analysis sloppy, lacking distinction, whatever the opposite of "clipped" is, meandering, tedious, sniveling, macro-phrasal...sigh.

Friday, July 23, 2010

more experiments

Harvard's Games with Words site has a new game up!

Translation Party
Idea: type in sentence in English. The site then queries Google Translator, translating into Japanese and then back again until it reaches "equilibrium," where the sentence you get out is the sentence you put in. Some sentences just never converge. Ten points to whoever finds the most interesting non-convergence.

Thursday, July 22, 2010

Andrew Sullivan is the Sarah Plain of science

Again and again, Andrew Sullivan impresses me with his utter incompetence at any and all things scientific-ee. For all his tirades against Sarah Palin, it's ironical that he can, in less than tongue-in-cheek manner, be accused of being the Sarah Palin of science. Yet again he posts an easily falsifiable scientific claim (regarding conversational analysis, or lack thereof...), fails to do even the most basic Google search, then comments on the false claim as if it were true. He would NEVER accept this kind of behavior from a political blogger, but he routinely engages in this himself when it comes to science.

It begins with Scott Adams, creator of Dilbert, posting this sentence on his own blog A conversation, like dancing, has some rules, although I've never seen them stated anywhere. Any first year linguist, anthropologist, English major, linebacker, Starbucks barrista, etc would see this statement and say, "hmmm, that seems wrong. I can't believe no one has ever studied conversation from scientific standpoint. Let me Google around a bit and see what I can find..." Sullivan didn't do this, he just reposted a passage from another blogger, uncritically pasted it into his own rather large megaphone, then added his own, misguided, largely wrong comment.

What that little bit of Googling would have given you, dear Scott and Sully, was the fact that there is a rather long history of conversation analysis within linguistics, sociology, anthropology, and now even computational linguistics.  There's a fucking Wiki page for fuck's sake!

And yes, people have been trying to write down the "rules" of conversation for a long time. They even have a name for them: turn taking. Though attempts at defining the "rules" of turn-taking have been fraught with problems, nonetheless scholars and scientists have been trying.

here's a brief and incomplete but representative list of some freely available papers and resource on the science of conversation analysis:

  • A Computational Architecture for Conversation (Microsoft, pdf).  We describe representation, inference strategies, and control procedures employed in an automated conversation system named the Bayesian Receptionist. The prototype is focused on the domain of dialog about goals typically handled by receptionists at the front desks of buildings on the Microsoft corporate campus.
  • Turn taking in conversation is universal (Max Planck institute for Psycholinguistics): Do people take turns in natural conversation in the same basic way in all languages, or does the turn-taking system vary in each language? Many anthropologists have suggested the latter, but MPI-researchers have found empirical evidence for robust universals in human conversation. Their study appears in this week's Proceedings of the National Academy of Sciences.
  • Speaking while monitoring addressees for understanding (pdf, Stanford): Speakers monitor their own speech and, when they discover problems, make repairs. In the proposal examined here, speakers also monitor addressees for understanding and, when necessary, alter their utterances in progress. Addressees cooperate by displaying and signaling their understanding in progress.
  • Sequencing in Conversational Openings (UCLA):An attempt is made to ascertain rules for the sequencing of a limited part of natural conversation and to determine some properties and empirical consequences of the operation of those rules.
As for Sullivan's contribution: "I think of it as a friendly tennis match. There is no attempt to score a point or win a match" this is more a function of his perception of a conversation than the reality. Conversations are always governed by goals and there is competition for the floor inherent in the interaction. It would be interesting if Sullivan would post a lengthy example of one of his friendly tennis match conversations and let a CA scholar have a go at analyzing the content. My guess is that we would find that Sully's friendly tennis match is more Serena v. Venus than he's willing to admit.

the psychological reality of truthiness?

New research out of U. Chicago looked at the effect of foreign accents on trust. The brief Flash Report Why don't we believe non-native speakers? (PDF; full citation below) found that "People judged trivia statements such as “Ants don't sleep” as less true when spoken by a non-native than a native speaker." There's a cline of truthiness because the researchers did the following: "Participants listened to each statement and indicated its veracity on a 14 cm line, with one pole labeled de!nitely false and the other definitely true. We measured the distance from the false pole in centimeters, so a higher number indicates a more truthful statement."

I found this to be a interesting design idea. Don't force people to make a clear decision about truth value. Frege and Russell be damned, haha! Have these researchers discovered the psychological reality of truthiness?

On a more serious note, they begin the article with a review of all the ways that processing fluency affects linguistic stimuli judgement. from the paper (reformatted for easy of reading):

Stimuli that are easier to process are perceived as
  • more familiar
  • more pleasant
  • visually clearer
  • longer and more recent
  • louder, less risky
  • more truthful
For example, people judge “Woes unite foes” as a more accurate description of the impact of troubles on adversaries than“Woes unite enemies,” because the rhyming of woes and foes increases processing fluency. Similarly, people judge the statement “Osorno is in Chile” as more true when the color of the font makes it easier to read.
Lev-Ari, S., & Keysar, B. (2010). Why don't we believe non-native speakers? The influence of accent on credibility Journal of Experimental Social Psychology DOI: 10.1016/j.jesp.2010.05.025

Wednesday, July 21, 2010

more on language death

Razib continues his thoughtful discussion of the interplay of linguistic diversity/homogeneity and socio-economic disparity/prosperity.

Money quote:
If you have a casual knowledge of history or geography you know that languages are fault-lines around which intergroup conflict emerges. But more concretely I’ll dig into the literature or do a statistical analysis. I’ll have to correct for the fact that Africa and South Asia are among the most linguistically diverse regions in the world, and they kind of really suck on Human Development Indices. And I do have to add that the arrow of causality here is complex; not only do I believe linguistic homogeneity fosters integration and economies of scale, but I believe political and economic development foster linguistic homogeneity. So it might be what economists might term a “virtual circle.” (emphasis in original)

I have a long history of discussing language death on this blog and my position can be summed up by this Q&A I had with myself:

Q: Is language death a separate phenomenon from language change?
A: In terms of linguistic effect, I suspect not

Q: Are there any favorable outcomes of language death?
A: I suspect, yes (Razib proposes one)

Q: How do current rates of language death compare with historical rates?
A: Nearly impossible to tell

Q: What is the role of linguists wrt language death?
A: One might ask: what is the role of mechanics wrt global warming?

Tuesday, July 20, 2010

on the evolving language of headlines

Gene Weingarten wrote up a nice rant about the evolution of headlines from the era when print headlines were meant to grab a reader's eye to the modern era where headlines are meant to juice SEO.

Money Quote:
Newspapers still have headlines, of course, but they don't seem to strive for greatness or to risk flopping anymore, because editors know that when the stories arrive on the Web, even the best headlines will be changed to something dull but utilitarian. That's because, on the Web, headlines aren't designed to catch readers' eyes. They are designed for "search engine optimization," meaning that readers who are looking for information about something will find the story, giving the newspaper a coveted "eyeball." Putting well-known names in headlines is considered shrewd, even if creativity suffers.

The temptation to end this post as he did was great...but modesty has won...this time...

Urdu is the most influential language IN THE WORLD!!!

The competition is over and Urdu has won!!! According to "Renowned Urdu scholar" Dr Farman Fatehpuri, "the status of a language should be decided in view of its influence and that Urdu was the most influential language in the world [...] Urdu has the distinction of having the phonetics and the letters that conform to it. Other languages are mostly devoid of it.”

Well, there you have it.

Monday, July 19, 2010

kids say the darnedest things

Too cute not to pass on...a FB comment posted by my sister Lori (who owns and runs her own pre-school):

One of the funniest things recently said by my preschooler Lilly (4 years old):

After repeating the importance of not unlocking the front door, her Mom said to her, ”What is it that you do not understand?” and Lilly replied, “English.”

Stanford in the news (good and bad)

Several Stanford linguistics related items have been popping up here and there, none worthy of a post by itself, but taken as a whole, something weird is happening over there:
  • Stanford linguistics recently posted a search for not one but TWO tenure-track faculty positions. I've always had the impression that linguistics departments at elite universities don't hire all that often and it's quite rare to find two positions simultaneously. Not sure if they just have money to burn or if this is a special situation.
  • Mr. Verb linked to a study with the title BRITISH ACCENT NO LONGER SEXY, STUDY FINDS. The linked to article makes a variety of claims:  1) the research was done  "by the Department of linguistics and the Department of Psychology at Stanford University" (why psychology gets caps but linguistics doesn't is perhaps another question as well), 2) it's called The Comito Study, and 3) either Dr. Linda Masterson or Dr. Lisa  Masterson, or possibly both, are involved. So far, I cannot find any reference to the study anywhere on Stanford's pages (or anywhere in the googlesphere save the original article), nor can I find either Dr. Linda or Dr. Lisa at Stanford (nor can I find any Masterson at all). UPDATE: I'm so used to seeing bad science reporting, I just assumed this was legit. A little follow-up shows this to be the work of the classic bat-boy publication The Weekly World News, a not-too-distant cousin to the Onion. Shame on me, haha. 
  • While searching for Drs Linda and Lisa on Stanford's page, I discovered that Stanford has implemented some kind of algorithm for matching similar sounding names, so the search page asked me if I wanted to "Find last names that sound like my search term." One of the earliest, if not THEE earliest sound matching algorithms was Soundex, patented in 1918 and now freely available in a variety of implementations (since the patent has expired). However, there are far superior algorithms involving various minimum edit distance and bag o' sound phonological comparisons (I spent a brief time at IBM with a group working on this). I don't know how they've implemented their search, but it's a nifty tool to include in a search engine, imho.

pullum bait

Here's an occasionally tongue-in-cheek Q&A from the Chicago Manual of Style Online. Personal fav:

Q. Can I use the first person?
A. Evidently.

And running a close second:
Q. “Between” vs. “among.” I’m going insane. I think the editor who changed my wording is just clueless or hasn’t given the issue enough thought. Please help. I’ve read the advice in CMOS, Garner’s Modern American Usage, Bernstein’s The Careful Writer, The Cambridge Grammar of the English Language, and a few other sources, but I can’t decide. Should I say “competition between companies” or “competition among companies”? They’re competing with each other, severally and individually. At least, that’s what I think. Or is “among” justified on the grounds that competition implies vague, intricate relationships? Do I need an economist to clear this usage question up? Are there right and wrong answers in this case? The phrase is “competition between/among companies is intensifying.”

A. It really doesn’t matter. The editor might well be clueless—it happens—but you are overthinking this.

HT: kottke

Friday, July 16, 2010

the upside of language death?

The bio-blogger Razib Khan steps into the murky waters of language death and proposes an hypothesis about how language death might have favorable outcomes for language evolution. Money quote: "very high linguistic diversity is not conducive to economic growth, social cooperation, and amity more generally scaled beyond the tribe."

As far as I can tell he has no evidence for this, but rather is drawing an analogy to cultural evolution ala Jared Diamond. The take-away seems to be: a little language diversity is good; a lot of language diversity is bad.

on withdraw

Like many people, a word I encounter all the time, which I consider normal will occasionally pop out at me and seem odd in some linguistically interesting way. Today, the word withdraw popped out at the ATM (along with the cash, hehe). It's the preposition that struck me as odd. I can still get the use of draw to mean take away (mostly thanks to poker), but what's with doing in that word? To withdraw does not mean draw with.

The preposition with is a tricky one that marks a wide variety of semantic roles. A brief set of examples should suffice to make the point (forgive my semantic role labels if they don't match your preferred terminology, just trying to make the point obvious):
  • Chris loaded the truck with hay.               hay = object*
  • Chris loaded the truck with a pitchfork.     pitchfork = instrument
  • Chris loaded the truck with Larry.            Larry  = co-agent
  • Chris loaded the truck with enthusiasm.  enthusiasm = manner
  • Chris loaded the truck with stripes.         stripes = modifier
In his big red syntactic theory book, one of my professors wrote a fairly involved analysis on why with is so versatile. But arguments as to why this is the case are not particularly relevant at the moment. I'm more interested in how with got there in the first place, not why the contemporary English grammar** allows it.

The Online Etymology Dictionary lists the following defintiion (sorry, no OED access): withdraw  early 13c., "to take back," from with "away" + drawen "to draw," possibly a loan-translation of L. retrahere "to retract." Sense of "to remove oneself" is recorded from c.1300. (emphasis added)

1300 1200 is a long time ago, so the word has serious English street cred. But I found the definition of with as 'away' again, just odd until I followed up on the etymology of with:

with: O.E. wið "against, opposite, toward," a shortened form related to wiðer, from P.Gmc. *withro- "against" (cf. O.S. withar "against," O.N. viðr "against, with, toward, at," M.Du., Du. weder, Du. weer "again," Goth. wiþra "against, opposite"), from PIE *wi-tero-, lit. "more apart," from base *wi- "separation" (cf. Skt. vi, Avestan vi- "asunder," Skt. vitaram "further, farther," O.C.S. vutoru "other, second"). In M.E., sense shifted to denote association, combination, and union, partly by influence of O.N. vidh, and also perhaps by L. cum "with" (as in pugnare cum "fight with"). In this sense, it replaced O.E. mid "with," which survives only as a prefix (e.g. midwife). Original sense of "against, in opposition" is retained in compounds such as withhold, withdraw, withstand. (emphasis added).

So, to withdraw is to draw against an account, and that makes perfect sense. Thank you freely available online lingo-tools. It's a nice example of how dramatically a word can change its semantics. Virtually all contemporary uses of with involve the sense of together, not against. But there it is, in black and white (and a little bit of green).

*I think Propbank would use cargo as the role label for hay, I'm not sure, but I figured object was more obvious for lay readers. U. Illinois has a nifty online Semantic Role Labeler demo, if you want to play around with this kind of thing.

**Careful now, I'm using the term English grammar in a fairly technical, psycholinguisticee sense.

Thursday, July 15, 2010

Again With The Bad Science Reporting....

A nice post over at Thoughtomics debunks an all too typical example of bad science reporting run amok involving chickens and eggs and proteins... you see where this is going, right? sigh...

Money Quote:

I didn’t exactly hold mainstream science journalism in high esteem, but I’m amazed that science journalists continue ‘covering’ science stories in this way, even when readers are calling them out. While the trouble may have started with a misleading introduction and a quirky quote, it is the journalist’s responsibility to check facts and put a story into a context. Coverage like this does more harm than good for the public image of science reporting and scientists themselves.

Words For Canoe...

The Ottawa Citizen reports that Words for 'canoe' point to long-lost family ties. The story begins thusly:

An obscure language in Siberia has similarities to languages in North America, which might reshape history, writes Randy Boswell.

A new book by leading linguists has bolstered a controversial theory that the language of Canada's Dene Nation is rooted in an ancient Asian tongue spoken today by only a few hundred people in Western Siberia.

The landmark discovery, initially proposed two years ago by U.S. researcher Edward Vajda, represents the only known link between any Old World language and the hundreds of speech systems among First Nations in the Western Hemisphere (emphasis added).

It's a nice story about hard working linguist Edward Vajda discovered linguistic relationships between the Ket language of Siberia and Athapaskan languages of North America. A relationship that goes back maybe 13,000 years. From his web page, "The "Dene-Yeniseian Hypothesis" is gaining acceptance as the first demonstrated link between an Old World and a New World language family."

Having been trained at a school known for both typology and field linguistics, I have a lot of respect for the skills and talent a linguist like Vajda brings to the field (especially since I lack the patience to do this kind of work). And his enthusiasm is infectious. From the article:

He found that the few remaining Ket speakers in Russia and the Dene, Gwich'in and other Athapaskan speakers in North America used almost identical words for canoe and such component parts as the prow and cross-piece.

"Finally, here was the beginning of a system that struck me as beyond the realm of chance," Vajda wrote at the time. "At that moment, I think I realized how an archeologist must feel who peers inside a freshly opened Egyptian tomb and witnesses what no one has seen for thousands of years." (emphasis added).

Wednesday, July 14, 2010

"Former Hacker"?

The term former hacker is being bandied about quite a lot right now (see examples here). The term struck me as odd simply because I think of hacking as a skill set, not a job. You can be a former police officer or former mayor because those are jobs that can end. But once you have a skill, you tend to retain it forever (like riding a bike....). My hunch is that the term is meant to suggest that the individual no longer breaks into other people's networks just for fun anymore, even though they could.

Tuesday, July 13, 2010

what do you say to a linguist?

Here's a clever site by David R. MacIver that compiles stereotypical responses to the "What did you study" question. two responses so far for linguistics. I'm sure we can add many more.

What do people say to Linguistics?
"Oh, my grammar is horrible. You must hate that..."
"So how many languages do you speak?"

Language Is A Battlefield

British author Roz Kaveney discusses Why trans is in but tranny is out - the language of transgender. Money quote:

Right now, trans is just about universally acceptable - though in recent years there was a fight over whether it should be an adjective or a prefix. A trans woman, the argument goes, is a woman who happens to be trans as she might be, say, blonde, but a transman is some special and distinct order of being.

For a while, it seemed as if some younger trans men were going to successfully reclaim 'tranny', at least as a 'smile when you say that' epithet, or a 'we can say that about ourselves; you can't' in-group word like 'queer'. It didn't take, though, partly because it had never stopped being used by would-be hip lad journalists to abuse not only actual trans people, but a list of 'weird' people seen as non-gender-conforming.

Monday, July 12, 2010

Implicit Language Policy

Ingrid, over at Language on the Move, tells the story of how difficult it was to get her university to accept the record of a non-English publication, then draws a smart conclusion about linguistic hegemony: one ever made an explicit policy decision that research publications in languages other than English are less desirable than those in English. However, mundane bureaucratic practices – such as making record entry for a publication in a language other than English more difficult – conspire to have exactly that policy effect. In this way many decisions that seem to have nothing to do with language end up as implicit language policy decisions – the fact that English-language journals dominate the academic rankings is another example from academic publishing (emphasis added).

Sunday, July 11, 2010

Linguists DEBUNK: Does Obama Talk Like a Girl?

For shame, Atlantic Wire. You wildly mislead your readers with this ridiculous title: Linguists Debate: Does Obama Talk Like a Girl? This is flat wrong. Linguists ain't debating this at all. Linguists, as far as I can tell, are all in COMPLETE AGREEMENT on this topic. Obama does not talk like a girl. It's a ridiculous claim with ridiculous presuppositions and ridiculous implicatures. I don't know a single linguist who disagrees with or wishes to debate this at all.

The Atlantic Wire's roundup of the whole Parker-Krauthammer-Payack kerfluffle treats the delusional scribbles of political partisans on equal terms with the objective, thoughtful and empirically sound analysis of professionals. This is just wrong. For shame.

UPDATE: Just noticed that John Lawler makes exactly this point in the comments of The Atlantic Wire's page. Good for you John.

Thursday, July 8, 2010

Salad Too Grill

I can't make sense of this name (just a block from the White House). It seems to be attempting  a play on words of some sort, but I just can't get any coherent meaning. If it were "Grill Salad Too" at least I could imagine it means a grill plus salads, or a play on "Salad to grill" meaning, I dunno, grilled salads (yuck) but I just can't get either of those from "Salad Too Grill." Nonetheless, it has a good reputation on Yelp.

Wednesday, July 7, 2010

Online Dialect Survey

All you swinging North American English speakers, pucker up and get yer vocal folds hummin 'cause you're being called to service! Claire Bowern of Yale University's Linguistics Department has launched an online North American English Dialect Survey (HT Mr. Verb). Now quit yer bitchin' and yer belly achin' about the use of ain't, the sissy passive, and no problem and contribute something useful to the interwebs, yer voice!.

Tuesday, July 6, 2010

No Problem? You're Welcome.

Over at Salon, Matt Zoeller Seitz posted a fairly mundane rant in the peevologist tradition complaining about the decline in civility represented by the rise of no problem as a replacement for you're welcome in American courtesy interactions. What piqued my interest was not the rant itself (I'm growing tired of countering the peevologists, let them rant away, yawn) but rather the fact that we can easily fact-check his intuition that you're welcome is declining in use while no problem is rising thanks to the newly released Corpus of Historical American English from Mark Davies at BYU. As a caution, this corpus is not really suited to this question because it's not limited to spoken courtesy phrases*, which is what Seitz was specifically ranting about; nonetheless, it give us a hint at the change in frequency of these two phrases.

Using the freely available online tool, I plotted the frequency of you're welcome over the last 200 years:

Saturday, July 3, 2010

Powerful Minority: Catalonians

According to The Hollywood, "Parliamentarians in Spain's northeastern region of Catalonia have passed a controversial law requiring half of all commercial films to be dubbed* into the local language."

I'm generally not a fan of laws relating to language (I'm a linguistic libertarian of sorts), but I recognize that Catalan has benefited tremendously from a strong region where its speakers have money, power, and prestige (the real forces of linguistics, ultimately). According to the above report, Catalan accounts for 20% of Spain's film market (ticket sales?) but only 3% of films are dubbed* into Catalan.

As a proud capitalist pig, I just don't see why the Catalonia 20% market share isn't itself enough to drive film makers to produce the dubbing. The Hollywood Reporter claims it costs "€50,000 euros ($61,000) to dub." Okay, try adding €1 to the ticket price of dubbed films and see if Catalonians are willing to pay for this service. If they aren't willing to pay for it, why legislate it?  As I recall, several European countries already have differential pricing** for films so this is not a crazy suggestion.

*actually "dubbed or subtitled."
** where some are cheaper than others, unlike here in the States where all films are the same price...a ludicrous system, btw. See HERE for a nice discussion.

Friday, July 2, 2010

Robo-Linguists At Last!!

(image from evasee)

Finally, the tedious job of linguists has been replaced by robots! The site proudly trumpets the triumph of algorithm over the comparative method with the post title: Computer program deciphers a dead language that mystified linguists. proudly proclaims the following:
The lost language of Ugaritic was last spoken 3,500 years ago. It survives on just a few tablets, and linguists could only translate it with years of hard work and plenty of luck. A computer deciphered it in hours.

However, just a brief scan of the original article (pdf) suggests that there's less here than meets the eye. The abstract begins thusly:

In this paper we propose a method for the automatic decipherment of lost languages. Given a non-parallel  corpus in a known re- lated language, our model produces both alphabetic mappings and translations of words into their corresponding cognates.

Producing an alphabetic mapping and a cognate set is nice, but "deciphering a dead language" it ain't.

HT: Sérgio Bernardino vi Twitter #linguistics)

Thursday, July 1, 2010


(Image from dlisted)

And a bravo for this one I (which I first heard about on Wait, Wait, Don't Tell Me): Woman in sumo wrestler suit assaulted her ex-girlfriend in gay pub after she waved at man dressed as a Snickers bar (online version of story here, but with simpler headline).

Surprisingly, this is not a crashblossom. This headline, as far as I can tell, means exactly what you think it means on the first read. The linguistics question, which is completely obvious, of course, is why did the headline author feel that "Snicker's bar" was the only NP that needed an article?

NLPers: How would you characterize your linguistics background?

That was the poll question my hero Professor Emily Bender posed on Twitter March 30th. 573 tweets later, a truly epic thread had been cre...