Tuesday, March 30, 2010

Teaching Phonetics

(screen grab from U. Iowa)

A collaboration of several departments at The University of Iowa (but not the Linguistics department, wtf!) has put up a really nice interactive/animated tool for demonstrating the articulatory anatomy of speech sounds in English, German and Spanish.

(HT srabivens on Twitter via #linguistics)

Monday, March 29, 2010

Doh! Nut Metaphors

Neal Whitman deconstructs the recent doughnut hole metaphor buzzing around the health car reform debate.  Money quote:

... reading about the doughnut hole in the newspaper or hearing about it on the radio, I kept having a feeling I wasn't understanding something. It was when I called upon my real-world knowledge of doughnut structure that I finally realized it wasn't the issue itself that was troubling me, but the choice of metaphor. Figure 2 shows a typical donut. We can observe that it is a glazed, cake doughnut, without sprinkles. We can also see that the gap in coverage from Figure 1 corresponds not to the doughnut hole, but to the sweet, cakey goodness of the doughnut itself.

He suggest a castle moat as an alternative metaphor. And yes, he really does use two different spellings: doughnut and donut. This may reflect an underlying ambivalent on his part, but I suspect it's just a good case of legitimate spelling change that hasn't fixed upon a final form. I suspect we'll all use donut soon enough.

Monday, March 22, 2010

Still No 'moist'

A Twitter challenge to list "the ugliest words in the English language" at #uglish. I vote for uglish.

Sunday, March 21, 2010

Is There A Disfluency Gap?

Watching the health care debate on C-SPAN I find Nancy Pelosi's speaking style to be jarringly disfluent, at least as much so as George W. Bush's ever was (or Sarah Palin for that matter) yet I don't recall Pelosi being as criticized as they were. My hotel internet connection is not fast enough for me to YouTube around for examples of Pelosi speaking extemporaneously, but I suspect you can find these examples easily and I suspect you'll see what I mean.

Is this a partisan issue? Are Republicans more likely to be criticized for speech errors than Democrats?

The folks at Language Log have discussed the politics of speech errors many times (see THIS post which includes links to many others) and it's worth quoting Liberman: "Everyone commits speech errors...and anyone who makes a big deal about particular examples is either a fool or a hypocrite."

My gut reaction is that there are many fools and hypocrites reporting on our politicians ... surely I am the first to uncover this rare gem of insight.

NOTE: I make no political point by bringing this up other than to ask if there is a statistical difference between the likelihood that a Republican figure will be criticized for speech errors and the likelihood that a Democrat will be criticized for speech errors. My intuition is that there is a difference, and that difference leans towards Republicans being more likely to be criticized. I caution the reader against trying to infer my own political beliefs from this post.

Tuesday, March 16, 2010

A software blogger takes on the heady task of defining categories. It's not clear to me if this is prescriptivist poppycock, naive descriptive lexicography, or wishful thinking: The Difference Between A Developer, A Programmer And A Computer Scientist. I made my own foray into this world here: Computational Linguistics vs. NLP.

HT zelandiya (via #linguistics)

Sunday, March 14, 2010

Auto-detecting Language

Why doesn't Google's translation tool automatically detect the language I paste in? This is not a terribly difficult problem to solve computationally. I suspect that if they took a bag o' trigrams (of characters, that is) and compared to a corpus using some kind of simple tf–idf weight, they'd get a pretty high degree of accuracy. Here are some distinctive trigrams from a page on Omniglot. Wanna guess the language based solely on these? I doubt it will be difficult. And I suspect that just one or two of these trigrams is distinctive enough to make an accurate guess.
  1. änn
  2. isk
  3. a_m
  4. är_
  5. föd
  6. och
  7. vär
UPDATE: thanks to the cemmentators for schooling me on this. In fact, Google DOES have a detect language function. I've been trying to find documentation on their methods but haven't had much luck. I did find this discussion of a different language detector that works rather differently than I proposed. Rather than compare trigrams of letters to language models, it looks up whole words in dictionaries. While I admit to the greater simplicity of this method, I think my idea is more betterer 'cause it's more linguisticy.

Notes on  my searching:
  1. Lots of programming language detecting tools.
  2. Several human language detecting tools, but few discussed methodology
Some advice from Mankiw about prospective grad students choosing where to go. #8 struck me as critical for any and all students:

Don't be distressed if you did not get into your top choice.  What you do in graduate school (or college) is far more important than where you go.  Your personal drive matters more than ranking of the school you attend.

Wednesday, March 10, 2010

Is Language Death All That Bad?

John McWhorter echoes some of my previous musings ... I like this guy:

Yet the going idea among linguists and anthropologists is that we must keep as many languages alive as possible, and that the death of each one is another step on a treadmill toward humankind’s cultural oblivion. This accounted for the melancholy tone, for example, of the obituaries for the Eyak language of southern Alaska last year when its last speaker died.

That death did mean, to be sure, that no one will again use the word demexch, which refers to a soft spot in the ice where it is good to fish. Never again will we hear the word 'ał for an evergreen branch, a word whose final sound is a whistling past the sides of the tongue that sounds like wind passing through just such a branch. And behind this small death is a larger context. Linguistic death is proceeding more rapidly even than species attrition. According to one estimate, a hundred years from now the 6,000 languages in use today will likely dwindle to 600. The question, though, is whether this is a problem (emphasis added).

This guy needs to read my own most excellent ramblings:

Is language death a separate phenomenon from language change?

  • In terms of linguistic effect, I suspect not

Are there any favorable outcomes of language death?

  • I suspect, yes

How do current rates of language death compare with historical rates?

  • Nearly impossible to tell

What is the role of linguists wrt language death?

  • One might ask: what is the role of mechanics wrt global warming?

HT i09 via Twitter's #linguistics.

Monday, March 8, 2010

Hypercorrect Substitutions

I got my morning cuppa joe from a Green Beans Coffee shop at The Great Place on my first day of three weeks in wonderful central Texas. While sipping ... okay, fine ... gulping my java I noticed the sleeve had the following quote:

Myself and many of the Naval sailors I work with have all had your coffee and love it.

The linguist in me couldn't help but notice that this was a beautiful example of hypercorrection(1). I also couldn't help wonder why the simple syntactic test of substitution isn't better understood by the average person. It's such a simple idea, any 6th grader could master it. The idea is that, when faced with a grammar choice you are unsure of, you simply ask yourself, what else could I put in its place and how does that help me make my choice? So, here we have a complex subject (i.e., a subject with two NPs):
  • X and Y have all had your coffee 
Where X refers to the speaker and Y = many of the Naval sailors I work with and the decision is what form of personal pronoun is appropriate for X. If we ask ourselves, what if this were a simple subject composed only of X, what form of the personal pronoun would we chose?
  • I have had my coffee 
  • Myself have had my coffee*
At this point, the decision is quite obvious, isn't it?

But wait! I'm no prescriptivist, Certainly I must have some descriptivist point to make, mustn't I?

Saturday, March 6, 2010

Thursday, March 4, 2010

Really! Really? Really.

Dear Netflix, is The Importance Of Being Earnest really a "Cerebral Drama"? Really! Really? Really. I'm not sure your recommender system ever actually read Oscar Wilde.

Pssst, for context on The Three "reallys" Construction see my comment on LL and repeated below:

The Three "reallys" Construction is strictly a spoken construction, as far as I can tell, so I can't do much of a search, but it's common in sitcoms and very commonly used in casual setting amongst friends when a person is faced with a situation that is (1) surprising, (2) intractable. The three "reallys" provide a cascaded enunciation of the cline from genuine surprise to complete defeatism (i.e., the person realizes there's nothing they can do about the situation).

really 1 = interjection like wow expressing internal surprise.
really 2 = interrogative, actually questioning the other person wrt the situation.
really 3 = expression of defeat (i.e., I give up).

Correct All Grammar Errors And Plagiarism!

I was stupid enough to click through to Huffington Post's colossally stupid and fundamentally mistaken Worst Grammar Mistakes Ever post (I refuse to link to it). Of course, the 11 items had virtually nothing to do with grammar (the vast majority were punctuation and spelling errors). I must agree with Zwicky's pessimism regarding National Grammar Day: "It seems to me that the day is especially unlikely to provide a receptive audience for what linguists have to say."

But what prompted this post was the ad at the bottom for Grammarly, a free online "proofreader and grammar coach" which promised to Correct All Grammar Errors And Plagiarism.

A bold claim, indeed. I doubt a team of ten trained linguists could could felicitously make this claim. But the boldness does not stop there (it never does on the innerwebz). Click through to the online tool and wow, the bold claims just start stacking up like flap jacks at a Sunday fundraiser.

Just paste in your test and bam! you get

150+ Grammar Checks
   Get detailed error explanations.
Plagiarism Detection
   Find unoriginal text.
Text Enhancement
   Use better words.
Contextual Spell Check
   Spot misused words.

Dang! Them fancy computers, they sure is smart. Just for funnin, I pasted the text of Zwicky's NGD again post into the window and ran the check. Here's his report:

Not bad for a professor at one of the lesser linguistics departments.

(pssst, btw, did ya spot that odd little grey balloon at the top of the second screen shot? Yeah, me too. It says "click allow if presented with a browser security message." Suspect, no doubt. Nonetheless, I trusted Chrome to protect me and plowed ahead).

Grammar Myths Debunked

Motivated Grammar debunks ten grammar myths in honor of National Grammar Day. Additionally, he honors the better spirit of linguistics by linking to "two papers that really made me fall in love with the field." I thought that was a nice idea so I'll pick up his cue and link to a couple that got my own linguistics juices flowing back-in-the-day.


Wednesday, March 3, 2010

Oldest Example of Written English Discovered

No, not quite. The title of this post comes from a Digg link which linked to this article. The writing is dated at around 500 years old, which couldn't possible be "oldest example of written English" could it? The Huntington Library has the Ellesmere Chaucer, a manuscript c. 1405, so that's got it beat by a 100 years already and I haven't even bothered to look for Old English manuscripts. The claim in the title is quite different from the claim in the original article which begins with this:

What is believed to be the first ever example of English written in a British church has been discovered. Problem is, no-one can read it.

This just means there's a lot of Latin written in English churches. The cool part is that they're crowdsourcing the interpretation.

If anyone thinks they can identify any further letters from the enhanced photographs, please contact us via the Salisbury Cathedral website.The basic questions of what exactly the words are and why the text was written on the cathedral wall remain unanswered. It would be wonderful for us to solve the mystery (link added).

Go on, give it a shot.

Looks like the original lyrics to Judas Priest's Better by You Better Than Me to me.

Tuesday, March 2, 2010

Qatar, Rhymes With Butter .. or Susan

UPDATE: Fivethirtyeight recently (12.02.2010) tweeted about ESPN broadcasters pronouncing Qatar and linked to this site with variations, neither of which rhyme with butter, hehe: howjsay.com.

I've noticed a lot of Americans pronouncing the country name Qatar as something like kutter* ([kʌɾɚ]). This is particularly true of US military personnel serving in Iraq who are regularly traveling through there, but I also just heard it on teevee by ESPN's Chris Fowler referencing a tennis tournament in Doha, Qatar. As a native speaker of American English, I don't think my default pronunciation assignment of the alphabetic string Q-a-t-a-r would be [kʌɾɚ]. If I were presented with the string of Romanized letters Q-a-t-a-r for the first time, I think my first attempt at a pronunciation would be somewhat closer to American English guitar, something like [kʌt'ɑːʳ]**. So why do so many Americans use this non-standard, may I say, deviant, pronunciation? First, I suspect that the soldiers and sports announcers flowing through the region have little confidence in their own default reading of the Romanized letters, so they willingly mimic whomesoever says the name first, and then, heck, that's how you say it. It's a nice example of follow-the-leader linguistics.

But why is the dominant American pronunciation of Qatar → [kʌɾɚ] to begin with? It's a nice "proximate cause" question. And my answer is???

How Many Linguists Are There? 5379.

Previously, I ranted, just a bit, about the suggestion that there are more linguists than languages. I guessed that, in fact, this may not be true. Thanks to the LSA update email that was just sent out, I was able to follow up a bit. That email referenced the results of the American Academy of Arts and Sciences’ survey of linguistics departments (pdf, it doesn't load every time, so repeated clicking might be warranted). Table LN1 (below) gives an estimated 1630 faculty member in linguistics departments across the United States. That strikes me as a fair base  to start a back-of-the-napkin estimation of total linguists worldwide (I noted in my previous rant the problems with defining a linguist, but I'll take this survey as my authority for now).

Let the number games begin.  First, let's assume that this initial estimation is conservative. I'll throw in another 10% to make up for that. Let's assume there are about 1793 linguists in the US.  I think it's fair to assume there are about as many linguists in Europe (though you'd never know it by the poor rate at which American linguists cite Europeans, but that's another rant). So that's another 1793 for Europe. I'd wager that there are at best an equal number of linguists in the rest of the world as in either the States or Europe, so that's another 1793.

By this estimation, there are approximately 5379 linguists in the world (1793 x 3). That sounds about right to me. And if this is correct, then my original point stands, there are NOT more linguists than languages.

NLPers: How would you characterize your linguistics background?

That was the poll question my hero Professor Emily Bender posed on Twitter March 30th. 573 tweets later, a truly epic thread had been cre...