Saturday, May 29, 2010

Causative Funny Business in Swedish

Finally saw the excellent Swedish film The Girl with the Dragon Tattoo (Män som hatar kvinnor). While I don't speak Swedish, I noted that the word mörda 'murder' was translated into English as killed. Since the cognate murder is clearly available, I had to wonder if there was some good reason for this choice. Is mörda less causative than murder?

I used Google to translate I accidentally murdered him into Swedish and was given Jag mördade av misstag honom. But when I translated that back into English, I got I accidentally killed him. Some causative funny business going on here methinks.

Friday, May 28, 2010

My German

Here's a curious bit of linguistics: American Students refer to studying languages using a possessive phrase, but not other studies:

a. I have to work on my German.
b. *I have to work on my math.
c. *I have to work on my biology.

Note that all three could easily include the word "skills" at the end, but only (a) is acceptable bare  (to me, anyway). I wonder if this is related to the creative work metonymy construction that Pullum just posted about over at LL (it was that post which triggered my thinking on this). Being able to speak a language can be seen as a kind of creative work (i.e., the speaker is producing the language in a way they are not producing biology).

Wednesday, May 26, 2010

yeah right ctd.

Thanks to Twitter #linguistics, I discovered that Hebrew University grad student Oren Tsur will be in DC next week presenting a paper on automatic detection of sarcasm in product reviews (see here and here for reactions). I've posted on sarcasm before (see here and here) so I'm curious. The conference is the 4th Int'l AAAI Conference on Weblogs and Social Media at GW and it looks rather interesting (the first interesting thing to happen in Foggy Bottom since Watergate?). I might could take some PTO and check it out.

Tsur's work can be found here: A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews (pdf).

FYI: while Tsur's work relies solely on written words, Joseph Tepperman et al. from USC work on sarcasm in voice recognition: “YEAH RIGHT”: SARCASM RECOGNITION FOR SPOKEN DIALOGUE SYSTEMS (pdf).

Tuesday, May 25, 2010

Psycholinguistics Experiments

The Portal for Psychological Experiments on Language has seven new experiments for May:
  • Image Caption Generation: In this experiment you will be presented with a news image, an article associated with the image, and a caption describing the image. Your task is to judge how well the caption describes the content of the image given the accompanying article and how grammatical the caption is. Some captions will seem appropriate to you, but others will not. You will make your judgement by choosing a rating from 1 (the caption is not appropriate) to 7 (the caption is appropriate). All captions were generated automatically by a computer program.
  • Human-robot InteractionYou will see a series of pictures of a scene with a robot standing at a table with some objects on it. You will hear the robot asking a question and a human answering it. Every time you will make a judgment about the robot's question. Between robot scenes you will check the correctness of simple calculations.
  • Sentence ReadingIn this experiment, you will be shown a set of sentences which describe a situation. You will have to read the sentences carefully and answer the questions asked at the end of the sets. Each sentence will appear on a separate slide. You can move to the next slide by clicking anywhere on the slide. At no point will you be able to go back and revisit the contents of the previous slide (please do not use the 'Back' button of the browser as this will take you to the begining of the experiment). On the slide containing the question, you will be given two options as possible answers and you need to select one of them. On selecting the answer, you will be presented with the next set.
  • Image AnnotationIn this experiment you will be presented with a news image, an article associated with the image, and a set of keywords describing the image. Your task is to judge how well each of the keywords describe the content of the image given the accompanying article. Some keywords will seem appropriate to you, but others will not. You will make your judgement by choosing a rating from 1 (the keywords are not appropriate) to 7 (the words are appropriate). All keywords were generated automatically by a computer program.
  • Sentence CompressionIn this experiment you will be asked to judge how well a given sentence compresses the meaning of another sentence. You will see a series of sentences together with their compressed versions. Some sentence compressions will seem perfectly OK to you, but others will not. All compressed versions were generated automatically by a computer program.
  • Story GenerationIn this experiment you will be asked to read a set of short computer generated stories. Each story will be 5 sentences long and will contain only a couple of characters. After reading each story you will assess its quality along three dimensions: fluency, coherence and interest. For each dimension you will provide a rating on a scale from 1 to 5.
  • Referring expressionsThe goal of this short survey is to collect your opinions about the most natural way to refer to objects in a conversation. Different people might do this in different ways, depending on how they interpret the context in which the dialogue takes place. We are interested in your opinion, given the context described below.
Enjoy!

Monday, May 24, 2010

The Politics of Publishing


Let's talk about class warfare in academics, shall we? I just read a nice little article on speech production from Cognition and while I enjoyed it, I couldn't help but wonder how it got published because it was rather light weight. To be fair, Cognition published it as a "Brief article" so it was meant to be short*; nonetheless, it had the feel of a grad student poster, not a publication. You might argue that this is the point of a "Brief article",  but I will argue that similar content would likely not have been published had it not been recognizably associated with a well known scholar. Despite the precautions of blind reviews, it is not uncommon for a linguistics reviewer to have a pretty good idea of who authored or co-authored a paper, simply because linguistics is a small field, and the sub-fields even smaller. Most scholars have easy-to-recognize methodologies, content areas, or style that acts almost as a scholarly fingerprint. I don't mean to be mean-spirited, I hope this doesn't come across that way, but minus the second author's fingerprint, I don't see this paper getting accepted.

But first, let's look at the paper itself: A purple giraffe is faster than a purple elephant: Inconsistent phonology affects determiner selection in English (full citation below).

Friday, May 21, 2010

mixed modals

I found Ta-Nehisi Coates' use of had have awkward in the following sentence (referring to Rachel Maddow's recent interview with Rand Paul):

That interview would have went a lot better for Rand Paul if Maddow had have just thrown her notes in the air and accused him of being a bigot, and a covert member of the Klan. (emphasis added).

So, the construction is "X would have went a lot better if Y had have just verbbed." My position is that the tense and aspect of the VP in the embedded subjunctive (the if-clause) normally matches the VP in the main clause. So, my preference is for "X would have went a lot better if Y would have just verbbed."

This use of had reminds me of the use of past perfect for simple past in black English, in constructions like "He had told me to be here at six." (though this wiki page says nothing about it). But this is not simple past anyway. Coates' use of had in the embedded clause may be a function of his dialect, I don't know. He's from Baltimore, but I don't know which neighborhood.  In a previous post, he talks about his language use as a child just a bit:

The fact is that while I read a ton, and got teased for it, I lived in the neighborhood and talked like people in the neighborhood. I was in gifted classes at school, but I didn't have the kind of parents who penalized for using a word like "irregardless." Moreover, I was, if not particularly cool, still really well liked.  My particular and specific black experience was that as long as you had some familiarity with the language, you pretty much were free to do whatever you wanted. (emphasis added).

Nonetheless, I'm no prescriptivist, I just thought it curious.

In Defense Of Science Blogging

Jason G. Goldman, a science blogger out of USC, posts a thoughtful defense of the emerging role of science blogging. His major points seem to be:

  1. Science journalism sucks (okay, "sucks" is my word), so science blogging is a potential, and superior replacement
  2. Blogging is a form of public intellectualism
  3. There are real professional development opportunities
He makes other points as well. And, he offers some good links to related posts.

Tuesday, May 18, 2010

How do you feel this bar?

English speaker walks into a bar in China hoping to practice his Chinese*. Chinese waiter walks up to the gweilo hoping to practice his English, and the game begins. A lingo-blogger takes on the heavy challenge of analyzing this linguistic power struggle in a post on sinosplice. In classic linguistic fashion, he devises a rule:

John's Rule For Determining Language:
Given a conscious choice between a number of languages to use for interaction, speakers will naturally tend to choose the common language in which the poorer speaker’s level is highest.

John wiggles by stipulating that "there’s no strict right or wrong here" (all linguistic "rules" require that same stipulation, haha, so what the hell's the point of a rule!!). But John uses this rule to define a linguistic strategy: "if I want to improve my Chinese without all this strife, I need to find Chinese speakers with English worse than my Chinese."

While John evokes communication efficiency as his basis for this strategy, he misses a crucial factor: appropriateness. It's not really appropriate for a customer to use a waiter for language practice, and vice versa. Even though it's effective for language learning purposes, that's just not why bars exist. Once John as customer violates the appropriateness, he's all but invited that waiter to do the same. At that point, all rules are off, it becomes a linguistic jungle with each speaker fighting for survival.

Unfortunately, neither John nor I could find any academic research on this topic (I found tons on inter cultural pragmatics, but nothing obviously on this particular situation). I suspect it's out there, it's just hard to find. What terms should I search for? Hmmm, it's an odd one, no doubt.

*The blogger did not specify what dialect, though Mandarin is likely.

A linguist asks some questions about word vectors

I have at best a passing familiarity with word vectors, strictly from a 30,000 foot view. I've never directly used them outside a handfu...