IBM's Watson fails third-grade geography test

World's smartest computer thinks Toronto is a U.S. city

IBM's Watson destroyed all humans in Tuesday's Jeopardy match, and mankind's fate hangs in the balance. It's only a matter of time before we are subjugated by a hopefully benevolent race of robots. I, for one, welcome our Skynet/Cylon overlords.

Yes, IBM's Watson racked up $35,734 in the first half of the much-anticipated Jeopardy matchup against the game show's two greatest human champions, who combined for a measly $15,200. The destruction of the human race is on course to finish tonight with the conclusion of Final Jeopardy. 

IBM's Watson surges to giant Jeopardy lead against human competitors

Speaking of Final Jeopardy... let's all take some solace in the fact that Watson seems to know as much about geography as your average schoolchild. Actually, that's probably not fair. 2nd- and 3rd-graders have a greater grasp of geography than your average American adult. 

But it's still puzzling why Watson would answer "What is Toronto???" (yes, "he" used multiple question marks) to a Jeopardy category titled "U.S. Cities." 

I've been to Toronto and it didn't feel that much different than Boston, although it is a very nice city indeed. But I had to use my passport to get there and answer some intrusive questions at Airport Customs, so I can confirm that it lies outside U.S. borders.

Watson's excessive use of question marks suggests that the supercomputer was highly confused by the Final Jeopardy question... err... answer. And so were Watson's creators. 

"Watson's developers were puzzled by his flub in the Final Jeopardy! segment," IBM marketing guy Steve Hamm writes in a Smarter Planet blog. "The category was US Cities, and the answer was:  'Its largest airport was named for a World War II hero; its second largest, for a World War II battle.' The two human contestants wrote 'What is Chicago?' for its O'Hare and Midway, but Watson's response was a lame 'What is Toronto???'"

"If this had been a normal clue, Watson would not have buzzed [in]," saving the machine from the embarrassment, notes Stephen Baker, who wrote a book about the machine's development.

Still, Hamm asks "How could the machine have been so wrong? David Ferrucci, the manager of the Watson project at IBM Research, explained during a viewing of the show on Monday morning that several things probably confused Watson. First, the category names on Jeopardy! are tricky. The answers often do not exactly fit the category.

"Watson, in his training phase,  learned that categories only weakly suggest the kind of answer that is expected, and, therefore, the machine downgrades their significance. The way the language was parsed provided an advantage for the humans and a disadvantage for Watson, as well. 'What US city'wasn't in the question. If it had been, Watson would have given US cities much more weight as it searched for the answer.

"Adding to the confusion for Watson, there are cities named Toronto in the United States and the Toronto in Canada has an American League baseball team. It probably picked up those facts from the written material it has digested. Also, the machine didn't find much evidence to connect either city's airport to World War II. (Chicago was a very close second on Watson's list of possible answers.) So this is just one of those situations that's a snap for a reasonably knowledgeable human but a true brain teaser for the machine."

Ferrucci was somewhat encouraged by the mistake because "Watson knew it did not know that right answer with any confidence," reporting a confidece level of only about 30%. 

Watson was also smart enough to make a small bet and preserve his giant lead going into tonight's final showdown. Although Ken Jennings and Brad Rutter doubled or nearly doubled their scores in Final Jeopardy, they are still big underdogs heading into the finale.

If Watson gains another $35,000 tonight, the IBM computer's total will be more than any human has ever earned in a two-day tournament final. (Teen tournament winner Rachel Rothenberg apparently holds the record with $63,800).

Watson's performance so far is an astounding achievement for both machine and the men and women who created him. With any luck, one quick tweak to the Watson algorithm will show him where Toronto is on a map. 

Follow Jon Brodkin on Twitter.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2011 IDG Communications, Inc.

SD-WAN buyers guide: Key questions to ask vendors (and yourself)