Skip Links

Network World

  • Social Web 
  • Email 
  • Close

(Comma separation for multiple addresses)
Your Message:

reCAPTCHA illustrates human ingenuity

How CAPTCHAs could be used for higher purposes
Security Strategies Alert By M. E. Kabay, Network World
September 16, 2008 12:28 AM ET
Sign up for this newsletter now!

Mich Kabay takes a high-level view of security issues and provides resources to help safeguard your corporate and personal security.

  • Share/Email
  • Tweet This
  • Comment
  • Print

The "Completely Automated Public Turing test to tell Computers and Humans Apart" (CAPTCHA) is the squiggly word that appears on Web sites to stop bots from sending spam and doing other vile deeds. In the Sept. 12 issue of SCIENCE magazine (Vol 321 p. 1465), computer scientists Luis von Ahn, Benjamin Maurer, Colin McMillen, David Abraham and Manuel Blum from the Computer Science Department of Carnegie Mellon University in Pittsburgh report on an innovative application of CAPTCHAs: potentially using the more than 100 million applications of human intelligence in decoding the symbols for useful work.

The first application involves supplementing machine intelligence applied to optical character recognition (OCR). Currently, there is an enormous worldwide effort to transcribe existing printed documents into digital form for increased availability and (one hopes) long-term storage (although the stability and usability of digital storage in the face of technological change is the subject of much concern). 

The reCAPTCHA system "is used by more than 40,000 Web sites… and demonstrates that old print material can be transcribed, word by word, by having people solve CAPTCHAs throughout the World Wide Web. Whereas standard CAPTCHAs display images of random characters rendered by a computer, re-CAPTCHA displays words taken from scanned texts."

Words which OCR programs have not been able to recognize are stored and then randomly supplied from a database as part of a two-word CAPTCHA; the second word is a regular computer-generated CAPTCHA. Both the graphical symbol from the database of uncertain words and a computer-selected ordinary-word CAPTCHA are suitably distorted to prevent machine recognition. If the user types in the correct spelling of the second CAPTCHA, then the user’s interpretation of the first CAPTCHA is recorded as a possible transcription.

"To account for human error in the digitization process, reCAPTCHA sends every suspicious word to multiple users, each time with a different random distortion" and combined with different control words. The process includes additional controls in cases of discrepancies. Careful analysis of the results suggests accuracy higher than 99% - acceptable by industry standards and better than standard OCR. Furthermore, the researchers found that it takes no longer (around 13 seconds) to decipher a two-word reCAPTCHA than to decipher a one-word CAPTCHA that uses gibberish.

The authors emphasize that reCAPTCHA is "a proof of concept of a more general idea: ‘Wasted’ human processing power can be harnessed to solve problems that computers cannot yet solve. Some have referred to this idea as ‘human computation.’"

For example, a CAPTCHA-like system called ASIRRA asks users to distinguish among pictures of dogs and cats -- and can include photos from local animal shelters to promote adoption of homeless critters. Another approach involves working challenging computational problems into computer games; the authors write, "People play these games and, as a result, collectively perform tasks that computers cannot yet perform. Inspired by this work, biologists have recently built Fold It, http://fold.it again in which people compete to determine the ideal structure of a given protein." You will find several other amusing games of this type at the Games With a Purpose (GWAP) site.

Anyone interested in installing reCAPTCHA on Web sites can find full documentation online

I hope that readers will come up with innovative applications of these ideas to the security field. If you do, please drop me a line and I’ll be glad to work with you to publicize your ideas.

I’m already, ah, GWAPing in amazement at the human ingenuity displayed in this work and look forward to further displays of creativity.

M. E. Kabay, PhD, CISSP-ISSMP, specializes in security and operations management consulting services and teaching. He is Chief Technical Officer of Adaptive Cyber Security Instruments, Inc. and Associate Professor of Information Assurance in the School of Business and Management at Norwich University. Visit his Web site for white papers and course materials.

  • Share/Email
  • Tweet This
  • Comment
  • Print

Comments (1)
Login
Forgot your account info?

reCAPTCHABy pjbrockmann on September 16, 2008, 11:14 amThis is great, but it should be a game. With points and prizes and advertisements while playing. That way the guilt associated with playing endless HALO multiplayer...

Reply | Read entire comment

View all comments

Add comment
Anonymous comments subject to approval. Register here for member benefits.
Have a NetworkWorld account? Log in here. Register now for a free account.

Videos

rssRss Feed