Open source software such as Lucene and Linux played a crucial role in IBM's Jeopardy computer
The software that helped IBM's Watson computer reign victorious on the Jeopardy game show in February could also help the financial industry assess risk more effectively, a pair of IBM executives stated on Monday at a high-performance computing conference.
"The financial community is always interested in reducing risk and being more assured of their decisions. The analytics behind Watson can help with some of that," said Jean Staten, IBM's director of cross-company Linux usage, in an interview with IDG.
Staten, along with Eddie Epstein, who led the technical team behind the IBM Watson computer, both spoke Monday at the High Performance Computing Linux Financial Markets conference, held in New York.
Watson's ability to judge the confidence of an answer it provides to a user query could help further speed split-second decision-making process in the financial community, the two noted.
In February, IBM's Watson system beat two previous champions at answering cleverly worded trivia questions on the Jeopardy television game show.
In each episode, after host Alex Trebek asked the contestants a question, Watson's best three answers would be posted on the bottom of the screen, along with a self-assessed confidence rating -- ranging from 0 percent to 100 percent -- how likely each answer was correct. The greater the confidence level, the more likely the given answer would actually be correct, according to the machine.
That ability to rank the probable validity of computer-generated answers could be valuable to the financial community, Staten noted. Using an organization's internally collected data, such a feature would also allow hedge fund managers, for instance, to quickly make more informed decisions about investments to make, or to avoid.
The Watson system, when asked a question, will seek to find as many probable answers for that question as it can, explained Epstein. Each answer it produces will contain a confidence level, or an assessment of whether the answer is the correct one, based on analysis of the source of the data, the structure of the question, and other factors.
This approach is "representative of really of what we do in everyday life. When we make decision, we have level of confidence of these decisions," Staten said, adding that we rarely know with absolute certainty whether our answer to some question is the correct one.
In Jeopardy, Watson would submit an answer if it had a confidence rating of around 80 percent or higher. Within organizations, the threshold of what constitutes a correct or a reliable answer could be calculated by using a set of training data, Staten said.
The Watson system used a variety of open source technologies to help generate answers, each with its own set of algorithms for deriving answers. The Apache Hadoop data processing framework was used for preprocessing the vast amounts of unstructured data that IBM had collected for the challenge. The Apache UIMA (Unstructured Information Management Architecture) provided the framework for deploying search engines, such as Apache Lucene and the Indri inference engine, that would figure out the best answers from a variety of perspectives.
Another potentially appealing aspect to Watson is its speed, Staten said. The Watson system ran over 2,800 processors across two racks of Power7 servers, all of which were configured to run multiple searches in parallel. This ability to carry out many search tasks in parallel allowed the system to provide answers to Trebek within three seconds.
"Speed is more important than ever. We will only have more and more unstructured data that we have to deal with. This is not going away. The technology that Watson is based on provides the framework within which many different industries could benefit," Staten said.
Staten said that IBM is still determining how to bring the Watson technology to the financial service industry. It may not offer a Watson-branded solution per se, though bits of the technology could come embedded within future IBM systems.