Gaining significant knowledge from the growing tons of information available about big picture topics such as biology, economics, astronomy, health or climate is a challenge beyond most human minds and computer programs.
But the scientists at the Defense Advanced Research Projects Agency (DARPA) want to change that with a program called Big Mechanisms they say could gather all exiting data about a particular topic, keep it up-to-date and develop new conclusions or research directions.
+More on Network World: IRS warns on 'Dirty Dozen' tax scams for 2014+
"Having big data about complicated economic, biological, neural and climate systems isn't the same as understanding the dense webs of causes and effects-what we call the big mechanisms-in these systems," said Paul Cohen, DARPA program manager. "Unfortunately, what we know about big mechanisms is contained in enormous, fragmentary and sometimes contradictory literatures and databases, so no single human can understand a really complicated system in its entirety. Computers must help us."
The Big Mechanism program might bring about new ways to understand complicated systems, DARPA said. "Today's researchers read deeply but struggle to keep up with relentless streams of relevant publications. To stay current, a researcher must specialize, becoming expert in a small part of something much bigger. The vision for the Big Mechanism program is fundamentally different: Every publication would immediately become part of a public, computer-maintained, causal model of a complicated system-a big mechanism-and every aspect of a big mechanism would be tied to the data that supports it or contradicts it. To the extent that we can automate the construction of Big Mechanisms, we can change how science is done," DARPA said.
In a nutshell the Big Mechanism program will develop technology to read research abstracts and papers to extract fragments of causal mechanisms, assemble these fragments into more complete causal models, and reason over these models to produce explanations.
DARPA said it will aim the Big Mechanism program at cancer research first, specifically cancer pathways or the molecular interactions that cause cells to become and remain cancerous.
From DARPA: The program has three primary technical areas: Computers should read abstracts and papers in cancer biology to extract fragments of cancer pathways. Next, they should assemble these fragments into complete pathways of unprecedented scale and accuracy, and should figure out how pathways interact. Finally, computers should determine the causes and effects that might be manipulated, perhaps even to prevent or control cancer.
"The language of molecular biology and the cancer literature emphasizes mechanisms," Cohen said. "Papers describe how proteins affect the expression of other proteins, and how these effects have biological consequences. Computers should be able to identify causes and effects in cancer biology papers more easily than in, say, the literatures of sociology or economics."
Actually building the Big Mechanism system sounds complicated as you might imagine. According to DARPA: " The Big Mechanism program will require new research and the integration of several research areas, particularly statistical and knowledge-based Natural Language Processing (NLP); curation and ontology; systems biology and mathematical biology; representation and reasoning; and quite possibly other areas such as visualization, simulation, and statistical foundations of very large causal networks.
"Machine reading researchers will need to develop deeper semantics to represent the causal and often kinetic models described in research papers. Deductive inference and qualitative simulation will probably not be sufficient to model the complicated dynamics of signaling pathways and will need to be augmented or replaced by probabilistic and quantitative models."
For a look at what exactly DARPA will be looking for go here.
Check out these other hot stories: