Analyzing software bugs to death
Professor's project relies on users: 'There's always more bugs than engineers.'
By Ryan Debeasi
,
Network World
, 07/10/2006
- Share/Email
- Tweet This
- Print
Ben Liblit wants to get the bugs out of your software, but he really could use your help.
His Cooperative Bug Isolation (CBI) project, with the aid of a few thousand users who run "instrumented" versions of e-mail clients, music players and
other software, is designed to help developers pinpoint the causes of bugs, often down to a specific line of code.
Cooperation is key because "there's always more bugs than engineers," says the University of Wisconsin assistant professor.
The CBI instrumentation technology records bits of data about a program as it runs and keeps track of whether it ran successfully
or crashed. Once the program has ended, this information is sent to a server for analysis. Liblit has configured several freely
available versions of programs that send data to his server, and they are listed as bug isolation downloads on the project's
Web site. Developers can instrument other software and have the data sent to their own servers to be analyzed.
Passwords and other specific information are not captured, but Liblit acknowledges that by the nature of the software "there
is some information leakage. . . . It's very small, but nevertheless it is nonzero." Although no security flaws have been
found, an instrumented program theoretically could send sensitive data such as encryption keys to Liblit's server. The program
would have to be structured and instrumented in a very specific way to do this, and a developer who encountered this issue
could avoid it by not instrumenting encryption routines. Still, IT managers may be leery of a program that phones home with
even a little of their data.
Liblit says running a program a few thousand times gives him and other researchers enough data to compile, statistically analyze,
and compensate for false positives and negatives; and to correlate its bugs to certain pieces of code. Users haven't supplied
enough data for him to use - the University of Wisconsin pegs the total number of runs at about 3,000 per month - so he finds
bugs by running instrumented programs in the lab.
In one paper, Liblit and other researchers explain how they use this technique to solve a crashing bug in a Linux-based calculator
called bc. This bug occurs sporadically, which makes it "inherently difficult to fix," the paper says. After running the program
3,051 times and analyzing the data that Liblit's instrumentation provided, researchers pinpointed the issue to one offending
line of code. Says music player Rhythmbox developer Colin Walters on a Red Hat mailing list, "His work has helped me a lot
with Rhythmbox - certainly he's found some important bugs. . . . It's a really well done project - they have it so users can
easily opt out . . . and even include a tray icon [to show that the instrumentation is collecting data]." (See this discussion)
Unlike Microsoft's somewhat similar Dr. Watson utility or the TalkBack software used to document crashes in Mozilla Firefox,
Liblit's software records data from both successful and failed runs. In addition, it collects data while the program runs,
not only when it crashes. According to a 2004 post on a developer mailing list, this method lets the software catch bugs that software like TalkBack might have difficulty analyzing. (For example, if memory
is corrupted in the crash, after-the-fact analysis of that data might not be very helpful.) "There are cases where the crash
is so severe that it's impossible for [TalkBack] to kind of reconstruct exactly what's going on" says Chris Hoffman, director
of special projects for Mozilla, "[but] generally, those have been very few." Hoffman adds that TalkBack's method of capturing
data makes it "lightweight and nonintrusive."
Comment