As a security researcher, analyzing malware trends is a common task.  This includes tracking data on the numerous types, variants, vectors and growth rates.  Occasionally, the byproduct of research yields information more interesting then the data itself.  One frequently occurring type of information is the reporting of statistics.

One website hacked every five seconds.  This statistic is spreading (...yes, every other letter is a link to a different site...a new trend called "letter linking", which I've just started) like a virus, due to Sophos' latest semiannual  Security Threat Report, revealing its cybercrime findings for the first half of 2008.  With an almost threefold increase from its previous 2007 report, stating one every 14 seconds, this is definitely a media worthy number to report.  Although, it seems to have been reported, blogged, quoted and re-reported throughout the Internet. However, I was raised to always question authority and statistics, and especially statistics reported by authority.  Therefore, I decided to look at Sophos' numbers.

In their previous threat report, they stated that they see 6,000 newly infected webpages every day.  With the assistance of my TI-30X, I too reached the same 1 in 14 second conclusion.  Of these infected sites, they determined that approximately 20% were hacker sites (intentionally malicious), while the remaining 80% were legitimate sites that had been compromised.

Their current report claims detection of 16,173 infected webpages every day.  Once again, checking the math (this time with an abacus) I came up with the same 1 in 5 second number.  This time, they found that about 90% of these sites were legitimate ones compromised by hackers.

One of the problems with statistics that claim the occurrence of an incident for a given interval (i.e., 1 out of every 5 people, 1 every 5 seconds) is the common misperception that these events actually occur with this uniform consistency.  Thus, applying this concept here, literally one website is not hacked constantly at 5 second interval.  While effective at demonstrating a relative change, the real data points are usually randomly dispersed as individual and clustered events.

Discussing malware research with security analyst David Marshallick, provided some very helpful information, "For any first hand investigation on malware trends and behavior, the obvious first step is collecting the samples.  There are a number of low-interaction, automated collection apps, like Nepenthes, honeyd, or mwcollect.   Download and play with these-you'll get a better understanding of the malware collection process.  Generally, the next step is a system level analysis of the samples in a sandboxed environment.  The analysis and disassembly give you the useful information for comparison against the existing database to identify unique samples." Another collection solution, using Windows, is to turn off all of your PC security and just surf the internet, installing anything that asks for permission, downloading everything claiming to be a security necessity, and click on every dancing animal you can find. 

It is important to keep in mind that any vendor, organization or individual can only report on the data they have acquired or can access.  With different vendors, you may have different data sets and subsequently, different findings.  A recent update from Kaspersky reports that in July 2008, unique malware accounted for 20,704 of unwanted programs found on users' computers.  This would indicate that new infections are occurring at a rate of 1 every 2 minutes.  I discovered similar insight on the reporting of malware statistics at the Security Curve Weblog.

Now let's take a moment to look at internet growth statistics.  According to Netcraft, there are approximately 175 million websites on the Internet as of July 2008, which is roughly 20 million more than their January 2008 report.  This means that during this 6 month period, 20 million new websites appeared, which translates into 110,500 new sites every day.  Extrapolating these numbers even further, we find that 4,604 new sites are created every hour, or 6.4 new websites every 5 seconds (Please, somebody check my math).  If the growth rate of the Internet (or websites) is greater than that of malware, does that indicate a relative decline in malware propagation rates?

If there is any accuracy or truth in these numbers, it sheds a different light on the findings of Sophos.  With the number of infection rate variables to consider (websites vs. webpages, URLs, domains, ISPs, security platforms, operating systems, etc...) I doubt we will ever have any completely accurate numbers regarding anything related to Internet usage.  However, the use of trends, changes, and patterns will continue to provide helpful security metrics...just remember to keep them in context with their source and perform your own evaluation.

"Statistics are the greatest liars of them all" - Thomas Carlyle, 19th century essayist, satirist, philosopher, and statistics critic.

