Wayback Machine indexes 400 billionth page

Internet Archive announces achievement

The Internet Archive Wayback Machine, an indispensable chronicler of the Web for going on two decades now, late last week announced a major milestone.

wayback

From an Internet Archive blog post:

The Wayback Machine, a digital archive of the World Wide Web, has reached a landmark with 400 billion webpages indexed.  This makes it possible to surf the web as it looked anytime from late 1996 up until a few hours ago.

The post lists a number of historical highlights, including:

  • 2001 - The Wayback Machine is launched.  Woo hoo.
  • 2006 - Archive-It is launched, allowing libraries that subscribe to the service to create curated collections of valuable web content.
  • March 25, 2009 - The Internet Archive and Sun Microsystems launch a new datacenter that stores the whole web archive and serves the Wayback Machine.  This 3 Petabyte data center handled 500 requests per second from its home in a shipping container.
  • October 26, 2012 - Internet Archive makes 80 terabytes of archived web crawl data from 2011 available for researchers, to explore how others might be able to interact with or learn from this content.
  • October 2013 - New features for the Wayback Machine are launched, including the ability to see newly crawled content an hour after we get it, a "Save Page" feature so that anyone can archive a page on demand, and an effort to fix broken links on the web starting with WordPress.com and Wikipedia.org.

Not included in the timeline was mention of a fire on Nov. 6 of last year that did more than $600,000 to digitization equipment at the Internet Archive's scanning center in San Francisco.

(What online news sites looked like on 9/11)

The Wayback Machine has proven useful to me on a number of occasions, most memorably in assembling this collection of online news site images from Sept. 11, 2001, forever known as 9/11.

cnn

Four hundred billion is a lot of pages. In fact, the archive now serves up about 100 billion more pages than McDonald's has served hamburgers.

Welcome regulars and passersby. Here are a few more recent buzzblog items. And, if you’d like to receive Buzzblog via e-mail newsletter, here’s where to sign up. You can follow me on Twitter here and on Google+ here.

To comment on this article and other Network World content, visit our Facebook page or our Twitter stream.
Related:
Must read: Hidden Cause of Slow Internet and how to fix it
Notice to our Readers
We're now using social media to take your comments and feedback. Learn more about this here.