Skip Links

Network World

  • Social Web 
  • Email 
  • Close

(Comma separation for multiple addresses)
Your Message:

Wikipedia gears up for explosion in digital media

Bigger upload limits made possible by new servers, storage
By Jon Brodkin, Network World
January 13, 2009 12:45 PM ET
  • Share/Email
  • Tweet This
  • Comment
  • Print

Wikipedia is gearing up for an explosion in digital content with new servers and storage designed to handle larger photo and video uploads.

Video file sizes are quickly reaching the dozens and hundreds of megabytes, and the proliferation of high-megapixel cameras means even small photos can take up a few megabytes, says Brion Vibber, CTO at the Wikimedia Foundation, which operates Wikipedia. Until early 2008, the user-generated encyclopedia's primary media file server had just 2TB of total space, Vibber says.

"For a long time, we just did not have the capacity [to handle very large media files]," he says.

Wikipedia has since scaled up from 2TB to 24TB and now 48TB of storage for its primary medial file server, and recently raised file upload limits from 20MB to 100MB. The amount of storage actually being used is about 5TB but that will grow quickly, Vibber says.

For example, users are uploading public domain classical music, and some recordings can last a half-hour. Documentary films that are out of copyright are also being uploaded, and some users are struggling to keep files under the 100MB limit, according to Vibber.

Vibber's long-term goal is to let users upload feature-length, high-quality videos, but in addition to capacity limits he says there are challenges related to getting files in the appropriate format and the physical movement of large files.

"There's no limit, and there's no practical limit," Vibber says. "The limits will get bigger and bigger to where it will be relatively easy for someone who has a legitimate need to upload a two-hour video of good quality."

Wikipedia's new servers and storage were supplied by Sun, which provided donations and special pricing on Sun Fire x4500 and x4150 servers and StorageTek arrays. (Compare storage products.) Wikipedia, which handles more than 10 billion page views a month, also uses Sun's open source MySQL as its primary database software.

Wikimedia operates a primary data center in Tampa, Fla., and caching centers in Amsterdam and South Korea, and has a total of 450 or so commodity servers from a mix of vendors, Vibber says.

The site's technology goals this year and next include improving usability, making it easier to edit articles and upload videos and photos, and integration with media file sites like Flickr, he says.

The donations from Sun came in the midst of a big fundraising push by Wikimedia, which raised more than $6 million from 125,000 donors, allowing it to fund operations through the end of its fiscal year on June 30. Wikimedia is a nonprofit and the encyclopedia Web site is devoid of advertising, so the site relies on donations, grants and gifts. Wikipedia has a paid staff of 23 employees, about eight of which are in IT.

  • Share/Email
  • Tweet This
  • Comment
  • Print

Partner Content

Gartner 2009 Magic Quadrant for Job Scheduling

Gartner has positioned BMC CONTROL-M in the Leaders Quadrant of their "2009 Magic Quadrant for Job Scheduling." The report assesses the ability to execute and completeness of vision of key vendors in the marketplace. Read a full copy today, courtesy of BMC Software.

Download whitepaper

Dell's SMART Approach to Workload Automation

Read a compelling case study by EMA, Inc. to learn how Dell uses BMC CONTROL-M to cut cost and increase productivity with workload automation.

Download whitepaper

Workload Automation Cost Savings 2 Minute Video

A major computer manufacturer uses BMC CONTROL-M and just four people to schedule and run over 85,000 jobs every month. By switching to BMC CONTROL-M, they more than quadrupled the workload without adding a single staff member.  See how in this 2-minute video overview.

Go to video

Comments (5)
Login
Forgot your account info?

doesn't Wikimedia talk to Archive.org?By Anonymous on January 13, 2009, 6:10 pmArchive.org already has a pretty good solution for this type of stuff. Rather then trying to compete, the two groups should talk to each other and come up with a...

Reply | Read entire comment

Archive.orgBy Anonymous on January 14, 2009, 11:38 amArchive.org stores a lot of data but it's on very slow almost-offline storage. Archive.org doesn't have but a tiny fraction of the traffic that Wikipedia gets....

Reply | Read entire comment

youtubeBy Anonymous on January 14, 2009, 4:30 pmjust link to youtube so our donation money isn't spent on bandwidth, retard.

Reply | Read entire comment

LicensingBy Anonymous on January 14, 2009, 5:32 pmGo read the Youtube terms of service. They obviously don't work with Free content - you retard.

Reply | Read entire comment

CorrectBy Anon on January 14, 2009, 7:20 pmI would like to point to the wikipedia statistics page: http://en.wikipedia.org/wiki/Wikipedia:Statistics#Automatically_updated_statistics See the traffic charts.

Reply | Read entire comment

View all comments

Add comment
Anonymous comments subject to approval. Register here for member benefits.
Have a NetworkWorld account? Log in here. Register now for a free account.

Videos

rssRss Feed