Skip Links

Network World

  • Social Web 
  • Email 
  • Close

(Comma separation for multiple addresses)
Your Message:

Outage hits Amazon S3 storage service

Amazon isolates problem, but some troubles persist
By Jon Brodkin , Network World , 02/15/2008
  • Share/Email
  • Tweet This
  • Comment
  • Print

 Amazon S3 went down Friday morning, causing numerous problems in Web applications that rely on the online storage service.

Graphics on Twitter and Tumblr were down as a result, one blogger states

S3 users and an Amazon employee discussed the problem on a developer connection message board hosted by Amazon Web Services. The first poster on the thread reported an outage and “massive (500) internal server error” that began around 4:30 a.m. PST.

“We are seeing the same thing with our Web site, which hosts all videos on Amazon S3,” one poster wrote. “What is the ETA for getting this back up? Please keep us posted on the forums, since this is mission critical for some.”

An Amazon Web Services employee named “Kathrin” said “we’re investigating” just after 5 a.m. PST. At 7:17 a.m. PST, she said the issue was resolved and performance was “returning to normal levels.” But a further post by Kathrin at 9:09 a.m. PST acknowledged some lingering problems.

“This morning’s issue has been resolved and the system is continuing to recover,” she wrote. “However, we are currently seeing slightly elevated error rates for some customers and are actively working to resolve this. More information on that to follow as we have it. Also, we wanted to reiterate per our previous post that [we] will absolutely be posting technical information about what happened earlier this morning; our current priority of course is to ensure that the service recovers as quickly as possible and remains stable. We appreciate your patience while we do so.”

Online “on-demand” storage services are becoming increasingly popular, and Amazon is one of the leading providers. The outage this morning served as a reality check to some customers who say it’s good to have a backup plan in case the service fails as it did today.

“This is why you have to set up a fail-safe,” one poster on the Amazon Web Services thread wrote. “My new sites host over 25,000 images on Amazon and I wake up to notice major issues this morning. I switched over to using my local server and everything is back up. . . . I really need to set something up so it does this automatically. The S3 service is great but this just proves you can’t rely on it. This is a major issue especially since it’s been down for so long. Way to go Amazon.”

Amazon released a statement Friday afternoon confirming the outage and saying the service is now up to 99% availability.
"One of our three geographic locations was unreachable for approximately two hours and was back to operating at over 99% of normal performance before 7 a.m. PST," Amazon stated. "We've been operating this service for two years and we're proud of our uptime track record. Any amount of downtime is unacceptable and we won't be satisfied until it's perfect."

These types of “glitches” are inevitable, though, writes technology observer (and author of The Big Switch: Rewiring the World from Edison to Google) Nicholas Carr on his blog, noting his belief in “the utility mode of computing.”

  • Share/Email
  • Tweet This
  • Comment
  • Print

Partner Content

Gartner 2009 Magic Quadrant for Job Scheduling

Gartner has positioned BMC CONTROL-M in the Leaders Quadrant of their "2009 Magic Quadrant for Job Scheduling." The report assesses the ability to execute and completeness of vision of key vendors in the marketplace. Read a full copy today, courtesy of BMC Software.

Download whitepaper

Dell's SMART Approach to Workload Automation

Read a compelling case study by EMA, Inc. to learn how Dell uses BMC CONTROL-M to cut cost and increase productivity with workload automation.

Download whitepaper

Workload Automation Cost Savings 2 Minute Video

A major computer manufacturer uses BMC CONTROL-M and just four people to schedule and run over 85,000 jobs every month. By switching to BMC CONTROL-M, they more than quadrupled the workload without adding a single staff member.  See how in this 2-minute video overview.

Go to video

Comment
Login
Forgot your account info?
Add comment
Anonymous comments subject to approval. Register here for member benefits.
Have a NetworkWorld account? Log in here. Register now for a free account.

Videos

rssRss Feed