The term 'disruption' gets tossed about a lot -- too often -- in the technology industry. But it isn't always hype. Backed by nearly half a billion dollars in investment, CEO Scott Dietzen and Pure Storage are hard at work disrupting a big chunk of the enterprise storage market owned by the likes NetApp and EMC, which is no stranger to disruption itself, having turned the tables on a previous generation of storage leaders.
Pure Storage is pure flash storage and it is aggressively driving its flash technology into Tier 1 storage applications in the data center. Forget flash being pricey. Dietzen, a veteran of multiple successful startups, says Pure Storage has driven the cost of flash below that of disk storage for such applications and he even promises customers a savings of $500,000 per storage appliance per year.
In this installment of the IDG Enterprise CEO Interview Series, Dietzen spoke with Chief Content Officer John Gallant about how Pure Storage is changing the economics of flash and how the company intends to elbow the current market leaders aside in the next-generation data center.
Scott, talk about how your work with earlier startups shaped your strategy at Pure Storage.
I was the CTO of WebLogic and then BEA Systems. I have a PhD in computer science, became an entrepreneur, have done four startups. The prior three were successful although they all ended in acquisition. Generally, I had the good fortune to land in big markets that were being fundamentally disrupted by innovative ideas and great teams. That's the recipe I played before. The recipe is a big market, a technology sea change of some variety, an innovative [strategy] and a world-class team. When you put those four ingredients together you can vastly reduce the risk to success.
Sounds like a good formula. Were any of the startups involved in storage?
No. I sold around storage for my career and believed it's been broken for quite some time. You look at what it takes to roll these half-ton double Sub-Zero refrigerators of mechanical spindles into the data center and the amount of esoteric management and tuning they require and how profoundly slower they are. Storage consumes a lot of people's work. It consumes way too much cash. It's power hungry and it's a thousand times slower than anything else in the data center. It seemed like a good place to disrupt.
Describe the inception of Pure Storage.
Flash memory had already redefined consumer technology. It's what's in your mobile phone; it's what's in your laptop. What a lot of people don't know is the technology is used at scale in most of the large consumer Web data centers. Google, Facebook use a lot of flash technology. We have principals from places like Apple, where the original iPod work was done. iPods used to have a disk inside them and they switched over to flash. We have principals from the infrastructure team at Google. We came together to package up this flash revolution and make it available to enterprise customers. What we thought were the two keys to doing that were economics and compatibility. The consumer tech properties -- like Google, like Facebook -- they generate quite a bit of cash. They are less price-sensitive than a typical enterprise business is. Most customers' storage budgets aren't going up, so unless you can fit into those storage budgets you don't have a lot of opportunity.
Second, the Googles and the Facebooks of the world own their own software stack so they can make changes where they need to in order to accommodate a media swap from mechanical disk to flash. Most businesses run off-the-shelf software from Oracle, Microsoft, VMware, SAP and so on. We needed to be able to fit into form factors that would work with all of that existing investment and that was what Pure did that was unique. We came out with an all-flash product that cost less to buy than disk. We didn't have to go for a total cost of ownership case. We went off and beat the acquisition price point of the mechanical disc arrays and we offered all the qualities of service, all the software capabilities for protecting and managing data that the enterprises were counting on.
It fits -- but we can save customers on average $500,000 per year per appliance and that's a combination of things like power, cooling, rack space, administrative overhead.
One of the things customers don't like about storage is the way the big vendors force them to rebuy the same storage over and over every four or five years. We found a way to fix that so that you can continue to run Pure Storage without having to replace an array for 10 to 15 years if you choose.
The conventional wisdom is that flash is still more expensive and it'll be some period of time before that changes. So how do you deliver that at lower cost?
The principal thing is we reduced the amount of space you needed to store the same data. We used techniques like compression and deduplication. These are techniques that a company like Riverbed [use] on your wide-area network. Nobody had ever applied these techniques to primary storage because they believed you couldn't do it fast enough to work in a storage discipline. We got some help from Moore's Law but we crafted algorithms that allowed us sub-millisecond to dedupe and compress data. On average, we achieved 5.5x data reduction, which means we require 5.5x less media than a traditional storage solution would. Think breakeven with disk is 3x to 4x currently and this is fast disk. These are the more expensive spindles that are used in Tier 1 and that is what we are replacing today. That is some $10 billion per year spend and we think that spend in its entirety goes to flash because of these innovations.
How did you get people to believe that?
The only way people believe it is if they do a proof of concept with us and they try it out in their data centers on their workloads. Seeing is believing. The enterprise buyers have gotten a lot more sophisticated. People never buy on PowerPoint anymore. You may get their attention but then they want you to take all the risk, front the cost for them to try the technology out in their data center on their workloads and, if you deliver on your promise, then you've got a good chance to win. The large majority of the time we fund one of these proof-of-concept systems, which we're doing continuously, they convert. People become customers because the reality is actually better than we even promise it's going to be.
That seems like a very expensive way to break into a market.
We've been very well funded. The business has raised $470 million over the course of six rounds of financing. That's basically headcount and capital cost to build the business, headcount because we need support in sales and SEs [sales engineers] ahead of those customers per system and then we need to fund the inventory for people to try it. As long as these conversion rates stay high, it's a no-brainer for us to fund it. Another key part of that equation is repeat business. Storage is land and expand: the customer likes their first one, they'll buy another. More than half of our customers within the first year buy another one and then they buy another one and those don't require POCs, the subsequent purchase.
Part of this is a cost hurdle but part of this transition is also familiarity and a comfort-level hurdle. Is it difficult for people to get over the flash hurdle or do they immediately accept this once they get that cost saving in hand?
The cost savings is part of it. I think the other motivator is a business transformation. Every time we unplug one of these Sub-Zero refrigerators and replace it with one of our microwaves, we'll eliminate more than a year of latency every month. That's how much time the upstream applications are waiting for mechanical disk to seek and rotate.
I can probably make this tangible. When you do Google instant search, when you type and you see the results real time, that's flash. That's what flash does. When you use the information systems at your office, they probably don't feel like that. The main reason is mechanical disk is underneath them and so there's all of this buildup in time. The rest of the data center operates close to the speed of light and then the whole world stops for an eternity while that mechanical spindle tries to seek and rotate to get to where the data is.
By eliminating that year of latency every month, all of the information systems get profoundly more productive. You can go off and process 20 to 50 times the amount of data in the same period. If you're doing risk analytics, if you can look at 20 times the data in your calculations you can get a much more informed decision or you can pass it directly on in employee productivity. Doctors and nurses, when they're running information systems, just by moving them over to Pure, they've found 15% productivity increases. So less time interacting with technology, more time interacting with patients.
You mentioned the savings on average, $500K per year per appliance. I want to make sure I understand exactly how you measure that and how you proved it to people.
We actually have a total cost of ownership calculator that our sales people will engage with a customer. We'll ask questions like: How much does power cost in your data center? How much do you have to spend for rack space? How much does cooling cost? Then we will tie in what their budget for storage refreshes is. The way storage is sold today, the vendor will come in and sell you three years of maintenance with your initial contract. The pricing generally looks pretty appealing at that point.
Three years in, your year-four maintenance bill shows up and it's dramatically higher maintenance for years four and five, so high that you're generally financially better off buying a new storage array. That's not customer friendly because you've got the logistics challenge of moving the new Sub-Zero refrigerators in and moving the old ones out. You've got to stage downtime to migrate the data from the old storage to the new. They add insult to injury by basically making you rebuy the same software. It's exactly the same software that's coming in on the new box that you're writing off going out on the old box. This is a huge hidden tax that's built in to the way that customers have to buy storage. They're effectively rebuying the same storage every four or five years.
I was going to ask you about that because you made that comment before. How does it change with your systems?
We build into our maintenance an evergreen storage array. What that means is we promise we're never going to spike your maintenance in years [to come]. We're going to maintain flat or even declining maintenance cost. If the market moves down, which it generally does over time, we'll actually let customers dig into those savings and we include in our maintenance, which is comparable to what the big vendors charge, controller refreshes. We actually bring in new processor nodes once every three years so the customer can stay on the latest and greatest software without having to rebuy any new storage. What that means is a customer gets a very predictable OpEx and the only capital expenditure that they would have to keep running their storage is if they want to expand capacity, which we make very easy for them to do in a non-disruptive, very predictable way. We removed a lot of the overhead as well as a lot of the friction from the storage procurement process and save customers a lot of money in the process.
Have you got a great sense at this point of the reliability of flash?
It's dramatically better than mechanical systems, which is not surprising. Silicon, once it's initially burned in, really doesn't fail. We see failure rates below 0.05% annually which is, think order of 100 times better than what is typical for mechanical spindles. It's 5% down to 0.05%, so over 100 times better reliability from the media.