Major market shifts in the database world don't happen often. When they do, they're massive, creating an impact that can last 10 to 20 years. When I entered the job market, it was right at the tail end of the last major shift from the mainframe to client/server. Our CIO announced his (aggressive) vision of the client/server future and I noticed three primary responses from my more seasoned colleagues. They either ridiculed it and ignored it; did a little nail biting, wondering what it all means; or got on board as fast as they could.
The new databases of today have the potential to be every bit as disruptive as the client/server wave, and the human reactions I saw nearly 20 years ago haven't changed all that much. These disruptors bring change, but while it sounds cliche, the truth is that with this change comes tremendous opportunity.
Let's try and classify some of these changes into a few big buckets so we can get a feel for how they could affect the data center as well as individuals from a career perspective.
For a whole lot of reasons that are beyond the scope of this article, modern applications are being asked to scale and perform at radically higher rates than ever before. When someone builds these modern applications, did you ever wonder what manages the data behind them?
Here's a hint: It's often not a relational database, and the reason is something called the CAP theorem.
The CAP theorem -- first articulated by Eric Brewer in 2000 -- essentially explains what the Rolling Stones knew long ago: "You can't always get what you want." CAP is an acronym for consistency, availability and partition tolerance (scale out). To drastically oversimplify, it says: "Pick any two, but you can't have all three."
Relational databases are fantastic at consistency and availability. But partition tolerance? Not so much. That's a problem, because what's often needed most in these modern applications is the ability to scale out across many database servers, even at the expense of something else. That something is transactional consistency. So what if your status doesn't immediately appear on all your friends' walls? Not a huge deal. But if that application is running slow because it can't scale -- that's a show stopper. This is where the NoSQL database comes into play.
There is a lot of debate over what to call these things, but I think NoSQL conveys the point that this is not a relational database, nor does it have things like the transactional consistency we take for granted in a relational model. (There's no reason that, in theory, they cannot use SQL, but the name caught on, so we're going to run with it.) What they can do is scale out like nobody's business without the complexities of data modeling and schema management, which most developers just love.
MapReduce and Hadoop
NoSQL databases strive to meet the scale demands for large-scale Web applications. However, they don't do much to address the business need for analyzing the resulting masses of data. One solution is to go back to the future and create a brute-force way of scaling and processing. Google's answer was MapReduce, which most of the world now knows in its open source form, Hadoop.
Apache Hadoop can churn through petabytes of data across thousands of nodes running nothing more than commodity hardware. The reduction in storage costs alone can be astounding, but there are definitely tradeoffs. Setting up Hadoop is not trivial, and your army of resources who know SQL will have to learn about things like HIVE and PIG (I know… it is hard to write that with a straight face) to write MapReduce jobs to handle your business needs. While many identify these problems as a step backwards, it does appear that Hadoop is here to stay.
These aren't new, but they are getting a lot of attention, and for good reason. Relational databases have become like the Borg, assimilating various technologies along the way to become all things to all applications. They're now big and complicated. Specialized database advocates argue that, rather than trying to solve every problem with one solution, it is better to build the right specific technology to solve a specific problem. Doing so, they say, dramatically decreases cost and increases performance.
The most common specialized database today focuses on solving the "data warehousing" problem, which is read-intensive requests going against massive amounts of data. Another specialization is occurring around online transaction processing systems that require very high transaction rates on relatively small data sets, and we expect to see more and more breakthroughs as in-memory and solid state disk technologies change the economics of delivering high IO rates.
Your Career Amidst the Chaos
The database world is no longer the same old story of the top three vendors vying for a couple of market share percentages here and there. Big changes are on the horizon that can and will bring change and, whether coming from new technology, off-shoring or outsourcing, the result is the same: career pressure.
I see this disruption most acutely in the role of the database administrator. I'm concerned for those who are so swamped with current workloads that it seems impossible to take the time to follow these trends. Equally troubling are those who approach these new technologies with condescending, dismissive attitudes.
To the seasoned database administrator, I would say that your experience can be invaluable if you are a team player willing to step out of your comfort zone, learn about these new disruptors and take initiative. Become an internal consultant to your company. Learn more about the business of IT. Understand the data and application, not just the database. Proactively test and report back on some of these new technologies, showing how and why they could affect your company.
For developers, this time is incredibly exciting. The ability to deliver high-volume, scalable applications that can reach untold audiences has never been greater, and it's amazing how much you can do right from home, for next to nothing in terms of cost. Start some projects for your family, school, church, etc. Dive in, learn and have fun with it. Developers have always led the way into brave new worlds, and the opportunities now are staggering.
Finally, to CIOs and data center owners, here is an opportunity to help your teams learn and grow. Reward innovation. Make time for anyone in your organization who indicates they want to take your company into the future. Don't be afraid to challenge your people, but make sure they are challenges they can conquer. Your reward will be off the charts because, at the end of the day, it's still about people. Fire them up, give them a chance to succeed and create the environment that drives them there. They will rise to the occasion.
Here's to all the excitement in our future as IT professionals! I wish you nothing but success in conquering the challenges of the next decade and beyond.