This column is available in a weekly newsletter called IT Best Practices. Click here to subscribe.
When Mark Zuckerberg and his associates developed a little application called Facebook back in 2004, they had no idea how wildly popular it would become. Designed to replace a college directory, Facebook took off among college students and had 12 million regular users within two years. At Facebook's ten year mark, the application has more than 1.23 billion users worldwide. More than half a billion people access the application every day.
Now picture yourself being the database administrator (DBA) who has to figure out how to accommodate such rapid growth. All those users and all their postings have to be organized and managed without impacting performance and availability of the application.
Facebook isn't the only company with a database scalability issue. Numerous startups with a cool application hope to become the next Facebook, Snapchat or eBay, attracting tens or hundreds of millions of users practically overnight. Moreover, extreme database growth isn't just for startups. Companies across all types of industries are outgrowing the scalability of their existing applications built on relational databases.
Simply moving from big iron in a datacenter to AWS or Rackspace in the cloud doesn't solve the problem. The fact is, relational databases can't scale efficiently to a really large environment, and databases such as Oracle, DB2, SQL Server and MySQL weren't designed to run in virtualized or cloud environments. As the applications built on these databases grow exponentially, they have to support a growing number of reads and writes. They eventually run into capacity and performance issues.
There are two traditional approaches to address this problem: scaling up by adding more hardware, and do-it-yourself sharding. Adding more hardware, even in the cloud, gets expensive. But sharding, which is breaking up the database into smaller pieces, usually means having to write code into the application to manage the pieces.
Now ScaleBase offers a third option to add database scalability and availability without adding more hardware or modifying the application code. ScaleBase takes a single database and separates it out into multiple distributed databases by utilizing fully automated sharding. The ScaleBase solution was purpose-built for the cloud and is available on Amazon's, IBM's and Rackspace's clouds. It leverages MySQL, the open source database from Oracle, and other readily deployable commodities.
Companies can migrate an existing application to ScaleBase’s distributed database without making any changes to the underlying application; i.e., no coding is necessary for the sharding. ScaleBase handles the database distribution in real time and in a fully automated fashion.
The following architecture slide illustrates how ScaleBase works.
The top of the illustration shows the application, which does not need to modified in any way. The application interfaces with a ScaleBase MySQL database just as it normally would with any other relational database. ScaleBase adds transaction management capabilities that maintain the database's ACID (atomicity, consistency, isolation and durability) properties.
The MySQL data layer shown at the bottom can be on any cloud. This layer provides a horizontally scalable database that automatically grows by spinning up new instances of MySQL as needed and performing dynamic data optimization, almost like a load balancer. In the above illustration, the vertical orange MySQL boxes represent automated redundancy of the main databases.
The ScaleBase Analysis Genie shown at the left in the illustration is at the heart of how this solution works to provide dynamic growth. The Analysis Genie looks at the existing application and database, as well as all the queries and traffic, and it performs an analysis which builds an optimal data-distribution policy customized for the app. The Analysis Genie basically figures out the best way to break out the database and distribute it to other instances. This recommendation is made before the database ever goes into production on ScaleBase so that a DBA can approve or tweak the custom built data distribution policy. ScaleBase views this plan as a living policy that can be adjusted as needed going forward.
Consider the case of a web-based gaming application. After a launch, a popular game can attract millions of simultaneous users worldwide within days. By building the sharding into the database level instead of the application level and allowing the data distribution to be based on analysis by the Analysis Genie, the gaming company can eliminate the risk of bottlenecks that would drive players away. ScaleBase would also help the company control costs and reduce its time-to-market.
There are basically two use cases for ScaleBase: migrating an existing application, and developing a new application from the start on ScaleBase. Customers that migrate an existing application from, say, Oracle or SQL Server, can reduce database license fees, the company says, while gaining the automatic scalability ScaleBase delivers.
Customers that develop new applications for web scale on ScaleBase can get those applications to market quicker and in a cost effective manner. In fact, ScaleBase offers a free edition that is limited only by deployment size. A developer can try ScaleBase, put it into production, understand how it works, and then go into an unlimited version of the solution once the application hits scale.
There also is a startup edition for new companies that are just getting off the ground. The database capabilities are unlimited, so the startup can grow an application quickly, and it remains free of charge until the startup company hits a certain revenue or investment level.
In short, ScaleBase can take one big bursting database and split it into multiple database instances (or nodes) that are automatically managed at an enormously scalable level. Benchmarks show that ScaleBase scales at a near linear level, so when an application goes from 0 to 1.23 billion users in a few years' time, there are no growing pains with the database.