Why synchronized, replicated file service makes sense now

Server virtualization plus more automated management plus high bandwidth delivered by WAN Virtualization or Network-as-a-Service make true LAN-speed performance irresistible

Last time, we saw why WAN Optimization won over WAFS. This time we'll take a look at how distributed replicated file services compared to WAN Optimization as alternatives looking backwards, as well as how such file services can change the nature of the enterprise WAN going forward, particularly in conjunction with WAN Virtualization and/or Network-as-a-Service as part of the Next-generation Enterprise WAN (NEW) architecture.

Wide Area File Services (WAFS) offered no meaningful advantages over WAN Optimization, while WAN Optimization offered more capability, and support for application speed-up beyond just file access, thus enabling WAN Optimization to easily defeat WAFS in the environment of typically low WAN bandwidth for most remote enterprise locations given the high cost of MPLS.

Distributed and replicated file services - as exemplified by Microsoft's DFS (Distributed File Service) Replication - which synchronize files across multiple servers, while having a lot in common with WAFS, have one major functionality difference which offers two major advantages over WAN Optimization. When accessing files which are managed/"stored" in a centralized location, with replicated file service, users get true LAN-speed performance every time they access such files, not only the 2nd or beyond time that the file is accessed from that location, because the files aren't merely cached at the remote location after being accessed, but are in fact "pushed" there and thus all always available locally.

The speed advantage the first time that any given user accesses a file can be very significant. In terms of user perception, and sometimes user productivity, that advantage can matter quite a lot.

Advantage number two is that with replicated file service, even for any files that are primarily created and "stored" at the local site, because they can be automatically replicated to a centralized server, the need to run backups over the enterprise WAN from remote locations can be completely eliminated. Whatever backup mechanism is in place at the data center/headquarters simply works the same way, "for free," for data from remote sites as well, without taking literally hours and clogging WAN pipes with such backup data.

There are other minor advantages as well (slightly faster performance, somewhat less WAN bandwidth consumed for file access, ability to access files when WAN connectivity is down), but they are dwarfed by the first two.

Yet despite these two big benefits, while there are indeed some deployments of distributed replicated file services for large WANs, I think it's fair to say that WAN Optimization has been far more widely deployed. Why is that?

I'd argue that the biggest part of the answer is the same set of reasons cited during our previous discussion about why WAN Optimization beat WAFS: support for more applications than just file service, in an environment with very little available bandwidth. With little bandwidth - typically between 512Kbps and 2 Mbps - it's both more important to be able to reduce the bandwidth consumed by other applications as well, and it's typically not sensible to be constantly "pushing" (or pre-positioning) a large number of files from a centralized data store to a number of remote ones.

I think the other major reason distributed replicated file services were not widely adopted has to do with the cost, in terms of both capital expense and management expense, of the systems needed at remote locations to store the data there.

As regular readers of this column may have spotted by now, however, most of these reasons have been addressed by other technologies that are part of the NEW architecture. Most importantly, WAN Virtualization and Network-as-a-Service have drastically reduced the cost of bandwidth at remote locations, by making far cheaper, yet much higher-bandwidth, Internet access bandwidth business quality, eliminating that critical bottleneck.

Server virtualization technology has made it much easier to deploy this kind of service at remote locations with a relatively minimal amount of hardware, and as importantly, without requiring a lot of IT expertise at these locations. Such a solution is not "zero server hardware" at the branch, but it can mean "(near) zero server management, with a little bit of hardware."

The capacity of hard disks is much greater than it was only a few years ago, of course, meaning the hardware to store data remotely becomes ever less expensive, and less expansive (i.e. smaller!) as well. Finally, the services themselves have gotten better (sending differential changes in files, rather than the whole file, for example), more automated and easier to use, again reducing the IT overhead burden.

So you can now get almost all of the traditional benefits of consolidation, and all the performance benefits of local servers, all at very low CapEx cost and very low OpEx people cost for management of the services and the data.

As noted last time, Microsoft's DFS (Distributed File Service) Replication is by no means the only distributed, replicated file service out there, nor necessarily the most functional, nor the easiest to manage. In fact, there are many solutions in this space today, some of them cloud-based (Box.net, for example), some not. As a networking guy, I wouldn't begin to suggest that I know which ones will be the most successful. I'll say only that I think there will be a place both for public-cloud based solutions, as well as solutions for private Intranets, given the security concerns some organizations will have trusting some or all of their data to a cloud-based solution.

And WAN Optimization will remain a key piece of the NEW architecture as well, both for the advantages it brings to applications beyond file service, and because it remains likely that even those enterprises which most aggressively adopt distributed file services will still have some centralized files which need to be accessed across long distances which will not be practical to replicate.

But whether choosing a cloud-based solution or a purely private one, in conjunction with the other key technologies in the NEW architecture, the benefits in terms of application performance, user experience and the avoidance of backups over the WAN mean that distributed file services are likely to play a key role in enterprise WAN environments going forward.

A twenty-five year data networking veteran, Andy founded Talari Networks, a pioneer in WAN Virtualization technology, and served as its first CEO, and is now leading product management at Aryaka Networks. Andy is the author of an upcoming book on Next-generation Enterprise WANs.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Take IDG’s 2020 IT Salary Survey: You’ll provide important data and have a chance to win $500.