• United States

Getting scalability, high availability from SSL VPN wares

Dec 19, 200511 mins
Network SecurityNetworkingSecurity

Most vendors tested keep things up and running.

Our Clear Choice Test expert tells how to get scalability and high availability from SSL VPN wares.

Any network architect building a production SSL VPN service will want to make it as reliable and scalable as possible. We took a whirlwind pass through the high-availability and scalability capabilities of the 11 products in our test to get an idea of the spectrum of options.

F5 Networks was unable to participate in this portion of our test because of logistical problems, and because SonicWall and Check Point do not have built-in high availability, they also were not tested. However, we did evaluate the features of all three in this section.

We found several types of high-availability and scalability features in our testing. High availability included both active/passive (commonly called master/slave) and active/active support. To support scalability, a few products included internal load-balancing technology, but most required external load balancers. Products requiring load balancers generally supported scalability with some form of configuration sharing.

High availability is a simple idea: The service should survive loss of some component, typically the SSL VPN gateway itself. But high-availability measures present a few complications to SSL VPN deployments. It’s not rocket science, but it’s not easy to do right, either, as our testing showed. In high availability (and in scalability), one of the problems to be solved is configuration sharing, making sure that any change on one device is then propagated to all of the devices so that the configuration is completely consistent.

High availability is usually handled with a virtual IP address, one that represents the highly available service. The virtual IP address usually is shared across a pair of SSL VPN devices; one of the two is the active master responsible for all traffic to that address, and the other is the passive slave, in communication with the active device and ready to take over the virtual IP and traffic in case of failure of the master. In a single data center, high availability focuses on dealing with the failure of either a single device or connectivity to that device. You start by making sure that the virtual IP address is always available by building a capability to move the virtual IP from one system to another when there is a problem.

Chart: HA and scalability is not one size fits all

In the event of a failure, moving a virtual IP one node to another is pretty basic stuff, as is the configuration sharing between the master and slave needed to make it work reliably.

What’s more difficult is making any failover seamless to the end user. As with any high-availability scenario, the problem lies in how to maintain the state of the end-user’s session in the event of a device failure. With SSL VPN devices, there are two main types of sessions – Web-based SSL VPN and network extension/port forwarding SSL VPN – with different high-availability issues to consider.

Web-based applications tunneling through an SSL VPN have two types of state associated with them that are important to high availability. All SSL VPN devices generate a cryptographic cookie – usually called a session ID or session cookie – that is used to reauthenticate a user’s browser every time it comes back for another Web page or image. Without that state information, the SSL VPN device would have no secure way to keep track of the user. This is why cross-site scripting attacks are so frightening to SSL VPN devices, if someone can steal your session cookie, he may be able to impersonate you until your session ends.

The second type of state information is the cookies (such as an shopping spree or, more commonly, an Outlook Web Access session) that are part of a user’s session. Web sites doing any kind of session-oriented application (such as a Web-based e-mail session) use cookies to keep things in order. For example, when you use Outlook Web Access, a cookie called “sessionid” is stored in your browser, so that as you go from page to page, the Outlook Web Access server knows who you are and what you have been doing. Secure SSL VPN boxes don’t actually pass those cookies directly to the end user’s system, because they might then be left lying around for the next user to use or abuse. Instead, they are maintained on the SSL VPN device. Not all SSL VPN devices do that, but the most security-focused use this technique.

When a high-availability event occurs (such as an involuntary power cord attenuation), the user’s cryptographic cookie needs to be in place on the backup SSL VPN device for everything to work smoothly. In our testing, Aventail, Caymas, Fortinet, Juniper and Nokia all handle this correctly. With AEP, Array and Nortel, the user must reauthenticate to the SSL VPN after a high-availability event occurs. (Nortel says it is solving this problem in its 5.5 release, due in January.)

We haven’t mentioned TCP connection stability, because it’s not a very important issue in Web-based SSL VPNs. Each Web page, and often each graphic on each page, is a separate TCP connection, usually very short. This means that if a high-availability event occurs while a Web page is downloading, and if TCP connections are not preserved across devices, then the user would only notice this as a Web page requiring reloading or having an image that doesn’t show up – something they’re probably already accustomed to every once in a while, just because of the general perversity of the Internet.

Network-extension high availability

However, TCP connection stability is important in network extension and port forwarding SSL VPN operations. There are no cookies to worry about, but there is the problem of the upper application. For example, if a user is running a Terminal Services session across a network extension SSL VPN, the network extension connection can’t simply “go away” as it would effectively break the Terminal Services session.

There are two strategies SSL VPN vendors could use to handle this. One is pure TCP state sharing, in which the actual TCP connection used by the network extension client is shared across multiple devices. None of the products we tested seem to go this far. This is a very expensive way, in both bandwidth and processing power, to synchronize connectivity, because it means that state information has to be passed between devices every time a data packet goes to or from the client.

The second strategy is transparent reconnection, which takes advantage of the fact that you’re running IP over TCP (over SSL over IP, of course). Because IP is not supposed to be reliable, in general, the underlying TCP connection can go away and come back and anything sitting on top of IP will only see a delay or a few lost packets. Transparent reconnection says that when the SSL VPN device goes away, the client tries to reconnect without letting the upper-layer protocol know that something bad has happened. That’s also hard to do. For example, when you reconnect, you better get exactly the same IP address.

In our testing, only Aventail and Juniper managed to keep transmitting data across a network-extension connection when a high-availability event occurred using transparent reconnection.

There were some unusual hiccups we found in the high-availability arena. We were disappointed in Array for requiring us to manually replicate configuration changes across its cluster, and in Caymas for handling updates with a timed replication. That kind of poor engineering shook our confidence that either vendor knew what it was doing when it came to high availability. If these systems can’t replicate configuration automatically, how can they be trusted to replicate user state information?

AEP has a more unusual high-availability approach. In AEP high-availability configurations, you plug the master device into a power outlet that is under control of the slave device. The idea is that when the slave loses contact with the master, it actually cycles power on the (former) master. That’s an interesting approach. Unfortunately, AEP’s high availability is marred by its inability of a device to fail “back” automatically. This means that if the master simply reboots because of a crash, the (former) slave takes over, which is the right thing to do, but will stay as the master forever, without the (former) master becoming a slave. Thus, you won’t have any high availability unless you manually go in and switch things around.

Scalability matters

Scalability is usually mentioned in the same breath as high availability because many of the same mechanisms apply. Scalability includes making the SSL VPN service accessible to more users and concurrent sessions than any single device could (or should) handle.

Vendors that provide master/slave high availability don’t automatically give you scalability, because half of your boxes are not doing anything – they’re sitting there as slaves. Many vendors call this active/passive high availability. To achieve scalability, boxes cannot sit idle.

To handle the scalability, then, a new function has to be introduced to the service: a load balancer. Three vendors, F5, Nokia and Aventail, include integrated load-balancer functionality in their SSL VPN devices. (Fortinet includes an internal load balancer in its all-purpose security appliance, but this does not apply to the SSL VPN service.) All other products tested in this specific test require an external load balancer to distribute the connections across different SSL VPN devices.

Aventail’s scalability solution is limited, but extremely elegant. A maximum of two nodes can be combined to form both a scalable and highly available service. For many SSL VPN deployments, this is a great solution because it requires no additional external device and is fully integrated with the product.

F5 uses a different approach of a single device acting as a load balancer (although that device can really be a two-node master/slave high-availability pair) that redirects sessions to other devices – each of which could also be a master/slave high-availability pair. Nokia’s solution is similar to F5’s. Both Nokia and F5 offer greater scalability than Aventail because they can scale to more than two devices.

Vendors requiring external load balancers fall into two camps: those that share configuration, and those that don’t. With AEP, Caymas, Check Point, Fortinet and SonicWall, you can build scalable SSL VPN services, but if you have 10 elements in your scalability solution, you’ll need to make the same change all 10 times whenever you change the configuration. That’s obviously undesirable except in the most restricted of circumstances.

Array, F5, Juniper, Nokia and Nortel all allow you to build large clusters of systems that share configuration information. With Array, Juniper and Nortel, you need to handle the load balancing yourself, but at least you only have to make changes once.

HA and scalability is not one size fits all
Most SSL VPNs now offer master/slave high availability, but failover results vary. Some require the user to reconnect and reauthenticate; others go the extra mile by transparently handling failures. Scalability generally requires an external load balancer, but a few devices have built-in capabilities to simplify deployment.
ProductHigh availability styleScalability styleNotes from high availability testing
AEPMaster/slave failoverExternal load balancer with manual configuration sharingUsers must reauthenticate after HA event. Network extension connections broken
ArrayMaster/slave failoverExternal load balancer works in conjunction with clusterUsers must re-authenticate after HA event; network extension connections broken.
AventailActive/active Load balancingInternal load balancer with automatic configuration sharingHA event transparent to users
CaymasMaster/slave failoverExternal load balancer with manual configuration sharingUsers do not have to re-authenticate after HA event; network extension connections broken
Check PointNo HA features offeredExternal load balancer with manual configuration sharingNot tested
F5Master/Slave FailoverInternal load balancer with automatic configuration sharingNot tested
FortinetMaster/slave failoverExternal load balancer with manual configuration sharing (existing internal load balancer does not work with SSL VPN)Users do not have to re-authenticate after HA event; network extension connections broken
JuniperMaster/slave failoverExternal load balancer works in conjunction with clusterHA event transparent to users
NokiaMaster/slave failoverInternal load balancer with automatic configuration sharingUsers do not have to re-authenticate after HA event; network extension connections broken
NortelMaster/slave failoverExternal load balancer works in conjunction with clusterUsers must re-authenticate after HA event; network extension connections broken
SonicWALLNo HA features offeredExternal load balancer with manual configuration sharingNot tested

Application/client interoperability | Next test: End-Point security >