Chapter 4: Common IPsec VPN Issues

Cisco Press

1 2 3 4 5 6 7 8 9 Page 8
Page 8 of 9

IPsec's Influence on DiffServ and LLQ/CBWFQ

In this section, we will explore a Voice over IP (VoIP) deployment in a branch networking scenario. VoIP is delay-sensitive—that is to say that packets must be received in order with consistent delay (low jitter). As shown in Figure 4-11, two routers want to communicate with each other over a series of wide-area links of varying bandwidths. On the lower-speed links, packets can sometimes be dropped due to oversubscription of the available bandwidth. Therefore QoS is required to ensure that the voice (RTP) packets are not dropped when this occurs (other packets are dropped instead). For these reasons, QoS must be used for IP traffic over the Frame-Relay links.

Figure 4-11

Figure 4-11

IPsec and DiffServ in a VoIP implementation

DiffServ is implemented in conjunction with LLQ/CBWFQ to deliver QoS for voice traffic to and from the branches. Because the company's security policy mandates confidentiality for voice traffic, IPsec VPNs have been configured between the enterprise headend router and all branch routers, posing several design considerations with the IPsec/DiffServ requirements:

  • If AH is used, changes to the IP header are not permitted (the AH MIC invalidates them on the receiving VPN endpoint). This prevents remarking on network devices between the phones. Therefore, RTP traffic must be marked accordingly prior to IPsec encapsulation (either on the routers or phones) if AH is used.

  • In both AH and ESP, if packets are received outside the antireplay window, they are dropped. Therefore, if traffic is delayed in queue due to QoS decisions, it could get dropped if it is received outside of the antireplay window at the opposite end of the IPsec VPN tunnel.

  • With ESP, the original IP header and QoS information is encapsulated in ESP and encrypted with the appropriate transform. This effectively renders the DiffServ bits needed for QoS unreadable by intermediate network nodes between the two IPsec VPN endpoints. Unless these bits are successfully copied to the outer IP header ESP encapsulation, network nodes between the two IPsec VPN endpoints may not appropriately classify the IPsec-processed RTP packet.

IPsec's Effect on IntServ and RSVP

In addition to issues outlined with the DiffServ and LLQ/CBWFQ, RSVP implementations with IPsec VPNs provide further design issues to address. As we had mentioned previously, RSVP provides a signaling method to proactively provision resources between a given source and destination. RSVP does so by exchanging a series of RSVP PATH and RSVP RESV messages between the source and destination. If intermediate network nodes between the RSVP source and destination are unable to decipher the RSVP RESV messages, as would be the case if they were encrypted in an IPsec VPN, intermediate network nodes cannot use the RSVP-RESV messages to dynamically reserve resources between source and destination (illustrated in Figure 4-12).

Figure 4-12

Figure 4-12

IPsec and RSVP Signaling Incompatibility

Therefore, to dynamically provision resources on intermediate nodes between a source and destination that require timely, ordered delivery of IP-based application traffic, RSVP signaling messages must be forwarded outside of the crypto path.

Solving Fragmentation Issues in IPsec VPNs

In IPsec VPN environments, it is critical to address MTU and fragmentation issues. Otherwise, the entire VPN is at risk of performance and operation issues. We will discuss the effect of fragmentation reassembly and MTU issues in this section, and provide solutions for proper IPsec design in environments in which MTU is likely to be exceeded, resulting in fragmentation.

The effect of fragment handling between encryption devices is largely focused on the encryption device that is performing the reassembly of the fragmented packet. Although most network devices and VPN endpoints available today can fragment encrypted packets in the crypto-switched fast path, the decrypting IPsec endpoint must decrypt all fragments in the chain before the packet can be reassembled. Figure 4-13 illustrates an IPsec VPN deployment in which packets are reassembled prior to decryption on the destination IPsec VPN gateway.

Figure 4-13

Figure 4-13

Fragmentation Handling Between Encryption Devices

This reassembly behavior is done at the process level and greatly affects the performance of the VPN. In IPsec environments, every precaution should be taken to fragment packets before they are encrypted with IPsec so that administrators can be assured that both fragmentation and reassembly is being done on devices with the appropriate computational resources available.

Path MTU Discovery

IP PMTUD is a technology that is used to dynamically discover the maximum MTU size between two endpoints such that the originating device fragments packets to the lowest MTU of the path. As such, PMTUD prevents intermediate network devices from fragmenting packets and causing excessive CPU overhead on the receiving IPsec endpoint doing the reassembly. Consider the scenario described in Figure 4-14, in which Host_A wishes to open a TCP session to Server_B across a routed IP network using Routers A, B, and C.

Figure 4-14

Figure 4-14

PMTUD and IPsec

Administrators have enabled IP PMTUD on their workstations and servers such that fragmentation reassembly issues can be avoided on Router_B. Host_A executes PMTUD using the following process:

  1. Host_A creates an IP packet sized to the appropriate MTU of its locally attached segment and sets the DF bit before transmitting it to Server_B.

  2. Router_A receives the packet, notes that the DF bit is set. Router_A's serial link has an MTU of 1414. Because the DF bit is set and the packet from Host_A to Server_B exceeds the MTU of the serial interface, it is dropped.

  3. Router_A sends an ICMP Unreachable message back to Host_A, carrying the MTU (1414) of the next hop (the serial interface between Router_A and C).

  4. Host_A sends another ICMP message of 1414 bytes in length to Server_B with the DF bit set.

  5. Router_A receives the packet and forwards to Router_C. Router_C receives the packet, and notes that the DF bit is set. Because the DF bit is set and the packet is greater than the MTU of Router_C's link to Router_B (512), the packet is dropped.

  6. Router_C sends an ICMP Unreachable message back to Host_A, carrying the MTU (512) of the next hop (serial interface between Router_C and A).

  7. Host_A sends another ICMP message of 512 bytes in length to Server_B with the DF bit set.

  8. The 512 byte ICMP message is lower than the MTU of any individual link in the path. It is therefore successfully forwarded to Server_B. Server_B sends an ICMP Echo Response back to Host_A, indicating to Host_A that 512 is the MTU of the path.


Note - The routers in the above scenario are used to illustrate the general operation of PMTUD. IPsec and IPsec+GRE tunnels use a slightly different configuration of PMTUD than the previous, known as "Tunnel Path MTU Discovery." The specific operation of fragment handling using PMTUD in IPsec and IPsec+GRE environments is discussed in greater detail later in this chapter.


IPsec in Cisco IOS can be configured to copy the DF bit value in to the outer IP header in ESP-processed packets. As such, the ICMP traffic that PMTUD relies on to operate correctly does not have to be explicitly excluded from the crypto switching path.

There are several issues that must be addressed if PMTUD is to be part of one's design strategy to mitigate fragmentation reassembly issues in IPsec VPNs. In this section, we will briefly highlight some of the most common ones:

  • Permitting ICMP Unreachable Messages

  • Rate-Limiting ICMP Messages

  • PMTUD Not Supported on End-Hosts

  • Adjusting TCP Maximum Segment Size

  • Clearing the DF-Bit

Permitting ICMP Unreachable Messages

As we've discussed previously in our overview of the PMTUD protocol, PMTUD relies heavily on ICMP unreachable messages to communicate the MTU of segments back to the fragmenting host. It is very common for security devices, such as firewalls, to deny ICMP unreachable messages, as they are commonly used in malicious scanning techniques by hackers. As such, care must be taken to ensure that all paths between PMTUD-enabled endpoint be checked to ensure that ICMP unreachables are indeed allowed to pass if PMTUD is to be the preferred message for fragmentation avoidance along the path.

Rate-Limiting ICMP Messages

Because PMTUD relies on the receipt of ICMP Unreachable replies within a given retransmission window on the originating host, care should be given to rate-limiting techniques applied to ICMP messages, as they could cause premature retransmission of ICMP messages in PMTUD environments. If received out of order, ICMP unreachable messages in a PMTUD environment could cause confusion on MTU settings for the originating PMTUD host. For example, ICMP Unreachable messages delayed in rate-limiting queues could signal an erroneous MTU setting on the originating PMTUD host if that Unreachable message is received after a valid ICMP Unreachable with the correct IP MTU.

PMTUD Not Supported on End-Hosts

If PMTUD is disabled on the hosts within a network, it is recommended that the network administrator take steps to ensure that packets are fragmented in the network at some other location before crypto processing occurs. This can be achieved through enabling IPsec Lookahead Fragmentation in Cisco IOS and setting the DF bit in IPsec-processed packets. With IPsec Lookahead Fragmentation, the IOS IPsec VPN endpoint will attempt to determine the encapsulated packet size before it is encrypted. If the encapsulated packet size is predetermined to be larger than the path MTU, it is fragmented before encryption. When the DF bit is set, the encrypting router will look for information in any ICMP unreachable message received for updates it needs to install to the Path MTU entry in its SADB. Alternately, if the VPN endpoint does not support functionality similar to IPsec Lookahead Fragmentation or explicit setting of the DF bit in outer IP headers, the MTU of the IPsec VPN tunnel can be manually defined to avoid fragmentation reassembly issues.

Adjusting TCP Maximum Segment Size

Hosts sending IP packets greater than the TCP Maximum Segment Size (MSS) are at risk of fragmentation. Strictly speaking, the TCP MSS is the maximum amount of data that a host is willing to accept in an IP datagram. Hosts compare TCP MSS buffers sizes with MTU to determine the MTU for their transmissions. The result will be the lower of the TCP MSS or the MTU less 40 bits (an allocation for IP header and TCP header, both 20 bits in length). Once determined, each host communicates the selected values to the opposite host via the following exchange described in Figure 4-15.

Figure 4-15

Figure 4-15

TCP MSS, IP MTU, and Fragmentation

The following order of events describes the sequence illustrated in Figure 4-15 above:

  1. Host_A has an MSS buffer of 20k and an MTU of 1500. It compares the MSS buffer with the MTU of the link, less a 40-bit allocation for IP and TCP header addition (1500 – 40 = 1460) and selects the lower value of 1460 (1460 < 20000) to send to Host_B.

  2. Host_B has a 16k MSS buffer and a 2048 interface MTU size. It does a similar comparison to Host_A's in step 1 and selects 2008 (2048 – 40). It then compares the received value from Host_A and selects the lower value of 1460 as its MSS value (1460 < 2008).

  3. Host_B signals its MSS of 2008 to Host_A.

  4. Host_A compares the received value of 2008 with its TCP MSS value derived in Step 1 and selects the lower of the two values, 1460, as its TCP MSS.

TCP MSS values use MTU values to help avoid fragmentation. In the example above, MTU values are selected, as they are smaller than TCP MSS values. However, if the MSS value were to be smaller than the MTU, then Hosts A and B will select the MSS + 40 bytes as the maximum packet length for TCP traffic. Note that, in this case, the larger MTU value would still be used for UDP traffic.

Clearing the DF-Bit

If the DF bit is cleared somewhere along the PMTUD path between source and destination, the network nodes along the path will fragment the ICMP PMTUD message rather than dropping it and replying with an ICMP unreachable. This will obviously break the operation of PMTUD. Most IP-enabled devices available today are capable of clearing the DF bit in an IP header.

Fragmentation Behavior on Cisco IOS VPN Endpoints

The overhead associated with IPsec and IPsec+GRE encapsulated IP packets can often lead to fragmentation, which is why PMTUD is, by default, enabled on IPsec VPN routers. However, the specifics of fragment handling and PMTUD differ slightly from nonVPN environments. In this section, we will discuss the handling of fragments in IPsec and IPsec+GRE tunnels and some additional solutions available for avoiding fragmentation in IPsec VPN environments.

IPsec VPNs use Tunnel Path MTU Discovery to interpret MTU information of ICMP Unreachable messages and update the Path MTU of the corresponding IPsec SA. The typical PMTUD operation and fragment handling of an IPsec VPN is illustrated in Figure 4-16.

Figure 4-16

Figure 4-16

Fragment Handling and PMTUD Operation with IPsec Tunnels

The following describes the operation illustrated in Figure 4-16:

  1. Host_A sends a 1500-byte (size of the local interface MTU) packet to Server_B.

  2. Router_A receives the packet sent in 1 above, and observes that the ESP encapsulated packet size exceeds the MTU of the serial link to B. Because Host_A set the DF bit of the packet, Router_A drops the packet and sends an ICMP unreachable message containing the MTU size of 1442 (1500bytes—58bytes max ESP overhead) back to Host_A.

  3. Host_A receives the ICMP Unreachable message with the MTU information, and forwards another ICMP packet of 1442 bytes in length to Router_A. Router_A encapsulates the packet with ESP and forwards it across the VPN with the DF bit set in the outer header.

  4. Router_C receives the ICMP message from Router_A in Step 3 and notes that the packet exceeds the MTU of its serial interface to Router_B. Because the DF bit is set, Router_C drops the packet and forwards an ICMP Unreachable to Router_A with the MTU size of 1440 embedded.

  5. Router_A receives the ICMP Unreachable message from Router_C in Step 4. Router_A notes the MTU size of 1440 in the PMTU field of the SA that is established with Router_B. Router_A does not send a new ICMP message of 1440 in length, but instead this is handled by Host_A in step 6.

  6. Host_A retransmits an ICMP message of 1442 in length, as it never received an acknowledgement from the original ICMP message sent in Step 2.

  7. Router_A compares the ESP-encapsulated packet size (1442+58) of the packet received in step 6 above with its path MTU (1440) and drops the packet. Router_A responds with an ICMP unreachable with the MTU of 1342 (1400 PMTU less ESP overhead of 58 bytes) embedded.

  8. Host_A sets its MTU to 1342 and forwards a new 1342-byte message to Server_B. The message and associated ESP overhead is now lower than the end-to-end path MTU, resulting in a successful transmission from Host_A to Server_B.

As we've discussed in previous sections of this chapter, and in others, it is sometimes necessary to encapsulate certain traffic types in GRE prior to processing them with IPsec. Processing of multicast traffic, for example, is one instance in which one would seek to encapsulate the plain text traffic in GRE prior to encapsulating it in ESP. This is commonly referred to as IPsec+GRE. This process includes an additional 24 bytes of overhead, as the GRE header is applied in addition to the ESP or AH headers. More importantly, it adds additional steps to the Tunnel PMTUD operation while trying to avoid fragmentation. Figure 4-17 illustrates the fragment handling process using PMTUD in an IPsec+GRE scenario.

Figure 4-17

Figure 4-17

Fragment Handling and PMTUD Operation with IPsec+GRE Tunnels

The operation of PMTUD over an IPsec+GRE tunnel illustrated in Figure 4-17 is described by the following order of events:

  1. Host_A sends a 1500byte packet with the DF bit set to Server_B.

  2. Router_A receives the packet and observes that the DF bit is set. GRE encapsulation occurs prior to ESP encapsulation in this scenario, so the GRE process on the router drops the packet as the 1500byte packet + 24bytes of GRE overhead exceeds the GRE tunnel MTU of 1500. Router_A sends an ICMP Unreachable back to Host_A with an embedded MTU value of 1476 (1500—GRE header length of 24).

  3. Host_A sends a 1476 byte packet with the DF bit set to Server_B.

  4. Router_A receives the packet, noting that the DF bit has been set. The router encapsulates the packet in GRE and then attempts to encapsulate it in ESP. The added ESP encapsulation pushes the MTU over the serial interface MTU of 1414, so Router_A drops the packet. ESP sends an ICMP error message to GRE indicating an MTU of 1376 bytes (1414 less max ESP header length of 38 bytes). GRE records this value as the new tunnel IP MTU.

  5. Host_A retransmits the 1476-byte packet in step 3, as no acknowledgement was received. Router_A drops this packet as it exceeds the tunnel IP MTU derived in step 4. Router_A responds with an ICMP Unreachable message with the tunnel IP MTU of 1414.

  6. Host_A sends a new ICMP message of 1414-bytes in length to Server_B. Router_A encapsulates in GRE, and then encapsulates in ESP. The DF bit is copied to the outer IP header in the ESP packet before transmitting across the IPsec VPN.

  7. Router_C receives the packet, and notes that the DF bit is set. The size of the packet is now 1414, as GRE and ESP headers have been added to the original ICMP message sent in step 6. Router_C drops the packet, as it exceeds the MTU of the link to Router_B and has the DF bit set. Router_C sends an ICMP unreachable message to Router_A with the MTU of 1400.

  8. Router_A receives the ICMP unreachable message from Router_C in step 7 above and updates the PMTU field of its IPsec SA to Router_B with the 1400-byte value.

  9. Host_A retransmits the 1414-byte ICMP message in step 6 to Server_B, as no acknowledgement was received.

  10. Router_A receives the packet, and encapsulates it in GRE. Once ESP encapsulation is applied, the length of the packet exceeds the 1400-byte IPsec SA PMTU obtained from Router_C in step 8. ESP sends an ICMP message to GRE with an MTU of 1342 (1400—58 bytes max ESP header length). GRE updates its tunnel IP MTU with this value.

  11. Host_A retransmits the 1414-byte ICMP message in step 6 again, as no acknowledgement was received from the retransmission in step 10.

  12. Router_A receives the packet and drops it as it exceeds the new GRE tunnel IP MTU of 1342 and has the DF-bit set. Router_A forwards an ICMP Unreachable to Host_A with an MTU value of 1318 bytes (1342 GRE MTU less 24 bytes GRE overhead).

  13. Host_A receives the ICMP Unreachable message sent from Router_A in step 12, and sends a new 1318-byte ICMP message to Server_B with the DF bit set.

  14. Router_A receives the packet, encapsulates it in GRE, encapsulates it in ESP, sets the DF bit in the outer IP header, and forwards to Router_C. This time, Router_C forwards the ICMP message originated from Host_A to Router_B.

  15. Router_B decapsulates the ESP packet, then decapsulates the GRE packet, and finally forwards the original ICMP PMTUD message to Server_B.

  16. Server_B acknowledges the receipt of the message, confirming that Host_A is to use an MTU size of 1338 bytes for this path.

Although the DF bit in PMTUD ICMP messages is always set so as to properly detect areas of fragmentation, ICMP Unreachable responses to these messages are sent with the DF bit set to 0. As such, it is important to note that ICMP PMTUD messages sent from source to destination will never be fragmented, but the responses to those messages could quite possibly be fragmented along the return path.

Solutions for Preventing Fragmentation

In previous sections, we've discussed the most common method for preventing fragmentation—Path MTU Discovery. However, as we have explored, the use of PMTUD is somewhat laborious for network devices to execute. Additionally, PMTUD may not be an option in networks that require the filtering of ICMP messages at various points within the network. As such, it is important to understand other ways in which fragmentation can be avoided when designing an IPsec VPN. We will discuss several techniques for mitigating IP fragmentation other than PMTUD in this section.

IPsec Prefragmentation

IPsec Prefragmentation is a Cisco IOS feature that enables an encrypting IPsec VPN endpoint to attempt fragmentation before encryption if a size of the encrypted packet and additional header information exceeds the MTU of the path in between endpoints. PMTUD can be used to determine the path of the SA and the MTU of that path. Doing so in conjunction with IPsec Prefragmentation provides a very scalable and manageable method of increasing the overall performance of an IPsec VPN where fragmentation after encryption is a possibility. Example 4-34 illustrates how to configure IPsec crypto DF-bit overwrite with IPsec Lookahead Fragmentation such that the path MTU of the SADB will be dynamically determined using tunnel PMTUD, and large packets will be fragmented (those exceeding the Path MTU for that SA in the SADB) before encryption.

Example 4-34 Enabling IPsec Prefragmentation with PMTUD and Crypto DF-bit Rewrite

Router_A#config
Router_A#configure terminal
Enter configuration commands, one per line. End with CNTL/Z.
Router_A(config)#crypto IPsec df-bit set
Router_A(config)#crypto IPsec fragmentation before-encryption
Router_A(config)#

Note - Although in the case of Example 4-34, Router_A will attempt to fragment large packets before encrypting them, there are many configuration instances in which IPsec Lookahead Fragmentation and DF-bit overwrite are configured incorrectly. It is critical to understand the interdependencies of the DF-bit setting and the Lookahead Fragmentation setting addressing IPsec fragmentation design considerations. For a full listing of DF-bit interoperability with IPsec Lookahead Fragmentation settings, please refer to the following URL on CCO:

http://www.cisco.com/en/US/partner/products/sw/iosswrel/ps1839/products_feature_guide09186a0080115533.html


Figure 4-18 illustrates a client server exchange that does not support PMTUD. Note that, even though there is no exchange of ICMP messages, the Path MTU is still discovered and updated in Router_A's IPsec SADB.

There are three key operations that enable this feature:

  • Lookahead Fragmentation: Before forwarding an IPsec packet, Router_A predetermines the encapsulated packet size, and compares it with the MTU in the SADB. If it is predetermined to exceed that MTU size, the packet is fragmented before it is encrypted.

  • Crypto DF-bit Rewrite: When PMTUD is not supported, it is important that Router_A be able to set the DF bit in the outer IP header of IPsec-encapsulated packets. This prevents fragmentation, and triggers ICMP unreachables needed to adjust the Path MTU in Router_Aís SADB.

  • Processing of MTU Information in ICMP Unreachables: Router_A is capable of deciphering MTU information of ICMP unreachables (received when IPsec packets with DF=1 are dropped). It uses this information to dynamically update the path MTU in its SADB.

The exchange between Router_A and Router_B in Figure 4-18 illustrates how all of these three features work in concert to minimize the effect of postencryption fragmentation in IPsec VPN deployments where PMTUD is note-enabled on the endstations:

Figure 4-18

Figure 4-18

IPsec Fragment Handling Without PMTUD-Enabled Endstations

  1. Host_A sends a 1500-byte data packet, destined for Server_B.

  2. Router_A receives the packet, and estimates the ESP encapsulated packet size before encrypting or forwarding the packet. Router_A compares the estimated encapsulated packet size with the Path MTU, and determines that the size is greater than the Path MTU and fragments the packet.

  3. Router_A applies the appropriate encapsulation to the fragments in the fragment chain. While doing so, it sets the DF bit of each encapsulated packet equal to 1.

  4. Router_C receives the packets from Router_A, compares them with the MTU of the Router_C to Router_B link, notes that DF=1, and drops larger packets accordingly. Router_C sends ICMP Unreachables for dropped packets to Router_A.

  5. Router_A receives the ICMP Unreachables from Router_C, and updates the MTU of its SADB accordingly.

  6. Host_A does not receive a reply to its original packet within the appropriate timeout window and therefore retransmits.

  7. Router_A performs Lookahead Fragmentation on the retransmitted packet, sizing the fragments to the new MTU in its SADB. It then sets the DF bit in each encrypted packet.

  8. The encrypted packets are now sized lower than any individual link MTU in the path (<1400 bytes), and are therefore received on Router_B. Router_B is now able to decrypt each fragment in the chain before they are reassembled, a process that is done in the fast switching path.

Manual MTU Adjustment

We've discussed the many tools available within Cisco IOS to avoid fragmentation in IPsec VPNs without having to manually tune the MTU sizes within the network. However, the option still exists to increase MTU size between IPsec VPN endpoints such that the risk of receiving a packet smaller than that MTU size is small. If one must tune MTU sizes to accommodate IPsec traffic between endpoints in a network, one should take the following disadvantages to this approach into consideration:

  • Scalability and Management: Remember that MTU sizes vary on a segment-by-segment basis. As such, it can become laborious for network administrators to consistently ensure that every segment's MTU is properly tuned. Network designers can anticipate the difficulty of manual MTU tuning to increase as the number of IPsec VPN connections and hosts scales upwards.

  • Serialization Delay: The MTU attribute exists to decrease serialization delay on networks. On segments that have artificially high MTU sizes, network administrators can expect increased delay as larger packets are serialized in queue. This adversely affects time- and delay-sensitive applications such as Voice and Video over IP.

Related:
1 2 3 4 5 6 7 8 9 Page 8
Page 8 of 9
The 10 most powerful companies in enterprise networking 2022