The Maximum Transmission Unit (MTU) is the largest number of bytes an individual datagram can have on a particular data communications link. When encapsulation, encryption or overlay network protocols are used the end-to-end effective MTU size is reduced. Some applications may not work well with the reduced MTU size and fail to perform Path MTU Discovery. In response, it would be nice to be able to increase the MTU size of the network links.
The Maximum Transmission Unit (MTU) is the largest possible frame size of a communications Protocol Data Unit (PDU) on an OSI Model Layer 2 data network. The size is governed based on the physical properties of the communications media. Historical network media were slower and more prone to errors so the MTU sizes were set smaller. For most Ethernet networks this is set to 1500 bytes and this size is used almost universally on access networks. Ethernet Version 2 networks have a standard frame size of 1518 bytes (including the 14-byte Ethernet II header and 4-byte Frame Check Sequence (FCS)). It should also be mentioned that other communications media types have different MTU sizes. For example, T3/DS3 (or E3) and SONET/SDH interfaces have an MTU size of 4470 bytes (4474 with header).
When one protocol's packets or frames get encapsulated within another protocol there is an overall increase in the frame size. The encapsulation that takes place adds protocol header overhead, and thus the systems sending 1500-byte packets across the network cannot be sent in-tack to the other side. The amount of bytes of protocol overhead vary based on the encapsulation type. Following is a list of protocol and encapsulation overhead added to the frame.
- GRE (IP Protocol 47) (RFC 2784) adds 24 bytes (20 byte IPv4 header, 4 byte GRE header)
- 6in4 encapsulation (IP Protocol 41, RFC 4213) adds 20 bytes
- 4in6 encapsulation (e.g. DS-Lite RFC 6333) adds 40 bytes
- Any time you add another outer IPv4 header adds 20 bytes
- IPsec encryption performed by the DMVPN adds 73 bytes for ESP-AES-256 and ESP-SHA-HMAC overhead (overhead depends on transport or tunnel mode and the encryption/authentication algorithm and HMAC)
- MPLS adds 4 bytes for each label in the stack
- IEEE 802.1Q tag adds 4 bytes (Q-in-Q would add 8 bytes)
- VXLAN adds 50 bytes
- OTV adds 42 bytes
- LISP adds 36 bytes for IPv4 and 56 bytes for IPv6 encapsulation
- NVGRE adds 42 bytes
- STT adds 54 bytes
There are many other situations where protocol encapsulation occurs so you must be aware that this is happening along the transmission path. Although, this may be difficult to detect, your network documentation should note where the MTU size is smaller than 1500 bytes.
Path MTU Discovery (PMTUD)
Routers are capable of performing fragmentation of packets to cut them down to size so they fit into the smaller MTU-size tunnels, but this is not optimal. When an incoming packet to a network device gets its size increased due to encapsulation the packet then gets sent through the outgoing interface on its way toward the destination. However, if the new total packet size exceeds the MTU of the outgoing interface, the network device may fragment the packet into two smaller packets before being able to forward the packet. The IPv4 router will fragment and forward the packet, but also send back to the source an ICMP “packet too big” error message to inform the source that it should use a smaller MTU size. IPv6 routers do not fragment the packet on behalf of the source and just drop the packet and send back the ICMPv6 error message.
The primary problem with the MTU size being reduced across the network is that some applications may not be able to work well in this environment. Some nodes that send 1500 byte packets into the DMVPN and subsequently receive an ICMPv4 “packet too big” message from the router may choose to ignore this. These nodes are not performing Path MTU Discovery (PMTUD) as prescribed by IETF Internet RFC 1191 or RFC 1981 and are therefore relying on the IPv4 routers to perform this fragmentation on behalf of the source host. RFC 2923 also covers the topic of “TCP Problems with Path MTU Discovery”. If the application cannot function properly in this environment, there could be end-user impacts. Also, if there is a firewall in the middle of the communication path somewhere that is blocking the ICMP error messages, then that would definitely prevent PMTUD from operating properly.
One method to test and detect a reduced MTU size is to use a ping with a large packet size. Here are some examples of how to do this.
C:\Users\ScottHogg> ping -l 1500 192.168.10.1
On a Windows host you can also set the Do Not Fragment (DF) bit to 1 with the “-f” ping parameter.
C:\Users\ScottHogg> ping 192.168.10.1 -l 1500 –f
On Linux the command would be:
RedHat# ping -s 1500 -M do 192.168.10.1
On a Cisco IOS device the command would be:
Router1# ping 192.168.10.1 size 1500 df-bit
On a Cisco NX-OS device the command would be:
Switch7K# ping 192.168.10.1 packet-size 9216 c 10
On a Cisco IOS XR device the command would be:
RP/0/RP0/CPU0:Router1#ping 192.168.10.1 size 1500 donnotfrag
On a JUNOS device the command would look like:
root@J4350-1# run ping 192.168.10.1 size 1500 do-not-fragment rapid
IPv4 routers fragment on behalf of the source node that is sending a larger packet. Routers can fragment IPv4 packets unless the Do-Not-Fragment (DF) bit is set to 1 in the IPv4 header. If the DF bit is set to 0 (the default), the router splits the packet that is too large to fit into the outgoing interface and send the two packets toward the destination. When the destination receives the two fragments, then the destination's protocol stack must perform reassembly of the fragments before processing the Protocol Data Unit (PDU). The danger is when an application sends its packets with DF=1 and does not pay attention to the ICMP “packet too big” messages and does not perform PMTUD.
All IPv6 networks must support an MTU size of 1280 bytes or greater (RFC 2460). This is because IPv6 routers do not fragment IPv6 packets on behalf of the source. IPv6 routers drop the packet and send back an ICMPv6 Type 4 packet (size exceeded) to the source indicating the proper MTU size. It then falls on the shoulders of the source to perform the fragmentation itself and cache the new reduced MTU size for that destination so future packets use the correct MTU size.
The primary concern with having the routers performing fragmentation on behalf of the source is the added CPU processing overhead on the router. If IPsec is being used, then the routers on both ends of the tunnel will need to handle the fragmentation and reassembly of the packets. If the routers are performing fragmentation on behalf of the source node, it may be desirable to have the encryption performed prior to encryption. This prevents the destination tunnel router from having to reassemble the fragments and then perform the decryption. In other cases, we may want to fragmentation take place after encryption. If fragmentation takes place after encryption, then the destination tunnel router will need to perform reassembly before it can decrypt the packet which can add CPU overhead. Therefore, it is advisable for most networks to fragment before encryption.
The following two Cisco IOS global configuration commands can control this behavior.
Router(config-if)# crypto ipsec fragmentation before-encryption
Router(config-if)# crypto ipsec fragmentation after-encryption
There is a good document from Cisco on the 7600 switches and how to resolve these issues titled. “Configuring IPSec VPN Fragmentation and MTU”.
MTU and MSS
Another method to handle the increase in MTU size due to encapsulation and the resulting fragmentation is to utilize the TCP Maximum Segment Size (MSS) parameter. The MSS is the largest amount of bytes of payload data able to be sent in a single TCP packet. In other words, the MSS is the largest amount of TCP data (in bytes) that can be transported over a computer network. This is negotiated during the TCP 3-way handshake in the SYN packet. The MSS is defined in RFC 879 for IPv4 and in RFC 2460 for IPv6. The MSS does not include the TCP header (20 bytes) or the IPv4 header (20 bytes) (IPv6 header is 40 bytes).
In the cases where IPsec is being used, it is customary to set the MTU size on the tunnel interfaces to 1400 bytes and to set the TCP-MSS-adjust to 1360 bytes. This can be configured in a Cisco IOS device using these commands.
Router(config)# interface tunnel 4
Router(config-if)# ip tcp adjust-mss 1360
Router(config-if)# ip mtu 1400
For IPv6-enabled interfaces we can use the same type of functions, but the IPv6 header is 40-bytes instead of IPv4’s ~20-byte header. We must also consider the 20-byte TCP header which is the same size for IPv4 and IPv6.
Router(config)# interface tunnel 6
Router(config-if)# ipv6 tcp adjust-mss 1340
Router(config-if)# ipv6 mtu 1400
This MSS option does not work for UDP applications because there is no way to negotiate this during the handshake because UDP is a connectionless protocol. For UDP applications that do not perform PMTUD and set the DF=1 bit, one option may be to configure a policy that sets the DF bit back to zero.
Here is a good document from Cisco on this topic titled “Resolve IP Fragmentation, MTU, MSS, and PMTUD Issues with GRE and IPSEC”.
Compensate by Increasing the MTU Size
The primary issue with MTU size is encapsulation is taking place while the links between sites only support 1500 byte MTU. Frequently, links between enterprise routers and the upstream ISP routers only support 1500-byte MTU. This is also true on the links between CE routers and PE routers.
It would be highly desirable to be able to increase the MTU size over the WAN. If the MTU size was able to be increased throughout the path across the WAN, then the added encapsulation overhead could be compensated for by the WAN interface of the routers. This would eliminate the need to reduce the MTU size on the tunnel interfaces, adjust MSS, and alleviate the routers from performing any fragmentation.
It is important to be aware of issues related to the reduction in the end-to-end effective MTU size related to encapsulation. This can occur with tunnels, overlay technologies, IPsec links, and new Layer-2 data center interconnect protocols. Detecting these issues, troubleshooting them, and correcting the problem can be difficult. If you are using encapsulation technologies then you should consider increasing the MTU size if that is possible. You could ask your service provider if they support larger frame sizes within their network and on the link between their PE and your CE router. At the very least, you should document where in your network topology that these MTU size problems may occur so you are ready to quickly troubleshoot any application issues that may result.