What do router networks and a preschool have in common? A lot more than you think. Read on for the answer.\nTo the average enterprise, \u201cnetwork\u201d means \u201crouter network\u201d. It\u2019s not that there aren\u2019t other things in the network, but that the whole of enterprise networking is about building IP connectivity. We\u2019ve invented a bunch of terms to describe the elements of our IP networks, and it seems like we\u2019re adding new ones every day.\u00a0 As we do, a growing number of enterprises are finding that they don\u2019t know as much about their networks\u2019 operation as they need to; they don\u2019t have \u201cobservability\u201d.\n\nThat\u2019s a term that\u2019s been defined by so many sources that the definitions are meaningless. Let\u2019s cut through the hype and focus on a term that\u2019s found within many of those definitions, the concept of trace. Trace implies a path, a relationship, and that\u2019s what networks should be all about.\nA network is made up of boxes, and network management and monitoring has tended to focus on the behavior of these boxes as an indicator of the state of the network. All boxes A-OK?\u00a0 Network OK. This same view is pervasive in application management; the sum of the state of the pieces equals the state of the whole. What IT ops people found was that this seemingly obvious approach missed the critical point of message flow. You have to trace how work moves through a series of components to understand how an application is working. Same, it turns out, for a network, because a network isn\u2019t a box or even just a collection of boxes, it\u2019s a cooperative.\nNow for that opening question. Your network is a bit like a room full of preschoolers because it\u2019s barely controlled disorder. You can tell preschoolers what to do, organize group activities, and so forth, but inside each kid is a little self-gratifying gremlin that can run off and do something unexpected.\u00a0 And guess what?\u00a0 Almost all IP networks are collections of willful gremlins.\nIndividual routers discover routes to move traffic using adaptive behavior. Every router typically advertises what network destinations it can reach and receives and forwards the advertisements from others to every adjacent router. From this, they pick the \u201cbest\u201d route, and if something breaks or gets congested, the routers work out a new topology through something often called convergence. Is that new topology optimum? Think of preschoolers working out their own lesson plan.\nSince routes are created from reachability data exchanged with partner devices, it takes time for changes to percolate through their partners and their partners\u2019 partners, and so forth, and for everyone to pick out what\u2019s best. While this is going on, it\u2019s possible to have packets take erratic routes, even to hit a dead end. Then when the process is finished, whether what\u2019s happened yields truly optimum routes is an open question.\nHow do you know what routes your packets take? There\u2019s an IP command, traceroute, that can tell you, and some router vendors will have packet-tracing tools built into their management systems to help visualize routes within your network. There are also third-party tools from network-monitoring companies that will do the same thing. They\u2019re particularly helpful in multi-vendor networks where a particular vendor\u2019s tool might not work.\nThe thing to look for in a packet trace is a route that doesn\u2019t seem to have any logic behind it, or one that keeps changing when there\u2019s no visible device or network failure. Either of these conditions may be due to congestion, which can cause packet loss or delay. To figure out what\u2019s happening, you start with the packet trace end-to-end and follow it along, looking for devices or connections that are overloaded or subject to a high error rate.\u00a0\nDon\u2019t expect to get a solid answer from the packet trace alone.\u00a0 It should show where your route seems to be going awry, but remember that every router gets reachability data from neighbors so the fault may lie elsewhere. A complete route map, the output of those specialized tools for packet-trace visualization, is helpful here if you can get trace data from multiple network endpoints at the same time.\nIn this case, knowledge isn\u2019t power, though, no matter what the old saw says. There\u2019s a difference between just watching a network and running it, just like there\u2019s a difference between watching a football game and calling the plays. Netops is about controlling and not just knowing. The starting point in traffic management is to examine your router policies to see whether you\u2019re picking routes correctly, but sometimes even controlling routing policies won\u2019t get your flows going along the routes you want. If that\u2019s the case, you have a traffic-management issue to address. The best tools to add traffic management capability are MPLS and SDN.\nMPLS lets routers build routes by threading an explicit path through routers. SDN eliminates the whole concept of adaptive routing and convergence by having a central controller maintain a global route map that it gives to each SDN switch, and that it updates in response to failures or congestion. If your network consists of a VPN service and a complicated LAN, SDN is likely the better option. If you actually have a complex router network, MPLS is likely the right choice. With either MPLS or SDN, you know where your flows are because you put them there.\nThere\u2019s also the option of virtual networking, if neither MPLS nor SDN seems to fit your needs. Almost all the major network vendors offer virtual networks that use a second routing layer, and by putting virtual-network routers at critical places you can create explicit routes for your traffic. Some SD-WAN products will also support this. It may also be possible to use policy management to control how routes and route changes are calculated.\u00a0 Virtual networks are especially valuable if you have multiple paths between remote sites or the cloud and data centers. You can use a virtual network to pick the best path, or to divide traffic across multiple options, like a VPN and the internet.\nDon\u2019t forget the control dimension of observability. A teen watching siblings play in the mud might be surprised when parents protest, \u201cI thought I told you to watch them!\u201d Well, the teen was doing just that! And that\u2019s the weakness of observability. Make sure you can do something with your new-found flow and route knowledge or your network may still end up behaving like a room full of preschoolers.