Readers have told me that they like blog posts with technical tips and tricks. So I asked SolarWinds to write an article about making the most out of NetFlow. The following is a guest post written by Denny LeCompte, SolarWinds VP of Product Management and Mav Turner, SolarWinds Product Manager. SolarWinds makes the popular Orion NetFlow Traffic Analyzer (NTA) that analyzes Cisco NetFlow, Juniper J-Flow, IPFIX, & sFlow data. Got more questions about NetFlow? Leave them as a comment and we'll see if we can get them answered for you.
This article will provide you with some insight on how to take your NetFlow skills to the next level and provide you insight on some of the more important aspects like templates and what you can do with them. It will also explain how to dissect all of that data you are collecting and how to get on the right path if you want to go full guns a-blazin’ and create your very own NetFlow tool.
When Cisco introduced NetFlow v1 for its routers and switches, it was really onto something. By the time v5 came around, it set the stage to become a ubiquitous traffic monitoring solution, and it is a wonderful tool for collecting critical information on network traffic.
Best of all, NetFlow v5 can be enabled on most network devices, making it easy to deploy and configure across the network. And if a vendor isn’t using NetFlow, chances are they are using something similar called sFlow. So, you should have your bases covered. When deployed correctly, NetFlow provides you with a crystal ball of information that lets you know how your network’s bandwidth is being utilized.
Why would you want to analyze Netflow and, more importantly, why would you want to dive deeper? Well, if you are experiencing a network slowdown, it could be a symptom of something more serious, like bandwidth hogs using YOUR network to torrent movies or host large personal files that are shared out to the world. You could be experiencing network configuration problems, security breaches/attack, or a botnet … oh my!
First, let’s look at templates, what they are and why they matter to you!
Back when NetFlow v5 came out, it really changed how things were done. Where v1 was a novel idea, v5 really cemented NetFlow’s place in the network management world hierarchy. But with v9, the game evolved to a new level and brought with it the introduction of templates. Click to enlarge images.
In v5, the flow data was sent in the raw and a packet capture shows you the data in each packet, just like you would expect it to. With NetFlow v9 and IPFIX, templates are used. Templates define the structure of the data you will be receiving; but what does it really mean to you? If you don’t receive the template, then you can’t see the data! In most cases, it’s not a problem and most devices send the templates fairly regularly. However, the Cisco ASA only sends this every 30 minutes by default.
Several customers have experienced their NetFlow collectors stopping processing flow from an ASA because the template timeout was so high. A quick way to remedy this is to change the timeout rate to one minute:
flow-export template timeout-rate 1
For full configuration examples of NetFlow v9, see this fantastic guide from Cisco: http://www.cisco.com/en/US/docs/ios/12_0s/feature/guide/nfexpfv9.html.
What about the other goodness from templates?
So now that you have to worry about the templates, what advantage do they provide you? The classic flow fields of the source IP, destination IP, source port, destination port, number of bytes, and number of packets give great information about basic network traffic, but there is obviously a lot of other interesting data flowing through your network devices, and you can soak all these tasty morsels up and use them!
Some of the more interesting NetFlow v9 fields exposed provide information about BGP, MPLS and IPv6. The NetFlow v5 flow record standard only allows four bytes for source and destination. It is for IPv4 only; this clearly will not work with the 16 bytes needed for IPv6 addresses, so these records were added for v9 and IPFIX. With BGP, you can include the origin or peer autonomous systems (AS). This is really helpful when trying to understand network performance issues and using multiple tier-one providers. Use the AS to find out how your traffic is actually being routed to your peers. If you hear complaints about application or network performance, there could be an issue with one of your providers.
Without knowing how that specific traffic was routed (knowing not assuming based on your policies), it is impossible to eliminate the variable without changing your routing for specific applications.
sFlow versus NetFlow: FIGHT!
Well it’s not really a fight, but I couldn’t resist. For most users, this is simply a vendor restriction. If you have Cisco devices, you use NetFlow. If you have HP (or almost any other vendor) then you will use sFlow. While they are similar, there are a few differences that should be noted.
sFlow is sampled flow. What that means is that instead of sending all flows, sFlow will send one packet for every X packets, where X is the sample rate. This is usually configurable. For example, to configure sFlow on an HP ProCurve 5400, you would type:
sFlow INSTANCE sampling X
Where instance is the instance number from your destination statement and X is the sample rate. What many people don’t know is that Cisco supports sampled flows as well. This is particularly helpful for large switches or anything that has really high throughput and can help reduce CPU load.
Here are the steps:
- Create a sampler map
R(config)# flow-sampler-map netflowSamplerR(config)# mode random one-out-of 100
Where netflowSample is simply a string that identifies the name of your sampler and 100 tells you how often to sample the flow (1 packet out of every 100). Random means that the 1 out of 100 will not always be the same offset, but randomly selected out of the 100 flows to get a better representation of the traffic.
- Apply the sampler map to an interface
R(config)# ip cefR(config)# interface gigabit 0/1 R(config-if)# flow-sampler netflowSampler
What are things you should look for in your fancy data?
OK, you’ve got a pile of data, but you want to know how to find that really interesting needle in the haystack. There is a great presentation from Masaryk University from FloCon 2011 about Detecting Botnets with Netflow data. One of the great nuggets from this paper was the idea of looking for DNS requests coming from inside your network to external DNS servers.
Most networks are set up so that their DNS servers will forward the request to an external DNS server for resolution if they do not know the address. Because of this, you should only see external DNS requests (UDP/TCP 53) from your internal DNS servers to specific external DNS servers. If you see any traffic to external DNS servers, it is worth following up to see what that DNS server is (there are legitimate reasons) and what clients are using it. It could be a command and control server for a botnet. It could also cause problems for average users, Instead of accessing the internal address, users of external DNS servers will likely resolve the public address and try to access that IP address causing potential service outages.
At a more general level, you should have a good idea of what traffic should be originating from inside your network and going out to the internet. Obviously, you will see a lot of HTTP/HTTPS, but think about what other protocols are really necessary for your employees to get their jobs done. If you see specific ports being used by multiple internal users, it is worth investigating.
What if you really want to geek out and build your own NetFlow tool?
There is a CERT project that developed an awesome open source collector, libfixbuf. You can use the libraries to collect the data and then do whatever you want. Warning, this is only for advanced users, but it can give you unlimited (whatever you can build) flexibility for what data you consume and how you want to display it. And who knows, you could have the start to your very own company. SolarWinds started in a similar fashion …
For the project site, go here: http://tools.netsa.cert.org/fixbuf/libfixbuf/index.html.
Your device doesn’t support any flow standards, what do you do?
There are instances where specific platforms do not support any flow standard but you still need the data. Or even worse, you don’t manage the device so you can’t enable it (maybe your service provider takes care of this for you and they want to charge you big bucks to enable it). Nprobe is a great open source project that will generate the flow data. The server you install nprobe on should have 2 interfaces, one to sniff the network and one to send the flows to your collector. The interface that is used to sniff the network should be connected to a switchport that is running is SPAN or duplicate mode. This will duplicate, or mirror, the traffic from one source, to the port that your server it plugged in to. This is a type of inline sniffer that can then generate the flow data that any standard flow collector can recognize and interpret. The diagram below depicts how nprobe works:
Just like your standard NetFlow deployment, make sure you place your collectors is relevant points in the network. Identify where the source and destination networks are, the path that they take, and what information you are trying to collect.
Don’t crash your network
Remember, flow protocols send information about most of the traffic traversing your network. If your users are sending a lot of information, then NetFlow will send a lot of information. Think about the impact on your network that this extra traffic will have. For most LANs, this will be marginal. However, if you are duplicating all of the traffic across your WAN, this could defeat your whole purpose for using NetFlow. In some instances, I have seen network devices crash because NetFlow was enabled on every single port. Needless, to say, crashing a core switch is never a good thing, and if you tell every interface to send out NetFlow data, this could very well happen. This brings back up the importance of planning out where to send NetFlow from (which points in the network). The best way to prevent any negative impacts of turning on NetFlow is to have a good plan, start small, and incrementally increasing the data you are sending. Make sure you have a good baseline on the performance of your network equipment so you can spot any signs of stress early on.
We hope this article gave you some new insights on understanding and utilizing NetFlow to a larger degree. If you have any additional tips or tricks you want to share, we would love to hear about them in the comment section.
More from the Cisco Odd and Ends blog
Follow all Cisco Subnet bloggers on Twitter @ciscosubnet
Follow Julie Bort on Twitter @Julie188