Chapter 1: Visualization

Addison-Wesley Professional

“I saw it with my own eyes!”

This sentence usually expresses certainty and conviction. It is a strong sentence. It is stronger than saying, “I heard it with my own ears.” Often, this sentence is interpreted as expressing the speaker’s conviction that she is privy to some truth. And we treat that conviction as authentic. It must have happened if she saw it. We want people to say this about the security data we analyze. We want them to look at a picture of our work product and have that experience. A picture says more than a thousand words. A visual representation of data can communicate a lot of detail in a way that is instantly accessible and meaningful.

More of the human brain is devoted to visual processing than to any other sense. It is the “broadband” access to understanding. This ability of the human mind to rapidly process visual input makes information visualization a useful and often necessary tool, enabling us to turn data into information and knowledge.

Images are very interesting. They are different from the written or the spoken word in many ways. It is not just the bandwidth of information that can be transferred. There is a much more interesting phenomenon called the critical faculty or the skepticism filter.1 When you listen to someone speak, or while you are reading these words, you are constantly asking yourself, “Is he saying the truth? Does this match up with my experience?” If you look at a picture, this skepticism filter does not seem to be there in the first moment. We trust a photograph. Do we? At first glance, we seem to. However, the closer we look, the more detail we start seeing, the more we analyze the picture, and the more skeptical we get. What is happening?

  1. Barnett, E. A. Analytical Hypnotherapy: Principles and Practice (Glendale, CA: Westwood Publishing Company,1989).

For the brain to process an image and understand its contents, it has to formulate sentences and words around the image. The image, and more specifically color, is put into sentences.2 The longer we look at an image, the more sentences the brain constructs. And the more sentences, the more reason we give our brain to apply the skepticism filter.

  1. A. Franklin et al., “From the Cover: Categorical perception of color is lateralized to the right hemisphere in infants, but to the left hemisphere in adults,” PNAS 105, 2008, 322–3225.

What does this all have to do with visualization, you might wonder? When we visualize data, we have to make sure that the output is going to be as simple and clear as possible. We have to make sure that the viewer needs as few sentences as possible to interpret the graph. This not only decreases the time that someone needs to process and understand a visualization, it also minimizes the surface area for viewers to apply the skepticism filter. We want them to trust that the image correctly represents the data.

This chapter explores visualization, encourages you to visualize security data, and explains some of the fundamental principles that anybody who is trying to communicate information in a visual form should understand.

What Is Visualization?

The proverb says, “A picture is worth a thousand words.” Images are used to efficiently communicate information. An image can capture a sunset in all of its beauty. It would be impossible to capture the same impression in words. I like to say that

A picture is worth a thousand log records.

Instead of handing someone a log file that describes how an attack happened, you can use a picture, a visual representation of the log records. At one glance, the picture communicates the content of this log. Viewers can process the information in a fraction of time that it would take them to read the original log.

Visualization, in the security sense, is therefore the process of generating a picture based on log records. It defines how the log records are mapped into a visual represen tation.

Why Visualization?

Why should we be interested in visualization? Because the human visual system is a pattern seeker of enormous power and subtlety. The eye and the visual cortex of the brain form a massively parallel processor that provides the highest-bandwidth channel into human cognitive centers.

—Colin Ware, author of Information Visualization: Perception for Design

Visual representations of data enable us to communicate a large amount of information to our viewers. Too often, information is encoded in text. It is more difficult to immediately grasp the essence of something if it is just described in words. In fact, it is hard for the brain to process text. Pictures or images, on the other hand, can be processed extremely well. They can encode a wealth of information and are therefore, well suited to communicate much larger amounts of data to a human. Pictures can use shape, color, size, relative positioning, and so on to encode information, contributing to increased bandwidth between the information and the consumer or viewer.

Many disciplines are facing an ever-growing amount of data that needs to be analyzed, processed, and communicated. We are in the middle of an information explosion era. A big percentage of this information is stored or represented in textual form: databases, documents, websites, emails, and so forth. We need new ways to work with all this data. People who have to look at, browse, or understand the data need ways to display relevant information graphically to assist in understanding the data, analyzing it, and remembering parts of it. Browsing huge amounts of data is crucial for finding information and then exploring details of a resultset. Interaction with the visualizations is one of the key elements in this process. It is not just the expedited browsing capabilities that visualization has to offer, but often a visual representation—in contrast to a textual representation—helps us discover relationships well hidden in the wealth of data. Finding these relationships can be crucial.

A simple example of a mainstream visualization application is the Friend Wheel, a Facebook3 application that generates a visualization of all Facebook friends (see Figure 1-1). Each person who is a friend of mine on Facebook is arranged in a circle. Friends of mine who know each other are connected with a line. Instead of me having to explain in written form who my friends are and what the different groups are that they belong to, this visualization summarizes all the relations in a simple and easy-to-understand picture.

  1. Facebook ( is a social networking platform.

Figure 1-1

Figure 1-1 The Friend Wheel visualizes friend relationships on Facebook.

There is a need for data visualization in many disciplines. The Friend Wheel is a simple example of how visualization has gone mainstream. The data explosion and resultant need for visualization affects computer security more than many other areas. Security analysts face an ever-increasing amount of data that needs to be analyzed and mastered. One of the areas responsible for the growth in data is the expanded scope of information that needs to be looked at by security people. It is not just network-based device logs anymore, such as the ones from firewalls and intrusion detection systems. Today, the entire stack needs to be analyzed: starting on the network layer, going all the way up to the applications, which are amazingly good at generating unmanageable amounts of data.

Visualization Benefits

If you have ever analyzed a large log file with tens of thousands of entries, you know how hard it is. A visual approach significantly facilitates the task (as compared to using text-based tools). Visualization offers a number of benefits over textual analysis of data. These benefits are based on people’s ability to process images efficiently. People can scan, recognize, and recall images rapidly. In addition, the human brain is an amazing pattern-recognition tool, and it can detect changes in size, color, shape, movement, and texture very efficiently. The following is a summary of visualization benefits:

  • Answers a question: Visualization enables you to create an image for each question you may have about a dataset. Instead of wading through textual data and trying to remember all the relationships between individual entries, you can use an image that conveys the data in a concise form.

  • Poses new questions: One interesting aspect of visual representations is that they cause the viewer to pose new questions. A human has the capability to look at a visual representation of data and see patterns. Often, these patterns are not anticipated at the time the visual is generated. What is this outlier over here? Why do these machines communicate with each other?

  • Explore and discover: By visualizing data, you have a new way of viewing and investigating data. A visual representation provides new insights into a given dataset. Different graphs and configurations highlight various different properties in the dataset and help identify previously unknown information. If the properties and relationships were known upfront, it would be possible to detect these incidents without visualization. However, they had to be discovered first, and visual tools are best suited to do so. Interactive visualizations enable even richer investigations and help discover hidden properties of a dataset.

  • Support decisions: Visualization helps to analyze a large amount of data very quickly. Decisions can be based on a large amount of data because visualization has helped to distill it into something meaningful. More data also helps back up decisions. Situational awareness is a prime tool to help in decision support.

  • Communicate information: Graphical representations of data are more effective as a means of communication than textual log files. A story can be told more efficiently, and the time to understand a picture is a fraction of the time that it takes to understand the textual data. Images are great for telling a story. Try to put a comic into textual form. It just doesn’t do the trick.

  • Increase efficiency: Instead of wading through thousands of lines of textual log data, it is much more efficient to graph certain properties of the data to see trends and outliers. The time it takes to analyze the log files is drastically cut down. This frees up people’s time and allows them to think about the patterns and relationships found in the data. It also speeds up the detection of and response to new developments. Fewer people are needed to deal with more data.

  • Inspire: Images inspire. While visually analyzing some of the datasets for this book, I got inspired many times to try out a new visualization, a new approach of viewing the same data. Sometimes these inspirations are dead ends. A lot of times, however, they lead to new findings and help better understand the data at hand.

If data visualization has all of these benefits, we should explore what visualization can do for security.

Security Visualization

The field of security visualization is very young. To date, only a limited amount of work has been done in this area. Given the huge amount of data needed to analyze security problems, visualization seems to be the right approach:

  • The ever-growing amount of data collected in IT environments asks for new methods and tools to deal with them.

  • Event and log analysis is becoming one of the main tools for security analysts to investigate and comprehend the state of their networks, hosts, applications, and business processes. All these tasks deal with an amazing amount of data that needs to be analyzed.

  • Regulatory compliance is asking for regular log analysis. Analysts need better and more efficient tools to execute the task.

  • The crime landscape is shifting. Attacks are moving up the network stack. Network-based attacks are not the prime source of security problems anymore. The attacks today are moving into the application layer: Web 2.0, instant messenger attacks, fraud, information theft, and crime-ware are just some examples of new types of attacks that generate a load of data to be collected and analyzed. Beware! Applications are really chatty and generate a lot of data.

  • Today, the attacks that you really need to protect yourself from are targeted. You are not going to be a random victim. The attackers know who they are coming for. You need to be prepared, and you have to proactively analyze your log files. Attackers will not set off your alarms.

Because of the vast amount of log data that needs to be analyzed, classic security tools, such as firewalls and intrusion detection systems, have over time added reporting capabilities and dashboards that are making use of charts and graphics. Most of the time, these displays are used to communicate information to the user. They are not interactive tools that support data exploration. In addition, most of these visual displays are fairly basic and, in most cases, an afterthought. Security products are not yet designed with visualization in mind. However, this situation is slowly improving. Companies are starting to realize that visualization is a competitive advantage for them and that user tasks are significantly simplified with visual aids.

The problem with these tools is that they are specialized. They visualize only the information collected or generated by that specific solution. We need to visualize information from multiple tools and for use-cases that are not supported by these tools. Novel methods are needed to conduct log and security data analysis.

1 2 3 Page 1
Page 1 of 3