Proximity: Objects grouped together in close proximity are perceived as a unit. Based on the location, clusters and outliers can be identified.
Closure: Humans tend to perceive objects that are almost a closed form (such as an interrupted circle) as the full form. If you were to cover this line of text halfway, you would still be able to guess the words. This principle can be used to eliminate bounding boxes around graphs. A lot of charts do not need the bounding box; the human visual system “simulates” it implicitly.
Similarity: Be it color, shape, orientation, or size, we tend to group similar-looking elements together. We can use this principle to encode the same data dimensions across multiple displays. If you are using the color red to encode malicious IP addresses in all of your graphs, there is a connection that the visual system makes automatically.
Continuity: Elements that are aligned are perceived as a unit. Nobody would interpret every little line in a dashed line as its own data element. The individual lines make up a dashed line. We should remember this phenomenon when we draw tables of data. The grid lines are not necessary; just arranging the items is enough.
Enclosure: Enclosing data points with a bounding box, or putting them inside some shape, groups those elements together. We can use this principle to highlight data elements in our graphs.
Connection: Connecting elements groups them together. This is the basis for link graphs. They are a great way to display relationships in data. They make use of the “connection” principle.
Figure 1-7 Illustration of the six Gestalt principles. Each of the six images illustrates one of the Gestalt principles. They show how each of the principles can be used to highlight data, tie data together, and separate it.
A piece of advice for generating graphical displays is to emphasize exceptions. For example, use the color red to highlight important or exceptional areas in your graphs. By following this advice, you will refrain from overusing visual attributes that overload graphs. Stick to the basics, and make sure your graphs communicate what you want them to communicate.
Figure 1-8 This bar chart illustrates the principle of highlighting exceptions. The risk in the sales department is the highest, and this is the only bar that is colored.
A powerful method of showing and highlighting important data in a graph is to compare graphs. Instead of just showing the graph with the data to be analyzed, also show a graph that shows “normal” behavior or shows the same data, but from a different time (see Figure 1-9). The viewer can then compare the two graphs to immediately identify anomalies, exceptions, or simply differences.
Graphs without legends or graphs without axis labels or units are not very useful. The only time when this is acceptable is when you want the viewer to qualitatively understand the data and the exact units of measure or the exact data is not important. Even in those cases, however, a little bit of text is needed to convey what data is visualized and what the viewer is looking at. In some cases, the annotations can come in the form of a figure caption or a text bubble in the graph (see Figure 1-10). Annotate as much as needed, but not more. You do not want the graphs to be overloaded with annotations that distract from the real data.
Figure 1-9 Two bar charts. The left chart shows normal behavior. The right side shows a graph of current data. Comparing the two graphs shows immediately that the current data does not look normal.
Figure 1-10 The left side bar chart does not contain any annotations. It is impossible for a user to know what the data represents. The right side uses axis labels, as well as text to annotate the outlier in the chart.
Whenever possible, make sure that the graphs do not only show that something is wrong or that there seems to be an “exception.” Make sure that the viewers have a way to identify the root cause through the graph. This is not always possible in a single graph. In those cases, it might make sense to show a second graph that can be used to identify the root cause. This principle helps you to utilize graphs to make decisions and act upon findings (see Figure 1-11). A lot of visualizations are great about identifying interesting areas in graphs and help identify outliers but they do not help to take action. Have you ever asked yourself, “So what?” This is generally the case for graphs where root causes are not shown.
Figure 1-11 This chart illustrates how causality can be shown in a chart. The number of servers failing per month is related to the temperature in the datacenter.
By applying all the previously discussed principles, you will generate not just visually pleasing graphs and data visualizations, but also ones that are simple to read and ones that communicate information effectively.
Information Seeking Mantra
In a paper from 1996,9 Ben Shneiderman introduced the information seeking mantra that defines the best way to gain insight from data. Imagine you have a large amount of data that needs to be displayed. For others to understand the data, they need to understand the overall nature of the data—they need an overview. Based on the overview, the viewer then wants to explore areas of the data (i.e., the graph) that look interesting. The viewer might want to exclude certain data by applying filters. And finally, after some exploration, the viewer arrives at a part of the data that looks interesting. To completely understand this data, viewers need a way to see the original, underlying data. In other words, they need the details that make up the graph. With the original data and the insights into the data gained through the graphical representation, a viewer can then make an informed and contextual statement about the data analyzed.
- “The Eyes Have It: A Task by Data Type Taxonomy for Information Visualization,” by Ben Shneiderman, IEEE Symposium on Visual Languages, 1996.
The information seeking mantra summarizes this process as follows:
Overview first, zoom and filter, then details on-demand.
We revisit the information seeking mantra in a later chapter, where I extend it to support some of the special needs we have in security visualization.
Applying visualization to the field of computer security requires knowledge of two different disciplines: security and visualization. Although most people who are trying to visualize security data have knowledge of the data itself and what it means, they do not necessarily understand visualization. This chapter is meant to help those people especially to acquire some knowledge in the field of visualization. It provides a short introduction to some visualization principles and theories. It touched on a lot of principles and should motivate you to learn more about the field. However, the visualization principles will be enough to guide us through the rest of this book. It is a distilled set of principles that are crucial for generating effective security visualizations.
This chapter first discussed generic visualization and then explained why visualization is an important aspect of data analysis, exploration, and reporting. The bulk of this chapter addressed graph design principles. The principles discussed are tailored toward an audience that has to apply visualization to practical computer security use-cases. This chapter ended with a discussion of the information seeking mantra, a principle that every visualization tool should follow.
© Copyright Pearson Education. All rights reserved.