1 Invitation to Data Visualization
1.1 Goals
In this lesson I will provide some examples of interesting and influential data visualizations.
In the task for this lesson, I will ask you to identify two visualizations you find interesting and provide a brief description and analysis of each.
1.2 Atmospheric carbon dioxide concentration
Climate change in the recent past and coming century will be controled by human-driven emission (and possibly sequestration) of carbon dioxide from fossil fuels into the atmosphere. Starting in the 1950s, the amount of carbon dioxide in the atmosphere (in parts per million) was regularly measured. Subsequently, methods for analyzing gasses trapped in ice were used to extend this record back about one million years. There is a direct physical link between atmospheric concentration of carbon dioxide and the loss of heat from Earth to space, resulting in a mechanistic link between increasing carbon dioxide concentration in the atmosphere and the mean temperature of the surface of the Earth. Visualizations of this data and assocated global mean temperature data have been extremely influential, forming the cornerstone of books, a documentary movie, and countless educational and environmental change movements.
Sample visualizations of atmospheric carbon dioxide are available from the institute that has been collecting this data for decades.
Estimates of global mean temperature over time are available from NASA.
Many other sites have information on these data, usually presenting data visually as a testament to the importance of visualizations.
1.3 Human health and development
Hans Rosling was a physician and popularizer of data visualizations to develop understanding of human health and economic development over time and across countries. His public presentations illustrate his view of how dyanmic charts can help us come to see the trajectory of global development, particularly the connections between health and economic development. I strongly encourage you to watch his presentations. He was especially well known for his effort to dispell misunderstandings about differences across countries in health and human development. He popularized a style of scatterplot which combined the use of colour, symbol size, and animations to show changes over time.
1.4 Weather
Many people are strongly interested in their local weather conditions. As a result of this strong interest and the complextity of the data, many visualizations have been developed. Forecasts, such as those produced by Environment Canada, and historical retrospectives, such as those produced by Weatherspark are examples that leverage familiarity with the data, broad-scale human interest, and data-rich but not overly complicated displays. Two examples are shown below.
1.5 Journalism
In the past decade there has been a resurgence of interest in data visualizations, stimulated in part by journalists emphasising visualizations in their publications. This example in the New York Times shows projected earnings for college graduates in a range of fields of study and is accompanied by notes and discussion questions. The New York Times has a series of educational materials on both visualizations and their stories.
1.6 Historically important visualizations
Many ideas in contemporary data visulizations can be traced back to the 19th century, as represented by several impactful examples. In 1869, Charles Minard produced a map of Napoleon’s Russian campaign of 1812. Florence Nightingale was a pioneer user of data visualizations to communicate messages about sanitation and public health, famously in a polar histogram showing causes of mortality of soldiers. Also in public health, John Snow mapped a cholera outbreak in London, visually linking deaths to a water source. All of these visualizations were great advances over the bills of mortality produced a few centuries earlier.
1.7 Stories
A common observation is that humans learn from stories. What is the role of data and its visualization in story telling? A graph does not tell a story by itself, but a story can be weaved from a combination of words and some data visulizations.
Wilke’s book has an excellent argument in favour of storytelling with data which he tells in a video (starting at time 6:42). His essential elements of a story are an arc including an opening, challenge, action, and resolution, which results in an emotional reaction such as excitement, curiosity or surprise. The principle is that the emotional response from the resolution of the challenge gets your audience engaged and helps them retain your message.
It may seem that a graph is far removed from a story. A pair of graphs, or a dynamic graph, or even just an original graph and an updated graph can be used to tell a story. For example, return to the carbon dioxide figures at the top of this section. Two years of data show a seasonal cycle in atmospheric carbon dioxide with a modest year over year trend. Suppose that was all you knew about carbon dioxide. It would be hard to know why there was a problem. Now look at the record since 1958. It’s now clear that there is a long-term increase and the interannual variation is small in comparison. If you find the 800,000 year record from ice cores, you will see even more context – current atmospheric carbon dioxide concentrations are outside the range of documented variability for the past 800k years.
We will return to the theme of story telling frequently in the course, particularly in assignments.
1.8 Futher reading
- Kurt Vonnegut summary of story arcs
- Wilke, Fundamentals of Data Visualization Chapter 29