36 Accessibility, Bias, and Ethics
Data visualization is about representing, including selecting, simplifying and organizing, data. It’s an activity done by humans, for questions generated and presented to humans, even if the underlying topic is about the natural world. As as a result, it is always important to think carefully about the human element. What steps can be taken to make our work accessible to as many people as possible? How may bias or discrimination enter into data collection, selection, analysis and interpretation? What are the ethical considerations to be considered in our work with data and the process of research?
This course only touches on these topics long enough to alert you to their importance.
36.1 Accessibility
Data visualization is, as the name implies, the act of producing a product to be seen. This is a useful goal because our brains are very good at processing some kinds of visual information. Training to read visualizations can greatly increase the ability to extract information from a visualization, so it is important to know your audience – students, the general public, people with well developed quantiative skills, or domain experts for the data you are presenting. All of these factors are central to knowing if a visualization is suitable and effective. Our target audience is university students.
Not everyone has the same visual abilities. Some people have vision that differs from the most common experience in some way – perceiving colours differently, reduced acuity, and other differences all the way to complete blindness. We should always keep these differences in mind when producing visualizations. To take the hardest challenge head on, what is the value in producing a visualization for someone who cannot see it? Data visualization is a process that uses the creator’s visual and technical skills to present features of a dataset. Any data visualization should be accompanied by a written description of the message to be conveyed. Ideally the visual and written aspects will complement each other. A visualization does not “say” anything by itself; a written interpretation is an essential part of the process.
36.2 Data collection and analysis
Data collection and analysis are critical tools for understanding and interacting with the world. Data are used by academic researchers, goverments, corportations, non-profit organizations, and citizens in complex and contrasting ways. All of these processes create opportunities for bias and discrimination. The links below give a few examples and stories elaborating these challenges.
- Data encodes systematic racism from the MIT technology review, December 2020.
- Case studies in data ethics from O’Reilly publishers.
- Data and the COVID pandemic, opinion published in Patterns, July 2020.
- A business and marketing take on the ethics of data science
- A student’s perspective on ethics in data science from 2018.
- A professional statement on ethical data science from the Royal statistical society and the Institute and Faculty of Actuaries
The following resources are in the form of checklists or questions to think about when collecting, analyzing, and presenting data.
If you find discussions of these topics you find particularly thought provoking or informative, please share them with me.
36.3 Further reading
- Data Science in a Box notes on ethics