Metadata

About the author and course instructor

These course notes were written by Andrew Irwin who is in the Department of Mathematics & Statistics at Dalhousie University in Halifax, Nova Scotia, Canada. Andrew has been at Dalhousie since 2018. Before that he was in the Department of Mathematics & Computer Science at Mount Allison University and before that in the Biology Department at the College of Staten Island at the City University of New York.

Andrew’s research applies mathematical and statistical models to problems in biological oceanography, particularly questions about phytoplankton biogeography, productivity, and their role in biogeochemical cycles. The prospect of large-scale systemic change over the next century plays a key motivating role in the desire to understand this important system more fully. Andrew co-directs the Marine Microbial Macroecology Lab at Dalhousie and more information about his research can be found there.

Thanks to the community

This course builds on the knowledge and generosity of many people.

The software described in this course is all open source: free to use, modify, and extend by all. I started to learn to use these tools as a graduate student with the commercial product S-PLUS. Participants in the open software development model have created an incredible set of resources for students to learn and use.

I have referenced three textbooks extensively in these notes. All are commercially available books, but the the authors and publishers have made these resources available online in a free to use format, which is wonderful for students and makes it easy for me to refer to specific sections, ideas, and skills demonstrated in these books.

Many people are actively engaged in making new resources for learning, supporting the creation of courses, and collecting data for interesting data visualization projects. Many of these are linked throughout these notes and used with much appreciation.

Posit (formerly Rstudio) is a public benefit corporation based in the USA that is committed to developing tools for the statistics and data science community. The contribution of the Rstudio software and many R packages has been a huge boost to the community of people using R.

R version and packages used

These documents were prepared using R version 4.3.1 (2023-06-16) on aarch64-apple-darwin20.

Here is a list of all packages used in any lesson:

broom, cansim, coefplot, dbplyr, dlookr, dplyr, DT, equatiomatic, fields, forcats, gapminder, geojsonio, GGally, gganimate, ggfortify, ggheatmap, ggmap, ggplot2, ggrepel, ggtext, ggthemes, glue, gratia, hablar, here, janitor, kableExtra, knitr, lattice, leaflet, lubridate, mapproj, maps, MASS, mgcv, naniar, nlme, osmdata, paletteer, palmerpenguins, patchwork, permute, plotly, pointblank, purrr, quantreg, ragg, RColorBrewer, readr, readxl, report, rnaturalearth, rnaturalearthdata, RSQLite, scales, sf, skimr, spam, SparseM, statebins, stringr, styler, svglite, tibble, tidyr, tidyverse, unglue, vegan, viridisLite

License

These course notes and supporting materials are all copyright by Andrew Irwin. Material linked or quoted from other sources are, naturally, owned by the original creators.

Original materials are licensed under a creative commons license that encourages non-commerical reuse and adaptation. For other uses, please contact the author.

Data Visualization course notes by Andrew Irwin is licensed under Attribution-NonCommercial 4.0 International