Reproducible reports

Andrew Irwin, a.irwin@dal.ca

2024-02-29

Plan

  • What is reproducibility?

  • Why should you care?

  • Specific role of

    • R markdown
    • git and GitHub
  • Skills to learn today

  • Communication, trust, and error detection

What makes work reproducible?

  • Clear documentation of all steps and tools used

  • Ease of reproducing the work

  • Consider the consequences of small changes to

    • data
    • visualization formatting
  • Consider the consequences of a change in team membership

Why do I care if my work is reproducible?

  • Helps “future you”

  • Lets you make changes, update data, fix errors easily

  • Improves communication

  • Increases value of your work

What is the role of R markdown?

Combines

  • explanation

  • R code

  • results

  • a final report

Easy to detect if a report is complete (Does it knit?)

What is the role of git and GitHub?

  • Allows you to make versions, track changes in code, data, and report

  • Makes it easy to share with others

  • Facilitates team work

Detailed observations for this course

  • Always check that your document knits

  • Don’t use absolute paths to files on your computer (/Users/airwin/Documents/xxx)

    • Use Rprojects and here instead
  • Carefully format text in R markdown reports

    • Headings with #
    • Bullet points with indented * or -
    • Bold and italics formatting with *, ** around words
  • Control output from R with “chunk options”

Ten simple rules

  1. Track how results were produced
  2. Avoid manual manipulation of data
  3. Archive exact versions of software
  4. Version control
  5. Record intermediate results in standard formats
  6. Record random number seeds
  7. Store the raw data used to make plots
  8. Hierarchical analysis output, revealing layers of detail
  9. Connect explanatory text to underlying results
  10. Provide public access to scripts and results

Adapted from Sandve et al. 10.1371/journal.pcbi.1003285

Further reading

The course notes are quite different from these slides. They emphasize practical skills and tips.

The course notes contain specific suggestions for using R markdown which will be explored in the task for this lesson.

Task

  • Practice specific tasks for formatting R markdown documents from course notes