Using Large Language Models to Learn

Andrew Irwin

2026-01-29

Goals

Learn effective ways to ask the computer for help
Identify which tasks are done well and which tasks are done poorly by LLMs
Ensure the computer helps you extend and expand your knowledge, rather than just replacing your brain
Develop strategies for determining if the solutions provided by the computer are correct

Think about Course and Learning goals

Learn to perform specific computing tasks
Develop confidence and fluency for the data visualization pipeline: examine data, summarize data, visualize data, interpret and describe your findings
Learn new tools, ideas, and techniques
Reflect on what you have learned and develop the skills to think about a wide range of data analysis and visualization tasks

What is an LLM?

A Large Language Model (LLM)
- is trained on massive amounts of text and code,
- generates text and code using statistical predictions,
- provides responses starting with a prompt or question.
LLMs are probabilistic, not deterministic.
LLMs predict the next word based on patterns; they don’t “know” facts like humans do.

LLM strengths

Getting started: Move from “blank page syndrome” to a working draft quickly.
Brainstorming visualization designs
Explaining statistical or computing concepts: Ask “Why does this work?” or “Explain like I’m a beginner.”
Tailored help with specific errors in your code.
Creating examples: Find specific R syntax for unfamiliar functions quickly.
Writing and debugging code

LLMs to try:

Asking questions

Use a phrase you might type for a google search, or turn your phrase in to a question:

Write code for a simple scatterplot based on the Palmer Penguins data using ggplot
What is tidy data?
Explain what the “aes” function in ggplot does.
What is the difference between staging, committing and pushing to github?

Make notes

Write a journal to learn:

Write queries and code in a .qmd file
Summarize what you learned from each useful prompt and response
Generate new ideas for questions to answer later
Simplify LLM output to create the simplest example you can that captures new knowledge or skills

Explain code

Explain what %>% does in the code co2_1960 <- co2_raw %>% filter(year >= 1960)
What is the difference between =, ==, and <- ?
Explain and improve diamonds |> ggplot(aes(x = price, y = carat)) + geom_bin2d() + scale_x_log10() (Example from Lesson 8.)

Explain more complex code

From last class: Explain the following code line by line

diamonds |>
  mutate(price_per_carat = price / carat) |>
  group_by(color, clarity, cut) |>
  summarise(median_price_per_carat = median(price_per_carat),
            n = n(),
            .groups = "drop") |>
  arrange(-median_price_per_carat) |>
  group_by(cut) |>
  slice_head(n=2) |>
  arrange(-median_price_per_carat)

Generate code for you to think about

Ask the LLM to challenge you or give you some examples: Give me some interesting calculations to make summary tables with group_by, mutate and summarize.
Help me improve the following code to make a scatter plot: mpg |> ggplot() + geom_point(aes(displ, cty))

Tutor me

I’d like to reproduce Hans Rosling’s life expectancy vs GDP figure, without animation using the tidyverse. I want to learn how to make the figure, so don’t give me the code or an answer. Instead ask me questions to help me figure out what I need to know to solve this challenge.

I can get as far as: gapminder |> ggplot() + geom_point(aes(gdpPerCapita, lifeExp)). Can you help me with the size of the dots?
Keep going until you are happy with the result …
Write in your journal

Start to think about data analysis projects

Help me find and summarize temperature data from Halifax, NS using R and the tidyverse.
Write tidyverse code to compute the distribution of the number of days in a row with a daily high temperature below freezing in Halifax

Prompting Strategies

Provide Context: “I am an R student using the tidyverse…”
Be Specific: Instead of “Fix my code,” use “I have an error on line 4; explain why.”
Show Your Data: Use str() or head() so the AI understands your variables.
Chain of Thought: Ask the AI to “Explain your reasoning step-by-step.”

The “Learning” vs. “Doing” Trap

Danger

If you copy-paste without understanding, you haven’t learned.

The Solution: Read-Understand-Revise

Read the AI’s code thoroughly.
Understand the logic (ask the AI to explain specific functions).
Revise and Adapt the code to your own purposes

Limitations

Hallucinations: LLMs will sometimes use R packages that do not exist.
Outdated Info: Suggestions may use outdated libraries or old syntax.
No Critical Thinking: LLMs don’t assess if a chart is misleading or suitable for your purpose.
Errors: Code will sometimes be wrong or non-functional
Sycophancy: The LLM will compliment you, agree with you, admit to errors, but this “decorative” text doesn’t mean anything.

Testing LLM code

Perform manual calculation checks (count data, compute means) for a few examples to see if they match output of more complex code
Make simplified versions of a visualization that you draw yourself
Compare notes with friend – did they use the same approach? did they get the same answer?
Think of alternative ways to get similar results and try both
Read the LLM code line by line and ensure you really understand each line. Caution: it’s easy to deceive yourself

Ethical Use & Integrity

Transparency: Disclose when an LLM was used for analysis or code generation in an acknowledgements or citations section of your work.
Originality: The story and interpretation of the data must be yours.
The Policy: LLMs are tools for assistance, not authorship.

You will need to demonstrate your knowledge of course material on a written test without computer assistance, so be sure you check your understanding

Summary

LLMs can be very useful for: getting started, learning new methods, finding and fixing errors
LLMs code can be more complex than what you are used to, creating barriers to understanding (“Simplify this code:”)
LLM output can be completely or subtly wrong
Reflect on the skills you are developing in the course and ask yourself what you can do with that knowledge
Learning to use LLMs to develop your own knowledge and skills is challenging but potentially very rewarding

Using Large Language Models to Learn

Goals

Think about Course and Learning goals

What is an LLM?

LLM strengths

LLMs to try:

Asking questions

More complicated questions

Make notes

Explain code

Explain more complex code

Generate code for you to think about

Tutor me

Start to think about data analysis projects

Prompting Strategies

The “Learning” vs. “Doing” Trap

Limitations

Testing LLM code

Ethical Use & Integrity

Summary