17 GAM and LOESS smoothing
In this lesson I will show you how to create GAM and LOESS models and perform some basic tasks to interact with the R model objects that the functions create. In keeping with the goals of the course, we will primarily focus on using the models for visualization and not attempt a detailed statistical analysis of when and why you might use a particular model for inference. This means we will restrict our attention to some basic uses of these models using one predictor (x) and one response (y) variable.
17.1 Generalized Additive Models
Generalized additive models are a kind of linear regression, but instead of finding coefficients of predictor variables (e.g., intercepts, slopes), the model finds a “smooth” response function for each predictor. Typically this means that a piecewise cubic function (spline) is used to approximate the relationship between two variables. We can compute predicted values, confindence and prediction intervals, and show the smooth response function that arose from the model. We don’t provide a simple list of coefficients like we did with linear regression, because the spline curve is defined by many numbers which are not usually informative on their own.
Generalized additive models are most commonly used when there is no theoretical motivation for a functional relationship between the variables being studied. We’ll look at the Mauna Loa atmospheric CO2 concentration. These data increase year over year and have well-established interannual oscillations, but there is no clear function for either of these patterns.
We will start by using the last decade of data from Lesson 1.
Here is a plot of atmospheric CO2 concentration over time.
my_theme = theme_linedraw() + theme(text = element_text(size = 14))
p1 <- co2 |> ggplot(aes(decimal_date, co2_avg)) + geom_point() + my_theme +
labs(x = "Year (decimal)", y = "Atmospheric CO2 (ppm)")
p1
The next plot shows each observation, minus the annual mean, as a function of the time of year but not the actual year. Each year’s observations are overlapped.
p2 <- co2 |> ggplot(aes(year_fraction, co2_anomaly)) + geom_point() + my_theme +
labs(x = "Year fraction (decimal)", y = "Atmospheric CO2 anomaly (ppm)")
p2
Here are GAM fits to both of these pairs of data using the gam
function from the mgcv
package. There is a plot
function in the mgcv
package, but I’m using a ggplot function called draw
in the gratia
package.
g1 <- gam(co2_anomaly ~ s(year_fraction, bs="cs"), data = co2)
draw(g1, residuals=TRUE, rug=FALSE)
g2 <- gam(co2_avg ~ s(decimal_date, bs="cs"), data = co2)
draw(g2, residuals=TRUE, rug=FALSE)
Depending on your goal, you may find that the second plot is too smooth – the interannual oscillations are smoothed out completely. We can increase the number of “knots” (by setting k=25
) in the spline to capture the oscillation, but this is not the way the smoothing is normally used:
g3 <- gam(co2_avg ~ s(decimal_date, bs="cs", k = 25), data = co2)
draw(g3, residuals=TRUE, rug=FALSE)
There is a summary
function for GAM fits and the broom
functions glance
and tidy
give access to these data, but the interpretation of this output is beyond the scope of the course:
summary(g1)
Family: gaussian
Link function: identity
Formula:
co2_anomaly ~ s(year_fraction, bs = "cs")
Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.353e-15 2.673e-02 0 1
Approximate significance of smooth terms:
edf Ref.df F p-value
s(year_fraction) 8.296 9 530.7 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R-sq.(adj) = 0.976 Deviance explained = 97.8%
GCV = 0.092244 Scale est. = 0.085038 n = 119
glance(g1) |> kable()
df | logLik | AIC | BIC | deviance | df.residual | nobs |
---|---|---|---|---|---|---|
9.296045 | -17.36692 | 55.32592 | 83.9399 | 9.328996 | 109.704 | 119 |
tidy(g1) |> kable()
term | edf | ref.df | statistic | p.value |
---|---|---|---|---|
s(year_fraction) | 8.296045 | 9 | 530.7004 | 0 |
It is often useful to compute residuals and extract predicted values.
co2 |> add_residuals(g1) |> add_fitted(g1) |> kable() |> scroll_box(height = "200px")
year | month | decimal_date | co2_avg | year_fraction | co2_anomaly | .residual | .value |
---|---|---|---|---|---|---|---|
2011 | 1 | 2011.042 | 391.33 | 0.0417 | -0.3216667 | 0.4287558 | -0.7504224 |
2011 | 2 | 2011.125 | 391.86 | 0.1250 | 0.2083333 | 0.3028663 | -0.0945330 |
2011 | 3 | 2011.208 | 392.60 | 0.2083 | 0.9483333 | 0.1440665 | 0.8042668 |
2011 | 4 | 2011.292 | 393.25 | 0.2917 | 1.5983333 | -0.6010913 | 2.1994246 |
2011 | 5 | 2011.375 | 394.19 | 0.3750 | 2.5383333 | -0.5071835 | 3.0455169 |
2011 | 6 | 2011.458 | 393.74 | 0.4583 | 2.0883333 | -0.1757645 | 2.2640978 |
2011 | 7 | 2011.542 | 392.51 | 0.5417 | 0.8583333 | 0.3875896 | 0.4707438 |
2011 | 8 | 2011.625 | 390.13 | 0.6250 | -1.5216667 | 0.0541232 | -1.5757899 |
2011 | 9 | 2011.708 | 389.08 | 0.7083 | -2.5716667 | 0.3920423 | -2.9637089 |
2011 | 10 | 2011.792 | 388.99 | 0.7917 | -2.6616667 | 0.0151210 | -2.6767877 |
2011 | 11 | 2011.875 | 390.28 | 0.8750 | -1.3716667 | -0.2731113 | -1.0985553 |
2011 | 12 | 2011.958 | 391.86 | 0.9583 | 0.2083333 | -0.2091638 | 0.4174971 |
2012 | 1 | 2012.042 | 393.12 | 0.0417 | -0.7350000 | 0.0154224 | -0.7504224 |
2012 | 2 | 2012.125 | 393.86 | 0.1250 | 0.0050000 | 0.0995330 | -0.0945330 |
2012 | 3 | 2012.208 | 394.40 | 0.2083 | 0.5450000 | -0.2592668 | 0.8042668 |
2012 | 4 | 2012.292 | 396.18 | 0.2917 | 2.3250000 | 0.1255754 | 2.1994246 |
2012 | 5 | 2012.375 | 396.74 | 0.3750 | 2.8850000 | -0.1605169 | 3.0455169 |
2012 | 6 | 2012.458 | 395.71 | 0.4583 | 1.8550000 | -0.4090978 | 2.2640978 |
2012 | 7 | 2012.542 | 394.36 | 0.5417 | 0.5050000 | 0.0342562 | 0.4707438 |
2012 | 8 | 2012.625 | 392.39 | 0.6250 | -1.4650000 | 0.1107899 | -1.5757899 |
2012 | 9 | 2012.708 | 391.13 | 0.7083 | -2.7250000 | 0.2387089 | -2.9637089 |
2012 | 10 | 2012.792 | 391.05 | 0.7917 | -2.8050000 | -0.1282123 | -2.6767877 |
2012 | 11 | 2012.875 | 392.98 | 0.8750 | -0.8750000 | 0.2235553 | -1.0985553 |
2012 | 12 | 2012.958 | 394.34 | 0.9583 | 0.4850000 | 0.0675029 | 0.4174971 |
2013 | 1 | 2013.042 | 395.55 | 0.0417 | -0.9700000 | -0.2195776 | -0.7504224 |
2013 | 2 | 2013.125 | 396.80 | 0.1250 | 0.2800000 | 0.3745330 | -0.0945330 |
2013 | 3 | 2013.208 | 397.43 | 0.2083 | 0.9100000 | 0.1057332 | 0.8042668 |
2013 | 4 | 2013.292 | 398.41 | 0.2917 | 1.8900000 | -0.3094246 | 2.1994246 |
2013 | 5 | 2013.375 | 399.78 | 0.3750 | 3.2600000 | 0.2144831 | 3.0455169 |
2013 | 6 | 2013.458 | 398.60 | 0.4583 | 2.0800000 | -0.1840978 | 2.2640978 |
2013 | 7 | 2013.542 | 397.32 | 0.5417 | 0.8000000 | 0.3292562 | 0.4707438 |
2013 | 8 | 2013.625 | 395.20 | 0.6250 | -1.3200000 | 0.2557899 | -1.5757899 |
2013 | 9 | 2013.708 | 393.45 | 0.7083 | -3.0700000 | -0.1062911 | -2.9637089 |
2013 | 10 | 2013.792 | 393.70 | 0.7917 | -2.8200000 | -0.1432123 | -2.6767877 |
2013 | 11 | 2013.875 | 395.16 | 0.8750 | -1.3600000 | -0.2614447 | -1.0985553 |
2013 | 12 | 2013.958 | 396.84 | 0.9583 | 0.3200000 | -0.0974971 | 0.4174971 |
2014 | 1 | 2014.042 | 397.85 | 0.0417 | -0.7925000 | -0.0420776 | -0.7504224 |
2014 | 2 | 2014.125 | 398.01 | 0.1250 | -0.6325000 | -0.5379670 | -0.0945330 |
2014 | 3 | 2014.208 | 399.71 | 0.2083 | 1.0675000 | 0.2632332 | 0.8042668 |
2014 | 4 | 2014.292 | 401.33 | 0.2917 | 2.6875000 | 0.4880754 | 2.1994246 |
2014 | 5 | 2014.375 | 401.78 | 0.3750 | 3.1375000 | 0.0919831 | 3.0455169 |
2014 | 6 | 2014.458 | 401.25 | 0.4583 | 2.6075000 | 0.3434022 | 2.2640978 |
2014 | 7 | 2014.542 | 399.11 | 0.5417 | 0.4675000 | -0.0032438 | 0.4707438 |
2014 | 8 | 2014.625 | 397.03 | 0.6250 | -1.6125000 | -0.0367101 | -1.5757899 |
2014 | 9 | 2014.708 | 395.38 | 0.7083 | -3.2625000 | -0.2987911 | -2.9637089 |
2014 | 10 | 2014.792 | 396.07 | 0.7917 | -2.5725000 | 0.1042877 | -2.6767877 |
2014 | 11 | 2014.875 | 397.28 | 0.8750 | -1.3625000 | -0.2639447 | -1.0985553 |
2014 | 12 | 2014.958 | 398.91 | 0.9583 | 0.2675000 | -0.1499971 | 0.4174971 |
2015 | 1 | 2015.042 | 399.98 | 0.0417 | -0.8458333 | -0.0954109 | -0.7504224 |
2015 | 2 | 2015.125 | 400.35 | 0.1250 | -0.4758333 | -0.3813003 | -0.0945330 |
2015 | 3 | 2015.208 | 401.52 | 0.2083 | 0.6941667 | -0.1101001 | 0.8042668 |
2015 | 4 | 2015.292 | 403.15 | 0.2917 | 2.3241667 | 0.1247421 | 2.1994246 |
2015 | 5 | 2015.375 | 403.96 | 0.3750 | 3.1341667 | 0.0886498 | 3.0455169 |
2015 | 6 | 2015.458 | 402.80 | 0.4583 | 1.9741667 | -0.2899311 | 2.2640978 |
2015 | 7 | 2015.542 | 401.29 | 0.5417 | 0.4641667 | -0.0065771 | 0.4707438 |
2015 | 8 | 2015.625 | 398.93 | 0.6250 | -1.8958333 | -0.3200435 | -1.5757899 |
2015 | 9 | 2015.708 | 397.63 | 0.7083 | -3.1958333 | -0.2321244 | -2.9637089 |
2015 | 10 | 2015.792 | 398.29 | 0.7917 | -2.5358333 | 0.1409543 | -2.6767877 |
2015 | 11 | 2015.875 | 400.16 | 0.8750 | -0.6658333 | 0.4327220 | -1.0985553 |
2015 | 12 | 2015.958 | 401.85 | 0.9583 | 1.0241667 | 0.6066695 | 0.4174971 |
2016 | 1 | 2016.042 | 402.50 | 0.0417 | -1.7225000 | -0.9720776 | -0.7504224 |
2016 | 2 | 2016.125 | 404.07 | 0.1250 | -0.1525000 | -0.0579670 | -0.0945330 |
2016 | 3 | 2016.208 | 404.87 | 0.2083 | 0.6475000 | -0.1567668 | 0.8042668 |
2016 | 4 | 2016.292 | 407.42 | 0.2917 | 3.1975000 | 0.9980754 | 2.1994246 |
2016 | 5 | 2016.375 | 407.72 | 0.3750 | 3.4975000 | 0.4519831 | 3.0455169 |
2016 | 6 | 2016.458 | 406.81 | 0.4583 | 2.5875000 | 0.3234022 | 2.2640978 |
2016 | 7 | 2016.542 | 404.40 | 0.5417 | 0.1775000 | -0.2932438 | 0.4707438 |
2016 | 8 | 2016.625 | 402.26 | 0.6250 | -1.9625000 | -0.3867101 | -1.5757899 |
2016 | 9 | 2016.708 | 401.05 | 0.7083 | -3.1725000 | -0.2087911 | -2.9637089 |
2016 | 10 | 2016.792 | 401.60 | 0.7917 | -2.6225000 | 0.0542877 | -2.6767877 |
2016 | 11 | 2016.875 | 403.53 | 0.8750 | -0.6925000 | 0.4060553 | -1.0985553 |
2016 | 12 | 2016.958 | 404.44 | 0.9583 | 0.2175000 | -0.1999971 | 0.4174971 |
2017 | 1 | 2017.042 | 406.17 | 0.0417 | -0.3850000 | 0.3654224 | -0.7504224 |
2017 | 2 | 2017.125 | 406.47 | 0.1250 | -0.0850000 | 0.0095330 | -0.0945330 |
2017 | 3 | 2017.208 | 407.23 | 0.2083 | 0.6750000 | -0.1292668 | 0.8042668 |
2017 | 4 | 2017.292 | 409.03 | 0.2917 | 2.4750000 | 0.2755754 | 2.1994246 |
2017 | 5 | 2017.375 | 409.69 | 0.3750 | 3.1350000 | 0.0894831 | 3.0455169 |
2017 | 6 | 2017.458 | 408.89 | 0.4583 | 2.3350000 | 0.0709022 | 2.2640978 |
2017 | 7 | 2017.542 | 407.13 | 0.5417 | 0.5750000 | 0.1042562 | 0.4707438 |
2017 | 8 | 2017.625 | 405.12 | 0.6250 | -1.4350000 | 0.1407899 | -1.5757899 |
2017 | 9 | 2017.708 | 403.37 | 0.7083 | -3.1850000 | -0.2212911 | -2.9637089 |
2017 | 10 | 2017.792 | 403.63 | 0.7917 | -2.9250000 | -0.2482123 | -2.6767877 |
2017 | 11 | 2017.875 | 405.12 | 0.8750 | -1.4350000 | -0.3364447 | -1.0985553 |
2017 | 12 | 2017.958 | 406.81 | 0.9583 | 0.2550000 | -0.1624971 | 0.4174971 |
2018 | 1 | 2018.042 | 407.96 | 0.0417 | -0.5600000 | 0.1904224 | -0.7504224 |
2018 | 2 | 2018.125 | 408.32 | 0.1250 | -0.2000000 | -0.1054670 | -0.0945330 |
2018 | 3 | 2018.208 | 409.39 | 0.2083 | 0.8700000 | 0.0657332 | 0.8042668 |
2018 | 4 | 2018.292 | 410.25 | 0.2917 | 1.7300000 | -0.4694246 | 2.1994246 |
2018 | 5 | 2018.375 | 411.24 | 0.3750 | 2.7200000 | -0.3255169 | 3.0455169 |
2018 | 6 | 2018.458 | 410.79 | 0.4583 | 2.2700000 | 0.0059022 | 2.2640978 |
2018 | 7 | 2018.542 | 408.70 | 0.5417 | 0.1800000 | -0.2907438 | 0.4707438 |
2018 | 8 | 2018.625 | 406.97 | 0.6250 | -1.5500000 | 0.0257899 | -1.5757899 |
2018 | 9 | 2018.708 | 405.52 | 0.7083 | -3.0000000 | -0.0362911 | -2.9637089 |
2018 | 10 | 2018.792 | 406.00 | 0.7917 | -2.5200000 | 0.1567877 | -2.6767877 |
2018 | 11 | 2018.875 | 408.02 | 0.8750 | -0.5000000 | 0.5985553 | -1.0985553 |
2018 | 12 | 2018.958 | 409.08 | 0.9583 | 0.5600000 | 0.1425029 | 0.4174971 |
2019 | 1 | 2019.042 | 410.83 | 0.0417 | -0.6041667 | 0.1462558 | -0.7504224 |
2019 | 2 | 2019.125 | 411.75 | 0.1250 | 0.3158333 | 0.4103663 | -0.0945330 |
2019 | 3 | 2019.208 | 411.97 | 0.2083 | 0.5358333 | -0.2684335 | 0.8042668 |
2019 | 4 | 2019.292 | 413.33 | 0.2917 | 1.8958333 | -0.3035913 | 2.1994246 |
2019 | 5 | 2019.375 | 414.64 | 0.3750 | 3.2058333 | 0.1603165 | 3.0455169 |
2019 | 6 | 2019.458 | 413.93 | 0.4583 | 2.4958333 | 0.2317355 | 2.2640978 |
2019 | 7 | 2019.542 | 411.74 | 0.5417 | 0.3058333 | -0.1649104 | 0.4707438 |
2019 | 8 | 2019.625 | 409.95 | 0.6250 | -1.4841667 | 0.0916232 | -1.5757899 |
2019 | 9 | 2019.708 | 408.54 | 0.7083 | -2.8941667 | 0.0695423 | -2.9637089 |
2019 | 10 | 2019.792 | 408.52 | 0.7917 | -2.9141667 | -0.2373790 | -2.6767877 |
2019 | 11 | 2019.875 | 410.25 | 0.8750 | -1.1841667 | -0.0856113 | -1.0985553 |
2019 | 12 | 2019.958 | 411.76 | 0.9583 | 0.3258333 | -0.0916638 | 0.4174971 |
2020 | 1 | 2020.042 | 413.39 | 0.0417 | -0.6154545 | 0.1349679 | -0.7504224 |
2020 | 2 | 2020.125 | 414.11 | 0.1250 | 0.1045455 | 0.1990785 | -0.0945330 |
2020 | 3 | 2020.208 | 414.51 | 0.2083 | 0.5045455 | -0.2997213 | 0.8042668 |
2020 | 4 | 2020.292 | 416.21 | 0.2917 | 2.2045455 | 0.0051209 | 2.1994246 |
2020 | 5 | 2020.375 | 417.07 | 0.3750 | 3.0645455 | 0.0190286 | 3.0455169 |
2020 | 6 | 2020.458 | 416.38 | 0.4583 | 2.3745455 | 0.1104477 | 2.2640978 |
2020 | 7 | 2020.542 | 414.38 | 0.5417 | 0.3745455 | -0.0961983 | 0.4707438 |
2020 | 8 | 2020.625 | 412.55 | 0.6250 | -1.4554545 | 0.1203353 | -1.5757899 |
2020 | 9 | 2020.708 | 411.29 | 0.7083 | -2.7154545 | 0.2482544 | -2.9637089 |
2020 | 10 | 2020.792 | 411.28 | 0.7917 | -2.7254545 | -0.0486669 | -2.6767877 |
2020 | 11 | 2020.875 | 412.89 | 0.8750 | -1.1154545 | -0.0168992 | -1.0985553 |
If we generate new data (dates), we can plot predictions too. Since the model is a piecewise cubic function, extrapolations are often dramatically unreliable. The downward facing cubic in the last “bump” is simply continued, with comically bad results. Extrapolation of models, and especially smooths, is somewhere between risky and foolish!
new_data <- tibble(decimal_date = seq(2017, 2022, by = 0.05))
new_data |> add_fitted(g3) |>
ggplot(aes(decimal_date, .value)) +
geom_line(color = "blue", size = 2) +
geom_point(aes(y= co2_avg), data = co2) + my_theme + xlim(2017, 2022)
If you want confidence intervals on the fitted values, use the confint
function together with the name of the smooth you are extracting. Be aware that this function does not include the intercept (or grand mean) from the model, so the values are all centred on zero.
confint(g1, "s(year_fraction)", level = 0.95) |> kable() |> scroll_box(height = "200px")
smooth | type | by | year_fraction | est | se | crit | lower | upper |
---|---|---|---|---|---|---|---|---|
s(year_fraction) | CRS (shrink) | NA | 0.0417000 | -0.7504224 | 0.0911641 | 1.959964 | -0.9291007 | -0.5717441 |
s(year_fraction) | CRS (shrink) | NA | 0.0509586 | -0.6817731 | 0.0809576 | 1.959964 | -0.8404472 | -0.5230990 |
s(year_fraction) | CRS (shrink) | NA | 0.0602172 | -0.6128047 | 0.0733925 | 1.959964 | -0.7566513 | -0.4689581 |
s(year_fraction) | CRS (shrink) | NA | 0.0694758 | -0.5431982 | 0.0690818 | 1.959964 | -0.6785960 | -0.4078004 |
s(year_fraction) | CRS (shrink) | NA | 0.0787343 | -0.4726346 | 0.0681204 | 1.959964 | -0.6061481 | -0.3391211 |
s(year_fraction) | CRS (shrink) | NA | 0.0879929 | -0.4007948 | 0.0699211 | 1.959964 | -0.5378376 | -0.2637519 |
s(year_fraction) | CRS (shrink) | NA | 0.0972515 | -0.3273597 | 0.0734371 | 1.959964 | -0.4712938 | -0.1834257 |
s(year_fraction) | CRS (shrink) | NA | 0.1065101 | -0.2520105 | 0.0775478 | 1.959964 | -0.4040013 | -0.1000197 |
s(year_fraction) | CRS (shrink) | NA | 0.1157687 | -0.1744279 | 0.0812996 | 1.959964 | -0.3337721 | -0.0150836 |
s(year_fraction) | CRS (shrink) | NA | 0.1250273 | -0.0942929 | 0.0839688 | 1.959964 | -0.2588687 | 0.0702829 |
s(year_fraction) | CRS (shrink) | NA | 0.1342859 | -0.0112865 | 0.0850507 | 1.959964 | -0.1779829 | 0.1554099 |
s(year_fraction) | CRS (shrink) | NA | 0.1435444 | 0.0749103 | 0.0842570 | 1.959964 | -0.0902303 | 0.2400509 |
s(year_fraction) | CRS (shrink) | NA | 0.1528030 | 0.1646477 | 0.0816674 | 1.959964 | 0.0045826 | 0.3247128 |
s(year_fraction) | CRS (shrink) | NA | 0.1620616 | 0.2583992 | 0.0780514 | 1.959964 | 0.1054213 | 0.4113771 |
s(year_fraction) | CRS (shrink) | NA | 0.1713202 | 0.3566685 | 0.0744216 | 1.959964 | 0.2108048 | 0.5025323 |
s(year_fraction) | CRS (shrink) | NA | 0.1805788 | 0.4599596 | 0.0717896 | 1.959964 | 0.3192547 | 0.6006645 |
s(year_fraction) | CRS (shrink) | NA | 0.1898374 | 0.5687763 | 0.0709348 | 1.959964 | 0.4297468 | 0.7078059 |
s(year_fraction) | CRS (shrink) | NA | 0.1990960 | 0.6836225 | 0.0721337 | 1.959964 | 0.5422429 | 0.8250020 |
s(year_fraction) | CRS (shrink) | NA | 0.2083545 | 0.8050020 | 0.0750446 | 1.959964 | 0.6579174 | 0.9520866 |
s(year_fraction) | CRS (shrink) | NA | 0.2176131 | 0.9334187 | 0.0788581 | 1.959964 | 0.7788597 | 1.0879777 |
s(year_fraction) | CRS (shrink) | NA | 0.2268717 | 1.0693765 | 0.0825778 | 1.959964 | 0.9075269 | 1.2312261 |
s(year_fraction) | CRS (shrink) | NA | 0.2361303 | 1.2133792 | 0.0852429 | 1.959964 | 1.0463061 | 1.3804523 |
s(year_fraction) | CRS (shrink) | NA | 0.2453889 | 1.3659307 | 0.0860566 | 1.959964 | 1.1972628 | 1.5345985 |
s(year_fraction) | CRS (shrink) | NA | 0.2546475 | 1.5269277 | 0.0846482 | 1.959964 | 1.3610204 | 1.6928350 |
s(year_fraction) | CRS (shrink) | NA | 0.2639061 | 1.6938561 | 0.0816106 | 1.959964 | 1.5339022 | 1.8538101 |
s(year_fraction) | CRS (shrink) | NA | 0.2731646 | 1.8636034 | 0.0779236 | 1.959964 | 1.7108759 | 2.0163308 |
s(year_fraction) | CRS (shrink) | NA | 0.2824232 | 2.0330568 | 0.0746644 | 1.959964 | 1.8867173 | 2.1793963 |
s(year_fraction) | CRS (shrink) | NA | 0.2916818 | 2.1991039 | 0.0728081 | 1.959964 | 2.0564026 | 2.3418053 |
s(year_fraction) | CRS (shrink) | NA | 0.3009404 | 2.3586321 | 0.0729366 | 1.959964 | 2.2156789 | 2.5015852 |
s(year_fraction) | CRS (shrink) | NA | 0.3101990 | 2.5085287 | 0.0750029 | 1.959964 | 2.3615256 | 2.6555317 |
s(year_fraction) | CRS (shrink) | NA | 0.3194576 | 2.6456811 | 0.0783537 | 1.959964 | 2.4921106 | 2.7992516 |
s(year_fraction) | CRS (shrink) | NA | 0.3287162 | 2.7669769 | 0.0819889 | 1.959964 | 2.6062816 | 2.9276722 |
s(year_fraction) | CRS (shrink) | NA | 0.3379747 | 2.8693034 | 0.0848476 | 1.959964 | 2.7030052 | 3.0356015 |
s(year_fraction) | CRS (shrink) | NA | 0.3472333 | 2.9495480 | 0.0859986 | 1.959964 | 2.7809937 | 3.1181022 |
s(year_fraction) | CRS (shrink) | NA | 0.3564919 | 3.0052962 | 0.0849386 | 1.959964 | 2.8388196 | 3.1717728 |
s(year_fraction) | CRS (shrink) | NA | 0.3657505 | 3.0369263 | 0.0821713 | 1.959964 | 2.8758735 | 3.1979791 |
s(year_fraction) | CRS (shrink) | NA | 0.3750091 | 3.0455143 | 0.0786208 | 1.959964 | 2.8914204 | 3.1996083 |
s(year_fraction) | CRS (shrink) | NA | 0.3842677 | 3.0321367 | 0.0753298 | 1.959964 | 2.8844929 | 3.1797804 |
s(year_fraction) | CRS (shrink) | NA | 0.3935263 | 2.9978695 | 0.0732707 | 1.959964 | 2.8542615 | 3.1414775 |
s(year_fraction) | CRS (shrink) | NA | 0.4027848 | 2.9437891 | 0.0730697 | 1.959964 | 2.8005751 | 3.0870030 |
s(year_fraction) | CRS (shrink) | NA | 0.4120434 | 2.8709716 | 0.0747628 | 1.959964 | 2.7244393 | 3.0175039 |
s(year_fraction) | CRS (shrink) | NA | 0.4213020 | 2.7804934 | 0.0777794 | 1.959964 | 2.6280486 | 2.9329383 |
s(year_fraction) | CRS (shrink) | NA | 0.4305606 | 2.6734307 | 0.0811703 | 1.959964 | 2.5143399 | 2.8325215 |
s(year_fraction) | CRS (shrink) | NA | 0.4398192 | 2.5508597 | 0.0838920 | 1.959964 | 2.3864344 | 2.7152849 |
s(year_fraction) | CRS (shrink) | NA | 0.4490778 | 2.4138566 | 0.0850124 | 1.959964 | 2.2472353 | 2.5804779 |
s(year_fraction) | CRS (shrink) | NA | 0.4583364 | 2.2634823 | 0.0840211 | 1.959964 | 2.0988041 | 2.4281606 |
s(year_fraction) | CRS (shrink) | NA | 0.4675949 | 2.1007371 | 0.0814055 | 1.959964 | 1.9411853 | 2.2602890 |
s(year_fraction) | CRS (shrink) | NA | 0.4768535 | 1.9266061 | 0.0780703 | 1.959964 | 1.7735912 | 2.0796210 |
s(year_fraction) | CRS (shrink) | NA | 0.4861121 | 1.7420745 | 0.0750318 | 1.959964 | 1.5950148 | 1.8891341 |
s(year_fraction) | CRS (shrink) | NA | 0.4953707 | 1.5481273 | 0.0732232 | 1.959964 | 1.4046125 | 1.6916421 |
s(year_fraction) | CRS (shrink) | NA | 0.5046293 | 1.3457498 | 0.0732233 | 1.959964 | 1.2022348 | 1.4892649 |
s(year_fraction) | CRS (shrink) | NA | 0.5138879 | 1.1359272 | 0.0750321 | 1.959964 | 0.9888669 | 1.2829875 |
s(year_fraction) | CRS (shrink) | NA | 0.5231465 | 0.9196446 | 0.0780707 | 1.959964 | 0.7666287 | 1.0726604 |
s(year_fraction) | CRS (shrink) | NA | 0.5324051 | 0.6978871 | 0.0814060 | 1.959964 | 0.5383343 | 0.8574399 |
s(year_fraction) | CRS (shrink) | NA | 0.5416636 | 0.4716399 | 0.0840214 | 1.959964 | 0.3069609 | 0.6363189 |
s(year_fraction) | CRS (shrink) | NA | 0.5509222 | 0.2418882 | 0.0850126 | 1.959964 | 0.0752666 | 0.4085099 |
s(year_fraction) | CRS (shrink) | NA | 0.5601808 | 0.0096723 | 0.0838920 | 1.959964 | -0.1547530 | 0.1740976 |
s(year_fraction) | CRS (shrink) | NA | 0.5694394 | -0.2237447 | 0.0811702 | 1.959964 | -0.3828354 | -0.0646541 |
s(year_fraction) | CRS (shrink) | NA | 0.5786980 | -0.4570436 | 0.0777794 | 1.959964 | -0.6094884 | -0.3045988 |
s(year_fraction) | CRS (shrink) | NA | 0.5879566 | -0.6889047 | 0.0747631 | 1.959964 | -0.8354377 | -0.5423718 |
s(year_fraction) | CRS (shrink) | NA | 0.5972152 | -0.9180088 | 0.0730706 | 1.959964 | -1.0612245 | -0.7747931 |
s(year_fraction) | CRS (shrink) | NA | 0.6064737 | -1.1430364 | 0.0732722 | 1.959964 | -1.2866474 | -0.9994254 |
s(year_fraction) | CRS (shrink) | NA | 0.6157323 | -1.3626681 | 0.0753319 | 1.959964 | -1.5103159 | -1.2150204 |
s(year_fraction) | CRS (shrink) | NA | 0.6249909 | -1.5755845 | 0.0786231 | 1.959964 | -1.7296830 | -1.4214861 |
s(year_fraction) | CRS (shrink) | NA | 0.6342495 | -1.7804663 | 0.0821735 | 1.959964 | -1.9415234 | -1.6194091 |
s(year_fraction) | CRS (shrink) | NA | 0.6435081 | -1.9759939 | 0.0849404 | 1.959964 | -2.1424740 | -1.8095138 |
s(year_fraction) | CRS (shrink) | NA | 0.6527667 | -2.1608480 | 0.0859997 | 1.959964 | -2.3294043 | -1.9922916 |
s(year_fraction) | CRS (shrink) | NA | 0.6620253 | -2.3336718 | 0.0848479 | 1.959964 | -2.4999706 | -2.1673729 |
s(year_fraction) | CRS (shrink) | NA | 0.6712838 | -2.4929588 | 0.0819889 | 1.959964 | -2.6536541 | -2.3322636 |
s(year_fraction) | CRS (shrink) | NA | 0.6805424 | -2.6371655 | 0.0783541 | 1.959964 | -2.7907368 | -2.4835942 |
s(year_fraction) | CRS (shrink) | NA | 0.6898010 | -2.7647478 | 0.0750048 | 1.959964 | -2.9117546 | -2.6177411 |
s(year_fraction) | CRS (shrink) | NA | 0.6990596 | -2.8741622 | 0.0729410 | 1.959964 | -3.0171239 | -2.7312005 |
s(year_fraction) | CRS (shrink) | NA | 0.7083182 | -2.9638648 | 0.0728152 | 1.959964 | -3.1065800 | -2.8211495 |
s(year_fraction) | CRS (shrink) | NA | 0.7175768 | -3.0323118 | 0.0746736 | 1.959964 | -3.1786694 | -2.8859541 |
s(year_fraction) | CRS (shrink) | NA | 0.7268354 | -3.0779594 | 0.0779336 | 1.959964 | -3.2307065 | -2.9252124 |
s(year_fraction) | CRS (shrink) | NA | 0.7360939 | -3.0992640 | 0.0816197 | 1.959964 | -3.2592356 | -2.9392923 |
s(year_fraction) | CRS (shrink) | NA | 0.7453525 | -3.0946816 | 0.0846547 | 1.959964 | -3.2606017 | -2.9287615 |
s(year_fraction) | CRS (shrink) | NA | 0.7546111 | -3.0626686 | 0.0860598 | 1.959964 | -3.2313426 | -2.8939946 |
s(year_fraction) | CRS (shrink) | NA | 0.7638697 | -3.0024297 | 0.0852433 | 1.959964 | -3.1695036 | -2.8353559 |
s(year_fraction) | CRS (shrink) | NA | 0.7731283 | -2.9161857 | 0.0825785 | 1.959964 | -3.0780367 | -2.7543348 |
s(year_fraction) | CRS (shrink) | NA | 0.7823869 | -2.8069168 | 0.0788644 | 1.959964 | -2.9614882 | -2.6523454 |
s(year_fraction) | CRS (shrink) | NA | 0.7916455 | -2.6776030 | 0.0750628 | 1.959964 | -2.8247235 | -2.5304825 |
s(year_fraction) | CRS (shrink) | NA | 0.8009040 | -2.5312246 | 0.0721691 | 1.959964 | -2.6726735 | -2.3897757 |
s(year_fraction) | CRS (shrink) | NA | 0.8101626 | -2.3707617 | 0.0709880 | 1.959964 | -2.5098957 | -2.2316278 |
s(year_fraction) | CRS (shrink) | NA | 0.8194212 | -2.1991946 | 0.0718550 | 1.959964 | -2.3400278 | -2.0583613 |
s(year_fraction) | CRS (shrink) | NA | 0.8286798 | -2.0195033 | 0.0744885 | 1.959964 | -2.1654980 | -1.8735085 |
s(year_fraction) | CRS (shrink) | NA | 0.8379384 | -1.8346680 | 0.0781073 | 1.959964 | -1.9877556 | -1.6815804 |
s(year_fraction) | CRS (shrink) | NA | 0.8471970 | -1.6476689 | 0.0817030 | 1.959964 | -1.8078037 | -1.4875341 |
s(year_fraction) | CRS (shrink) | NA | 0.8564556 | -1.4614862 | 0.0842697 | 1.959964 | -1.6266517 | -1.2963207 |
s(year_fraction) | CRS (shrink) | NA | 0.8657141 | -1.2785516 | 0.0850508 | 1.959964 | -1.4452480 | -1.1118552 |
s(year_fraction) | CRS (shrink) | NA | 0.8749727 | -1.0990792 | 0.0839868 | 1.959964 | -1.2636903 | -0.9344681 |
s(year_fraction) | CRS (shrink) | NA | 0.8842313 | -0.9227229 | 0.0813945 | 1.959964 | -1.0822532 | -0.7631926 |
s(year_fraction) | CRS (shrink) | NA | 0.8934899 | -0.7491364 | 0.0778159 | 1.959964 | -0.9016528 | -0.5966200 |
s(year_fraction) | CRS (shrink) | NA | 0.9027485 | -0.5779735 | 0.0740203 | 1.959964 | -0.7230506 | -0.4328964 |
s(year_fraction) | CRS (shrink) | NA | 0.9120071 | -0.4088880 | 0.0710024 | 1.959964 | -0.5480501 | -0.2697259 |
s(year_fraction) | CRS (shrink) | NA | 0.9212657 | -0.2415336 | 0.0698892 | 1.959964 | -0.3785140 | -0.1045532 |
s(year_fraction) | CRS (shrink) | NA | 0.9305242 | -0.0755641 | 0.0716662 | 1.959964 | -0.2160273 | 0.0648991 |
s(year_fraction) | CRS (shrink) | NA | 0.9397828 | 0.0893667 | 0.0768070 | 1.959964 | -0.0611722 | 0.2399055 |
s(year_fraction) | CRS (shrink) | NA | 0.9490414 | 0.2536050 | 0.0851237 | 1.959964 | 0.0867656 | 0.4204444 |
s(year_fraction) | CRS (shrink) | NA | 0.9583000 | 0.4174971 | 0.0959746 | 1.959964 | 0.2293904 | 0.6056038 |
confint(g1, "s(year_fraction)", level = 0.95) |>
ggplot(aes(year_fraction)) +
geom_ribbon(aes(ymin = lower, ymax = upper), alpha = 0.25) +
geom_line(aes(y = est)) + my_theme
You can also use lm
to fit splines; these are similar to GAMs but there are some important differences. The mgcv
package has a lot of features not available with lm
.
term | estimate | std.error | statistic | p.value |
---|---|---|---|---|
(Intercept) | 391.73 | 1.13 | 347.53 | 0.00 |
splines::bs(decimal_date, df = 5)1 | 0.64 | 2.16 | 0.30 | 0.77 |
splines::bs(decimal_date, df = 5)2 | 6.87 | 1.47 | 4.69 | 0.00 |
splines::bs(decimal_date, df = 5)3 | 14.49 | 1.98 | 7.31 | 0.00 |
splines::bs(decimal_date, df = 5)4 | 21.01 | 1.62 | 13.01 | 0.00 |
splines::bs(decimal_date, df = 5)5 | 22.14 | 1.64 | 13.50 | 0.00 |
glance(s1) |> kable(digits = 2)
r.squared | adj.r.squared | sigma | statistic | p.value | df | logLik | AIC | BIC | deviance | df.residual | nobs |
---|---|---|---|---|---|---|---|---|---|---|---|
0.91 | 0.9 | 2.31 | 219.14 | 0 | 5 | -265.4 | 544.8 | 564.26 | 602.91 | 113 | 119 |
We can use augment
to generate data to plot and combine it with the original data.
a1 <- augment(s1, data = co2, se_fit = TRUE, interval="prediction")
a1 |> ggplot(aes(decimal_date)) +
geom_ribbon(aes(ymin = .lower, ymax = .upper), alpha = 0.2) +
geom_line(aes(y= .fitted)) +
geom_point(aes(y = co2_avg)) +
my_theme
For a spline that closely traces the data, increase df
to 26 or more. This increases the numbers of knots or separate cubics used to approximate the data.
17.2 Locally Estimated Scatterplot Smoothing (LOESS)
LOESS smooths are constructed by making a large number of quadratic (or possibly linear) regression lines as a window moves along the x-axis. The degree of the fits and the width of the window (and other details) can be adjusted. The predictions from these many local regressions are then computed and plotted as a “smooth” of the data. We usually just imagine the line is drawn through the data in a way that allows it to track fluctuations without specifying a model.
l1 <- loess(co2_anomaly ~ year_fraction, data = co2)
As with the GAMs, the summary
function is not particularly easy to interpret and we will not explore the details, but you can see that this is a quadratic model with a “span” of 0.75, meaning that 75% of the data are used for each regression. (The weighting of each point in the regression varies as a function of x, resulting a continuous change in the predictions.)
summary(l1)
Call:
loess(formula = co2_anomaly ~ year_fraction, data = co2)
Number of Observations: 119
Equivalent Number of Parameters: 4.58
Residual Standard Error: 0.4647
Trace of smoother matrix: 5 (exact)
Control settings:
span : 0.75
degree : 2
family : gaussian
surface : interpolate cell = 0.2
normalize: TRUE
parametric: FALSE
drop.square: FALSE
We can use predict
, residuals
and augment
on these model objects as we did for lm
fits in the previous lesson.
augment(l1, se_fit = TRUE) |> kable() |> scroll_box(height = "200px")
co2_anomaly | year_fraction | .fitted | .se.fit | .resid |
---|---|---|---|---|
-0.3216667 | 0.0417 | -1.1874816 | 0.1278142 | 0.8658149 |
0.2083333 | 0.1250 | 0.3297748 | 0.0783764 | -0.1214414 |
0.9483333 | 0.2083 | 1.4858927 | 0.0756092 | -0.5375594 |
1.5983333 | 0.2917 | 2.2386031 | 0.0811237 | -0.6402698 |
2.5383333 | 0.3750 | 2.7453971 | 0.0876098 | -0.2070637 |
2.0883333 | 0.4583 | 2.1417191 | 0.0876044 | -0.0533857 |
0.8583333 | 0.5417 | 0.4302506 | 0.0876044 | 0.4280828 |
-1.5216667 | 0.6250 | -1.5283681 | 0.0876098 | 0.0067014 |
-2.5716667 | 0.7083 | -2.4561226 | 0.0811840 | -0.1155441 |
-2.6616667 | 0.7917 | -2.3873488 | 0.0757368 | -0.2743178 |
-1.3716667 | 0.8750 | -1.4172800 | 0.0803002 | 0.0456133 |
0.2083333 | 0.9583 | 0.4905675 | 0.1329762 | -0.2822341 |
-0.7350000 | 0.0417 | -1.1874816 | 0.1278142 | 0.4524816 |
0.0050000 | 0.1250 | 0.3297748 | 0.0783764 | -0.3247748 |
0.5450000 | 0.2083 | 1.4858927 | 0.0756092 | -0.9408927 |
2.3250000 | 0.2917 | 2.2386031 | 0.0811237 | 0.0863969 |
2.8850000 | 0.3750 | 2.7453971 | 0.0876098 | 0.1396029 |
1.8550000 | 0.4583 | 2.1417191 | 0.0876044 | -0.2867191 |
0.5050000 | 0.5417 | 0.4302506 | 0.0876044 | 0.0747494 |
-1.4650000 | 0.6250 | -1.5283681 | 0.0876098 | 0.0633681 |
-2.7250000 | 0.7083 | -2.4561226 | 0.0811840 | -0.2688774 |
-2.8050000 | 0.7917 | -2.3873488 | 0.0757368 | -0.4176512 |
-0.8750000 | 0.8750 | -1.4172800 | 0.0803002 | 0.5422800 |
0.4850000 | 0.9583 | 0.4905675 | 0.1329762 | -0.0055675 |
-0.9700000 | 0.0417 | -1.1874816 | 0.1278142 | 0.2174816 |
0.2800000 | 0.1250 | 0.3297748 | 0.0783764 | -0.0497748 |
0.9100000 | 0.2083 | 1.4858927 | 0.0756092 | -0.5758927 |
1.8900000 | 0.2917 | 2.2386031 | 0.0811237 | -0.3486031 |
3.2600000 | 0.3750 | 2.7453971 | 0.0876098 | 0.5146029 |
2.0800000 | 0.4583 | 2.1417191 | 0.0876044 | -0.0617191 |
0.8000000 | 0.5417 | 0.4302506 | 0.0876044 | 0.3697494 |
-1.3200000 | 0.6250 | -1.5283681 | 0.0876098 | 0.2083681 |
-3.0700000 | 0.7083 | -2.4561226 | 0.0811840 | -0.6138774 |
-2.8200000 | 0.7917 | -2.3873488 | 0.0757368 | -0.4326512 |
-1.3600000 | 0.8750 | -1.4172800 | 0.0803002 | 0.0572800 |
0.3200000 | 0.9583 | 0.4905675 | 0.1329762 | -0.1705675 |
-0.7925000 | 0.0417 | -1.1874816 | 0.1278142 | 0.3949816 |
-0.6325000 | 0.1250 | 0.3297748 | 0.0783764 | -0.9622748 |
1.0675000 | 0.2083 | 1.4858927 | 0.0756092 | -0.4183927 |
2.6875000 | 0.2917 | 2.2386031 | 0.0811237 | 0.4488969 |
3.1375000 | 0.3750 | 2.7453971 | 0.0876098 | 0.3921029 |
2.6075000 | 0.4583 | 2.1417191 | 0.0876044 | 0.4657809 |
0.4675000 | 0.5417 | 0.4302506 | 0.0876044 | 0.0372494 |
-1.6125000 | 0.6250 | -1.5283681 | 0.0876098 | -0.0841319 |
-3.2625000 | 0.7083 | -2.4561226 | 0.0811840 | -0.8063774 |
-2.5725000 | 0.7917 | -2.3873488 | 0.0757368 | -0.1851512 |
-1.3625000 | 0.8750 | -1.4172800 | 0.0803002 | 0.0547800 |
0.2675000 | 0.9583 | 0.4905675 | 0.1329762 | -0.2230675 |
-0.8458333 | 0.0417 | -1.1874816 | 0.1278142 | 0.3416483 |
-0.4758333 | 0.1250 | 0.3297748 | 0.0783764 | -0.8056081 |
0.6941667 | 0.2083 | 1.4858927 | 0.0756092 | -0.7917261 |
2.3241667 | 0.2917 | 2.2386031 | 0.0811237 | 0.0855636 |
3.1341667 | 0.3750 | 2.7453971 | 0.0876098 | 0.3887696 |
1.9741667 | 0.4583 | 2.1417191 | 0.0876044 | -0.1675524 |
0.4641667 | 0.5417 | 0.4302506 | 0.0876044 | 0.0339161 |
-1.8958333 | 0.6250 | -1.5283681 | 0.0876098 | -0.3674653 |
-3.1958333 | 0.7083 | -2.4561226 | 0.0811840 | -0.7397107 |
-2.5358333 | 0.7917 | -2.3873488 | 0.0757368 | -0.1484845 |
-0.6658333 | 0.8750 | -1.4172800 | 0.0803002 | 0.7514466 |
1.0241667 | 0.9583 | 0.4905675 | 0.1329762 | 0.5335992 |
-1.7225000 | 0.0417 | -1.1874816 | 0.1278142 | -0.5350184 |
-0.1525000 | 0.1250 | 0.3297748 | 0.0783764 | -0.4822748 |
0.6475000 | 0.2083 | 1.4858927 | 0.0756092 | -0.8383927 |
3.1975000 | 0.2917 | 2.2386031 | 0.0811237 | 0.9588969 |
3.4975000 | 0.3750 | 2.7453971 | 0.0876098 | 0.7521029 |
2.5875000 | 0.4583 | 2.1417191 | 0.0876044 | 0.4457809 |
0.1775000 | 0.5417 | 0.4302506 | 0.0876044 | -0.2527506 |
-1.9625000 | 0.6250 | -1.5283681 | 0.0876098 | -0.4341319 |
-3.1725000 | 0.7083 | -2.4561226 | 0.0811840 | -0.7163774 |
-2.6225000 | 0.7917 | -2.3873488 | 0.0757368 | -0.2351512 |
-0.6925000 | 0.8750 | -1.4172800 | 0.0803002 | 0.7247800 |
0.2175000 | 0.9583 | 0.4905675 | 0.1329762 | -0.2730675 |
-0.3850000 | 0.0417 | -1.1874816 | 0.1278142 | 0.8024816 |
-0.0850000 | 0.1250 | 0.3297748 | 0.0783764 | -0.4147748 |
0.6750000 | 0.2083 | 1.4858927 | 0.0756092 | -0.8108927 |
2.4750000 | 0.2917 | 2.2386031 | 0.0811237 | 0.2363969 |
3.1350000 | 0.3750 | 2.7453971 | 0.0876098 | 0.3896029 |
2.3350000 | 0.4583 | 2.1417191 | 0.0876044 | 0.1932809 |
0.5750000 | 0.5417 | 0.4302506 | 0.0876044 | 0.1447494 |
-1.4350000 | 0.6250 | -1.5283681 | 0.0876098 | 0.0933681 |
-3.1850000 | 0.7083 | -2.4561226 | 0.0811840 | -0.7288774 |
-2.9250000 | 0.7917 | -2.3873488 | 0.0757368 | -0.5376512 |
-1.4350000 | 0.8750 | -1.4172800 | 0.0803002 | -0.0177200 |
0.2550000 | 0.9583 | 0.4905675 | 0.1329762 | -0.2355675 |
-0.5600000 | 0.0417 | -1.1874816 | 0.1278142 | 0.6274816 |
-0.2000000 | 0.1250 | 0.3297748 | 0.0783764 | -0.5297748 |
0.8700000 | 0.2083 | 1.4858927 | 0.0756092 | -0.6158927 |
1.7300000 | 0.2917 | 2.2386031 | 0.0811237 | -0.5086031 |
2.7200000 | 0.3750 | 2.7453971 | 0.0876098 | -0.0253971 |
2.2700000 | 0.4583 | 2.1417191 | 0.0876044 | 0.1282809 |
0.1800000 | 0.5417 | 0.4302506 | 0.0876044 | -0.2502506 |
-1.5500000 | 0.6250 | -1.5283681 | 0.0876098 | -0.0216319 |
-3.0000000 | 0.7083 | -2.4561226 | 0.0811840 | -0.5438774 |
-2.5200000 | 0.7917 | -2.3873488 | 0.0757368 | -0.1326512 |
-0.5000000 | 0.8750 | -1.4172800 | 0.0803002 | 0.9172800 |
0.5600000 | 0.9583 | 0.4905675 | 0.1329762 | 0.0694325 |
-0.6041667 | 0.0417 | -1.1874816 | 0.1278142 | 0.5833149 |
0.3158333 | 0.1250 | 0.3297748 | 0.0783764 | -0.0139414 |
0.5358333 | 0.2083 | 1.4858927 | 0.0756092 | -0.9500594 |
1.8958333 | 0.2917 | 2.2386031 | 0.0811237 | -0.3427698 |
3.2058333 | 0.3750 | 2.7453971 | 0.0876098 | 0.4604363 |
2.4958333 | 0.4583 | 2.1417191 | 0.0876044 | 0.3541143 |
0.3058333 | 0.5417 | 0.4302506 | 0.0876044 | -0.1244172 |
-1.4841667 | 0.6250 | -1.5283681 | 0.0876098 | 0.0442014 |
-2.8941667 | 0.7083 | -2.4561226 | 0.0811840 | -0.4380441 |
-2.9141667 | 0.7917 | -2.3873488 | 0.0757368 | -0.5268178 |
-1.1841667 | 0.8750 | -1.4172800 | 0.0803002 | 0.2331133 |
0.3258333 | 0.9583 | 0.4905675 | 0.1329762 | -0.1647341 |
-0.6154545 | 0.0417 | -1.1874816 | 0.1278142 | 0.5720271 |
0.1045455 | 0.1250 | 0.3297748 | 0.0783764 | -0.2252293 |
0.5045455 | 0.2083 | 1.4858927 | 0.0756092 | -0.9813473 |
2.2045455 | 0.2917 | 2.2386031 | 0.0811237 | -0.0340576 |
3.0645455 | 0.3750 | 2.7453971 | 0.0876098 | 0.3191484 |
2.3745455 | 0.4583 | 2.1417191 | 0.0876044 | 0.2328264 |
0.3745455 | 0.5417 | 0.4302506 | 0.0876044 | -0.0557051 |
-1.4554545 | 0.6250 | -1.5283681 | 0.0876098 | 0.0729135 |
-2.7154545 | 0.7083 | -2.4561226 | 0.0811840 | -0.2593319 |
-2.7254545 | 0.7917 | -2.3873488 | 0.0757368 | -0.3381057 |
-1.1154545 | 0.8750 | -1.4172800 | 0.0803002 | 0.3018254 |
augment(l1, se_fit = TRUE) |>
ggplot(aes(x = year_fraction, y = .fitted)) +
geom_line() +
geom_ribbon(aes(ymin = .fitted - 2*.se.fit, ymax = .fitted + 2*.se.fit), alpha = 0.20) +
my_theme
This line is only drawn using 12 points, so we might want to generate a smoother prediction by creating a new set of dates. We also add the original data to the plot.
new_data = tibble(year_fraction = seq(min(co2$year_fraction),
max(co2$year_fraction),
length = 100))
augment(l1, newdata = new_data, se_fit = TRUE) |>
ggplot(aes(x = year_fraction, y = .fitted)) +
geom_line() +
geom_ribbon(aes(ymin = .fitted - 2*.se.fit,
ymax = .fitted + 2*.se.fit),
alpha = 0.20) +
geom_point(aes(y = co2_anomaly), data = co2) +
my_theme
The LOESS function will not predict outside of the range of data provided, so I had to select the range of new_data
above carefully to get the line to start and end near the first and last points.
Now we make two plots that contrast the results with different sized windows (span
) and degree of the local regression models.
l2 <- loess(co2_avg ~ decimal_date, data = co2,
degree = 1, span = 0.05)
l3 <- loess(co2_avg ~ decimal_date, data = co2,
degree = 1, span = 0.25)
new_data = tibble(decimal_date = seq(min(co2$decimal_date),
max(co2$decimal_date),
length = 300))
a2 <- augment(l2, newdata = new_data, se_fit = TRUE)
a3 <- augment(l3, newdata = new_data, se_fit = TRUE)
ggplot(a2, aes(x = decimal_date, y = .fitted)) +
geom_line(col="green") +
geom_ribbon(aes(ymin = .fitted - 2*.se.fit,
ymax = .fitted + 2*.se.fit),
alpha = 0.20, fill="green") +
geom_point(aes(y = co2_avg), data = co2) +
geom_line(data = a3,
col="blue") +
geom_ribbon(data = a3,
aes(ymin = .fitted - 2*.se.fit,
ymax = .fitted + 2*.se.fit),
alpha = 0.20, fill="blue") +
my_theme
LOESS fits are slow and take a lot of memory compared to other methods (both time and storage requirements increase like the square of the number of points), so they are usually only used for small data sets.
17.3 Packages used
In addition to tidyverse
, kableExtra
and gapminder
, in this lesson we used
broom
-
mgcv
for GAMs -
gratia
for plotting and extracting information from GAMs