This post documents my experience in the Tidy Time Series and Forecasting in R workshop at rstudio::conf 2020 that took place in San Francisco, CA during January 27-30, 2020.
Following what was presented at the two-day workshop, this third post (part two of day two) will be “everything that is forecasting”. The second day focused on how to use the the tidyverts
packages in forecasting. Due to the length of the material covered in the second day, it has been split into two posts. The first part covered modeling algorithms. This post will finish up modeling algorithms and cover model diagnostics and forecast accuracy.
Setup
The following packages are required:
The following data sets are used in this post.
- Monthly retail turnover for department stores in Victoria, Australia
- Yearly economic indicator values for Australia
victoria_dept_stores_tsbl <- aus_retail %>%
filter(
Industry == "Department stores",
State == "Victoria",
year(Month) >= 2000
) %>%
select(Month, Turnover)
victoria_dept_stores_tsbl
# A tsibble: 228 x 2 [1M]
Month Turnover
<mth> <dbl>
1 2000 Jan 228
2 2000 Feb 196.
3 2000 Mar 224.
4 2000 Apr 261.
5 2000 May 265.
6 2000 Jun 299.
7 2000 Jul 206.
8 2000 Aug 240.
9 2000 Sep 251.
10 2000 Oct 270.
# … with 218 more rows
aus_economy_tsbl <- global_economy %>%
filter(Country == "Australia") %>%
select(Year, Growth, CPI, Imports, Exports) %>%
drop_na() %>%
as_tsibble(index = Year)
aus_economy_tsbl
# A tsibble: 57 x 5 [1Y]
Year Growth CPI Imports Exports
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1961 2.49 8.14 15.0 12.4
2 1962 1.30 8.12 12.6 13.9
3 1963 6.21 8.17 13.8 13.0
4 1964 6.98 8.40 13.8 14.9
5 1965 5.98 8.69 15.3 13.2
6 1966 2.38 8.98 15.1 12.9
7 1967 6.30 9.29 13.9 12.9
8 1968 5.10 9.52 14.5 12.3
9 1969 7.04 9.83 13.3 12.0
10 1970 7.17 10.2 13.2 13.0
# … with 47 more rows
Complex Methods (continued)
Picking up from the previous post, the final complex method covered on day two is detailed below.
Dynamic Regression (RegARIMA)
The last algorithm introduced at the workshop was Dynamic Regression, or a linear regression with ARIMA errors. In regular linear regression, we do not want the residuals to be correlated. In Dynamic Regression, we allow the residuals to be correlated, and then fit those residuals to an ARIMA process. This is an extension of the ARIMA()
function where we add explanatory variables.
For these examples, I will switch the Australian economy data set. This data set has the annual percentage growth in GDP for Australia (we will be trying to predict this), and some other information, such as CPI, and Imports and Exports as a percentage of GDP.
aus_economy_tsbl %>%
head(5) %>%
mutate(across(where(is.numeric), round, digits = 2)) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
Year | Growth | CPI | Imports | Exports |
---|---|---|---|---|
1961 | 2.49 | 8.14 | 15.03 | 12.40 |
1962 | 1.30 | 8.12 | 12.63 | 13.94 |
1963 | 6.21 | 8.17 | 13.83 | 13.01 |
1964 | 6.98 | 8.40 | 13.76 | 14.94 |
1965 | 5.98 | 8.69 | 15.27 | 13.22 |
The percentage growth in GDP for Australia from 1961 to 2017 looks like this:
aus_economy_tsbl %>%
autoplot(Growth) +
geom_hline(aes(yintercept = 0), linetype = "dashed") +
theme_minimal() +
scale_y_continuous(
labels = scales::label_percent(scale = 1, accuracy = 0.1)
) +
expand_limits(y = c(-8, 8)) +
labs(
title = "Annual Percentage Growth in GDP For Australia",
subtitle = "Sourced from tsibbledata package",
caption = "Original Source: The World Bank",
x = "",
y = "Growth in GDP"
)
As with basic ARIMA models, you can either specify the p, d, and q parameters or allow the function to select them for you.
aus_economy_dyn_reg_mbl <- aus_economy_tsbl %>%
filter(Year <= 2010) %>%
model(
regarima_110 = ARIMA(Growth ~ CPI + Imports + Exports + pdq(1,1,0)),
regarima_010 = ARIMA(Growth ~ CPI + Imports + Exports + pdq(0,1,0)),
regarima_002 = ARIMA(Growth ~ CPI + Imports + Exports + pdq(0,0,2)),
regarima_auto = ARIMA(Growth ~ CPI + Imports + Exports)
)
glimpse(aus_economy_dyn_reg_mbl)
Rows: 1
Columns: 4
$ regarima_110 <model> [LM w/ ARIMA(1,1,0) errors]
$ regarima_010 <model> [LM w/ ARIMA(0,1,0) errors]
$ regarima_002 <model> [LM w/ ARIMA(0,0,2) errors]
$ regarima_auto <model> [LM w/ ARIMA(0,0,0) errors]
Note that I have held out 2011 to 2017. The held out years will be used to forecast because we need the extraneous variables to have future values (since they are variables in the model and not what is being forecasted). We can see this by using the report()
function.
Series: Growth
Model: LM w/ ARIMA(0,0,0) errors
Coefficients:
CPI Imports Exports
-0.0465 0.3810 -0.0500
s.e. 0.0109 0.1914 0.1998
sigma^2 estimated as 3.11: log likelihood=-97.76
AIC=203.53 AICc=204.42 BIC=211.18
The tidy()
function again works here.
aus_economy_dyn_reg_mbl %>%
select(regarima_auto) %>%
tidy() %>%
mutate(across(where(is.numeric), round, digits = 2)) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
.model | term | estimate | std.error | statistic | p.value |
---|---|---|---|---|---|
regarima_auto | CPI | -0.05 | 0.01 | -4.28 | 0.00 |
regarima_auto | Imports | 0.38 | 0.19 | 1.99 | 0.05 |
regarima_auto | Exports | -0.05 | 0.20 | -0.25 | 0.80 |
We can now use the held data to forecast the Growth
variable from 2011 to 2017. Note that including the full data set in the autoplot()
function will plot the actual Growth
over the forecast horizon. This is useful to visually see how well the forecast performed before we get to the formal performance metrics.
aus_economy_tsbl %>%
filter(Year > 2010) %>%
forecast(object = aus_economy_dyn_reg_mbl) %>%
autoplot(aus_economy_tsbl) +
geom_hline(aes(yintercept = 0), linetype = "dashed") +
facet_wrap(~ .model, ncol = 1, scales = "free_y") +
theme_bw() +
scale_y_continuous(
labels = scales::label_percent(scale = 1, accuracy = 0.1)
) +
labs(
title = str_glue(
"Seven-Year Forecast of Annual Percentage Growth in GDP For Australia"
),
subtitle = str_glue(
"Models: four combinations of Growth = CPI + Imports + Exports with
ARIMA(1,1,0), ARIMA(0,1,0), ARIMA(0,0,2) and ARIMA(0,0,0) errors"
),
x = "",
y = "Growth in GDP"
)
Model Ensemble
Having been introduced to seven time-series model types (four baseline and three more complex), the process of ensembling was then introduced. We can train multiple models, and then take some weighted average of their outputs as the forecast. The fable
package is smart enough to combine forecasts using only basic arithmetic, and the forecast()
function is smart enough to interpret the ensemble. The ensemble model will forecast for all the models, combine them, and adjust the final forecast appropriately.
victoria_dept_stores_ensemble_mbl <- victoria_dept_stores_tsbl %>%
model(
snaive = SNAIVE(Turnover),
ets = ETS(Turnover),
arima = ARIMA(Turnover)
) %>%
mutate(mixed = (snaive + ets + arima) / 3)
victoria_dept_stores_ensemble_mbl
# A mable: 1 x 4
snaive ets arima mixed
<model> <model> <model> <model>
1 <SNAIVE> <ETS(M,Ad,M)> <ARIMA(2,1,1)(1,1,2)[12]> <COMBINATION>
The mixed
model is of type <COMBINATION>
indicating that it has combined the three models. In this case, it has assigned each model an equal weight, but you can choose a different weighting system. By using the report()
function, we can see the individual models that make up the ensemble and how they are added together. Note that as of the publication date of this post, the tidy()
function cannot be used on an object of class "model_combination"
.
Series: Turnover
Model: COMBINATION
Combination: (Turnover + Turnover + Turnover) * 0.333333333333333
=================================================================
Series: Turnover + Turnover + Turnover
Model: COMBINATION
Combination: Turnover + Turnover + Turnover
===========================================
Series: Turnover + Turnover
Model: COMBINATION
Combination: Turnover + Turnover
================================
Series: Turnover
Model: SNAIVE
sigma^2: 292.2446
Series: Turnover
Model: ETS(M,Ad,M)
Smoothing parameters:
alpha = 0.167336
beta = 0.0015252
gamma = 0.000111099
phi = 0.9799949
Initial states:
l[0] b[0] s[0] s[-1] s[-2] s[-3] s[-4] s[-5]
261.8962 2.122653 1.856338 1.102949 0.9556113 0.8697593 0.8442324 0.9481133
s[-6] s[-7] s[-8] s[-9] s[-10] s[-11]
0.9952447 0.9489134 0.9525782 0.9034134 0.7260604 0.896787
sigma^2: 0.0017
AIC AICc BIC
2459.671 2462.943 2521.399
Series: Turnover
Model: ARIMA(2,1,1)(1,1,2)[12]
Coefficients:
ar1 ar2 ma1 sar1 sma1 sma2
-0.2336 -0.1432 -0.7237 -0.8059 0.1900 -0.5586
s.e. 0.0939 0.0852 0.0712 0.2862 0.2785 0.1707
sigma^2 estimated as 193.8: log likelihood=-872.18
AIC=1758.35 AICc=1758.89 BIC=1781.94
By themselves, the three underlying models produce the following forecasts:
victoria_dept_stores_ensemble_mbl %>%
select(-mixed) %>%
forecast(h = "3 years") %>%
autoplot(victoria_dept_stores_tsbl) +
facet_wrap(~ .model, ncol = 1) +
expand_limits(y = 0) +
theme_bw() +
scale_y_continuous(
labels = scales::label_dollar(accuracy = 0.1, scale = 1e-3, suffix = "B")
) +
labs(
title = "Three-Year Forecast of Yearly Retail Turnover for Victoria",
subtitle = "Models: Seasonal Naive, ETS(M,Ad,M), ARIMA(2,1,1)(1,1,2)[12]",
x = "",
y = "Australian Dollars"
)
Taking a simple average of the three models, the ensemble produces the following forecast:
victoria_dept_stores_ensemble_mbl %>%
select(mixed) %>%
forecast(h = "3 years") %>%
autoplot(victoria_dept_stores_tsbl) +
theme_minimal() +
expand_limits(y = 0) +
scale_y_continuous(
labels = scales::label_dollar(accuracy = 0.1, scale = 1e-3, suffix = "B")
) +
labs(
title = "Three-Year Forecast of Yearly Retail Turnover for Victoria",
subtitle = str_glue("Ensemble model made of equal weights of:
Seasonal Naive, ETS(M,Ad,M), ARIMA(2,1,1)(1,1,2)[12]"),
x = "",
y = "Australian Dollars"
)
Model Diagnostics
Building a model that produces a reasonable looking forecast is good, but there are steps that must be taken to validate that model. In my case, all the models I build in my current role must be validated by our risk and compliance teams. It is important that I not only prove that my model is statistically correct, I must also justify the process I used to select the model.
Information Criteria
A straightforward way to justify why a certain model was chosen over another is to use various information criteria, such as AIC, BIC, or AICc. While these measures are certainly convenient, there are some very important restrictions to know.
- Comparisons can only be done on models of the same type (e.g. you cannot compare the AIC of an ETS model to that of an ARIMA model)
- Comparisons for ARIMA models can only be done between those models with the same number of differences (e.g. you cannot compare the AIC of an ARIMA model with d=1 to that of an ARIMA model with d=0)
If the models in your mable()
do not fall into the above restrictions, their information criteria can be calculated using the glance()
function.
victoria_dept_stores_tsbl %>%
model(
arima_111 = ARIMA(Turnover ~ pdq(1,1,1) + PDQ(0,0,0)),
arima_212 = ARIMA(Turnover ~ pdq(2,1,2) + PDQ(0,0,0)),
arima_012 = ARIMA(Turnover ~ pdq(0,1,2) + PDQ(0,0,0)),
arima_200 = ARIMA(Turnover ~ pdq(2,1,0) + PDQ(0,0,0))
) %>%
glance() %>%
select(.model, AIC, BIC, AICc) %>%
mutate(across(where(is.numeric), round, digits = 0)) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
.model | AIC | BIC | AICc |
---|---|---|---|
arima_111 | 2736 | 2746 | 2736 |
arima_212 | 2685 | 2706 | 2686 |
arima_012 | 2732 | 2743 | 2732 |
arima_200 | 2800 | 2810 | 2800 |
Residual Diagnostics
We know how to compare models from the same modeling algorithm, but what about across different algorithms? We will discuss checking the accuracy of the forecasts on out-of-sample data in the next section, but a good first step is to check the residuals. For a time-series model, we basically have two key assumptions that must be validated:
- The residuals are uncorrelated
- The residuals have mean zero
While not necessary, meeting the following assumptions are also very useful:
- The residuals have constant variance
- The residuals are normally distributed
Essentially, meeting all four of these assumptions means your residuals are Gaussian White Noise.
The first thing we can do to check these assumptions is to simply plot the distribution of the residuals. The residuals from a forecast can be calculated using the augment()
function. I will use the ARIMA models for these examples.
victoria_dept_stores_arima_mbl <- victoria_dept_stores_tsbl %>%
model(
arima_211_000 = ARIMA(Turnover ~ pdq(2,1,1) + PDQ(0,0,0)),
arima_012_002 = ARIMA(Turnover ~ pdq(0,1,2) + PDQ(0,0,2)),
arima_200_010 = ARIMA(Turnover ~ pdq(2,0,0) + PDQ(0,1,0)),
arima_auto = ARIMA(Turnover)
)
victoria_dept_stores_arima_mbl %>%
select(arima_auto) %>%
augment() %>%
ggplot() +
geom_histogram(aes(x = .resid), color = "white") +
theme_minimal() +
labs(
title = str_glue("Yearly Retail Turnover for Victoria:
Distribution of Residuals"),
subtitle = "Model: ARIMA(2,1,1)(1,1,2)[12]",
x = "Residual",
y = "Count"
)
Basic time-series are also helpful to see if there are any dramatic outliers and at what observation they occurred.
victoria_dept_stores_arima_mbl %>%
select(arima_auto) %>%
augment() %>%
autoplot(.resid) +
theme_minimal() +
labs(
title = str_glue("Yearly Retail Turnover for Victoria:
Time-Series Residuals"),
subtitle = "Model: ARIMA(2,1,1)(1,1,2)[12]",
x = "Month",
y = "Residual"
)
We can get a little more quantitatively statistical, rather than purely visual, by plotting the ACF of the residuals. We can see the amount of autocorrelation, if any exists.
victoria_dept_stores_arima_mbl %>%
select(arima_auto) %>%
augment() %>%
ACF(.resid) %>%
autoplot() +
theme_minimal() +
labs(
title = "Yearly Retail Turnover for Victoria: ACF Plot",
subtitle = "Model: ARIMA(2,1,1)(1,1,2)[12]",
x = "Lag (Month)",
y = "ACF"
)
Based on the plots we have seen so far, the residuals look roughly normal, and the ACF shows no autocorrelation. To formalize these observations, we can run a Ljung-Box Test. The null hypothesis is that the residual series is not autocorrelated. We want to fail to reject the null (i.e. have a p-value greater than 0.05). We can pass the ljung_box()
function into the features()
function to run this test. There are two parameters of the test we must define:
- Degrees of Freedom (
dof
): the number of parameters in the model (e.g. zero for Naive, p + q for ARIMA) - Number of Lags Tested (
lag
): Dr. Hyndman suggests a value of 10 for non-seasonal data and 2 times the seasonal period for seasonal data (e.g. if monthly, 2 * 12 = 24)
victoria_dept_stores_arima_mbl %>%
select(arima_auto) %>%
augment() %>%
features(.resid, ljung_box, dof = 2+1+1+2, lag = 2*12) %>%
mutate(across(where(is.numeric), round, digits = 2)) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
.model | lb_stat | lb_pvalue |
---|---|---|
arima_auto | 11.53 | 0.87 |
Forecast Accuracy
The final step we can take is to forecast on out-of-sample data and check how accurate that forecast is. In general, what we want to do is:
- Separate some portion of the most recent data before fitting a model
- Train the model on the remaining data
- Calculate some accuracy metric using the training data
- Calculate some accuracy metric using the held-out data
- Compare the two accuracy metrics (we want them to be similar, showing that the model generalizes well to data it has not seen before)
Whereas the information criteria can only be used to compare models of the same type, comparing accuracy can be done across model types. Commonly used accuracy measures include:
- Mean Absolute Percentage Error (MAPE)
- Root Mean Square Error (RMSE)
- Mean Absolute Scaled Error (MASE)
The accuracy on the training data can be calculated by passing the model straight into the accuracy()
function.
victoria_dept_stores_train_arima_mbl <- victoria_dept_stores_tsbl %>%
filter(year(Month) <= 2015) %>%
model(arima = ARIMA(Turnover))
victoria_dept_stores_train_arima_mbl %>%
accuracy() %>%
select(.model, .type, MAPE, RMSE, MASE) %>%
mutate(across(where(is.numeric), round, digits = 2)) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
.model | .type | MAPE | RMSE | MASE |
---|---|---|---|---|
arima | Training | 2.98 | 13.44 | 0.68 |
The accuracy on the held out data takes an additional step. First, use the forecast()
function to forecast over an arbitrary horizon, and then pass that fable()
into the accuracy()
function, along with the training data.
victoria_dept_stores_tsbl %>%
filter(year(Month) > 2015) %>%
forecast(object = victoria_dept_stores_train_arima_mbl) %>%
accuracy(victoria_dept_stores_tsbl) %>%
select(.model, .type, MAPE, RMSE, MASE) %>%
mutate(across(where(is.numeric), round, digits = 2)) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE)
.model | .type | MAPE | RMSE | MASE |
---|---|---|---|---|
arima | Test | 3.93 | 16.93 | 0.94 |
Recap
The second day of the workshop focused on forecasting methods, model algorithms, validating those models, and producing and checking the accuracy of forecasts. This second post on the second day covered:
- Building complex models
- Validating model fit
- Determining forecast accuracy
More Information
A big thanks to the Dr. Hyndman and and the tidyverts
team for building these packages and putting together a fantastic workshop!
The best source of additional information on the tidyverts
ecosystem is the official website.
For more examples and explanations on the theoretical background of time-series forecasting, please see the 3rd Edition of Forecasting Practices and Principles, co-authored by Dr. Rob Hyndman. This book goes into depth on most of the topics covered in the workshop. During the workshop, Dr. Hyndman mentioned that a new edition of the book would be coming out with updated tidyverts
integration, but did not have a specific date available.
Any issues with the packages in the tidyverts
can be opened and tracked via their respective GitHub repositories.
Finally, the RStudio Community is a great place to ask for and receive help. There are several tidyverts
questions already posted, and at least one of the package authors is known to be an active member.
Notes
This post is based on a presentation that was given on the date listed. It may be updated from time to time to fix errors, detail new functions, and/or remove deprecated functions so the packages and R version will likely be newer than what was available at the time.
The R session information used for this post:
R version 4.2.1 (2022-06-23)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS 14.1.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] kableExtra_1.3.4 ggplot2_3.4.0 fable_0.3.2 feasts_0.3.0
[5] fabletools_0.3.2 tsibbledata_0.4.1 tsibble_1.1.3 stringr_1.4.1
[9] lubridate_1.9.0 timechange_0.1.1 purrr_0.3.5 tidyr_1.2.1
[13] dplyr_1.0.10
loaded via a namespace (and not attached):
[1] tidyselect_1.2.0 xfun_0.40 urca_1.3-3
[4] lattice_0.20-45 colorspace_2.0-3 vctrs_0.6.3
[7] generics_0.1.3 viridisLite_0.4.1 htmltools_0.5.3
[10] yaml_2.3.5 utf8_1.2.2 rlang_1.1.1
[13] pillar_1.8.1 glue_1.6.2 withr_2.5.0
[16] rappdirs_0.3.3 distributional_0.3.1 lifecycle_1.0.3
[19] progressr_0.11.0 munsell_0.5.0 anytime_0.3.9
[22] gtable_0.3.1 rvest_1.0.3 evaluate_0.16
[25] labeling_0.4.2 knitr_1.40 fastmap_1.1.0
[28] fansi_1.0.3 highr_0.9 Rcpp_1.0.9
[31] renv_0.16.0 scales_1.2.1 webshot_0.5.4
[34] jsonlite_1.8.0 systemfonts_1.0.4 farver_2.1.1
[37] digest_0.6.29 stringi_1.7.12 grid_4.2.1
[40] cli_3.6.1 tools_4.2.1 magrittr_2.0.3
[43] tibble_3.1.8 pkgconfig_2.0.3 ellipsis_0.3.2
[46] xml2_1.3.3 rmarkdown_2.16 svglite_2.1.0
[49] httr_1.4.4 rstudioapi_0.14 R6_2.5.1
[52] nlme_3.1-157 compiler_4.2.1