In statistics and econometrics, and in particular in time series analysis, an autoregressive integrated moving average (ARIMA) model is a generalization of an autoregressive moving average (ARMA) model. To better comprehend the data or to forecast upcoming series points, both of these models are fitted to time series data. ARIMA models are applied in some cases where data show evidence of non-stationarity in the sense of expected value (but not variance/autocovariance), where an initial differencing step (corresponding to the "integrated" part of the model) can be applied one or more times to eliminate the non-stationarity of the mean function (i.e., the trend).[1] When the seasonality shows in a time series, the seasonal-differencing could be applied to eliminate the seasonal component. Since the ARMA model, according to the Wold's decomposition theorem,[2] [3] [4] is theoretically sufficient to describe a regular (a.k.a. purely nondeterministic) wide-sense stationary time series, we are motivated to make stationary a non-stationary time series, e.g., by using differencing, before we can use the ARMA model.[5] Note that if the time series contains a predictable sub-process (a.k.a. pure sine or complex-valued exponential process), the predictable component is treated as a non-zero-mean but periodic (i.e., seasonal) component in the ARIMA framework so that it is eliminated by the seasonal differencing.
The autoregressive part of ARIMA indicates that the evolving variable of interest is regressed on its own lagged (i.e., prior) values. The moving average part indicates that the regression error is actually a linear combination of error terms whose values occurred contemporaneously and at various times in the past.[6] The (for "integrated") indicates that the data values have been replaced with the difference between their values and the previous values (and this differencing process may have been performed more than once). The purpose of each of these features is to make the model fit the data as well as possible.
Non-seasonal ARIMA models are generally denoted ARIMA(p,d,q) where parameters p, d, and q are non-negative integers, p is the order (number of time lags) of the autoregressive model, d is the degree of differencing (the number of times the data have had past values subtracted), and q is the order of the moving-average model. Seasonal ARIMA models are usually denoted ARIMA(p,d,q)(P,D,Q)m, where m refers to the number of periods in each season, and the uppercase P,D,Q refer to the autoregressive, differencing, and moving average terms for the seasonal part of the ARIMA model.[7] [8]
When two out of the three terms are zeros, the model may be referred to based on the non-zero parameter, dropping "", "" or "" from the acronym describing the model. For example, is, is, and is .
ARIMA models can be estimated following the Box–Jenkins approach.
Given time series data Xt where t is an integer index and the Xt are real numbers, an
ARMA(p',q)
Xt-\alpha1Xt-1-...-\alphap'Xt-p'=\varepsilont+\theta1\varepsilont-1+ … +\thetaq\varepsilont-q,
or equivalently by
\left(1-
p' | |
\sum | |
i=1 |
\alphaiLi \right)Xt = \left(1+
q | |
\sum | |
i=1 |
\thetaiLi \right)\varepsilont
where
L
\alphai
\thetai
\varepsilont
\varepsilont
Assume now that the polynomial
style\left(1-
p' | |
\sum | |
i=1 |
\alphaiLi\right)
(1-L)
\left(1-
p' | |
\sum | |
i=1 |
\alphaiLi \right) = \left(1-
p'-d | |
\sum | |
i=1 |
\varphiiLi \right) \left(1-L \right)d.
An ARIMA(p,d,q) process expresses this polynomial factorisation property with p=p'−d, and is given by:
\left(1-
p | |
\sum | |
i=1 |
\varphiiLi\right) (1-L)dXt =\left(1+
q | |
\sum | |
i=1 |
\thetaiLi\right)\varepsilont
and thus can be thought as a particular case of an ARMA(p+d,q) process having the autoregressive polynomial with d unit roots. (For this reason, no process that is accurately described by an ARIMA model with d > 0 is wide-sense stationary.)
The above can be generalized as follows.
\left(1-
p | |
\sum | |
i=1 |
\varphiiLi\right) (1-L)dXt=\delta+\left(1+
q | |
\sum | |
i=1 |
\thetaiLi\right)\varepsilont.
This defines an ARIMA(p,d,q) process with drift
\delta | |
1-\sum\varphii |
The explicit identification of the factorization of the autoregression polynomial into factors as above can be extended to other cases, firstly to apply to the moving average polynomial and secondly to include other special factors. For example, having a factor
(1-Ls)
\left(1-\sqrt{3}L+L2\right)
Identification and specification of appropriate factors in an ARIMA model can be an important step in modeling as it can allow a reduction in the overall number of parameters to be estimated while allowing the imposition on the model of types of behavior that logic and experience suggest should be there.
A stationary time series's properties do not depend on the time at which the series is observed. Specifically, for a wide-sense stationary time series, the mean and the variance/autocovariance keep constant over time. Differencing in statistics is a transformation applied to a non-stationary time-series in order to make it stationary in the mean sense (viz., to remove the non-constant trend), but having nothing to do with the non-stationarity of the variance or autocovariance. Likewise, the seasonal differencing is applied to a seasonal time-series to remove the seasonal component. From the perspective of signal processing, especially the Fourier spectral analysis theory, the trend is the low-frequency part in the spectrum of a non-stationary time series, while the season is the periodic-frequency part in the spectrum of it. Therefore, the differencing works as a high-pass (i.e., low-stop) filter and the seasonal-differencing as a comb filter to suppress the low-frequency trend and the periodic-frequency season in the spectrum domain (rather than directly in the time domain), respectively.
To difference the data, the difference between consecutive observations is computed. Mathematically, this is shown as
yt'=yt-yt-1
Differencing removes the changes in the level of a time series, eliminating trend and seasonality and consequently stabilizing the mean of the time series.
Sometimes it may be necessary to difference the data a second time to obtain a stationary time series, which is referred to as second-order differencing:
* | |
\begin{align} y | |
t |
&=yt'-yt-1'\\ &=(yt-yt-1)-(yt-1-yt-2)\\ &=yt-2yt-1+yt-2\end{align}
Another method of differencing data is seasonal differencing, which involves computing the difference between an observation and the corresponding observation in the previous season e.g a year. This is shown as:
yt'=yt-yt-m wherem=durationofseason.
The differenced data are then used for the estimation of an ARMA model.
Some well-known special cases arise naturally or are mathematically equivalent to other popular forecasting models. For example:
Xt=Xt-1+\varepsilont
Xt=c+Xt-1+\varepsilont
Xt=2Xt-1-Xt-2+(\alpha+\beta-2)\varepsilont-1+(1-\alpha)\varepsilont-2+\varepsilont
The order p and q can be determined using the sample autocorrelation function (ACF), partial autocorrelation function (PACF), and/or extended autocorrelation function (EACF) method.[10]
Other alternative methods include AIC, BIC, etc. To determine the order of a non-seasonal ARIMA model, a useful criterion is the Akaike information criterion (AIC). It is written as
AIC=-2log(L)+2(p+q+k),
where L is the likelihood of the data, p is the order of the autoregressive part and q is the order of the moving average part. The k represents the intercept of the ARIMA model. For AIC, if k = 1 then there is an intercept in the ARIMA model (c ≠ 0) and if k = 0 then there is no intercept in the ARIMA model (c = 0).
The corrected AIC for ARIMA models can be written as
AICc=AIC+
2(p+q+k)(p+q+k+1) | |
T-p-q-k-1 |
.
The Bayesian Information Criterion (BIC) can be written as
BIC=AIC+((logT)-2)(p+q+k).
The objective is to minimize the AIC, AICc or BIC values for a good model. The lower the value of one of these criteria for a range of models being investigated, the better the model will suit the data. The AIC and the BIC are used for two completely different purposes. While the AIC tries to approximate models towards the reality of the situation, the BIC attempts to find the perfect fit. The BIC approach is often criticized as there never is a perfect fit to real-life complex data; however, it is still a useful method for selection as it penalizes models more heavily for having more parameters than the AIC would.
AICc can only be used to compare ARIMA models with the same orders of differencing. For ARIMAs with different orders of differencing, RMSE can be used for model comparison.
The ARIMA model can be viewed as a "cascade" of two models. The first is non-stationary:
Yt=(1-L)dXt
while the second is wide-sense stationary:
\left(1-
p | |
\sum | |
i=1 |
\varphiiLi\right)Yt= \left(1+
q | |
\sum | |
i=1 |
\thetaiLi\right)\varepsilont.
Now forecasts can be made for the process
Yt
The forecast intervals (confidence intervals for forecasts) for ARIMA models are based on assumptions that the residuals are uncorrelated and normally distributed. If either of these assumptions does not hold, then the forecast intervals may be incorrect. For this reason, researchers plot the ACF and histogram of the residuals to check the assumptions before producing forecast intervals.
95% forecast interval:
\hat{y}T+h\mid\pm1.96\sqrt{vT+h\mid
vT+h\mid
yT+h\midy1,...,yT
For
h=1
vT+h\mid=\hat{\sigma}2
For ARIMA(0,0,q),
yt=et+\sum
q\theta | |
ie |
t-i.
vT+h\mid=\hat{\sigma}2
h-1 | |
\left[1+\sum | |
i=1 |
\thetaiet-i\right],forh=2,3,\ldots
In general, forecast intervals from ARIMA models will increase as the forecast horizon increases.
A number of variations on the ARIMA model are commonly employed. If multiple time series are used then the
Xt
Various packages that apply methodology like Box–Jenkins parameter optimization are available to find the right parameters for the ARIMA model.
ARIMA
fitting and forecasting.[13] [14] [15]from the US Bureau of the Census