1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168
|
.. _arma_estimation:
ARMA process estimation
-----------------------
From the order :math:`(p,q)` or a range of orders :math:`(p,q) \in Ind_p \times Ind_q`,
where :math:`Ind_p = [p_1, p_2]`
and :math:`Ind_q = [q_1, q_2]`, the methods aims to find the *best* model
:math:`ARMA(p,q)` that fits the data and estimate the
corresponding coefficients. The *best* model is considered with
respect to the :math:`AIC_c` criteria (corrected Akaike Information
Criterion), defined by:
.. math::
AICc = -2\log L_w + 2(p + q + 1)\frac{m}{m - p - q - 2}
where :math:`m` is half the number of points of the time grid of the
process sample (if the data are a process sample) or in a block of the
time series (if the data are a time series).
Two other criteria are computed for each order :math:`(p,q)`:
- the AIC criterion:
.. math::
AIC = -2\log L_w + 2(p + q + 1)
- and the BIC criterion:
.. math::
BIC = -2\log L_w + 2(p + q + 1)\log(p + q + 1)
The *BIC* criterion leads to a model that gives a better prediction;
the *AIC* criterion selects the best model that fits the given data;
the :math:`AIC_c` criterion improves the previous one by penalizing a
too high order that would artificially fit to the data.
For each order :math:`(p,q)`, the estimation of the coefficients
:math:`(a_k)_{1\leq k\leq p}`, :math:`(b_k)_{1\leq k\leq q}` and the
variance :math:`\sigma^2` is done using the Whittle estimator which is
based on the maximization of the likelihood function in the frequency
domain.
The principle is detailed hereafter for the case of a time series: in
the case of a process sample, the estimator is similar except for the
periodogram which is computed differently.
We consider a time series associated to the time grid
:math:`(t_0, \hdots, t_{n-1})` and a particular order :math:`(p,q)`.
The spectral density function of the
:math:`ARMA(p,q)` process writes:
.. math::
:label: arma_spectrum
f(\lambda, \vect{\theta}, \sigma^2) = \frac{\sigma^2}{2 \pi} \frac{|1 + b_1 \exp(-i \lambda) + \ldots
+ b_q \exp(-i q \lambda)|^2}{|1 + a_1 \exp(-i \lambda) + \ldots + a_p \exp(-i p \lambda)|^2} =
\frac{\sigma^2}{2 \pi} g(\lambda, \vect{\theta})
where :math:`\vect{\theta} = (a_1, a_2,...,a_p,b_1,...,b_q)` and
:math:`\lambda` is the frequency value.
The Whittle log-likelihood writes:
.. math::
:label: arma_likelihood
\log L_w(\vect{\theta}, \sigma^2) = - \sum_{j=1}^{m} \log f(\lambda_j, \vect{\theta}, \sigma^2) - \frac{1}{2 \pi} \sum_{j=1}^{m} \frac{I(\lambda_j)}{f(\lambda_j, \vect{\theta}, \sigma^2)}
where:
- :math:`I` is the non parametric estimate of the spectral density,
expressed in the Fourier space (frequencies in :math:`[0,2\pi]`
instead of :math:`[-f_{max}, f_{max}]`). By default the Welch estimator is used.
- :math:`\lambda_j` is the :math:`j-th` Fourier frequency,
:math:`\lambda_j = \frac{2 \pi j}{n}`, :math:`j=1 \ldots m` with
:math:`m` the largest integer :math:`\leq \frac{n-1}{2}`.
We estimate the :math:`(p+q+1)` scalar coefficients by maximizing the
log-likelihood function. The corresponding equations lead to the
following relation:
.. math::
:label: optSigma
\sigma^{2*} = \frac{1}{m} \sum_{j=1}^{m} \frac{I(\lambda_j)}{g(\lambda_j, \vect{\theta}^{*})}
where :math:`\vect{\theta}^{*}` maximizes:
.. math::
:label: lik2
\log L_w(\vect{\theta}) = m (\log(2 \pi) - 1) - m \log\left( \frac{1}{m} \sum_{j=1}^{m} \frac{I(\lambda_j)}{g(\lambda_j, \vect{\theta})}\right) - \sum_{j=1}^{m} g(\lambda_j, \vect{\theta})
The Whitle estimation requires that:
- the determinant of the eigenvalues of the companion matrix
associated to the polynomial :math:`1 + a_1X + \dots + a_pX^p` are
outside the unit disc,, which guarantees the stationarity of the
process;
- the determinant of the eigenvalues of the companion matrix
associated to the polynomial :math:`1 + ba_1X + \dots + b_qX^q` are
outside the unit disc, which guarantees the invertibility of the
process.
Multivariate estimation
~~~~~~~~~~~~~~~~~~~~~~~
Let :math:`(t_i, \vect{X}_i)_{0\leq i \leq n-1}` be a multivariate
time series of dimension :math:`d` generated by an ARMA process
where :math:`(p,q)` are supposed to
be known. We assume that the white noise :math:`\varepsilon` is
distributed according to the normal distribution with zero mean and
with covariance matrix
:math:`\mat{\Sigma}_{\varepsilon} = \sigma^2\mat{Q}` where
:math:`|\mat{Q}| = 1` .
The normality of the white noise implies the normality of the process.
If we note :math:`\vect{W} = (\vect{X}_0, \hdots, \vect{X}_{n-1})`,
then :math:`\vect{W}` is normal with zero mean. Its covariance matrix
writes
:math:`\mathbb{E}(\vect{W}\vect{W}^t)= \sigma^2 \Sigma_{\vect{W}}`
which depends on the coefficients :math:`(\mat{A}_k, \mat{B}_l)` for
:math:`k = 1,\ldots,p` and :math:`l = 1,\ldots, q` and on the matrix
:math:`\mat{Q}`.
The likelihood of :math:`\vect{W}` writes:
.. math::
:label: likelihoodMV
L(\vect{\beta}, \sigma^2 | \vect{W}) = (2 \pi \sigma^2) ^{-\frac{d n}{2}} |\Sigma_{w}|^{-\frac{1}{2}} \exp\left(- (2\sigma^2)^{-1} \vect{W}^t \Sigma_{\vect{W}}^{-1} \vect{W} \right)
where
:math:`\vect{\beta} = (\mat{A}_{k}, \mat{B}_{l}, \mat{Q}),\ k = 1,\ldots,p`,
:math:`l = 1,\ldots, q` and where :math:`|.|` denotes the determinant.
The difficulty arises from the great size (:math:`dn \times dn`) of
:math:`\Sigma_{\vect{W}}` which is a dense matrix in the general case.
[mauricio1995]_ proposes an efficient algorithm to evaluate the likelihood
function. The main point is to use a change of variable that leads to a
block-diagonal sparse covariance matrix.
The multivariate Whittle estimation requires that:
- the determinant of the eigenvalues of the companion matrix
associated to the polynomial
:math:`\mat{I} + \mat{A}_1\mat{X} + \dots + \mat{A}_p\mat{X}^p` are
outside the unit disc, which guarantees the stationarity of the
process;
- the determinant of the eigenvalues of the companion matrix
associated to the polynomial
:math:`\mat{I} + \mat{B}_1\mat{X} + \dots + \mat{B}_q\mat{X}^q` are
outside the unit disc, which guarantees the invertibility of the
process.
.. topic:: API:
- See :class:`~openturns.WhittleFactory`
- See :class:`~openturns.WelchFactory`
- See :class:`~openturns.ARMA`
.. topic:: Examples:
- See :doc:`/auto_data_analysis/estimate_stochastic_processes/plot_estimate_arma`
- See :doc:`/auto_data_analysis/estimate_stochastic_processes/plot_estimate_multivariate_arma`
|