A vector autoregression (VAR) model is a multivariate time series model containing a system of n equations of n distinct, stationary response variables as linear functions of lagged responses and other terms. VAR models are also characterized by their degree p; each equation in a VAR(p) model contains p lags of all variables in the system.
VAR models belong to a class of multivariate linear time series models called vector autoregression moving average (VARMA) models. Although Econometrics Toolbox™ provides functionality to conduct a comprehensive analysis of a VAR(p) model (from model estimation to forecasting and simulation), the toolbox provides limited support for other models in the VARMA class.
In general, multivariate linear time series models are well suited for:
Modeling the movements of several stationary time series simultaneously.
Measuring the delayed effects among the response variables in the system.
Measuring the effects of exogenous series on variables in the system. For example, determine whether the presence of a recently imposed tariff significantly affects several econometric series.
Generating simultaneous forecasts of the response variables.
This table contains forms of multivariate linear time series models and describes their supported functionality in Econometrics Toolbox.
Model  Abbreviation  Equation  Supported Functionality 

Vector autoregression  VAR(p) 
$${y}_{t}=c+{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{y}_{tj}}+{\epsilon}_{t}$$


Vector autoregression with a linear time trend  VAR(p) 
$${y}_{t}=c+\delta t+{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{y}_{tj}+{\epsilon}_{t}}$$
 Represent the model by using a 
Vector autoregression with exogenous series  VARX(p) 
$${y}_{t}=c+\delta t+\beta {x}_{t}+{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{y}_{tj}}+{\epsilon}_{t}$$
 Represent the model by using a 
Vector moving average  VMA(q) 
$${y}_{t}=c+{\displaystyle \sum _{k=1}^{q}{\Theta}_{k}{\epsilon}_{tk}}+{\epsilon}_{t}$$
 
Vector autoregression moving average  VARMA(p, q) 
$${y}_{t}=c+{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{y}_{tj}}+{\displaystyle \sum _{k=1}^{q}{\Theta}_{k}{\epsilon}_{tk}}+{\epsilon}_{t}$$
 
Structural vector autoregression moving average  SVARMA(p, q) 
$${\Phi}_{0}{y}_{t}=c+{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{y}_{tj}}+{\displaystyle \sum _{k=1}^{q}{\Theta}_{k}{\epsilon}_{tk}}+{\Theta}_{0}{\epsilon}_{t}$$
 Same support as for VARMA models 
The following variables appear in the equations:
y_{t} is the nby1 vector of distinct response time series variables at time t.
c is an nby1 vector of constant offsets in each equation.
Φ_{j} is an nbyn matrix of AR coefficients, where j = 1,...,p and Φ_{p} is not a matrix containing only zeros.
x_{t} is an mby1 vector of values corresponding to m exogenous variables or predictors. In addition to the lagged responses, exogenous variables are unmodeled inputs to the system. Each exogenous variable appears in all response equations by default.
β is an nbym matrix of regression coefficients. Row j contains the coefficients in the equation of response variable j, and column k contains the coefficients of exogenous variable k among all equations.
δ is an nby1 vector of linear timetrend values.
ε_{t} is an nby1 vector of random Gaussian innovations, each with a mean of 0 and collectively an nbyn covariance matrix Σ. For t ≠ s, ε_{t} and ε_{s} are independent.
Θ_{k} is an nbyn matrix of MA coefficients, where k = 1,...,q and Θ_{q} is not a matrix containing only zeros.
Φ_{0} and Θ_{0} are the AR and MA structural coefficients, respectively.
Generally, the time series y_{t} and x_{t} are observable because you have data representing the series. The values of c, δ, β, and the autoregressive matrices Φ_{j} are not always known. You typically want to fit these parameters to your data. See estimate
for ways to estimate unknown parameters or how to hold some of them fixed to values (set equality constraints) during estimation. The innovations ε_{t} are not observable in data, but they can be observable in simulations.
In the preceding table, the models are represented in differenceequation notation. Lag operator notation is an equivalent and more succinct representation of the multivariate linear time series equations.
The lag operator L reduces the time index by one unit: Ly_{t} = y_{t–1}. The operator L^{j} reduces the time index by j units: L^{j}y_{t} = y_{t–j}.
In lag operator form, the equation for a SVARMAX(p, q) model is:
$$\left({\Phi}_{0}{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{L}^{j}}\right){y}_{t}=c+\beta {x}_{t}+\left({\Theta}_{0}+{\displaystyle \sum _{k=1}^{q}{\Theta}_{k}{L}^{k}}\right){\epsilon}_{t}.$$
The equation is expressed more succinctly in this form:
$$\Phi (L){y}_{t}=c+\beta {x}_{t}+\Theta (L){\epsilon}_{t},$$
where
$$\Phi (L)={\Phi}_{0}{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{L}^{j}}$$
and
$$\Theta (L)={\Theta}_{0}+{\displaystyle \sum _{k=1}^{q}{\Theta}_{k}{L}^{k}}.$$
A multivariate AR polynomial is stable if
$$\mathrm{det}\left({I}_{n}{\Phi}_{1}z{\Phi}_{2}{z}^{2}\mathrm{...}{\Phi}_{p}{z}^{p}\right)\ne 0\text{for}\leftz\right\le 1.$$
With all innovations equal to zero, this condition implies that the VAR process converges to c as t approaches infinity (for more details, see [1], Ch. 2).
A multivariate MA polynomial is invertible if
$$\mathrm{det}\left({I}_{n}+{\Theta}_{1}z+{\Theta}_{2}{z}^{2}+\mathrm{...}+{\Theta}_{q}{z}^{q}\right)\ne 0\text{for}\leftz\right\le 1.$$
This condition implies that the pure VAR representation of the VMA process is stable (for more details, see [1], Ch. 11).
A VARMA model is stable if its AR polynomial is stable. Similarly, a VARMA model is invertible if its MA polynomial is invertible.
Models with exogenous inputs (for example, VARMAX models) have no welldefined notion of stability or invertibility. An exogenous input can destabilize a model.
Incorporate feedback from exogenous predictors, or study their linear associations with the response series, by including a regression component in a multivariate linear time series model. By order of increasing complexity, examples of applications that use such models:
Modeling the effects of an intervention, which implies that the exogenous series is an indicator variable.
Modeling the contemporaneous linear associations between a subset of exogenous series to each response. Applications include CAPM analysis and studying the effects of prices of items on their demand. These applications are examples of seemingly unrelated regression (SUR). For more details, see Implement Seemingly Unrelated Regression and Estimate Capital Asset Pricing Model Using SUR.
Modeling the linear associations between contemporaneous and lagged exogenous series and the response as part of a distributed lag model. Applications include determining how a change in monetary growth affects real gross domestic product (GDP) and gross national income (GNI).
Any combination of SUR and the distributed lag model that includes the lagged effects of responses, also known as simultaneous equation models.
The general equation for a VARX(p) model is
$${y}_{t}=c+\delta t+\beta {x}_{t}+{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{y}_{tj}}+{\epsilon}_{t}$$
where
x_{t} is an mby1 vector of observations from m exogenous variables at time t. The vector x_{t} can contain lagged exogenous series.
β is an nbym vector of regression coefficients. Row j of β contains the regression coefficients in the equation of response series j for all exogenous variables. Column k of β contains the regression coefficients among the response series equations for exogenous variable k. This figure shows the system with an expanded regression component:
$$\left[\begin{array}{c}{y}_{1,t}\\ {y}_{2,t}\\ \vdots \\ {y}_{n,t}\end{array}\right]=c+\delta t+\left[\begin{array}{c}{x}_{1,t}\beta (1,1)+\cdots +{x}_{m,t}\beta (1,m)\\ {x}_{1,t}\beta (2,1)+\cdots +{x}_{m,t}\beta (2,m)\\ \vdots \\ {x}_{1,t}\beta (n,1)+\cdots +{x}_{m,t}\beta (n,m)\end{array}\right]+{\displaystyle \sum _{j=1}^{p}{\Phi}_{j}{y}_{tj}}+{\epsilon}_{t}.$$
This workflow describes how to analyze multivariate time series by using Econometrics Toolbox VAR model functionality. If you believe the response series are cointegrated, use VEC model functionality instead (see vecm
).
Load, preprocess, and partition the data set. For more details, see Multivariate Time Series Data Formats.
Create a varm
model object that characterizes a VAR model. A varm
model object is a MATLAB^{®} variable containing properties that describe the model, such as AR polynomial degree p, response dimensionality n, and coefficient values. varm
must be able to infer n and p from your specifications; n and p are not estimable. You can update the lag structure of the AR polynomial after creating a VAR model, but you cannot change n.
varm
enables you to create these types of models:
Fully specified model in which all parameters, including coefficients and the innovations covariance matrix, are numeric values. Create this type of model when economic theory specifies the values of all parameters in the model, or you want to experiment with parameter settings. After creating a fully specified model, you can pass the model to all object functions except estimate
.
Model template in which n and p are known values, but all coefficients and the innovations covariance matrix are unknown, estimable parameters. Properties corresponding to estimable parameters are composed of NaN
values. Pass a model template and data to estimate
to obtain an estimated (fully specified) VAR model. Then, you can pass the estimated model to any other object function.
Partially specified model template in which some parameters are known, and others are unknown and estimable. If you pass a partially specified model and data to estimate
, MATLAB treats the known parameter values as equality constraints during optimization, and estimates the unknown values. A partially specified model is well suited to these tasks:
Remove lags from the model by setting the coefficient to zero.
Associate a subset of predictors to a response variable by setting to zero the regression coefficients of predictors you do not want in the response equation.
For more details, see Create VAR Model.
For models with unknown, estimable parameters, fit the model to data. See Fitting Models to Data and estimate
.
Find an appropriate AR polynomial degree by iterating steps 2 and 3. See Select Appropriate Lag Order.
Analyze the fitted model. This step can involve:
Determining whether response series Grangercause other response series in the system (see gctest
).
Calculating impulse responses, which are forecasts based on an assumed change in an input to a time series.
VAR model forecasting by obtaining either minimum mean square error forecasts or Monte Carlo forecasts.
Comparing model forecasts to holdout data. For an example, see VAR Model Case Study.
Your application does not have to involve all the steps in this workflow, and you can iterate some of the steps. For example, you might not have any data, but want to simulate responses from a fully specified model.
[1] Lütkepohl, H. New Introduction to Multiple Time Series Analysis. Berlin: Springer, 2005.