The thing that impresses me about multivariate forecasting models is that you need forecasts for all the driver or predictor variables in order to develop an unconditional forecast of the target variable.
As a consequence, a univariate model can sometimes perform better for forecasting purposes.
The following Monte Carlo simulation in an Excel spreadsheet speaks to this issue, and also to the benefits of combining forecasts.
The multivariate forecast model is
y = b1x1+b2x2+b3x3.
The target or dependent variable is labeled y and the predictors are x1, x2, and x3. Typically, such models are developed by ordinary least squares (OLS) regression, with data structures of the following form.
Here we have 35 “observations” within sample on y and the predictor variables x. The last five observations, bolded in red, are “out-of-sample.”
In other words, we plan to estimate an OLS regression over the first 30 observations, and test this multivariate regression relationship in predicting the out-of-sample values of y.
This dependent or target variable y in the simulation is generated with the “true” coefficients listed in the light blue box at the top of the exhibit. The estimated coefficients are shown in the pink area, and, are developed by regression.
Each of the three predictor variables is generated as a simple random walk, as shown in the light brown area to the far left. The error processes for these random walks are i.i.d normal errors with zero mean and standard deviation 10, or N(0,10).
The error process labeled ε in the right hand panel is distributed i.i.d. as a normal error with zero mean and standard deviation equal to 3.
For purposes of classification, then, we could say this simulation involves a cointegrated relationship between predictor variables which are I(1) – in other words, which reduce to a white noise process with a single differencing.
OK, so now we are ready to compare univariate versus multivariate forecasts in this context.
First, note that all the predictors are simple random walks. Thus, since y is the sum of these random walks plus the error process ε, y is a random walk with noise.
This makes forecasting the predictors or x-variables easy. The forecasts for the x are always optimally the previously observed value of the particular x variable.
The situation is a little more complicated for the target variable y. But, because the variances of the random walk predictors overwhelm the variance of the error process ε, it turns out that the previous values of y are approximately the optimal forecast of the current value of y, too.
Adopting these procedures for forecasting the predictors x and a univariate model for y, we can generate a Monte Carlo simulation in an Excel spreadsheet so as to compare the mean square error (MSE) of the multivariate and univariate forecasts.
I use a simple Visual Basic (VB) program to recalculate all the random values in the spreadsheet 10,000 times, recording the number of times the MSE of the univariate forecast is less than the MSE of the multivariate forecast model.
I also register the number of times a combination of the univariate and multivariate forecast models beat the MSE of the multivariate model.
So the univariate forecast beats the multivariate forecast just a little less than 50 percent of the time – 4919 times out of 10,000.
A weighted average of the univariate and multivariate forecast, however, produces a lower MSE than the multivariate forecast a little more than 50 percent of the time – 5025 times out of 10,000.
So, to put some flesh on these abstractions, it’s possible to explain performance of a product in a market, based on consumer incomes, prices of competing products, and other factors. However, this multivariate model may not always do as well in forecasting as simply tracking back on sales of the product, looking for seasonal and other patterns.
The multivariate model, then, is useful in strategic planning, and in indicating what-if’s.
Supplementing this model with a time series univariate model may improve forecasts.
On relevant literature, I’ve found several papers, and recommend,