Weighted Ranking Procedure for Combining Univariate Time Series Models

This paper extends the standard approach of combining forecast by proposing weights which are based on ranking the performance of forecast accuracy measures of models. These weights became necessary due to the problems associated with the Akaike weights, equal weights and forecast from the ‘best’ model selected by the minimum AICc value; which are pointed out in this study. According to a selection criterion, five models were fitted to the simulated dataset with two different sample sizes, n=25 and n=200. The results revealed that the mean squared forecast error (MSFE) from the combined forecast of the proposed weights (weighted ranking procedure) outperformed all other approaches that were investigated in this study. Furthermore, the three combined forecast approaches consistently outperformed the forecast from the best model selected by the minimum AICc. Thus, we recommend the use of the weighted ranking procedure in combining models.


INTRODUCTION
In time series analysis, one major interest is to be able to forecast the future values of a series from a 'best' model.This is to say that before forecasting, one is faced with a challenge of choosing the 'best' model among a variety of candidate models.The selection procedures of the 'best' model have several difficulties.Usually, one has to go through a series of evaluations in order to get the "best" model.The obvious difficulty is that there is no objective guideline for the choice of the size of the various tests involved in the selection procedures.Thus, little or no information is known about the errors associated with these procedures (i.e., after conducting a series of evaluations).
Our preliminary analysis and available literature indicated that, the model preferred by a test or information criterion does not necessarily perform better than other competing models in terms of prediction risk.In addition, one major drawback with model selection is its instability.Zou et al. (2004) argued that with a small or moderate number of observations, models close to each other are usually hard to distinguish and the model selection criteria are usually quite close to each other.Thus, a slight change of the data may result in the choice of a different model.
The unstable nature of model selection criteria often may inflate variability in the estimation or prediction.The instability of model selection has been recognized in statistics and related literature (Breiman, 1996).Chatfield, 2004;Hoeting et al. 1999; have used the term 'model uncertainty' to capture the difficulty in identifying the correct model.In addressing this challenge, combining forecast was introduced over the past three decades (Bates and Granger, 1969;Clemen, 1989) and various methods have been proposed.Thus, when there is a substantial uncertainty in finding the 'best' model, alternative method, such as a combined model should be considered.
Most often, the following weighting schemes have been distinguished: equal weights, Akaike weights, optimized and constrained weights, and Bayesian weights.However, these authors often do not focus on deriving a weighted model but rather weighted forecasting.Thus, this study does two different things relating to combining models: (1) evaluate two most commonly used conventional weighting schemes and propose a new weighted procedure based on ranking competing models by their respective forecast accuracy measure performance, and (2) derive the weighted model based on the new proposed weights.

Weights for combining forecast
In this section, we briefly discuss the Akaike weights, equal weights and the proposed weights, called weighted ranking (WR).

Akaike weights
The Akaike weights was proposed by Akaike (1974).The procedures of the Akaike weights are as follows:

a. Ranking alternative models
The AICc values, of the entire set of models is rescaled, such that the model with the minimum AICc has a value of 0. Thus, information criterion values can be rescaled as simple differences,

b. Defining the Akaike weights
The simple transformation results in the (discrete) likelihood of model i, given the data L(g i |x).These are functions in the same sense that L(θ|x,g i ) is the likelihood of the parameters θ, given the data (x) and the model (g i ).These likelihoods are very useful; for example, the evidence ratio for model i versus model j is (2) It is convenient to normalize these likelihoods such that they sum to 1, as where w i is the akaike weight, R is the number of the entire set of models (Burnham and Anderson, 2002 ).Our interest is with the Akaike weights.
This weight makes use of the Akaike information criterion.However, as pointed out earlier, these information criteria performances are unstable at times.In other words, they wrongly select models at times, and models selected by them as 'best' models do not mean they forecast well.Thus, the estimation of weights based on any of these criteria will likely exhibit instability at times.In other words, since these information criteria do not always select the 'best' model, any estimation from them makes the results unstable or uncertain.

Equal weights
Equal weights for combining forecast is a straightforward approach compared to other weighting approaches.Here, forecast values from the entire set of models are averaged by using any kind of mean (e.g., trimmed, arithmetic, etc).
One basic problem with this weighting approach is that, the forecast performance of different models are not the same; some models outperform others.Thus, it will be conceptually wrong to consider the forecast performance of all models as same.

Proposed weights
To overcome the above problems, we propose a weight whose estimation is not based on information criteria and the assumption of equality in forecast performance is desirable.We, therefore, propose a weight based on ranking the forecast or predictive performance of all competing models.
The basis of the weighted ranking procedure is that, each competing model has the potential of relatively predicting the future value of a series, since the true model is unknown.Thus, we allow each model in the competing set of models to forecast.We therefore rank each model based on their predictive performance, by ranking the model with the lowest forecast accuracy giving the highest rank to that model while the best model gets the lowest rank.The weighted ranking procedure is given below: 1. Fit a set of competing models to a dataset.Here, it is appropriate that the selected models of the set have close distance, meaning difference in the information criterion of the respective models should be small.Thus, the guideline for including a model to the set of competing models is that, the information criterion difference of a model and the best model should be less than 4 (i.e.,
2. Forecast each model in the entire set of models based on the 'out-of-sample' or 'inof-sample' data.
4. Rank models in the entire set by their forecast accuracy measure.Thus, the lowest forecast accuracy measure model receives the highest rank.
5. Sum the ranks and respectively divide the individual rank by the total of the ranks to get the corresponding model weights.
Thus, we can express the proposed weight as: where i ψ is the rank for model i forecast accuracy measure, MSFE; and is the sum of ranks of forecast accuracy measure, MSFE, in the entire set of models (s = last model in the entire set).

Formula for combining forecast values
For period h + 1, the forecast values are: Model n : F h+1 → F h+1,n × w n In general, forecast value for each model is: F h+1,i × w i , i = 1,2,3, …, n where i is the respective model.Thus, overall forecast value for a particular period is: where R is the number of models in the entire set.

Combining ARIMA models
In the available literature, the interest of researchers, usually, is to combine forecast values and not combining models.Thus, in this study we present an appropriate way of combining models.

Univariate ARMA (p,q) model
A time series {x t; t = 0,± 1, ± 2, …} is ARMA (p,q) and is stationary The parameters p and q are called the autoregressive and the moving average orders, respectively.

Weighted ARIMA model
Thus, ARIMA (p,d,q) model in equation ( 6) can be combined with appropriate weights as: θ denotes the estimator of j θ based on model g h .
Similarly, we can define drift or intercept as in ( 8) and ( 9).

Simulation study for combining models
Data were simulated from a true model using R software.Several models were then fitted to the data.However, since the true model is known, we ignore it from the competing models.
Here, we consider all models as approximation of the true model.The justification is that, in the real world data, the true model of a data set is not known.Thus, without the true model, the model with the minimum information criterion is considered as the best model.
Datasets are generated for samples of N = 35 (considered as small sample) and N = 210 (considered as large sample).A 'seed' was set so that the same data set were produced for the large sample size.However, for the purpose of cross-validation, 10 data-points were removed from each sample sets.Thus, the sample size for model fitting were n = 25 and n = 200.The rationale behind these two different sample sizes is that, we want to know how the proposed method performs with an increase in sample size.

Akaike weights
Based on the guideline indicated earlier, five models were identified and fitted to both data set of sample sizes n=25 and n=200.The results of all necessary computations are given in Table 1.

*is the 'best' model
In Table 1, the weight of the 'best' model (which is model 1, (1, 1, 0)) is quite similar to model 2, and even its difference with models 3, 4, and 5 are considerably smaller.In other words, all the models in the entire set look good as approximation of the true model.Thus, in this case, it is not appropriate to consider only the 'best' model for inference by neglecting equally good models.This is the basis of combining forecast or models.In Table 2, the weight of the 'best' model is relatively higher as compared to models 2, 4, and 5; but the different with model 3 is not that much.
Since the difference between the 'best' model weight and other competing or alternative models is very small; instead of using only the 'best' model for inference, it will be appropriate to add the other four alternative models to construct a single weighted model (composite model) for proper inference.This will lead to increase in precision.

Equal weight
There are five (5) models in the entire set, thus the equal weight will be 0.2 for each model.

Proposed weight (weighted ranking procedure)
Based on the selection criterion indicated earlier, five models were identified and fitted to both dataset of sample size n = 25 and n = 200.However, it should be noted that, several models were fitted to the data but for lack of space, we report models which meet the selection criterion.We got the forecast values for each model for 10 horizons.The out-ofsample forecast accuracy measure, mean square forecast error (MSFE), was computed on each model.These MSFE of the individual models were ranked from lowest to highest.The results are given in Tables 3 and 4.

*'best' model in terms of forecast performance
In Table 4, the distributions of weights are also very different from the Akaike weights.
Here, the 'best' model (i.e., (2,1,1)) selected by the Akaike information criterion was ranked as the third 'best' model in terms of forecasting performance by the ranking weight.Thus, we expect that our weighted ranking approach to out-perform the Akaike weight.

Overall performance of approaches
The purpose of this section is to compare the performance of combining forecasts using Akaike weights, equal weights, and our proposed weights and with individual forecast obtained from the 'best' model selected by the AICc; with respect to different horizon.Here, we define horizon as the distant from one observation in time to the other.Mean square forecast errors (MSFE) for sample n = 25 are given in Table 5. Mean square forecast errors (MSFE) for sample, n=200 are given in Table 6.In Table 5 and 6, the weighted ranking procedure measured the lowest MSFE, indicating that our proposed weighted ranking procedure outperforms the other approaches of weights combined forecasts (i.e., equal weight and Akaike weight) and the individual forecast obtained from the 'best' model selected by minimum AICc.The second performed approach is the equal weighting method, followed by Akaike weighting method and lastly the 'best' model selected by minimum AICc; and this is depicted in Figure 1.Such performance indicates the dominance of combined forecasts.
Generally, in all the approaches, forecast horizon increases with MSFE, and this is evident in Figure 1.This confirms that time series forecast does not perform well with high horizon, thus, it's always appropriate to forecast smaller or shorter horizon.It should be noted that the smaller the MSFE value the better the model.

Fig 1. Performance of weighted approaches and best model as horizon increases
MinAICc is the MSFE from 'best' model( Wrank) selected by minimum AICc, Wequal is the MSFE from equal weights, Wakaike is the MSFE from Akaike weights and Wrank is the MSFE from weighted ranking.

Combining models using Weighted Ranking procedure
Available literature often neglects the aspect of combining models.Thus, in this section, we show how to combine models using the weighted ranking procedure.It should be noted that, the approach can be applied when using any other weighting technique.We will illustrate this technique by using models for which when sample size, n = 200.

Weighted parameter estimates
The computation of the weighted parameter estimates of the models are presented in Table 7.
It should be noted that, when n = 25, the 'best' model was (1,1,0); however it could not make it into the set of competing models when n = 200, according to the selection criterion.

CONCLUSION
The problems or challenges associated with the Akaike weights, equal weights for combining forecasts and forecast from a single 'best' model selected by the minimum AICc value are pointed out in this study.Thus, an approach that can minimize or handle these challenges will be useful.We therefore, proposed a weight based on ranking procedure to combine forecast.The results revealed that the MSFE from the combined forecast of the proposed weight (weighted ranking procedure) outperformed all other approaches that were investigated in this study.Furthermore, the three combined forecast approaches consistently outperformed the forecast from the 'best' model selected by minimum AICc.We therefore, recommend the use of weighted ranking procedure for combining models.
AICc of each alternative model and minAICc is the best model or model with the minimum AICc.The i ∆ allows a ranking of the models from best to worst; the larger the i ∆ , the less plausible is model i.
Applying equation (2.8) and (2.9), we achieve the following weighted parameter estimates for the autoregressive and moving average models, respectively.
Note that the values in the parentheses are the respective standard errors.Thus, the combined or weighted model is ARIMA (3,1,2) but with different parameter estimates.It's defined as:

Table 3 . Deriving the weighted ranking procedure when n = 25
*'best' model in terms of forecast performanceIn Table3, the distributions of weights are very different from the Akaike weights.Here, the 'best' model (i.e., (1, 1, 0)) selected by the Akaike information criterion was the third 'best' model in terms of forecasting performance.This confirms the fact that 'best' model does not always give better forecast.

Table 5 . Mean Square Forecast Error (MSFE) for sample n = 25
MinAICc is the MSFE from 'best' model selected by minimum AICc, Wequal is the MSFE from equal weights, Wakaike is MSFE from Akaike weights and Wrank is the MSFE from weighted ranking.