ASYMPTOTICS FOR OUT OF SAMPLE TESTS OF CAUSALITY 1 BY MICHAEL W. MCCRACKEN NOVEMBER 8, 1999 HEADNOTE This paper presents analytical and numerical evidence concerning out of sample tests of causality. The relevant environment is one in which the relative predictive ability of two nested parametric regression models is of interest. Results are provided for three statistics: a regression-based statistic suggested by Granger and Newbold (1977), a t-type statistic comparable to those suggested by Diebold and Mariano (1995) and West (1996), and an F-type statistic akin to Theil's U (1966). Since the limiting distributions under the null are nonstandard, tables of asymptotically valid critical values are provided. The null limiting distributions indicate that overfit models should predict poorly and that the Principle of Parsimony should be applied judiciously. Power calculations under a local alternative provide some guidance on the choice of test statistic and the percentage of the sample withheld for predictive evaluation. Keywords: causality, forecast evaluation, hypothesis testing, model selection. JEL categories: C12, C32, C52, C53. Department of Economics, Louisiana State University, 2107 CEBA, Baton Rouge LA, 70803; mmccrac@unix1.sncc.lsu.edu. 1 1. INTRODUCTION EVALUATING A TIME SERIES models' ability to forecast is one method of determining its usefulness. Tegene and Kuchler (1994), Swanson and White (1995), Huh (1996), Diebold and Kilian (1997), and Sullivan, Timmermann and White (1998) are a few examples of applications that have determined the appropriateness of a model based on its ability to predict Out-Of-Sample (OOS). When using this methodology a model is determined to be valuable if the resulting forecast errors are deemed small relative to some loss function. Typically this loss function is mean squared error (MSE) though others such as mean absolute error (MAE) and directional accuracy have been used by Leitch and Tanner (1991) and Breen, Glosten and Jagannathan (1989) respectively. This OOS methodology is in contrast to traditional methods (like the classical F-test reported by most statistical software) that determine quality of the predictive model based on its ability to replicate or "fit" the same realizations used to estimate the model. This paper contributes to recent analytical work on OOS model evaluation, specifically that of West (1996), by providing asymptotic results for OOS tests that compare the predictive ability of two nested models when parameters are estimated. Null limiting distributions are derived for three commonly used tests that compare the OOS predictive ability of two nested models: a regression-based test for equal MSE proposed by Granger and Newbold (1977), a similar t-type test commonly attributed to either Diebold and Mariano (1995) or West (1996), and an F-type test similar in spirit to Theil's U (1966) but perhaps closer to in-sample likelihood ratio tests. Since the limiting distributions of the former two tests are identical they will be referenced simultaneously as "OOS-t" tests; the latter test will be referenced as an "OOS-F" test. The limiting null distributions of both the OOS-t and OOS-F tests are non-standard. Each can be written as functions of stochastic integrals of quadratics of Brownian Motion. The distributions bear some resemblance to those in Andrews (1993) but are distinct. Tables are provided in order to facilitate the use of these distributions. A limited collection of analytical and numerical results regarding the local power of these tests is also provided. Monte Carlo evidence on the finite sample size and power of these tests and an empirical example can be found in Clark and McCracken (1999). There are a number of interesting implications of the asymptotics under the null and under the local alternative. First, the null asymptotics provide a simple method of constructing asymptotically valid tests of OOS predictive ability between two nested models. A test can be conducted by simply consulting the provided tables of estimates of asymptotically valid critical values. In addition, the null asymptotics have implications for the Principle of Parsimony and overfitting when OOS predictive ability is the objective. Assuming for the moment that the predictive model is linear, we know that in- sample predictive ability improves deterministically with the number of extraneous regressors. The results of this paper show that OOS quite the contrary is true. OOS the probability that the unrestricted model has lower predictive ability than the restricted model is increasing in the number of extraneous regressors. This result is particularly intriguing in the context of comparing the predictive ability of the random walk and economic models of asset movements. Meese and Rogoff (1983, 1988), Wolff (1987), Chinn and Meese (1995), and Berkowitz and Giorgianni (1999) are a few examples of such horse races. 2 Finally, the local alternative results indicate that the choice of sample split and the number of extraneous parameters in the unrestricted model jointly determine whether the OOS-t or OOS-F test is more powerful. The OOS-F has greater local power when the post-sample size is small relative to the in-sample size and when the number of extraneous parameters is small. As more of the sample is used for post-sample evaluation or when the number of extraneous parameters is large, the OOS-t tends to be more powerful. The choice of optimal sample split is less clear and is left to Section four. The remainder of the paper will proceed as follows. Section two introduces the OOS methodology and provides a brief literature review. The review focuses on uses of the OOS methodology to date and potential applications of the results contained in this paper. Section three and its subsections provide notation, assumptions, theorems and corollaries regarding the null asymptotics. Section four provides a limited set of results regarding the power of both the OOS-t and OOS-F tests under a sequence of local alternatives. Section five concludes and suggests directions for future research. All proofs are presented within the Appendix. 2. LITERATURE REVIEW Recent work by West (1996) has shown how to construct asymptotically valid OOS tests of predictive ability when forecasts are generated using estimated parameters. He provides conditions under which t-type statistics will be asymptotically standard normal. These conditions extend and clarify previous analytical work on OOS hypothesis testing made by Mincer and Zarnowitz (1969), Chong and Hendry (1986), Hoffman and Pagan (1989), Fair and Shiller (1989, 1990), Mizrach (1992), and Diebold and Mariano (1995). More recent work on OOS hypothesis testing has also developed. Corradi, Swanson and Olivetti (1999) extend previous work to allow for the comparison of non-nested models when cointegrating relationships exist. McCracken (1999a) provides analytical results for constructing OOS tests when the test involves non-smooth functions such as the indicator or absolute value function. Harvey, Leybourne and Newbold (1998) construct tests of equal predictive ability in the presence of ARCH. Diebold, Gunther and Tay (1997) discuss the evaluation of density forecasts. White (1999) shows how to use the bootstrap to compensate for data-snooping biases when comparing the predictive ability of a large number of models. Sanchez (1998) tests for unit roots using OOS forecast errors. One test that is considered by several of these authors is whether two models have the same predictive ability with respect to some loss function L(.). Diebold and Mariano (1995) suggest a test of the form T (2.1) P −0.5 Ω −0.5 å [ L( u1,t +1 ) − L( u 2 ,t +1 )] ˆ ˆ ˆ t =R where Ω denotes a consistent estimate of the limiting variance of P −0.5 åT= R [ L( u 1,t +1 ) − L( u 2 ,t +1 )] , T + 1 = P + ˆ t ˆ ˆ R; P is the number of OOS observations and R is the number of observations used to construct the first forecast. In ˆ (2.1), u i ,t +1 i = 1,2 is the forecast error from model i observed at time t+1 associated with a forecast from time t. When each forecast is constructed using β i ,t , an estimator of the parameters associated with model i, West (1996) ˆ 3 shows that the test statistic in (2.1) can be asymptotically standard normal. For this to be true however, some conditions must hold. One condition is that the estimate of the limiting variance, Ω, must be appropriately constructed. The estimated limiting variance should not only account for sample variation, heteroskedasticity and serial correlation but also for the fact that forecasts are typically made using parametric models for which the parameters are unknown. If the parameters are estimated using the random data then they too are random and may contribute to the limiting variance.2 West (1996) provides the correct limiting variance. The correct limiting variance is sometimes complicated but West and McCracken (1998) show that many OOS tests can be conveniently constructed using regression-based tests. These artificial regressions are similar to in-sample diagnostic tests suggested by, for example, Pagan and Hall (1983). Unfortunately, it is easy to overlook the most crucial condition for limiting normality. For the OOS t-test in (2.1) to be limiting standard normal Ω must be positive. If Ω is zero then P −0.5 åT= R [ L( u 1,t +1 ) − L( u 2 ,t +1 )] → p 0 . t ˆ ˆ The problem is more pronounced when we look back at (2.1). This OOS-t statistic involves Ω −0.5 as well. Using ˆ results in West (1996), we know that Ω converges in probability to zero when Ω = 0. If we put the two items ˆ together it is unclear whether the OOS-t statistic is degenerate, divergent or bounded in probability. What is clear is that the limiting distribution will not be standard normal. This last problem may seem unlikely but indeed it is quite common. Using results in West (1996) one can easily show that Ω equals zero if the two parametric models are nested rather than non-nested. This has serious implications for OOS tests of causality and market efficiency for which the models are inherently nested. For example, in testing for a causal relationship between aggregate advertising expenditure and aggregate consumption expenditure Ashley, Granger and Schmalensee (1980) construct an OOS-t statistic similar to that in (2.1). Using a method suggested by Granger and Newbold (1977) they test the null that advertising causes consumption using the t-statistic (and standard normal tables) associated with α from the OLS estimated artificial regression (2.2) u 1,t +1 − u 2 ,t +1 = α ( u 1,t +1 + u 2 ,t +1 ) + error term. ˆ ˆ ˆ ˆ ˆ In (2.2) u1,t +1 is the one-step ahead forecast error from an autoregressive model for aggregate consumption and ˆ u 2 ,t +1 is the one-step ahead forecast error from a bivariate autoregressive model for both aggregate consumption and aggregate advertising. Ashley (1981) uses similar methods to test for causality between the consumer price index and its dispersion across different consumption categories. Park (1990) tests for causal relationships in cattle markets using (2.2). There are also a number of potential applications to tests for the predictability of asset returns and more generally for tests of market efficiency. If the null is that asset returns form a martingale difference sequence then any parametric model for asset returns nests the null model (i.e. a constant zero conditional mean function) within it. For example Mark (1995) constructs OOS-t statistics of the form (2.1) to test the null that changes in exchange rates 4 are unpredictable. If this is the case then the MSE using the null zero conditional mean model should equal the MSE using a linear model that depends upon certain fundamentals. Kilian (1997) constructs similar tests but under the null that changes in exchange rates form a martingale difference sequence around a nonzero unconditional mean. It should be mentioned that Mark (1995) and Kilian (1997) each use the bootstrap when conducting their hypothesis tests in these Long-Horizon regressions. They do not reference standard normal tables per se. However, the reason they use the bootstrap is that they are concerned about finite sample size distortions relative to the (claimed) limiting standard normal distribution of (2.1). The results in Section three of this paper indicate that those distortions may also be because the limiting distribution is not standard normal nor is well approximated by a standard normal distribution. This paper focuses on constructing asymptotically valid OOS sample tests that compare the predictability of two nested parametric models. Three different statistics are considered. The first two, those from (2.1) and (2.2), are OOS-t statistics. The third is an OOS-F statistic of the form T T ( P −1 å L( u1,t +1 ) ) − ( P −1 å L( u 2 ,t +1 ) ) ˆ ˆ t =R t =R (2.3) P . ˆ c ˆ In (2.3), c converges in probability to a certain normalizing constant c. For the moment it suffices to focus on the most useful case in which c = P −1 åT= R u 2 ,t +1 and u i2,t +1 = L( u i ,t +1 ) i = 1,2. As in the descriptions of (2.1) and ˆ t ˆ2 ˆ ˆ (2.2), the restricted and unrestricted models are referenced using the indexes i = 1 and i = 2 respectively. The OOS-F statistic is not generally used in the form (2.3). For example, Leitch and Tanner (1991) simply report Theil's U without providing any formal test that the unrestricted model has a lower MSE than the random walk. Others, including Mark and Sul (1998) and Ashley (1998), test the null of equal MSE by bootstrapping the ratio of the restricted MSE to the unrestricted MSE. Another group including Urbain (1989), Pesaran and Timmermann (1995), and Swanson and White (1997a) test for equal predictive ability using model selection criteria. In this last case a statistic similar to (2.3) is constructed but includes penalty terms like those associated with well known information criteria (e.g. AIC, SBC, Hannan-Quinn, etc.). I introduce the OOS-F for a number of reasons. First, it is essentially Theil's U when the null model is a random walk but allows for a wider range of nested parameterizations. Also, given the limiting distribution results in Section three there does not seem to be any need to include penalty terms; the OOS MSE is not decreasing in the number of extraneous parameters as is the case in-sample. Finally, it seems natural to use the OOS-F because it is a direct analog of the in-sample F-test. 3. THEORETICAL RESULTS This section provides the null limiting distributions of both the OOS-t tests in (2.1) and (2.2) and the OOS-F test in (2.3). It does so in five subsections. Section 3.1 presents the basic environment while Section 3.2 presents the assumptions needed for the results in Section 3.3. Section 3.3 presents the limiting distribution of the OOS-t and OOS-F tests first allowing a wide range of likelihood-type loss functions to measure predictive ability. Section 3.4 5 specializes the results to the leading case in which parameters are estimated by NLLS and MSE is used to measure predictive ability. Since the null limiting distributions are nonstandard, tables of asymptotically valid critical values are provided. Section 3.5 provides a discussion of the asymptotic results and their relation to the Principle of Parsimony. 3.1 Environment Throughout it will be assumed that there is an observed sample { X s }s =+ 1 of length T + 1. Using that sample, T 1 the researcher wishes to compare the one-step ahead predictive ability of two nested parametric regression models. This structure allows for many of the relevant applications discussed in Section two. It does eliminate applications like those in Diebold and Nason (1990), Swanson and White (1997b) and McCracken (1999b) who use local- regression, series-based and kernel-based nonparametric methods to estimate the regression function and construct forecasts. The focus on one-step ahead forecasts rather than τ-step (τ > 1) is both substantive and for purposes of clarity. By limiting the discussion to one-step ahead forecasts I am able to derive results for a wide range of potential loss functions used to measure predictive ability. Asymptotics for multi-step forecasts are left to future research. Given the pair of nested parametric regression models, two sequences of one-step ahead forecasts are constructed using one of three methods. These are referred to as the recursive, rolling and fixed sampling schemes. Within each of these schemes an initial in-sample portion of the data, of length R, is used to select the two nested models and estimate their respective model parameters. Using the chosen nested models, and the estimated parameters, a sequence of P one-step ahead forecasts is then generated. See West (1996), West and McCracken (1998), McCracken (1999a), and Pesaran and Timmermann (1999) for more discussion on the use of these three schemes. A brief description is given below. Pagan and Schwert (1990) use the recursive sampling scheme. Under this scheme a sequence of parametric forecasts is generated with updated parameter estimates. Specifically, at each time t = R,…,T the parameter estimate β t depends explicitly on all information from s = 1,…,t. If OLS is used to estimate the parameters from a linear ˆ model with regressors Zs and predictand ys then β t = ( t −1 åts =1 Z s Z 's )−1( t −1 åts =1 Z s ys ) . The first forecast for ˆ models i = 1,2 is then of the form ˆ R + 1( β i ,R ) . The resulting forecast error is constructed as ui ,R +1 = y ˆ ˆ y R + 1 − ˆ R + 1( β i ,R ) . For some loss function L(.) the loss associated with the first forecast is constructed as y ˆ L( ui ,R +1 ) and will usually be denoted as Li ,R +1( β i ,R ) . The second forecast, ˆ R + 2 ( β i ,R + 1 ) , is constructed similarly ˆ ˆ y ˆ using observations s = 1,...,R+1. The forecast error and loss associated with the second forecast is constructed as for the first forecast. This process is iterated P times so that for each t ∈ [R, T] , the parameter estimates are based upon all data s ∈ [1, t] . Swanson (1998) uses the rolling sampling scheme. Under this scheme a sequence of parametric forecasts, forecast errors and losses are constructed in much the same way as the recursive scheme. What distinguishes the rolling from the recursive is its treatment of observations from the distant past. The rolling scheme uses only a fixed 6 window of the past R observations. As t increases from R to T, older observations are not used in estimating the parameters. If OLS is used to estimate the parameters using regressors Zs and predictand ys then β t = ˆ ( R −1 åts = t − R +1 Z s Z 's )−1( R −1 åts = t − R +1 Z s ys ) . This implies that the first rolling forecast, ˆ R + 1( β i ,R ) , forecast error, y ˆ and loss are identical to those for the recursive. The second rolling forecast, ˆ R + 2 ( β i ,R + 1 ) , is constructed using y ˆ only observations s = 2,...,R+1 to estimate the model parameters. This implies that the second rolling forecast, forecast error, and loss are distinct from those using the recursive scheme. The process is iterated P times such that for each t ∈ [R, T] the parameter estimates are based upon all data s ∈ [t - R + 1, t] . Ashley, Granger and Schmalensee (1980) use the fixed scheme. This method is distinct from the previous two in that the parameters are not updated with the introduction of new observations. Although this method may seem inefficient it is frequently used when the computational burden is large such as when artificial neural networks are used to form forecasts (Kuan and Liu, 1995). Since the parameter vector is estimated only once each of the P forecasts, ˆ t + 1( β i ,R ) , uses the same parameter estimate.3 If OLS is used to estimate the parameters using y ˆ regressors Zs and predictand ys then β t = ( R −1 å R=1 Z s Z 's )−1( R −1 å s =1 Z s ys ) . Hence for each one-step ahead ˆ s R forecast from time t ∈ [R, T] , the parameter estimate is based only upon data s ∈ [1, R] . Using each of the two series of subsequent forecast errors, one from the nesting model and one from the nested model, a test statistic of the form in either (2.1), (2.2) or (2.3) is constructed. Based upon the value of this statistic one either fails to reject or rejects the null of equal predictive ability. The null and alternative can be stated as (3.1.1) H0: EL1,t ( β1 ) ≤ EL2 ,t ( β 2 ) vs. HA: EL1,t ( β1 ) > EL2 ,t ( β 2 ) . * * * * The alternative is one-sided rather than two sided because the two models are nested by construction. Note that if a log-likelihood function is being used to evaluate predictive ability, (3.1.1) implies that L(.) is defined as the negative of that log-likelihood function. 3.2 Assumptions Before discussing specific assumptions, some notation is required. For the loss function Li ,t ( β i ) let hi ,t ( β i ) denote ∂Li ,t ( β i ) / ∂β i and q i ,t ( β i ) denote ∂ 2 Li ,t ( β i ) / ∂β i ∂β i' . For any matrix A with elements ai,j let |A| denote max i , j | a i , j | . For any (m×n) matrix A with column vectors ai let vec(A) denote the (mn×1) vector [ a1 , a '2 ,..., a 'n ] ' . ' Without loss of generality let β 2 ≡ ( β 2 ,1 , β 2 ,2 )' = ( β 2 ,1 ,0 )' = ( β 1 ,0 )' . Define a selection matrix J ≡ * *' *' *' *' ( I k1×k1 ,0k1×k2 ) (k1×k, k > k1). Since the two models are nested we know that under the null, L1,t ( β 1 ) = L2 ,t ( β 2 ) , * * Jh2,t = h1,t and Jq 2 ,t J ' = q1,t for all t. The following assumptions are not intended to be necessary and sufficient, only sufficient. 7 ASSUMPTION 1: The parameter vectors β 1 and β 2 are estimated by minimizing the aggregate loss functions, * * Λ1,t(β1) and Λ2,t(β2). For i = 1,2, and t = R,…,T, Λi,t(βi) = t −1 åtj =1 Li , j ( β i ) , R −1 åtj =t − R +1 Li , j ( β i ) and R −1 å R=1 Li , j ( β i ) for the recursive, rolling, and fixed schemes respectively. j This first assumption provides two pieces of information. Analytically it states that the parameter estimates are of the form β 1,t = argmin Λ1,t(β1) and β 2 ,t = argmin Λ2,t(β2). This allows for both linear and nonlinear models ˆ ˆ estimated by OLS, NLLS, and maximum likelihood. The substantive part of the first assumption is that it requires that the loss function used to estimate the parameters and the loss function used to measure predictive accuracy are the same. An implication of this assumption is that if MSE is the measure of OOS predictive ability, parameters must be estimated using OLS, NLLS, or maximum likelihood under the additional assumption that the disturbances are normal. One benefit of Assumption 1 is that it otherwise does not place a restriction on the chosen loss function. For example if parameters are estimated by minimizing the negative of a log-likelihood and then the negative of that log-likelihood is used to measure predictive ability, the limiting distribution is essentially the same as if the model had been estimated by OLS and then MSE was used to measure predictive ability.4 ASSUMPTION 2: For i = 1,2, (a) β i ∈ Θ i , Θi compact, (b) ELi ,t ( β i ) is uniquely minimized at β i* ∈Θ i with Eqi ,t nonsingular, (c) In an open neighborhood Ni around β i* , and with probability one Li ,t ( β i ) is twice continuously differentiable, admitting a mean value expansion ~ Li ,t ( β i ) = Li ,t ( β i* ) + hi' ,t ( β i − β i* ) + ( 0.5 )( β i − β i* )' q i ,t ( β i )( β i − β i* ) ~ for some β i on the line between β i and β i* , (d) In the open neighborhood Ni, and for all t there exists a positive constant ϕ and a positive random variable mt such that | qi ,t ( β i ) − q i ,t ( β i* ) | ≤ mt | β i − β i* |ϕ with Emt < ∞ and ϕ < ∞ , (e) sup β i ∈Θ i | Λi ( β i ) − ELi ,t ( β i ) | → a .s . 0 . Assumption 2 insures that the parameters are identified and are consistently estimated. It is directly comparable to Theorem (2.1) of Newey and McFadden (1994). The substantive component of this assumption is the requirement that the loss function be twice continuously differentiable. This allows for MSE and many log- likelihood type measures of predictive ability but eliminates applications, like that of Weiss and Andersen (1984), that estimate the parameters using LAD and then use MAE as the measure of predictive ability. ASSUMPTION 3: Let U t ≡ [ h2 ,t , vec( h2 ,t h2 ,t − Eh2 ,t h'2 ,t )' , vec( q 2 ,t − Eq 2 ,t )' ] ' . (a) EUt = 0, (b) Ut is uniformly ' ' L8 bounded, (c) For some 8 > d > 2, Ut is strong mixing with coefficients of size -8d/(8-d), (d) limT →∞ T −1 E åT =1 U jU 'j < ∞. j 8 The conditions in Assumption 3 differ from those in, say, West (1996) because the models are nested rather than non-nested. If the models are non-nested then the OOS-t statistics in (2.1) and (2.2) can be asymptotically standard normal and hence one needs to make assumptions sufficient for the application of a central limit theorem. West (1996) and West and McCracken (1998) use a central limit theorem derived by Wooldridge and White (1989). In this paper, the limiting distributions are comprised of functions of stochastic integrals of quadratics of Brownian Motion. Hence we require conditions sufficient for the joint weak convergence of partial sums and the averages of these partial sums to Brownian Motion and stochastic integrals of these Brownian Motion. Hansen (1992) provides sufficient conditions for just such a situation. The details of Assumption 3 above are directly comparable to those for Theorems (2.1) and (3.1) in Hansen (1992). − ASSUMPTION 4: (a) Eh2 ,t h2 ,t = cEq2,t ≡ cB2 1 for a constant c, (b) E( h2 ,t | h2 ,t − j , q 2 ,t − j , j = 1,2,...) = 0. ' The reasons for imposing Assumption 4 are much the same as Assumption 1. In order to insure that the limiting distribution does not depend upon the underlying data generating process, additional conditions must be imposed on the loss function L. The first is that the loss function has the property that the expected outer product of the score is proportional to the expected hessian. Moreover that constant of proportionality must be positive and finite. If the loss function L(.) is the negative of a log-likelihood then that constant is one. The need for the constant c arises from the fact that MSE is the most common measure of predictability. If the disturbances from the parametric linear regression model y t = Z 2 ,t β 2 + u 2 ,t are i.i.d. normal and conditionally ' * homoskedastic with variance σ u then the OLS estimates of β 2 are numerically identical to those estimated using 2 * the log-likelihood. That is not the same as saying that h2t and q2t do not depend on whether you use MSE or the log- likelihood as your measure of predictive ability. For example if we use OLS to estimate the parameters then h2 OLS ) = −2u 2 ,t Z 2 ,t , q 2OLS ) = 2 Z 2 ,t Z '2 ,t and hence Eh2 OLS ) h2 OLS )' = 4σ u EZ 2 ,t Z '2 ,t ≠ 2 EZ 2 ,t Z 2 ,t = Eq 2OLS ) . ( ,t ( ,t ( ,t ( ,t 2 ' ( ,t − − Similarly if we minimize a negative log-likelihood h2 ,MLE ) = −σ u 2 u 2 ,t Z 2 ,t , q 2 MLE ) = σ u 2 Z 2 ,t Z '2 ,t and hence ( t ( ,t − Eh2 ,MLE ) h2 ,MLE )' = σ u 2 EZ 2 ,t Z 2 ,t = Eq 2 MLE ) . This difference generates the need for the constant c. For a more ( t ( t ' ( ,t detailed discussion see Section 3.4. ASSUMPTION 5: limT →∞ P / R = π , 0 < π < ∞. λ ≡ (1 + π)-1. This final assumption introduces the means by which the asymptotics are achieved. As in Hoffman and Pagan (1989), West (1996), and White (1999) the limiting distributions are derived by imposing a slightly stronger condition than simply that the sample size T becomes arbitrarily large. The additional condition is that both the number of in-sample (R) and OOS (P) observations become arbitrarily large at the same rate. This insures that the parameters estimated in-sample and certain OOS averages are both consistent estimators of their population level analogs. 9 3.3 Asymptotics Under the Null This section presents the null limiting distributions of the OOS-t statistic in (2.1) and the OOS-F statistic in (2.3). Since the limiting distributions are non-standard, tables of critical values are provided that can be used to test the null in (3.1.1). We will return to the Granger-Newbold statistic, from (2.2), in Section 3.4 where the loss function is specialized to the case of MSE. There are two main components of the OOS-t and OOS-F statistics in (2.1) and (2.3). Both test statistics depend upon åT= R [ L1,t + 1( β1,t ) − L2 ,t + 1( β 2 ,t )] . A second component, åT= R [ L1,t + 1( β1,t ) − L2 ,t + 1( β 2 ,t )] 2 , arises in t ˆ ˆ t ˆ ˆ the OOS-t statistic from (2.1). This latter component is a denominator term that was originally designed to estimate the limiting variance of P −0.5 åT= R [ L1,t +1 ( β 1,t ) − L2 ,t +1 ( β 2 ,t )] which, since the models are nested, is equal to zero. t ˆ ˆ To see how these components affect the OOS-t and OOS-F statistics let's rewrite (2.1) and (2.3). T T å [ L1,t +1 ( β 1,t ) − L2 ,t +1 ( β 2 ,t )] ˆ ˆ (3.3.1) OOS-t = P −0.5 Ω −0.5 å [ L( u1,t +1 ) − L( u 2 ,t +1 )] = ˆ ˆ ˆ T t =R t =R ( å [ L1,t +1 ( β 1,t ) − L2 ,t +1 ( β 2 ,t )] 2 )0.5 ˆ ˆ t =R T T T ( P −1 å L( u1,t +1 ) ) − ( P −1 å L( u 2 ,t +1 ) ) ˆ ˆ å [ L1,t +1 ( β 1,t ) − L2 ,t +1 ( β 2 ,t )] ˆ ˆ t =R t =R t=R (3.3.2) OOS-F = P = . ˆ c ˆ c We can see from (3.3.1) and (3.3.2) that the OOS-t and OOS-F are somewhat related. They differ in that the OOS-t has a denominator component that the OOS-F does not have. Notice that since the forecasts are 1-step ahead I am assuming that Ω = P −1 åT= R [ L1,t +1 ( β 1,t ) − L2 ,t +1 ( β 2 ,t )] 2 and hence one is not using a serial correlation ˆ t ˆ ˆ consistent covariance matrix.5 I emphasize this case because it is most common. Clark (1999) and Harvey, Leybourne and Newbold (1998) consider OOS tests when serial correlation is of concern. To gain some intuition as to how these two components contribute to the limiting distributions, consider the following three lemmas. In the following, for i = 1,2, define Hi(t) as t −1 åts =1 hi ,s , R −1 åts =t − R +1 hi ,s and R −1 å s =1 hi ,s for the recursive, rolling and fixed schemes respectively. Also, define B1 such that Eh1,t h1,t = cEq1,t ≡ R ' ~ ~ cB1 1 . For matrices C and A defined in Lemma 3.1 let c −0.5 A' CB2 .5 h2 ,t = h2 ,t and c −0.5 A' CB2 .5 H 2 ( t ) = H 2 ( t ) . − 0 0 − − LEMMA 3.1: (a) Let − J ' B1 J + B 2 = M and B2 0.5 MB2 0.5 = Q, then Q is idempotent. (b) Let A be a (k×k2) matrix with I k2 ×k2 on the upper (k2×k2) block and zeroes elsewhere. There exists a symmetric orthonormal matrix C such that Q = CAA' C . ~ ~ LEMMA 3.2: åT= R [ L1,t +1 ( β 1,t ) − L2 ,t +1 ( β 2 ,t )] = c [ åT= R ( T 0.5 H 2 ( t ))' ( T −0.5 h2 ,t +1 ) − t ˆ ˆ t ~ ~ ( 0.5 )T −1 åT= R ( T 0.5 H 2 ( t ))' ( T 0.5 H 2 ( t ))] + op(1). t 10 ~ ~ LEMMA 3.3: åT= R [ L1,t +1 ( β 1,t ) − L2 ,t +1 ( β 2 ,t )] 2 = c 2 T −1 åT= R ( T 0.5 H 2 ( t ))' ( T 0.5 H 2 ( t )) + op(1). t ˆ ˆ t When deriving the limiting distribution of the in-sample F-statistic one first shows that the statistic can be written as a weighted quadratic of, say, a (k×1) limiting standard normal random vector. The second step is to show that the weighting matrix is idempotent of, say, rank k2 ≤ k. The final step is to apply the continuous mapping theorem and conclude that the limiting distribution is chi-square with k2 degrees of freedom. The OOS statistics are roughly the same, at least in spirit. They are comprised of weighted quadratics of standard normal random vectors for which the weighting matrix is idempotent. The OOS statistics differ in that they ~ depend upon weighted averages of an entire sample path of these quadratics. To see this consider T 0.5 H 2 ( t ) ≡ ~ ~ T 0.5 t −1 åts =1 h2 ,s = ( T / t )( T −0.5 åts =1 h2 ,s ) for the recursive scheme and let W(s) denote a (k2×1) standard Brownian ~ Motion on [ λ ,1 ] with W(0) = 0 and λ ≡ ( 1 + π ) −1 . Since the increments h2 ,s are conditionally homoskedastic ~ vector martingale differences with unit variance and T/t is bounded by Assumption 5, T 0.5 H ( t ) is well approximated (weakly) by s −1W ( s ) for large enough T. The in-sample result can be thought of as just the endpoint of a similar, but distinct, sample path. Lemmas 3.2 and 3.3 also clarify the potential need for scaling by the factor c. When the OOS-F statistic is of interest Lemma 3.2 shows that the data generating process and loss function are irrelevant to the asymptotics but for ˆ the factor c. To eliminate that factor the OOS-F is defined relative to some consistent estimator c of c. The OOS-t statistics do not require a consistent estimator of c. The reason for this is that c arises in both the numerator and denominator of (3.3.1) and hence cancels. Recall that the denominator of (3.3.1) will be akin to the square root of the right hand side of Lemma 3.3. Lemmas 3.2 and 3.3 provide the building blocks for the following theorems. THEOREM 3.1: Let c → p c . OOS-F = åT= R [ L1,t +1 ( β 1,t ) − L 2 ,t +1 ( β 2 ,t )] / c → d F 1 where F 1 equals ˆ t ˆ ˆ ˆ 1 −1 ' 1 −2 ' −1 −1 ' òλ s W ( s )dW ( s ) − ( 0.5 )òλ s W ( s )W ( s )ds , λ { W ( 1 ) − W ( λ )} W ( λ ) − ( 0.5 )πλ W ( λ )W ( λ ) , and ' λ −1 òλ { W ( s ) − W ( s − λ )}' dW ( s ) − ( 0.5 )λ −2 òλ { W ( s ) − W ( s − λ )}' { W ( s ) − W ( s − λ )}ds for the recursive, fixed 1 1 and rolling schemes respectively. 2 THEOREM 3.2: OOS-t = åT= R [ L1,t +1 ( β 1,t ) − L2 ,t +1 ( β 2 ,t )] /( åT= R [ L1,t +1 ( β 1,t ) − L2 ,t +1 ( β 2 ,t )] )0.5 → d F 2 t ˆ ˆ t ˆ ˆ where F 2 equals F 1 /[ òλ s −2W ' ( s )W ( s )ds ] 0.5 , F 1 /[ πλ−1W ' ( λ )W ( λ )] 0.5 , and 1 F 1 /[ λ−2 òλ { W ( s ) − W ( s − λ )}' { W ( s ) − W ( s − λ )}ds ] 0.5 for the recursive, fixed and rolling schemes respectively. 1 There are a number of things to notice about Theorems 3.1 and 3.2. The first is that both the OOS-t and OOS-F statistics are asymptotically pivotal. This permits the construction of estimates of asymptotically valid critical values without knowledge of the underlying data generating process. Given these critical values, one can conduct 11 asymptotically valid tests for equal forecast accuracy between two nested parametric models. Tables of these critical values are provided and discussed in Section 3.4. A second fact to note is that the limiting distributions do not depend upon the choice of loss function L(.). So long as the parameters are estimated using the same loss function as is used to measure predictive ability the loss function itself has no effect on the limiting distribution. That does not imply that finite sample size and power performance is invariant to the choice of loss function. Though the null limiting distributions do not depend upon the loss function itself, the distributions are dependent upon two parameters. The first is the number of excess parameters k2. We can see this in the dimension of the vector Brownian Motion W(s). It is easier to see if we rewrite F1. Consider the recursive sampling scheme. If we let Wi(s) denote the ith element of W(s) then k2 1 1 (3.3.3) F1 = å [ ò s −1Wi ( s )dWi ( s ) − ( 0.5 ) ò s − 2Wi 2 ( s )ds ] . i =1 λ λ This representation is useful for two purposes. First, it provides some insight into the effect that k2 has on the mean of F1. Taking expectations, and noting that each of the i = 1,…,k2 elements is independent and identically distributed, it is straightforward to show that (3.3.4) E(F1) = −( 0.5 )k 2 ln( 1 + π ) for the recursive scheme = −( 0.5 )k 2π for the rolling and fixed schemes. Hence as k2 increases we expect the distribution of the OOS-F statistic to drift into the negative orthant. This is occurring because the first term in F1 is mean zero for all k2 while the second term is increasingly negative. See Section 3.4 for a discussion of this fact and its relevance to the Principle of Parsimony. A less important effect due to k2 is on the variance of F1. Since F1 can be written as the sum of k2 i.i.d. terms we know that the variance is monotonically increasing in k2. The effect of k2 on F2 is less clear. Since F2 is nonlinear in its components it is difficult to analytically derive properties concerning its mean and variance. Numerical results suggest that the mean does become increasingly negative in k2 but that the variance is relatively constant in k2. There is a second reason that the representation in (3.3.3) is useful. One of the assumptions in Section 3.2 was that k2 was finite. We can heuristically see the need for that assumption by simply taking the limit of F1 as k2 goes to infinity. It diverges under the null. Hong and White (1995) show, in the context of series-based nonparametric regressions, that the in-sample F-statistic also diverges under the null as the number of series-terms increases to infinity. They suggest a transformed version of the F-statistic that is asymptotically standard normal as the number of series-terms increases to infinity. It seems that such an argument could be used for the OOS-F statistic as well. Such a proof is beyond the scope of this paper and is left for future research. A second parameter, π, affects the null limiting distribution of both the OOS-t and OOS-F. It affects the limiting distributions in two ways. It directly affects the weights on each of the components of the statistics (recall 12 that λ = ( 1 + π ) −1 ). It also affects the range of integration on each of the stochastic integrals through λ. Since the parameter π enters both F1 and F2 nonlinearly its affect on their distributions is less clear than it was for k2. Looking at (3.3.4) we can say with certainty that the mean of F1 decreases with π just as it did with k2. Numerical results indicate that the variance of F1 is also monotonically increasing in π for a fixed value of k2. For F2, numerical results suggest that the mean is decreasing and the variance is increasing in π, but to a lesser extent than for F1. 3.4 Null Asymptotics When MSE is the Measure of Predictive Ability If one is interested in using MSE as the measure of predictive ability there are two loose ends remaining from Section 3.3. The first is that the limiting distribution of the OOS-t statistic from (2.2) has not been provided. This has been done to place some added emphasis on the fact that the OOS-t from (2.1) can be applied to a wider range of measures of predictive ability than just MSE. The OOS-t in (2.2) is only applicable when MSE is the measure of predictive ability. The first loose end is alleviated by the following theorem. THEOREM 3.3: Let Li ,t + 1( β i ,t ) = ui2,t + 1( β i ,t ) ≡ ui2,t +1 and define a0,T ≡ P −1 åT= R [ u1,t + 1 − u2 ,t + 1 ] , a1,T ≡ ˆ ˆ ˆ t ˆ2 ˆ2 P −1 åT= R [ u1,t + 1 − u2 ,t + 1 ] 2 , a2,T ≡ P −1 åT= R [ u1,t + 1 + u2 ,t + 1 ] 2 and a3,T ≡ P −1 åT= R [ u1,t + 1 − u2 ,t +1 ] 2 . t ˆ ˆ t ˆ ˆ t ˆ2 ˆ2 [ a1,T a 2 ,T − a0 ,T ] −0.5 P 0.5 a0 ,T − [ a 3 ,T ] −0.5 P 0.5 a0 ,T →p 0. 2 Theorem 3.3 states that the two OOS-t statistics are asymptotically equivalent. Hence one can use the same critical values to construct asymptotically valid tests of equal predictive ability when using either of the tests. Tables I - III relate to the OOS-t statistic. These were generated numerically using the limiting distribution in Theorem 3.2 and hence can be considered estimates of the true asymptotic critical values. The critical values are the 90th, 95th and 99th percentiles of 5000 independent draws from the distribution of F2 for a given sampling scheme and value of k2 and π. Generating these draws proceeded as follows. Weights that depend upon π were estimated in the obvious way using π = P / R . The necessary k2 Brownian Motions were simulated as random walks each using ˆ an independent sequence of 10,000 i.i.d. N(0,T-0.5) increments. The integrals were emulated by summing the relevant weighted quadratics of the random walks from the R+1st observation to the Tth observation. The random number generator was seeded so that all k2 and π pairs and all sampling schemes use the same 5000 draws of k2 sequences of 10,000 i.i.d. N(0,T-0.5) increments. A brief listing of critical values is provided in Tables I - III. Each table corresponds to either the recursive, rolling or fixed scheme. Within each table there are 330 critical values. Each of these correspond to one permutation of three parameters: k2 = {1, 2, 3,…, 9, 10}, π = {0.1, 0.2, 0.4,…, 1.0, 1.2,…, 2.0} and nominal size of the test takes the values {0.01, 0.05, 0.10}.6 Tables that allow for larger values of both k2 and π are available from the author upon request. The second loose end concerns the OOS-F test when MSE is the measure of predictive ability. Recall the discussion following Assumption 4 in Section 3.2. There we discussed how c was defined. Specifically it was 13 shown that if OLS was used to estimate the parameters and MSE was used to measure predictive ability then Eh2 OLS ) h2 OLS )' = 2σ u Eq 2OLS ) . This was in contrast to the case where the parameters were estimated by ( ,t ( ,t 2 ( ,t minimizing a negative log-likelihood and then the same negative log-likelihood was used to measure predictive ability. In this latter case we know that Eh2 ,MLE ) h2 ,MLE )' = Eq 2 MLE ) . ( t ( t ( ,t The constant c is intended to soak up any difference between the expected outer product of the score and the expected hessian determined by the choice of loss function. Assumption 4 defines c as 2σ u when MSE is the 2 measure of predictive ability. If so we can consistently estimate c using c = 2( P −1 åT= R u 2 ,t +1 ) .7 ˆ t ˆ2 Unfortunately this is not the most commonly used normalization factor for this type of statistic. When the in- sample F-test is constructed the denominator is the mean square error associated with the unrestricted regression. When Ashley (1998), Mark (1995), Kilian (1997) and others bootstrap versions of this statistic the denominator term is P −1 åT= R u 2 ,t +1 . When Pesaran and Timmermann (1995) and Swanson and White (1997a) use OOS information t ˆ2 criteria (such as AIC, SBC, Hannan-Quinn, etc.) to compare the predictive ability of two nested models they are effectively normalizing by P −1 åT= R u 2 ,t +1 . In these cases the OOS-F with c = 2( P −1 åT= R u 2 ,t +1 ) is not t ˆ2 ˆ t ˆ2 applicable. By modifying the definition of the OOS-F in accordance with these applications we have T å [ u1,t +1 − u 2 ,t +1 ] ˆ2 ˆ2 t =R (3.4.1) (modified) OOS-F = T . P −1 å u 2 ,t +1 ˆ2 t =R The limiting distribution of this statistic follows immediately from Theorem 3.1. COROLLARY 3.1: Let Li ,t +1 ( β i ,t ) = u i2,t +1 ( β i ,t ) ≡ u i2,t +1 , then [ P −1 åT= R u 2 ,t +1 ] −1 åT= R [ u1,t +1 − u 2 ,t +1 ] → d 2F1 . ˆ ˆ ˆ t ˆ2 t ˆ2 ˆ2 Since MSE is the most heavily used measure of predictive ability I focus on the limiting distribution in Corollary 3.1 rather than that in Theorem 3.1. Tables IV - VI provide the critical values associated with constructing an asymptotically valid test of the null of equal MSE between two nested models using the modified OOS-F statistic in (3.4.1). Each table corresponds to either the recursive, rolling or fixed sampling scheme. The 330 values reported in Tables IV - VI correspond to the same permutations of k2, π and nominal size of the test that were used in Tables I - III. More detailed tables are available from the author upon request. It should be emphasized that Tables IV - VI cannot be directly applied to applications where a negative log- likelihood is used to measure predictive ability. It can be used with a simple adjustment. If one is interested in using the OOS-F statistic in the form presented in Theorem 3.1 the critical values presented in Tables IV - VI can be used only after they have been divided by two. Suppose that the recursive scheme is used, k2 = 1 and π = 0.4. If MSE is used to measure predictive ability, and Corollary 3.1 is applied, then the critical value associated with a 5% 14 test of the null hypothesis is 1.298. If instead the negative log-likelihood associated with a normal random variate is used to measure predictive ability, and hence Theorem 3.1 is applied, the appropriate critical value is 0.649. In each of Tables I - VI it is also useful to note that the critical values are not generally monotone in either k2 or π. In Section 3.3 we discussed the fact that the means of the OOS-F and OOS-t statistics were monotone decreasing in both k2 and π and hence we expected the distributions to generally drift into the negative orthant as these parameters increased. That does not imply that the upper tails decrease monotonically. Consider the OOS-F statistic for a fixed value of π but allow k2 to increase. As k2 increases the mean of the distribution becomes increasingly negative while at the same time the variance increases. Put together, these two forces imply that the upper tail need not be monotonically decreasing in k2. 3.5 Discussion of the Null Distributions: The Principle of Parsimony and Why Overfit Models Predict Poorly The preceding two sections present the null limiting distributions of and the critical values associated with the OOS-t and OOS-F statistics. A quick glance at Tables I - VI indicates that the distributions are nonstandard; they are not well approximated by either the normal or chi-square distributions. The density plots in Figures 1 - 4 are intended to provide some feel for the behavior of the distributions of 2F1 and F2 corresponding to the (modified) OOS-F and OOS-t statistics respectively. In order to reduce the number of plots I focus exclusively on the recursive sampling scheme. Plots for the rolling and fixed schemes are qualitatively similar in shape. They do differ in location and scale. When the rolling and fixed schemes are used the statistics have heavier tails and drift into the negative orthant much quicker than when the recursive scheme is used. For example when k2 = 20 and π = 50 the 95th percentiles associated with the (modified) OOS-F statistic are -64.018, -939.127, and -540.728 for the recursive, rolling and fixed schemes respectively. Figure 1 is comprised of four plots. Each show the effect on the density of 2F1 when π increases from 0.2 to 1.0 to 2.0 holding k2 constant at 1, 2, 5, and 10. Figure 3 is the same but for F2. In each Figure and plot, as π increases the probability that the statistic is negative increases. This is particularly true for the (modified) OOS-F statistic in Figure 1. Figure 2 is comprised of four plots. Each shows the effect on the density of 2F1 when k2 increases from 1 to 2 to 5 to 10 to 20 holding π constant at 0.2, 1, 2 and 50.0. Figure 4 is the same but for F2. In each Figure and plot, as k2 increases the probability that the statistic is negative increases. Again, this is especially true for the (modified) OOS-F statistic in Figure 2. These density plots and the associated percentiles indicate that the probability that both the OOS-F and OOS-t statistics are negative increases in both k2 and π. For the moment focus on the (modified) OOS-F statistic. Algebraically this implies that under the null 15 T å [ u1,t +1 − u 2 ,t +1 ] ˆ2 ˆ2 T T (3.5.1) lim Prob( t = R T ≤ 0 ) = lim Prob( P −1 å u1,t +1 ≤ P −1 å u 2 ,t +1 ) ˆ2 ˆ2 T →∞ T →∞ P −1 å u 2 ,t +1 ˆ2 t =R t =R t=R increases in both k2 and π. What makes (3.5.1) interesting is its affect on model selection based upon OOS predictive performance. In- sample we know that, when parameters are estimated by NLLS, the MSE from a restricted parametric regression model must be numerically greater than the MSE from an unrestricted parametric regression model that nests the restricted model. One repercussion of this numerical ordering of MSE's is on the application of the Principle of Parsimony. Granger (1995): "If two models appear to fit the data equally well, choose the simpler model (that is the one involving the fewest parameters)." When in-sample predictive ability is of interest, the fact that the unrestricted MSE must be less than or equal to that for the restricted model places the burden of proof on the unrestricted model. For a researcher to feel confident that the unrestricted model is providing information beyond that contained in the restricted model the unrestricted MSE must be "significantly" lower than the restricted MSE. If it is not significantly lower, the Principle of Parsimony says to choose the less parameterized model. If OOS predictive ability is of interest that logic no longer holds. OOS the unrestricted MSE can be less than or greater than the restricted MSE. To make matters worse, the plots in Figures 1 - 4 indicate that the probability in (3.5.1) increases the larger is the number of extraneous parameters introduced in the unrestricted regression model. Moreover it seems that this probability increases to one as either k2 or π become arbitrarily large. This implies that when OOS predictive ability is of interest, the burden of proof that is solely on the unrestricted model in-sample, is also on the restricted model. The MSE from the restricted model must also be "significantly" lower than the MSE from the restricted model.8 It is for this reason that it is particularly important to apply tests of significance when comparing the OOS predictive ability of two nested models. Simply reporting the OOS MSE's from two nested models is insufficient. Consider the case in which the value of the OOS-F statistic is zero and hence the restricted and unrestricted models have "equal" predictive ability. The critical values from Tables IV - VI indicate that k2 and π can always be chosen large enough that zero lies in the rejection region for a given nominal size of the test. If this is the case, we reject the null even though the OOS MSE’s are the same. Another related implication is on the use of OOS information criteria to identify regression models. Swanson and White (1995, 1997a) have applied this methodology. There, and in other applications, the penalty terms used were those commonly associated with in-sample information criteria. In other words, the penalty terms were positive, additive, and increased in k2. This form of penalty term is intended to serve as a statistical mechanism for the Principle of Parsimony. 16 But as mentioned above, it is not clear how the Principle of Parsimony should be applied in an OOS context. As Figures 1 – 4 indicate, for small k2 and π the unrestricted model has a lower MSE than the restricted model a sizable percentage of the time. This holds even under the null. Hence it may be appropriate to use traditional information criteria for smaller values of k2 and π. As k2 and π increase, the restricted model tends to have a lower MSE under the null. In this case it is unclear why penalty terms would be necessary at all. Moreover, including penalty terms could potentially reduce the power of the model selection procedure by artificially inflating the measure of “predictive ability” associated with the unrestricted model, as measured by the information criterion. To eliminate such a problem it may even be the case that negative penalty terms are required in the construction of OOS information criteria particularly for larger values of k2 and π. In any event, it is not clear that traditional in-sample information criteria are appropriate in an OOS context. Development of a theory of model selection using OOS information criteria is left to future research. It should be noted that most of the comments made above are based upon (3.5.1), which in turn relies upon Theorem 3.1. In Theorem 3.1 the results are only applicable to correctly specified regression functions. One cannot infer that the same would be true for misspecified models. It may very well be the case that a more heavily parameterized misspecified model has greater predictive ability than a less parameterized misspecified model. Extensions to misspecified models are left to future research. 4. ASYMPTOTICS UNDER A LOCAL ALTERNATIVE The null asymptotics provide us with a basis for constructing asymptotically valid tests for equal OOS predictive ability between two nested models. If we use the appropriate critical values from Tables I - VI then we know that for large enough T both the OOS-t and OOS-F statistics will be well sized. However, these null distributions do not provide us with any rationale for choosing between the OOS-t and OOS-F. The null distributions also do not provide us with any information on how to choose the sample split parameter, π, or the sampling scheme in order to maximize power of the test. This section provides a limited set of evidence concerning the local power of both the OOS-t and OOS-F statistics for each of the recursive, rolling, and fixed sampling schemes and a limited range of values of k2 and π. The evidence suggests that the recursive scheme is usually the most powerful among the three schemes. The evidence also suggests that which of the OOS-t and OOS-F statistics is more powerful depends jointly upon the values of k2 and π. Clark and McCracken (1999) provide Monte Carlo evidence on the finite sample size and power performance of these statistics, and comparable tests of encompassing, when the nested model is autoregressive and the nesting model is bivariate vector autoregressive. Rather than do a complete analysis of all possible local alternatives and parametric regression models I focus on the most relevant application. I presume that one is interested in measuring predictive ability using MSE as the loss function and that the parameters from two nested linear parametric regression model are estimated using OLS. The null and local alternative models can then be specified as 17 (4.1) H0: y t = Z 1,t β 1 + u t vs. ' * HA: y t = Z 1,t β 1 + T −0.5 c 0.5 Z 22 ,t β 22 + u t = Z '2 ,t β 2 + ( c 0.5 T −0.5 − 1 )Z 22 ,t β 22 + u t ' * ' * * ' * where Z 2 ,t = ( Z 1,t , Z 22 ,t )' , β 2 = ( β 1 , β 22 )' , β 22 ≠ 0 and ut is a conditionally homoskedastic martingale ' ' * *' *' * difference sequence with unconditional variance σ u . 2 This alternative specification is chosen for two reasons. The first is that linear models are the parametric regression models of choice in many applications. Examples include Clarida and Taylor (1997) and Meese and Rogoff (1983) where OOS predictive ability is of interest. The second reason is to simplify the algebra involved with deriving the limiting distributions under the local alternative. Define χ3 as òλ s −1W ( s )ds , λ −1 òλ [ W ( s ) − W ( s − λ )] ds and πW ( λ ) for the recursive, rolling and fixed 1 1 sampling schemes respectively. Furthermore, define χ2 as the square of the denominator term in F2 from Theorem 3.2. In the following, Assumptions 1 - 5 are maintained. THEOREM 4.1: Let Li ,t +1 ( β i ,t ) = u i2,t +1 ( β i ,t ) ≡ u i2,t +1 and define a selection matrix J2 ≡ ( 0k2 ×k1 , I k2 ×k 2 ) (k2×k, ˆ ˆ ˆ k > k2). Under the local alternative in (4.1), (modified) OOS-F → d F3 and OOS-t → d F4 where F3 = 2F1 + πλβ 22 J 2 B −0.5 QB2 0.5 J 2 β 2 − 2 β 22 J '2 B2 0.5 CA[ W ( 1 ) − W ( λ )] and F4 = *' ' − * *' − *' ' − − − ( 0.5 )F 3 /[ χ 2 + πλβ 22 J 2 B 2 0.5 QB2 0.5 J 2 β 22 − 2 β 22 J '2 B2 0.5 CAχ 3 ] 0.5 . * *' The first thing to note about Theorem 4.1 is that the statistics are not asymptotically pivotal. The local power of the tests depends upon the data generating process through B2 and β 22 . Under the null β 22 = 0 and B2 was * * irrelevant. Both affect the limiting distribution under the local alternative. This is important because it implies that any given set of power calculations using the results of Theorem 4.1 should be interpreted with care. Local power of the test for one data generating process does not imply comparable local power for other data generating processes. Moreover, Nelson and Savin (1990) show that local asymptotics may provide a poor approximation to true finite sample power. With these caveats in mind, Tables VII - IX provide a brief list of local power characteristics. The calculations apply the same methods used to construct the critical values in Tables I - VI. The random number generator was seeded so that the random walks used to emulate Brownian Motion under the null were also used under the alternative. In this way much of the null numerical calculations used to generate 5000 draws of F1 and F2 were directly applied in generating 5000 draws of F3 and F4. What distinguishes the two simulations is the need to construct the drift terms in F3 and F4. To do so a particular data generating process needed to be chosen. In order to simplify the presentation, the chosen data generating process was one for which the regressors are i.i.d. orthonormal and are of equal relevance to the 18 conditional mean function so that β 2 = ( 1,1,...,1 )' . After this simplification the limiting distributions under the local * alternative can be rewritten as k2 F3 = 2F1 + πλk2 − 2 å [ Wi ( 1 ) − Wi ( λ )] i =1 and k2 F4 = ( 0.5 )F 3 /[ χ 2 + πλk 2 − 2 å χ i ,3 ] 0.5 i =1 where Wi and χi,3 represent the ith components of W and χ3 respectively. Tables VII - IX report the percentage of 5000 draws of F3 and F4 that were greater than the relevant critical values reported in Tables I - VI. In each table the local power of the OOS-t and OOS-F is reported for a range of π = (0.2, 1.0, 2.0, 50.0), k2 = (1, 2, 5, 10, 20), and nominal size of the test takes the values (1%, 5%, 10%). For example under the recursive scheme with k2 = 2 and π = 1, 1400 of the 5000 draws of F3 were greater than 1.802 and hence at a nominal size of 5% the local power of the (modified) OOS-F statistic is 28%. Similarly, at a nominal size of 5% the local power of the OOS-t statistic is 12.53%. Table VII relates to the recursive scheme, VIII to the rolling, and IX to the fixed. In each of the three tables and in panels A, B and C it is usually the case that when both k2 and π are smaller the OOS-F is more powerful than the OOS-t. As either k2 or π become sufficiently large the OOS-t becomes more powerful. Hence given a particular k2 or π pair, Tables VII – IX provide some guideline on the choice between the OOS-F and OOS-t statistics. If we compare local power across Tables VII – IX we can also draw some conclusions on the choice of sampling scheme. Of the 60 possible (k2, π, nominal size) comparisons among the three sampling schemes, when the OOS-t is used the recursive scheme is most powerful 57 times and the rolling 3 times. When the OOS-F is used the recursive is most powerful 39 times and the fixed 21 times. In this latter case, the fixed scheme is most powerful only when both k2 and π are smaller. It therefore seems that when choosing a sampling scheme the recursive scheme should be the first choice unless both k2 or π are small in which case perhaps the fixed should be considered. Deciding upon the optimal sample split parameter π is less clear. The sample split that maximizes the power of the test varies with the statistic, the sampling scheme and sometimes the number of excess parameters k2. For the recursive scheme larger values of π (π = 50.0) are best when the OOS-t is used. For the OOS-F the optimal split is usually small (π = 0.2) when k2 is small and moderate (π = 1.0) when k2 is larger. For both the fixed and rolling schemes smaller values of π (π = 0.2) are best when the OOS-F is used. These two schemes differ when using the OOS-t statistic. When using the fixed scheme power is highest using moderate sample splits (π = 1.0) but when using the rolling scheme power is highest at slightly higher levels (π = 2.0). 19 5. CONCLUSION This paper presents the null limiting distributions of three statistics commonly used to test for equal predictive ability between two nested models. The limiting null distributions of these three statistics are non-standard. Numerically calculated critical values are provided so that asymptotically valid tests of equal predictive ability can be constructed. The limiting distributions of these statistics are also presented under a particular sequence of local alternatives. Though limited, the results indicate that at smaller values of k2 and π the OOS-F is more powerful but as k2 and π increase the OOS-t becomes more powerful. The results also indicate that the recursive scheme is generally most powerful though the fixed scheme is most powerful in particular circumstances. The numerical results shed little light on the optimal choice of the sample split parameter π. There are situations where power seems to be monotone in π, but often times it is not. Steckel and VanHonacker (1993) note this type of nonlinear behavior. Perhaps the most interesting results concern implications for the Principle of Parsimony and overfitting. In Section 3.5 it is shown that the probability of the MSE of an overparameterized model being larger than the MSE of a less parameterized model is increasing in both the number of excess parameters in the overparameterized model and the percentage of the sample used for OOS prediction. This simple result implies that the common in-sample application of the Principle of Parsimony is sometimes inappropriate when applied OOS. This result can also be interpreted as analytical evidence for why overfit models tend to (but do not always) exhibit poor OOS predictive ability (Diebold, 1998, p.47). A number of questions still remain concerning the OOS predictive ability of nested models. As mentioned in Section 3.3 the assumptions eliminate the potential application of either series-based (Swanson, 1996), local-linear (Diebold and Nason, 1991) or kernel-based (McCracken, 1999b) nonparametric estimation of the regression function. Since these are increasingly prevalent methods of constructing forecasts it would be useful to develop tools that would allow the application of the OOS-t and OOS-F statistics when nonparametric forecasts are used. This may be of particular usefulness when one is interested in testing for market efficiency and hence is concerned with what Fama (1991) refers to as the joint-hypothesis problem. Another potential extension is the development of OOS model selection criteria. The present paper only considers using OOS predictive ability as a means of choosing between two parametric models. In general there are situations when one wishes to choose from multiple models. Such is the case when the model is known to be autoregressive with unknown lag order. As discussed in Section 3.5 it is not clear that existing in-sample information criteria can be directly extended to the OOS environment. Other extensions include the application to nondifferentiable loss functions such as MAE, the Linex and the Maximum Score. Furthermore, it would be useful to extend the results so that the predictive models are potentially misspecified. Finally since it is not always the case that the loss function used to estimate the parameters is identical to that used to measure predictive ability, it would be helpful to extend the results in Section three to allow for such a possibility. 20 REFERENCES AKAIKE, HIROTUGU (1969): “Fitting Autoregressive Models for Prediction”, Annals of the Institute of Statistical Mathematics, 21, 243-247. ANDREWS, DONALD W.K. (1993): “Tests for Parameter Instability and Structural Change with Unknown Change Point”, Econometrica, 61, 821-856. ASHLEY, RICHARD (1981): “Inflation and the Distribution of Price Changes Across Markets: A Causal Analysis”, Economic Inquiry, 19, 650-660. ---------- (1998): “A New Technique for Postsample Model Selection and Validation”, Journal of Economic Dynamics and Control, 22, 647-665. ----------, CLIVE W.J. GRANGER AND R. SCHMALENSEE (1980): “Advertising and Aggregate Consumption: An Analysis of Causality”, Econometrica, 48, 1149-1167. BERKOWITZ, JEREMY AND LORENZO GIORGIANNI (1999): “Long-Horizon Exchange Rate Predictability”, Review of Economics and Statistics, forthcoming. BREEN, WILLIAM, LAWRENCE R. GLOSTEN AND RAVI JAGANNATHAN (1989): “Economic Significance of Predictable Variations in Stock Index Returns”, The Journal of Finance, 44, 1177-1189. CHINN, MENZIE D. AND RICHARD A. MEESE (1995): “Banking on Currency Forecasts: How Predictable is Change in Money?”, Journal of International Economics, 38, 161-178. CHONG, YOCK Y. AND DAVID F. HENDRY (1986): "Econometric Evaluation of Linear Macro-Economic Models", Review of Economic Studies, 53, 671-690. CHUNG, Y. PETER AND ZHUNG GUO ZHOU (1996): "The Predictability of Stock Returns--A Nonparametric Approach", Econometric Reviews, 15, 299-330. CLARIDA, RICHARD H. AND MARK P. TAYLOR (1997): “The Term Structure of Forward Exchange Premiums and the Forecastability of Spot Exchange Rates: Correcting the Errors”, The Review of Economics and Statistics, 79, 353-361. CLARK, TODD E. (1999): "Finite-Sample Properties of Tests for Equal Forecast Accuracy", Journal of Forecasting, forthcoming. ---------- AND MICHAEL W. MCCRACKEN (1999): "Tests of Equal Forecast Accuracy and Encompassing for Nested Models", manuscript, Federal Reserve Bank of Kansas City. CORRADI, V., N.R. SWANSON AND C. OLIVETTI (1999): “Predictive Ability with Cointegrated Variables”, manuscript, Texas A & M University. DAVIDSON, RUSSELL AND JAMES G. MACKINNON (1987): "Implicit Alternatives and the Local Power of Test Statistics", Econometrica, 55, 1305-1329. DIEBOLD, FRANCIS X. (1998): Elements of Forecasting, (Cincinnati, South-Western College Publishing). ----------, TODD A. GUNTHER AND ANTHONY S. TAY (1997): “Evaluating Density Forecasts”, NBER Technical Working Paper #215. 21 ---------- AND LUTZ KILIAN (1997): “Measuring Predictability: Theory and Macroeconomic Applications”, NBER Technical Working Paper #213. ---------- AND ROBERT S. MARIANO (1995): "Comparing Predictive Accuracy", Journal of Business and Economic Statistics, 13, 253-263. ---------- AND JAMES NASON (1990): "Nonparametric Exchange Rate Prediction?", Journal of International Economics, 28, 315-322. FAIR, RAY C. AND ROBERT SHILLER (1989): "The Informational Content of Ex Ante Forecasts", Review of Economics and Statistics, 71, 325-331. ---------- AND ---------- (1990): "Comparing Information in Forecasts from Econometric Models", American Economic Review, 80, 375-389. FAMA, EUGENE F. (1991): “Efficient Capital Markets: II”, The Journal of Finance, 46, 1575-1617. GRANGER, CLIVE W.J. (1995): “Where are the Controversies in Econometric Methodology?”, in C.W.J. Granger, ed., Modeling Economic Series, (Oxford University Press, New York). ---------- AND PAUL NEWBOLD (1977): Forecasting Economic Time Series, (London, Academic Press Inc.). HALL, PETER (1992): The Bootstrap and Edgeworth Expansion, (New York, Springer-Verlag). HANSEN, BRUCE E. (1992): “Convergence to Stochastic Integrals for Dependent Heterogeneous Processes”, Econometric Theory, 8, 489-500. HARVEY, DAVID I., STEPHEN J. LEYBOURNE AND PAUL NEWBOLD (1998): “Forecast Evaluation Tests in the Presence of ARCH”, manuscript, Loughborough University and University of Nottingham. HOFFMAN, DENNIS AND ADRIAN PAGAN (1989): "Practitioners Corner: Post-Sample Prediction Tests for Generalized Method of Moments Estimators", Oxford Bulletin of Economics and Statistics, 51, 333-343. HONG, YONGMIAO AND HALBERT WHITE (1995): “Consistent Specification Testing Via Nonparametric Series Regression”, Econometrica, 63, 1133-1159. HUH, CHAN (1996): “Some Evidence on the Efficacy of the UK Inflation Targeting Regime: An Out of Sample Forecast Approach”, Federal Reserve Board of Governors, International Finance Discussion Paper #565. KILIAN, LUTZ (1999): “Exchange Rates and Monetary Fundamentals: What Do We Learn From Long-Horizon Regressions?”, Journal of Applied Econometrics, Forthcoming. KUAN, CHUNG-MING AND TUNG LIU (1995): “Forecasting Exchange Rates Using Feedforward and Recurrent Neural Networks”, Journal of Applied Econometrics, 10, 347-364. LEITCH, GORDON AND J. ERNEST TANNER (1991): “Economic Forecast Evaluation: Profits Versus the Conventional Error Measures”, American Economic Review, 81, 580-590. MAGNUS, J. AND H. NEUDECKER (1988): Matrix Differential Calculus with Applications in Statistics and Econometrics, (New York, Wiley). MARK, NELSON C. (1995): “Exchange Rates and Fundamentals: Evidence on Long-Horizon Predictability”, The American Economic Review, 85, 201-218. ---------- AND DONGGYU SUL (1998): “Nominal Exchange Rates and Monetary Fundamentals: Evidence from a Seventeen Country Panel”, manuscript, Ohio State University. 22 MCCRACKEN, MICHAEL W. (1998): “Data Mining and Out-of-Sample Inference”, manuscript, Louisiana State University. ---------- (1999a): "Robust Out of Sample Inference", manuscript, Louisiana State University. ---------- (1999b): "An Out-of-Sample, Nonparametric Test of the Martingale Difference Hypothesis", manuscript, Louisiana State University. MEESE, RICHARD A. AND KENNETH ROGOFF (1983): "Empirical Exchange Rate Models of the Seventies: Do they Fit Out of Sample?", Journal of International Economics, 14, 3-24. ---------- AND ---------- (1988): "Was It Real? The Exchange Rate-Interest Differential Relation Over the Modern Floating-Rate Period", Journal of Finance, 43, 933-948. MINCER, JACOB AND VICTOR ZARNOWITZ (1969): "The Evaluation of Economic Forecasts," in J. Mincer, ed., Economic Forecasts and Expectations, (New York, National Bureau of Economic Research). MIZRACH, BRUCE (1992):"The Distribution of the Theil U-Statistic in Bivariate Normal Populations", Economic Letters, 38, 163-167. MORGAN, W. A. (1939): "A Test for Significance of the Difference Between Two Variances in a Sample from a Normal Bivariate Population", Biometrika, 31, 13-19. NELSON, FORREST D. AND N.E. SAVIN (1990): “The Danger of Extrapolating Asymptotic Local Power”, Econometrica, 58, 977-981. NEWEY, WHITNEY K. AND DANIEL MCFADDEN (1994): “Large Sample Estimation and Hypothesis Testing”, in R.F. Engle and D.L. McFadden, ed., Handbook of Econometrics, Volume IV, (Amsterdam, North-Holland). PAGAN, ADRIAN R. AND ANTHONY D. HALL (1983): "Diagnostic Tests as Residual Analysis", Econometric Reviews, 2, 159-218. ---------- AND G. WILLIAM SCHWERT (1990): "Alternative Models for Conditional Stock Volatility", Journal of Econometrics, 45, 267-290. PARK, TIMOTHY (1990): “Forecast Evaluation for Multivariate Time-Series Models: The U.S. Cattle Market”, Western Journal of Agricultural Economics, July, 133-143. PESARAN, M. HASHEM AND ALLAN TIMMERMANN (1995): “Predictability of Stock Returns: Robustness and Economic Significance”, The Journal of Finance, 50, 4, 1201-1228. ---------- AND ---------- (1999): “Model Instability and Choice of Observation Window”, manuscript, University of California-San Diego Working Paper #99-19. RANDLES, RONALD H. (1982): “On the Asymptotic Normality of Statistics with Estimated Parameters”, The Annals of Statistics, 10, 463-474. SANCHEZ, ISMAEL (1998): “Testing for Unit Roots with Prediction Errors”, manuscript, University of California San Diego. STECKEL, JOEL AND WILFRIED VANHONACKER (1993): “Cross-Validating Regression Models in Market Research”, Marketing Science, 12 415-427. SULLIVAN, RYAN, ALLAN TIMMERMANN AND HALBERT WHITE (1998): “Data-Snooping, Technical Trading Rule Performance, and the Bootstrap”, Journal of Finance, forthcoming. 23 SWANSON, NORMAN R. (1996): “Forecasting Using First-Available versus Fully Revised Economic Time-Series Data”, Studies in Nonlinear Dynamics and Econometrics, 1, 47-64. SWANSON, NORMAN R. (1998): “Money and Output Viewed Through a Rolling Window”, Journal of Monetary Economics, 41, 455-473. ----------, ATAMAN OZYILDIRIM AND MARIA PISU (1996): “A Comparison of Alternative Causality and Predictive Accuracy Tests in the Presence of Integrated and Co-Integrated Economic Variables”, manuscript, Pennsylvania State University. ---------- AND HALBERT WHITE (1995): “A Model-Selection Approach to Assessing the Information in the Term Structure Using Linear Models and Artificial Neural Networks”, Journal of Business and Economic Statistics, 13, 265-275. ---------- AND HALBERT WHITE (1997a): “A Model-Selection Approach to Real-Time Macroeconomic Forecasting Using Linear Models and Artificial Neural Networks”, The Review of Economics and Statistics, 79, 265-275. ---------- AND HALBERT WHITE (1997b): “Forecasting Economic Time Series Using Flexible Versus Fixed Specification and Linear Versus Nonlinear Econometric Models”, International Journal of Forecasting, 13, 439-461. TEGENE, ABEBAYEHU AND FRED KUCHLER (1994): “Evaluating Forecasting Models of Farmland Prices”, International Journal of Forecasting, 10, 65-80. THEIL, H. (1966): Applied Economic Forecasting, (Amsterdam, North-Holland). URBAIN, J.P. (1989): “Model Selection Criteria and Granger Causality Tests: An Empirical Note”, Economics Letters, 29, 317-320. WEISS, ANDREW A. (1996): “Estimating Time Series Models Using the Relevant Cost Function”, Journal of Applied Econometrics, 11, 539-560. ---------- AND A.P. ANDERSEN (1984): “Estimating Time Series Using the Relevant Forecast Evaluation Criterion”, Journal of the Royal Statistical Society, Series A, 147, 484-487. WEST, KENNETH D. (1996): "Asymptotic Inference About Predictive Ability", Econometrica, 64, 1067-1084. ---------- AND MICHAEL W. MCCRACKEN (1998): "Regression-Based Tests of Predictive Ability", International Economic Review, 39, 817-840 WHITE, HALBERT (1999): “A Reality Check for Data Snooping”, Econometrica, forthcoming. WOLFF, CHRISTIAN C.P. (1987): “Time-Varying Parameters and the Out-of-Sample Forecasting Performance of Structural Exchange Rate Models”, Journal of Business and Statistics, 5, 87-97. WOOLDRIDGE, JEFFREY M. AND HALBERT WHITE (1989): "Central Limit Theorems for Dependent, Heterogeneous Processes with Trending Moments", manuscript, Michigan State University. FOOTNOTES 1 I’d like to thank Todd Clark, Walter Enders, Bruce Hansen, Dek Terrell, Ken West, and seminar participants at LSU and the 1999 Midwest Economic Association meetings for helpful comments. 2 See Randles (1982) for the in-sample analog. 3 Notice that the fixed and rolling parameter estimates should be subscripted both by t and R. In order to simplify the notation the subscript R will be suppressed. 4 See the discussion in Section 3.4. 5 I also could have estimated Ω using squared deviations from the sample mean. Doing so is asymptotically irrelevant and hence is omitted for notational convenience. 6 These values of π correspond to percentages of in-sample observations λ = {0.909, 0.833, 0.714, 0.625, 0.555, 0.500, 0.454, 0.417, 0.385, 0.357, 0.333}. 7 This estimator is consistent by Theorem 4.1 of West (1996). 8 This does not mean that the test should be lower tailed. It means that the asymptotic median of the difference in MSE's is now negative. 1 TABLE I PERCENTILES OF THE OOS-t STATISTIC: RECURSIVE SCHEME k2\π 0.1 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 1 (0.99) 1.921 1.784 1.625 1.515 1.462 1.436 1.413 1.343 1.316 1.274 1.238 (0.95) 1.245 1.111 0.994 0.971 0.863 0.771 0.740 0.705 0.671 0.638 0.610 (0.90) 0.885 0.780 0.657 0.598 0.512 0.443 0.402 0.370 0.330 0.306 0.281 2 (0.99) 1.986 1.856 1.563 1.436 1.387 1.312 1.276 1.196 1.158 1.127 1.074 (0.95) 1.274 1.140 0.986 0.868 0.782 0.704 0.623 0.596 0.537 0.507 0.478 (0.90) 0.932 0.786 0.614 0.541 0.455 0.361 0.295 0.253 0.235 0.194 0.160 3 (0.99) 1.840 1.737 1.542 1.448 1.359 1.252 1.148 1.071 0.976 0.978 0.953 (0.95) 1.300 1.120 0.968 0.808 0.685 0.610 0.552 0.496 0.438 0.419 0.386 (0.90) 0.939 0.751 0.551 0.454 0.356 0.279 0.222 0.175 0.108 0.074 0.035 4 (0.99) 1.872 1.731 1.581 1.365 1.195 1.119 1.108 1.041 0.902 0.861 0.854 (0.95) 1.264 1.101 0.914 0.772 0.609 0.502 0.419 0.345 0.285 0.239 0.221 (0.90) 0.898 0.742 0.562 0.419 0.263 0.169 0.094 0.052 -0.014 -0.054 -0.106 5 (0.99) 1.849 1.679 1.468 1.242 1.095 0.995 0.979 0.913 0.795 0.732 0.677 (0.95) 1.222 1.061 0.849 0.689 0.491 0.386 0.308 0.224 0.148 0.107 0.081 (0.90) 0.866 0.694 0.461 0.315 0.179 0.062 -0.021 -0.083 -0.145 -0.174 -0.228 6 (0.99) 1.836 1.639 1.390 1.200 1.042 0.943 0.859 0.755 0.686 0.610 0.593 (0.95) 1.192 0.998 0.768 0.615 0.429 0.328 0.259 0.141 0.078 0.055 -0.019 (0.90) 0.823 0.642 0.394 0.256 0.108 -0.011 -0.101 -0.164 -0.218 -0.266 -0.319 7 (0.99) 1.836 1.649 1.341 1.154 0.994 0.872 0.810 0.637 0.549 0.476 0.438 (0.95) 1.199 0.976 0.742 0.546 0.372 0.279 0.191 0.072 -0.002 -0.034 -0.105 (0.90) 0.811 0.615 0.359 0.213 0.062 -0.088 -0.152 -0.230 -0.305 -0.363 -0.449 8 (0.99) 1.789 1.659 1.298 1.090 0.879 0.788 0.728 0.503 0.444 0.401 0.359 (0.95) 1.193 0.928 0.677 0.462 0.302 0.198 0.105 0.020 -0.058 -0.101 -0.176 (0.90) 0.773 0.574 0.329 0.139 0.003 -0.131 -0.203 -0.293 -0.383 -0.452 -0.516 9 (0.99) 1.813 1.607 1.268 1.112 0.804 0.724 0.634 0.523 0.427 0.391 0.305 (0.95) 1.112 0.912 0.617 0.397 0.276 0.121 0.030 -0.055 -0.122 -0.193 -0.257 (0.90) 0.733 0.561 0.273 0.096 -0.068 -0.187 -0.286 -0.377 -0.437 -0.518 -0.579 10 (0.99) 1.743 1.534 1.193 1.035 0.758 0.621 0.506 0.419 0.347 0.285 0.185 (0.95) 1.082 0.890 0.566 0.358 0.205 0.043 -0.072 -0.162 -0.222 -0.296 -0.339 (0.90) 0.749 0.529 0.226 0.032 -0.130 -0.248 -0.355 -0.454 -0.524 -0.591 -0.651 2 TABLE II PERCENTILES OF THE OOS-t STATISTIC: ROLLING SCHEME k2\π 0.1 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 1 (0.99) 1.875 1.799 1.604 1.447 1.340 1.221 1.179 1.098 1.021 0.969 0.882 (0.95) 1.251 1.117 0.970 0.859 0.722 0.651 0.575 0.510 0.455 0.382 0.334 (0.90) 0.903 0.776 0.637 0.530 0.401 0.317 0.246 0.180 0.136 0.116 0.078 2 (0.99) 1.959 1.757 1.504 1.325 1.180 1.165 0.996 0.953 0.883 0.744 0.640 (0.95) 1.280 1.105 0.884 0.753 0.631 0.484 0.401 0.304 0.235 0.166 0.103 (0.90) 0.915 0.755 0.569 0.425 0.280 0.155 0.111 0.026 -0.050 -0.094 -0.140 3 (0.99) 1.860 1.669 1.473 1.271 1.076 0.984 0.896 0.773 0.614 0.504 0.431 (0.95) 1.274 1.088 0.842 0.667 0.490 0.381 0.251 0.146 0.066 -0.016 -0.084 (0.90) 0.938 0.718 0.521 0.346 0.201 0.064 -0.042 -0.137 -0.224 -0.302 -0.346 4 (0.99) 1.905 1.700 1.503 1.183 1.003 0.903 0.755 0.656 0.455 0.342 0.234 (0.95) 1.267 1.087 0.852 0.585 0.376 0.274 0.136 0.024 -0.080 -0.173 -0.222 (0.90) 0.866 0.731 0.494 0.248 0.098 -0.047 -0.164 -0.262 -0.362 -0.434 -0.505 5 (0.99) 1.881 1.627 1.347 1.112 0.927 0.790 0.657 0.504 0.307 0.193 0.123 (0.95) 1.229 1.034 0.716 0.479 0.280 0.155 -0.019 -0.090 -0.219 -0.329 -0.385 (0.90) 0.825 0.694 0.402 0.154 -0.025 -0.168 -0.305 -0.399 -0.508 -0.589 -0.674 6 (0.99) 1.826 1.680 1.312 1.007 0.850 0.641 0.558 0.336 0.195 0.069 0.017 (0.95) 1.176 0.966 0.621 0.407 0.225 0.058 -0.119 -0.218 -0.336 -0.428 -0.535 (0.90) 0.811 0.602 0.319 0.088 -0.095 -0.262 -0.423 -0.523 -0.638 -0.732 -0.821 7 (0.99) 1.842 1.620 1.233 0.989 0.751 0.526 0.485 0.227 0.055 -0.039 -0.127 (0.95) 1.154 0.936 0.628 0.346 0.171 -0.011 -0.182 -0.320 -0.433 -0.531 -0.663 (0.90) 0.791 0.573 0.279 0.038 -0.157 -0.326 -0.497 -0.611 -0.750 -0.841 -0.933 8 (0.99) 1.819 1.582 1.178 0.918 0.702 0.466 0.349 0.132 -0.018 -0.176 -0.302 (0.95) 1.157 0.924 0.562 0.258 0.081 -0.099 -0.281 -0.432 -0.552 -0.672 -0.785 (0.90) 0.758 0.541 0.244 -0.042 -0.244 -0.408 -0.576 -0.727 -0.838 -0.957 -1.040 9 (0.99) 1.768 1.510 1.110 0.845 0.600 0.408 0.235 0.036 -0.099 -0.277 -0.407 (0.95) 1.117 0.892 0.504 0.213 0.021 -0.156 -0.374 -0.529 -0.623 -0.785 -0.885 (0.90) 0.742 0.520 0.193 -0.105 -0.322 -0.491 -0.657 -0.803 -0.951 -1.049 -1.153 10 (0.99) 1.713 1.428 1.075 0.808 0.536 0.298 0.122 -0.064 -0.248 -0.381 -0.482 (0.95) 1.068 0.872 0.443 0.133 -0.038 -0.258 -0.466 -0.605 -0.765 -0.909 -1.011 (0.90) 0.727 0.500 0.138 -0.144 -0.374 -0.568 -0.757 -0.902 -1.045 -1.167 -1.288 3 TABLE III PERCENTILES OF THE OOS-t STATISTIC: FIXED SCHEME k2\π 0.1 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 1 (0.99) 2.201 2.051 1.974 2.061 2.037 2.024 1.992 2.018 1.996 2.016 1.993 (0.95) 1.506 1.416 1.364 1.428 1.346 1.252 1.301 1.293 1.249 1.235 1.218 (0.90) 1.149 1.079 1.042 1.040 0.976 0.917 0.896 0.893 0.908 0.834 0.862 2 (0.99) 2.145 2.089 1.923 1.947 1.964 1.749 1.751 1.665 1.725 1.646 1.613 (0.95) 1.468 1.342 1.301 1.265 1.164 1.072 1.034 1.046 0.977 0.982 0.955 (0.90) 1.096 0.999 0.901 0.873 0.798 0.711 0.680 0.639 0.578 0.556 0.520 3 (0.99) 2.045 1.977 1.957 1.805 1.739 1.602 1.520 1.597 1.463 1.513 1.407 (0.95) 1.432 1.277 1.195 1.095 1.014 0.909 0.893 0.851 0.761 0.735 0.733 (0.90) 1.063 0.922 0.793 0.705 0.621 0.540 0.511 0.455 0.386 0.373 0.306 4 (0.99) 2.013 1.883 1.829 1.687 1.528 1.467 1.475 1.422 1.318 1.255 1.277 (0.95) 1.369 1.281 1.110 0.997 0.883 0.755 0.689 0.650 0.607 0.566 0.509 (0.90) 1.004 0.895 0.764 0.575 0.476 0.367 0.340 0.273 0.204 0.171 0.081 5 (0.99) 1.930 1.878 1.716 1.596 1.405 1.254 1.301 1.230 1.171 1.115 1.034 (0.95) 1.333 1.193 1.009 0.863 0.725 0.646 0.570 0.486 0.410 0.365 0.291 (0.90) 0.945 0.838 0.636 0.487 0.374 0.258 0.193 0.115 0.020 -0.022 -0.085 6 (0.99) 1.933 1.874 1.628 1.481 1.382 1.146 1.188 1.091 1.016 1.007 0.878 (0.95) 1.269 1.122 0.936 0.771 0.652 0.538 0.487 0.367 0.314 0.222 0.152 (0.90) 0.912 0.764 0.552 0.400 0.299 0.169 0.103 0.003 -0.106 -0.146 -0.235 7 (0.99) 1.925 1.859 1.556 1.377 1.257 1.105 1.103 0.987 0.896 0.828 0.765 (0.95) 1.263 1.086 0.878 0.692 0.557 0.446 0.346 0.254 0.191 0.074 0.014 (0.90) 0.895 0.731 0.513 0.332 0.215 0.060 -0.003 -0.147 -0.252 -0.308 -0.386 8 (0.99) 1.856 1.827 1.467 1.245 1.146 1.029 0.980 0.860 0.786 0.762 0.666 (0.95) 1.249 1.064 0.807 0.623 0.481 0.363 0.268 0.151 0.054 -0.042 -0.120 (0.90) 0.868 0.663 0.467 0.247 0.153 -0.029 -0.115 -0.227 -0.343 -0.440 -0.502 9 (0.99) 1.878 1.697 1.440 1.198 1.124 0.902 0.791 0.683 0.644 0.595 0.507 (0.95) 1.197 1.031 0.754 0.537 0.416 0.305 0.162 0.050 -0.067 -0.171 -0.242 (0.90) 0.844 0.655 0.396 0.182 0.034 -0.111 -0.224 -0.303 -0.437 -0.543 -0.625 10 (0.99) 1.824 1.604 1.354 1.126 0.998 0.797 0.659 0.557 0.550 0.505 0.415 (0.95) 1.143 1.007 0.688 0.455 0.337 0.167 0.040 -0.057 -0.174 -0.246 -0.358 (0.90) 0.797 0.616 0.348 0.125 -0.055 -0.210 -0.305 -0.398 -0.559 -0.645 -0.729 4 TABLE IV PERCENTILES OF THE (MODIFIED) OOS-F STATISTIC: RECURSIVE SCHEME k2\π 0.1 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 1 (0.99) 1.608 2.129 2.768 3.179 3.459 3.584 3.771 3.589 3.838 3.882 3.951 (0.95) 0.850 1.038 1.298 1.554 1.567 1.548 1.583 1.623 1.599 1.553 1.518 (0.90) 0.530 0.659 0.814 0.796 0.798 0.751 0.759 0.698 0.685 0.687 0.616 2 (0.99) 1.996 2.691 3.426 3.907 4.129 4.200 4.362 4.304 4.309 4.278 4.250 (0.95) 1.184 1.453 1.733 1.891 1.820 1.802 1.819 1.752 1.734 1.692 1.706 (0.90) 0.794 0.912 1.029 1.077 1.008 0.880 0.785 0.697 0.666 0.587 0.506 3 (0.99) 2.418 3.092 4.080 4.136 4.322 4.341 4.337 4.192 4.089 4.365 4.184 (0.95) 1.434 1.710 2.062 2.073 1.978 1.909 1.930 1.795 1.715 1.710 1.612 (0.90) 0.970 1.064 1.117 1.121 0.960 0.857 0.691 0.599 0.386 0.276 0.127 4 (0.99) 2.714 3.440 4.541 4.609 4.378 4.202 4.586 4.477 4.337 4.247 4.096 (0.95) 1.566 1.964 2.246 2.194 1.900 1.809 1.578 1.376 1.256 1.122 1.029 (0.90) 1.060 1.225 1.313 1.184 0.829 0.545 0.354 0.197 -0.058 -0.234 -0.456 5 (0.99) 2.902 3.673 4.466 4.434 4.249 4.351 4.349 4.187 3.945 3.783 3.783 (0.95) 1.688 2.082 2.235 2.242 1.773 1.449 1.316 1.045 0.718 0.502 0.459 (0.90) 1.130 1.277 1.228 0.958 0.614 0.241 -0.099 -0.361 -0.656 -0.820 -1.072 6 (0.99) 3.212 3.846 4.545 4.676 4.637 4.703 4.286 4.144 3.981 3.525 3.321 (0.95) 1.828 2.124 2.217 2.121 1.660 1.360 1.181 0.761 0.413 0.299 -0.109 (0.90) 1.220 1.313 1.164 0.890 0.419 -0.044 -0.405 -0.776 -1.072 -1.395 -1.664 7 (0.99) 3.450 4.098 4.508 4.419 4.271 4.312 4.150 3.677 3.155 3.090 2.880 (0.95) 2.000 2.239 2.424 2.057 1.604 1.282 0.928 0.378 -0.008 -0.199 -0.591 (0.90) 1.272 1.333 1.118 0.799 0.242 -0.363 -0.728 -1.194 -1.657 -2.033 -2.507 8 (0.99) 3.408 4.130 4.645 4.625 4.202 4.147 3.912 3.185 2.933 2.952 2.484 (0.95) 2.136 2.312 2.373 1.895 1.390 0.943 0.587 0.131 -0.372 -0.680 -1.14 (0.90) 1.338 1.369 1.058 0.552 0.014 -0.632 -1.076 -1.633 -2.174 -2.731 -3.16 9 (0.99) 3.540 4.388 4.703 4.873 4.122 4.066 3.753 3.027 2.925 2.802 2.186 (0.95) 2.168 2.440 2.219 1.714 1.286 0.631 0.198 -0.356 -0.851 -1.241 -1.696 (0.90) 1.354 1.432 0.920 0.393 -0.327 -1.007 -1.595 -2.229 -2.666 -3.250 -3.794 10 (0.99) 3.646 4.433 4.813 4.718 3.944 3.645 3.194 2.578 2.282 2.152 1.436 (0.95) 2.202 2.489 2.157 1.536 1.055 0.205 -0.431 -1.071 -1.459 -1.988 -2.378 (0.90) 1.458 1.401 0.884 0.155 -0.600 -1.341 -2.008 -2.782 -3.348 -3.839 -4.437 5 TABLE V PERCENTILES OF THE (MODIFIED) OOS-F STATISTIC: ROLLING SCHEME k2\π 0.1 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 1 (0.99) 1.638 2.230 2.812 3.300 3.634 3.811 3.688 3.721 3.924 3.612 3.765 (0.95) 0.854 1.112 1.394 1.644 1.627 1.583 1.574 1.469 1.488 1.378 1.215 (0.90) 0.536 0.667 0.838 0.865 0.773 0.693 0.602 0.482 0.390 0.355 0.276 2 (0.99) 2.036 2.819 3.544 3.988 4.066 4.398 4.403 4.109 4.293 4.046 3.566 (0.95) 1.232 1.481 1.802 1.889 1.841 1.695 1.495 1.264 1.015 0.783 0.504 (0.90) 0.812 0.920 1.028 1.004 0.806 0.468 0.399 0.095 -0.198 -0.394 -0.623 3 (0.99) 2.476 3.128 4.135 4.120 4.264 4.519 4.386 4.123 3.373 3.089 2.685 (0.95) 1.472 1.752 2.089 2.042 1.700 1.532 1.100 0.694 0.340 -0.071 -0.471 (0.90) 1.006 1.074 1.135 0.944 0.617 0.224 -0.174 -0.600 -1.080 -1.529 -1.847 4 (0.99) 2.724 3.649 4.474 4.586 4.432 4.459 4.296 3.621 2.905 2.337 1.699 (0.95) 1.600 2.078 2.332 1.979 1.536 1.228 0.701 0.116 -0.491 -1.112 -1.487 (0.90) 1.096 1.284 1.263 0.777 0.356 -0.200 -0.788 -1.341 -1.973 -2.528 -3.182 5 (0.99) 3.008 3.721 4.504 4.710 4.508 4.199 4.042 3.216 2.167 1.370 1.055 (0.95) 1.768 2.191 2.164 1.783 1.175 0.764 -0.121 -0.542 -1.454 -2.172 -2.765 (0.90) 1.152 1.315 1.143 0.541 -0.114 -0.774 -1.583 -2.300 -3.102 -3.896 -4.649 6 (0.99) 3.214 4.093 4.532 4.786 4.456 3.899 3.473 2.324 1.500 0.615 0.159 (0.95) 1.926 2.181 2.117 1.652 1.062 0.318 -0.745 -1.424 -2.302 -3.162 -4.256 (0.90) 1.224 1.311 0.981 0.365 -0.428 -1.348 -2.392 -3.270 -4.292 -5.180 -6.114 7 (0.99) 3.552 4.340 4.568 4.566 4.181 3.450 3.167 1.696 0.442 -0.300 -1.151 (0.95) 1.994 2.273 2.321 1.478 0.868 -0.076 -1.162 -2.243 -3.224 -4.261 -5.62 (0.90) 1.292 1.355 0.939 0.151 -0.783 -1.833 -3.061 -4.153 -5.487 -6.474 -7.583 8 (0.99) 3.480 4.537 4.707 4.681 4.041 3.065 2.488 1.013 -0.144 -1.562 -2.729 (0.95) 2.080 2.436 2.169 1.213 0.468 -0.626 -1.885 -3.233 -4.430 -5.681 -6.991 (0.90) 1.386 1.357 0.886 -0.186 -1.278 -2.492 -3.824 -5.225 -6.556 -7.690 -8.939 9 (0.99) 3.552 4.438 4.711 4.443 3.754 2.622 1.723 0.327 -0.877 -2.543 -3.923 (0.95) 2.164 2.518 1.966 1.075 0.138 -1.113 -2.620 -4.036 -5.258 -6.931 -8.345 (0.90) 1.378 1.433 0.772 -0.521 -1.835 -3.139 -4.586 -6.133 -7.734 -9.173 -10.558 10 (0.99) 3.728 4.413 4.815 4.589 3.460 2.145 1.121 -0.624 -2.313 -3.795 -5.166 (0.95) 2.224 2.520 1.893 0.730 -0.235 -1.733 -3.496 -4.940 -6.512 -8.255 -9.863 (0.90) 1.456 1.411 0.555 -0.701 -2.189 -3.790 -5.566 -7.200 -9.046 -10.574 -12.294 6 TABLE VI PERCENTILES OF THE (MODIFIED) OOS-F STATISTIC: FIXED SCHEME k2\π 0.1 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 2.0 1 (0.99) 1.480 1.981 2.681 3.055 3.230 3.377 3.562 3.619 3.816 3.812 3.838 (0.95) 0.784 1.015 1.345 1.534 1.677 1.667 1.738 1.807 1.812 1.857 1.862 (0.90) 0.514 0.649 0.835 0.885 0.933 0.964 1.009 0.986 1.050 1.080 1.037 2 (0.99) 1.840 2.554 3.241 3.514 3.944 4.019 4.173 4.364 4.251 4.556 4.414 (0.95) 1.132 1.421 1.765 1.999 2.077 2.116 2.169 2.232 2.275 2.260 2.195 (0.90) 0.784 0.914 1.140 1.237 1.299 1.268 1.330 1.250 1.126 1.189 1.151 3 (0.99) 2.324 2.985 3.854 4.103 4.272 4.233 4.549 4.764 4.687 4.915 4.900 (0.95) 1.408 1.653 2.050 2.322 2.308 2.319 2.325 2.336 2.187 2.283 2.275 (0.90) 1.000 1.106 1.328 1.367 1.375 1.291 1.289 1.225 1.073 1.043 0.954 4 (0.99) 2.576 3.283 3.999 4.339 4.349 4.629 4.602 5.002 4.793 4.984 5.028 (0.95) 1.536 1.947 2.374 2.478 2.426 2.238 2.310 2.175 2.063 1.891 1.784 (0.90) 1.100 1.317 1.472 1.362 1.195 1.109 1.008 0.860 0.667 0.544 0.289 5 (0.99) 2.748 3.437 4.212 4.454 4.330 4.396 4.739 5.044 4.761 4.731 4.560 (0.95) 1.664 2.018 2.368 2.504 2.337 2.167 2.109 1.862 1.593 1.540 1.249 (0.90) 1.178 1.387 1.414 1.341 1.038 0.865 0.696 0.445 0.083 -0.075 -0.347 6 (0.99) 3.084 3.755 4.467 4.754 4.559 4.715 4.836 4.515 4.561 4.303 4.365 (0.95) 1.826 2.164 2.406 2.422 2.267 2.010 1.995 1.654 1.302 1.107 0.744 (0.90) 1.242 1.417 1.428 1.167 0.962 0.634 0.410 0.014 -0.449 -0.666 -1.113 7 (0.99) 3.294 3.980 4.599 4.683 4.704 5.000 4.828 4.667 4.489 4.367 4.155 (0.95) 1.962 2.282 2.441 2.410 2.198 1.886 1.535 1.263 0.943 0.356 0.071 (0.90) 1.342 1.536 1.457 1.057 0.817 0.254 -0.015 -0.668 -1.314 -1.610 -2.23 8 (0.99) 3.364 4.116 4.775 4.724 4.715 4.762 4.480 4.111 4.278 4.482 3.804 (0.95) 2.078 2.394 2.580 2.244 2.007 1.666 1.266 0.744 0.293 -0.244 -0.658 (0.90) 1.434 1.501 1.374 0.900 0.622 -0.146 -0.593 -1.157 -1.925 -2.678 -3.109 9 (0.99) 3.430 4.233 4.671 4.640 4.856 4.580 4.112 3.756 3.536 3.648 3.158 (0.95) 2.090 2.525 2.533 2.064 1.881 1.434 0.892 0.292 -0.344 -0.960 -1.536 (0.90) 1.474 1.564 1.329 0.727 0.181 -0.578 -1.168 -1.751 -2.653 -3.457 -4.122 10 (0.99) 3.582 4.232 4.750 4.674 4.489 4.251 3.643 3.467 3.373 3.342 3.036 (0.95) 2.158 2.611 2.481 1.967 1.601 0.936 0.282 -0.360 -1.068 -1.467 -2.404 (0.90) 1.504 1.583 1.176 0.520 -0.246 -1.125 -1.722 -2.449 -3.606 -4.312 -5.252 TABLE VII LOCAL POWER OF THE (MODIFIED) OOS-F AND OOS-t STATISTICS: RECURSIVE SCHEME A. Nominal Size of 1% k2 Test π 1 2 5 10 20 OOS-F 0.2 0.1292 0.1664 0.2125 0.2446 0.2829 OOS-t 0.2 0.0241 0.0258 0.0405 0.0555 0.0835 OOS-F 1 0.1216 0.1570 0.2207 0.2619 0.3110 OOS-t 1 0.0316 0.0429 0.0847 0.1466 0.2866 OOS-F 2 0.1239 0.1531 0.1929 0.2246 0.2604 OOS-t 2 0.0381 0.0521 0.0990 0.1860 0.3564 OOS-F 50 0.0837 0.0821 0.0590 0.0287 0.0113 OOS-t 50 0.0507 0.0857 0.1578 0.2509 0.4295 B. Nominal Size of 5% k2 Test π 1 2 5 10 20 OOS-F 0.2 0.2266 0.2680 0.3072 0.3321 0.3788 OOS-t 0.2 0.0920 0.1018 0.1275 0.1587 0.2375 OOS-F 1 0.2182 0.2660 0.3201 0.3505 0.3989 OOS-t 1 0.1157 0.1484 0.2379 0.3329 0.5049 OOS-F 2 0.2168 0.2489 0.2870 0.3095 0.3491 OOS-t 2 0.1300 0.1658 0.2634 0.3848 0.5852 OOS-F 50 0.1386 0.1277 0.0929 0.0505 0.0218 OOS-t 50 0.1538 0.2096 0.3249 0.4649 0.6587 C. Nominal Size of 10% k2 Test π 1 2 5 10 20 OOS-F 0.2 0.2850 0.3255 0.3625 0.3886 0.4306 OOS-t 0.2 0.1622 0.1760 0.2167 0.2665 0.3649 OOS-F 1 0.2772 0.3217 0.3716 0.4001 0.4457 OOS-t 1 0.1973 0.2478 0.3531 0.4581 0.6255 OOS-F 2 0.2682 0.3063 0.3390 0.3600 0.3982 OOS-t 2 0.2167 0.2674 0.3878 0.5230 0.6991 OOS-F 50 0.1739 0.1591 0.1118 0.0639 0.0293 OOS-t 50 0.2429 0.3064 0.4399 0.5883 0.7648 Notes: Each element of Panels A, B and C is the local power of either the OOS-t or (modified) OOS-F test for a given permutation of the choice of sample split π, number of excess parameters in the unrestricted model k2, and the nominal size of the test. For a description of how the results were generated see Section four of the text. TABLE VIII LOCAL POWER OF THE (MODIFIED) OOS-F AND OOS-t STATISTICS: ROLLING SCHEME A. Nominal Size of 1% k2 Test π 1 2 5 10 20 OOS-F 0.2 0.1249 0.1566 0.2009 0.2345 0.2626 OOS-t 0.2 0.0222 0.0277 0.0414 0.0584 0.0776 OOS-F 1 0.1012 0.1294 0.1534 0.1740 0.1795 OOS-t 1 0.0288 0.0356 0.0702 0.1277 0.2365 OOS-F 2 0.0985 0.1059 0.1048 0.0950 0.0697 OOS-t 2 0.0325 0.0495 0.0884 0.1578 0.2690 OOS-F 50 0.0001 0.0000 0.0000 0.0000 0.0000 OOS-t 50 0.0162 0.0171 0.0216 0.0280 0.0459 B. Nominal Size of 5% k2 Test π 1 2 5 10 20 OOS-F 0.2 0.2165 0.2565 0.2941 0.3182 0.3573 OOS-t 0.2 0.0886 0.1010 0.1226 0.1549 0.2298 OOS-F 1 0.1862 0.2183 0.2434 0.2504 0.2575 OOS-t 1 0.1011 0.1407 0.2095 0.2992 0.4622 OOS-F 2 0.1706 0.1808 0.1627 0.1464 0.1072 OOS-t 2 0.1191 0.1574 0.2252 0.3301 0.4843 OOS-F 50 0.0004 0.0000 0.0000 0.0000 0.0000 OOS-t 50 0.0664 0.0765 0.0925 0.1140 0.1501 C. Nominal Size of 10% k2 Test π 1 2 5 10 20 OOS-F 0.2 0.2795 0.3160 0.3509 0.3720 0.4115 OOS-t 0.2 0.1557 0.1763 0.2077 0.2580 0.3549 OOS-F 1 0.2428 0.2722 0.2941 0.2982 0.3035 OOS-t 1 0.1806 0.2351 0.3250 0.4258 0.5941 OOS-F 2 0.2134 0.2209 0.2002 0.1784 0.1350 OOS-t 2 0.1959 0.2438 0.3358 0.4445 0.6154 OOS-F 50 0.0009 0.0000 0.0000 0.0000 0.0000 OOS-t 50 0.1277 0.1442 0.1630 0.1936 0.2534 Notes: Each element of Panels A, B and C is the local power of either the OOS-t or (modified) OOS-F test for a given permutation of the choice of sample split π, number of excess parameters in the unrestricted model k2, and the nominal size of the test. For a description of how the results were generated see Section four of the text. TABLE IX LOCAL POWER OF THE (MODIFIED) OOS-F AND OOS-t STATISTICS: FIXED SCHEME A. Nominal Size of 1% k2 Test π 1 2 5 10 20 OOS-F 0.2 0.1428 0.1846 0.2313 0.2563 0.2710 OOS-t 0.2 0.0237 0.0226 0.0328 0.0505 0.0716 OOS-F 1 0.1392 0.1822 0.2101 0.2045 0.1918 OOS-t 1 0.0291 0.0364 0.0632 0.0979 0.1627 OOS-F 2 0.1418 0.1625 0.1492 0.1231 0.0952 OOS-t 2 0.0287 0.0437 0.0569 0.0855 0.1471 OOS-F 50 0.0808 0.0461 0.0093 0.0031 0.0000 OOS-t 50 0.0299 0.0250 0.0208 0.0197 0.0205 B. Nominal Size of 5% k2 Test π 1 2 5 10 20 OOS-F 0.2 0.2492 0.2935 0.3209 0.3343 0.3624 OOS-t 0.2 0.0843 0.0926 0.1171 0.1491 0.2115 OOS-F 1 0.2475 0.2800 0.2816 0.2717 0.2597 OOS-t 1 0.1067 0.1253 0.1767 0.2396 0.3468 OOS-F 2 0.2399 0.2431 0.2111 0.1784 0.1473 OOS-t 2 0.1045 0.1223 0.1701 0.2274 0.3347 OOS-F 50 0.1067 0.0681 0.0244 0.0099 0.0013 OOS-t 50 0.0866 0.0756 0.0704 0.0742 0.0790 C. Nominal Size of 10% k2 Test π 1 2 5 10 20 OOS-F 0.2 0.3204 0.3507 0.3704 0.3846 0.4160 OOS-t 0.2 0.1489 0.1608 0.2015 0.2469 0.3367 OOS-F 1 0.3149 0.3336 0.3233 0.3137 0.3055 OOS-t 1 0.1784 0.2110 0.2728 0.3514 0.4718 OOS-F 2 0.2939 0.2881 0.2504 0.2163 0.1833 OOS-t 2 0.1711 0.2073 0.2639 0.3390 0.4651 OOS-F 50 0.1344 0.0926 0.0409 0.0184 0.0035 OOS-t 50 0.1379 0.1305 0.1292 0.1352 0.1450 Notes: Each element of Panels A, B and C is the local power of either the OOS-t or (modified) OOS-F test for a given permutation of the choice of sample split π, number of excess parameters in the unrestricted model k2, and the nominal size of the test. For a description of how the results were generated see Section four of the text. Figure 1 Density Plots for OOS-F: Recursive 1 Excess Parameter 5 Excess Parameters π = {0.2,1,2} π = {0.2,1,2} 2.00 0.50 0.45 1.75 0.40 1.50 0.35 1.25 0.30 1.00 0.25 0.20 0.75 0.15 0.50 0.10 0.25 0.05 0.00 0.00 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 -14 -12 -10 -8 -6 -4 -2 0 2 4 6 2 Excess Parameters 10 Excess Parameters π = {0.2,1,2} π = {0.2,1,2} 1.0 0.35 0.9 0.30 0.8 0.7 0.25 0.6 0.20 0.5 0.15 0.4 0.3 0.10 0.2 0.05 0.1 0.0 0.00 -10 -8 -6 -4 -2 0 2 4 6 -20.0 -15.0 -10.0 -5.0 0.0 5.0 Figure 2 Density Plots for OOS-F: Recursive π = 0.2 π=2 Excess Parameters = {1,2,5,10,20} Excess Parameters = {1,2,5,10,20} 2.00 0.9 1.75 0.8 0.7 1.50 0.6 1.25 0.5 1.00 0.4 0.75 0.3 0.50 0.2 0.25 0.1 0.00 0.0 -10 -8 -6 -4 -2 0 2 4 6 8 -30 -26 -22 -18 -14 -10 -6 -2 2 6 π=1 π = 50 Excess Parameters = {1,2,5,10,20} Excess Parameters = {1,2,5,10,20} 1.08 0.8 0.96 0.7 0.84 0.6 0.72 0.5 0.60 0.4 0.48 0.3 0.36 0.2 0.24 0.12 0.1 0.00 0.0 -22 -18 -14 -10 -6 -2 2 6 -55 -50 -45 -40 -35 -30 -25 -20 -15 -10 -5 0 5 10 Figure 3 Density Plots for OOS-t: Recursive 1 Excess Parameter 5 Excess Parameters π = {0.2,1,2} π = {0.2,1,2} 0.50 0.50 0.45 0.45 0.40 0.40 0.35 0.35 0.30 0.30 0.25 0.25 0.20 0.20 0.15 0.15 0.10 0.10 0.05 0.05 0.00 0.00 -4.5 -3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5 -4.5 -3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 2 Excess Parameters 10 Excess Parameters π = {0.2,1,2} π = {0.2,1,2} 0.50 0.50 0.45 0.45 0.40 0.40 0.35 0.35 0.30 0.30 0.25 0.25 0.20 0.20 0.15 0.15 0.10 0.10 0.05 0.05 0.00 0.00 -4.5 -3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 3.5 -4.5 -3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 Figure 4 Density Plots for OOS-t: Recursive π = 0.2 π=2 Excess Parameters = {1,2,5,10,20} Excess Parameters = {1,2,5,10,20} 0.45 0.50 0.40 0.45 0.40 0.35 0.35 0.30 0.30 0.25 0.25 0.20 0.20 0.15 0.15 0.10 0.10 0.05 0.05 0.00 0.00 -5.0 -4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 -6.0 -5.0 -4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 π=1 π = 50 Excess Parameters = {1,2,5,10,20} Excess Parameters = {1,2,5,10,20} 0.50 0.7 0.45 0.6 0.40 0.35 0.5 0.30 0.4 0.25 0.3 0.20 0.15 0.2 0.10 0.1 0.05 0.00 0.0 -5.5 -4.5 -3.5 -2.5 -1.5 -0.5 0.5 1.5 2.5 -7.0 -6.0 -5.0 -4.0 -3.0 -2.0 -1.0 0.0 1.0