Bootstrapping GMM Estimators for Time Series¤ Atsushi Inouey Mototsugu Shintaniz North Carolina State University Vanderbilt University First Draft: October 2000 This Version: February 2001 Abstract This paper establishes that the bootstrap provides asymptotic re¯nements for the generalized method of moments estimator of overidenti¯ed linear models when autocovariance structures of moment functions are unknown. Because the heteroskedasticity and autocorrelation consistent covariance matrix estimator cannot be written as a function of sample moments and converges at a rate slower than T ¡1=2 , the asymptotic re¯nement cannot be proved in the conventional way. As a result, we ¯nd that the bootstrap approximation error for the distribution of the t test and the test of overidentifying restrictions is of larger order than typically found in the literature. We also ¯nd that the choice of kernels plays a more important role in our second- order asymptotic theory than in the conventional ¯rst-order asymptotic theory. Nevertheless, the bootstrap approximation improves upon the ¯rst-order asymptotic approximation. A Monte Carlo experiment shows that the bootstrap improves the accuracy of inference on regression parameters in small samples. We apply our bootstrap method to inference about the parameters in the monetary policy reaction function. KEYWORDS: asymptotic re¯nements, block bootstrap, HAC covariance matrix estimator, de- pendent data, Edgeworth expansions, instrumental variables, J test. ¤ We thank Jordi Gal¶ for providing us with the data and program used in Clarida, Gal¶ and Gertler (2000). We ³ ³ also thank Alastair Hall, Lutz Kilian and seminar participants at Brown University, University of Michigan and the 2000 Triangle Econometrics Conference for helpful comments. y Department of Agricultural and Resource Economics, North Carolina State University, Box 8109, Raleigh, NC 27695-8109. E-mail: atsushi@unity.ncsu.edu. z Department of Economics, Vanderbilt University, Box 1819 Station B, Nashville, TN 37235. E-mail: mototsugu.shintani@vanderbilt.edu. 1. Introduction In this paper we establish that the bootstrap provides asymptotic re¯nements for the generalized method of moments (GMM) estimator of possibly overidenti¯ed linear mod- els. Our analysis di®ers from earlier work in that we allow for general autocovariance structures of moment functions. In typical empirical situations, the autocovariance structure of moment functions is unknown and the inverse of the heteroskedasticity and autocorrelation consistent (HAC) covariance matrix estimator is used as a weight- ing matrix in GMM estimation. It is well known, however, that coverage probabilities based on the HAC covariance estimator are often too low, and that the t test tends to reject too frequently (see Andrews, 1991). In this paper, we propose a bootstrap method for the GMM estimator to improve the ¯nite sample performance of the t test and the test of overidentifying restrictions (J test). We use the block bootstrap origi- nally proposed by KÄnsch (1989) for weakly dependent data (see also Carlstein, 1986). u When the block length increases at a suitable rate with the sample size, such block bootstrap procedures eventually will capture the unknown structure of dependence. Our linear framework is of particular interest in applied time series analysis. GMM estimation of linear models has been applied to the expectation hypothesis of the term structure (Campbell and Shiller, 1991), the monetary policy reaction function (Clarida, Gal¶ and Gertler, 2000), the permanent-income hypothesis (Runkle, 1991), and the ³ present value model of stock prices (West, 1988). Since the GMM estimates often have policy implications in structural econometric models, it is important for researchers to obtain accurate con¯dence intervals. For example, the interpretation of the policy rule crucially depends on the value of the estimated parameters (see Clarida, Gal¶ and ³ Gertler, 2000). 1 Not surprisingly, given the poor performance of the conventional asymptotic ap- proximation, the econometric literature on the bootstrap for GMM is growing rapidly. Hahn (1996) shows the ¯rst-order validity of the bootstrap for GMM with iid observa- tions. 1 For dependent data, Hall and Horowitz (1996) show that the block bootstrap provides asymptotic re¯nements for GMM. However, Hall and Horowitz (1996) assume that the autocovariances of the moment function are zero after ¯nite lags, and thus their framework does not cover the use of the HAC covariance matrix estimator for the general dependence structure. Economic theory often provides information about the speci¯cation of moment conditions, but not necessarily about the dependence struc- ture of the moment conditions. Therefore, it is important for applied work to be able to allow for more general forms of autocorrelation. This extension is not straightfor- ward because the HAC covariance matrix estimator cannot be written as a function of sample moments and converges at a rate slower than T ¡1=2 . Thus, the conventional arguments cannot be applied directly to prove the existence of Edgeworth expansions and to establish asymptotic re¯nements of the bootstrap. Recently, GÄtze and KÄnsch (1996) and Lahiri (1996) show that the block bootstrap o u can provide asymptotic re¯nements for a smooth function of sample means and for parameters in a linear regression model, respectively, even when the HAC covariance estimator is used. They show that the bootstrap provides asymptotic re¯nements for approximating the distribution of the estimator and for the coverage probability of one- sided con¯dence intervals. However, they do not show asymptotic re¯nements for the two-sided symmetric t test nor do they provide any result for the overidenti¯ed case which is of great interest in empirical work. The purpose of this paper is to prove that 1 Brown and Newey (1995) propose an alternative e±cient bootstrap method based on the empirical likelihood. 2 the bootstrap provides asymptotic re¯nements for these statistics in overidenti¯ed linear models estimated by GMM. To our knowledge, the higher-order properties of the block bootstrap for GMM with unknown autocovariance structures have not been formally investigated. Our results are nonstandard for two reasons. First, we show that the order of the bootstrap approximation error is larger than typically found in the literature on the bootstrap for parametric estimators. The intuition behind this result is as follows: The HAC covariance matrix estimator is (proportional to) a nonparametric estimator of the spectral density at frequency zero, and its convergence rate is slower than T ¡1=2 . For the ¯rst-order asymptotic theory, all that matters is the consistency of the HAC covariance matrix estimator. However, the nonparametric nature of the HAC covariance matrix estimator becomes important in the higher-order asymptotic theory and complicates the analysis of the two-sided symmetric t test and the J test statistic. Nevertheless, we are able to establish that the bootstrap approximation error is smaller than the conventional normal approximation error. Second, we note that the choice of kernels plays a more important role in our second- order asymptotic theory than in the conventional ¯rst-order asymptotic theory because the order of the bootstrap approximation error depends on the bias of the HAC covari- ance estimator. For the bootstrap to provide asymptotic re¯nements, the bias must vanish su±ciently fast. For the one-sided t test, most of the commonly used kernels sat- isfy this condition. For two-sided symmetric t test and for the J test statistic, however, one must use kernels, such as the truncated kernel (White, 1984) and the trapezoidal kernel (Politis and Romano, 1995), whose bias vanishes even faster. The resulting HAC covariance matrix estimator based on these kernels, however, is not necessarily positive 3 semide¯nite. In this paper, we propose a modi¯ed HAC covariance matrix estimator that is always positive semide¯nite. In a Monte Carlo experiment, we ¯nd that our bootstrap method improves the accuracy of inference in small samples, especially for the two-sided symmetric t test. To illustrate the usefulness of the bootstrap approach, we apply our bootstrap procedure to the monetary policy reaction function of Clarida, Gal¶ and Gertler (2000). We ¯nd ³ that the data do not necessarily support some of their conclusions. The rest of the paper is organized as follows. Section 2 introduces the model and describes the proposed bootstrap procedure. Section 3 presents the assumptions and theoretical results. Section 4 provides some Monte Carlo results. Section 5 presents an empirical illustration. Section 6 concludes the paper. All proofs are relegated to an appendix. 2. Model and Bootstrap Procedure Consider a stationary time series (x0 ; yt ; zt )0 which satis¯es t 0 E[zt ut ] = 0; (2.1) 0 where ut = yt ¡ ¯0 xt , ¯0 is a p-dimensional parameter, xt is a p-dimensional vector, zt is a k-dimensional vector and p < k. Given a realization f(x0 ; yt ; zt )0 gT0 , we are t 0 t=1 interested in two-step GMM estimation of ¯0 based on the moment condition (2.1). Let ` denote the lag truncation parameter used in HAC covariance matrix estimation and 4 T = T0 ¡ ` + 1.2 We ¯rst obtain the ¯rst-step GMM estimator ¯T by minimizing ~ 2 30 2 3 T0 T0 1 X 0 1 X 4 zt(yt ¡ ¯ xt ) 5 VT 4 zt(yt ¡ ¯ 0 xt )5 T0 t=1 T0 t=1 with respect to ¯, where VT is some k £ k positive semide¯nite matrix. Then we obtain ^ the second-step GMM estimator ¯T by minimizing " T #0 T " # 1X 0 ^¡1 1 X zt (yt ¡ ¯ xt ) ST zt (yt ¡ ¯ 0 xt ) ; T t=1 T t=1 where 2 3 T ` µ ¶³ ´ 1 X4 2 0 X j ^ ST = zt ut zt + ~ ! ~ ~ 0 ~~ 0 zt+j ut+j ut zt + zt utut+j zt+j 5 T t=1 j=1 ` ~0 ut = yt ¡ ¯T xt : ~ is the HAC covariance matrix estimator for the moment function (2.1), !(¢) is a kernel. ^ ¡1=2 ^ We are interested in the distribution of the studentized statistic §T (¯T ¡ ¯0 ) where P 0 ^¡1 PT z x0 )¡1 and in the distribution of the J test statistic §T = ( T xtzt ST ^ t=1 t=1 t t " T #0 " T # 1 X ^0 ^¡1 1 X ^0 JT = p zt (yt ¡ ¯T xt ) ST p zt(yt ¡ ¯T xt ) : T t=1 T t=1 We propose the following block bootstrap procedure. Suppose that T = b` for some integer b. Step 1. Let N1 ; N2 ; :::; Nb be iid uniform random variables on f0; 1; :::; T ¡ `g and let (x¤0 ¤ ¤0 0 0 0 0 (j¡1)`+i ; y(j¡1)`+i ; z(j¡1)`+i ) = (xNj +i ; yNj +i ; zNj +i ) ; for 1 · i · ` and 1 · j · b. 2 ^ We use T observations and the modi¯ed HAC covariance matrix estimator ST to obtain asymptotic re¯nements for the two-sided symmetric t test and the J test statistic. This modi¯cation is not necessary for obtaining asymptotic re¯nements for one-sided con¯dence intervals. See also Hall and Horowitz (1996, p.895). 5 ~¤ Step 2. Calculate the ¯rst-step bootstrap GMM estimator ¯T by minimizing " T #0 " T # 1X ¤ ¤ 1X ¤ ¤ z (y ¡ ¯ 0 x¤ ) ¡ ¹¤ VT z (y ¡ ¯ 0 x¤ ) ¡ ¹¤ T t=1 t t t T T t=1 t t t T where T ¡` X X1 ` 1 ¹¤ = T ^0 zt+i (yt+i ¡ ¯T xt+i ): T ¡ ` + 1 t=0 ` i=1 ^¤ Step 3. Compute the second-step bootstrap GMM estimator ¯T by minimizing " T #0 " T # 1X ¤ ¤ ^¤¡1 1X ¤ ¤ z (y ¡ ¯ 0 x¤ ) ¡ ¹¤ ST z (y ¡ ¯ 0 x¤ ) ¡ ¹¤ ; T t=1 t t t T T t=1 t t t T where b ` ` ^¤ 1 XXX ¤ ST = (z u¤ ~ ¡ ¹¤ )(zNk +j u¤ k +j ¡ ¹¤ )0 ; ¤ ~N T k=1 i=1 j=1 Nk +i Nk +i T T u¤ = yt ¡ ¯T x¤ : ~t ¤ ~¤0 t ^ ¤¡1=2 (¯ ¤ ¡ ¯T ) where Step 4. Obtain the bootstrap version of the studentized statistic §T ^ ^ T P ¤0 ^¤¡1 PT §¤ = ( T x¤ zt ST ^ T t=1 t ¤ ¤0 ¡1 t=1 zt xt ) and the J test statistic ( T )0 ( T ) ¤ 1 X ¤ ¤ ^¤0 ¤ ^¤¡1 1 X ¤ ¤ ^¤0 ¤ JT = p [zt (yt ¡ ¯T xt ) ¡ ¹¤ ] T ST p [zt (yt ¡ ¯T xt ) ¡ ¹¤ ] : T T t=1 T t=1 By repeating Steps 1{4 su±ciently many times, one can approximate the ¯nite-sample distributions of the studentized statistic and the J test statistic by the empirical distri- butions of their bootstrap version. Remarks: 1. As in Hall and Horowitz (1996), we recenter the bootstrap version of the moment functions. Unlike the just identi¯ed case, the bootstrap version of the moment condition does not hold without recentering in the case of overidenti¯ed restrictions. The expression ¹¤ is the mean of the bootstrapped moment function with respect to T the probability measure induced by the bootstrap algorithm. 6 2. Davison and Hall (1993) show that naÄ applications of the block bootstrap do ³ve not provide asymptotic re¯nements for studentized statistics involving the long-run variance estimator. Speci¯cally, they show that the error of the naÄ bootstrap is of ³ve order O(b¡1 ) + O(`¡1 ) and thus is greater than or equal to the error of the ¯rst order asymptotic approximation. We therefore modify the bootstrap version of the HAC covariance matrix estimator (see GÄtze and Hipp, 1996, for the just-identi¯ed case). o ^¤ The expression ST given in Step 3 is a consistent estimator for the variance of the bootstrapped moment function with the bootstrap probability measure. 3. Asymptotic Theory In this section, we present our main theoretical results. Unless noted otherwise, we shall denote the Euclidean norm of a vector x by kxk. First, we provide the following set of assumptions. Assumption 1: (a) f(x0 ; yt ; zt )0 g is strictly stationary and strong mixing with mixing coe±cients sat- t 0 isfying ®m · (1=d) exp(¡dm) for some d > 0. (b) There is a unique ¯0 2

0. b (d) Let Fa denote the sigma-algebra generated by Ra ; Ra+1 ; :::; Rb . For all m; s; t = t+s 1; 2; ::: and A 2 Ft¡s , t¡1 1 t¡1 t+s+m EjP (AjF¡1 [ Ft+1 ) ¡ P (AjFt¡s¡m [ Ft+1 )j · (1=d) exp(¡dm): 7 (e) For all m; t = 1; 2; ::: and µ 2 ¤ ) = ® + o(`T ¡1 ) + O(`¡q ): ® (3.8) Remarks: Theorems 1 and 2 show that the distributions of the studentized statistic and the J test statistic and their bootstrap versions can be approximated by their Edgeworth expansions. Theorem 3 shows the order of the bootstrap approximation error. For the one-sided t test, the two-sided symmetric t test and the J test statistic, the approximation errors made by the ¯rst-order asymptotic theory are of order O(T ¡1=2 ) + O(`¡q ); O(`T ¡1 ) + O(`¡q ) and O(`T ¡1 ) + O(`¡q ); (3.9) respectively, whereas the bootstrap approximation errors are of order O(`T ¡1 ) + O(`¡q ); o(`T ¡1 ) + O(`¡q ) and o(`T ¡1 ) + O(`¡q ): (3.10) 10 Thus the bootstrap provides asymptotic re¯nements if the bias of the HAC covariance matrix estimator vanishes fast enough, i.e., O(`¡q ) = o(T ¡1=2 ); O(`¡q ) = o(`T ¡1 ) and O(`¡q ) = o(`T ¡1 ): (3.11) for the three statistics, respectively. For the one-sided t test, the bootstrap provides asymptotic re¯nements for a wide class of kernels that satisfy O(`¡q ) = o(T ¡1=2 ), such as the Parzen kernel. However, the bootstrap does not provide asymptotic re¯nements for the Bartlett kernel as it does not satisfy (3.11), because its characteristic exponent is one. For the two-sided symmetric t test and the J test statistic, the bootstrap can provide asymptotic re¯nements only for kernels whose characteristic exponent is greater than 2, such as the truncated kernel, ( 1 for jxj < 1 !(x) = ; 0 otherwise the trapezoidal kernel (Politis and Romano, 1995) 8 > 1 < for jxj · ® jxj¡® !(x) = 1¡ for ® < jxj · 1 ; > : 1¡® 0 otherwise where 0 < ® < 1, and the Parzen (b) kernel (Parzen, 1957) ( 1 ¡ jxjq for jxj · 1 !(x) = 0 otherwise where q > 2. Under the assumption of exponentially decaying mixing coe±cients, the truncated and trapezoidal kernels have no asymptotic bias and thus satisfy (3.11). If q > 2 and ` 6= O(T 1=(q+1)), the Parzen (b) kernel also satis¯es (3.11). A potential problem with these kernels is that the resulting weighting matrix is not necessarily positive semide¯nite. To eliminate this problem, the weighting matrix can be modi¯ed as follows: By Schur's decomposition theorem (e.g., Theorem 13 of Magnus and Neudecker, 1999, 11 p.16), there exist an orthogonal k £ k matrix E whose columns are eigenvalues of WT = ^¡1 ST and a diagonal matrix ¤ = diag(¸1 ; :::; ¸k ), whose elements are the eigenvalues of WT , such that WT = E 0¡1 ¤E ¡1 : (3.12) De¯ne a modi¯ed HAC covariance matrix estimator by WT = E 0¡1 ¤+ E ¡1 ; + (3.13) + where ¤+ = diag(max(¸1 ; 0); :::; max(¸k ; 0)). Then WT is positive semide¯nite, asymp- totically equivalent to (3.12) and thus is consistent. Politis and Romano (1995, equation 12) uses a similar modi¯cation in the context of univariate spectral density estimation. For the trapezoidal kernel, the frequency of positive semide¯nite corrections can be re- duced by choosing small ®. However, Politis and Romano (1995) recommends ® = 1=2. 4. Monte Carlo Results In this section, we conduct a small simulation study to examine the accuracy of the proposed bootstrap procedure. We consider the following stylized linear regression model with an intercept and a regressor, xt : yt = ¯1 + ¯2 xt + ut ; for t = 1; : : : ; T: (4.14) The disturbance and the regressors are generated from the following AR(1) processes with common ½, ut = ½ut¡1 + "1t ; (4.15) xt = ½xt¡1 + "2t ; (4.16) 12 where "t = ("1t ; "2t )0 » N(0; I2 ). In the simulation, we use ¯ = (¯1 ; ¯2 )0 = (0; 0)0 for the regression parameter and ½ 2 f0:5; 0:9; 0:95g for the AR parameters. For instruments, we use xt , xt¡1 and xt¡2 in addition to an intercept. This choice of instruments implies an over-identi¯ed model with 2 degrees of freedom for the J test. Two values for the sample size T , 64 and 128, are considered. The kernel functions employed are the trapezoidal, Parzen (b) and truncated kernels. In all experiments, the number of Monte Carlo trials is 1000. The choice of the block length is important in practice. Ideally, one would choose a longer block length for more persistent processes and a shorter block length for less persistent processes. In the literature, this is typically accomplished by selecting the lag truncation parameter that minimizes the mean squared error of the HAC covari- ance matrix estimator (see Andrews, 1991; and Newey and West, 1994). Because the trapezoidal and truncated kernels have no asymptotic bias, however, one cannot take advantage of the usual bias-variance trade-o® and thus no optimal block length can be de¯ned for these kernels. Thus, we propose the following procedure that is similar to the general-to-speci¯c modeling strategy for selecting the lag order of autoregressions in the literature on unit root testing (see Hall, 1994; Ng and Perron, 1995). By the Wold representation theorem, the moment function has a moving average (M A) repre- sentation of possibly in¯nite order. The idea is to approximate this MA representation by a sequence of ¯nite-order MA processes. Because the block bootstrap is originally designed to capture the dependence of m-dependent-type processes when ` is ¯xed, it makes sense to approximate the process by an MA process that is m-dependent. The proposed procedure takes the following steps. Step 1. Let `1 < `2 < ¢ ¢ ¢ < `max be candidate block lengths that satisfy Assumption 1(g) 13 and set k = max ¡1. Step 2. Test the null that every element of the moment function is MA(`k ) against the alternative that at least one of the elements is MA(`k+1 ). Step 3. If the null is accepted and if k > 1, then let k = k ¡ 1 and go to Step 2. If the null is accepted and if k = 1, then let ` = `1 . If the null is rejected, then set ` = `k+1 . Because there is parameter uncertainty due to ¯rst-step estimation and because we apply a univariate testing procedure to each element of the moment function, it is di±cult to control the size of this procedure. In this Monte Carlo experiment, therefore, we use the 99% level critical value to be conservative. Our primary interest is to compare the size properties of tests based on asymptotic and bootstrap critical values. For each experiment, the empirical size for the t test for the regression slope parameter ¯2 as well as for the J test is obtained using the 10% nominal signi¯cance level. Each bootstrap critical value is constructed from 999 replications of the bootstrap sampling process. In addition to the results based on the asymptotic and bootstrap critical values using our proposed procedure, we report the asymptotic results based on the Bartlett and QS kernels, with Andrews' (1991) data-dependent bandwidth estimator and Andrews and Monahan's (1992) prewhitening procedure. Table 1 summarizes the result of the simulation study. In all cases, the size proper- ties of the bootstrap t test are better than those of the asymptotic t test. The choice of kernel function does not make much of a di®erence for the performance. Indeed the em- pirical sizes of the bootstrap test are very close to the nominal size when T is 128. The degree of the reduction in the size distortion depends on the value of the AR parameters as well as the sample size. The bootstrap works quite well with persistent processes. 14 Because the moment functions have an AR(1) autocovariance structure, the prewhiten- ing procedure has a considerable advantage in our simulation design. However, the bootstrap outperforms the conventional prewhitened HAC procedure with asymptotic critical values. In contrast, the advantage of the bootstrap for the J test is not clear 4 because the J test performs quite well even with asymptotic critical values. Based on this experiment, we recommend our bootstrap procedure especially for the t test for regression parameters. 5. Empirical Illustration To illustrate the usefulness of the proposed bootstrap approach, we conduct bootstrap inference about the parameters in the monetary policy reaction function of Clarida, Gal¶ and Gertler (2000, hereafter CGG). CGG model the target for the federal funds ³ ¤ rate rt by rt = r¤ + ¯(E[¼t+1 j- t ] ¡ ¼ ¤ ) + °E[xt j- t ] ¤ (5.17) where ¼t is the in°ation rate, ¼ ¤ is the target for in°ation, - t is the information set at time t, xt is the output gap, and r¤ is the target with zero in°ation and output gap. Policy rules (5.17) with ¯ > 1 and ° > 0 are stabilizing and those with ¯ · 1 and ° · 0 are destabilizing. CGG obtain the GMM estimates of ¯ and ° based on the set of unconditional moment conditions Ef[rt ¡ (1 ¡ ½1 ¡ ½2 )[rr¤ ¡ (¯ ¡ 1)¼ ¤ + ¯¼t+1 + °xt ] + ½1 rt¡1 + ½2 rt¡2 ]zt g = 0; (5.18) where rt is the actual federal fund rate, rr¤ is the equilibrium real rate and zt is a vector of instruments. They ¯nd that the GMM estimate of ¯ is signi¯cantly less than unity 4 See Tauchen (1986) and Hall and Horowitz (1996) for similar ¯ndings. 15 during the pre-Volcker era, while the estimate is signi¯cantly greater than unity during the Volcker-Greenspan era. We reexamine these ¯ndings by applying our bootstrap procedure as well as the bootstrap procedure of Hall and Horowitz (1991) and the standard HAC asymptotics. We obtain GMM estimates of ¯ and ° based on the linear moment conditions Ef[rt ¡ c ¡ µ1 ¼t+1 ¡ µ2 xt ¡ ½1 rt¡1 ¡ ½2 rt¡2 ]ztg = 0; (5.19) where c = (1 ¡ ½1 ¡ ½2 )[rr¤ ¡ (¯ ¡ 1)¼ ¤ ]. Then ¯T = µ1T =(1 ¡ ½1T ¡ ½2T ) and °T = ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ µ2T =(1¡ ½1T ¡ ½2T ), where µ1T ; µ2T ; ½1T and ½2T are the GMM estimates of µ1 ; µ2 ; ½1 and ^ ½2 , respectively. We use CGG's baseline dataset and two sample periods, the pre-Volcker period (1960:1-1979:2) and the Volcker-Greenspan period (1979:3-1996:3) (see CGG for the description of the data source). In addition to their baseline speci¯cation, we construct the optimal weighting matrix using the inverse of the HAC covariance matrix estimator to allow for more general dynamic speci¯cations in the determination of the actual funds rate. For the asymptotic con¯dence intervals, we use the conventional prewhitened and recolored estimates based on the Bartlett and QS kernels with the automatic bandwidth selection method (Andrews 1991, Andrews and Monahan 1992). For the con¯dence intervals constructed from our bootstrap, we use the trapezoidal, Parzen (b) and truncated kernels. We use the data-dependent procedure described in the previous section to select the block length for the bootstrap. The number of bootstrap replications is set to 999. Table 2 presents GMM estimates of these parameters. Asymptotic standard errors are reported in parentheses. The ¯rst two rows of each of Tables 2(a) and (b) replicate CGG's results. These ¯ndings are robust to whether or not the HAC covariance matrix estimator is used. 16 Table 3 shows 90% two-sided con¯dence intervals of these parameters. Consistent with CGG's ¯ndings, the upper bound of the asymptotic con¯dence interval for ¯ is less than unity during the pre-Volcker period, and the lower bound is far greater than unity during the Volcker-Greenspan period. Based on these estimates, CGG suggest that the Fed was accommodating in°ation before 1979, but not after 1979. The bootstrap con¯dence interval, however, indicates that ¯ may be greater than unity even during the pre-Volcker period, consistent with the view that the Fed has always been combating in°ation. Moreover, unlike the asymptotic con¯dence interval, the bootstrap con¯dence interval does not rule out that ° is negative during the Volcker-Greenspan period. 6. Concluding Remarks In this paper we establish that the bootstrap provides asymptotic re¯nements for the GMM estimator of possibly overidenti¯ed linear models when the autocovariance struc- ture of the moment function is unknown. Because the HAC covariance matrix estimator cannot be written as a function of sample moments and converges at a rate slower than T ¡1=2 , the conventional techniques cannot be used directly to prove the existence of the Edgeworth expansions. Because of the nonparametric nature of the HAC covari- ance matrix estimator, the order of the bootstrap approximation error is larger than the typical order of the bootstrap approximation error for parametric estimators. Nev- ertheless, the bootstrap provides improved approximations relative to the ¯rst-order approximation. We also ¯nd that the choice of kernels plays a more important role in our second-order asymptotic theory than in the conventional ¯rst-order asymptotic theory because the order of the bootstrap approximation error depends on the bias of the HAC covariance estimator. We note that an extension of the present results to 17 nonlinear dynamic models as well as further investigation of data-dependent methods for selecting the optimal block length would be useful. 18 Appendix Notation To simplify the notation, we will assume p = 1 throughout the appendix. In the proof for the case p > 1, the scalar ¯ in the current proof is replaced by an arbitrary linear combination of ¯. -denotes the Kronecker product operator. If ® is an n-dimensional nonnegative integral, j®j de- Pn Pn notes its length, i.e., j®j = i=1 j®i j. k¢ k denotes the Euclidean norm, i.e., kxk = ( i=1 x2 )1=2 , i where x is an n-dimensional vector. We will write !(j=`) as !j for notational simplicity. ·j (x) denotes the jth cumulant of a random variable x. vec(¢) is the column-by-column vectorization function. vech(¢) denotes the column stacking operator that stacks the elements on and below the leading diagonal. For a nonnegative integral vector ® = (®1 ; ®2 ; :::; ®n ), let @ ®1 @ ®n D® = ®1 ¢ ¢ ¢ : @x1 @x®nn ` and l are treated di®erently: ` denotes the lag truncation parameter and l denotes an integer. Let ut = yt ¡ ¯0 xt , ut = yt ¡ ¯T xt , ut = yt ¡ ¯T xt , vt = zt ut , vt = zt ut , vt = zt ut , wt = zt x0 , 0 ^ ^0 ~ ~0 ^ ^ ~ ~ t ½ P ½ P ~ ~0 (1=T ) T vt+j vt j ¸0 0 0 (1=T ) T vt+j wt + wt+j vt j ¸ 0 ^ ¡j = Pt=1 ~ ; r¡j = Pt=1 ; ~ ~0 0 0 T T (1=T ) t=1 vt vt¡j j <0 (1=T ) t=1 vt wt¡j + vt wt¡j j < 0 ½ PT 0 ½ (1=T ) t=1 vt+j vt j ¸0 E(v w0 + w v 0 ) j ¸ 0 ~ ¡j = PT 0 ; r¡j = E(vt+j0 t + v t+j t ) j < 0 ; 0 (1=T ) t=1 vt vt¡j j <0 t wt¡j t wt¡j ½ 0 ½ PT 0 E(vt+j vt ) j ¸ 0 (1=T ) t=1 wt+j wt j ¸ 0 ¡j = E(v v 0 ) j < 0 ; r2 ¡j = PT 0 ; t t¡j (1=T ) t=1 wt wt¡j j < 0 P` P` P` ^ ST = ^ !j ¡j ; ~ ST = ~ !j ¡j ; ¹ ST = j=¡` !j ¡j ; Pj=¡` T ¡1 jjj Pj=¡` ` ~ j ; rST = P` ST = (1 ¡ T )¡j ; rS~T = !j r¡ ¹ j=¡` !j r¡j ; P1 +1 j=¡T 2~ Pj=¡` ` 2~ rS = j=¡1 r¡j ; r ST = j=¡` !j r ¡j : PT PT Let GT = (1=T ) t=1 wt and mT = T ¡1=2 t=1 vt . Then the studentized statistic can be written as p 1 ^ ^ T ^¡1 T ^¡1 fT = T §¡1=2 (¯T ¡ ¯0 ) = (G0 ST GT )¡ 2 G0 ST mT : We use the following notation for the bootstrap. Let 1 X ¤ ¤ 1 X T b m¤ T = p (zt ut ¡ ¹¤ ) = p T BNk ; T t=1 b k=1 1 X 1 X ` ` BNk = p (zNk +i uNk +i ¡ ¹¤ ) = p ^ T (^Nk +i ¡ ¹¤ ) ; v T ` i=1 ` i=1 1 X¡ ¤ ¢ ` b BNk = p zN k+i u¤ bi ¤ e bNk+i ¡ ¹¤ ; u¤ = yi ¡ ¯ ¤0 x¤ ; T i ` i=1 1 X ¤ ¤0 1X T b G¤ T = zt xt = FN ; T t=1 b k=1 k 1X 1X ` ` 0 FNk = zNk +i xNk +i = wNk +i : ` i=1 ` i=1 1 X b b0 1X b b ^¤ ST = e¤ BNk BNk ; ST = 0 BNk BNk ; ST = Var¤ (m¤ ) : ¤ T b b k=1 k=1 19 Then the bootstrap version of the ¯rst-step and the second-step GMM estimators can be written as " b #¡1 1X 0 1X 1X 0 1 X b b b e¤ ¯ = ¯+ ^ FNk VT FNk FNk VT p BNk b b b T b k=1 k=1 k=1 k=1 ¡1 1 = ¯ + [G¤0 VT G¤ ] G¤0 VT p m¤ ; ^ T T T T T " b #¡1 1 X 0 ^¤¡1 1 X 1 X 0 ^¤¡1 1 X b b b ¤ ^ ¯ = ¯+ ^ FNk ST FNk FNk ST p BNk b b b T b k=1 k=1 k=1 k=1 h i¡1 ^ T ^¤¡1 T = ¯ + G¤0 ST G¤ ^¤¡1 1 T G¤0 ST p m¤ ; T T respectively. Proofs of Lemmas Next, we will present the lemmas used in the proofs of the theorems. Lemma A.1 produces a Taylor series expansion of the studentized statistic fT . Lemma A.2 provides bounds on the moments and will be used in the proofs of Lemmas A.3{A.6. Lemma A.3 shows the limits and the convergence rates of the ¯rst three cumulants of gT in (A.1), that will be used to derive the formal Edgeworth expansion. Lemmas A.5 and A.6 provide bounds on the approximation error. For convenience, we present Lemma B.1 that will be used in the proofs of Lemmas B.2 and B.3. Lemma B.2 shows the consistency and convergence rate of the bootstrap version of the moments. Lemma B.3 shows the limits and the convergence rates of the ¯rst three cumulants of the bootstrap version. Lemma A.1: fT = a0 mT + b0 [(GT ¡ G0 ) -mT ] + c0 [vech(ST ¡ S0 ) -mT ] ^ +d0 [(GT ¡ G0 ) -vech(ST ¡ S0 ) -mT ] + e0 [vech(ST ¡ S0 ) -vech(ST ¡ S0 ) -mT ] ^ ^ ^ 3=2 +Op ((`=T ) ) = a0 mT + b0 [(GT ¡ G0 ) -mT ] + c0 [vech(ST ¡ ST ) -mT ] + c0 [vech(ST ¡ S0 ) -mT ] ^ ¹ ¹ 0 ^T ¡ ST ) -mT ] + e0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ] +d [(GT ¡ G0 ) -vech(S ¹ ^ ¹ ^ ¹ +d0 [(GT ¡ G0 ) -vech(ST ¡ S0 ) -mT ] + e0 [vech(ST ¡ ST ) -vech(ST ¡ S0 ) -mT ] ¹ ^ ¹ ¹ +e0 [vech(ST ¡ S0 ) -vech(ST ¡ ST ) -mT ] + e0 [vech(ST ¡ S0 ) -vech(ST ¡ S0 ) -mT ] ¹ ^ ¹ ¹ ¹ +Op ((`=T )3=2 ) ´ gT + c0 [vech(ST ¡ S0 ) -mT ] + d0 [(GT ¡ G0 ) -vech(ST ¡ S0 ) -mT ] ¹ ¹ +e0 [vech(ST ¡ ST ) -vech(ST ¡ S0 ) -mT ] + e0 [vech(ST ¡ S0 ) -vech(ST ¡ ST ) -mT ] ^ ¹ ¹ ¹ ^ ¹ 0 ¹ ¹ 3=2 +e [vech(ST ¡ S0 ) -vech(ST ¡ S0 ) -mT ] + Op ((`=T ) ); (A.1) where a, b, c, d and e are q, q 2 , q(q 2 +q), q(q 2 +q)=2, q 2 (q 2 +q)=2 and q((q 2 +q)=2)2 -dimensional vectors of smooth functions of G0 and S0 , respectively. Proof of Lemma A.1: (A.1) immediately follows from a Taylor series expansion of fT around (m0 ; G0 ; vech(ST )0 )0 = (01£q ; G0 ; vech(S0 )0 )0 T T ^ 0 and from Theorem 1 of Andrews (1991). Q.E.D. Lemma A.2: EkmT kr+´ = O(1); (A.2) 20 EkT 1=2 (GT ¡ G0 )kr+´ = O(1); (A.3) Ek(T =`)1=2 vech(ST ¡ ST )kr=2 ~ ¹ = O(1); (A.4) Ek(T =`)1=2 vech(rST ¡ rST )kr=2 ~ ¹ = O(1); (A.5) 1=2 EkT vech(ST ¡ ST )kr=2 ^ ~ = O(1): (A.6) Proof of Lemma A.2: First, (A.2) and (A.3) immediately follow from the moment inequality of Yokoyama (1980). Second, we will show (A.4). Note that [T =`] X ` X (T =`)1=2 (ST ¡ ST ) = (T=`)1=2 ~ ¹ !j (¡j ¡ ¡j ) = (`=T )1=2 ~ Wi j=¡` i=1 X X X = (`=T )1=2 ( Wi + Wi + Wi ); (A.7) i=0mod3 i=1mod3 i=2mod3 where 1 X i` X ` 0 0 0 0 0 0 Wi = fvt vt ¡ E(vt vt ) + !j [vt+j vt ¡ E(vt+j vt ) + vt vt+j ¡ E(vt vt+j )]g: ` j=1 t=(i¡1)`+1 Note that the summands in each sum on the RHS of (A.7) are asymptotically independent by construction. Thus, ° °r X 3 ° ¹ °2 r r E °(T=`)1=2 vech(ST ¡ ST )° = O(Ekvech(W2 )k 2 ) = ~ O(Ekvech(W2 (i))k 2 ) (A.8) i=1 where X X 2` `¡1 X 2` X ¡1 X `¡1 W2 (1) = `¡1 !j vt+j vt ; W2 (2) = `¡1 0 0 !j vt vt¡j ; W2 (3) = 0 E(v0 v¡j ): t=`+1 j=0 t=`+1 j=¡`+1 j=¡`+1 Thus it su±ces to show that, for i; j = 1; 2; :::; q, r EjW2 (1)(i;j) j 2 = O(1); (A.9) r (i;j) EjW2 (2) j 2 = O(1); (A.10) (i;j) r EjW2 (3) j2 = O(1); (A.11) where W2 (¢)(i;j) denotes the (i; j)th element of W2 (¢). By Assumptions 1(a) and 1(f), it follows that X (k ) (k ) (k ) EjW2 (1)(i;j) jr=2 = O(`r=2 Ejvt1 1 vt2 2 ¢ ¢ ¢ vtr r j); (A.12) t1 ·t2 ·¢¢¢·tr where 0 · tl · 2` and kl = i; j for l = 1; 2; :::; r. Then the standard arguments used in proofs of the moment inequality complete the proof of (A.9). The proof of (A.10) is analogous to that of (A.9) and thus is omitted. By the mixing inequality of Hall and Heyde (1980, Corollary A.2), it follows that for some d0 > 0 r X `¡1 r X `¡1 0 r EjW2 (3)(i;j) j 2 = ( 0 E(v0 v¡j )) 2 = ( ®d ) 2 = O(1); j (A.13) j=¡`+1 j=¡`+1 and thus (A.11) holds. Therefore, (A.4) immediately follows from (A.7){(A.11). The proof of (A.5) is analogous to that of (A.4) and thus is omitted. Lastly, we will prove (A.6). Note that T 1=2 (ST ¡ ST ) = rST T 1=2 (¯T ¡ ¯0 ) + r2 ST T 1=2 (¯T ¡ ¯0 )2 : ^ ~ ~ ~ ~ ~ (A.14) 21 Thus it follows from (A.5) and Minkowski's inequality that [EkrST kr ]1=r · [EkrST ¡ rST kr ]1=r + [EkrST kr ]1=r = O(`1=2 T ¡1=2 ) + O(1); ~ ~ ¹ ¹ (A.15) X ` X ` [Ekr2 ST kr ]1=r ~ · [Ek !j (r2 ¡j ¡ E(r2 ¡j ))kr ]1=r + [Ek !j E(r2 ¡j )kr ]1=r j=¡` j=¡` ¡1=2 = O(`T ) + O(`): (A.16) Therefore (A.6) follows from (A.14), (A.15), (A.16), Assumption 1(i) and HÄlder's inequality. o Q.E.D. Lemma A.3: T 1=2 ·1 (gT ) = ®1 + O(`¡q ) + o(`T ¡1=2 ); (A.17) (T =`)(·2 (gT ) ¡ 1) = °1 + O(`¡1=2 ); (A.18) T 1=2 ·3 (gT ) = ·1 ¡ 3®1 + O(`¡q ) + o(`T ¡1=2 ); (A.19) (T =`)(·4 (gT ) ¡ 3) = ³1 + O(`¡1=2 ); (A.20) where X 1 X 1 ®1 = b0 E[w0 -vi ] + c0 0 E[vech(v0 vi ) -vj ] i=¡1 i;j=¡1 X 1 0 +c Efvech[rS(E(w0 )0 V E(w0 ))¡1 E(w0 )0 V v0 ] -vi g ¹ i=¡1 1 X X ` T °1 = 2 lim Efa0 v0 c0 [vech(vi vi¡j ¡ ¡j ) -vk ]g 0 T !1 ` j=¡` i;k=¡T 1 X X ` T +2 lim Efa0 v0 e0 [vech(vi vi¡j ¡ ¡j ) -vech(vk vk¡l ¡ ¡l ) -vm ]g 0 0 T !1 `T i;l=¡` i;k;m=¡T 1 X T X ` + lim Efc0 [vech(v0 v¡i ¡ ¡i ) -vj ]c0 [vech(vk vk¡l ¡ ¡k ) -vm ]g; 0 0 T !1 `T j;k;m=¡T i;l=¡` 1 X T ¡1 X 1 ·1 = E(a0 v0 a0 vi a0 vj ) + 3 lim Efa0 v0 a0 vi b0 [vech(wj ¡ E(wj )) -vk ]g T !1 T i;j=¡1 i;j;k=¡T +1 1 X T +3 lim Efa0 v0 a0 vi c0 [vech(vj vj¡k ¡ ¡k ) -vl g 0 T !1 T i;j;k;l=¡T 1 X T +3 lim Efa0 v0 a0 vi c0 vech[rS(E(w0 )0 V E(w0 ))¡1 E(w0 )0 V vj ] -vk g; ¹ T !1 T 2 i;j;k=¡T ³1 4 X T X ` = Efa0 v0 a0 vi a0 vj c0 [vech(vk vk¡l ¡ ¡l ) -vm ]g 0 `T i;j;k;m=¡T l=¡` 4 X T X ` + lim Efa0 v0 a0 vi a0 vj e0 [vech(vk vk¡l ¡ ¡l ) -vech(vm vm¡n ¡ ¡n ) -vo ]g 0 0 `T 2 i;j;k;m;o=¡T l;n=¡` 6 X T X ` + lim Efa0 v0 a0 vi c0 [vech(vj vj¡k ¡ ¡k ) -vl ]c0 [vech(vm vm¡n ¡ ¡n ) -vo ]g 0 0 `T 2 i;j;l;m;o=¡T k;n=¡` 22 1 X X T ` ¡12 lim Efa0 v0 c0 [vech(vj vj¡k ¡ ¡k ) -vl ]g 0 ` j;l=¡T k=¡` 1 X T X ` ¡12 lim Efa0 v0 e0 [vech(vj vj¡k ¡ ¡k ) -(vl vl¡m ¡ ¡m ) -vn ]g 0 0 `T j;l;n=¡T k;m=¡` 1 X T X ` ¡6 lim Efc0 [(v0 v¡i ¡ ¡i ) -vj ]c0 [(vk vk¡l ¡ ¡l ) -vm ]g: 0 0 `T 2 j;k;m=¡T i;l=¡` Proof of Lemma A.3: First, we will prove (A.17). By HÄlder's inequality and Lemma A.2, it o su±ces to show that X 1 1=2 T E[(GT ¡ G0 ) -mT ] = E[w0 -vi ] + O(T ¡1 ); (A.21) i=¡1 1 X T 1=2 E[vech(ST ¡ ST ) -mT ] = ~ ¹ E[vech(v0 vi ) -vj ] + O(`¡q ) + O(`T ¡1 ); 0 (A.22) i;j=¡1 X1 T 1=2 E[vech(ST ¡ ST ) -mT ] = ^ ~ Efvech[rS(E(w0 )0 V E(w0 ))¡1 E(w0 )0 V v0 ] -vi g ¹ i=¡1 +O(`1=2 T ¡1=2 ); (A.23) (T =`)E[vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ] = o(1): ^ ¹ ^ ¹ (A.24) First, (A.21) follows from several applications of the mixing inequality. Second, we will show (A.22). We have 1 X` T 2 E[ !j vech(¡j ¡ ¡j ) -mT ] ~ j=0 X ` X T ¡1 T ¡ i1(j > i) ¡ jjj1(j > 0 or j · ¡i) 0 = !i E[vech(v0 v¡i ) -vj ] T i=0 j=¡`¡T +1 X ` X T ¡1 = !i E[vech(v0 v¡i ) -vj ] + O(`T ¡1 ) 0 i=0 j=¡`¡T +1 X ` T ¡1 X = E[vech(v0 v¡i ) -vj ] + O(`¡q ) + O(`T ¡1 ) 0 i=0 j=¡`¡T +1 X X 1 1 = E[vech(v0 v¡i ) -vj ] + O(`¡q ) + O(`T ¡1 ): 0 (A.25) i=0 j=¡1 The ¯rst equality follows from strict stationarity. Repeated applications of the moment inequal- ity of Yokoyama (1980) produce X ` T ¡j X T ¡ i1(j > i) ¡ jjj1(j > 0 or j · ¡i) 0 !i E[vech(v0 v¡i ) -vj ] T i=0 j=¡`¡T +1 0 2 ¡2j¡1 ¡j ¡(1=2)i X ` X X X @T ¡1 4 r0 r0 = O !i jjj®¡i¡j + jjj®i + i®¡j i=0 j=¡`¡T j=¡2j j=¡i 31 ¡1 X X i T ¡1 X 0 r0 r 0 5A + i®r i+j + (i + j)®i + (i + j)®j j=¡(1=2)i+1 j=0 j=i+1 ¡1 = O(`T ): (A.26) 23 for some r0 2 (0; 1), from which the second equality follows. Arguments analogous to the proof of Theorem 10 of Hannan (1970, pp.283-284) yield the last two equalities. By symmetric arguments, it follows that 1 X ¡1 T E[ 2 ~ !j vech(¡j ¡ ¡j ) -mT ] j=¡` X ¡1 X 1 = E[vech(v0 v¡i ) -vj ] + O(`¡q ) + O(`T ¡1 ): 0 (A.27) i=¡1 j=¡1 Hence, (A.23) follows from (A.25) and (A.27). Third, we will show (A.23). It follows from (A.14), Assumption 1(i) and Lemma A.2 that 1 ^ ~ T 2 E[vech(ST ¡ ST ) -mT ] 1 = T 2 E[vech(rST (¯T ¡ ¯0 ) + r2 ST (¯T ¡ ¯0 )2 ) -mT ] ~ ~ ~ ~ 1 1 ¹ ~ ¹ ~ = T 2 E[vech((rST ¡ rST )(¯T ¡ ¯0 ) -mT )] + T 2 E[vech(rST (¯T ¡ ¯0 ) -mT )] ~ 1 +T 2 E[vech((r2 ST ¡ r2 ST )(¯T ¡ ¯0 )2 ) -mT ] ~ ¹ ~ 1 +T 2 E[vech((r2 r2 ST (¯T ¡ ¯0 )2 ) -mT )] ¹ ~ 1 X = Efvech[rS(E(w0 )0 V E(w0 ))¡1 E(w0 )0 V v0 -vi ]g + O(`1=2 T ¡1=2 ); ¹ (A.28) i=¡1 which completes the proof of (A.23). Lastly, we will show (A.24). ^ ¹ ^ ¹ (T =`)E[vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ] ~ ¹ ~ ¹ = (T =`)E[vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ] + o(1) X ` X T = `¡1 T ¡3=2 0 E[vech(vt+i vt ¡ ¡i ) -vech(vs+j vs ¡ ¡j ) -vu ] + o(1) i;j=¡` t;s;u=1 2 ¡1=2 = O(` T ) = o(1): (A.29) Therefore, (A.17) follows from (A.21){(A.24). Next, we will prove (A.18). It follows from (A.17), HÄlder's inequality and Lemma A.2 that o ·2 (gT ) ¡ 1 = E(gT ) ¡ [E(gT )]2 ¡ 1 2 = 2Efa0 mT b0 [(GT ¡ G0 ) -mT ]g + 2Efa0 mT c0 [vech(ST ¡ ST ) -mT ]g ~ ¹ 0 0 +2Efa mT e [vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ]g ~ ¹ ~ ¹ 0 ^T ¡ ST ) -mT ]g2 + O(`1=2 T ¡1 ): ¹ +Efc [vech(S (A.30) Thus, we only need to analyze the ¯rst four terms on the RHS of (A.30). First, by repeated applications of the mixing inequality as in the proof of moment inequalities (e.g, the proof of Lemma 4 of Billingsley, 1968, pp.172{174), one can show that T Efa0 mT b0 [(GT ¡ G0 ) -mT ]g = O(1): (A.31) Second, it follows from arguments similar to the one used in the proof of (A.17) that (T =`)Efa0 mT c0 [vech(ST ¡ ST ) -mT ]g ~ ¹ X XXX ` T T T = (`T )¡1 !j Efa0 vt c0 [vech(vs vs¡j ¡ ¡j ) -vu ]g 0 j=¡` t=1 s=1 u=1 X ` X T ¡1 = `¡1 !j (1 ¡ ¿i;k )Efa0 v0 c0 [vech(vi vi¡j ¡ ¡j ) -vk ]g 0 j=¡` i;k=¡T +1 24 X ` T ¡1 X = `¡1 !j Efa0 v0 c0 [vech(vi vi¡j ¡ ¡j ) -vk ]g + O(`T ¡1 ) 0 j=¡` i;k=¡T +1 X ` X T ¡1 = `¡1 Efa0 v0 c0 [vech(vi vi¡j ¡ ¡j ) -vk ]g + O(`¡qw ) + O(`T ¡1 ) 0 j=¡` i;k=¡T +1 X ` T ¡1 X T ¡1 X = lim `¡1 Efa0 v0 c0 [vech(vt vt¡j ¡ ¡j ) -vs ]g + O(`¡1 ); (A.32) 0 T !1 j=¡` t=¡T +1 s=¡T +1 (T =`)Efa0 mT e0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ]g ~ ¹ ~ ¹ 1 X X ` T = !i !j Efa0 vr e0 [vech(vs vs¡i ¡ ¡i ) -vech(vt vt¡j ¡ ¡j ) -vu ]g 0 0 `T 2 i;j=¡` r;s;t;u=1 1 X X ` T = !i !j (1 ¡ ¿s;t;u )Efa0 v0 e0 [vech(vs vs¡i ¡ ¡i ) -vech(vt vt¡j ¡ ¡j ) -vu ]g 0 0 `T i;j=¡` s;t;u=¡T 1 X X ` T = !i !j Efa0 v0 e0 [vech(vs vs¡i ¡ ¡i ) -vech(vt vt¡j ¡ ¡j ) -vu ]g 0 0 `T i;j=¡` s;t;u=¡T +O(`2 T ¡1 ) 1 X X ` T = Efa0 v0 e0 [vech(vs vs¡i ¡ ¡i ) -vech(vt vt¡j ¡ ¡j ) -vu ]g 0 0 `T i;j=¡` s;t;u=¡T +O(`¡q ) + O(`2 T ¡1 ) 1 X X ` T = lim Efa0 v0 e0 [vech(vs vs¡i ¡ ¡i ) -vech(vt vt¡j ¡ ¡j ) -vu ]g 0 0 T !1 `T s;t;u=¡T i;j=¡` ¡1 +O(` ); (A.33) and (T =`)Efc0 [vech(ST ¡ ST ) -mT ]g2 ^ ¹ X T X ` = `¡1 T ¡2 !i !j Efc0 [vech(vs vs¡i ¡ ¡i ) -vt ]c0 [vech(vu vu¡j ¡ ¡j ) -vv ]g 0 0 t;s;u;v=1 i;j=¡` X T X ` = (`T )¡1 !i !j (1 ¡ ¿j;k;m )Efc0 [vech(v0 v¡i ¡ ¡i ) -vj ] 0 j;k;m=¡T i;l=¡` £c0 [vech(vk vk¡l ¡ ¡k ) -vm ]g 0 X T X ` = (`T )¡1 !i !j Efc0 [vech(v0 v¡i ¡ ¡i ) -vj ]c0 [vech(vk vk¡l ¡ ¡k ) -vm ]g 0 0 j;k;m=¡T i;l=¡` ¡1 +O(`T ) X T X ` = (`T )¡1 Efc0 [vech(v0 v¡i ¡ ¡i ) -vj ]c0 [vech(vk vk¡l ¡ ¡k ) -vm ]g 0 0 j;k;m=¡T i;l=¡` ¡q +O(` ) + O(`T ¡1 ) XT X ` = lim `¡1 T ¡1 Efc0 [vech(v0 v¡i ¡ ¡i ) -vj ]c0 [vech(vk vk¡l ¡ ¡k ) -vm ]g 0 0 T !1 j;k;m=¡T i;l=¡` ¡1 +O(` ); (A.34) 25 where ¿i;k = (1=T ) min(max(jij; jkj; ji ¡ kj); T ) and ¿s;t;u = (1=T ) min(max(jsj; jtj; juj; js ¡ tj; jt ¡ uj; ju ¡ sj); T ). The proofs of (A.32), (A.33) and (A.34) are similar to that of (A.17) and thus details are omitted. Therefore, (A.18) follows from (A.30){(A.33). Third, we will prove (A.19). By (A.17), (A.18) and ·3 (gT ) = E(gT ) ¡ 3E(gT )E(gT ) + 2(E(gT ))3 ; 3 2 (A.35) it su±ces to show that T 1=2 E(gT ) = ·1 + O(`¡q ) + o(`T ¡1=2 ): 3 (A.36) It follows from Assumption 1(i), HÄlder's inequality and Lemma A.2 that o E(gT ) = E[(a0 mT )3 ] + 3Ef(a0 mT )2 b0 [(GT ¡ G0 )0 -mT ]g 3 +3Ef(a0 mT )2 c0 [vech(ST ¡ ST ) -mT ]g ~ ¹ +3Ef(a0 mT )2 c0 [vech(ST ¡ ST ) -mT ]g + o(`T ¡1 ): ^ ~ (A.37) The rest of the proof is similar to that of (A.17), and thus we will only show that 1 X ` T 2 Ef(a0 mT )2 c0 [ ~ vech(¡j ¡ ¡j ) -mT ]g j=¡` T ¡1 X = lim (1=T ) Efa0 v0 a0 v¿ c0 [vech(vt vt¡k ¡ ¡k ) -vs ]g: 0 (A.38) T !1 ¿;t;s;k=¡T +1 It follows from arguments similar to the proof of (A.21) that 1 T 2 Ef(a0 mT )2 c0 [vech(ST ¡ ST ) -mT ]g ~ ¹ T ¡1 X X ` = (1=T ) !j (1 ¡ ¿s;t;u )Efa0 v0 a0 vs c0 [vech(vt vt¡j ¡ ¡j ) -vu ]g s;t;u=¡T +1 j=¡` X T ¡1 X ` = (1=T ) !j Efa0 v0 a0 vs c0 [vech(vt vt¡j ¡ ¡j ) -vu ]g + O(T ¡1 ) s;t;u=¡T +1 j=¡` T ¡1 X X ` = (1=T ) Efa0 v0 a0 vs c0 [vech(vt vt¡j ¡ ¡j ) -vu ]g + O(`¡q ) s;t;u=¡T +1 j=¡` X T ¡1 X ` = lim T ¡1 Efa0 v0 a0 v¿ c0 [vech(vt vt¡j ¡ ¡j )0 -vs ]g + O(`¡q ): (A.39) 0 T !1 ¿;t;s=¡T +1 j=¡` By arguments similar to the proof of Lemma 1 of Andrews (1991, pp.850{851), one can show that the RHS of (A.39) equals the in¯nite sum of the product of two expectations plus some ¯nite number. By the mixing inequality, it follows that the in¯nite sum of the product of two expectations is ¯nite. Therefore, the RHS of (A.39) is well de¯ned. Lastly, we will show (A.20). ·4 (gT ) ¡ 3 = 4Ef(a0 mT )3 c0 [vech(ST ¡ ST ) -mT ]g ^ ¹ +4Ef(a0 mT )3 e0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ]g ^ ¹ ^ ¹ ³ ´ +6E (a0 mT )2 fc0 [vech(ST ¡ ST ) -mT ]g2 ^ ¹ ¡12Efa0 mT c0 [vech(ST ¡ ST ) -mT ]g ^ ¹ ¡12Efa0 mT e0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ]g ^ ¹ ^ ¹ 0 2 1=2 ¡1 ¡6Efc [vech(ST ¡ ST ) -mT ]g + O(` T ); ^ ¹ (A.40) 26 from which the desired result follows by similar arguments. Q.E.D. Lemma A.4: Ãg;T (x) · ¸ 1 1 iµ 3 ` µ2 µ4 ` = exp ¡ µ 2 + T ¡ 2 (®1 (iµ) ¡ (·1 ¡ 3®1 )) ¡ ( °1 + ³1 ) + o( ) ; A.41) ( 2 6 T 2 24 T P (gT · x) = ª(x) + T ¡1=2 p1 (x) + (`=T )p2 (x) + o(`=T ): (A.42) Proof of Lemma A.4: The proof of (A.41) follows from the standard arguments. (A.42) can be obtained by inverting (A.41). Q.E.D. Lemma A.5: Following GÄtze and KÄnsch (1996), de¯ne a truncation function by o u ¿ (x) = T ° xf (T ¡° kxk)=kxk where ° 2 (2=r; 1=2) and f 2 C 1 (0; 1) satis¯es (i) f (x) = x for x · 1; (ii) f is increasing; and y (iii) f (x) = 2 for x ¸ 2. Let fT denote fT with Rt ´ (vt ; vt ; vec(wt )0 ) replaced by ¹ 0 ~ Ry = (vt ; vt ; vec(wt )0 )0 = ¿ ((vt ; vt ; vec(wt )0 )0 ) : ¹ t y0 y0 ~ y 0 ~0 Let ªy and ªy denote the Edgeworth expansions of fT and gT , respectively. Let Ãg;T (x) and T g;T y y y Ãg;T (x) denote the characteristic functions of gT and ªy , respectively. Then ~y y g;T Z y ~y sup jP (fT · x) ¡ ªT (x)j · C jÃg;T (µ) ¡ Ãg;T (µ)jjµj¡1 dµ + O(`¡q ) + o(`T ¡1 ): (A.43) x jµj T ° ¹ · P (kRt k > T ° ) = O(T 1¡°r ); ¹ (A.45) 1·t·T t=1 it follows that ¯ ¯ ¯ y ¯ sup ¯P (fT · x) ¡ P (fT · x)¯ = O(T 1¡°r ) = O(T ¡1 ): (A.46) ¡1 T ° )] 1·t·T 2j 1=2 · 2 (EkmT k ) j P ( max kRt k > T ° )1=2 1·t·T = o(T ¡1=2 ) (A.47) for j · r=2. Similarly, we obtain that EkT 1=2 [(Gy ¡ Gy ) ¡ (GT ¡ G0 )]kj T 0 = o(T ¡1=2 ); (A.48) ~y ¹y ~y ¹y Ek(T=`)1=2 [vech(ST ¡ ST ) ¡ vech(ST ¡ ST )]kj = o(T ¡1=2 ); (A.49) Ek(T =`)1=2 [vech(rST ¡ rST ) ¡ vech(rST ¡ rST )]kj ~ ¹ ~ ¹ = o(T ¡1=2 ); (A.50) EkT 1=2 ^y [vech(ST ¡ ~y ST ) ^y ¡ vech(ST ~y ¡ ST )]jj = o(T ¡1=2 ); (A.51) 27 for j · r=2. Thus it follows from Lemma A.2, (A.45), (A.47)-(A.51) that ¯ ¯ ¯ ¯ sup ¯ªT (x) ¡ ªy (x)¯ = o(`T ¡1 ): T (A.52) ¡1 `3=2 T ¡3=2 ) can be made arbitrarily small. y T Thus we have sup jP (fT · x) ¡ ªy (x)j = sup jP (hy · x) ¡ ªy (x)j + O(`3=2 T ¡3=2 ): y T T h;T (A.54) x x y Since the di®erence between the Edgeworth expansions of gT and of hy is O(ST ¡ ST ), it follows T ¹y y that sup jP (hy · x) ¡ ªy (x)j = sup jP (gT · x) ¡ ªy (x)j + O(`¡q ): T h;T y g;T (A.55) x x Therefore, (A.53) follows from (A.54) and (A.55). Lastly, it follows from the so-called smoothing lemma (e.g., Proposition C1 of Fan and Linton, 1997) that Z y sup jP (gT · x) ¡ ªy (x)j · C g;T y ~y jÃg;T (µ) ¡ Ãg;T (µ)jjµj¡1 dµ + O(T ¡1+2=r ): (A.56) x jµj 12. Q.E.D. Lemma A.6: For 0 < " < 1=6, Z y ~y jÃg;T (µ) ¡ Ãg;T (µ)jjµj¡1 dµ = o(`T ¡1 ): (A.57) jµj·T " y Proof of Lemma A.6: Write gT as y gT = a0 my + b0 [(Gy ¡ Gy ) -my ] + c0 [vech(ST ¡ ST ) -my ] T T 0 T ~y ¹ T 0 +c [vech(ST^ ¡ ST ) -m ] + d0 [(G ¡ G ) -vech(S y ¡ S y ) -my ] y ~ y y y ~ ¹ T T 0 T T T 0 y y ^y ¡ S y ) -my ] +d [(G ¡ G ) -vech(S T 0 ~ T T T +e [vech(ST ¡ ST ) -vech(ST ¡ ST ) -my ] 0 ~y ¹y ~y ¹y T +e0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -my ] ~y ¹y ^y ~y T +e0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -my ] ^y ~y ~y ¹y T +e0 [vech(Sy ¡ S y ) -vech(S y ¡ S y ) -my ] ^ ~ T ^ T ~ T T T y y y ´ gT ;1 + gT;2 + ::: + gT ;10 : 28 y y y y Then a Taylor series expansion of E(exp(iµgT )) around gT;2 + gT;3 + ::: + gT;10 = 0 yields y y y y y y E(exp(iµgT )) = E(exp(iµgT;1 ) + iµE[exp(iµgT;1 )(gT;2 + gT ;3 + gT;4 )] (iµ)2 y y y y y y2 + E[exp(iµgT;1 )(2gT;1 gT ;3 + 2gT ;1 gT;7 + gT ;3 )] 2 (iµ)3 y y2 y y2 y y2 y + E[exp(iµgT;1 )(3gT;1 gT ;2 + 3gT ;1 gT;3 + 3gT;1 gT;4 )] 6 (iµ)4 y y3 y y3 y y2 y2 + E[exp(iµgT;1 )(4gT;1 gT ;3 + 4gT ;1 gT;7 + 6gT;1 gT;3 )] 24 y4 y4 y4 +O(µ 4 [E(gT;2 ) + E(gT;3 ) + ::: + E(gT;10 )]): (A.58) We will analyze each term on the RHS of (A.58) in turn. First, it follows from Lemma 3.33 of GÄtze and Hipp (1983) that o ½ · 3 4 6 ¸ 2 ¾ y (iµ) µ µ µ E exp(iµgT ;1 ) ¡ 1 + E(a0 my )3 + (E(a0 my )4 ¡ 3) ¡ (E(a0 my )3 )2 exp(¡ ) T T T 6 24 72 2 = O((1 + jµj9 ) exp(¡µ2 )T ¡1¡" ): (A.59) P y y Second, let ÃX denote the multivariate expansion of E(exp(ic0 T ¡1=2 T Xt )) where Xt = ~ t=1 0 y y0 y y 0 0 (a vt ; vt ; (wt ¡ G0 ) ) . Then an application of Lemma 3.33 of GÄtze and Hipp (1983) with o # = (µ; 0; :::; 0)0 yields y y (iµ)3 y2 y jEfexp(iµgT ;1 )[iµgT ;2 + g g ]g 2 T ;1 T;2 µ ¶ (iµ)3 (iµ)3 µ2 ¡ (iµ ¡ )Efb0 [(Gy ¡ Gy ) -my ]g + T 0 T Ef(a0 my )2 b0 [(Gy ¡ Gy ) -my ]g exp(¡ )j T T 0 T 2 2 2 X X T · T ¡1=2 jc® jjD® [E(exp(i#0 T ¡1=2 y ~ X2t )) ¡ ÃX ]j ® t=1 = O((1 + jµj8 + jµj10 ) exp(¡µ2 )T ¡1¡" ); (A.60) where c® are the corresponding elements of a, b and G0 . Third, we will show that y y y y (iµ)3 y2 y (iµ)4 y3 y jiµE[exp(iµgT;1 )[iµgT;3 + (iµ)2 gT ;1 gT;3 + gT;1 gT ;3 + g g ]] µ 2 6 T;1 T;3 1 ¡ ~y ¹y (iµ ¡ (iµ)3 )Efc0 [vech(ST ¡ ST ) -my ]g + (iµ)2 Efa0 my c0 [vech(ST ¡ ST ) -my ]g T T ~y ¹y T 2 (iµ)3 + Ef(a0 my )2 c0 [vech(ST ¡ ST ) -my ]g T ~y ¹y T 2 ¶ 4 (iµ) 0 y 3 0 ~y ¡ S y ) -my ]g exp(¡ 1 µ2 )j ¹ + Ef(a mT ) c [vech(ST T T 6 2 = O((1 + jµj6 ) exp(¡µ 2 )`T ¡1¡" ): (A.61) Note that the ¯rst term of (A.61) can be written as a weighted sum of (iµ)3 0 y 2 (iµ)4 0 y 3 0 Efexp(iµgT ;1 )[iµ + (iµ)2 a0 my + y T (a mT ) + (a mT ) ]c [vech(¡y ¡ ¡y ) -my ]g (A.62) ~ j j T 2 6 and that the rest of the terms can be written as a weighted sum of 1 (iµ)3 0 y 2 (iµ)4 0 y 3 0 µ2 Ef[iµ ¡ (iµ)3 + (iµ)2 a0 my + T (a mT ) + (a mT ) ]c [vech(¡y ¡ ¡y ) -my ]g exp(¡ ) ~ j j T 2 2 6 2 (A.63) 29 ~ We will apply Lemma 3.33 of GÄtze and Hipp (1983) to (A.62) and (A.63). Let ÃY denote the o 0 ¡1=2 PT y multivariate expansion of E(exp(i# T t=1 Yt )) where # = (µ; 0; :::; 0) and X T Yty = (a0 my ; my0 ; T ¡1=2 T T vech[vt vt¡j ¡ E(vt vt¡j )]0 )0 : 0 0 t=1 Then the di®erence between (A.61) and (A.62) are bounded by ¯ ¯ X ¯ X y T ¯ ¯ ~ ¯ T ¡1=2 jc® jD® ¯E(exp(i#0 T ¡1=2 Yt )) ¡ ÃY )¯ = O((1 + jµj6 ) exp(¡µ2 )T ¡1¡" ); (A.64) ® ¯ ¯ t=1 where c® are the corresponding linear combinations of a and c. Thus (A.61) follows. Fourth, by arguments analogous to the proof of (A.61), one can show that y y (iµ)3 y2 y jE[exp(iµgT;1 )(iµgT ;4 + g g )] 2 T ;1 T;4 1 ¡ ((iµ ¡ (iµ)3 )Efc0 [vech(ST ¡ ST ) -my ]g + (iµ)3 Ef(a0 my )2 c0 [vech(ST ¡ ST ) -my ]g) ^y ~y T T ^y ~y T 2 2 µ £ exp(¡ )j 2 = O((1 + jµj6 ) exp(¡µ 2 )`T ¡1¡" ); (A.65) and y (iµ)2 y y 2 (iµ)4 y3 y (iµ)4 y2 y2 jE[exp(iµgT;1 )[ (2gT;1 gT;7 + gT ;3 ) + gT;1 gT;7 + g g ] 2 6 4 T ;1 T;3 (iµ)2 ¡ ( (2Efa0 mT c0 [vech(ST ¡ ST ) -mT ]g + Efc0 [vech(ST ¡ ST ) -mT ]g2 ) ~ ¹ ^ ¹ 2 (iµ)4 + Efa0 my e0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -my ]g T ^y ¹y ^y ¹y T 6 (iµ)4 µ2 + Efc0 [vech(ST ¡ ST ) -mT ]g2 ) exp(¡ )j ^ ¹ 6 2 = O((1 + jµj6 ) exp(¡µ 2 )`2 T ¡3=2¡" ): (A.66) Lastly, it follows from Lemma A.2 that y4 y4 y4 µ 4 [E(gT;2 ) + E(gT;3 ) + : : : + E(gT ;4 )] = O(µ 4 `2 T ¡2 ): (A.67) Combining and integrating (A.59), (A.60), (A.61), (A.65), (A.66) and (A.67) produces the desired result. Q.E.D. Lemma A.7: Z y ~y jÃg;T (µ) ¡ Ãg;T (µ)jjµj¡1 dµ = o(`T ¡1 ): (A.68) T " 0. Let N = [(T=µ 2 + 1)m2 ] for T " < jµj < T 1¡2=r . Then m · N · T for su±ciently large T . De¯ne X N X T mN = T ¡1=2 vt ; mT ¡N = T ¡1=2 vt ; t=1 t=N+1 X N X T GN ¡ E(GN ) = (1=T ) (wt ¡ E(wt )); GT ¡N ¡ E(GT ¡N ) = (1=T ) (wt ¡ E(wt )); t=1 t=N+1 30 X ` X ` SN ¡ SN = ~ ¹ !j (¡j;N ¡ ¡j ); ~ ST ¡N ¡ ST ¡N = ~ ¹ !j (¡j;T ¡N ¡ ¡j ); ~ j=¡` j=¡` X ` X ` ^ ~ SN ¡ SN = ^ ~ !j (¡j;N ¡ ¡j;N ); ^ ~ ST ¡N ¡ ST ¡N = ^ ~ !j (¡j;T ¡N ¡ ¡j;T ¡N ) j=¡` j=¡` so that mT = mN + mT ¡N ; GT ¡ G0 = GN ¡ E(GN ) + GT ¡N ¡ E(GT ¡N ); ~ ST ¹ ¡ ST = ~ ¹ ~ ¹ SN ¡ SN + ST ¡N ¡ ST ¡N ; ^T S ¡ ST ~ = ^N ¡ SN + ST ¡N ¡ ST ¡N : S ~ ^ ~ Write gT = a0 mT + Q(mT ; GT ; ST ; ST ; ST ): ^ ~ ¹ Then a Taylor series expansion of Q around vt = 0 and wt = 0 for t = 1; 2; :::; N yields E exp(iµgT ) = E[exp(iµa0 mT + iµQ(mT ¡N ; GT ¡N ; ST ¡N ; ST ¡N ; ST ¡N ) ^ ~ ¹ X £ ¹ º ^ ~ ¹ v w Q¹º (mT ¡N ; GT ¡N ; ST ¡N ; ST ¡N ; ST ¡N )] ®;¯ ^ ~ ¹ ^ ~ ¹ +O(jµjr EjQ(mT ; GT ; ST ; ST ; ST ) ¡ Q(mT ¡N ; GT ¡N ; ST ¡N ; ST ¡N ; ST ¡N )jr )(A.69) where the power is element-by-element and the indices satisfy ¹ = (¹1 ; :::; ¹N +`¡1 ; 0; :::; 0); º = (º1 ; :::; ºN ; 0; :::; 0); j¹j + jºj · 5(r ¡ 1): First, we will consider the expansion terms in (A.69). Let 0 0 fj1 ; :::; j5(r¡1) g = fj : ¹j or ºj > 0g; 0 I = fj 2 f1; :::; N ¡ mg : jj ¡ jk j ¸ 3m; k = 1; :::; 5(r ¡ 1)g; jk+1 = inffj 2 I : j ¸ jk + 7mg and j1 = inf I. Let s denote the smallest integer for which the inf is unde¯ned. Let Y Ak = fexp(iµT ¡1=2 a0 vt : j 2 I; jj ¡ jk j · mg; k = 1; :::; s; Y Bk = fexp(iµT ¡1=2 a0 vt : j 2 I; jk + m + 1 · j · jk+1 ¡ m ¡ 1g; k = 1; :::; s ¡ 1; Y Bs = fexp(iµT ¡1=2 a0 vt : j 2 I; j ¸ js ¡ m ¡ 1g; Y R = exp(iµT ¡1=2 a0 vt ) exp(iµQ(mT ¡N ; GT ¡N ; ST ¡N ; ST ¡N ; ST ¡N ))v¹ wº Q¹º : ^ ~ ¹ j62I Then we can write E[exp(iµa0 mT + iµQ(mT ¡N ; GT ¡N ; ST ¡N ; ST ¡N ; ST ¡N ) ^ ~ ¹ X Y s £ ^ ~ ¹ v¹ wº Q¹º (mT ¡N ; GT ¡N ; ST ¡N ; ST ¡N ; ST ¡N )] = Ak Bk R: (A.70) ®;¯ k=1 Note that jAk j · 1, jBk j · 1, jRj · T °(s¡1)r , and that Ak , Bk and R are measurable with jk +2m jk +1 respect to Fjk ¡2m , Fjk ¡1 , fFl : 9j 62 I; jl ¡ jj · mg, respectively. By Assumption 1(d), it 31 follows that Y s Y s jE[ Ak Bk R] ¡ E[ E(Ak jFj : jj ¡ jk j · 3m)Bk R]j k=1 k=1 X s Y j¡1 Y s · jE[ Ak Bk (Aj ¡ E(Aj jFj : jj ¡ jk j · 3m)) E(Al jFj ; jj ¡ jl j · 3m)Bl j j=1 k=1 l=j+1 X s Y j¡1 jk ¡1 1 = jE[ Ak Bk (E(Aj jF¡1 [ Fjk +1 ) ¡ E(Aj jFj : jj ¡ jk j · 3m)) j=1 k=1 Y s £ E(Al jFj : jj ¡ jl j · 3m)Bl j l=j+1 = O(T c1 exp(¡dm)) = o(T ¡c2 ) (A.71) for any arbitrary c2 > 0 by choosing su±ciently large M . By the mixing inequality of Hall and Heyde (1980), we obtain Y s jE[R E(Ak jFj : jj ¡ jk j · 3m)Bk ]j k=1 Y s · T c3 E jE(Ak jFk : 0 < jj ¡ jk j · 3m)j j=1 Ys +T c3 EjE(Ak jFj : 0 < jj ¡ jk j · 3m)j + 4T c3 (q=d) exp(¡dm) (A.72) j=1 for some c3 > 0. For jµj ¸ d, we have EjE(Ak jFj ; j 6= jk )j · exp(¡d). Thus by Lemma 3.2 of GÄtze and Hipp (1983) and Assumption 1(d), it follows that o EjE(Ak jFj ; jj ¡ jk j · 3m)j · EjE(Ak jFj : jj ¡ jk j 6= 0)j + O(T c exp(¡dm)) · max(exp(¡dµ2 =T ); exp(¡d)) + O(T c3 exp(¡dm))(A.73) Y s E[ Ak Bk R] = O(T ¡c ) (A.74) k=1 for arbitrary c > 0 by choosing su±ciently large M . Next, consider the remainder term in (A.69). It follows from Lemma A.2 that EjmN jr = O((N=T )r ); (A.75) EjT 1=2 (GN ¡ G0 )jr = O((N=T )r ); (A.76) Ej(T =`)1=2 vech(SN ¡ SN )jr ~ ¹ = O((N=T )r=2 ); (A.77) Ej(T =`)1=2 vech(rSN ¡ rSN )jr ~ ¹ = O((N=T )r=2 ); (A.78) 1=2 EjT vech(SN ¡ SN )jr ^ ~ = O((N=T )r=2 ): (A.79) Using the de¯nition of N and "r > 2, we obtain that jµjr EjQ(mT ; GT ; ST ; ST ; ST ) ¡ Q(mT ¡N ; GT ¡N ; ST ¡N ; ST ¡N ; ST ¡N )jr ^ ~ ¹ ^ ~ ¹ r r=2 ¡r = O(` jµj N T ) r=2 ½ = O(`r=2 mr T ¡r=2 ) for jµj · T 1=2 O(jµj ` m T ) for T 1=2 < jµj · `¡1=2 T 1¡" r r=2 r ¡r = o(`T ¡1 ): (A.80) 32 Lastly, it follows from (A.69), (A.71)-(A.73) and (A.80) that E exp(iµgT ) = T c max(exp(¡dµ 2 =T ); exp(¡d))N=M + O(T c exp(¡dm)) + o(`T ¡1 ) = o(`T ¡1 ) (A.81) for s ¸ N=M and su±ciently large M, which completes the proof. Q.E.D. Lemma B.1: For 1 · s · r=2, © ª E ¤ [kvec(FNj )ks ] ¡ E E ¤ [kvec(FNj )ks ] = Op (b¡1=2 ); (A.82) © ¤ ª E ¤ [kBNj ks ] ¡ E E [kBNj ks ]] = Op (b¡1=2 ): (A.83) Proof of Lemma B.1: First we will prove (A.82). We can write the LHS of (A.82) as X T ¡` X ` (1=(T ¡ ` + 1)) kvec(Fj )ks ¡ E[kvec(Fj )ks ] = (1=(T ¡ ` + 1))(1=`) fs;º ; (A.84) t=0 º=1 where X b¡1 fs;º = (1=b) (kvec(F¹`+º )ks ¡ E(kvec(F¹`+º )ks )) : ¹=0 Note that fvec(F¹`+º )gb¡1 ¹=0 is a triangular array of strong mixing sequence with mixing co- e±cients given by f®¹` g where ®m is the mixing coe±cient of the original variables. So is kvec(F¹`+º )ks . Thus it follows that fs;º = Op (b¡1=2 ): (A.85) Since the decay rate of the mixing coe±cients is uniform in º, (A.85) also holds uniformly in º. Hence (A.82) follows from (A.84) and (A.85). Next we will prove (A.83). Note that the LHS of (A.83) is bounded by à T ¡` ! X O (1=(T ¡ ` + 1)) kB ~t ks ¡ EkBt ks ~ (A.86) t=0 à ! X T ¡` +O (1=(T ¡ ` + 1)) kBt ks ¡ kBt ks ¡ E(kBt ks ¡ kBt ks ) ^ ~ ^ ~ (A.87) t=0 +O (k¹¤ ks ¡ E(k¹¤ k )) ; T T s (A.88) ^ ¡1=2 P` ¡1=2 P` ¡1=2 where Bt = ` ~ j=1 vt+j and Bt = ` ~ j=1 vt+j . First, the proof that (A.86) is Op (b ) is analogous to the proof of (A.82) and thus is omitted. Second, we will prove that (A.87) is Op (b¡1=2 ). A Taylor series expansion yields 1 ~ ~ ~ kBt k ¡ kBt k = skBt ks¡2 Ft ` 2 (¯T ¡ ¯0 ): ^ (A.89) Thus we have X T ¡` X T ¡` 1 (1=(T ¡ ` + 1)) ^ ~ kBt ks ¡ kBt ks = (1=(T ¡ ` + 1)) ~ ~ skBt ks¡2 Ft ` 2 (¯T ¡ ¯0 ): (A.90) t=0 t=0 By using arguments analogous to the one used in the proof of (A.82), it follows from the ergodic theorem that T ¡` X (1=(T ¡ ` + 1)) ~ skBt ks¡2 Ft = Oas (1): (A.91) t=0 33 Thus it follows from Assumption 1(i) that X T ¡` (1=(T ¡ ` + 1)) (kBt ks ¡ kBt ks ) = Op (b¡1=2 ): ^ ~ (A.92) t=0 Similarly we obtain T ¡` X (1=(T ¡ ` + 1)) E(kBt ks ¡ kBt ks ) = O(b¡1=2 ): ^ ~ (A.93) t=0 Hence it follows from (A.92) and (A.93) that (A.87) is Op (b¡1=2 ). Third, we will prove that (A.88) is Op (b¡1=2 ). We can write ¹¤ as T X T ¡` ¹¤ T = (1=(T ¡ ` + 1)) ^ Bt t=0 T ¡` X T ¡` X = (1=(T ¡ ` + 1)) Bt + (1=(T ¡ ` + 1)) ~ Ft `1=2 (¯T ¡ ¯0 ) ~ t=0 t=0 X T ¡` +(1=(T ¡ ` + 1)) Ht `1=2 (¯T ¡ ¯0 )2 : ~ (A.94) t=0 Thus we obtain k¹¤ ks ¡ Ek¹¤ ks T T T ¡` X = O((1=(T ¡ ` + 1)) kBt ks ¡ EkBt ks ) ~ ~ t=0 X T ¡` +O((1=(T ¡ ` + 1)) kFt `1=2 (¯T ¡ ¯0 )ks ¡ EkFt `1=2 (¯T ¡ ¯0 )ks ): ~ ~ (A.95) t=0 The rest of the proof is analogous to the proofs of (A.86) and (A.87). Therefore (A.83) follows from (A.86), (A.87) and (A.88). Q.E.D. Lemma B.2: Let G¤ = E ¤ (G¤ ) and BT and CT denote the bootstrap version of B and C in 0 T ¤ ¤ ¤ Lemma A.1 with S0 replaced by ST , respectively. Then G¤ 0 = G0 + Op (T ¡1=2 ); (A.96) ¤ ¡1 ¡1=2 ST = S + O(` ) + Op (b ): (A.97) Proof of Lemma B.2: First, we will prove (A.96). 1X b G¤ 0 = E ¤ [G¤ ] = E ¤ [ T FNk ] = E ¤ [FN1 ] b k=1 1 X T ¡` 1 X1X T ¡` ` = Ft = wt+i T ¡`+1 t=0 T ¡ ` + 1 t=0 ` i=1 1 X T = wt + Op (`T ¡1 ) = GT + Op (`T ¡1 ): T t=1 Therefore, (A.96 follows from GT ¡ G0 = Op (T ¡1=2 ). 34 Next, we will prove (A.97). By de¯nition, it follows that 1 X b ¤ ST ´ Var¤ (m¤ ) = Var¤ ( p T BNk ) (A.98) b k=1 à !à !0 1 X p ¤ 1 X p ¤ b b ¤ = E p BNk ¡ bE (BN1 ) p BNk ¡ bE (BN1 ) (A.99) b k=1 b k=1 à !à !0 1 X 1 X b b ¤ = E p BNk p BNk (A.100) b k=1 b k=1 T ¡` 1X ¤ X b 0 0 1 0 = E (BNk BNk ) = E ¤ (BN1 BN1 ) = Bt Bt : (A.101) b k=1 T ¡ ` + 1 t=0 It follows from Lemma B.1 that ST ¡ E[ST ] = Op (b¡1=2 ): ¤ ¤ (A.102) Since ¹¤ = Op (T ¡1=2 ), we have T 0 X ` ¤ E[ST ] = E[Bt Bt ] = (1 ¡ jjj=`)E[v0 v¡j ] = S + O(`¡1 ): 0 (A.103) j=¡` Thus (A.97) follows from (A.101),(A.102) and (A.103). Q.E.D. Lemma B.3: Let ®¤ T = T 1=2 ·¤ (gT ); 1 ¤ ¤ °T = (T =`)(·¤ (gT ) ¡ 1) = (T =`)(E ¤ (gT ) ¡ [E ¤ (gT )]2 ¡ 1); 2 ¤ ¤2 ¤ ·¤ T = T 1=2 E ¤ (gT ) = T 1=2 f·¤ (gT ) + 3E ¤ (gT )E ¤ (gT ) ¡ 2[E ¤ (gT )]3 g; ¤3 3 ¤ ¤2 ¤ ¤ ¤ ¤ ¤ ³T = (T =`)(·4 (gT ) ¡ 3): Then ®¤ T = ®1 + T 1=2 b¤0 E ¤ [(G¤ ¡ G¤ ) -m¤ ] + T 1=2 c¤0 E ¤ [vech(ST ¡ ST ) -m¤ ] T 0 T ~¤ ¤ T 0 +T 1=2 c¤ E ¤ [vech(ST ¡ ST ) -m¤ ] + o¤ (`T ¡1=2 ) ^¤ ~¤ T p = ®1 + Op (`¡1 ) + Op (b¡1=2 ) + o¤ (`T ¡1=2 ); p (A.104) ¤ °T = °1 + 2(T =`)E ¤ fa¤0 m¤ b¤0 [(G¤ ¡ G¤ ) -m¤ ]g T T 0 T +2(T =`)E ¤ fa¤0 m¤ c¤0 [vech(ST ¡ ST ) -m¤ ]g T ~¤ ¤ T +2(T =`)E ¤ fa¤0 m¤ e¤0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -m¤ ]g T ~¤ ¤ ~¤ ¤ T ¤ ¤0 ^T ¡ ST ) -m¤ ]g2 + o¤ (1); ¤ ~¤ +(T =`)E fc [vech(S T p = °1 + op (1) + o¤ (1); p (A.105) ·¤ T = ·1 + T 1=2 E ¤ [(a¤0 m¤ )3 ] + 3T 1=2 E ¤ f(a¤0 m¤ )2 b¤0 [(GT ¡ G0 )0 -mT ]g T T +3T 1=2 E ¤ f(a¤0 m¤ )2 c¤0 [vech(ST ¡ ST ) -m¤ ]g T ~¤ ¤ T +3T 1=2 E ¤ f(a¤0 m¤ )2 c¤0 [vech(ST ¡ ST ) -m¤ ]g + o¤ (`T ¡1=2 ) T ^¤ ~¤ T p = ·1 + Op (`¡1=2 ) + Op (b¡1=2 ) + o¤ (`T ¡1=2 ); p (A.106) ¤ ³T = ³1 + 4(T =`)E f(a mT ) c [vech(ST ¡ ST ) -m¤ ]g ¤ ¤0 ¤ 3 ¤0 ^¤ ~¤ T ¤ ¤0 ¤ 3 0 ^¤ +4(T =`)E f(a mT ) e [vech(ST ¡ ST ) -vech(ST ¡ ST ) -m¤ ]g ~¤ ^¤ ~¤ ³ ´ T ¤ ¤0 ¤ 2 ¤0 ^T ¡ ST ) -m¤ ]g2 ¤ ~¤ +6(T =`)E (a mT ) fc [vech(S T 35 ¡12(T =`)E ¤ fa¤0 m¤ c¤0 [vech(ST ¡ ST ) -m¤ ]g T ^¤ ~¤ T ¡12(T =`)E ¤ fa¤0 m¤ e¤0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -m¤ ]g T ^¤ ~¤ ^¤ ~¤ T ¡6(T =`)E ¤ fc¤0 [vech(ST ¡ ST ) -m¤ ]g2 + o¤ (1) ^¤ ~¤ T p = ³1 + op (1) + o¤ (1); p (A.107) where ®1 , °1 , ·1 , and ³1 are de¯ned in Lemma A.4. Proof of Lemma B.3: The ¯rst equalities in (A.104){(A.107) follow from Lemmas B.1 and B.2. Thus we will show that the second equalities hold in the rest of the proof. Part (a): Proof of (A.104). First, we introduce some notation for the proof. Let ®¤ 1T = T 1=2 b¤0 E ¤ [(G¤ ¡ G¤ ) -m¤ ]; T 0 T ®¤ 2T = T 1=2 c¤0 E ¤ [vech(ST ¡ ST ) -m¤ ]; ~¤ ¤ T 0 ®¤ 3T = T 1=2 c¤ E ¤ [vech(ST ¡ ST ) -m¤ ]; ^¤ ~¤ T ®1T = T 1=2 b0 E[(GT ¡ G0 ) -mT ]; ®2T = T 1=2 c0 E[vech(ST ¡ ST ) -mT ]; ~ ¹ 1=2 0 ^ ~ ®3T = T c E[vech(ST ¡ ST ) -mT ]; X1 ®11 = b0 E[w0 -vi ]; i=¡1 1 X ®21 = c0 0 E[vech(v0 vi ) -vj ]; i;j=¡1 X1 ®31 = c0 Efvech[rS(E(w0 )0 V E(w0 ))¡1 E(w0 )0 V v0 ] -vi g: ¹ i=¡1 Next, we will prove that ¤ ®1T ¡ ®11 = Op (`¡1 ) + Op (`1=2 b¡1=2 ); (A.108) ¤ ¡1 1=2 ¡1=2 ®2T ¡ ®21 = Op (` ) + Op (` b ); (A.109) ¤ ¡1 1=2 ¡1=2 ®3T ¡ ®31 = Op (` ) + Op (` b ): (A.110) Since ®1 = ®11 +®21 + ®31 and ®¤ = ®¤ +®2T +®¤ , (A.104) follows from (A.108), (A.109) T 1T ¤ 3T and (A.110). First, we will prove (A.108). From Lemma B.2, we have b¤ = b + O(`¡1 ) + Op (b¡1=2 ) and thus p ¤0 ¤ ®¤1T = `b E f[FN1 ¡ E ¤ (FN1 )] -BN1 g e = b¤0 E ¤ [FN1 -BN1 ] e = b0 E ¤ [FN1 -BN1 ] + Op (`¡1 ) + Op (b¡1=2 ) = ®11 + Op (`¡1 ) + Op (b¡1=2 ); ¤ say. (A.111) By combining (A.111) with h i X ` 11 e E [®¤ ] = b0 E Ft -Bt = (1 ¡ jjj=`)b0 E[w0 -v¡j ] = ®11 + O(`¡1 ): (A.112) j=¡` and ®¤ ¡ E [®11 ] = Op (b¡1=2 ) from Lemma B.1, we obtain (A.108). 11 ¤ Second, we will prove (A.109). Similarly, we have c¤ = c + O(`¡1 ) + Op (b¡1=2 ) from Lemma B.2 and thus p ¤0 ¤ ®¤ 2T = `c E [vech(BN1 BN1 ¡ E ¤ (BN1 BN1 )) -BN1 ] 0 0 36 p = `c0 E ¤ [vech(BN1 BN1 ¡ E ¤ (BN1 BN1 )) -BN1 ] + Op (`¡1 ) + Op (b¡1=2 ) 0 0 p 0 ¤ = `c E [vech(BN1 BN1 ) -BN1 ] + Op (`¡1 ) + Op (b¡1=2 ) 0 = ®¤ + Op (`¡1 ) + Op (b¡1=2 ); 21 say. (A.113) By combining (A.113) with p 0 E [®¤ ] = 21 0 `c E[vech (Bt Bt ) -Bt ] X µ ` min ((max jij; jjj)(i ¢ j > 0) + (jij + jjj)(i ¢ j · 0); `) ¶ = c0 1¡ i;j=¡` ` ¡ 0 ¢ £E[vech v0 v¡i -v¡j ] = ®21 + O(`¡1 ): (A.114) and ®¤ ¡ E [®21 ] = Op (b¡1=2 ) from Lemma B.1, we obtain (A.109). 21 ¤ Lastly, we will prove (A.110). Note that 1 X ³ b b0 ´ b ^¤ e¤ ST ¡ ST = 0 BNk BNk ¡ BNk BNk b k=1 e¤ e ^ e¤ e = 5ST (¯ ¤ ¡ ¯) + 52 ST (¯¤ ¡ ¯)2 ^ (A.115) where p b e¤ ` X¡ 0 0 ¢ 5ST = FNk BNk + BNk FNk ; b k=1 ` X¡ ¢ b e¤ 52 ST = 0 FNk FNk ; b k=1 e ¡1 1 ¯ ¤ ¡ ¯ = [G¤0 VT G¤ ] G¤0 VT p m¤ : ^ T T T T T First, note that h i e¤ 5ST e¤ = E ¤ 5ST + Op (b¡1=2 ); ¤ h i e¤ `¡1 52 ST e¤ = `¡1 E ¤ 52 ST + Op (b¡1=2 ); ¤ where h i p ¤ e¤ E ¤ 5ST = e e0 `E [FN1 BN1 + BN1 FN1 ] = E ¤ [FN1 BN1 + BN1 FN1 ]; 0 0 0 h i e¤ E ¤ 52 ST = `E ¤ [FN1 FN1 ]: 0 Second, note that e T 1=2 (¯ ¤ ¡ ¯) = [E ¤ [G¤0 ]VT E ¤ [G¤ ]] ^ T T ¡1 E ¤ [G¤0 ]VT m¤ + Op (b¡1=2 ) T T ¤ (A.116) since G¤ ¡ E ¤ [G¤ ] = Op (b¡1=2 ) and T T ¤ G¤0 VT G¤ ¡ E ¤ [G¤0 ]VT E ¤ [G¤ ] = Op (b¡1=2 ): T T T T ¤ Thus it follows from (A.115){(A.116) that 0 ®¤ 3T = T 1=2 c¤ E ¤ [vech(ST ¡ ST ) -m¤ ]; ^¤ ~¤ T 37 ³ ´ 0 e¤ e e¤ e = c¤ E ¤ [vech 5ST T 1=2 (¯ ¤ ¡ ¯) + 52 ST T 1=2 (¯ ¤ ¡ ¯)2 -m¤ ] ^ ^ T ³ ´ 0 e¤ e = c¤ E ¤ [vech 5ST T 1=2 (¯ ¤ ¡ ¯) -m¤ ] + Op (`1=2 T ¡1=2 ) ^ T ¤ ³ ´ 0 e¤ ¡1 = c¤ E ¤ fvech E ¤ [5ST ] [E ¤ [G¤0 ]VT E ¤ [G¤ ]] E ¤ [G¤0 ]VT m¤ -m¤ g T T T T T +Op (`1=2 T ¡1=2 ) ¤ ³ £ ¤¡1 e e0 = c0 E ¤ fvech E ¤ [FN1 BN1 + BN1 FN1 ] E ¤ [FN1 ]V E ¤ [FN1 ] 0 0 ¢ £E ¤ [FN1 ]V BN1 -BN1 g + Op (`1=2 T ¡1=2 ) + Op (`¡1 ) + Op (b¡1=2 ); 0 ¤ = ®¤ + Op (`1=2 T ¡1=2 ) + Op (`¡1 ) + Op (b¡1=2 ); 31 ¤ say. (A.117) Since h i p X ` e e0 E E ¤ [FN1 BN1 + BN1 FN1 ] = 0 `E[Ft Bt + Bt Ft0 ] = 0 0 0 (1 ¡ jjj=`)E[w0 v¡j + v0 w¡j ] j=¡` 1 X = E[w0 v¡j + v0 w¡j ] + O(`¡1 ) = 5S + O(`¡1 );A.118) 0 0 ( j=¡1 E [E ¤ [FN1 ]] = E[Ft ] = E[w0 ]; (A.119) it follows that h i e e0 e e0 E ¤ [FN1 BN1 + BN1 FN1 ] ¡ E E ¤ [FN1 BN1 + BN1 FN1 ] 0 0 = Op (b¡1=2 ); (A.120) E ¤ [FN1 ] ¡ E [E ¤ [FN1 ]] = Op (b¡1=2 ): (A.121) Hence, it follows from the moment inequality, Lemma B.1, (A.120) and (A.121) that ®¤ 31 = E [®¤ ] + Op (b¡1=2 ) 31 X1 = c0 Efvech[rS(E(w0 )0 V E(w0 ))¡1 E(w0 )0 V v0 ] -vi g + Op (`¡1 ) + Op (b¡1=2 ) ¹ i=¡1 = ®31 + Op (`¡1 ) + Op (b¡1=2 ): (A.122) Therefore, (A.110) follows from (A.117), and ( A.122). Part (b): Proof of (A.105). Let ¤ °1T = (T =`)E ¤ fa¤0 m¤ b¤0 [(G¤ ¡ G¤ ) -m¤ ]g; T T 0 T ¤ °2T = (T =`)Efa¤0 m¤ c¤0 [vech(ST ¡ ST ) -m¤ ]g; T ~¤ ¤ T ¤ °3T = (T =`)E ¤ fa¤0 m¤ e¤0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -m¤ ]g; T ~¤ ¤ ~¤ ¤ T ¤ °4T = (T =`)E ¤ fc¤0 [vech(ST ¡ ST ) -m¤ ]g2 : ^¤ ~¤ T From Lemma B.2, we have a¤ = a + O(`¡1 ) + Op (b¡1=2 ) and thus °1T = `¡1=2 E ¤ fa¤0 BN1 b¤0 [FN1 ¡ E ¤ (FN1 )] -BN1 g = o¤ (1): ¤ p Similarly, ¤ °2T = (T =`)E ¤ fa¤0 m¤ c¤0 [vech(ST ¡ ST ) -m¤ ]g T ~¤ ¤ T = E ¤ fa¤0 BN1 c¤0 [vech(BN1 BN1 ¡ E ¤ (BN1 BN1 )) -BN1 ]g 0 0 = E ¤ fa0 BN1 c0 [vech(BN1 BN1 ¡ E ¤ (BN1 BN1 )) -BN1 ]g + Op (`¡1 ) + Op (b¡1=2 ) 0 0 = °21 + Op (`¡1 ) + Op (b¡1=2 ); ¤ say. 38 It follows from the moment inequality and Lemma B.1 that ¤ °21 = E [°21 ] + Op (b¡1=2 ) ¤ 1 X X ` T = lim Efa0 v0 c0 [vech(vi vi¡j ¡ ¡j ) -vk ]g + Op (`¡1 ) + Op (b¡1=2 ) 0 T !1 ` j=¡` i;k=¡T = °21 + Op (`¡1 ) + Op (b¡1=2 ): (A.123) ¤ ¤ The result for °3T and °4T can be proved using similar arguments, and thus the proof is omitted. Part (c): Proof of (A.106). Let ·¤ = T 1=2 E ¤ [(a¤0 m¤ )3 ] denote the second term on the RHS 1T T of (A.106). Because the proof of Part (c) is analogous to the proofs of Parts (a) and (b), we will only show that X1 ·¤ = 1T E(a0 v0 a0 vi a0 vj ) + Op (`¡1 ) + Op (b¡1=2 ): j (A.124) i;j=¡1 By de¯nition, we have T ¡` X ·¤ 1T = ` 1=2 ¤ ¤0 E [(a BN1 ) ] = (` 3 1=2 =(T ¡ ` + 1)) [a¤0 (Bt + Bt ¡ Bt ¡ ¹¤ )]3 ; ~ ^ ~ T (A.125) t=0 ^ ~ where Bt and Bt are de¯ned in the proof of Lemma B.1. Thus it su±ces to show that T ¡` X 1 X (`1=2 =(T ¡ ` + 1)) (a¤0 Bt )3 ~ = E(a0 v0 a0 vi a0 vj ) + op (`T ¡1=2 );(A.126) t=0 i;j=¡1 X T ¡` (`1=2 =(T ¡ ` + 1)) [a¤0 (Bt ¡ Bt )]3 ^ ~ = Op (`¡1 ) + Op (b¡1=2 ); (A.127) t=0 T ¡` X (`1=2 =(T ¡ ` + 1)) (a¤0 ¹¤ )3 T = Op (`¡1 ) + Op (b¡1=2 ); (A.128) t=0 First, we will show (A.126). Since a HAC covariance matrix estimator converges at rate Op (`1=2 T ¡1=2 ), it follows that X T ¡` (1=(T ¡ ` + 1)) `1=2 (a¤0 Bi )3 ¡ `1=2 E(a¤0 Bi )3 ~ ~ i=0 X ` = Op ( (1 ¡ min(max(i; j; ji ¡ jj); `)=`)(1=(T ¡ ` + 1)) i;j=0 T ¡` X £ [a¤0 vt a¤0 vt+i a¤0 vt+j ¡ E(a¤0 vt a¤0 vt+i a¤0 vt+j )]) t=0 = Op (`1=2 T ¡1=2 ): (A.129) By the moment inequality, it follows that X b¡1 1 X (1=b) `1=2 E(a¤0 Bi )3 = ~ E(a0 v0 a0 vi a0 vj ) + o(`T ¡1=2 ): (A.130) i=0 i;j=¡1 Thus (A.126) follows from (A.129) and (A.130). Next we will show (A.127) and (A.128). Using arguments similar to the one used in the proof of Lemma B.1, we obtain T ¡` X T ¡` X 1 3 (` 2 =(T ¡`+1)) [a¤0 (Bt ¡Bt )]3 = (`2 (T ¡`+1)) ^ ~ fa¤0 [Ft (¯T ¡¯0 )]g3 = Op (`2 T ¡ 2 ) (A.131) ~ t=0 t=0 39 and T ¡` X (`1=2 =(T ¡ ` + 1)) (a¤0 ¹¤ )3 = `1=2 (a¤0 ¹¤ )3 = Op (`1=2 b¡3=2 ): T T (A.132) t=0 Thus (A.127) and (A.128) are satis¯ed. Therefore, (A.106) follows. Part (d): Proof of (A.107). Part (d) can be proved using similar arguments and thus the proof is omitted. Q.E.D. Proofs of Main Theorems Lastly, we will prove the main theorems. Proof of Theorem 1: The result for the studentized statistic (3.2) follows from Lemmas A.5-A.7. Note that the J test statistic can be written as 1=20 1=2 JT = JT JT (A.133) where " # 1 X T 1=2 ^¡1=2 p 0 ^T xt ) : JT = ST zt (yt ¡ ¯ T t=1 1=2 Then one can show that Lemmas A.1{A.7 with fT replaced by JT hold except that a, b, c, 1=2 d and e now take di®erent values. Thus the distribution of JT can be approximated by its Edgeworth expansion in a suitable sense. A slight modi¯cation of Theorem 1 of Chandra and Ghosh (1979) completes the proof of (3.3). Q.E.D. Proof of Theorem 2: For iid observations, a modi¯cation of Theorem 1 with ` = 1 yields ^ ¡1=2 ^ sup jP (§T (¯T ¡ ¯0 ) · x) ¡ ªT (x)j = o(T ¡1 ); (A.134) x2