Bootstrapping GMM Estimators for Time Series¤

                              Atsushi Inouey                      Mototsugu Shintaniz
                      North Carolina State University            Vanderbilt University

                                        First Draft: October 2000

                                       This Version: February 2001


                                                   Abstract
          This paper establishes that the bootstrap provides asymptotic re¯nements for the generalized
      method of moments estimator of overidenti¯ed linear models when autocovariance structures of
      moment functions are unknown. Because the heteroskedasticity and autocorrelation consistent
      covariance matrix estimator cannot be written as a function of sample moments and converges
      at a rate slower than T ¡1=2 , the asymptotic re¯nement cannot be proved in the conventional
      way. As a result, we ¯nd that the bootstrap approximation error for the distribution of the
      t test and the test of overidentifying restrictions is of larger order than typically found in the
      literature. We also ¯nd that the choice of kernels plays a more important role in our second-
      order asymptotic theory than in the conventional ¯rst-order asymptotic theory. Nevertheless,
      the bootstrap approximation improves upon the ¯rst-order asymptotic approximation. A Monte
      Carlo experiment shows that the bootstrap improves the accuracy of inference on regression
      parameters in small samples. We apply our bootstrap method to inference about the parameters
      in the monetary policy reaction function.

      KEYWORDS: asymptotic re¯nements, block bootstrap, HAC covariance matrix estimator, de-
      pendent data, Edgeworth expansions, instrumental variables, J test.


  ¤
     We thank Jordi Gal¶ for providing us with the data and program used in Clarida, Gal¶ and Gertler (2000). We
                        ³                                                                ³
also thank Alastair Hall, Lutz Kilian and seminar participants at Brown University, University of Michigan and the
2000 Triangle Econometrics Conference for helpful comments.
   y
     Department of Agricultural and Resource Economics, North Carolina State University, Box 8109, Raleigh, NC
27695-8109. E-mail: atsushi@unity.ncsu.edu.
   z
     Department of Economics, Vanderbilt University, Box 1819 Station B, Nashville, TN 37235.              E-mail:
mototsugu.shintani@vanderbilt.edu.
1.     Introduction

In this paper we establish that the bootstrap provides asymptotic re¯nements for the

generalized method of moments (GMM) estimator of possibly overidenti¯ed linear mod-

els. Our analysis di®ers from earlier work in that we allow for general autocovariance

structures of moment functions. In typical empirical situations, the autocovariance

structure of moment functions is unknown and the inverse of the heteroskedasticity

and autocorrelation consistent (HAC) covariance matrix estimator is used as a weight-

ing matrix in GMM estimation. It is well known, however, that coverage probabilities

based on the HAC covariance estimator are often too low, and that the t test tends

to reject too frequently (see Andrews, 1991). In this paper, we propose a bootstrap

method for the GMM estimator to improve the ¯nite sample performance of the t test

and the test of overidentifying restrictions (J test). We use the block bootstrap origi-

nally proposed by KÄnsch (1989) for weakly dependent data (see also Carlstein, 1986).
                   u

When the block length increases at a suitable rate with the sample size, such block

bootstrap procedures eventually will capture the unknown structure of dependence.

     Our linear framework is of particular interest in applied time series analysis. GMM

estimation of linear models has been applied to the expectation hypothesis of the term

structure (Campbell and Shiller, 1991), the monetary policy reaction function (Clarida,

Gal¶ and Gertler, 2000), the permanent-income hypothesis (Runkle, 1991), and the
   ³

present value model of stock prices (West, 1988). Since the GMM estimates often have

policy implications in structural econometric models, it is important for researchers

to obtain accurate con¯dence intervals. For example, the interpretation of the policy

rule crucially depends on the value of the estimated parameters (see Clarida, Gal¶ and
                                                                                 ³

Gertler, 2000).

                                            1
         Not surprisingly, given the poor performance of the conventional asymptotic ap-

      proximation, the econometric literature on the bootstrap for GMM is growing rapidly.

      Hahn (1996) shows the ¯rst-order validity of the bootstrap for GMM with iid observa-

      tions. 1 For dependent data, Hall and Horowitz (1996) show that the block bootstrap

      provides asymptotic re¯nements for GMM. However, Hall and Horowitz (1996) assume

      that the autocovariances of the moment function are zero after ¯nite lags, and thus

      their framework does not cover the use of the HAC covariance matrix estimator for the

      general dependence structure. Economic theory often provides information about the

      speci¯cation of moment conditions, but not necessarily about the dependence struc-

      ture of the moment conditions. Therefore, it is important for applied work to be able

      to allow for more general forms of autocorrelation. This extension is not straightfor-

      ward because the HAC covariance matrix estimator cannot be written as a function of

      sample moments and converges at a rate slower than T ¡1=2 . Thus, the conventional

      arguments cannot be applied directly to prove the existence of Edgeworth expansions

      and to establish asymptotic re¯nements of the bootstrap.

         Recently, GÄtze and KÄnsch (1996) and Lahiri (1996) show that the block bootstrap
                    o         u

      can provide asymptotic re¯nements for a smooth function of sample means and for

      parameters in a linear regression model, respectively, even when the HAC covariance

      estimator is used. They show that the bootstrap provides asymptotic re¯nements for

      approximating the distribution of the estimator and for the coverage probability of one-

      sided con¯dence intervals. However, they do not show asymptotic re¯nements for the

      two-sided symmetric t test nor do they provide any result for the overidenti¯ed case

      which is of great interest in empirical work. The purpose of this paper is to prove that
  1
     Brown and Newey (1995) propose an alternative e±cient bootstrap method based on the empirical
likelihood.


                                                 2
the bootstrap provides asymptotic re¯nements for these statistics in overidenti¯ed linear

models estimated by GMM. To our knowledge, the higher-order properties of the block

bootstrap for GMM with unknown autocovariance structures have not been formally

investigated.

   Our results are nonstandard for two reasons. First, we show that the order of the

bootstrap approximation error is larger than typically found in the literature on the

bootstrap for parametric estimators. The intuition behind this result is as follows: The

HAC covariance matrix estimator is (proportional to) a nonparametric estimator of the

spectral density at frequency zero, and its convergence rate is slower than T ¡1=2 . For the

¯rst-order asymptotic theory, all that matters is the consistency of the HAC covariance

matrix estimator. However, the nonparametric nature of the HAC covariance matrix

estimator becomes important in the higher-order asymptotic theory and complicates

the analysis of the two-sided symmetric t test and the J test statistic. Nevertheless,

we are able to establish that the bootstrap approximation error is smaller than the

conventional normal approximation error.

   Second, we note that the choice of kernels plays a more important role in our second-

order asymptotic theory than in the conventional ¯rst-order asymptotic theory because

the order of the bootstrap approximation error depends on the bias of the HAC covari-

ance estimator. For the bootstrap to provide asymptotic re¯nements, the bias must

vanish su±ciently fast. For the one-sided t test, most of the commonly used kernels sat-

isfy this condition. For two-sided symmetric t test and for the J test statistic, however,

one must use kernels, such as the truncated kernel (White, 1984) and the trapezoidal

kernel (Politis and Romano, 1995), whose bias vanishes even faster. The resulting HAC

covariance matrix estimator based on these kernels, however, is not necessarily positive


                                             3
semide¯nite. In this paper, we propose a modi¯ed HAC covariance matrix estimator

that is always positive semide¯nite.

     In a Monte Carlo experiment, we ¯nd that our bootstrap method improves the

accuracy of inference in small samples, especially for the two-sided symmetric t test. To

illustrate the usefulness of the bootstrap approach, we apply our bootstrap procedure

to the monetary policy reaction function of Clarida, Gal¶ and Gertler (2000). We ¯nd
                                                        ³

that the data do not necessarily support some of their conclusions.

     The rest of the paper is organized as follows. Section 2 introduces the model and

describes the proposed bootstrap procedure. Section 3 presents the assumptions and

theoretical results. Section 4 provides some Monte Carlo results. Section 5 presents

an empirical illustration. Section 6 concludes the paper. All proofs are relegated to an

appendix.


2.     Model and Bootstrap Procedure

Consider a stationary time series (x0 ; yt ; zt )0 which satis¯es
                                    t
                                              0


                                        E[zt ut ] = 0;                               (2.1)


                 0
where ut = yt ¡ ¯0 xt , ¯0 is a p-dimensional parameter, xt is a p-dimensional vector,

zt is a k-dimensional vector and p < k. Given a realization f(x0 ; yt ; zt )0 gT0 , we are
                                                               t
                                                                         0
                                                                               t=1


interested in two-step GMM estimation of ¯0 based on the moment condition (2.1). Let

` denote the lag truncation parameter used in HAC covariance matrix estimation and


                                               4
      T = T0 ¡ ` + 1.2 We ¯rst obtain the ¯rst-step GMM estimator ¯T by minimizing
                                                                  ~
                               2                       30        2                     3
                                     T0                           T0
                                 1 X             0            1 X
                               4        zt(yt ¡ ¯ xt ) 5 VT 4        zt(yt ¡ ¯ 0 xt )5
                                 T0 t=1                       T0 t=1

      with respect to ¯, where VT is some k £ k positive semide¯nite matrix. Then we obtain

                                    ^
      the second-step GMM estimator ¯T by minimizing
                               "      T
                                                      #0        T
                                                                 "                    #
                                   1X              0     ^¡1 1
                                                               X
                                         zt (yt ¡ ¯ xt ) ST        zt (yt ¡ ¯ 0 xt ) ;
                                   T t=1                     T t=1

      where
                                       2                                                            3
                                 T              `    µ ¶³                                ´
                              1 X4 2 0 X              j
                   ^
                   ST    =          zt ut zt +
                                       ~           !          ~ ~ 0           ~~      0
                                                         zt+j ut+j ut zt + zt utut+j zt+j 5
                              T t=1            j=1
                                                      `
                             ~0
                   ut = yt ¡ ¯T xt :
                   ~


      is the HAC covariance matrix estimator for the moment function (2.1), !(¢) is a kernel.

                                                                         ^ ¡1=2 ^
      We are interested in the distribution of the studentized statistic §T (¯T ¡ ¯0 ) where
            P     0 ^¡1 PT z x0 )¡1 and in the distribution of the J test statistic
      §T = ( T xtzt ST
      ^      t=1         t=1 t t

                               "         T
                                                            #0       "
                                                                     T
                                                                                             #
                                     1 X             ^0      ^¡1 1
                                                                    X
                                                                                ^0
                        JT =        p       zt (yt ¡ ¯T xt ) ST p       zt(yt ¡ ¯T xt ) :
                                      T t=1                       T t=1

         We propose the following block bootstrap procedure. Suppose that T = b` for some

      integer b.

  Step 1. Let N1 ; N2 ; :::; Nb be iid uniform random variables on f0; 1; :::; T ¡ `g and let


                             (x¤0          ¤           ¤0        0    0                 0      0
                               (j¡1)`+i ; y(j¡1)`+i ; z(j¡1)`+i ) = (xNj +i ; yNj +i ; zNj +i ) ;


           for 1 · i · ` and 1 · j · b.
  2                                                                             ^
    We use T observations and the modi¯ed HAC covariance matrix estimator ST to obtain asymptotic
re¯nements for the two-sided symmetric t test and the J test statistic. This modi¯cation is not necessary
for obtaining asymptotic re¯nements for one-sided con¯dence intervals. See also Hall and Horowitz (1996,
p.895).


                                                            5
                                                        ~¤
Step 2. Calculate the ¯rst-step bootstrap GMM estimator ¯T by minimizing
                        "      T
                                                        #0         "      T
                                                                                                    #
                            1X ¤ ¤                                     1X ¤ ¤
                                 z (y ¡ ¯ 0 x¤ ) ¡ ¹¤         VT            z (y ¡ ¯ 0 x¤ ) ¡ ¹¤
                            T t=1 t t        t      T
                                                                       T t=1 t t        t      T


        where
                                                   T ¡` X
                                                   X1 `
                                            1
                                  ¹¤ =
                                   T
                                                                          ^0
                                                             zt+i (yt+i ¡ ¯T xt+i ):
                                         T ¡ ` + 1 t=0 ` i=1

                                                        ^¤
Step 3. Compute the second-step bootstrap GMM estimator ¯T by minimizing
                    "      T
                                                     #0             "      T
                                                                                                    #
                        1X ¤ ¤                            ^¤¡1          1X ¤ ¤
                             z (y ¡ ¯ 0 x¤ ) ¡ ¹¤         ST                 z (y ¡ ¯ 0 x¤ ) ¡ ¹¤ ;
                        T t=1 t t        t      T
                                                                        T t=1 t t        t      T


        where

                                      b   `   `
                    ^¤             1 XXX ¤
                    ST        =                 (z    u¤
                                                      ~      ¡ ¹¤ )(zNk +j u¤ k +j ¡ ¹¤ )0 ;
                                                                     ¤
                                                                           ~N
                                   T k=1 i=1 j=1 Nk +i Nk +i    T                     T


                        u¤ = yt ¡ ¯T x¤ :
                        ~t    ¤   ~¤0
                                      t


                                                                  ^ ¤¡1=2 (¯ ¤ ¡ ¯T ) where
Step 4. Obtain the bootstrap version of the studentized statistic §T       ^     ^
                                                                            T

              P      ¤0 ^¤¡1 PT
        §¤ = ( T x¤ zt ST
        ^
          T    t=1 t
                                   ¤ ¤0 ¡1
                              t=1 zt xt )  and the J test statistic
                    (        T
                                                              )0          (        T
                                                                                                             )
           ¤             1 X ¤ ¤ ^¤0 ¤                             ^¤¡1        1 X ¤ ¤ ^¤0 ¤
          JT    =       p       [zt (yt ¡ ¯T xt ) ¡ ¹¤ ]
                                                     T             ST         p       [zt (yt ¡ ¯T xt ) ¡ ¹¤ ] :
                                                                                                           T
                          T t=1                                                 T t=1

   By repeating Steps 1{4 su±ciently many times, one can approximate the ¯nite-sample

   distributions of the studentized statistic and the J test statistic by the empirical distri-

   butions of their bootstrap version.

   Remarks: 1. As in Hall and Horowitz (1996), we recenter the bootstrap version of the

   moment functions. Unlike the just identi¯ed case, the bootstrap version of the moment

   condition does not hold without recentering in the case of overidenti¯ed restrictions.

   The expression ¹¤ is the mean of the bootstrapped moment function with respect to
                   T


   the probability measure induced by the bootstrap algorithm.


                                                          6
2. Davison and Hall (1993) show that naÄ applications of the block bootstrap do
                                       ³ve

not provide asymptotic re¯nements for studentized statistics involving the long-run

variance estimator. Speci¯cally, they show that the error of the naÄ bootstrap is of
                                                                   ³ve

order O(b¡1 ) + O(`¡1 ) and thus is greater than or equal to the error of the ¯rst order

asymptotic approximation. We therefore modify the bootstrap version of the HAC

covariance matrix estimator (see GÄtze and Hipp, 1996, for the just-identi¯ed case).
                                  o

               ^¤
The expression ST given in Step 3 is a consistent estimator for the variance of the

bootstrapped moment function with the bootstrap probability measure.


3.    Asymptotic Theory

In this section, we present our main theoretical results. Unless noted otherwise, we shall

denote the Euclidean norm of a vector x by kxk. First, we provide the following set of

assumptions.


Assumption 1:

(a) f(x0 ; yt ; zt )0 g is strictly stationary and strong mixing with mixing coe±cients sat-
       t
                 0


     isfying ®m · (1=d) exp(¡dm) for some d > 0.

(b) There is a unique ¯0 2 <p such that E[zt ut ] ´ E[zt (yt ¡ ¯0 xt )] = 0.
                                                                0


 (c) Let Rt = ((ztut )0 ; vec(ztxt )0 )0 . Then EkRt kr+´ < 1 for some r ¸ 12 and ´ > 0.

         b
(d) Let Fa denote the sigma-algebra generated by Ra ; Ra+1 ; :::; Rb . For all m; s; t =

                        t+s
     1; 2; ::: and A 2 Ft¡s ,


                     t¡1   1             t¡1      t+s+m
             EjP (AjF¡1 [ Ft+1 ) ¡ P (AjFt¡s¡m [ Ft+1 )j · (1=d) exp(¡dm):


                                            7
       (e) For all m; t = 1; 2; ::: and µ 2 <p+k+1 such that 1=d < m < t and jµj ¸ d,
                        ¯ (   "                      #¯           )¯
                        ¯           t+m
                                    X                 ¯            ¯
                        ¯                             ¯ t¡1        ¯
                      E ¯E exp iµ 0                            1
                                        (Rs ¡ E(Rs )) ¯ F¡1 [ Ft+1 ¯ · exp(¡d):
                        ¯                             ¯            ¯
                                        s=t¡m


       (f) ! : < ! [¡1; 1] satis¯es (i) !(0) = 1, (ii) !(x) = !(¡x) 8x 2 <, (iii) !(x) = 0

            8jxj ¸ 1, (iv) !(¢) is continuous at 0 and at all but a ¯nite number of other points.

       (g) ` ! 1 as T ! 1 such that ` 6= O(T 1=6 ) and ` = o(T 1=4 ).

               P
       (h) ST = T ¡1 +1 !(j=`)¡j is a positive semide¯nite matrix that converges in prob-
           ^
                j=¡T
                              ^
                                                           P1                   0
            ability to a positive de¯nite matrix S0 ´        j=¡1   E(z0 u0 uj zj ).

        (i) The ¯rst-step estimator ¯T satis¯es EjT 1=2 (¯T ¡ ¯0 )jr = O(1), and VT is a pos-
                                    ~                    ~

            itive semide¯nite matrix that converges to a positive de¯nite matrix V at rate

            O(`1=2 T ¡1=2 ).


      Remarks: Assumption 1(c) requires that at least the 12th moment of the moment func-

      tion be ¯nite, and we will later require that at least the 36th moment be ¯nite. Although

      this condition is strong, it is not atypical in the literature on higher-order asymptotic

      theory. For example, a su±cient (but not necessary) condition for Assumptions 3(f)

      and 4 of Hall and Horowitz (1996) is the ¯niteness of the 33rd moment of the mo-

      ment functions and of their derivatives. Assumptions 1(d) and 1(e) are from GÄtze
                                                                                   o

      and KÄnsch (1996). Hall and Horowitz (1996, Assumptions 1 and 6) impose similar
           u

      assumptions. Assumption 1(f) is a subset of Andrews' (1991) class of kernels K1 . For

      example, the truncated kernel (White, 1984), Bartlett kernel (Newey and West, 1989)

      and Parzen kernel (Gallant, 1987) satisfy Assumption 1(f). 3 The range of divergence
  3
    Our proofs depend on the assumption that lags of order greater than or equal to ` receive zero weight. We
do not know whether the bootstrap provides asymptotic re¯nements for one-sided con¯dence intervals when
the quadratic spectral kernel (Andrews, 1991) is used. The bootstrap does not provide asymptotic re¯ne-
ments for the two-sided symmetric t test and the J test statistic when this kernel is used as its characteristic
exponent is two.


                                                       8
rates of ` allowed in Assumption 1(g) is narrower than the one typically assumed in the

literature on HAC covariance matrix estimation (e.g., Theorem 1 of Andrews, 1991) but

is wider than the one Hall and Horowitz (1996) assumed for the divergence rate of the
                           p
block length. While the     T -consistency of the ¯rst-step estimator is su±cient for the

¯rst-order asymptotic theory (e.g., Assumption B(i) of Andrews, 1991), Assumption

1(i) requires further conditions.


   Next, we present our three main theorems. Let q denote the characteristic exponent

of the kernel !. That is, q is the largest real number such that limx!0 (1 ¡ !(x))=jxjq 2

[0; 1).


Theorem 1: Suppose that Assumption 1 holds. Let


                    ªT (x) = ©(x) + T ¡1=2 p1 (x)Á(x) + `T ¡1 p2 (x)Á(x)

                   ªJ;T (x) = FÂ2         (x) + `T ¡1 pJ (x)fÂ2 (x)
                                    k¡p                     k¡p


                                      ^ ¡1=2 ^
denote the Edgeworth expansions of P (§T (¯T ¡¯0 ) · x) and P (JT · x), respectively,

where ©(x) denote the p-dimensional standard normal distribution, FÂ2         and fÂ2     are
                                                                        k¡p         k¡p


the distribution and density functions of a Â2 random variable with degree of freedom

k ¡ p, p1 is even, and p2 and pJ are odd. Then


            sup jP (§¡1=2 (¯T ¡ ¯0 ) · x) ¡ ªT (x)j = o(`T ¡1 ) + O(`¡q );
                    ^
                      T
                           ^                                                            (3.2)
            x2<p

                          sup jP (JT · x) ¡ ªJ;T (x)j = o(`T ¡1 ) + O(`¡q ):            (3.3)
                          x¸0


Theorem 2: Suppose that Assumption 1 holds with r ¸ 12 replaced by r ¸ 36. Let


                    ª¤ (x) = ©(x) + T ¡1=2 p¤ (x)Á(x) + `T ¡1 p¤ (x)Á(x)
                     T                      1                  2


                   ª¤ (x) = FÂ2 (x) + `T ¡1 p¤ (x)fÂ2
                    J;T       q¡p            J                    (x)
                                                            k¡p


                                                 9
                                      ^     ¤¡1=2    ^¤   ^                 ¤
denote the Edgeworth expansions of P (§T            (¯T ¡ ¯T ) · x) and P (JT · x), respec-

tively, where p¤ is even, and p¤ and p¤ are odd. Then
               1               2      J


                          ^ ¤¡1=2 (¯T ¡ ¯T ) · x) ¡ ª¤ (x)j = op(`T ¡1 );
                sup jP ¤ (§T       ^¤   ^
                                                     T                                (3.4)
               x2<p

                               sup jP ¤ (JT · x) ¡ ª¤ (x)j = op(`T ¡1 )
                                          ¤
                                                    J;T                               (3.5)
                               x¸0


where P ¤ is the probability measure induced by the bootstrap conditional on the data.


Theorem 3: Suppose that Assumption 1 holds with r ¸ 12 replaced by r ¸ 36. Let ¿T

denote the t-statistic for the kth element of ¯. Let ¿1;® , ¿2;® and Â¤ denote the 100®
                                                      ¤      ¤
                                                                      ®


level critical values for the one-sided t test, the two-sided symmetric-t test and the J

test statistic, respectively. Then


                       P (¿T · ¿1;® ) = 1 ¡ ® + O(`T ¡1 ) + O(`¡q );
                                ¤
                                                                                      (3.6)

                      P (j¿T j · ¿2;® ) = 1 ¡ ® + o(`T ¡1 ) + O(`¡q );
                                  ¤
                                                                                      (3.7)

                        P (JT > Â¤ ) = ® + o(`T ¡1 ) + O(`¡q ):
                                 ®                                                    (3.8)


Remarks: Theorems 1 and 2 show that the distributions of the studentized statistic

and the J test statistic and their bootstrap versions can be approximated by their

Edgeworth expansions. Theorem 3 shows the order of the bootstrap approximation

error. For the one-sided t test, the two-sided symmetric t test and the J test statistic,

the approximation errors made by the ¯rst-order asymptotic theory are of order


            O(T ¡1=2 ) + O(`¡q ); O(`T ¡1 ) + O(`¡q ) and O(`T ¡1 ) + O(`¡q );        (3.9)


respectively, whereas the bootstrap approximation errors are of order


             O(`T ¡1 ) + O(`¡q ); o(`T ¡1 ) + O(`¡q ) and o(`T ¡1 ) + O(`¡q ):       (3.10)


                                             10
Thus the bootstrap provides asymptotic re¯nements if the bias of the HAC covariance

matrix estimator vanishes fast enough, i.e.,


             O(`¡q ) = o(T ¡1=2 ); O(`¡q ) = o(`T ¡1 ) and O(`¡q ) = o(`T ¡1 ):     (3.11)


for the three statistics, respectively.

   For the one-sided t test, the bootstrap provides asymptotic re¯nements for a wide

class of kernels that satisfy O(`¡q ) = o(T ¡1=2 ), such as the Parzen kernel. However, the

bootstrap does not provide asymptotic re¯nements for the Bartlett kernel as it does not

satisfy (3.11), because its characteristic exponent is one. For the two-sided symmetric t

test and the J test statistic, the bootstrap can provide asymptotic re¯nements only for

kernels whose characteristic exponent is greater than 2, such as the truncated kernel,
                                              (
                                                  1 for jxj < 1
                                !(x) =                          ;
                                                  0 otherwise

the trapezoidal kernel (Politis and Romano, 1995)
                                   8
                                   > 1
                                   <                   for jxj · ®
                                              jxj¡®
                         !(x) =      1¡                for ® < jxj · 1 ;
                                   >
                                   :           1¡®
                                     0                 otherwise

where 0 < ® < 1, and the Parzen (b) kernel (Parzen, 1957)
                                          (
                                              1 ¡ jxjq for jxj · 1
                             !(x) =
                                              0        otherwise

where q > 2. Under the assumption of exponentially decaying mixing coe±cients, the

truncated and trapezoidal kernels have no asymptotic bias and thus satisfy (3.11). If q >

2 and ` 6= O(T 1=(q+1)), the Parzen (b) kernel also satis¯es (3.11). A potential problem

with these kernels is that the resulting weighting matrix is not necessarily positive

semide¯nite. To eliminate this problem, the weighting matrix can be modi¯ed as follows:

By Schur's decomposition theorem (e.g., Theorem 13 of Magnus and Neudecker, 1999,


                                                  11
p.16), there exist an orthogonal k £ k matrix E whose columns are eigenvalues of WT =

^¡1
ST and a diagonal matrix ¤ = diag(¸1 ; :::; ¸k ), whose elements are the eigenvalues of

WT , such that

                                   WT = E 0¡1 ¤E ¡1 :                             (3.12)


De¯ne a modi¯ed HAC covariance matrix estimator by


                                  WT = E 0¡1 ¤+ E ¡1 ;
                                   +
                                                                                  (3.13)


                                                      +
where ¤+ = diag(max(¸1 ; 0); :::; max(¸k ; 0)). Then WT is positive semide¯nite, asymp-

totically equivalent to (3.12) and thus is consistent. Politis and Romano (1995, equation

12) uses a similar modi¯cation in the context of univariate spectral density estimation.

For the trapezoidal kernel, the frequency of positive semide¯nite corrections can be re-

duced by choosing small ®. However, Politis and Romano (1995) recommends ® = 1=2.


4.    Monte Carlo Results

In this section, we conduct a small simulation study to examine the accuracy of the

proposed bootstrap procedure. We consider the following stylized linear regression

model with an intercept and a regressor, xt :


                       yt = ¯1 + ¯2 xt + ut ;    for t = 1; : : : ; T:            (4.14)


The disturbance and the regressors are generated from the following AR(1) processes

with common ½,


                                  ut = ½ut¡1 + "1t ;                              (4.15)

                                  xt = ½xt¡1 + "2t ;                              (4.16)


                                           12
   where "t = ("1t ; "2t )0 » N(0; I2 ). In the simulation, we use ¯ = (¯1 ; ¯2 )0 = (0; 0)0 for the

   regression parameter and ½ 2 f0:5; 0:9; 0:95g for the AR parameters. For instruments,

   we use xt , xt¡1 and xt¡2 in addition to an intercept. This choice of instruments implies

   an over-identi¯ed model with 2 degrees of freedom for the J test. Two values for the

   sample size T , 64 and 128, are considered. The kernel functions employed are the

   trapezoidal, Parzen (b) and truncated kernels. In all experiments, the number of Monte

   Carlo trials is 1000.

      The choice of the block length is important in practice. Ideally, one would choose

   a longer block length for more persistent processes and a shorter block length for less

   persistent processes. In the literature, this is typically accomplished by selecting the

   lag truncation parameter that minimizes the mean squared error of the HAC covari-

   ance matrix estimator (see Andrews, 1991; and Newey and West, 1994). Because the

   trapezoidal and truncated kernels have no asymptotic bias, however, one cannot take

   advantage of the usual bias-variance trade-o® and thus no optimal block length can be

   de¯ned for these kernels. Thus, we propose the following procedure that is similar to

   the general-to-speci¯c modeling strategy for selecting the lag order of autoregressions

   in the literature on unit root testing (see Hall, 1994; Ng and Perron, 1995). By the

   Wold representation theorem, the moment function has a moving average (M A) repre-

   sentation of possibly in¯nite order. The idea is to approximate this MA representation

   by a sequence of ¯nite-order MA processes. Because the block bootstrap is originally

   designed to capture the dependence of m-dependent-type processes when ` is ¯xed, it

   makes sense to approximate the process by an MA process that is m-dependent.

      The proposed procedure takes the following steps.


Step 1. Let `1 < `2 < ¢ ¢ ¢ < `max be candidate block lengths that satisfy Assumption 1(g)


                                                  13
        and set k = max ¡1.

Step 2. Test the null that every element of the moment function is MA(`k ) against the

        alternative that at least one of the elements is MA(`k+1 ).

Step 3. If the null is accepted and if k > 1, then let k = k ¡ 1 and go to Step 2. If the null

        is accepted and if k = 1, then let ` = `1 . If the null is rejected, then set ` = `k+1 .


      Because there is parameter uncertainty due to ¯rst-step estimation and because

   we apply a univariate testing procedure to each element of the moment function, it is

   di±cult to control the size of this procedure. In this Monte Carlo experiment, therefore,

   we use the 99% level critical value to be conservative.

      Our primary interest is to compare the size properties of tests based on asymptotic

   and bootstrap critical values. For each experiment, the empirical size for the t test

   for the regression slope parameter ¯2 as well as for the J test is obtained using the

   10% nominal signi¯cance level. Each bootstrap critical value is constructed from 999

   replications of the bootstrap sampling process. In addition to the results based on

   the asymptotic and bootstrap critical values using our proposed procedure, we report

   the asymptotic results based on the Bartlett and QS kernels, with Andrews' (1991)

   data-dependent bandwidth estimator and Andrews and Monahan's (1992) prewhitening

   procedure.

      Table 1 summarizes the result of the simulation study. In all cases, the size proper-

   ties of the bootstrap t test are better than those of the asymptotic t test. The choice of

   kernel function does not make much of a di®erence for the performance. Indeed the em-

   pirical sizes of the bootstrap test are very close to the nominal size when T is 128. The

   degree of the reduction in the size distortion depends on the value of the AR parameters

   as well as the sample size. The bootstrap works quite well with persistent processes.

                                                14
      Because the moment functions have an AR(1) autocovariance structure, the prewhiten-

      ing procedure has a considerable advantage in our simulation design. However, the

      bootstrap outperforms the conventional prewhitened HAC procedure with asymptotic

      critical values. In contrast, the advantage of the bootstrap for the J test is not clear

                                                                                 4
      because the J test performs quite well even with asymptotic critical values. Based on

      this experiment, we recommend our bootstrap procedure especially for the t test for

      regression parameters.


      5.    Empirical Illustration

      To illustrate the usefulness of the proposed bootstrap approach, we conduct bootstrap

      inference about the parameters in the monetary policy reaction function of Clarida,

      Gal¶ and Gertler (2000, hereafter CGG). CGG model the target for the federal funds
         ³

            ¤
      rate rt by

                             rt = r¤ + ¯(E[¼t+1 j- t ] ¡ ¼ ¤ ) + °E[xt j- t ]
                              ¤
                                                                                               (5.17)


      where ¼t is the in°ation rate, ¼ ¤ is the target for in°ation, -   t   is the information set at

      time t, xt is the output gap, and r¤ is the target with zero in°ation and output gap.

      Policy rules (5.17) with ¯ > 1 and ° > 0 are stabilizing and those with ¯ · 1 and

      ° · 0 are destabilizing. CGG obtain the GMM estimates of ¯ and ° based on the set

      of unconditional moment conditions


       Ef[rt ¡ (1 ¡ ½1 ¡ ½2 )[rr¤ ¡ (¯ ¡ 1)¼ ¤ + ¯¼t+1 + °xt ] + ½1 rt¡1 + ½2 rt¡2 ]zt g = 0; (5.18)


      where rt is the actual federal fund rate, rr¤ is the equilibrium real rate and zt is a vector

      of instruments. They ¯nd that the GMM estimate of ¯ is signi¯cantly less than unity
4
    See Tauchen (1986) and Hall and Horowitz (1996) for similar ¯ndings.


                                                   15
during the pre-Volcker era, while the estimate is signi¯cantly greater than unity during

the Volcker-Greenspan era.

   We reexamine these ¯ndings by applying our bootstrap procedure as well as the

bootstrap procedure of Hall and Horowitz (1991) and the standard HAC asymptotics.

We obtain GMM estimates of ¯ and ° based on the linear moment conditions


                  Ef[rt ¡ c ¡ µ1 ¼t+1 ¡ µ2 xt ¡ ½1 rt¡1 ¡ ½2 rt¡2 ]ztg = 0;           (5.19)


where c = (1 ¡ ½1 ¡ ½2 )[rr¤ ¡ (¯ ¡ 1)¼ ¤ ]. Then ¯T = µ1T =(1 ¡ ½1T ¡ ½2T ) and °T =
                                                  ^    ^         ^     ^         ^

^        ^     ^            ^ ^ ^
µ2T =(1¡ ½1T ¡ ½2T ), where µ1T ; µ2T ; ½1T and ½2T are the GMM estimates of µ1 ; µ2 ; ½1 and
                                                ^

½2 , respectively. We use CGG's baseline dataset and two sample periods, the pre-Volcker

period (1960:1-1979:2) and the Volcker-Greenspan period (1979:3-1996:3) (see CGG for

the description of the data source). In addition to their baseline speci¯cation, we

construct the optimal weighting matrix using the inverse of the HAC covariance matrix

estimator to allow for more general dynamic speci¯cations in the determination of the

actual funds rate. For the asymptotic con¯dence intervals, we use the conventional

prewhitened and recolored estimates based on the Bartlett and QS kernels with the

automatic bandwidth selection method (Andrews 1991, Andrews and Monahan 1992).

For the con¯dence intervals constructed from our bootstrap, we use the trapezoidal,

Parzen (b) and truncated kernels. We use the data-dependent procedure described

in the previous section to select the block length for the bootstrap. The number of

bootstrap replications is set to 999.

   Table 2 presents GMM estimates of these parameters. Asymptotic standard errors

are reported in parentheses. The ¯rst two rows of each of Tables 2(a) and (b) replicate

CGG's results. These ¯ndings are robust to whether or not the HAC covariance matrix

estimator is used.

                                             16
     Table 3 shows 90% two-sided con¯dence intervals of these parameters. Consistent

with CGG's ¯ndings, the upper bound of the asymptotic con¯dence interval for ¯ is less

than unity during the pre-Volcker period, and the lower bound is far greater than unity

during the Volcker-Greenspan period. Based on these estimates, CGG suggest that

the Fed was accommodating in°ation before 1979, but not after 1979. The bootstrap

con¯dence interval, however, indicates that ¯ may be greater than unity even during the

pre-Volcker period, consistent with the view that the Fed has always been combating

in°ation. Moreover, unlike the asymptotic con¯dence interval, the bootstrap con¯dence

interval does not rule out that ° is negative during the Volcker-Greenspan period.


6.     Concluding Remarks

In this paper we establish that the bootstrap provides asymptotic re¯nements for the

GMM estimator of possibly overidenti¯ed linear models when the autocovariance struc-

ture of the moment function is unknown. Because the HAC covariance matrix estimator

cannot be written as a function of sample moments and converges at a rate slower than

T ¡1=2 , the conventional techniques cannot be used directly to prove the existence of

the Edgeworth expansions. Because of the nonparametric nature of the HAC covari-

ance matrix estimator, the order of the bootstrap approximation error is larger than

the typical order of the bootstrap approximation error for parametric estimators. Nev-

ertheless, the bootstrap provides improved approximations relative to the ¯rst-order

approximation. We also ¯nd that the choice of kernels plays a more important role

in our second-order asymptotic theory than in the conventional ¯rst-order asymptotic

theory because the order of the bootstrap approximation error depends on the bias of

the HAC covariance estimator. We note that an extension of the present results to


                                          17
nonlinear dynamic models as well as further investigation of data-dependent methods

for selecting the optimal block length would be useful.


                                          18
Appendix
Notation

To simplify the notation, we will assume p = 1 throughout the appendix. In the proof for the
case p > 1, the scalar ¯ in the current proof is replaced by an arbitrary linear combination of ¯.
-denotes the Kronecker product operator. If ® is an n-dimensional nonnegative integral, j®j de-
                             Pn                                                        Pn
notes its length, i.e., j®j = i=1 j®i j. k¢ k denotes the Euclidean norm, i.e., kxk = ( i=1 x2 )1=2 ,
                                                                                             i
where x is an n-dimensional vector. We will write !(j=`) as !j for notational simplicity. ·j (x)
denotes the jth cumulant of a random variable x. vec(¢) is the column-by-column vectorization
function. vech(¢) denotes the column stacking operator that stacks the elements on and below
the leading diagonal. For a nonnegative integral vector ® = (®1 ; ®2 ; :::; ®n ), let
                                                       @ ®1         @ ®n
                                                D® =      ®1 ¢ ¢ ¢       :
                                                       @x1         @x®nn

` and l are treated di®erently: ` denotes the lag truncation parameter and l denotes an integer.
Let ut = yt ¡ ¯0 xt , ut = yt ¡ ¯T xt , ut = yt ¡ ¯T xt , vt = zt ut , vt = zt ut , vt = zt ut , wt = zt x0 ,
                0
                      ^         ^0      ~         ~0                   ^       ^ ~          ~             t
           ½     P                                         ½         P
                       ~ ~0
           (1=T ) T vt+j vt                j   ¸0                                0       0
                                                              (1=T ) T vt+j wt + wt+j vt j ¸ 0
    ^
    ¡j =         Pt=1                                ~
                                                  ; r¡j =            Pt=1                        ;
                       ~ ~0                                                  0         0
                    T                                                  T
           (1=T ) t=1 vt vt¡j              j   <0             (1=T ) t=1 vt wt¡j + vt wt¡j j < 0
         ½       PT          0
                                                           ½
           (1=T ) t=1 vt+j vt              j   ¸0             E(v w0 + w v 0 ) j ¸ 0
    ~
    ¡j =         PT       0                       ; r¡j = E(vt+j0 t + v t+j t ) j < 0 ;
                                                                             0
           (1=T ) t=1 vt vt¡j              j   <0                 t wt¡j  t wt¡j
         ½         0
                                                            ½         PT           0
           E(vt+j vt ) j ¸ 0                                   (1=T ) t=1 wt+j wt j ¸ 0
    ¡j = E(v v 0 ) j < 0 ;                          r2 ¡j =           PT       0           ;
              t t¡j                                            (1=T ) t=1 wt wt¡j j < 0
           P`                             P`                         P`
   ^
   ST    =           ^
                  !j ¡j ;         ~
                                  ST   =            ~
                                                 !j ¡j ;     ¹
                                                             ST    =  j=¡` !j ¡j ;
           Pj=¡`
            T ¡1           jjj            Pj=¡`
                                            `         ~ j ; rST = P`
   ST    =           (1 ¡ T )¡j ; rS~T =         !j r¡         ¹
                                                                      j=¡` !j r¡j ;
           P1 +1
            j=¡T
                                    2~
                                          Pj=¡`
                                            `         2~
   rS    =  j=¡1 r¡j ;            r ST =    j=¡` !j r ¡j :
               PT                        PT
Let GT = (1=T ) t=1 wt and mT = T ¡1=2 t=1 vt . Then the studentized statistic can be
written as                p                                1
                               ^  ^           T
                                                ^¡1          T
                                                               ^¡1
                 fT = T §¡1=2 (¯T ¡ ¯0 ) = (G0 ST GT )¡ 2 G0 ST mT :
   We use the following notation for the bootstrap. Let

                                 1 X ¤ ¤                1 X
                                     T                       b
                  m¤
                   T    =       p       (zt ut ¡ ¹¤ ) = p
                                                  T             BNk ;
                                  T t=1                   b k=1
                                1 X                             1 X
                                       `                                     `
                 BNk    =       p       (zNk +i uNk +i ¡ ¹¤ ) = p
                                                ^         T             (^Nk +i ¡ ¹¤ ) ;
                                                                         v         T
                                  ` i=1                           ` i=1
                                1 X¡ ¤                    ¢
                                       `
                 b
                 BNk    =       p       zN k+i u¤           bi    ¤   e
                                               bNk+i ¡ ¹¤ ; u¤ = yi ¡ ¯ ¤0 x¤ ;
                                                        T                   i
                                  ` i=1
                                1 X ¤ ¤0      1X
                                   T             b
                  G¤
                   T    =             zt xt =      FN ;
                                T t=1         b k=1 k

                                1X                    1X
                                   `                              `
                                              0
                 FNk    =             zNk +i xNk +i =       wNk +i :
                                ` i=1                 ` i=1

                    1 X b b0          1X
                            b                                 b
               ^¤
               ST =              e¤
                       BNk BNk ; ST =         0
                                         BNk BNk ;                           ST = Var¤ (m¤ ) :
                                                                              ¤
                                                                                         T
                    b                 b
                         k=1                               k=1


                                                         19
Then the bootstrap version of the ¯rst-step and the second-step GMM estimators can be written
as
                        " b                           #¡1
                          1X 0          1X                 1X 0           1 X
                                            b                 b                b
           e¤
           ¯ = ¯+   ^           FNk VT          FNk              FNk VT p         BNk
                          b             b                  b              T b k=1
                            k=1           k=1                k=1
                                    ¡1             1
                = ¯ + [G¤0 VT G¤ ] G¤0 VT p m¤ ;
                    ^
                           T      T      T              T
                                                    T
                        " b                             #¡1
                          1 X 0 ^¤¡1 1 X                     1 X 0 ^¤¡1 1 X
                                                b                b                  b
            ¤
           ^
           ¯ = ¯+   ^           FNk ST              FNk             FNk ST p           BNk
                          b                b                 b                 T b k=1
                            k=1               k=1               k=1
                        h             i¡1
                    ^       T
                              ^¤¡1 T
                = ¯ + G¤0 ST G¤                   ^¤¡1 1 T
                                           G¤0 ST p m¤ ;
                                              T
                                                          T
respectively.

Proofs of Lemmas

Next, we will present the lemmas used in the proofs of the theorems. Lemma A.1 produces
a Taylor series expansion of the studentized statistic fT . Lemma A.2 provides bounds on the
moments and will be used in the proofs of Lemmas A.3{A.6. Lemma A.3 shows the limits and
the convergence rates of the ¯rst three cumulants of gT in (A.1), that will be used to derive
the formal Edgeworth expansion. Lemmas A.5 and A.6 provide bounds on the approximation
error. For convenience, we present Lemma B.1 that will be used in the proofs of Lemmas B.2
and B.3. Lemma B.2 shows the consistency and convergence rate of the bootstrap version of the
moments. Lemma B.3 shows the limits and the convergence rates of the ¯rst three cumulants
of the bootstrap version.

Lemma A.1:
     fT
   = a0 mT + b0 [(GT ¡ G0 ) -mT ] + c0 [vech(ST ¡ S0 ) -mT ]
                                             ^
     +d0 [(GT ¡ G0 ) -vech(ST ¡ S0 ) -mT ] + e0 [vech(ST ¡ S0 ) -vech(ST ¡ S0 ) -mT ]
                             ^                          ^                ^
                 3=2
     +Op ((`=T ) )
   = a0 mT + b0 [(GT ¡ G0 ) -mT ] + c0 [vech(ST ¡ ST ) -mT ] + c0 [vech(ST ¡ S0 ) -mT ]
                                             ^     ¹                    ¹
         0                   ^T ¡ ST ) -mT ] + e0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ]
     +d [(GT ¡ G0 ) -vech(S       ¹                     ^    ¹            ^    ¹
     +d0 [(GT ¡ G0 ) -vech(ST ¡ S0 ) -mT ] + e0 [vech(ST ¡ ST ) -vech(ST ¡ S0 ) -mT ]
                             ¹                          ^    ¹            ¹
     +e0 [vech(ST ¡ S0 ) -vech(ST ¡ ST ) -mT ] + e0 [vech(ST ¡ S0 ) -vech(ST ¡ S0 ) -mT ]
               ¹                ^     ¹                    ¹                ¹
     +Op ((`=T )3=2 )
   ´ gT + c0 [vech(ST ¡ S0 ) -mT ] + d0 [(GT ¡ G0 ) -vech(ST ¡ S0 ) -mT ]
                   ¹                                      ¹
     +e0 [vech(ST ¡ ST ) -vech(ST ¡ S0 ) -mT ] + e0 [vech(ST ¡ S0 ) -vech(ST ¡ ST ) -mT ]
                ^     ¹         ¹                         ¹               ^    ¹
        0       ¹              ¹                           3=2
     +e [vech(ST ¡ S0 ) -vech(ST ¡ S0 ) -mT ] + Op ((`=T ) );                        (A.1)
where a, b, c, d and e are q, q 2 , q(q 2 +q), q(q 2 +q)=2, q 2 (q 2 +q)=2 and q((q 2 +q)=2)2 -dimensional
vectors of smooth functions of G0 and S0 , respectively.

Proof of Lemma A.1: (A.1) immediately follows from a Taylor series expansion of fT around
                           (m0 ; G0 ; vech(ST )0 )0 = (01£q ; G0 ; vech(S0 )0 )0
                             T    T
                                           ^                   0

and from Theorem 1 of Andrews (1991).                                                             Q.E.D.

Lemma A.2:
                                                     EkmT kr+´      = O(1);                         (A.2)

                                                    20
                                          EkT 1=2 (GT ¡ G0 )kr+´             =    O(1);                               (A.3)
                                  Ek(T =`)1=2 vech(ST ¡ ST )kr=2
                                                    ~    ¹                   =    O(1);                               (A.4)
                               Ek(T =`)1=2 vech(rST ¡ rST )kr=2
                                                  ~      ¹                   =    O(1);                               (A.5)
                                            1=2
                                      EkT vech(ST ¡ ST )kr=2
                                                    ^    ~                   =    O(1):                               (A.6)


Proof of Lemma A.2: First, (A.2) and (A.3) immediately follow from the moment inequality of
Yokoyama (1980). Second, we will show (A.4). Note that
                                                                                                [T =`]
                                                          X
                                                          `                                     X
           (T =`)1=2 (ST ¡ ST ) = (T=`)1=2
                      ~    ¹                                     !j (¡j ¡ ¡j ) = (`=T )1=2
                                                                     ~                                   Wi
                                                       j=¡`                                     i=1
                                                             X               X                X
                                        = (`=T )1=2 (               Wi +             Wi +                Wi );        (A.7)
                                                          i=0mod3          i=1mod3           i=2mod3

where

             1      X
                    i`                              X
                                                    `
                                  0         0                       0           0         0           0
    Wi =                     fvt vt ¡ E(vt vt ) +         !j [vt+j vt ¡ E(vt+j vt ) + vt vt+j ¡ E(vt vt+j )]g:
             `                                      j=1
                 t=(i¡1)`+1

Note that the summands in each sum on the RHS of (A.7) are asymptotically independent by
construction. Thus,
             °                       °r                       X
                                                              3
             °                   ¹ °2                   r                         r
           E °(T=`)1=2 vech(ST ¡ ST )° = O(Ekvech(W2 )k 2 ) =
                            ~                                   O(Ekvech(W2 (i))k 2 ) (A.8)
                                                                                  i=1

where
                 X X
                 2` `¡1                                      X
                                                             2`     X
                                                                    ¡1                                 X
                                                                                                       `¡1
W2 (1) = `¡1                 !j vt+j vt ; W2 (2) = `¡1
                                      0                                           0
                                                                           !j vt vt¡j ; W2 (3) =                       0
                                                                                                                 E(v0 v¡j ):
                 t=`+1 j=0                                t=`+1 j=¡`+1                             j=¡`+1

Thus it su±ces to show that, for i; j = 1; 2; :::; q,
                                                             r
                                          EjW2 (1)(i;j) j 2       = O(1);                                             (A.9)
                                                             r
                                                     (i;j)
                                          EjW2 (2)     j     2    = O(1);                                            (A.10)
                                                  (i;j) r
                                          EjW2 (3)     j2         = O(1);                                            (A.11)

where W2 (¢)(i;j) denotes the (i; j)th element of W2 (¢). By Assumptions 1(a) and 1(f), it follows
that                                              X           (k ) (k )         (k )
                   EjW2 (1)(i;j) jr=2 = O(`r=2             Ejvt1 1 vt2 2 ¢ ¢ ¢ vtr r j);   (A.12)
                                                       t1 ·t2 ·¢¢¢·tr

where 0 · tl · 2` and kl = i; j for l = 1; 2; :::; r. Then the standard arguments used in proofs of
the moment inequality complete the proof of (A.9). The proof of (A.10) is analogous to that of
(A.9) and thus is omitted. By the mixing inequality of Hall and Heyde (1980, Corollary A.2),
it follows that for some d0 > 0

                                    r
                                            X
                                            `¡1
                                                                   r
                                                                           X
                                                                           `¡1
                                                                                     0   r
                    EjW2 (3)(i;j) j 2 = (                 0
                                                    E(v0 v¡j )) 2 = (             ®d ) 2 = O(1);
                                                                                   j                                 (A.13)
                                          j=¡`+1                         j=¡`+1

and thus (A.11) holds. Therefore, (A.4) immediately follows from (A.7){(A.11). The proof of
(A.5) is analogous to that of (A.4) and thus is omitted.
   Lastly, we will prove (A.6). Note that

                   T 1=2 (ST ¡ ST ) = rST T 1=2 (¯T ¡ ¯0 ) + r2 ST T 1=2 (¯T ¡ ¯0 )2 :
                          ^    ~       ~         ~              ~         ~                                          (A.14)

                                                           21
Thus it follows from (A.5) and Minkowski's inequality that
  [EkrST kr ]1=r · [EkrST ¡ rST kr ]1=r + [EkrST kr ]1=r = O(`1=2 T ¡1=2 ) + O(1);
      ~                ~     ¹                ¹                                                                  (A.15)

                                           X
                                           `                                              X
                                                                                          `
      [Ekr2 ST kr ]1=r
            ~                 · [Ek            !j (r2 ¡j ¡ E(r2 ¡j ))kr ]1=r + [Ek              !j E(r2 ¡j )kr ]1=r
                                       j=¡`                                              j=¡`
                                           ¡1=2
                              = O(`T              ) + O(`):                                                      (A.16)
Therefore (A.6) follows from (A.14), (A.15), (A.16), Assumption 1(i) and HÄlder's inequality.
                                                                          o
Q.E.D.

Lemma A.3:
                                T 1=2 ·1 (gT )      =    ®1 + O(`¡q ) + o(`T ¡1=2 );                             (A.17)
                         (T =`)(·2 (gT ) ¡ 1)       =    °1 + O(`¡1=2 );                                         (A.18)
                                T 1=2 ·3 (gT )      =    ·1 ¡ 3®1 + O(`¡q ) + o(`T ¡1=2 );                       (A.19)
                         (T =`)(·4 (gT ) ¡ 3)       =    ³1 + O(`¡1=2 );                                         (A.20)
where
                  X
                  1                               X
                                                  1
®1      = b0             E[w0 -vi ] + c0                           0
                                                        E[vech(v0 vi ) -vj ]
                i=¡1                           i;j=¡1
                   X
                   1
                 0
             +c            Efvech[rS(E(w0 )0 V E(w0 ))¡1 E(w0 )0 V v0 ] -vi g
                                   ¹
                  i=¡1

                     1 X          X
                       `          T
 °1     = 2 lim                            Efa0 v0 c0 [vech(vi vi¡j ¡ ¡j ) -vk ]g
                                                                0
                T !1 `
                          j=¡` i;k=¡T

                        1 X          X
                             `        T
             +2 lim                       Efa0 v0 e0 [vech(vi vi¡j ¡ ¡j ) -vech(vk vk¡l ¡ ¡l ) -vm ]g
                                                               0                    0
                  T !1 `T
                          i;l=¡` i;k;m=¡T

                    1           X
                                T          X
                                           `
             + lim                                Efc0 [vech(v0 v¡i ¡ ¡i ) -vj ]c0 [vech(vk vk¡l ¡ ¡k ) -vm ]g;
                                                                 0                           0
              T !1 `T
                              j;k;m=¡T i;l=¡`
                1
                X                                                T ¡1
                                                                 X
                                                        1
 ·1     =               E(a0 v0 a0 vi a0 vj ) + 3 lim                       Efa0 v0 a0 vi b0 [vech(wj ¡ E(wj )) -vk ]g
                                                   T !1 T
             i;j=¡1                                           i;j;k=¡T +1

                          1      X
                                 T
             +3 lim                        Efa0 v0 a0 vi c0 [vech(vj vj¡k ¡ ¡k ) -vl g
                                                                      0
                  T !1    T   i;j;k;l=¡T

                        1        X
                                 T
             +3 lim                        Efa0 v0 a0 vi c0 vech[rS(E(w0 )0 V E(w0 ))¡1 E(w0 )0 V vj ] -vk g;
                                                                  ¹
                  T !1 T 2
                               i;j;k=¡T


        ³1
         4      X
                T          X
                           `

   =                            Efa0 v0 a0 vi a0 vj c0 [vech(vk vk¡l ¡ ¡l ) -vm ]g
                                                                 0
        `T
             i;j;k;m=¡T l=¡`

                 4         X
                           T
                                     X
                                     `

        + lim                                Efa0 v0 a0 vi a0 vj e0 [vech(vk vk¡l ¡ ¡l ) -vech(vm vm¡n ¡ ¡n ) -vo ]g
                                                                              0                    0
                `T 2
                       i;j;k;m;o=¡T l;n=¡`

                 6        X
                          T          X
                                     `

        + lim                                Efa0 v0 a0 vi c0 [vech(vj vj¡k ¡ ¡k ) -vl ]c0 [vech(vm vm¡n ¡ ¡n ) -vo ]g
                                                                        0                            0
                `T 2
                       i;j;l;m;o=¡T k;n=¡`


                                                              22
                 1 X X
                   T `

       ¡12 lim         Efa0 v0 c0 [vech(vj vj¡k ¡ ¡k ) -vl ]g
                                            0
                 `
                   j;l=¡T k=¡`

                  1      X
                         T
                                      X
                                      `

       ¡12 lim                             Efa0 v0 e0 [vech(vj vj¡k ¡ ¡k ) -(vl vl¡m ¡ ¡m ) -vn ]g
                                                                0                0
                 `T
                       j;l;n=¡T k;m=¡`

                 1       X
                         T            X
                                      `

       ¡6 lim                             Efc0 [(v0 v¡i ¡ ¡i ) -vj ]c0 [(vk vk¡l ¡ ¡l ) -vm ]g:
                                                     0                       0
                `T 2
                       j;k;m=¡T i;l=¡`


Proof of Lemma A.3: First, we will prove (A.17). By HÄlder's inequality and Lemma A.2, it
                                                      o
su±ces to show that
                                     X
                                     1
         1=2
       T E[(GT ¡ G0 ) -mT ] =            E[w0 -vi ] + O(T ¡1 );                   (A.21)
                                                   i=¡1
                                                     1
                                                     X
   T 1=2 E[vech(ST ¡ ST ) -mT ] =
                ~    ¹                                      E[vech(v0 vi ) -vj ] + O(`¡q ) + O(`T ¡1 );
                                                                       0
                                                                                                          (A.22)
                                                   i;j=¡1
                                                     X1
   T 1=2 E[vech(ST ¡ ST ) -mT ] =
                ^    ~                                    Efvech[rS(E(w0 )0 V E(w0 ))¡1 E(w0 )0 V v0 ] -vi g
                                                                  ¹
                                                   i=¡1

                                                   +O(`1=2 T ¡1=2 );                                      (A.23)
                         (T =`)E[vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ] = o(1):
                                      ^    ¹          ^    ¹                                              (A.24)
First, (A.21) follows from several applications of the mixing inequality. Second, we will show
(A.22). We have

                   1
                       X`
                 T 2 E[   !j vech(¡j ¡ ¡j ) -mT ]
                                  ~
                         j=0

                 X
                 `             X
                               T ¡1
                                          T ¡ i1(j > i) ¡ jjj1(j > 0 or j · ¡i)            0
          =            !i                                                       E[vech(v0 v¡i ) -vj ]
                                                            T
                 i=0        j=¡`¡T +1

                 X
                 `             X
                               T ¡1
          =            !i                 E[vech(v0 v¡i ) -vj ] + O(`T ¡1 )
                                                     0

                 i=0        j=¡`¡T +1

                 X
                 `          T ¡1
                            X
          =                            E[vech(v0 v¡i ) -vj ] + O(`¡q ) + O(`T ¡1 )
                                                  0

                 i=0 j=¡`¡T +1
                 X X
                 1 1
          =                        E[vech(v0 v¡i ) -vj ] + O(`¡q ) + O(`T ¡1 ):
                                              0
                                                                                                          (A.25)
                 i=0 j=¡1

The ¯rst equality follows from strict stationarity. Repeated applications of the moment inequal-
ity of Yokoyama (1980) produce
                 X
                 `             T ¡j
                               X
                           T ¡ i1(j > i) ¡ jjj1(j > 0 or j · ¡i)            0
                       !i                                        E[vech(v0 v¡i ) -vj ]
                                             T
            i=0  j=¡`¡T +1
              0          2
                            ¡2j¡1               ¡j           ¡(1=2)i
                    X
                    `        X                 X              X
              @T ¡1      4            r0                r0
          = O         !i          jjj®¡i¡j +        jjj®i +          i®¡j
                               i=0        j=¡`¡T                  j=¡2j           j=¡i
                                                                                31
                            ¡1
                            X                    X
                                                 i             T ¡1
                                                               X
                                           0
                                                           r0               r 0 5A
                 +                     i®r
                                         i+j   +   (i + j)®i +      (i + j)®j
                       j=¡(1=2)i+1               j=0                j=i+1
                         ¡1
          = O(`T              ):                                                                          (A.26)

                                                             23
for some r0 2 (0; 1), from which the second equality follows. Arguments analogous to the
proof of Theorem 10 of Hannan (1970, pp.283-284) yield the last two equalities. By symmetric
arguments, it follows that

                                1
                                     X
                                     ¡1
                              T E[
                                2                   ~
                                            !j vech(¡j ¡ ¡j ) -mT ]
                                     j=¡`

                               X
                               ¡1     X
                                      1
                       =                     E[vech(v0 v¡i ) -vj ] + O(`¡q ) + O(`T ¡1 ):
                                                        0
                                                                                                      (A.27)
                              i=¡1 j=¡1

Hence, (A.23) follows from (A.25) and (A.27). Third, we will show (A.23). It follows from
(A.14), Assumption 1(i) and Lemma A.2 that
           1
                    ^    ~
         T 2 E[vech(ST ¡ ST ) -mT ]
           1
       = T 2 E[vech(rST (¯T ¡ ¯0 ) + r2 ST (¯T ¡ ¯0 )2 ) -mT ]
                      ~ ~               ~ ~
           1                                              1
                              ¹ ~                               ¹ ~
       = T 2 E[vech((rST ¡ rST )(¯T ¡ ¯0 ) -mT )] + T 2 E[vech(rST (¯T ¡ ¯0 ) -mT )]
                       ~
             1
         +T 2 E[vech((r2 ST ¡ r2 ST )(¯T ¡ ¯0 )2 ) -mT ]
                         ~        ¹ ~
            1
         +T 2 E[vech((r2 r2 ST (¯T ¡ ¯0 )2 ) -mT )]
                            ¹ ~
          1
          X
       =      Efvech[rS(E(w0 )0 V E(w0 ))¡1 E(w0 )0 V v0 -vi ]g + O(`1=2 T ¡1=2 );
                        ¹                                                                             (A.28)
           i=¡1

which completes the proof of (A.23). Lastly, we will show (A.24).
                        ^    ¹          ^    ¹
           (T =`)E[vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ]
                        ~    ¹          ~    ¹
         = (T =`)E[vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ] + o(1)
                              X
                              `      X
                                     T
         = `¡1 T ¡3=2                                    0
                                            E[vech(vt+i vt ¡ ¡i ) -vech(vs+j vs ¡ ¡j ) -vu ] + o(1)
                           i;j=¡` t;s;u=1
                   2   ¡1=2
         = O(` T              ) = o(1):                                                               (A.29)

Therefore, (A.17) follows from (A.21){(A.24).
   Next, we will prove (A.18). It follows from (A.17), HÄlder's inequality and Lemma A.2 that
                                                        o

    ·2 (gT ) ¡ 1   = E(gT ) ¡ [E(gT )]2 ¡ 1
                        2

                   = 2Efa0 mT b0 [(GT ¡ G0 ) -mT ]g + 2Efa0 mT c0 [vech(ST ¡ ST ) -mT ]g
                                                                        ~    ¹
                             0    0
                     +2Efa mT e [vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ]g
                                         ~    ¹          ~      ¹
                           0       ^T ¡ ST ) -mT ]g2 + O(`1=2 T ¡1 ):
                                        ¹
                     +Efc [vech(S                                                     (A.30)

Thus, we only need to analyze the ¯rst four terms on the RHS of (A.30). First, by repeated
applications of the mixing inequality as in the proof of moment inequalities (e.g, the proof of
Lemma 4 of Billingsley, 1968, pp.172{174), one can show that

                              T Efa0 mT b0 [(GT ¡ G0 ) -mT ]g = O(1):                                 (A.31)

Second, it follows from arguments similar to the one used in the proof of (A.17) that

              (T =`)Efa0 mT c0 [vech(ST ¡ ST ) -mT ]g
                                     ~    ¹
                           X XXX
                           ` T T T
         = (`T )¡1                            !j Efa0 vt c0 [vech(vs vs¡j ¡ ¡j ) -vu ]g
                                                                      0

                        j=¡` t=1 s=1 u=1

                       X
                       `      X
                              T ¡1
         = `¡1                        !j (1 ¡ ¿i;k )Efa0 v0 c0 [vech(vi vi¡j ¡ ¡j ) -vk ]g
                                                                         0

                   j=¡` i;k=¡T +1


                                                        24
                            X
                            `           T ¡1
                                        X
            = `¡1                                !j Efa0 v0 c0 [vech(vi vi¡j ¡ ¡j ) -vk ]g + O(`T ¡1 )
                                                                         0

                           j=¡` i;k=¡T +1

                            X
                            `           X
                                        T ¡1
            = `¡1                                Efa0 v0 c0 [vech(vi vi¡j ¡ ¡j ) -vk ]g + O(`¡qw ) + O(`T ¡1 )
                                                                      0

                           j=¡` i;k=¡T +1

                                        X
                                        `      T ¡1
                                               X       T ¡1
                                                       X
            =      lim `¡1                                    Efa0 v0 c0 [vech(vt vt¡j ¡ ¡j ) -vs ]g + O(`¡1 ); (A.32)
                                                                                   0
                  T !1
                                    j=¡` t=¡T +1 s=¡T +1


          (T =`)Efa0 mT e0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ]g
                                 ~    ¹          ~    ¹

           1 X                      X
               `                    T
  =                                     !i !j Efa0 vr e0 [vech(vs vs¡i ¡ ¡i ) -vech(vt vt¡j ¡ ¡j ) -vu ]g
                                                                   0                    0
          `T 2
                i;j=¡` r;s;t;u=1

           1 X                  X
             `                  T
  =                                     !i !j (1 ¡ ¿s;t;u )Efa0 v0 e0 [vech(vs vs¡i ¡ ¡i ) -vech(vt vt¡j ¡ ¡j ) -vu ]g
                                                                                0                    0
          `T
             i;j=¡` s;t;u=¡T

           1 X         X
                `        T
  =                          !i !j Efa0 v0 e0 [vech(vs vs¡i ¡ ¡i ) -vech(vt vt¡j ¡ ¡j ) -vu ]g
                                                        0                    0
          `T i;j=¡` s;t;u=¡T
          +O(`2 T ¡1 )
           1 X         X
                `        T
  =                          Efa0 v0 e0 [vech(vs vs¡i ¡ ¡i ) -vech(vt vt¡j ¡ ¡j ) -vu ]g
                                                  0                    0
          `T i;j=¡` s;t;u=¡T

    +O(`¡q ) + O(`2 T ¡1 )
          1 X        X
               `       T
  = lim                    Efa0 v0 e0 [vech(vs vs¡i ¡ ¡i ) -vech(vt vt¡j ¡ ¡j ) -vu ]g
                                                0                    0
    T !1 `T
                  s;t;u=¡T
                       i;j=¡`
                 ¡1
          +O(`        );                                                                                         (A.33)

and

           (T =`)Efc0 [vech(ST ¡ ST ) -mT ]g2
                            ^    ¹
                                X
                                T        X
                                         `
      = `¡1 T ¡2                                !i !j Efc0 [vech(vs vs¡i ¡ ¡i ) -vt ]c0 [vech(vu vu¡j ¡ ¡j ) -vv ]g
                                                                     0                            0

                           t;s;u;v=1 i;j=¡`

                                X
                                T        X
                                         `
      = (`T )¡1                                 !i !j (1 ¡ ¿j;k;m )Efc0 [vech(v0 v¡i ¡ ¡i ) -vj ]
                                                                                  0

                       j;k;m=¡T i;l=¡`

           £c0 [vech(vk vk¡l ¡ ¡k ) -vm ]g
                         0

                                X
                                T        X
                                         `
      = (`T )¡1                                 !i !j Efc0 [vech(v0 v¡i ¡ ¡i ) -vj ]c0 [vech(vk vk¡l ¡ ¡k ) -vm ]g
                                                                     0                           0

                       j;k;m=¡T i;l=¡`
                       ¡1
           +O(`T                )
                                X
                                T        X
                                         `
      = (`T )¡1                                 Efc0 [vech(v0 v¡i ¡ ¡i ) -vj ]c0 [vech(vk vk¡l ¡ ¡k ) -vm ]g
                                                               0                           0

                       j;k;m=¡T i;l=¡`
                   ¡q
           +O(`   ) + O(`T ¡1 )
                          XT                      X
                                                  `
      =     lim `¡1 T ¡1                               Efc0 [vech(v0 v¡i ¡ ¡i ) -vj ]c0 [vech(vk vk¡l ¡ ¡k ) -vm ]g
                                                                      0                           0
           T !1
                                    j;k;m=¡T i;l=¡`
                   ¡1
           +O(`            );                                                                                    (A.34)

                                                                 25
where ¿i;k = (1=T ) min(max(jij; jkj; ji ¡ kj); T ) and ¿s;t;u = (1=T ) min(max(jsj; jtj; juj; js ¡ tj; jt ¡
uj; ju ¡ sj); T ). The proofs of (A.32), (A.33) and (A.34) are similar to that of (A.17) and thus
details are omitted. Therefore, (A.18) follows from (A.30){(A.33).
     Third, we will prove (A.19). By (A.17), (A.18) and

                             ·3 (gT ) = E(gT ) ¡ 3E(gT )E(gT ) + 2(E(gT ))3 ;
                                           3         2
                                                                                                             (A.35)

it su±ces to show that

                                   T 1=2 E(gT ) = ·1 + O(`¡q ) + o(`T ¡1=2 ):
                                            3
                                                                                                             (A.36)

It follows from Assumption 1(i), HÄlder's inequality and Lemma A.2 that
                                  o

                  E(gT ) = E[(a0 mT )3 ] + 3Ef(a0 mT )2 b0 [(GT ¡ G0 )0 -mT ]g
                     3

                           +3Ef(a0 mT )2 c0 [vech(ST ¡ ST ) -mT ]g
                                                   ~     ¹
                           +3Ef(a0 mT )2 c0 [vech(ST ¡ ST ) -mT ]g + o(`T ¡1 ):
                                                   ^     ~                                                   (A.37)

The rest of the proof is similar to that of (A.17), and thus we will only show that

                        1
                                                X
                                                `
                      T 2 Ef(a0 mT )2 c0 [                  ~
                                                       vech(¡j ¡ ¡j ) -mT ]g
                                                j=¡`
                                              T ¡1
                                              X
                 =     lim (1=T )                         Efa0 v0 a0 v¿ c0 [vech(vt vt¡k ¡ ¡k ) -vs ]g:
                                                                                     0
                                                                                                             (A.38)
                      T !1
                                          ¿;t;s;k=¡T +1

It follows from arguments similar to the proof of (A.21) that
                 1
               T 2 Ef(a0 mT )2 c0 [vech(ST ¡ ST ) -mT ]g
                                        ~    ¹
                            T ¡1
                            X             X
                                          `
          = (1=T )                            !j (1 ¡ ¿s;t;u )Efa0 v0 a0 vs c0 [vech(vt vt¡j ¡ ¡j ) -vu ]g
                      s;t;u=¡T +1 j=¡`

                            X
                            T ¡1          X
                                          `
          = (1=T )                            !j Efa0 v0 a0 vs c0 [vech(vt vt¡j ¡ ¡j ) -vu ]g + O(T ¡1 )
                      s;t;u=¡T +1 j=¡`
                            T ¡1
                            X             X
                                          `
          = (1=T )                            Efa0 v0 a0 vs c0 [vech(vt vt¡j ¡ ¡j ) -vu ]g + O(`¡q )
                      s;t;u=¡T +1 j=¡`

                                   X
                                   T ¡1       X
                                              `
          =     lim T ¡1                             Efa0 v0 a0 v¿ c0 [vech(vt vt¡j ¡ ¡j )0 -vs ]g + O(`¡q ): (A.39)
                                                                                0
               T !1
                            ¿;t;s=¡T +1 j=¡`

By arguments similar to the proof of Lemma 1 of Andrews (1991, pp.850{851), one can show
that the RHS of (A.39) equals the in¯nite sum of the product of two expectations plus some
¯nite number. By the mixing inequality, it follows that the in¯nite sum of the product of two
expectations is ¯nite. Therefore, the RHS of (A.39) is well de¯ned.
   Lastly, we will show (A.20).

            ·4 (gT ) ¡ 3 = 4Ef(a0 mT )3 c0 [vech(ST ¡ ST ) -mT ]g
                                                 ^     ¹
                           +4Ef(a0 mT )3 e0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ]g
                                                   ^     ¹        ^    ¹
                               ³                                     ´
                           +6E (a0 mT )2 fc0 [vech(ST ¡ ST ) -mT ]g2
                                                     ^     ¹

                                   ¡12Efa0 mT c0 [vech(ST ¡ ST ) -mT ]g
                                                       ^    ¹
                                   ¡12Efa0 mT e0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -mT ]g
                                                       ^    ¹           ^   ¹
                                        0                        2      1=2 ¡1
                                   ¡6Efc [vech(ST ¡ ST ) -mT ]g + O(` T );
                                                ^     ¹                                                      (A.40)

                                                             26
from which the desired result follows by similar arguments.                                                    Q.E.D.

Lemma A.4:

           Ãg;T (x)
               ·                                                              ¸
                   1       1            iµ 3               ` µ2   µ4        `
         = exp ¡ µ 2 + T ¡ 2 (®1 (iµ) ¡      (·1 ¡ 3®1 )) ¡ ( °1 + ³1 ) + o( ) ; A.41)
                                                                                (
                   2                     6                 T 2    24        T

                     P (gT · x) = ª(x) + T ¡1=2 p1 (x) + (`=T )p2 (x) + o(`=T ):                               (A.42)

Proof of Lemma A.4: The proof of (A.41) follows from the standard arguments. (A.42) can be
obtained by inverting (A.41).                                                      Q.E.D.

Lemma A.5: Following GÄtze and KÄnsch (1996), de¯ne a truncation function by
                      o         u

                                           ¿ (x) = T ° xf (T ¡° kxk)=kxk

where ° 2 (2=r; 1=2) and f 2 C 1 (0; 1) satis¯es (i) f (x) = x for x · 1; (ii) f is increasing; and
                                y
(iii) f (x) = 2 for x ¸ 2. Let fT denote fT with Rt ´ (vt ; vt ; vec(wt )0 ) replaced by
                                                 ¹       0
                                                            ~

                             Ry = (vt ; vt ; vec(wt )0 )0 = ¿ ((vt ; vt ; vec(wt )0 )0 ) :
                             ¹
                               t
                                    y0 y0
                                        ~         y              0
                                                                     ~0

Let ªy and ªy denote the Edgeworth expansions of fT and gT , respectively. Let Ãg;T (x) and
      T       g;T
                                                           y        y                      y

Ãg;T (x) denote the characteristic functions of gT and ªy , respectively. Then
~y                                                y
                                                           g;T
                               Z
                                             y         ~y
 sup jP (fT · x) ¡ ªT (x)j · C             jÃg;T (µ) ¡ Ãg;T (µ)jjµj¡1 dµ + O(`¡q ) + o(`T ¡1 ): (A.43)
  x                                       jµj<T 1¡2=r


Proof of Lemma A.5: First, we will show that

         sup      jP (fT · x) ¡ ªT (x)j =                   sup     jP (fT · x) ¡ ªy (x)j + o(`T ¡1 ):
                                                                         y
                                                                                   T                           (A.44)
        ¡1<x<1                                         ¡1<x<1

Since
                        µ                         ¶         X
                                                            T
                    P       max kRt k > T °
                                 ¹                      ·          P (kRt k > T ° ) = O(T 1¡°r );
                                                                       ¹                                       (A.45)
                            1·t·T
                                                             t=1

it follows that
                                ¯                       ¯
                                ¯                 y     ¯
                        sup     ¯P (fT · x) ¡ P (fT · x)¯ = O(T 1¡°r ) = O(T ¡1 ):                             (A.46)
                   ¡1<x<1

Then it follows from Lemma A.2 and (A.45) that

                   Ekmy ¡ mT kj
                      T                     · 2j E[kmT kj I( max kRt k > T ° )]
                                                                      1·t·T
                                                                   2j 1=2
                                            · 2 (EkmT k )
                                                   j
                                                                            P ( max kRt k > T ° )1=2
                                                                              1·t·T

                                            = o(T ¡1=2 )                                                       (A.47)

for j · r=2. Similarly, we obtain that

                                EkT 1=2 [(Gy ¡ Gy ) ¡ (GT ¡ G0 )]kj
                                           T    0                                          = o(T ¡1=2 );       (A.48)
                                     ~y     ¹y         ~y   ¹y
                    Ek(T=`)1=2 [vech(ST ¡ ST ) ¡ vech(ST ¡ ST )]kj                         = o(T   ¡1=2
                                                                                                          );   (A.49)
             Ek(T =`)1=2 [vech(rST ¡ rST ) ¡ vech(rST ¡ rST )]kj
                                ~       ¹            ~      ¹                              = o(T   ¡1=2
                                                                                                          );   (A.50)
                              EkT   1=2         ^y
                                          [vech(ST     ¡   ~y
                                                           ST )          ^y
                                                                  ¡ vech(ST      ~y
                                                                               ¡ ST )]jj   = o(T   ¡1=2
                                                                                                          );   (A.51)

                                                              27
for j · r=2. Thus it follows from Lemma A.2, (A.45), (A.47)-(A.51) that
                                     ¯              ¯
                                     ¯              ¯
                                sup ¯ªT (x) ¡ ªy (x)¯ = o(`T ¡1 ):
                                               T                                                     (A.52)
                                     ¡1<x<1

Therefore (A.44) follows from (A.46) and (A.52).
   Next, we will show that

      sup jP (fT · x) ¡ ªy (x)j = sup jP (gT · x) ¡ ªy (x)j + O(`¡q ) + O(`3=2 T ¡3=2 ):
               y
                         T
                                           y
                                                     g;T                                             (A.53)
          x                                      x

Let

              hy
               T    = gT + c0 [vech(ST ¡ S0 ) -my ] + d0 [(GT ¡ G0 ) -vech(ST ¡ S0 ) -my ]
                       y            ¹y      y
                                                  T
                                                                           ¹y    y
                                                                                       T
                         0
                      +e [vech(S ^y ¡ S y ) -vech(S y ¡ S y ) -my ]
                                       ¹          ¹
                                     T               T               T           0           T
                         +e0 [vech(ST ¡ S0 ) -vech(ST ¡ ST ) -my ]
                                   ¹y    y         ^y   ¹y
                                                               T
                         +e0 [vech(ST ¡ S0 ) -vech(ST ¡ S0 ) -my ];
                                   ¹y    y         ¹y    y
                                                               T

and let ªh;T (x) denote its Edgeworth expansion. Using the de¯nition of Taylor series expansions,
Lemma A.2 and Markov's inequality, P (jfT ¡ hy j > `3=2 T ¡3=2 ) can be made arbitrarily small.
                                           y
                                                T
Thus we have

              sup jP (fT · x) ¡ ªy (x)j = sup jP (hy · x) ¡ ªy (x)j + O(`3=2 T ¡3=2 ):
                       y
                                 T                 T         h;T                                     (A.54)
               x                                             x

                                                         y
Since the di®erence between the Edgeworth expansions of gT and of hy is O(ST ¡ ST ), it follows
                                                                   T
                                                                          ¹y    y

that
              sup jP (hy · x) ¡ ªy (x)j = sup jP (gT · x) ¡ ªy (x)j + O(`¡q ):
                       T         h;T
                                                   y
                                                              g;T                       (A.55)
                     x                                               x

Therefore, (A.53) follows from (A.54) and (A.55).
   Lastly, it follows from the so-called smoothing lemma (e.g., Proposition C1 of Fan and Linton,
1997) that
                                    Z
           y
  sup jP (gT · x) ¡ ªy (x)j · C
                        g;T
                                                 y         ~y
                                               jÃg;T (µ) ¡ Ãg;T (µ)jjµj¡1 dµ + O(T ¡1+2=r ): (A.56)
      x                                              jµj<T 1¡2=r

      Therefore, Lemma A.5 follows from (A.44), (A.53) and (A.56) as r > 12.                         Q.E.D.

Lemma A.6: For 0 < " < 1=6,
                      Z
                                               y         ~y
                                             jÃg;T (µ) ¡ Ãg;T (µ)jjµj¡1 dµ = o(`T ¡1 ):              (A.57)
                                 jµj·T "


                           y
Proof of Lemma A.6: Write gT as
                y
               gT    = a0 my + b0 [(Gy ¡ Gy ) -my ] + c0 [vech(ST ¡ ST ) -my ]
                              T      T     0      T
                                                               ~y   ¹
                                                                             T
                           0
                       +c [vech(ST^ ¡ ST ) -m ] + d0 [(G ¡ G ) -vech(S y ¡ S y ) -my ]
                                   y   ~       y          y     y         ~    ¹
                                               T          T     0          T    T  T
                            0   y    y         ^y ¡ S y ) -my ]
                       +d [(G ¡ G ) -vech(S
                                 T           0
                                                    ~
                                                                 T           T           T
                          +e [vech(ST ¡ ST ) -vech(ST ¡ ST ) -my ]
                             0      ~y   ¹y          ~y    ¹y
                                                                   T
                          +e0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -my ]
                                    ~y   ¹y          ^y    ~y
                                                                   T
                          +e0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -my ]
                                    ^y   ~y          ~y    ¹y
                                                                   T
                          +e0 [vech(Sy ¡ S y ) -vech(S y ¡ S y ) -my ]
                                    ^    ~
                                         T
                                                     ^
                                                         T
                                                           ~
                                                                         T           T           T
                        y       y            y
                     ´ gT ;1 + gT;2 + ::: + gT ;10 :

                                                                     28
                                           y            y      y            y
Then a Taylor series expansion of E(exp(iµgT )) around gT;2 + gT;3 + ::: + gT;10 = 0 yields
                     y               y                  y      y      y       y
            E(exp(iµgT )) = E(exp(iµgT;1 ) + iµE[exp(iµgT;1 )(gT;2 + gT ;3 + gT;4 )]
                                       (iµ)2           y       y   y        y     y      y2
                                     +       E[exp(iµgT;1 )(2gT;1 gT ;3 + 2gT ;1 gT;7 + gT ;3 )]
                                         2
                                       (iµ)3           y       y2 y         y2 y           y2 y
                                     +       E[exp(iµgT;1 )(3gT;1 gT ;2 + 3gT ;1 gT;3 + 3gT;1 gT;4 )]
                                         6
                                       (iµ)4           y       y3 y         y3 y           y2 y2
                                     +       E[exp(iµgT;1 )(4gT;1 gT ;3 + 4gT ;1 gT;7 + 6gT;1 gT;3 )]
                                        24
                                                 y4         y4                y4
                                     +O(µ 4 [E(gT;2 ) + E(gT;3 ) + ::: + E(gT;10 )]):                   (A.58)

We will analyze each term on the RHS of (A.58) in turn. First, it follows from Lemma 3.33 of
GÄtze and Hipp (1983) that
 o
         ½              ·      3             4                    6
                                                                               ¸      2
                                                                                        ¾
                     y               (iµ)             µ                  µ                    µ
        E     exp(iµgT ;1 ) ¡ 1 +         E(a0 my )3 + (E(a0 my )4 ¡ 3) ¡ (E(a0 my )3 )2 exp(¡ )
                                                T             T                  T
                                       6              24                 72                   2
    =   O((1 + jµj9 ) exp(¡µ2 )T ¡1¡" ):                                                                (A.59)
                                                                        P    y          y
    Second, let ÃX denote the multivariate expansion of E(exp(ic0 T ¡1=2 T Xt )) where Xt =
                   ~
                                                                         t=1
  0 y y0       y      y 0 0
(a vt ; vt ; (wt ¡ G0 ) ) . Then an application of Lemma 3.33 of GÄtze and Hipp (1983) with
                                                                    o
# = (µ; 0; :::; 0)0 yields

                  y         y        (iµ)3 y2 y
        jEfexp(iµgT ;1 )[iµgT ;2 +        g g ]g
                                       2 T ;1 T;2
          µ                                                                                      ¶
                     (iµ)3                            (iµ)3                                        µ2
        ¡ (iµ ¡            )Efb0 [(Gy ¡ Gy ) -my ]g +
                                    T    0     T            Ef(a0 my )2 b0 [(Gy ¡ Gy ) -my ]g exp(¡ )j
                                                                   T          T    0     T
                       2                                2                                          2
                 X                                X
                                                  T

   ·    T ¡1=2       jc® jjD® [E(exp(i#0 T ¡1=2          y       ~
                                                        X2t )) ¡ ÃX ]j
                 ®                                t=1

   =    O((1 + jµj8 + jµj10 ) exp(¡µ2 )T ¡1¡" );                                                        (A.60)

where c® are the corresponding elements of a, b and G0 .
   Third, we will show that

                    y        y           y    y      (iµ)3 y2 y        (iµ)4 y3 y
      jiµE[exp(iµgT;1 )[iµgT;3 + (iµ)2 gT ;1 gT;3 +       gT;1 gT ;3 +      g g ]]
      µ                                                2                 6 T;1 T;3
               1
    ¡                               ~y   ¹y
        (iµ ¡ (iµ)3 )Efc0 [vech(ST ¡ ST ) -my ]g + (iµ)2 Efa0 my c0 [vech(ST ¡ ST ) -my ]g
                                                   T                   T
                                                                                ~y ¹y
                                                                                      T
               2
         (iµ)3
      +        Ef(a0 my )2 c0 [vech(ST ¡ ST ) -my ]g
                      T
                                     ~y   ¹y
                                                    T
           2                                            ¶
             4
         (iµ)       0 y 3 0          ~y ¡ S y ) -my ]g exp(¡ 1 µ2 )j
                                           ¹
      +        Ef(a mT ) c [vech(ST         T       T
           6                                                      2
    = O((1 + jµj6 ) exp(¡µ 2 )`T ¡1¡" ):                                                                (A.61)

Note that the ¯rst term of (A.61) can be written as a weighted sum of

                                         (iµ)3 0 y 2 (iµ)4 0 y 3 0
 Efexp(iµgT ;1 )[iµ + (iµ)2 a0 my +
          y
                                T             (a mT ) +   (a mT ) ]c [vech(¡y ¡ ¡y ) -my ]g (A.62)
                                                                           ~
                                                                             j   j     T
                                           2            6
and that the rest of the terms can be written as a weighted sum of

       1                      (iµ)3 0 y 2 (iµ)4 0 y 3 0                                µ2
Ef[iµ ¡ (iµ)3 + (iµ)2 a0 my +
                          T        (a mT ) +   (a mT ) ]c [vech(¡y ¡ ¡y ) -my ]g exp(¡ )
                                                                ~
                                                                  j   j     T
       2                        2            6                                         2
                                                                                    (A.63)


                                                           29
                                                                                 ~
We will apply Lemma 3.33 of GÄtze and Hipp (1983) to (A.62) and (A.63). Let ÃY denote the
                               o
                                  0 ¡1=2
                                         PT     y
multivariate expansion of E(exp(i# T      t=1 Yt )) where # = (µ; 0; :::; 0) and

                                                   X
                                                   T
                    Yty = (a0 my ; my0 ; T ¡1=2
                               T    T                    vech[vt vt¡j ¡ E(vt vt¡j )]0 )0 :
                                                                  0           0

                                                   t=1

Then the di®erence between (A.61) and (A.62) are bounded by
                    ¯                               ¯
         X          ¯                 X y
                                      T             ¯
                    ¯                           ~ ¯
  T ¡1=2    jc® jD® ¯E(exp(i#0 T ¡1=2   Yt )) ¡ ÃY )¯ = O((1 + jµj6 ) exp(¡µ2 )T ¡1¡" );                  (A.64)
         ®
                    ¯                               ¯
                                           t=1

where c® are the corresponding linear combinations of a and c. Thus (A.61) follows.
   Fourth, by arguments analogous to the proof of (A.61), one can show that

                 y        y     (iµ)3 y2 y
       jE[exp(iµgT;1 )(iµgT ;4 +     g g )]
                                  2 T ;1 T;4
           1
   ¡ ((iµ ¡ (iµ)3 )Efc0 [vech(ST ¡ ST ) -my ]g + (iµ)3 Ef(a0 my )2 c0 [vech(ST ¡ ST ) -my ]g)
                                 ^y     ~y
                                             T                T
                                                                            ^y   ~y
                                                                                        T
           2
               2
             µ
     £ exp(¡ )j
              2
   = O((1 + jµj6 ) exp(¡µ 2 )`T ¡1¡" );                                                (A.65)

and

                       y        (iµ)2    y    y      2        (iµ)4 y3 y       (iµ)4 y2 y2
             jE[exp(iµgT;1 )[         (2gT;1 gT;7 + gT ;3 ) +      gT;1 gT;7 +      g g ]
                                  2                             6                4 T ;1 T;3
            (iµ)2
         ¡ (      (2Efa0 mT c0 [vech(ST ¡ ST ) -mT ]g + Efc0 [vech(ST ¡ ST ) -mT ]g2 )
                                       ~     ¹                      ^   ¹
              2
             (iµ)4
           +       Efa0 my e0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -my ]g
                           T
                                      ^y    ¹y       ^y    ¹y
                                                                   T
               6
             (iµ)4                                      µ2
           +       Efc0 [vech(ST ¡ ST ) -mT ]g2 ) exp(¡ )j
                               ^      ¹
               6                                        2
         = O((1 + jµj6 ) exp(¡µ 2 )`2 T ¡3=2¡" ):                                      (A.66)

   Lastly, it follows from Lemma A.2 that
                             y4         y4                 y4
                     µ 4 [E(gT;2 ) + E(gT;3 ) + : : : + E(gT ;4 )] = O(µ 4 `2 T ¡2 ):                     (A.67)

Combining and integrating (A.59), (A.60), (A.61), (A.65), (A.66) and (A.67) produces the
desired result.                                                                  Q.E.D.

Lemma A.7:            Z
                                            y         ~y
                                          jÃg;T (µ) ¡ Ãg;T (µ)jjµj¡1 dµ = o(`T ¡1 ):                      (A.68)
                       T " <jµj<T 1¡2=r


Proof of Lemma A.7: We closely follow the proof of GÄtze and Hipp (1996, pp.1927{1930). To
                                                       o
simplify the notation, we will omit the superscript y. Let m = M log T for some M > 0. Let
N = [(T=µ 2 + 1)m2 ] for T " < jµj < T 1¡2=r . Then m · N · T for su±ciently large T . De¯ne

                                      X
                                      N                                    X
                                                                           T
                      mN = T ¡1=2           vt ;    mT ¡N = T ¡1=2               vt ;
                                      t=1                                t=N+1

                          X
                          N                                                                  X
                                                                                             T
 GN ¡ E(GN ) = (1=T )       (wt ¡ E(wt ));          GT ¡N ¡ E(GT ¡N ) = (1=T )                   (wt ¡ E(wt ));
                          t=1                                                           t=N+1


                                                     30
                            X
                            `                                             X
                                                                          `
           SN ¡ SN =
           ~    ¹                  !j (¡j;N ¡ ¡j );
                                       ~               ST ¡N ¡ ST ¡N =
                                                       ~       ¹                 !j (¡j;T ¡N ¡ ¡j );
                                                                                     ~
                            j=¡`                                          j=¡`

                            X
                            `                                             X
                                                                          `
          ^    ~
          SN ¡ SN =                  ^      ~
                                 !j (¡j;N ¡ ¡j;N );    ^       ~
                                                       ST ¡N ¡ ST ¡N =               ^         ~
                                                                                 !j (¡j;T ¡N ¡ ¡j;T ¡N )
                        j=¡`                                              j=¡`

so that

                                   mT     =   mN + mT ¡N ;
                            GT    ¡ G0    =   GN ¡ E(GN ) + GT ¡N ¡ E(GT ¡N );
                            ~
                            ST      ¹
                                  ¡ ST    =   ~    ¹    ~       ¹
                                              SN ¡ SN + ST ¡N ¡ ST ¡N ;
                            ^T
                            S     ¡ ST
                                    ~     =   ^N ¡ SN + ST ¡N ¡ ST ¡N :
                                              S    ~    ^       ~

Write
                                     gT = a0 mT + Q(mT ; GT ; ST ; ST ; ST ):
                                                              ^ ~ ¹

Then a Taylor series expansion of Q around vt = 0 and wt = 0 for t = 1; 2; :::; N yields

            E exp(iµgT )
          = E[exp(iµa0 mT + iµQ(mT ¡N ; GT ¡N ; ST ¡N ; ST ¡N ; ST ¡N )
                                                ^       ~       ¹
              X
            £      ¹ º                    ^       ~       ¹
                 v w Q¹º (mT ¡N ; GT ¡N ; ST ¡N ; ST ¡N ; ST ¡N )]
                 ®;¯
                                    ^ ~ ¹                              ^       ~       ¹
              +O(jµjr EjQ(mT ; GT ; ST ; ST ; ST ) ¡ Q(mT ¡N ; GT ¡N ; ST ¡N ; ST ¡N ; ST ¡N )jr )(A.69)

where the power is element-by-element and the indices satisfy

          ¹ = (¹1 ; :::; ¹N +`¡1 ; 0; :::; 0); º = (º1 ; :::; ºN ; 0; :::; 0); j¹j + jºj · 5(r ¡ 1):

   First, we will consider the expansion terms in (A.69). Let
             0         0
           fj1 ; :::; j5(r¡1) g = fj : ¹j or ºj > 0g;
                                                                  0
                             I     = fj 2 f1; :::; N ¡ mg : jj ¡ jk j ¸ 3m; k = 1; :::; 5(r ¡ 1)g;
                        jk+1       = inffj 2 I : j ¸ jk + 7mg

and j1 = inf I. Let s denote the smallest integer for which the inf is unde¯ned. Let
              Y
    Ak =         fexp(iµT ¡1=2 a0 vt : j 2 I; jj ¡ jk j · mg; k = 1; :::; s;
              Y
    Bk =         fexp(iµT ¡1=2 a0 vt : j 2 I; jk + m + 1 · j · jk+1 ¡ m ¡ 1g; k = 1; :::; s ¡ 1;
              Y
    Bs =         fexp(iµT ¡1=2 a0 vt : j 2 I; j ¸ js ¡ m ¡ 1g;
              Y
     R =         exp(iµT ¡1=2 a0 vt ) exp(iµQ(mT ¡N ; GT ¡N ; ST ¡N ; ST ¡N ; ST ¡N ))v¹ wº Q¹º :
                                                                ^       ~     ¹
               j62I

Then we can write

           E[exp(iµa0 mT + iµQ(mT ¡N ; GT ¡N ; ST ¡N ; ST ¡N ; ST ¡N )
                                                  ^       ~       ¹
                 X                                                                     Y
                                                                                       s
               £                                 ^       ~       ¹
                      v¹ wº Q¹º (mT ¡N ; GT ¡N ; ST ¡N ; ST ¡N ; ST ¡N )] =                 Ak Bk R:   (A.70)
                      ®;¯                                                             k=1


Note that jAk j · 1, jBk j · 1, jRj · T °(s¡1)r , and that Ak , Bk and R are measurable with
            jk +2m    jk +1
respect to Fjk ¡2m , Fjk ¡1 , fFl : 9j 62 I; jl ¡ jj · mg, respectively. By Assumption 1(d), it


                                                        31
follows that
                Y
                s                       Y
                                        s
          jE[         Ak Bk R] ¡ E[           E(Ak jFj : jj ¡ jk j · 3m)Bk R]j
                k=1                     k=1
          X
          s           Y
                      j¡1                                                      Y
                                                                               s
      ·         jE[         Ak Bk (Aj ¡ E(Aj jFj : jj ¡ jk j · 3m))                E(Al jFj ; jj ¡ jl j · 3m)Bl j
          j=1         k=1                                                  l=j+1

          X
          s           Y
                      j¡1
                                          jk ¡1 1
      =         jE[         Ak Bk (E(Aj jF¡1 [ Fjk +1 ) ¡ E(Aj jFj : jj ¡ jk j · 3m))
          j=1         k=1
                Y
                s
          £            E(Al jFj : jj ¡ jl j · 3m)Bl j
               l=j+1

      = O(T c1 exp(¡dm)) = o(T ¡c2 )                                                                          (A.71)

for any arbitrary c2 > 0 by choosing su±ciently large M . By the mixing inequality of Hall and
Heyde (1980), we obtain

                               Y
                               s
                       jE[R          E(Ak jFj : jj ¡ jk j · 3m)Bk ]j
                               k=1
                                Y
                                s
                · T c3 E             jE(Ak jFk : 0 < jj ¡ jk j · 3m)j
                               j=1
                               Ys
                       +T c3         EjE(Ak jFj : 0 < jj ¡ jk j · 3m)j + 4T c3 (q=d) exp(¡dm)                 (A.72)
                               j=1

for some c3 > 0. For jµj ¸ d, we have EjE(Ak jFj ; j 6= jk )j · exp(¡d). Thus by Lemma 3.2 of
GÄtze and Hipp (1983) and Assumption 1(d), it follows that
  o

      EjE(Ak jFj ; jj ¡ jk j · 3m)j · EjE(Ak jFj : jj ¡ jk j 6= 0)j + O(T c exp(¡dm))
                                    · max(exp(¡dµ2 =T ); exp(¡d)) + O(T c3 exp(¡dm))(A.73)

                                                   Y
                                                   s
                                              E[         Ak Bk R] = O(T ¡c )                                  (A.74)
                                                   k=1

for arbitrary c > 0 by choosing su±ciently large M .
    Next, consider the remainder term in (A.69). It follows from Lemma A.2 that

                                                     EjmN jr           =   O((N=T )r );                       (A.75)
                                         EjT 1=2 (GN ¡ G0 )jr          =   O((N=T )r );                       (A.76)
                               Ej(T =`)1=2 vech(SN ¡ SN )jr
                                                  ~    ¹               =   O((N=T )r=2 );                     (A.77)
                            Ej(T =`)1=2 vech(rSN ¡ rSN )jr
                                               ~       ¹               =   O((N=T )r=2 );                     (A.78)
                                         1=2
                                    EjT vech(SN ¡ SN )jr
                                                  ^    ~               =   O((N=T )r=2 ):                     (A.79)

Using the de¯nition of N and "r > 2, we obtain that

             jµjr EjQ(mT ; GT ; ST ; ST ; ST ) ¡ Q(mT ¡N ; GT ¡N ; ST ¡N ; ST ¡N ; ST ¡N )jr
                                ^ ~ ¹                              ^       ~       ¹
                       r r=2 ¡r
           = O(` jµj N T )
                  r=2
             ½
           =     O(`r=2 mr T ¡r=2 )    for jµj · T 1=2
                 O(jµj ` m T ) for T 1=2 < jµj · `¡1=2 T 1¡"
                      r r=2 r ¡r

           = o(`T ¡1 ):                                                                                       (A.80)


                                                              32
   Lastly, it follows from (A.69), (A.71)-(A.73) and (A.80) that

     E exp(iµgT ) = T c max(exp(¡dµ 2 =T ); exp(¡d))N=M + O(T c exp(¡dm)) + o(`T ¡1 )
                  = o(`T ¡1 )                                                      (A.81)

for s ¸ N=M and su±ciently large M, which completes the proof.                                                Q.E.D.

Lemma B.1: For 1 · s · r=2,
                                              ©                  ª
                     E ¤ [kvec(FNj )ks ] ¡ E E ¤ [kvec(FNj )ks ]   = Op (b¡1=2 );                             (A.82)
                                                 © ¤             ª
                               E ¤ [kBNj ks ] ¡ E E [kBNj ks ]]    = Op (b¡1=2 ):                             (A.83)


Proof of Lemma B.1: First we will prove (A.82). We can write the LHS of (A.82) as

                       X
                       T ¡`                                                                    X
                                                                                               `
     (1=(T ¡ ` + 1))           kvec(Fj )ks ¡ E[kvec(Fj )ks ] = (1=(T ¡ ` + 1))(1=`)                  fs;º ;   (A.84)
                       t=0                                                                     º=1

where
                                       X
                                       b¡1
                       fs;º = (1=b)          (kvec(F¹`+º )ks ¡ E(kvec(F¹`+º )ks )) :
                                       ¹=0

Note that  fvec(F¹`+º )gb¡1
                        ¹=0   is a triangular array of strong mixing sequence with mixing co-
e±cients given by f®¹` g where ®m is the mixing coe±cient of the original variables. So is
kvec(F¹`+º )ks . Thus it follows that

                                               fs;º = Op (b¡1=2 ):                                            (A.85)

Since the decay rate of the mixing coe±cients is uniform in º, (A.85) also holds uniformly in º.
Hence (A.82) follows from (A.84) and (A.85).
   Next we will prove (A.83). Note that the LHS of (A.83) is bounded by
                     Ã                T ¡`
                                                            !
                                      X
                  O (1=(T ¡ ` + 1))        kB
                                            ~t ks ¡ EkBt ks
                                                      ~                                  (A.86)
                                              t=0
                           Ã                                                                         !
                                                X
                                                T ¡`
                   +O (1=(T ¡ ` + 1))                  kBt ks ¡ kBt ks ¡ E(kBt ks ¡ kBt ks )
                                                        ^        ~          ^        ~                        (A.87)
                                                t=0
                   +O (k¹¤ ks ¡ E(k¹¤ k )) ;
                         T          T
                                                s
                                                                                                              (A.88)

       ^     ¡1=2
                  P`                    ¡1=2
                                              P`                                               ¡1=2
where Bt = `                      ~
                     j=1 vt+j and Bt = `
                         ~                     j=1 vt+j . First, the proof that (A.86) is Op (b     )
is analogous to the proof of (A.82) and thus is omitted. Second, we will prove that (A.87) is
Op (b¡1=2 ). A Taylor series expansion yields
                                                                     1
                                          ~        ~               ~
                                 kBt k ¡ kBt k = skBt ks¡2 Ft ` 2 (¯T ¡ ¯0 ):
                                  ^                                                                           (A.89)

Thus we have
                     X
                     T ¡`                                            X
                                                                     T ¡`
                                                                                              1
   (1=(T ¡ ` + 1))           ^        ~
                            kBt ks ¡ kBt ks = (1=(T ¡ ` + 1))                    ~               ~
                                                                               skBt ks¡2 Ft ` 2 (¯T ¡ ¯0 ):   (A.90)
                     t=0                                                 t=0

By using arguments analogous to the one used in the proof of (A.82), it follows from the ergodic
theorem that
                                        T ¡`
                                        X
                        (1=(T ¡ ` + 1))        ~
                                             skBt ks¡2 Ft = Oas (1):                      (A.91)
                                                    t=0


                                                          33
Thus it follows from Assumption 1(i) that

                                               X
                                               T ¡`
                           (1=(T ¡ ` + 1))            (kBt ks ¡ kBt ks ) = Op (b¡1=2 ):
                                                        ^        ~                                         (A.92)
                                               t=0

Similarly we obtain
                                             T ¡`
                                             X
                           (1=(T ¡ ` + 1))           E(kBt ks ¡ kBt ks ) = O(b¡1=2 ):
                                                        ^        ~                                         (A.93)
                                              t=0


Hence it follows from (A.92) and (A.93) that (A.87) is Op (b¡1=2 ). Third, we will prove that
(A.88) is Op (b¡1=2 ). We can write ¹¤ as
                                     T

                                           X
                                           T ¡`
            ¹¤
             T    = (1=(T ¡ ` + 1))                 ^
                                                    Bt
                                             t=0
                                           T ¡`
                                           X                                   T ¡`
                                                                               X
                  = (1=(T ¡ ` + 1))                 Bt + (1=(T ¡ ` + 1))
                                                    ~                                 Ft `1=2 (¯T ¡ ¯0 )
                                                                                               ~
                                             t=0                               t=0
                                               X
                                               T ¡`
                       +(1=(T ¡ ` + 1))                Ht `1=2 (¯T ¡ ¯0 )2 :
                                                                ~                                          (A.94)
                                                t=0

Thus we obtain

                 k¹¤ ks ¡ Ek¹¤ ks
                   T         T
                                        T ¡`
                                        X
            = O((1=(T ¡ ` + 1))                kBt ks ¡ EkBt ks )
                                                ~         ~
                                        t=0
                                          X
                                          T ¡`
                 +O((1=(T ¡ ` + 1))              kFt `1=2 (¯T ¡ ¯0 )ks ¡ EkFt `1=2 (¯T ¡ ¯0 )ks ):
                                                           ~                        ~                      (A.95)
                                          t=0

The rest of the proof is analogous to the proofs of (A.86) and (A.87). Therefore (A.83) follows
from (A.86), (A.87) and (A.88).                                                        Q.E.D.

Lemma B.2: Let G¤ = E ¤ (G¤ ) and BT and CT denote the bootstrap version of B and C in
                 0         T
                                     ¤        ¤
                               ¤
Lemma A.1 with S0 replaced by ST , respectively. Then

                                   G¤
                                    0   = G0 + Op (T ¡1=2 );                                               (A.96)
                                    ¤                       ¡1           ¡1=2
                                   ST   = S + O(`                ) + Op (b      ):                         (A.97)


Proof of Lemma B.2: First, we will prove (A.96).

                                                         1X
                                                             b
                      G¤
                       0     = E ¤ [G¤ ] = E ¤ [
                                     T                      FNk ] = E ¤ [FN1 ]
                                                         b
                                                           k=1

                                    1          X
                                               T ¡`
                                                                     1      X1X
                                                                            T ¡`   `
                             =                        Ft =                            wt+i
                                 T ¡`+1        t=0
                                                                  T ¡ ` + 1 t=0 ` i=1

                                 1 X
                                    T
                             =         wt + Op (`T ¡1 ) = GT + Op (`T ¡1 ):
                                 T t=1

Therefore, (A.96 follows from GT ¡ G0 = Op (T ¡1=2 ).


                                                          34
   Next, we will prove (A.97). By de¯nition, it follows that

                                           1 X
                                                         b
              ¤
             ST     ´ Var¤ (m¤ ) = Var¤ ( p
                               T                  BNk )                                                  (A.98)
                                            b k=1
                          Ã                          !Ã                          !0
                            1 X          p ¤              1 X          p ¤
                                 b                             b
                        ¤
                    = E     p       BNk ¡ bE (BN1 )       p       BNk ¡ bE (BN1 )                        (A.99)
                              b k=1                         b k=1
                          Ã             !Ã             !0
                            1 X             1 X
                                 b              b
                        ¤
                    = E     p       BNk    p       BNk                                                  (A.100)
                              b k=1          b k=1
                                                                          T ¡`
                          1X ¤                                            X
                             b
                                        0                0         1               0
                    =           E (BNk BNk ) = E ¤ (BN1 BN1 ) =                Bt Bt :                  (A.101)
                          b k=1                                 T ¡ ` + 1 t=0

It follows from Lemma B.1 that

                                             ST ¡ E[ST ] = Op (b¡1=2 ):
                                              ¤      ¤
                                                                                                        (A.102)

Since ¹¤ = Op (T ¡1=2 ), we have
       T


                                         0       X
                                                 `
                        ¤
                     E[ST ] = E[Bt Bt ] =               (1 ¡ jjj=`)E[v0 v¡j ] = S + O(`¡1 ):
                                                                         0
                                                                                                        (A.103)
                                                 j=¡`

Thus (A.97) follows from (A.101),(A.102) and (A.103).                                                   Q.E.D.

Lemma B.3: Let

             ®¤
              T       =    T 1=2 ·¤ (gT );
                                  1
                                       ¤

              ¤
             °T       =    (T =`)(·¤ (gT ) ¡ 1) = (T =`)(E ¤ (gT ) ¡ [E ¤ (gT )]2 ¡ 1);
                                     2
                                         ¤                      ¤2            ¤

             ·¤
              T       =    T 1=2 E ¤ (gT ) = T 1=2 f·¤ (gT ) + 3E ¤ (gT )E ¤ (gT ) ¡ 2[E ¤ (gT )]3 g;
                                        ¤3
                                                     3
                                                         ¤            ¤2        ¤            ¤

              ¤                      ¤ ¤
             ³T       =    (T =`)(·4 (gT ) ¡ 3):

Then

       ®¤
        T         = ®1 + T 1=2 b¤0 E ¤ [(G¤ ¡ G¤ ) -m¤ ] + T 1=2 c¤0 E ¤ [vech(ST ¡ ST ) -m¤ ]
                                           T   0      T
                                                                               ~¤    ¤
                                                                                           T
                             0
                    +T 1=2 c¤ E ¤ [vech(ST ¡ ST ) -m¤ ] + o¤ (`T ¡1=2 )
                                         ^¤  ~¤      T     p

                  = ®1 + Op (`¡1 ) + Op (b¡1=2 ) + o¤ (`T ¡1=2 );
                                                      p                                                 (A.104)
         ¤
        °T        = °1 + 2(T =`)E ¤ fa¤0 m¤ b¤0 [(G¤ ¡ G¤ ) -m¤ ]g
                                            T       T     0       T
                    +2(T =`)E ¤ fa¤0 m¤ c¤0 [vech(ST ¡ ST ) -m¤ ]g
                                      T
                                                   ~¤    ¤
                                                                  T
                    +2(T =`)E ¤ fa¤0 m¤ e¤0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -m¤ ]g
                                      T
                                                   ~¤    ¤           ~¤ ¤
                                                                              T
                             ¤   ¤0       ^T ¡ ST ) -m¤ ]g2 + o¤ (1);
                                            ¤    ~¤
                    +(T =`)E fc [vech(S                 T          p
                  = °1 + op (1) + o¤ (1);
                                    p                                                                   (A.105)
        ·¤
         T        = ·1 + T 1=2 E ¤ [(a¤0 m¤ )3 ] + 3T 1=2 E ¤ f(a¤0 m¤ )2 b¤0 [(GT ¡ G0 )0 -mT ]g
                                          T                          T
                    +3T 1=2 E ¤ f(a¤0 m¤ )2 c¤0 [vech(ST ¡ ST ) -m¤ ]g
                                       T
                                                      ~¤       ¤
                                                                       T
                    +3T 1=2 E ¤ f(a¤0 m¤ )2 c¤0 [vech(ST ¡ ST ) -m¤ ]g + o¤ (`T ¡1=2 )
                                       T
                                                      ^¤      ~¤       T        p

                  = ·1 + Op (`¡1=2 ) + Op (b¡1=2 ) + o¤ (`T ¡1=2 );
                                                      p                                                 (A.106)
         ¤
        ³T        = ³1 + 4(T =`)E f(a mT ) c [vech(ST ¡ ST ) -m¤ ]g
                                  ¤    ¤0 ¤ 3 ¤0          ^¤     ~¤   T
                              ¤   ¤0 ¤ 3 0         ^¤
                    +4(T =`)E f(a mT ) e [vech(ST ¡ ST ) -vech(ST ¡ ST ) -m¤ ]g
                                                           ~¤        ^¤   ~¤
                                ³                                       ´    T
                              ¤     ¤0 ¤ 2   ¤0       ^T ¡ ST ) -m¤ ]g2
                                                        ¤     ~¤
                    +6(T =`)E (a mT ) fc [vech(S                    T


                                                             35
              ¡12(T =`)E ¤ fa¤0 m¤ c¤0 [vech(ST ¡ ST ) -m¤ ]g
                                   T
                                              ^¤  ~¤      T
              ¡12(T =`)E ¤ fa¤0 m¤ e¤0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -m¤ ]g
                                   T
                                              ^¤  ~¤          ^¤  ~¤     T
              ¡6(T =`)E ¤ fc¤0 [vech(ST ¡ ST ) -m¤ ]g2 + o¤ (1)
                                     ^¤     ~¤     T       p
            = ³1 + op (1) + o¤ (1);
                             p                                                          (A.107)

where ®1 , °1 , ·1 , and ³1 are de¯ned in Lemma A.4.

Proof of Lemma B.3: The ¯rst equalities in (A.104){(A.107) follow from Lemmas B.1 and B.2.
Thus we will show that the second equalities hold in the rest of the proof.
   Part (a): Proof of (A.104). First, we introduce some notation for the proof. Let

              ®¤
               1T   = T 1=2 b¤0 E ¤ [(G¤ ¡ G¤ ) -m¤ ];
                                        T    0      T
              ®¤
               2T   = T 1=2 c¤0 E ¤ [vech(ST ¡ ST ) -m¤ ];
                                          ~¤     ¤
                                                       T
                              0
              ®¤
               3T   = T 1=2 c¤ E ¤ [vech(ST ¡ ST ) -m¤ ];
                                          ^¤    ~¤     T
              ®1T   = T 1=2 b0 E[(GT ¡ G0 ) -mT ];
              ®2T   = T 1=2 c0 E[vech(ST ¡ ST ) -mT ];
                                      ~     ¹
                        1=2 0         ^     ~
              ®3T   = T c E[vech(ST ¡ ST ) -mT ];
                          X1
             ®11    = b0        E[w0 -vi ];
                           i=¡1
                             1
                             X
             ®21    = c0                       0
                                    E[vech(v0 vi ) -vj ];
                           i;j=¡1
                             X1
             ®31    = c0          Efvech[rS(E(w0 )0 V E(w0 ))¡1 E(w0 )0 V v0 ] -vi g:
                                          ¹
                           i=¡1

   Next, we will prove that
                          ¤
                         ®1T ¡ ®11         = Op (`¡1 ) + Op (`1=2 b¡1=2 );              (A.108)
                          ¤                          ¡1           1=2 ¡1=2
                         ®2T    ¡ ®21      = Op (`        ) + Op (`   b      );         (A.109)
                          ¤                          ¡1           1=2 ¡1=2
                         ®3T    ¡ ®31      = Op (`        ) + Op (`   b      ):         (A.110)

Since ®1 = ®11 +®21 + ®31 and ®¤ = ®¤ +®2T +®¤ , (A.104) follows from (A.108), (A.109)
                                    T    1T
                                              ¤
                                                     3T
and (A.110).
   First, we will prove (A.108). From Lemma B.2, we have b¤ = b + O(`¡1 ) + Op (b¡1=2 ) and
thus
                                p ¤0 ¤
                      ®¤1T  =     `b E f[FN1 ¡ E ¤ (FN1 )] -BN1 g
                                       e
                            = b¤0 E ¤ [FN1 -BN1 ]
                                        e
                              = b0 E ¤ [FN1 -BN1 ] + Op (`¡1 ) + Op (b¡1=2 )
                              = ®11 + Op (`¡1 ) + Op (b¡1=2 );
                                 ¤
                                                                          say.          (A.111)

By combining (A.111) with
                    h      i  X
                              `

          11
                      e
      E [®¤ ] = b0 E Ft -Bt =   (1 ¡ jjj=`)b0 E[w0 -v¡j ] = ®11 + O(`¡1 ):              (A.112)
                                    j=¡`

and ®¤ ¡ E [®11 ] = Op (b¡1=2 ) from Lemma B.1, we obtain (A.108).
     11
               ¤

   Second, we will prove (A.109). Similarly, we have c¤ = c + O(`¡1 ) + Op (b¡1=2 ) from Lemma
B.2 and thus
                 p ¤0 ¤
       ®¤
        2T   =     `c E [vech(BN1 BN1 ¡ E ¤ (BN1 BN1 )) -BN1 ]
                                     0               0


                                                  36
                 p
               =  `c0 E ¤ [vech(BN1 BN1 ¡ E ¤ (BN1 BN1 )) -BN1 ] + Op (`¡1 ) + Op (b¡1=2 )
                                     0              0
                 p 0 ¤
               =  `c E [vech(BN1 BN1 ) -BN1 ] + Op (`¡1 ) + Op (b¡1=2 )
                                     0


               = ®¤ + Op (`¡1 ) + Op (b¡1=2 );
                  21                                         say.                                 (A.113)

By combining (A.113) with
                   p 0
      E [®¤ ] =
          21
                                    0
                     `c E[vech (Bt Bt ) -Bt ]
                       X µ
                        `
                                 min ((max jij; jjj)(i ¢ j > 0) + (jij + jjj)(i ¢ j · 0); `)
                                                                                             ¶
               = c0          1¡
                     i;j=¡`
                                                             `
                            ¡ 0 ¢
                   £E[vech v0 v¡i -v¡j ]
                   = ®21 + O(`¡1 ):                                                               (A.114)

and ®¤ ¡ E [®21 ] = Op (b¡1=2 ) from Lemma B.1, we obtain (A.109).
     21
               ¤

   Lastly, we will prove (A.110). Note that

                                             1 X ³ b b0              ´
                                                b
                          ^¤   e¤
                          ST ¡ ST       =                         0
                                                   BNk BNk ¡ BNk BNk
                                             b k=1
                                           e¤ e      ^       e¤ e
                                        = 5ST (¯ ¤ ¡ ¯) + 52 ST (¯¤ ¡ ¯)2
                                                                      ^                           (A.115)

where
                                             p b
                               e¤             ` X¡     0         0
                                                                   ¢
                              5ST      =          FNk BNk + BNk FNk ;
                                             b
                                                k=1

                                             ` X¡         ¢
                                                b
                                e¤
                             52 ST     =                0
                                                   FNk FNk ;
                                             b k=1

                            e                     ¡1       1
                            ¯ ¤ ¡ ¯ = [G¤0 VT G¤ ] G¤0 VT p m¤ :
                                  ^     T      T     T        T
                                                            T
First, note that
                                                 h      i
                                   e¤
                                  5ST                e¤
                                            = E ¤ 5ST + Op (b¡1=2 );
                                                               ¤

                                                     h       i
                                  e¤
                           `¡1 52 ST                      e¤
                                            = `¡1 E ¤ 52 ST + Op (b¡1=2 );
                                                                 ¤


where
               h     i   p ¤
                  e¤
            E ¤ 5ST    =                                  e             e0
                           `E [FN1 BN1 + BN1 FN1 ] = E ¤ [FN1 BN1 + BN1 FN1 ];
                                      0       0                0

              h      i
                  e¤
           E ¤ 52 ST   = `E ¤ [FN1 FN1 ]:
                                    0


Second, note that
                       e
                T 1=2 (¯ ¤ ¡ ¯) = [E ¤ [G¤0 ]VT E ¤ [G¤ ]]
                             ^           T            T
                                                             ¡1
                                                                  E ¤ [G¤0 ]VT m¤ + Op (b¡1=2 )
                                                                        T       T
                                                                                     ¤
                                                                                                  (A.116)

since G¤ ¡ E ¤ [G¤ ] = Op (b¡1=2 ) and
       T         T
                        ¤


                            G¤0 VT G¤ ¡ E ¤ [G¤0 ]VT E ¤ [G¤ ] = Op (b¡1=2 ):
                             T      T         T            T
                                                                  ¤


Thus it follows from (A.115){(A.116) that
                          0
         ®¤
          3T    = T 1=2 c¤ E ¤ [vech(ST ¡ ST ) -m¤ ];
                                     ^¤   ~¤     T


                                                      37
                                ³                                               ´
                      0
                                   e¤        e                 e¤      e
                  = c¤ E ¤ [vech 5ST T 1=2 (¯ ¤ ¡ ¯) + 52 ST T 1=2 (¯ ¤ ¡ ¯)2 -m¤ ]
                                                   ^                        ^         T
                                ³                      ´
                      0
                                   e¤        e
                  = c¤ E ¤ [vech 5ST T 1=2 (¯ ¤ ¡ ¯) -m¤ ] + Op (`1=2 T ¡1=2 )
                                                   ^           T
                                                                     ¤

                                ³                                                    ´
                      0
                                       e¤                           ¡1
                  = c¤ E ¤ fvech E ¤ [5ST ] [E ¤ [G¤0 ]VT E ¤ [G¤ ]] E ¤ [G¤0 ]VT m¤ -m¤ g
                                                   T             T         T       T    T

                    +Op (`1=2 T ¡1=2 )
                         ¤
                                ³                        £                      ¤¡1
                                      e             e0
                  = c0 E ¤ fvech E ¤ [FN1 BN1 + BN1 FN1 ] E ¤ [FN1 ]V E ¤ [FN1 ]
                                           0                    0

                                      ¢
                    £E ¤ [FN1 ]V BN1 -BN1 g + Op (`1=2 T ¡1=2 ) + Op (`¡1 ) + Op (b¡1=2 );
                             0                    ¤


                  = ®¤ + Op (`1=2 T ¡1=2 ) + Op (`¡1 ) + Op (b¡1=2 );
                     31
                          ¤
                                                                            say.             (A.117)

Since
    h                        i  p                      X
                                                       `
          e             e0
   E E ¤ [FN1 BN1 + BN1 FN1 ] =
               0
                                 `E[Ft Bt + Bt Ft0 ] =
                                        0                                 0        0
                                                         (1 ¡ jjj=`)E[w0 v¡j + v0 w¡j ]
                                                                j=¡`
                                          1
                                          X
                                    =           E[w0 v¡j + v0 w¡j ] + O(`¡1 ) = 5S + O(`¡1 );A.118)
                                                      0        0
                                                                                            (
                                        j=¡1
                      E [E ¤ [FN1 ]] = E[Ft ] = E[w0 ];                                      (A.119)

it follows that
                                          h                        i
                   e             e0             e             e0
              E ¤ [FN1 BN1 + BN1 FN1 ] ¡ E E ¤ [FN1 BN1 + BN1 FN1 ]
                        0                            0
                                                                          = Op (b¡1=2 );     (A.120)
                                                E ¤ [FN1 ] ¡ E [E ¤ [FN1 ]] = Op (b¡1=2 ):   (A.121)

Hence, it follows from the moment inequality, Lemma B.1, (A.120) and (A.121) that

  ®¤
   31      = E [®¤ ] + Op (b¡1=2 )
                 31
                X1
           = c0     Efvech[rS(E(w0 )0 V E(w0 ))¡1 E(w0 )0 V v0 ] -vi g + Op (`¡1 ) + Op (b¡1=2 )
                              ¹
                  i=¡1

           = ®31 + Op (`¡1 ) + Op (b¡1=2 ):                                                  (A.122)

Therefore, (A.110) follows from (A.117), and ( A.122).
   Part (b): Proof of (A.105). Let
               ¤
              °1T    =   (T =`)E ¤ fa¤0 m¤ b¤0 [(G¤ ¡ G¤ ) -m¤ ]g;
                                          T        T     0      T
               ¤
              °2T    =   (T =`)Efa¤0 m¤ c¤0 [vech(ST ¡ ST ) -m¤ ]g;
                                         T
                                                     ~¤    ¤
                                                                  T
               ¤
              °3T    =   (T =`)E ¤ fa¤0 m¤ e¤0 [vech(ST ¡ ST ) -vech(ST ¡ ST ) -m¤ ]g;
                                          T
                                                      ~¤     ¤       ~¤    ¤
                                                                                 T
               ¤
              °4T    =   (T =`)E ¤ fc¤0 [vech(ST ¡ ST ) -m¤ ]g2 :
                                              ^¤     ~¤       T

From Lemma B.2, we have a¤ = a + O(`¡1 ) + Op (b¡1=2 ) and thus

                    °1T = `¡1=2 E ¤ fa¤0 BN1 b¤0 [FN1 ¡ E ¤ (FN1 )] -BN1 g = o¤ (1):
                     ¤
                                                                              p

Similarly,
      ¤
     °2T     = (T =`)E ¤ fa¤0 m¤ c¤0 [vech(ST ¡ ST ) -m¤ ]g
                                 T
                                           ~¤    ¤
                                                        T
             = E ¤ fa¤0 BN1 c¤0 [vech(BN1 BN1 ¡ E ¤ (BN1 BN1 )) -BN1 ]g
                                            0             0


             = E ¤ fa0 BN1 c0 [vech(BN1 BN1 ¡ E ¤ (BN1 BN1 )) -BN1 ]g + Op (`¡1 ) + Op (b¡1=2 )
                                         0              0


             = °21 + Op (`¡1 ) + Op (b¡1=2 );
                ¤
                                                     say.


                                                   38
   It follows from the moment inequality and Lemma B.1 that
        ¤
       °21   = E [°21 ] + Op (b¡1=2 )
                   ¤


                        1 X X
                            `     T
             =      lim               Efa0 v0 c0 [vech(vi vi¡j ¡ ¡j ) -vk ]g + Op (`¡1 ) + Op (b¡1=2 )
                                                           0
                   T !1 `
                          j=¡` i;k=¡T

             = °21 + Op (`¡1 ) + Op (b¡1=2 ):                                                                                   (A.123)
                     ¤        ¤
    The result for °3T and °4T can be proved using similar arguments, and thus the proof is
omitted.
    Part (c): Proof of (A.106). Let ·¤ = T 1=2 E ¤ [(a¤0 m¤ )3 ] denote the second term on the RHS
                                     1T                      T
of (A.106). Because the proof of Part (c) is analogous to the proofs of Parts (a) and (b), we will
only show that
                              X1
                      ·¤ =
                        1T          E(a0 v0 a0 vi a0 vj ) + Op (`¡1 ) + Op (b¡1=2 ):
                                                   j                                        (A.124)
                                          i;j=¡1

By de¯nition, we have
                                                                                    T ¡`
                                                                                    X
       ·¤
        1T   = ` 1=2    ¤       ¤0
                       E [(a BN1 ) ] = (`   3              1=2
                                                                 =(T ¡ ` + 1))             [a¤0 (Bt + Bt ¡ Bt ¡ ¹¤ )]3 ;
                                                                                                 ~    ^    ~     T              (A.125)
                                                                                    t=0

      ^      ~
where Bt and Bt are de¯ned in the proof of Lemma B.1. Thus it su±ces to show that
                                                   T ¡`
                                                   X                              1
                                                                                  X
                   (`1=2 =(T ¡ ` + 1))                    (a¤0 Bt )3
                                                               ~          =                 E(a0 v0 a0 vi a0 vj ) + op (`T ¡1=2 );(A.126)
                                                    t=0                         i;j=¡1

                                         X
                                         T ¡`
         (`1=2 =(T ¡ ` + 1))                    [a¤0 (Bt ¡ Bt )]3
                                                      ^    ~              = Op (`¡1 ) + Op (b¡1=2 );                            (A.127)
                                         t=0
                                                   T ¡`
                                                   X
                   (`1=2 =(T ¡ ` + 1))                    (a¤0 ¹¤ )3
                                                                T         = Op (`¡1 ) + Op (b¡1=2 );                            (A.128)
                                                   t=0

First, we will show (A.126). Since a HAC covariance matrix estimator converges at rate
Op (`1=2 T ¡1=2 ), it follows that
                                                              X
                                                              T ¡`
                                     (1=(T ¡ ` + 1))                   `1=2 (a¤0 Bi )3 ¡ `1=2 E(a¤0 Bi )3
                                                                                 ~                  ~
                                                                 i=0
                                           X
                                           `
                            = Op (                 (1 ¡ min(max(i; j; ji ¡ jj); `)=`)(1=(T ¡ ` + 1))
                                          i;j=0
                                         T ¡`
                                         X
                                     £          [a¤0 vt a¤0 vt+i a¤0 vt+j ¡ E(a¤0 vt a¤0 vt+i a¤0 vt+j )])
                                       t=0

                            =        Op (`1=2 T ¡1=2 ):                                                                         (A.129)
By the moment inequality, it follows that
                                X
                                b¡1                                     1
                                                                        X
                       (1=b)             `1=2 E(a¤0 Bi )3 =
                                                    ~                           E(a0 v0 a0 vi a0 vj ) + o(`T ¡1=2 ):            (A.130)
                                i=0                                    i;j=¡1

Thus (A.126) follows from (A.129) and (A.130). Next we will show (A.127) and (A.128). Using
arguments similar to the one used in the proof of Lemma B.1, we obtain
                     T ¡`
                     X                                                          T ¡`
                                                                                X
   1                                                                                                                        3
 (` 2 =(T ¡`+1))             [a¤0 (Bt ¡Bt )]3 = (`2 (T ¡`+1))
                                   ^ ~                                                 fa¤0 [Ft (¯T ¡¯0 )]g3 = Op (`2 T ¡ 2 ) (A.131)
                                                                                                 ~
                       t=0                                                      t=0


                                                                          39
and
                                      T ¡`
                                      X
                (`1=2 =(T ¡ ` + 1))          (a¤0 ¹¤ )3 = `1=2 (a¤0 ¹¤ )3 = Op (`1=2 b¡3=2 ):
                                                   T                 T                          (A.132)
                                      t=0

Thus (A.127) and (A.128) are satis¯ed. Therefore, (A.106) follows.
   Part (d): Proof of (A.107). Part (d) can be proved using similar arguments and thus the
proof is omitted.                                                                  Q.E.D.

Proofs of Main Theorems

Lastly, we will prove the main theorems.

Proof of Theorem 1: The result for the studentized statistic (3.2) follows from Lemmas A.5-A.7.
Note that the J test statistic can be written as
                                                    1=20 1=2
                                              JT = JT JT                                        (A.133)

where                                     "                        #
                                            1 X
                                                T
                             1=2     ^¡1=2 p                  0
                                                            ^T xt ) :
                            JT     = ST            zt (yt ¡ ¯
                                             T t=1
                                                                              1=2
Then one can show that Lemmas A.1{A.7 with fT replaced by JT hold except that a, b, c,
                                                            1=2
d and e now take di®erent values. Thus the distribution of JT can be approximated by its
Edgeworth expansion in a suitable sense. A slight modi¯cation of Theorem 1 of Chandra and
Ghosh (1979) completes the proof of (3.3).                                         Q.E.D.

Proof of Theorem 2: For iid observations, a modi¯cation of Theorem 1 with ` = 1 yields
                              ^ ¡1=2 ^
                      sup jP (§T (¯T ¡ ¯0 ) · x) ¡ ªT (x)j = o(T ¡1 );                          (A.134)
                     x2<p

                                   sup jP (JT · x) ¡ ªJ;T (x)j = o(T ¡1 ):                      (A.135)
                                   x¸0

under Assumptions 1(b)(c)(d)(i), ` = 1 and Assumption 1(e) replaced by the standard Cramer
condition. It su±ces to show that the conditions on Rt = (vt ;vec(wt )0 )0 required for the Edge-
                                                           0

worth expansion of Theorem 1 are also satis¯ed for QNj = (BNj ;vec(FNj )0 )0 for j = 1; : : : ; b
                                                                0

conditionally on the sample ÂT = f(x0 ; yt ; zt )gT , uniformly for all ÂT in a set whose probabil-
                                    t
                                              0
                                                  t=1
ity tends to 1 as T ! 1. Without loss of generality, we check the conditions using BNj . For
Assumption A1(b), we have

                                                 1 X ¤
                                                           `
                       E ¤ [BNj ] = E ¤ [BN1 ] = p      E (zi ui ¡ ¹¤ ) = 0:
                                                                    T                           (A.136)
                                                  ` i=1

For Assumption A1(c), it follows from Lemma A.0 that
                                        ¯            ¯r+´     ¯            ¯r+´
     h ¯      ¯r+´ i       1     X ¯ 1 X
                                 T ¡`
                                        ¯
                                             `       ¯
                                                     ¯
                                                              ¯ 1 X
                                                              ¯
                                                                   `       ¯
                                                                           ¯
   E E ¤ ¯BNj ¯      =                E ¯p      vt+i ¯    = E ¯p      vt+i ¯    < 1:            (A.137)
                       T ¡ ` + 1 t=0    ¯ `          ¯        ¯ `          ¯
                                            i=1                   i=1

   From the proof of Theorem 4.2 of GÄtze and KÄnsch (1996),
                                       o         u
                           ¯    ¯r+´     h ¯      ¯r+´ i
                       E ¤ ¯BNj ¯    ¡ E E ¤ ¯BNj ¯      = Op (b¡1=2 ):                         (A.138)
                                                                 ¯   ¯r+´
Combining the two results implies that the probability of E ¤ ¯BNj ¯      < 1 tends to unity.
    By construction, the moving block bootstrap sample are based on the independent sampling
of BNj . Therefore, Assumption A1(d) is trivially satis¯ed (with a probability one) using a sigma-
¯eld de¯ned by ¾(Nj ) for j = 1; : : : ; b, conditionally on the sample ÂT . By the same reason,

                                                      40
we can replace Assumption A1(e) by the standard Cram¶r condition and we only need to show
                                                         e
that the condition holds with probability tends to one. Using an argument that appeared in the
proof of Theorem 4.2 of GÄtze and KÄnsch (1996), we have that
                          o          u
                        (                                  )
                       P        sup       jE ¤ exp[itBN1 ]j · 1 ¡ ³   = 1 ¡ o(T ¡1 )   (A.139)
                             d<jtj<b1=2


for some 0 < ³ < 1=2.                                                                  Q.E.D.

Proof of Theorem 3: It follows from Lemmas A.3{A.5, Lemmas B.2-B.3 and Theorems 1 and 2
that

                   sup jP (¿T · x) ¡ P ¤ (¿1;® · x)j = Op (`T ¡1 ) + Op (`¡q );
                                           ¤
                                                                                       (A.140)
                  x2<p

                sup jP (j¿T j · x) ¡ P ¤ (j¿2;® j · x)j = op (`T ¡1 ) + Op (`¡q );
                                            ¤
                                                                                       (A.141)
                x2<p

                    sup jP (JT · x) ¡ P ¤ (JT · x)j = o(`T ¡1 ) + Op (`¡q ):
                                            ¤
                                                                                       (A.142)
                       x¸0

Then the standard Cornish-Fisher expansions arguments complete the proof.              Q.E.D.


                                                      41
References
   Andrews, D.W.K. (1991), \Heteroskedasticity and Autocorrelation Consistent Covariance
   Matrix Estimation," Econometrica, 59, 817{858.
   Andrews, D.W.K. and J.C. Monahan (1992), \An Improved Heteroskedasticity and Auto-
   correlation Consistent Covariance Matrix Estimator," Econometrica, 60, 953{966.
   Billingsley, P. (1968), Convergence of Probability Measures, New York: Wiley.
   Brown, B.W. and W.K. Newey (1995), \Bootstrapping for GMM," manuscript, Rice Uni-
   versity and MIT.
   Campbell, J.Y. and R.J. Shiller (1991), \Yields Spread and Interest Rate Movements: A
   Bird's Eye View," Review of Economic Studies, 58, 495{514.
   Carlstein, E. (1986), \The Use of Subseries Methods for Estimating the Variance of a
   General Statistic from a Stationary Time Series," Annals of Statistics, 14, 1171{1179.
   Clarida, R., J. Gal¶ and M. Gertler (2000), \Monetary Policy Rules and Macroeconomic
                      ³
   Stability: Evidence and Some Theory," Quarterly Journal of Economics, 115, 147{180.
   Davison, A.C. and P. Hall (1993), \On Studentizing and Blocking Methods for Implement-
   ing the Bootstrap with Dependent Data," Australian Journal of Statistics, 35, 215{224.
   den Haan, W.J. and A. Levin, (1997), \A Practitioner's Guide to Robust Covariance
   Estimation," in G.S. Maddala and C.R. Rao, eds, Handbook of Statistics: Robust Inference,
   Vol.15, New York: Elsevier.
   Fan, Y. and O. Linton (1997), \Some Higher Order Theory for a Consistent Nonparametric
   Model Speci¯cation Test," forthcoming, Journal of Statistical Planning and Inference.
   Gallant, A.R. (1987), Nonlinear Statistical Models, New York: Wiley.
   GÄtze, F. and C. Hipp (1983), \Asymptotic Expansions for Sums of Weakly Dependent
     o
   Random Vectors," Zeitschrift fÄr Wahrscheinlichkeitstheorie und verwandte Gebiete, 64,
                                 u
   211{239.
   GÄtze, F. and H.R. KÄnsch (1996), \Second-Order Correctness of the Blockwise Bootstrap
     o                  u
   for Stationary Observations," Annals of Statistics, 24, 1914{1933.
   Hahn, J. (1996), \A Note on Bootstrapping Generalized Method of Moments Estimators,"
   Econometric Theory, 12, 187{197.
   Hall, A.R. (1994), \Testing for a Unit Root in Time Series With Pretest Data-Based Model
   Selection," Journal of Business & Economic Statistics, 12, 461{470.
   Hall, P. (1992), The Bootstrap and Edgeworth Expansion, New York: Springer.
   Hall, P. and C.C. Heyde (1980), Martingale Limit Theory and Its Application, San Diego:
   Academic Press.
   Hall, P. and J.L. Horowitz (1996), \Bootstrap Critical Values for Tests Based on General-
   ized Method of Moments Estimators," Econometrica, 64, 891{916.
   Hannan, E.J. (1970), Multiple Time Series, New York: Wiley.
   Hansen, L.P. (1982), \Large Sample Properties of Generalized Method of Moment Esti-
   mators," Econometrica, 50, 1029{1054.
   KÄnsch, H.R. (1989), \The Jackknife and the Bootstrap for General Stationary Observa-
     u
   tions," Annals of Statistics, 17, 1217{1241.
   Lahiri, S.N. (1996), \On Edgeworth Expansion and Moving Block Bootstrap for Stu-
   dentized M-Estimators in Multiple Linear Regression Models," Journal of Multivariate
   Analysis, 56, 42{59.
   Magnus, J.R. and H. Neudecker (1999), Matrix Di®erential Calculus with Applications in
   Statistics and Econometrics, Revised Edition, Chichester, UK: Wiley.


                                           42
Newey, W.K. and K.D. West (1987), \A Simple Positive Semi-De¯nite, Heteroskedasticity
and Autocorrelation Consistent Covariance Matrix," Econometrica, 55, 703{708.
Newey, W.K. and K.D. West (1994), \Automatic Lag Selection in Covariance Matrix
Estimation," Review of Economic Studies, 61, 631{653.
Ng, S. and P. Perron (1995), \Unit Root Tests in ARMA Models With Data-Dependent
Methods for the Selection of the Truncated Lag," Journal of the American Statistical
Association, 90, 268{281.
Parzen, E. (1957), \On Consistent Estimates of the Spectrum of a Stationary Time Series,"
Annals of Mathematical Statistics,28, 329{348.
Politis, D.N. and J.P. Romano (1995), \Bias-Corrected Nonparametric Spectral Estima-
tion," Journal of Time Series Analysis, 16, 67{103.
Runkle, D.E. (1991), \Liquidity Constraints and the Permanent-Income Hypothesis,"
Journal of Monetary Economics, 27, 73{98.
Tauchen, G. (1985), \Statistical Properties of Generalized Method-of-Moments Estimators
of Structural Parameters Obtained From Financial Market Data," Journal of Business &
Economic Statistics, 4, 397{416.
West, K.D. (1988), \Dividend Innovations and Stock Price Volatility," Econometrica, 56,
37{61.
White, H. (1984), Asymptotic Theory for Econometricians, New York: Academic Press.
Yokoyama, R. (1980), \Moment Bounds for Stationary Mixing Sequences," Zeitschrift fÄr
                                                                                   u
Wahrscheinlichkeitstheorie und verwandte Gebiete, 52, 45{57.


                                        43
                                    Table 1
                 Empirical Size of Nominal 10% t and J Tests
                with Asymptotic and Bootstrap Critical Values

                                (1) Trapezoidal kernel
                               Asymptotic            Bootstrap      PSD
               ½    T0 + 1   t Test J Test        t Test J Test
              0.5     64      23.8     10.6        15.3      8.9     10.9
                     128      20.9      9.6        14.2     9.9       7.1
              0.9     64      44.9     11.5        20.5      9.1     23.3
                     128      35.9     13.6        12.2     11.8     10.1
             0.95     64      47.5     12.6        24.0      8.4     27.6
                     128      42.3     13.6        12.3     10.1     16.3
                                  (2) Parzen (b) kernel
                               Asymptotic            Bootstrap      PSD
               ½    T0 + 1   t Test J Test        t Test J Test
              0.5     64      22.5      9.4        14.6     7.7      8.0
                     128      20.8      8.8        13.7     9.0       4.7
              0.9     64      44.3     10.3        22.4      8.1     18.2
                     128      35.9     13.6        12.2     11.8     7.8
             0.95     64      48.2     14.0        23.8      9.7     21.2
                     128      41.5     13.6        11.9     10.6     13.6
                                  (3) Truncated kernel
                               Asymptotic            Bootstrap      PSD
               ½    T0 + 1   t Test J Test        t Test J Test
              0.5     64      23.0     10.5        15.4      8.2     14.0
                     128      20.8     10.1        13.0      9.6     12.2
              0.9     64      42.3      8.5        19.8     7.0      33.0
                     128      32.6     10.0        11.7      8.7     30.2
             0.95     64      45.1     10.2        21.6      7.3     36.8
                     128      37.9     10.3        11.0      9.5     34.7
                                 (4) Prewhitened HAC
                                Bartlett                QS          PSD
               ½    T0 + 1   t Test J Test        t Test J Test
              0.5     64      20.2     11.3        20.7     10.8      |
                     128      15.8     10.8        16.4     10.5      |
              0.9     64      37.1     13.9        38.2     10.0      |
                     128      26.3     13.8        26.2     9.6       |
             0.95     64      41.9     20.1        41.4     11.8      |
                     128      30.5     15.3        31.0     9.4       |

Notes: Numbers are in percent. \PSD" refers to the frequencies of the positive semidef-
inite correction procedure described in Section 3.


                                          44
                                Table 2
               GMM Estimates of the Policy Rule Parameters

                     (a) Pre-Volcker Period: 1960:1-1972:2

                         Kernel       ¯          °         J
                         None       0.834      0.274    13.075
                                   (0.067)    (0.087)   (0.126)
                        Bartlett    0.871      0.392    22.206
                                   (0.030)    (0.073)   (0.671)
                           QS       0.871      0.388    22.242
                                   (0.030)    (0.073)   (0.673)

                 (b) Volcker-Greenspan Period: 1979:3-1996:3

                         Kernel       ¯          °         J
                         None       2.153      0.933    21.376
                                   (0.379)    (0.454)   (0.625)
                        Bartlett    2.258      0.854    23.314
                                   (0.148)    (0.224)   (0.726)
                           QS       2.280      0.803    34.607
                                   (0.148)    (0.216)   (0.978)

Notes: Asymptotic standard errors for the estimates of ¯ and °, and asymptotic p
values for the J statistics are in parentheses. For the asymptotic con¯dence interval
based on the Bartlett and QS kernels, the data-dependent bandwidth estimator of
Andrews (1991) and the prewhitening procedure of Andrews and Monahan (1992) are
used. The estimated bandwidths are reported in Table 3. \None" indicates that the
inverse of the variance-covariance matrix is used as the weighting matrix.


                                         45
                                Table 3
          90% Con¯dence Intervals of the Policy Rule Parameters

                     (a) Pre-Volcker Period: 1960:1-1972:2

                           Kernel        `           ¯                 °
          Asymptotic        None         0     (0.724, 0.945)   (0.131, 0.416)
                          Bartlett     0.640   (0.822, 0.921)   (0.272, 0.512)
                             QS        0.944   (0.823, 0.920)   (0.268, 0.507)
         HH Bootstrap       None         2     (0.656, 1.013)   (-0.027, 0.575)
         IS Bootstrap    Trapezoidal     2     (0.738, 1.191)   (0.177, 0.762)
         IS Bootstrap    Parzen (b)      2     (0.693, 1.161)   (0.150, 0.627)
         IS Bootstrap    Truncated       2     (0.738, 1.191)   (0.177, 0.762)

                 (b) Volcker-Greenspan Period: 1979:3-1996:3

                           kernels       `           ¯                 °
          Asymptotic        None         0     (1.530, 2.776)   (0.187, 1.680)
                          Bartlett     1.227   (2.015, 2.502)   (0.485, 1.222)
                             QS        1.460   (2.038, 2.523)   (0.449, 1.158)
         HH Bootstrap       None         2     (1.070, 3.263)   (-0.927, 2.738)
         IS Bootstrap    Trapezoidal     2     (1.517, 3.026)   (-0.151, 0.996)
         IS Bootstrap    Parzen (b)      2     (1.446, 3.204)   (-0.321, 1.273)
         IS Bootstrap    Truncated       2     (1.517, 3.026)   (-0.151, 0.996)

Notes: \HH Bootstrap" denotes the bootstrap method of Hall and Horowitz (1996) and
\IS Bootstrap" denotes the bootstrap method proposed in the present paper. \None"
indicates that the inverse of the variance-covariance matrix is used as the weighting
matrix. ` denotes the bandwidth for the asymptotic con¯dence interval and the block
length for the bootstrap con¯dence interval. For the asymptotic con¯dence interval
based on the Bartlett and QS kernels, the data-dependent bandwidth estimator of
Andrews (1991) and the prewhitening procedure of Andrews and Monahan (1992) are
used. For the bootstrap con¯dence interval, the data-dependent procedure described in
Section 3 is used to select the block length.


                                         46