Robust Tests for Heteroskedasticity in the One-Way Error Components Model∗

This paper constructs tests for heteroskedasticity in one-way error components models, in line with Baltagi, Bresson and Pirotte (Journal of Econometrics, 134, 2006). Our tests have two additional robustness properties. First, standard tests for heteroskedasticity in the individual component are shown to be negatively affected by heteroskedasticity in the remainder component. We derive modified tests that are insensitive to heteroskedasticity in the component not being checked, and hence help identify the source of heteroskedasticity. Second, Gaussian based LM tests are shown to reject too often in the presence of heavytailed (e.g. t-Student) distributions. By using a conditional moments framework, we derive distribution-free tests that are robust to nonnormalities. Our tests are computationally convenient since they are based on simple artificial regressions after pooled OLS estimation. JEL Classification: C12, C23.


Introduction
Typical panels in econometrics are largely asymmetric, in the sense that their cross-sectional dimension is much larger than its temporal one. Consequently, most of the concerns that affect cross-sectional models harm panel data models similarly. This is surely the case of heteroskedasticity, a subject that has played a substantial role in the history of econometric research and practice, and still occupies a relevant place in its pedagogical side: all basic texts include a chapter on the subject. As it is well known, heteroskedasticity invalidates standard inferential procedures, and usually calls for alternative strategies that either accommodate heterogeneous conditional variances, or are insensitive to them. The one-way error components model is the most basic extension of simple linear models to handle panel data, and it is widely used in the applied literature. In this model, heteroskedasticity may now be present in either the 'individual' error component, in the observation-specific 'remainder' error component, or in both simultaneously.
Consider the case of testing for heteroskedasticity. In the cross-sectional domain, the landmark paper by Breusch and Pagan (1979) derives a widely used, asymptotically valid test in the Lagrange multiplier (LM) maximumlikelihood (ML) framework under normality. Further work by Koenker (1981) proposed a simple 'studentization' that avoids the restrictive Gaussian assumption. This is an important result since non-normalities severely affect the performance of the standard LM based test, as clearly documented by Evans (1992) in a comprehensive Monte Carlo study. Wooldridge (1990Wooldridge ( , 1991 and Dastoor (1997) consider a more general framework allowing for heterokurtosis.
The literature on panel data has only recently produced results analog to those available for the cross-sectional case. 1 For the one-way error component, Holly and Gardiol (2000) study the case where heteroskedasticity is only present in the individual-specific component, and derive a test statistic that is a direct analog of the classic Breusch-Pagan test in an LM framework under normality. 2 Baltagi, Bresson and Pirotte (2006) allow for heteroskedasticity in both components and derive a test for the joint null of homoskedasticity, again, in the Gaussian LM framework. They also derive 'marginal' tests for homoskedasticity in either component, that is, tests that assume that heteroskedasticity is absent in the component not being checked, of which, naturally, the test by Holly and Gardiol (2000) is a particular case. Both articles propose LM-type tests and, consequently, are based on estimating a null homoskedastic model, which makes them computationally attractive. 3 Closer to our work is Lejeune (2006), who proposes a pseudo maximum likelihood framework for estimation and inference of a 1 An early contribution on this topic is the seminal paper by Mazodier and Trognon (1978).
2 Recently, Baltagi, Jung and Song (2010) extend this test to incorporate serial correlation as well.
full heteroskedastic model. This paper derives new tests for homoskedasticity in the error components model that possess two robustness properties. Though the term robust has a long tradition in statistics (Huber, 1981), in this paper it is used to mean being resistant to 1) misspecification of the conditional variance of the remainder term, and 2) departures away from the strict Gaussian framework used in the ML-LM context.
The first robustness property is related to resistance to misspecification of the a priori admissible hypotheses, that is, to 'type-III errors' in the terminology of Kimball (1957) (see Welsh, 1996, pp. 119-120, for a discussion of these concepts). The negative effects of this type of misspecification on the performance of LM tests have been studied by Davidson and MacKinnon (1987), Saikonnen (1989) and Bera and Yoon (1993), and are found to occur when the score of the parameter of interest is correlated with that of the nuisance parameter. This type of misspecification affects the Holly and Gardiol (2000) test in the case where the temporal dimension of the panel is fixed, which assumes that heteroskedasticity is absent in the remainder term, and therefore, rejects its null spuriously not due to heteroskedasticity being present in the individual component being tested, but in the other one. This problem can be observed directly in the corresponding non-zero element of the Fisher information matrix presented in Baltagi et al. (2006).
As discussed in Section 4, Lejeune's (2006)  The second robustness property is related to the idea of robustness of validity of Box (1953), that is, tests that achieve an intended asymptotic level for a rather large family of distributions (see Welsh, 1996, ch. 5, for a discussion). In this paper, through an extensive Monte Carlo experiment, non-normalities are shown to severely affect the performance of the tests by Holly and Gardiol (2000) and Baltagi et al. (2006), consistent with the results found by Evans (1992) for the cross-sectional case. We derive new tests using a conditional moments framework, and thus, they are by construction distribution-free, subject to mild regularity assumptions. In this context, the LM-type tests proposed by Lejeune (2006) are also resistant to non-normalities. We also consider the case of possible heterokurtosis, as convenient and simple extension of our framework, along the line of the work by Wooldridge (1990Wooldridge ( , 1991 and Dastoor (1997).
An additional advantage of all our proposed statistics is that of simplicity, since they are based on simple transformations of pooled OLS residuals of a fully homoskedastic model, unlike the case of the tests by Holly and Gardiol (2000) and Baltagi et al. (2006) that require ML estimation. Fur-thermore, all tests proposed in this paper can be computed based on the R 2 coefficients from simple artificial regressions.
The paper is organized as follows. Section 2 presents the heteroskedastic error components model and the set of moment conditions used to derive test statistics in Section 3. Section 4 presents the results of a detailed Monte Carlo experiment that compares all our statistics and those obtained by Holly and Gardiol (2000), Baltagi et al. (2006) and Lejeune (2006). Section 5 considers an extension of the proposed statistics to handle heterokurtosis.
Section 6 concludes and presents suggestions for practitioners and future research.
2 Moment conditions for the one-way heteroskedastic error components model Baltagi et al. (2006) use a parametric error components model under normality, and a ML estimator. In order to highlight differences and similarities, our search for distribution-free tests for heteroskedasticity will be based on a set of appropriate moment conditions. Consider the following regression model with general heteroskedasticity in a one-way error components model: where y it , u it , µ i and ν it are scalars, x it is a k β -vector of regressors, and β is a k β -vector of parameters. As usual, the subscript i refers to individual, and t to temporal observations. We follow the conditional moments frame-work introduced by Newey (1985), Tauchen (1985) and White (1987), and consider a set of conditioning variables w it , containing the not necessarily disjoint elements x it , z µi and z νit . Here z µi and z νit are vectors of regressors of dimensions k θµ and k θν respectively. For notational convenience we also Throughout the paper we assume that the conditional mean of model (1) is well specified, In the context of the general framework specified by Wooldridge (1990, p. 18) this implies that the validity of the derived tests actually imposes more than just the hypothesis of interest, by ruling out misspecification in the conditional mean. 4 Further, we assume that the conditional processes µ i |w i and ν it |w i are conditionally uncorrelated, independent across i, with ν it |w i also uncorrelated across t, and with zero conditional mean, conditional variances given by and finite fourth moments. h µ (.) and h ν (.) are twice continuously differ- : θ ν = θ µ = 0. Because, in general, the nature of the heteroskedasticity is unknown, z µ and z ν may be similar, when not identical, hence we cannot rely on them to distinguish among different types of heteroskedasticity.
Let u i ≡ T −1 T t=1 u it be the between residuals andũ it ≡ u it −ū i the within residuals. Different moment conditions on these errors provide alternative ways of testing for both sources of heteroskedasticity.
The squared between residual provides moment conditions for testing Unlike (4), this moment condition does not involve parameters related to heteroskedasticity in the remainder component, and, hence, it will be used in Section 3.2 to construct tests for heteroskedasticity in the individual component in short panels that are robust to the presence of heteroskedasticity in the remainder component.
Consider now the moment condition based on the squared within residual: This condition can be used to construct tests for H σ 2 ν 0 . Note that σ 2 µ and θ µ do not appear anywhere in (7), which means that a test based on this moment condition will be robust to the presence of heteroskedasticity in the individual error component, i.e. when θ µ = 0. A test for heteroskedasticity in the remainder component will be based on N T × R 2 , where R 2 is the centered coefficient of determination of an auxiliary regression ofũ 2 on z ν and a constant (see Section 3.3). Note, there may be differences between short and long panels because . This is expored in Section 3.4.

Robust tests for heteroskedasticity
Our tests will be based on the moment conditions considered in the previous section, following Koenker's (1981) studentization procedure. We use the asymptotic framework of Dastoor (1997) adapted to the one-way error components model structure described above.
Assumption 1 For each i = 1, ..., N and t = 1, ..., T , E[w j,it w j,it ] is a finite positive definite matrix, where w j,· is a column vector containing the distinct elements of w and 1. Moreover, Dastoor's framework includes Wooldridge's (1990Wooldridge's ( , 1991 set-up for heterokurtosis, that is, the case where the error term is allowed to have different conditional fourth moments. In our case, this would involve allowing that constants. In this section we derive tests assuming homokurtosis, since it provides an intuitive framework to motivate the statistics. The heterokurtic case and a related Monte Carlo exploration are treated as an extension, in Section 5.
Assumption 2 For each i = 1, ..., N , and t = 1, ..., The test statistics will be based on transformations of the OLS residualŝ

Test for
Defineη, a N -vector containing the sample squared between residuals, Z µ , a N × k θµ matrix with the sample matrix of covariates for testing this hypothesis, and M N ≡ I N −J N , whereJ N = ι N ι N /N and ι N is a (N × 1) vector of ones. Consider a sequence of alternativesà la Pitman Then, under Assumptions 1 and 2, as N, T → ∞ or Proof: Note that the sequence of random variables {ū 2 i } is independent.
Moreover, by taking a Taylor series explansion of the function h µ (.) and As- we apply Theorem 1 in Dastoor (1997) for our sequence of squared OLS between residuals on i = 1, ..., N , which under Assumption 2 (homokurtosis) gives the desired result. Q.E.D.
Note that if µ is Gaussian, φ µ = 2×(σ 2 µ +T −1 σ 2 ν ) 2 , and then the Koenkertype test reduces to the Holly and Gardiol (2000) marginal test, which is analog to the Breusch and Pagan (1979) test where the between OLS residuals are used instead of the untransformed OLS residuals.
Consider now the auxiliary regression model (see Davidson and MacKinnon, 1990, on the use of artificial regressions) Note that m µ is N ×R 2 µ where R 2 µ is the centered coefficient of determination of this regression model, i.e. an auxiliary regression ofη on z µ and a constant (see Koenker, 1981, p. 111).

Test for
A test for the individual component in short panels with potential heteroskedasticity in the remainder component requires the use of condition Then, under Assumptions 1 and 2, as N → ∞ and under H Proof: similar to that in Theorem 1.
Consider the auxiliary regression model Using a similar argument as before, m * µ = N × R 2 * µ where R 2 * µ is the centered coefficient of determination of the regression model. Note that the auxiliary regression model (11) covers that in model (9), and therefore, the case analyzed here is a generalization of the former.

Test for H
Consider a test for homoskedasticity in the remainder component in long vector containing the sample within residuals squared, Z ν , a N T ×k θν matrix with the sample matrix of covariates for testing this hypothesis, and M N T = The following Theorem derives an asymptotically valid test for this hypothesis.
Then, under Assumptions 1 and 2, as N, Proof: Note that the sequence of random variables {ũ 2 it } is asymptotically in- Then follow the proof of Theorem 1 for our sequence on i = 1, ..., N and t = 1, .., T , which under Assumption 2 (homokurtosis) gives the desired result. Q.E.D. Consider now the auxiliary regression model Consider now the case where N → ∞ and T finite. For this case, consider a Taylor expansion of eq. (7) where θ ν is expanded about 0, additional covariance terms need to be taken into consideration. Define η is vector of within residuals {ũ it }, and letΦ ν be a consistent estimate of that variance-covariance matrix ofη.
Then, under Assumptions 1 and 2, as N → ∞, T fixed and under H σ 2 ν A : 5 As noted by an anonymous referee a significant limitation of this test is that νit|wi is not serially correlated and it should not be very difficult to construct a modified test that do not rely on this assumption (see for instance next subsection, where additional covariance terms are considered).
A convenient way to implement this test is based on the auxiliary re- and Following Baltagi et al. (2006) we construct a joint test based on the sum of the individual tests, With N and T tending to infinity, the joint test is trivially derived by exploiting the two orthogonal moment conditions (5) and (7) and hence a valid test is based on the sum of the marginal tests for each source of heteroskedasticity, which involve the sum of independent chi-squared random 6 The Monte Carlo experiments of the next section are carried out with T ≥ 5, and we find no significant discrepancies between the results obtained from model (14) and those carried out based the statistic in Theorem 4, where the within individuals covariance terms c inΦν are estimated as variables, and therefore, we have that m µ,ν d → χ 2 k θµ +k θν . Note that the joint test by Baltagi et al. (2006) also reduces to the sum of two marginal tests when T → ∞. A preliminary analysis of the Monte Carlo experiments showed that with T small, m µ,ν behave similarly to the large T case, and therefore, we find that it is not necessary to make a small panel correction.

Monte Carlo experiments
In order to explore the robustness properties of the proposed tests in small samples, the design of our Monte Carlo experiment will initially follow very closely that of Baltagi et al. (2006), to which we refer for further details on the experimental design, and will be modified accordingly to highlight some specific features of our tests. The baseline model is: where x it = w i,t + 0.5w i,t−1 and w i,t ∼ iid U (0, 2). The parameters β 0 and β 1 are assigned values 5 and 0.5, respectively. For each x i , we generate T + 10 observations and drop the first 10 observations in order to reduce the dependency on initial values.
The experiment considers three cases, corresponding to different sources of heteroskedasticity. In all of them, the total variance is set toσ 2 µ +σ 2 ν = 8, . For all DGPs, ν it has zero mean and variance σ 2 ν it , while µ i has zero mean and variance σ 2 µ i . For each case we consider exponential heteroskedasticity, h(z θ) = exp(z θ). 7 The following heteroskedastic models are considered: Heteroskedasticity in the remainder component (case a): , 1, 2, 3}, and θ µ = 0.
For each replication we have computed the test statistics proposed in this paper, those based on Lejeune's (2006) framework (based on pooled OLS residuals), and those of Baltagi et al. (2006) and Holly and Gardiol (2000), using residuals after ML estimation. Specifically, the statistics considered and their corresponding null hypotheses are: The statistic is N -times the R 2 from the pooled OLS regression ofū 2 i onx i and a constant (see Section 3.1, eq. (8)).
• m * µ . H in short panels, and is N -times the R 2 from the pooled OLS regression 7 Simulations were also run for quadratic heteroskedasticity, h(z θ) = (1+z θ) 2 , and the results are similar for size and power to those of exponential heteroskedasticity. Following the referees' suggestions we omit these results but they are available from the authors upon request. ofũ 2 it on x it and a constant (see Section 3.2, eq. (10)).
• m µ,ν . H 0 : θ µ = θ ν = 0. This is the proposed statistic for the joint null of homoskedasticity in both components, and is the sum of m µ and m ν .
We have performed 5000 replications for each case, and the proportion of rejections was obtained based on a 5% nominal level. The main goals of the experiment are to quantify 1) the effects of misspecified heteroskedasticity on new and existing tests, 2) the effects of departures away from gaussianity, 3) the 'cost of robustification', that is, the potential power losses due to using robust tests when the 'ideal' conditions (normality and correct specification) used to derive the ML-LM based tests hold, and hence a robustification is not necessary. In order to isolate each problem, in the first subsection we will focus on robustness to misspecification, and in the second one on robustness of validity, measuring robustification costs for each case. It is important to note that all tests are constructed using parameters estimated under the joint null hypothesis of full homoskedasticity. Therefore, Holly and Gardiol (2000), Baltagi et al. (2006) and Lejeune (2006)  It is important to observe that, as predicted by the results of Section 2, the effects of misspecification are stronger the smaller T is and the more important is the between variation in the remainder component. The first effect can be appreciated by comparing results for different panel sizes, and the second by comparing the cases σ 2 µ = 6,σ 2 ν = 2 and σ 2 µ = 2,σ 2 ν = 6 in Tables 1 and 2.

Robustness to misspecified heteroskedasticity
In order to highlight these points, consider the following experiments, which are a variation of the exponential heteroskedasticity in the remainder component, case a, where σ 2 µ = 2 for all i, λ ν = 3, andσ 2 ν = 6. First, to assess the sensitivity of the proposed statistics to the panel size, we fix N = 50 and consider 1000 simulations for each T ∈ {2, 3, ..., 30}. Simulation results are presented graphically in Figure 1, and show that the main problem arises because of short panels. Moreover, it shows that the main gain of using m * µ is in the small T case, the most likely situation in practice.
All tests achieve correct size for large T , but m * µ achieves the correct size in shorter panels. INSERT

INSERT FIGURES 3 2 HERE
Finally, we explored the effects of the relative importance of between vs.
within heteroskedasticity in the remainder component. Consider now the following form of functional heteroskedasticity: with α ∈ [0, 1]. If α = 0, this corresponds to case a in Table 1. If α = 1, by construction, there is only within heteroskedasticity, and therefore no differences in the variance across individuals. For different values of α, we have generated 1000 replications for (N, T ) = (50, 5), and calculate the empirical size at a theoretical level of 5% of HG µ , L µ , m µ and m * µ . Results are shown graphically in Figure 4. HG µ , L µ and m µ reject too often for small α, while m * µ has better size properties. Moreover, for the four statistics, the simulated empirical size approaches the theoretical level as α goes to 1.

INSERT FIGURE 4 HERE
Regarding robustification costs, tests specifically designed to react to heteroskedasticity in the remainder (m ν , BBP ν , BBP ν , L nu ) increase their empirical power with the strength of this type of heteroskedasticity and, as expected under normality, the power of BBP ν is the largest. Interestingly, our robust test m ν performs relatively close to the Baltagi et al. (2006) LM statistics, implying that robustifications costs for these particular experiments are low, that is, the loss in power for unnecessarily using a robust test is minor. Finally, note that the performance of m * ν , our proposed statistic designed to increase its power in small samples, is not as good as expected.
First, it shows over-rejection for the (σ 2 µ = 6, σ 2 ν = 2) case. Second, its power outperforms that of m ν only in Table 2.   INSERT TABLE 3 HERE   Consider now Table 3, where we allow for heteroskedasticity in the individual component only, under gaussianity. The Holly and Gardiol (2000) test is locally optimal and should have correct asymptotic size, so robustifications are not necessary. Our robust statistics m µ and m * µ have very similar rejection rates for all values of θ µ , suggesting that robustification cost are small in this case too. Interestingly, ,the test by Lejeune (1996)

Robustness of validity
In order to explore the effect of departures away from gaussianity, we evaluate the performance of all the test statistics under H 0 : θ µ = θ ν = 0, N = 50 and T = 5, for non-normal DGP's using 5000 replications. First, we generate t-Student DGP's with 3 and 5 degrees of freedom. Second, we consider skewed-Normal distributions constructed as in Azzalini and Capitanio (2003). 9 Finally, we have also considered log-normal, exponential, χ 2 1 and uniform distributions. In all cases, the random variables are standardized to have the required variances.
The effects of departures away from gaussianity are dramatic. For the t-Student cases, the empirical sizes of the LM Gaussian-based statistics are considerably large. Moreover, the simulations show that rejection rates decrese as degrees of freedom increase, and thus the DGP becomes closer to normal. Even higher rejection rates are observed for the log-normal, exponential, χ 2 1 and uniform DGPs. For instance, the log-normal has rejection rates above 0.24 for HG µ , and close to 0.50 for BBP ν . However, rejection rates are close to the nominal level for the skewed-Normal distribution (with considerable skewness but limited kurtosis). These results are in line with Evans' (1992) simulations for the Breusch-Pagan cross-sectional test, which was found to be highly sensitive to excess kurtosis but less so to skewness.
Interestingly our new test statistics and those of Lejeune's (2006) are robust to departures away from gaussianity, presenting empirical sizes very close to their nominal values. Surprisingly, we also find good empirical size for the t-Student case with 3 degrees of freedom, which has infinite fourth moment, and therefore, it does not satisfy the assumptions used in the theorems of Section 3. Finally, all tests derived under Lejeune's (2006) framework present good empirical size and are, hence, robust to distributional misspeficications. Although not reported, in all cases, the proposed tests have monotonically increasing empirical power as heteroskedasticity in the tested component augments.
To summarize, the analysis confirms that, although optimal in the Gaussian case, LM tests derived under this assumption are severely affected by non-normalities, and that, on the contrary, our new statistics and those based on Lejeune's (2006) context remain unaltered by changes in the underlying distribution of the error terms.

An extension: the heterokurtic case
We consider an extension of the tests proposed above to the case of finite but non-identical fourth moments, i.e. heterokurtosis. This is, thus, a generalization of the procedures of Wooldridge (1990Wooldridge ( , 1991 and Dastoor (1997) in the cross-sectional case, to the error components model in panel data. In this case, Assumption 2 should be dropped and the asymptotic results should be modified to allow for different variances of the conditional squared residuals. We illustrate this procedure by modifying Theorem 1 (for the tests for heteroskedasticity in the individual component), which provides a guidance for straighforward extensions for Theorems 2 and 3.
Recall from Section 3.1 thatη i =ū 2 i . Definê Consider the following assumption, that ensures existence of the fourth moments: is a finite positive define matrix.
The following theorem provides the asymptotic distribution of a Wooldridge (1990)-type statistic for testing heteroskedasticity in the individual component with heterokurtosis. The intuition is that, as argued in Wooldridge (1990, p.23), the White (1980) covariance matrix (in our case based onΦ µ ) can be used to compute heteroskedasticity tests that are not affected by heterokurtosis. A similar procedure can be used to construct tests that are robust to heterokurtosis for all the test statistics considered in this paper.
Interestingly, following Wooldridge (1990, Example 3.2, p.32-34) this test can also be implemented in an artificial regression set-up, as N × R 2h µ of the regression of a vector of ones on η − 1  in order to guarantee that the resulting joint test has the desired asymptotic size. This is the essence of the 'multiple comparison procedure' in Bera and Jarque (1982).
Regarding further research, this paper focuses mostly on preserving consistency and correct asymptotic size, with minimal power losses with respect of existing ML based test. Power improvements can be expected from using a quantile regression framework, as in Koenker and Bassett (1982), who find power gains by basing a test for heteroskedasticity on the difference in slopes in a quantile regression framework, for the cross sectional case. The literature on quantile models for panels is still incipient, though promising (see Koenker, 2004, Canay, 2008, and Galvao, 2009, so futher developments along the results of this research line seems promising.