CHAPTER 8: MULTICOLLINEARITY ... This means lower t-statistics. 3. The overall fit of the regression equation will be largely unaffected by multicolli...

0 downloads 11 Views 114KB Size

No documents

Loading...

Perfect multicollinearity is the violation of Assumption 6 (no explanatory variable is a perfect linear function of any other explanatory variables). Perfect (or Exact) Multicollinearity If two or more independent variables have an exact linear relationship between them then we have perfect multicollinearity. Examples: including the same information twice (weight in pounds and weight in kilograms), not using dummy variables correctly (falling into the dummy variable trap), etc. Here is an example of perfect multicollinearity in a model with two explanatory variables:

Page 1 of 10

CHAPTER 8: MULTICOLLINEARITY

Consequence: OLS cannot generate estimates of regression coefficients (error message). Why? OLS cannot estimate the marginal effect of because moves exactly when moves!

on

while holding

constant

Solution: Easy - Drop one of the variables!

Page 2 of 10

CHAPTER 8: MULTICOLLINEARITY

Imperfect (or Near) Multicollinearity When we use the word multicollinearity we are usually talking about severe imperfect multicollinearity. When explanatory variables are approximately linearly related, we have

Page 3 of 10

CHAPTER 8: MULTICOLLINEARITY

The Consequences of Multicollinearity 1. Imperfect multicollinearity does not violate Assumption 6. Therefore the GaussMarkov Theorem tells us that the OLS estimators are BLUE. So then why do we care about multicollinearity? 2. The variances and the standard errors of the regression coefficient estimates will increase. This means lower t-statistics. 3. The overall fit of the regression equation will be largely unaffected by multicollinearity. This also means that forecasting and prediction will be largely unaffected. 4. Regression coefficients will be sensitive to specifications. Regression coefficients can change substantially when variables are added or dropped.

Page 4 of 10

CHAPTER 8: MULTICOLLINEARITY

The Detection of Multicollinearity High Correlation Coefficients Pairwise correlations among independent variables might be high (in absolute value). Rule of thumb: If the correlation > 0.8 then severe multicollinearity may be present.

High

with low t-Statistic Values

Possible for individual regression coefficients to be insignificant but for the overall fit of the equation to be high.

High Variance Inflation Factors (VIFs) A VIF measures the extent to which multicollinearity has increased the variance of an estimated coefficient. It looks at the extent to which an explanatory variable can be explained by all the other explanatory variables in the equation.

Page 5 of 10

CHAPTER 8: MULTICOLLINEARITY

Suppose our regression is equation includes k explanatory variables: …

.

In this equation there are k VIFs: Step 1: Run the OLS regression for each X variable. For example for

:

…

Step 2: Calculate the VIF for

: VIF( )

is the

1 1

for the auxiliary regression in Step 1.

Step 3: Analyze the degree of multicollinearity by evaluating each VIF( ). Rule of thumb: If VIF( ) > 5 then severe multicollinearity may be present. Page 6 of 10

CHAPTER 8: MULTICOLLINEARITY

Remedies for Multicollinearity No single solution exists that will eliminate multicollinearity. Certain approaches may be useful: 1. Do Nothing Live with what you have. 2. Drop a Redundant Variable If a variable is redundant, it should have never been included in the model in the first place. So dropping it actually is just correcting for a specification error. Use economic theory to guide your choice of which variable to drop. 3. Transform the Multicollinear Variables Sometimes you can reduce multicollinearity by re-specifying the model, for instance, create a combination of the multicollinear variables. As an example, rather than including the variables GDP and population in the model, include GDP/population (GDP per capita) instead. Page 7 of 10

CHAPTER 8: MULTICOLLINEARITY

4. Increase the Sample Size Increasing the sample size improves the precision of an estimator and reduces the adverse effects of multicollinearity. Usually adding data though is not feasible.

Page 8 of 10

CHAPTER 8: MULTICOLLINEARITY

Example How would perfect multicollinearity arise in our previous election example? We’ve already seen one case: vote sharei =

spendingi

Conservativei +

+

incumbencyi + malei + BQi + NDPi +

Liberali +

What else? Campaign spending is really the sum of five separate expenditures:

Advertising Election surveys Office expenses Salaries Other

Page 9 of 10

CHAPTER 8: MULTICOLLINEARITY

What if a researcher were interested in the individual effect of each of these expenditures? vote sharei =

spendingi + advertising + surveys + office + salaries + other + incumbencyi + malei + Liberali + Conservativei + BQi + NDPi +

Even if you correct the model there still may be an imperfect multicollinearity between the components of campaign expenditures.

Page 10 of 10