Introductory Econometrics: A Modern Approach, 4e
Jeffrey M. Wooldridge
This document contains a listing of all data sets that are provided with the fourth edition of Introductory Econometrics: A Modern Approach. For each data set, I list its source (wherever possible), where it is used or mentioned in the text (if it is), and, in some cases, notes on how an instructormight use the data set to generate new homework exercises, exam problems, or term projects. In some cases, I suggest ways to improve the data sets.
Special thanks to Edmund Wooldridge, who provided valuable assistance in updating the page numbers for the fourth edition.
Source: L.E. Papke (1995), “Participation in and Contributions to 401(k) Pension Plans: Evidence from Plan Data,”Journal of Human Resources 30, 311-325.
Professor Papke kindly provided these data. She gathered them from the Internal Revenue Service’s Form 5500 tapes.
Used in Text: pages 64, 80, 135-136, 173, 217, 685-686
Notes: This data set is used in a variety of ways in the text. One additional possibility is to investigate whether the coefficients from the regression of prate on mrate,log(totemp) differ by whether the plan is a sole plan. The Chow test (see Section 7.4), and the less restrictive version that allows different intercepts, can be used.
Source: A. Abadie (2003), “Semiparametric Instrumental Variable Estimation of Treatment Response Models,” Journal of Econometrics 113, 231-263.
Professor Abadie kindly provided these data. He obtained them fromthe 1991 Survey of Income and Program Participation (SIPP).
Used in Text: pages 165, 182, 222, 261, 279-280, 288, 298-299, 336, 542
Notes: This data set can also be used to illustrate the binary response models, probit and logit, in Chapter 17, where, say, pira (an indicator for having an individual retirement account) is the dependent variable, and e401k [the 401(k) eligibility indicator]is the key explanatory variable.
Source: Data from the National Highway Traffic Safety Administration: “A Digest of State Alcohol-Highway Safety Related Legislation,” U.S. Department of Transportation, NHTSA. I used the third (1985), eighth (1990), and 13th (1995) editions.
Used in Text: not used
Notes: This is not so much a data set as a summary of so-called“administrative per se” laws at the state level, for three different years. It could be supplemented with drunk-driving fatalities for a nice econometric analysis. In addition, the data for 2000 or later years can be added, forming the basis for a term project. Many other explanatory variables could be included. Unemployment rates, state-level tax rates on alcohol, and membership in MADD are just a fewpossibilities.
Source: R.C. Fair (1978), “A Theory of Extramarital Affairs,” Journal of Political Economy 86, 45-61, 1978.
I collected the data from Professor Fair’s web cite at the Yale University Department of Economics. He originally obtained the data from a survey by Psychology Today.
Used in Text: not used
Notes: This is an interesting data set for problem sets,starting in Chapter 7. Even though naffairs (number of extramarital affairs a woman reports) is a count variable, a linear model can be used as decent approximation. Or, you could ask the students to estimate a linear probability model for the binary indicator affair, equal to one of the woman reports having any extramarital affairs. One possibility is to test whether putting the single marriagerating variable, ratemarr, is enough, against the alternative that a full set of dummy variables is needed; see pages 237-238 for a similar example. This is also a good data set to illustrate Poisson regression (using naffairs) in Section 17.3 or probit and logit (using affair) in Section 17.1.
Source: Jiyoung Kwon, a doctoral candidate in economics at MSU, kindly provided...