Topic 1: Panel data models

Dr Pham Thi Bich Ngoc Hoa Sen University ngoc.phamthibich@hoasen.edu.vn    Learn and use STATA? http://www.ats.ucla.edu/stat/stata/ Introductory Economics: A Modern Approach - Jeffrey M Wooldridge (2012) “Economic Analysis of Cross section and Panel data” - Jeffrey M Wooldridge (2010) YEU TO CHU THE VA YEU TO THOI GIA   These are Models that Combine Crosssection and Time-Series Data In panel data the same cross-sectional unit (industry, firm, country) is surveyed over time, so we have data which is pooled over space as well as time I : ID (DOANH NGHIEP, INDIVIDUAL, HOUSEHOLD, COUNTRY, INDUSTRY T : TIME (DAY, WEEK, QUATER, YRYEAR ID / YEAR KHOA 2010 KHOA 2011 KHOA 2012 PHUONG 2010 PHUONG 2011 / WAGE 8 5 / EDU 12 12 12 12 13 / EXP / MARRIED 0 0 file excel BT1 If all the cross-sectional units have the same number of time series observations the panel is balanced, if not it is unbalanced Cross section  y 11 y  12 Time   series  y  1t    y 1T y 21  y i  y N  y 22  y i  y N   y 2t  y 2T    y it   y iT     y Nt       y NT  - a matrix of balanced panel data observations on variable y, N cross-sectional observations, T time series observations Panel data can take explicit account of individualspecific heterogeneity (“individual” here means related to the microunit) By combining data in two dimensions, panel data gives more data variation, less collinearity and more degrees of freedom Panel data is better suited than cross-sectional data for studying the dynamics of change For example it is well suited to understanding transition behaviour – for example company bankruptcy or merger; the effects of technological change, or economic cycles  Grunfeld and Griliches [1960] I it   i   Fit   Cit   it ◦ i = 10 firms: GM, CH, GE, WE, US, AF, DM, GY, UN, IBM; t = 20 years: 1935-1954 ◦ Iit = Gross investment ◦ Fit = Market value ◦ Cit = Value of the stock of plant and equipment yit  t   yit 1   ln(si )   ln(ni  g  d )   COM i   OPECi   it       yit = Real per capita GDP si = Average saving rate (over 1960-1985) ni = Average population growth rate (over 1960-1985) g+d = 5% COMi = if communist, otherwise OPECi =1 if OPEC, otherwise   LWAGE = log of wage = dependent variable in regressions EXP = work experience WKS = weeks worked OCC = occupation, if blue collar, IND = if manufacturing industry SOUTH = if resides in south SMSA = if resides in a city (SMSA) MS = if married FEM = if female UNION = if wage set by union contract ED = years of education BLK = if individual is black     Pooled OLS Difference in Difference, First Differences (FD), Between Effects, Fixed Effects (FE), Random Effects (RE), and Hausman test Two stages Least Square (2SLS) Generalized Methods of Moments (GMM) David Roodman, 2009 "How to xtabond2: An introduction to difference and system GMM in Stata," Stata Journal, StataCorp LP, vol 9(1), pages 86-136, March David Roodman, 2006 "How to Do xtabond2: An Introduction to "Difference" and "System" GMM in Stata," Working Papers 103, Center for Global Development A Pooled OLS (Pooled Cross Section) 10   Previously we’ve assumed that ui was correlated with the x’s, but what if it’s not? OLS would be consistent in that case, but composite error will be serially correlated 21   Need to transform the model and GLS to solve the problem and make correct inferences End up with a sort of weighted average of OLS and Fixed Effects – use quasidemeaned data       T  yit  yi   1     1  xit1  xi1   u 12 a u    k  xitk  xik    it    i  22     If θ = 1, then this is just the fixed effects estimator If θ = 0, then this is just the OLS estimator So, the bigger the variance of the unobserved effect, the closer it is to FE The smaller the variance of the unobserved effect, the closer it is to OLS 23 Random Effects Estimation: RE >< FE? FE assumes that each group (firm) has a non-stochastic group-specific component to y RE treats these unobservable effects as being stochastic (i.e random) yit  a0  a1 xit  ui  eit ui , the random error term/ varies between groups but not within groups eit is the element of the error which varies over group and time 24 We assume that: E (ui )  E (eit )  E (ui2 )   v2 E (eit2 )   2 (both components homoscedastic) E (eit u j )   i, t , j (independence of two components) E (eit e js )  if t  s or i  j E (ui u j )  if i  j (no autocorrelation) (no across group correlation) E (ui xit )  E (eit xit )  (both independent of regressor) STATA: xtreg depvar [indepvars] [if] [in] [weight] , [re] 25 Choosing between Fixed Effects (FE) and Random Effects (RE) With large T and small N there is likely to be little difference, so FE is preferable as it is easier to compute With large N and small T, estimates can differ significantly If the cross-sectional groups are a random sample of the population RE is preferable If not the FE is preferable If the error component, vi , is correlated with x then RE is biased, but FE is not For large N and small T and if the assumptions behind RE hold then RE is more efficient than FE 26 27  Test for Var(ui) = 0, that is Cov( it , is )  Cov(ui  eit ,ui  eis )  Cov(eit ,eis ) ◦ If Ti=T for all i, the Lagrange-multiplier test statistic (Breusch-Pagan, 1980) is:   2 N T   ' ˆ e NT  eˆ ( I N  J T )eˆ  NT   i 1  t 1 it  LM     ~  (1)   N T '   2 T  1  eˆ eˆ T  1   eît    i 1 t 1   βˆ  ' where eît  yit   xit 1   , J T  iT iT' uˆ  Pooled 28 ◦ For unbalanced panels, the modified Breusch-Pagan LM test for random effects (Baltagi-Li, 1990) is: 2 N N T   i ˆ T e  i 1 i   i 1  t 1 it  LM   ~  (1) N Ti   N 2  i 1 Ti (Ti  1)   i 1  t 1 eît    ◦ Alternative one-side test:       LM ~ N (0,1) under H P  Value : Prn ( z  LM ) STATA- LM test: xttest0 after xtreg , re 29   Fixed effects estimator is consistent under H0 and H1; Random effects estimator is efficient under H0, but it is inconsistent under H1 Hausman Test Statistic   ' 1   H  βˆ RE  βˆ FE Var (βˆ RE )  Var (βˆ FE )  βˆ RE  βˆ FE ~  (# βˆ FE ), provided # βˆ FE  # βˆ RE (no intercept ) 30  Hausman Test ◦ Estimate any of the random effects models ( yit   yi )  (xit'   xi' )β  ( xit'  xi' ) γ  eit (or , random effects model : yit  xit' β  ( xit'  xi' ) γ  eit ) ( yit   yi )  (xit'   xi' )β  xi' γ  eit ( yit   yi )  (xit'   xi' )β  xit' γ  eit ◦ F Test that  = H : γ   H : Cov(ui , xit )  31 Hausman test: Tests for the statistical significance of the difference between the coefficient estimates obtained by FE and by RE, under then null hypothesis that the RE estimates are efficient and consistent, and FE estimates are inefficient STATA: hausman FE RE 32    The data in WAGEPAN.RAW are from Vella and Verbeek (1998) Each of the 545 men in the sample worked in every year from 1980 through 1987 Some variables in the data set change over time: experience, marital status, and union status are the three important ones Other variables not change: race and education are the key examples If we use fixed effects (or first differencing), we cannot include race, education, or experience in the equation 33    We use three methods: pooled OLS, random effects, and fixed effects In the first two methods, we can include educ and race dummies (black and hispan), but these drop out of the fixed effects analysis The time-varying variables are exper, exper2, union, and married “exper” is dropped in the FE analysis (but exper2 remains) Each regression also contains a full set of year dummies 34 35 [...]...   Often loosely use the term panel data to refer to any data set that has both a crosssectional dimension and a time-series dimension More precisely it’s only data following the same cross-section units over time Otherwise it’s a pooled cross-section (also called POLS) 11 coi taatat ca cac quan... hypothesis that the RE estimates are efficient and consistent, and FE estimates are inefficient STATA: hausman FE RE 32    The data in WAGEPAN.RAW are from Vella and Verbeek (1998) Each of the 545 men in the sample worked in every year from 1980 through 1987 Some variables in the data set change over time: experience, marital status, and union status are the three important ones Other variables do not...  t 1 it  2 LM   1   1 ~  (1)   N T '   2 2 T  1  eˆ eˆ 2 T  1   eît    i 1 t 1   βˆ  ' where eît  yit   xit 1   , J T  iT iT' uˆ  Pooled 28 ◦ For unbalanced panels, the modified Breusch-Pagan LM test for random effects (Baltagi-Li, 1990) is: 2 2 2 N N T   i ˆ T e  i 1 i   i 1  t 1 it  2 LM   1 ~  (1) N Ti   N 2 2  i 1 Ti (Ti  1)   i 1... Statistic   ' 1   H  βˆ RE  βˆ FE Var (βˆ RE )  Var (βˆ FE )  βˆ RE  βˆ FE ~  2 (# βˆ FE ), provided # βˆ FE  # βˆ RE (no intercept ) 30  Hausman Test ◦ Estimate any of the random effects models ( yit   yi )  (xit'   xi' )β  ( xit'  xi' ) γ  eit (or , random effects model : yit  xit' β  ( xit'  xi' ) γ  eit ) ( yit   yi )  (xit'   xi' )β  xi' γ  eit ( yit   yi )  (xit'... be serially correlated 21   Need to transform the model and do GLS to solve the problem and make correct inferences End up with a sort of weighted average of OLS and Fixed Effects – use quasidemeaned data   1     T  yit  yi   0 1     1  xit1  xi1   2 u 2 12 a 2 u    k  xitk  xik    it    i  22     If θ = 1, then this is just the fixed effects estimator If ... microunit) By combining data in two dimensions, panel data gives more data variation, less collinearity and more degrees of freedom Panel data is better suited than cross-sectional data for studying... of Cross section and Panel data - Jeffrey M Wooldridge (2010) YEU TO CHU THE VA YEU TO THOI GIA   These are Models that Combine Crosssection and Time-Series Data In panel data the same cross-sectional... Nt       y NT  - a matrix of balanced panel data observations on variable y, N cross-sectional observations, T time series observations Panel data can take explicit account of individualspecific

Topic 1: Panel data models

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan