Examples
In Stata, after installation, load data
Run tregs
. tregs wage i.race, absorb(ttl_exp#age) xvar(grade) xvar_at(12) log reset regopts(vce(cluster age)) mostlinear
Warning: You specified variables in xvar() that are not included as covariates. All variables in xvar() have been added as covariates in all regressions.
Computing the power with the highest RESET specification test p-value...
Power with the highest p-value in the RESET specification test: 1/9
Processing specification: y^1
- Computing semi-elasticity and elasticity with respect to grade
Processing specification: y^(1/2)
- Computing semi-elasticity and elasticity with respect to grade
Processing specification: y^(1/3)
- Computing semi-elasticity and elasticity with respect to grade
Processing specification: y^(1/4)
- Computing semi-elasticity and elasticity with respect to grade
Processing specification: y^(1/5)
- Computing semi-elasticity and elasticity with respect to grade
Processing specification: log(y)
- Computing semi-elasticity and elasticity with respect to grade
Processing specification: y^(1/9)
- Computing semi-elasticity and elasticity with respect to grade
Regression Results, Dep. Var: Hourly wage
-----------------------------------------------------------------------------------------------------------------------------------------
(1) (2) (3) (4) (5) (6) (7)
y^1 y^(1/2) y^(1/3) y^(1/4) y^(1/5) log(y) y^(1/9)
-----------------------------------------------------------------------------------------------------------------------------------------
White 0 0 0 0 0 0 0
(.) (.) (.) (.) (.) (.) (.)
Black -0.917 -0.137 -0.0624 -0.0389 -0.0279 -0.0908 -0.0128
(1.122) (0.162) (0.0738) (0.0460) (0.0330) (0.108) (0.0152)
Other -3.428*** -0.833*** -0.430*** -0.284*** -0.210*** -0.768*** -0.102***
(0.868) (0.130) (0.0591) (0.0368) (0.0264) (0.0854) (0.0121)
Current grade completed 0.785*** 0.125*** 0.0580*** 0.0364*** 0.0262*** 0.0860*** 0.0120***
(0.217) (0.0325) (0.0148) (0.00920) (0.00659) (0.0213) (0.00301)
-----------------------------------------------------------------------------------------------------------------------------------------
eydx: Current grade com~d 0.0921 0.0897 0.0885 0.0879 0.0875 0.0860 0.0869
eyex: Current grade com~d 1.105 1.076 1.062 1.055 1.050 1.032 1.042
Predicted y: At means 9.743 8.948 8.772 8.687 8.636 8.446 8.549
RESET Test p 0.000412 0.00647 0.0326 0.0933 0.152 0.130 0.222
Observations 205 205 205 205 205 205 205
-----------------------------------------------------------------------------------------------------------------------------------------
Standard errors in parentheses
* p<0.10, ** p<0.05, *** p<0.01
Interpretation
The dependent variable here is wage
, or "Hourly wage".
The independent variable of interest in this case is grade
(specified in xvar(grade)
), or "Current grade completed".
In all specifications, grade
is significantly associated with wage
.
While the coefficient estimates vary across specifications because the unit of the transformed dependent variable changes, the semi-elasticities and elasticities are quite stable. These elasticites are the quantities of interest.
On average, a one unit increase in grade
at grade
12 will change wage
by 9.21% in the linear specification (y^1
), by 8.75% in the quintic root specification (y^5
), and by 8.60% in the log specification (log
).
These numbers can be read off of the row for semi-elasticities (eydx
).
The story is also very similar if we instead look at elasticities (eyex
), the percent change in wage
with respect to a percent change in grade
.
Here, wage
data are all positive, so all specifications have the same number of observations and therefore are directly comparable (we can compute log(wage)
for all observations).
The predicted wage
at the means of the covariates is also very similar across specifications. Note that this is the predicted value of the untransformed dependent variable, wage
. When using non-linear transformations (e.g., \(\log(y)\)), one cannot simply invert (e.g., \(\exp(X\hat{\beta})\)) the predicted value of the transformed dependent variable to obtain the implied predicted value of the untransformed dependent variable (\(y\)). The estimates here are obtained using Duan (1983)'s smearing estimate.
Ramsey's RESET test rejects the null hypothesis of a linear relationship between the tranformed dependent variable and the covariates at the 5% significance level for the linear, square root, and cubic root specifications. This finding will be unsurprising to those who work with wage data and tend to use logarithmic transformations to achieve a better fit for the data. In this case, according to the RESET, the transformation y^(1/9)
yields a relationship that is "most" linear.
Retrieve returned values
Returned values can be accessed with ereturn list
and return list
.
will return the following:
. ereturn list
macros:
e(mtitles) : "y^1" "y^(1/2)" "y^(1/3)" "y^(1/4)" "y^(1/5)" "log(y)" "y^(1/9)"
e(mostlinear) : "mostlinear"
e(mostlinear_power) : "1/9"
e(absorb_option) : "absorb(ttl_exp#age)"
e(regopts) : "vce(cluster age) resid"
e(reg_command) : "reghdfe"
e(y) : "wage"
e(cnames) : "1b.race 2.race 3.race grade"
e(speclist) : "1 1/2 1/3 1/4 1/5 log 1/9"
e(cmdline) : "treg wage i.race, absorb(ttl_exp#age) xvar(grade) xvar_at(12) log reset regopts(vce(cluster age)) mostlinear"
e(cmd) : "treg"
matrices:
e(elasticities) : 3 x 7
The content of e(elasticities)
contains elasticities and predicted values in a named matrix format:
. matrix list e(elasticities)
e(elasticities)[3,7]
y^1 y^(1/2) y^(1/3) y^(1/4) y^(1/5) log(y) y^(1/9)
Semi-elasticity:grade .09210216 .08966786 .08852852 .08791762 .08754021 .08596683 .08685207
Elasticity:grade 1.1052259 1.0760144 1.0623422 1.0550114 1.0504825 1.031602 1.0422249
Predicted y:At means 9.7430196 8.9478998 8.7721528 8.6866237 8.6363291 8.44572 8.5492522
Next,
will return the following:
. return list
scalars:
r(nmodels) = 7
r(ccols) = 3
macros:
r(names) : "reg_1 reg_1_2 reg_1_3 reg_1_4 reg_1_5 reg_log reg_1_9"
r(m7_depname) : "__00001E"
r(m6_depname) : "__000016"
r(m5_depname) : "__00000Y"
r(m4_depname) : "__00000Q"
r(m3_depname) : "__00000I"
r(m2_depname) : "__00000A"
r(m1_depname) : "__000002"
r(cmdline) : "estout reg_*, cells(b(fmt(a3) star) se(fmt(a3) par("{ralign @modelwidth:{txt:(}" "{txt:)}}"))) drop(_cons, relax) stats(semi_grade elas.."
matrices:
r(coefs) : 4 x 21
r(stats) : 5 x 7
To retrieve coefficients and standard errors for all specifications, r(coefs)
can be used:
. matrix list r(coefs)
r(coefs)[4,21]
reg_1: reg_1: reg_1: reg_1_2: reg_1_2: reg_1_2: reg_1_3: reg_1_3: reg_1_3: reg_1_4: reg_1_4: reg_1_4: reg_1_5:
b se p b se p b se p b se p b
1.race 0 . . 0 . . 0 . . 0 . . 0
2.race -.91713138 1.1222037 .4311257 -.13674825 .16217866 .41707779 -.06242501 .0738332 .41585135 -.03891155 .04603871 .41600909 -.02789641
3.race -3.4282385 .86796147 .00227364 -.83319327 .12984377 .00004963 -.43045271 .05913321 .00001584 -.28378344 .03679788 9.241e-06 -.21004786
grade .78545163 .21699037 .0040282 .12530074 .03246094 .0026531 .05801803 .0147833 .00237414 .03637153 .00919947 .0022585 .02615427
reg_1_5: reg_1_5: reg_log: reg_log: reg_log: reg_1_9: reg_1_9: reg_1_9:
se p b se p b se p
1.race . . 0 . . 0 . .
2.race .03303105 .41635185 -.09082638 .1083043 .41953216 -.012784 .01517268 .41741417
3.race .02635142 6.761e-06 -.76798984 .08537324 2.106e-06 -.10157507 .01205005 3.960e-06
grade .00658786 .00219582 .08596683 .02134331 .00198941 .01204197 .00301251 .00209578
To retrieve semi-elasticity estimates, elasticity estimates, and all other statistics returned by tregs
, use r(stats)
:
. matrix list r(stats)
r(stats)[5,7]
reg_1 reg_1_2 reg_1_3 reg_1_4 reg_1_5 reg_log reg_1_9
semi_grade .09210216 .08966786 .08852852 .08791762 .08754021 .08596683 .08685207
elas_grade 1.1052259 1.0760144 1.0623422 1.0550114 1.0504825 1.031602 1.0422249
pred 9.7430196 8.9478998 8.7721528 8.6866237 8.6363291 8.44572 8.5492522
reset_p .00041195 .00646536 .03256949 .09331925 .15206974 .12989616 .22201128
N 205 205 205 205 205 205 205
All separate regression results are stored in each of the values in r(names)
.
For example, if we want to access the y^(1/9)
specification:
Now ereturn list
will show the list for this particular specification:
. ereturn list
scalars:
e(reset_p) = .2220112831743523
e(elas_grade) = 1.042224878670878
e(F) = .
e(rmse) = .07198609357511
e(mss) = .7760035773469278
e(rss) = .5285637621568577
e(tss_within) = .6380255370108875
e(tss) = 1.304567339503786
e(df_m) = 3
e(N_full) = 2244
e(num_singletons) = 2039
e(drop_singletons) = 1
e(ic) = 1
e(df_a_nested) = 100
e(df_a_redundant) = 100
e(df_a_initial) = 100
e(df_a) = 0
e(N_hdfe_extended) = 1
e(N_hdfe) = 1
e(df_r) = 11
e(rank) = 3
e(N) = 205
e(N_clustervars) = 1
e(report_constant) = 1
e(sumweights) = 205
e(N_clust1) = 12
e(N_clust) = 12
e(r2_a_within) = .1471975280008481
e(r2_a) = .1896719377354549
e(r2_within) = .1715633129151096
e(r2) = .5948359688677274
e(ll_0) = 300.7872629567238
e(ll) = 320.0792864923622
e(semi_grade) = .0868520732225732
e(pred) = 8.549252180753959
macros:
e(_estimates_name) : "reg_1_9"
e(vce) : "cluster"
e(vcetype) : "Robust"
e(resid) : "_reghdfe_resid"
e(clustvar1) : "age"
e(clustvar) : "age"
e(indepvars) : "2bn.race 3bn.race grade _cons"
e(depvar) : "__00001E"
e(title3) : "Statistics robust to heteroskedasticity"
e(cmdline) : "reghdfe __00001E 1b.race 2.race 3.race grade [], vce(cluster age) resid absorb(ttl_exp#age)"
e(title2) : "Absorbing 1 HDFE group"
e(title) : "HDFE Linear regression"
e(marginsnotok) : "Residuals SCore"
e(footnote) : "reghdfe_footnote"
e(estat_cmd) : "reghdfe_estat"
e(predict) : "reghdfe_p"
e(extended_absvars) : "ttl_exp#age"
e(absvars) : "ttl_exp#age"
e(dofmethod) : "pairwise clusters continuous"
e(cmd) : "reghdfe"
e(properties) : "b V"
matrices:
e(b) : 1 x 5
e(V) : 5 x 5
e(dof_table) : 1 x 5