Skip to content

Examples

In Stata, after installation, load data

. sysuse nlsw88, clear
(NLSW, 1988 extract)

Run tregs

. tregs wage i.race, absorb(ttl_exp#age) xvar(grade) xvar_at(12) log reset regopts(vce(cluster age)) mostlinear
Warning: You specified variables in xvar() that are not included as covariates. All variables in xvar() have been added as covariates in all regressions.
Computing the power with the highest RESET specification test p-value...
Power with the highest p-value in the RESET specification test: 1/9
Processing specification: y^1
  - Computing semi-elasticity and elasticity with respect to grade
Processing specification: y^(1/2)
  - Computing semi-elasticity and elasticity with respect to grade
Processing specification: y^(1/3)
  - Computing semi-elasticity and elasticity with respect to grade
Processing specification: y^(1/4)
  - Computing semi-elasticity and elasticity with respect to grade
Processing specification: y^(1/5)
  - Computing semi-elasticity and elasticity with respect to grade
Processing specification: log(y)
  - Computing semi-elasticity and elasticity with respect to grade
Processing specification: y^(1/9)
  - Computing semi-elasticity and elasticity with respect to grade

Regression Results, Dep. Var: Hourly wage
-----------------------------------------------------------------------------------------------------------------------------------------
                                   (1)             (2)             (3)             (4)             (5)             (6)             (7)
                                   y^1         y^(1/2)         y^(1/3)         y^(1/4)         y^(1/5)          log(y)         y^(1/9)
-----------------------------------------------------------------------------------------------------------------------------------------
White                                0               0               0               0               0               0               0
                                   (.)             (.)             (.)             (.)             (.)             (.)             (.)

Black                           -0.917          -0.137         -0.0624         -0.0389         -0.0279         -0.0908         -0.0128
                               (1.122)         (0.162)        (0.0738)        (0.0460)        (0.0330)         (0.108)        (0.0152)

Other                           -3.428***       -0.833***       -0.430***       -0.284***       -0.210***       -0.768***       -0.102***
                               (0.868)         (0.130)        (0.0591)        (0.0368)        (0.0264)        (0.0854)        (0.0121)

Current grade completed          0.785***        0.125***       0.0580***       0.0364***       0.0262***       0.0860***       0.0120***
                               (0.217)        (0.0325)        (0.0148)       (0.00920)       (0.00659)        (0.0213)       (0.00301)
-----------------------------------------------------------------------------------------------------------------------------------------
eydx: Current grade com~d       0.0921          0.0897          0.0885          0.0879          0.0875          0.0860          0.0869
eyex: Current grade com~d        1.105           1.076           1.062           1.055           1.050           1.032           1.042
Predicted y: At means            9.743           8.948           8.772           8.687           8.636           8.446           8.549
RESET Test p                  0.000412         0.00647          0.0326          0.0933           0.152           0.130           0.222
Observations                       205             205             205             205             205             205             205
-----------------------------------------------------------------------------------------------------------------------------------------
Standard errors in parentheses
* p<0.10, ** p<0.05, *** p<0.01

Interpretation

The dependent variable here is wage, or "Hourly wage".

The independent variable of interest in this case is grade (specified in xvar(grade)), or "Current grade completed".

In all specifications, grade is significantly associated with wage.

While the coefficient estimates vary across specifications because the unit of the transformed dependent variable changes, the semi-elasticities and elasticities are quite stable. These elasticites are the quantities of interest.

On average, a one unit increase in grade at grade 12 will change wage by 9.21% in the linear specification (y^1), by 8.75% in the quintic root specification (y^5), and by 8.60% in the log specification (log). These numbers can be read off of the row for semi-elasticities (eydx). The story is also very similar if we instead look at elasticities (eyex), the percent change in wage with respect to a percent change in grade.

Here, wage data are all positive, so all specifications have the same number of observations and therefore are directly comparable (we can compute log(wage) for all observations).

The predicted wage at the means of the covariates is also very similar across specifications. Note that this is the predicted value of the untransformed dependent variable, wage. When using non-linear transformations (e.g., \(\log(y)\)), one cannot simply invert (e.g., \(\exp(X\hat{\beta})\)) the predicted value of the transformed dependent variable to obtain the implied predicted value of the untransformed dependent variable (\(y\)). The estimates here are obtained using Duan (1983)'s smearing estimate.

Ramsey's RESET test rejects the null hypothesis of a linear relationship between the tranformed dependent variable and the covariates at the 5% significance level for the linear, square root, and cubic root specifications. This finding will be unsurprising to those who work with wage data and tend to use logarithmic transformations to achieve a better fit for the data. In this case, according to the RESET, the transformation y^(1/9) yields a relationship that is "most" linear.

Retrieve returned values

Returned values can be accessed with ereturn list and return list.

ereturn list

will return the following:

. ereturn list

macros:
            e(mtitles) : "y^1" "y^(1/2)" "y^(1/3)" "y^(1/4)" "y^(1/5)" "log(y)" "y^(1/9)"
         e(mostlinear) : "mostlinear"
   e(mostlinear_power) : "1/9"
      e(absorb_option) : "absorb(ttl_exp#age)"
            e(regopts) : "vce(cluster age) resid"
        e(reg_command) : "reghdfe"
                  e(y) : "wage"
             e(cnames) : "1b.race 2.race 3.race grade"
           e(speclist) : "1 1/2 1/3 1/4 1/5 log 1/9"
            e(cmdline) : "treg wage i.race, absorb(ttl_exp#age) xvar(grade) xvar_at(12) log reset regopts(vce(cluster age)) mostlinear"
                e(cmd) : "treg"

matrices:
       e(elasticities) :  3 x 7

The content of e(elasticities) contains elasticities and predicted values in a named matrix format:

. matrix list e(elasticities)

e(elasticities)[3,7]
                             y^1    y^(1/2)    y^(1/3)    y^(1/4)    y^(1/5)     log(y)    y^(1/9)
Semi-elasticity:grade  .09210216  .08966786  .08852852  .08791762  .08754021  .08596683  .08685207
     Elasticity:grade  1.1052259  1.0760144  1.0623422  1.0550114  1.0504825   1.031602  1.0422249
 Predicted y:At means  9.7430196  8.9478998  8.7721528  8.6866237  8.6363291    8.44572  8.5492522

Next,

return list

will return the following:

. return list

scalars:
            r(nmodels) =  7
              r(ccols) =  3

macros:
              r(names) : "reg_1 reg_1_2 reg_1_3 reg_1_4 reg_1_5 reg_log reg_1_9"
         r(m7_depname) : "__00001E"
         r(m6_depname) : "__000016"
         r(m5_depname) : "__00000Y"
         r(m4_depname) : "__00000Q"
         r(m3_depname) : "__00000I"
         r(m2_depname) : "__00000A"
         r(m1_depname) : "__000002"
            r(cmdline) : "estout reg_*, cells(b(fmt(a3) star) se(fmt(a3) par("{ralign @modelwidth:{txt:(}" "{txt:)}}"))) drop(_cons, relax) stats(semi_grade elas.."

matrices:
              r(coefs) :  4 x 21
              r(stats) :  5 x 7

To retrieve coefficients and standard errors for all specifications, r(coefs) can be used:

. matrix list r(coefs)

r(coefs)[4,21]
             reg_1:      reg_1:      reg_1:    reg_1_2:    reg_1_2:    reg_1_2:    reg_1_3:    reg_1_3:    reg_1_3:    reg_1_4:    reg_1_4:    reg_1_4:    reg_1_5:
                 b          se           p           b          se           p           b          se           p           b          se           p           b
1.race           0           .           .           0           .           .           0           .           .           0           .           .           0
2.race  -.91713138   1.1222037    .4311257  -.13674825   .16217866   .41707779  -.06242501    .0738332   .41585135  -.03891155   .04603871   .41600909  -.02789641
3.race  -3.4282385   .86796147   .00227364  -.83319327   .12984377   .00004963  -.43045271   .05913321   .00001584  -.28378344   .03679788   9.241e-06  -.21004786
 grade   .78545163   .21699037    .0040282   .12530074   .03246094    .0026531   .05801803    .0147833   .00237414   .03637153   .00919947    .0022585   .02615427

           reg_1_5:    reg_1_5:    reg_log:    reg_log:    reg_log:    reg_1_9:    reg_1_9:    reg_1_9:
                se           p           b          se           p           b          se           p
1.race           .           .           0           .           .           0           .           .
2.race   .03303105   .41635185  -.09082638    .1083043   .41953216    -.012784   .01517268   .41741417
3.race   .02635142   6.761e-06  -.76798984   .08537324   2.106e-06  -.10157507   .01205005   3.960e-06
 grade   .00658786   .00219582   .08596683   .02134331   .00198941   .01204197   .00301251   .00209578

To retrieve semi-elasticity estimates, elasticity estimates, and all other statistics returned by tregs, use r(stats):

. matrix list r(stats)

r(stats)[5,7]
                reg_1    reg_1_2    reg_1_3    reg_1_4    reg_1_5    reg_log    reg_1_9
semi_grade  .09210216  .08966786  .08852852  .08791762  .08754021  .08596683  .08685207
elas_grade  1.1052259  1.0760144  1.0623422  1.0550114  1.0504825   1.031602  1.0422249
      pred  9.7430196  8.9478998  8.7721528  8.6866237  8.6363291    8.44572  8.5492522
   reset_p  .00041195  .00646536  .03256949  .09331925  .15206974  .12989616  .22201128
         N        205        205        205        205        205        205        205

All separate regression results are stored in each of the values in r(names).

For example, if we want to access the y^(1/9) specification:

. estimates restore reg_1_9
(results reg_1_9 are active now)

Now ereturn list will show the list for this particular specification:

. ereturn list

scalars:
            e(reset_p) =  .2220112831743523
         e(elas_grade) =  1.042224878670878
                  e(F) =  .
               e(rmse) =  .07198609357511
                e(mss) =  .7760035773469278
                e(rss) =  .5285637621568577
         e(tss_within) =  .6380255370108875
                e(tss) =  1.304567339503786
               e(df_m) =  3
             e(N_full) =  2244
     e(num_singletons) =  2039
    e(drop_singletons) =  1
                 e(ic) =  1
        e(df_a_nested) =  100
     e(df_a_redundant) =  100
       e(df_a_initial) =  100
               e(df_a) =  0
    e(N_hdfe_extended) =  1
             e(N_hdfe) =  1
               e(df_r) =  11
               e(rank) =  3
                  e(N) =  205
      e(N_clustervars) =  1
    e(report_constant) =  1
         e(sumweights) =  205
           e(N_clust1) =  12
            e(N_clust) =  12
        e(r2_a_within) =  .1471975280008481
               e(r2_a) =  .1896719377354549
          e(r2_within) =  .1715633129151096
                 e(r2) =  .5948359688677274
               e(ll_0) =  300.7872629567238
                 e(ll) =  320.0792864923622
         e(semi_grade) =  .0868520732225732
               e(pred) =  8.549252180753959

macros:
    e(_estimates_name) : "reg_1_9"
                e(vce) : "cluster"
            e(vcetype) : "Robust"
              e(resid) : "_reghdfe_resid"
          e(clustvar1) : "age"
           e(clustvar) : "age"
          e(indepvars) : "2bn.race 3bn.race grade _cons"
             e(depvar) : "__00001E"
             e(title3) : "Statistics robust to heteroskedasticity"
            e(cmdline) : "reghdfe __00001E 1b.race 2.race 3.race grade [], vce(cluster age) resid absorb(ttl_exp#age)"
             e(title2) : "Absorbing 1 HDFE group"
              e(title) : "HDFE Linear regression"
       e(marginsnotok) : "Residuals SCore"
           e(footnote) : "reghdfe_footnote"
          e(estat_cmd) : "reghdfe_estat"
            e(predict) : "reghdfe_p"
   e(extended_absvars) : "ttl_exp#age"
            e(absvars) : "ttl_exp#age"
          e(dofmethod) : "pairwise clusters continuous"
                e(cmd) : "reghdfe"
         e(properties) : "b V"

matrices:
                  e(b) :  1 x 5
                  e(V) :  5 x 5
          e(dof_table) :  1 x 5