添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
Stata学习:如何进行PSM-DID?

Stata学习:如何进行PSM-DID?

一、详细解读

0. 数据导入

导入自带数据集:

webuse nlswork.dta,clear

1. 前期处理

设置面板:

xtset idcode year, delta(1)
       panel variable:  idcode (unbalanced)
        time variable:  year, 68 to 88, but with gaps
                delta:  1 unit

变量加工与描述性统计:

xtdescribe 
g age2= age^2
g ttl_exp2=ttl_exp^2
g tenure2=tenure^2
global xlist "grade age age2 ttl_exp ttl_exp2 tenure tenure2 not_smsa south race"
sum ln_w $xlist 

2. 传统DID

在这个数据集中,政策执行时间为1977年,因此设置:

g time = (year >= 77) & !missing(year) 

而政策执行地方为idcode大于2000的地方,因此设置:

g treated = (idcode >2000)&!missing(idcode) 

生成政策变量:

g did = time*treated 

采用OLS估计:

reg ln_w did time treated $xlist 
      Source |       SS           df       MS      Number of obs   =    28,091
-------------+----------------------------------   F(13, 28077)    =   1295.76
       Model |  2405.08413        13  185.006471   Prob > F        =    0.0000
    Residual |  4008.77975    28,077  .142778066   R-squared       =    0.3750
-------------+----------------------------------   Adj R-squared   =    0.3747
       Total |  6413.86388    28,090  .228332641   Root MSE        =    .37786
------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         did |   .0124603   .0094099     1.32   0.185    -.0059836    .0309042
        time |  -.0694754   .0097324    -7.14   0.000    -.0885513   -.0503996
     treated |  -.0142815   .0078374    -1.82   0.068    -.0296431    .0010801
       grade |   .0633445   .0010293    61.54   0.000      .061327     .065362
         age |   .0484968   .0036834    13.17   0.000     .0412771    .0557165
        age2 |  -.0008259   .0000583   -14.18   0.000    -.0009401   -.0007117
     ttl_exp |   .0250849    .002395    10.47   0.000     .0203906    .0297793
    ttl_exp2 |   .0002872   .0001266     2.27   0.023      .000039    .0005353
      tenure |   .0460783    .001968    23.41   0.000     .0422209    .0499356
     tenure2 |  -.0019604   .0001341   -14.62   0.000    -.0022233   -.0016976
    not_smsa |  -.1686808   .0051713   -32.62   0.000    -.1788169   -.1585447
       south |  -.1017625   .0055672   -18.28   0.000    -.1126745   -.0908504
        race |  -.0528909   .0049329   -10.72   0.000    -.0625597   -.0432221
       _cons |   .1400083   .0535917     2.61   0.009     .0349661    .2450506
------------------------------------------------------------------------------

固定效应模型:

xtreg ln_w did time treated $xlist i.year, fe
Fixed-effects (within) regression               Number of obs     =     28,091
Group variable: idcode                          Number of groups  =      4,697
R-sq:                                           Obs per group:
     within  = 0.1781                                         min =          1
     between = 0.3576                                         avg =        6.0
     overall = 0.2666                                         max =         15
                                                F(23,23371)       =     220.25
corr(u_i, Xb)  = 0.1840                         Prob > F          =     0.0000
------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         did |    .019761    .008746     2.26   0.024     .0026182    .0369037
        time |  -.3296955   .2012367    -1.64   0.101    -.7241326    .0647416
     treated |          0  (omitted)
       grade |          0  (omitted)
         age |   .0665613   .0105138     6.33   0.000     .0459536    .0871689
        age2 |  -.0009369   .0000616   -15.20   0.000    -.0010577    -.000816
     ttl_exp |   .0392161    .003072    12.77   0.000     .0331947    .0452375
    ttl_exp2 |  -.0001047   .0001352    -0.77   0.439    -.0003697    .0001603
      tenure |   .0338566   .0018579    18.22   0.000     .0302151    .0374981
     tenure2 |  -.0018164    .000126   -14.42   0.000    -.0020633   -.0015695
    not_smsa |   -.086641   .0095117    -9.11   0.000    -.1052846   -.0679974
       south |  -.0597302   .0109246    -5.47   0.000    -.0811431   -.0383173
        race |          0  (omitted)
        year |
         69  |   .0423343    .015528     2.73   0.006     .0118984    .0727702
         70  |  -.0339863    .022938    -1.48   0.138    -.0789464    .0109737
         71  |  -.0299658   .0318704    -0.94   0.347     -.092434    .0325023
         72  |  -.0584678   .0412716    -1.42   0.157    -.1393627    .0224272
         73  |  -.0961513   .0508379    -1.89   0.059     -.195797    .0034943
         75  |  -.1513149   .0698104    -2.17   0.030    -.2881479   -.0144818
         77  |   .1558488   .1129486     1.38   0.168    -.0655379    .3772354
         78  |   .1417381   .1027217     1.38   0.168    -.0596031    .3430793
         80  |   .0826885   .0834407     0.99   0.322    -.0808607    .2462377
         82  |   .0268096   .0638067     0.42   0.674    -.0982557     .151875
         83  |   .0103961    .054166     0.19   0.848    -.0957727     .116565
         85  |    .007777   .0347411     0.22   0.823    -.0603178    .0758717
         87  |  -.0223798   .0162763    -1.37   0.169    -.0542825    .0095229
         88  |          0  (omitted)
       _cons |   .5038214   .1945871     2.59   0.010     .1224179    .8852249
-------------+----------------------------------------------------------------
     sigma_u |   .3532234
     sigma_e |   .2898202
         rho |  .59764928   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(4696, 23371) = 6.58                 Prob > F = 0.0000

3. PSM-DID

一般匹配

定义种子:

set seed 0001

生成随机数并随机整理:

g tmp = runiform()
sort tmp 

近邻匹配:

psmatch2 treated $xlist, out(ln_w) logit ate neighbor(1) common caliper(.05) ties 
Logistic regression                             Number of obs     =     28,091
                                                LR chi2(10)       =   10093.80
                                                Prob > chi2       =     0.0000
Log likelihood =  -13636.63                     Pseudo R2         =     0.2701
------------------------------------------------------------------------------
     treated |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       grade |  -.0342869   .0070082    -4.89   0.000    -.0480227   -.0205511
         age |   .1726049   .0227237     7.60   0.000     .1280673    .2171424
        age2 |  -.0026816   .0003693    -7.26   0.000    -.0034054   -.0019577
     ttl_exp |  -.0681925   .0155242    -4.39   0.000    -.0986193   -.0377657
    ttl_exp2 |   .0040423   .0008363     4.83   0.000     .0024031    .0056815
      tenure |  -.0138884   .0128354    -1.08   0.279    -.0390453    .0112686
     tenure2 |   .0002263   .0008671     0.26   0.794    -.0014733    .0019259
    not_smsa |   .4024444   .0355904    11.31   0.000     .3326885    .4722002
       south |   2.865226   .0409688    69.94   0.000     2.784928    2.945523
        race |   .7655284    .034012    22.51   0.000     .6988662    .8321907
       _cons |  -3.400584   .3254663   -10.45   0.000    -4.038487   -2.762682
------------------------------------------------------------------------------
----------------------------------------------------------------------------------------
        Variable     Sample |    Treated     Controls   Difference         S.E.   T-stat
----------------------------+-----------------------------------------------------------
         ln_wage  Unmatched | 1.62754443   1.75722818  -.129683748   .005816206   -22.30
                        ATT | 1.62798811   1.70873922  -.080751107   .021132148    -3.82
                        ATU | 1.75707925   1.75230645  -.004772802            .        .
                        ATE |                          -.051698588            .        .
----------------------------+-----------------------------------------------------------
Note: S.E. does not take into account that the propensity score is estimated.
 psmatch2: |   psmatch2: Common
 Treatment |        support
assignment | Off suppo  On suppor |     Total
-----------+----------------------+----------
 Untreated |         2     10,733 |    10,735 
   Treated |        20     17,336 |    17,356 
-----------+----------------------+----------
     Total |        22     28,069 |    28,091 

检验协变量在处理组与控制组之间是否平衡:

pstest $xlist, both graph
----------------------------------------------------------------------------------------
                Unmatched |       Mean               %reduct |     t-test    |  V(T)/
Variable          Matched | Treated Control    %bias  |bias| |    t    p>|t| |  V(C)
--------------------------+----------------------------------+---------------+----------
grade                  U  | 12.341   12.853    -22.3         | -18.01  0.000 |  1.18*
                       M  | 12.348   12.645    -12.9    42.1 | -12.03  0.000 |  1.15*
                          |                                  |               |
age                    U  | 29.223   28.848      5.6         |   4.56  0.000 |  0.94*
                       M  | 29.216   30.875    -24.7  -342.5 | -23.65  0.000 |  1.05*
                          |                                  |               |
age2                   U  |  897.8   878.68      4.7         |   3.86  0.000 |  0.95*
                       M  | 897.33   994.81    -24.1  -409.7 | -22.80  0.000 |  1.01
                          |                                  |               |
ttl_exp                U  | 6.2403   6.2447     -0.1         |  -0.08  0.938 |  1.05*
                       M  | 6.2334   6.9942    -16.4-17031.2 | -14.80  0.000 |  0.92*
                          |                                  |               |
ttl_exp2               U  | 61.008   60.008      1.2         |   1.01  0.312 |  1.03
                       M  | 60.849   72.756    -14.8 -1090.6 | -12.91  0.000 |  0.79*
                          |                                  |               |
tenure                 U  | 3.0891   3.1782     -2.4         |  -1.93  0.053 |  0.96*
                       M  | 3.0899   2.9777      3.0   -25.9 |   2.87  0.004 |  1.10*
                          |                                  |               |
tenure2                U  |  23.41   24.482     -1.9         |  -1.60  0.110 |  0.90*
                       M  | 23.423   21.518      3.5   -77.7 |   3.41  0.001 |  1.14*
                          |                                  |               |
not_smsa               U  | .32945   .20689     27.9         |  22.36  0.000 |     .
                       M  | .32897   .32066      1.9    93.2 |   1.65  0.099 |     .
                          |                                  |               |
south                  U  |   .618   .07219    140.2         | 107.35  0.000 |     .
                       M  | .61756   .61802     -0.1    99.9 |  -0.09  0.930 |     .
                          |                                  |               |
race                   U  | 1.3912   1.1631     50.3         |  39.53  0.000 |  1.91*
                       M  | 1.3896   1.3763      2.9    94.2 |   2.43  0.015 |  1.05*
                          |                                  |               |
----------------------------------------------------------------------------------------
* if variance ratio outside [0.97; 1.03] for U and [0.97; 1.03] for M
-----------------------------------------------------------------------------------
 Sample    | Ps R2   LR chi2   p>chi2   MeanBias   MedBias      B      R     %Var
-----------+-----------------------------------------------------------------------
 Unmatched | 0.272  10149.10    0.000     25.7       5.2     145.8*   3.70*    88
 Matched   | 0.019    932.83    0.000     10.4       8.2      33.1*   1.16     88
-----------------------------------------------------------------------------------
* if B>25%, R outside [0.5; 2]

去掉不满足共同区域假定的观测值:

gen common=_support
drop if common == 0 

根据上面匹配好的数据,进行DID。

首先是OLS:

reg ln_w did time treated $xlist 
      Source |       SS           df       MS      Number of obs   =    28,069
-------------+----------------------------------   F(13, 28055)    =   1294.85
       Model |  2402.32966        13  184.794589   Prob > F        =    0.0000
    Residual |   4003.8761    28,055  .142715241   R-squared       =    0.3750
-------------+----------------------------------   Adj R-squared   =    0.3747
       Total |  6406.20576    28,068  .228238769   Root MSE        =    .37778
------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         did |   .0130378   .0094101     1.39   0.166    -.0054064    .0314821
        time |   -.069613   .0097317    -7.15   0.000    -.0886876   -.0505384
     treated |  -.0145157    .007837    -1.85   0.064    -.0298765    .0008452
       grade |   .0635306   .0010326    61.53   0.000     .0615067    .0655545
         age |   .0487295   .0036856    13.22   0.000     .0415054    .0559535
        age2 |  -.0008306   .0000583   -14.24   0.000    -.0009449   -.0007163
     ttl_exp |   .0246821   .0023991    10.29   0.000     .0199798    .0293843
    ttl_exp2 |   .0003217   .0001271     2.53   0.011     .0000726    .0005708
      tenure |   .0461767   .0019693    23.45   0.000     .0423168    .0500365
     tenure2 |  -.0019786   .0001343   -14.73   0.000    -.0022419   -.0017154
    not_smsa |  -.1681968   .0051738   -32.51   0.000    -.1783376    -.158056
       south |   -.101765   .0055661   -18.28   0.000    -.1126748   -.0908551
        race |  -.0534454   .0049452   -10.81   0.000    -.0631381   -.0437526
       _cons |   .1361316   .0536086     2.54   0.011     .0310562    .2412069
------------------------------------------------------------------------------

然后是固定效应模型:

xtreg ln_w did time treated $xlist i.year, fe
Fixed-effects (within) regression               Number of obs     =     28,069
Group variable: idcode                          Number of groups  =      4,696
R-sq:                                           Obs per group:
     within  = 0.1782                                         min =          1
     between = 0.3585                                         avg =        6.0
     overall = 0.2678                                         max =         15
                                                F(23,23350)       =     220.19
corr(u_i, Xb)  = 0.1861                         Prob > F          =     0.0000
------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         did |   .0199922   .0087501     2.28   0.022     .0028414     .037143
        time |  -.3259785   .2014472    -1.62   0.106    -.7208281    .0688712
     treated |          0  (omitted)
       grade |          0  (omitted)
         age |   .0664258   .0105214     6.31   0.000     .0458031    .0870485
        age2 |  -.0009367   .0000617   -15.18   0.000    -.0010576   -.0008158
     ttl_exp |   .0388635   .0030813    12.61   0.000     .0328238    .0449031
    ttl_exp2 |  -.0000845   .0001359    -0.62   0.534    -.0003508    .0001819
      tenure |   .0339768   .0018608    18.26   0.000     .0303295    .0376241
     tenure2 |  -.0018292   .0001263   -14.48   0.000    -.0020768   -.0015817
    not_smsa |  -.0865921   .0095179    -9.10   0.000    -.1052478   -.0679365
       south |  -.0600416   .0109379    -5.49   0.000    -.0814807   -.0386026
        race |          0  (omitted)
        year |
         69  |   .0425357   .0155411     2.74   0.006     .0120741    .0729974
         70  |  -.0332855   .0229601    -1.45   0.147    -.0782887    .0117178
         71  |  -.0290703    .031901    -0.91   0.362    -.0915985    .0334578
         72  |  -.0573452   .0413132    -1.39   0.165    -.1383217    .0236314
         73  |  -.0947357   .0508901    -1.86   0.063    -.1944837    .0050123
         75  |  -.1496357   .0698826    -2.14   0.032    -.2866102   -.0126612
         77  |   .1540973   .1130656     1.36   0.173    -.0675187    .3757134
         78  |    .140131   .1028285     1.36   0.173    -.0614196    .3416817
         80  |    .081445   .0835287     0.98   0.330    -.0822767    .2451668
         82  |   .0260429   .0638735     0.41   0.683    -.0991533     .151239
         83  |   .0097373   .0542219     0.18   0.857    -.0965412    .1160159
         85  |    .007336   .0347776     0.21   0.833    -.0608304    .0755023
         87  |  -.0225526   .0162929    -1.38   0.166    -.0544878    .0093826
         88  |          0  (omitted)
       _cons |   .5068104   .1947252     2.60   0.009     .1251361    .8884847
-------------+----------------------------------------------------------------
     sigma_u |   .3531224
     sigma_e |  .28988636
         rho |  .59740195   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(4695, 23350) = 6.56                 Prob > F = 0.0000

逐年匹配

这里采用Xu等(2022)的思路:

  1. Xu, L., et al. (2022). How has China's low-carbon city pilot policy influenced its CO2 abatement costs? Analysis from the perspective of the shadow price
    1. Appendix B. Supplementary data 【数据+Stata+Matlab】
*(1)matching data
use psmdata.dta, clear
forvalue i = 2003/2018{
      preserve
          capture {
              keep if year == `i'
              set seed 10101
              gen  ranorder = runiform()
              sort ranorder
              psmatch2 treated $xlist , outcome(lnMAC) logit neighbor(2)  ///
                                        ties common ate caliper(0.05)
              save `i'.dta, replace
      restore
clear all
use 2003.dta, clear
forvalue k =2004/2018 {
      capture {
          append using `k'.dta
save ybydata.dta, replace
**(2) regression 
use ybydata.dta, clear
*- PSM-DID3(samples that meet the Common Support Assumption )
reghdfe lnMAC LCC $xlist if _support == 1, $regopt //Column (3) of Table 5
outreg2 using psm.rtf,bdec(4) sdec(4) append
*- PSM-DID4(frequency-weighted regression)
gen     weight = _weight * 2
replace weight = 1 if treated == 1 & _weight != .
reghdfe  lnMAC LCC  $xlist [fweight = weight], $regopt //Column (4) of Table 5
outreg2 using psm.rtf,bdec(4) sdec(4) append

4. DID的条件检验

4.1 共同趋势假设

生成年度虚拟变量

tab year, g(yrdum) 
  interview |
       year |      Freq.     Percent        Cum.
------------+-----------------------------------
         68 |      1,375        4.82        4.82
         69 |      1,232        4.32        9.14
         70 |      1,686        5.91       15.05
         71 |      1,851        6.49       21.53
         72 |      1,693        5.93       27.47
         73 |      1,981        6.94       34.41
         75 |      2,141        7.50       41.91
         77 |      2,171        7.61       49.52
         78 |      1,964        6.88       56.40
         80 |      1,847        6.47       62.88
         82 |      2,085        7.31       70.18
         83 |      1,987        6.96       77.15
         85 |      2,085        7.31       84.45
         87 |      2,164        7.58       92.04
         88 |      2,272        7.96      100.00
------------+-----------------------------------
      Total |     28,534      100.00

生成政策实行前的那些年份与处理虚拟变量的交互项

forval v=1/7{
g treated`v'=yrdum`v'*treated
}

做三个回归:

  • 第一列:没有控制变量
  • 第二列:如果did依然显著,且treated*这些政策施行前年份交互项并不显著,那就好
  • 第三列:工会会影响这个处理组和控制组的共同趋势,因此看看union=0的情形
xtreg ln_w did treated*  i.year ,fe  
outreg2 using result_1.doc,replace tstat bdec(3) tdec(2) ctitle(ln_w,Full-sample)addtext(Controls, No,Year Effect, Yes) keep(did treated*)
xtreg ln_w did treated* $xlist i.year ,fe 
outreg2 using result_1.doc,append tstat bdec(3) tdec(2) ctitle(ln_w,Full-sample)addtext(Controls, Yes,Year Effect, Yes) keep(did treated*)
xtreg ln_w did treated* $xlist i.year if union!=1 ,fe 
outreg2 using result_1.doc,append tstat bdec(3) tdec(2) ctitle(ln_w,union=0)addtext(Controls, Yes,Year Effect, Yes) keep(did treated*)

4.2 政策干预时间的随机性

将政策执行时间分别提前到1975年和1976年,重新构造DID变量:

g time1 = (year >= 75) & !missing(year)
g treated_1= (idcode >2000)&!missing(idcode)
g did1 = time1*treated_1
g time2 = (year >= 76) & !missing(year)
g treated_2= (idcode >2000)&!missing(idcode)
g did2 = time2*treated_2 

回归结果:

xtreg ln_w did1 $xlist i.year,fe 
outreg2 using result_2.doc,replace tstat bdec(3) tdec(2) ctitle(ln_w,Full-sample)addtext(Controls, Yes,Year Effect, Yes) keep(did*)
xtreg ln_w did2 $xlist i.year,fe 
outreg2 using result_2.doc,append tstat bdec(3) tdec(2) ctitle(ln_w,Full-sample)addtext(Controls, Yes,Year Effect, Yes) keep(did*)

4.3 控制组将不受政策影响

考虑一个并没有受政策影响地方假设其受到政策影响的情况:

g time = (year >= 77) & !missing(year)  
g treated_3= (idcode<1600 & idcode>1000)&!missing(idcode) 
g did3 = time*treated_3 
xtreg ln_w did3 $xlist i.year,fe
------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        did3 |   .0047208   .0137331     0.34   0.731     -.022197    .0316387
...

did3不显著,证明控制组不受政策影响。符合预期。

4.4 政策实施的唯一性

至少证明这个政策才是主要影响因素。需要寻找某些受到其他政策影响的地方:

g time4 = (year >= 77) & !missing(year)  
g treated_4= (idcode<3000 & idcode>2300)&!missing(idcode)
g did4 = time4*treated_4  
xtreg ln_w did4 $xlist i.year,fe 
------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        did4 |   -.009966   .0129562    -0.77   0.442     -.035361    .0154289
...

这里did4不显著,可以证明政策实施的唯一性。但有时候did4可能依然显著,但是系数变小,证明还受到其他政策影响。

4.5 控制组和政策影响组的分组是随机的

用工具变量来替代政策变量,解决因为分组非随机导致的内生性问题:

xi:xtivreg2 ln_w (did=hours tenure) $xlist i.year,fe first 

二、期刊示例

示例1:psmatch2

文献来源

  1. Zhang, X., et al. (2023). The impact of carbon markets on the financial performance of power producers: Evidence based on China
    1. Appendix A. Supplementary data 【数据+Stata】

示例代码

cd "C:\Download\1-s2.0-S0140988323006175-mmc1\data and command"
use "data.dta",clear
gen top10=ln(Top10)
gen Soe=0
replace Soe=1 if 公司属性=="中央国有企业"|公司属性=="地方国有企业"
destring stkcd,replace
xtset stkcd time
gen miss1=missing(ROA ,did ,Size ,Leverage ,Growth ,top10 ,Soe)
drop if miss1==1
set seed 10101
gen ranorder = runiform()
sort ranorder
*==========进行一对一匹配========================================
psmatch2 did i.time Size Leverage Growth top10 Soe, outcome(ROA) n(1) ate ties logit common
*==========平衡性检验============================================
qui psmatch2 did i.time Size Leverage Growth top10 Soe, outcome(ROA) n(1) ate ties logit common
pstest Size Leverage Growth Top10 Soe, both graph
*====================未匹配结果==================================
psmatch2 did i.time Size Leverage Growth top10 Soe, outcome(ROA) n(1) ate ties logit common
*========(1)一对一匹配===========================================
bootstrap r(att) r(atu) r(ate), reps(100): psmatch2 did i.time Size Leverage Growth top10 Soe, outcome(ROA) n(1) ate ties logit common
gen psm_sample1=1 if _weight!=.
drop  _pscore _treated _support _weight _id _n* _pdif  
*========(2)邻近匹配k=4========================================
bootstrap r(att) r(atu) r(ate), reps(100): psmatch2 did i.time Size Leverage Growth top10 Soe, outcome(ROA) n(4) ate ties logit common
sum _pscore
dis 0.25 * r(sd)
gen psm_sample2=1 if _weight!=.
drop _pscore _treated _support _weight _id _n* _pdif  
*========(3)卡尺匹配=============================================
bootstrap r(att) r(atu) r(ate), reps(100): psmatch2 did i.time Size Leverage Growth top10 Soe, outcome(ROA) n(4) cal(0.01) ate ties logit common quietly
gen psm_sample3=1 if _weight!=.
drop _pscore _treated _support _weight _id _n* _pdif  
*========(4)半径匹配=============================================
bootstrap r(att) r(atu) r(ate), reps(100): psmatch2 did i.time Size Leverage Growth top10 Soe, outcome(ROA) radius cal(0.01) ate ties logit common quietly
gen psm_sample4=1 if _weight!=.
drop _pscore _treated _support _weight  
*========(5)核匹配===============================================
bootstrap r(att) r(atu) r(ate), reps(100): psmatch2 did i.time Size Leverage Growth top10 Soe, outcome(ROA) kernel ate ties logit common quietly
gen psm_sample5=1 if _weight!=.
drop _pscore _treated _support _weight 
*========(6)局部线性回归匹配=====================================
bootstrap r(att) r(atu) r(ate), reps(100): psmatch2 did i.time Size Leverage Growth top10 Soe, outcome(ROA) llr ate ties logit common quietly
gen psm_sample6=1 if _weight!=.
drop _pscore _treated _support _weight _id _n* _pdif  
*========(7)样条匹配=============================================
set seed 10101
bootstrap r(att) r(atu) r(ate), reps(100): psmatch2 did i.time Size Leverage Growth top10 Soe, outcome(ROA) spline ate ties logit common quietly
gen psm_sample7=1 if _weight!=.
drop _treated _support _weight _id _n*   
*========(8)马氏匹配=============================================
psmatch2 did, outcome(ROA) mahal(i.time Size Leverage Growth top10 Soe) n(4) ai(4) ate
gen psm_sample8=1 if _weight!=.
drop _treated _support _weight _id _n*
*****************************************************************
*========生成结果=============================================
sum psm_sample1 psm_sample2 psm_sample3 psm_sample4 psm_sample5 psm_sample6  psm_sample8
xtset stkcd time
cap drop _est*
qui{ 
reg ROA did Size Leverage Growth top10 Soe i.time if psm_sample1==1,cluster(stkcd) r
est store m1
reg ROA did Size Leverage Growth top10 Soe i.time if psm_sample2==1,cluster(stkcd) r
est store m2
reg ROA did Size Leverage Growth top10 Soe i.time if psm_sample3==1,cluster(stkcd) r
est store m3
reg ROA did Size Leverage Growth top10 Soe i.time if psm_sample4==1,cluster(stkcd) r
est store m4
reg ROA did Size Leverage Growth top10 Soe i.time if psm_sample5==1,cluster(stkcd) r
est store m5
reg ROA did Size Leverage Growth top10 Soe i.time if psm_sample6==1,cluster(stkcd) r
est store m6
reg ROA did Size Leverage Growth top10 Soe i.time if psm_sample8==1,cluster(stkcd) r
est store m8
outreg2 [m1 m2 m3 m4 m5 m6  m8] using "倾向得分匹配PSM(内生性2).doc",word replace tstat tdec(2) bdec(3) rdec(3) nonote drop(_Itime_*) sortvar(ROA)addtext(Firm FE,YES,Year FE,YES)

得到结果

示例2:kmatch

文献来源

  1. Shangguan, Y., et al. (2023). Environmental bonuses of employment protection: Evidence from labor contract law in China
    1. Appendix B. Supplementary data 【数据+Stata】

示例代码

cd "C:\Download\1-s2.0-S0140988323007363-mmc1\DATA & CODE Environmental Bonuses of Employment Protection"
use "Environmental Bonuses of Employment Protection.dta",clear
sum LC2_dum
kmatch ps  LC2_dum  (lny1), ///
        ebalance ( lnFixasset lnLabor acf_sale age sa hhi c_d roa ) ///
		gen(var) wgen(wvar) nn(1)
drop if _KM_ps == .
reghdfe lny1 did2  lnFixasset lnLabor acf_sale age sa hhi c_d roa [pweight=_KM_mw ] , ///
        a( id  year ind4a_year ind4a_cnamecode cnamecode_year ) clu(id)  

得到结果

Propensity-score nearest-neighbor matching              Number of obs = 93,066
                                                Neighbors:    min =      46512
Treatment   : LC2_dum = 1                                     max =      46554
Covariates  : (none)
PS model    : logit (pr)
Entropy bal.: targets = 1
              balance tolerance = .00001
              covariates = lnFixasset lnLabor acf_sale age sa hhi c_d roa
              basis = matching weights
Matching statistics
------------------------------------------------------------------------------------------
           |             Matched             |             Controls            |  Balance 
           |       Yes         No      Total |      Used     Unused      Total |     loss 
-----------+---------------------------------+---------------------------------+----------
   Treated |     46512          0      46512 |     46554          0      46554 |  4.14e-13
 Untreated |     46554          0      46554 |     46512          0      46512 |  1.52e-16
  Combined |     93066          0      93066 |     93066          0      93066 |          
------------------------------------------------------------------------------------------
Treatment-effects estimation
------------------------------------------------------------------------------
        lny1 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
         ATE |   .2756808   .0174657    15.78   0.000     .2414481    .3099134
------------------------------------------------------------------------------
Stored variables
Variable      Storage   Display    Value
    name         type    format    label      Variable label
---------------------------------------------------------------------------------------------
var             byte    %8.0g                 Treatment indicator
_KM_nc          long    %10.0g                Number of matched controls
_KM_nm          long    %10.0g                Number of times used as a match
_KM_mw          double  %10.0g                Matching weight
_KM_ps          double  %10.0g                Propensity score
_KM_strata      byte    %8.0g                 Matching stratum
wvar            double  %10.0g                Matching weights for ATE
HDFE Linear regression                            Number of obs   =     88,762
Absorbing 5 HDFE groups                           F(   9,  17649) =      62.55
Statistics robust to heteroskedasticity           Prob > F        =     0.0000
                                                  R-squared       =     0.8876
                                                  Adj R-squared   =     0.8219
                                                  Within R-sq.    =     0.0917
Number of clusters (id)      =     17,650         Root MSE        =     0.4057
                                (Std. err. adjusted for 17,650 clusters in id)
------------------------------------------------------------------------------
             |               Robust
        lny1 | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        did2 |   -.118895   .0215247    -5.52   0.000    -.1610854   -.0767046
  lnFixasset |  -.1014892   .0089206   -11.38   0.000    -.1189745   -.0840039
     lnLabor |  -.1498149   .0124582   -12.03   0.000    -.1742341   -.1253956
    acf_sale |  -.3305972   .0283343   -11.67   0.000    -.3861353   -.2750591
         age |   .0001202   .0006171     0.19   0.846    -.0010895    .0013299
          sa |   .0235574   .0060927     3.87   0.000     .0116151    .0354997
         hhi |  -3.597629   1.952235    -1.84   0.065    -7.424202     .228945
         c_d |  -.0109467   .0124806    -0.88   0.380    -.0354099    .0135164
         roa |  -.0889202    .044236    -2.01   0.044    -.1756272   -.0022132
       _cons |   4.526707     .25066    18.06   0.000     4.035389    5.018026
------------------------------------------------------------------------------
Absorbed degrees of freedom:
---------------------------------------------------------+
     Absorbed FE | Categories  - Redundant  = Num. Coefs |
-----------------+---------------------------------------|
              id |     17650       17650           0    *|
            year |        10           0          10     |
      ind4a_year |      3194          10        3184     |
 ind4a_cnamecode |      9357         279        9078    ?|
  cnamecode_year |      3072         274        2798    ?|
---------------------------------------------------------+
? = number of redundant parameters may be higher
* = FE nested within cluster; treated as redundant for DoF computation

期刊排版

示例3:psmatch2

文献来源

  1. Wang, W., et al. (2024). The impact of energy-consuming rights trading on green total factor productivity in the context of digital economy: Evidence from listed firms in China
    1. Appendix B. Supplementary data 【数据+Stata】

示例代码

/**Near-neighbor matching**/
clear 
use data.dta
set seed 103
gen u=runiform()
sort u
psmatch2 du SIZE AGE LEV TOBINQ ROA CAPITAL GROWTH TOP1 INDEP EPI FDI GDP IND,outcome(FGTFP) n(2) ate ties logit common
pstest SIZE AGE LEV TOBINQ ROA CAPITAL GROWTH TOP1 INDEP EPI FDI GDP IND,both graph
psgraph
reghdfe FGTFP c._treated#c.dt SIZE AGE LEV TOBINQ ROA CAPITAL GROWTH TOP1 INDEP EPI FDI GDP IND if _weight!=.,absorb(id year)
reghdfe FGTFP c._treated#c.dt SIZE AGE LEV TOBINQ ROA CAPITAL GROWTH TOP1 INDEP EPI FDI GDP IND if _support ==1,absorb(id year)
/**Radius matching**/
clear
use data.dta 
set seed 101
gen u=runiform()
sort u
psmatch2 du SIZE AGE LEV TOBINQ ROA CAPITAL GROWTH TOP1 INDEP EPI FDI GDP IND,outcome(FGTFP) radius ate logit cal(0.01) common
pstest SIZE AGE LEV TOBINQ ROA CAPITAL GROWTH TOP1 INDEP EPI FDI GDP IND,both graph
psgraph
reghdfe FGTFP c._treated#c.dt SIZE AGE LEV TOBINQ ROA CAPITAL GROWTH TOP1 INDEP EPI FDI GDP IND if _weight!=.,absorb(id year)
reghdfe FGTFP c._treated#c.dt SIZE AGE LEV TOBINQ ROA CAPITAL GROWTH TOP1 INDEP EPI FDI GDP IND if _support ==1,absorb(id year)
/**Nuclear matching**/
clear
use data.dta 
set seed 104
gen u=runiform()