The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before
sharing sensitive information, make sure you’re on a federal
government site.
The
https://
ensures that you are connecting to the
official website and that any information you provide is encrypted
and transmitted securely.
As a library, NLM provides access to scientific literature. Inclusion in an NLM database does not imply endorsement of, or agreement with,
the contents by NLM or the National Institutes of Health.
Learn more:
PMC Disclaimer
Nan Fang Yi Ke Da Xue Xue Bao.
2023 Jul 20; 43(7): 1241–1247.
Language:
Chinese
|
English
预测重症缺血性脑卒中死亡风险的模型:基于内在可解释性机器学习方法
An interpretable machine learning-based prediction model for risk of death for patients with ischemic stroke in intensive care unit
,
*
,
,
*
and
*
罗 枭
海军军医大学卫勤系军队卫生统计学教研室,上海 200433,
Department of Military Health Statistics, Naval Medical University, Shanghai 200433, China
程 义
海军军医大学卫勤系军队卫生统计学教研室,上海 200433,
Department of Military Health Statistics, Naval Medical University, Shanghai 200433, China
吴 骋
海军军医大学卫勤系军队卫生统计学教研室,上海 200433,
Department of Military Health Statistics, Naval Medical University, Shanghai 200433, China
贺 佳
海军军医大学卫勤系军队卫生统计学教研室,上海 200433,
Department of Military Health Statistics, Naval Medical University, Shanghai 200433, China
海军军医大学卫勤系军队卫生统计学教研室,上海 200433,
Department of Military Health Statistics, Naval Medical University, Shanghai 200433, China
COPD: Chronic obstructive pulmonary disease; WBC: White blood cell.Gender<0.001 Female784 (46.7%)376 (54.5%) Male895 (53.3%)314 (45.5%)Age (year)69.5±14.378.8±11.6<0.001Weight (kg)81.9±21.172.9±18.9<0.001Smoking history0.9 Yes306 (18.2%)128 (18.6%) No1373 (81.8%)562 (81.4%)Ventilation<0.001 Yes927 (55.2%)319 (46.2%) No752 (44.8%)371(53.8%)Hypertension<0.001 Yes927 (55.2%)319(46.2%) No752 (44.8%)371 (53.8%)Hyperlipidemia<0.001 Yes1148 (68.4%)410 (59.4%) No531 (31.6%)280 (40.6%)Diabetes0.02 Yes402 (23.9%)198 (28.7%) No1277 (76.1%)492 (71.3%)COPD0.02 Yes402 (23.9%)198 (28.7%) No1277 (76.1%)492 (71.3%)Coronary heart disease0.34 Yes620 (36.9%)270 (39.1%) No1059 (63.1%)420 (60.9%)Atrial fibrillation0.004 Yes528 (31.4%)312 (45.2%) No1151 (68.6%)378 (54.8%)Heart rate (rate/min)77.9±14.085.2±15.8<0.001Diastolic blood pressure (mmHg)
68.7±13.166.9±12.10.002Systolic blood pressure (mmHg)
130.6±18.7130.2±20.30.67Respiratory rate (rate/min)18.5±3.020.2±3.8<0.001Temperature (℃)36.8±0.436.9±0.50.03Oxygen saturation (%)96.8±1.797.2±2.00.003Glucose (mg/dL)147.9±77.9174.3±96.1<0.001WBC (109/L)11.8±5.513.3±6.3<0.001Blood urea nitrogen (mg/dL)22.2±15.132.3±24.2<0.001Serum creatinine (mg/dL)1.2±1.21.5±1.2<0.001Sodium (mmol/L)140.2±3.6141.0±5.50.003International Normalized Ratio1.4±0.71.5±1.00.002Partial thromboplastin time (s)
36.9±19.436.8±17.30.9Erythrocyte specific volume (%)
34.2±7.033.1±6.4<0.001Platelets (109/L)204.3±79.5212.9±94.00.04Anion gap (mmol/L)13.1±2.914.3±3.3<0.001Bicarbonate (mmol/L)22.5±3.421.7±4.5<0.001Calcium (mmol/L)8.5±0.88.4±0.70.07Chloride (mmol/L)102.9±4.9102.1±5.90.004Potassium (mmol/L)3.9±0.53.9±0.60.27SOFA score3.4±2.65.5±3.4<0.001
2.2. 机器学习模型表现
将所有变量纳入机器学习模型中,经过超参数寻优选择4种方法各自表现最优的模型,4种模型应用于测试集数据(
),其中EBM模型表现最好。AUC值从高到低依次排列为EBM(0.857)、RF(0.838)、LR(0.807)、Naive Bayes(0.785)(
),经DeLong检验除EBM与朴素贝叶斯之间AUC值差异有统计学意义(
P
<0.05)外,其余模型的两两检验差异无统计学意义。Brier值从低到高依次排列为EBM(0.135)、RF(0.148)、LR(0.158)、Naive Bayes(0.200),从4种预测模型的校准图中可见EBM模型校准表现最好(
),其斜率与截距最小,Naive Bayes表现最差。从4种预测模型的决策曲线分析结果中可见当概率阈值为0.10~0.80时,EBM的净获益率要高于其他模型(
)。
表 2
四种模型在测试集上的性能表现
Performance of the 4 models on the test set
Indicator
|
LR
|
Naive bayes
|
RF
|
EBM
|
LR: Logistic regression; RF: Random forest; EBM: Explainable boosting machine; AUC: Area under the subject's working characteristic curve.
|
AUC (95%
CI
)
|
0.807 (0.773-0.836)
|
0.785 (0.755-0.813)
|
0.838 (0.810-0.870)
|
0.857 (0.831-0.887)
|
Accuracy
|
0.789
|
0.747
|
0.787
|
0.808
|
Precision
|
0.671
|
0.580
|
0.734
|
0.733
|
Recall
|
0.766
|
0.471
|
0.420
|
0.536
|
F1-score
|
0.488
|
0.520
|
0.535
|
0.619
|
Brier score
|
0.158
|
0.200
|
0.148
|
0.135
|