检验医学 ›› 2025, Vol. 40 ›› Issue (7): 680-686.

• 论著 • 上一篇    下一篇

基于常规炎症指标构建机器学习模型在区分社区获得性肺炎严重程度中的应用

管超1, 黄莺2, 宋云霄3, 周颖2()   

  1. 1.上海市徐汇区中心医院呼吸与危重症医学科,上海 200031
    2.上海市徐汇区中心医院全科医学科,上海 200031
    3.上海市徐汇区中心医院检验科,上海 200031
  • 收稿日期:2024-12-11 修回日期:2025-04-27 出版日期:2025-07-30 发布日期:2025-07-28
  • 通讯作者: 周颖,E-mail:yingzhouwui@hotmail.com
  • 作者简介:周颖,E-mail:yingzhouwui@hotmail.com
    管超,女,1986年生,硕士,主治医师,主要从事呼吸科相关疾病的诊治工作。
    黄莺,女,1976年生,硕士,副主任医师,主要从事全科疾病的诊治工作。第一联系人:管超和黄莺对本研究具有同等贡献,并列为第一作者。
  • 基金资助:
    上海市徐汇区医学科研项目(SHXH202201);上海市徐汇区卫生健康委员会资助项目(SHXHZDXK202323)

Application of a machine learning model based on routine inflammatory markers to distinguish the severity of community-acquired pneumonia

GUAN Chao1, HUANG Ying2, SONG Yunxiao3, ZHOU Ying2()   

  1. 1. Department of Respiratory and Critical Care Medicine,Xuhui District Central Hospital,Shanghai 200031,China
    2. Department of General Medicine,Xuhui District Central Hospital,Shanghai 200031,China
    3. Department of Clinical Laboratory,Xuhui District Central Hospital,Shanghai 200031,China
  • Received:2024-12-11 Revised:2025-04-27 Online:2025-07-30 Published:2025-07-28

摘要:

目的 采用机器学习算法基于11种临床常规炎症指标构建区分CAP患者疾病严重程度的模型并进行验证。方法 选取2016年1月—2024年11月上海市徐汇区中心医院初诊CAP患者3 674例,根据纳入时间将所有患者分为训练集(2016年1月—2021年12月,轻症1 363例、重症1 320例)和验证集(2022年1月—2024年11月,轻症563例、重症428例)。收集所有患者的临床资料并检测11项炎症指标。在训练集中采用决策树(DT)、K近邻算法(KNN)、逻辑回归(LR)、随机森林(RF)、支持向量机(SVM)和极限梯度提升(XGBoost)6种机器学习算法基于常规炎症指标构建区分CAP轻症和重症的模型。采用受试者工作特征(ROC)曲线选择最优模型,并在验证集中进行验证。结果 在训练集和验证集中,轻症组和重症组白细胞(WBC)计数、血小板(PLT)计数、中性粒细胞绝对数(NEUT#)、淋巴细胞绝对数(LYMPH#)、单核细胞绝对数(MO#)、中性粒细胞/淋巴细胞比值(NLR)、血小板/淋巴细胞比值(PLR)、淋巴细胞/单核细胞比值(LMR)、全身免疫炎症指数(SII)、C反应蛋白(CRP)、降钙素原(PCT)差异均有统计学意义(P<0.001),其他各项指标差异均无统计学意义(P>0.05)。在训练集中,XGBoost模型区分CAP轻症和重症的AUC为0.95,准确率为89%,明显优于其他5种模型。在验证集中,XGBoost模型区分CAP轻症和重症的AUC为0.91,准确率为86%,敏感性为81%,特异性为90%。XGBoost模型区分细菌性CAP轻症和重症的AUC为0.92,区分病毒性CAP轻症和重症的AUC为0.90。结论 基于11种常规炎症指标构建的XGBoost模型可有效区分CAP轻症和重症。

关键词: 炎症指标, 社区获得性肺炎, 严重程度, 机器学习, 模型

Abstract:

Objective To develop and validate a machine learning model for assessing the severity of community-acquired pneumonia(CAP) based on 11 routinely used clinical inflammatory markers. Methods A total of 3 674 patients with newly diagnosed CAP at Xuhui District Central Hospital between January 2016 and November 2024 were retrospectively enrolled. The patients were classified into a training cohort(January 2016-December 2021,1 363 mild cases and 1 320 severe cases) and a validation cohort(January 2022-November 2024,563 mild cases and 428 severe cases) according to the time of diagnosis. The clinical data and the determination results of 11 inflammatory markers were collected for each patient. Six machine learning algorithms-including decision tree(DT),K-nearest neighbors(KNN),Logistic regression(LR),random forest(RF),support vector machine(SVM) and extreme gradient boosting(XGBoost) were used to the training cohort to construct models for distinguishing mild from severe CAP. Using receiver operating characteristic(ROC) curve,the optimal model was selected based on the area under curve(AUC),and it was further validated in the validation cohort. Results Significant differences were observed between mild and severe CAP groups in both the training and validation cohorts with respect to white blood cell(WBC)count,platelet(PLT)count,the absolute value of neutrophils(NEUT#),the absolute value of lymphocytes(LYMPH#),the absolute value of monocytes(MO#),neutrophil-to-lymphocyte ratio(NLR),platelet-to-lymphocyte ratio(PLR),lymphocyte-to-monocyte ratio(LMR),systemic immune-inflammation index(SII),C-reactive protein(CRP) and procalcitonin(PCT)(P<0.001). No statistically significant differences were found for the other indicators(P>0.05). Among the 6 models,XGBoost demonstrated the optimal performance in the training cohort with an AUC of 0.95 and an accuracy of 89%. In the validation cohort,XGBoost achieved an AUC of 0.91 for distinguishing mild and severe CAP,an accuracy of 86%,a sensitivity of 81% and a specificity of 90%. Subgroup analysis showed that XGBoost achieved an AUC of 0.92 for distinguishing severity in bacterial CAP and 0.90 in viral CAP. Conclusions The XGBoost model based on routine inflammatory markers can effectively differentiate between mild and severe CAP,offering a practical tool for clinical severity assessment.

Key words: Inflammatory marker, Community-acquired pneumonia, Disease severity, Machine learning, Model

中图分类号: