Laboratory Medicine ›› 2025, Vol. 40 ›› Issue (7): 680-686.

Previous Articles     Next Articles

Application of a machine learning model based on routine inflammatory markers to distinguish the severity of community-acquired pneumonia

GUAN Chao1, HUANG Ying2, SONG Yunxiao3, ZHOU Ying2()   

  1. 1. Department of Respiratory and Critical Care Medicine,Xuhui District Central Hospital,Shanghai 200031,China
    2. Department of General Medicine,Xuhui District Central Hospital,Shanghai 200031,China
    3. Department of Clinical Laboratory,Xuhui District Central Hospital,Shanghai 200031,China
  • Received:2024-12-11 Revised:2025-04-27 Online:2025-07-30 Published:2025-07-28

Abstract:

Objective To develop and validate a machine learning model for assessing the severity of community-acquired pneumonia(CAP) based on 11 routinely used clinical inflammatory markers. Methods A total of 3 674 patients with newly diagnosed CAP at Xuhui District Central Hospital between January 2016 and November 2024 were retrospectively enrolled. The patients were classified into a training cohort(January 2016-December 2021,1 363 mild cases and 1 320 severe cases) and a validation cohort(January 2022-November 2024,563 mild cases and 428 severe cases) according to the time of diagnosis. The clinical data and the determination results of 11 inflammatory markers were collected for each patient. Six machine learning algorithms-including decision tree(DT),K-nearest neighbors(KNN),Logistic regression(LR),random forest(RF),support vector machine(SVM) and extreme gradient boosting(XGBoost) were used to the training cohort to construct models for distinguishing mild from severe CAP. Using receiver operating characteristic(ROC) curve,the optimal model was selected based on the area under curve(AUC),and it was further validated in the validation cohort. Results Significant differences were observed between mild and severe CAP groups in both the training and validation cohorts with respect to white blood cell(WBC)count,platelet(PLT)count,the absolute value of neutrophils(NEUT#),the absolute value of lymphocytes(LYMPH#),the absolute value of monocytes(MO#),neutrophil-to-lymphocyte ratio(NLR),platelet-to-lymphocyte ratio(PLR),lymphocyte-to-monocyte ratio(LMR),systemic immune-inflammation index(SII),C-reactive protein(CRP) and procalcitonin(PCT)(P<0.001). No statistically significant differences were found for the other indicators(P>0.05). Among the 6 models,XGBoost demonstrated the optimal performance in the training cohort with an AUC of 0.95 and an accuracy of 89%. In the validation cohort,XGBoost achieved an AUC of 0.91 for distinguishing mild and severe CAP,an accuracy of 86%,a sensitivity of 81% and a specificity of 90%. Subgroup analysis showed that XGBoost achieved an AUC of 0.92 for distinguishing severity in bacterial CAP and 0.90 in viral CAP. Conclusions The XGBoost model based on routine inflammatory markers can effectively differentiate between mild and severe CAP,offering a practical tool for clinical severity assessment.

Key words: Inflammatory marker, Community-acquired pneumonia, Disease severity, Machine learning, Model

CLC Number: