Laboratory Medicine ›› 2025, Vol. 40 ›› Issue (12): 1190-1196.DOI: 10.3969/j.issn.1673-8640.2025.12.009

Previous Articles    

A machine learning early warning model for acute promyelocytic leukemia based on blood cell analyzer parameters

CHANG Nan, WEI Yali, LU Qifeng, LI Tian, HOU Tingting, LI Yuan, ZHU Mengyu, SHEN Yajuan()   

  1. Department of Clinical Laboratory,Shandong Provincial Hospital,Shandong First Medical University,Jinan 250021,Shandong,China
  • Received:2025-05-06 Revised:2025-10-29 Online:2025-12-30 Published:2025-12-26
  • Contact: SHEN Yajuan

Abstract:

Objective To construct a machine learning(ML) early warning model for acute promyelocytic leukemia(APL) based on blood cell analyzer parameters,and to analyze the specific parameter characteristics of APL to provide a reference for the early clinical diagnosis of APL. Methods Blood cell analyzer data from 958 patients with APL,other acute myeloid leukemia(AML) and lymphoid tissue tumors,as well as 985 healthy subjects,were collected from January 2018 to September 2024 at Shandong Provincial Hospital of Shandong First Medical University. The data from January 2018 to December 2023 were collected in the modeling set,and the data from January 2024 to September 2024 were collected in the external validation set. Three algorithms,including Lasso,support vector machine recursive feature elimination and random forest,were used to screen the modeling parameters. Five algorithms,including light gradient boosting machine,support vector machine,multi-layer perceptron,multinomial logistic regression and random forest,were used to construct the APL early warning model. Receiver operating characteristic curve was drawn,and the model with optimal area under curve(AUC) was selected. Ten-fold cross-validation and external validation were performed. The contribution of model features was explained using SHAP. Results Among the 5 ML models,the APL early warning model constructed by the light gradient boosting machine had the optimal overall performance in the test set(AUC of 0.974). Its AUC for predicting APL in the test set was 0.976,and the area under the precision-recall curve(PR-AUC) was 0.906. The AUC and PR-AUC in ten-fold cross-validation were both >0.835. The AUC for predicting APL in the external validation set was 0.969,and the PR-AUC was 0.718. The SHAP explanation results showed that platelet(PLT) count,side scatter intensity in the monocyte region(MO_X) and mean corpuscular hemoglobin concentration(MCHC) had the most significant contribution to the APL early warning model. Conclusions The APL early warning model constructed based on blood cell analyzer parameters and the light gradient boosting machine algorithm can effectively support the early identification and diagnosis of APL.

Key words: Blood cell analyzer parameter, Acute promyelocytic leukemia, Machine learning, Early warning model, Hematologic malignancy

CLC Number: