Laboratory Medicine ›› 2024, Vol. 39 ›› Issue (7): 668-672.DOI: 10.3969/j.issn.1673-8640.2024.07.009

Previous Articles     Next Articles

Pulmonary tuberculosis diagnosis model for blood routine test based on machine learning algorithms

HUANG Ying1, ZHOU Ying1(), SONG Yunxiao2(), MAO Junjie1, GUAN Chao1, ZHAO Jinyan1, NI Peiqing1   

  1. 1. Department of General Practice,Shanghai Xuhui Central Hospital,Shanghai 200237,China
    2. Department of Clinical Laboratory,Shanghai Xuhui Central Hospital,Shanghai 200237,China
  • Received:2024-01-06 Revised:2024-04-16 Online:2024-07-30 Published:2024-07-31

Abstract:

Objective To construct a pulmonary tuberculosis diagnosis model based on machine learning algorithms for blood routine test,and to analyze its clinical application value. Methods Totally,469 newly diagnosed patients with pulmonary tuberculosis (pulmonary tuberculosis group) from Shanghai Xuhui Central Hospital from January 2019 to December 2022 were enrolled,and 506 healthy subjects matched by age and sex were enrolled as healthy control group. The data of 22 blood routine test items and demographic parameters of all the subjects were collected. The collinearity was analyzed by LASSO regression analysis. The datum set was randomly divided into 2 parts:75% was used as the training set for the construction of the machine learning model;25% was used as the test set for the performance evaluation of the model. Four machine learning algorithms,distributed random forest (DRF),deep learning,gradient elevator and generalized linear model,were used to test the model,and the diagnostic efficiency of the model was verified by 5-fold crossover method. The diagnostic performance of the model was evaluated by receiver operating characteristic (ROC) curve. Results Based on Logistic regression analysis and LASSO regression analysis,10 non-collinear indicators were selected. DRF was the opitmal machine learning algorithm for the construction of pulmonary tuberculosis diagnosis. In the training set and test set,the areas under curves of the DRF model were 0.992 1 and 0.847 4,the sensitivities were 99.16% and 92.04%,the specificities were 80.91% and 55.22%,and the accuracies were 89.84% and 72.06%,respectively. Conclusions The pulmonary tuberculosis diagnosis model based on machine learning algorithm is an effective diagnostic tool,but its clinical application value needs to be further verified.

Key words: Machine learning, Diagnostic model, Pulmonary tuberculosis, Blood routine test

CLC Number: