Multivariate classification models to diagnose tuberculosis/non-tuberculosis patients

Short description:

The correct diagnose of tuberculosis patients is nowadays of paramount importance. The main reason is because traditional methods (i.e. Ziehl-Neelsen staining and microscopy) take too long (~3 weeks) and the treatment should be admnistered to the patient as soon as possible (~1 day). A typical short-cut to avoid traditional methods is to use genotype analysis of the mycobacterium (via PCR). However, these methods are labour-intensive and require well trained personnel.

Our proposal in this project is to use pyrolysis-GC-MS to analyse a sputum sample of the patient. In principle, we should be able to identify the biomarkers that are specific to tuberculosis mycobacteria, and use this information to classify patients between tuberculosis/non-tuberculosis. The project contained a considerable ammount of work in data pre-treatment (mainly alignment and standarization). At a first attempt, univariate analysis failed to pick up the right biomarkers. However, the use of Partial Least Squares Discriminant Analysis (PLS-DA) was succesful in detecting the right biomarkers and building up a discrimination method. In a second step we were able to translate the biomarker features into chemical components, so the method can be easily transfered to other instrumentation and/or operator. A seminal paper of the proposed methodology can be found here [38a]. A validation of this method (using samples from different countries) can be found here [39a]. In this later case, it has been proven that a tree model performs superior (and it is simpler) than the traditional PLS-DA model. The method works for mycobacterium samples, but it should be adapted for its application onto sputum samples.


This project was developed at different institutions. Several people were involved. Look into presentations co-authorship for more information.


University of Amsterdam.


None available


None available