6533b873fe1ef96bd12d5b09

RESEARCH PRODUCT

Determining gastrointestinal tract dysbiosis using machine learning techniques

Sindre Lindahl

subject

IKT 590VDP::Technology: 500::Information and communication technology: 550

description

Masteroppgave informasjons- og kommunikasjonsteknologi - Universitetet i Agder, 2015 This thesis explores machine learning techniques for the purpose of determining gastrointestinal tract dysbiosis. Dysbiosis is an unbalance of bacteria flora. Stool sample analysis of relevant bacterias can be used in "diagnosis" of this condition. The problem is how to best classify dysbiosis from a healthy balance of bacteria. Pattern recognition methods could be used to create a diagnostic decision support system. The approach includes comparisons between classifiers with the additional use of feature reduction techniques. Experiments show that the accuracy varies significantly depending of which classifier is used. The best classifier for the data set used here was found to be the C4.5 decision tree. Much of the analyzed data is shown to be noisy, confusing and irrelevant to the classifier. Accuracy can be improved by reducing the amount of bacteria species with more than 90%. In addition, results imply that the different microbial stool analysis panels seriously affect accuracy. Which classifier to use and the highly relevant feature subsets found should be helpful for any future work in the field of gut dysbiosis. And the comparisons could be applicable for classification of similar data sets.

http://hdl.handle.net/11250/299472