Statistical Machine Learning for Modeling Early Respiratory Microbiota Composition
Update: 2014-03-31
Description
Co-authors: Giske Biesbroek (UMC Utrecht), Elisabeth A.M. Sanders (UMC Utrecht), Roy Montijn (TNO Research Institute), Reinier H. Veenhoven (4. Research Center Linnaeus Institute), Bart J.F. Keijser (TNO Research Institute), Debby Bogaert (UMC Utrecht)
Many bacterial pathogens causing respiratory infections in children are common residents of the respiratory tract. Insight into bacterial colonization patterns and stability at young age may allow identification of biomarker strains that elucidate healthy or susceptible conditions for development of respiratory disease.
We used statistical machine learning algorithms for analysis of complex nasopharyngeal microbiota profiles of 60 healthy children at the ages of 6 weeks, and 6, 12 and 24 months. Our unsupervised and semi-supervised learning methods are particularly suitable for high dimensional metagenomic datasets. The methods stem from a recently proposed class of multi-view algorithms (closely related to ensembles and consensus techniques) that aim to combine multiple clustering hypotheses for increased accuracy and are not limited to a single similarity measure, thus leading to robust and reliable results. Furthermore, our algorithms allow identification of the optimal number of clusters via construction of co-occurrence matrices and detection of biomarker species by using unsupervised greedy forward feature selection approach.
We identified 6 distinct microbiota profiles represented by the dominant genera Moraxella, Haemophilus, Streptococcus, or Staphylococcus, a combination of Dolosigranulum and Corynebacterium, plus cluster-specific low abundant biomarker bacteria. The current study enabled us to gain insight in the dynamic nature of nasoparyngeal microbiota in infants. Our results suggest that the composition of early-life microbiota is associated with long-term stability and may predict susceptibility to disease.
Many bacterial pathogens causing respiratory infections in children are common residents of the respiratory tract. Insight into bacterial colonization patterns and stability at young age may allow identification of biomarker strains that elucidate healthy or susceptible conditions for development of respiratory disease.
We used statistical machine learning algorithms for analysis of complex nasopharyngeal microbiota profiles of 60 healthy children at the ages of 6 weeks, and 6, 12 and 24 months. Our unsupervised and semi-supervised learning methods are particularly suitable for high dimensional metagenomic datasets. The methods stem from a recently proposed class of multi-view algorithms (closely related to ensembles and consensus techniques) that aim to combine multiple clustering hypotheses for increased accuracy and are not limited to a single similarity measure, thus leading to robust and reliable results. Furthermore, our algorithms allow identification of the optimal number of clusters via construction of co-occurrence matrices and detection of biomarker species by using unsupervised greedy forward feature selection approach.
We identified 6 distinct microbiota profiles represented by the dominant genera Moraxella, Haemophilus, Streptococcus, or Staphylococcus, a combination of Dolosigranulum and Corynebacterium, plus cluster-specific low abundant biomarker bacteria. The current study enabled us to gain insight in the dynamic nature of nasoparyngeal microbiota in infants. Our results suggest that the composition of early-life microbiota is associated with long-term stability and may predict susceptibility to disease.
Comments
Top Podcasts
The Best New Comedy Podcast Right Now – June 2024The Best News Podcast Right Now – June 2024The Best New Business Podcast Right Now – June 2024The Best New Sports Podcast Right Now – June 2024The Best New True Crime Podcast Right Now – June 2024The Best New Joe Rogan Experience Podcast Right Now – June 20The Best New Dan Bongino Show Podcast Right Now – June 20The Best New Mark Levin Podcast – June 2024
In Channel