SIAMCAT: user-friendly and versatile machine learning workflows for statistically rigorous microbiome analyses
release_jraubeuycrbt5chykvybdlzzuy
by
Jakob Wirbel, Konrad Zych, Morgan Essex, Nicolai Karcher, Ece Kartal, Guillem Salazar, Peer Bork, Shinichi Sunagawa, Georg Zeller
2020
Abstract
The human microbiome is increasingly mined for diagnostic and therapeutic biomarkers. However, computational tools tailored to such analyses are still scarce. Here, we present the SIAMCAT R package, a versatile and user-friendly toolbox for comparative metagenome analyses using machine learning (ML), statistical tests, and visualization. Based on a large meta-analysis of gut microbiome studies, we optimized the choice of ML algorithms and preprocessing routines for default workflow settings. Furthermore, we illustrate common pitfalls leading to overfitting and show how SIAMCAT safeguards against this to make statistically rigorous ML workflows broadly accessible. SIAMCAT is available from siamcat.embl.de and Bioconductor.
In application/xml+jats
format
Archived Files and Locations
application/pdf 4.1 MB
file_flwul2zyonbdhp2a5r22lvmtfm
|
www.biorxiv.org (repository) web.archive.org (webarchive) |
application/pdf 3.8 MB
file_qw3vhv7sarbrpk4j5hxbph643i
|
www.biorxiv.org (repository) web.archive.org (webarchive) |
post
Stage
unknown
Date 2020-02-06
access all versions, variants, and formats of this works (eg, pre-prints)
Crossref Metadata (via API)
Worldcat
wikidata.org
CORE.ac.uk
Semantic Scholar
Google Scholar