SIAMCAT: user-friendly and versatile machine learning workflows for statistically rigorous microbiome analyses release_jraubeuycrbt5chykvybdlzzuy

by Jakob Wirbel, Konrad Zych, Morgan Essex, Nicolai Karcher, Ece Kartal, Guillem Salazar, Peer Bork, Shinichi Sunagawa, Georg Zeller

Released as a post by Cold Spring Harbor Laboratory.

2020  

Abstract

The human microbiome is increasingly mined for diagnostic and therapeutic biomarkers. However, computational tools tailored to such analyses are still scarce. Here, we present the SIAMCAT R package, a versatile and user-friendly toolbox for comparative metagenome analyses using machine learning (ML), statistical tests, and visualization. Based on a large meta-analysis of gut microbiome studies, we optimized the choice of ML algorithms and preprocessing routines for default workflow settings. Furthermore, we illustrate common pitfalls leading to overfitting and show how SIAMCAT safeguards against this to make statistically rigorous ML workflows broadly accessible. SIAMCAT is available from siamcat.embl.de and Bioconductor.
In application/xml+jats format

Archived Files and Locations

application/pdf  4.1 MB
file_flwul2zyonbdhp2a5r22lvmtfm
www.biorxiv.org (repository)
web.archive.org (webarchive)
application/pdf  3.8 MB
file_qw3vhv7sarbrpk4j5hxbph643i
www.biorxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  post
Stage   unknown
Date   2020-02-06
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: d5b78be3-dda3-4e82-bf63-8389dfe5b4e9
API URL: JSON