A Snapshot of SARS-CoV-2 Genome Availability up to April 2020 and its Implications: Data Analysis (Preprint) release_gqn2234cp5gzjgjrwhaaozzbr4

by carla mavian, Simone Marini, Mattia Prosperi, Marco

Released as a post by JMIR Publications Inc..

2020  

Abstract

<sec> <title>BACKGROUND</title> The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic has been growing exponentially, affecting over 4 million people and causing enormous distress to economies and societies worldwide. A plethora of analyses based on viral sequences has already been published both in scientific journals and through non–peer-reviewed channels to investigate the genetic heterogeneity and spatiotemporal dissemination of SARS-CoV-2. However, a systematic investigation of phylogenetic information and sampling bias in the available data is lacking. Although the number of available genome sequences of SARS-CoV-2 is growing daily and the sequences show increasing phylogenetic information, country-specific data still present severe limitations and should be interpreted with caution. </sec> <sec> <title>OBJECTIVE</title> The objective of this study was to determine the quality of the currently available SARS-CoV-2 full genome data in terms of sampling bias as well as phylogenetic and temporal signals to inform and guide the scientific community. </sec> <sec> <title>METHODS</title> We used maximum likelihood–based methods to assess the presence of sufficient information for robust phylogenetic and phylogeographic studies in several SARS-CoV-2 sequence alignments assembled from GISAID (Global Initiative on Sharing All Influenza Data) data released between March and April 2020. </sec> <sec> <title>RESULTS</title> Although the number of high-quality full genomes is growing daily, and sequence data released in April 2020 contain sufficient phylogenetic information to allow reliable inference of phylogenetic relationships, country-specific SARS-CoV-2 data sets still present severe limitations. </sec> <sec> <title>CONCLUSIONS</title> At the present time, studies assessing within-country spread or transmission clusters should be considered preliminary or hypothesis-generating at best. Hence, current reports should be interpreted with caution, and concerted efforts should continue to increase the number and quality of sequences required for robust tracing of the epidemic. </sec>
In application/xml+jats format

Archived Files and Locations

application/pdf  474.9 kB
file_eefyyqloejbhlikh5bxe2nf75m
publichealth.jmir.org (web)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  post
Stage   unknown
Date   2020-04-06
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: e7262fff-2203-4230-8688-d3d27e0c13aa
API URL: JSON