TandemMapper and TandemQUAST: mapping long reads and assessing/improving assembly quality in extra-long tandem repeats
release_ja7qqkej7jfz3a2lfktusg4un4
by
Alla Mikheenko, Andrey V Bzikadze, Alexey Gurevich, Karen H Miga, Pavel A Pevzner
2019
Abstract
Extra-long tandem repeats (ETRs) are widespread in eukaryotic genomes and play an important role in fundamental cellular processes, such as chromosome segregation. Although emerging long-read technologies have enabled ETR assemblies, the accuracy of such assemblies is difficult to evaluate since there is no standard tool for their quality assessment. Moreover, since the mapping of long error-prone reads to ETR remains an open problem, it is not clear how to polish draft ETR assemblies. To address these problems, we developed the tandemMapper tool for mapping reads to ETRs and the tandemQUAST tool for polishing ETR assemblies and their quality assessment. We demonstrate that tandemQUAST not only reveals errors in and evaluates ETR assemblies, but also improves them. To illustrate how tandemMapper and tandemQUAST work, we apply them to recently generated assemblies of human centromeres.
In application/xml+jats
format
Archived Files and Locations
application/pdf 3.2 MB
file_uldmo5xmxzftvcdjbdtla4xdai
|
www.biorxiv.org (web) web.archive.org (webarchive) |
post
Stage
unknown
Date 2019-12-23
access all versions, variants, and formats of this works (eg, pre-prints)
Crossref Metadata (via API)
Worldcat
wikidata.org
CORE.ac.uk
Semantic Scholar
Google Scholar