Correlation-based feature selection to identify functional dynamics in proteins
release_7yj54pyeq5hplgnmarnydsfquq
by
Georg Diez, Daniel Nagel, Gerhard Stock
2022
Abstract
To interpret molecular dynamics simulations of biomolecular systems,
systematic dimensionality reduction methods are commonly employed. Among
others, this includes principal component analysis (PCA) and time-lagged
independent component analysis (TICA), which aim to maximize the variance and
the timescale of the first components, respectively. A crucial first step of
such an analysis is the identification of suitable and relevant input
coordinates (the so-called features), such as backbone dihedral angles and
interresidue distances. As typically only a small subset of those coordinates
is involved in a specific biomolecular process, it is important to discard the
remaining uncorrelated motions or weakly correlated noise coordinates. This is
because they may exhibit large amplitudes or long timescales and therefore will
be erroneously be considered important by PCA and TICA, respectively. To
discriminate collective motions underlying functional dynamics from
uncorrelated motions, the correlation matrix of the input coordinates is
block-diagonalized by a clustering method. This strategy avoids possible bias
due to presumed functional observables and conformational states or variation
principles that maximize variance or timescales. Considering several linear and
nonlinear correlation measures and various clustering algorithms, it is shown
that the combination of linear correlation and the Leiden community detection
algorithm yields excellent results for all considered model systems. These
include the functional motion of T4 lysozyme to demonstrate the successful
identification of collective motion, as well as the folding of villin headpiece
to highlight the physical interpretation of the correlated motions in terms of
a functional mechanism.
In text/plain
format
Archived Files and Locations
application/pdf 6.5 MB
file_ybf36bjaujecpgckpbpzn7gx3a
|
arxiv.org (repository) web.archive.org (webarchive) |
2204.02770v1
access all versions, variants, and formats of this works (eg, pre-prints)