Gryffin: An algorithm for Bayesian optimization for categorical
variables informed by physical intuition with applications to chemistry
release_nhicygcnk5appdxplxjvcl6viq
by
Florian Häse, Loïc M. Roch, Alán Aspuru-Guzik
2020
Abstract
Designing functional molecules and advanced materials requires complex
interdependent design choices: tuning continuous process parameters such as
temperatures or flow rates, while simultaneously selecting categorical
variables like catalysts or solvents. To date, the development of data-driven
experiment planning strategies for autonomous experimentation has largely
focused on continuous process parameters despite the urge to devise efficient
strategies for the selection of categorical variables to substantially
accelerate scientific discovery. We introduce Gryffin, as a general purpose
optimization framework for the autonomous selection of categorical variables
driven by expert knowledge. Gryffin augments Bayesian optimization with kernel
density estimation using smooth approximations to categorical distributions.
Leveraging domain knowledge from physicochemical descriptors to characterize
categorical options, Gryffin can significantly accelerate the search for
promising molecules and materials. Gryffin can further highlight relevant
correlations between the provided descriptors to inspire physical insights and
foster scientific intuition. In addition to comprehensive benchmarks, we
demonstrate the capabilities and performance of Gryffin on three examples in
materials science and chemistry: (i) the discovery of non-fullerene acceptors
for organic solar cells, (ii) the design of hybrid organic-inorganic
perovskites for light-harvesting, and (iii) the identification of ligands and
process parameters for Suzuki-Miyaura reactions. Our observations suggest that
Gryffin, in its simplest form without descriptors, constitutes a competitive
categorical optimizer compared to state-of-the-art approaches. However, when
leveraging domain knowledge provided via descriptors, Gryffin can optimize at
considerable higher rates and refine this domain knowledge to spark scientific
understanding.
In text/plain
format
Archived Files and Locations
application/pdf 9.2 MB
file_nwwrarffonavniod5pffgxqf2y
|
arxiv.org (repository) web.archive.org (webarchive) |
2003.12127v1
access all versions, variants, and formats of this works (eg, pre-prints)