Information Newton's flow: second-order optimization method in probability space release_lpnszmnf7zcfhpmvhpor4gznwa

by Yifei Wang, Wuchen Li

Released as a article .

2020  

Abstract

We introduce a framework for Newton's flows in probability space with information metrics, named information Newton's flows. Here two information metrics are considered, including both the Fisher-Rao metric and the Wasserstein-2 metric. Several examples of information Newton's flows for learning objective/loss functions are provided, such as Kullback-Leibler (KL) divergence, Maximum mean discrepancy (MMD), and cross entropy. The asymptotic convergence results of proposed Newton's methods are provided. A known fact is that overdamped Langevin dynamics correspond to Wasserstein gradient flows of KL divergence. Extending this fact to Wasserstein Newton's flows of KL divergence, we derive Newton's Langevin dynamics. We provide examples of Newton's Langevin dynamics in both one-dimensional space and Gaussian families. For the numerical implementation, we design sampling efficient variational methods to approximate Wasserstein Newton's directions. Several numerical examples in Gaussian families and Bayesian logistic regression are shown to demonstrate the effectiveness of the proposed method.
In text/plain format

Archived Files and Locations

application/pdf  3.1 MB
file_li4ajrk6czaopjoj4es2txyubi
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2020-01-16
Version   v2
Language   en ?
arXiv  2001.04341v2
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 6eb4b6a3-834c-4b66-b6cd-08c74844f2ff
API URL: JSON