Fast and scalable learning of neuro-symbolic representations of biomedical knowledge release_xnq3qva3qbg4fcn6twoyhpkele

by Asan Agibetov, Matthias Samwald

Released as a article .

2018  

Abstract

In this work we address the problem of fast and scalable learning of neuro-symbolic representations for general biological knowledge. Based on a recently published comprehensive biological knowledge graph (Alshahrani, 2017) that was used for demonstrating neuro-symbolic representation learning, we show how to train fast (under 1 minute) log-linear neural embeddings of the entities. We utilize these representations as inputs for machine learning classifiers to enable important tasks such as biological link prediction. Classifiers are trained by concatenating learned entity embeddings to represent entity relations, and training classifiers on the concatenated embeddings to discern true relations from automatically generated negative examples. Our simple embedding methodology greatly improves on classification error compared to previously published state-of-the-art results, yielding a maximum increase of +0.28 F-measure and +0.22 ROC AUC scores for the most difficult biological link prediction problem. Finally, our embedding approach is orders of magnitude faster to train (≤ 1 minute vs. hours), much more economical in terms of embedding dimensions (d=50 vs. d=512), and naturally encodes the directionality of the asymmetric biological relations, that can be controlled by the order with which we concatenate the embeddings.
In text/plain format

Archived Files and Locations

application/pdf  367.1 kB
file_x6jxbs7yx5ej7ilmjzjrezkpd4
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2018-04-30
Version   v1
Language   en ?
arXiv  1804.11105v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 00188946-fa9e-4a43-8e8a-97aab410bbc8
API URL: JSON