A supervised protein complex prediction method with network representation learning and gene ontology knowledge release_f7ne3xtxbzfardnbefagpmuhuq

by Xiaoxu Wang, Yijia Zhang, Peixuan Zhou, Xiaoxia Liu

Published in BMC Bioinformatics by Springer Science and Business Media LLC.

2022   Volume 23, Issue 1, p300

Abstract

<jats:title>Abstract</jats:title><jats:sec> <jats:title>Background</jats:title> Protein complexes are essential for biologists to understand cell organization and function effectively. In recent years, predicting complexes from protein–protein interaction (PPI) networks through computational methods is one of the current research hotspots. Many methods for protein complex prediction have been proposed. However, how to use the information of known protein complexes is still a fundamental problem that needs to be solved urgently in predicting protein complexes. </jats:sec><jats:sec> <jats:title>Results</jats:title> To solve these problems, we propose a supervised learning method based on network representation learning and gene ontology knowledge, which can fully use the information of known protein complexes to predict new protein complexes. This method first constructs a weighted PPI network based on gene ontology knowledge and topology information, reducing the network's noise problem. On this basis, the topological information of known protein complexes is extracted as features, and the supervised learning model SVCC is obtained according to the feature training. At the same time, the SVCC model is used to predict candidate protein complexes from the protein interaction network. Then, we use the network representation learning method to obtain the vector representation of the protein complex and train the random forest model. Finally, we use the random forest model to classify the candidate protein complexes to obtain the final predicted protein complexes. We evaluate the performance of the proposed method on two publicly PPI data sets. </jats:sec><jats:sec> <jats:title>Conclusions</jats:title> Experimental results show that our method can effectively improve the performance of protein complex recognition compared with existing methods. In addition, we also analyze the biological significance of protein complexes predicted by our method and other methods. The results show that the protein complexes predicted by our method have high biological significance. </jats:sec>
In application/xml+jats format

Archived Files and Locations

application/pdf  2.2 MB
file_pfmbsaqcqzhohec77r7mu4fypi
bmcbioinformatics.biomedcentral.com (publisher)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article-journal
Stage   published
Date   2022-07-25
Language   en ?
Container Metadata
Open Access Publication
In DOAJ
In ISSN ROAD
In Keepers Registry
ISSN-L:  1471-2105
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 7fd62a81-aaaa-4b45-907c-3783e46731ea
API URL: JSON