An Exploration of Approaches to Integrating Neural Reranking Models in Multi-Stage Ranking Architectures release_doxpaqytlze6bo5gjynadeeibe

by Zhucheng Tu, Matt Crane, Royal Sequiera, Junchen Zhang, Jimmy Lin

Released as a article .



We explore different approaches to integrating a simple convolutional neural network (CNN) with the Lucene search engine in a multi-stage ranking architecture. Our models are trained using the PyTorch deep learning toolkit, which is implemented in C/C++ with a Python frontend. One obvious integration strategy is to expose the neural network directly as a service. For this, we use Apache Thrift, a software framework for building scalable cross-language services. In exploring alternative architectures, we observe that once trained, the feedforward evaluation of neural networks is quite straightforward. Therefore, we can extract the parameters of a trained CNN from PyTorch and import the model into Java, taking advantage of the Java Deeplearning4J library for feedforward evaluation. This has the advantage that the entire end-to-end system can be implemented in Java. As a third approach, we can extract the neural network from PyTorch and "compile" it into a C++ program that exposes a Thrift service. We evaluate these alternatives in terms of performance (latency and throughput) as well as ease of integration. Experiments show that feedforward evaluation of the convolutional neural network is significantly slower in Java, while the performance of the compiled C++ network does not consistently beat the PyTorch implementation.
In text/plain format

Archived Files and Locations

application/pdf  577.7 kB
file_yysmleje7bc55awsotfaxraljy (repository) (webarchive)
Read Archived PDF
Type  article
Stage   submitted
Date   2017-07-26
Version   v1
Language   en ?
arXiv  1707.08275v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: c439eb22-a1ad-4966-a19c-b65eb5c3d78b