Semi-Automated Protocol Disambiguation and Code Generation release_kma5otuuavhn7aft7pnwtudzhi

by Jane Yen and Tamás Lévai and Qinyuan Ye and Xiang Ren and Ramesh Govindan and Barath Raghavan

Released as a article .

2020  

Abstract

For decades, Internet protocols have been specified using natural language. Given the ambiguity inherent in such text, it is not surprising that over the years protocol implementations exhibited bugs and non-interoperabilities. In this paper, we explore to what extent natural language processing (NLP), an area that has made impressive strides in recent years, can be used to generate protocol implementations. We advocate a semi-automated protocol generation approach, Sage, that can be used to uncover ambiguous or under-specified sentences in specifications; these can then be fixed by a human iteratively until Sage is able to generate protocol code automatically. Using an implementation of Sage, we discover 5 instances of ambiguity and 6 instances of under-specification in the ICMP RFC, after fixing which Sage is able to generate code automatically that interoperates perfectly with Linux implementations. We demonstrate the ability to generalize Sage to parts of IGMP and NTP. We also find that Sage supports half of the conceptual components found in major standards protocols; this suggests that, with some additional machinery, Sage may be able to generalize to TCP and BGP.
In text/plain format

Archived Files and Locations

application/pdf  736.1 kB
file_6xv3f2ieozcyho6xdctvwhuyaa
arxiv.org (repository)
web.archive.org (webarchive)
Read Archived PDF
Preserved and Accessible
Type  article
Stage   submitted
Date   2020-10-09
Version   v1
Language   en ?
arXiv  2010.04801v1
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 0e07f155-bb09-4fde-b0db-1d5427deed62
API URL: JSON