Semi-Automated Protocol Disambiguation and Code Generation
release_kma5otuuavhn7aft7pnwtudzhi
by
Jane Yen and Tamás Lévai and Qinyuan Ye and Xiang Ren and Ramesh Govindan and Barath Raghavan
2020
Abstract
For decades, Internet protocols have been specified using natural language.
Given the ambiguity inherent in such text, it is not surprising that over the
years protocol implementations exhibited bugs and non-interoperabilities. In
this paper, we explore to what extent natural language processing (NLP), an
area that has made impressive strides in recent years, can be used to generate
protocol implementations. We advocate a semi-automated protocol generation
approach, Sage, that can be used to uncover ambiguous or under-specified
sentences in specifications; these can then be fixed by a human iteratively
until Sage is able to generate protocol code automatically. Using an
implementation of Sage, we discover 5 instances of ambiguity and 6 instances of
under-specification in the ICMP RFC, after fixing which Sage is able to
generate code automatically that interoperates perfectly with Linux
implementations. We demonstrate the ability to generalize Sage to parts of IGMP
and NTP. We also find that Sage supports half of the conceptual components
found in major standards protocols; this suggests that, with some additional
machinery, Sage may be able to generalize to TCP and BGP.
In text/plain
format
Archived Files and Locations
application/pdf 736.1 kB
file_6xv3f2ieozcyho6xdctvwhuyaa
|
arxiv.org (repository) web.archive.org (webarchive) |
2010.04801v1
access all versions, variants, and formats of this works (eg, pre-prints)