Getafix: Learning to Fix Bugs Automatically
release_26gb7gdsqzhtnispyhai3jb7py
by
Johannes Bader and Andrew Scott and Michael Pradel and Satish Chandra
2019
Abstract
Static analyzers help find bugs early by warning about recurring bug
categories. While fixing these bugs still remains a mostly manual task in
practice, we observe that fixes for a specific bug category often are
repetitive. This paper addresses the problem of automatically fixing instances
of common bugs by learning from past fixes. We present Getafix, an approach
that produces human-like fixes while being fast enough to suggest fixes in time
proportional to the amount of time needed to obtain static analysis results in
the first place. Getafix is based on a novel hierarchical clustering algorithm
that summarizes fix patterns into a hierarchy ranging from general to specific
patterns. Instead of an expensive exploration of a potentially large space of
candidate fixes, Getafix uses a simple yet effective ranking technique that
uses the context of a code change to select the most appropriate fix for a
given bug. Our evaluation applies Getafix to 1,268 bug fixes for six bug
categories reported by popular static analyzers for Java, including null
dereferences, incorrect API calls, and misuses of particular language
constructs. The approach predicts exactly the human-written fix as the top-most
suggestion between 12% and 91% of the time, depending on the bug category. The
top-5 suggestions contain fixes for 526 of the 1,268 bugs. Moreover, we report
on deploying the approach within Facebook, where it contributes to the
reliability of software used by billions of people. To the best of our
knowledge, Getafix is the first industrially-deployed automated bug-fixing tool
that learns fix patterns from past, human-written fixes to produce human-like
fixes.
In text/plain
format
Archived Files and Locations
application/pdf 1.2 MB
file_5r62wn4ydzdnjh7onlof6mnaty
|
arxiv.org (repository) web.archive.org (webarchive) |
1902.06111v3
access all versions, variants, and formats of this works (eg, pre-prints)