The Effect of Coincidental Correctness on Defect Detection: an Empirical Study release_n7pzvf4xsjes5luse2bqeifhym

by Wes Masri

Published by Figshare.



<b>According to the PIE model, three conditions must be met for failure to be observed: 1) the defect is executed, 2) the program is infected, and 3) the infection has propagated to the output. Weak coincidental correctness (CC) occurs when the program produces the correct output, while condition 1) is satisfied but 2) and 3) are not satisfied. Strong coincidental correctness occurs when a correct output is observed, while both conditions 1) and 2) are satisfied but not 3). </b> <b>In prior work, we analytically demonstrated that CC is a safety reducing factor for coverage-based fault localization (CBFL). However, we did not experimentally validate that fact, which we do in this paper. Specifically, we comparatively evaluated the performance of CBFL using ten different suspiciousness metrics when: a) both weak and strong CC tests are present; b) no weak nor strong CC tests are present; c) only weak CC tests are present; d) only strong CC tests are present. Our experiments showed that when the CC tests are discarded, in most cases the suspiciousness score of the defective statement increased and its EXAM ranking score also improved. The metrics that benefited most from discarding CC tests are: <i>Tarantula, Ample, Ochiai, Dstar<sup>2</sup>,</i> and <i>Dstar<sup>3</sup></i>. Whereas, discarding CC tests had no effect on <i>Russel</i>, <i>Wong1</i>, and <i>Binary</i>. However, the latter three metrics were the worst performers in regard to the EXAM score.</b>
In text/plain format

Archived Files and Locations

application/pdf  761.7 kB
file_qplg3jnh3jegpkl2nkyg4nta7i (webarchive) (publisher)
Read Archived PDF
Type  article-journal
Stage   published
Date   2018-09-12
Work Entity
access all versions, variants, and formats of this works (eg, pre-prints)
Catalog Record
Revision: 30995c21-cdd3-46fd-92b1-a7f741a390a9