Case Studies

The case studies analyzed existing software from 3 large commercial organizations. The applications were all commercial data processing systems including areas such as finance, sales, and distribution. The analyzed software consisted of about 360 modules totaling 156,000 NCSS.

The case study results are quite interesting. The results of the single-value metrics are all fairly poor. In fact, a ranking by similarity of size provides as good a result as a ranking by any of the other single-value metrics. However, the average similarity, which is a composite of the single-value rankings, provides substantially improved performance.

The ranking by structure profile is the best method for locating instances of actual reuse. This is particularly true for the top part of the ranking, which is the part that has practical value. For example, a sizeable percentage of the known cases of reuse fall in the first 100 ranked pairs. (The actual results are 37, 50, and 63 percent, respectively, for the three case studies.)

We can test the statistical significance of the result based on the structure profile vs. the other metrics. Using Dunnett's T to control for Type I error across the set of comparisons, the structure profile shows a significant performance improvement versus the single-value metrics at the 0.05 level. The improvement over the average similarity, while substantial, is not significant at the 0.05 level.

The pattern of results is very consistent across the three case studies. This is particularly encouraging since the case study settings vary substantially including differences in programming language, development methodology, and organization size.

In general, concepts carried over quite well from plagiarism detection to reuse. Single-value metrics show poor results for locating the instances of informal reuse. On the other hand, a composite based on these metrics performed much better. Finally, the structure profile approach provided the best performance.