SoftKin Prototype

SoftKin is a prototype tool for exploring approaches to measuring software similarity. The name reflects the idea that SoftKin looks for related software parts.

SoftKin consists of a data collector and analyzer. The collector processes existing software and calculates measures of form for each module. The analyzer computes similarity measures for each module pair. The pairs of modules can then be ranked from most similar to least similar by each similarity measure.

SoftKin can calculate similarity based on a variety of measures of software form. These include single-value metrics (such as McCabe's Cyclomatic Complexity), a metric composite, and also a structure profile that is a slight variant of the profile proposed by Whale.

The case study goal was to see if any of the approaches to measuring similarity show promise for identifying candidate parts that might be reengineered for a reusable parts collection.

To evaluate the various similarity rankings, I focused on the task of locating informal reuse. In each of the case studies, we were able to identify a set of instances of reuse in the existing software. This allowed me to evaluate each measure by where the known instances of reuse fall in the similarity ranking. A good ranking is one that places the actual reuse instances near the top of the list.

This evaluation provides a relative measure of various methods of similarity detection. As a production tool, SoftKin would guide an analysis of existing software. The ranking by similarity would suggest where to focus attention in looking for candidate reusable parts. If SoftKin is successful, the best candidates for a parts collection would be clustered near the top of the list.