home *** CD-ROM | disk | FTP | other *** search
- Subject: v20i010: Tools for generating software metrics, Part03/14
- Newsgroups: comp.sources.unix
- Sender: sources
- Approved: rsalz@uunet.UU.NET
-
- Submitted-by: Brian Renaud <huron.ann-arbor.mi.us!bdr>
- Posting-number: Volume 20, Issue 10
- Archive-name: metrics/part03
-
- ---- Cut Here and unpack ----
- #!/bin/sh
- # this is part 3 of a multipart archive
- # do not concatenate these parts, unpack them in order with /bin/sh
- # file doc/Results1 continued
- #
- CurArch=3
- if test ! -r s2_seq_.tmp
- then echo "Please unpack part 1 first!"
- exit 1; fi
- ( read Scheck
- if test "$Scheck" != $CurArch
- then echo "Please unpack part $Scheck next!"
- exit 1;
- else exit 0; fi
- ) < s2_seq_.tmp || exit 1
- echo "x - Continuing file doc/Results1"
- sed 's/^X//' << 'SHAR_EOF' >> doc/Results1
- Xheavily to the change count. This is also intuitive. (I think there
- Xare two reasons for this, (1) that code with many special cases requiring
- Xreturns/exits is not well thought out ahead of time, but rather, coded
- Xas the implementor thought and (2) that changes to code with many returns
- Xcan be difficult, and can easily be incorrect.
- X
- XYou might think at this point that you have a magic formula for predicting
- Xerrors. Unfortunately, although this stuff does a very good job of
- Xpredicting problems on the pascal project, the best R-sqared I ever
- Xgot for the RSM(*1) data was still less than .4. Apparently, there are
- Xother factors at work. I believe that one very important factor is the
- Xexperience/skill of the original implementor and of those who made changes
- Xto the module. The uids of these people can be obtained from the sccs file.
- XYou then need to have some way to associate a skill level with the person
- X(note that time may be important here too). I never completed this final
- Xstep.
- X
- X
- X*1 - RSM == Remote Software Management, a distributed source distribution
- Xand control system. See results2 for more info.
- X
- XAs always, feel free to contact me with questions or assistance in setting
- Xthis stuff up to work on a project you want to analyze.
- X
- XBrian Renaud
- SHAR_EOF
- echo "File doc/Results1 is complete"
- chmod 0644 doc/Results1 || echo "restore of doc/Results1 fails"
- echo "x - extracting doc/Results2 (Text)"
- sed 's/^X//' << 'SHAR_EOF' > doc/Results2
- XSome additional example metrics. These are from the RSM and pascal
- X(combined) projects.
- X
- XThis example predicts changes (errors) by using (halstead) volume,
- Xcomments, functions (number of functions in file) and returns
- X(total returns). There are 490 files (data points). The data contains
- Xa number of significant outliers. For every variable, the standard
- Xdeviation is greater than the mean.
- X
- XNote that I included function count rather than mccabe. This
- Xseemed to be a more significant variable.
- X
- XThe regression equation is:
- X
- Xchanges = 0.00164 volume - 0.1097comments - 0.2515 functions
- X + 0.07432 returns + 3.18609
- X
- XR-Squared is .5818
- X
- XNote that, once again we have a counter-intuitive variable. This time
- Xfunctions varies inversely with changes. (That is, more functions in
- Xa file imply fewer changes.) Strange; I speculate that since file size
- Xis explained by the volume variable, many functions in a file means small
- Xfunctions. Small functions typically contain fewer errors.
- X
- XThe correlation matrix is:
- X
- Xchanges 1.0000
- Xvolume 0.6630 1.0000
- Xcomments 0.2152 0.7068 1.0000
- Xfunctions 0.2394 0.6311 0.7528 1.0000
- Xreturns 0.6879 0.7087 0.1825 0.3079 1.0000
- X
- X
- XThe t test results are:
- X
- Xvolume 10.5995
- Xcomments 5.0503
- Xfunctions 1.8426
- Xreturns 3.8045
- SHAR_EOF
- chmod 0644 doc/Results2 || echo "restore of doc/Results2 fails"
- echo "x - extracting doc/halstead.doc (Text)"
- sed 's/^X//' << 'SHAR_EOF' > doc/halstead.doc
- XHalstead provides various indicators of the module's complexity.
- XThe module is examined to determine the following
- X
- Xn1 == number of unique operators
- Xn2 == number of unique operands
- XN1 == number of total operators
- XN2 == number of total operands
- X
- XThe program writes these output fields:
- X
- XFile Name
- XProgram Length (N = N1 + N2)
- XProgram Volume (V = N log<base2> (n1 + n2))
- XProgram Level (L = (2/n1) * (n2/N2))
- XMental Discriminations (E^ = V / L)
- X
- XThe program volume (aka ``Halstead Volume'') is probably most useful
- Xand seems to be reasonably well correlated with error counts for modules.
- SHAR_EOF
- chmod 0644 doc/halstead.doc || echo "restore of doc/halstead.doc fails"
- echo "x - extracting doc/kdsi.1L (Text)"
- sed 's/^X//' << 'SHAR_EOF' > doc/kdsi.1L
- X.TH KDSI "L COSI" 2/2/86
- X.\" bdr
- X.UC
- X.SH NAME
- Xkdsi - count number of lines of code in a C program
- X.SH SYNOPSIS
- X.B kdsi
- X.B [ file ]*
- X.SH DESCRIPTION
- X.I Kdsi
- Xcounts the lines of code in a C program.
- XIt provides the following information:
- X.sp 1
- X.nf
- X lines of code
- X blank lines
- X comments lines
- X number of comments
- X.fi
- X.LP
- XIf you specify no files,
- X.I kdsi
- Xwill read from stdin.
- XIf you specify more than one file on the command line,
- X.I kdsi
- Xwill print the total for each category.
- X.SH SEE ALSO
- Xwc(1)
- SHAR_EOF
- chmod 0644 doc/kdsi.1L || echo "restore of doc/kdsi.1L fails"
- echo "x - extracting doc/mccabe.doc (Text)"
- sed 's/^X//' << 'SHAR_EOF' > doc/mccabe.doc
- XMccabe determines function complexity based on Mccabe model
- Xof program complexity.
- X
- XUsage: mccabe [-n] file [file]
- X
- XThe -n flag (No Header) is useful if you are using this to produce data for
- Xother tools
- X
- XMccabe produces (in order) the following output
- X
- X File Name
- X Function Name (or *** for complexity not in a function, as in yacc)
- X Complexity
- X Count of Return statements
- X
- XTypically a complexity over 10 is cause for further examination. The
- Xnumber of returns in a function is also important in predicting error
- Xcounts for that function, perhaps even more important than the complexity.
- SHAR_EOF
- chmod 0644 doc/mccabe.doc || echo "restore of doc/mccabe.doc fails"
- echo "x - extracting src/control/README (Text)"
- sed 's/^X//' << 'SHAR_EOF' > src/control/README
- XThis directory contains an example of some tools which glom together the
- Xoutput of various metrics tools into some (potentially) usable databases.
- XThere are two databases produced, one of which is file-based (containing
- Xhalstead, change count and line count information) and one of which is
- Xfunction based (containing mccabe and return information). Obviously,
- Xthere can be one or more functions in a file.
- X
- XI have only a few printouts left indicating the results of the multiple
- Xregression models I built. The results from these are presented later in
- Xthis file. The most (statistically) significant variable for the
- Xprediction of errors (SCCS changes) was halstead volume. Mccabe, returns
- Xand comment counts were also significant, at similar statistical levels.
- XKdsi and volume were highly correlated. The only variable I disbelieve is
- Xthe values for mccabe. See my comments in the doc directory.
- X
- XThe structure of the scripts follows. They are not easy to read, although
- Xif you lay them all out before you, they are not too bad.
- X
- XThe file pascal_stats is the highest level script. It actually knows about
- Xthe structure of the application's directories. (But, before trying to use
- Xthis file, see the comments about proj_stats below.) It calls gather_stats
- Xto pull stuff together for each directory. The output from gather_stats is
- Xjust appended to the appropriate metrics database.
- X
- XGather_stats actually gets the statistics, for the file specification
- Xpassed to it. Note that I have modified it since it was last used. I
- Xassume that it still compiles. In addition, some of the commands produce
- Xmore and different output than they used to. I have changed the
- Xspecification of the joins to what I think it should be.
- X
- XI have created a new script proj_stats which employs a different (easier to
- Xread) method than that in "pascal_stats" for naming all of the directories
- Xand files in a project. It reads a file for the names of the directories
- Xand the patterns which describe files in those directories. An example
- Xfile which should work for the pascal project is in "example_spec".
- X
- XThe other new scripts are altparse.prs and altkdsi. Altkdsi just post-
- Xprocesses the output of the kdsi command to strip off the totals line.
- XAltparse.prs finds the change count from sccs. It does not count ".1"
- Xdeltas, assuming that those are new releases, rather than bug fixes.
- SHAR_EOF
- chmod 0644 src/control/README || echo "restore of src/control/README fails"
- echo "x - extracting src/control/altkdsi (Text)"
- sed 's/^X//' << 'SHAR_EOF' > src/control/altkdsi
- X: postprocess kdsi to put it in form for gather_stats
- X
- Xkdsi $* | awk '$5 != "total" {printf("%s\t%s\t%s\n", $5, $1, $4);}'
- SHAR_EOF
- chmod 0644 src/control/altkdsi || echo "restore of src/control/altkdsi fails"
- echo "x - extracting src/control/altparse.prs (Text)"
- sed 's/^X//' << 'SHAR_EOF' > src/control/altparse.prs
- X: parse output from sccs prs command
- X
- Xfor file in $*
- Xdo
- X prs ${file} | awk '
- X BEGIN {
- X True = 1;
- X False = 0;
- X inMR = False;
- X inComment = False;
- X first = True;
- X }
- X
- X $0 == "" {
- X inMR = False;
- X inComment = False;
- X next;
- X }
- X
- X $0 ~ /^D [0-9][0-9]*\.[0-9][0-9]*/ {
- X # got delta entry
- X len = length( $2 );
- X if ( substr($2, len - 1) != ".1" )
- X changect++;
- X next;
- X }
- X
- X $1 ~ /^MRs:/ {
- X inMR = True;
- X next;
- X }
- X
- X $1 ~ /^COMMENT/ {
- X inComment = True;
- X next;
- X }
- X
- X inMR == 1 { # skipping through MR section
- X next;
- X }
- X
- X inComment == 1 { # skipping through comment section
- SHAR_EOF
- echo "End of part 3"
- echo "File src/control/altparse.prs is continued in part 4"
- echo "4" > s2_seq_.tmp
- exit 0
-
-
-