NetNews Usenet Archive 1992 #16

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #16 / NN_1992_16.iso / spool / comp / software / 3011 < prev next >

Wrap

Internet Message Format | 1992-07-30 | 10.1 KB

Xref: sparky comp.software-eng:3011 comp.object:3069 Newsgroups: comp.software-eng,comp.object Path: sparky!uunet!destroyer!gumby!wupost!m.cs.uiuc.edu!marick From: marick@m.cs.uiuc.edu (Brian Marick) Subject: Testing Software that Reuses (long) Message-ID: <1992Jul30.200315.11868@m.cs.uiuc.edu> Organization: University of Illinois, Dept. of Comp. Sci., Urbana, IL Date: Thu, 30 Jul 1992 20:03:15 GMT Lines: 225 TESTING SOFTWARE THAT REUSES[1] Brian Marick The producer of reusable software provides a part of the product built by a software developer (here called the "consumer"). The producer tests this reusable part, but the consumer must test the rest. This note addresses the question of how the producer can help the consumer test. The producer should provide a catalog of reusable test conditions derived from likely misuses of the reused software. The consumer then designs test cases by combining the producer's test conditions with test conditions derived from the rest of the product. But there's more to testing than designing tests - there's also measuring whether they do what they're supposed to do. Coverage tools, which record measures like branch coverage, fill this role. They should be extended so that producers can provide modules that measure how well potential misuses have been tested. An existing freely distributable coverage tool will be so extended. ____________________ [1] Technical Note 2, Testing Foundations, Champaign, IL 61820, 1992. 1. The Argument (1) Conditions to test can be found by examining past programmer errors. (2) Often, errors are due to misuse of an abstraction. (3) Sometimes, the abstraction is implemented as reusable software. (4) Its producer should provide the error history in a form useful for testing. (5) The consumer can use this information when building test cases. (6) A coverage tool can measure how thoroughly the consumer's tests have probed likely misuses. 2. Test Design Test design occurs in two stages, be they implicit or explicit. In the first stage, test conditions are created. A test condition is a require- ment that at least one test case must satisfy. Examples: Argument A is negative. Argument A is 0. Arguments A and B are equal. In the second stage, the test conditions are combined into test cases which precisely describe the inputs to the program and its expected results. Example: A=-1, B=-1 Expected value: 13 This test satisfies two of the test conditions. The rules by which test conditions are combined into test cases are irrelevant to the argument of this note. Where do the test conditions come from? Most obviously, they come from the specification, the description of what the program is to do. For example, if we're testing the implementation of a memory allocator, malloc(amount), we might get these test conditions: amount is negative amount memory is available. Not enough memory is available. In this case, there's a test condition to see if the program handles invalid inputs correctly and two test conditions for the two kinds of return values. Test conditions also come from an understanding of the types of errors the programmer likely made while writing the program. Specifications and pro- grams are built from cliches [Rich90], such as "searching a list", "sorting a list", "decomposing a pathname into a directory and a file", "hash tables", "strings" (null terminated character arrays), and so on. Notice that both operations and data types can be cliches. Programmers often make cliched errors when implementing or using cliches [Johnson83]. Thus, you can create a catalog of test conditions, indexed by cliche. One such cata- log is [Marick92]. There are two ways cliches may be manifest in programs: (1) They may be implemented inline. For example, a searching cliche may be implemented as a loop over a vector. (2) They may be reusable code. The searching cliche may be implemented as a call to a subroutine named bsearch() . The test conditions for a cliche-using program depend on the manifestation. If the cliche is implemented inline, you must test that implementation. For a vector search loop, you'll use a test condition that probes for off- by-one errors: "element found in last position". But if the search is a call to a well-tested subroutine, that test condition would likely be use- less. Instead, you would restrict your attention to plausible errors in the program's use of the cliche -- faults where bsearch is called correctly, but the program fails to handle the result properly. Here, a test condition might be "element not found", since programmers sometimes fail to think about that possibility. Another example would be testing uses of the write() routine with "write fails", since programs that assume writes always succeed abound. Of course, not all reusable subroutines implement cliches, but all reusable code can generate the same sort of test conditions, test conditions that probe likely misuses. In the absence of more information, some general rules are: (1) A test condition for each error return. (2) A test condition for each distinct type of "normal return". For exam- ple, if a routine returns five status codes, the calling program isn't well tested unless it has shown that it can handle all five possibili- ties. But there can also be test conditions particular to a reused routine or datatype. For example, suppose a collection that can grow and shrink is provided as a datatype. However, the user must reinitialize the collection whenever the last element is removed. Programmers will surely often forget the reinitialization; this likely error can be captured in a test condition like "an element is removed from a single-element collection, then a new element is added". Henceforth, these test conditions will be called producer test conditions. Consumers find these producer test conditions, since they make the common mistakes. The people best suited for circulating this information to all consumers are the producers. (Of course, in this particular case, they would also want to rework the design of the datatype to eliminate the cause of the common error, but this cannot always be done. Some abstractions are simply inherently complicated.) Along with reusable software, vendors should sell catalogs of producer test conditions. These are used in the same way as, and in conjunction with, the general-purpose cliche catalog mentioned earlier. For example, if a consumer is writing software that uses a stream of input records to modify a collection, the "empty, then add" producer test condition can be combined with generic stream test conditions to produce good test cases for that software. 3. Test Coverage I won't discuss coverage in detail. If you want to learn about coverage, get a copy of my freely available test coverage tool, GCT. It's available by anonymous FTP from cs.uiuc.edu. Start by fetching pub/testing/GCT.README. GCT is used in three phases. (1) The program is instrumented by adding code to check whether coverage conditions are satisfied. Coverage conditions are test conditions derived mechanically from the code. Example: one coverage condition might require that an IF on line 256 be taken in the TRUE direction, while another would require that the < on line 397 be evaluated with its left-hand side equal to its right-hand side. (This boundary con- dition helps discover off-by-one errors.) (2) At runtime, the program executes and updates a log of which coverage conditions have been satisfied. (3) After the program runs, reporting tools produce output that looks like this: "lc.c", line 256: if was taken TRUE 0, FALSE 11 times. "lc.c", line 397: operator < might be <=. What good is that output? In some cases, it points to weaknesses in test design. For example, the second line may be a symptom of forgetting to test boundaries. The first line may point to an untested feature. In other cases, coverage points to mistakes in implementation, where your tests don't test what you thought they did. (This happens a lot.) The tester of software that reuses would find coverage more useful if it were derived from the reused software. Suppose the tester forgot about the "last element removed and new one added" test condition. Or suppose it was tested, but the test's input was handled by special-case code that never exercised the collection at all. Conventional coverage very well might miss the omission. What the tester wants is a coverage report that looks like this: "lc.c", line 218: collection.add never applied to reinitialized collection. "lc.c", line 256: if was taken TRUE 0, FALSE 11 times. "lc.c", line 397: operator < might be <=. "lc.c", line 403: bsearch never failed. The last condition prompts us to write a test to detect omitted error- handling code. Branch coverage, for instance, would not tell us we need to write such a test - it can't generate a coverage condition for an IF state- ment that ought to be there but isn't. Such faults of omission are common in fielded systems [Glass81]. To allow such coverage, a producer must provide two things: (1) A module which communicates with GCT during instrumentation. For appropriate function calls or method calls, the module tells GCT how many log entries to allocate and what the reporting tools should report for each. (2) Testing versions of reusable software that mark when producer test conditions are satisfied. The GCT support required is being designed now. I invite reuse providers to participate in the design. REFERENCES [Glass81] Robert L. Glass, "Persistent Software Errors", Transactions on Software Engineering, vol. SE-7, No. 2, pp. 162-168, March, 1981. [Johnson83] W.L Johnson, E. Soloway, B. Cutler, and S.W. Draper. Bug Catalogue: I. Yale University Technical Report, October, 1983. [Marick92] B. Marick, Test Condition Catalog. Testing Foundations, Champaign, IL 61820, 1992. [Rich90] C. Rich and R. Waters. The Programmer's Apprentice. New York: ACM Press, 1990.