home *** CD-ROM | disk | FTP | other *** search
- INSTALLATION
- 1) Change makefile settings to reflect
- ATT vs. BSD software
- termio vs. termcap
- MGR vs. no MGR (MGR is a BELLCORE produced
- window manager that is also available
- free to the public.)
- 2) Then, just say "make".
- If you want to "make install", you should first
- change definition of INSDIR in the makefile
-
- 3) to test the software say
-
- spiff Sample.1 Sample.2
-
- spiff should find 4 differences and
- you should see the words "added", "deleted", "changed",
- and "altered" as well as four number in stand-out mode.
-
- spiff Sample.1 Sample.2 | cat
-
- should produce the same output, only the differences
- should be underlined However, on many terminals the
- underlining does not appear. So try the command
-
- spiff Sample.1 Sample.2 | cat -v
-
- or whatever the equivalent to cat -v is on your system.
-
- A more complicated test set is found in Sample.3 and Sample.4
- These files show how to use embedded commands to do things
- like change the commenting convention and tolerances on the
- fly. Be sure to run the command with the -s option to spiff:
-
- spiff -s 'command spiffword' Sample.3 Sample.4
-
- These files by no means provide an exhaustive test of
- spiff's features. But they should give you some idea if
- things are working right.
-
- This code (or it's closely related cousins) has been run on
- Vaxen running 4.3BSD, a CCI Power 6, some XENIX machines, and some
- other machines running System V derivatives as well as
- (thanks to eugene@ames.arpa) Cray, Amdahl and Convex machines.
-
- 4) Share and enjoy.
-
- AUTHOR'S ADDRESS
- Please send complaints, comments, praise, bug reports, etc to
- Dan Nachbar
- Bell Communications Research (also known as BELLCORE)
- 445 South St. Room 2B-389
- Morristown, NJ 07960
-
- nachbar@bellcore.com
- or
- bellcore!nachbar
- or
- (201) 829-4392 (praise only, please)
-
- OVERVIEW OF OPERATION
-
- Each of two input files is read and stored in core.
- Then it is parsed into a series of tokens (literal strings and
- floating point numbers, white space is ignored).
- The token sequences are stored in core as well.
- After both files have been parsed, a differencing algorithm is applied to
- the token sequences. The differencing algorithm
- produces an edit script, which is then passed to an output routine.
-
- SIZE LIMITS AND OTHER DEFAULTS
- file implementing limit name default value
- maximum number of lines lines.h _L_MAXLINES 10000
- per file
- maximum number of tokens token.h K_MAXTOKENS 50000
- per file
- maximum line length misc.h Z_LINELEN 1024
- maximum word length misc.h Z_WORDLEN 20
- (length of misc buffers for
- things like literal
- delimiters.
- NOT length of tokens which
- can be virtually any length)
- default absolute tolerance tol.h _T_ADEF "1e-10"
- default relative tolerance tol.h _T_RDEF "1e-10"
- maximum number of commands command.h _C_CMDMAX 100
- in effect at one time
- maximum number of commenting comment.h W_COMMAX 20
- conventions that can be
- in effect at one time
- (not including commenting
- conventions that are
- restricted to beginning
- of line)
- maximum number of commenting comment.h W_BOLMAX 20
- conventions that are
- restricted to beginning of
- line that are in effect at
- one time
- maximum number of literal comment.h W_LITMAX 20
- string conventions that
- can be in effect at one time
- maximum number of tolerances tol.h _T_TOLMAX 10
- that can be in effect at one
- time
-
-
- DIFFERENCES BETWEEN THE CURRENT VERSION AND THE ENCLOSED PAPER
-
- The files paper.ms and paper.out contain the nroff -ms input and
- output respectively of a paper on spiff that was given the Summer '88
- USENIX conference in San Francisco. Since that time many changes
- have been made to the code. Many flags have changed and some have
- had their meanings reversed, see the enclosed man page for the current
- usage. Also, there is no longer control over the
- granularity of object used when applying the differencing algorithm.
- The current version of spiff always applies the differencing
- in terms of individual tokens. The -t flag controls how the edit script
- is printed. This arrangement more closely reflects the original intent
- of having multiple differencing granularities.
-
- PERFORMANCE
-
- Spiff is big and slow. It is big because all the storage is
- in core. It is a straightforward but boring task to move the temporary
- storage into a file. Someone who cares is invited to take on the job.
- Spiff is slow because whenever a choice had to be made between
- speed of operation and ease of coding, speed of operation almost always lost.
- As the program matures it will almost certainly get smaller and faster.
- Obvious performance enhancements have been avoided in order to make the
- program available as soon as possible.
-
- COPYRIGHT
-
- Our lawyers advise the following:
-
- Copyright (c) 1988 Bellcore
- All Rights Reserved
- Permission is granted to copy or use this program, EXCEPT that it
- may not be sold for profit, the copyright notice must be reproduced
- on copies, and credit should be given to Bellcore where it is due.
- BELLCORE MAKES NO WARRANTY AND ACCEPTS NO LIABILITY FOR THIS PROGRAM.
-
- Given that all of the above seems to be very reasonable, there should be no
- reason for anyone to not play by the rules.
-
-
- NAMING CONVENTIONS USED IN THE CODE
-
- All symbols (functions, data declarations, macros) are named as follows:
-
- L_foo -- for names exported to other modules
- and possibly used inside the module as well.
- _L_foo -- for names used by more than one routine
- within a module
- foo -- for names used inside a single routine.
-
- Each module uses a different value for "L" --
- module files letter used implements
- spiff.c Y top level routines
- misc.[ch] Z various routines used throughout
- strings.[ch] S routines for handling strings
- edit.h E list of changes found and printed
- tol.[ch] T tolerances for real numbers
- token.[ch] K storage for objects
- float.[ch] F manipulation of floats
- floatrep.[ch] R representation of floats
- line.[ch] L storage for input lines
- parse.[ch] P parse for input files
- command.[ch] C storage and recognition of commands
- comment.[ch] W comment list maintenance
- compare.[ch] X comparisons of a single token
- exact.[ch] Q exact match differencing algorithm
- miller.[ch] G miller/myers differencing algorithm
- output.[ch] O print listing of differences
- flagdefs.h U define flag bits that are used in
- several of the other modules.
- These #defines could have been
- included in misc.c, but were separated
- out because of their explicit
- communication function.
- visual.[ch] V screen oriented display for MGR
- window manager, also contains
- dummy routines for people who don't
- have MGR
-
- I haven't cleaned up visual.c yet. It probably doesn't even compile
- in this version anyway. But since most people don't have mgr, this
- isn't urgent.
-
- NON-OBVIOUS DATA STRUCTURES
-
- The Floating Point Representation
-
- Floating point numbers are stored in a struct R_flstr
- The fractional part is often called the mantissa.
-
- The structure consists of
- a flag for the sign of the factional part
- the exponent in binary
- a character string containing the fractional part
-
- The structure could be converted to a float via
- atof(strcat(".",mantissa)) * (10^exponent)
-
- To be properly formed, the mantissa string must:
- start with a digit between 1 and 9 (i.e. no leading zeros)
- except for the zero, in which case the mantissa is exactly "0"
- for the special case of zero, the exponent is always 0, and the
- sign is always positive. (i.e. no negative 0)
-
- In other words, (except for the value 0)
- the mantissa is a fractional number ranging
- between 0.1 (inclusive) and 1.0 (exclusive).
- The exponent is interpreted as a power of 10.
-
- Lines
- there are three sets of lines:
- implemented in line.c and line.h
- real_lines --
- the lines as they come from the file
- content_lines --
- a subset of reallines that excluding embedded commands
- implemented in token.c and token.h
- token_lines --
- a subset of content_lines consisting of those lines that
- have tokens that begin on them (literals can go on for
- more than one line)
- i.e. content_lines excluding comments and blank lines.
-
-
- THE STATE OF THE CODE
- Things that should be added
- visual mode should handle tabs and wrapped lines
- handling huge files in chunks when in using the ordinal match
- algorithm. right now you have to parse and then diff the
- whole thing before you get any output. often, you run out of memory.
-
- Things that would be nice to add
- output should optionally be expressed in real line numbers
- (i.e. including command lines)
- at present, all storage is in core. there should
- be a compile time decision to allow temporary storage
- in files rather than core.
- that way the user could decide how to handle the
- speed/space tradeoff
- a front end that looked like diff should be added so that
- one could drop spiff into existing shell scripts
- the parser converts floats into their internal form even when
- it isn't necessary.
- in the miller/myer code, the code should check for matching
- end sequences. it currently looks matching beginning
- sequences.
-
- Minor programming improvements (programming botches)
- some of the #defines should really be enumerated types
- all the routines in strings.c that alter the data at the end of
- a pointer but return void should just return the correct
- data. the current arrangement is a historical artifact
- of the days when these routines returned a status code.
- but then the code was never examined,
- so i made them void . . .
- comments should be added to the miller/myer code
- in visual mode, ask for font by name rather than number
-