home *** CD-ROM | disk | FTP | other *** search
Text File | 1990-02-26 | 31.2 KB | 1,157 lines |
- Newsgroups: comp.sources.misc
- organization: University of East Anglia, Norwich
- subject: v10i086: Names2, a random names generator (part 1 of 2)
- From: jrk@sys.uea.ac.uk (Richard Kennaway)
- Sender: allbery@uunet.UU.NET (Brandon S. Allbery - comp.sources.misc)
-
- Posting-number: Volume 10, Issue 86
- Submitted-by: jrk@sys.uea.ac.uk (Richard Kennaway)
- Archive-name: names2/part01
-
- This is names2.c, a program which generates random names for FRP
- characters or placenames. It is a development of names.c, which I posted
- to comp.sources.misc in July 1989.
-
- The posting is in two parts. This is part 1, containing the program;
- part 2 contains the data files.
-
- For those who saw the earlier version, this is what's new:
-
- - some data files are included to get you started (Sindarin, German,
- Chinese, and Chaucerian English).
-
- - you can specify which characters in the input should be considered
- "letters"; for example, you can have it recognise punctuation marks and
- accents, or apply it to Greek or Cyrillic text (provided each character
- is represented by a single byte).
-
- - the internal data structures are much smaller, at the cost of taking
- longer to analyse the input and begin generating names.
-
- For those who didnt see the earlier version, unlike all similar programs
- I've seen, names2 will generate output to match any language you like.
- Feed it with text in that language, and it will generate words
- statistically similar to the input text.
-
- It runs on a Macintosh (if you have MPW C) and Unix.
-
- For further information and examples, see the manual entry (near the
- begining of the shar archive).
-
- The program is public domain. Share and enjoy.
-
- --
- Richard Kennaway SYS, University of East Anglia, Norwich, U.K.
- Internet: jrk@uk.ac.uea.sys uucp: ...mcvax!ukc!uea-sys!jrk
-
- #!/bin/sh
- echo x - MANIFEST1
- sed 's/^X//' >MANIFEST1 <<'*-*-END-of-MANIFEST1-*-*'
- XMANIFEST1 This file.
- Xnames2.1 The manual entry.
- Xnames2.c The source.
- XNames2.make The Macintosh makefile. (See comment at the
- X beginning before using it.)
- XMakefile.unix The UNIX makefile.
- *-*-END-of-MANIFEST1-*-*
- echo x - names2.1
- sed 's/^X//' >names2.1 <<'*-*-END-of-names2.1-*-*'
- X.TH NAMES2 1 "January 1990"
- X.UC
- X.SH NAME
- Xnames2 \- generate random names \- version 2
- X.SH SYNOPSIS
- X.B names2
- X[
- X.B \-3
- X] [
- X.BR \-w\ |\ \-s
- X] [
- X.B -l
- X.I nnn
- X] [
- X.I files
- X]
- X.SH DESCRIPTION
- X.I Names2
- Xis a random name generator. It will read text from standard input or from
- Xfiles given on the command line, and generate a random stream of words
- Xwhose statistical characteristics are similar to those of the input. Thus
- Xif you give it a database of Elvish names, it will generate Elvish-like
- Xnames; if a database of Orcish names, it generates Orc-like names, etc.
- X.PP
- XIt does this by counting the frequency of all 1-, 2-, 3-, and 4-character
- Xsequences of letters or spaces in the input. Case is ignored by default,
- Xand all runs of non-letters are seen as single spaces. The first character
- Xto be output, say "r", is generated according to the relative frequencies
- Xwith which each character was found to follow a space in the input. The
- Xsecond, say "o", is generated according to the relative frequencies with
- Xwhich each character can appear following the digraph " r". The third, say
- X"l" is generated according to the relative frequencies with which each
- Xcharacter follows the trigraph " ro", and thereafter each character is
- Xgenerated according to the frequencies with which the different possible
- Xcharacters follow the preceding three.
- X.PP
- XBy default, the letters are the characters a-z and A-Z, with case
- Xdifferences ignored. There are options letting you specify that case is
- Xsignificant, or to add or remove characters from the set of "letters".
- XThus you can use it to generate names in Greek, Cyrillic, Japanese
- Xkana, etc. provided the input text is encoded in some way with one byte
- Xper character.
- X.PP
- XThe larger the input, the better. You need at least a few thousand bytes
- Xof input for good results. If the input is not large enough, you will
- Xtend to get words from the input appearing verbatim in the output, as much
- Xof the time three consecutive characters will uniquely determine the next
- Xcharacter. (To see an extreme form of this, try running it on the text
- X"the cat sat on the mat".) If more input of the desired form is not
- Xavailable, the program can be made to use a third-order approximation
- Xinstead, each character of the output depending only on the two preceding
- Xcharacters.
- X.PP
- XThe output is wrapped to 76 chars maximum, hyphenating any word that has
- Xto be broken over a line-end.
- X.PP
- X.I Names2
- Xwill run on Unix, and on a Macintosh as an MPW shell tool with MPW C
- Xversion 3.
- X.SH EXAMPLES
- XFor your inspiration, here are some examples demonstrating its
- Xversatility. To generate Elvish names, feed it with the Sindarin words
- Xfrom a Sindarin-English dictionary. You get something like this:
- X.PP
- X.na
- X.in +3
- X.ll -3
- X.I
- Xthaiglin thoromirin mallorien orth girithrass bregalad imloth
- X.I
- Xmenel berhael cirion saur celebdil aradrif oroth ered eryd mindon
- X.I
- Xmirandros aer balrond adui narfingor emyn gaurnen sernui silien
- X.I
- Xglorn fuin celegost bladhren gil breth argonuiliath hinguruthon...
- X.PP
- X.ad
- XAs you can see, not all the output is directly usable, but by exercising
- Xsome selection you can obtain results like:
- X.PP
- X.na
- X.in +3
- X.ll -3
- X.I
- XTolfalad, Lothlain, Ossarnen, Malbarahir, Minarwen, Eredil,
- X.I
- XSuldor Belebrethand, Berielegor, Gaurgor, Mithron, Galadhril,
- X.I
- XSammathremmir, Erufinrod, Fangband, Turingamar, Ninuviel, Elwinion...
- X.PP
- X.ad
- XPretty convincing, eh? Yet none of these words actually occurred in the
- Xinput file. Using the output as inspiration, you can even construct
- XElvish text. Here is an extract from the tale of Suldor Belebrethand's
- Xjourney through the orclands of Gaurgor to the foothills of Tolfalad, in
- Xan ancient and little-known Sindarin dialect (thus forestalling any
- Xcriticisms from smart-alecs who actually speak Elvish :-)):
- X.PP
- X.na
- X.in +3
- X.ll -3
- X.I
- X...Argirien emyn druadar lhun arach sarinan-duhirion Erufinrod.
- X.I
- XSuldor bas caracharon arad mor gwanui alfiriel, i los hin arant
- X.I
- Xdruadan-calad. Til edras, "Barachaer bas gaurondon", orguldur...
- X.PP
- X.ad
- XIn contrast, here's some output from the names in the index of a book on
- Xearly mediaeval Germany. I told names2 to consider '"' a letter, and
- Xused it to represent a dieresis on the following vowel. On a Macintosh
- Xyou could use the accented characters themselves.
- X.PP
- X.na
- X.in +3
- X.ll -3
- X.I
- Xpassen chutizi albrem m"olden nordgard kunich sched salzburg
- X.I
- Xbaldwig capendhausengar dal vith assau gisela hildestins pr"ubeck
- X.I
- Xsclavaringau edgau rick hed albert alcuin ruodlingen hodo boleslas
- X.I
- Xmisti bert hrothard bold ekkarl wettinau tegenburg zevenstedt...
- X.I
- X.PP
- X.ad
- XJust the thing for Warhammer. I can see it now: chaos tribes have crossed
- Xthe Chutizi river and threaten the territory of Count Hrothard of the
- XRuodling family, ruler of the town of Zevenstedt and province of
- XNordgard, who sends his faithful servant Hodo to ask for help from King
- XBaldwig in Wettinau; the party encounter the dying Hodo, waylaid by
- Xruffians in the pay of evil Baron Ekkarl; also in the plot are Abbess
- XHildestin of Pr"ubeck, Bishop Albrem of M"olden, the village of
- XCapendhausen, and the city state of Tegenburg-Assau ... the scenario
- Xwrites itself once you get the names right.
- X.PP
- XHere is some output from a list of names of characters in "The Story of
- Xthe Stone", a Chinese novel of the 18th century, in Pinyin romanization:
- X.PP
- X.na
- X.in +3
- X.ll -3
- X.I
- Xxian weng rong shixilua yu lian lin liao wen shiyin hun qian
- X.I
- Xyun xue chuan bing xiaoqing lian yin xian wan jing wen wan tian
- X.I
- Xsiji song xue shen xiang deng langzhe ruhai xin xifei lan hun
- X.I
- Xyouang xi baojinxiao ziten xifei xue erji huan zhe xingyun jie...
- X.PP
- X.ad
- XLastly, this is generated from English text, with all punctuation marks
- Xrecognised as "letters" and upper and lower case distinguished:
- X.PP
- X.in +3
- X.ll -3
- X.I
- Xof a Ch'huen as just is are your prime, stupidition. -- Alfred
- X.I
- Xin a mismal comple it in that fortune und fall Lighs take you
- X.I
- Xdoubtful fleeps. I that just Dented, somed formen. Sit chese
- X.I
- X.PP
- XWith names2, who needs James Joyce?
- X.SH OPTIONS
- XWhen an option takes an argument, there must be no space between the
- Xoption and the argument. Options must be written separately, e.g. -3 -c,
- Xnot -3c.
- X.TP
- X.B \-3
- XUse trigraph frequencies instead of tetragraph frequencies. Gives better
- Xresults when input data is limited. It is interesting to experiment with
- Xthis option even if you have enough data to use tetragraphs.
- X.TP
- X.B \-axxxx
- XAdd the characters in the string xxxx to the set of "letters".
- X.TP
- X.B \-c
- XTreat letters in different cases as different. By default they are not
- Xdistinguished. When this option is given, output is in lower-case.
- X.TP
- X.B \-dxxxx
- XRemove the characters in the string xxxx from the set of "letters".
- X\-a and \-d options are processed sequentially, and the \-c option, if
- Xpresent, is applied last.
- X.TP
- X.B \-lnnn
- XGenerate nnn lines of output. Default is 20. No space between the \-l
- Xand the nnn. If \-l is given with no argument, the output will go on
- X(nearly) forever.
- X.TP
- X.B \-rnnn
- XUse nnn as the seed for the random number generator. As the value of the
- Xseed is printed on stderr, this enables you to reproduce the output. By
- Xdefault, the value of the seconds clock is used.
- X.TP
- X.B \-s
- XNegation of
- X.B \-w
- Xoption.
- XThe first character of each word will depend on the last three characters
- Xgenerated (i.e. the last two characters of the preceding word, and the
- Xinter-word space).
- X.TP
- X.B \-w
- XNegation of
- X.B \-s
- Xoption.
- XGenerate successive words independently, i.e. each word begins as if it
- Xwas the beginning of the whole output, ignoring how the preceding word
- Xended. (Default.)
- X.SH DIAGNOSTICS
- X.I Names2
- Xgives a usage message if the arguments are bad. Exits with status 0 if all
- Xwent well. Exits with status 1 if there were bad arguments (other than
- Xnon-existent files), or insufficient memory. No names are generated.
- XOtherwise, exits with status 2 if any files were not found (however, it
- Xwill read all the files it could find and generate names).
- X.PP
- XWrites to stderr a count of the number of different "letter" characters, a
- Xcount of the characters read (i.e. letters and runs of non-letters), and
- Xthe seed for the random number generator.
- X.PP
- XIf compiled with SHOWTABLE defined, it dumps the tables to standard output
- Xbefore the random names.
- X.SH CHANGES SINCE PREVIOUS VERSION
- XAdded -a, -c, -d, -r options. Vastly improved memory allocation. Ported
- Xto MPW version 3.
- X.SH BUGS
- XThe ignoring of case only applies to the characters a-z and A-Z, not to
- Xthe accented letters and ligatures in the Macintosh character set. If you
- Xwant to accept all the extra characters and ignore case differences, you
- Xcan will need to preprocess your input to map, say, A-dieresis into
- Xa-dieresis, OE to oe, etc.
- X.SH FURTHER IDEAS
- XThere is still some room for improvement in the efficiency of
- Xrepresentation of the tables. The space required is approximately four
- Xtimes the size of the input, plus eight times the square of the size of
- Xthe alphabet. With the -3 option, it would be possible to compress the
- Xtables by half, but this is not done - the presence of the -3 option makes
- Xno difference to the amount of memory used.
- X.PP
- XArrange to write the tables to a file and read them in again, to avoid
- Xhaving to reconstruct them every time you run the program on the same
- Xinput.
- X.PP
- XThe enthusiastic may want to convert the program to run as a stand-alone
- Xapplication on the Macintosh.
- X.SH ACKNOWLEDGEMENTS
- XThe distribution includes several word-lists from various sources: the
- XSindarin dictionary contained in "An Introduction to Elvish", edited by
- XJim Allan, published by Bran's Head Books Ltd, 91 Wimborne Avenue, Hayes,
- XMiddlesex, U.K., 1978 (but I hear they've gone bust, so that address may
- Xnot be any use); the personal and place names from the index of "Rule and
- XConflict in an early Medieval Society", by Karl Leyser (Basil Blackwell,
- XOxford, U.K., 1989); the names from the index of characters of "The Story
- Xof the Stone", by Cao Xueqin (trans. David Hawkes, 3 volumes, Penguin
- X1973); and some text from Chaucer.
- X.PP
- XThe Chinese file is rather short; the example above was produced with
- Xthe -3 option.
- X.SH AUTHOR
- XRichard Kennaway.
- X.TP
- Xjrk@sys.uea.ac.uk (INTERNET), ...mcvax!uea-sys!jrk (UUCP).
- X.TP
- XThis program is public domain.
- *-*-END-of-names2.1-*-*
- echo x - names2.c
- sed 's/^X//' >names2.c <<'*-*-END-of-names2.c-*-*'
- X/* names2.c */
- X/* Random name generator */
- X
- X/* Richard Kennaway */
- X/* INTERNET: jrk@uk.ac.uea.sys */
- X/* UUCP: ...mcvax!uea-sys!jrk */
- X
- X/* Public domain! */
- X/* August 1989: First version. */
- X/* January 1990: Ported to MPW3.
- X Removed some untidiness (lint warnings).
- X Print randseed to stderr and take randseed as option
- X to allow reproducibility.
- X Better representation of tetragraph table.
- X Ability to specify character set. */
- X
- X
- X#define FALSE 0
- X#define TRUE 1
- X
- X/* Choose one... */
- X#define UNIX TRUE /* Version for Unix */
- X#define MPW FALSE /* Version for Apple Macintosh (MPW C) */
- X
- X/* If MPW is TRUE, define one of MPW2 or MPW3 as TRUE, the other as FALSE. */
- X#define MPW2 FALSE /* MPW version 2 */
- X#define MPW3 FALSE /* MPW version 3 */
- X
- X
- X/* System declarations */
- X
- X#include <stdio.h>
- X#if MPW
- X#include <Memory.h> /* For BlockMove(). */
- X#include <QuickDraw.h> /* For random numbers. */
- X#include <OSUtils.h> /* For GetDateTime(). */
- X#endif
- X
- X#define EOFCHAR (-1)
- X
- Xextern char *malloc();
- X
- X
- X/* Compatibility */
- X
- Xtypedef char int8;
- Xtypedef unsigned char uint8;
- Xtypedef short int16;
- Xtypedef unsigned short uint16;
- Xtypedef unsigned long uint32;
- Xtypedef long int32;
- X
- X#define MAXUINT8 ((uint8) ((int8) (-1)))
- X#define MAXUINT16 ((uint16) ((int16) (-1)))
- X#define MAXUINT32 ((int32) ((int32) (-1)))
- X#define NUMCHARS 256
- X#define chartoint(c) ((int16)(uint8)(c))
- X#define A_CHAR chartoint('A')
- X#define Z_CHAR chartoint('Z')
- X#define a_CHAR chartoint('a')
- X#define z_CHAR chartoint('z')
- X#define SPACE_CHAR chartoint(' ')
- X#define A_TO_a (a_CHAR-A_CHAR)
- X
- X#if MPW2
- X#define NEWLINECHAR chartoint('\r')
- X#endif
- X#if UNIX || MPW3
- X#define NEWLINECHAR chartoint('\n')
- X/* Note: the actual value of '\n' is different in UNIX and MPW3,
- X and '\n' in MPW3 is the same as '\r' in MPW2. */
- X#endif
- X
- X
- X/* Where is the random number generator? */
- X
- X#if UNIX
- Xtypedef char *Ptr;
- X#define Boolean int
- X#define BlockMove bcopy
- Xint32 random();
- X#define Random() ((int16) (random()))
- X#endif
- Xuint32 RandSeed;
- X
- X
- X/* Globals. */
- X
- Xint Argc;
- Xunsigned char **Argv;
- Xint ExitStatus = 0;
- XBoolean FileArgs = FALSE, Big = TRUE, SeparateWords = TRUE;
- XBoolean CaseSignificant = FALSE, Letters[NUMCHARS];
- Xint16 CurFile;
- X
- X
- X/* Layout. */
- X
- X#define BREAK1 60
- X#define BREAK2 75
- Xint16 Column = 0;
- Xint32 Lines = 0;
- X#define DEFAULTMAXLINES 20
- Xint32 MaxLines = DEFAULTMAXLINES;
- X
- X
- X/* Tables */
- X
- Xint16 NumChars = 0;
- X#define SPACEINDEX 0
- Xint32 t2size, t3size, t4size;
- X
- X#define NOTCHOICE MAXUINT16
- X
- Xint16 CharToIndex[NUMCHARS], IndexToChar[NUMCHARS];
- X
- Xint32 table0 = 0,
- X *table1 = NULL,
- X *table2 = NULL;
- X
- X#define BLOCKSIZE(n) (sizeof(int32) + sizeof(int32) + (n)*sizeof(uint16))
- X#define INITSIZE 10
- X#define GROWNUM 5
- X#define GROWDEN 4
- X
- Xtypedef struct DigraphBlock {
- X int32 size, maxsize;
- X uint16 data[1];
- X} DigraphBlockRec, *DigraphBlockPtr;
- X
- XDigraphBlockPtr *quadtable = NULL;
- X
- X
- X/* Sorting */
- X
- Xstatic void SortArray();
- X
- Xtypedef Boolean (*ComparisonProc)();
- X
- X
- X/* Memory allocation */
- X
- Xchar *trymemory( bytesNeeded, mustGet )
- Xint32 bytesNeeded;
- XBoolean mustGet;
- X{
- Xchar *result;
- X
- X result = (char *) malloc( bytesNeeded );
- X if ((result==NULL) && (mustGet)) {
- X fprintf( stderr, "Could not get %lu bytes - terminating.%c",
- X bytesNeeded, NEWLINECHAR );
- X ExitStatus = 1;
- X exit( ExitStatus );
- X }
- X return( result );
- X} /* char *trymemory( bytesNeeded, mustGet ) */
- X
- Xvoid zero( start, numBytes )
- Xchar *start;
- Xint32 numBytes;
- X{
- X/* Your system may well have a faster way of zeroing memory. */
- X/* In fact, the static arrays to which this procedure is applied */
- X/* may be automatically initialised to zero already. */
- X/* But portability would be impaired by asssuming that. */
- X
- Xint32 remainder, i, num32bits;
- X
- X remainder = numBytes % ((int32) 4);
- X for (i=1; i <= remainder; i++) start[numBytes-i] = 0;
- X num32bits = numBytes / ((int32) 4);
- X for (i=0; i<num32bits; i++) ((int32 *) start)[i] = 0;
- X} /* void zero( start, numBytes ) */
- X
- Xvoid getmemory()
- X{
- Xint32 i;
- X
- X table1 = (int32 *) trymemory( NumChars * sizeof(int32), TRUE );
- X table2 = (int32 *) trymemory( t2size * sizeof(int32), TRUE );
- X quadtable = (DigraphBlockPtr *) trymemory( t2size * sizeof(DigraphBlockPtr), TRUE );
- X
- X zero( (char *) table1, NumChars * sizeof(int32) );
- X zero( (char *) table2, t2size * sizeof(int32) );
- X for (i=0; i<t2size; i++) quadtable[i] = NULL;
- X} /* void getmemory() */
- X
- Xvoid freememory()
- X{
- X if (table1 != NULL) free( (char *) table1 );
- X if (table2 != NULL) free( (char *) table2 );
- X} /* void freememory() */
- X
- X
- X/* Preliminary setup */
- X
- Xvoid setchar( c, accept )
- Xuint8 c;
- XBoolean accept;
- X{
- X Letters[c] = accept;
- X if (! CaseSignificant) {
- X if ((A_CHAR <= c) && (c <= Z_CHAR)) Letters[c + A_TO_a] = accept;
- X if ((a_CHAR <= c) && (c <= z_CHAR)) Letters[c - A_TO_a] = accept;
- X }
- X} /* void setchar( c, accept ) */
- X
- Xvoid setchars( s, accept )
- Xuint8 *s;
- XBoolean accept;
- X{
- Xint16 i;
- Xuint8 c;
- X
- X if (s==NULL) return;
- X i = 0;
- X while ((c = s[i++]) != 0) setchar( c, accept );
- X} /* void setchars( s, accept ) */
- X
- Xvoid maketranstable()
- X{
- Xint16 c;
- X
- X for (c=0; c < NUMCHARS; c++) {
- X CharToIndex[(uint8)c] = SPACEINDEX;
- X IndexToChar[(uint8)c] = SPACE_CHAR;
- X }
- X NumChars = 1;
- X if (!CaseSignificant) {
- X for (c=a_CHAR; c<= z_CHAR; c++) {
- X if (Letters[(uint8)(c - A_TO_a)] != Letters[(uint8)c]) {
- X Letters[(uint8)c] = TRUE;
- X Letters[(uint8)(c - A_TO_a)] = TRUE;
- X }
- X }
- X }
- X for (c=0; c < NUMCHARS; c++) {
- X if (Letters[(uint8)c] && (CaseSignificant || (c < A_CHAR) || (Z_CHAR < c))) {
- X CharToIndex[(uint8)c] = NumChars;
- X IndexToChar[(uint8)NumChars] = c;
- X NumChars++;
- X }
- X }
- X if (!CaseSignificant) {
- X for (c=a_CHAR; c<= z_CHAR; c++) {
- X CharToIndex[(uint8)(c - A_TO_a)] = CharToIndex[(uint8)c];
- X }
- X }
- X IndexToChar[(uint8)SPACEINDEX] = SPACE_CHAR;
- X
- X t2size = NumChars*NumChars;
- X t3size = t2size*NumChars;
- X t4size = t2size*t2size;
- X} /* void maketranstable() */
- X
- X
- X/* Input */
- X
- XBoolean openfile()
- X{
- XFILE *temp;
- X
- X temp = freopen( Argv[CurFile], "r", stdin );
- X if (temp == NULL) {
- X fprintf( stderr, "%s: could not open file \"%s\"%c",
- X Argv[0], Argv[CurFile], NEWLINECHAR );
- X ExitStatus = 2;
- X }
- X return( temp != NULL );
- X} /* Boolean openfile() */
- X
- XBoolean getnextfile()
- X{
- X while (((++CurFile) < Argc) && (! openfile())) { /* nothing */ }
- X return( CurFile < Argc );
- X} /* Boolean getnextfile() */
- X
- Xint16 getrawchar()
- X{
- Xint16 c;
- X c = getchar();
- X while ((c==EOFCHAR) && getnextfile()) {
- X c = getchar();
- X }
- X return(c);
- X} /* int16 getrawchar() */
- X
- X#define WASSPACE 0
- X#define WASNONSPACE 1
- X#define END 2
- Xint16 Where = WASSPACE;
- X
- Xint16 nextchar()
- X{
- Xint16 c;
- X
- X switch (Where) {
- X case WASSPACE:
- X while (((c = getrawchar()) != EOFCHAR) &&
- X (!Letters[(uint8)c])) {
- X /* nothing */
- X }
- X if (c==EOFCHAR) {
- X Where = END;
- X return(-1);
- X } else {
- X Where = WASNONSPACE;
- X return(CharToIndex[(uint8)c]);
- X }
- X case WASNONSPACE:
- X c = getrawchar();
- X if (c==EOFCHAR) {
- X Where = END;
- X return(SPACEINDEX);
- X } else if (Letters[(uint8)c]) {
- X return(CharToIndex[(uint8)c]);
- X } else {
- X Where = WASSPACE;
- X return(SPACEINDEX);
- X }
- X case END:
- X return(-1);
- X }
- X return(-1); /* Never happens. */
- X} /* int16 nextchar() */
- X
- XDigraphBlockPtr NewBlock( size )
- Xint32 size;
- X{
- XDigraphBlockPtr temp;
- X temp = (DigraphBlockPtr) malloc( BLOCKSIZE(size) );
- X return( temp );
- X} /* DigraphBlockPtr NewBlock( size ) */
- X
- XBoolean insertdigraph( t, cd )
- XDigraphBlockPtr *t;
- Xuint16 cd;
- X{
- XDigraphBlockPtr temp;
- Xint32 newSize;
- X
- X if (t==NULL) return( FALSE );
- X if (((*t)==NULL) || ((*t)->size >= (*t)->maxsize)) {
- X newSize = (*t)==NULL ? INITSIZE : ((*t)->size * GROWNUM)/GROWDEN;
- X temp = NewBlock( newSize );
- X if (temp==NULL) return( FALSE );
- X if ((*t)==NULL) {
- X temp->size = 1;
- X } else {
- X BlockMove( (Ptr) (*t), (Ptr) temp, BLOCKSIZE((*t)->size) );
- X temp->size = (*t)->size + 1;
- X free( (char *) (*t) );
- X }
- X temp->maxsize = newSize;
- X *t = temp;
- X (*t)->data[(*t)->size-1] = cd;
- X } else {
- X (*t)->data[(*t)->size++] = cd;
- X }
- X return( TRUE );
- X} /* Boolean insertdigraph( t, cd ) */
- X
- Xint16 AA = 0, BB = 0, CC = 0;
- X
- Xvoid entergroup( d )
- Xint16 d;
- X{
- Xint32 ab, cd;
- X
- X ab = AA*NumChars + BB;
- X cd = CC*NUMCHARS + d;
- X if (table2[ab] < MAXUINT16) {
- X if (insertdigraph( &(quadtable[ab]), (uint16) cd )) {
- X table0++;
- X table1[d]++;
- X table2[ab]++;
- X }
- X }
- X AA = BB; BB = CC; CC = d;
- X} /* void entergroup( d ) */
- X
- Xvoid buildtable()
- X{
- Xint16 a0, b0, c0, d;
- X
- X a0 = nextchar();
- X if (a0==SPACEINDEX) a0 = nextchar();
- X b0 = nextchar();
- X c0 = nextchar();
- X if (c0 == -1) return;
- X AA = a0; BB = b0; CC = c0;
- X while ((d = nextchar()) != (-1)) {
- X entergroup( d );
- X }
- X if (CC==SPACEINDEX) {
- X entergroup( a0 );
- X entergroup( b0 );
- X entergroup( c0 );
- X } else {
- X entergroup( SPACEINDEX );
- X entergroup( a0 );
- X entergroup( b0 );
- X entergroup( c0 );
- X }
- X} /* void buildtable() */
- X
- X
- X#ifdef SHOWTABLE
- X
- X/* Dump the tables. */
- X/* Only called if SHOWTABLE is defined at compile time. */
- X
- Xvoid showtable()
- X{
- Xuint8 i, j, k;
- Xint32 *t2;
- Xuint8 *t4;
- X
- X for (i=0; i<NumChars; i++) if (table1[i] != 0) {
- X printf( "%c\t%lu%c", IndexToChar[i], table1[i], NEWLINECHAR );
- X t2 = table2 + i*NumChars;
- X for (j=0; j<NumChars; j++) if (t2[j] != 0) {
- X printf( "%c%c\t%u", IndexToChar[i], IndexToChar[j], t2[j] );
- X t4 = (uint8 *) (quadtable[i*NumChars + j]->data);
- X for (k=0; k<t2[j]; k++) {
- X if ((k%20==0) && (k>0)) { putchar( NEWLINECHAR ); putchar( '\t' ); }
- X putchar( ' ' );
- X putchar( IndexToChar[ (uint8)(t4[k+k]) ] );
- X putchar( IndexToChar[ (uint8)(t4[k+k+1]) ] );
- X }
- X putchar( NEWLINECHAR );
- X }
- X putchar( NEWLINECHAR );
- X }
- X fflush( stdout );
- X} /* void showtable() */
- X
- X#endif
- X
- X
- X/* Generation of output */
- X
- Xuint16 Rand16()
- X{
- X return( (uint16) Random() );
- X} /* uint16 Rand16() */
- X
- Xint32 randint( max )
- Xint32 max;
- X{
- X if (max==0) return( 0 );
- X if (max <= MAXUINT16) return( ((int32) Rand16())%max );
- X return( ((((int32) Random()) << 16) + ((int32) Random())) % max );
- X} /* int32 randint( max ) */
- X
- Xuint16 randchoice32( tot, dist )
- Xint32 tot;
- Xint32 *dist;
- X{
- Xint32 i;
- Xuint8 j;
- X
- X if (tot==0) return(NOTCHOICE);
- X i = randint( tot );
- X for (j=0; j<NumChars; j++) {
- X if (i < dist[j]) return(j);
- X i -= dist[j];
- X }
- X return( NOTCHOICE ); /* Should never happen. */
- X} /* uint16 randchoice32( tot, dist ) */
- X
- Xcleanupquads()
- X{
- Xint32 i;
- X
- X for (i=0; i<t2size; i++) if (table2[i] > 0) {
- X SortArray( quadtable[i]->data, quadtable[i]->size );
- X }
- X} /* cleanupquads() */
- X
- Xuint16 randtrip( a, b )
- Xuint8 a, b;
- X{
- Xuint16 aNb;
- Xint32 t2;
- Xuint8 *t4;
- Xint32 r;
- X
- X aNb = a*NumChars+b;
- X t2 = table2[ aNb ];
- X t4 = (uint8 *) (quadtable[ aNb ]->data);
- X r = randint( t2 );
- X return( (uint16) (t4[r+r]) );
- X} /* uint16 randtrip( a, b ) */
- X
- Xuint16 randquad( a, b, c )
- Xuint8 a, b, c;
- X{
- Xuint16 aNb;
- Xint32 t2;
- Xuint8 *t4;
- Xint32 lo, hi, i, r;
- X
- X aNb = a*NumChars+b;
- X t2 = table2[ aNb ];
- X t4 = (uint8 *) (quadtable[ aNb ]->data);
- X lo = 0;
- X hi = 0;
- X for (i=0; i<t2; i++) {
- X if (t4[i+i] <= c){
- X hi++;
- X if (t4[i+i] < c) lo++;
- X } else break;
- X }
- X if (lo >= hi) {
- X /* This should never happen. */
- X return( NOTCHOICE );
- X }
- X r = lo + randint( hi-lo );
- X return( (uint16) (t4[r+r+1]) );
- X} /* uint16 randquad( a, b, c ) */
- X
- Xvoid outchar( c )
- Xint16 c;
- X{
- X if (Column < BREAK1) {
- X putchar(c); Column++;
- X } else if (c==chartoint(' ')) {
- X putchar( NEWLINECHAR );
- X Column = 0; Lines++;
- X } else if (Column >= BREAK2) {
- X putchar('-'); putchar( NEWLINECHAR );
- X Column = 0; Lines++;
- X if (Lines < MaxLines) {
- X putchar(c); Column++;
- X }
- X } else {
- X putchar(c); Column++;
- X }
- X} /* void outchar( c ) */
- X
- Xvoid generateword()
- X{
- Xuint16 a, b, c, d;
- X
- X a = (uint16)SPACEINDEX;
- X b = randchoice32( (int32) (table1[a]), table2 + a*NumChars );
- X if (b==NOTCHOICE) return;
- X outchar( IndexToChar[(uint8)b] );
- X if (SeparateWords && (b==SPACEINDEX)) return;
- X
- X c = randtrip( (uint16)SPACEINDEX, (uint8)b );
- X outchar( IndexToChar[(uint8)c] );
- X if (SeparateWords && (c==(uint16)SPACEINDEX)) return;
- X
- X while (Lines < MaxLines) {
- X d = Big ? randquad( (uint8)a, (uint8)b, (uint8)c )
- X : randtrip( (uint8)b, (uint8)c );
- X if (d==NOTCHOICE) {
- X outchar( '.' );
- X return;
- X }
- X outchar( IndexToChar[(uint8)d] );
- X if (SeparateWords && (d==(uint16)SPACEINDEX)) return;
- X a = b; b = c; c = d;
- X }
- X} /* void generateword() */
- X
- Xvoid generate()
- X{
- X if (table0 > 0) while (Lines < MaxLines) generateword();
- X} /* void generate() */
- X
- X
- X/* Argument parsing */
- X
- Xvoid usageerror()
- X{
- X fprintf( stderr, "Usage: %s [-3] [-s|-w] [-c] [-axxxx] [-dxxxx] [-lnnn] [-rnnn] [file]%c",
- X Argv[0], NEWLINECHAR );
- X fprintf( stderr, "\t-3: 3rd-order statistics.%c",
- X NEWLINECHAR );
- X fprintf( stderr, "\t-w: Successive words are independent (default).%c",
- X NEWLINECHAR );
- X fprintf( stderr, "\t-s: (Sentences) Successive words are dependent.%c",
- X NEWLINECHAR );
- X fprintf( stderr, "\t-c: Treat case differences as significant.%c",
- X NEWLINECHAR );
- X fprintf( stderr, "\t-axxxx: Accept characters in string \"xxxx\" as 'letters'.%c",
- X NEWLINECHAR );
- X fprintf( stderr, "\t-dxxxx: Treat characters in string \"xxxx\" as spaces.%c",
- X NEWLINECHAR );
- X fprintf( stderr, "\tSuccessive -a/-d options are processed sequentially.%c",
- X NEWLINECHAR );
- X fprintf( stderr, "\t-lnnn: Generate nnn lines of output (default %d).%c",
- X DEFAULTMAXLINES, NEWLINECHAR );
- X fprintf( stderr, "\t-rnnn: Specify random generator seed.%c",
- X DEFAULTMAXLINES, NEWLINECHAR );
- X ExitStatus = 1;
- X exit( ExitStatus );
- X} /* void usageerror() */
- X
- Xvoid processoptions()
- X{
- Xint16 i;
- X
- X/* getopt()? What's that? :-) */
- X
- X CaseSignificant = FALSE;
- X for (i=0; i<NUMCHARS; i++) Letters[(uint8)i] = FALSE;
- X setchars( (uint8 *) "abcdefghijklmnopqrstuvwxyz", TRUE );
- X
- X CurFile = Argc;
- X for (i=1; i<Argc; i++) {
- X if (Argv[i][0] == '-') {
- X switch (Argv[i][1]) {
- X case 's':
- X SeparateWords = FALSE;
- X break;
- X case 'w':
- X SeparateWords = TRUE;
- X break;
- X case 'a':
- X setchars( &(Argv[i][1]), TRUE );
- X break;
- X case 'd':
- X setchars( &(Argv[i][1]), TRUE );
- X break;
- X case 'c':
- X CaseSignificant = TRUE;
- X break;
- X case '3':
- X Big = FALSE;
- X break;
- X case 'l':
- X if (Argv[i][2]==0) {
- X MaxLines = MAXUINT32;
- X } else if ((sscanf( &(Argv[i][2]), "%lu", &MaxLines ) != 1) ||
- X (MaxLines < 0)) {
- X usageerror(); /* exits */
- X }
- X break;
- X case 'r':
- X if ((Argv[i][2] != 0) &&
- X (sscanf( &(Argv[i][2]), "%lu", &RandSeed ) != 1)) {
- X usageerror(); /* exits */
- X }
- X break;
- X default:
- X usageerror(); /* exits */
- X }
- X } else if (Argv[i][0] == 0) {
- X FileArgs = FALSE;
- X } else {
- X FileArgs = TRUE;
- X CurFile = i-1;
- X (void) getnextfile();
- X return;
- X }
- X }
- X} /* void processoptions() */
- X
- X
- X/* Control */
- X
- X#if UNIX
- Xcleanup( status, ignore )
- Xint status;
- Xchar *ignore;
- X#endif
- X#if MPW
- Xvoid cleanup( status )
- Xint status;
- X#endif
- X{
- X freememory();
- X} /* cleanup( status, ignore ) */
- X
- Xvoid SeedRand()
- X{
- X#if MPW
- X qd.randSeed = (int32) RandSeed;
- X#endif
- X#if UNIX
- X srandom( RandSeed );
- X#endif
- X} /* void SeedRand() */
- X
- Xmain( argc, argv )
- Xint argc;
- Xuint8 **argv;
- X{
- X Argc = argc; Argv = argv;
- X
- X#if MPW
- X InitGraf( &(qd.thePort) ); /* for random numbers */
- X GetDateTime( &RandSeed );
- X#endif
- X#if UNIX
- X RandSeed = time(0);
- X on_exit( cleanup, NULL ); /* Probably not necessary. */
- X#endif
- X#if MPW2
- X onexit( cleanup ); /* Maybe necessary? */
- X#endif
- X
- X processoptions();
- X
- X SeedRand();
- X maketranstable();
- X getmemory();
- X fprintf( stderr, "Reading input...%c", NEWLINECHAR );
- X buildtable();
- X fprintf( stderr, "%d different letters, %u characters input. Randseed = %lu%c",
- X NumChars-1, table0, RandSeed, NEWLINECHAR );
- X if (table0 > 0) {
- X#ifdef SHOWTABLE
- X showtable();
- X#endif
- X if (Big) cleanupquads();
- X#ifdef SHOWTABLE
- X showtable();
- X#endif
- X generate();
- X fflush( stdout );
- X }
- X exit( ExitStatus );
- X} /* main() */
- X
- X
- X/* Heapsort. */
- X
- Xstatic uint16 *TheArray;
- X
- Xuint16 Temp;
- X
- X#define SWAPITEM( i, j ) \
- X Temp = TheArray[(i)]; \
- X TheArray[(i)] = TheArray[(j)]; \
- X TheArray[(j)] = Temp \
- X
- Xstatic void MakeHeap( theElement, numElements )
- Xint32 theElement, numElements;
- X{
- Xint32 left, right;
- X
- X while ((left = theElement+theElement+1L) < numElements) {
- X right = left+1L;
- X if (TheArray[theElement] < TheArray[left]) {
- X if ((right < numElements) &&
- X (TheArray[left] < TheArray[right])) {
- X /* M<L<R */
- X SWAPITEM( theElement, right );
- X theElement = right;
- X } else { /* M<L, M<R<L, R<M<L */
- X SWAPITEM( theElement, left );
- X theElement = left;
- X }
- X } else if ((right < numElements) &&
- X (TheArray[theElement] < TheArray[right])) {
- X /* L<M<R */
- X SWAPITEM( theElement, right );
- X theElement = right;
- X } else {
- X /* L<M, L<R<M, R<L<M */
- X break;
- X }
- X }
- X} /* static void MakeHeap( theElement, numElements ) */
- X
- Xstatic void SortArray( theArray, length )
- Xuint16 *theArray;
- Xint32 length;
- X{
- Xint32 i;
- X
- X TheArray = theArray;
- X
- X for (i = (length / 2L) - 1L; i >= 0L; i--) {
- X MakeHeap( i, length );
- X }
- X for (i = length-1L; i >= 1L; i--) {
- X SWAPITEM( 0L, i );
- X MakeHeap( 0L, i );
- X }
- X} /* static void SortArray( theArray, length ) */
- *-*-END-of-names2.c-*-*
- echo x - Names2.make
- sed 's/^X//' >Names2.make <<'*-*-END-of-Names2.make-*-*'
- X# Replace each backslash by option-d, and each colon by option-f.
- X# Then use the Build command on the Build menu to build Names2.
- X
- X# File: Names2.make
- X# Target: Names2
- X# Sources: names2.c
- X# Created: Thursday, February 15, 1990 20:42:06 with MPW C version 3
- X
- Xnames2.c.o : Names2.make names2.c
- X C names2.c
- X
- XSOURCES = names2.c
- XOBJECTS = names2.c.o
- X
- XNames2 :: Names2.make {OBJECTS}
- X Link -w -c 'MPS ' -t MPST \
- X {OBJECTS} \
- X "{Libraries}"stubs.o \
- X "{CLibraries}"CRuntime.o \
- X "{Libraries}"Interface.o \
- X "{CLibraries}"StdCLib.o \
- X "{CLibraries}"CSANELib.o \
- X "{CLibraries}"Math.o \
- X "{CLibraries}"CInterface.o \
- X "{Libraries}"ToolLibs.o \
- X -o Names2
- *-*-END-of-Names2.make-*-*
- echo x - Makefile.unix
- sed 's/^X//' >Makefile.unix <<'*-*-END-of-Makefile.unix-*-*'
- Xnames2 : names2.c
- X cc names2.c -o names2
- *-*-END-of-Makefile.unix-*-*
- exit
-
-