Random text generation as specified by textDNA files is not the only function of your JanusNode robot. It also has an entirely orthogonal set of tools for text-morphing: that is, for altering pre-existent texts (including but not limited to texts produced by your JanusNode) in random but controlled ways. In order to access these function, set the Mode Switch to 'Morph'. The 'TextDNA:' pop-up menu will disappear- because text-morphing does not use the TextDNA files- and the contents of the 'Subject:' pop-up menu will change to reflect the many currently-available text-morphing tools.
There are three main types of text-morphing tools: a variety of
methods of Markov chaining, which allow you to build statistical
models of texts; a set of mappings, which allow you to make systematic
substitutions in a target text, and a variety of random methods,
which simply mix up the target text in random ways. There is also
a way of adding new functions of any kind. We will first consider
the Markov chaining methods
The idea behind Markov chaining is very simple: for every element of a text, compute the likelihood that any element is followed by any other element, then reconstruct the text in a way which reflects the real probabilities, by stepping through the probability table. For example, in the string 'ababca' the likelihood that 'b' will follow 'a' is 100% while the likelihood that 'a' will follow 'b' is just 50%, since 'b' is followed exactly once by 'a' and is also followed once by the letter 'c'. One probabilistic reconstruction of this string might be 'bcabcaba', since this string has (roughly) the same statistical structure as the original one.
Any ordered sequence of elements can be Markov chained. Your JanusNode
will Markov chain any text, including its own output, although,
for reasons which may be clear upon reflection, simply chaining
JanusNode's output is usually not very interesting- or, at least,
no more interesting (but much slower) than generating it from
scratch. It is more interesting to chain 'real' texts, which you
can paste in to JanusNode, or read in from text files. You can
also use probability tables that are stored in the 'Markov Tables'
folder inside the 'JanusNode Resources' folder. JanusNode ships
with a multitude of tables you can play with. You can easily make
(and mix) your own, as described below.
There are, alas, a few technical limitations in Markov chaining
with your JanusNode. One is that it is quite slow, as JanusNode
needs to take two passes through the text to build the probability
table, and JanusNode is not fast. (However, once the probability
table is built and stored in the MarkovTables folder, your JanusNode
can use it without further computational effort.) The other limitation
is on the size of the input texts: JanusNode is limited internally
to tables which are less than 32K in size, although you can spread
single texts across several chunks of that size. Sometimes the
input text will be smaller than the Markov table of that text.
A JanusNode simply cuts off the text when it runs out of internal
storage space. This is transparent to the user, and you don't
have to worry about it, unless it is extremely important for you
to be sure that every word in your input text was Markov-chained.
Your JanusNode is able to provide a variety of Markov-chaining
services. All are available in the same way, by choosing them
from the subject/method menu and then clicking on the Janus icon.
'Make Markov file' will create a new Markov chain table
(as a text file) from the text in the JanusNode window, and store
it in disk for you. If you want to be able to use it, you must
store in the 'MarkovTables' folder inside the 'JanusNode Resources'
folder. JanusNode uses the same tables for both single and pairwise
Markov chaining, and it only looks in this folder for them. Although
the tables are editable text files, it is a bad idea to alter
them in any way unless you are quite sure you understand what
you are doing, as it is very easy to introduce errors into these
tables which will render them useless. For example, you can't
just add or delete words as you like, as the table is full of
dependencies.
'Write loosely' allows you to use one or more of the
pre-calculated tables stored in the 'MarkovTables' folder, using
single-word chaining. This means that any word pair is guaranteed
to contain words that actually appeared together in that order
in the original text. You can select as many tables as you like
from the folder (use the command key to make a discontiguous selection).
Your JanusNode will chain and randomly switch between tables as
it produces a text. The resultant output will resemble all of
the input files (ie. the files from which the tables were made)
to some degree- it is something very much like a statistical average
of those files. The 'Write loosely' function is your best bet
for generating original output, since the looseness of the algorithm
makes it relatively unlikely that any of the lines in the output
appeared exactly in that form in the input file.
'Write tightly' allows you to use one or more of the
pre-calculated tables stored in the 'MarkovTables' folder using
pair-wise chaining. This means that any word _triplet_ (not word
pair, as above) is guaranteed to contain words that actually
appeared together in that order in the original text. You can
again select as many tables as you like from the folder (again,
use the command key to make a discontiguous selection), in order
to mingle different tables. The JanusNode will chain and randomly
switch between tables as it produces a text. However, because
of the nature of 'tight' Markov chaining (which requires a match
on word pairs rather than on single words) JanusNode may have
trouble finding a file to switch to. As a result, it may stay
with the same file for a relatively long time. You will see it
trying to switch and switching back, if you look in the information
window that appears when it is working.
When chaining tightly, JanusNode may often end up reproducing
fairly long sections of the input text verbatim. This is not a
bug- it is the nature of pair-wise chaining. So BEWARE: if your
JanusNode produces something brilliant when it is Markov chaining
by pairs, it may not be Janus speaking to you. It might just be
straight plagiarism. What you find so brilliant may be a statistically-reconstructed
but verbatim quote from the input files which were used to make
the Markov tables. This is of course so even if you never saw
the input file from which the table was constructed: the table
holds all the information necessary to reconstruct the file with
some degree of accuracy. You alone are responsible for distinguishing
randomly-brilliant statements by Janus- to which you have been
gifted the rights- from straight plagiarized brilliance, which
belongs to the person who owns the copyright on the source document.
The current version of JanusNodes can also Markov-chain texts
in the input window on the fly. This is not recommended: it is
better to save the table as an intermediate step. There is no
advantage to doing it on the fly, except that you can skip the
step of saving the Markov table to disk and then choosing it.
The disadvantage is that if you do not save the table you can't
re-use it unless you re-generate it from the original text.
'Chain loosely' will simply Markov chain any text
in the JanusNode window, without saving the probability table
to disk. The probability table will be built based on the odds
of any single word following any other single word. This leads
to a loose model of the original text: i.e. one which resembles
that text quite distantly.
'Chain tightly' does the same thing, but it builds
the probability table using pairs instead of single words. This
gives more structure to the probability table: the output text
will bear a closer resemblance to the input text, and may include
sections of several words in length which are exact duplicates
of sections of the input text.
'Chain letters' will chain a text in the JanusNode
window by letter pairs instead of the word pairs used by 'Chain
Pairs'. The output will resemble the language of the original,
but may not necessarily consist of real words. There are many
interesting experiments to be conducted with Markov chaining by
letter. Try exploring the difference it makes to use highly redundant
texts (such as, for example, those generated by asking a JanusNode
to produce lines using just one or a few rules), as compared to
more randomized prose, or the difference it makes to use a long
text versus a short text. Do not fail to chain by letter and then
have JanusNode read it out- the results can be quite amusing.
There are many computer programs whose purpose is to turn a text into a specific dialect. These programs are usually extremely simple: they simply make a set of pre-defined substitutions to the text. Your JanusNode has this ability, which can be completely configured by the user. The 'Text Mapping' command uses information stored in files inside the 'Mappings' folder (in the 'JanusNode Resources' folder) to make substitutions to the text in the JanusNode window.
The mappings have their own simple grammar. In general, each line
must have at least two and at most three elements, separated by
commas. The second element will be substituted for the first element
with a probability equal to the third element, if there is a third
element. (If there is not, it is assumed to be 100%). Elements
may be subword character strings, words, or multiword strings.
The probabilities operate globally, not by item- so once a mapping
'passes' the probability test and is chosen to be applied, it
will be applied to every item. For example, consider the following
mapping:
you , thee, 20
This will be applied 20% of the time it is chosen (and every mapping will be chosen exactly once, in random order, when the tool is applied). When it is applied, it will replace the word 'you' with 'thee'. Note the space inserted after the word 'you'- this is to ensure that the mapping is only applied when the whole word is 'you'- so, for example, the word 'your' will not be changed to 'theer'. It would be so changed if there were no space after the first word. If you leave a space after the first element, there is no need to also leave one after the second element: JanusNode can figure this much out for itself. Your JanusNode also deals by itself with the complications of capitalization and punctuation of various kinds, so that it will recognize that the word 'You', 'you.', or 'you)' (and so on) should be replaced in the above example with 'Thee', 'thee.' and 'thee)'.
Along with such two- or three-element substitutions, there are
two other allowable forms which may appear in a mapping: comments,
and random exclamations. A comment is any line beginning with
an '*': it will simply be ignored when encountered, allowing you
to insert notes into your mappings. A random exclamation has the
form 'random(X)' (optionally followed by a percent probability),
where 'X' is some text. When JanusNode sees a line like this in
a mapping file it will (if it passes the optional probability
test) randomly insert the text contained between the parentheses
after the end of a random number (one or more) of randomly-selected
sentences in the text. You can use this to add a little spice
to your dialects.
JanusNode comes with a variety of mapping files which should serve
as further examples, and will hopefully make the idea clear.
The remaining text-degeneration features work only if there is text in the JanusNode window. All of them are methods for randomly mixing up elements of the text.
'Blur' will randomly replace a given percentage of the letters
in the text with randomly chosen letters.
'Blur Vowels' will randomly replace a given percentage
of the vowels in the text with randomly chosen vowels.
'Flip Pairs' will randomly swap a given number
of letter pairs.
'Flip Vowels' will randomly swap a given number
of vowel pairs.
'Reverse By Word' will reverse every word of the text,
leaving that word in its current position in the text.
'Delete Every Other' will delete every second word of the
text. If you are trying to construct a poem from a prose text,
this can sometimes be a helpful step for loosening your associations.
'eecummingsfy' will attempt to mimic the style of
the great poet ee cummings, using the available text. Text which
has been eecummingsfied tends to function 'more poetically' than
text which has not been so treated. Like all the randomization
tools, eecummingsfication works probabilistically, so treating
the same text twice will not give precisely the same result.
EECummingsfication is user-configurable. It uses three files in the 'eecummings' folder which is inside the 'JanusNode Resources' folder. You can add items to and delete items from these files to customize the way eecummingsfication functions. Eecummingsfication works by looking for subword elements which can be interestingly 'isolated' from their context. The file 'EndCuts' contains strings that may possibly be isolated from the front if they appear in the text. (Since the tools apply by chance, there is no guarantee that any isolation will actually be made.) For example, if the word 'be' appears in the 'FrontCuts' file, then the word 'babe' might be split into 'ba' and return & 'be' when the tool is applied. Here the word 'be' is isolated from the front. The file 'EndCuts' contains strings that may (probabilstically) be isolated from the end if they appear in the text. For example, if the word 'be' also appears in the 'EndCuts' file, then the word 'bear' might be split into 'be' and return & 'ar' when the tool is applied. Here the word 'be' is isolated from the end. Note that such isolation would not occur from the appearance of the word 'be' in the 'FrontCuts' file. The 'FrameMe' file contains strings that will be isolated from both sides at once' If 'be' appears in that file, then the word 'unbearable' might be split up as 'un', return, 'be', return, and 'arable'- with 'be' isolated (= 'framed') from both sides at once.
'Dadaize' will randomly choose words from the
original text and print them in a randomly-arrayed manner. There
are two forms, representing sampling with or without replacement.
If you choose 'No replacement' each word from the original text will
be used at most once in each round (though you may ask your JanusNode
to go through the original text multiple times, by setting the
output size to an integer higher than the number of words in the
input). Some of you may recognize this as the orginal formula
for producing Dadaist poetry, as conceived of by the patron saint
of JanusNodes, Tristan Tzara: "And here you are a writer,
infinitely original and endowed with a sensibility that is charming
though beyond the understanding of the vulgar". If you choose 'With replacement', a single word from the original text may be chosen
more than once. Note that punctuation that is separated by a space
from any word will be treated as word by the Dadaize function,
so by adding such spaces one can have random punctuation in the
Dadaized text.
'Random sentences' will randomly print entire sentences
from the original text, sampling without replacement: in other
words, it's like the Dadaize function, but works with sentences
instead of words.
[Note to long-time users: 'Replace terms', a function which appeared
in JanusNode's predecessors, has been removed. The 'Make A Rule'
function (described below) became so much better than 'Replace
Terms' was that it became misleading to allow users the choice
between the two. If you want to see what a text would look like
with its terms randomly replaced, use the 'Make A Rule' function
to make an executable rule from the text, and then execute the
rule. You can also use the 'Replace Words' function to replace
specific user-chosen words.]
'Randomize' will randomly swap words in the text.
'Make TextDNA' will attempt to turn any text in the
JanusNode window into an executable line of TextDNA. If you are
too lazy to write TextDNA, you can now simply write (or import)
a sentence (or more) of the form which you would like to produce,
and let this function translate that sentence into TextDNA. The
TextDNA produced can then be used like any other line of TextDNA,
if you paste it into the TextDNA field. The function can only
work if you use words in your template that appear in your JanusNode's
BrainFood files. Each recognized word will be replaced with a
call to a global variable that is set to a word from the same
file as the recognized word. Do not use texts that are too long,
or, if you od, break them up afterwards into smaller rules connected
by 'ChooseTextDNA' calls. Although the function itself can generate
very long rules, such rules will not run on your JanusNode, which
has a limit on how many times it can recurse.
There are two options offered you: local and global. The global option replaces every word it recognizes with a call to a global variable: so, for example, every occurence of 'cat' will be replaced by the same word (maybe 'dog') when the rule is run. If you have the sentence 'Cats hate cats' in the input and choose the global option, it might come out as 'Dogs love dogs' when the rule is run. The local option replaces each word it recognizes with a random call to the file in which that word was found. This means that the same word in the input text will not (necessarily) be replaced by the same word in the output. So, if you have that sentence 'Cats hate cats' in the input and choose the local option, it might come out as 'Mice love pigs' when the rule is run. You also get the option to replace pronouns with the local option, by request from John Waterman. If anyone wants that option in the global case, let me know and I'll put it in.
After you have generated a rule, you wil be offered the option to run it right away. If you accept this, the contents of the TextDNA field will be over-written with the newly-generated rule, and your JanusNode will enter text-generation mode. You can run the new rule in the usual manner- by clicking on the icon of Janus. If there is an error in your new rule, you can edit it in the TextDNA window.
Your JanusNode's rule-generating tool is not perfect. Because it blindly replaces words that it finds in the order that it searches, it sometimes makes errors in deciding which Brain Food file a word should come from. It has turned out to be (to me) surprisingly difficult to generate rules that work perfectly every time, and the 'Make textDNA' function does occasionally produce TextDNA that contains (usually very minor) errors. However, it most often generates TextDNA which is either useable as it is, or in need of only minimal repairs. JanusNode's standard example files now includes many lines of TextDNA that were generated automatically. The 'Robot Johnson' project is now making very heavy use of the 'Make TextDNA' function, resulting in a huge speed-up and greatly increased ease in generating the TextDNA files used by that project.
The 'Replace Words' command allows you to choose specific words
to be replaced with words from JanusNode's database. When you
choose a word, JanusNode will look for a word-list that contains
that word, and randomly replace the word with another one from
that same word-list.
The 'Steal words' command allows you to use the main tool from
rule-fix mode (described above) without having to be in the 'Make
A Rule' command: the tool for snatching words from a text and
adding them to JanusNode's database. The sole purpose of the command
is to make it easier to add words to JanusNode. After selecting
'Steal words' from the menu, click on the Janus icon, and you
will enter the 'Steal words' mode. While in 'Steal words' mode
you can simply click on words which do not (or might not) appear
in your JanusNode's database. The JanusNode will ask you what
kind of word it is and, if it does not already appear in its database,
it will add it to the proper file. The purpose, of course, is
to make it easy to use texts from other sources as input to JanusNode's
word database: simply read in the text (by selecting 'Open File'
from the File menu) and run the 'Steal words' too. You exit the
'Steal words' mode by clicking command-period. No words which
are chosen from the original text are replaced while using the
'Steal words' command.
JanusNodes have a mechanism by which new text-morphing functions can be easily added by anyone who knows Hypertalk. These functions are must be contained in the 'TextMorphers' folder, in files whose title is the same as the name of a Hypertalk text-morphing function that is defined within that file. These can be accessed using the 'Use External Morpher' command. Simply select the external morpher you wish to use, and the function will be automatically applied to the text in the JanusNode field.
A word of warning is probably in order: Anyone who knows Hypertalk can write an external text-morpher. It is simple- trivial- for anyone to write a text-morpher that does damage to your computer by erasing files or engaging in other mischievous behaviour. When you use an external text-morphing function, your JanusNode is simply a means of running that function- it has no control at all over what that function does. As a result, it is impossible for me to accept responsibility for the actions of external text-morphing functions. User beware. However, with that warning given, it should be pointed out that external text-morphers are in this respect no more dangerous than any other shareware, since any program you run on your computer might be a rogue program. External functions are actually a little safer than most shareware, because their code is wide-open: anyone can look and see what the code for an external text-morpher does, because it is not compiled.
I will distribute new text-morphers (if any) here on Janus's web site, and will attempt to make sure that any I distribute through that channel are safe. The official version of a JanusNode ships with three examples of text-morphers (two of which were built-in to JanusNode's predecessor).
'NeoDadaize' is an experimental randomization, loosely inspired by Tristan Tzara's method. It will randomly choose short ordered sets of words from the original text (with replacement) and print them (two per line) in a randomly-arrayed manner. It formats its output into equal-length stanzas with a 'chorus', which is occasionally changed. This function works best when the input text is quite long.
The 'ReverseText' text-morpher will reverse the entire text.
'LengthHopper' works in a deterministic manner (i.e. it treats the same text the same way every time), deleting words in the input field according to a simple formulat based on the length of those words. It is highly-selective, in the sense that it will always delete the vast majority of the input text.
The remainder of this section is devoted to very brief instructions for those who wish to write their own TextMorphers. It assumes familiarity with Hypertalk. If you want your TextMorpher to be distributed on the JanusNode web site, send it to me.
The basic idea is simple: Define a procedure which acts on text. Name your TextMorphing file with the same name as that procedure. After it loads in a TextMorpher, your JanusNode tries to run a function with the same name as that file.
You have one requisite global variable which you must declare in order to access text at all: TextField. The text you can morph (and the place to which you can write out) is 'card field TextField'.
You are very strongly encouraged to call the procedure 'FollowTheWay' as often as possible. This rotates the Taoist cursor, and, more importantly, checks to see if the user has hit 'command-.', the halt command.
If you are writing text (rather than just randomizing) then it is nice to call the 'scrollcontrol' and 'checksize' procedures. 'Scrollcontrol' will scroll the window so that the end of the text is always visible. 'Checksize' will check to see if the output field is nearly full, and write it to a text file if it is.
The only other function your JanusNode gives your is 'TypeIt', which takes an single argument. It will write its argument to end of the output window.
Other than that, just use your imagination and Hypertalk.
Here's a simple and heavily-annotated example, the 'ReverseText' morpher:
on ReverseText
-- This function will only run if it is in a file also called 'ReverseText'
global TextField
-- The text is in 'card field TextField'; this variable must be
-- declared in every text-morpher.put card field TextField into x
-- Take control of the input text by putting it into a variable, 'x'put "" into card field TextField
-- Delete the input textput the number of chars in x into y
-- start with the last characterrepeat while y > 0
FollowTheWay
-- This animates the cursor, and, more importantly, allows -- the user to force the function to quit. Please call the
-- 'FollowTheWay' procedure as often as possible.TypeIt(char y of x)
-- Write the current last character
-- 'TypeIt' just writes its argument x to the output field
-- You can also write 'put x after card field TextField', of-- course. But 'typeit' will use the 'Type-o-matic' if it is on.
ScrollControl
-- 'ScrollControl' keeps the latest text visible.-- There is no need to call 'CheckSize' (which checks to see
-- if the field is full) since the output text can never be
-- bigger than the input text here. If we were writing long
-- output texts, we would need to call 'CheckSize' here.put y - 1 into y
-- Move to the next character
end repeat
end ReverseText