home *** CD-ROM | disk | FTP | other *** search
- Newly Revised and Updated Formatting Standard for Project Galactic Guide
- Revised 19930420 by Paul Clegg, with lots of information supplied by
- Stephane Lussier, Tobias B Koehler, and everyone on alt.galactic-guide
-
- Introduction:
-
- The point of all this is to have a very, very extensive reference for
- programmers and editors to create and maintain the data archives for
- Project Galactic Guide. The reason this extensive formatting design
- is necessary is because the Guide will be (and already has been) ported
- to various computer architectures, and not all computers use the same
- character sets, or can handle the same type of information. In particular,
- the Unix systems that most of us use can only handle 7-bit ASCII for
- mailings, news posts, etc, so we are constrained to use the worst
- possible character set for our data.
-
- This does not mean that we cannot represent alternate character sets.
- This was the primary reason for updating the design into an extremely
- complex standard in the first place. The purpose has since expanded
- to include various text effects, margin control, etc, that is or might
- be needed to properly portray specific articles.
-
- This text here should not intimidate field researchers in any way.
- Articles will be accepted in raw ASCII format, hand-written hardcopy,
- or even in text printed with a word processor package. The editors
- would like to encourage field researchers to use the following
- standard, to lighten their workload, but the hierarchy here does not
- at all require field researchers to use this format for their
- submissions.
-
- With that aside, I now cast you into the world of 7-bit data
- representations...
-
- Special Characters:
-
- This section details all the special characters that might be used in
- any given article. Accompanying the name of the character is the
- code, 7-bit replacement (if there is no better replacement in any
- given character set), and numerical codes for several popular
- character sets. Most of the information contained within this section
- has been derived from Tobias B. Koehler's posting to alt.galactic-guide.
-
- Definitions of accents:
- breve accent: \_/ (above letter)
- acute accent: / (above letter)
- grave accent: \ (above letter)
- circumflex: /\ (above letter)
- hacek accent: \/ (above letter)
- tilde: ~ (above letter)
- two dots: .. (above letter)
- ring: o (above letter)
- two acute acc: // (above letter)
- dot: . (above letter)
- cedilla: _) (under letter)
- ogonek hook: (_ (under letter)
-
- Special letters: Eth and Thorn are special Icelandic characters. The
- uppercase Eth looks like a slashed D, the lowercase eth looks like a
- horizontally flipped 6 with a slash. The uppercase Thorn looks like
- the upper half of a b combined with the lower half of a p. The long s
- looks like the f without the horizontal bar; the sharp s is a ligature
- of a long s and a normal s. Both are German thingies.
-
- code: Textual code
- repl: 7-bit replace to be used if character not available
- EC: TeX Extended Computer Modern character set code
- ISO: ISO 8859/1 (Amiga, Windows) character set code
- 850: IBM codepage 850 (MS-DOS, OS/2) character set code
-
- Most important: To represent a backslash (which is normally an escape
- character to denote a special character or effect) use a double backslash:
- \\ inserts a single \ character.
-
- code |repl|description |position |
- | | |EC |ISO |850 |
-
- \ch`` " Eng dbl left/Ger dbl right quote 16 147
- \ch'' " English double right quote 17 148
- \ch,, " German double left quote 18 132
- \ch<< " French double left quote 19 171 174
- \ch>> " French double right quote 20 187 175
- \ch < ` French single left quote 14 139
- \ch > ' French single right quote 15 152
-
- \ch-- -- long dash (as opposed to hyphen) 22 151 196
- \ch r d degree sign 6 176 248
- \ch$$ $ paragraph or section sign 159 167 245
- \%o o/oo promille sign 37+24 137
- \chOC (C) copyright sign 169 184
- \chOR (R) registered trademark sign 174 169
- \ch=L L pound sterling sign 191 163 156
-
- \chuA A A with breve accent 128
- \ch;A A A with ogonek hook 129
- \ch`A A A with grave accent 192 192 183
- \ch'A A A with acute accent 193 193 181
- \ch^A A A with circumflex 194 194 182
- \ch~A A A with tilde 195 195 199
- \ch"A Ae A with two dots 196 196 142
- \chrA Aa A with ring (ala Angstrom) 197 197 143
- \chAE AE AE ligature 198 198 146
- \chua a a with breve accent 160
- \ch;a a a with ogonek hook 161
- \ch`a a a with grave accent 224 224 133
- \ch'a a a with acute accent 225 225 160
- \ch^a a a with circumflex 226 226 131
- \ch~a a a with tilde 227 227 198
- \ch"a ae a with two dots 228 228 132
- \chra aa a with ring 229 229 134
- \chae ae ae ligature 230 230 145
-
- \ch'C C C with acute accent 130
- \chvC C C with hacek accent 131
- \ch,C C C with cedilla 199 199 128
- \ch'c c c with acute accent 162
- \chvc c c with hacek accent 163
- \ch,c c c with cedilla 231 231 135
-
- \chvD D D with hacek accent 132
- \ch-D D slashed D or Eth (\chEt) 208 208 209
- \ch-d d slashed d 158
- \chet eth (\chet) 240 240 208
-
- \chvE E E with hacek accent 133
- \ch;E E E with ogonek hook 134
- \ch`E E E with grave accent 200 200 212
- \ch'E E E with acute accent 201 201 144
- \ch^E E E with circumflex 202 202 210
- \ch"E E E with two dots 203 203 211
- \chve e e with hacek accent 165
- \ch;e e e with ogonek hook 166
- \ch`e e e with grave accent 232 234 138
- \ch'e e e with acute accent 233 234 130
- \ch^e e e with circumflex 234 234 136
- \ch"e e e with two dots 235 235 137
-
- \chuG G G with breve accent 135
- \chug g g with breve accent 167
-
- \ch.I I I with dot 157
- \ch`I I I with grave accent 204 204 222
- \ch'I I I with acute accent 205 205 161
- \ch^I I I with circumflex 206 206 215
- \ch"I I I with two dots 207 207 216
- \ch i i dotless i 25 213
- \ch`i i i with grave accent 236 236 141
- \ch'i i i with acute accent 237 237 161
- \ch^i i i with circumflex 238 238 140
- \ch"i i i with two dots 239 239 139
-
- \ch j j dotless j 26
-
- \ch'L L L with acute accent 27
- \ch-L L slashed L 138
- \ch'l l l with acute accent 168
- \ch-l l slashed l 169
-
- \ch'N N N with acute accent 139
- \chvN N N with hacek accent 140
- \chNJ Nj NJ ligature 141
- \ch~N N N with tilde 209 209 165
- \ch'n n n with acute accent 170
- \chvn n n with hacek accent 171
- \chnj nj nj ligature 173
- \ch~n n n with tilde 241 241 164
-
- \chhO Oe O with two acute accents 142
- \ch`O O O with grave accent 210 210 227
- \ch'O O O with acute accent 211 211 224
- \ch^O O O with circumflex 212 212 226
- \ch~O O O with tilde 213 213 229
- \ch"O Oe O with two dots 153
- \chOE OE OE ligature 215 140
- \ch/O Oe slashed O 216 216 157
- \chho oe o with two acute accents 174
- \ch`o o o with grave accent 242 242 149
- \ch'o o o with acute accent 243 243 162
- \ch^o o o with circumflex 244 244 147
- \ch~o o o with tilde 245 245 228
- \ch"o oe o with two dots 148
- \choe oe oe ligature 247 156
- \ch/o oe slashed o 248 248 155
-
- \ch'R R R with acute accent 143
- \chvR R R with hacek accent 144
- \ch'r r r with acute accent 175
- \chvr r r with hacek accent 176
-
- \ch'S S S with acute accent 145
- \chvS S S with hacek accent 146 138
- \ch,S S S with cedilla 147
- \ch's s s with acute accent 177
- \chvs s s with hacek accent 178 154
- \ch,s s s with cedilla 179
- \chss ss sharp s 255 223 225
- \chls s long s
-
- \chvT T T with hacek accent 148
- \ch,T T T with cedilla 149
- \chTh Thorn 222 222 232
- \ch,t t t with cedilla 181
- \chth thorn 254 254 231
-
- \chhU UE U with two acute accents 150
- \chrU U U with ring 151
- \ch`U U U with grave accent 217 217 235
- \ch'U U U with acute accent 218 218 233
- \ch^U U U with circumflex 219 219 234
- \ch"U Ue U with two dots 220 220 154
- \ch.U U U with dot
- \chhu ue u with two acute accents 182
- \chru u u with ring 183
- \ch`u u u with grave accent 249 249 151
- \ch'u u u with acute accent 250 250 163
- \ch^u u u with circumflex 251 251 150
- \ch"u ue u with two dots 252 252 129
- \ch.u u u with dot
-
- \ch"Y Y Y with two dots 152
- \ch'Y Y Y with acute accent 221 221 237
- \ch"y y y with two dots 184 152
- \ch'y y y with acute accent 253 253 236
-
- \ch'Z Z Z with acute accent 153
- \chvZ Z Z with hacek accent 154
- \ch.Z Z Z with dot 155
- \ch'z z z with acute accent 185
- \chvz z z with hacek accent 186
- \ch.z z z with dot 187
-
- NOTE: The following information was mostly picked out of one of Stephane
- Lussier's numerous informative posts. The following are REALLY special
- characters that are usually only used in special circumstances, such as
- mathematical texts. I do not have the resources to research the characters
- in the various character sets, so in this case, the character code is
- followed by the 7-bit ASCII representation and a short explanation.
-
- Greek Characters:
-
- code |repl|description
- \Galp a lower case alpha
- \GALP A upper case alpha
- \Gbet b lowercase beta
- \GBET B uppercase beta
- \Ggam g lowercase gamma
- \GGAM G uppercase gamma
- \Gdel d lowercase delta
- \GDEL D uppercase delta
- \Geps e lowercase epsilon
- \GEPS E uppercase epsilon
- \Gzet z lowercase zeta
- \GZET Z uppercase zeta
- \Geta h lowercase eta
- \GETA H uppercase eta
- \Gthe o lowercase theta
- \GTHE O uppercase theta
- \Giot i lowercase iota
- \GIOT I uppercase iota
- \Gkap k lowercase kappa
- \GKAP K uppercase kappa
- \Glam l lowercase lambda
- \GLAM L uppercase lambda
- \G*mu m lowercase mu
- \G*MU M uppercase mu
- \G*nu n lowercase nu
- \G*NU N uppercase nu
- \G*xi x lowercase xi
- \G*XI X uppercase xi
- \Gomi o lowercase omicron
- \GOMI O uppercase omicron
- \G*pi pi lowercase pi
- \G*PI PI uppercase pi
- \Grho p lowercase rho
- \GRHO P uppercase rho
- \Gsig s lowercase sigma
- \GSIG S uppercase sigma
- \Gtau t lowercase tau
- \GTAU T uppercase tau
- \Gups u lowercase upsilon
- \GUPS U uppercase upsilon
- \Gphi o lowercase phi
- \GPHI O uppercase phi
- \Gchi x lowercase chi
- \GCHI X uppercase chi
- \Gpsi y lowercase psi
- \GPSI Y uppercase psi
- \Gome w lowercase omega
- \GOME W uppercase omega
-
- Note: Some 7-bit representations have been duplicated. From a programming
- standpoint, it's probably preferred to actually replace the symbol with its
- full name (sans upper/lowercase), since the 7-bit letters don't fully
- coincide with the real characters too much.
-
- Mathematical Characters:
-
- code |repl|description
- \M**8 oo infinity
- \M*+- +- plus over minus
- \MNOT - negation character (horizontal bar w/ short vertical bar on left)
- \M*lv V logic: OR
- \M(+) (+) logic: XOR (Exclusive OR)
- \M(/) 0 empty set notation
- \M*|^ v logic: NOR (down arrow type of thing)
- \M--> --> implication
- \M-/> -/-> "does not imply"
- \M<-- <-- implication
- \M</- <-/- "does not imply"
- \M<-> <--> double implication
- \M</> <-/-> "there is no double implication"
- \M==> ==> implication
- \M=/> =/=> "does not imply"
- \M<== <== implication
- \M</= <=/= "does not imply"
- \M<=> <==> equivalence
- \M</> <=/=> "there is no equivalence"
- \M*-= = congruence (three horizontal bars)
- \M/-= != not congruent
- \M*/= != not equal (slashed equal sign)
- \M**~ ~ is equivalent to
- \M*~- ~- isomorphism (tilde over single bar)
- \M*~~ ~= approximately equals (two stacked wavy lines)
- \M*~= = wavy line over equal sign
- \M*)( asymptotal (upcurve over downcurve)
- \M*|| || two parallel lines
- \M*rA upturned A, "for all"
- \M*rE reversed E, "there exists"
- \M/rE slashed reversed E, "there does not exist"
- \M*.: three dots in triangle, "therefore"
- \M**U U union
- \M*rU intersection (overturned U)
- \M**E "is an element of"
- \M*/E "is not an element of"
- \M**C C "is a subset of"
- \M*/C !C "is not a subset of"
- \M**X X Cartesian product sign
- \M**| | Full vertical bar for absolute values, etc.
- \M*/| !| Does not divide (vertical bar w/ slash)
- \M**o o Composition (small circle)
- \M**. * Product (small point)
- \M**> Derivable, right pointing hollow triangle
- \M**< Normal subgroup notation, left pointing hollow triangle
- \M**% Division sign (circle over and below horizontal line)
- \M*>= >= Greater than or equal to
- \M/>= !>= Not greater than or equal to
- \M*<= <= Less than or equal to
- \M/<= !<= Not less than or equal to
- \Mint Integration sign
- \Mont Integration sign with small circle on it
- \M**' ' Prime
- \M**" " Double prime
- \M*'" '" Triple prime (etc. up to \M""", sextuple prime)
-
- Formatting Effects:
-
- The following sections include various special text effects and devices to
- allow various platforms to display various things in special formats. Since
- monospaced ASCII has been shown to not work very well, particularly with
- varying display widths, it is impossible to relegate text formatting to the
- ASCII dump. Many of the ideas within this section have been taken straight
- from Stephane Lussier's post(s), though everyone's posts have influenced the
- end result you see here.
-
- Text Effects:
-
- Text effects are things such as bold, italic, superscript, subscript,
- underline, and other visual effects that may be applied to text to make
- it more visually appealing, clear, and informative.
-
- All format controls are denoted by a backslash, a code (usually four
- letters), and a left curly brace ("{"). These sections are terminated by a
- right curly brace ("}"). The text to be that should have the given effect
- should be inside the two curly braces. Because there may easily be a reason
- to have a right curly brace in the text, a right curly brace is denoted as
- \}, to indicate that it is not part of the text coding. There is no reason
- for an alternate marker for left curly braces.
-
- Bold: \bold{ <text> }
- Italic: \ital{ <text> }
- Underlined: \undl{ <text> }
- Double Underlined: \dund{ <text> }
- Subdued: \subd{ <text> }
- Flashing: \flsh{ <text> }
- Subscript: \subs{ <text> }
- Superscript: \sups{ <text> }
-
- Effects primarily used in mathematics:
- Overlined: \ovrl{ <text> }
- Right Arrow Over Expression (vector): \raro{ <text> }
- Left Arrow Over Expression: \laro{ <text> }
- Hat Over Expression: \mhat{ <text> } Note: <text> here must be a single
- character.
-
- NOTE: Very intricate mathematical formatting instructions may eventually
- be included in this standard, but they are not being included in this
- version. For programmers writing code, assume that if you come across
- the \MATH{ <text> } escape code sequence, ignore it all. This will allow
- reader programs written to this format to be able to handle the only major
- expansion to this format that I forsee in the future, or at least not barf
- if it comes across an article with the expanded math features.
- Addendum: You WILL have to check to make sure all the curly braces are
- matched within the \MATH structure, in order to figure out when the \MATH
- structure ends. Within MATH structures, \{ and \} indicate curly braces
- with no escape codes attached (and thus don't affect the stack of braces).
-
- Standard Structure:
-
- The body of every article is organized into sections. For instance, should
- this become an entry, this paragraph is considered a section. A table would
- have to be used for the character codes above, and that would be another
- section. In this case even the subtitles (such as "Standard Structure:")
- would be separate sections. Whether or not sections should be separated
- by blank lines is optional, and may be left to a user-defined option, or
- programmer's choice; the ruling is not made here.
-
- Text formatting codes (such as underline, etc. as listed above) should be
- reset to default in between sections. If a text style is to be continued
- into the next section, the proper codes must be re-applied within the
- section's curly braces.
-
- Paragraphs:
-
- The type of section that should be most common would be the standard
- paragraph. A paragraph is denoted by \para{ followed by all the text that
- should go into that paragraph. The paragraph must be terminated by an
- ending }. Escape codes are allowed in paragraphs provided they are not
- section codes. You cannot embed paragraphs inside other paragraphs, nor
- can you embed matrices, lists, etc. within paragraphs. An example paragraph:
-
- \para{This is an example paragraph. Other than the initial escape code, and
- the ending curly brace, and any required escape codes within this text, this
- text should be completely \bold{ASCII}. For electronic mail transmission
- purposes, the length of a line should not be more than 78 characters in
- width, and lines of less than 76 characters is appreciated. Because the end
- of a paragraph is only when a \} is found, the reader programs can wrap text
- on their own, and so the EOL can be relatively ignored. Do \bold{NOT}
- hyphenate words.}
-
- Individual Lines:
-
- Often individual lines are wanted or required, particularly for things such
- as subsection headers, and so on. An individual line is still considered
- a section, and as such should leave a blank line after it. However single
- lines are much more flexible than paragraphs in most respects, and there are
- actually several types of individual lines that may be employed in an
- article.
-
- Justification: Single lines may be justified in any one of three ways:
- left, right, and center. The codes for this are, respectively, \jstl{ },
- \jstr{ }, and \cntr{ }.
-
- Preformat: A single line may be dictated as being preformatted, or absolute,
- where the reader should accept the text as being formatted for an 75 column
- display and should not try to "play" with the text involved. This is
- included only for those rare problems, and should not be used if at all
- possible. The escape code is \PREF{ <text> }. Textual effects may still
- be applied to the text contained in a preformatted line, but spacing should
- not be toyed with by the reader program.
-
- Special Effects: A single line allows us some freedom in other ways, too.
- Inserting a \. into a single line inserts a line feed, such that the text
- should drop to the same column, but the next row. This may be accomplished
- almost as easily, if not more easily, by simply using several preformat
- commands.
-
- Internal Passages:
-
- Long quotes should be given special cases, being different from a standard
- paragraph. Text enclosed in the \quot{} formatting code should be treated
- as a normal paragraph, but it should be indented on both sides when
- displayed. For an 80 column text screen, a five space indent on both sides
- is suggested.
-
- Lists:
-
- Lists are obviously used for lists of information, which may of any number
- of things. The list command, however, also works for outline designs, which
- is basically a specialized list design. There are several types of lists,
- and all of them may be nested within each other, with the one exception of
- the military notation list (see below). In any case, an element in a list
- should be offset from the left margin by some number of characters; for an
- 80 column display, the suggested indent space is 10 characters. Text that
- wraps around a display should be indented so as to line up with the first
- character of the actual text, and not just with the first digit of the
- element identifier. Sublists, or lists embedded in other lists, should
- be indented again. For all lists, the list type is used only to determine
- the type of list. Each element in the list must be contained in a \item{}
- field.
-
- Arabic Number List: This is your basic list, with elements numbered 1, 2,
- 3, etc. The escape code for this type of list is \LSAx { ... }, where
- x is the character that follows the number (see below).
-
- Lowercase Letter List: This uses the alphabet to denote its elements. The
- first element will be marked with by "a", the next by "b", etc. There
- may NOT be more than 26 elements in a letter list. The escape code is
- \LSlx { ... }.
-
- Uppercase Letter List: Exactly like the \LSlx { ... } list type, but
- using uppercase letters instead. The escape code is \LSLx { ... }, and
- it too is restricted to 26 or fewer elements.
-
- Lowercase Roman List: Uses lowercase Roman numerals, i, ii, iii, iv, etc.
- The escape code is \LSrx { ... }.
-
- Uppercase Roman List: Uses uppercase Roman numeral, I, II, III, IV, etc.
- The escape code is \LSRx { ... }.
-
- No Identifier List: This does not use any number or character to
- differentiate between elements. The escape code is \LS_x { ... }, which
- allows the author to still use special characters listed below to mark
- elements.
-
- Military Notation List: This is a tricky one. Only Military Notation
- Lists may be nested within Military Notation Lists. The identifying
- numbers are in Arabic numerals (ie. decimal), but also show the hierarchy
- of the list itself. The reader program must run through the list and
- determine how deep the sublists embedded in the list go, as each number
- must be expanded to show this. Thus, if you have a list that has a sublist
- inside it, and that sublist has yet another sublist, the numbers must
- expanded to three places, so the very first element would be 1.0.0, the
- second element would be 2.0.0, etc., but the sublist off the first element
- would have 1.1.0 for the first element. The first element off the first
- sublist of the first sublist would be 1.1.1. If sublists nested in a list
- five deep, the very first number would be 1.0.0.0.0, but if they nested
- only two deep, the first number would be 1.0. The escape code for this
- type of list is \LSMN { ... }.
-
- Separator Characters: With the exception of the Military Notation List,
- all the lists have one space in their command for a single character.
- This character must be chosen off the following list:
-
- . Uses a period after the list identifier.
- , Uses a comma after the list identifier.
- : Uses a colon after the list identifier.
- - Uses a dash after the list identifier.
- ) Uses a right parenthesis after the list identifier
- _ Puts nothing after the list identifier.
- > Puts an arrow after the list identifier.
- * Puts a bullet after the list identifier.
-
- Matrices:
-
- There have been several suggestions for matrices, but I have yet to figure
- out yet how exactly to implement them. A matrix will be given the escape
- code \MTRX { ... }, so until a matrix standard is produced, ignore the
- matrices.
-
- Conclusion:
-
- This is the first Really Big Galactic Guide Format in the Guide's history.
- Undoubtedly, there are many problems with what I've put together here, and
- I've almost certainly left things out. But that's what revisions are all
- about. With this standard, however, the use of escape codes allows for
- future expansion very easily, and any revisions will most likely not be
- of such a large scale. I want to take this time here to thank everyone
- who actually put more than thirty seconds of thought into this project,
- and especially everyone who stuck with the project from the very beginning.
- And a really big hand to all the programmers who've created Guide readers,
- cuz they're really going to be pissed when they try to program for this
- monstrosity!
-
- ...Paul
-
-
-
-