home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
kermit.columbia.edu
/
kermit.columbia.edu.tar
/
kermit.columbia.edu
/
archives
/
ucsterminal.zip
/
mail.txt
< prev
next >
Wrap
Text File
|
1998-12-08
|
315KB
|
7,108 lines
1-Oct-98 7:14:07-GMT,3775;000000000001
Return-Path: <gpw@cybersurf.net>
Received: from mailrelay1.cc.columbia.edu (mailrelay1.cc.columbia.edu [128.59.35.143])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id DAA20582
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 03:14:05 -0400 (EDT)
Received: from sagitta.cia.com (sagitta.cybersurf.net [206.186.113.4])
by mailrelay1.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id DAA11389
for <fdc@columbia.edu>; Thu, 1 Oct 1998 03:14:04 -0400 (EDT)
Received: from cybersurf.net (anzu.cybersurf.net [206.186.111.67])
by sagitta.cia.com (8.8.5/8.8.5) with ESMTP id AAA29349
for <fdc@columbia.edu>; Thu, 1 Oct 1998 00:13:47 -0700
Message-ID: <36132BC6.D5494815@cybersurf.net>
Date: Thu, 01 Oct 1998 00:14:14 -0700
From: Geoffrey Waigh <gpw@cybersurf.net>
X-Mailer: Mozilla 4.06 [en] (Win95; U)
MIME-Version: 1.0
To: fdc@columbia.edu
Subject: Re: Terminal Graphics Proposal
References: <9810010139.AA14114@unicode.org>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
I was going to send this to offline@unicode.org (the list for offline
unicode discussions,) but assumed from your request for private
correspondence you were not aware of the forum.
Frank da Cruz wrote:
>
> D R A F T # 1
>
> ABSTRACT
>
> A selection of terminal graphics characters is proposed for Unicode [24]
> and ISO 10646 [19] to allow Unicode-based terminal emulation software to
> (a) display glyphs that are found on popular types of terminals but
> currently are not available in Unicode, and (b) interoperate with other
> Unicode applications.
I can see clear merit in handling b), but I'm leary of the code space
consumption that a) is having here. In general, my feeling is that
if 98% emulation does the job in an adequate fashion for
non-perfectionists, then that is the way to go.
[On control character display]
I don't think that the existing C0 control code graphics being
indicated as horizontal and a preference for diagonal glyphs
warrants disunification. I think that is a font variation and
so long as the control code is represented with one of it's
standard names (2 or 3 letters, horizontal or diagonal,) the
information is being properly conveyed and the user can understand
what is going on. However, I think it would be useful to
complete the control code set.
[On Hex code display]
That seems kind of wasteful for a debugging mode. Do the terminals
that produce this output have escape sequences for enabling this
mode, or is it strictly a terminal configuration option? (Of course
by that measure the control character codes come under scrutiny...)
[On math symbols]
I cannot comment on these since our customers had 0 interest in the
technical symbols, and so aside from glancing at the code pages and
realizing they wouldn't map to Unicode very well I didn't work with
them.
[On Line and Blocks]
Again, I didn't have to deal with the terminals that form the bulk
of these codes and cannot comment.
> 9. UNFINISHED BUSINESS
>
> The selection of characters presented in this draft is far from
> comprehensive. Hundreds of other terminals from the past 30+ years are
> likely to have glyphs or entire character sets covered neither here nor
> in Unicode, and these might or might not be important in some application
> somewhere. Readers are invited, therefore, to propose any needed
> additions, bearing in mind that Unicode code space is not unlimited.
And hopefully the compleatists out there will let sleeping dogs lie.
Which is not to say that some other terminals might be worth
supporting, but I suspect that the cost to the rest of the world
in terms of codepoint space for most of them means that doing the
emulation with alternate glyphs or custom fonts is appropriate.
Geoffrey Waigh
gpw@cybersurf.net
1-Oct-98 11:27:35-GMT,3161;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id HAA23098
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 07:27:34 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id EAA59550
; Thu, 1 Oct 1998 04:24:53 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA15098; Thu, 1 Oct 98 04:16:20 -0700
Message-Id: <9810011116.AA15098@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Uml-Sequence: 6029 (1998-10-01 11:15:47 GMT)
From: Kevin Bracey <kbracey@acorn.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 04:15:46 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
In message <9810010143.AA14122@unicode.org>
Frank da Cruz <fdc@watsun.cc.columbia.edu> wrote:
> Sorry for the length of the following; if you're not interested,
> skip it. The intention is to bring likeminded parties out of the woodwork;
> if you are one, please contact me and we can continue the topic offline.
>
> - Frank
>
Very well put together proposal - I like it. A few comments (on the mailing
list, so keeping it short):
>
> Unicode already has a block of Control Pictures at U+2400 through U+2421,
> but (except for "NL" at U+2424) these go horizontally across the character
> cell, rather than diagonally, thus making them difficult to distinguish
> from normal alphanumeric text. A new, parallel block of C0 control
> pictures is needed in which the abbreviations are displayed diagonally.
That's a glyph variation - the Unicode Standard explicitly states that
you can use whatever preferred glyph you like for these. Indeed, IIRC,
ISO 10646-1 has considerably different suggested glyphs for these characters.
> E080 SP Space (like U+2420 but arranged diagonally)
> E081 DEL Delete (Rubout) (2-character name: DT)
These two are glyph variants of U+2420 and U+2421.
> E082 LS1 Locking Shift 1 (ISO name for SO)
> E083 LS0 Locking Shift 0 (ISO name for SI)
Maybe these two could be considered glyph variants of U+240E and u+240F?
Probably not, I suppose.
> Hexadecimal byte values, 2 hex digits each. Like display controls, but for
> all 256 8-bit byte values, showing the byte code in hexadecimal, rather
> than the (context-dependent) name. For hex debugging (in terminal
> emulators, line monitors, protocol analyzers, etc). Should be arranged
> diagonally within the character cell as shown in Figure 5.1:
Fair enough - but who are you to specify diagonality? These are just
characters with the semantic meaning "Graphic representation of octet value
xx".
> E0F0 Reverse Question Mark DEC VTxxx, Wyse, Televideo (1)
I would suggest U+FFFD for this.
--
Kevin Bracey, Senior Software Engineer
Acorn Computers Ltd Tel: +44 (0) 1223 725228
Acorn House, 645 Newmarket Road Fax: +44 (0) 1223 725328
Cambridge, CB5 8PB, United Kingdom WWW: http://www.acorn.co.uk/
1-Oct-98 14:43:20-GMT,2442;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id KAA04241
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 10:43:19 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id HAA47416
; Thu, 1 Oct 1998 07:41:57 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA15592; Thu, 1 Oct 98 07:35:05 -0700
Message-Id: <9810011435.AA15592@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6030 (1998-10-01 14:34:45 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 07:34:43 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
> > Unicode already has a block of Control Pictures at U+2400 through U+2421,
> > but (except for "NL" at U+2424) these go horizontally across the character
> > cell, rather than diagonally, thus making them difficult to distinguish
> > from normal alphanumeric text. A new, parallel block of C0 control
> > pictures is needed in which the abbreviations are displayed diagonally.
>
> That's a glyph variation - the Unicode Standard explicitly states that
> you can use whatever preferred glyph you like for these. Indeed, IIRC,
> ISO 10646-1 has considerably different suggested glyphs for these characters.
>
My concern is that the pictures in the Unicode book go horizontally. Although
I do not claim to be an expert on Unicode fonts, I have never seen one that
implemented this block, so I don't actually know how it looks. However, I'd
say that the horizontal arrangement would make it extremely difficult for the
viewer to discern the cell boundaries, as in:
NULSOHSTXETXEOTENQACKBELDELNAKSYNETBCANSUBESCCANACKSSASS3SPAEPACSISCI
And thus, at minumum, the table in the book should be altered to show all
control pictures arranged diagonally, and all future control picture additions
should also be arranged that way.
> > E0F0 Reverse Question Mark DEC VTxxx, Wyse, Televideo (1)
>
> I would suggest U+FFFD for this.
>
U+FFFD means "this character is not in Unicode" (or in this font), which is
not quite the same meaning as "this character is illegal in this context"
on the VT terminals. Anyway, reverse question mark is a regular glyph
character on the Wyse and Televideo models.
- Frank
1-Oct-98 16:24:10-GMT,921;000000000011
Return-Path: <nelson@pinotnoir.media.mit.edu>
Received: from aleve.media.mit.edu (aleve.media.mit.edu [18.85.2.171])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id MAA03674
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 12:24:05 -0400 (EDT)
Received: from pinotnoir.media.mit.edu (nelson@pinotnoir.media.mit.edu [18.85.16.104])
by aleve.media.mit.edu (8.8.7/ML970927) with ESMTP id MAA00747;
Thu, 1 Oct 1998 12:23:52 -0400 (EDT)
Received: (from nelson@localhost)
by pinotnoir.media.mit.edu (8.8.5/8.8.5) id MAA04913;
Thu, 1 Oct 1998 12:23:52 -0400
Date: Thu, 1 Oct 1998 12:23:52 -0400
Message-Id: <199810011623.MAA04913@pinotnoir.media.mit.edu>
From: nelson@media.mit.edu (Nelson Minar)
To: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Subject: Re: Terminal Graphics Proposal
In-Reply-To: <9810010143.AA14122@unicode.org>
References: <9810010143.AA14122@unicode.org>
Wow, that was an impressive proposal.
1-Oct-98 16:59:50-GMT,2868;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id MAA14961
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 12:59:49 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id JAA32536
; Thu, 1 Oct 1998 09:58:25 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA16971; Thu, 1 Oct 98 09:48:27 -0700
Message-Id: <9810011648.AA16971@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Uml-Sequence: 6034 (1998-10-01 16:48:08 GMT)
From: John Cowan <cowan@locke.ccil.org>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 09:48:07 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
Content-Transfer-Encoding: 7bit
Frank da Cruz wrote:
> More useful in a terminal emulator, however, is the ability to display the
> the official abbreviation [1,18], or "name", of the control character in a
> single cell. [...]
>
> Some control characters have two-character abbreviations (such as CR, LF,
> HT, FF), while others are three characters (NUL, SOH, DC1, DLE). Some
> terminals compress three-letter abbreviations to the two-character forms
> shown in Table 4.2. All terminals, however, display the abbreviations
> diagonally in the character cell, as shown in Figure 4.1. [...]
>
> Unicode already has a block of Control Pictures at U+2400 through U+2421,
> but (except for "NL" at U+2424) these go horizontally across the character
> cell, rather than diagonally, thus making them difficult to distinguish from
> normal alphanumeric text. A new, parallel block of C0 control pictures is
> needed in which the abbreviations are displayed diagonally.
This reflects a failure to understand the semantics of the
Control Pictures block, specifically the range U+2400 - U+214F,
which is documented on page 6-84 of the Unicode Standard 2.0.
# [F]or the control code graphics U+2400 -> U+241F only the
# semantic is encoded in the Unicode Standard. This allwos a particular
# application to use the graphic representation it prefers.
# [...] The [code points U+2400 to U+241F] are not associated with
# specific glyphs, but rather are available to encode <em>any</em>
# desired pictorial representation of the given control code.
The horizontal representations printed on page 7-188, therefore,
are not standardized in any way. Diagonal representations would
be entirely equivalent; the distinction is one of font only.
--
John Cowan http://www.ccil.org/~cowan cowan@ccil.org
You tollerday donsk? N. You tolkatiff scowegian? Nn.
You spigotty anglease? Nnn. You phonio saxo? Nnnn.
Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)
1-Oct-98 17:15:40-GMT,2779;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id NAA19770
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 13:15:38 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id KAA28598
; Thu, 1 Oct 1998 10:11:05 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA17056; Thu, 1 Oct 98 09:53:13 -0700
Message-Id: <9810011653.AA17056@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6035 (1998-10-01 16:51:54 GMT)
From: Rick McGowan <rmcgowan@apple.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 09:51:51 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
Frank da Cruz sent a long proposal. It looks like a pretty thorough
analysis, though I've not made it all the way through. One thing leapt out
that I thought I'd mention...
> Unicode already has a block of Control Pictures at U+2400 through U+2421,
> but (except for "NL" at U+2424) these go horizontally across the character
> cell, rather than diagonally, thus making them difficult to distinguish
> from normal alphanumeric text. A new, parallel block of C0 control
> pictures is needed in which the abbreviations are displayed diagonally
I think rather that the current control pictures are a SUGGESTION of the
possible glyphs for particular functions. The glyphs for them even changed
between Unicode 1.0 and 2.0! So I would have to seriously question adding a
parallel set of pictures. Unless there is some need for having multiple,
parallel representations for THE SAME CODE on the SAME TERMINAL, I don't see
any point to adding several glyphic variations. Pick your glyphs and use the
existing control pix for existing controls.
Of course, there are a *lot* of controls, many control sets, and some degree
of overlap, as Frank's proposal points out rather dramatically. I would
suggest that he take up an attempt at serious unification of these things,
and collect all of the wonderful data he's gathered into a "white paper" on
how to use control pictures for what terminals, etc. With mapping tables,
and a list of the minimum required additions to support full cross-mappings.
This proposal contains a lot of data. It would be best to do as much
unification work as possible up-front, rather than relying on UTC and/or WG2
to take it up. The proposal would stand a greater chance of success. If the
committees look at it and say that it needs much work to clarify what can
and cannot be unified, then they're less likely to act quickly. In my
opinion.
And the bibliography is impressive.
Rick
1-Oct-98 17:24:04-GMT,1128;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id NAA21767
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 13:24:03 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id KAA60476
; Thu, 1 Oct 1998 10:20:31 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA17189; Thu, 1 Oct 98 10:03:08 -0700
Message-Id: <9810011703.AA17189@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6036 (1998-10-01 17:00:19 GMT)
From: Rick McGowan <rmcgowan@apple.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 10:00:18 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
> I do not claim to be an expert on Unicode fonts, I have never seen one that
> implemented this block, so I don't actually know how it looks.
Frank -- Go out to
http://www.indigo.ie/egt/celtscript/
and look for Everson Mono. There's a PS font that implements these, with
completely different glyphs.
Rick
1-Oct-98 18:29:33-GMT,6717;000000000011
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id OAA09734
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 14:29:31 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id LAA54084
; Thu, 1 Oct 1998 11:19:28 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA17232; Thu, 1 Oct 98 10:06:03 -0700
Message-Id: <9810011706.AA17232@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Uml-Sequence: 6037 (1998-10-01 17:03:51 GMT)
From: Paul Keinanen <keinanen@sci.fi>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 10:03:50 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
Kevin Bracey said most that I was planning to say about this interesting
proposal, but here are some more observations.
Frank da Cruz <fdc@watsun.cc.columbia.edu> wrote:
>Table 4.2: C0 Control Pictures
>
> Code Name 2X Code Name 2X
> E000 NUL NU E010 DLE DL
> E001 SOH SH E011 DC1 D1
> E002 STX SX E012 DC2 D2
> E003 ETX EX E013 DC3 D3
> E004 EOT ET E014 DC4 D4
> E005 ENQ EQ E015 NAK NK
> E006 ACK AK E016 SYN SY
> E007 BEL BL E017 ETB EB
> E009 BS BS E018 CAN CN
> E009 HT HT E019 EM EM
> E00A LF LF E01A SUB SU
> E00B VT VT E01B ESC EC
> E00C FF FF E01C FS FS
> E00D CR CR E01D GS GS
> E00E SO SO E01E RS RS
> E00F SI SI E01F US US
>
>There is little to gain by defining separate 2- and 3-character glyphs for
>control characters that have 3-character names; therefore it is suggested
>that the full abbreviation (from the Name column) be used, with the
>characters arranged diagonally within each cell (rather than horizontally as
>in the U+2400 block), and that the 2X column be ignored.
As far as I know, the Unicode standard does not specify the writing
direction or actual representation of these characters. I would think that
the two or three character forms are just variations of the same glyph. To
me, it would make perfectly sense for readability point of view to use e.g.
AK (horisontally, diagonally or vertically spaced) for a very small font and
use ACK for larger fonts with more available pixels.
If all octet values (00 .. FF) are also going to be displayed, there might
be some ambiguity with some of the two letter codes, e.g. FF, D1, D2, D3,
D4, EB and EC, which should be noted in the actual font design.
>C1 Control characters are specified in ISO-6429 and used in the VT220
>family of terminals [5] and the Wyse 370 [26], where they are represented
>in the right half of the "display controls" font as shown in Table 4.3 (DEC
>terminals use the full name, Wyse terminals use the 2X name). As with C0
>controls, the "name" is displayed diagonally within the character cell.
>Unicode presently includes no C1 control pictures.
Looking through various EBCDIC code pages (e.g. IBM278, IBM880) and other
unnumbered sets it appears that these control codes are all also available
in EBCDIC, but of course at different positions (e.g. IND at 0x24). Some
references to these sets are "IBM NLS RM Vol2 SE09-8002-01, March 1990" and
"IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987".
Based on this observation, it is strange that the C0 control pictures are in
the Unicode standard, but not the C1 control pictures.
>Table 4.3: C1 Control Pictures
>Note that three of the C1 control pictures are unassigned (the ones marked
>by "(1)", that would be at U+E020, U+E021, and U+E039 if these were
>assigned). These positions should be left vacant in case names are assigned
>to these characters in a future revision of ISO 6429.
In ISO 8859-1 these are listed as
80 PADDING CHARACTER (PAD)
81 HIGH OCTET PRESET (HOP)
99 SINGLE GRAPHIC CHARACTER INTRODUCER (SGCI)
>Table 4.4 shows the names of control characters unique to EBCDIC (that is,
>the ones it does not share with ASCII).
There seems to be different names for the same EBCDIC control characters and
some of these names are equivalent to the ASCII names. Just wondering what
should be done to these control pictures ? Some examples below.
> E082 LS1 Locking Shift 1 (ISO name for SO)
> E083 LS0 Locking Shift 0 (ISO name for SI)
> E084 IS4 ISO Name for FS: Information Separator 4
> E085 IS3 ISO Name for GS: Information Separator 3
> E086 IS2 ISO Name for RS: Information Separator 2
> E087 IS1 ISO Name for US: Information Separator 1
>5. HEX BYTES
>
>Hexadecimal byte values, 2 hex digits each. Like display controls, but for
>all 256 8-bit byte values, showing the byte code in hexadecimal, rather than
>the (context-dependent) name. For hex debugging (in terminal emulators,
>line monitors, protocol analyzers, etc). Should be arranged diagonally
>within the character cell as shown in Figure 5.1:
These would be very nice :-). Note the possible ambiguity with some two
character control pictures r.g. FF, EB etc. So special precautions should be
taken when designing the fonts.
>8. MISCELLANEOUS SINGLE-CELL GLYPHS
>Notes:
> (1) The reverse question is essential in VT terminal emulation, where it
> indicates that an invalid code was received, or a parity or other
> error was detected. It also stands for SUB and/or RS in Wyse display
> controls mode, and is the glyph for 0xFF in the Televideo Multinational
> Character Set [23]. And it it is also a glyph in the DG Special
> Graphics Character Set [2].
Even ISO-Latin1 contains the reverse question mark at 0xBF, so it is no need
to re-invent it.
>9. UNFINISHED BUSINESS
>No attempt was made to account for the many Viewdata, Videotex, Minitel,
>NAPLPS, or other mosaic graphics character sets. These should be tackled,
>if appropriate, by someone who knows something about them.
And not forgetting the tele-text block characters on European TVs. With the
introduction of TV cards for PCs that also contains a teletext decoder, so
there is a need to display the text and block graphics on PC. As far as I
remember, the block graphic format is more or less the same as Viewdata with
2 columns and 3 rows per character cell, thus requiring 64 glyphs.
All in all a very interesting proposal. By using as much existing characters
from current Unicode standard, i guess there would be a greater likelyhood
of getting thing officially approved.
Paul
1-Oct-98 19:21:23-GMT,4957;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id PAA23982;
Thu, 1 Oct 1998 15:20:06 -0400 (EDT)
Date: Thu, 1 Oct 98 15:20:04 EDT
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
To: unicode@unicode.org
Subject: Re: Terminal Graphics Proposal
In-Reply-To: Your message of Thu, 1 Oct 1998 10:03:50 -0700 (PDT)
Message-ID: <CMM.0.90.4.907269604.fdc@watsun.cc.columbia.edu>
> As far as I know, the Unicode standard does not specify the writing
> direction or actual representation of [control pictures]. I would think that
> the two or three character forms are just variations of the same glyph.
>
This seems to be the consensus, and the most prominent reaction to the
proposal. Still, if I were a font maker working from the Unicode book, I'd
probably copy the pictures in it, so again, I'd suggest the next edition
show the characters diagonally within the cell, and the accompanying text
(which if I can overlook, so can a font maker :-) point out the importance
of visually preserving the character-cell boundaries by some means. The
diagonal arrangement is used on all terminals I have seen that support
display controls, so this would be the most obvious method.
> If all octet values (00 .. FF) are also going to be displayed, there might
> be some ambiguity with some of the two letter codes, e.g. FF, D1, D2, D3,
> D4, EB and EC, which should be noted in the actual font design.
>
Good point. One path to disambiguation would be to show hex digits A-F in
lower case. Sounds OK?
> >C1 Control characters are specified in ISO-6429 and used in the VT220
> >family of terminals [5] and the Wyse 370 [26], where they are represented
> >in the right half of the "display controls" font as shown in Table 4.3 (DEC
> >terminals use the full name, Wyse terminals use the 2X name). As with C0
> >controls, the "name" is displayed diagonally within the character cell.
> >Unicode presently includes no C1 control pictures.
>
> Looking through various EBCDIC code pages (e.g. IBM278, IBM880) and other
> unnumbered sets it appears that these control codes are all also available
> in EBCDIC, but of course at different positions (e.g. IND at 0x24). Some
> references to these sets are "IBM NLS RM Vol2 SE09-8002-01, March 1990" and
> "IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987".
>
Thanks for pointing this out -- I'll be sure to unify all duplicates in the
next go-round.
> In ISO 8859-1 these are listed as
>
> 80 PADDING CHARACTER (PAD)
> 81 HIGH OCTET PRESET (HOP)
> 99 SINGLE GRAPHIC CHARACTER INTRODUCER (SGCI)
>
I suppose that's a good enough source, though I wonder why they are not
named in ISO 6429!
> >Table 4.4 shows the names of control characters unique to EBCDIC (that
> >is, the ones it does not share with ASCII).
>
> There seems to be different names for the same EBCDIC control characters
> and some of these names are equivalent to the ASCII names. Just wondering
> what should be done to these control pictures ? Some examples below.
>
In the spirit of unification, I would venture that if two different control
characters have the same name, only one control picture is needed.
> >Notes:
> > (1) The reverse question is essential in VT terminal emulation...
>
> Even ISO-Latin1 contains the reverse question mark at 0xBF, so it is no
> need to re-invent it.
>
But that one is upside down. The one I'm talking about is upright but
flipped on its vertical axis.
Clearly an important component of this proposal, before it reaches its final
stage, is a collection of pictures of the proposed characters. I'll do my
best to scan in the relevant pages from the many terminal manuals, for what
it's worth -- many of them are crude and unclear to begin with, some of them
are even hand-drawn. Others are wedded to dot matrices of specific
dimensions and, in fact, are shown as large tty graphics.
I wonder how one proceeds from such elusive sources to create a definitive
picture of each character, and then to translate this into the style of a
particular font. Oh well, not my problem :-)
> All in all a very interesting proposal. By using as much existing characters
> from current Unicode standard, i guess there would be a greater likelyhood
> of getting thing officially approved.
>
And of course, many characters in many of these sets are indeed well covered
by existing Unicode characters and so never appeared in the proposal in the
first place. I considered fully enumerating each character set and noting
which characters already did and did not have suitable Unicode equivalents,
but that would have made the proposal much too long.
Thanks to you and everyone else for the helpful and supportive comments. I
think the next step will be to run a new draft (updated according to comments
from this list) past the broad constituencies of some of the terminals it
treats, for which there are several well-suited newsgroups.
Thanks again!
- Frank
1-Oct-98 19:58:53-GMT,5362;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id PAA06792
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 15:58:51 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id MAA36108
; Thu, 1 Oct 1998 12:59:01 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA19623; Thu, 1 Oct 98 12:35:11 -0700
Message-Id: <9810011935.AA19623@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6042 (1998-10-01 19:32:37 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 12:32:36 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
> As far as I know, the Unicode standard does not specify the writing
> direction or actual representation of [control pictures]. I would think that
> the two or three character forms are just variations of the same glyph.
>
This seems to be the consensus, and the most prominent reaction to the
proposal. Still, if I were a font maker working from the Unicode book, I'd
probably copy the pictures in it, so again, I'd suggest the next edition
show the characters diagonally within the cell, and the accompanying text
(which if I can overlook, so can a font maker :-) point out the importance
of visually preserving the character-cell boundaries by some means. The
diagonal arrangement is used on all terminals I have seen that support
display controls, so this would be the most obvious method.
> If all octet values (00 .. FF) are also going to be displayed, there might
> be some ambiguity with some of the two letter codes, e.g. FF, D1, D2, D3,
> D4, EB and EC, which should be noted in the actual font design.
>
Good point. One path to disambiguation would be to show hex digits A-F in
lower case. Sounds OK?
> >C1 Control characters are specified in ISO-6429 and used in the VT220
> >family of terminals [5] and the Wyse 370 [26], where they are represented
> >in the right half of the "display controls" font as shown in Table 4.3 (DEC
> >terminals use the full name, Wyse terminals use the 2X name). As with C0
> >controls, the "name" is displayed diagonally within the character cell.
> >Unicode presently includes no C1 control pictures.
>
> Looking through various EBCDIC code pages (e.g. IBM278, IBM880) and other
> unnumbered sets it appears that these control codes are all also available
> in EBCDIC, but of course at different positions (e.g. IND at 0x24). Some
> references to these sets are "IBM NLS RM Vol2 SE09-8002-01, March 1990" and
> "IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987".
>
Thanks for pointing this out -- I'll be sure to unify all duplicates in the
next go-round.
> In ISO 8859-1 these are listed as
>
> 80 PADDING CHARACTER (PAD)
> 81 HIGH OCTET PRESET (HOP)
> 99 SINGLE GRAPHIC CHARACTER INTRODUCER (SGCI)
>
I suppose that's a good enough source, though I wonder why they are not
named in ISO 6429!
> >Table 4.4 shows the names of control characters unique to EBCDIC (that
> >is, the ones it does not share with ASCII).
>
> There seems to be different names for the same EBCDIC control characters
> and some of these names are equivalent to the ASCII names. Just wondering
> what should be done to these control pictures ? Some examples below.
>
In the spirit of unification, I would venture that if two different control
characters have the same name, only one control picture is needed.
> >Notes:
> > (1) The reverse question is essential in VT terminal emulation...
>
> Even ISO-Latin1 contains the reverse question mark at 0xBF, so it is no
> need to re-invent it.
>
But that one is upside down. The one I'm talking about is upright but
flipped on its vertical axis.
Clearly an important component of this proposal, before it reaches its final
stage, is a collection of pictures of the proposed characters. I'll do my
best to scan in the relevant pages from the many terminal manuals, for what
it's worth -- many of them are crude and unclear to begin with, some of them
are even hand-drawn. Others are wedded to dot matrices of specific
dimensions and, in fact, are shown as large tty graphics.
I wonder how one proceeds from such elusive sources to create a definitive
picture of each character, and then to translate this into the style of a
particular font. Oh well, not my problem :-)
> All in all a very interesting proposal. By using as much existing characters
> from current Unicode standard, i guess there would be a greater likelyhood
> of getting thing officially approved.
>
And of course, many characters in many of these sets are indeed well covered
by existing Unicode characters and so never appeared in the proposal in the
first place. I considered fully enumerating each character set and noting
which characters already did and did not have suitable Unicode equivalents,
but that would have made the proposal much too long.
Thanks to you and everyone else for the helpful and supportive comments. I
think the next step will be to run a new draft (updated according to comments
from this list) past the broad constituencies of some of the terminals it
treats, for which there are several well-suited newsgroups.
Thanks again!
- Frank
1-Oct-98 19:58:53-GMT,1419;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id PAA06793
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 15:58:51 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id MAA16728
; Thu, 1 Oct 1998 12:57:48 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA19780; Thu, 1 Oct 98 12:42:14 -0700
Message-Id: <9810011942.AA19780@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Uml-Sequence: 6043 (1998-10-01 19:39:42 GMT)
From: John Cowan <cowan@locke.ccil.org>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 12:39:38 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
Content-Transfer-Encoding: 7bit
Paul Keinanen wrote:
> Even ISO-Latin1 contains the reverse question mark at 0xBF, so it is no need
> to re-invent it.
No, that is the inverted (head over heels) question mark. What is
being described here is a reversed (left-to-right) question mark.
--
John Cowan http://www.ccil.org/~cowan cowan@ccil.org
You tollerday donsk? N. You tolkatiff scowegian? Nn.
You spigotty anglease? Nnn. You phonio saxo? Nnnn.
Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)
1-Oct-98 21:32:59-GMT,2116;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id RAA03320
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 17:32:57 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id OAA17196
; Thu, 1 Oct 1998 14:28:31 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA20910; Thu, 1 Oct 98 13:56:11 -0700
Message-Id: <9810012056.AA20910@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6046 (1998-10-01 20:54:15 GMT)
From: Rick McGowan <rmcgowan@apple.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 13:54:14 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
> Still, if I were a font maker working from the Unicode book, I'd
> probably copy the pictures in it, so again, I'd suggest the next edition
> show the characters diagonally within the cell, and the accompanying text
> (which if I can overlook, so can a font maker :-)
Yes, yes, but... People should read, Grasshopper. It is that for which we write.
People who do not "R.T.F.M." waste everyone else's time asking questions
that are answered in "T.F.M." The Internet is absolutely RAMPANT with that
behavior and it always has been. (Well, society itself, for that matter, is
full of such behavior, so I shouldn't knock the net...)
It would be a poor, poor font designer indeed who, having NOT Read The
Formidable Manual, tried to implement a font for the control pictures. I
would not pity such a person, only the victims of the resulting "font".
> I wonder how one proceeds from such elusive sources to create a definitive
> picture of each character, and then to translate this into the style of a
> particular font. Oh well, not my problem :-)
Actually, Grasshopper... it *is* your problem. Nobody else is going to do
this for you. I suggest you gird your loins and heave to. This is a chance
for you to make a useful and lasting contribution to History.
Cheerily,
Rick
1-Oct-98 21:53:31-GMT,1581;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id RAA06957
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 17:53:30 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id OAA11468
; Thu, 1 Oct 1998 14:53:30 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA21562; Thu, 1 Oct 98 14:41:30 -0700
Message-Id: <9810012141.AA21562@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Uml-Sequence: 6047 (1998-10-01 21:40:59 GMT)
From: Asmus Freytag <asmusf@ix.netcom.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 14:40:56 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
>And thus, at minumum, the table in the book should be altered to show all
>control pictures arranged diagonally, and all future control picture
additions
>should also be arranged that way.
We are looking into this for Unicode 3.0. Although the mail discussion makes
clear that the distinction between characters and glyphs is widely known, it
makes no sense to depart from the established use in the one area the
characters are intended for!
Since the two glyph forms are equivalent (i.e. there's no question of changing
the identity of the characters) such a change is editorial in nature. For
what it's worth, ISO 10646 uses the diagonal forms (although incorrectly in
a roman type face).
A./
1-Oct-98 22:06:05-GMT,1299;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id SAA08132
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 18:06:03 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id PAA40338
; Thu, 1 Oct 1998 15:05:50 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA21781; Thu, 1 Oct 98 14:52:06 -0700
Message-Id: <9810012152.AA21781@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Uml-Sequence: 6048 (1998-10-01 21:50:21 GMT)
From: Asmus Freytag <asmusf@ix.netcom.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 14:50:20 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
>> >Notes:
>> > (1) The reverse question is essential in VT terminal emulation...
>>
>> Even ISO-Latin1 contains the reverse question mark at 0xBF, so it is no
>> need to re-invent it.
>>
>But that one is upside down. The one I'm talking about is upright but
>flipped on its vertical axis.
>
This important character is already on the list of characters to be added
in one the coming amendments in ISO 10646.
A./
1-Oct-98 23:12:25-GMT,1652;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id TAA15789
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 19:12:24 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id QAA65550
; Thu, 1 Oct 1998 16:11:33 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA22944; Thu, 1 Oct 98 16:06:19 -0700
Message-Id: <9810012306.AA22944@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6049 (1998-10-01 23:05:56 GMT)
From: kenw@sybase.com (Kenneth Whistler)
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Cc: kenw@sybase.com
Date: Thu, 1 Oct 1998 16:05:51 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal (reverse QMark)
>
> >> >Notes:
> >> > (1) The reverse question is essential in VT terminal emulation...
> >>
> >> Even ISO-Latin1 contains the reverse question mark at 0xBF, so it is no
> >> need to re-invent it.
> >>
> >But that one is upside down. The one I'm talking about is upright but
> >flipped on its vertical axis.
> >
>
> This important character is already on the list of characters to be added
> in one the coming amendments in ISO 10646.
>
As Asmus mentioned, this one is already on its way. It is encoded in
Amendment 18 to 10646, which is just entering its last round of ballotting:
U+2426 SYMBOL FOR SUBSTITUTE FORM TWO
with the requisite shape of the reversed question mark.
This character is derived from ISO 2047, also shows up in DIN 66 213,
and in various terminal emulations.
--Ken
2-Oct-98 0:27:41-GMT,1442;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id UAA23786
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 20:27:41 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id RAA31536
; Thu, 1 Oct 1998 17:27:33 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA24011; Thu, 1 Oct 98 17:17:58 -0700
Message-Id: <9810020017.AA24011@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Uml-Sequence: 6054 (1998-10-02 00:16:05 GMT)
From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 17:16:04 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
Paul Keinanen wrote on 1998-10-01 17:03 UTC:
> In ISO 8859-1 these are listed as
>
> 80 PADDING CHARACTER (PAD)
> 81 HIGH OCTET PRESET (HOP)
>
> 99 SINGLE GRAPHIC CHARACTER INTRODUCER (SGCI)
Are you sure about the source? Last time I looked into
ISO/IEC 8859-1:1986(E), it was certainly free of any control
characters. ISO 8859 defines only graphical characters.
What exactly is your source on this?
Markus
--
Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK
email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
2-Oct-98 0:48:25-GMT,2800;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id UAA27172
for <fdc@watsun.cc.columbia.edu>; Thu, 1 Oct 1998 20:48:24 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id RAA11956
; Thu, 1 Oct 1998 17:48:34 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA23953; Thu, 1 Oct 98 17:15:49 -0700
Message-Id: <9810020015.AA23953@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Uml-Sequence: 6053 (1998-10-02 00:15:13 GMT)
From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 17:15:12 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
Frank da Cruz wrote on 1998-10-01 14:34 UTC:
> My concern is that the pictures in the Unicode book go horizontally.
Not much love went into the U+24XX glyphs used to print Unicode 2.0.
The OCR symbols look quite strange as well.
Unicode is a character set, not a font. Keeping things readable is the
duty of the font designer. Of course most good fonts will have the Control
Pictures with diagonal letters. The ISO 10646-1 standard shows them
all nicely diagonally. It is a good idea for font designers to have
BOTH the Unicode 2.0 and the ISO 10646 standard on their desk, to see a few
glyph variations as the two standards were printed using different fonts.
> Although
> I do not claim to be an expert on Unicode fonts, I have never seen one that
> implemented this block, so I don't actually know how it looks.
One X11 ISO 10646-1 font that implements this block is available from
http://www.cl.cam.ac.uk/~mgk25/download/ucs-fonts.tar.gz
See the included README file for instructions on how to have a quick
look at it with xfd. I don't claim that the control pictures in there
are extremely beautiful (doing ENQ in a 6x13 matrix is quite
challenging), but I think it is quite readable.
> However, I'd
> say that the horizontal arrangement would make it extremely difficult for the
> viewer to discern the cell boundaries, as in:
>
> NULSOHSTXETXEOTENQACKBELDELNAKSYNETBCANSUBESCCANACKSSASS3SPAEPACSISCI
>
> And thus, at minumum, the table in the book should be altered to show all
> control pictures arranged diagonally, and all future control picture additions
> should also be arranged that way.
I agree that the glyphs used to print the ISO 10646-1 standard are
much better here than those used in the Unicode 2.0 standard for the U+24XX
range.
Markus
--
Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK
email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
2-Oct-98 5:47:14-GMT,2024;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id BAA03026
for <fdc@watsun.cc.columbia.edu>; Fri, 2 Oct 1998 01:47:13 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id WAA11502
; Thu, 1 Oct 1998 22:47:21 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA25821; Thu, 1 Oct 98 22:41:05 -0700
Message-Id: <9810020541.AA25821@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Uml-Sequence: 6056 (1998-10-02 05:40:41 GMT)
From: Paul Keinanen <keinanen@sci.fi>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 22:40:39 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
At 12:32 1.10.1998 -0700, Frank da Cruz wrote:
>> If all octet values (00 .. FF) are also going to be displayed, there might
>> be some ambiguity with some of the two letter codes, e.g. FF, D1, D2, D3,
>> D4, EB and EC, which should be noted in the actual font design.
>>
>Good point. One path to disambiguation would be to show hex digits A-F in
>lower case. Sounds OK?
I weas also thinking about that, but since the characters are really small
to begin with, trying to make them lower case on a low resolution matrix
would make them even harder to read.
>> >Notes:
>> > (1) The reverse question is essential in VT terminal emulation...
>>
>> Even ISO-Latin1 contains the reverse question mark at 0xBF, so it is no
>> need to re-invent it.
>>
>But that one is upside down. The one I'm talking about is upright but
>flipped on its vertical axis.
Sorry about that, I did not read your text thoroughly enough.
In all those DEC and other systems I have used, the inverted question mark
(nmot the reverse question mark) has been used for (parity etc.) error
indication. I assumed, incorrectly, that you were refering to this usage.
Paul
2-Oct-98 5:53:42-GMT,2125;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id BAA03730
for <fdc@watsun.cc.columbia.edu>; Fri, 2 Oct 1998 01:53:42 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id WAA09530
; Thu, 1 Oct 1998 22:53:47 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA25825; Thu, 1 Oct 98 22:41:06 -0700
Message-Id: <9810020541.AA25825@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Uml-Sequence: 6057 (1998-10-02 05:40:51 GMT)
From: Paul Keinanen <keinanen@sci.fi>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 1 Oct 1998 22:40:50 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
At 17:16 1.10.1998 -0700, Markus Kuhn wrote:
>Paul Keinanen wrote on 1998-10-01 17:03 UTC:
>> In ISO 8859-1 these are listed as
>>
>> 80 PADDING CHARACTER (PAD)
>> 81 HIGH OCTET PRESET (HOP)
>>
>> 99 SINGLE GRAPHIC CHARACTER INTRODUCER (SGCI)
>
>Are you sure about the source? Last time I looked into
>ISO/IEC 8859-1:1986(E), it was certainly free of any control
>characters. ISO 8859 defines only graphical characters.
>What exactly is your source on this?
So that explains why
ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-1.TXT
list no code points in the C0 and C1 range.
Then I am just wondering why
ftp://dkuug.dk/i18n/charmaps/CP819 (alias Latin1 alias ISO_8859-1:1987) lists
<PA> /x80 <U0080> PADDING CHARACTER (PAD)
<HO> /x81 <U0081> HIGH OCTET PRESET (HOP)
<GC> /x99 <U0099> SINGLE GRAPHIC CHARACTER
INTRODUCER (SGCI)
and ftp://dkuug.dk/i18n/charmaps.646/ISO_8859-1:1987
lists the same code point values for these control characters
<PA> /d128
<HO> /d129
<GC> /d153
So I just wonder, where they at dkuug.dk/i18n have taken these C0 and C1
codes from, unfortunately these tables did not contain any references (as
did most EBCDIC tables).
Paul
2-Oct-98 9:37:48-GMT,1880;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id FAA07143
for <fdc@watsun.cc.columbia.edu>; Fri, 2 Oct 1998 05:37:47 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id CAA14408
; Fri, 2 Oct 1998 02:37:14 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA26800; Fri, 2 Oct 98 01:54:04 -0700
Message-Id: <9810020854.AA26800@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Uml-Sequence: 6061 (1998-10-02 08:51:08 GMT)
From: Michael Everson <everson@indigo.ie>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 2 Oct 1998 01:51:01 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by watsun.cc.columbia.edu id FAA07143
Ar 17:16 -0700 1998-10-01, scrφobh Markus Kuhn:
>Paul Keinanen wrote on 1998-10-01 17:03 UTC:
>> In ISO 8859-1 these are listed as
>>
>> 80 PADDING CHARACTER (PAD)
>> 81 HIGH OCTET PRESET (HOP)
>>
>> 99 SINGLE GRAPHIC CHARACTER INTRODUCER (SGCI)
>
>Are you sure about the source? Last time I looked into
>ISO/IEC 8859-1:1986(E), it was certainly free of any control
>characters. ISO 8859 defines only graphical characters.
>What exactly is your source on this?
I am not following the Terminal Graphics Proposal thread in great detail
because I think Elvish more relevant to my work :-) but I would like to say
that I hope lots of hardcopy examples will be forwarded to WG2 so that we
who are not so expert in the field can evaluate it appropriately.
Cf. the Western Musical Symbols or Syriac proposals.
Michael Everson
PS. Yes, I would make TTFs for them if necessary.
2-Oct-98 11:58:41-GMT,1798;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id HAA15340
for <fdc@watsun.cc.columbia.edu>; Fri, 2 Oct 1998 07:58:41 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id EAA13470
; Fri, 2 Oct 1998 04:53:01 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA29143; Fri, 2 Oct 98 04:44:36 -0700
Message-Id: <9810021144.AA29143@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Uml-Sequence: 6067 (1998-10-02 11:44:14 GMT)
From: Kevin Bracey <kbracey@acorn.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 2 Oct 1998 04:44:12 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
In message <9810020026.AA24186@unicode.org>
Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> wrote:
>
> Unicode is a character set, not a font. Keeping things readable is the
> duty of the font designer. Of course most good fonts will have the Control
> Pictures with diagonal letters. The ISO 10646-1 standard shows them
> all nicely diagonally. It is a good idea for font designers to have
> BOTH the Unicode 2.0 and the ISO 10646 standard on their desk, to see a few
> glyph variations as the two standards were printed using different fonts.
>
I heartily concur with this. IMHO, most of ISO 10646-1's glyphs are a lot
better than the Unicode Standard's.
--
Kevin Bracey, Senior Software Engineer
Acorn Computers Ltd Tel: +44 (0) 1223 725228
Acorn House, 645 Newmarket Road Fax: +44 (0) 1223 725328
Cambridge, CB5 8PB, United Kingdom WWW: http://www.acorn.co.uk/
2-Oct-98 15:07:41-GMT,2466;000000000001
Return-Path: <keka@im.se>
Received: from www.im.se (fw.im.se [193.14.22.222])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id LAA25411
for <fdc@watsun.cc.columbia.edu>; Fri, 2 Oct 1998 11:07:37 -0400 (EDT)
Received: from imhps.im.se (imhps.im.se [192.36.35.5])
by www.im.se (8.8.7/8.8.7) with ESMTP id QAA28984;
Fri, 2 Oct 1998 16:48:31 +0200 (METDST)
Received: from msxsth1.im.se by imhps.im.se (1.37.109.16/IM-3.12)
id AA088490748; Fri, 2 Oct 1998 17:05:48 +0200
Received: by msxsth1 with Internet Mail Service (5.5.2232.9)
id <TX2WHJ04>; Fri, 2 Oct 1998 17:03:39 +0200
Message-Id: <C110A2268F8DD111AA1A00805F85E58D57DC2F@ntgbg1>
From: Karlsson Kent - keka <keka@im.se>
To: "'Paul Keinanen'" <keinanen@sci.fi>, "'Rick McGowan'" <rmcgowan@apple.com>,
"'Frank da Cruz'" <fdc@watsun.cc.columbia.edu>,
"'Markus Kuhn'"
<Markus.Kuhn@cl.cam.ac.uk>,
"'Ken Whistler'" <kenw@sybase.com>
Cc: "'Asmus Freytag'" <asmusf@ix.netcom.com>,
"'Kevin Bracey'"
<kbracey@acorn.com>,
"'John Cowan'" <cowan@locke.ccil.org>
Subject: RE: Terminal Graphics Proposal
Date: Fri, 2 Oct 1998 17:03:15 +0200
Mime-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2232.9)
Content-Type: text/plain;
charset="iso-8859-1"
I'm ***not*** REALLY interested in the control code display-instead-of-do
characters, since I find them to be a thing of the past* (no flames, please,
I alredy know that some of you disagree). And I know TUS says one can use
ANY (in some way appropriate) glyps for them.
It still disturbs me that thay have no compatibility decompositions.
(Compare the <square> decompositons for some characters.) The glyphs for
these (nonce, imho) symbol characters are still fairly fixed to actually be
a short (2-3) sequence of letters/digits. I think it would be reasonable to
have compatibility decompositions for these characters too. This would
affect collation also: they would be sorted according to the constituent
letters (which is what I, at least, would expect).
I probably should not say this, but... If you are abolutely hardbent on
having symbols for control codes, there should be some for the Unicode
control codes too (like paragraph separator, left-to-right-mark, etc.) They
need not be constructed from letters...
R.
/kent k
*(though the newly suggested hexadecimal-digit-pair display ones might
continue to be useful; though hexadecimal digit quadruples would fill an
entier plane and more! ;-)
2-Oct-98 16:42:26-GMT,2281;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id MAA23664;
Fri, 2 Oct 1998 12:41:04 -0400 (EDT)
Date: Fri, 2 Oct 1998 12:41:04 -0400 (EDT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Message-Id: <199810021641.MAA23664@watsun.cc.columbia.edu>
To: unicode@unicode.org
Subject: Re: Terminal Graphics Proposal
In-Reply-To: Your message of Fri, 2 Oct 1998 01:51:01 -0700 (PDT)
Ar Fri, 2 Oct 1998 01:51:01 -0700 (PDT) scrφobh Michael Everson:
> ... I hope lots of hardcopy examples will be forwarded to WG2 so that we
> who are not so expert in the field can evaluate it appropriately.
>
I'll be happy to provide copies of the character set table from all the
relevant manuals. How do I forward them to WG2?
> PS. Yes, I would make TTFs for them if necessary.
>
What's a TTF? You mean Web-viewable glyph tables like on your website?
Thanks!
- Frank
P.S. By the way, I realize some people find this focus on arcane,
"obsolete", ""legacy"" technology amusing, but it might have certain
unanticipated benefits. For historic or scholarly purposes, the UTC has an
interest in encoding scripts that are no longer in active use; one might
view these glyphs in the same way. I am always amazed by the vigor with
which the history of computing is discarded and wiped out on a continuing
basis. Computing is quite likely to dominate human life from now on; some
day everyone will look around and wonder how it all happened, and nobody
will know. At least now (I hope) we'll be able to publish works --
electronic or otherwise -- in a Unicode font, illustrating how people used
computers in ancient times (the 1970s and 80s), for the continued
amusement of generations to come.
P.P.S. Those interested in preserving the signs and symbols of bygone eras
of computing might also want to take a look at Fred Hoyle's book, The Black
Cloud, circa 1954, which I read a long time ago but don't have any more. As
I recall, it included fragments of computer programs written in the strange
punch-card symbols of the time -- lozenges, etc -- which I dimly recall from
my youthful experiences with IBM EAM equipment. Does anyone have a copy
handy? I wonder if it can be printed in Unicode; perhaps here is fodder for
another fun proposal...
2-Oct-98 17:29:14-GMT,2329;000000000001
Return-Path: <kenw@sybase.com>
Received: from inergen.sybase.com (inergen.sybase.com [192.138.151.43])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id NAA07088
for <fdc@watsun.cc.columbia.edu>; Fri, 2 Oct 1998 13:29:12 -0400 (EDT)
Received: from smtp1.sybase.com (sybgate.sybase.com [130.214.220.35])
by inergen.sybase.com (8.8.4/8.8.4) with SMTP
id KAA10504; Fri, 2 Oct 1998 10:26:44 -0700 (PDT)
Received: from birdie.sybase.com by smtp1.sybase.com (4.1/SMI-4.1/SybH3.5-030896)
id AA25742; Fri, 2 Oct 98 10:25:29 PDT
Received: by birdie.sybase.com (5.x/SMI-SVR4/SybEC3.5)
id AA16847; Fri, 2 Oct 1998 10:25:23 -0700
Date: Fri, 2 Oct 1998 10:25:23 -0700
From: kenw@sybase.com (Kenneth Whistler)
Message-Id: <9810021725.AA16847@birdie.sybase.com>
To: keka@im.se
Subject: RE: Terminal Graphics Proposal
Cc: keinanen@sci.fi, rmcgowan@apple.com, fdc@watsun.cc.columbia.edu,
Markus.Kuhn@cl.cam.ac.uk, kenw@sybase.com, asmusf@ix.netcom.com,
kbracey@acorn.com, cowan@locke.ccil.org
X-Sun-Charset: US-ASCII
Kent said:
> I'm ***not*** REALLY interested in the control code display-instead-of-do
> characters, since I find them to be a thing of the past* (no flames, please,
> I alredy know that some of you disagree). And I know TUS says one can use
> ANY (in some way appropriate) glyps for them.
>
> It still disturbs me that thay have no compatibility decompositions.
> (Compare the <square> decompositons for some characters.)
I disagree completely on this. Proposing compatibility decompositions
for glyphs which have arbitrary content (such as these) confuses
apples and oranges. These 32 glyphs for control codes could contain
3-letter acronyms, or 2-letter acronyms, or could be substituted out
for something completely different, such as the ISO 2047 set. Compatibility
decompositions in that context would be completely misleading.
> The glyphs for
> these (nonce, imho) symbol characters are still fairly fixed to actually be
> a short (2-3) sequence of letters/digits. I think it would be reasonable to
> have compatibility decompositions for these characters too. This would
> affect collation also: they would be sorted according to the constituent
> letters (which is what I, at least, would expect).
Again, I disagree. These should *not* sort as "NUL", "SOH", etc.
--Ken
2-Oct-98 19:42:24-GMT,854;000000000001
Return-Path: <jon@kanji.com>
Received: from lotus.kanji.com (lotus.kanji.com [206.230.42.4])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id PAA20222
for <fdc@watsun.cc.columbia.edu>; Fri, 2 Oct 1998 15:42:17 -0400 (EDT)
Received: by kanji.com
via sendmail from stdin
id <m0zPAuc-002Aq4C@lotus.kanji.com> (Debian Smail3.2.0.101)
for fdc@watsun.cc.columbia.edu; Fri, 2 Oct 1998 13:30:46 -0600 (MDT)
Message-Id: <m0zPAuc-002Aq4C@lotus.kanji.com>
Date: Fri, 2 Oct 1998 13:30:46 -0600 (MDT)
From: Jon Babcock <jon@kanji.com>
To: fdc@watsun.cc.columbia.edu
Subject: The Black Hole
_The Black Hole_ by Fred Hoyle(ISBN: 0899683444) is available on
www.amazon.com for $26.96 plus shipping.
Just noticed your note in a unicode ML msg and thought you might be
interested. (I've no connection with Amazon, btw.)
Jon
--
Jon Babcock <jon@kanji.com>
2-Oct-98 20:25:29-GMT,1979;000000000011
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id QAA02116
for <fdc@watsun.cc.columbia.edu>; Fri, 2 Oct 1998 16:25:25 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id MAA08278
; Fri, 2 Oct 1998 12:15:57 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA02257; Fri, 2 Oct 98 11:15:01 -0700
Message-Id: <9810021815.AA02257@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Uml-Sequence: 6079 (1998-10-02 18:13:04 GMT)
From: Michael Everson <everson@indigo.ie>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 2 Oct 1998 11:13:03 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by watsun.cc.columbia.edu id QAA02116
Ar 09:54 -0700 1998-10-02, scrφobh Frank da Cruz:
>What's a TTF? You mean Web-viewable glyph tables like on your website?
A True Type Font.
>- Frank
>
>P.S. By the way, I realize some people find this focus on arcane,
>"obsolete", ""legacy"" technology amusing, but it might have certain
>unanticipated benefits. For historic or scholarly purposes, the UTC has an
>interest in encoding scripts that are no longer in active use; one might
>view these glyphs in the same way. I am always amazed by the vigor with
>which the history of computing is discarded and wiped out on a continuing
>basis.
I for my part do NOT!!!! want to see these terminal graphic things in the
BMP. They belong in Plane 1.
--
Michael Everson, Everson Gunn Teoranta ** http://www.indigo.ie/egt
15 Port Chaeimhghein ═ochtarach; Baile ┴tha Cliath 2; ╔ire/Ireland
Guthßn: +353 1 478-2597 ** Facsa: +353 1 478-2597 (by arrangement)
27 Pßirc an FhΘithlinn; Baile an Bh≤thair; Co. ┴tha Cliath; ╔ire
2-Oct-98 22:55:26-GMT,1768;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id SAA02054
for <fdc@watsun.cc.columbia.edu>; Fri, 2 Oct 1998 18:55:24 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id PAA19284
; Fri, 2 Oct 1998 15:48:52 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA04658; Fri, 2 Oct 98 14:30:05 -0700
Message-Id: <9810022130.AA04658@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6084 (1998-10-02 21:24:10 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 2 Oct 1998 14:24:06 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
> I for my part do NOT!!!! want to see these terminal graphic things in the
> BMP. They belong in Plane 1.
>
Perhaps, but as the lawyers say, the door was opened by the characters
already included in blocks at U+2400, U+2500, U+2600, and U+2700. In any
case, the intention here is to help Unicode become somewhat more
"technology-neutral". Terminal emulation is a fact of life, and important
to a significant number of serious and productive computer users; why should
its special glyphs be excluded from the same status enjoyed by dingbats and
astrological signs? Seriously, I think terminal emulation is far more
mainstream than many Unicoders seem to think, and I hope it is a worthy goal
to welcome this consituency into the fold, thus allowing them to continue
their work in their accustomed manner, rather than according to the dictates
of haute couture, with the added bonus of uniform access to the world's
writing systems.
- Frank
2-Oct-98 23:32:19-GMT,4256;000000000001
Return-Path: <tzha0@amdahl.com>
Received: from orpheus.amdahl.com (orpheus.amdahl.com [159.199.101.3])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with SMTP id TAA05771
for <fdc@watsun.cc.columbia.edu>; Fri, 2 Oct 1998 19:32:18 -0400 (EDT)
Received: from minerva.amdahl.com([129.212.33.25]) (3880 bytes) by orpheus.amdahl.com
via sendmail with P:smtp/R:match-mx-hosts/T:smtp
(sender: <tzha0@amdahl.com>)
id <m0zPEd8-0004qxC@orpheus.amdahl.com>
for <fdc@watsun.cc.columbia.edu>; Fri, 2 Oct 1998 16:28:58 -0700 (PDT)
(Smail-3.2.0.102 1998-Aug-2 #1 built 1998-Aug-14)
Received: from libra by minerva.amdahl.com with smtp
(Smail3.1.29.1 #5) id m0zPEc1-0001AFC; Fri, 2 Oct 98 16:27 PDT
Message-Id: <m0zPEc1-0001AFC@minerva.amdahl.com>
From: "Tony Harminc" <tzha0@amdahl.com>
To: Frank da Cruz <fdc@watsun.cc.columbia.edu>, unicode@unicode.org
Date: Fri, 2 Oct 1998 19:29:05 -0400
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Re: Terminal Graphics Proposal
Priority: normal
In-reply-to: <9810010137.AA14105@unicode.org>
X-mailer: Pegasus Mail for Win32 (v3.01a)
On 30 Sep 98, at 18:35, Frank da Cruz wrote:
> 8. MISCELLANEOUS SINGLE-CELL GLYPHS
>
> Table 8.1: Miscellaneous Single-Cell Terminal Glyphs
>
> Code Description Reference
> E0F0 Reverse Question Mark DEC VTxxx, Wyse, Televideo (1)
> E0F1 Box with X inside DG Math 06/07, GCGID SP500000
> E0F2 Human stick figure with hat SNI Facet 04/14
> E0F3 Clock (with hands at 3:00) SNI Klammern 05/01
> E0F4 Overscore asterisk IBM 3270
> E0F5 Overscore semicolon IBM 3270
> E0F6 Padlock (keyboard locked) IBM 3270
This last one introduces a bit of a problem, I think. It differs
from all other characters mentioned in that it is never displayed in
the data portion of a 3270 screen, but rather occurs "below the line"
as an indication of keyboard status. If it is to be included, then
there are several more uniquely 3270 characters that can be seen
below the line; I don't know formal names for them, and indeed they
generally don't appear in IBM's CDRA documents. Roughly, they are:
Outline up arrow (indication of upshifted condition)
Outline down arrow (indication of downshifted (override) condition)
Key (indication of terminal physically locked (I think
this may be what is meant by E0F6 above)
Stick figure (terminal is connected to "operator" (really to a
supervisory program))
Solid block (terminal is connected to "application program")
4 in square box (terminal is connected to 3274-type control unit)
6 in square box (terminal is connected to 3276-type control unit)
Lightning bolt (communication failure)
Rectangle with slash (machine check)
Printer symbol with slash (associated printer has an error condition)
and most problematic:
Left half of clock (these two form a doublewidth clock (set at 6:10
Right half of clock or 2:30, though I'm sure the time would be
considered a matter of glyph - indeed at least
one non-IBM manufacturer's clock symbol was 5:50
or 10:30)
Now it's entirely reasonable to argue that all the above (and I may
have forgotten a couple) have no business being encoded at all.
Indeed some terminal emulators use graphical means to produce the
symbols. In any case there is nothing in the 3270 architecture that
specifies use of any of them, and an emulator program can use other
means to communicate the same information to the user. However a
number of Windows-based emulators I know do use glyphs encoded in a
font that they supply to produce at least a subset of the symbols.
(It should be pointed out that a number of "ordinary" glyphs can also
appear below the line, but I can think of no reason not to unify them
with the upper case letters, numbers, and so on.)
That IBM doesn't include them in CDRA may be a good reason to exclude
them from this proposal. But they can be genuinely useful for
writers of emulators. What to do ? And how many clocks and stick
figures is it reasonable to encode ?
Tony Harminc
3-Oct-98 9:44:33-GMT,2720;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id FAA27978
for <fdc@watsun.cc.columbia.edu>; Sat, 3 Oct 1998 05:44:32 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id CAA46594
; Sat, 3 Oct 1998 02:44:30 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA07667; Sat, 3 Oct 98 02:33:15 -0700
Message-Id: <9810030933.AA07667@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Uml-Sequence: 6089 (1998-10-03 09:32:58 GMT)
From: Michael Everson <everson@indigo.ie>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Sat, 3 Oct 1998 02:32:57 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by watsun.cc.columbia.edu id FAA27978
Ar 14:24 -0700 1998-10-02, scrφobh Frank da Cruz:
>> I for my part do NOT!!!! want to see these terminal graphic things in the
>> BMP. They belong in Plane 1.
>>
>Perhaps, but as the lawyers say, the door was opened by the characters
>already included in blocks at U+2400, U+2500, U+2600, and U+2700.
I will not support their inclusion in the BMP unless there is a really good
reason. (I'd still make TTFs if necessary though, because I am a loon.) The
list of characters I saw was rather long.
>In any
>case, the intention here is to help Unicode become somewhat more
>"technology-neutral".
The UCS is going to be used for centuries. Do we really think VT100
emulation will be needed via BMP support?
>Terminal emulation is a fact of life, and important
>to a significant number of serious and productive computer users; why should
>its special glyphs be excluded from the same status enjoyed by dingbats and
>astrological signs?
Because the dingbats are used in typography, and astrological signs have a
definite semantic.
>Seriously, I think terminal emulation is far more
>mainstream than many Unicoders seem to think, and I hope it is a worthy goal
>to welcome this consituency into the fold, thus allowing them to continue
>their work in their accustomed manner, rather than according to the dictates
>of haute couture, with the added bonus of uniform access to the world's
>writing systems.
I don't see the argument for BMP here.
--
Michael Everson, Everson Gunn Teoranta ** http://www.indigo.ie/egt
15 Port Chaeimhghein ═ochtarach; Baile ┴tha Cliath 2; ╔ire/Ireland
Guthßn: +353 1 478-2597 ** Facsa: +353 1 478-2597 (by arrangement)
27 Pßirc an FhΘithlinn; Baile an Bh≤thair; Co. ┴tha Cliath; ╔ire
3-Oct-98 21:07:30-GMT,1792;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id RAA02053
for <fdc@watsun.cc.columbia.edu>; Sat, 3 Oct 1998 17:07:29 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id OAA55830
; Sat, 3 Oct 1998 14:02:52 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA09212; Sat, 3 Oct 98 13:52:08 -0700
Message-Id: <9810032052.AA09212@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-Uml-Sequence: 6091 (1998-10-03 20:51:56 GMT)
From: Elliotte Rusty Harold <elharo@sunsite.unc.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Sat, 3 Oct 1998 13:51:54 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
> E0B4 Latin capital letter H with bar SNI Math 04/05 (2)
> E0B5 Latin small letter h with bar SNI Math 04/06 (2)
Is E0B5 supposed to be Planck's constant over 2*PI? If so, it's encoded at
210F, 0127, and 045B. And your E0B4 is at 0126.
+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@sunsite.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
| XML: Extensible Markup Language (IDG Books 1998) |
| http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/ |
+----------------------------------+---------------------------------+
| Read Cafe au Lait for Java News: http://sunsite.unc.edu/javafaq/ |
| Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/ |
+----------------------------------+---------------------------------+
4-Oct-98 21:50:26-GMT,3358;000000000001
Return-Path: <keka@im.se>
Received: from www.im.se (fw.im.se [193.14.22.222])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id RAA11663
for <fdc@watsun.cc.columbia.edu>; Sun, 4 Oct 1998 17:50:19 -0400 (EDT)
Received: from imhps.im.se (imhps.im.se [192.36.35.5])
by www.im.se (8.8.7/8.8.7) with ESMTP id XAA24796;
Sun, 4 Oct 1998 23:33:11 +0200 (METDST)
Received: from msxsth1.im.se by imhps.im.se (1.37.109.16/IM-3.12)
id AA239507835; Sun, 4 Oct 1998 23:50:35 +0200
Received: by msxsth1 with Internet Mail Service (5.5.2232.9)
id <TX2WHN8C>; Sun, 4 Oct 1998 23:48:37 +0200
Message-Id: <C110A2268F8DD111AA1A00805F85E58D57DC34@ntgbg1>
From: Karlsson Kent - keka <keka@im.se>
To: "'kenw@sybase.com'" <kenw@sybase.com>
Cc: keinanen@sci.fi, rmcgowan@apple.com, fdc@watsun.cc.columbia.edu,
Markus.Kuhn@cl.cam.ac.uk, asmusf@ix.netcom.com, kbracey@acorn.com,
cowan@locke.ccil.org
Subject: RE: Terminal Graphics Proposal
Date: Sun, 4 Oct 1998 23:47:53 +0200
Mime-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2232.9)
Content-Type: text/plain
> ----------
> From: kenw@sybase.com
> Sent: fredag 2 oktober 1998 19:25
> To: keka@im.se
> Cc: keinanen@sci.fi; rmcgowan@apple.com; fdc@watsun.cc.columbia.edu;
> Markus.Kuhn@cl.cam.ac.uk; kenw@sybase.com; asmusf@ix.netcom.com;
> kbracey@acorn.com; cowan@locke.ccil.org
> Subject: RE: Terminal Graphics Proposal
>
> Kent said:
>
> > I'm ***not*** REALLY interested in the control code
> display-instead-of-do
> > characters, since I find them to be a thing of the past* (no flames,
> please,
> > I alredy know that some of you disagree). And I know TUS says one can
> use
> > ANY (in some way appropriate) glyps for them.
> >
> > It still disturbs me that thay have no compatibility decompositions.
> > (Compare the <square> decompositons for some characters.)
>
> I disagree completely on this. Proposing compatibility decompositions
> for glyphs which have arbitrary content (such as these) confuses
> apples and oranges. These 32 glyphs for control codes could contain
> 3-letter acronyms, or 2-letter acronyms, or could be substituted out
> for something completely different, such as the ISO 2047 set.
> Compatibility
> decompositions in that context would be completely misleading.
>
> > The glyphs for
> > these (nonce, imho) symbol characters are still fairly fixed to actually
> be
> > a short (2-3) sequence of letters/digits. I think it would be
> reasonable to
> > have compatibility decompositions for these characters too. This would
> > affect collation also: they would be sorted according to the constituent
> > letters (which is what I, at least, would expect).
>
> Again, I disagree. These should *not* sort as "NUL", "SOH", etc.
>
> --Ken
>
When looking in an index for a terminal emulator say, which is somethign I
actually do sometimes (still), both I and I expect the average reader would
expect to find NL under N, FF under F and so on, rather than quite
unexpectedly before A.
In practice the "content" (glyphs) for these characters do not
appear to be arbitrary. They appear to be rather fixed, and not much
intended for future arbitrary glyph invention. The only variation I have
seen, correct me if I am wrong, is between two and three letter acronyms,
for which a differing code positions would be tolarable.
Kind regards
/kent k
5-Oct-98 16:34:10-GMT,6749;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id MAA25339;
Mon, 5 Oct 1998 12:32:44 -0400 (EDT)
Date: Mon, 5 Oct 98 12:32:43 EDT
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
To: unicode@unicode.org
Subject: Re: Terminal Graphics Proposal
In-Reply-To: Your message of Sat, 3 Oct 1998 02:32:57 -0700 (PDT)
Message-ID: <CMM.0.90.4.907605163.fdc@watsun.cc.columbia.edu>
> Ar 14:24 -0700 1998-10-02, scrmobh Frank da Cruz:
> >> I for my part do NOT!!!! want to see these terminal graphic things in the
> >> BMP. They belong in Plane 1.
> >>
> >Perhaps, but as the lawyers say, the door was opened by the characters
> >already included in blocks at U+2400, U+2500, U+2600, and U+2700.
>
> I will not support their inclusion in the BMP unless there is a really good
> reason. (I'd still make TTFs if necessary though, because I am a loon.) The
> list of characters I saw was rather long.
>
Hurray for Loons!
> >Terminal emulation is a fact of life, and important
> >to a significant number of serious and productive computer users; why should
> >its special glyphs be excluded from the same status enjoyed by dingbats and
> >astrological signs?
>
> Because the dingbats are used in typography, and astrological signs have a
> definite semantic.
>
But what about the block at U+2500? It was included to allow for
character-cell graphics that are possible on the PC -- and the so-called ANSI
emulations that based on it -- but they exclude other types of terminals that
are just as important ("existing standards"). The blocks at U+2580 and
U+25A0 are also clearly intended for character-cell graphic applications, but
they are incomplete. This proposal aims to fill some holes in existing
categories.
The argument for including the missing characters (not necessarily all of
them), stated as clearly as I can, is:
1. There are numerous terminal emulation products on the market, with a user
base numbering in the millions.
2. Increasingly, these products are used on systems -- like Windows NT --
that have Unicode fonts.
3. Many terminal based applications take full advantage of the features and
glyph repertoires of the terminals they are designed for.
4. The glyph repertoire of many common terminals -- VT100/VT220, Wyse,
Siemens Nixdorf, Data General, etc, include glyphs that are not presently
in Unicode.
5. Customers of terminal emulation products demand complete and accurate
emulation.
6. In order to succeed, makers of terminal emulation software must create
private fonts containing the missing glyphs (which, as an aside,
unnecessarily drives up the cost of the product for the end user) in
the Private Use area.
7. Because of the closed an proprietary nature of this process, each terminal
emulation product potentially (and in fact) encodes the same characters
at different places.
8. Other applications use the Private Use Area for other purposes (and other
glyphs).
9. The result is that terminal emulation products do not interoperate with
each other (who cares), or (the real point) with other applications on the
same platform.
For example, a VT100 or HP forms-based screen can not be pasted into a word
processing document without changing the forms borders (etc, depending on
exactly how they are encoded) into whatever other glyphs happen to be defined
at the same code points in the font used by the other application. Ditto for
mathematical formulae displayed on DEC or Siemens Nixdorf screen. Ditto for
character-cell illustrations or tables in numerous online texts intended for
display on any of the widespread terminals.
> >In any
> >case, the intention here is to help Unicode become somewhat more
> >"technology-neutral".
>
> The UCS is going to be used for centuries. Do we really think VT100
> emulation will be needed via BMP support?
>
How does one answer a question like this? Should it be based purely on
numbers? For example, if there are currently millions of users of terminal
emulators (there are), is it right to turn our backs on them while at the
same time we encode writing systems that are used by only a handful of
scholars?
Or, to turn the question on its head, what is wrong with VT100 emulation?
The fact that the popular trade press would like us all to live in a GUI
world that we all know is unreliable, mysterious, proprietary, and constantly
in flux, rather than in a proven, productive, stable, dependable, and
cost-effective open environment should not be a factor in this discussion,
any more than it should be in deciding whether to encode Linear B.
Here in New York City we have thousands of people whose jobs are to sit in
front of a 3270 (or other) terminal all day and respond to telephone calls.
These include 911 (police/fire emergency) operators, EMS dispatchers,
heat-complaint bureau and poison control agents, and car rental and airline
reservation clerks (to name a few). These are what we like to call "mission
critical" applications, and they must be (what we like to call) "rock solid".
These people use a particular application all day, every day. They are
trained on it, they must be able to use it effectively. At some point, the
aging terminals will be replaced by PCs, because the terminals wear out and
almost nobody makes them any more, but the applications themselves will not
go away, nor should they.
The new PCs will need to do exactly what the terminals did. We don't want
our 911 operators to become needlessly confused when some strange symbol
shows up on their screen in place of the one they expect. Taking this a step
further, the people who write the training, operations, and procedures
manuals for these systems need to be able to show the terminal screens and
quote individual glyphs in the text. This is legitimate, real-world,
nuts-and-bolts stuff that might not grab headlines in PC Week (but then I
think that's an excellent indicator its importance :-)
The original proposal included:
Math symbols: 34
Line/Box/Block symbols: 31
Misc symbols: 7
Control pictures: 115
Hex bytes: 256
TOTAL: 443
The single biggest category is hex bytes, which so far seems to have received
a warm reception. Thus the greatest controversy seems to swirl around the
smallest number of characters.
We begin by unifying the proposed diagonal C0 control pictures with the ones
already at U+2400:
Math symbols: 34
Line/Box/Block symbols: 31
Misc symbols: 7
Control pictures: 81
Hex bytes: 256
TOTAL: 409
If we eliminate the hex bytes, this brings the total down to 153.
- Frank
5-Oct-98 19:07:40-GMT,2001;000000000001
Return-Path: <asmusf@ix.netcom.com>
Received: from dfw-ix10.ix.netcom.com (dfw-ix10.ix.netcom.com [206.214.98.10])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id PAA13364
for <fdc@watsun.cc.columbia.edu>; Mon, 5 Oct 1998 15:07:39 -0400 (EDT)
Received: (from smap@localhost)
by dfw-ix10.ix.netcom.com (8.8.4/8.8.4)
id OAA18941; Mon, 5 Oct 1998 14:03:12 -0500 (CDT)
Received: from stl-wa51-59.ix.netcom.com(207.220.40.187) by dfw-ix10.ix.netcom.com via smap (V1.3)
id rma018829; Mon Oct 5 14:02:24 1998
Message-Id: <3.0.5.32.19981005120405.00a62a20@popd.ix.netcom.com>
X-Sender: asmusf@popd.ix.netcom.com
X-Mailer: QUALCOMM Windows Eudora Light Version 3.0.5 (32)
Date: Mon, 05 Oct 1998 12:04:05 -0700
To: Karlsson Kent - keka <keka@im.se>, "'kenw@sybase.com'" <kenw@sybase.com>
From: Asmus Freytag <asmusf@ix.netcom.com>
Subject: RE: Terminal Graphics Proposal
Cc: keinanen@sci.fi, rmcgowan@apple.com, fdc@watsun.cc.columbia.edu,
Markus.Kuhn@cl.cam.ac.uk, kbracey@acorn.com, cowan@locke.ccil.org
In-Reply-To: <C110A2268F8DD111AA1A00805F85E58D57DC34@ntgbg1>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
>>
>> Again, I disagree. These should *not* sort as "NUL", "SOH", etc.
>>
>> --Ken
>>
>When looking in an index for a terminal emulator say, which is somethign I
>actually do sometimes (still), both I and I expect the average reader would
>expect to find NL under N, FF under F and so on, rather than quite
>unexpectedly before A.
>
I would side with Ken. If the emulator manual used the character codes that
correspond to the 'Control code Pictures', I would in fact expect them to
sort with all the other control code pictures and special symbols. If the
index wanted to focus on the names for the control functions, it would use
the charcter codes for the Latin letters and spell out FF etc.
There is no need to burdern *every* single implementation of the standard
sort with the table entries since such simple solutions are possible.
A./
6-Oct-98 17:58:19-GMT,1532;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id NAA10835
for <fdc@watsun.cc.columbia.edu>; Tue, 6 Oct 1998 13:58:17 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id KAA09096
; Tue, 6 Oct 1998 10:57:23 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA04434; Tue, 6 Oct 98 10:43:08 -0700
Message-Id: <9810061743.AA04434@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Uml-Sequence: 6096 (1998-10-06 17:42:51 GMT)
From: "Julie Doll Allen" <adollen@ix.netcom.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Tue, 6 Oct 1998 10:42:50 -0700 (PDT)
Subject: RE: Terminal Graphics Proposal
Content-Transfer-Encoding: 7bit
Asmus wrote:
Finally, once you feel that your proposal is pretty stable, there are brand
new instructions on how to submit proposals to Unicode on the web site (the
page should be called proposals.html, but I'm not sure where you will find
it). It would be useful to assemble the kinds of information that are
needed, esp. the answers to the form.
---------[end snip]-------
The new page is at:
http://www.unicode.org/pending/proposals.html
I am still adding links to get to it, but it can be accessed from What's New
or, of course, directly.
Julie Allen
Editor
Unicode, Inc.
6-Oct-98 19:09:10-GMT,1153;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id PAA02708
for <fdc@watsun.cc.columbia.edu>; Tue, 6 Oct 1998 15:09:09 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id MAA53736
; Tue, 6 Oct 1998 12:07:35 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA05236; Tue, 6 Oct 98 12:01:06 -0700
Message-Id: <9810061901.AA05236@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
X-Uml-Sequence: 6097 (1998-10-06 19:00:47 GMT)
From: "Tony Harminc" <tzha0@amdahl.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Tue, 6 Oct 1998 12:00:46 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
Content-Transfer-Encoding: 7BIT
On 5 Oct 98, at 13:57, Frank da Cruz wrote:
> The single biggest category is hex bytes, which so far seems to have received
> a warm reception.
Btw, should the hex bytes have the Number property ?
Tony Harminc
6-Oct-98 20:11:20-GMT,955;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id QAA20909
for <fdc@watsun.cc.columbia.edu>; Tue, 6 Oct 1998 16:11:19 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id NAA67062
; Tue, 6 Oct 1998 13:10:21 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA05696; Tue, 6 Oct 98 13:04:37 -0700
Message-Id: <9810062004.AA05696@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6098 (1998-10-06 20:04:23 GMT)
From: Rick McGowan <rmcgowan@apple.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Tue, 6 Oct 1998 13:04:21 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
> The single biggest category is hex bytes, which so far seems to have received
> a warm reception.
What does "warm reception" mean?
Rick
6-Oct-98 20:49:45-GMT,1073;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id QAA03497
for <fdc@watsun.cc.columbia.edu>; Tue, 6 Oct 1998 16:49:41 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id NAA10010
; Tue, 6 Oct 1998 13:45:33 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA05911; Tue, 6 Oct 98 13:31:48 -0700
Message-Id: <9810062031.AA05911@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6099 (1998-10-06 20:30:21 GMT)
From: kenw@sybase.com (Kenneth Whistler)
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Tue, 6 Oct 1998 13:30:20 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
>
> On 5 Oct 98, at 13:57, Frank da Cruz wrote:
>
> > The single biggest category is hex bytes, which so far seems to have received
> > a warm reception.
>
> Btw, should the hex bytes have the Number property ?
>
Clearly not.
--Ken
> Tony Harminc
>
6-Oct-98 21:10:51-GMT,1278;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id RAA08700
for <fdc@watsun.cc.columbia.edu>; Tue, 6 Oct 1998 17:10:49 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id OAA53574
; Tue, 6 Oct 1998 14:09:56 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA06001; Tue, 6 Oct 98 13:37:31 -0700
Message-Id: <9810062037.AA06001@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Uml-Sequence: 6100 (1998-10-06 20:37:07 GMT)
From: John Cowan <cowan@locke.ccil.org>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Tue, 6 Oct 1998 13:37:05 -0700 (PDT)
Subject: Re: Terminal Graphics Proposal
Content-Transfer-Encoding: 7bit
Tony Harminc wrote:
> Btw, should the hex bytes have the Number property ?
IMHO no. They are "Symbol, other".
--
John Cowan http://www.ccil.org/~cowan cowan@ccil.org
You tollerday donsk? N. You tolkatiff scowegian? Nn.
You spigotty anglease? Nnn. You phonio saxo? Nnnn.
Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)
8-Oct-98 0:08:32-GMT,16887;000000000401
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id UAA27987;
Wed, 7 Oct 1998 20:07:16 -0400 (EDT)
Date: Wed, 7 Oct 98 20:07:15 EDT
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
To: unicode@unicode.org
Subject: Collected Comments on Terminal Graphics Proposal
Message-ID: <CMM.0.90.4.907805235.fdc@watsun.cc.columbia.edu>
Thanks to all who commented on the Terminal Graphics proposal. Here are
some collected responses to particular points.
Geoffrey Waigh <gpw@cybersurf.net> wrote:
> > A selection of terminal graphics characters is proposed for Unicode [24]
> > and ISO 10646 [19] to allow Unicode-based terminal emulation software to
> > (a) display glyphs that are found on popular types of terminals but
> > currently are not available in Unicode, and (b) interoperate with other
> > Unicode applications.
>
> I can see clear merit in handling b), but I'm leary of the code space
> consumption that a) is having here. In general, my feeling is that
> if 98% emulation does the job in an adequate fashion for
> non-perfectionists, then that is the way to go.
>
When a company is in the market for a terminal emulator, one of the factors
affecting their choice is the quality of the emulation, which includes the
ability to display all the same glyphs the terminal displays. If product A
can do this (but has to use a custom font to do so) and product B is a good
citizen and sticks with Unicode -- which prevents it from displaying the same
glyphs properly -- many companies will choose product A because its emulation
is better, even though they might suffer down the road with its nonstandard
encodings -- and maybe even total lack of Unicode support.
> [On Hex code display]
>
> That seems kind of wasteful for a debugging mode. Do the terminals
> that produce this output have escape sequences for enabling this
> mode, or is it strictly a terminal configuration option? (Of course
> by that measure the control character codes come under scrutiny...)
>
This is the largest biggest block in the proposal, and it can be dispensed
with. I do believe, however, that many developers, help-desk people, network
managers, etc, will find it handy in debugging not only terminal sessions but
Web pages, word processors, network protocols, and files using Unicode-based
tools.
Kevin Bracey <kbracey@acorn.com> wrote:
> > Unicode already has a block of Control Pictures at U+2400 through
> > U+2421, but (except for "NL" at U+2424) these go horizontally across the
> > character cell, rather than diagonally, thus making them difficult to
> > distinguish from normal alphanumeric text. A new, parallel block of C0
> > control pictures is needed in which the abbreviations are displayed
> > diagonally.
>
> That's a glyph variation - the Unicode Standard explicitly states that you
> can use whatever preferred glyph you like for these. Indeed, IIRC, ISO
> 10646-1 has considerably different suggested glyphs for these characters.
>
(And many others concurred.) OK, this block is removed from Draft 2 of the
proposal, but some suggestions added for the next edition of the Unicode
Standard.
Asmus Freytag <asmusf@ix.netcom.com> wrote [On the same topic...]:
> And thus, at minumum, the table in the book should be altered to show all
> control pictures arranged diagonally, and all future control picture
> additions should also be arranged that way.
We are looking into this for Unicode 3.0. Although the mail discussion makes
clear that the distinction between characters and glyphs is widely known, it
makes no sense to depart from the established use in the one area the
characters are intended for! Since the two glyph forms are equivalent
(i.e. there's no question of changing the identity of the characters) such a
change is editorial in nature. For what it's worth, ISO 10646 uses the
diagonal forms (although incorrectly in a roman type face).
Kevin Bracey <kbracey@acorn.com> wrote:
> > E080 SP Space (like U+2420 but arranged diagonally)
> > E081 DEL Delete (Rubout) (2-character name: DT)
>
> These two are glyph variants of U+2420 and U+2421.
>
OK, these are removed too.
> > E082 LS1 Locking Shift 1 (ISO name for SO)
> > E083 LS0 Locking Shift 0 (ISO name for SI)
>
> Maybe these two could be considered glyph variants of U+240E and u+240F?
> Probably not, I suppose.
>
I've left them in, along with IS1 through IS4.
> > E0F0 Reverse Question Mark DEC VTxxx, Wyse, Televideo (1)
>
> I would suggest U+FFFD for this.
>
This was discussed at some length, but I've left it in, since many terminals
display this glyph, and for different purposes. It does not always mean
"unknown character received".
> Even ISO-Latin1 contains the reverse question mark at 0xBF, so it is no
> need to re-invent it.
>
As noted in previous postings, the ISO one is upside down, whereas this one
is upright.
Asmus Freytag <asmusf@ix.netcom.com> wrote:
> This important character [reverse question mark] is already on the list of
> characters to be added in one the coming amendments in ISO 10646.
kenw@sybase.com (Kenneth Whistler) wrote:
> As Asmus mentioned, this one is already on its way. It is encoded in
> Amendment 18 to 10646, which is just entering its last round of ballotting:
>
> U+2426 SYMBOL FOR SUBSTITUTE FORM TWO
>
> with the requisite shape of the reversed question mark.
>
Thanks; draft 2 amended accordingly.
Rick McGowan <rmcgowan@apple.com> wrote:
> Of course, there are a *lot* of controls, many control sets, and some
> degree of overlap, as Frank's proposal points out rather dramatically. I
> would suggest that he take up an attempt at serious unification of these
> things, and collect all of the wonderful data he's gathered into a "white
> paper" on how to use control pictures for what terminals, etc. With
> mapping tables, and a list of the minimum required additions to support
> full cross-mappings.
>
I have tried to do this in Draft 2.
Paul Keinanen <keinanen@sci.fi> wrote:
> If all octet values (00 .. FF) are also going to be displayed, there might
> be some ambiguity with some of the two letter codes, e.g. FF, D1, D2, D3,
> D4, EB and EC, which should be noted in the actual font design.
>
Thanks for noticing! A caution to this effect has been added to Draft 2.
> > C1 Control characters are specified in ISO-6429 and used in the VT220
> > family of terminals [5] and the Wyse 370 [26], where they are
> > represented in the right half of the "display controls" font as shown in
> > Table 4.3 (DEC terminals use the full name, Wyse terminals use the 2X
> > name). As with C0 controls, the "name" is displayed diagonally within
> > the character cell. Unicode presently includes no C1 control pictures.
>
> Looking through various EBCDIC code pages (e.g. IBM278, IBM880) and other
> unnumbered sets it appears that these control codes are all also available
> in EBCDIC, but of course at different positions (e.g. IND at 0x24). Some
> references to these sets are "IBM NLS RM Vol2 SE09-8002-01, March 1990"
> and "IBM 3270 Char Set Ref Ch 10, GA27-2837-9, April 1987".
>
Thanks for the reference. I found a complete listing of modern EBCDIC
(which has changed considerable since the System/360 days!) in the CDRA
Registry, and have totally revised the EBCDIC controls section in Draft 2.
> >Note that three of the C1 control pictures are unassigned (the ones
> >marked by "(1)", that would be at U+E020, U+E021, and U+E039 if these
> >were assigned). These positions should be left vacant in case names are
> >assigned to these characters in a future revision of ISO 6429.
>
> In ISO 8859-1 these are listed as
>
> 80 PADDING CHARACTER (PAD)
> 81 HIGH OCTET PRESET (HOP)
> 99 SINGLE GRAPHIC CHARACTER INTRODUCER (SGCI)
>
I have both the ISO and ECMA versions of this standard and find no reference
to these or any other control characters. Nor can I find these characters
ISO 6429 or any of the control sets in the ISO Registry. Can you give a
more precise source?
> Then I am just wondering why:
>
> ftp://dkuug.dk/i18n/charmaps/CP819 (alias Latin1 alias ISO_8859-1:1987)
> lists:
> <PA> /x80 <U0080> PADDING CHARACTER (PAD)
> <HO> /x81 <U0081> HIGH OCTET PRESET (HOP)
> <GC> /x99 <U0099> SINGLE GRAPHIC CHARACTER INTRODUCER (SGCI)
>
> and ftp://dkuug.dk/i18n/charmaps.646/ISO_8859-1:1987
> lists the same code point values for these control characters
>
> <PA> /d128
> <HO> /d129
> <GC> /d153
>
> So I just wonder, where they at dkuug.dk/i18n have taken these C0 and C1
> codes from, unfortunately these tables did not contain any references (as
> did most EBCDIC tables).
> > 5. HEX BYTES
> >
> > Hexadecimal byte values, 2 hex digits each. Like display controls, but
> > for all 256 8-bit byte values...
>
> These would be very nice :-). Note the possible ambiguity with some two
> character control pictures r.g. FF, EB etc. So special precautions should be
> taken when designing the fonts.
>
Noted in Draft 2.
Karlsson Kent - keka <keka@im.se> wrote:
> ... (though the newly suggested hexadecimal-digit-pair display ones might
> continue to be useful; though hexadecimal digit quadruples would fill an
> entier plane and more! ;-)
Rick McGowan <rmcgowan@apple.com> wrote:
> > The single biggest category is hex bytes, which so far seems to have
> > received a warm reception.
>
> What does "warm reception" mean?
>
Some nice comments (like the ones just above).
Paul Keinanen <keinanen@sci.fi> wrote:
> > No attempt was made to account for the many Viewdata, Videotex, Minitel,
> > NAPLPS, or other mosaic graphics character sets. These should be
> > tackled, if appropriate, by someone who knows something about them.
>
> And not forgetting the tele-text block characters on European TVs. With
> the introduction of TV cards for PCs that also contains a teletext
> decoder, so there is a need to display the text and block graphics on
> PC. As far as I remember, the block graphic format is more or less the
> same as Viewdata with 2 columns and 3 rows per character cell, thus
> requiring 64 glyphs.
>
There are numerous mosaic graphics, Teletex, and similar character sets in
the ISO Register. Quite honestly, I have never even seen such a terminal
and do not feel qualified to propose how/if/when/whether this class of glyphs
should be handled in Unicode.
> All in all a very interesting proposal. By using as much existing
> characters from current Unicode standard, i guess there would be a greater
> likelyhood of getting thing officially approved.
>
In most places, the proposal does not bother enumerate all of the characters
used by these terminals that are already in Unicode -- and this evidently
leaves the false impression that they were not researched. Indeed they
were! If it is necessary to get the proposal passed, of course it can be
done.
Rick McGowan <rmcgowan@apple.com> wrote:
> > Still, if I were a font maker working from the Unicode book, I'd
> > probably copy the pictures in it, so again, I'd suggest the next edition
> > show the characters diagonally within the cell, and the accompanying text
> > (which if I can overlook, so can a font maker :-)
>
> Yes, yes, but... People should read, Grasshopper. It is that for which we
> write.
>
Yes, I should know this as well as anyone, having written several books
myself, which serve to varying degrees as software manuals, and which, if
users of the software would only read them, would save me my daily 6-8 hours
of question answering -- hence the smiley :-)
Karlsson Kent - keka <keka@im.se> wrote:
> I probably should not say this, but... If you are abolutely hardbent on
> having symbols for control codes, there should be some for the Unicode
> control codes too (like paragraph separator, left-to-right-mark, etc.)
> They need not be constructed from letters...
>
I have added a section on these to Draft 2. They are not needed for
terminal emulators (at least not yet), but might be handy in other contexts.
Tony Harminc <tzha0@amdahl.com> wrote:
> > E0F6 Padlock (keyboard locked) IBM 3270
>
> This last one introduces a bit of a problem, I think. It differs
> from all other characters mentioned in that it is never displayed in
> the data portion of a 3270 screen, but rather occurs "below the line"
> as an indication of keyboard status. If it is to be included, then
> there are several more uniquely 3270 characters that can be seen
> below the line; I don't know formal names for them, and indeed they
> generally don't appear in IBM's CDRA documents. Roughly, they are:
>
> Outline up arrow (indication of upshifted condition)
> Outline down arrow (indication of downshifted (override) condition)
> Key (indication of terminal physically locked (I think
> this may be what is meant by E0F6 above)
> Stick figure (terminal is connected to "operator" (really to a
> supervisory program))
> Solid block (terminal is connected to "application program")
> 4 in square box (terminal is connected to 3274-type control unit)
> 6 in square box (terminal is connected to 3276-type control unit)
> Lightning bolt (communication failure)
> Rectangle with slash (machine check)
> Printer symbol with slash (associated printer has an error condition)
>
These have been added in Draft 2 -- but just the ones not already in
Unicode (such as outline arrows, "4 in square box" which is really just
an inverse video "4" as far as a terminal is concerned, etc).
> and most problematic:
> Left half of clock (these two form a doublewidth clock (set at 6:10
> Right half of clock or 2:30, though I'm sure the time would be
> considered a matter of glyph - indeed at least
> one non-IBM manufacturer's clock symbol was 5:50
> or 10:30)
>
I don't have an actual 3270 terminal to look at just now, but I did manage
to scrape up the IBM 3270 Component Description manual, which lists (and
illustrates) all the special glyphs shown in the Operator Information Area,
in which there is nothing to suggest that the clock is made from two
character cells. In fact, it looks quite round to me :-)
Even if it is made from pieces, I assume there is no way to see them in
isolation, and so there should be no harm in encoding the clock as a single
glyph (and then, if necessary, show it in double size).
> Now it's entirely reasonable to argue that all the above (and I may
> have forgotten a couple) have no business being encoded at all.
> Indeed some terminal emulators use graphical means to produce the
> symbols. In any case there is nothing in the 3270 architecture that
> specifies use of any of them, and an emulator program can use other
> means to communicate the same information to the user. However a
> number of Windows-based emulators I know do use glyphs encoded in a
> font that they supply to produce at least a subset of the symbols.
> (It should be pointed out that a number of "ordinary" glyphs can also
> appear below the line, but I can think of no reason not to unify them
> with the upper case letters, numbers, and so on.)
>
Right. The reason for including the special glyphs appears at the top of
this message.
> That IBM doesn't include them in CDRA may be a good reason to exclude
> them from this proposal. But they can be genuinely useful for
> writers of emulators. What to do ? And how many clocks and stick
> figures is it reasonable to encode ?
>
In Draft 2, I'm listing one of each (I retired the SNI 3:00 clock and
stick figure with hat).
(Yes, I know that on the RS/6000 there is a little animated "running man"
who can stop, fall down, etc, as an indicator of the system status, but
that's above and beyond...)
Elliotte Rusty Harold <elharo@sunsite.unc.edu> wrote:
> > E0B4 Latin capital letter H with bar SNI Math 04/05 (2)
> > E0B5 Latin small letter h with bar SNI Math 04/06 (2)
>
> Is E0B5 supposed to be Planck's constant over 2*PI? If so, it's encoded at
> 210F, 0127, and 045B. And your E0B4 is at 0126.
>
Who knows what it's supposed to be! In any case, I looked harder and found
barred H's and T's, dotted L's, etc (which look just right for the SNI
character set), as well as some Engs, in Latin Extended A (U+0100..) and so
removed them from the proposal.
As a result of all your comments, and further research, Draft 2 should be
much tighter in terms of unifications, but also more complete -- win some,
lose some :-)
It's coming up in the next message. NOTE: If it is the sense of the readers
that these proposals should no longer be posted here, but rather just
pointers to them, I'm happy to comply. In case you want to skip the next
draft in email, the pointer is:
ftp://kermit.columbia.edu/kermit/charsets/ucsterminal.txt
Thanks again!
- Frank
8-Oct-98 1:12:12-GMT,1571;000000000011
Return-Path: <rmcgowan@scv4.apple.com>
Received: from mail-out1.apple.com (mail-out1.apple.com [17.254.0.52])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id VAA08504
for <fdc@watsun.cc.columbia.edu>; Wed, 7 Oct 1998 21:12:11 -0400 (EDT)
Received: from mailgate.apple.com (A17-128-100-225.apple.com [17.128.100.225])
by mail-out1.apple.com (8.8.5/8.8.5) with ESMTP id SAA40798
for <fdc@watsun.cc.columbia.edu>; Wed, 7 Oct 1998 18:06:40 -0700
Received: from scv4.apple.com (scv4.apple.com) by mailgate.apple.com
(mailgate.apple.com - SMTPRS 2.0.15) with ESMTP id <B0002724774@mailgate.apple.com> for <fdc@watsun.cc.columbia.edu>;
Wed, 07 Oct 1998 18:06:30 -0700
Received: from rangda (rangda.apple.com [17.202.14.171])
by scv4.apple.com (8.8.5/8.8.5) with SMTP id SAA56586
for <fdc@watsun.cc.columbia.edu>; Wed, 7 Oct 1998 18:06:29 -0700
Message-Id: <199810080106.SAA56586@scv4.apple.com>
To: fdc@watsun.cc.columbia.edu
Subject: Re: Terminal Graphics Draft 2
Date: Wed, 7 Oct 1998 18:06:29 -0700
From: Rick McGowan <rmcgowan@apple.com>
Reply-To: rmcgowan@apple.com
Received: by Apple.Mailer (2.95.2)
Thanks Frank for working on the next draft. But PLEASE, in future *DO NOT*
post 59.1k worth of draft document to the Unicode list!!! I get enough
megabytes of e-mail per day. And some people pay for downloading and
connect-time.
A pointer to:
ftp://kermit.columbia.edu/kermit/charsets/ucsterminal.txt
is sufficient. Anyone out there without access to either a web browser OR
ftp should ask for a copy of the draft via private e-mail.
Rick
8-Oct-98 6:44:07-GMT,2499;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id CAA22615
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 02:44:06 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id XAA08294
; Wed, 7 Oct 1998 23:43:45 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA14555; Wed, 7 Oct 98 23:39:36 -0700
Message-Id: <9810080639.AA14555@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Uml-Sequence: 6106 (1998-10-08 06:39:23 GMT)
From: brox@corena.no (Bjorn Brox)
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Wed, 7 Oct 1998 23:39:21 -0700 (PDT)
Subject: Re: Collected Comments on Terminal Graphics Proposal
Content-Transfer-Encoding: 8bit
Frank da Cruz wrote this:
>
> Thanks to all who commented on the Terminal Graphics proposal. Here are
> some collected responses to particular points.
...
> > And not forgetting the tele-text block characters on European TVs. With
> > the introduction of TV cards for PCs that also contains a teletext
> > decoder, so there is a need to display the text and block graphics on
> > PC. As far as I remember, the block graphic format is more or less the
> > same as Viewdata with 2 columns and 3 rows per character cell, thus
> > requiring 64 glyphs.
> >
> There are numerous mosaic graphics, Teletex, and similar character sets in
> the ISO Register. Quite honestly, I have never even seen such a terminal
> and do not feel qualified to propose how/if/when/whether this class of glyphs
> should be handled in Unicode.
The national norwegian teletext service on WWW is using a
Teletex-font (nrkttv.ttf) when properly configured.
http://www.nrk.no/teksttv (sorry, it's in Norwegian)
Wouldn't it be nice to be able to cut and paste from such a window?
You should also take a look on the corporate use subarea defined by Adobe
Systems.
http://www.adobe.com/supportservice/devrelations/typeforum/corporateuse.txt
http://www.adobe.com/supportservice/devrelations/typeforum/unicodegn.html
Some of your maths symbols, and probably others is covered by this range..
--
Bjorn Brox, CORENA Norge AS, http://www.corena.no/
Kirkegaardsvn. 45, P.O.Box 1024, N-3601 Kongsberg, NORWAY
Phone: +47 32737435, Fax: +47 32736877, Mobile: +47 92638590
8-Oct-98 8:55:12-GMT,5092;000000000001
Return-Path: <keka@im.se>
Received: from www.im.se (fw.im.se [193.14.22.222])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id EAA14769
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 04:55:07 -0400 (EDT)
Received: from imhps.im.se (imhps.im.se [192.36.35.5])
by www.im.se (8.9.1/8.9.1) with ESMTP id KAA07148;
Thu, 8 Oct 1998 10:36:53 +0200 (METDST)
Received: from msxsth1.im.se by imhps.im.se (1.37.109.16/IM-3.12)
id AA106666867; Thu, 8 Oct 1998 10:54:27 +0200
Received: by msxsth1 with Internet Mail Service (5.5.2232.9)
id <TX2W2GSD>; Thu, 8 Oct 1998 10:52:29 +0200
Message-Id: <C110A2268F8DD111AA1A00805F85E58D57DC41@ntgbg1>
From: Karlsson Kent - keka <keka@im.se>
To: "'Frank da Cruz'" <fdc@watsun.cc.columbia.edu>
Cc: "'kenw@sybase.com'" <kenw@sybase.com>,
"'rmcgowan@apple.com'"
<rmcgowan@apple.com>,
"'asmusf@ix.netcom.com'" <asmusf@ix.netcom.com>,
"'kbracey@acorn.com'" <kbracey@acorn.com>,
"'Markus.Kuhn@cl.cam.ac.uk'"
<Markus.Kuhn@cl.cam.ac.uk>,
"'cowan@locke.ccil.org'"
<cowan@locke.ccil.org>
Subject: RE: Terminal Graphics Draft 2
Date: Thu, 8 Oct 1998 10:52:00 +0200
Mime-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2232.9)
Content-Type: text/plain;
charset="iso-8859-1"
> Rusty Harold, Paul Keinanen, Karlsson Kent, Rick McGowan, Kenneth
> Whistler.
>
Well, it is Kent Karlsson (but so what...).
> Math Symbols
> Although most math symbols found on terminals are already in Unicode,
> certain terminal-based applications rely on the ability to construct
> large
> symbols (integral and summation signs, braces, brackets) from smaller
> character-cell-sized pieces. Section 6.
>
Some of these are used INTERNALLY in Knuth's TeX (and INTERNALLY in the now
hopefully retired troff). Users CANNOT access these glyph pieces directly
in these systems, if I remember correctly (nor would they want to). I don't
know to what extent they may be considered to be "characters" in dvi files
(which are never written by humans).
> 4. HEX BYTES
>
> Hexadecimal byte values, 2 hex digits each, allow any 8-bit byte to be
> displayed in hexadecimal in a single character cell (and therefore allow
> any
> Unicode character value to be displayed in two cells),
>
Well, if in UTF-16 one would need 2 OR 4 such per character. If in UTF-8
one would need from 1 to 4 such per character (assuming only UTF-16 space is
used, otherwise UTF-8 can have up to 6 octets per character).
> Table 5.0: Unicode Control Characters
>
> Code Val Name Description
> E000 2000 NQ SP Symbol for En Quad
> E001 2001 MQ SP Symbol for Em Quad
> E002 2002 EN SP Symbol for En Space
> E003 2003 EM SP Symbol for Em Space
> E004 2004 3/M SP Symbol for Three-Per-Em-Space
> E005 2005 4/M SP Symbol for Four-Per-Em-Space
> E006 2006 6/M SP Symbol for Six-Per-Em-Space
> E007 2007 F SP Symbol for Figure Space
> E008 2008 P SP Symbol for Punctuation Space
> E009 2009 TH SP Symbol for Thin Space
> E00A 200A H SP Symbol for Hair Space
> E00B 200B ZW SP Symbol for Zero-Width Space
> E00C 200C ZW NJ Symbol for Zero-Width Non-Joiner
> E00D 200D ZW J Symbol for Zero-Width Joiner
> E00E 200E LRM Symbol for Left-to-Right Mark
> E00F 200F RLM Symbol for Right-to-Left Mark
> E010 2028 L SEP Symbol for Line Separator
> E011 2029 P SEP Symbol for Paragraph Separator
> E012 202A LRE Symbol for Left-to-Right Embedding
> E013 202B RLE Symbol for Right-to-Left Embedding
> E014 202C PDF Symbol for Pop Directional Formatting
> E015 202D LRO Symbol for Left-to-Right Override
> E016 202E RLO Symbol for Right-to-Left Override
> E017 206A I SS Symbol for Inhibit Symmetric Swapping
> E018 206B A SS Symbol for Activate Symmetric Swapping
> E019 206C I AFS Symbol for Inhibit Arabic Form Shaping
> E01A 206D A AFS Symbol for Activate Arabic Form Shaping
> E01B 206E NA DS Symbol for National Digit Shapes
> E01C 206F NO DS Symbol for Nominal Digit Shapes
> E01D FEFF ZWN BSP Symbol for Zero Width No Break Space
> E01E FFFE FF FE Symbol for Not A Character (Byte Order) (2)
> E01F FFFE FF FF Symbol for Not A Character (2)
>
I think these have more room for glyph invention, since there is no need to
be at all compatible with existing terminals. Like grey(ish) arrows,
grey(ish) section mark or pilcrow, grey(ish) 'space box with annotation',
etc., rather than letters.
> Table 5.2: C1 Control Characters
>
> Code Val Name 2X Description
> 80 80 (1)
> 81 81 (1)
> E022 82 BPH Symbol for Break Permitted Here (2)
> E023 83 NBH Symbol for No Break Here (2)
>
Aren't these two 'control codes' the same as the Unicode characters Zero
Width Space and Zero Width No-Break Space?
> E024 84 IND IN Symbol for Index (3)
> E025 85 NEL NL Symbol for Next Line
>
newline??? again?
Regards
/kent k
8-Oct-98 13:09:08-GMT,2066;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id JAA11405
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 09:09:07 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id GAA30778
; Thu, 8 Oct 1998 06:08:22 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA15702; Thu, 8 Oct 98 06:03:47 -0700
Message-Id: <9810081303.AA15702@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
X-Uml-Sequence: 6107 (1998-10-08 13:02:16 GMT)
From: "Hart, Edwin F." <Ed.Hart@jhuapl.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 8 Oct 1998 06:02:15 -0700 (PDT)
Subject: RE: Terminal Graphics Proposal
I can see value for encoding the paired hex digits (00 to FF) in the
proposal. However with appropriate rendering software, I could also see
having them merely as a glyph variant for rendering. I might compare this
to encoding the Braille script.
1. Two of the glyphs could be used to display a 16-bit Unicode
character (e.g., for debugging or for displaying an unknown character)
2. Protocol analyzers for communications and LANs use these glyphs to
display captured data. (Today, the Network Associates (formerly, Network
General) Sniffer is perhaps the most widely recognized device. 20 years
ago, it was the Spectron DataScope. Both of these are likely trademarks.)
They have 2 display modes, hex and text (typically ASCII, EBCDIC). In my
youth, these devices used hardware fonts in ROM and TV-resolution CRTs.
Now, these devices tend to be PCs or computers with embedded software. If
the manufacturers of such equipment want to display characters beyond the
7-bit ASCII set, Unicode is the natural choice.
3. The glyphs are used for debugging communication problems and
software problems with the "real" terminals (rather than PCs emulating the
terminals).
Ed Hart
8-Oct-98 13:27:17-GMT,1078;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id JAA15084;
Thu, 8 Oct 1998 09:27:15 -0400 (EDT)
Date: Thu, 8 Oct 98 9:27:15 EDT
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
To: rmcgowan@apple.com
Subject: Re: Terminal Graphics Draft 2
In-Reply-To: Your message of Wed, 7 Oct 1998 18:06:29 -0700
Message-ID: <CMM.0.90.4.907853235.fdc@watsun.cc.columbia.edu>
> Thanks Frank for working on the next draft. But PLEASE, in future *DO NOT*
> post 59.1k worth of draft document to the Unicode list!!! I get enough
> megabytes of e-mail per day. And some people pay for downloading and
> connect-time.
>
> A pointer to:
>
> ftp://kermit.columbia.edu/kermit/charsets/ucsterminal.txt
>
> is sufficient. Anyone out there without access to either a web browser OR
> ftp should ask for a copy of the draft via private e-mail.
>
OK. That was my original idea (I suggested keeping the discussion private
to spare the bulk of Unicode readers, but nobody seemed to want it that way).
I'll post pointers from now on.
- Frank
8-Oct-98 16:28:04-GMT,1105;000000000001
Return-Path: <kenw@sybase.com>
Received: from inergen.sybase.com (inergen.sybase.com [192.138.151.43])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id MAA08787
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 12:28:03 -0400 (EDT)
Received: from smtp1.sybase.com (sybgate.sybase.com [130.214.220.35])
by inergen.sybase.com (8.8.4/8.8.4) with SMTP
id JAA17259; Thu, 8 Oct 1998 09:29:17 -0700 (PDT)
Received: from birdie.sybase.com by smtp1.sybase.com (4.1/SMI-4.1/SybH3.5-030896)
id AA16596; Thu, 8 Oct 98 09:28:05 PDT
Received: by birdie.sybase.com (5.x/SMI-SVR4/SybEC3.5)
id AA19822; Thu, 8 Oct 1998 09:27:56 -0700
Date: Thu, 8 Oct 1998 09:27:56 -0700
From: kenw@sybase.com (Kenneth Whistler)
Message-Id: <9810081627.AA19822@birdie.sybase.com>
To: keka@im.se
Subject: RE: Terminal Graphics Draft 2
Cc: fdc@watsun.cc.columbia.edu, kenw@sybase.com
X-Sun-Charset: US-ASCII
> > E024 84 IND IN Symbol for Index (3)
> > E025 85 NEL NL Symbol for Next Line
> >
> newline??? again?
I agree with Kent. Shouldn't this simply be:
U+2424 SYMBOL FOR NEWLINE
?
--Ken
8-Oct-98 18:33:27-GMT,2068;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id OAA18263
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 14:33:25 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id KAA43790
; Thu, 8 Oct 1998 10:10:50 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA16630; Thu, 8 Oct 98 09:57:59 -0700
Message-Id: <9810081657.AA16630@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6109 (1998-10-08 16:57:33 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 8 Oct 1998 09:57:32 -0700 (PDT)
Subject: RE: Terminal Graphics Draft 2
> > > E024 84 IND IN Symbol for Index (3)
> > > E025 85 NEL NL Symbol for Next Line
> > >
> > newline??? again?
>
> I agree with Kent. Shouldn't this simply be:
>
> U+2424 SYMBOL FOR NEWLINE
>
That depends on what the (unstated) semantics are for U+2424.
I expect it simply represents a "line terminator", like LF in
UNIX, CR on the Macintosh, or CRLF in DOS.
NEL stands for Next Line (not Newline). The definition of NEL
in ISO 6429 [8.3.87] is rather complex: "The effect of NEL
depends on the setting of the DEVICE COMPONENT SELECT MODE
(DCSM) and on the parameter value of SELECT IMPLICIT MOVEMENT
DIRECTION (SIMD)." Several paragraphs go on to explain this.
Confusingly, terminals that support C1 controls and that also
use 2-character abbreviations for them abbreviate NEL as NL.
However, the VT220 class of terminals actually puts "NEL" on
the screen in display-controls mode. All this in contrast to
EBCDIC, which defines an actual NL character, which I doubt
carries the ISO NEL semantics.
- Frank
P.S. Sorry for getting your name backwards, Kent. And for
omitting Geoffrey Waigh from the acknowledgements. Both errors
fixed in my working copy. Also, no more posting long drafts;
just pointers from now on.
8-Oct-98 18:55:44-GMT,765;000000000001
Return-Path: <Ed.Hart@jhuapl.edu>
Received: from aples2.jhuapl.edu (aples2.jhuapl.edu [128.244.26.86])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id OAA24567
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 14:55:40 -0400 (EDT)
Received: by aples2.jhuapl.edu with Internet Mail Service (5.5.2232.9)
id <41YQFQF6>; Thu, 8 Oct 1998 14:55:38 -0400
Message-ID: <91D1D51C2955D111B82B00805F19989501CD7119@aples2.jhuapl.edu>
From: "Hart, Edwin F." <Ed.Hart@jhuapl.edu>
To: "'Frank da Cruz'" <fdc@watsun.cc.columbia.edu>
Subject: RE: Terminal Graphics Draft 2
Date: Thu, 8 Oct 1998 14:55:36 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2232.9)
Content-Type: text/plain
Thanks for championing this effort.
It seems to be going very well.
Ed
8-Oct-98 19:01:11-GMT,751;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id PAA26296;
Thu, 8 Oct 1998 15:01:05 -0400 (EDT)
Date: Thu, 8 Oct 98 15:01:05 EDT
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
To: "Hart, Edwin F." <Ed.Hart@jhuapl.edu>
Subject: RE: Terminal Graphics Draft 2
In-Reply-To: Your message of Thu, 8 Oct 1998 14:55:36 -0400
Message-ID: <CMM.0.90.4.907873265.fdc@watsun.cc.columbia.edu>
> Thanks for championing this effort.
>
> It seems to be going very well.
>
Thanks for saying so, Ed (and good to hear from you).
Yup, we old timers have to stick up for what's right :-)
Maybe a few more USS Yorktown incidents will get more
people longing for the good old days when things actually
worked...
- Frank
8-Oct-98 19:23:24-GMT,2765;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id PAA03172;
Thu, 8 Oct 1998 15:23:22 -0400 (EDT)
Date: Thu, 8 Oct 98 15:23:22 EDT
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
To: "Hart, Edwin F." <Ed.Hart@jhuapl.edu>
Subject: RE: Terminal Graphics Draft 2
In-Reply-To: Your message of Thu, 8 Oct 1998 15:11:38 -0400
Message-ID: <CMM.0.90.4.907874602.fdc@watsun.cc.columbia.edu>
> This is how the work really gets done. Someone with the need and the
> expertise gets the ball rolling. I must say that I was impressed by the
> depth of your first draft. Doing Kermit did not hurt.
>
Yes, I learned long ago that the person who cares -- and can write it down --
can usually get it done.
> This seems like an issue that SHARE needs to support given all of the legacy
> systems in use by our organizational members.
>
Don't say "legacy"! I hate that. It means "Not Microsoft Windows and so
deserves to be ground into fine powder at the earliest opportunity but because
we are so stupid and lazy we can't do it yet, please don't hate us, we are
so ashamed." (Seriously, my personal mission is to expunge that word from
the computing lexicon.)
> Since we swapped our
> mainframe for lots of VMS and NT systems, I don't go to SHARE anymore. It's
> nice to have an issue where I know that SHARE has a definite interest in the
> outcome. I believe that you have made the point about the need for terminal
> emulation and the key players on the UTC accept the argument. If you can
> sell Rick McGowan, Ken Whistler, and Asmus Freytag, the rest of the UTC will
> accept the proposal.
>
Then I guess looks good, since they are mainly quibbling about individual
characters and not the entire idea.
> BTW, pardon my ignorance, but what was the Yorktown incident?
>
The US Navy has a "smart ship" program, meaning everything is controlled by
computer. The USS Yorktown is a guided missile frigate entirely controlled
by a network PCs running Windows NT (installed, naturally, over the vigorous
objections of the technical people). According to a front page article in
Government Computer News, the network froze and the ship turned itself off,
engine, rudder, and all. No amount of prodding would bring it back to life.
It had to be ignomiously towed back to port. Evidently this has happened more
than once. The fact that no no missiles were launched is a pretty lucky break.
Sigh. Microsoft evidently has a pretty cozy deal with US government -- NT
is the only platform that any government installation can just buy, without
any sort of approval. Anything else -- mainframes, UNIX, VMS, etc -- requires
mountains of paperwork, RFPs, RFQs, sealed bids, etc etc.
Oh well, don't get me started :-)
- Frank
8-Oct-98 19:27:09-GMT,834;000000000001
Return-Path: <Ed.Hart@jhuapl.edu>
Received: from aples2.jhuapl.edu (aples2.jhuapl.edu [128.244.26.86])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id PAA04303
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 15:27:09 -0400 (EDT)
Received: by aples2.jhuapl.edu with Internet Mail Service (5.5.2232.9)
id <41YQFQK2>; Thu, 8 Oct 1998 15:27:02 -0400
Message-ID: <91D1D51C2955D111B82B00805F19989501CD711C@aples2.jhuapl.edu>
From: "Hart, Edwin F." <Ed.Hart@jhuapl.edu>
To: "'Frank da Cruz'" <fdc@watsun.cc.columbia.edu>
Subject: RE: Terminal Graphics Draft 2
Date: Thu, 8 Oct 1998 15:27:01 -0400
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2232.9)
Content-Type: text/plain;
charset="iso-8859-1"
Thanks for the feedback. I had not heard of the Yorktown.
I'll try to avoid the term, legacy. : )
Best regards,
Ed
8-Oct-98 19:42:48-GMT,6418;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id PAA10390
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 15:42:34 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id LAA23382
; Thu, 8 Oct 1998 11:24:11 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA17365; Thu, 8 Oct 98 10:52:54 -0700
Message-Id: <9810081752.AA17365@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6112 (1998-10-08 17:51:24 GMT)
From: Rick McGowan <rmcgowan@apple.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 8 Oct 1998 10:51:23 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
Frank -- I reviewed the latest draft and have more comments...
> ... All proposed characters have Combining Class 0 (although
> some of the characters are designed to "combine" (connect) with other
> characters in adjacent cells).
You might re-word the above a bit:
... (although some of the corresponding glyphs must be designed to
"combine" (connect) with other glyphs in adjacent display cells).
> Digital VT220 and higher terminals, as well as Televideo, Wyse, HP, Perkin
> Elmer, and other models, allow the user to select whether control characters
> are acted upon or displayed graphically. Unicode itself includes its own
Well, to my mind that indicates that these aren't NEEDED to be encoded at
all! Just set the terminal emulator into the "display controls" mode and let
it display the glyphs that the emulator has for the control codes. They
should not need to be encoded, since they're merely a variant representation
that the terminal does internally. It's unfortunate that my argument is
weakend by the fact that we already have a bunch of control pix encoded...
I do have a problem generally with adding "picture" characters to correspond
to existing things that are Unicode-specific. For instance, I see it as
just pointless to add "Symbol for En Quad" or "Symbol for Right-to-Left
Override" or other such things, unless you can show that this and other such
codes are absolutely necessary for supporting the emulation of these Actual
Physical Terminals. Of course you say:
> (1) There is no known need for these symbols when emulating current
> terminals. In the future, if/when terminals are based on Unicode, they
> might be useful in that context. In the meantime, makers of word
> processors, Web browsers, etc, might have a use for these glyphs.
So, it's my opinion that we should note the possible use, and move on.
I.e., don't propose adding them now on the off chance the *someone* might
have a user for them. Wait for a use. It's perfectly possible in
implementations to have a "show control" mode that shows controls as glyphs,
without having the pictures for them encoded as characters. So if characters
aren't in support of the graphical requirements for Actual Physical
Terminals, they should not be proposed. Or should be proposed separately.
((Here's a little side-bar... It's sometimes desirable to separate things
into independent proposals so that characters which appear to be
"non-controversial" or less controversial can be put into one proposal and
the controversial stuff in another. That way, when committees look at them
"formally" and vote, things move more quickly. In practice, this can lead to
forward progress in pieces rather than multiple rounds back into the draft
stage for an entire set of stuff. This happened recently with Tibetan
extensions that were recently approved for addition. The ad-hoc group of
experts removed everything controversial and quickly came to consensus on an
agreed set for immediate proposal. If they had waited until the last bit of
controversy were resolved for a few items, they would still not have a
proposal today.))
I guess it would be nice to see this document broken into two really major
sections -- one is an analysis of the existing controls, with recommendations
about usage and mappings to character sets for popular Actual Physical
Terminals, as best you're able to determine. The other section would be
proposed additions.
> Table 5.2: C1 Control Characters
Table 5.2 is particularly valuable information of the "here's what exists"
variety... and given the widespread use of ISO-6249 controls, it is probably
worth adding these. You also say in the notes "ISO-6428". Is that different
from 6429? Or just a typo?
> 5.3. EBCDIC Control Pictures
Likewise, this is valuable information. It would be good to somehow call
out the proposed additions, perhaps by putting an asterisk before or after
the names. Because they're in EBCDIC order I found it a bit hard to discern
precisely which are proposed additions.
Someone from IBM should look at the 3270 stuff... I suppose someone will do so.
Another thing that should be discussed is when adding "symbol for foo" one
should also add "foo" itself. For instance, there is no "Start of Field"
control character; but a picture of it is being proposed. Probably UTC needs
to hash through *that* issue...
> Table 6.1: Math Symbols for Terminals
You should look at the glyph pieces in the Adobe Symbol font, which is a
widely used font. Many of these are contained in the Symbol font (0xE6 to
0xFE inclusive).
I believe the following two characters are just masculine and feminine
ordinal indicators, and are already encoded between 0x80 and 0xFF, as part of
ISO Latin 1. They are probably just variant glyphs... unless the
documentation distinguishes them and they occur in pairs with lower-case. Do
you mean "small" or "capital"? Or are they really different?
> E0B3 Latin small letter a with underbar SNI Math 04/04 (2)
> E0B4 Latin capital letter O with underbar SNI Math 04/09 (2)
By the way, I'm opposed quite strongly to adding the 256 "hex bytes" under
any circumstances. Good thing they're an indepenedent proposal. The total
proposed, including Hex Bytes is 448. Without Hex Bytes, it's a modest 192,
and I think it could be reduced with a little more unification. Of course
reduction will offset the expected increase due to other terminals clamoring
to be included...
Rick
8-Oct-98 20:29:17-GMT,1895;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id QAA23231
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 16:29:16 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id MAA06860
; Thu, 8 Oct 1998 12:55:10 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA18332; Thu, 8 Oct 98 12:46:59 -0700
Message-Id: <9810081946.AA18332@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain;
charset="iso-8859-1"
X-Uml-Sequence: 6114 (1998-10-08 19:46:44 GMT)
From: "Hart, Edwin F." <Ed.Hart@jhuapl.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Cc: "'Unicode List'" <unicode@unicode.org>
Date: Thu, 8 Oct 1998 12:46:43 -0700 (PDT)
Subject: RE: Terminal Graphics Draft 2
What should the "customary and familiar mnemonic" be?
One of my concerns is that the names for the EBCDIC controls seemed to vary
from device to device and with different editions of the IBM "green card"
(System/360 Reference Summary) (and "yellow card" and "pink card"). I'm
unsure how much of this has solidified and/or disappeared with IBM's SNA
devices because I have not looked at any of this in over 10 years.
Ed Hart
----------
From: Frank da Cruz [SMTP:fdc@watsun.cc.columbia.edu]
Sent: 08 October, 1998 13:37
To: Unicode List
Subject: RE: Terminal Graphics Draft 2
. . .
I'm sure we could also find other examples of control characters
in the C1 and EBCDIC sets whose semantics are the same or close
but whose names differ; I don't think that means we should unify
them. The purpose of "display controls" is to show the customary
and familiar mnemonic for each control character in its context so
people can read them easily.
- Frank
8-Oct-98 20:52:59-GMT,1934;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id QAA01890
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 16:52:58 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id NAA30836
; Thu, 8 Oct 1998 13:18:15 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA18816; Thu, 8 Oct 98 13:09:10 -0700
Message-Id: <9810082009.AA18816@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6115 (1998-10-08 20:09:01 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Cc: unicode@unicode.org
Date: Thu, 8 Oct 1998 13:09:00 -0700 (PDT)
Subject: RE: Terminal Graphics Draft 2
> What should the "customary and familiar mnemonic" be?
>
> One of my concerns is that the names for the EBCDIC controls seemed to vary
> from device to device and with different editions of the IBM "green card"
> (System/360 Reference Summary) (and "yellow card" and "pink card"). I'm
> unsure how much of this has solidified and/or disappeared with IBM's SNA
> devices because I have not looked at any of this in over 10 years.
>
I took the EBCDIC names from the current reference, and do indeed note the
fact that the names have changed over the years, and include a table of
original names for reference. Is there any sentiment in favor of actually
encoding the old names?
By the way, I also omitted the mnemonics of numerous special-purpose control
sets found in the ISO Register since, to my knowledge, no terminal ever
displays these mnemonics in "display controls" mode, nor does any kind of
protocol analyzer or data scope. I think people who look at "display
controls" screens will be satisfied with the proposed familiar set of
mnemonics. But I could be wrong.
- Frank
8-Oct-98 21:19:51-GMT,6421;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id RAA09529
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 17:19:47 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id OAA10974
; Thu, 8 Oct 1998 14:18:08 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA19370; Thu, 8 Oct 98 14:07:51 -0700
Message-Id: <9810082107.AA19370@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6117 (1998-10-08 21:07:34 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 8 Oct 1998 14:07:31 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
Rick McGowan wrote:
> Frank -- I reviewed the latest draft and have more comments...
>
I appreciate it, thanks.
> I do have a problem generally with adding "picture" characters to
> correspond to existing things that are Unicode-specific.
> ...
> So, it's my opinion that we should note the possible use, and move on.
> I.e., don't propose adding them now on the off chance the *someone* might
> have a user for them. Wait for a use.
>
Fine with me!
> Table 5.2 is particularly valuable information of the "here's what exists"
> variety... and given the widespread use of ISO-6249 controls, it is probably
> worth adding these. You also say in the notes "ISO-6428". Is that
> different from 6429? Or just a typo?
>
It's a typo; thanks for spotting it.
> > 5.3. EBCDIC Control Pictures
>
> Likewise, this is valuable information. It would be good to somehow call
> out the proposed additions, perhaps by putting an asterisk before or after
> the names. Because they're in EBCDIC order I found it a bit hard to discern
> precisely which are proposed additions.
>
I suppose the proposal is rather dense -- the inevitabal tug-of-war between
saying everything everywhere, thus making it so long nobody will read it, or
presuming it is read from top to bottom so everything is explained in advance
but must be remembered (the topological sort). The marking of new additions
is in the left ("Code") column. If the code is Exxx it is to be added;
otherwise it is already in Unicode (usually in the U+2xxx's).
But OK, I'll try to highlight them better.
> Someone from IBM should look at the 3270 stuff... I suppose someone will
> do so.
>
I was hoping for some feedback from the IBM mainframe camp too; not just
3270 users, but also those who analyze and debug 3270 data streams. If any
readers happen to know people outside this group who might be interested,
please feel free to forward the proposal to them.
> Another thing that should be discussed is when adding "symbol for foo" one
> should also add "foo" itself. For instance, there is no "Start of Field"
> control character; but a picture of it is being proposed. Probably UTC
> needs to hash through *that* issue...
>
Oh what a tangled web we weave... I think in this case we have an exception
to the rule. I think we can say that Unicode is ISO/ASCII based rather than
EBCDIC based. The structure of U+0000 through U+00FF is identical with
ASCII (= ISO 646 International Reference Version) + ISO 8859-1, with the
layout of ISO 4873 (C0, GL, C1, GR). The C0 control set is, indeed, the
ASCII C0 set (and that of ISO 646; ISO Registry number #001). Granted, the
C1 area is left unspecified, but what else could it be but that of ISO 6429?
I think it would be pretty weird (note: this is how we spell "weird" this
week...) to add EBCDIC controls to an ISO/ASCII based character set.
Personally, I'd rather leave them out and use the positions they would
occupy for something more useful. But the *symbols* for them do need
encoding, since we will be using Unicode-based software to analyze EBCDIC
and/or 3270 data streams (wire bearing EBCDIC comes into PC, which uses
Unicode internally). However, I would heartily welcome review by IBM or
other EBCDIC/3270-centric party of the specific repertoire of glyphs in the
proposal.
> You should look at the glyph pieces in the Adobe Symbol font, which is a
> widely used font. Many of these are contained in the Symbol font (0xE6 to
> 0xFE inclusive).
>
All the more reason to add them to Unicode. Another, as Kent Karlsson
pointed out earlier today, is that they are used in TeX (see the original
TeX and METAFONT book, p.175: TeX Standard Extension Fonts).
> I believe the following two characters are just masculine and feminine
> ordinal indicators, and are already encoded between 0x80 and 0xFF, as part
> of ISO Latin 1. They are probably just variant glyphs... unless the
> documentation distinguishes them and they occur in pairs with lower-case.
> Do you mean "small" or "capital"? Or are they really different?
>
> > E0B3 Latin small letter a with underbar SNI Math 04/04 (2)
> > E0B4 Latin capital letter O with underbar SNI Math 04/09 (2)
>
Well, "small" means lowercase; "capital" actually means "big" -- who can
tell with an "O"! Hopefully I'll be able to post GIFs of scanned pages
soon; that'll be a day's work! The reason these need to be encoded
separately from feminine/masculine ordinals are their size -- they fill the
whole cell, like a regular letter. Since terminal emulators and data
analyzers use fixed-pitch fonts, we can't just switch to another point size
to display these characters, since that will wreck the matrix arrangement
of the screen.
> By the way, I'm opposed quite strongly to adding the 256 "hex bytes" under
> any circumstances. Good thing they're an indepenedent proposal.
>
I certainly would not want to see them hold up the rest.
> The total proposed, including Hex Bytes is 448. Without Hex Bytes, it's a
> modest 192, and I think it could be reduced with a little more unification.
> Of course reduction will offset the expected increase due to other
> terminals clamoring to be included...
>
Yes, I see the mobs starting to form on the street below, waving placards
emblazoned with vertical lightnings with solidi; diagonal lightnings with
horizontal bars; European no-parking signs; Canadian moose-crossing signs...
Seriously, the hex bytes are entirely separable from the rest. I'll be
glad to cut them loose unless somebody speaks up strongly in their favor.
Thanks again!
- Frank
8-Oct-98 21:31:49-GMT,6665;000000000011
Return-Path: <tzha0@amdahl.com>
Received: from orpheus.amdahl.com (orpheus.amdahl.com [159.199.101.3])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with SMTP id RAA13325
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 17:31:47 -0400 (EDT)
Received: from minerva.amdahl.com([129.212.33.25]) (6252 bytes) by orpheus.amdahl.com
via sendmail with P:smtp/R:match-mx-hosts/T:smtp
(sender: <tzha0@amdahl.com>)
id <m0zRNeu-0008eGC@orpheus.amdahl.com>
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 14:31:40 -0700 (PDT)
(Smail-3.2.0.102 1998-Aug-2 #1 built 1998-Aug-14)
Received: from libra by minerva.amdahl.com with smtp
(Smail3.1.29.1 #5) id m0zRNdh-0001THC; Thu, 8 Oct 98 14:30 PDT
Message-Id: <m0zRNdh-0001THC@minerva.amdahl.com>
From: "Tony Harminc" <tzha0@amdahl.com>
To: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Date: Thu, 8 Oct 1998 17:31:54 -0400
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Re: Terminal Graphics Draft 2
Priority: normal
In-reply-to: <9810080053.AA13270@unicode.org>
X-mailer: Pegasus Mail for Win32 (v3.01a)
Just a few very minor comments. In general my comments are not meant
to be inserted as text - they're for your information. I've left
headings in to help identify the places.
> This document represents a survey of the following terminals:
> IBM 3164 and 3270 [15,16,27]
Really, "3270" is not a terminal, i.e. there has never been a device
made by IBM with that model number. Rather, 3270 is an architecture,
with a large number of IBM terminals having been made that conform in
varying degrees to its specifications. Typical 3270 model numbers
are 3277 (the earliest implementation c. 1971), 3278 (the thing that
most people think of when they think of a "real" 3270, c. 1977), and
3178 (a simpler, cheaper version c. 1982).
> 3.1. Temporary Reference Code Assignments
>
> The characters proposed in this document are assigned temporary Unicode
> values from the Private Use area, strictly for reference within (or to)
> this document only. Final values should be assigned out of the Private
^^^^^^
Should probably read "outside of" to avoid ambiguity
> 5.3. EBCDIC Control Pictures
> Table 5.3 shows the EBCDIC control characters [29], in EBCDIC order. The
> Code column shows the Unicode value; those starting with 24 are already in
> Unicode block U+2400; those starting with E need to be added. The Val
> column shows the EBCDIC value (hex). The Name column shows the EBCDIC
> abbreviation for the code, and the description lists "Symbol for" plus the
> EBCDIC name. There are no known "2X" forms in use.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I don't understand what this means. Are you saying that none of the
EBCDIC control characters in the range X'20'-X'2F' are in use ? This
is certainly not true. It probably means something else, but it's
not obvious to me.
> 5.5. 3270 Terminal Operator Status Indicators
>
> The IBM 3270 terminal displays a variety of unique glyphs in its Operator
> Information Area [15, Figure A-4]. Although they are not encoded in any IBM
> character set (known to me), they nevertheless appear on the screen, and are
> therefore required for accurate terminal emulation. These glyphs are listed
> in Table 5.5.
In particular, they are not assigned GCGIDs in [29 as updated].
> Table 5.5: 3270 Terminal Operator Status Indicators
>
> Code Description
> E080 Human stick figure
> E081 Human stick figure in box
> E082 Clock at 6:10 (or 1:30)
Oops - I think I meant 2:30. :-) The hands are the same length.
As for the double vs single width issue, I did look at some "real"
3270s today, unfortunately made by Memorex/Telex (who were never
known for great faithfulness to the IBM model). They have a very
tall, thin, squished looking clock that is clearly a single cell.
One PC-based terminal emulator I have is TCP3270 from McGill
University (since sold to Hummingbird Communications), and it ships a
font with the two clock halves in separate characters in order to get
a satisfactorily round clock face of legible size. It winds up being
a 1 1/2 width character.
> E083 White rectangle with stroke (1)
> E084 Black rectangle with stroke (2)
> E085 Lighting with stroke (3)
> E086 Security key (4)
> E087 Black and White Right-Pointing Triangles (5)
Elsewhere it was suggested that the 4 and 6 in boxes were just
inverse video characters; I think they are different. In particular,
if we have a "white" numeral, then the surrounding box is also
"white", and the background inside the box is "black".
> Notes:
> (1) A rectangle like the one at U+25AD with an oblique stroke through it.
> Note that "white" and "black" are used in the sense of the Unicode
> standard, and do not imply any particular colors or measure of goodness.
> (2) A rectangle like the one at U+25AC with an oblique stroke through it.
> (3) A horizontal lightning symbol with an oblique stroke through it.
> (4) A picture of a key (indicating the keyboard is locked).
This should not be unified with other lock or key-like symbols, in
particular with the locked padlock commonly used to indicate shift
lock. (This isn't in Unicode, I believe, but I think is part of Alan
LaBonte's keyboard standards, and so might get in via that route.)
This one is a key (rather than a lock) to show that a (physical) key
is needed to use the terminal.
> 10. REFERENCES
> [13] IBM System/360 Principles of Operation, GA22-6821-8, Poughkeepsie,
> NY, 1970.
I would ditch the reference to the S/360 POO - it's been pretty much
obsolete since 1970 or so. Fine for a historic reference, but I
think [29] (as updated) is the better ref.
> [14] IBM National Language Design Guide, Volume 2: National Language
> Support Reference Manual, 4th Edition, North York, ON, 1994.
(Order number SE09-8002-03)
> [29] IBM Character Data Representation Architecture, Level 1 Registry,
> IBM Canada Ltd., National Language Technical Centre, Ontario,
> SC09-1391-00, 1990.
The above publication is obsolete, and is replaced by:
IBM Character Data Representation Architecture, Registration
and Registry, IBM Canada Ltd., Toronto, SC09-2190-00, 1995.
(This is a 300 page book also containing two CD-ROMs.)
Thanks for doing all this work. I hope the views of the likes of
Michael Everson ("Unicode will be in use for centuries" with the
implication that all these silly terminal emulations are just
dinosaurs) will not prevail.
Cheers... Tony H.
8-Oct-98 21:52:29-GMT,2360;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id RAA20099
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 17:52:27 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id OAA41846
; Thu, 8 Oct 1998 14:48:19 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA19672; Thu, 8 Oct 98 14:28:58 -0700
Message-Id: <9810082128.AA19672@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6118 (1998-10-08 21:27:30 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 8 Oct 1998 14:27:29 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
I neglected to answer this one...
> > Digital VT220 and higher terminals, as well as Televideo, Wyse, HP, Perkin
> > Elmer, and other models, allow the user to select whether control
> > characters are acted upon or displayed graphically. Unicode itself
> > includes its own ...
>
> Well, to my mind that indicates that these aren't NEEDED to be encoded at
> all! Just set the terminal emulator into the "display controls" mode and
> let it display the glyphs that the emulator has for the control codes.
>
Good point. Indeed, this character set, unlike others in the VT220 (and
higher) can not be selected by the host using ISO 2022 escape sequences, which
makes sense -- the familiar "transparent mode" conundrum -- once entered, how
then to exit, since everything is transparent?
However, this is not to say that other terminals, such as the Wyse 60, which
do not comply with ISO 2022 rules, do allow the host to command them into
"display controls" mode.
In any case, the control-picture symbols must be encoded because we're
concerned not only about the terminal emulator but also the applications it
must interact with, as in:
User: "Help, my screen is messed up!"
Help desk: "OK, click on Debug in the Terminal window menu bar
and repeat what you did before."
User: "Now my screen is REALLY messed up!"
Help desk: "Let's have a look. Please use your mouse to copy it
and paste it into your email window and send it to us."
This is, of course, looking forward to the day when All Is Unicode...
- Frank
8-Oct-98 22:14:17-GMT,2992;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id SAA25881
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 18:14:16 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id PAA15352
; Thu, 8 Oct 1998 15:11:49 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA19887; Thu, 8 Oct 98 14:44:41 -0700
Message-Id: <9810082144.AA19887@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6119 (1998-10-08 21:43:01 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 8 Oct 1998 14:43:00 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
John Cowan wrote:
> I agree with Rick: don't propose characters that someone might need
> someday. There are quite enough characters, and indeed whole
> scripts, that are not yet available!
>
> > 2421 7F DEL DT Symbol for Delete (3)
>
> [...]
>
> > (3) Not, strictly speaking, a control character, but not a visible
> > one either.
>
> DEL is a control character in every sense, despite its position at 7F.
>
It depends who you ask. ISO 6429, 3rd Edition, 1992, says (in Annex F,
section 8.1) "The character DELETE..., not being a control function in the
strict sense, has been removed from the body of this International Standard."
DELETE and SPACE are special to ISO 4873 and ISO 2022, in which "character
sets" (as we think of them, monolithically) are actually composed of a control
portion (C0 or C1), SPACE (or not), a graphics portion, and DELETE (or not).
SPACE is a character set unto itself, and so is DELETE. If one is present,
the other must be too, in which case the graphics set is a 94-byte character
set (with a little "byte" taken out of the northwest and southeast corner),
and this is crucial to its identification. When SPACE and DELETE are not
present, it is a 96-byte set (such as "the right half of ISO-8859 Latin
Alphabet 1").
> > 5.3. EBCDIC Control Pictures
>
> Note: the EBCDIC/Unicode mapping tables at the Unicode FTP site
> map the EBCDIC-specific controls onto the C1 space, but the mapping
> seems to make no sense. For example, EBCDIC 09 (Superscript)
> is mapped to Unicode 008D (Reverse Line Feed). Why?
>
IBM provides its own EBCDIC / ASCII control-character mapping in the CDRA.
Of course it is inadequate as there are 64 EBCDIC control characters (three
undefined), but there are only 32 in ASCII.
In any case, we don't care about ASCII/EBCDIC mapping here. The need is
for glyphs for visual representation of each EBCDIC control character,
by name so people who live in the EBCDIC / 3270 world who must debug EBCDIC
and/or 3270 data streams using Unicode-based software will be able to see
these control characters represented by the names they are known by in
the EBCDIC/3270 world.
- Frank
8-Oct-98 23:08:25-GMT,7301;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id TAA10645
for <fdc@watsun.cc.columbia.edu>; Thu, 8 Oct 1998 19:08:24 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id QAA41818
; Thu, 8 Oct 1998 16:02:17 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA20721; Thu, 8 Oct 98 15:56:45 -0700
Message-Id: <9810082256.AA20721@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6120 (1998-10-08 22:56:30 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 8 Oct 1998 15:56:28 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
"Tony Harminc" <tzha0@amdahl.com> wrote:
> Just a few very minor comments. In general my comments are not meant
> to be inserted as text - they're for your information. I've left
> headings in to help identify the places.
>
I appreciate them, many thanks!
> > This document represents a survey of the following terminals:
> >
> > IBM 3164 and 3270 [15,16,27]
>
> Really, "3270" is not a terminal, i.e. there has never been a device
> made by IBM with that model number. Rather, 3270 is an architecture...
>
Right -- sloppy wording again.
> with a large number of IBM terminals having been made that conform in
> varying degrees to its specifications. Typical 3270 model numbers
> are 3277 (the earliest implementation c. 1971), 3278 (the thing that
> most people think of when they think of a "real" 3270, c. 1977)
>
The one that weighs about 700 pounds :-) (318 Kg)
(Actually it's not so heavy compared to the 2741...)
> > 5.3. EBCDIC Control Pictures
> ...
> > EBCDIC name. There are no known "2X" forms in use.
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> I don't understand what this means. Are you saying that none of the
> EBCDIC control characters in the range X'20'-X'2F' are in use ? This
> is certainly not true. It probably means something else, but it's
> not obvious to me.
>
Sorry, sequential reading required again :-) You probably skipped directly
to the EBCDIC sections. "2X" was my shorthand for 2-character abbreviations
for 3-character mnemonics, such as used in the Display Controls font of
Wyse and Televideo terminals (see Section 5.1).
> > Table 5.5: 3270 Terminal Operator Status Indicators
> >
> > Code Description
> > E080 Human stick figure
> > E081 Human stick figure in box
> > E082 Clock at 6:10 (or 1:30)
>
> Oops - I think I meant 2:30. :-)
>
Oops, right.
> As for the double vs single width issue, I did look at some "real"
> 3270s today, unfortunately made by Memorex/Telex (who were never
> known for great faithfulness to the IBM model). They have a very
> tall, thin, squished looking clock that is clearly a single cell.
> One PC-based terminal emulator I have is TCP3270 from McGill
> University (since sold to Hummingbird Communications), and it ships a
> font with the two clock halves in separate characters in order to get
> a satisfactorily round clock face of legible size. It winds up being
> a 1 1/2 width character.
>
Do you think this is worth worrying about? We certainly have a lot of
glyphs in Unicode that are more complex than this but, presumably, fit
in a single cell.
> > E083 White rectangle with stroke (1)
> > E084 Black rectangle with stroke (2)
> > E085 Lighting with stroke (3)
> > E086 Security key (4)
> > E087 Black and White Right-Pointing Triangles (5)
>
> Elsewhere it was suggested that the 4 and 6 in boxes were just
> inverse video characters; I think they are different. In particular,
> if we have a "white" numeral, then the surrounding box is also
> "white", and the background inside the box is "black".
>
Do you think it matters? We need to conserve code points whenever possible;
in this case it would seem to me that no information is lost by displaying
full-cell inverse video digits. I should probably have a look at a real
327x...
> > (4) A picture of a key (indicating the keyboard is locked).
>
> This should not be unified with other lock or key-like symbols, in
> particular with the locked padlock commonly used to indicate shift
> lock. (This isn't in Unicode, I believe, but I think is part of Alan
> LaBonte's keyboard standards, and so might get in via that route.)
> This one is a key (rather than a lock) to show that a (physical) key
> is needed to use the terminal.
>
Noted. And I do remember seeing a padlock on an IBM 327x terminal screen,
don't I? It was a long time ago, maybe I dreamed it. Anyway, I can't find
any reference to it now.
> > [13] IBM System/360 Principles of Operation, GA22-6821-8, Poughkeepsie,
> > NY, 1970.
>
> I would ditch the reference to the S/360 POO - it's been pretty much
> obsolete since 1970 or so. Fine for a historic reference, but I think
> [29] (as updated) is the better ref.
>
This is my reference for Table 5.3A. Thanks for the other updated
references.
> Thanks for doing all this work. I hope the views of the likes of
> Michael Everson ("Unicode will be in use for centuries" with the
> implication that all these silly terminal emulations are just
> dinosaurs) will not prevail.
>
Michael's views are mainstream. In any case, Michael is a passionate
advocate of some of my favorite scripts :-) I certainly would not want
to see (say) hex bytes squeezing out (say) Nordic or Irish runes.
Terminals (or emulators -- including xterms and the like), protocol
analyzers, escape sequences, termcaps, timesharing systems, mainframes, etc,
are approximately as widespread now as they ever were, but around them has
grown an entirely new world of Windows and GUIs and Web browsers, and this
is all we hear about in the mass market.
Younger people are not even aware of this older world, quietly doing its job
in its unglamorous "machine rooms", just as passengers on an ocean liner
could be unaware of what went on below decks. I have seen Columbia students
drop their jaws in amazement upon first seeing a terminal emulation screen
-- let alone an actual terminal -- "What's THAT???? It's so UGLY!". But
that world is there; we all depend on it, and it's not going anywhere.
(Really. Nothing ever quite disappears. I have heard of a payroll system
that was originally written in the early 1960s for an IBM ... OK, I forget
the exact model number -- 1104 or something like that. When that machine
was replaced by a 7094 (?), the same payroll system ran under an 1104
emulator. When the 7094 was replaced by a 360, it still ran on the 1104
emulator, which itself ran on a 7094 emulator. And so on and so on, legend
has it, to this day.)
- Frank
P.S. This is coming to you from the original IBM Thomas J Watson Research
Laboratory, where IBM developed much of its 1940s and 50s technology before
turning the building over to Columbia University in 1955 and moving to
Yorktown Heights (no, I wasn't here then). "THIMK" :-)
P.P.S. A slightly updated draft (2.5), based on today's discussion, is in
the usual place (get out your clickers):
ftp://kermit.columbia.edu/kermit/charsets/ucsterminal.txt
(End)
9-Oct-98 5:28:10-GMT,2059;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id BAA01525
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 01:28:09 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id WAA47442
; Thu, 8 Oct 1998 22:27:49 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA21907; Thu, 8 Oct 98 22:19:52 -0700
Message-Id: <9810090519.AA21907@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Uml-Sequence: 6121 (1998-10-09 05:19:27 GMT)
From: Geoffrey Waigh <gpw@cybersurf.net>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Thu, 8 Oct 1998 22:19:25 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
Content-Transfer-Encoding: 7bit
To interject a small point on the matter of high-fidelity terminal
emulation; the vast majority of emulation users are not going to
quibble over the exact appearance or dimensions of the status area
indicators. When I worked for a company doing terminal emulation,
the customers only wanted it to run their applications correctly
and support a bizarre array of kludges^H^H^H^H^H^H^Hfeatures that
let them improve performance/functionality of their system. Quite
a few terminal features were missing until a new customer needed
it and I was appalled at how divergent some of the character code
to glyph mappings that customers ended up with were. [When I was
migrating the system to Unicode the fonts, character sets and
character -> glyph mappings were cleaned up.]
Also on the matter of debugging 5250/3270 streams our developers
always used an ASCII text representation. There wasn't any desire
that I can recall for fancy debugging fonts from either our
technical staff or our customers. (Then again the customers
usually left terminal protocol analysis to our support group.)
Geoffrey Waigh
gpw@cybersurf.net
9-Oct-98 7:25:55-GMT,2599;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id DAA20023
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 03:25:53 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id AAA43726
; Fri, 9 Oct 1998 00:24:28 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA22250; Fri, 9 Oct 98 00:12:21 -0700
Message-Id: <9810090712.AA22250@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Uml-Sequence: 6122 (1998-10-09 07:12:10 GMT)
From: Paul Keinanen <keinanen@sci.fi>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 9 Oct 1998 00:12:09 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by watsun.cc.columbia.edu id DAA20023
At 13:44 8.10.1998 -0700, John Cowan wrote:
>Frank da Cruz wrote:
>
>> 2421 7F DEL DT Symbol for Delete (3)
>
>[...]
>
>> (3) Not, strictly speaking, a control character, but not a visible
>> one either.
>
>DEL is a control character in every sense, despite its position at 7F.
The situation with Delete/Rubout is a bit complicated, depending on usage.
In the VTxxx environment it is clearly a control function (erasing the
previous character), but originally the code for Delete/Rubout was chosen
for Teletype style paper tape manual entry. If you hit the wrong key, you
manually stepped back the tape one character and overpunched all 7 holes (7F
in 7 bit+odd parity) or all 8 holes (FF in 7 bit+even parity) and the
reader/computer would then ignore the character. In that sense it is a dummy
nonspaced character. When using a Teletype with a paper tape reader, you had
to be careful to have the computer in correct mode for manual resp. paper
tape reader input.
Due to this duality, should Rubout and Delete be concidered to be two
separate characters, although they seem to have the same code point in all
ASCII based character sets ? Probably not, as we try to unify other
characters as well, but at least the Rubout functionality should also be
included in the description. You could of course argue that the Rubout
functionality is either a nonspaced dummy character just in the same way as
NUL, but as NUL is concidered a control character, thus the Rubout should
also be concidered a control character.
So indeed, there can be many interpretations.
Paul KeinΣnen
9-Oct-98 13:40:53-GMT,1567;000000000001
Return-Path: <Paul.Williams@rrds.co.uk>
Received: from mailrelay1.cc.columbia.edu (mailrelay1.cc.columbia.edu [128.59.35.143])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id JAA02739
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 09:40:52 -0400 (EDT)
Received: from cyber (gate1.rrds.co.uk [195.166.41.35])
by mailrelay1.cc.columbia.edu (8.8.5/8.8.5) with SMTP id JAA00654
for <fdc@columbia.edu>; Fri, 9 Oct 1998 09:40:48 -0400 (EDT)
Sender: pawillia@rrds.co.uk
Message-Id: <361E101C.E3A323C1@rrds.co.uk>
Date: Fri, 09 Oct 1998 14:31:09 +0100
From: Paul Williams <Paul.Williams@rrds.co.uk>
Organization: Racal Radar Defence Systems Ltd
X-Mailer: Mozilla 4.02 [en] (X11; I; SunOS 5.5.1 sun4m)
MIME-Version: 1.0
To: fdc@columbia.edu
Subject: Terminal Graphics for Unicode. Trainspotter Alert!
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Frank,
[Your email inbox must be huge. Please don't feel obliged to reply to
this.]
I've just been reading your very interesting proposal to add characters
found in terminal repertoires to the Unicode standard. Although I expect
that point 2 in the Problems section doesn't hold true for the VT320,
i.e. "Lack of definitive, high-quality pictures of the glyphs in some
cases", you may still like to take a look at:
http://www.celigne.co.uk/terminal/built-in_glyphs.html
I do realise that this is the work of a complete anorak, but it is
part of my project to write a "real" VT320 terminal emulator and
document it completely.
Regards,
Paul
[not speaking on behalf of my employer]
9-Oct-98 13:54:43-GMT,3358;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id JAA06407;
Fri, 9 Oct 1998 09:54:39 -0400 (EDT)
Sender: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Date: Fri, 9 Oct 98 9:54:38 EDT
From: Kermit Software Support <kermit-support@columbia.edu>
To: Karlsson Kent - keka <keka@im.se>
Subject: Re: Kermit95
In-Reply-To: Your message of Fri, 9 Oct 1998 11:52:00 +0200
Cc: kermit-support@columbia.edu
Reply-To: kermit-support@columbia.edu
Message-ID: <CMM.0.90.4.907941278.fdc@watsun.cc.columbia.edu>
> I have a few questions:
>
> 1. Can one use Unicode (UTF-16, either endianism, or UTF-8) on the host
> side, i.e. 'on the wire' (between the host and Kermit)?=A0 (This would
> be for a Unicode enabled application on the host.)
>
Not yet. So far nobody has asked for it. In any case, I have not heard of
any platform that offers UTF-8 host-terminal sessions. Well, maybe Plan 9?
But yes, we do plan to add UTF-8 support, even in advance of user demand.
> 2. Is the Unicode enabled Kermit available also on Win95/98?
> it only for NT?
>
It's only for NT because, at present, Kermit 95 is a console application,
and Unicode is not supported in console windows on Windows 95/98.
> 3. Can one use Kanji/Hangul etc. with Kermit95? Can one use proportional
> width fonts in general with Kermit95?
>
DCBS is problematic in console windows. Proportional-width fonts don't
usually make sense in a terminal screen. However, when K-95 is converted to
full GUI form, the user should be able to choose any font at all, including a
proportional one.
>From the Kermit 95 FAQ:
There is no explicit support in Kermit 95 for Chinese, Japanese, or
Korean (CJK), but you still might be able to view CJK in a Kermit 95
window if (a) your PC is configured to allow it; (b) the CJK character
set used on your PC is the same as that on the host; (c) K95 has been
told to "set terminal character-set transparent" and "set terminal
bytesize 8". However, the ability to view CJK text does not necessarily
mean you can also enter it. According to Microsoft Knowledge Base
Article Q156793, CJK Input Method Editors are not available in Console
windows under Windows 95, even though they are available in Windows NT
3.51 and later. Explicit support for CJK terminal emulation is planned
for future releases of Kermit 95, after release of the GUI.
We do, however, support translation among the full range of Kanji character
sets during file transfer.
> (I.e. can I use, e.g. Bitstreams Cyberbit font to display
> Latin/Hangul/Kanji/Hiragana/...?)
>
Maybe in NT. Certainly not in Win95/98 at present.
> 4. Can Kermit95 properly handle (a non-empty subset of) combining
> characters? Conjoining Hangul Jamo?
>
No.
> 5. Can Kermit95 interpret ESC-sequences, so that (terminalish) forms
> can be drawn, and filled in? (ECS-sequence positioning would be to a "cell
> grid", but text strings should still be displayable via a proportional font).
>
Yes, except for the proprortional font part. All terminals emulated by
Kermit 95 use fixed-pitch fonts, and so all of Kermit 95's emulations
assume a regular matrix of screen cells. Are you aware of any terminal
that does not follow this model?
> 6. Can Kermit be used together with an IME (to input, e.g. Kanji).
> (Just asking...)
>
See FAQ quote above.
- Frank
9-Oct-98 14:09:23-GMT,3918;000000000001
Return-Path: <jaltman>
Received: (from jaltman@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id KAA10541;
Fri, 9 Oct 1998 10:09:21 -0400 (EDT)
Sender: Jeffrey Altman <jaltman@watsun.cc.columbia.edu>
Date: Fri, 9 Oct 98 10:09:20 EDT
From: kermit-support@watsun.cc.columbia.edu
To: kermit-support@columbia.edu
Cc: Karlsson Kent - keka <keka@im.se>, kermit-support@columbia.edu
Subject: Re: Kermit95
In-Reply-To: Your message of Fri, 9 Oct 98 9:54:38 EDT
Reply-To: kermit-support@watsun.cc.columbia.edu
Message-ID: <CMM.0.90.4.907942160.jaltman@watsun.cc.columbia.edu>
Let me elaborate on some of the responses in more detail.
> > 2. Is the Unicode enabled Kermit available also on Win95/98?
> > it only for NT?
> >
> It's only for NT because, at present, Kermit 95 is a console application,
> and Unicode is not supported in console windows on Windows 95/98.
All versions of K95 use Unicode internally during terminal emulation.
In other words, all remote character-sets are translated to Unicode
and stored in the screen buffer. On NT, Unicode is used for display
because it is supported by the OS. On Win95/98 and OS/2 the Unicode
characters are converted to the equivalent character (if one exists)
in the code page used on the local system.
Adding support for host based UTF-7 or UTF-8 in conjunction with an
existing terminal emulation will not be difficult. If you have a host
application that does this, we would like to know about it.
> > 3. Can one use Kanji/Hangul etc. with Kermit95? Can one use proportional
> > width fonts in general with Kermit95?
> >
> DCBS is problematic in console windows. Proportional-width fonts don't
> usually make sense in a terminal screen. However, when K-95 is converted to
> full GUI form, the user should be able to choose any font at all, including a
> proportional one.
>
> >From the Kermit 95 FAQ:
>
> There is no explicit support in Kermit 95 for Chinese, Japanese, or
> Korean (CJK), but you still might be able to view CJK in a Kermit 95
> window if (a) your PC is configured to allow it; (b) the CJK character
> set used on your PC is the same as that on the host; (c) K95 has been
> told to "set terminal character-set transparent" and "set terminal
> bytesize 8". However, the ability to view CJK text does not necessarily
> mean you can also enter it. According to Microsoft Knowledge Base
> Article Q156793, CJK Input Method Editors are not available in Console
> windows under Windows 95, even though they are available in Windows NT
> 3.51 and later. Explicit support for CJK terminal emulation is planned
> for future releases of Kermit 95, after release of the GUI.
>
> We do, however, support translation among the full range of Kanji character
> sets during file transfer.
>
> > (I.e. can I use, e.g. Bitstreams Cyberbit font to display
> > Latin/Hangul/Kanji/Hiragana/...?)
> >
> Maybe in NT. Certainly not in Win95/98 at present.
The primary reason that we have not implemented DBCS support for K95
is because there are no freely distributed monospaced Unicode fonts
that populate the CJK areas. The Bitstream Cyberbit font is
proportionally spaced and cannot be used in Console windows.
Monotype does have fully populated versions of Lucida Console and
Monotype.com fonts that can be purchased. You would have to contact
them for pricing info. These fonts can them be used in Win95/98 and
NT. In NT you would now have the ability to display the CJK
characters. However, as Win95/98 does not have code page support for
the CJK characters (at least in the Western releases) you would still
not be able to display the CJK characters.
Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2
The Kermit Project * Columbia University
612 West 115th St #716 * New York, NY * 10025
http://www.kermit-project.org/k95.html * kermit-support@kermit-project.org
9-Oct-98 14:20:35-GMT,3605;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id KAA13757
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 10:20:34 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id HAA45070
; Fri, 9 Oct 1998 07:19:55 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA23282; Fri, 9 Oct 98 07:13:18 -0700
Message-Id: <9810091413.AA23282@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Uml-Sequence: 6123 (1998-10-09 14:12:43 GMT)
From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 9 Oct 1998 07:12:42 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
Frank da Cruz wrote on 1998-10-08 21:27 UTC:
> In any case, the control-picture symbols must be encoded because we're
> concerned not only about the terminal emulator but also the applications it
> must interact with, as in:
>
> User: "Help, my screen is messed up!"
>
> Help desk: "OK, click on Debug in the Terminal window menu bar
> and repeat what you did before."
>
> User: "Now my screen is REALLY messed up!"
>
> Help desk: "Let's have a look. Please use your mouse to copy it
> and paste it into your email window and send it to us."
>
> This is, of course, looking forward to the day when All Is Unicode...
The helpdesk can already get the same effect today by much simpler
asking for a bitmap of the screen or window to be mailed:
Help desk: "What GUI are you using?"
User: "X11"
Help desk: "Excellent choice. Just enter the shell command
'xwd | uuencode - | mail helpdesk' and then click on the
messed-up window"
User: "Done."
Help desk: "Ah, now I see your problem. That is easy to fix ..."
Let's not create unnecessary complicated user requirements.
I highly welcome attempts to complete Unicode with the various technical
character set symbols that various terminal types have, after proper
unification according to the well-proven character/glyph model.
Also symbols that are not part of the user accessible character set of
a terminal but that appear as part of the normal look-and-feel of this
terminal in status lines, etc. should be added to Unicode, in order to allow
to emulate one terminal inside another terminal emulator (e.g., an IBM 3270
emulator that runs inside an UTF-8 enhanced xterm). I am sceptical
however, whether all the many control symbols really need to have a place
in the BMP. I think that debugging tools can quite easily provide them
using some other replacement notation, that might or might not bypass the
usual font mechanisms. I have been used to see ^M in a different color as a
common replacement symbol for Carriage Return (Ctrl-M) for over 15 years,
and I never missed a much less readable CR glyph for debugging purposes.
Debugging is done by experts and experts are used to cope with any replacement
notation anyway. We can do hexdumps quite nicely with 0-9A-F without having
to resort to hex-byte-glyphs and control abbreviation glyphs.
I think the criterion for inclusion of terminal emulator characters
should be whether the character can ever be seen by a normal user
in normal (non-debugging, non-configure) operation on the screen of
the terminal.
Markus
--
Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK
email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
9-Oct-98 14:49:23-GMT,2248;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id KAA22940
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 10:49:21 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id HAA22448
; Fri, 9 Oct 1998 07:47:47 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA23387; Fri, 9 Oct 98 07:29:59 -0700
Message-Id: <9810091429.AA23387@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Uml-Sequence: 6124 (1998-10-09 14:29:42 GMT)
From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 9 Oct 1998 07:29:41 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
Frank da Cruz wrote on 1998-10-08 22:56 UTC:
> (Really. Nothing ever quite disappears. I have heard of a payroll system
> that was originally written in the early 1960s for an IBM ... OK, I forget
> the exact model number -- 1104 or something like that. When that machine
> was replaced by a 7094 (?), the same payroll system ran under an 1104
> emulator. When the 7094 was replaced by a 360, it still ran on the 1104
> emulator, which itself ran on a 7094 emulator. And so on and so on, legend
> has it, to this day.)
Another question is which terminals should actually be supported. Many
of the ones you mentioned have died away already. I am aware of the IBM
3270 family and the DEC VT100 family having a long and healthy live (to
Michael Everson: those might indeed still be around in a hundred years
from now, we will know after Y2K), but much of the rest is probably not
sufficiently mainstream enough to deserve consideration in Unicode.
Do you have any form of data on the terminal emulator market regarding
more exotic terminal types? Are there really more than a few hundred
people out there who use applications that depend on a terminal type
radically different from a DEC VT340 or a IBM 3278?
Markus
--
Markus G. Kuhn, Security Group, Computer Lab, Cambridge University, UK
email: mkuhn at acm.org, home page: <http://www.cl.cam.ac.uk/~mgk25/>
9-Oct-98 15:38:18-GMT,1363;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id LAA08268;
Fri, 9 Oct 1998 11:38:10 -0400 (EDT)
Date: Fri, 9 Oct 98 11:38:09 EDT
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
To: Otto Stolz <Otto.Stolz@uni-konstanz.de>
Subject: Re: Terminal Graphics Draft 2
In-Reply-To: Your message of Fri, 9 Oct 1998 17:31:21 -0600
Message-ID: <CMM.0.90.4.907947489.fdc@watsun.cc.columbia.edu>
> Hello,
>
> on 1998-10-09 at 17:19, I have written to the Unicode list:
> > Cf. figure 3-1 in IBM form GA27-2837-8
> > "IBM 3270 Information Display System Character Set Reference".
>
> If you want this old (Aug 1968), pre-CDRA pamphlet, you can have it
> for the asking. It contains various keyboard layouts, I/O interface
> code charts and some ancillary material (including special 3270 controll
> character assignments, cf. my forthcoming note to the Unicode list).
>
I'd love to have it -- as you can guess, I collect such things.
Thanks!
Frank da Cruz
The Kermit Project
Columbia University
612 West 115th Street
New York NY 10025-7799
USA
P.S. Another rare object I was hoping to find was an Siemens (or Nixdorf?)
BA80 terminal manual. Nobody at SNI acknowledges there ever was such a thing,
but I don't believe them. This is just a "shot in the dark" -- ignore it if
you don't know what I'm talking about.
- Frank
9-Oct-98 15:43:42-GMT,1745;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id LAA09826
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 11:43:40 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id IAA17762
; Fri, 9 Oct 1998 08:42:00 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA23887; Fri, 9 Oct 98 08:33:42 -0700
Message-Id: <9810091533.AA23887@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Uml-Sequence: 6125 (1998-10-09 15:33:15 GMT)
From: Otto Stolz <Otto.Stolz@uni-konstanz.de>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 9 Oct 1998 08:33:14 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
Am 1998-10-8 um 12:19 hat John Cowan geschrieben:
> A typical (though not the only)
> glyph for U+2424 is the one which appears on the Enter key of PC
> keyboards.
Please describe.
On my keyboards (both PC and X-Terminal), the Enter key has the word
"Enter" engraved, whilst the Return key has a U+21B2 Downwards Arrow
with Tip Leftwards (or is it a glyph variant of U+21B5?).
If I remember correctly, my 3270 terminal had the similar engravings
on these keys, viz. "DatFreig" (Datenfreigabe = German translation of
"Enter"), and U+21B5, respectively. Cf. figure 3-1 in IBM form GA27-2837-8
"IBM 3270 Information Display System Character Set Reference".
Btw., the semantics of these 3270 keys were quite different: the Enter
key sends data to zhe host, whilst the Return key is just a local cursor
movement without sending anything.
Best wishes,
Otto Stolz
9-Oct-98 16:02:21-GMT,842;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id MAA14500;
Fri, 9 Oct 1998 12:02:17 -0400 (EDT)
Date: Fri, 9 Oct 98 12:02:17 EDT
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
To: Otto Stolz <Otto.Stolz@uni-konstanz.de>
Subject: Re: Terminal Graphics Draft 2
In-Reply-To: Your message of Fri, 9 Oct 1998 17:58:28 -0600
Message-ID: <CMM.0.90.4.907948937.fdc@watsun.cc.columbia.edu>
> Am 1998-10-9 um 17:31 hat Otto Stolz geschrieben:
> > ancillary material (including special 3270 controll
> > character assignments, cf. my forthcoming note to the Unicode list).
>
> This is only to tell you that no note is forthcoming, as I eventually have
> found all of those control characters in either table 5.3 or 5.4 of your
> 2nd draft.
>
OK, that will please the minimalists :-)
Thanks.
- Frank
9-Oct-98 16:25:12-GMT,1442;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id MAA20078
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 12:25:11 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id JAA65830
; Fri, 9 Oct 1998 09:24:47 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA24229; Fri, 9 Oct 98 09:18:58 -0700
Message-Id: <9810091618.AA24229@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Uml-Sequence: 6126 (1998-10-09 16:18:46 GMT)
From: John Cowan <cowan@locke.ccil.org>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 9 Oct 1998 09:18:45 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
Content-Transfer-Encoding: 7bit
Otto Stolz wrote:
> On my keyboards (both PC and X-Terminal), the Enter key has the word
> "Enter" engraved, whilst the Return key has a U+21B2 Downwards Arrow
> with Tip Leftwards (or is it a glyph variant of U+21B5?).
Mm, you're right. I don't know what I was thinking of.
--
John Cowan http://www.ccil.org/~cowan cowan@ccil.org
You tollerday donsk? N. You tolkatiff scowegian? Nn.
You spigotty anglease? Nnn. You phonio saxo? Nnnn.
Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)
9-Oct-98 17:17:51-GMT,1805;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id NAA04954
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 13:17:50 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id KAA52330
; Fri, 9 Oct 1998 10:17:12 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA24788; Fri, 9 Oct 98 10:04:28 -0700
Message-Id: <9810091704.AA24788@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6129 (1998-10-09 17:04:14 GMT)
From: Rick McGowan <rmcgowan@apple.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 9 Oct 1998 10:04:12 -0700 (PDT)
Subject: careless quotation and forwarding
It's nice to quote a little context when you're replying to some previous
message. People do that a lot by pre-pending ">" to the quoted lines.
Sometimes, however, people don't take enough care with REMOVING the
irrelevant portions of the note they're quoting.
I get a little tired of scenarios like this: Someone quotes a few lines of
a note, intersperses comments, then just LEAVES the rest of the note. There
was one note yesterday on this list that ended with 717 lines quoted from
another note, WITHOUT COMMENT. This was some 30k of excess garbage that
clogged up my inbox, and I had to scroll all the way through it to verify
that it was UNCOMMENTED, and hence irrelevant to forward.
Please take more care in removing excess quoted material -- parts of notes
that are irrelevant or upon which you are not commenting. It doesn't take
that much time to do, and saves all of your readers the trouble of
fruitlessly wading through the excess.
Thanks,
Rick
9-Oct-98 17:37:03-GMT,1854;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id NAA11282
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 13:37:01 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id KAA56198
; Fri, 9 Oct 1998 10:35:01 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA24939; Fri, 9 Oct 98 10:15:15 -0700
Message-Id: <9810091715.AA24939@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6130 (1998-10-09 17:13:48 GMT)
From: Rick McGowan <rmcgowan@apple.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 9 Oct 1998 10:13:47 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
Frank wrote...
> The reason these need to be encoded separately from feminine/masculine
> ordinals are their size
Ah, I figured so. In this case, they should just be unified with the
existing codes.
> Since terminal emulators and data analyzers use fixed-pitch fonts, we can't
> just switch to another point size to display these characters, since that will
> wreck the matrix arrangement of the screen.
Eh? There are plenty of fixed-pitch fonts that include masculine and
feminine ordinal indicators taking up the same size cell as everything else.
Big or small, unless you need to distinguish these from small ordinals for
the emulation of ONE Actual Physical Terminal, there's no point in encoding
them again.
> Seriously, the hex bytes are entirely separable from the rest. I'll be
> glad to cut them loose unless somebody speaks up strongly in their favor.
I'll repeat myself strongly in their disfavor. I think you should remove
them from this proposal, whether or not you make another proposal for them.
Rick
9-Oct-98 17:49:15-GMT,1449;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id NAA14124
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 13:49:12 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id KAA57790
; Fri, 9 Oct 1998 10:47:35 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA25247; Fri, 9 Oct 98 10:36:02 -0700
Message-Id: <9810091736.AA25247@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6131 (1998-10-09 17:35:33 GMT)
From: kenw@sybase.com (Kenneth Whistler)
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 9 Oct 1998 10:35:32 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
Frank wrote:
>
> (Really. Nothing ever quite disappears. I have heard of a payroll system
> that was originally written in the early 1960s for an IBM ... OK, I forget
> the exact model number -- 1104 or something like that. When that machine
> was replaced by a 7094 (?), the same payroll system ran under an 1104
> emulator. When the 7094 was replaced by a 360, it still ran on the 1104
> emulator, which itself ran on a 7094 emulator. And so on and so on, legend
> has it, to this day.)
Somewhat orthogonally, many of these old dinosaurs are about to be
cleared away by the asteroid known as Y2K on its way to impact.
--Ken
9-Oct-98 18:10:55-GMT,1548;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id OAA19499
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 14:10:53 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id LAA42778
; Fri, 9 Oct 1998 11:09:06 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA25845; Fri, 9 Oct 98 10:55:36 -0700
Message-Id: <9810091755.AA25845@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6135 (1998-10-09 17:54:18 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 9 Oct 1998 10:54:14 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
> > The reason these need to be encoded separately from feminine/masculine
> > ordinals are their size
>
> Ah, I figured so. In this case, they should just be unified with the
> existing codes.
>
See my response to Otto Stolz on this. Again, I don't care so much whether
these particular characters are encoded, but they illustrate a point worth
making, namely that unifications that work in the GUI world don't
necessarily work in an environment where we must use a fixed-pitch font.
If a "big" feminine ordinal arrives, I can't just display the regular one
at a bigger point size with a lower baseline, because (a) I might not be
able to (maybe it's a console application), and (b) cells must be fixed
size.
- Frank
9-Oct-98 18:11:50-GMT,2295;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id OAA19718
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 14:11:47 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id LAA45088
; Fri, 9 Oct 1998 11:09:08 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA25534; Fri, 9 Oct 98 10:47:19 -0700
Message-Id: <9810091747.AA25534@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6132 (1998-10-09 17:45:54 GMT)
From: kenw@sybase.com (Kenneth Whistler)
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Cc: kenw@sybase.com
Date: Fri, 9 Oct 1998 10:45:47 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
John Cowan wrote:
>
> But the SYMBOL FOR NEWLINE is not tied to U+2424, which is a LINE
> SEPARATOR, not a line terminator. A typical (though not the only)
> glyph for U+2424 is the one which appears on the Enter key of PC
> keyboards.
No.
U+2424 *is* SYMBOL FOR NEWLINE
It is a graphic symbol for the NEWLINE function.
U+2424 is *not* a line separator, nor a line terminator. It is not
a control function or control code at all.
It *is* the character which should be used when displaying a graphic
symbol for the control function NEWLINE. And here we are talking
the EBCDIC NL (etc.), not C `\n'.
U+21B5 DOWNARDS ARROW WITH CORNER LEFTWARDS is the character which
can be used to represent that which typically appears on the Enter key
of PC keyboards (i.e., serving as a graphic symbol for CARRIAGE RETURN).
Incidentally, there is another pile of graphic symbols for keyboard
functions coming down the pike in Amendment 22 to 10646 (based on
ISO 9995-7). These should be checked to verify that there are no
duplicates against the collection of symbols being proposed for
terminal emulation. (Examples: symbols for compose, enter, alternate,
shift lock, undo, print screen, clear screen, delete, etc.)
--Ken
>
> --
> John Cowan http://www.ccil.org/~cowan cowan@ccil.org
> You tollerday donsk? N. You tolkatiff scowegian? Nn.
> You spigotty anglease? Nnn. You phonio saxo? Nnnn.
> Clear all so! 'Tis a Jute.... (Finnegans Wake 16.5)
>
9-Oct-98 18:22:26-GMT,2272;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id OAA22300
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 14:22:24 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id LAA40550
; Fri, 9 Oct 1998 11:20:39 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA25711; Fri, 9 Oct 98 10:53:19 -0700
Message-Id: <9810091753.AA25711@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6133 (1998-10-09 17:51:01 GMT)
From: kenw@sybase.com (Kenneth Whistler)
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Cc: kenw@sybase.com
Date: Fri, 9 Oct 1998 10:51:00 -0700 (PDT)
Subject: RE: Terminal Graphics Draft 2
> I can't think of any connections where both NL and NEL would be
> used in the same data stream, since data streams tend to be
> either ASCII/ISO or EBCDIC, but not a mixture.
>
> However, real terminals (VT220-520) print "NEL" when in display-
> controls mode, so why make an exception to the rule of printing
> the actual name in this one case? I think this would be more
> confusing than doing what the actual terminal does. Especially
> considering the "metareferences" we might make to these characters
> in Unicode texts; e.g. "An ISO data stream will show the [NEL]
> character, whereas an EBCDIC data stream will show the [NL]
> character"...
>
> I'm sure we could also find other examples of control characters
> in the C1 and EBCDIC sets whose semantics are the same or close
> but whose names differ; I don't think that means we should unify
> them. The purpose of "display controls" is to show the customary
> and familiar mnemonic for each control character in its context so
> people can read them easily.
And since your collection of display controls lists both the
three-letter and two-letter mnemonics for these things, I
cannot see any argument for disunification. This is the thing meant
for what in your chart is:
E025 85 NEL NL Symbol for Next Line
U+2424 is the correct character for the graphic symbol
display of "NEL" or "NL Symbol or Next Line (or Symbol for Newline).
--Ken
>
> - Frank
>
9-Oct-98 18:40:04-GMT,3883;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id OAA28263
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 14:40:02 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id LAA40650
; Fri, 9 Oct 1998 11:26:03 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA25671; Fri, 9 Oct 98 10:52:15 -0700
Message-Id: <9810091752.AA25671@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6134 (1998-10-09 17:51:41 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 9 Oct 1998 10:51:40 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
> If the Feminine, and Masculine, Ordinal Indicators (U+00AA, and U+00BA,
> respectively) were written in the same fixed-pitch font as the surrounding
> text, they also would occupy one cell, each, won't they?
>
Yes, but they would be too small. The SNI glyphs are full-size base
characters, but the ordinal indicator glyphs are superscripts.
> As Frank had written in both of his own drafts:
> > arriving at a sufficient set of character-cell terminal graphics for
> > Unicode is complicated by the well-known problems that affect other
> > preexisting character sets to varying degrees:
> > 1. Lack of official names for the characters of some of the sets.
> > 2. Lack of definitive, high-quality pictures of the glyphs in some cases.
> > 3. Lack of descriptions of the purpose and intended use of the glyphs.
>
> I think, those are good reasons not to take the glyphs in the Siemens
> Nixdorf 97801-5xx Benutzerhandbuch too seriously -- good reasons to unify
> these characters with the above-mentioned U+00AA and U+00BA. The only
> reason, IMHO, not to unify them would be existence of a character set
> containing different glyphs both for the proposed characters and the
> existing ones -- as Rick has already noted.
>
I tend to agree. The "strange" SNI glyphs are not a high priority, to me
personally at least. I have, however, posted a message to the Sinix newsgroup
(of SNI customers) to see if any strong opinions come to the surface. All I
can say from my own experience is that there was heavy demand for accurate
SNI terminal emulation for Windows 95/98/NT, and we met that demand as best
we could within the limitations of the code pages and fonts available to us.
For those of you not familiar with SNI 97801, it probably has the most
advanced ISO 2022 implementation and repertoire of character sets of any
terminal ever built -- at least in the West (it lacks Hebrew, Arabic, and
CJK, but includes ISO 8859-1,2,3,4,5,7,9, various ISO 646 versions, plus
a selection of "strange" private sets, and a wide variety of input methods).
To answer Otto's point with a question: what is a character set? I can see
both a superscript feminine ordinal and a "big" feminine ordinal on the same
screen simply by sending ISO 2022 escape sequences to switch "character sets".
So in a sense, all character sets that can be designated and invoked by ISO
2022 escape sequences form one big character set :-) See, for example:
http://www.columbia.edu/kermit/kuishots.html
Go down to Shot 3. This screen was produced using ISO 2022 escape sequences
from the host to a VT320 terminal emulator on Windows 95, with Lucida Console
as the (Unicode) font. The same screen could be produced by sending the exact
same data stream to the 97801. (This screen does not show any of the SNI
"strange" glyphs, but I hope it illustrates the point.)
Again, I have no great investment in these characters, and so far our SNI
users have not complained about their absence, but before striking them I
hope to hear some additional testimony from them.
- Frank
9-Oct-98 19:46:19-GMT,1944;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id PAA16362;
Fri, 9 Oct 1998 15:46:15 -0400 (EDT)
Date: Fri, 9 Oct 1998 15:46:15 -0400 (EDT)
Message-Id: <199810091946.PAA16362@watsun.cc.columbia.edu>
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
To: Michael Everson <everson@indigo.ie>
Subject: Re: Terminal Graphics Draft 2
In-Reply-To: Your message of Fri, 9 Oct 1998 12:20:23 -0700 (PDT)
> Everson Mono is fixed-pitch font that includes masculine and feminine
> ordinal indicators taking up the same size cell as everything else.
>
Hi Michael. Did we discuss Everson Mono before? I mean, the possibility
of packaging it with Kermit 95? (Since Microsoft so resolutely and, may I
say, arrogantly, refuses to decently populate Lucida Console.)
I assume we would have to license it and pay some money. Can you give me a
rough idea of the terms? And could it be used as a plug-in replacement for
Lucida Console on Windows NT (but with so many of those huge gaps filled in)?
- Frank
P.S. If your offer to make TTF glyphs for them still stands, the local
photocopier has been fixed so I can start copying from books and manuals to
sheets of paper. These, of course, I can send by post or fax, or scan them.
P.P.S. What did John Cowan mean about Ireland? I had discussions about
this with various people last year and learned that the name of the country
was a somewhat sensitive topic, but the consensus seemed to favor
"Republic of Ireland" rather than "╔ire" (which is controversial since
it came from Eamon de Valera (sp?) who agreed with England on the partition)
or "Ireland" which is confusing to some people (e.g. postal authorities).
By the way, in case you are interested in the results of that discussion,
I'd be glad to send it -- it is a discourse on how to address postal mail
to many lands, the Isles to the north of continental Europe being the most
interesting case :-)
9-Oct-98 19:53:17-GMT,1561;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id PAA18311
for fdc; Fri, 9 Oct 1998 15:53:15 -0400 (EDT)
Date: Fri, 9 Oct 1998 15:53:15 -0400 (EDT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Message-Id: <199810091953.PAA18311@watsun.cc.columbia.edu>
To: fdc@watsun.cc.columbia.edu
Path: news.columbia.edu!watsun.cc.columbia.edu!fdc
From: fdc@watsun.cc.columbia.edu (Frank da Cruz)
Newsgroups: de.comp.os.sinix
Subject: SNI 97801 Emulation vs Unicode
Date: 9 Oct 1998 17:55:14 GMT
Organization: Columbia University
Lines: 24
Message-ID: <6vlim2$9s6$1@apakabar.cc.columbia.edu>
NNTP-Posting-Host: watsun.cc.columbia.edu
Xref: news.columbia.edu de.comp.os.sinix:1873
I would like to know if anybody is using applications with SNI 97801
terminals (or emulators) that require any of the following character sets:
1. Klammern (Brackets) -- pieces of brackets, braces, integral signs,
little clocks, and other funny symbols.
2. Facet -- shapes for drawing pictures (mosaic graphics), but not
like Videotex / Teletex.
3. "IBM" -- contains some unique symbols not found in any IBM code page,
like rotated hex digit-pairs, superscript "proportional-to" symbol,
superscript infinity symbol, etc.
These three character sets contain glyphs that are not in Unicode.
The question is whether they should be added.
I don't care about the other 97801 sets (Math, Euro, German, International,
or the Latin alphabets) because they are already in Unicode.
Thank you!
Frank da Cruz
Columbia University
9-Oct-98 21:07:45-GMT,2657;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id RAA08937
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 17:07:44 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id OAA51036
; Fri, 9 Oct 1998 14:06:05 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA29038; Fri, 9 Oct 98 13:53:48 -0700
Message-Id: <9810092053.AA29038@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Uml-Sequence: 6139 (1998-10-09 20:53:35 GMT)
From: "Alain" <alb@sct.gouv.qc.ca>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 9 Oct 1998 13:53:33 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by watsun.cc.columbia.edu id RAA08937
A 08:33 98-10-09 -0700, Otto Stolz a Θcrit :
>Am 1998-10-8 um 12:19 hat John Cowan geschrieben:
>> A typical (though not the only)
>> glyph for U+2424 is the one which appears on the Enter key of PC
>> keyboards.
>
>Please describe.
>
>On my keyboards (both PC and X-Terminal), the Enter key has the word
>"Enter" engraved, whilst the Return key has a U+21B2 Downwards Arrow
>with Tip Leftwards (or is it a glyph variant of U+21B5?).
>
>If I remember correctly, my 3270 terminal had the similar engravings
>on these keys, viz. "DatFreig" (Datenfreigabe = German translation of
>"Enter"), and U+21B5, respectively. Cf. figure 3-1 in IBM form GA27-2837-8
>"IBM 3270 Information Display System Character Set Reference".
>
>Btw., the semantics of these 3270 keys were quite different: the Enter
>key sends data to zhe host, whilst the Return key is just a local cursor
>movement without sending anything.
>
>Best wishes,
> Otto Stolz
[Alain] :
According to ISO/IEC 9995-7 (Symbols used for keyboard functions), "Enter"
(fr: Validation) and "Return" (fr: Retour) are indeed two very different
functions, with two different international symbols. On 3270's they are
used simultaneously. On PCs, only the Return function is used *generally*,
unless you use a terminal emulator, in which case, Enter is also used
(generally dedicating the same scan code as the righ-hand-side Control key;
some applications alos use the Return key of the numeric keypad as an Enter
function).
Alain LaBontΘ
QuΘbec
Project editor, ISO/IEC 9995 series (8 parts)
Coeditor, ISO/IEC 9995-7 (with Bernard Chauvois and Fred Bealle)
[and author/designer of several keyboard drivers for PCs]
9-Oct-98 21:37:01-GMT,7519;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id RAA17342
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 17:36:59 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id OAA43772
; Fri, 9 Oct 1998 14:36:11 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA29631; Fri, 9 Oct 98 14:28:34 -0700
Message-Id: <9810092128.AA29631@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6142 (1998-10-09 21:25:14 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 9 Oct 1998 14:25:12 -0700 (PDT)
Subject: Terminal Graphics: Assorted Responses
Paul Keinanen <keinanen@sci.fi> wrote [About DEL and RUB]:
> ...
> Due to this duality, should Rubout and Delete be concidered to be two
> separate characters, although they seem to have the same code point in all
> ASCII based character sets ? Probably not...
>
I don't think they should be separated. The name Rubout was rubbed out
decades (literally) ago. ANSI X3.4-1977 does not contain the word Rubout.
Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk> wrote:
>
> I highly welcome attempts to complete Unicode with the various technical
> character set symbols that various terminal types have, after proper
> unification according to the well-proven character/glyph model.
>
Good...
> Also symbols that are not part of the user accessible character set of
> a terminal but that appear as part of the normal look-and-feel of this
> terminal in status lines, etc. should be added to Unicode, in order to allow
> to emulate one terminal inside another terminal emulator (e.g., an IBM 3270
> emulator that runs inside an UTF-8 enhanced xterm).
>
Good...
> I am sceptical
> however, whether all the many control symbols really need to have a place
> in the BMP. I think that debugging tools can quite easily provide them
> using some other replacement notation, that might or might not bypass the
> usual font mechanisms.
>
Indeed they can, but then will such debugging tools be interoperable
with other applications? I think it is a worthy goal to be able to paste
terminal screens -- even when they contain debugging information -- into
other applications. For example, for publication purposes, e.g. by people
who write networking and data communications textbooks, manuals, and
"for dummies" books. I think there is a nontrivial market there :-)
I don't think we should go out of our way to anticipate how people will use
these characters, or what kind of people will use them.
> Another question is which terminals should actually be supported. Many
> of the ones you mentioned have died away already.
>
Like the Perkin Elmers. But they are not the basis for any proposed
characters, only illustrations of features like "display controls". The
VT100 family, Wyse, Televideo, IBM, and SNI are all very current. The
Heath/Zenith 19 hasn't been manufactured in quite a while, but it remains
one of the most popular terminals to emulate due to its powerful but simple
command set and several unique features. People who were not even born
until after the last H19s vanished are using emulators for them today, with
matching termcaps (or 3270 protocol converter terminal types) on the other
end. In the IBM world, the H19 is especially popular since it lets the host
change the cursor shape, and Series/1-based protocol converters use this
feature to tell the user whether the 3270 is in insert or overwrite mode.
For this reason alone, countless people stick with this emulation rather
than "modernize" to VT100 (or beyond).
> Do you have any form of data on the terminal emulator market regarding
> more exotic terminal types? Are there really more than a few hundred
> people out there who use applications that depend on a terminal type
> radically different from a DEC VT340 or a IBM 3278?
>
I can tell you as a maker of terminal emulation software that there is
indeed a significant and and insistent demand for all sorts of terminals you
never heard of. The original list for Kermit 95 was simply VT100, VT220,
and ANSI. In the three years since we released it -- that is, here in the
late 1990s -- quite contrary to our expectations, customer demands have
compelled us to expand the list as follows:
[C:\k95\] K-95> set terminal type ? One of the following:
aixterm beterm hft qansi tvi910+ vt100 wy30
ansi-bbs dg200 hp2621a qnx tvi925 vt102 wy370
at386 dg210 hpterm scoansi tvi950 vt220 wy50
avatar/0+ dg217 hz1500 sni-97801 vc404 vt320 wy60
ba80 heath19 linux tty vip7809 vt52
[C:\k95\] K-95>
You'll see that some of them are true antiques: the Volker Craig 404, the
Hazeltine 1500, ... Customers require these emulations because they have
applications that are hardwired to use them. And yes, some of these
applications are "dinosaurs", but they do not die so easily, and who is
to say they should?
kenw@sybase.com (Kenneth Whistler) wrote [of ancient programs]:
>
> Somewhat orthogonally, many of these old dinosaurs are about to be
> cleared away by the asteroid known as Y2K on its way to impact.
>
And many won't -- there is a huge industry that installs patches in ancient
binaries for which source code is lost, or the source language is long
forgotten (or no longer compilable or assemblable). And then, whenever the
"window" rolls around, they'll have to do it again :-)
> Incidentally, there is another pile of graphic symbols for keyboard
> functions coming down the pike in Amendment 22 to 10646 (based on
> ISO 9995-7). These should be checked to verify that there are no
> duplicates against the collection of symbols being proposed for
> terminal emulation. (Examples: symbols for compose, enter, alternate,
> shift lock, undo, print screen, clear screen, delete, etc.)
>
For sure. I assume someone who is party to both proposals will take
responsbility? If not, is Amendment 22 in a public place so I can check
it myself?
> And since your collection of display controls lists both the
> three-letter and two-letter mnemonics for these things [NEL and NL],
> I cannot see any argument for disunification. This is the thing meant
> for what in your chart is:
>
> E025 85 NEL NL Symbol for Next Line
>
> U+2424 is the correct character for the graphic symbol
> display of "NEL" or "NL Symbol or Next Line (or Symbol for Newline).
>
Well, we've beaten this one to death, but I would say we should be
consistent. Our choices are these:
1. Encode the full form and no "2X" forms, i.e. no abbreviations of
abbreviations should be encoded.
2. Encode both forms (I'm not advocating that).
3. List 2X forms as glyph alternatives and allow font designers to use
*all* full forms or *all* 2X forms.
4. Encode only 2X forms.
Only in the last case, I think, does it make sense to unify NEL and NL.
Otherwise NEL is a full C1 form and NL is an EBCDIC form, which
unfortunately happens to coincide with the "2X" representation of NEL.
Any other argument for unification would lead us to unify the symbols
for all controls -- C0, C1, EBCDIC, Unicode, and otherwise -- that have
similar functions but different names, which would defeat the purpose of
having these glyphs to begin with.
- Frank
9-Oct-98 22:19:51-GMT,1665;000000000001
Return-Path: <jaltman>
Received: (from jaltman@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id SAA26443;
Fri, 9 Oct 1998 18:19:46 -0400 (EDT)
Sender: Jeffrey Altman <jaltman@watsun.cc.columbia.edu>
Date: Fri, 9 Oct 98 18:19:46 EDT
From: kermit-support@watsun.cc.columbia.edu
To: Karlsson Kent - keka <keka@im.se>
Cc: kermit-support@columbia.edu
Subject: Re: RE2: Kermit95
In-Reply-To: Your message of Fri, 9 Oct 1998 20:49:52 +0200
Reply-To: kermit-support@watsun.cc.columbia.edu
Message-ID: <CMM.0.90.4.907971586.jaltman@watsun.cc.columbia.edu>
One more note regarding Unicode UTF-8 and text terminals. If the text
terminal is an ANSI X3.64-1979 derivative and the character set is
UTF-8, all of the terminal command sequences must use the 7-bit
equivalents to the 8-bit C1 controls.
This may be a reason for why UTF-7 might be perferred in an text
terminal environment. I do not believe that UTF-7 would interfere
with the C1 control character range.
Also, modifications would need to be made to how the terminal responds
to character-set invocation comamnds. In general, I believe that all
of the ISO 2022 rules for character-set handling would need to be
ignored. The result is that you would be restricted to using only
those characters available in Unicode and could not use any of the
Special Graphics characters available in most terminal emulations for
box drawing.
Jeffrey Altman * Sr.Software Designer * Kermit-95 for Win32 and OS/2
The Kermit Project * Columbia University
612 West 115th St #716 * New York, NY * 10025
http://www.kermit-project.org/k95.html * kermit-support@kermit-project.org
9-Oct-98 17:08:26-GMT,2889;000000000011
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id NAA02699
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 13:08:24 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id KAA36498
; Fri, 9 Oct 1998 10:05:17 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA24722; Fri, 9 Oct 98 10:01:01 -0700
Message-Id: <9810091701.AA24722@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Uml-Sequence: 6128 (1998-10-09 17:00:45 GMT)
From: Otto Stolz <Otto.Stolz@uni-konstanz.de>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 9 Oct 1998 10:00:44 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
Frank da Cruz had proposed:
> E0B3 Latin small letter a with underbar SNI Math 04/04 (2)
> E0B4 Latin capital letter O with underbar SNI Math 04/09 (2)
Rick McGowan wrote:
> I believe [those] two characters are just masculine and feminine
> ordinal indicators, and are already encoded between 0x80 and 0xFF, as part
> of ISO Latin 1. They are probably just variant glyphs... unless the
> documentation distinguishes them and they occur in pairs with lower-case.
Am 1998-10-8 um 14:07 hat Frank da Cruz geschrieben:
> The reason these need to be encoded
> separately from feminine/masculine ordinals are their size -- they fill the
> whole cell, like a regular letter. Since terminal emulators and data
> analyzers use fixed-pitch fonts, we can't just switch to another point size
> to display these characters, since that will wreck the matrix arrangement
> of the screen.
This is what I cannot understand.
If the Feminine, and Masculine, Ordinal Indicators (U+00AA, and U+00BA,
respectively) were written in the same fixed-pitch font as the surrounding
text, they also would occupy one cell, each, won't they?
As Frank had written in both of his own drafts:
> arriving at a sufficient set of character-cell terminal graphics for
> Unicode is complicated by the well-known problems that affect other
> preexisting character sets to varying degrees:
> 1. Lack of official names for the characters of some of the sets.
> 2. Lack of definitive, high-quality pictures of the glyphs in some cases.
> 3. Lack of descriptions of the purpose and intended use of the glyphs.
I think, those are good reasons not to take the glyphs in the Siemens
Nixdorf 97801-5xx Benutzerhandbuch too seriously -- good reasons to unify
these characters with the above-mentioned U+00AA and U+00BA. The only
reason, IMHO, not to unify them would be existence of a character set
containing different glyphs both for the proposed characters and the
existing ones -- as Rick has already noted.
Best wishes,
Otto Stolz
9-Oct-98 18:40:04-GMT,3883;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id OAA28263
for <fdc@watsun.cc.columbia.edu>; Fri, 9 Oct 1998 14:40:02 -0400 (EDT)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id LAA40650
; Fri, 9 Oct 1998 11:26:03 -0700
Received: by unicode.org (NX5.67g/NX3.0S)
id AA25671; Fri, 9 Oct 98 10:52:15 -0700
Message-Id: <9810091752.AA25671@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6134 (1998-10-09 17:51:41 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 9 Oct 1998 10:51:40 -0700 (PDT)
Subject: Re: Terminal Graphics Draft 2
> If the Feminine, and Masculine, Ordinal Indicators (U+00AA, and U+00BA,
> respectively) were written in the same fixed-pitch font as the surrounding
> text, they also would occupy one cell, each, won't they?
>
Yes, but they would be too small. The SNI glyphs are full-size base
characters, but the ordinal indicator glyphs are superscripts.
> As Frank had written in both of his own drafts:
> > arriving at a sufficient set of character-cell terminal graphics for
> > Unicode is complicated by the well-known problems that affect other
> > preexisting character sets to varying degrees:
> > 1. Lack of official names for the characters of some of the sets.
> > 2. Lack of definitive, high-quality pictures of the glyphs in some cases.
> > 3. Lack of descriptions of the purpose and intended use of the glyphs.
>
> I think, those are good reasons not to take the glyphs in the Siemens
> Nixdorf 97801-5xx Benutzerhandbuch too seriously -- good reasons to unify
> these characters with the above-mentioned U+00AA and U+00BA. The only
> reason, IMHO, not to unify them would be existence of a character set
> containing different glyphs both for the proposed characters and the
> existing ones -- as Rick has already noted.
>
I tend to agree. The "strange" SNI glyphs are not a high priority, to me
personally at least. I have, however, posted a message to the Sinix newsgroup
(of SNI customers) to see if any strong opinions come to the surface. All I
can say from my own experience is that there was heavy demand for accurate
SNI terminal emulation for Windows 95/98/NT, and we met that demand as best
we could within the limitations of the code pages and fonts available to us.
For those of you not familiar with SNI 97801, it probably has the most
advanced ISO 2022 implementation and repertoire of character sets of any
terminal ever built -- at least in the West (it lacks Hebrew, Arabic, and
CJK, but includes ISO 8859-1,2,3,4,5,7,9, various ISO 646 versions, plus
a selection of "strange" private sets, and a wide variety of input methods).
To answer Otto's point with a question: what is a character set? I can see
both a superscript feminine ordinal and a "big" feminine ordinal on the same
screen simply by sending ISO 2022 escape sequences to switch "character sets".
So in a sense, all character sets that can be designated and invoked by ISO
2022 escape sequences form one big character set :-) See, for example:
http://www.columbia.edu/kermit/kuishots.html
Go down to Shot 3. This screen was produced using ISO 2022 escape sequences
from the host to a VT320 terminal emulator on Windows 95, with Lucida Console
as the (Unicode) font. The same screen could be produced by sending the exact
same data stream to the 97801. (This screen does not show any of the SNI
"strange" glyphs, but I hope it illustrates the point.)
Again, I have no great investment in these characters, and so far our SNI
users have not complained about their absence, but before striking them I
hope to hear some additional testimony from them.
- Frank
30-Oct-98 20:53:04-GMT,1988;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id PAA29389
for <fdc@watsun.cc.columbia.edu>; Fri, 30 Oct 1998 15:53:03 -0500 (EST)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id MAA49598
; Fri, 30 Oct 1998 12:50:52 -0800
Received: by unicode.org (NX5.67g/NX3.0S)
id AA17044; Fri, 30 Oct 98 11:48:35 -0800
Message-Id: <9810301948.AA17044@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6361 (1998-10-30 19:45:46 GMT)
From: kenw@sybase.com (Kenneth Whistler)
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Cc: kenw@sybase.com
Date: Fri, 30 Oct 1998 11:45:42 -0800 (PST)
Subject: Re: Terminal Emulation
Doug commented:
>
> Excluding the hex-byte characters (which almost nobody seems to like),
> we're only talking about 256 characters, aren't we? I guess I don't
> understand why the opposition is so vigorous.
As *glyphs*, nobody cares. They're fine. Anybody who wants to use
glyphs like these to represent hex byte values may feel free to do
so, and nobody will object.
As *characters*, they are useless dreck. There is no reason to
introduce into a text stream a *character*--say U+2841--to serve
as a visible symbolic placeholder for the byte value 0x41. What
purpose does this serve? Debuggers translate *byte values* into
visibly displayed glyphs (either unitary, as proposed here, or
simply as sequences of glyphs for the hex digits, i.e. "41").
Adding an arbitrary layer of textual *characters* in between
just gets in the way of what the debugger should be doing.
Unicode is a *character* encoding standard. It is not a glyph
registry. People who want a registry of well-defined glyphs that
font vendors can use to produce common collections of displayable
glyphs (for terminal emulations or whatever) should be talking
to AFII, instead.
--Ken
30-Oct-98 22:51:33-GMT,1287;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id RAA17156
for <fdc@watsun.cc.columbia.edu>; Fri, 30 Oct 1998 17:51:32 -0500 (EST)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id OAA35960
; Fri, 30 Oct 1998 14:44:39 -0800
Received: by unicode.org (NX5.67g/NX3.0S)
id AA18475; Fri, 30 Oct 98 13:06:37 -0800
Message-Id: <9810302106.AA18475@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6366 (1998-10-30 21:01:33 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 30 Oct 1998 13:01:30 -0800 (PST)
Subject: Re: Terminal Emulation
Michael Everson wrote:
> > 2. The letterlike characters from the SNI Math set that I had in the
> > first draft but later withdrew are in fact from ISO Registration
> > 103: Teletex Supplementary Set of Graphic Characters from CCITT
> > (ITU-T) T.61. These include Lappish Eng...
>
> Sami eng.
>
Right, I knew that, sorry. (I was hurriedly copying the names from the
Teletex standard, which says "Lappish"...)
Revisions coming up momentarily.
- Frank
30-Oct-98 23:12:33-GMT,2135;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id SAA18729
for <fdc@watsun.cc.columbia.edu>; Fri, 30 Oct 1998 18:12:33 -0500 (EST)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id PAA25438
; Fri, 30 Oct 1998 15:10:28 -0800
Received: by unicode.org (NX5.67g/NX3.0S)
id AA18765; Fri, 30 Oct 98 13:16:27 -0800
Message-Id: <9810302116.AA18765@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6367 (1998-10-30 21:14:12 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 30 Oct 1998 13:14:11 -0800 (PST)
Subject: Terminal Charsets Proposal
Well, I ran out of time -- gotta go now, probably won't be back till
Monday, so there's no way to get these to the appropriate parties on
paper by Fedex on November 2nd. Maybe November 3rd?...
Anyway, the proposal is now split into 3: regular glyphs, control pics,
and hex bytes.
Michael Everson's glyph map is included, as well as an archive of the
email discussion and a document from SNI about their glyphs.
I would have liked to spend more time polishing -- and will do so before
I send them in for real. In the meantime, any comments will be
appreciated (but not responded to for a few days).
HEX BYTE PICTURES FOR UNICODE (plain text)
ftp://kermit.columbia.edu/kermit/ucsterminal/hex.txt
ADDITIONAL CONTROL PICTURES FOR UNICODE (plain text)
ftp://kermit.columbia.edu/kermit/ucsterminal/control
TERMINAL GRAPHICS FOR UNICODE (plain text)
ftp://kermit.columbia.edu/kermit/ucsterminal/ucsterminal.txt
Glyph Map (PDF, binary, contributed by Michael Everson)
ftp://kermit.columbia.edu/kermit/ucsterminal/terminal-emulation.pdf
Clarification of SNI Glyphs (Microsoft Word 7.0, binary, from SNI)
ftp://kermit.columbia.edu/kermit/ucsterminal/sni-charsets.doc
Discussion (Unicode list e-mail, plain text)
ftp://kermit.columbia.edu/kermit/ucsterminal/mail.txt
- Frank
30-Oct-98 23:43:13-GMT,1360;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id SAA23024
for <fdc@watsun.cc.columbia.edu>; Fri, 30 Oct 1998 18:43:12 -0500 (EST)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id PAA36380
; Fri, 30 Oct 1998 15:34:42 -0800
Received: by unicode.org (NX5.67g/NX3.0S)
id AA19518; Fri, 30 Oct 98 13:48:53 -0800
Message-Id: <9810302148.AA19518@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6370 (1998-10-30 21:45:39 GMT)
From: Rick McGowan <rmcgowan@apple.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 30 Oct 1998 13:45:38 -0800 (PST)
Subject: Re: Terminal Emulation
Doug Ewell wrote and Everson commented...
> > ... to include characters of debatable usefulness
> > rather than excluding them.
>
> No way! Include characters of limited usefulness, perhaps. But not of
> debatable usefulness.
I missed this one. Good call, Michael! Nobody wants things of debatable
usefulness. Either the committees as a whole are convinced of some utility,
and the characters go IN, or they're not convinced of utility, and the
characters stay OUT. (Luckily, nobody has to use or like every single
character...)
Rick
31-Oct-98 2:30:27-GMT,1649;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id VAA08939
for <fdc@watsun.cc.columbia.edu>; Fri, 30 Oct 1998 21:30:27 -0500 (EST)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id SAA27418
; Fri, 30 Oct 1998 18:27:33 -0800
Received: by unicode.org (NX5.67g/NX3.0S)
id AA00927; Fri, 30 Oct 98 16:46:07 -0800
Message-Id: <9810310046.AA00927@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6382 (1998-10-31 00:45:41 GMT)
From: "Joan Aliprand" <BR.JMA@rlg.org>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Fri, 30 Oct 1998 16:45:40 -0800 (PST)
Subject: Re: Terminal Charsets Proposal - alternative deadlines
REPLY TO 10/30/98 15:25 FROM UNICODE@UNICODE.ORG: Terminal Charsets Proposal
Frank:
> Well, I ran out of time -- gotta go now, probably won't be back till
> Monday, so there's no way to get these to the appropriate parties on
> paper by Fedex on November 2nd. Maybe November 3rd?...
You have two chances:
(a) Yes, I guess Arnold could hold off for one day. (I'll leave a message
for him.)
(b) But if you get your proposal to the Unicode Office by Monday, November
16, there is enough time to make copies for distribution at the UTC/L2
meeting (even with the Thanksgiving holiday intervening).
And it is possible for us to pull a copy from a site. (However, then you
are trusting that software and connections work correctly.)
-- Joan Aliprand
Chair, UTC
To: UNICODE@UNICODE.ORG
1-Nov-98 9:01:20-GMT,2440;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id EAA11750
for <fdc@watsun.cc.columbia.edu>; Sun, 1 Nov 1998 04:01:19 -0500 (EST)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id BAA19848
; Sun, 1 Nov 1998 01:00:40 -0800
Received: by unicode.org (NX5.67g/NX3.0S)
id AA06934; Sun, 1 Nov 98 00:42:50 -0800
Message-Id: <9811010842.AA06934@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Disposition: inline
X-Uml-Sequence: 6387 (1998-11-01 08:42:33 GMT)
From: Doug Ewell <dewell@compuserve.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Sun, 1 Nov 1998 00:42:31 -0800 (PST)
Subject: Re: Terminal Emulation
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by watsun.cc.columbia.edu id EAA11750
Kenneth Whistler <kenw@sybase.com> wrote:
> Doug commented:
>
>> Excluding the hex-byte characters (which almost nobody seems to
>> like), we're only talking about 256 characters, aren't we? I
>> guess I don't understand why the opposition is so vigorous.
>
> As *glyphs*, nobody cares. They're fine. Anybody who wants to use
> glyphs like these to represent hex byte values may feel free to do
> so, and nobody will object.
>
> As *characters*, they are useless dreck.
...
Sorry, I guess my use of the word "excluding" was somehow misleading.
I did not mean to appear to be supporting addition of the hex bytes
into Unicode. I meant to say that, IF the hex bytes were removed
from the proposal, we would be left with a single 256-character block
(which is not even fully populated) and that I wouldn't have guessed
that its addition would have caused so much controversy.
I should also point out for the benefit of Michael, Rick, and others
that I nearly used the phrase "limited usefulness" instead of
"debatable usefulness," and in retrospect should have. I meant
"debatable" from the perspective of individual users, but "limited"
from the perspective of the committees. All character sets have at
least one character that SOMEBODY might think is not necessary, as
evidenced by the case of the gentleman who wanted to replace the
supposedly useless vertical bar in ASCII with the Euro symbol.
Cheers,
-Doug
1-Nov-98 11:24:09-GMT,2545;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id GAA20339
for <fdc@watsun.cc.columbia.edu>; Sun, 1 Nov 1998 06:24:08 -0500 (EST)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id DAA63194
; Sun, 1 Nov 1998 03:23:26 -0800
Received: by unicode.org (NX5.67g/NX3.0S)
id AA07272; Sun, 1 Nov 98 03:17:44 -0800
Message-Id: <9811011117.AA07272@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Uml-Sequence: 6388 (1998-11-01 11:17:31 GMT)
From: Michael Everson <everson@indigo.ie>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Sun, 1 Nov 1998 03:17:30 -0800 (PST)
Subject: Re: Terminal Emulation
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by watsun.cc.columbia.edu id GAA20339
Ar 00:42 -0800 1998-11-01, scrφobh Doug Ewell:
>Sorry, I guess my use of the word "excluding" was somehow misleading.
>I did not mean to appear to be supporting addition of the hex bytes
>into Unicode. I meant to say that, IF the hex bytes were removed
>>from the proposal, we would be left with a single 256-character block
>(which is not even fully populated) and that I wouldn't have guessed
>that its addition would have caused so much controversy.
>I should also point out for the benefit of Michael, Rick, and others
>that I nearly used the phrase "limited usefulness" instead of
>"debatable usefulness," and in retrospect should have. I meant
>"debatable" from the perspective of individual users, but "limited"
>from the perspective of the committees.
Any character the users aren't sure they want should not be proposed to the
committees.
>All character sets have at
>least one character that SOMEBODY might think is not necessary, as
>evidenced by the case of the gentleman who wanted to replace the
>supposedly useless vertical bar in ASCII with the Euro symbol.
>>>Shhhhh!<<< One oughtn't want say this too loudly. Best not encourage
>>>people to think of such things. (In ASCII one writes "EUR " or "E" if
>>>one cannot otherwise represent the EURO SIGN.)
--
Michael Everson, Everson Gunn Teoranta ** http://www.indigo.ie/egt
15 Port Chaeimhghein ═ochtarach; Baile ┴tha Cliath 2; ╔ire/Ireland
Guthßn: +353 1 478-2597 ** Facsa: +353 1 478-2597 (by arrangement)
27 Pßirc an FhΘithlinn; Baile an Bh≤thair; Co. ┴tha Cliath; ╔ire
2-Nov-98 14:35:21-GMT,2873;000000000011
Return-Path: <oleary@msmail.awii.com>
Received: from timone.hac.awii.com (nat17.awii.com [208.133.247.17])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id JAA22408
for <fdc@watsun.cc.columbia.edu>; Mon, 2 Nov 1998 09:35:21 -0500 (EST)
Received: by timone with Internet Mail Service (5.0.1460.8)
id <4D96PAWD>; Mon, 2 Nov 1998 09:36:05 -0500
Message-ID: <D2815B3074C2D1118F2B0060975B95283E3E8D@timone>
From: "O'Leary, Sean (NJ)" <oleary@msmail.awii.com>
To: "'Frank da Cruz'" <fdc@watsun.cc.columbia.edu>
Subject: RE: Terminal Charsets Proposal
Date: Mon, 2 Nov 1998 09:36:03 -0500
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.0.1460.8)
Content-Type: text/plain
Frank,
I have found cases where a hex bytes area would be extremely useful. To me,
the hex bytes are at least as useful as the Braille or control character
encodings. It does not seem likely that the hex bytes will make it into
Unicode's BMP, but I am still interested in tracking which directions this
proposal goes.
I would be interested in reviewing your site at:
ftp://kermit.columbia.edu/kermit/ucsterminal/hex.txt
but the site rejected my login attempts. Is this site for public viewing?
Thanks,
Sean O'Leary
Software Internationalization
Automated Wagering International
(201) 489-5950
email: oleary@awii.com
> -----Original Message-----
> From: Frank da Cruz [SMTP:fdc@watsun.cc.columbia.edu]
> Sent: Friday, October 30, 1998 4:14 PM
> To: Unicode List
> Subject: Terminal Charsets Proposal
>
> Well, I ran out of time -- gotta go now, probably won't be back till
> Monday, so there's no way to get these to the appropriate parties on
> paper by Fedex on November 2nd. Maybe November 3rd?...
>
> Anyway, the proposal is now split into 3: regular glyphs, control pics,
> and hex bytes.
>
> Michael Everson's glyph map is included, as well as an archive of the
> email discussion and a document from SNI about their glyphs.
>
> I would have liked to spend more time polishing -- and will do so before
> I send them in for real. In the meantime, any comments will be
> appreciated (but not responded to for a few days).
>
> HEX BYTE PICTURES FOR UNICODE (plain text)
> ftp://kermit.columbia.edu/kermit/ucsterminal/hex.txt
>
> ADDITIONAL CONTROL PICTURES FOR UNICODE (plain text)
> ftp://kermit.columbia.edu/kermit/ucsterminal/control
>
> TERMINAL GRAPHICS FOR UNICODE (plain text)
> ftp://kermit.columbia.edu/kermit/ucsterminal/ucsterminal.txt
>
> Glyph Map (PDF, binary, contributed by Michael Everson)
> ftp://kermit.columbia.edu/kermit/ucsterminal/terminal-emulation.pdf
>
> Clarification of SNI Glyphs (Microsoft Word 7.0, binary, from SNI)
> ftp://kermit.columbia.edu/kermit/ucsterminal/sni-charsets.doc
>
> Discussion (Unicode list e-mail, plain text)
> ftp://kermit.columbia.edu/kermit/ucsterminal/mail.txt
>
> - Frank
2-Nov-98 19:30:37-GMT,2884;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id OAA11073
for <fdc@watsun.cc.columbia.edu>; Mon, 2 Nov 1998 14:30:29 -0500 (EST)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id LAA69238
; Mon, 2 Nov 1998 11:27:30 -0800
Received: by unicode.org (NX5.67g/NX3.0S)
id AA14632; Mon, 2 Nov 98 11:10:50 -0800
Message-Id: <9811021910.AA14632@unicode.org>
Errors-To: uni-bounce@unicode.org
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
X-Uml-Sequence: 6401 (1998-11-02 19:10:25 GMT)
From: Mark Davis <medavis2@us.ibm.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Mon, 2 Nov 1998 11:10:24 -0800 (PST)
Subject: New draft Unicode technical reports available for review
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by watsun.cc.columbia.edu id OAA11073
Unicode Technical Committee (UTC) meeting #78 will be held the first week of
December. As at every meeting, technical reports on
http://www.unicode.org/unicode/reports/techreports.html will come up for
discussion or approval. These papers can have significant impact on the
recommendations for implementations and on Unicode conformance. Topics include
how Unicode text is normalized for program identifiers and on the web, how
Unicode text should line-break, how to deal with characters that can either
have full-width or half-width in East Asian contexts, and how to sort Unicode
characters.
If you have any feedback on these topics, be sure to review the documents, and
send your feedback to contact point listed in each paper. For consideration at
any UTC meeting, you should make sure that your comments are sent well before
the meeting dates.
The draft technical reports include:
UTR #15: Unicode Normalization Forms
UTR #14: Line Breaking Properties
UTR #13: Unicode Newline Guidelines
UTR #11: East Asian Character Width
UTR #10: Unicode Collation Algorithm
In addition. we will be now be posting proposed draft technical reports as they
become available. These are in an earlier stage of development, and have not
yet been considered by the UTC, so feedback is especially valuable. Topics
include how to handle Unicode characters in regular expressions, the structure
and terminology for character encodings, handling Unicode on EBCDIC systems,
coding annotations (Ruby), and the Unicode BIDI algorithm.
UTR #18: Unicode Regular Expression Guidelines
UTR #17: Character Encoding Model
UTR #16: EBCDIC-Friendly UCS Transformation Format
UTR #12: Support for Interlinear Annotations
UTR #9: The Bidirectional Algorithm Reference Implementation
(UTR #6: Standard Compression Scheme for Unicode (SCSU) has also been
updated--editorial fixes only.)
4-Nov-98 4:19:06-GMT,3638;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id XAA22069
for <fdc@watsun.cc.columbia.edu>; Tue, 3 Nov 1998 23:19:05 -0500 (EST)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id UAA71048
; Tue, 3 Nov 1998 20:18:22 -0800
Received: by unicode.org (NX5.67g/NX3.0S)
id AA25646; Tue, 3 Nov 98 20:06:44 -0800
Message-Id: <9811040406.AA25646@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6418 (1998-11-04 04:06:32 GMT)
From: "Julia Oesterle (Unicode)" <v-juliao@microsoft.com>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Tue, 3 Nov 1998 20:06:31 -0800 (PST)
Subject: Re: Terminal Emulation from Frank da Cruz
this one went astray...resend.
Date: Tue, 3 Nov 98 19:25:39 EST
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Subject: Re: Terminal Emulation
>
> Rich McGowan wrote:
>
> > I'd suggest the best course of action for now would be to bring the
> > proposal to the attention of UTC members. (Some of them are on the
> > Unicode list, others aren't.) Ken and I can do this part, which merely
> > involves sending UTC some pointers and a blurb about the proposal, to
> > solicit their consideration and feedback.
> >
> Thanks. I think it's ready for this. The proposal has been split into
> three (as noted previously), and updated again today for polish and
> constistency (from the rushed hatchet job of last Friday):
>
> HEX BYTE PICTURES FOR UNICODE (plain text)
> ftp://kermit.columbia.edu/kermit/ucsterminal/hex.txt
>
> ADDITIONAL CONTROL PICTURES FOR UNICODE (plain text)
> ftp://kermit.columbia.edu/kermit/ucsterminal/control.txt
>
> TERMINAL GRAPHICS FOR UNICODE (plain text)
> ftp://kermit.columbia.edu/kermit/ucsterminal/ucsterminal.txt
>
> Glyph Map (PDF, contributed by Michael Everson) (*)
> ftp://kermit.columbia.edu/kermit/ucsterminal/terminal-emulation.pdf
>
> Clarification of SNI Glyphs (Microsoft Word 7.0, from SNI)
> ftp://kermit.columbia.edu/kermit/ucsterminal/sni-charsets.doc
>
> Discussion (plain text -- from Unicode mailing list)
> ftp://kermit.columbia.edu/kermit/ucsterminal/mail.txt
>
> I think the hex bytes proposal is worth another look by those on both
> sides
> of the fence. It has been beefed up considerably in the motivation /
> justification department.
>
> After several more days of comment, I'll send them in on paper.
>
> Thanks again to everybody for all the help (and patience).
>
> - Frank
>
> (*) Michael's glyph map is based on previous drafts; some of the
> characters
> shown in it have since been eliminated from the proposal.
> ------------------------------Header--------------------------------------
> ---
> From watsun.cc.columbia.edu!fdc Tue Nov 3 17:09:57 1998
> Received: from watsun.cc.columbia.edu by unicode.com id aa28351;
> 3 Nov 98 17:09 PST
> Received: from watsun.cc.columbia.edu (watsun.cc.columbia.edu
> [128.59.39.2])
> by mail1.dynamic.com (8.8.5/8.8.5) with ESMTP id QAA15248
> for <unicode@unicode.com>; Tue, 3 Nov 1998 16:25:31 -0800
> Received: (from fdc@localhost)
> by watsun.cc.columbia.edu (8.8.5/8.8.5) id TAA24416;
> Tue, 3 Nov 1998 19:25:40 -0500 (EST)
> Date: Tue, 3 Nov 98 19:25:39 EST
> From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
> Subject: Re: Terminal Emulation
> In-Reply-To: Your message of Thu, 29 Oct 1998 14:18:14 -0800
> To: unicode@unicode.com
> Message-ID: <CMM.0.90.4.910139139.fdc@watsun.cc.columbia.edu>
9-Nov-98 21:09:50-GMT,2560;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id QAA13219
for <fdc@watsun.cc.columbia.edu>; Mon, 9 Nov 1998 16:09:49 -0500 (EST)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id NAA26578
; Mon, 9 Nov 1998 13:09:21 -0800
Received: by unicode.org (NX5.67g/NX3.0S)
id AA18733; Mon, 9 Nov 98 13:00:04 -0800
Message-Id: <9811092100.AA18733@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6450 (1998-11-09 20:59:44 GMT)
From: kenw@sybase.com (Kenneth Whistler)
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Cc: kenw@sybase.com
Date: Mon, 9 Nov 1998 12:59:43 -0800 (PST)
Subject: Re: Displaying Plane 1 characters (annotating the code table
Markus Scherer noted:
> However, it probably makes sense for files as an easy and somewhat compact
> format, and it makes sense for the number of possible characters: 1M + 64k,
> including 128k+6400 private use character code points. There are about 38000
> characters assigned so far, with about 20000-30000 more in the pipeline.
Here are the exact values of what currently is encoded and what Unicode 3.0
will contain (synched with the prospective content of the republication
of ISO/IEC 10646-1):
Unicode 2.1:
6813 Misc. characters
20902 Unihan
11172 Johab Hangul
6400 Private use
2048 Surrogates
65 Controls
2 Not characters
18134 Unassigned assignable
38887 Assigned graphic characters
Unicode 3.0 (prospective, as of November 3, 1998):
10554 Misc. characters
20902 Unihan
6582 Unihan Extension A
11172 Johab Hangul
6400 Private use
2048 Surrogates
65 Controls
2 Not characters
7811 Unassigned assignable
49210 Assigned graphic characters
For a net gain of 10323 new characters.
Others have noted the following, but I would like to reiterate, so that
*correct* rumors can circulate, instead of incorrect ones:
Unicode 3.0 will *not* contain any encoded characters requiring surrogates.
The republication of ISO/IEC 10646-1 will *not* contain any encoded
characters outside of the Basic Multilingual Plane.
Plane 1 (and 2 and 14) are for ISO/IEC 10646-2, which is still in
working draft and which has not yet even started a CD ballot. When 10646
Part 2 progresses far enough, we anticipate publishing a Version 4.0 of
the Unicode Standard -- and *that* will make use of surrogate codes
to access encoded characters on Planes 1 and beyond.
--Ken Whistler
9-Nov-98 21:53:20-GMT,3634;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id QAA28069;
Mon, 9 Nov 1998 16:52:57 -0500 (EST)
Date: Mon, 9 Nov 98 16:52:57 EST
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
To: Joan Aliprand <BR.JMA@rlg.org>
cc: Ken Whistler <kenw@sybase.com>, Rick McGowan <rmcgowan@apple.com>,
"Hart, Edwin F." <Edwin.Hart@jhuapl.edu>
Subject: Re: Terminal Charsets Proposal - alternative deadlines
In-Reply-To: Your message of Fri, 30 Oct 1998 16:45:40 -0800 (PST)
Message-ID: <CMM.0.90.4.910648377.fdc@watsun.cc.columbia.edu>
> > Well, I ran out of time -- gotta go now, probably won't be back till
> > Monday, so there's no way to get these to the appropriate parties on
> > paper by Fedex on November 2nd. Maybe November 3rd?...
>
> You have two chances:
> ...
> (b) But if you get your proposal to the Unicode Office by Monday, November
> 16, there is enough time to make copies for distribution at the UTC/L2
> meeting (even with the Thanksgiving holiday intervening).
>
Well, I posted an announcement of the latest drafts about a week ago (to the
wrong address, but you reposted them, thanks!) and have not heard a peep, so
I suppose they must be ready to go.
> And it is possible for us to pull a copy from a site. (However, then you
> are trusting that software and connections work correctly.)
>
The relevant files are all available via anonymous ftp to
kermit.columbia.edu [128.59.39.2], directory kermit/ucsterminal. Transfer
all files in text mode except the ones marked (*):
-rw-rw-r-- 1 fdc 1067 Nov 4 11:14 README.TXT
-rw-rw-r-- 1 fdc 41001 Nov 3 19:12 control.txt
-rw-rw-r-- 1 fdc 14665 Nov 3 19:12 hex.txt
-rw-rw-r-- 1 fdc 257434 Nov 9 16:14 mail.txt
-rw-rw-r-- 1 fdc 42496 Oct 30 12:46 sni-charsets.doc (*)
-rw-rw-r-- 1 fdc 88216 Oct 30 12:45 terminal-emulation.pdf (*)
-rw-rw-r-- 1 fdc 38763 Nov 3 19:12 ucsterminal.txt
-rw-rw-r-- 1 fdc 44534 Sep 30 21:27 ucsterminal_01.txt
-rw-rw-r-- 1 fdc 59180 Oct 7 20:03 ucsterminal_02.txt
-rw-rw-r-- 1 fdc 37651 Oct 30 15:52 ucsterminal_03.txt
(*) Transfer these in binary mode.
The three proposals are:
-rw-rw-r-- 1 fdc 41001 Nov 3 19:12 control.txt
-rw-rw-r-- 1 fdc 14665 Nov 3 19:12 hex.txt
-rw-rw-r-- 1 fdc 38763 Nov 3 19:12 ucsterminal.txt
The Unicode mail list discussion is:
-rw-rw-r-- 1 fdc 257434 Nov 9 16:14 mail.txt
The glyph maps are in a PDF file from Michael Everson:
-rw-rw-r-- 1 fdc 88216 Oct 30 12:45 terminal-emulation.pdf
Clarification on the mysterious Siemens Nixdorf glyphs is in the
following Microsoft Word file:
-rw-rw-r-- 1 fdc 42496 Oct 30 12:46 sni-charsets.doc
And the following are earlier drafts of the original monolithic proposal:
-rw-rw-r-- 1 fdc 44534 Sep 30 21:27 ucsterminal_01.txt
-rw-rw-r-- 1 fdc 59180 Oct 7 20:03 ucsterminal_02.txt
-rw-rw-r-- 1 fdc 37651 Oct 30 15:52 ucsterminal_03.txt
I also have a set of "exhibits" on paper, which are photocopies of
character-set tables from a selection of terminal manuals. These are
listed at the end of ucsterminal.txt. I don't have any way to put them
online, so I'll be glad to send them by fedex to any address you designate.
However, I don't think they could arrive at a Post Office box in time for
the deadline.
Can I consider the proposals submitted? (Your web page says to send paper
"unless prior arrangements have been made for receipt of electronic copy").
Thanks!
- Frank
9-Nov-98 22:49:26-GMT,1317;000000000011
Return-Path: <BR.JMA@RLG.ORG>
Received: from RLG.ORG (rlg.org [204.161.104.131])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with SMTP id RAA15456
for <fdc@watsun.cc.columbia.edu>; Mon, 9 Nov 1998 17:49:24 -0500 (EST)
Message-Id: <199811092249.RAA15456@watsun.cc.columbia.edu>
Date: Mon, 9 Nov 98 14:48:50 PST
From: "Joan Aliprand" <BR.JMA@RLG.ORG>
To: fdc@watsun.cc.columbia.edu
Subject: Re: Terminal Charsets Proposal - alternative deadlines
REPLY TO 11/09/98 13:52 FROM FDC@WATSUN.CC.COLUMBIA.EDU "Frank da Cruz": Re:
Terminal Charsets Proposal - alternative deadlines
Dear Frank,
I have put your Terminal Charsets Proposal on the agenda for the UTC/L2
joint meeting in December. However, the agenda has many items that are
time-critical for Version 3.0, so I cannot guarantee whether this proposal
will be discussed at this meeting.
I have been unable to connect to the Kermit FTP server this afternoon. (I
tried from the Kermit Web site, as well as the direct IP address you gave.)
If problems persist, I may have to ask you to send the main documents
(i.e., the three proposals) by express mail or fax to the Unicode Office.
Yours sincerely,
-- Joan Aliprand
Chair, UTC
To: FDC@WATSUN.CC.COLUMBIA.EDU
cc: KEN(KENW@SYBASE.COM), RICK(RMCGOWAN@APPLE.COM),
HART(EDWIN.HART@JHUAPL.EDU)
21-Nov-98 12:43:55-GMT,3190;000000000005
Return-Path: <markus.kuhn@cl.cam.ac.uk>
Received: from mailrelay1.cc.columbia.edu (mailrelay1.cc.columbia.edu [128.59.35.143])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id HAA01746
for <fdc@watsun.cc.columbia.edu>; Sat, 21 Nov 1998 07:43:48 -0500 (EST)
Received: from heaton.cl.cam.ac.uk (heaton.cl.cam.ac.uk [128.232.32.11])
by mailrelay1.cc.columbia.edu (8.8.5/8.8.5) with SMTP id HAA17231
for <fdc@columbia.edu>; Sat, 21 Nov 1998 07:43:35 -0500 (EST)
Received: from trillium.cl.cam.ac.uk (cl.cam.ac.uk) [128.232.8.5] (mgk25)
by heaton.cl.cam.ac.uk with esmtp (Exim 1.82 #1)
id 0zhCNx-0001jE-00; Sat, 21 Nov 1998 12:43:33 +0000
X-Mailer: exmh version 2.0.2+CL 2/24/98
To: fdc@columbia.edu
cc: unicode@unicode.org
Subject: UCS Terminal Emulation Draft
X-URL: http://www.cl.cam.ac.uk/~mgk25/
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Date: Sat, 21 Nov 1998 12:43:31 +0000
From: Markus Kuhn <Markus.Kuhn@cl.cam.ac.uk>
Message-Id: <E0zhCNx-0001jE-00@heaton.cl.cam.ac.uk>
A few questions/suggestions on
ftp://kermit.columbia.edu/kermit/ucsterminal/ucsterminal.txt
which I am implementing at the moment.
> * E0A6 Extensible UR or LL brace section IBM SS240000
> * E0A7 Extensible LR or UL brace section IBM SS250000
I don't understand why there are not four of these. How can UR and LL be
unified?
> * E0AE Right ceiling corner DEC Tech 03/05
> * E0AF Right floor corner DEC Tech 03/06
What are these good for? Big floor-ceiling operators can already be
constructed using the bracket segments. And why are there only right
versions of these?
> E0EF Box drawing double dash H DGL 03/12 (5)
> (5) Similar to U+2504 but double rather than triple.
What is the difference to U+254c (BOX DRAWINGS LIGHT DOUBLE DASH
HORIZONTAL)? Michael's glyph in
ftp://kermit.columbia.edu/kermit/ucsterminal/terminal-emulation.pdf
doesn't seem to fit the description here. This character is still a bit
confusing.
> E0D8 H line - Scan 5 DSG 07/01, Wyse ANSI 02/02 (2)
I think, this one should be unified with U+2500. The E0D6-E0DA
characters should also be renamed, as a scan line count is ambiguous and
resolution dependent. Something like
E0D6 BOX DRAWINGS LIGHT HORIZONTAL UPPER ONE SIXTH
E0D7 BOX DRAWINGS LIGHT HORIZONTAL UPPER TWO SIXTH
E0D9 BOX DRAWINGS LIGHT HORIZONTAL LOWER TWO SIXTH
E0DA BOX DRAWINGS LIGHT HORIZONTAL UPPER ONE SIXTH
and together with 2500 we would then have all the required lines.
An implementation of these characters is now available in
http://www.cl.cam.ac.uk/~mgk25/download/ucs-fonts.tar.gz
in the file 6x13-future.bdf, in which I collect proposed implementations
of post-Unicode 2.1 characters for my 6x13 font. It would be nice if you
could have a look at these characters. [BTW: The 6x13.bdf file is now
complete and will be added to various Linux distributions in a few days.
This is your last chance to send me bug reports and suggestions for this
free Unicode xterm font before wide distribution.]
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
23-Nov-98 18:59:08-GMT,4546;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id NAA05690
for <fdc@watsun.cc.columbia.edu>; Mon, 23 Nov 1998 13:59:05 -0500 (EST)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id KAA68464
; Mon, 23 Nov 1998 10:56:50 -0800
Received: by unicode.org (NX5.67g/NX3.0S)
id AA19230; Mon, 23 Nov 98 10:49:18 -0800
Message-Id: <9811231849.AA19230@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6651 (1998-11-23 18:49:05 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Cc: unicode@unicode.org
Date: Mon, 23 Nov 1998 10:49:02 -0800 (PST)
Subject: Re: UCS Terminal Emulation Draft
Hi Markus.
> A few questions/suggestions on
>
> ftp://kermit.columbia.edu/kermit/ucsterminal/ucsterminal.txt
>
> which I am implementing at the moment.
>
> > * E0A6 Extensible UR or LL brace section IBM SS240000
> > * E0A7 Extensible LR or UL brace section IBM SS250000
>
> I don't understand why there are not four of these. How can UR and LL be
> unified?
>
Because they look exactly the same :-) (IBM being clever...)
> > * E0AE Right ceiling corner DEC Tech 03/05
> > * E0AF Right floor corner DEC Tech 03/06
>
> What are these good for? Big floor-ceiling operators can already be
> constructed using the bracket segments. And why are there only right
> versions of these?
>
They're not centered vertically or horizontally. Do you have a DEC terminal
manual? They look about like this:
+------------------+ +------------------+
| | | |
| -------------+ | | |
| | | | | |
| | | | | |
| | | -------------+ |
| | | |
+------------------+ +------------------+
03/05 03/06
> > E0EF Box drawing double dash H DGL 03/12 (5)
> > (5) Similar to U+2504 but double rather than triple.
>
> What is the difference to U+254c (BOX DRAWINGS LIGHT DOUBLE DASH
> HORIZONTAL)? Michael's glyph in
>
> ftp://kermit.columbia.edu/kermit/ucsterminal/terminal-emulation.pdf
>
> doesn't seem to fit the description here. This character is still a bit
> confusing.
>
I might have missed the glyph at U+254C -- I think this one might be a
candidate for unification. The DG character, however, has wider spacing
between the dashes (for what it's worth).
> > E0D8 H line - Scan 5 DSG 07/01, Wyse ANSI 02/02 (2)
>
> I think, this one should be unified with U+2500.
>
I think I commented on this in the proposal. Yes, they should be unified,
but only if it can be guaranteed that the unified character works in both
contexts (PC-style box drawing and VT-style box drawing). I don't see any
reason why it shouldn't, but I'm not a font designer.
> The E0D6-E0DA
> characters should also be renamed, as a scan line count is ambiguous and
> resolution dependent. Something like
>
> E0D6 BOX DRAWINGS LIGHT HORIZONTAL UPPER ONE SIXTH
> E0D7 BOX DRAWINGS LIGHT HORIZONTAL UPPER TWO SIXTH
> E0D9 BOX DRAWINGS LIGHT HORIZONTAL LOWER TWO SIXTH
> E0DA BOX DRAWINGS LIGHT HORIZONTAL UPPER ONE SIXTH
>
> and together with 2500 we would then have all the required lines.
>
I'm certainly not averse to this.
> An implementation of these characters is now available in
>
> http://www.cl.cam.ac.uk/~mgk25/download/ucs-fonts.tar.gz
>
> in the file 6x13-future.bdf, in which I collect proposed implementations
> of post-Unicode 2.1 characters for my 6x13 font.
>
You've encoded them in the private-use area, right? Hopefully final resting
places will be designated for them in the U+2xxx region, and the repertoire
and/or sequencing might be altered. For that matter the entire proposal might
be rejected. In the latter case, of course, we can just keep these characters
where they are.
> It would be nice if you
> could have a look at these characters. [BTW: The 6x13.bdf file is now
> complete and will be added to various Linux distributions in a few days.
> This is your last chance to send me bug reports and suggestions for this
> free Unicode xterm font before wide distribution.]
>
I don't have a way to look at BDF files at the moment so can't comment --
let's take this offline...
Thanks!
- Frank
23-Nov-98 22:37:12-GMT,1527;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id RAA12301
for <fdc@watsun.cc.columbia.edu>; Mon, 23 Nov 1998 17:37:11 -0500 (EST)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id OAA49562
; Mon, 23 Nov 1998 14:33:24 -0800
Received: by unicode.org (NX5.67g/NX3.0S)
id AA21773; Mon, 23 Nov 98 14:15:36 -0800
Message-Id: <9811232215.AA21773@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6657 (1998-11-23 22:15:25 GMT)
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
Reply-To: unicode@unicode.org
To: Unicode List <unicode@unicode.org>
Date: Mon, 23 Nov 1998 14:15:24 -0800 (PST)
Subject: Re: Glyphs of new Unicode 3.0 symbols
Roman Czyborra <czyborra@cs.tu-berlin.de> wrote:
> These exist in http://czyborra.com/unifont/.
>
> > 237E BELL SYMBOL
>
If this is an approved addition, and it is indeed a picture of a
bell, it can be unified with the "Picture of Bell" character in the
"Additional Control Pictures for Unicode" proposal.
> I also would like to see a standardized APPLE.
>
I thought corporate logos were off limits. Note that Data General
terminals also include a DG-logo glyph. There are no doubt several
others. Well, come to think of it, the number of such logos would
be bounded by the number of corporations (and other organizations).
Looks, sounds, and smells like a can of worms to me!
- Frank
30-Nov-98 20:43:16-GMT,6376;000000000001
Return-Path: <unicode@unicode.org>
Received: from public.lists.apple.com (public.lists.apple.com [17.254.0.151])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id PAA22648
for <fdc@watsun.cc.columbia.edu>; Mon, 30 Nov 1998 15:43:09 -0500 (EST)
Received: from unicode.org (unicode2.apple.com [17.254.3.212])
by public.lists.apple.com (8.9.1a/8.9.1) with SMTP id MAA37976
; Mon, 30 Nov 1998 12:42:00 -0800
Received: by unicode.org (NX5.67g/NX3.0S)
id AA00477; Mon, 30 Nov 98 12:13:45 -0800
Message-Id: <9811302013.AA00477@unicode.org>
Errors-To: uni-bounce@unicode.org
X-Uml-Sequence: 6845 (1998-11-30 20:12:10 GMT)
From: kenw@sybase.com (Kenneth Whistler)
Reply-To: considered_harmful@unicode.org
To: Unicode List <unicode@unicode.org>
Cc: kenw@sybase.com
Date: Mon, 30 Nov 1998 12:12:09 -0800 (PST)
Subject: Re: Glyphs of new Unicode 3.0 symbols
MIME-Version: 1.0
Content-Type: text/plain; charset=unknown-8bit
Content-Transfer-Encoding: 8bit
Roman suggested:
>
> Speaking of Unicode 3.0 (thank you all for the many enlightening
> details!) I would like to express my wish for the following additions
> to the Unicode 3.0 CD-ROM for implementor's convenience:
>
> 2. Add an "age" field to the unidata.txt to specify since which
> Unicode version each character has been defined:
> "1.0", "1.1", "2.0", "2.1", or "3.0"
This is under active consideration for a much revised and extended
form of the Unicode Character Database data to accompany the release
of the Unicode Standard, Version 3.0. However, do not expect it to
simply be an additional field for the UnicodeData-X.Y.Z.txt file. The
format and field content of that file have been fixed for long enough that
there are multiple implementations out there that parse it with
particular assumptions about its format. There is an ongoing discussion,
but chances are that new data files will be introduced, with similar,
but new formats, for additional information provided about characters
in the future.
>
> 3. Add an "ASCII transliteration" mapping to each Unicode character
> so that it can be rendered readable in ASCII contexts
This suggestion got thoroughly chewed over last week. Suffice it to
say that this is *way* down the priority list for those of us working
on the properties, attributes, and sundry characteristics of characters.
I consider this to be A) a black hole, and B) a great opportunity for
the vendors and industrious entrepeneurs to come up with appropriate
solutions for different classes of applications and groups of customers.
It is certainly not ripe for an ad hoc standardization by the
Unicode Consortium.
>
> 4. Make the names.txt equivalent to the book's charts by illustrating
> it with UTF-8 characters, for example
>
> 0025 % PERCENT SIGN
> x (arabic percent sign - 066A ٪)
> x (per mille sign - 2030 ‰)
> x (per ten thousand sign - 2031 ‱)
This is, of course, a fairly simple thing to do, but it has annoying
edge cases, since there are four digit years and four digit standards
citations in the file that have to be filtered so they don't produce
erroneous conversions. (For an example of the problem, see the note under
U+0197 in the Unicode Standard, Version 2.0.)
The transformation from the format of the text-only version of the
names list to the formatted, final version of the names list is fairly
complex and subtle. We will certainly again be placing the text-only
version of the names list on the CD-ROM, but the amount of special-purpose
massaging we do to it is a matter of resource contention with other
tasks for publication.
>
> 6. Add mapping tables for the other ISO standards listed as source
> standards in chapter R.1 but not in mappings/iso*/
As someone else speculated, much of this information is not just
"available" and being held back -- it is implicit in mountains of
standards documents, explicit but scattered in various vendors'
implementations of mappings, but not sitting ready somewhere to just
stick on the CD-ROM.
We'll put what we have available, but even reviewing and updating the
sometimes outdated information in the tables we *do* have is going
to be a major task.
Frank asked:
> > > 237E BELL SYMBOL
> >
> If this is an approved addition, and it is indeed a picture of a
> bell, it can be unified with the "Picture of Bell" character in the
> "Additional Control Pictures for Unicode" proposal.
This one is from ISO 2047 (see also DIN 66 213). Yes, it could (and should)
have been unified with U+2407 SYMBOL FOR BELL, but that is not what the
ISO committee decided. But this is not the only instance in which a graphic
representation of a control code has taken on a life of its own as a
separate graphic character. Think of U+237E BELL SYMBOL now as a cute
little mushroom with legs in the technical symbols area. A propos for
representation of a door buzzer, or whatever... But if the terminal
graphics proposal needs both a "BEL" and a character for a picture of
a bell, this is it.
Roman asked:
> And is U+3004 JAPANESE INDUSTRIAL STANDARD SYMBOL no corporate symbol?
Yep, but there are always exceptions. This one is in Unicode because,
although this symbol is not in JIS standards (X 0208, for example), it
is universally used in Japanese JIS dictionaries as a little symbol to
indicate the JIS value of a character. So even if someone could point to
a claim that this is a trademarked logo, it has been genericized by
usage.
Tim Partridge asked:
> Back to the subject of what would be useful on the Unicode 3.0 CD, how about
> a list of the characters used by various languages? (Perhaps with
> classifications like "essential" and "only in foreign words".) Could the
> European subsetters be persuaded to contribute their data? The Cyrillic and
> Arabic blocks also merit attention.
This would be a nice thing to have, but is also a tremendous amount of
work and an open-ended project, since there are disagreements about the
status of various letters even within well-known languages, and there
are potentially 1000's of languages to deal with.
As for whether the European subsetters could be persuaded to contribute
their data, it might be more efficient for us to simply point at their
results for European languages when they stabilize and are available
in a public place. [CEN Workshop Agreement (CWA) on Alphabets of Europe]
--Ken Whistler
7-Dec-98 15:57:11-GMT,7925;000000000005
Return-Path: <Edwin.Hart@jhuapl.edu>
Received: from aples2.jhuapl.edu (aples2.jhuapl.edu [128.244.26.86])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id KAA21896
for <fdc@watsun.cc.columbia.edu>; Mon, 7 Dec 1998 10:57:07 -0500 (EST)
Received: by aples2.jhuapl.edu with Internet Mail Service (5.5.2232.9)
id <XX0C3C3R>; Mon, 7 Dec 1998 10:56:32 -0500
Message-ID: <91D1D51C2955D111B82B00805F19989501CD7290@aples2.jhuapl.edu>
From: "Hart, Edwin F." <Edwin.Hart@jhuapl.edu>
To: "'da Cruz, Frank'" <fdc@watsun.cc.columbia.edu>
Cc: "'Aliprand, Joan'" <br.jma@rlg.org>, "'Whistler, Ken'"
<kenw@sybase.com>,
"'McGowan, Rick'" <rmcgowan@apple.com>,
"'Thewlis, Dave'" <dthewlis@dcta.com>
Subject: UTC response to your request to encode characters for terminal em
ulation
Date: Mon, 7 Dec 1998 10:56:31 -0500
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2232.9)
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by watsun.cc.columbia.edu id KAA21896
Frank,
What follows is the text of the response of the UTC to your proposals. (If
you would prefer, I also have a Word97 version.) The UTC needs some
additional information from you (and IBM and SHARE) before it decides about
the "Terminal Graphics for Unicode" proposal. The next UTC meeting is in
February in Palo Alto and we would appreciate your response by then.
If you want, we can talk about this.
Best regards,
Ed
Edwin F. Hart
Applied Physics Laboratory
11100 Johns Hopkins Road
Laurel, MD 20723-6099
+1-240-228-6926 (from Washington, DC area)
+1-443-778-6926 (from Baltimore area)
+1-240-228-1093 (fax)
edwin.hart@jhuapl.edu <mailto:edwin.hart@jhuapl.edu>
1998-December-07
To: Frank da Cruz
From: Unicode Technical Committee
Subject: Response to your three proposals to encode characters for
terminal emulation.
Thank you for your three proposals for encoding terminal-emulation
characters in Unicode. The proposals were well organized, thorough, and
well researched.
Results
The UTC acknowledges the concerns raised in your three documents. The UTC
had extended discussions on these documents at its December, 1998 meeting in
San JosΘ. Here is the result of these discussions for each paper.
1. Document L2/98-353, "Additional Control Pictures for Unicode"
Status: rejected
The UTC believes that the proposed glyphs would be used as an
alternate way to display control characters rather than to interchange
information; e.g., to document control sequences. The UTC decided not to
encode these glyphs in Unicode.
However, the UTC noted that a bell glyph may have value in other
contexts and so could be encoded for another purpose in the future. In
addition, the UTC noted that Unicode 3.0 aligns the abbreviations for
control characters along a diagonal as you had requested.
As a secondary concern, encoding glyphs for control characters is an
open-ended proposition. The UTC knows that multiple sets of control
characters are defined for the C1 control area. For example, ISO had two
standards defining control characters, ISO/IEC 6429 and ISO 6630. When
someone proposes a new set of C1 control characters, should they also be
considered for encoding? What should be encoded? Should exactly one glyph
be encoded per control-character code position or should multiple glyphs be
encoded for the same control-character code position? These are examples of
concerns underlying the UTC decision rather than a request for you to answer
the questions.
2. Document L2/98-354, "Terminal Graphics for Unicode"
Status: deferred for additional information
The UTC has requested more information before it makes a decision.
Table 5.1, range of E080 to E087. The UTC has requested an official
position from IBM and feedback from SHARE on the glyphs used in the status
area of a 3270 display.
Table 5.2, range of E0A0 to E0AD. The UTC has requested that
Microsoft provide a list of the full set of glyphs used to construct
mathematical entities (brackets, braces, sigma, etc.). Previously, the UTC
had decided not to encode these as characters. However, once this
information is available, the UTC will revisit the issue.
In addition, the UTC would appreciate your response to the
following:
a. What is the full set of terminal-emulation glyphs that you
considered and how did you map those not in your proposal into Unicode? The
UTC's concern is for round-trip integrity and distinguishing different
characters so that the UTC avoids mapping the characters in you proposal to
the same Unicode characters you used already for other glyphs in your full
set. (The concern is not the characters from standard coded character sets
like 7-bit ASCII and the ISO/IEC 8859 series, but rather the set of symbols
outside of these sets.)
b. Which of the following proposed characters could be unified with
(mapped into) Unicode characters?
1) Can you provide (a) the source glyphs for the proposed E0AC and E0AD
sigma/summation parts, and also (b) better glyphs for them.
2) What is the purpose of the proposed E0AE and E0AF characters? Are
they supposed to be full corners for a box, or partial corners, or to
provide the top and bottom corners of right brackets, or to provide serifs
for the sigma (E0AC and E0AD)? Could the proposed E0AE and E0AF characters
be unified with 231D top right corner and 231F bottom right corner?
3) For the proposed E0B0, could it be unified with either 2713 check
mark or 221A radical? Is "small" the distinguishing characteristic for not
unifying it with 2713 or 221A?
4) For the proposed E0B1, should it be unified with either 237B not
check mark or 2415 symbol for negative acknowledge?
5) What is the purpose of the proposed E0D0 and E0D1 characters? Are
they to be used to construct extended brackets and braces with the E0A0 to
E0AB "extensible" characters? If so, then they should be moved to the
mathematical symbol area of your proposal. If not, please explain how they
might be used.
6) Could the proposed E0D2 to E0D5 triangle characters be unified with
the 25E2 to 25E5 black triangle characters?
7) Could the proposed E0E5 diamond be unified with 25C6 black diamond?
8) Could the proposed E0EC be unified with 21E5 rightward arrow to bar?
9) Could the proposed E0ED be unified with 21E4 leftward arrow to bar?
3. Document L2/98-355, "Hex Byte Pictures for Unicode"
Status: rejected
The UTC considers that these are glyphs and, as such, they are out
of the scope of the Unicode standard. Representing hex bytes visibly is a
font-rendering issue rather than an information interchange issue.
Suggestions
Here are some suggestions for you to consider to help you meet your
requirements.
1. Code the glyphs in the Private Use Area. If at some time in the
future, the terminal emulation vendors are using the assignments, then you
may resubmit your proposal (except for the pictures for hex bytes) to
Unicode with this additional justification. It is beyond the scope of the
UTC to encode characters in the Private Use Area or to endorse any
particular use of characters in the Private Use Area. If the terminal
emulation community believes that consistent use of private-use code
positions is desirable, you might consider registering your code assignments
in a registry for the Private Use Area such as the Conscript Registry. Note
that Unicode does not endorse any registry for the Private Use Area. Both
Adobe and Apple have described how each uses the Private Use Area. You may
want to contact these organizations for additional information.
2. Register the glyphs with AFII (Association for Font Information
Interchange). AFII is the registration authority for the ISO/IEC 10036
glyph registry. AFII charges a nominal fee for registering glyphs. If you
are interested in pursuing this, contact AFII ( <mailto:asmus@unicode.org>
afii@unicode.org) for more information.
7-Dec-98 15:57:11-GMT,7925;000000000015
Return-Path: <Edwin.Hart@jhuapl.edu>
Received: from aples2.jhuapl.edu (aples2.jhuapl.edu [128.244.26.86])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id KAA21896
for <fdc@watsun.cc.columbia.edu>; Mon, 7 Dec 1998 10:57:07 -0500 (EST)
Received: by aples2.jhuapl.edu with Internet Mail Service (5.5.2232.9)
id <XX0C3C3R>; Mon, 7 Dec 1998 10:56:32 -0500
Message-ID: <91D1D51C2955D111B82B00805F19989501CD7290@aples2.jhuapl.edu>
From: "Hart, Edwin F." <Edwin.Hart@jhuapl.edu>
To: "'da Cruz, Frank'" <fdc@watsun.cc.columbia.edu>
Cc: "'Aliprand, Joan'" <br.jma@rlg.org>, "'Whistler, Ken'"
<kenw@sybase.com>,
"'McGowan, Rick'" <rmcgowan@apple.com>,
"'Thewlis, Dave'" <dthewlis@dcta.com>
Subject: UTC response to your request to encode characters for terminal em
ulation
Date: Mon, 7 Dec 1998 10:56:31 -0500
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2232.9)
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by watsun.cc.columbia.edu id KAA21896
Frank,
What follows is the text of the response of the UTC to your proposals. (If
you would prefer, I also have a Word97 version.) The UTC needs some
additional information from you (and IBM and SHARE) before it decides about
the "Terminal Graphics for Unicode" proposal. The next UTC meeting is in
February in Palo Alto and we would appreciate your response by then.
If you want, we can talk about this.
Best regards,
Ed
Edwin F. Hart
Applied Physics Laboratory
11100 Johns Hopkins Road
Laurel, MD 20723-6099
+1-240-228-6926 (from Washington, DC area)
+1-443-778-6926 (from Baltimore area)
+1-240-228-1093 (fax)
edwin.hart@jhuapl.edu <mailto:edwin.hart@jhuapl.edu>
1998-December-07
To: Frank da Cruz
From: Unicode Technical Committee
Subject: Response to your three proposals to encode characters for
terminal emulation.
Thank you for your three proposals for encoding terminal-emulation
characters in Unicode. The proposals were well organized, thorough, and
well researched.
Results
The UTC acknowledges the concerns raised in your three documents. The UTC
had extended discussions on these documents at its December, 1998 meeting in
San JosΘ. Here is the result of these discussions for each paper.
1. Document L2/98-353, "Additional Control Pictures for Unicode"
Status: rejected
The UTC believes that the proposed glyphs would be used as an
alternate way to display control characters rather than to interchange
information; e.g., to document control sequences. The UTC decided not to
encode these glyphs in Unicode.
However, the UTC noted that a bell glyph may have value in other
contexts and so could be encoded for another purpose in the future. In
addition, the UTC noted that Unicode 3.0 aligns the abbreviations for
control characters along a diagonal as you had requested.
As a secondary concern, encoding glyphs for control characters is an
open-ended proposition. The UTC knows that multiple sets of control
characters are defined for the C1 control area. For example, ISO had two
standards defining control characters, ISO/IEC 6429 and ISO 6630. When
someone proposes a new set of C1 control characters, should they also be
considered for encoding? What should be encoded? Should exactly one glyph
be encoded per control-character code position or should multiple glyphs be
encoded for the same control-character code position? These are examples of
concerns underlying the UTC decision rather than a request for you to answer
the questions.
2. Document L2/98-354, "Terminal Graphics for Unicode"
Status: deferred for additional information
The UTC has requested more information before it makes a decision.
Table 5.1, range of E080 to E087. The UTC has requested an official
position from IBM and feedback from SHARE on the glyphs used in the status
area of a 3270 display.
Table 5.2, range of E0A0 to E0AD. The UTC has requested that
Microsoft provide a list of the full set of glyphs used to construct
mathematical entities (brackets, braces, sigma, etc.). Previously, the UTC
had decided not to encode these as characters. However, once this
information is available, the UTC will revisit the issue.
In addition, the UTC would appreciate your response to the
following:
a. What is the full set of terminal-emulation glyphs that you
considered and how did you map those not in your proposal into Unicode? The
UTC's concern is for round-trip integrity and distinguishing different
characters so that the UTC avoids mapping the characters in you proposal to
the same Unicode characters you used already for other glyphs in your full
set. (The concern is not the characters from standard coded character sets
like 7-bit ASCII and the ISO/IEC 8859 series, but rather the set of symbols
outside of these sets.)
b. Which of the following proposed characters could be unified with
(mapped into) Unicode characters?
1) Can you provide (a) the source glyphs for the proposed E0AC and E0AD
sigma/summation parts, and also (b) better glyphs for them.
2) What is the purpose of the proposed E0AE and E0AF characters? Are
they supposed to be full corners for a box, or partial corners, or to
provide the top and bottom corners of right brackets, or to provide serifs
for the sigma (E0AC and E0AD)? Could the proposed E0AE and E0AF characters
be unified with 231D top right corner and 231F bottom right corner?
3) For the proposed E0B0, could it be unified with either 2713 check
mark or 221A radical? Is "small" the distinguishing characteristic for not
unifying it with 2713 or 221A?
4) For the proposed E0B1, should it be unified with either 237B not
check mark or 2415 symbol for negative acknowledge?
5) What is the purpose of the proposed E0D0 and E0D1 characters? Are
they to be used to construct extended brackets and braces with the E0A0 to
E0AB "extensible" characters? If so, then they should be moved to the
mathematical symbol area of your proposal. If not, please explain how they
might be used.
6) Could the proposed E0D2 to E0D5 triangle characters be unified with
the 25E2 to 25E5 black triangle characters?
7) Could the proposed E0E5 diamond be unified with 25C6 black diamond?
8) Could the proposed E0EC be unified with 21E5 rightward arrow to bar?
9) Could the proposed E0ED be unified with 21E4 leftward arrow to bar?
3. Document L2/98-355, "Hex Byte Pictures for Unicode"
Status: rejected
The UTC considers that these are glyphs and, as such, they are out
of the scope of the Unicode standard. Representing hex bytes visibly is a
font-rendering issue rather than an information interchange issue.
Suggestions
Here are some suggestions for you to consider to help you meet your
requirements.
1. Code the glyphs in the Private Use Area. If at some time in the
future, the terminal emulation vendors are using the assignments, then you
may resubmit your proposal (except for the pictures for hex bytes) to
Unicode with this additional justification. It is beyond the scope of the
UTC to encode characters in the Private Use Area or to endorse any
particular use of characters in the Private Use Area. If the terminal
emulation community believes that consistent use of private-use code
positions is desirable, you might consider registering your code assignments
in a registry for the Private Use Area such as the Conscript Registry. Note
that Unicode does not endorse any registry for the Private Use Area. Both
Adobe and Apple have described how each uses the Private Use Area. You may
want to contact these organizations for additional information.
2. Register the glyphs with AFII (Association for Font Information
Interchange). AFII is the registration authority for the ISO/IEC 10036
glyph registry. AFII charges a nominal fee for registering glyphs. If you
are interested in pursuing this, contact AFII ( <mailto:asmus@unicode.org>
afii@unicode.org) for more information.
8-Dec-98 1:00:34-GMT,1315;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id UAA21174;
Mon, 7 Dec 1998 20:00:17 -0500 (EST)
Date: Mon, 7 Dec 98 20:00:17 EST
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
To: "Hart, Edwin F." <Edwin.Hart@jhuapl.edu>
Cc: "'Aliprand, Joan'" <br.jma@rlg.org>, "'Whistler, Ken'" <kenw@sybase.com>,
"'McGowan, Rick'" <rmcgowan@apple.com>,
"'Thewlis, Dave'" <dthewlis@dcta.com>
Subject: Re: UTC response to your request to encode characters for terminal
em ulation
In-Reply-To: Your message of Mon, 7 Dec 1998 10:56:31 -0500
Message-ID: <CMM.0.90.4.913078817.fdc@watsun.cc.columbia.edu>
> What follows is the text of the response of the UTC to your proposals. (If
> you would prefer, I also have a Word97 version.)
>
No thanks, I *like* plain text :-)
> The UTC needs some
> additional information from you (and IBM and SHARE) before it decides about
> the "Terminal Graphics for Unicode" proposal. The next UTC meeting is in
> February in Palo Alto and we would appreciate your response by then.
>
OK, I'm happy to keep following up. I'll try to have a detailed response
for you by end of January. In the meantime, please let me know what comes in
from IBM and SHARE.
Thanks for your consideration and the detailed response.
- Frank
8-Dec-98 14:52:44-GMT,2723;000000000015
Return-Path: <Edwin.Hart@jhuapl.edu>
Received: from aples2.jhuapl.edu (aples2.jhuapl.edu [128.244.26.86])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id JAA13274
for <fdc@watsun.cc.columbia.edu>; Tue, 8 Dec 1998 09:52:37 -0500 (EST)
Received: by aples2.jhuapl.edu with Internet Mail Service (5.5.2232.9)
id <XX0C3F08>; Tue, 8 Dec 1998 09:52:27 -0500
Message-ID: <91D1D51C2955D111B82B00805F19989501CD72A0@aples2.jhuapl.edu>
From: "Hart, Edwin F." <Edwin.Hart@jhuapl.edu>
To: "'Frank da Cruz'" <fdc@watsun.cc.columbia.edu>
Subject: RE: UTC response to your request to encode characters for termina
l em ulation
Date: Tue, 8 Dec 1998 09:52:20 -0500
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2232.9)
Content-Type: text/plain;
charset="iso-8859-1"
Frank,
Since you sent your reports in plain ASCII text, I had thought that this was
your preferred medium.
I'm sorry that the response was not positive. The UTC recognized your
concerns but felt that the resolution was more appropriate in the
font/rendering arena rather than coding in Unicode. If you can get written
support from terminal emulation vendors, it would strengthen your case.
Some of the UTC players were rather vehement about not coding the hex digits
so this part is really dead.
Regarding the glyphs used in the 3270 status area, my feeling is that unless
these are communicated from the controller to the 3270 terminal, the UTC
will reject these.
Email me with your phone number if you want to talk about any of this.
Ed
Edwin F. Hart
Applied Physics Laboratory
11100 Johns Hopkins Road
Laurel, MD 20723-6099
+1-240-228-6926 (from Washington, DC area)
+1-443-778-6926 (from Baltimore area)
+1-240-228-1093 (fax)
edwin.hart@jhuapl.edu <mailto:edwin.hart@jhuapl.edu>
----------
From: Frank da Cruz [SMTP:fdc@watsun.cc.columbia.edu]
Sent: 07 December, 1998 20:00
To: Hart, Edwin F.
Cc: 'Aliprand, Joan'; 'Whistler, Ken'; 'McGowan, Rick'; 'Thewlis,
Dave'
Subject: Re: UTC response to your request to encode characters for
terminal em ulation
> What follows is the text of the response of the UTC to your
proposals. (If
> you would prefer, I also have a Word97 version.)
>
No thanks, I *like* plain text :-)
> The UTC needs some
> additional information from you (and IBM and SHARE) before it
decides about
> the "Terminal Graphics for Unicode" proposal. The next UTC
meeting is in
> February in Palo Alto and we would appreciate your response by
then.
>
OK, I'm happy to keep following up. I'll try to have a detailed
response
for you by end of January. In the meantime, please let me know what
comes in
from IBM and SHARE.
Thanks for your consideration and the detailed response.
- Frank
8-Dec-98 15:57:40-GMT,2245;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id KAA02356;
Tue, 8 Dec 1998 10:57:22 -0500 (EST)
Date: Tue, 8 Dec 98 10:57:22 EST
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
To: "Hart, Edwin F." <Edwin.Hart@jhuapl.edu>
Subject: RE: UTC response to your request to encode characters for termina l
em ulation
In-Reply-To: Your message of Tue, 8 Dec 1998 09:52:20 -0500
Message-ID: <CMM.0.90.4.913132642.fdc@watsun.cc.columbia.edu>
> Since you sent your reports in plain ASCII text, I had thought that this was
> your preferred medium.
>
You were right.
> I'm sorry that the response was not positive. The UTC recognized your
> concerns but felt that the resolution was more appropriate in the
> font/rendering arena rather than coding in Unicode. If you can get written
> support from terminal emulation vendors, it would strengthen your case.
>
But this is a competitive market. The other terminal emulation makers
compete with us, so that will be tough, or at least awkward. And quite
honestly, in a way this whole proposal was partly against my better judgement,
since the other emulation companies have been profiting from our work for
years, and had this proposal been approved, it would have solved a big problem
for them.
> Some of the UTC players were rather vehement about not coding the hex digits
> so this part is really dead.
>
I expected that, but thought it should be entered into the record anyway,
because I think it addesses issues that will come up again and again.
> Regarding the glyphs used in the 3270 status area, my feeling is that unless
> these are communicated from the controller to the 3270 terminal, the UTC
> will reject these.
>
It's not a big deal. It would have been nice to have standardized glyphs,
and I feel I did my duty by proposing them. So now we'll go ahead and
put everything in the private use area and distribute custom fonts just like
everybody else, and live with the fallout.
Although I do plan to provide the requested responses, I don't feel there
is much point, since there is no chance that any of this will get into
Unicode 3.0 anyway, and I can't afford to drag this out forever -- we have
deadlines too.
- Frank
8-Dec-98 19:26:20-GMT,1202;000000000011
Return-Path: <Edwin.Hart@jhuapl.edu>
Received: from aples2.jhuapl.edu (aples2.jhuapl.edu [128.244.26.86])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id OAA01837
for <fdc@watsun.cc.columbia.edu>; Tue, 8 Dec 1998 14:26:19 -0500 (EST)
Received: by aples2.jhuapl.edu with Internet Mail Service (5.5.2232.9)
id <XX0C3H3G>; Tue, 8 Dec 1998 14:26:15 -0500
Message-ID: <91D1D51C2955D111B82B00805F19989501CD72A8@aples2.jhuapl.edu>
From: "Hart, Edwin F." <Edwin.Hart@jhuapl.edu>
To: "'da Cruz, Frank'" <fdc@watsun.cc.columbia.edu>
Subject: feedback from IBM on 3270 status symbols
Date: Tue, 8 Dec 1998 14:26:12 -0500
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2232.9)
Content-Type: text/plain
Dave,
IBM has raised the possibility of using of the symbols for documentation and
this is certainly a valid use.
The symbols are internal to the 3270 display terminal rather than
communicated between it and the controller.
Best regards,
Ed
Edwin F. Hart
Applied Physics Laboratory
11100 Johns Hopkins Road
Laurel, MD 20723-6099
+1-240-228-6926 (from Washington, DC area)
+1-443-778-6926 (from Baltimore area)
+1-240-228-1093 (fax)
edwin.hart@jhuapl.edu <mailto:edwin.hart@jhuapl.edu>
8-Dec-98 19:37:08-GMT,2025;000000000001
Return-Path: <fdc>
Received: (from fdc@localhost)
by watsun.cc.columbia.edu (8.8.5/8.8.5) id OAA04895;
Tue, 8 Dec 1998 14:36:17 -0500 (EST)
Date: Tue, 8 Dec 98 14:36:17 EST
From: Frank da Cruz <fdc@watsun.cc.columbia.edu>
To: "Hart, Edwin F." <Edwin.Hart@jhuapl.edu>
Subject: Re: feedback from IBM on 3270 status symbols
In-Reply-To: Your message of Tue, 8 Dec 1998 14:26:12 -0500
Message-ID: <CMM.0.90.4.913145777.fdc@watsun.cc.columbia.edu>
> Dave,
>
Who's Dave?
> IBM has raised the possibility of using of the symbols for documentation
> and this is certainly a valid use.
>
The UTC probably won't see it that way, since I made that argument for many
of the other characters that they rejected.
> The symbols are internal to the 3270 display terminal rather than
> communicated between it and the controller.
>
All symbols are internal to terminals. The same can be said for (e.g.)
the backwards question mark, which is not sent by the host, but is displayed
on screen to indicate some kind of error, and which was accepted by the UTC
(although perhaps in some other context), or many other symbols that are
displayed in response to communications from the host, but not necessarily
mapped to a particular character.
No big deal -- I think I made all the relevant arguments already. Even if
these symbols are not sent by the host or the controller, they still are
shown on the screen, and therefore PC based emulators will also need to show
them on the screen. If these emulators are based on Unicode, but Unicode
does not include these characters, then all such emulators will have to bundle
custom fonts, each one probably incompatible with the other.
On the other hand, if some company like Monotype makes a Unicode "terminal
emulation font" with all these characters at well-defined positions (and in
fact, this is exactly what will happen), then this will become a de facto
standard anyway, which is as good as a real standard except it will conflict
with other uses of the Private Use area.
- Frank
8-Dec-98 20:33:52-GMT,3517;000000000001
Return-Path: <Edwin.Hart@jhuapl.edu>
Received: from aples2.jhuapl.edu (aples2.jhuapl.edu [128.244.26.86])
by watsun.cc.columbia.edu (8.8.5/8.8.5) with ESMTP id PAA20309
for <fdc@watsun.cc.columbia.edu>; Tue, 8 Dec 1998 15:33:48 -0500 (EST)
Received: by aples2.jhuapl.edu with Internet Mail Service (5.5.2232.9)
id <XX0C3HYW>; Tue, 8 Dec 1998 15:33:48 -0500
Message-ID: <91D1D51C2955D111B82B00805F19989501CD72A9@aples2.jhuapl.edu>
From: "Hart, Edwin F." <Edwin.Hart@jhuapl.edu>
To: "'Frank da Cruz'" <fdc@watsun.cc.columbia.edu>
Subject: RE: feedback from IBM on 3270 status symbols
Date: Tue, 8 Dec 1998 15:33:47 -0500
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2232.9)
Content-Type: text/plain
Comments are embedded in your message.
Edwin F. Hart
Applied Physics Laboratory
11100 Johns Hopkins Road
Laurel, MD 20723-6099
+1-240-228-6926 (from Washington, DC area)
+1-443-778-6926 (from Baltimore area)
+1-240-228-1093 (fax)
edwin.hart@jhuapl.edu <mailto:edwin.hart@jhuapl.edu>
----------
From: Frank da Cruz [SMTP:fdc@watsun.cc.columbia.edu]
Sent: 08 December, 1998 14:36
To: Hart, Edwin F.
Subject: Re: feedback from IBM on 3270 status symbols
> Dave,
>
Who's Dave?
Dave is my SHARE manager. I decided originally to write to him and
then changed my mind and decided to write directly to you.
> IBM has raised the possibility of using of the symbols for
documentation
> and this is certainly a valid use.
>
The UTC probably won't see it that way, since I made that argument
for many
of the other characters that they rejected.
Well, I was not smart enough to repeat this argument at the meeting.
You needed a better mouthpiece. : )
> The symbols are internal to the 3270 display terminal rather than
> communicated between it and the controller.
>
All symbols are internal to terminals. The same can be said for
(e.g.)
the backwards question mark, which is not sent by the host, but is
displayed
on screen to indicate some kind of error, and which was accepted by
the UTC
(although perhaps in some other context), or many other symbols that
are
displayed in response to communications from the host, but not
necessarily
mapped to a particular character.
All symbols are internal to terminals. Yes, but the third set
represents graphic characters from a code table presumably invoked by a
ISO/IEC 2022 control sequence and then by the 7-bit/8-bit code positions on
the wire. Since the characters are communicated on the wire (and hence have
inherent information content), the UTC is willing to consider encoding them.
The alternate control characters and hex digits could be displayed using an
alternate font and appropriate rendering software.
No big deal -- I think I made all the relevant arguments already.
Even if
these symbols are not sent by the host or the controller, they still
are
shown on the screen, and therefore PC based emulators will also need
to show
them on the screen. If these emulators are based on Unicode, but
Unicode
does not include these characters, then all such emulators will have
to bundle
custom fonts, each one probably incompatible with the other.
On the other hand, if some company like Monotype makes a Unicode
"terminal
emulation font" with all these characters at well-defined positions
(and in
fact, this is exactly what will happen), then this will become a de
facto
standard anyway, which is as good as a real standard except it will
conflict
with other uses of the Private Use area.
- Frank