home *** CD-ROM | disk | FTP | other *** search
- Date: Fri, 3 Oct 86 12:26:22 PDT
- From: sun!gorodish!guy@utastro.UUCP (Guy Harris)
-
- > From: mark@cbosgd.att.com (Mark Horton)
- > Subject: Case sensitive file names
-
- > I think this is a mistake. UNIX is the only major operating system
- > that treats things like file names, logins, host names, and commands
- > as case sensitive.
-
- It's been a while since I used Multics; I think it was case-sensitive. Of
- course, I don't know whether it counts as "major" here or not; I don't know
- how many sites are around. Are you sure there are no others?
-
- > It's also reasonable to leave the case alone, but ignore case in
- > comparisons.
-
- This would probably be the best scheme (I think the Xerox Alto's operating
- system did this). Some people may want to use mixed case in file names for
- aesthetic reasons, for example.
-
- > There is also probably a good argument for keeping it case sensitive
- > (after all, there are probably 5 or 6 people out there who really need
- > both makefile and Makefile...
-
- This means UNIX probably can't change, at least not without a fair bit of
- pain. I know of at least one directory on a UNIX system that has both
- "makefile" and "Makefile" in it; this would cause some upset on a
- case-mapping UNIX system.
-
- However, there is another problem with case mapping. It's dependent on the
- language the text is in! Doing case mapping is all very well and good for
- English-speaking users; the algorithm for mapping characters between cases
- in English is straightforward. However, in German "ss" is a single special
- character in lower-case but "SS" in upper case. Even if you don't have
- anomalies like this, the current schemes proposed by AT&T for "international
- UNIX" use various ISO codes; this means that the character whose hex value
- is E6 is the "ae" diaresis in the ISO Latin Alphabet #1, and thus matches
- the character whose hex value is C6 (which is the "AE" diaresis); however,
- in the JIS C6226 Kanji set, it is probably the first byte of a two-byte
- sequence representing a Kanji sysmbol, and I don't think it gets case mapped
- at all.
-
- This means that the operating system would have to know what character set a
- particular character was in, so that it could map its case correctly; this
- would be best done with sequences embedded in the file name indicating
- shifts in the character set to which bytes belong. (These same sequences
- should be used in text files, character strings in programs, etc.. Other
- suggestions include a per-file character set designator, that would
- presumably apply to any files containing character strings, including
- directories; however, this means that *all* strings in that file must be in
- the same character set, which is not always a reasonable restriction.) It
- would then have to know how to do case mapping for all character sets
- supported by the system, and would have to be modified or have new
- information supplied to it if a new character set was to be supported.
-
- Volume-Number: Volume 7, Number 16
-
-