home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!snorkelwacker.mit.edu!ai-lab!life.ai.mit.edu!burley
- From: burley@geech.gnu.ai.mit.edu (Craig Burley)
- Newsgroups: comp.edu
- Subject: Case Sensitivity (was Re: Small Language Wanted)
- Message-ID: <BURLEY.92Aug27172140@geech.gnu.ai.mit.edu>
- Date: 27 Aug 92 21:21:40 GMT
- References: <DAVIS.92Aug23010605@pacific.mps.ohio-state.edu>
- <WVENABLE.92Aug26154731@algona.stats.adelaide.edu.au>
- <1992Aug26.151818.1@vxdesy.desy.de>
- Sender: news@ai.mit.edu
- Followup-To: comp.edu
- Organization: Free Software Foundation 545 Tech Square Cambridge, MA 02139
- Lines: 135
- In-reply-to: pawlak@vxdesy.desy.de's message of 26 Aug 92 15:18:18 GMT
-
- In article <1992Aug26.151818.1@vxdesy.desy.de> pawlak@vxdesy.desy.de writes:
-
- In article <WVENABLE.92Aug26154731@algona.stats.adelaide.edu.au>, wvenable@algona.stats.adelaide.edu.au (Bill Venables) writes:
- >>>>>> "Michal" == pawlak <pawlak@vxdesy.desy.de> writes:
- >
- > Michal> ... about 70% of the C code I had to work with was
- > Michal> badly written because of:
- > Michal> - ...
- > Michal> - making use of case sensitivity (i.e. symbols 'value' and 'Value' in the
- > Michal> same program)
- > Michal> - ...
- >
- > I'll accept all the other points, but NOT this one. The concept of "case"
- > is a printing artefact and makes no sense in this context. What's wrong
- > with using all 52 letters in the character set? What could be more natural
- > than using "x" for a singly indexed array and "X" for a doubly indexed one?
- >
- > Most versions of Fortran now do allow mixed case input (thank goodness -
- > otherwise you would go deaf reading the stuff) but then to regard "i" and
- > "I" as the *same* is a absolute gotcha.
- >
- > It's case INsensitivity that is the real syntactic curse in any language
- > (or operating system for that matter).
-
- Did you ever try to DISCUSS your code with someone? How do you pronounce
- capital letters then? Some people have 'verbal' memory (me for instance)
- - I remember the word, not its graphical representation (therefore I also
- have a lot of trouble if 'unnatural' identifiers are used). What to do
- with such people? They will guaranteed mix such symbols... MixeD cASe
- inPUT? YES! cASe seSitIviTy? NO!
-
- Besides, with 31 letters allowed in identifiers I have 26^31 possible
- identifiers (forgetting $, _, digits, shorter ones, etc). I don't really
- feel I need more so urgently...
-
- The number of available identifiers has nothing to do with the issue of
- whether case is sensitive in a name space (like the variable and type names
- in a language, or the file names in a file system), unless you're dealing
- with an extremely constrained name space (like BASIC, with a single letter
- and optional single digit following it for variable names).
-
- I used to be a major fan of case insensitivity and would design that into
- subsystems I wrote. The OS I was working on generally did case insensitivity,
- but I felt that _true_ case insensitivity wasn't just uppercasing or
- lowercasing everything entered into the name space, but remembering the
- _original_ case of the name as entered into the space. So my subsystems
- remembered the original case as entered by the user, but matched using ci
- so that while "Foo" might be displayed as a batch job name, it'd be matched
- when asking to display info on "foo", "FOO", "fOo", and so on.
-
- I went on this way for a while, increasingly struggling with the problems
- this supposdly enlightened approach introduced. For example, what about
- name spaces where there isn't a single, clear point of entry for a new name?
- What is the "registered" case of that name as used when displaying a report
- on names in the name space? For example, if two Fortran (given a
- case-insensitive dialect as I would have designed it) programs referred
- to a COMMON area as /FOO/ and /foo/, which entry should hold sway in the
- name space representing global entities such as common blocks in the linker's
- map?
-
- Then, I was subjected to seminars on internationalization, and that pretty
- much rooted out the last of my penchant for feeling that case insensitivity
- is the only way to go. Fortunately, this happened a couple of years before
- my first encounter with C and four years before that with UNIX, both of
- which are thoroughly case-sensitive. What I had discovered was the case
- _insensitivity_ was a feature that was fairly nationalistic (or at least
- restricted to a subset of languages), hard to internationalize well, and
- really a _user-friendliness_ issue that didn't necessarily _always_ belong
- ensconsed in name spaces such as language variable names and file systems.
- User interfaces constructed on top of such things could do a much better
- job of providing useful case-insensitivity (plus catching other typos,
- like exchanging "1" and "l" or "0" and "O") than could be achieved by
- trying to foist such features on facilities that don't need them.
-
- That's not to say I don't think Fortran compilers should generally be
- case-insensitive. It's what most Fortran users expect; it's an extension
- to FORTRAN 77 but standard with Fortran 90; GNU Fortran supports _only_
- that form of case-insensitivity (for now; I'll probably add more flexibility
- in the form of configuration or compiler options later); and so on. It's
- just that, when designing a new language or facility with a name space, I
- no longer see the job of the manager of that name space as including
- case insensitivity. That's now strictly the job of the user interface,
- if it wants to provide that feature.
-
- Just because you have a language with case-sensitive variable names doesn't
- mean programmers should or will use "X" and "x" simultaneously in a program.
- I wouldn't do such things unless the nomenclature came directly from the
- branch of sciences that laid the foundation work for the software -- so
- intercommunication between program, programmer, and others would be made
- easier. Otherwise, such uses of uppercase vs. lowercase seems to invite
- difficulties and misunderstandings, as suggested by the poster -- as would
- deliberately choosing names like "nl" with "n1" and "FOO" with "F00".
-
- Ultimately, it's up to the programmer to choose names out of the name space
- that not only don't conflict within that name space, but don't conflict in
- the "wet" name spaces in our heads, whether verbalized or hand-written or
- whatever. Forcing case-insensitivity on every language's name space is
- hardly the way to ensure this; in fact it might invite laziness here. The
- best way is probably to have an enlightened group of programmers (in that
- they know what they want to avoid) along with user-interface tools that
- help choose names. For example, it's a simple matter, given a user interface
- that knows the name space of variables and procedures in a program, to make
- sure a newly created name isn't so "close" to an existing name as to make
- typos or misreads from one to the other too easy. (Spelling checkers that
- include "guess intended spelling" have had such algorithms for over a decade.)
-
- I prefer case-sensitive languages now because I sometimes like to use the
- occasional capital to indicate a new word as an alternate to using an
- underscore or hyphen. That doesn't mean I'll ever have both "fooBar" and
- "foobar", at least not intentionally; what it means is that I might have
- both "fooBar" and "foo_bar", where the former model always indicate a type
- while the latter always indicates a function (procedure in Fortran terms).
- I could do the exact same thing if my C compiler suddenly became
- case-_in_sensitive -- in fact, I think all my code would compile just fine
- anyway (though it'd be an interesting experiment) -- but I prefer having the
- compiler, in effect, "police" my code in case I accidentally did type "foobar"
- and meant "foo_bar" rather than the compiler's case-insensitive "guess" that
- I meant "fooBar".
-
- I do know that ever since going over to the "I prefer case sensitivity, all
- else being equal" camp from the other camp, I've found far fewer vexing
- problems in areas of file-system, OS kernel, linker, and utility design.
-
- So I suggest that case-insensitivity be thought of as an _optional_ feature
- to be provided in the realm of user-friendliness, but not as the _only_
- feature relating to effective and intelligent choice of names in a name
- space. In that context, given that we now have the opportunity to use
- program-creating tools better than keypunches and simple line editors, I
- believe that realm is best handled in the user interfaces of those facilities
- that help create and maintain programs, _not_ in the designs of the languages
- in which those programs are written.
- --
-
- James Craig Burley, Software Craftsperson burley@gnu.ai.mit.edu
- Member of the League for Programming Freedom (LPF)
-