NetNews Usenet Archive 1992 #19

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #19 / NN_1992_19.iso / spool / comp / edu / 1383 < prev next >

Wrap

Internet Message Format | 1992-08-27 | 8.6 KB

Path: sparky!uunet!snorkelwacker.mit.edu!ai-lab!life.ai.mit.edu!burley From: burley@geech.gnu.ai.mit.edu (Craig Burley) Newsgroups: comp.edu Subject: Case Sensitivity (was Re: Small Language Wanted) Message-ID: <BURLEY.92Aug27172140@geech.gnu.ai.mit.edu> Date: 27 Aug 92 21:21:40 GMT References: <DAVIS.92Aug23010605@pacific.mps.ohio-state.edu> <WVENABLE.92Aug26154731@algona.stats.adelaide.edu.au> <1992Aug26.151818.1@vxdesy.desy.de> Sender: news@ai.mit.edu Followup-To: comp.edu Organization: Free Software Foundation 545 Tech Square Cambridge, MA 02139 Lines: 135 In-reply-to: pawlak@vxdesy.desy.de's message of 26 Aug 92 15:18:18 GMT In article <1992Aug26.151818.1@vxdesy.desy.de> pawlak@vxdesy.desy.de writes: In article <WVENABLE.92Aug26154731@algona.stats.adelaide.edu.au>, wvenable@algona.stats.adelaide.edu.au (Bill Venables) writes: >>>>>> "Michal" == pawlak <pawlak@vxdesy.desy.de> writes: > > Michal> ... about 70% of the C code I had to work with was > Michal> badly written because of: > Michal> - ... > Michal> - making use of case sensitivity (i.e. symbols 'value' and 'Value' in the > Michal> same program) > Michal> - ... > > I'll accept all the other points, but NOT this one. The concept of "case" > is a printing artefact and makes no sense in this context. What's wrong > with using all 52 letters in the character set? What could be more natural > than using "x" for a singly indexed array and "X" for a doubly indexed one? > > Most versions of Fortran now do allow mixed case input (thank goodness - > otherwise you would go deaf reading the stuff) but then to regard "i" and > "I" as the *same* is a absolute gotcha. > > It's case INsensitivity that is the real syntactic curse in any language > (or operating system for that matter). Did you ever try to DISCUSS your code with someone? How do you pronounce capital letters then? Some people have 'verbal' memory (me for instance) - I remember the word, not its graphical representation (therefore I also have a lot of trouble if 'unnatural' identifiers are used). What to do with such people? They will guaranteed mix such symbols... MixeD cASe inPUT? YES! cASe seSitIviTy? NO! Besides, with 31 letters allowed in identifiers I have 26^31 possible identifiers (forgetting $, _, digits, shorter ones, etc). I don't really feel I need more so urgently... The number of available identifiers has nothing to do with the issue of whether case is sensitive in a name space (like the variable and type names in a language, or the file names in a file system), unless you're dealing with an extremely constrained name space (like BASIC, with a single letter and optional single digit following it for variable names). I used to be a major fan of case insensitivity and would design that into subsystems I wrote. The OS I was working on generally did case insensitivity, but I felt that _true_ case insensitivity wasn't just uppercasing or lowercasing everything entered into the name space, but remembering the _original_ case of the name as entered into the space. So my subsystems remembered the original case as entered by the user, but matched using ci so that while "Foo" might be displayed as a batch job name, it'd be matched when asking to display info on "foo", "FOO", "fOo", and so on. I went on this way for a while, increasingly struggling with the problems this supposdly enlightened approach introduced. For example, what about name spaces where there isn't a single, clear point of entry for a new name? What is the "registered" case of that name as used when displaying a report on names in the name space? For example, if two Fortran (given a case-insensitive dialect as I would have designed it) programs referred to a COMMON area as /FOO/ and /foo/, which entry should hold sway in the name space representing global entities such as common blocks in the linker's map? Then, I was subjected to seminars on internationalization, and that pretty much rooted out the last of my penchant for feeling that case insensitivity is the only way to go. Fortunately, this happened a couple of years before my first encounter with C and four years before that with UNIX, both of which are thoroughly case-sensitive. What I had discovered was the case _insensitivity_ was a feature that was fairly nationalistic (or at least restricted to a subset of languages), hard to internationalize well, and really a _user-friendliness_ issue that didn't necessarily _always_ belong ensconsed in name spaces such as language variable names and file systems. User interfaces constructed on top of such things could do a much better job of providing useful case-insensitivity (plus catching other typos, like exchanging "1" and "l" or "0" and "O") than could be achieved by trying to foist such features on facilities that don't need them. That's not to say I don't think Fortran compilers should generally be case-insensitive. It's what most Fortran users expect; it's an extension to FORTRAN 77 but standard with Fortran 90; GNU Fortran supports _only_ that form of case-insensitivity (for now; I'll probably add more flexibility in the form of configuration or compiler options later); and so on. It's just that, when designing a new language or facility with a name space, I no longer see the job of the manager of that name space as including case insensitivity. That's now strictly the job of the user interface, if it wants to provide that feature. Just because you have a language with case-sensitive variable names doesn't mean programmers should or will use "X" and "x" simultaneously in a program. I wouldn't do such things unless the nomenclature came directly from the branch of sciences that laid the foundation work for the software -- so intercommunication between program, programmer, and others would be made easier. Otherwise, such uses of uppercase vs. lowercase seems to invite difficulties and misunderstandings, as suggested by the poster -- as would deliberately choosing names like "nl" with "n1" and "FOO" with "F00". Ultimately, it's up to the programmer to choose names out of the name space that not only don't conflict within that name space, but don't conflict in the "wet" name spaces in our heads, whether verbalized or hand-written or whatever. Forcing case-insensitivity on every language's name space is hardly the way to ensure this; in fact it might invite laziness here. The best way is probably to have an enlightened group of programmers (in that they know what they want to avoid) along with user-interface tools that help choose names. For example, it's a simple matter, given a user interface that knows the name space of variables and procedures in a program, to make sure a newly created name isn't so "close" to an existing name as to make typos or misreads from one to the other too easy. (Spelling checkers that include "guess intended spelling" have had such algorithms for over a decade.) I prefer case-sensitive languages now because I sometimes like to use the occasional capital to indicate a new word as an alternate to using an underscore or hyphen. That doesn't mean I'll ever have both "fooBar" and "foobar", at least not intentionally; what it means is that I might have both "fooBar" and "foo_bar", where the former model always indicate a type while the latter always indicates a function (procedure in Fortran terms). I could do the exact same thing if my C compiler suddenly became case-_in_sensitive -- in fact, I think all my code would compile just fine anyway (though it'd be an interesting experiment) -- but I prefer having the compiler, in effect, "police" my code in case I accidentally did type "foobar" and meant "foo_bar" rather than the compiler's case-insensitive "guess" that I meant "fooBar". I do know that ever since going over to the "I prefer case sensitivity, all else being equal" camp from the other camp, I've found far fewer vexing problems in areas of file-system, OS kernel, linker, and utility design. So I suggest that case-insensitivity be thought of as an _optional_ feature to be provided in the realm of user-friendliness, but not as the _only_ feature relating to effective and intelligent choice of names in a name space. In that context, given that we now have the opportunity to use program-creating tools better than keypunches and simple line editors, I believe that realm is best handled in the user interfaces of those facilities that help create and maintain programs, _not_ in the designs of the languages in which those programs are written. -- James Craig Burley, Software Craftsperson burley@gnu.ai.mit.edu Member of the League for Programming Freedom (LPF)