NetNews Usenet Archive 1992 #16

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #16 / NN_1992_16.iso / spool / comp / arch / 8455 < prev next >

Wrap

Internet Message Format | 1992-07-30 | 8.1 KB

Xref: sparky comp.arch:8455 comp.lang.misc:2709 Path: sparky!uunet!gatech!purdue!mentor.cc.purdue.edu!pop.stat.purdue.edu!hrubin From: hrubin@pop.stat.purdue.edu (Herman Rubin) Newsgroups: comp.arch,comp.lang.misc Subject: Re: CISC Microcode (was Re: RISC Mainframe) Message-ID: <55535@mentor.cc.purdue.edu> Date: 30 Jul 92 13:16:51 GMT References: <id.RKUR.GFF@ferranti.com> <55294@mentor.cc.purdue.edu> <id.GQWR.83D@ferranti.com> Sender: news@mentor.cc.purdue.edu Followup-To: comp.arch Organization: Purdue University Statistics Department Lines: 142 In article <id.GQWR.83D@ferranti.com> peter@ferranti.com (Peter da Silva) writes: >In article <55294@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes: >> In article <id.RKUR.GFF@ferranti.com> peter@ferranti.com (Peter da Silva) writes: >> >I'm glad you noticed it. We're agreed then that these idiot savants generate >> >code that is as fast as a human can write, if not the same sort of code. > >> This is not always the case. It is quite possible that a simple idea which >> a human can see is not within the category of those known to the compiler. >> No compiler can change the algorithm to one that it does not know about. >> I can, and do. >Yes, I quite agree, different problem spaces demand different algorithms for >solution. However, and this is the key point, it is (in 1992) extremely rare >that you encounter a constraint on the problem space due to the hardware that >has any significant effect on the choice of algorithm. Things like 64K segment >limits, massively parallel hardware, or the absence of a FPU... yes. Things >like the presence or absence of autoincrement addressing... no. It's been many >years that microoptimisation for the instruction set has been cost-effective >outside of tight loops. Try designing algorithms for massivel parallel, or even vector, machines. The massively parallel SIMD machines really scream for VVVCISC, as any conditional operation is a major bottleneck. The conditioning in hardware can be very cheap, as now each processor does it independently. Things like the use of fixed-point operations on floating, the speed of conversion, the use of bits, and like operations; packing and unpacking of floats; and quite a few others, are still important. With a decent way of writing machine instructions, I consider a couple of hundred instructions, certainly not what I would call a tight loop, as reasonable. I do not know how important autoincrement addressing is, but vector machines use a special case of it to great advantage. >> >Because allowing the compiler to find machine-specific optimisations is more >> >cost effective than doing it myself. Because the code I write runs on Sparcs, >> >68000s, 80386s, 80286s, VAXen, and 68020s. Because next month it might be >> >running on MIPS or 88000. Next year it might be running on Supersparc or Alpha. >> >Because optimising when you don't need to is a waste of resources. >> This means that the language should be expanded to include the alternatives. >Why? So I can write the same code half a dozen times? Whether I do so by >coding a loop in five different assembly dialects or by coding a bunch of >alternatives in Herman Rubin's Perfect Language it's still a waste of time. There is not, was not, and never will be a "perfect language." The superfast sub-imbecile in the compiler, and even assembler instruction reordered, and try out lots of alternatives, including investigating of many branches. The HLL gurus claim that their language will make it unnecessary for programmers ever to use machine codes. My proposal is for far less; let the programmer set up translations, some of which will be for quite considerable use, and the machine operations may or may not be invoked. And example, which may or may not use machine code, would be to find the integer closest to a floating-point number. Now this, and other such "standard" situations, could be put in a dictionary of alternatives. Quite some time ago, for simulation, I saw that producing vectors of random variables, instead of separate calls, would greatly speed up things. I have not seen this in wide use on the standard libraries. For a very simple operation, which can be extremely nasty, consider that of using a random bit to put a sign on a random number. In fact, consider the operation of putting the sign of x on y, or reversing the sign of y if x is negative. This is a fast operation on some machines, and a slow operation on others, and if it is slow, the algorithm in which it is used should be changed to avoid this problem, if at all possible. >> >> >I expect any good programmer to code for the abstract machine defined for >> >> >the language. Remember, software longa, hardware brevis. >> >> This might be reasonable if we had a language produced by people who have >> >> some respect for the capabilities of humans. >> >We do. We have dozens of them. >> Name one in which recognizes the various things I have previously published >> in a syntax intended for fast use by a human being. >The various things you're looking for are in the "tight loop" category, and >the effort of coding them in assembler is negligable compared to the cost of >coding the rest of the program in a language designed for microoptimisation >rather than expression of an algorithm. >C++ with embedded assembly and inline expansion. >Forth. >C with inline assembler. >Turbo Pascal. >Dec Fortran 77. >Most modern Lisp dialects. I have been mainly using C with inline assembler. But its headaches are massive. And my complaint that the great bulk of assembler languages are not designed for human use definitely holds. At least one Unix manual had the explict statement that the assembler was essentially for compiler maintainers ONLY. Forth, and in fact all stack machines, are overly restrictive. "Superinlining," which essentially uses code expansion, not subroutine conversion, is what should be done. Even if memory <-> register is as fast as claimed, it still involved lots of moving. An inlined procedure, to be good, must take the arguments where it finds them, and put the results where they are wanted, not using any standard memory or register locations. >> Lisp seems to have a >> fairly reasonable collection of primitives, but a human being should not >> be REQUIRED to use clumsy prenex notation. >Oddly, many human beings prefer it. HP has managed to remain the last >successful US calculator manufacturer by using a similar notation. I do not believe that many human beings would like to have to type plus(x,y) for x+y, minus(x,y) [or is it minus(y,x)?] for x-y, etc. Nor should they have to type iplus for integers, fplus for floats, etc. HP uses stack notation, not prenex. For manual operation, this minimizes work, and even in a computer, this is the actual function or subroutine procedure, even if it is not written that way. I am not aware of any calling sequences where the call is issued first, and then the arguments are obtained, or some arguments are issued, and then the instruction, and then other arguments. The prenex form is easier for the parser. I have seen the structure of a microcomputer assembler, and it uses the opcode to branch to the parsing of the remaining part of the instruction. This is easy for the assembler only. And even the HP uses registers, which are really memory, to handle arguments, instead of having to manipulate a stack. The current registers, not the registers on the early machines, are really a specialized high-speed memory, not where the computations are done. Too few machine designers have realized this and included registers in the addressible memory, so that registers themselves could be indexed and addressed as variable locations. Now I have no problems with the assembler, optimizer, etc., using things in this manner. But I would like to write things my way, and you would like to write things your way, and I see no reason why we cannot be accommodated. What we need is a versatile macro processor to translate between. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 Phone: (317)494-6054 hrubin@pop.stat.purdue.edu (Internet, bitnet) {purdue,pur-ee}!pop.stat!hrubin(UUCP)