home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.lang.functional
- Path: sparky!uunet!mcsun!sunic!kth.se!dentex.tds.kth.se!kff
- From: kff@dentex.tds.kth.se (Karl-Filip Faxen)
- Subject: Re: Predicting efficiency from theoretical models (Re: ** LAZY LANGUAGES FASTER?? **)
- Message-ID: <1992Jul27.165728.16279@kth.se>
- Sender: usenet@kth.se (Usenet)
- Nntp-Posting-Host: dentex.tds.kth.se
- Reply-To: kff@dentex.tds.kth.se (Karl-Filip Faxen)
- Organization: Royal Institute of Technology, Stockholm, Sweden
- References: <1992Jul24.134857.23289@gdstech.grumman.com> <1992Jul25.190252.9173@cs.cornell.edu>
- Date: Mon, 27 Jul 1992 16:57:28 GMT
- Lines: 59
-
- In article <1992Jul25.190252.9173@cs.cornell.edu>, murthy@cs.cornell.edu (Chet Murthy) writes:
-
- |> Anybody thought about the question of _why_ CBV is more efficient?
-
- [Stuff deleted...]
-
- |> In the end, it came down for me to something disgustingly applied -
- |> the frequency of expressions which are weak-head-normalizable to
- |> closure-free objects (i.e. which compute to data-values) is very high
- |> in normal programs - so CBV wins because it "represents" them
- |> efficiently - as the values they denote.
-
- Well, I would like to put it in a slightly different way:
-
- (1) In CBN, the value of a variable may or may not already
- be evaluated. A strict primitive, like integer add, needs its
- argument in evaluated form. Hence in an expression like `x+y',
- both arguments needs to be tested, and perhaps evaluated, before
- the addition is performed. This incurrs two costs relative to CBV:
-
- (a) The run time test is not needed in CBV, since variables are
- only allowed to denote fully evaluated objects.
-
- (b) A variable must be able to represent not only all objects of
- its type, but also all possible ways of evaluating such objects
- (all possible closures). Thus if a variable is implemented as a
- machine word, either there will be fewer objects of its type
- than bit patterns in a machine word (which is a pain in the
- behind when it comes to ints or floats), or some (or all) values
- in the type have to be stored in the heap and represented as
- pointers (all values must be boxed) which wastes storage and
- instructions (to move data between the heap and the CPU).
-
- (2) Optimizations such as register allocation and static instruction
- scheduling depends on the compiler being able figure out what the
- program will do when run. Every conditional branch is one more place
- where the compiler has to assume the worst. Thus these carry indirect
- costs as well as the direct cost of executing them.
-
- (3) It may be expensive to allocate, initialize and in general manipulate
- the closures. Especially since these are generally in the heap, they
- may well destroy locality, resulting in a thrashing data cache.
-
- Unfortunately, effects such as (1b), (2) and the locality loss in (3) are
- not accounted for by standard complexity theory. They just slips between
- your fingers if you're only counting the number of operations a program
- will execute since on a modern computer, the cost of an operation in
- general depends on its context, what operations preceede and follow it in
- the dynamic instruction stream, when, where and by whom its operands were
- computed and when, where and by whom its result will be used.
-
- |> I mean, for a theoretician, there seems to be no really good answer.
-
- I think that is a sure sign that there is a lot of new and interesting
- theory to be done in this field!
-
- Karl-Filip
- ----------------------------------------------------------------
- Make the frequent case fast, and the fast case frequent!
-