home *** CD-ROM | disk | FTP | other *** search
-
-
- Notes on potential compiler improvements
-
-
- * The analysis which generates `block-stack-link' could be improved.
- Currently, it fails in the case where the procedure is always invoked
- as a subproblem, but there are multiple possible continuations. The
- reason is subtle: we need to know that the `continuation/offset' of
- all of the continuations is the same (as well as the frame-size and
- the closing-block). Unfortunately, the computation of the offset
- depends on the subproblem ordering, which depends on the stack-link
- (to decide whether or not to use static links). Catch-22. Probably
- this can be solved.
-
- * Pathological case of "takr.scm" can perhaps be solved by integrating
- simapp and outer into one pass. By handling "passed-in" nodes before
- other nodes, and not making links to such nodes, the explosion of
- useless dataflow information would be avoided. However, this affects
- the static-link analysis, which looks at BOTH the "passed-in" bit as
- well as the set of values. Think of some way to make this degrade
- properly.
-
- * Make the static-link analysis more sophisticated so that it uses
- static links whenever the current strategy would require going through
- at least two links. This sometimes happens when the parent must be
- located through the closing block of the continuation. In this case
- it is probably better to add a redundant static link for speed in
- lookup.
-
- * When tail-recursing into an internal procedure, if the procedure has
- no free variables, we can erase the calling frame. In the simplest
- case, this means that such a procedure is actually an external
- procedure. However, we could get more sophisticated and notice that
- it was OK to delete some of the ancestor stack frames but not others.
-
- * The code generated by the rewrite rule for disjunctions demonstrates
- that the decision about whether or not to use registers for LET
- parameters does not depend on the entire body of the LET. In this
- case, the predicate parameter can ALWAYS be register allocated,
- independent of the complexity of the alternative, because it is unused
- once the decision has been made in favor of the alternative. This can
- be generalized to handle more complex cases.
-
- * Change CFG implementation so that `hook' objects are just partially
- connected edges. I think that noop nodes can then be eliminated in
- favor of disconnected edges. This will also solve a potential problem
- where deletion of the noop nodes at the entry points of continuations
- leaves random hooks scattered around in various subproblems.
-
- * Many closures are never invoked through their external entry points.
- For such closures, the external entry point and associated code need
- never be generated. Also, the closure object need not contain a code
- pointer. This is one step closer to just using the closure frame
- pointer in place of the closure.
-
- * Perform dead-code-elimination at the same time as constant folding.
- Be thorough, deleting all nodes associated with all code that is
- eliminated. This is tricky but pays off handsomely later on. Also,
- doing it after the dataflow but before the rest of the analysis
- greatly reduces the amount of details that have to be kept track of
- during deletion.
-
- ALSO: note that removal of code to hack known predicates in "rgretn"
- may make something like this necessary for simple cases.
-
- Subsequent note: performing dead code elimination prior to subproblem
- ordering has a problem in that there are cfg fragments in the
- subproblems with invisible pointers into the node structure. We can't
- delete nodes unless we know about these pointers, so we must do dead
- code elimination after subproblem ordering.
-
- * Now that RTL generator does not generate temporaries for quantities
- that are immediately pushed, tested, etc., we will need to modify the
- CSE to generate temporaries for the cases where those quantities are
- found in multiple places. Hopefully this won't break the world.
-
- * The interning of SCode variable objects (for explicit lookup) is
- done on a per-block basis. It should be changed so that stack blocks
- are skipped and the interning is done on the nearest IC block.
-
- * Fixnum operations
-
- ** Is signed bit-field extraction faster than current strategy if the
- operand is in memory?
-
- ** In the case of addition of two boxed fixnums to a boxed result, no
- unboxing is needed on the operands provided the result is boxed in the
- usual way.
-
-
- Items that have been processed
-
-
- * Introduction of inline-coded continuations (i.e. continuations of
- type UNIQUE or SIMPLE) has invalidated the current method of
- maintaining the frame pointer offset. The reason is that the body of
- such a continuation thinks that the frame pointer knows where its
- frame is, while the offset in fact refers to some ancestor of that
- frame. I think that ignoring the frame of such a continuation in
- `find-block' will produce the desired effect.
-
- * JOIN type blocks aren't needed for offset, but they ARE needed to
- prevent continuations from being classified as UNIFORM when they
- aren't.
-
- * To do `block-parent' operation on a "real" block, must skip any
- intervening JOIN blocks to find the next "real" block.
-
- * `generator/subproblem' has code to mark frame-size of a join block
- if the continuation is closed in one. That needs to be moved
- elsewhere?
-
- * Theory: JOIN blocks are always invisible _except_ when needed to
- compute a frame pointer offset. This means:
-
- ** `find-block' and friends in "emodel" need to know about them. Also
- the associated `stack-block-parent-locative' and similar
- constructions.
-
- ** `procedure-closure-block' now refers to the previous
- `block-parent'. The closing code must refer to `block-%parent' to get
- the lower-level closing block.
-
- ** `block->join/distance' in "rgretn" needs to learn about them.
-
- * (implemented 8/88 -- cph) The code in "rgretn" should be modified as
- follows. At a return point, if the continuation is known, then we can
- just jump to the continuation, as long as we set things up correctly
- based on the operator class of the continuation. This might mean, for
- example, that we throw away the return address on the stack because we
- know that it has a certain value. In practice, this can occur when we
- supply a continuation to a combination that goes to one of two
- procedures. The procedure in which the return appears is ONLY invoked
- with this continuation, while the other procedure is sometimes invoked
- with another continuation. Thus we must push the return address,
- because we don't know which procedure we're invoking, but at return
- time it isn't needed.
-
- * Some procedures that are being considered closures can easily be
- open external. Each of the free variables must satisfy one of the
- following criteria: (1) it has a known value, or (2) it is bound in
- the IC block being used for cached references. This optimization will
- make an enormous performance improvement on programs that consist of
- many procedures closed in a compiled block, with a few external
- closure entry points, because it will allow most of the internal
- procedures to be open. Currently they will all become closures.
-