home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.compilers
- Path: sparky!uunet!spool.mu.edu!enterpoop.mit.edu!world!iecc!compilers-sender
- From: syiek@tartan.com
- Subject: High Quality DSP Compilers
- Reply-To: syiek@tartan.com
- Organization: Compilers Central
- Date: Mon, 4 Jan 1993 22:43:42 GMT
- Approved: compilers@iecc.cambridge.ma.us
- Message-ID: <93-01-008@comp.compilers>
- Keywords: DSP, optimize, comment
- References: <92-12-094@comp.compilers> <92-12-097@comp.compilers>
- Sender: compilers-sender@iecc.cambridge.ma.us
- Lines: 211
-
- Response to Johan Van Praet and Jeff Enderwick
- (discussion of DSP compiler quality).
-
- Tartan Inc. is a supplier of Ada development systems for real-time
- embedded systems. We produced the first Ada compiler for a DSP (the
- TMS320C3x in '90), and currently also have TMS320C31 and TMS320C40
- compilers.
-
- As you may already know, the use of Ada is mandated by the U.S.
- Department of Defense (DOD) for all defense-related programs. Use of
- assembly is usually limited to 5% or less of a total application.
-
- Most real-time DSP applications require extremely tight code.
- Traditionally this could only be achieved through careful assembly
- language coding. Thus it is not unusual for DSP users to reject
- high-level language compilers, (sometimes before even trying them),
- because of this previous experience.
-
- Fortunately, the DOD mandate makes it very hard to reject an Ada compiler
- without first trying it out. As a result, the Tartan Ada DSP compilers
- are often benchmarked against other compilers and hand coded assembly.
- The results have been surprising (but not to us!): Tartan Ada consistently
- outperforms other DSP compilers and is usually only slightly slower than
- hand coded assembly. Indeed there are cases where the compiler has
- produced code that runs faster or is smaller than hand coded assembly.
-
- There are published accounts of several of these benchmarking efforts.
- Two are:
-
- P.K. Lawlis and T.W. Elam "Ada Outperforms Assembly: A Case Study"
- Proceedings of TRI-Ada '92
- Orlando, Florida 1992.
-
- Ralph E. Crafts, "Ada vs. C for the DSP - Advantage Ada",
- Ada Strategies, Vol 5. No. 4, April 1991.
-
- The Lawlis paper discusses an incident where the compiler was used to
- produced a version of an algorithm that was the same size as hand coded
- assembly, but ran twice as fast.
-
- There are many technical reasons why the compiler performs as well as it
- does. I refer you to the following papers:
-
- D.A. Syiek "Challenging Assembly Code Quality,"
- Proceedings of the International Conference on DSP Applications
- and Technology,
- Berlin, Germany, 1991.
-
- D.A. Syiek and D. Burton
- "Optimizing Ada Code for the TI SMJ320C30 Digital Signal Processor",
- Proceedings of the International Conference on DSP Applications
- and Technology,
- Brussels, Belgium, 1990.
-
- ======================================================================
-
- Some specific responses to Johan Van Praet's message:
-
- >What causes this factor of 5 on the code size?
-
- It is speed that usually matters for my customers. However, the Lawlis
- paper describes how the Ada compiler was also used to produce a version
- that was twice as small as the hand-coded assembly. This is a factor of
- 0.5 !
-
- >The instruction set of a DSP processor does not lend itself
- >to conventional compiling techniques ?
-
- All machines are unique to some degree. A good compiler technology will
- accommodate this uniqueness across a wide variety of machines. We had few
- real problems finding ways to generate good TMS320C3x/C4x code. We expect
- few problems when (and if) we tackle other DSP architectures in the
- future.
-
- >the High Level Languages are not useful for DSP?
- >not enough parallelism in C (too difficult to extract the parallelism)?
- >too many possible constructs in C ? (a subset of C is better)
- >non-procedural languages as e.g. Silage are better ?
-
- 1. There are many that believe that C is NOT an example of a modern high
- level language. I will pretend that you use "C" as a meta-variable for
- high-level language (HLL).
- 2. You can kill performance in ANY language with bad coding style.
- Conversely, you can tune code easily in a high level language, (but
- not always the same way for all languages, compilers and targets).
- 3. Parallelism at what level?
- Are you building a multiprocessor and want your algorithm magically
- distributed across the system?
- Or do you have asynchronous functional units in your CPU?
- Is the instruction pipeline visible at the user level?
- Are there multiple operations per instruction?
- 4. The Tartan TMS320C3x/C4x Ada compiler is able to "fold" loops (make
- software pipelines) and generate parallel instructions. The parallel
- instructions are also used in many pre-canned sequences and
- library functions invoked automatically where applicable. Finally, the
- parallel instructions may be created in unusual places through an
- algorithm we call "threads" that is too combinatorially complex to
- duplicate by hand.
-
- >no possibility of using all the provided tricks for the DSP
- >processors ?
-
- Almost all good compilers contain language extensions, compiler built-ins,
- or libraries that allow you to touch the hardware features no matter what
- the input language. For example, the Tartan compiler contains constructs
- for using the circular and bit-reversed addressing features of the
- TMS320C3x/C4x.
-
- > no global ordering and scheduling of the generated code ?
-
- At the machine code level, we schedule code to avoid pipeline delays both
- for delayed branching and for all other pipeline locks. At a higher
- level, scheduled tasking and interrupt handling usually deal with the
- remainder of the the dynamic scheduling issues. Automatic multiprocessor
- scheduling is usually not a requirement of embedded systems. Tasks are
- statically divided up amongst the resources.
-
- >no or not enough use of the low-overhead-loop facility of a
- >processor as the "DO" for Motorola and the "RPT" for Texas
- >Instruments processors ?
-
- The Tartan TMS320C3x/C4x compilers automatically use the RPTS and RPTB
- instructions where appropriate. In fact there are about a dozen
- specialized looping constructs the compiler chooses from when building
- iteration loops, depending on the nesting depth, iteration count,
- resources available ... etc.
-
- >no use of special addressing ? (e.g. in circular buffers)
-
- As already stated, the Tartan compiler contains constructs for doing
- circular addressing and bit-reversed addressing.
-
- >no use of special block data moves ?
-
- The Tartan compiler uses fast block move sequences whenever appropriate.
-
- >I also know of three code generation approaches that would generate more
- >optimal code :
-
- Now you know of a fourth. Furthermore it is mature and off-the-shelf and
- supports a standardized high-level language.
-
- >What is their quality on real life examples ?
-
- Tartan has a large customer base - it seems like just about every company
- in the defense industry owns a Tartan compiler. Our DSP compilers have
- been used to build many delivered systems. The feedback we get from our
- customers is VERY positive.
-
- So what does this mean to the commercial (non-DOD) world:
-
- 1. You COULD start using Ada. If you have a large application, there
- is much evidence that the high cost of entry will pay for itself
- several times over. However, much of the commercial world:
- a. Builds small systems on rather primitive fixed-point DSP processors
- for which good compilers do not exist.
- b. Refuses to consider Ada and will not even study the cost-benefit.
-
- 2. Tartan recently formed a commercial division in order to leverage our
- existing technology into that market. There already two products
- designed to boost C performance: FasTar (high speed trig function
- library) and FloTar (double precision floats for the C3x/C4x). Look
- for more ambitious products in the coming year.
-
- ======================================================================
-
- Specific responses to Enderwick's message:
-
- I agree with much of what you say. The difference between your
- perspective and mine is that my customers are generally creating
- lower-production-volume systems using the newer floating-point chips.
- These systems are large (often measured in 100K line increments) and have
- such long life expectancies that 100% assembly code is totally impractical
- (even if the DOD would allow it).
-
- In a large system, the 90-10 rule tends to hold up pretty well (90% of the
- time is spent in 10% of the code). A small amount of time spent tuning
- the key 10% of the Ada code usually gets the algorithm to perform in the
- required time. The compiler contains assembly code insertion capability
- (with symbolic reference to Ada variables) as well as assembly code
- interface capability. These can be (and are) used as a last resort and
- usually on less than 5% of the total code.
-
- The register selection problem you mention shows up on the C3x/C4x as
- well. Since loop folding and other parallelization techniques are done
- AFTER the code is generated, registers must often be re-assigned to get
- the right ones. The compiler does not always come up with the ideal
- solution (but we are working on it!). However, since most loops employing
- parallel instructions are 5 or 6 instructions long, its not really a big
- deal to use a machine code insert.
-
- Your comment about the referenced schemes being "library goop-together"
- matches my limited understanding of them as well. However, as you also
- note, this is a powerful approach. Tartan uses this approach in a limited
- way by providing some amount of library routines, often with high speed
- interfacing supported directly from the code generator of the compiler.
- For example, complex numbers are supported in this way, as are mixed
- complex/real operations and complex vectors. It is likely that we will
- expand on this as time goes by.
-
- David A. Syiek
- Tartan Inc., Monroeville, PA 15146
- (412) 856-3600, FAX:856-3636
- syiek@tartan.com
- [Does Tartan have DSP compilers for other languages or just Ada? I can
- imagine that there'd be a market for Fortran, considering all the numeric
- Fortran code there is, or maybe C++ for people who want a cool language.
- -John]
- --
- Send compilers articles to compilers@iecc.cambridge.ma.us or
- {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.
-