This is Info file gcc.info, produced by Makeinfo-1.55 from the input file gcc.texi. This file documents the use and the internals of the GNU compiler. Published by the Free Software Foundation 675 Massachusetts Avenue Cambridge, MA 02139 USA Copyright (C) 1988, 1989, 1992, 1993 Free Software Foundation, Inc. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided also that the sections entitled "GNU General Public License" and "Protect Your Freedom--Fight `Look And Feel'" are included exactly as in the original, and provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that the sections entitled "GNU General Public License" and "Protect Your Freedom--Fight `Look And Feel'", and this permission notice, may be included in translations approved by the Free Software Foundation instead of in the original English. File: gcc.info, Node: Output Statement, Next: Constraints, Prev: Output Template, Up: Machine Desc C Statements for Assembler Output ================================= Often a single fixed template string cannot produce correct and efficient assembler code for all the cases that are recognized by a single instruction pattern. For example, the opcodes may depend on the kinds of operands; or some unfortunate combinations of operands may require extra machine instructions. If the output control string starts with a `@', then it is actually a series of templates, each on a separate line. (Blank lines and leading spaces and tabs are ignored.) The templates correspond to the pattern's constraint alternatives (*note Multi-Alternative::.). For example, if a target machine has a two-address add instruction `addr' to add into a register and another `addm' to add a register to memory, you might write this pattern: (define_insn "addsi3" [(set (match_operand:SI 0 "general_operand" "=r,m") (plus:SI (match_operand:SI 1 "general_operand" "0,0") (match_operand:SI 2 "general_operand" "g,r")))] "" "@ addr %2,%0 addm %2,%0") If the output control string starts with a `*', then it is not an output template but rather a piece of C program that should compute a template. It should execute a `return' statement to return the template-string you want. Most such templates use C string literals, which require doublequote characters to delimit them. To include these doublequote characters in the string, prefix each one with `\'. The operands may be found in the array `operands', whose C data type is `rtx []'. It is very common to select different ways of generating assembler code based on whether an immediate operand is within a certain range. Be careful when doing this, because the result of `INTVAL' is an integer on the host machine. If the host machine has more bits in an `int' than the target machine has in the mode in which the constant will be used, then some of the bits you get from `INTVAL' will be superfluous. For proper results, you must carefully disregard the values of those bits. It is possible to output an assembler instruction and then go on to output or compute more of them, using the subroutine `output_asm_insn'. This receives two arguments: a template-string and a vector of operands. The vector may be `operands', or it may be another array of `rtx' that you declare locally and initialize yourself. When an insn pattern has multiple alternatives in its constraints, often the appearance of the assembler code is determined mostly by which alternative was matched. When this is so, the C code can test the variable `which_alternative', which is the ordinal number of the alternative that was actually satisfied (0 for the first, 1 for the second alternative, etc.). For example, suppose there are two opcodes for storing zero, `clrreg' for registers and `clrmem' for memory locations. Here is how a pattern could use `which_alternative' to choose between them: (define_insn "" [(set (match_operand:SI 0 "general_operand" "=r,m") (const_int 0))] "" "* return (which_alternative == 0 ? \"clrreg %0\" : \"clrmem %0\"); ") The example above, where the assembler code to generate was *solely* determined by the alternative, could also have been specified as follows, having the output control string start with a `@': (define_insn "" [(set (match_operand:SI 0 "general_operand" "=r,m") (const_int 0))] "" "@ clrreg %0 clrmem %0") File: gcc.info, Node: Constraints, Next: Standard Names, Prev: Output Statement, Up: Machine Desc Operand Constraints =================== Each `match_operand' in an instruction pattern can specify a constraint for the type of operands allowed. Constraints can say whether an operand may be in a register, and which kinds of register; whether the operand can be a memory reference, and which kinds of address; whether the operand may be an immediate constant, and which possible values it may have. Constraints can also require two operands to match. * Menu: * Simple Constraints:: Basic use of constraints. * Multi-Alternative:: When an insn has two alternative constraint-patterns. * Class Preferences:: Constraints guide which hard register to put things in. * Modifiers:: More precise control over effects of constraints. * Machine Constraints:: Existing constraints for some particular machines. * No Constraints:: Describing a clean machine without constraints. File: gcc.info, Node: Simple Constraints, Next: Multi-Alternative, Up: Constraints Simple Constraints ------------------ The simplest kind of constraint is a string full of letters, each of which describes one kind of operand that is permitted. Here are the letters that are allowed: A memory operand is allowed, with any kind of address that the machine supports in general. A memory operand is allowed, but only if the address is "offsettable". This means that adding a small integer (actually, the width in bytes of the operand, as determined by its machine mode) may be added to the address and the result is also a valid memory address. For example, an address which is constant is offsettable; so is an address that is the sum of a register and a constant (as long as a slightly larger constant is also within the range of address-offsets supported by the machine); but an autoincrement or autodecrement address is not offsettable. More complicated indirect/indexed addresses may or may not be offsettable depending on the other addressing modes that the machine supports. Note that in an output operand which can be matched by another operand, the constraint letter `o' is valid only when accompanied by both `<' (if the target machine has predecrement addressing) and `>' (if the target machine has preincrement addressing). A memory operand that is not offsettable. In other words, anything that would fit the `m' constraint but not the `o' constraint. A memory operand with autodecrement addressing (either predecrement or postdecrement) is allowed. A memory operand with autoincrement addressing (either preincrement or postincrement) is allowed. A register operand is allowed provided that it is in a general register. `d', `a', `f', ... Other letters can be defined in machine-dependent fashion to stand for particular classes of registers. `d', `a' and `f' are defined on the 68000/68020 to stand for data, address and floating point registers. An immediate integer operand (one with constant value) is allowed. This includes symbolic constants whose values will be known only at assembly time. An immediate integer operand with a known numeric value is allowed. Many systems cannot support assembly-time constants for operands less than a word wide. Constraints for these operands should use `n' rather than `i'. `I', `J', `K', ... `P' Other letters in the range `I' through `P' may be defined in a machine-dependent fashion to permit immediate integer operands with explicit integer values in specified ranges. For example, on the 68000, `I' is defined to stand for the range of values 1 to 8. This is the range permitted as a shift count in the shift instructions. An immediate floating operand (expression code `const_double') is allowed, but only if the target floating point format is the same as that of the host machine (on which the compiler is running). An immediate floating operand (expression code `const_double') is allowed. `G', `H' `G' and `H' may be defined in a machine-dependent fashion to permit immediate floating operands in particular ranges of values. An immediate integer operand whose value is not an explicit integer is allowed. This might appear strange; if an insn allows a constant operand with a value not known at compile time, it certainly must allow any known value. So why use `s' instead of `i'? Sometimes it allows better code to be generated. For example, on the 68000 in a fullword instruction it is possible to use an immediate operand; but if the immediate value is between -128 and 127, better code results from loading the value into a register and using the register. This is because the load into the register can be done with a `moveq' instruction. We arrange for this to happen by defining the letter `K' to mean "any integer outside the range -128 to 127", and then specifying `Ks' in the operand constraints. Any register, memory or immediate integer operand is allowed, except for registers that are not general registers. Any operand whatsoever is allowed, even if it does not satisfy `general_operand'. This is normally used in the constraint of a `match_scratch' when certain alternatives will not actually require a scratch register. `0', `1', `2', ... `9' An operand that matches the specified operand number is allowed. If a digit is used together with letters within the same alternative, the digit should come last. This is called a "matching constraint" and what it really means is that the assembler has only a single operand that fills two roles considered separate in the RTL insn. For example, an add insn has two input operands and one output operand in the RTL, but on most CISC machines an add instruction really has only two operands, one of them an input-output operand: addl #35,r12 Matching constraints are used in these circumstances. More precisely, the two operands that match must include one input-only operand and one output-only operand. Moreover, the digit must be a smaller number than the number of the operand that uses it in the constraint. For operands to match in a particular case usually means that they are identical-looking RTL expressions. But in a few special cases specific kinds of dissimilarity are allowed. For example, `*x' as an input operand will match `*x++' as an output operand. For proper results in such cases, the output template should always use the output-operand's number when printing the operand. An operand that is a valid memory address is allowed. This is for "load address" and "push address" instructions. `p' in the constraint must be accompanied by `address_operand' as the predicate in the `match_operand'. This predicate interprets the mode specified in the `match_operand' as the mode of the memory reference for which the address would be valid. `Q', `R', `S', ... `U' Letters in the range `Q' through `U' may be defined in a machine-dependent fashion to stand for arbitrary operand types. The machine description macro `EXTRA_CONSTRAINT' is passed the operand as its first argument and the constraint letter as its second operand. A typical use for this would be to distinguish certain types of memory references that affect other insn operands. Do not define these constraint letters to accept register references (`reg'); the reload pass does not expect this and would not handle it properly. In order to have valid assembler code, each operand must satisfy its constraint. But a failure to do so does not prevent the pattern from applying to an insn. Instead, it directs the compiler to modify the code so that the constraint will be satisfied. Usually this is done by copying an operand into a register. Contrast, therefore, the two instruction patterns that follow: (define_insn "" [(set (match_operand:SI 0 "general_operand" "=r") (plus:SI (match_dup 0) (match_operand:SI 1 "general_operand" "r")))] "" "...") which has two operands, one of which must appear in two places, and (define_insn "" [(set (match_operand:SI 0 "general_operand" "=r") (plus:SI (match_operand:SI 1 "general_operand" "0") (match_operand:SI 2 "general_operand" "r")))] "" "...") which has three operands, two of which are required by a constraint to be identical. If we are considering an insn of the form (insn N PREV NEXT (set (reg:SI 3) (plus:SI (reg:SI 6) (reg:SI 109))) ...) the first pattern would not apply at all, because this insn does not contain two identical subexpressions in the right place. The pattern would say, "That does not look like an add instruction; try other patterns." The second pattern would say, "Yes, that's an add instruction, but there is something wrong with it." It would direct the reload pass of the compiler to generate additional insns to make the constraint true. The results might look like this: (insn N2 PREV N (set (reg:SI 3) (reg:SI 6)) ...) (insn N N2 NEXT (set (reg:SI 3) (plus:SI (reg:SI 3) (reg:SI 109))) ...) It is up to you to make sure that each operand, in each pattern, has constraints that can handle any RTL expression that could be present for that operand. (When multiple alternatives are in use, each pattern must, for each possible combination of operand expressions, have at least one alternative which can handle that combination of operands.) The constraints don't need to *allow* any possible operand--when this is the case, they do not constrain--but they must at least point the way to reloading any possible operand so that it will fit. * If the constraint accepts whatever operands the predicate permits, there is no problem: reloading is never necessary for this operand. For example, an operand whose constraints permit everything except registers is safe provided its predicate rejects registers. An operand whose predicate accepts only constant values is safe provided its constraints include the letter `i'. If any possible constant value is accepted, then nothing less than `i' will do; if the predicate is more selective, then the constraints may also be more selective. * Any operand expression can be reloaded by copying it into a register. So if an operand's constraints allow some kind of register, it is certain to be safe. It need not permit all classes of registers; the compiler knows how to copy a register into another register of the proper class in order to make an instruction valid. * A nonoffsettable memory reference can be reloaded by copying the address into a register. So if the constraint uses the letter `o', all memory references are taken care of. * A constant operand can be reloaded by allocating space in memory to hold it as preinitialized data. Then the memory reference can be used in place of the constant. So if the constraint uses the letters `o' or `m', constant operands are not a problem. * If the constraint permits a constant and a pseudo register used in an insn was not allocated to a hard register and is equivalent to a constant, the register will be replaced with the constant. If the predicate does not permit a constant and the insn is re-recognized for some reason, the compiler will crash. Thus the predicate must always recognize any objects allowed by the constraint. If the operand's predicate can recognize registers, but the constraint does not permit them, it can make the compiler crash. When this operand happens to be a register, the reload pass will be stymied, because it does not know how to copy a register temporarily into memory. File: gcc.info, Node: Multi-Alternative, Next: Class Preferences, Prev: Simple Constraints, Up: Constraints Multiple Alternative Constraints -------------------------------- Sometimes a single instruction has multiple alternative sets of possible operands. For example, on the 68000, a logical-or instruction can combine register or an immediate value into memory, or it can combine any kind of operand into a register; but it cannot combine one memory location into another. These constraints are represented as multiple alternatives. An alternative can be described by a series of letters for each operand. The overall constraint for an operand is made from the letters for this operand from the first alternative, a comma, the letters for this operand from the second alternative, a comma, and so on until the last alternative. Here is how it is done for fullword logical-or on the 68000: (define_insn "iorsi3" [(set (match_operand:SI 0 "general_operand" "=m,d") (ior:SI (match_operand:SI 1 "general_operand" "%0,0") (match_operand:SI 2 "general_operand" "dKs,dmKs")))] ...) The first alternative has `m' (memory) for operand 0, `0' for operand 1 (meaning it must match operand 0), and `dKs' for operand 2. The second alternative has `d' (data register) for operand 0, `0' for operand 1, and `dmKs' for operand 2. The `=' and `%' in the constraints apply to all the alternatives; their meaning is explained in the next section (*note Class Preferences::.). If all the operands fit any one alternative, the instruction is valid. Otherwise, for each alternative, the compiler counts how many instructions must be added to copy the operands so that that alternative applies. The alternative requiring the least copying is chosen. If two alternatives need the same amount of copying, the one that comes first is chosen. These choices can be altered with the `?' and `!' characters: Disparage slightly the alternative that the `?' appears in, as a choice when no alternative applies exactly. The compiler regards this alternative as one unit more costly for each `?' that appears in it. Disparage severely the alternative that the `!' appears in. This alternative can still be used if it fits without reloading, but if reloading is needed, some other alternative will be used. When an insn pattern has multiple alternatives in its constraints, often the appearance of the assembler code is determined mostly by which alternative was matched. When this is so, the C code for writing the assembler code can use the variable `which_alternative', which is the ordinal number of the alternative that was actually satisfied (0 for the first, 1 for the second alternative, etc.). *Note Output Statement::. File: gcc.info, Node: Class Preferences, Next: Modifiers, Prev: Multi-Alternative, Up: Constraints Register Class Preferences -------------------------- The operand constraints have another function: they enable the compiler to decide which kind of hardware register a pseudo register is best allocated to. The compiler examines the constraints that apply to the insns that use the pseudo register, looking for the machine-dependent letters such as `d' and `a' that specify classes of registers. The pseudo register is put in whichever class gets the most "votes". The constraint letters `g' and `r' also vote: they vote in favor of a general register. The machine description says which registers are considered general. Of course, on some machines all registers are equivalent, and no register classes are defined. Then none of this complexity is relevant. File: gcc.info, Node: Modifiers, Next: Machine Constraints, Prev: Class Preferences, Up: Constraints Constraint Modifier Characters ------------------------------ Means that this operand is write-only for this instruction: the previous value is discarded and replaced by output data. Means that this operand is both read and written by the instruction. When the compiler fixes up the operands to satisfy the constraints, it needs to know which operands are inputs to the instruction and which are outputs from it. `=' identifies an output; `+' identifies an operand that is both input and output; all other operands are assumed to be input only. Means (in a particular alternative) that this operand is written before the instruction is finished using the input operands. Therefore, this operand may not lie in a register that is used as an input operand or as part of any memory address. `&' applies only to the alternative in which it is written. In constraints with multiple alternatives, sometimes one alternative requires `&' while others do not. See, for example, the `movdf' insn of the 68000. `&' does not obviate the need to write `='. Declares the instruction to be commutative for this operand and the following operand. This means that the compiler may interchange the two operands if that is the cheapest way to make all operands fit the constraints. This is often used in patterns for addition instructions that really have only two operands: the result must go in one of the arguments. Here for example, is how the 68000 halfword-add instruction is defined: (define_insn "addhi3" [(set (match_operand:HI 0 "general_operand" "=m,r") (plus:HI (match_operand:HI 1 "general_operand" "%0,0") (match_operand:HI 2 "general_operand" "di,g")))] ...) Says that all following characters, up to the next comma, are to be ignored as a constraint. They are significant only for choosing register preferences. Says that the following character should be ignored when choosing register preferences. `*' has no effect on the meaning of the constraint as a constraint, and no effect on reloading. Here is an example: the 68000 has an instruction to sign-extend a halfword in a data register, and can also sign-extend a value by copying it into an address register. While either kind of register is acceptable, the constraints on an address-register destination are less strict, so it is best if register allocation makes an address register its goal. Therefore, `*' is used so that the `d' constraint letter (for data register) is ignored when computing register preferences. (define_insn "extendhisi2" [(set (match_operand:SI 0 "general_operand" "=*d,a") (sign_extend:SI (match_operand:HI 1 "general_operand" "0,g")))] ...) File: gcc.info, Node: Machine Constraints, Next: No Constraints, Prev: Modifiers, Up: Constraints Constraints for Particular Machines ----------------------------------- Whenever possible, you should use the general-purpose constraint letters in `asm' arguments, since they will convey meaning more readily to people reading your code. Failing that, use the constraint letters that usually have very similar meanings across architectures. The most commonly used constraints are `m' and `r' (for memory and general-purpose registers respectively; *note Simple Constraints::.), and `I', usually the letter indicating the most common immediate-constant format. For each machine architecture, the `config/MACHINE.h' file defines additional constraints. These constraints are used by the compiler itself for instruction generation, as well as for `asm' statements; therefore, some of the constraints are not particularly interesting for `asm'. The constraints are defined through these macros: `REG_CLASS_FROM_LETTER' Register class constraints (usually lower case). `CONST_OK_FOR_LETTER_P' Immediate constant constraints, for non-floating point constants of word size or smaller precision (usually upper case). `CONST_DOUBLE_OK_FOR_LETTER_P' Immediate constant constraints, for all floating point constants and for constants of greater than word size precision (usually upper case). `EXTRA_CONSTRAINT' Special cases of registers or memory. This macro is not required, and is only defined for some machines. Inspecting these macro definitions in the compiler source for your machine is the best way to be certain you have the right constraints. However, here is a summary of the machine-dependent constraints available on some particular machines. *AMD 29000 family--`a29k.h'* `l' Local register 0 `b' Byte Pointer (`BP') register `q' `Q' register `h' Special purpose register `A' First accumulator register `a' Other accumulator register `f' Floating point register `I' Constant greater than 0, less than 0x100 `J' Constant greater than 0, less than 0x10000 `K' Constant whose high 24 bits are on (1) `L' 16 bit constant whose high 8 bits are on (1) `M' 32 bit constant whose high 16 bits are on (1) `N' 32 bit negative constant that fits in 8 bits `O' The constant 0x80000000 or, on the 29050, any 32 bit constant whose low 16 bits are 0. `P' 16 bit negative constant that fits in 8 bits `G' `H' A floating point constant (in `asm' statements, use the machine independent `E' or `F' instead) *IBM RS6000--`rs6000.h'* `b' Address base register `f' Floating point register `h' `MQ', `CTR', or `LINK' register `q' `MQ' register `c' `CTR' register `l' `LINK' register `x' `CR' register (condition register) number 0 `y' `CR' register (condition register) `I' Signed 16 bit constant `J' Constant whose low 16 bits are 0 `K' Constant whose high 16 bits are 0 `L' Constant suitable as a mask operand `M' Constant larger than 31 `N' Exact power of 2 `O' Zero `P' Constant whose negation is a signed 16 bit constant `G' Floating point constant that can be loaded into a register with one instruction per word `Q' Memory operand that is an offset from a register (`m' is preferable for `asm' statements) *Intel 386--`i386.h'* `q' `a', `b', `c', or `d' register `f' Floating point register `t' First (top of stack) floating point register `u' Second floating point register `a' `a' register `b' `b' register `c' `c' register `d' `d' register `D' `di' register `S' `si' register `I' Constant in range 0 to 31 (for 32 bit shifts) `J' Constant in range 0 to 63 (for 64 bit shifts) `K' `0xff' `L' `0xffff' `M' 0, 1, 2, or 3 (shifts for `lea' instruction) `G' Standard 80387 floating point constant *Intel 960--`i960.h'* `f' Floating point register (`fp0' to `fp3') `l' Local register (`r0' to `r15') `b' Global register (`g0' to `g15') `d' Any local or global register `I' Integers from 0 to 31 `J' 0 `K' Integers from -31 to 0 `G' Floating point 0 `H' Floating point 1 *MIPS--`mips.h'* `d' General-purpose integer register `f' Floating-point register (if available) `h' `Hi' register `l' `Lo' register `x' `Hi' or `Lo' register `y' General-purpose integer register `z' Floating-point status register `I' Signed 16 bit constant (for arithmetic instructions) `J' Zero `K' Zero-extended 16-bit constant (for logic instructions) `L' Constant with low 16 bits zero (can be loaded with `lui') `M' 32 bit constant which requires two instructions to load (a constant which is not `I', `K', or `L') `N' Negative 16 bit constant `O' Exact power of two `P' Positive 16 bit constant `G' Floating point zero `Q' Memory reference that can be loaded with more than one instruction (`m' is preferable for `asm' statements) `R' Memory reference that can be loaded with one instruction (`m' is preferable for `asm' statements) `S' Memory reference in external OSF/rose PIC format (`m' is preferable for `asm' statements) *Motorola 680x0--`m68k.h'* `a' Address register `d' Data register `f' 68881 floating-point register, if available `x' Sun FPA (floating-point) register, if available `y' First 16 Sun FPA registers, if available `I' Integer in the range 1 to 8 `J' 16 bit signed number `K' Signed number whose magnitude is greater than 0x80 `L' Integer in the range -8 to -1 `G' Floating point constant that is not a 68881 constant `H' Floating point constant that can be used by Sun FPA *SPARC--`sparc.h'* `f' Floating-point register `I' Signed 13 bit constant `J' Zero `K' 32 bit constant with the low 12 bits clear (a constant that can be loaded with the `sethi' instruction) `G' Floating-point zero `H' Signed 13 bit constant, sign-extended to 32 or 64 bits `Q' Memory reference that can be loaded with one instruction (`m' is more appropriate for `asm' statements) `S' Constant, or memory address `T' Memory address aligned to an 8-byte boundary `U' Even register File: gcc.info, Node: No Constraints, Prev: Machine Constraints, Up: Constraints Not Using Constraints --------------------- Some machines are so clean that operand constraints are not required. For example, on the Vax, an operand valid in one context is valid in any other context. On such a machine, every operand constraint would be `g', excepting only operands of "load address" instructions which are written as if they referred to a memory location's contents but actual refer to its address. They would have constraint `p'. For such machines, instead of writing `g' and `p' for all the constraints, you can choose to write a description with empty constraints. Then you write `""' for the constraint in every `match_operand'. Address operands are identified by writing an `address' expression around the `match_operand', not by their constraints. When the machine description has just empty constraints, certain parts of compilation are skipped, making the compiler faster. However, few machines actually do not need constraints; all machine descriptions now in existence use constraints.