The machine instruction sets are (almost by definition) different on
each machine where as
runs. Floating point representations
vary as well, and as
often supports a few additional
directives or command-line options for compatibility with other
assemblers on a particular platform. Finally, some versions of
as
support special pseudo-instructions for branch
optimization.
This chapter discusses most of these differences, though it does not include details on any machine's instruction set. For details on that subject, see the hardware manufacturer's manual.
The ARC chip family includes several successive levels (or other variants) of chip, using the same core instruction set, but including a few additional instructions at each level.
By default, as
assumes the core instruction set (ARC
base). The .cpu
pseudo-op is intended to be used to select
the variant.
-mbig-endian
-mlittle-endian
as
can select big-endian or
little-endian output at run time (unlike most other GNU development
tools, which must be configured for one or the other). Use
`-mbig-endian' to select big-endian output, and `-mlittle-endian'
for little-endian.
The ARC cpu family currently does not have hardware floating point
support. Software floating point support is provided by GCC
and uses IEEE floating-point numbers.
The ARC version of as
supports the following additional
machine directives:
.cpu
.cpu
is used to
select the desired variant [though currently there are none].
as
has no additional command-line options for the AMD
29K family.
The macro syntax used on the AMD 29K is like that described in the AMD
29K Family Macro Assembler Specification. Normal as
macros should still work.
`;' is the line comment character.
The character `?' is permitted in identifiers (but may not begin an identifier).
General-purpose registers are represented by predefined symbols of the
form `GRnnn' (for global registers) or `LRnnn'
(for local registers), where nnn represents a number between
0
and 127
, written with no leading zeros. The leading
letters may be in either upper or lower case; for example, `gr13'
and `LR7' are both valid register names.
You may also refer to general-purpose registers by specifying the register number as the result of an expression (prefixed with `%%' to flag the expression as a register number):
%%expression
---where expression must be an absolute expression evaluating to a
number between 0
and 255
. The range [0, 127] refers to
global registers, and the range [128, 255] to local registers.
In addition, as
understands the following protected
special-purpose register names for the AMD 29K family:
vab chd pc0 ops chc pc1 cps rbp pc2 cfg tmc mmu cha tmr lru
These unprotected special-purpose register names are also recognized:
ipc alu fpe ipa bp inte ipb fc fps q cr exop
The AMD 29K family uses IEEE floating-point numbers.
.block size , fill
.cputype
.file
Warning: in other versions of the GNU assembler,
.file
is used for the directive called.app-file
in the AMD 29K support.
.line
.sect
.use section name
.text
, .data
,
.data1
, or .lit
. With one of the first three section
name options, `.use' is equivalent to the machine directive
section name; the remaining case, `.use .lit', is the same as
`.data 200'.
as
implements all the standard AMD 29K opcodes. No
additional pseudo-instructions are needed on this family.
For information on the 29K machine instruction set, see Am29000 User's Manual, Advanced Micro Devices, Inc.
-marm[2|250|3|6|60|600|610|620|7|7m|7d|7dm|7di|7dmi|70|700|700i|710|710c|7100|7500|7500fe|7tdmi|8|810|9|9tdmi|920|strongarm|strongarm110|strongarm1100]
-marmv[2|2a|3|3m|4|4t|5|5t]
-mthumb
-mall
-mfpa [10|11]
-mfpe-old
-mno-fpu
-mthumb-interwork
-mapcs [26|32]
-mapcs-float
-mapcs-reentrant
-EB
-EL
-k
-moabi
The presence of a `@' on a line indicates the start of a comment that extends to the end of the current line. If a `#' appears as the first character of a line, the whole line is treated as a comment.
On ARM systems running the GNU/Linux operating system, `;' can be used instead of a newline to separate statements.
Either `#' or `$' can be used to indicate immediate operands.
*TODO* Explain about /data modifier on symbols.
*TODO* Explain about ARM register naming, and the predefined names.
The ARM family uses IEEE floating-point numbers.
.align expression [, expression]
name .req register name
foo .req r0
.code [16|32]
.thumb
.arm
.force_thumb
.thumb_func
.thumb_set
.set
directive in that it
creates a symbol which is an alias for another symbol (possibly not yet
defined). This directive also has the added property in that it marks
the aliased symbol as being a thumb function entry point, in the same
way that the .thumb_func
directive does.
.ltorg
.pool
as
implements all the standard ARM opcodes. It also
implements several pseudo opcodes, including several synthetic load
instructions.
NOP
nopThis pseudo op will always evaluate to a legal ARM instruction that does nothing. Currently it will evaluate to MOV r0, r0.
LDR
ldr <register> , = <expression>If expression evaluates to a numeric constant then a MOV or MVN instruction will be used in place of the LDR instruction, if the constant can be generated by either of these instructions. Otherwise the constant will be placed into the nearest literal pool (if it not already there) and a PC relative LDR instruction will be generated.
ADR
adr <register> <label>This instruction will load the address of label into the indicated register. The instruction will evaluate to a PC relative ADD or SUB instruction depending upon where the label is located. If the label is out of range, or if it is not defined in the same file (and section) as the ADR instruction, then an error will be generated. This instruction will not make use of the literal pool.
ADRL
adrl <register> <label>This instruction will load the address of label into the indicated register. The instruction will evaluate to one or two a PC relative ADD or SUB instructions depending upon where the label is located. If a second instruction is not needed a NOP instruction will be generated in its place, so that this instruction is always 8 bytes long. If the label is out of range, or if it is not defined in the same file (and section) as the ADRL instruction, then an error will be generated. This instruction will not make use of the literal pool.
For information on the ARM or Thumb instruction sets, see ARM Software Development Toolkit Reference Manual, Advanced RISC Machines Ltd.
The Mitsubishi D10V version of as
has a few machine
dependent options.
as
will attempt to optimize its output by detecting when
instructions can be executed in parallel.
as
will sometimes swap the
order of instructions. Normally this generates a warning. When this option
is used, no warning will be generated when instructions are swapped.
The D10V syntax is based on the syntax in Mitsubishi's D10V architecture manual. The differences are detailed below.
The D10V version of as
uses the instruction names in the D10V
Architecture Manual. However, the names in the manual are sometimes ambiguous.
There are instruction names that can assemble to a short or long form opcode.
How does the assembler pick the correct form? as
will always pick the
smallest form if it can. When dealing with a symbol that is not defined yet when a
line is being assembled, it will always use the long form. If you need to force the
assembler to use either the short or long form of the instruction, you can append
either `.s' (short) or `.l' (long) to it. For example, if you are writing
an assembly program and you want to do a branch to a symbol that is defined later
in your program, you can write `bra.s foo'.
Objdump and GDB will always append `.s' or `.l' to instructions which
have both short and long forms.
The D10V assembler takes as input a series of instructions, either one-per-line, or in the special two-per-line format described in the next section. Some of these instructions will be short-form or sub-instructions. These sub-instructions can be packed into a single instruction. The assembler will do this automatically. It will also detect when it should not pack instructions. For example, when a label is defined, the next instruction will never be packaged with the previous one. Whenever a branch and link instruction is called, it will not be packaged with the next instruction so the return address will be valid. Nops are automatically inserted when necessary.
If you do not want the assembler automatically making these decisions, you can control the packaging and execution type (parallel or sequential) with the special execution symbols described in the next section.
`;' and `#' are the line comment characters. Sub-instructions may be executed in order, in reverse-order, or in parallel. Instructions listed in the standard one-per-line format will be executed sequentially. To specify the executing order, use the following symbols:
The D10V syntax allows either one instruction per line, one instruction per line with the execution symbol, or two instructions per line. For example
abs a1 -> abs r0
abs r0 <- abs a1
ld2w r2,@r8+ || mac a0,r0,r7
ld2w r2,@r8+ ||
mac a0,r0,r7
ld2w r2,@r8+
mac a0,r0,r7
ld2w r2,@r8+ ->
mac a0,r0,r7
Since `$' has no special meaning, you may use it in symbol names.
You can use the predefined symbols `r0' through `r15' to refer to the D10V registers. You can also use `sp' as an alias for `r15'. The accumulators are `a0' and `a1'. There are special register-pair names that may optionally be used in opcodes that require even-numbered registers. Register names are not case sensitive.
Register Pairs
r0-r1
r2-r3
r4-r5
r6-r7
r8-r9
r10-r11
r12-r13
r14-r15
The D10V also has predefined symbols for these control registers and status bits:
psw
bpsw
pc
bpc
rpt_c
rpt_s
rpt_e
mod_s
mod_e
iba
f0
f1
c
as
understands the following addressing modes for the D10V.
Rn
in the following refers to any of the numbered
registers, but not the control registers.
Rn
@Rn
@Rn+
@Rn-
@-SP
@(disp, Rn)
addr
#imm
Any symbol followed by @word
will be replaced by the symbol's value
shifted right by 2. This is used in situations such as loading a register
with the address of a function (or any other code fragment). For example, if
you want to load a register with the location of the function main
then
jump to that function, you could do it as follws:
ldi r2, main@word jmp r2
The D10V has no hardware floating point, but the .float
and .double
directives generates IEEE floating-point numbers for compatibility
with other development tools.
For detailed information on the D10V machine instruction set, see
D10V Architecture: A VLIW Microprocessor for Multimedia Applications
(Mitsubishi Electric Corp.).
as
implements all the standard D10V opcodes. The only changes are those
described in the section on size modifiers
The Mitsubishi D30V version of as
has a few machine
dependent options.
as
will attempt to optimize its output by detecting when
instructions can be executed in parallel.
as
will issue a warning every
time it adds a nop instruction.
as
will issue a warning if it
needs to insert a nop after a 32-bit multiply before a load or 16-bit
multiply instruction.
The D30V syntax is based on the syntax in Mitsubishi's D30V architecture manual. The differences are detailed below.
The D30V version of as
uses the instruction names in the D30V
Architecture Manual. However, the names in the manual are sometimes ambiguous.
There are instruction names that can assemble to a short or long form opcode.
How does the assembler pick the correct form? as
will always pick the
smallest form if it can. When dealing with a symbol that is not defined yet when a
line is being assembled, it will always use the long form. If you need to force the
assembler to use either the short or long form of the instruction, you can append
either `.s' (short) or `.l' (long) to it. For example, if you are writing
an assembly program and you want to do a branch to a symbol that is defined later
in your program, you can write `bra.s foo'.
Objdump and GDB will always append `.s' or `.l' to instructions which
have both short and long forms.
The D30V assembler takes as input a series of instructions, either one-per-line, or in the special two-per-line format described in the next section. Some of these instructions will be short-form or sub-instructions. These sub-instructions can be packed into a single instruction. The assembler will do this automatically. It will also detect when it should not pack instructions. For example, when a label is defined, the next instruction will never be packaged with the previous one. Whenever a branch and link instruction is called, it will not be packaged with the next instruction so the return address will be valid. Nops are automatically inserted when necessary.
If you do not want the assembler automatically making these decisions, you can control the packaging and execution type (parallel or sequential) with the special execution symbols described in the next section.
`;' and `#' are the line comment characters. Sub-instructions may be executed in order, in reverse-order, or in parallel. Instructions listed in the standard one-per-line format will be executed sequentially unless you use the `-O' option.
To specify the executing order, use the following symbols:
The D30V syntax allows either one instruction per line, one instruction per line with the execution symbol, or two instructions per line. For example
abs r2,r3 -> abs r4,r5
abs r2,r3 <- abs r4,r5
abs r2,r3 || abs r4,r5
ldw r2,@(r3,r4) ||
mulx r6,r8,r9
mulx a0,r8,r9
stw r2,@(r3,r4)
stw r2,@(r3,r4) ->
mulx a0,r8,r9
stw r2,@(r3,r4) <-
mulx a0,r8,r9
Since `$' has no special meaning, you may use it in symbol names.
as
supports the full range of guarded execution
directives for each instruction. Just append the directive after the
instruction proper. The directives are:
You can use the predefined symbols `r0' through `r63' to refer to the D30V registers. You can also use `sp' as an alias for `r63' and `link' as an alias for `r62'. The accumulators are `a0' and `a1'.
The D30V also has predefined symbols for these control registers and status bits:
psw
bpsw
pc
bpc
rpt_c
rpt_s
rpt_e
mod_s
mod_e
iba
f0
f1
f2
f3
f4
f5
f6
f7
s
v
va
c
b
as
understands the following addressing modes for the D30V.
Rn
in the following refers to any of the numbered
registers, but not the control registers.
Rn
@Rn
@Rn+
@Rn-
@-SP
@(disp, Rn)
addr
#imm
The D30V has no hardware floating point, but the .float
and .double
directives generates IEEE floating-point numbers for compatibility
with other development tools.
For detailed information on the D30V machine instruction set, see
D30V Architecture: A VLIW Microprocessor for Multimedia Applications
(Mitsubishi Electric Corp.).
as
implements all the standard D30V opcodes. The only changes are those
described in the section on size modifiers
as
has no additional command-line options for the Hitachi
H8/300 family.
`;' is the line comment character.
`$' can be used instead of a newline to separate statements. Therefore you may not use `$' in symbol names on the H8/300.
You can use predefined symbols of the form `rnh' and `rnl' to refer to the H8/300 registers as sixteen 8-bit general-purpose registers. n is a digit from `0' to `7'); for instance, both `r0h' and `r7l' are valid register names.
You can also use the eight predefined symbols `rn' to refer to the H8/300 registers as 16-bit registers (you must use this form for addressing).
On the H8/300H, you can also use the eight predefined symbols `ern' (`er0' ... `er7') to refer to the 32-bit general purpose registers.
The two control registers are called pc
(program counter; a
16-bit register, except on the H8/300H where it is 24 bits) and
ccr
(condition code register; an 8-bit register). r7
is
used as the stack pointer, and can also be called sp
.
as understands the following addressing modes for the H8/300:
rn
@rn
@(d, rn)
@(d:16, rn)
@(d:24, rn)
@rn+
@-rn
@
aa
@
aa:8
@
aa:16
@
aa:24
aa
. (The address size `:24' only makes
sense on the H8/300H.)
#xx
#xx:8
#xx:16
#xx:32
as
neither
requires this nor uses it--the data size required is taken from
context.
@
@
aa
@
@
aa:8
as
neither requires this nor uses it.
The H8/300 family has no hardware floating point, but the .float
directive generates IEEE floating-point numbers for compatibility
with other development tools.
as
has only one machine-dependent directive for the
H8/300:
.h8300h
.int
emit 32-bit numbers rather than the usual (16-bit)
for the H8/300 family.
On the H8/300 family (including the H8/300H) `.word' directives generate 16-bit numbers.
For detailed information on the H8/300 machine instruction set, see H8/300 Series Programming Manual (Hitachi ADE--602--025). For information specific to the H8/300H, see H8/300H Series Programming Manual (Hitachi).
as
implements all the standard H8/300 opcodes. No additional
pseudo-instructions are needed on this family.
Four H8/300 instructions (add
, cmp
, mov
,
sub
) are defined with variants using the suffixes `.b',
`.w', and `.l' to specify the size of a memory operand.
as
supports these suffixes, but does not require them;
since one of the operands is always a register, as
can
deduce the correct size.
For example, since r0
refers to a 16-bit register,
mov r0,@foo is equivalent to mov.w r0,@foo
If you use the size suffixes, as
issues a warning when
the suffix and the register size do not match.
as
has no additional command-line options for the Hitachi
H8/500 family.
`!' is the line comment character.
`;' can be used instead of a newline to separate statements.
Since `$' has no special meaning, you may use it in symbol names.
You can use the predefined symbols `r0', `r1', `r2', `r3', `r4', `r5', `r6', and `r7' to refer to the H8/500 registers.
The H8/500 also has these control registers:
cp
dp
bp
tp
ep
sr
ccr
All registers are 16 bits long. To represent 32 bit numbers, use two
adjacent registers; for distant memory addresses, use one of the segment
pointers (cp
for the program counter; dp
for
r0
--r3
; ep
for r4
and r5
; and
tp
for r6
and r7
.
as understands the following addressing modes for the H8/500:
Rn
@Rn
@(d:8, Rn)
@(d:16, Rn)
@-Rn
@Rn+
@aa:8
@aa:16
#xx:8
#xx:16
The H8/500 family has no hardware floating point, but the .float
directive generates IEEE floating-point numbers for compatibility
with other development tools.
as
has no machine-dependent directives for the H8/500.
However, on this platform the `.int' and `.word' directives
generate 16-bit numbers.
For detailed information on the H8/500 machine instruction set, see H8/500 Series Programming Manual (Hitachi M21T001).
as
implements all the standard H8/500 opcodes. No additional
pseudo-instructions are needed on this family.
As a back end for GNU CC as
has been throughly tested and should
work extremely well. We have tested it only minimally on hand written assembly
code and no one has tested it much on the assembly output from the HP
compilers.
The format of the debugging sections has changed since the original
as
port (version 1.3X) was released; therefore,
you must rebuild all HPPA objects and libraries with the new
assembler so that you can debug the final executable.
The HPPA as
port generates a small subset of the relocations
available in the SOM and ELF object file formats. Additional relocation
support will be added as it becomes necessary.
as
has no machine-dependent command-line options for the HPPA.
The assembler syntax closely follows the HPPA instruction set reference manual; assembler directives and general syntax closely follow the HPPA assembly language reference manual, with a few noteworthy differences.
First, a colon may immediately follow a label definition. This is simply for compatibility with how most assembly language programmers write code.
Some obscure expression parsing problems may affect hand written code which
uses the spop
instructions, or code which makes significant
use of the !
line separator.
as
is much less forgiving about missing arguments and other
similar oversights than the HP assembler. as
notifies you
of missing arguments as syntax errors; this is regarded as a feature, not a
bug.
Finally, as
allows you to use an external symbol without
explicitly importing the symbol. Warning: in the future this will be
an error for HPPA targets.
Special characters for HPPA targets include:
`;' is the line comment character.
`!' can be used instead of a newline to separate statements.
Since `$' has no special meaning, you may use it in symbol names.
The HPPA family uses IEEE floating-point numbers.
as
for the HPPA supports many additional directives for
compatibility with the native assembler. This section describes them only
briefly. For detailed information on HPPA-specific assembler directives, see
HP9000 Series 800 Assembly Language Reference Manual (HP 92432-90001).
as
does not support the following assembler directives
described in the HP manual:
.endm .liston .enter .locct .leave .macro .listoff
Beyond those implemented for compatibility, as
supports one
additional assembler directive for the HPPA: .param
. It conveys
register argument locations for static functions. Its syntax closely follows
the .export
directive.
These are the additional directives in as
for the HPPA:
.block n
.blockz n
.call
.callinfo [ param=value, ... ] [ flag, ... ]
.code
.copyright "string"
.copyright "string"
.enter
.entry
.exit
.export name [ ,typ ] [ ,param=r ]
0
to 3
, and
indicates one of four one-word arguments); `rtnval' (the procedure's
result); or `priv_lev' (privilege level). For arguments or the result,
r specifies how to relocate, and must be one of `no' (not
relocatable), `gr' (argument is in general register), `fr' (in
floating point register), or `fu' (upper half of float register).
For `priv_lev', r is an integer.
.half n
as
directive .short
.
.import name [ ,typ ]
.export
; make a procedure available to call. The arguments
use the same conventions as the first two arguments for .export
.
.label name
.leave
.origin lc
portable directive .org
.
.param name [ ,typ ] [ ,param=r ]
.export
, but used for static procedures.
.proc
.procend
label .reg expr
.equ
; define label with the absolute expression
expr as its value.
.space secname [ ,params ]
.spnum secnam
.space
directive.)
.string "str"
as
strings.
Warning! The HPPA version of .string
differs from the
usual as
definition: it does not write a zero byte
after copying str.
.stringz "str"
.string
, but appends a zero byte after copying str to object
file.
.subspa name [ ,params ]
.nsubspa name [ ,params ]
.space
, but selects a subsection name within the
current section. You may only specify params when you create a
subsection (in the first instance of .subspa
for this name).
If specified, the list params declares attributes of the subsection,
identified by keywords. The keywords recognized are `quad=expr'
("quadrant" for this subsection), `align=expr' (alignment for
beginning of this subsection; a power of two), `access=expr' (value
for "access rights" field), `sort=expr' (sorting order for this
subspace in link), `code_only' (subsection contains only code),
`unloadable' (subsection cannot be loaded into memory), `common'
(subsection is common block), `dup_comm' (initialized data may have
duplicate names), or `zero' (subsection is all zeros, do not write in
object file).
.nsubspa
always creates a new subspace with the given name, even
if one with the same name already exists.
.version "str"
For detailed information on the HPPA machine instruction set, see PA-RISC Architecture and Instruction Set Reference Manual (HP 09740-90039).
The ESA/390 as
port is currently intended to be a back-end
for the GNU CC compiler. It is not HLASM compatible, although
it does support a subset of some of the HLASM directives. The only
supported binary file format is ELF; none of the usual MVS/VM/OE/USS
object file formats, such as ESD or XSD, are supported.
When used with the GNU CC compiler, the ESA/390 as
will produce correct, fully relocated, functional binaries, and has been
used to compile and execute large projects. However, many aspects should
still be considered experimental; these include shared library support,
dynamically loadable objects, and any relocation other than the 31-bit
relocation.
as
has no machine-dependent command-line options for the ESA/390.
The opcode/operand syntax follows the ESA/390 Principles of Operation manual; assembler directives and general syntax are loosely based on the prevailing AT&T/SVR4/ELF/Solaris style notation. HLASM-style directives are not supported for the most part, with the exception of those described herein.
A leading dot in front of directives is optional, and the case of directives is ignored; thus for example, .using and USING have the same effect.
A colon may immediately follow a label definition. This is simply for compatibility with how most assembly language programmers write code.
`#' is the line comment character.
`;' can be used instead of a newline to separate statements.
Since `$' has no special meaning, you may use it in symbol names.
Registers can be given the symbolic names r0..r15, fp0, fp2, fp4, fp6.
By using thesse symbolic names, as
can detect simple
syntax errors. The name rarg or r.arg is a synonym for r11, rtca or r.tca
for r12, sp, r.sp, dsa r.dsa for r13, lr or r.lr for r14, rbase or r.base
for r3 and rpgt or r.pgt for r4.
`*' is the current location counter. Unlike `.' it is always relative to the last USING directive. Note that this means that expressions cannot use multiplication, as any occurence of `*' will be interpreted as a location counter.
All labels are relative to the last USING. Thus, branches to a label always imply the use of base+displacement.
Many of the usual forms of address constants / address literals are supported. Thus,
.using *,r3 L r15,=A(some_routine) LM r6,r7,=V(some_longlong_extern) A r1,=F'12' AH r0,=H'42' ME r6,=E'3.1416' MD r6,=D'3.14159265358979' O r6,=XL4'cacad0d0' .ltorg
should all behave as expected: that is, an entry in the literal
pool will be created (or reused if it already exists), and the
instruction operands will be the displacement into the literal pool
using the current base register (as last declared with the .using
directive).
The assembler generates only IEEE floating-point numbers. The older floiating point formats are not supported.
as
for the ESA/390 supports all of the standard ELF/SVR4
assembler directives that are documented in the main part of this
documentation. Several additional directives are supported in order
to implement the ESA/390 addressing model. The most important of these
are .using
and .ltorg
These are the additional directives in as
for the ESA/390:
.dc
.drop regno
.using
directive in the
same section as the current section.
.ebcdic string
.string
etc. emit
ascii strings by default.
EQU
as
directive .equ can be used to the same effect.
.ltorg
.using
must have been previously
specified in the same section.
.using expr,regno
.using
directives to be simultaneously
outstanding, one in the .text
section, and one in another section
(typically, the .data
section). This feature allows
dynamically loaded objects to be implemented in a relatively
straightforward way. A .using
directive must always be specified
in the .text
section; this will specify the base register that
will be used for branches in the .text
section. A second
.using
may be specified in another section; this will specify
the base register that is used for non-label address literals.
When a second .using
is specified, then the subsequent
.ltorg
must be put in the same section; otherwise an error will
result.
Thus, for example, the following code uses r3
to address branch
targets and r4
to address the literal pool, which has been written
to the .data
section. The is, the constants =A(some_routine)
,
=H'42'
and =E'3.1416'
will all appear in the .data
section.
.data .using LITPOOL,r4 .text BASR r3,0 .using *,r3 B START .long LITPOOL START: L r4,4(,r3) L r15,=A(some_routine) LTR r15,r15 BNE LABEL AH r0,=H'42' LABEL: ME r6,=E'3.1416' .data LITPOOL: .ltorgNote that this dual-
.using
directive semantics extends
and is not compatible with HLASM semantics. Note that this assembler
directive does not support the full range of HLASM semantics.
For detailed information on the ESA/390 machine instruction set, see ESA/390 Principles of Operation (IBM Publication Number DZ9AR004).
The 80386 has no machine dependent options.
In order to maintain compatibility with the output of gcc
,
as
supports AT&T System V/386 assembler syntax. This is quite
different from Intel syntax. We mention these differences because
almost all 80386 documents use Intel syntax. Notable differences
between the two syntaxes are:
Instruction mnemonics are suffixed with one character modifiers which
specify the size of operands. The letters `b', `w', and
`l' specify byte, word, and long operands. If no suffix is
specified by an instruction then as
tries to fill in the
missing suffix based on the destination register operand (the last one
by convention). Thus, `mov %ax, %bx' is equivalent to `movw
%ax, %bx'; also, `mov $1, %bx' is equivalent to `movw $1,
%bx'. Note that this is incompatible with the AT&T Unix assembler which
assumes that a missing mnemonic suffix implies long operand size. (This
incompatibility does not affect compiler output since compilers always
explicitly specify the mnemonic suffix.)
Almost all instructions have the same names in AT&T and Intel format. There are a few exceptions. The sign extend and zero extend instructions need two sizes to specify them. They need a size to sign/zero extend from and a size to zero extend to. This is accomplished by using two instruction mnemonic suffixes in AT&T syntax. Base names for sign extend and zero extend are `movs...' and `movz...' in AT&T syntax (`movsx' and `movzx' in Intel syntax). The instruction mnemonic suffixes are tacked on to this base name, the from suffix before the to suffix. Thus, `movsbl %al, %edx' is AT&T syntax for "move sign extend from %al to %edx." Possible suffixes, thus, are `bl' (from byte to long), `bw' (from byte to word), and `wl' (from word to long).
The Intel-syntax conversion instructions
are called `cbtw', `cwtl', `cwtd', and `cltd' in
AT&T naming. as
accepts either naming for these instructions.
Far call/jump instructions are `lcall' and `ljmp' in AT&T syntax, but are `call far' and `jump far' in Intel convention.
Register operands are always prefixed with `%'. The 80386 registers consist of
Instruction prefixes are used to modify the following instruction. They are used to repeat string instructions, to provide section overrides, to perform bus lock operations, and to change operand and address sizes. (Most instructions that normally operate on 32-bit operands will use 16-bit operands if the instruction has an "operand size" prefix.) Instruction prefixes are best written on the same line as the instruction they act upon. For example, the `scas' (scan string) instruction is repeated with:
repne scas %es:(%edi),%al
You may also place prefixes on the lines immediately preceding the
instruction, but this circumvents checks that as
does
with prefixes, and will not work with all prefixes.
Here is a list of instruction prefixes:
.code16
section) into 32-bit operands/addresses. These prefixes
must appear on the same line of code as the instruction they
modify. For example, in a 16-bit .code16
section, you might
write:
addr32 jmpl *(%ebx)
An Intel syntax indirect memory reference of the form
section:[base + index*scale + disp]
is translated into the AT&T syntax
section:disp(base, index, scale)
where base and index are the optional 32-bit base and
index registers, disp is the optional displacement, and
scale, taking the values 1, 2, 4, and 8, multiplies index
to calculate the address of the operand. If no scale is
specified, scale is taken to be 1. section specifies the
optional section register for the memory operand, and may override the
default section register (see a 80386 manual for section register
defaults). Note that section overrides in AT&T syntax must
be preceded by a `%'. If you specify a section override which
coincides with the default section register, as
does not
output any section register override prefixes to assemble the given
instruction. Thus, section overrides can be specified to emphasize which
section register is used for a given memory operand.
Here are some examples of Intel and AT&T style memory references:
Absolute (as opposed to PC relative) call and jump operands must be
prefixed with `*'. If no `*' is specified, as
always chooses PC relative addressing for jump/call labels.
Any instruction that has a memory operand, but no register operand, must specify its size (byte, word, or long) with an instruction mnemonic suffix (`b', `w', or `l', respectively).
Jump instructions are always optimized to use the smallest possible displacements. This is accomplished by using byte (8-bit) displacement jumps whenever the target is sufficiently close. If a byte displacement is insufficient a long (32-bit) displacement is used. We do not support word (16-bit) displacement jumps in 32-bit mode (i.e. prefixing the jump instruction with the `data16' instruction prefix), since the 80386 insists upon masking `%eip' to 16 bits after the word displacement is added.
Note that the `jcxz', `jecxz', `loop', `loopz',
`loope', `loopnz' and `loopne' instructions only come in byte
displacements, so that if you use these instructions (gcc
does
not use them) you may get an error message (and incorrect code). The AT&T
80386 assembler tries to get around this problem by expanding `jcxz foo'
to
jcxz cx_zero jmp cx_nonzero cx_zero: jmp foo cx_nonzero:
All 80387 floating point types except packed BCD are supported. (BCD support may be added without much difficulty). These data types are 16-, 32-, and 64- bit integers, and single (32-bit), double (64-bit), and extended (80-bit) precision floating point. Each supported type has an instruction mnemonic suffix and a constructor associated with it. Instruction mnemonic suffixes specify the operand's data type. Constructors build these data types into memory.
Register to register operations should not use instruction mnemonic suffixes. `fstl %st, %st(1)' will give a warning, and be assembled as if you wrote `fst %st, %st(1)', since all register to register operations use 80-bit floating point operands. (Contrast this with `fstl %st, mem', which converts `%st' from 80-bit to 64-bit floating point format, then stores the result in the 4 byte location `mem')
as
supports Intel's MMX instruction set (SIMD
instructions for integer data), available on Intel's Pentium MMX
processors and Pentium II processors, AMD's K6 and K6-2 processors,
Cyrix' M2 processor, and probably others. It also supports AMD's 3DNow!
instruction set (SIMD instructions for 32-bit floating point data)
available on AMD's K6-2 processor and possibly others in the future.
Currently, as
does not support Intel's floating point
SIMD, Katmai (KNI).
The eight 64-bit MMX operands, also used by 3DNow!, are called `%mm0', `%mm1', ... `%mm7'. They contain eight 8-bit integers, four 16-bit integers, two 32-bit integers, one 64-bit integer, or two 32-bit floating point values. The MMX registers cannot be used at the same time as the floating point stack.
See Intel and AMD documentation, keeping in mind that the operand order in instructions is reversed from the Intel syntax.
While as
normally writes only "pure" 32-bit i386 code,
it also supports writing code to run in real mode or in 16-bit protected
mode code segments. To do this, put a `.code16' or
`.code16gcc' directive before the assembly language instructions to
be run in 16-bit mode. You can switch as
back to writing
normal 32-bit code with the `.code32' directive.
`.code16gcc' provides experimental support for generating 16-bit code from gcc, and differs from `.code16' in that `call', `ret', `enter', `leave', `push', `pop', `pusha', `popa', `pushf', and `popf' instructions default to 32-bit size. This is so that the stack pointer is manipulated in the same way over function calls, allowing access to function parameters at the same stack offsets as in 32-bit mode. `.code16gcc' also automatically adds address size prefixes where necessary to use the 32-bit addressing modes that gcc generates.
The code which as
generates in 16-bit mode will not
necessarily run on a 16-bit pre-80386 processor. To write code that
runs on such a processor, you must refrain from using any 32-bit
constructs which require as
to output address or operand
size prefixes.
Note that writing 16-bit code instructions by explicitly specifying a prefix or an instruction mnemonic suffix within a 32-bit code section generates different machine instructions than those generated for a 16-bit code segment. In a 32-bit code section, the following code generates the machine opcode bytes `66 6a 04', which pushes the value `4' onto the stack, decrementing `%esp' by 2.
pushw $4
The same code in a 16-bit code section would generate the machine opcode bytes `6a 04' (ie. without the operand size prefix), which is correct since the processor default operand size is assumed to be 16 bits in a 16-bit code section.
The UnixWare assembler, and probably other AT&T derived ix86 Unix assemblers, generate floating point instructions with reversed source and destination registers in certain cases. Unfortunately, gcc and possibly many other programs use this reversed syntax, so we're stuck with it.
For example
fsub %st,%st(3)
results in `%st(3)' being updated to `%st - %st(3)' rather than the expected `%st(3) - %st'. This happens with all the non-commutative arithmetic floating point operations with two register operands where the source register is `%st' and the destination register is `%st(i)'.
There is some trickery concerning the `mul' and `imul'
instructions that deserves mention. The 16-, 32-, and 64-bit expanding
multiplies (base opcode `0xf6'; extension 4 for `mul' and 5
for `imul') can be output only in the one operand form. Thus,
`imul %ebx, %eax' does not select the expanding multiply;
the expanding multiply would clobber the `%edx' register, and this
would confuse gcc
output. Use `imul %ebx' to get the
64-bit product in `%edx:%eax'.
We have added a two operand form of `imul' when the first operand is an immediate mode expression and the second operand is a register. This is just a shorthand, so that, multiplying `%eax' by 69, for example, can be done with `imul $69, %eax' rather than `imul $69, %eax, %eax'.
-ACA | -ACA_A | -ACB | -ACC | -AKA | -AKB | -AKC | -AMC
as
generates code
for any instruction or feature that is supported by some version of the
960 (even if this means mixing architectures!). In principle,
as
attempts to deduce the minimal sufficient processor type if
none is specified; depending on the object code format, the processor type may
be recorded in the object file. If it is critical that the as
output match a specific architecture, specify that architecture explicitly.
-b
call increment routine .word 0 # pre-counter Label: BR call increment routine .word 0 # post-counterThe counter following a branch records the number of times that branch was not taken; the differenc between the two counters is the number of times the branch was taken. A table of every such
Label
is also generated, so that the
external postprocessor gbr960
(supplied by Intel) can locate all
the counters. This table is always labelled `__BRANCH_TABLE__';
this is a local symbol to permit collecting statistics for many separate
object files. The table is word aligned, and begins with a two-word
header. The first word, initialized to 0, is used in maintaining linked
lists of branch tables. The second word is a count of the number of
entries in the table, which follow immediately: each is a word, pointing
to one of the labels illustrated above.
The first word of the header is used to locate multiple branch tables,
since each object file may contain one. Normally the links are
maintained with a call to an initialization routine, placed at the
beginning of each function in the file. The GNU C compiler
generates these calls automatically when you give it a `-b' option.
For further details, see the documentation of `gbr960'.
-no-relax
as
should generate errors instead, if the target displacement
is larger than 13 bits.
This option does not affect the Compare-and-Jump instructions; the code
emitted for them is always adjusted when necessary (depending on
displacement size), regardless of whether you use `-no-relax'.
as
generates IEEE floating-point numbers for the directives
`.float', `.double', `.extended', and `.single'.
.bss symbol, length, align
.lcomm symbol
, length.
.extended flonums
.extended
expects zero or more flonums, separated by commas; for
each flonum, `.extended' emits an IEEE extended-format (80-bit)
floating-point number.
.leafproc call-lab, bal-lab
callj
instruction to enable faster calls of leaf
procedures. If a procedure is known to call no other procedures, you
may define an entry point that skips procedure prolog code (and that does
not depend on system-supplied saved context), and declare it as the
bal-lab using `.leafproc'. If the procedure also has an
entry point that goes through the normal prolog, you can specify that
entry point as call-lab.
A `.leafproc' declaration is meant for use in conjunction with the
optimized call instruction `callj'; the directive records the data
needed later to choose between converting the `callj' into a
bal
or a call
.
call-lab is optional; if only one argument is present, or if the
two arguments are identical, the single argument is assumed to be the
bal
entry point.
.sysproc name, index
All Intel 960 machine instructions are supported; see section i960 Command-line Options for a discussion of selecting the instruction subset for a particular 960 architecture.
Some opcodes are processed beyond simply emitting a single corresponding instruction: `callj', and Compare-and-Branch or Compare-and-Jump instructions with target displacements larger than 13 bits.
callj
callj
You can write callj
to have the assembler or the linker determine
the most appropriate form of subroutine call: `call',
`bal', or `calls'. If the assembly source contains
enough information--a `.leafproc' or `.sysproc' directive
defining the operand--then as
translates the
callj
; if not, it simply emits the callj
, leaving it
for the linker to resolve.
The 960 architectures provide combined Compare-and-Branch instructions that permit you to store the branch target in the lower 13 bits of the instruction word itself. However, if you specify a branch target far enough away that its address won't fit in 13 bits, the assembler can either issue an error, or convert your Compare-and-Branch instruction into separate instructions to do the compare and the branch.
Whether as
gives an error or expands the instruction depends
on two choices you can make: whether you use the `-no-relax' option,
and whether you use a "Compare and Branch" instruction or a "Compare
and Jump" instruction. The "Jump" instructions are always
expanded if necessary; the "Branch" instructions are expanded when
necessary unless you specify -no-relax
---in which case
as
gives an error instead.
These are the Compare-and-Branch instructions, their "Jump" variants, and the instruction pairs they may expand into:
The Motorola 680x0 version of as
has a few machine
dependent options.
You can use the `-l' option to shorten the size of references to undefined
symbols. If you do not use the `-l' option, references to undefined
symbols are wide enough for a full long
(32 bits). (Since
as
cannot know where these symbols end up, as
can
only allocate space for the linker to fill in later. Since as
does not know how far away these symbols are, it allocates as much space as it
can.) If you use this option, the references are only one word wide (16 bits).
This may be useful if you want the object file to be as small as possible, and
you know that the relevant symbols are always less than 17 bits away.
For some configurations, especially those where the compiler normally does not prepend an underscore to the names of user variables, the assembler requires a `%' before any use of a register name. This is intended to let the assembler distinguish between C variables and functions named `a0' through `a7', and so on. The `%' is always accepted, but is not required for certain configurations, notably `sun3'. The `--register-prefix-optional' option may be used to permit omitting the `%' even for configurations for which it is normally required. If this is done, it will generally be impossible to refer to C variables and functions with the same names as register names.
Normally the character `|' is treated as a comment character, which means that it can not be used in expressions. The `--bitwise-or' option turns `|' into a normal character. In this mode, you must either use C style comments, or start comments with a `#' character at the beginning of a line.
If you use an addressing mode with a base register without specifying
the size, as
will normally use the full 32 bit value.
For example, the addressing mode `%a0@(%d0)' is equivalent to
`%a0@(%d0:l)'. You may use the `--base-size-default-16'
option to tell as
to default to using the 16 bit value.
In this case, `%a0@(%d0)' is equivalent to `%a0@(%d0:w)'.
You may use the `--base-size-default-32' option to restore the
default behaviour.
If you use an addressing mode with a displacement, and the value of the
displacement is not known, as
will normally assume that
the value is 32 bits. For example, if the symbol `disp' has not
been defined, as
will assemble the addressing mode
`%a0@(disp,%d0)' as though `disp' is a 32 bit value. You may
use the `--disp-size-default-16' option to tell as
to instead assume that the displacement is 16 bits. In this case,
as
will assemble `%a0@(disp,%d0)' as though
`disp' is a 16 bit value. You may use the
`--disp-size-default-32' option to restore the default behaviour.
as
can assemble code for several different members of the
Motorola 680x0 family. The default depends upon how as
was configured when it was built; normally, the default is to assemble
code for the 68020 microprocessor. The following options may be used to
change the default. These options control which instructions and
addressing modes are permitted. The members of the 680x0 family are
very similar. For detailed information about the differences, see the
Motorola manuals.
This syntax for the Motorola 680x0 was developed at MIT.
The 680x0 version of as
uses instructions names and
syntax compatible with the Sun assembler. Intervening periods are
ignored; for example, `movl' is equivalent to `mov.l'.
In the following table apc stands for any of the address registers (`%a0' through `%a7'), the program counter (`%pc'), the zero-address relative to the program counter (`%zpc'), a suppressed address register (`%za0' through `%za7'), or it may be omitted entirely. The use of size means one of `w' or `l', and it may be omitted, along with the leading colon, unless a scale is also specified. The use of scale means one of `1', `2', `4', or `8', and it may always be omitted along with the leading colon.
The following addressing modes are understood:
%a6
is also known as `%fp', the Frame Pointer.
The standard Motorola syntax for this chip differs from the syntax
already discussed (see section Syntax). as
can
accept Motorola syntax for operands, even if MIT syntax is used for
other operands in the same instruction. The two kinds of syntax are
fully compatible.
In the following table apc stands for any of the address registers (`%a0' through `%a7'), the program counter (`%pc'), the zero-address relative to the program counter (`%zpc'), or a suppressed address register (`%za0' through `%za7'). The use of size means one of `w' or `l', and it may always be omitted along with the leading dot. The use of scale means one of `1', `2', `4', or `8', and it may always be omitted along with the leading asterisk.
The following additional addressing modes are understood:
%a6
is also known as `%fp', the Frame Pointer.
Packed decimal (P) format floating literals are not supported. Feel free to add the code!
The floating point formats generated by directives are these.
.float
Single
precision floating point constants.
.double
Double
precision floating point constants.
.extend
.ldouble
Extended
precision (long double
) floating point constants.
In order to be compatible with the Sun assembler the 680x0 assembler understands the following directives.
.data1
.data 1
directive.
.data2
.data 2
directive.
.even
.align
directive; it
aligns the output to an even byte boundary.
.skip
.space
directive.
Certain pseudo opcodes are permitted for branch instructions. They expand to the shortest branch instruction that reach the target. Generally these mnemonics are made by substituting `j' for `b' at the start of a Motorola mnemonic.
The following table summarizes the pseudo-operations. A *
flags
cases that are more fully described after the table:
Displacement +------------------------------------------------- | 68020 68000/10 Pseudo-Op |BYTE WORD LONG LONG non-PC relative +------------------------------------------------- jbsr |bsrs bsr bsrl jsr jsr jra |bras bra bral jmp jmp * jXX |bXXs bXX bXXl bNXs;jmpl bNXs;jmp * dbXX |dbXX dbXX dbXX; bra; jmpl * fjXX |fbXXw fbXXw fbXXl fbNXw;jmp XX: condition NX: negative of condition XX
*
---see full description below
jbsr
jra
jXX
jhi jls jcc jcs jne jeq jvc jvs jpl jmi jge jlt jgt jleFor the cases of non-PC relative displacements and long displacements on the 68000 or 68010,
as
issues a longer code fragment in terms of
NX, the opposite condition to XX. For example, for the
non-PC relative case:
jXX foogives
bNXs oof jmp foo oof:
dbXX
dbhi dbls dbcc dbcs dbne dbeq dbvc dbvs dbpl dbmi dbge dblt dbgt dble dbf dbra dbtOther than for word and byte displacements, when the source reads `dbXX foo',
as
emits
dbXX oo1 bra oo2 oo1:jmpl foo oo2:
fjXX
fjne fjeq fjge fjlt fjgt fjle fjf fjt fjgl fjgle fjnge fjngl fjngle fjngt fjnle fjnlt fjoge fjogl fjogt fjole fjolt fjor fjseq fjsf fjsne fjst fjueq fjuge fjugt fjule fjult fjunFor branch targets that are not PC relative,
as
emits
fbNX oof jmp foo oof:when it encounters `fjXX foo'.
The immediate character is `#' for Sun compatibility. The line-comment character is `|' (unless the `--bitwise-or' option is used). If a `#' appears at the beginning of a line, it is treated as a comment unless it looks like `# line file', in which case it is treated normally.
The Motorola 68HC11 and 68HC12 version of as
has a few machine
dependent options.
This option switches the assembler in the M68HC11 mode. In this mode, the assembler only accepts 68HC11 operands and mnemonics. It produces code for the 68HC11.
This option switches the assembler in the M68HC12 mode. In this mode, the assembler also accepts 68HC12 operands and mnemonics. It produces code for the 68HC12. A fiew 68HC11 instructions are replaced by some 68HC12 instructions as recommended by Motorola specifications.
You can use the `--strict-direct-mode' option to disable
the automatic translation of direct page mode addressing into
extended mode when the instruction does not support direct mode.
For example, the `clr' instruction does not support direct page
mode addressing. When it is used with the direct page mode,
as
will ignore it and generate an absolute addressing.
This option prevents as
from doing this, and the wrong
usage of the direct page mode will raise an error.
The `--short-branchs' option turns off the translation of
relative branches into absolute branches when the branch offset is
out of range. By default as
transforms the relative
branch (`bsr', `bgt', `bge', `beq', `bne',
`ble', `blt', `bhi', `bcc', `bls',
`bcs', `bmi', `bvs', `bvs', `bra') into
an absolute branch when the offset is out of the -128 .. 127 range.
In that case, the `bsr' instruction is translated into a
`jsr', the `bra' instruction is translated into a
`jmp' and the conditional branchs instructions are inverted and
followed by a `jmp'. This option disables these translations
and as
will generate an error if a relative branch
is out of range. This option does not affect the optimization
associated to the `jbra', `jbsr' and `jbXX' pseudo opcodes.
The `--force-long-branchs' option forces the translation of relative branches into absolute branches. This option does not affect the optimization associated to the `jbra', `jbsr' and `jbXX' pseudo opcodes.
You can use the `--print-insn-syntax' option to obtain the syntax description of the instruction when an error is detected.
The `--print-opcodes' option prints the list of all the
instructions with their syntax. The first item of each line
represents the instruction name and the rest of the line indicates
the possible operands for that instruction. The list is printed
in alphabetical order. Once the list is printed as
exits.
The `--generate-example' option is similar to `--print-opcodes' but it generates an example for each instruction instead.
In the M68HC11 syntax, the instruction name comes first and it may
be followed by one or several operands (up to three). Operands are
separated by comma (`,'). In the normal mode,
as
will complain if too many operands are specified for
a given instruction. In the MRI mode (turned on with `-M' option),
it will treat them as comments. Example:
inx lda #23 bset 2,x #4 brclr *bot #8 foo
The following addressing modes are understood:
Packed decimal (P) format floating literals are not supported. Feel free to add the code!
The floating point formats generated by directives are these.
.float
Single
precision floating point constants.
.double
Double
precision floating point constants.
.extend
.ldouble
Extended
precision (long double
) floating point constants.
Certain pseudo opcodes are permitted for branch instructions. They expand to the shortest branch instruction that reach the target. Generally these mnemonics are made by prepending `j' to the start of Motorola mnemonic. These pseudo opcodes are not affected by the `--short-branchs' or `--force-long-branchs' options.
The following table summarizes the pseudo-operations.
Displacement Width +-------------------------------------------------------------+ | Options | | --short-branchs --force-long-branchs | +--------------------------+----------------------------------+ Pseudo-Op |BYTE WORD | BYTE WORD | +--------------------------+----------------------------------+ bsr | bsr <pc-rel> <error> | jsr <abs> | bra | bra <pc-rel> <error> | jmp <abs> | jbsr | bsr <pc-rel> jsr <abs> | bsr <pc-rel> jsr <abs> | jbra | bra <pc-rel> jmp <abs> | bra <pc-rel> jmp <abs> | bXX | bXX <pc-rel> <error> | bNX +3; jmp <abs> | jbXX | bXX <pc-rel> bNX +3; | bXX <pc-rel> bNX +3; jmp <abs> | | jmp <abs> | | +--------------------------+----------------------------------+ XX: condition NX: negative of condition XX
jbsr
jbra
jbXX
jbcc jbeq jbge jbgt jbhi jbvs jbpl jblo jbcs jbne jblt jble jbls jbvc jbmiFor the cases of non-PC relative displacements and long displacements,
as
issues a longer code fragment in terms of
NX, the opposite condition to XX. For example, for the
non-PC relative case:
jbXX foogives
bNXs oof jmp foo oof:
GNU as
for MIPS architectures supports several
different MIPS processors, and MIPS ISA levels I through IV. For
information about the MIPS instruction set, see MIPS RISC
Architecture, by Kane and Heindrich (Prentice-Hall). For an overview
of MIPS assembly conventions, see "Appendix D: Assembly Language
Programming" in the same work.
The MIPS configurations of GNU as
support these
special options:
-G num
gp
register. It is only accepted for targets
that use ECOFF format. The default value is 8.
-EB
-EL
as
can select big-endian or
little-endian output at run time (unlike the other GNU development
tools, which must be configured for one or the other). Use `-EB'
to select big-endian output, and `-EL' for little-endian.
-mips1
-mips2
-mips3
-mips4
-mgp32
move
, which will assemble
to a 32-bit or a 64-bit instruction depending on this flag. On some
MIPS variants there is a 32-bit mode flag; when this flag is set,
64-bit instructions generate a trap. Also, some 32-bit OSes only save
the 32-bit registers on a context switch, so it is essential never to
use the 64-bit registers.
-mgp64
-mips16
-no-mips16
-mfix7000
-no-mfix7000
-m4010
-no-m4010
-m4650
-no-m4650
-m3900
-no-m3900
-m4100
-no-m4100
-mcpu=cpu
2000, 3000, 3900, 4000, 4010, 4100, 4111, 4300, 4400, 4600, 4650, 5000, 6000, 8000, 10000
-nocpp
as
, there is no need for `-nocpp', because the
GNU assembler itself never runs the C preprocessor.
--trap
--no-break
as
automatically macro expands certain division and
multiplication instructions to check for overflow and division by zero. This
option causes as
to generate code to take a trap exception
rather than a break exception when an error is detected. The trap instructions
are only supported at Instruction Set Architecture level 2 and higher.
--break
--no-trap
Assembling for a MIPS ECOFF target supports some additional sections
besides the usual .text
, .data
and .bss
. The
additional sections are .rdata
, used for read-only data,
.sdata
, used for small data, and .sbss
, used for small
common objects.
When assembling for ECOFF, the assembler uses the $gp
($28
)
register to form the address of a "small object". Any object in the
.sdata
or .sbss
sections is considered "small" in this sense.
For external objects, or for objects in the .bss
section, you can use
the gcc
`-G' option to control the size of objects addressed via
$gp
; the default value is 8, meaning that a reference to any object
eight bytes or smaller uses $gp
. Passing `-G 0' to
as
prevents it from using the $gp
register on the basis
of object size (but the assembler uses $gp
for objects in .sdata
or sbss
in any case). The size of an object in the .bss
section
is set by the .comm
or .lcomm
directive that defines it. The
size of an external object may be set with the .extern
directive. For
example, `.extern sym,4' declares that the object at sym
is 4 bytes
in length, whie leaving sym
otherwise undefined.
Using small ECOFF objects requires linker support, and assumes that the
$gp
register is correctly initialized (normally done automatically by
the startup code). MIPS ECOFF assembly code must not modify the
$gp
register.
MIPS ECOFF as
supports several directives used for
generating debugging information which are not support by traditional MIPS
assemblers. These are .def
, .endef
, .dim
, .file
,
.scl
, .size
, .tag
, .type
, .val
,
.stabd
, .stabn
, and .stabs
. The debugging information
generated by the three .stab
directives can only be read by GDB,
not by traditional MIPS debuggers (this enhancement is required to fully
support C++ debugging). These directives are primarily used by compilers, not
assembly language programmers!
GNU as
supports an additional directive to change
the MIPS Instruction Set Architecture level on the fly: .set
mipsn
. n should be a number from 0 to 4. A value from 1
to 4 makes the assembler accept instructions for the corresponding
ISA level, from that point on in the assembly. .set
mipsn
affects not only which instructions are permitted, but also
how certain macros are expanded. .set mips0
restores the
ISA level to its original level: either the level you selected with
command line options, or the default for your configuration. You can
use this feature to permit specific R4000 instructions while
assembling in 32 bit mode. Use this directive with care!
The directive `.set mips16' puts the assembler into MIPS 16 mode, in which it will assemble instructions for the MIPS 16 processor. Use `.set nomips16' to return to normal 32 bit mode.
Traditional MIPS assemblers do not support this directive.
By default, MIPS 16 instructions are automatically extended to 32 bits when necessary. The directive `.set noautoextend' will turn this off. When `.set noautoextend' is in effect, any 32 bit instruction must be explicitly extended with the `.e' modifier (e.g., `li.e $4,1000'). The directive `.set autoextend' may be used to once again automatically extend instructions when necessary.
This directive is only meaningful when in MIPS 16 mode. Traditional MIPS assemblers do not support this directive.
The .insn
directive tells as
that the following
data is actually instructions. This makes a difference in MIPS 16 mode:
when loading the address of a label which precedes instructions,
as
automatically adds 1 to the value, so that jumping to
the loaded address will do the right thing.
The directives .set push
and .set pop
may be used to save
and restore the current settings for all the options which are
controlled by .set
. The .set push
directive saves the
current settings on a stack. The .set pop
directive pops the
stack and restores the settings.
These directives can be useful inside an macro which must change an option such as the ISA level or instruction reordering but does not want to change the state of the code which invoked the macro.
Traditional MIPS assemblers do not support these directives.
as
has two addiitional command-line options for the picoJava
architecture.
-ml
-mb
as
has no additional command-line options for the Hitachi
SH family.
`!' is the line comment character.
You can use `;' instead of a newline to separate statements.
Since `$' has no special meaning, you may use it in symbol names.
You can use the predefined symbols `r0', `r1', `r2', `r3', `r4', `r5', `r6', `r7', `r8', `r9', `r10', `r11', `r12', `r13', `r14', and `r15' to refer to the SH registers.
The SH also has these control registers:
pr
pc
mach
macl
sr
gbr
vbr
as
understands the following addressing modes for the SH.
Rn
in the following refers to any of the numbered
registers, but not the control registers.
Rn
@Rn
@-Rn
@Rn+
@(disp, Rn)
@(R0, Rn)
@(disp, GBR)
GBR
offset
@(R0, GBR)
addr
@(disp, PC)
as
implementation allows you to use the simpler form
addr anywhere a PC relative address is called for; the alternate
form is supported for compatibility with other assemblers.
#imm
The SH family has no hardware floating point, but the .float
directive generates IEEE floating-point numbers for compatibility
with other development tools.
uaword
ualong
as
will issue a warning when a misaligned .word
or
.long
directive is used. You may use .uaword
or
.ualong
to indicate that the value is intentionally misaligned.
For detailed information on the SH machine instruction set, see SH-Microcomputer User's Manual (Hitachi Micro Systems, Inc.).
as
implements all the standard SH opcodes. No additional
pseudo-instructions are needed on this family. Note, however, that
because as
supports a simpler form of PC-relative
addressing, you may simply write (for example)
mov.l bar,r0
where other assemblers might require an explicit displacement to
bar
from the program counter:
mov.l @(disp, PC)
The SPARC chip family includes several successive levels, using the same core instruction set, but including a few additional instructions at each level. There are exceptions to this however. For details on what instructions each variant supports, please see the chip's architecture reference manual.
By default, as
assumes the core instruction set (SPARC
v6), but "bumps" the architecture level as needed: it switches to
successively higher architectures as it encounters instructions that
only exist in the higher levels.
If not configured for SPARC v9 (sparc64-*-*
) GAS will not bump
passed sparclite by default, an option must be passed to enable the
v9 instructions.
GAS treats sparclite as being compatible with v8, unless an architecture is explicitly requested. SPARC v9 is always incompatible with sparclite.
-Av6 | -Av7 | -Av8 | -Asparclet | -Asparclite
-Av8plus | -Av8plusa | -Av9 | -Av9a
as
reports a fatal error if it encounters an instruction
or feature requiring an incompatible or higher level.
`-Av8plus' and `-Av8plusa' select a 32 bit environment.
`-Av9' and `-Av9a' select a 64 bit environment and are not
available unless GAS is explicitly configured with 64 bit environment
support.
`-Av8plusa' and `-Av9a' enable the SPARC V9 instruction set with
UltraSPARC extensions.
-xarch=v8plus | -xarch=v8plusa
-bump
-32 | -64
SPARC GAS normally permits data to be misaligned. For example, it
permits the .long
pseudo-op to be used on a byte boundary.
However, the native SunOS and Solaris assemblers issue an error when
they see misaligned data.
You can use the --enforce-aligned-data
option to make SPARC GAS
also issue an error about misaligned data, just as the SunOS and Solaris
assemblers do.
The --enforce-aligned-data
option is not the default because gcc
issues misaligned data pseudo-ops when it initializes certain packed
data structures (structures defined using the packed
attribute).
You may have to assemble with GAS in order to initialize packed data
structures in your own code.
The Sparc uses IEEE floating-point numbers.
The Sparc version of as
supports the following additional
machine directives:
.align
.common
"bss"
. This behaves somewhat like .comm
, but the
syntax is different.
.half
.short
.
.nword
.nword
directive produces native word sized value,
ie. if assembling with -32 it is equivalent to .word
, if assembling
with -64 it is equivalent to .xword
.
.proc
.register
#scratch
,
it is a scratch register, if it is #ignore
, it just surpresses any
errors about using undeclared global register, but does not emit any
information about it into the object file. This can be useful e.g. if you
save the register before use and restore it after.
.reserve
"bss"
. This behaves somewhat like .lcomm
, but the
syntax is different.
.seg
"text"
, "data"
, or
"data1"
. It behaves like .text
, .data
, or
.data 1
.
.skip
.space
directive.
.word
.word
directive produces 32 bit values,
instead of the 16 bit values it produces on many other machines.
.xword
.xword
directive produces
64 bit values.
The Z8000 as supports both members of the Z8000 family: the unsegmented Z8002, with 16 bit addresses, and the segmented Z8001 with 24 bit addresses.
When the assembler is in unsegmented mode (specified with the
unsegm
directive), an address takes up one word (16 bit)
sized register. When the assembler is in segmented mode (specified with
the segm
directive), a 24-bit address takes up a long (32 bit)
register. See section Assembler Directives for the Z8000,
for a list of other Z8000 specific assembler directives.
as
has no additional command-line options for the Zilog
Z8000 family.
`!' is the line comment character.
You can use `;' instead of a newline to separate statements.
The Z8000 has sixteen 16 bit registers, numbered 0 to 15. You can refer to different sized groups of registers by register number, with the prefix `r' for 16 bit registers, `rr' for 32 bit registers and `rq' for 64 bit registers. You can also refer to the contents of the first eight (of the sixteen 16 bit registers) by bytes. They are named `rnh' and `rnl'.
byte registers r0l r0h r1h r1l r2h r2l r3h r3l r4h r4l r5h r5l r6h r6l r7h r7l word registers r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 long word registers rr0 rr2 rr4 rr6 rr8 rr10 rr12 rr14 quad word registers rq0 rq4 rq8 rq12
as understands the following addressing modes for the Z8000:
rn
@rn
addr
address(rn)
rn(#imm)
rn(rm)
#xx
The Z8000 port of as includes these additional assembler directives, for compatibility with other Z8000 assemblers. As shown, these do not begin with `.' (unlike the ordinary as directives).
segm
unsegm
name
.file
global
.global
wval
.word
lval
.long
bval
.byte
sval
sval
expects one string literal, delimited by
single quotes. It assembles each byte of the string into consecutive
addresses. You can use the escape sequence `%xx' (where
xx represents a two-digit hexadecimal number) to represent the
character whose ASCII value is xx. Use this feature to
describe single quote and other characters that may not appear in string
literals as themselves. For example, the C statement `char *a =
"he said \"it's 50% off\"";' is represented in Z8000 assembly language
(shown with the assembler output in hex at the left) as
@begingroup
@let@nonarrowing=@comment
68652073 sval 'he said %22it%27s 50%25 off%22%00' 61696420 22697427 73203530 25206F66 662200@endgroup
rsect
.section
block
.space
even
.align
; aligns output to even byte boundary.
For detailed information on the Z8000 machine instruction set, see Z8000 Technical Manual.
The Vax version of as
accepts any of the following options,
gives a warning message that the option was ignored and proceeds.
These options are for compatibility with scripts designed for other
people's assemblers.
-D
(Debug)
-S
(Symbol Table)
-T
(Token Trace)
-d
(Displacement size for JUMPs)
-V
(Virtualize Interpass Temporary File)
as
always does this, so this
option is redundant.
-J
(JUMPify Longer Branches)
-t
(Temporary File Directory)
as
does not use a temporary disk file, this
option makes no difference. `-t' needs exactly one
filename.
The Vax version of the assembler accepts additional options when compiled for VMS:
-H
option directs as
to display
every mapped symbol during assembly.
Symbols whose names include a dollar sign `$' are exceptions to the
general name mapping. These symbols are normally only used to reference
VMS library names. Such symbols are always mapped to upper case.
as
to truncate any symbol
name larger than 31 characters. The `-+' option also prevents some
code following the `_main' symbol normally added to make the object
file compatible with Vax-11 "C".
as
version 1.x.
as
to print every symbol
which was changed by case mapping.
Conversion of flonums to floating point is correct, and compatible with previous assemblers. Rounding is towards zero if the remainder is exactly half the least significant bit.
D
, F
, G
and H
floating point formats
are understood.
Immediate floating literals (e.g. `S`$6.9') are rendered correctly. Again, rounding is towards zero in the boundary case.
The .float
directive produces f
format numbers.
The .double
directive produces d
format numbers.
The Vax version of the assembler supports four directives for generating Vax floating point constants. They are described in the table below.
.dfloat
d
format 64-bit floating point constants.
.ffloat
f
format 32-bit floating point constants.
.gfloat
g
format 64-bit floating point constants.
.hfloat
h
format 128-bit floating point constants.
All DEC mnemonics are supported. Beware that case...
instructions have exactly 3 operands. The dispatch table that
follows the case...
instruction should be made with
.word
statements. This is compatible with all unix
assemblers we know of.
Certain pseudo opcodes are permitted. They are for branch instructions. They expand to the shortest branch instruction that reaches the target. Generally these mnemonics are made by substituting `j' for `b' at the start of a DEC mnemonic. This feature is included both for compatibility and to help compilers. If you do not need this feature, avoid these opcodes. Here are the mnemonics, and the code they can expand into.
jbsb
jbr
jr
jCOND
neq
, nequ
, eql
, eqlu
, gtr
,
geq
, lss
, gtru
, lequ
, vc
, vs
,
gequ
, cc
, lssu
, cs
.
COND may also be one of the bit tests
bs
, bc
, bss
, bcs
, bsc
, bcc
,
bssi
, bcci
, lbs
, lbc
.
NOTCOND is the opposite condition to COND.
jacbX
b d f g h l w
.
OPCODE ..., foo ; brb bar ; foo: jmp ... ; bar:
jaobYYY
lss leq
.
jsobZZZ
geq gtr
.
OPCODE ..., foo ; brb bar ; foo: brw destination ; bar:
OPCODE ..., foo ; brb bar ; foo: jmp destination ; bar:
aobleq
aoblss
sobgeq
sobgtr
OPCODE ..., foo ; brb bar ; foo: brw destination ; bar:
OPCODE ..., foo ; brb bar ; foo: jmp destination ; bar:
The immediate character is `$' for Unix compatibility, not `#' as DEC writes it.
The indirect character is `*' for Unix compatibility, not `@' as DEC writes it.
The displacement sizing character is ``' (an accent grave) for
Unix compatibility, not `^' as DEC writes it. The letter
preceding ``' may have either case. `G' is not
understood, but all other letters (b i l s w
) are understood.
Register names understood are r0 r1 r2 ... r15 ap fp sp
pc
. Upper and lower case letters are equivalent.
For instance
tstb *w`$4(r5)
Any expression is permitted in an operand. Operands are comma separated.
Vax bit fields can not be assembled with as
. Someone
can add the required code if they really need it.
as
supports the following additional command-line options
for the V850 processor family:
-wsigned_overflow
-wunsigned_overflow
-mv850
-mv850e
-mv850any
`#' is the line comment character.
as
supports the following names for registers:
general register 0
general register 1
general register 2
general register 3
general register 4
general register 5
general register 6
general register 7
general register 8
general register 9
general register 10
general register 11
general register 12
general register 13
general register 14
general register 15
general register 16
general register 17
general register 18
general register 19
general register 20
general register 21
general register 22
general register 23
general register 24
general register 25
general register 26
general register 27
general register 28
general register 29
general register 30
general register 31
system register 0
system register 1
system register 2
system register 3
system register 4
system register 5
system register 16
system register 17
system register 18
system register 19
system register 20
The V850 family uses IEEE floating-point numbers.
.offset <expression>
.section "name", <type>
.v850
.v850e
as
implements all the standard V850 opcodes.
as
also implements the following pseudo ops:
hi0()
lo()
hi()
hilo()
sdaoff()
tdaoff()
zdaoff()
ctoff()
For information on the V850 instruction set, see V850 Family 32-/16-Bit single-Chip Microcontroller Architecture Manual from NEC. Ltd.
Go to the first, previous, next, last section, table of contents.