Tuesday, November 16, 2010

assembly tutorial

http://docs.sun.com/app/docs/doc/817-5477/ennab?a=view

 There are two types of lables: symbolic and numeric.

Symbolic Labels

A symbolic label consists of an identifier (or symbol) followed by a colon (:) (ASCII 0x3A). Symbolic labels must be defined only once. Symbolic labels have global scope and appear in the object file's symbol table.
Symbolic labels with identifiers beginning with a period (.) (ASCII 0x2E) are considered to have local scope and are not included in the object file's symbol table.

Numeric Labels

A numeric label consists of a single digit in the range zero (0) through nine (9) followed by a colon (:). Numeric labels are used only for local reference and are not included in the object file's symbol table. Numeric labels have limited scope and can be redefined repeatedly.
When a numeric label is used as a reference (as an instruction operand, for example), the suffixes b (“backward”) or f (“forward”) should be added to the numeric label. For numeric label N, the reference Nb refers to the nearest label N defined before the reference, and the reference Nf refers to the nearest label N defined after the reference. The following example illustrates the use of numeric labels:
1:          / define numeric label "1"
one:        / define symbolic label "one"

/ ... assembler code ...

jmp   1f    / jump to first numeric label "1" defined
            / after this instruction
            / (this reference is equivalent to label "two")

jmp   1b    / jump to last numeric label "1" defined
            / before this instruction
            / (this reference is equivalent to label "one")

1:          / redefine label "1"
two:        / define symbolic label "two"

jmp   1b    / jump to last numeric label "1" defined
            / before this instruction
            / (this reference is equivalent to label "two")

Operands
An x86 instruction can have zero to three operands. Operands are separated by commas (,) (ASCII 0x2C). For instructions with two operands, the first (lefthand) operand is the source operand, and the second (righthand) operand is the destination operand (that is, source->destination).

Note – The Intel assembler uses the opposite order (destination<-source) for operands.

Operands can be immediate (that is, constant expressions that evaluate to an inline value), register (a value in the processor number registers), or memory (a value stored in memory). An indirect operand contains the address of the actual operand value. Indirect operands are specified by prefixing the operand with an asterisk (*) (ASCII 0x2A). Only jump and call instructions can use indirect operands.
  • Immediate operands are prefixed with a dollar sign ($) (ASCII 0x24)
  • Register names are prefixed with a percent sign (%) (ASCII 0x25)
  • Memory operands are specified either by the name of a variable or by a register that contains the address of a variable. A variable name implies the address of a variable and instructs the computer to reference the contents of memory at that address. Memory references have the following syntax:segment:offset(base, index, scale).
    • Segment is any of the x86 architecture segment registers. Segment is optional: if specified, it must be separated from offset by a colon (:). If segment is omitted, the value of %ds (the default segment register) is assumed.
    • Offset is the displacement from segment of the desired memory value. Offset is optional.
    • Base and index can be any of the general 32–bit number registers.
    • Scale is a factor by which index is to be multipled before being added to base to specify the address of the operand. Scale can have the value of 1, 2, 4, or 8. If scale is not specified, the default value is 1.
    Some examples of memory addresses are:
    movl var, %eax
    Move the contents of memory location var into number register %eax.
    movl %cs:var, %eax
    Move the contents of memory location var in the code segment (register %cs) into number register %eax.
    movl $var, %eax
    Move the address of var into number register %eax.
    movl array_base(%esi), %eax
    Add the address of memory location array_base to the contents of number register %esi to determine an address in memory. Move the contents of this address into number register %eax.
    movl (%ebx, %esi, 4), %eax
    Multiply the contents of number register %esi by 4 and add the result to the contents of number register %ebx to produce a memory reference. Move the contents of this memory location into number register %eax.
    movl struct_base(%ebx, %esi, 4), %eax
    Multiply the contents of number register %esi by 4, add the result to the contents of number register %ebx, and add the result to the address of struct_base to produce an address. Move the contents of this address into number register %eax.


Assembler Directives

Directives are commands that are part of the assembler syntax but are not related to the x86 processor instruction set. All assembler directives begin with a period (.) (ASCII 0x2E).
.align integer, pad
The .align directive causes the next data generated to be aligned modulo integer bytes. Integer must be a positive integer expression and must be a power of 2. If specified, pad is an integer bye value used for padding. The default value of pad for the text section is 0x90 (nop); for other sections, the default value of pad is zero (0).
.bss
The .bss directive changes the current section to .bss.
.bss symbol, integer
Define symbol in the .bss section and add integer bytes to the value of the location counter for .bss. When issued with arguments, the .bss directive does not change the current section to .bss. Integer must be positive.
.
.globl symbol1, symbol2, ..., symbolN
The .globl directive declares each symbol in the list to be global. Each symbol is either defined externally or defined in the input file and accessible in other files. Default bindings for the symbol are overridden. A global symbol definition in one file satisfies an undefined reference to the same global symbol in another file. Multiple definitions of a defined global symbol are not allowed. If a defined global symbol has more than one definition, an error occurs. The .globl directive only declares the symbol to be global in scope, it does not define the symbol.
.group group, section, #comdat
The .group directive adds section to a COMDAT group. Refer to COMDAT Section in Linker and Libraries Guide for additional information about COMDAT.
.section section, attributes
The .section directive makes section the current section. If section does not exist, a new section with the specified name and attributes is created. If section is a non-reserved section, attributes must be included the first time section is specified by the .section directive.

No comments: