Lexical Analysis: The assembler reads each line of the assembly code, identifying and parsing labels, mnemonics (like ADD, MOV), operands, and comments. It ignores comments and other non-instructional data.
Symbol Table Construction: The assembler identifies labels (symbols) in the code, which often represent memory addresses or locations in the program. Each label is stored in a symbol table alongside its memory address. Labels might refer to variables, instructions, or locations in the code, and are essential for tracking where instructions and data reside.
Address Assignment: As the assembler reads through each instruction, it assigns memory addresses to them sequentially. It tracks the memory locations of instructions and data, calculating the addresses each label should point to. This phase defines memory locations but does not yet translate the code into machine instructions.
Error Checking: Any errors related to undefined symbols (e.g., if a label is used but never defined) are usually flagged in this pass. Syntax errors, however, may only be partially caught, as a detailed analysis happens in Pass 2.
Symbol Table Lookup: Using the symbol table created in Pass 1, the assembler revisits each instruction in the assembly code. It replaces symbolic labels with the actual memory addresses they represent, as defined in the symbol table.
Mnemonic Translation: Each assembly language instruction mnemonic (like ADD, SUB, etc.) is translated into its corresponding machine code operation, typically represented in binary or hexadecimal. This is often done by looking up each mnemonic in an operation code (opcode) table, which lists each mnemonic with its binary opcode.
Addressing Mode Resolution: For instructions that involve addressing modes (e.g., immediate, direct, indirect), the assembler determines the correct machine code format for the instruction based on how data or memory is being accessed.
Binary Code Generation: The assembler generates the final machine code, a series of binary instructions ready for execution by the CPU. It resolves any symbolic references by inserting the corresponding addresses, ensuring all labels and instructions are correctly represented in machine language.
Error Checking: The assembler performs additional checks for errors, including incorrect operand types, invalid instructions, and unresolved symbols, flagging any issues missed in Pass 1.