thecodingidiot.com

Writing 6502 AssemblyRegisters and Addressing

Registers and Addressing

Before writing any code, you need a mental model of the CPU's internal state. The 6502 has six registers. Every instruction either reads from one, writes to one, or tests the flags that track what happened to the last result.

The six registers

A — the accumulator. This is the main working register. Almost every arithmetic or logic operation happens here. When you add two numbers, the result lands in A. When you AND a value with a mask, A holds the output. If you want to move a value from one memory location to another, it passes through A on the way.

X and Y — the index registers. These are general-purpose counters and loop variables. You can load them, increment or decrement them, and branch based on whether they have reached zero. They can also offset addresses — lda $4000,X reads from $4000 plus whatever is in X — though the programs in this chapter use them mainly as loop counters.

SP — the stack pointer. The 6502's stack lives at 01000100–01FF. SP holds the low byte of the current stack address; the high byte is always $01. The stack grows downward: pushing a byte decrements SP, pulling increments it. PHA pushes A; PLA pulls A. JSR (jump to subroutine) pushes the return address; RTS pulls it. The programs in this chapter do not use subroutines, but the stack is always there.

PC — the program counter. PC holds the address of the next instruction the CPU will fetch. The CPU updates it automatically after every instruction. Branch and jump instructions change it explicitly to redirect execution. You never write to PC directly; you use JMP, BNE, BEQ, and similar instructions.

P — the processor status register. P is eight individual flags packed into one byte. Instructions set and clear these flags as a side effect of their result. Branch instructions test specific flags to decide whether to redirect PC. The flags relevant to this chapter are:

  • Z (zero flag): set to 1 when the result of an operation is exactly zero; cleared otherwise.
  • N (negative flag): set to 1 when bit 7 of the result is 1 (the 6502 treats values with bit 7 set as negative in signed arithmetic); cleared otherwise.

The other flags (carry, overflow, interrupt disable, decimal mode, break) matter in later work. For now, Z and N are the ones the programs here test.

Addressing modes

The 6502 has a dozen addressing modes. Three appear in this chapter's programs.

Immediate. The operand is a literal value embedded in the instruction itself. The # prefix marks an immediate operand.

lda #$ff

This loads the value $FF directly into A. No memory access — the byte after the opcode is the value.

Absolute. The operand is a 16-bit address. The CPU reads the value at that address from memory.

sta $4002

This stores A into the memory cell at $4002 — which, in our circuit, is the VIA's DDRB register. Absolute addresses take two bytes after the opcode (low byte first, then high byte), so absolute instructions are three bytes total.

Zero page. The operand is an 8-bit address in the range 00000000–00FF. The CPU treats the single byte as an address in the first 256 bytes of memory, called the zero page.

sta $00

This stores A into $0000. Zero-page instructions are two bytes total instead of three, and they execute one cycle faster than their absolute equivalents. The counter program uses zero-page addressing to keep the loop tight.

When the assembler sees a symbolic constant like DDRB = $4002 and then sta DDRB, it substitutes the address and picks absolute mode automatically. You can override the mode with a < prefix to force zero page, but you will not need that here.

Key instructions

These are the only instructions the chapter programs use. Each one is simple on its own; the work comes from combining them.

lda — load accumulator. Copies a value into A. Comes in immediate, absolute, and zero-page forms. Sets Z if the loaded value is zero, N if bit 7 is set.

sta — store accumulator. Writes A to a memory address. Does not affect flags.

ldx — load X register. Copies a value into X. Immediate form only in these programs. Sets Z and N the same way lda does.

dex — decrement X. Subtracts 1 from X and stores the result back in X. Sets Z if X reaches zero, N if bit 7 of the result is set. Two cycles. Used in delay loops: load X with a count, decrement in a loop until Z is set.

inc — increment memory. Reads a byte from a memory address, adds 1, and writes it back. Five cycles for a zero-page address. Increments the counter variable in counter.s.

and — bitwise AND. ANDs A with an operand and stores the result back in A. Sets Z if the result is zero, N if bit 7 is set. Used to isolate a single bit: AND with a mask that has only that bit set. If the result is zero the bit was clear; if non-zero the bit was set.

cmp — compare. Subtracts an operand from A and sets flags based on the result, but does not change A. Used to test whether A equals a value. The programs here use and to test bits rather than cmp to test values, so cmp does not appear in the assembly listings — it is listed here because you will encounter it in almost any 6502 code you read.

bne — branch if not equal (branch if Z clear). If the Z flag is 0, add a signed offset to PC — the branch is taken, execution continues at the target label. If Z is 1, the instruction is ignored and execution falls through to the next instruction. The name "not equal" comes from using cmp before the branch; after and, think of it as "branch if non-zero."

beq — branch if equal (branch if Z set). The opposite of bne. Taken when Z is 1. Not used directly in this chapter's programs but appears in the explanation of bne because the two are mirrors of each other.

jmp — jump. Sets PC to the target address unconditionally. Three cycles. The tightest possible loop is two instructions: one doing work, one jmp back.

The status flags in practice

Two concrete patterns appear over and over in this chapter.

The first is a delay loop using X and bne:

ldx #200
wait:
  dex
  bne wait

ldx #200 loads X with 200. Each time through the loop, dex subtracts 1 from X and sets Z when X reaches zero. While Z is clear — while X is not yet zero — bne takes the branch back to wait. When X finally reaches zero, Z is set, bne does not branch, and execution continues past the loop.

The second is bit testing using and and bne:

lda PORTB
and #$80
bne somewhere

lda PORTB reads the full eight-bit port value into A. and #$80 keeps only bit 7 (80=wassetinPORTB,theresultis80 = %10000000) and clears all others. If bit 7 was set in PORTB, the result is 80 — non-zero, Z clear, bne is taken. If bit 7 was clear, the result is $00 — zero, Z set, bne is not taken. This is how the button program decides whether the button is pressed.