Instruction Set Architecture

The SaarCPU ISA is a rather CISC-ish little-endian 8-bit instruction set architecture. All its opcodes are precisely 8 bit long, but immediates may be attached in a second or a third byte. While it focuses on 8-bit operations, we've implemented a few instructions that operate on 16 bit register pairs for convenience.

The ISA exposes seven 8-bit registers. a, b, c and d are general-purpose registers. p (for “page”) and i (“index”) can be used in any place where the above can be used, however they can also be to fetch one operand from memory. In this case, we denote the operand as [pi]. p serves as the upper 8 bits of the address, i as the lower bits. The last and most special register is the accumulator acc. It can also be used everywhere where a, b, c and d can be used. But acc also serves as the target and source operand of every 8-bit arithmetic or logical operation. (In case of two operands, the other operand can be an immediate, [pi] or any other register.) An overview of the encoding in instruction opcodes can be found below. 8-bit operands are called $\mathit{reg8}$ .

There are two reasons for having the accumulator: First, a data bus is rather costly at the hardware-level. So we decided to only have a single data bus. For binary operations, this means that both operands or one operand and the result need to be stored in special registers anyway.
We more or less follow the design of Ulf Casper’s build: one operand is transferred via the data bus, the other one as well as the result are stored in a special register. For more details, refer to the ALU documentation. Models of transferring ALU operands via data busses It would have been possible to build arithmetic instructions which can take arbitrary registers for the target and source. However, this would have consumed many opcodes, our second reason. Furthermore, this only shifts the burden of performing the moves to the microcode; the cycle count would not have been much lower.

For 16-bit operations, two 8-bit registers can be used together: ab consists of a as the upper byte and b as the lower byte, cd consists of c and d, pi of p and i. 16-bit instructions may also access the stack pointer sp. Such 16-bit operands are called $\mathit{reg16}$ .

Furthermore, there is the 16-bit program counter. It (more or less) describes which instruction is currently executing. To avoid confusion, we use $\mathit{PC}$ for the address of first byte after the current instruction’s opcode when defining semantics of instructions. $\mathit{pc}$ refers to the register.

As mentioned above, our ISA uses little-endian. In particular this means that a 16-bit immediate is encoded with the lower byte first. call pushes the upper byte of the program counter first to have the lower byte at the lower address.

The ISA defines four flags, which are updated at arithmetic/logic operations:

The sign flag S (or $\mathit{SF}$ ) is the most significant bit of the ALU result.
The zero flag Z (or $\mathit{ZF}$ ) is set if and only if the ALU result is zero.
The overflow flag V (or $\mathit{OF}$ ) indicates a signed overflow (crossing the boundary between $127$ and $-128$ ).
The carry flag C (or $\mathit{CF}$ ) indicates an unsigned wrap (crossing the boundary between $255$ and $0$ ).

When we specify the affected flags below, the letter means set accordingly, - means not set, and ? means undefined.

For movs, our ISA supports a three more sophisticated addressing modes, see below.

Notation

The assembly syntax we use is to some extent inspired by x86 assembly (intel syntax). In particular, the destination is denoted before the source. For memory indirections, we also use square brackets [ and ]. So for example mov acc, [ab+42] means to load the value at address $\mathit{ab} + 42$ into $\mathit{acc}$ .

To describe the semantics of an instruction like mov acc, [reg16+imm8s16] (imm8s16 stands for an 8-bit immediate that is sign-extended to 16 bits), we use the following notation: $\mathit{acc} \gets \mathit{mem}[\mathit{reg16} + \mathit{sext}(\mathit{imm8})]$ .

Encoding

Encoding	Affected Flags	Instruction	Comments
`00 000 000`	`SZVC`	`reset`
`00 aaa 000`		`other alu`	`a` ≠ `000`
`00 fff 001`	`----`	`other control flow`
`00 ddd 010`	`----`	`mov d, imm8`
`00 bbb 011`		`binop acc, imm8`
`00 rrr 100`	`----`	`push r`	`r` ≠ `[pi]`
`00 110 100`	`----`	`pushf`
`00 rrr 101`	`----`	`pop r`	`r` ≠ `[pi]`
`00 110 101`	`----`	`popf`
`00 ccc 11n`	`----`	`jcc imm8s16/imm16`
`01 bbb rrr`		`binop acc, r`
`10 ddd rrr`	`----`	`mov d, r`	`d` ≠ `r`, no `acc` from/to `[pi]`
`10 000 000`	`----`	`prefix_a16`
`10 001 001`	`----`	`i2c_send`
`10 010 010`	`----`	`i2c_recv`
`10 011 011`	`----`	`spi`
`10 100 100`	`----`	`switch_fb`
`10 101 101`		`(unused)`
`10 110 110`	`----`	`hlt`
`10 111 111`		`(unused)`
`10 111 110`	`----`	`mov [imm16], acc`
`10 110 111`	`----`	`mov acc, [imm16]`
`11 00 iiww`	`----`	`mov [i(w)], acc`
`11 01 iiww`	`----`	`mov acc, [i(w)]`
`11 10 00ww`	`SZVC`	`add ab, w`
`11 10 01ww`	`SZVC`	`sub ab, w`
`11 10 10ww`	`SZVC`	`add w, imm8s16`
`11 10 11ww`	`----`	`mov w, imm16`
`11 11 vvww`	`----`	`mov v, w`	`v` ≠ `w`
`11 11 0000`		`(unused)`
`11 11 0101`		`(unused)`
`11 11 1010`	`----`	`assertz`
`11 11 1111`	`----`	`nop`

`bbb`: binary operator

Encoding	Instruction	Affected Flags
`000`	`add`	`SZVC`
`001`	`adc`	`SZVC`
`010`	`sub`	`SZVC`
`011`	`sbc`	`SZVC`
`100`	`and`	`SZ??`
`101`	`or`	`SZ??`
`110`	`xor`	`SZ??`
`111`	`cmp`	`SZVC`

`aaa`: other ALU operations

Encoding	Instruction	Affected Flags	Description
`000`			(`reset`)
`001`	`rcr`	`SZ?C`	rotate right `acc` through carry
`010`	`shr`	`SZ?C`	shift right `acc` filling in 0
`011`	`sar`	`SZ?C`	shift right `acc` replicating MSB
`100`	`not`	`SZVC`	`~acc`
`101`	`neg`	`SZVC`	`-acc`
`110`	`clc`	`---C`	clear carry
`111`	`stc`	`---C`	set carry

`fff`: other control flow

Encoding	Instruction	Description
`000`	`jmp pi`	indirect jump
`001`	`call imm16`	call absolute
`010`	`call pi`	indirect call
`011`	`ret`	return
`100`	`int imm8`	software interrupt
`101`	`iret`	return from interrupt
`110`	`cli`	disable interrupts
`111`	`sti`	enable interrupts

`ddd`/`rrr`: 8-bit operand

Encoding	Meaning
`000`	`a`
`001`	`b`
`010`	`c`
`011`	`d`
`100`	`p` (page)
`101`	`i` (index)
`110`	`[pi]` memory indirect
`111`	`acc`

`ii`: address mode

Encoding	Address Mode	Notation
`00`	register	`[reg16]`
`01`	register post-increment	`[reg16++]`
`10`	register pre-decrement	`[--reg16]`
`11`	register + `imm8s16`	`[reg16+imm8s16]`

`vv`/`ww`: 16 bit register

Encoding	Register
`00`	`ab`
`01`	`cd`
`10`	`pi`
`11`	`sp`

We use ab as the 16-bit accu. This just means that most 16-bit instructions only support ab as the destination and has nothing to do with the 8-bit register acc.

Instruction length

For each instruction, we give the number of micro-instructions / clock cycles needed to execute it. Often, using prefix_a16 makes the instruction take more clock cycles. You can then find documentation like

5 cycles for add acc, [pi]/adc acc, [pi] with prefix_a16 (the sequence prefix_a16; add acc, [pi] takes 6 cycles)

This means that add acc, [pi] takes 5 cycles if the prefix_a16 latch is set. Since the prefix_a16 instruction, which sets the latch, also takes one cycle, the total length of this two-instruction sequence is 6 cycles.

Notation​

Encoding​

bbb: binary operator​

aaa: other ALU operations​

fff: other control flow​

ddd/rrr: 8-bit operand​

ii: address mode​

vv/ww: 16 bit register​

Instruction length​