Skip to content

Commit

Permalink
Updates ISA spec accordingly.
Browse files Browse the repository at this point in the history
  • Loading branch information
Lichtso committed Nov 22, 2024
1 parent e72ec04 commit 08bcb37
Showing 1 changed file with 82 additions and 52 deletions.
134 changes: 82 additions & 52 deletions doc/bytecode.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ All of them are 64 bit wide.
| `r8` | all | GPR | Call-preserved
| `r9` | all | GPR | Call-preserved
| `r10` | all | Frame pointer | System register
| `r11` | from v2 | Stack pointer | System register
| `r11` | from v1 | Stack pointer | System register
| `pc` | all | Program counter | Hidden register


Expand Down Expand Up @@ -78,7 +78,7 @@ The following Rust equivalents assume that:
- `imm` is `u32`
- `off` is `u16`

### 32 bit Arithmetic and Logic
### Memory Load or 32 bit Arithmetic and Logic
| opcode (hex / bin) | feature set | assembler mnemonic | Rust equivalent
| ------------------ | ----------- | ---------------------- | ---------------
| `04` / `00000100` | until v2 | `add32 dst, imm` | `dst = (dst as u32).wrapping_add(imm) as i32 as i64 as u64`
Expand All @@ -90,9 +90,13 @@ The following Rust equivalents assume that:
| `1C` / `00011100` | until v2 | `sub32 dst, src` | `dst = (dst as u32).wrapping_sub(src as u32) as i32 as i64 as u64`
| `1C` / `00011100` | from v2 | `sub32 dst, src` | `dst = (dst as u32).wrapping_sub(src as u32) as u64`
| `24` / `00100100` | until v2 | `mul32 dst, imm` | `dst = (dst as i32).wrapping_mul(imm as i32) as i64 as u64`
| `24` / `00100100` | from v2 | -- reserved --
| `2C` / `00101100` | until v2 | `mul32 dst, src` | `dst = (dst as i32).wrapping_mul(src as i32) as i64 as u64`
| `2C` / `00101100` | from v2 | `ldxb dst, [src + off]`
| `34` / `00110100` | until v2 | `div32 dst, imm` | `dst = ((dst as u32) / imm) as u64`
| `34` / `00110100` | from v2 | -- reserved --
| `3C` / `00111100` | until v2 | `div32 dst, src` | `dst = ((dst as u32) / (src as u32)) as u64`
| `3C` / `00111100` | from v2 | `ldxh dst, [src + off]`
| `44` / `01000100` | all | `or32 dst, imm` | `dst = (dst as u32).or(imm) as u64`
| `4C` / `01001100` | all | `or32 dst, src` | `dst = (dst as u32).or(src as u32) as u64`
| `54` / `01010100` | all | `and32 dst, imm` | `dst = (dst as u32).and(imm) as u64`
Expand All @@ -102,9 +106,13 @@ The following Rust equivalents assume that:
| `74` / `01110100` | all | `rsh32 dst, imm` | `dst = (dst as u32).wrapping_shr(imm) as u64`
| `7C` / `01111100` | all | `rsh32 dst, src` | `dst = (dst as u32).wrapping_shr(src as u32) as u64`
| `84` / `10000100` | until v2 | `neg32 dst` | `dst = (dst as i32).wrapping_neg() as u32 as u64`
| `8C` / `10001100` | | -- reserved --
| `84` / `10000100` | from v2 | -- reserved --
| `8C` / `10001100` | until v2 | -- reserved --
| `8C` / `01100001` | from v2 | `ldxw dst, [src + off]`
| `94` / `10010100` | until v2 | `mod32 dst, imm` | `dst = ((dst as u32) % imm) as u64`
| `94` / `10010100` | from v2 | -- reserved --
| `9C` / `10011100` | until v2 | `mod32 dst, src` | `dst = ((dst as u32) % (src as u32)) as u64`
| `9C` / `01111001` | from v2 | `ldxdw dst, [src + off]`
| `A4` / `10100100` | all | `xor32 dst, imm` | `dst = (dst as u32).xor(imm) as u64`
| `AC` / `10101100` | all | `xor32 dst, src` | `dst = (dst as u32).xor(src as u32) as u64`
| `B4` / `10110100` | all | `mov32 dst, imm` | `dst = imm as i32 as i64 as u64`
Expand All @@ -113,10 +121,11 @@ The following Rust equivalents assume that:
| `C4` / `11000100` | all | `ash32 dst, imm` | `dst = (dst as i32).wrapping_shr(imm) as u32 as u64`
| `CC` / `11001100` | all | `ash32 dst, src` | `dst = (dst as i32).wrapping_shr(src as u32) as u32 as u64`
| `D4` / `11010100` | until v2 | `le dst, imm` | `dst = dst as u32 as u64`
| `D4` / `11010100` | from v2 | -- reserved --
| `DC` / `11011100` | all | `be dst, imm` | `dst = match imm { 16 => (dst as u16).swap_bytes() as u64, 32 => (dst as u32).swap_bytes() as u64, 64 => dst.swap_bytes() }`
| `E4` to `FC` | | -- reserved --
| `E4` to `FC` | all | -- reserved --

### 64 bit Arithmetic and Logic
### Memory Store or 64 bit Arithmetic and Logic
| opcode (hex / bin) | feature set | assembler mnemonic | Rust equivalent
| ------------------ | ----------- | ------------------ | ---------------
| `07` / `00000111` | all | `add64 dst, imm` | `dst = dst.wrapping_add(imm as i32 as i64 as u64)`
Expand All @@ -125,9 +134,13 @@ The following Rust equivalents assume that:
| `17` / `00010111` | from v2 | `sub64 dst, imm` | `dst = (imm as i32 as i64 as u64).wrapping_sub(dst)`
| `1F` / `00011111` | all | `sub64 dst, src` | `dst = dst.wrapping_sub(src)`
| `27` / `00100111` | until v2 | `mul64 dst, imm` | `dst = dst.wrapping_mul(imm as u64)`
| `27` / `01110010` | from v2 | `stb [dst + off], imm`
| `2F` / `00101111` | until v2 | `mul64 dst, src` | `dst = dst.wrapping_mul(src)`
| `2F` / `01110011` | from v2 | `stxb [dst + off], src`
| `37` / `00110111` | until v2 | `div64 dst, imm` | `dst = dst / (imm as u64)`
| `37` / `01101010` | from v2 | `sth [dst + off], imm`
| `3F` / `00111111` | until v2 | `div64 dst, src` | `dst = dst / src`
| `3F` / `01101011` | from v2 | `stxh [dst + off], src`
| `47` / `01000111` | all | `or64 dst, imm` | `dst = dst.or(imm)`
| `4F` / `01001111` | all | `or64 dst, src` | `dst = dst.or(src)`
| `57` / `01010111` | all | `and64 dst, imm` | `dst = dst.and(imm)`
Expand All @@ -137,18 +150,23 @@ The following Rust equivalents assume that:
| `77` / `01110111` | all | `rsh64 dst, imm` | `dst = dst.wrapping_shr(imm)`
| `7F` / `01111111` | all | `rsh64 dst, src` | `dst = dst.wrapping_shr(src as u32)`
| `87` / `10000111` | until v2 | `neg64 dst` | `dst = (dst as i64).wrapping_neg() as u64`
| `8F` / `10001111` | | -- reserved --
| `87` / `01100010` | from v2 | `stw [dst + off], imm`
| `8F` / `10001111` | until | -- reserved --
| `8F` / `01100011` | from v2 | `stxw [dst + off], src`
| `97` / `10010111` | until v2 | `mod64 dst, imm` | `dst = dst % (imm as u64)`
| `97` / `01111010` | from v2 | `stdw [dst + off], imm`
| `9F` / `10011111` | until v2 | `mod64 dst, src` | `dst = dst % src`
| `9F` / `01111011` | from v2 | `stxdw [dst + off], src`
| `A7` / `10100111` | all | `xor64 dst, imm` | `dst = dst.xor(imm)`
| `AF` / `10101111` | all | `xor64 dst, src` | `dst = dst.xor(src)`
| `B7` / `10110111` | all | `mov64 dst, imm` | `dst = imm as u64`
| `BF` / `10111111` | all | `mov64 dst, src` | `dst = src`
| `C7` / `11000111` | all | `ash64 dst, imm` | `dst = (dst as i64).wrapping_shr(imm)`
| `CF` / `11001111` | all | `ash64 dst, src` | `dst = (dst as i64).wrapping_shr(src as u32)`
| `D7` to `EF` | | -- reserved --
| `D7` to `EF` | all | -- reserved --
| `F7` / `11110111` | until v2 | -- reserved --
| `F7` / `11110111` | from v2 | `hor64 dst, imm` | `dst = dst.or((imm as u64).wrapping_shl(32))`
| `FF` / `11111111` | | -- reserved --
| `FF` / `11111111` | all | -- reserved --

### Product / Quotient / Remainder
| bit index | when `0` | when `1`
Expand All @@ -165,7 +183,7 @@ The following Rust equivalents assume that:

| opcode (hex / bin) | feature set | assembler mnemonic | Rust equivalent
| ------------------ | ----------- | ------------------ | ---------------
| `06` to `2E` | | -- reserved --
| `06` to `2E` | all | -- reserved --
| `36` / `00110110` | from v2 | `uhmul64 dst, imm` | `dst = (dst as u128).wrapping_mul(imm as u128).wrapping_shr(64) as u64`
| `3E` / `00111110` | from v2 | `uhmul64 dst, src` | `dst = (dst as u128).wrapping_mul(src as u128).wrapping_shr(64) as u64`
| `46` / `01000110` | from v2 | `udiv32 dst, imm` | `dst = ((dst as u32) / imm) as u64`
Expand All @@ -180,7 +198,7 @@ The following Rust equivalents assume that:
| `8E` / `10001110` | from v2 | `lmul32 dst, src` | `dst = (dst as i32).wrapping_mul(src as i32) as u32 as u64`
| `96` / `10010110` | from v2 | `lmul64 dst, imm` | `dst = dst.wrapping_mul(imm as u64)`
| `9E` / `10011110` | from v2 | `lmul64 dst, src` | `dst = dst.wrapping_mul(src)`
| `A6` to `AE` | | -- reserved --
| `A6` to `AE` | all | -- reserved --
| `B6` / `10110110` | from v2 | `shmul64 dst, imm` | `dst = (dst as i128).wrapping_mul(imm as i32 as i128).wrapping_shr(64) as i64 as u64`
| `BE` / `10111110` | from v2 | `shmul64 dst, src` | `dst = (dst as i128).wrapping_mul(src as i64 as i128).wrapping_shr(64) as i64 as u64`
| `C6` / `11000110` | from v2 | `sdiv32 dst, imm` | `dst = ((dst as i32) / (imm as i32)) as u32 as u64`
Expand All @@ -192,7 +210,7 @@ The following Rust equivalents assume that:
| `F6` / `11110110` | from v2 | `srem64 dst, imm` | `dst = ((dst as i64) % (imm as i64)) as u64`
| `FE` / `11111110` | from v2 | `srem64 dst, src` | `dst = ((dst as i64) % (src as i64)) as u64`

### Memory
### Deprecated Memory Load and Store

#### Panics
- Out of bounds: When the memory location is not mapped.
Expand All @@ -201,36 +219,36 @@ The following Rust equivalents assume that:
| opcode (hex / bin) | feature set | assembler mnemonic | Rust equivalent
| ------------------ | ----------- | ------------------ | ---------------
| `00` / `00000000` | until v2 | `lddw dst, imm` | `dst = dst.or((imm as u64).wrapping_shl(32))`
| `08` to `10` | | -- reserved --
| `08` to `10` | all | -- reserved --
| `18` / `00011000` | until v2 | `lddw dst, imm` | `dst = imm as u64`
| `20` to `F8` | | -- reserved --
| `20` to `F8` | all | -- reserved --

| opcode (hex / bin) | feature set | assembler mnemonic
| ------------------ | ----------- | ------------------
| `01` to `59` | | -- reserved --
| `61` / `01100001` | all | `ldxw dst, [src + off]`
| `69` / `01101001` | all | `ldxh dst, [src + off]`
| `71` / `01110001` | all | `ldxb dst, [src + off]`
| `79` / `01111001` | all | `ldxdw dst, [src + off]`
| `81` to `F9` | | -- reserved --
| `01` to `59` | all | -- reserved --
| `61` / `01100001` | until v2 | `ldxw dst, [src + off]`
| `69` / `01101001` | until v2 | `ldxh dst, [src + off]`
| `71` / `01110001` | until v2 | `ldxb dst, [src + off]`
| `79` / `01111001` | until v2 | `ldxdw dst, [src + off]`
| `81` to `F9` | all | -- reserved --

| opcode (hex / bin) | feature set | assembler mnemonic
| ------------------ | ----------- | ------------------
| `02` to `5A` | | -- reserved --
| `62` / `01100010` | all | `stw [dst + off], imm`
| `6A` / `01101010` | all | `sth [dst + off], imm`
| `72` / `01110010` | all | `stb [dst + off], imm`
| `7A` / `01111010` | all | `stdw [dst + off], imm`
| `82` to `FA` | | -- reserved --
| `02` to `5A` | all | -- reserved --
| `62` / `01100010` | until v2 | `stw [dst + off], imm`
| `6A` / `01101010` | until v2 | `sth [dst + off], imm`
| `72` / `01110010` | until v2 | `stb [dst + off], imm`
| `7A` / `01111010` | until v2 | `stdw [dst + off], imm`
| `82` to `FA` | all | -- reserved --

| opcode (hex / bin) | feature set | assembler mnemonic
| ------------------ | ----------- | ------------------
| `03` to `5B` | | -- reserved --
| `63` / `01100011` | all | `stxw [dst + off], src`
| `6B` / `01101011` | all | `stxh [dst + off], src`
| `73` / `01110011` | all | `stxb [dst + off], src`
| `7B` / `01111011` | all | `stxdw [dst + off], src`
| `83` to `FB` | | -- reserved --
| `03` to `5B` | all | -- reserved --
| `63` / `01100011` | until v2 | `stxw [dst + off], src`
| `6B` / `01101011` | until v2 | `stxh [dst + off], src`
| `73` / `01110011` | until v2 | `stxb [dst + off], src`
| `7B` / `01111011` | until v2 | `stxdw [dst + off], src`
| `83` to `FB` | all | -- reserved --

### Control Flow

Expand All @@ -239,24 +257,24 @@ Except that the target location of `callx` is the src register, thus runtime dyn

Call instructions (`call` and `callx` but not `syscall`) do:
- Save the registers `r6`, `r7`, `r8`, `r9`, the frame pointer `r10` and the `pc` (pointing at the next instruction)
- If v1: Add one stack frame size to the frame pointer `r10`
- If ≥ v2: Move the stack pointer `r11` into the frame pointer `r10`
- If < v1: Add one stack frame size to the frame pointer `r10`
- If ≥ v1: Move the stack pointer `r11` into the frame pointer `r10`

The `exit` (a.k.a. return) instruction does:
- Restore the registers `r6`, `r7`, `r8`, `r9`, the frame pointer `r10` and the `pc`
- Or gracefully terminate the program if there is no stack frame to restore

#### Panics
- Out of bounds: When the target location is outside the bytecode if ≤ v1.
- Out of bounds: When the target location is outside the current function if ≥ v2 and a jump.
- Out of bounds: When the target location is not a registered function if ≥ v2 and a call.
- Out of bounds: When the target location is outside the bytecode if < v3.
- Out of bounds: When the target location is outside the current function if ≥ v3 and a jump.
- Out of bounds: When the target location is not a registered function if ≥ v3 and a call.
- Second slot of `lddw`: When the target location has opcode `0x00`.
- Stack overflow: When one too many nested call happens.

| opcode (hex / bin) | feature set | assembler mnemonic | condition Rust equivalent
| ------------------ | ----------- | -------------------- | -------------------------
| `05` / `00000101` | all | `ja off` | `true`
| `0D` / `00001101` | | -- reserved --
| `0D` / `00001101` | all | -- reserved --
| `15` / `00010101` | all | `jeq dst, imm, off` | `dst == (imm as i32 as i64 as u64)`
| `1D` / `00011101` | all | `jeq dst, src, off` | `dst == src`
| `25` / `00100101` | all | `jgt dst, imm, off` | `dst > (imm as i32 as i64 as u64)`
Expand All @@ -271,13 +289,14 @@ The `exit` (a.k.a. return) instruction does:
| `6D` / `01101101` | all | `jsgt dst, src, off` | `(dst as i64) > (src as i64)`
| `75` / `01110101` | all | `jsge dst, imm, off` | `(dst as i64) >= (imm as i32 as i64)`
| `7D` / `01111101` | all | `jsge dst, src, off` | `(dst as i64) >= (src as i64)`
| `85` / `10000101` | until v2 | `call off`
| `85` / `10000101` | from v2 | `syscall src=0, off`
| `85` / `10000101` | from v2 | `call src=1, off`
| `85` / `10000101` | until v3 | `call imm` or `syscall imm`
| `85` / `10000101` | from v3 | `call off`
| `8D` / `10001101` | until v2 | `callx imm`
| `8D` / `10001101` | from v2 | `callx src`
| `95` / `10010101` | all | `exit`
| `9D` / `10011101` | | -- reserved --
| `95` / `10010101` | until v3 | `exit` or `return`
| `95` / `10010101` | from v3 | `syscall imm`
| `9D` / `10011101` | until v3 | -- reserved --
| `9D` / `10011101` | from v3 | `exit` or `return`
| `A5` / `10100101` | all | `jlt dst, imm, off` | `dst < imm as i32 as i64 as u64`
| `AD` / `10101101` | all | `jlt dst, src, off` | `dst < src`
| `B5` / `10110101` | all | `jle dst, imm, off` | `dst <= imm as i32 as i64 as u64`
Expand All @@ -286,7 +305,7 @@ The `exit` (a.k.a. return) instruction does:
| `CD` / `11001101` | all | `jslt dst, src, off` | `(dst as i64) < (src as i64)`
| `D5` / `11010101` | all | `jsle dst, imm, off` | `(dst as i64) <= (imm as i32 as i64)`
| `DD` / `11011101` | all | `jsle dst, src, off` | `(dst as i64) <= (src as i64)`
| `E5` to `FD` | | -- reserved --
| `E5` to `FD` | all | -- reserved --


Verification
Expand All @@ -307,23 +326,34 @@ Verification
- For all instructions the opcode must be valid
- Memory write instructions can use `r10` as destination register

### until v1
- No instruction can use `r11` as destination register

### from v1
- `add64 reg, imm` can use `r11` as destination register

### until v2
- Opcodes from the product / quotient / remainder instruction class are forbiden
- `neg32` and `neg64` are allowed
- `le` is allowed
- `lddw` (opcodes `0x18` and `0x00`) are allowed
- `hor64` is forbidden
- `callx` source register is encoded in the imm field
- The targets of `call` instructions is checked at runtime not verification time
- The offset of jump instructions must be limited to the range of the bytecode

### from v2
- Every function must end in a `ja` or `exit` instruction
- `lddw` (opcodes `0x18` and `0x00`) are forbidden
- `neg32` and `neg64` are forbidden
- Opcodes from the product / quotient / remainder instruction class are allowed
- `neg32` and `neg64` are forbidden
- `le` is forbidden
- `lddw` (opcodes `0x18` and `0x00`) are forbidden
- `hor64` is allowed
- The offset of jump instructions must be limited to the range of the current function
- `callx` source register is encoded in the src field
- The targets of internal calls (`call` instructions with src ≠ 0) must have been registered at verification time
- The targets of syscalls (`call` instructions with src = 0) must have been registered at verification time
- `add64 reg, imm` can use `r11` as destination register

### until v3
- The targets of `call` instructions (which includes `syscall` instructions) is checked at runtime not verification time
- The offset of jump instructions must be limited to the range of the bytecode

### from v3
- Every function must end in a `ja` or `exit` instruction
- The targets of `call` instructions must have been registered at verification time
- The targets of `syscall` instructions must have been registered at verification time
- The offset of jump instructions must be limited to the range of the current function

0 comments on commit 08bcb37

Please sign in to comment.