Skip to content
This repository was archived by the owner on Jan 10, 2025. It is now read-only.

Commit 0442846

Browse files
committed
Updates ISA spec accordingly.
1 parent b48c04d commit 0442846

File tree

1 file changed

+82
-52
lines changed

1 file changed

+82
-52
lines changed

doc/bytecode.md

+82-52
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ All of them are 64 bit wide.
1919
| `r8` | all | GPR | Call-preserved
2020
| `r9` | all | GPR | Call-preserved
2121
| `r10` | all | Frame pointer | System register
22-
| `r11` | from v2 | Stack pointer | System register
22+
| `r11` | from v1 | Stack pointer | System register
2323
| `pc` | all | Program counter | Hidden register
2424

2525

@@ -78,7 +78,7 @@ The following Rust equivalents assume that:
7878
- `imm` is `u32`
7979
- `off` is `u16`
8080

81-
### 32 bit Arithmetic and Logic
81+
### Memory Load or 32 bit Arithmetic and Logic
8282
| opcode (hex / bin) | feature set | assembler mnemonic | Rust equivalent
8383
| ------------------ | ----------- | ---------------------- | ---------------
8484
| `04` / `00000100` | until v2 | `add32 dst, imm` | `dst = (dst as u32).wrapping_add(imm) as i32 as i64 as u64`
@@ -90,9 +90,13 @@ The following Rust equivalents assume that:
9090
| `1C` / `00011100` | until v2 | `sub32 dst, src` | `dst = (dst as u32).wrapping_sub(src as u32) as i32 as i64 as u64`
9191
| `1C` / `00011100` | from v2 | `sub32 dst, src` | `dst = (dst as u32).wrapping_sub(src as u32) as u64`
9292
| `24` / `00100100` | until v2 | `mul32 dst, imm` | `dst = (dst as i32).wrapping_mul(imm as i32) as i64 as u64`
93+
| `24` / `00100100` | from v2 | -- reserved --
9394
| `2C` / `00101100` | until v2 | `mul32 dst, src` | `dst = (dst as i32).wrapping_mul(src as i32) as i64 as u64`
95+
| `2C` / `00101100` | from v2 | `ldxb dst, [src + off]`
9496
| `34` / `00110100` | until v2 | `div32 dst, imm` | `dst = ((dst as u32) / imm) as u64`
97+
| `34` / `00110100` | from v2 | -- reserved --
9598
| `3C` / `00111100` | until v2 | `div32 dst, src` | `dst = ((dst as u32) / (src as u32)) as u64`
99+
| `3C` / `00111100` | from v2 | `ldxh dst, [src + off]`
96100
| `44` / `01000100` | all | `or32 dst, imm` | `dst = (dst as u32).or(imm) as u64`
97101
| `4C` / `01001100` | all | `or32 dst, src` | `dst = (dst as u32).or(src as u32) as u64`
98102
| `54` / `01010100` | all | `and32 dst, imm` | `dst = (dst as u32).and(imm) as u64`
@@ -102,9 +106,13 @@ The following Rust equivalents assume that:
102106
| `74` / `01110100` | all | `rsh32 dst, imm` | `dst = (dst as u32).wrapping_shr(imm) as u64`
103107
| `7C` / `01111100` | all | `rsh32 dst, src` | `dst = (dst as u32).wrapping_shr(src as u32) as u64`
104108
| `84` / `10000100` | until v2 | `neg32 dst` | `dst = (dst as i32).wrapping_neg() as u32 as u64`
105-
| `8C` / `10001100` | | -- reserved --
109+
| `84` / `10000100` | from v2 | -- reserved --
110+
| `8C` / `10001100` | until v2 | -- reserved --
111+
| `8C` / `01100001` | from v2 | `ldxw dst, [src + off]`
106112
| `94` / `10010100` | until v2 | `mod32 dst, imm` | `dst = ((dst as u32) % imm) as u64`
113+
| `94` / `10010100` | from v2 | -- reserved --
107114
| `9C` / `10011100` | until v2 | `mod32 dst, src` | `dst = ((dst as u32) % (src as u32)) as u64`
115+
| `9C` / `01111001` | from v2 | `ldxdw dst, [src + off]`
108116
| `A4` / `10100100` | all | `xor32 dst, imm` | `dst = (dst as u32).xor(imm) as u64`
109117
| `AC` / `10101100` | all | `xor32 dst, src` | `dst = (dst as u32).xor(src as u32) as u64`
110118
| `B4` / `10110100` | all | `mov32 dst, imm` | `dst = imm as i32 as i64 as u64`
@@ -113,10 +121,11 @@ The following Rust equivalents assume that:
113121
| `C4` / `11000100` | all | `ash32 dst, imm` | `dst = (dst as i32).wrapping_shr(imm) as u32 as u64`
114122
| `CC` / `11001100` | all | `ash32 dst, src` | `dst = (dst as i32).wrapping_shr(src as u32) as u32 as u64`
115123
| `D4` / `11010100` | until v2 | `le dst, imm` | `dst = dst as u32 as u64`
124+
| `D4` / `11010100` | from v2 | -- reserved --
116125
| `DC` / `11011100` | all | `be dst, imm` | `dst = match imm { 16 => (dst as u16).swap_bytes() as u64, 32 => (dst as u32).swap_bytes() as u64, 64 => dst.swap_bytes() }`
117-
| `E4` to `FC` | | -- reserved --
126+
| `E4` to `FC` | all | -- reserved --
118127

119-
### 64 bit Arithmetic and Logic
128+
### Memory Store or 64 bit Arithmetic and Logic
120129
| opcode (hex / bin) | feature set | assembler mnemonic | Rust equivalent
121130
| ------------------ | ----------- | ------------------ | ---------------
122131
| `07` / `00000111` | all | `add64 dst, imm` | `dst = dst.wrapping_add(imm as i32 as i64 as u64)`
@@ -125,9 +134,13 @@ The following Rust equivalents assume that:
125134
| `17` / `00010111` | from v2 | `sub64 dst, imm` | `dst = (imm as i32 as i64 as u64).wrapping_sub(dst)`
126135
| `1F` / `00011111` | all | `sub64 dst, src` | `dst = dst.wrapping_sub(src)`
127136
| `27` / `00100111` | until v2 | `mul64 dst, imm` | `dst = dst.wrapping_mul(imm as u64)`
137+
| `27` / `01110010` | from v2 | `stb [dst + off], imm`
128138
| `2F` / `00101111` | until v2 | `mul64 dst, src` | `dst = dst.wrapping_mul(src)`
139+
| `2F` / `01110011` | from v2 | `stxb [dst + off], src`
129140
| `37` / `00110111` | until v2 | `div64 dst, imm` | `dst = dst / (imm as u64)`
141+
| `37` / `01101010` | from v2 | `sth [dst + off], imm`
130142
| `3F` / `00111111` | until v2 | `div64 dst, src` | `dst = dst / src`
143+
| `3F` / `01101011` | from v2 | `stxh [dst + off], src`
131144
| `47` / `01000111` | all | `or64 dst, imm` | `dst = dst.or(imm)`
132145
| `4F` / `01001111` | all | `or64 dst, src` | `dst = dst.or(src)`
133146
| `57` / `01010111` | all | `and64 dst, imm` | `dst = dst.and(imm)`
@@ -137,18 +150,23 @@ The following Rust equivalents assume that:
137150
| `77` / `01110111` | all | `rsh64 dst, imm` | `dst = dst.wrapping_shr(imm)`
138151
| `7F` / `01111111` | all | `rsh64 dst, src` | `dst = dst.wrapping_shr(src as u32)`
139152
| `87` / `10000111` | until v2 | `neg64 dst` | `dst = (dst as i64).wrapping_neg() as u64`
140-
| `8F` / `10001111` | | -- reserved --
153+
| `87` / `01100010` | from v2 | `stw [dst + off], imm`
154+
| `8F` / `10001111` | until | -- reserved --
155+
| `8F` / `01100011` | from v2 | `stxw [dst + off], src`
141156
| `97` / `10010111` | until v2 | `mod64 dst, imm` | `dst = dst % (imm as u64)`
157+
| `97` / `01111010` | from v2 | `stdw [dst + off], imm`
142158
| `9F` / `10011111` | until v2 | `mod64 dst, src` | `dst = dst % src`
159+
| `9F` / `01111011` | from v2 | `stxdw [dst + off], src`
143160
| `A7` / `10100111` | all | `xor64 dst, imm` | `dst = dst.xor(imm)`
144161
| `AF` / `10101111` | all | `xor64 dst, src` | `dst = dst.xor(src)`
145162
| `B7` / `10110111` | all | `mov64 dst, imm` | `dst = imm as u64`
146163
| `BF` / `10111111` | all | `mov64 dst, src` | `dst = src`
147164
| `C7` / `11000111` | all | `ash64 dst, imm` | `dst = (dst as i64).wrapping_shr(imm)`
148165
| `CF` / `11001111` | all | `ash64 dst, src` | `dst = (dst as i64).wrapping_shr(src as u32)`
149-
| `D7` to `EF` | | -- reserved --
166+
| `D7` to `EF` | all | -- reserved --
167+
| `F7` / `11110111` | until v2 | -- reserved --
150168
| `F7` / `11110111` | from v2 | `hor64 dst, imm` | `dst = dst.or((imm as u64).wrapping_shl(32))`
151-
| `FF` / `11111111` | | -- reserved --
169+
| `FF` / `11111111` | all | -- reserved --
152170

153171
### Product / Quotient / Remainder
154172
| bit index | when `0` | when `1`
@@ -165,7 +183,7 @@ The following Rust equivalents assume that:
165183

166184
| opcode (hex / bin) | feature set | assembler mnemonic | Rust equivalent
167185
| ------------------ | ----------- | ------------------ | ---------------
168-
| `06` to `2E` | | -- reserved --
186+
| `06` to `2E` | all | -- reserved --
169187
| `36` / `00110110` | from v2 | `uhmul64 dst, imm` | `dst = (dst as u128).wrapping_mul(imm as u128).wrapping_shr(64) as u64`
170188
| `3E` / `00111110` | from v2 | `uhmul64 dst, src` | `dst = (dst as u128).wrapping_mul(src as u128).wrapping_shr(64) as u64`
171189
| `46` / `01000110` | from v2 | `udiv32 dst, imm` | `dst = ((dst as u32) / imm) as u64`
@@ -180,7 +198,7 @@ The following Rust equivalents assume that:
180198
| `8E` / `10001110` | from v2 | `lmul32 dst, src` | `dst = (dst as i32).wrapping_mul(src as i32) as u32 as u64`
181199
| `96` / `10010110` | from v2 | `lmul64 dst, imm` | `dst = dst.wrapping_mul(imm as u64)`
182200
| `9E` / `10011110` | from v2 | `lmul64 dst, src` | `dst = dst.wrapping_mul(src)`
183-
| `A6` to `AE` | | -- reserved --
201+
| `A6` to `AE` | all | -- reserved --
184202
| `B6` / `10110110` | from v2 | `shmul64 dst, imm` | `dst = (dst as i128).wrapping_mul(imm as i32 as i128).wrapping_shr(64) as i64 as u64`
185203
| `BE` / `10111110` | from v2 | `shmul64 dst, src` | `dst = (dst as i128).wrapping_mul(src as i64 as i128).wrapping_shr(64) as i64 as u64`
186204
| `C6` / `11000110` | from v2 | `sdiv32 dst, imm` | `dst = ((dst as i32) / (imm as i32)) as u32 as u64`
@@ -192,7 +210,7 @@ The following Rust equivalents assume that:
192210
| `F6` / `11110110` | from v2 | `srem64 dst, imm` | `dst = ((dst as i64) % (imm as i64)) as u64`
193211
| `FE` / `11111110` | from v2 | `srem64 dst, src` | `dst = ((dst as i64) % (src as i64)) as u64`
194212

195-
### Memory
213+
### Deprecated Memory Load and Store
196214

197215
#### Panics
198216
- Out of bounds: When the memory location is not mapped.
@@ -201,36 +219,36 @@ The following Rust equivalents assume that:
201219
| opcode (hex / bin) | feature set | assembler mnemonic | Rust equivalent
202220
| ------------------ | ----------- | ------------------ | ---------------
203221
| `00` / `00000000` | until v2 | `lddw dst, imm` | `dst = dst.or((imm as u64).wrapping_shl(32))`
204-
| `08` to `10` | | -- reserved --
222+
| `08` to `10` | all | -- reserved --
205223
| `18` / `00011000` | until v2 | `lddw dst, imm` | `dst = imm as u64`
206-
| `20` to `F8` | | -- reserved --
224+
| `20` to `F8` | all | -- reserved --
207225

208226
| opcode (hex / bin) | feature set | assembler mnemonic
209227
| ------------------ | ----------- | ------------------
210-
| `01` to `59` | | -- reserved --
211-
| `61` / `01100001` | all | `ldxw dst, [src + off]`
212-
| `69` / `01101001` | all | `ldxh dst, [src + off]`
213-
| `71` / `01110001` | all | `ldxb dst, [src + off]`
214-
| `79` / `01111001` | all | `ldxdw dst, [src + off]`
215-
| `81` to `F9` | | -- reserved --
228+
| `01` to `59` | all | -- reserved --
229+
| `61` / `01100001` | until v2 | `ldxw dst, [src + off]`
230+
| `69` / `01101001` | until v2 | `ldxh dst, [src + off]`
231+
| `71` / `01110001` | until v2 | `ldxb dst, [src + off]`
232+
| `79` / `01111001` | until v2 | `ldxdw dst, [src + off]`
233+
| `81` to `F9` | all | -- reserved --
216234

217235
| opcode (hex / bin) | feature set | assembler mnemonic
218236
| ------------------ | ----------- | ------------------
219-
| `02` to `5A` | | -- reserved --
220-
| `62` / `01100010` | all | `stw [dst + off], imm`
221-
| `6A` / `01101010` | all | `sth [dst + off], imm`
222-
| `72` / `01110010` | all | `stb [dst + off], imm`
223-
| `7A` / `01111010` | all | `stdw [dst + off], imm`
224-
| `82` to `FA` | | -- reserved --
237+
| `02` to `5A` | all | -- reserved --
238+
| `62` / `01100010` | until v2 | `stw [dst + off], imm`
239+
| `6A` / `01101010` | until v2 | `sth [dst + off], imm`
240+
| `72` / `01110010` | until v2 | `stb [dst + off], imm`
241+
| `7A` / `01111010` | until v2 | `stdw [dst + off], imm`
242+
| `82` to `FA` | all | -- reserved --
225243

226244
| opcode (hex / bin) | feature set | assembler mnemonic
227245
| ------------------ | ----------- | ------------------
228-
| `03` to `5B` | | -- reserved --
229-
| `63` / `01100011` | all | `stxw [dst + off], src`
230-
| `6B` / `01101011` | all | `stxh [dst + off], src`
231-
| `73` / `01110011` | all | `stxb [dst + off], src`
232-
| `7B` / `01111011` | all | `stxdw [dst + off], src`
233-
| `83` to `FB` | | -- reserved --
246+
| `03` to `5B` | all | -- reserved --
247+
| `63` / `01100011` | until v2 | `stxw [dst + off], src`
248+
| `6B` / `01101011` | until v2 | `stxh [dst + off], src`
249+
| `73` / `01110011` | until v2 | `stxb [dst + off], src`
250+
| `7B` / `01111011` | until v2 | `stxdw [dst + off], src`
251+
| `83` to `FB` | all | -- reserved --
234252

235253
### Control Flow
236254

@@ -239,24 +257,24 @@ Except that the target location of `callx` is the src register, thus runtime dyn
239257

240258
Call instructions (`call` and `callx` but not `syscall`) do:
241259
- Save the registers `r6`, `r7`, `r8`, `r9`, the frame pointer `r10` and the `pc` (pointing at the next instruction)
242-
- If v1: Add one stack frame size to the frame pointer `r10`
243-
- If ≥ v2: Move the stack pointer `r11` into the frame pointer `r10`
260+
- If < v1: Add one stack frame size to the frame pointer `r10`
261+
- If ≥ v1: Move the stack pointer `r11` into the frame pointer `r10`
244262

245263
The `exit` (a.k.a. return) instruction does:
246264
- Restore the registers `r6`, `r7`, `r8`, `r9`, the frame pointer `r10` and the `pc`
247265
- Or gracefully terminate the program if there is no stack frame to restore
248266

249267
#### Panics
250-
- Out of bounds: When the target location is outside the bytecode if ≤ v1.
251-
- Out of bounds: When the target location is outside the current function if ≥ v2 and a jump.
252-
- Out of bounds: When the target location is not a registered function if ≥ v2 and a call.
268+
- Out of bounds: When the target location is outside the bytecode if < v3.
269+
- Out of bounds: When the target location is outside the current function if ≥ v3 and a jump.
270+
- Out of bounds: When the target location is not a registered function if ≥ v3 and a call.
253271
- Second slot of `lddw`: When the target location has opcode `0x00`.
254272
- Stack overflow: When one too many nested call happens.
255273

256274
| opcode (hex / bin) | feature set | assembler mnemonic | condition Rust equivalent
257275
| ------------------ | ----------- | -------------------- | -------------------------
258276
| `05` / `00000101` | all | `ja off` | `true`
259-
| `0D` / `00001101` | | -- reserved --
277+
| `0D` / `00001101` | all | -- reserved --
260278
| `15` / `00010101` | all | `jeq dst, imm, off` | `dst == (imm as i32 as i64 as u64)`
261279
| `1D` / `00011101` | all | `jeq dst, src, off` | `dst == src`
262280
| `25` / `00100101` | all | `jgt dst, imm, off` | `dst > (imm as i32 as i64 as u64)`
@@ -271,13 +289,14 @@ The `exit` (a.k.a. return) instruction does:
271289
| `6D` / `01101101` | all | `jsgt dst, src, off` | `(dst as i64) > (src as i64)`
272290
| `75` / `01110101` | all | `jsge dst, imm, off` | `(dst as i64) >= (imm as i32 as i64)`
273291
| `7D` / `01111101` | all | `jsge dst, src, off` | `(dst as i64) >= (src as i64)`
274-
| `85` / `10000101` | until v2 | `call off`
275-
| `85` / `10000101` | from v2 | `syscall src=0, off`
276-
| `85` / `10000101` | from v2 | `call src=1, off`
292+
| `85` / `10000101` | until v3 | `call imm` or `syscall imm`
293+
| `85` / `10000101` | from v3 | `call off`
277294
| `8D` / `10001101` | until v2 | `callx imm`
278295
| `8D` / `10001101` | from v2 | `callx src`
279-
| `95` / `10010101` | all | `exit`
280-
| `9D` / `10011101` | | -- reserved --
296+
| `95` / `10010101` | until v3 | `exit` or `return`
297+
| `95` / `10010101` | from v3 | `syscall imm`
298+
| `9D` / `10011101` | until v3 | -- reserved --
299+
| `9D` / `10011101` | from v3 | `exit` or `return`
281300
| `A5` / `10100101` | all | `jlt dst, imm, off` | `dst < imm as i32 as i64 as u64`
282301
| `AD` / `10101101` | all | `jlt dst, src, off` | `dst < src`
283302
| `B5` / `10110101` | all | `jle dst, imm, off` | `dst <= imm as i32 as i64 as u64`
@@ -286,7 +305,7 @@ The `exit` (a.k.a. return) instruction does:
286305
| `CD` / `11001101` | all | `jslt dst, src, off` | `(dst as i64) < (src as i64)`
287306
| `D5` / `11010101` | all | `jsle dst, imm, off` | `(dst as i64) <= (imm as i32 as i64)`
288307
| `DD` / `11011101` | all | `jsle dst, src, off` | `(dst as i64) <= (src as i64)`
289-
| `E5` to `FD` | | -- reserved --
308+
| `E5` to `FD` | all | -- reserved --
290309

291310

292311
Verification
@@ -307,23 +326,34 @@ Verification
307326
- For all instructions the opcode must be valid
308327
- Memory write instructions can use `r10` as destination register
309328

329+
### until v1
330+
- No instruction can use `r11` as destination register
331+
332+
### from v1
333+
- `add64 reg, imm` can use `r11` as destination register
334+
310335
### until v2
311336
- Opcodes from the product / quotient / remainder instruction class are forbiden
337+
- `neg32` and `neg64` are allowed
312338
- `le` is allowed
339+
- `lddw` (opcodes `0x18` and `0x00`) is allowed
313340
- `hor64` is forbidden
314341
- `callx` source register is encoded in the imm field
315-
- The targets of `call` instructions is checked at runtime not verification time
316-
- The offset of jump instructions must be limited to the range of the bytecode
317342

318343
### from v2
319-
- Every function must end in a `ja` or `exit` instruction
320-
- `lddw` (opcodes `0x18` and `0x00`) are forbidden
321-
- `neg32` and `neg64` are forbidden
322344
- Opcodes from the product / quotient / remainder instruction class are allowed
345+
- `neg32` and `neg64` are forbidden
323346
- `le` is forbidden
347+
- `lddw` (opcodes `0x18` and `0x00`) is forbidden
324348
- `hor64` is allowed
325-
- The offset of jump instructions must be limited to the range of the current function
326349
- `callx` source register is encoded in the src field
327-
- The targets of internal calls (`call` instructions with src ≠ 0) must have been registered at verification time
328-
- The targets of syscalls (`call` instructions with src = 0) must have been registered at verification time
329-
- `add64 reg, imm` can use `r11` as destination register
350+
351+
### until v3
352+
- The targets of `call` instructions (which includes `syscall` instructions) are checked at runtime not verification time
353+
- The offset of jump instructions must be limited to the range of the bytecode
354+
355+
### from v3
356+
- Every function must end in a `ja` or `exit` instruction
357+
- The targets of `call` instructions must have been registered at verification time
358+
- The targets of `syscall` instructions must have been registered at verification time
359+
- The offset of jump instructions must be limited to the range of the current function

0 commit comments

Comments
 (0)