-
Notifications
You must be signed in to change notification settings - Fork 12
/
Copy pathoptimization-tricks.txt
60 lines (48 loc) · 1.37 KB
/
optimization-tricks.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
A collection of cute optimization tricks, mostly things I haven't had time to follow up on...
Loads from constant globals with patterns in data
(Note: LLVM currently handles some cases w/compares, but that's all I can find)
a = constant global [1,3,5,7,9]
a[i] -> -1 + i * 2
(and other assorted idioms)
Use masked loads w/pass through to eliminate selects of addresses
a = select c, a1, a2
v = load a
->
v1 = masked.load(c, a1, undef)
v = masked.load(!c, a2, v1)
(can do better if either are safe to speculate)
trivial specialization (no cost dispatch)
foo(..., bool b) { if (b) C; else D; }
->
foo(...) { if(b) tail C(); else tail D(); }
(potentially very interesting w/early return idioms)
Tricks for encoding long jumps with fewer bytes using implicit fault handlers
LongJCC: # 6 bytes
jle baz
ShortJCC: # 2 bytes
jle bar
forward_jump_int3: # 3 bytes
jle continue
int3 # bad to have in instruction stream?
continue:
forward_jump_ud2: # 4 bytes
jle continue2
ud2 # bad to have in instruction stream?
continue2:
jcc_payload: # 4 bytes
test %rax, %rcx
.byte 116 #0x74, je
fault:
.byte 204 #0xcc, int3 as payload of je
jle fault #backbrach into previous jcc
and_payload: % 4 bytes
test %rax, %rcx
.byte 4 #0x04, add AL, 204
fault2:
.byte 204 #0xcc, int3 as payload of add
jle fault2 #backbrach into previous add
bar:
retq
.section ".other"
baz:
retq