-
Notifications
You must be signed in to change notification settings - Fork 0
/
notes.txt
69 lines (56 loc) · 2.34 KB
/
notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
/*** Implementation related notes ***/
Aka re-engineering todos to clean and refactor things
- trait Print should belong in LMS
- make people somehow know that they need to extend LiftVariables?
- x+2 : found Int, expecting Rep[Int]
- Rep[Unit] => Rep[X] : compiler complains with types
- getting a "proper" iterator implementation with next and
hasNext
- LiftVariables for tuples?
- Integrate "emitDataStructures right into ScalaCompile"
- how to "ScalaCompile" programs with structs as input?
- integrate "Array mutation (2d arrays)" without sym errors
- printing globalDefs to understand how symbols link to each other. How the
effect system works.
/*** Architecture ideas (Thierry) ***/
Ideas:
- separate the concerns: evaluation / pretty print algebrae and DP algebra
-> use Vojin macros to make these two interact.
-> What files/framework do we need?
- file naming conventions: separate the phases
- *Ops.scala: describes IR
- we need to describe all the parsers into IR nodes
- *Opt.scala: describes optimizations
- bottom-up evaluation is done by the outer wrapping node
- *SGen.scala: Scala generation
- *CGen.scala: C/CUDA generation
- rework folders:
- src/main/scala/lms/utils? -> evaluation algebrae helpers (+package LMS ops)
- src/main/scala/lms/dp -> DP algebra
- src/main/scala/examples -> examples of the algebrae
- src/test/scala/lms -> regression tests (with shellscript?)
idea: compare v4 vs lms-scala vs lms-cuda
IR Nodes:
- Terminal[T](min,max) + F(input,i,j)=>List(T)
- Tabulate(P)
- Aggregate(P)
- Filter(P,f) f: T=>Bool
- Or(P,P)
- Map(P,f) f:T=>U
- Concat(P,P,type,indices)
- Wrapper(List[P],axiom_ptr,single/k_element?)
+++ syntactic sugar to Detuple
Optimization phase:
- wrapper -> normalization(single/k_element)
Code generation:
- Wrapper
- Generates bottom-up loops (Scala+CUDA) [quite generic]
-> we have 2 different iteration schemes: single track and two track
- Generates JNI transfer + CUDA wrappers
-> how to tightly integrate with JVM as currently?
--> can we reuse Delite JNI generation? What about array(struct)
Shortcomings of previous work:
- All the arrays are in O(n^2), we do not leverage yield_max
- Can we implement k-best for CUDA? (with k comparisons)
Misc:
- What is ListToGen? Could we leverage the idea to generalize the k-normalization ?