Draft superscalar support #69

yugr · 2024-03-05T09:00:57Z

This draft PR illustrates the possibility of adding support for superscalar architectures in Unison.

It's based on the following suggestion from Roberto:

Here is my idea to model out-of-order execution in a more accurate and                                                                                                                                      
(hopefully) scalable way. The key idea is to mimic the OoO
architecture by decoupling the order in which instructions are fetched
and their execution schedule in the constraint model. Currently, the
model uses a single set of variables issue(o) (in the notation of the
TOPLAS paper [1]) to capture both fetching and executing. My proposal
would be to decouple this into two sets of variables, say f(o) and
e(o), giving the fetch order and (estimated) execution cycle of each
operation o.

Fetch order variables. These variables model the sequential order in
which operations are presented to the OoO processor. They define a
total ordering among all operations in a basic block; would determine
the live ranges; and, by extension, would be involved in all register
allocation constraints (see Table 7 in [1]).
There may be "holes" in the f(o) variables but that shouldn't be a
problem from the modeling perspective.

Execution cycle variables. These variables model an estimation of the
actual schedule in which operations are executed. They define a
partial ordering among all operations in a basic block, since as many
operations as the issue width (W) of the processor can be scheduled in
parallel. They would be involved in all instruction scheduling
constraints (resource usage, etc.).

These two sets of variables would have to be linked ("channeled" in
constraint programming speak) so that all operations estimated to be
executed in parallel (by the e(o) variables) are ordered contiguously
according to the f(o) variables. This could be achieved, I think, by
the following constraints:

e(o) * W <= f(o) < (e(o) + 1) * W     for all o in O_b, for all b in B
alldifferent({f(o) : o in O_b})    for all b in B

The appeal with this model is that it avoids introducing a quadratic
number of variables or constraints, which tends to affect scalability
severely as you noted in your email.

I also created an example of this model extension using the program
from Fig. 15 in [1] and assuming W = 2, see attachment.

While the idea is, conceptually, fairly simple, extending the Unison
implementation would probably be non-trivial and require serious
constraint programming expertise.

The code most likely has many deficiencies so feedback and suggestions are very welcome.

…alar architectures in Unison.

robcasloz · 2024-03-07T21:11:05Z

Interesting, thanks for the prototype implementation @yugr!

yugr · 2024-04-05T08:51:46Z

One problem with new constraints is that they cause many more solver timeouts. Perhaps additional presolver constraints could help with this. Let me know if anything comes to mind.

This commit illustrates the possibility of adding support for supersc…

417e6f4

…alar architectures in Unison.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Draft superscalar support #69

Draft superscalar support #69

yugr commented Mar 5, 2024

robcasloz commented Mar 7, 2024

yugr commented Apr 5, 2024

Draft superscalar support #69

Are you sure you want to change the base?

Draft superscalar support #69

Conversation

yugr commented Mar 5, 2024

robcasloz commented Mar 7, 2024

yugr commented Apr 5, 2024