Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft superscalar support #69

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft

Conversation

yugr
Copy link

@yugr yugr commented Mar 5, 2024

This draft PR illustrates the possibility of adding support for superscalar architectures in Unison.

It's based on the following suggestion from Roberto:

Here is my idea to model out-of-order execution in a more accurate and                                                                                                                                      
(hopefully) scalable way. The key idea is to mimic the OoO
architecture by decoupling the order in which instructions are fetched
and their execution schedule in the constraint model. Currently, the
model uses a single set of variables issue(o) (in the notation of the
TOPLAS paper [1]) to capture both fetching and executing. My proposal
would be to decouple this into two sets of variables, say f(o) and
e(o), giving the fetch order and (estimated) execution cycle of each
operation o.

Fetch order variables. These variables model the sequential order in
which operations are presented to the OoO processor. They define a
total ordering among all operations in a basic block; would determine
the live ranges; and, by extension, would be involved in all register
allocation constraints (see Table 7 in [1]).
There may be "holes" in the f(o) variables but that shouldn't be a
problem from the modeling perspective.

Execution cycle variables. These variables model an estimation of the
actual schedule in which operations are executed. They define a
partial ordering among all operations in a basic block, since as many
operations as the issue width (W) of the processor can be scheduled in
parallel. They would be involved in all instruction scheduling
constraints (resource usage, etc.).

These two sets of variables would have to be linked ("channeled" in
constraint programming speak) so that all operations estimated to be
executed in parallel (by the e(o) variables) are ordered contiguously
according to the f(o) variables. This could be achieved, I think, by
the following constraints:

e(o) * W <= f(o) < (e(o) + 1) * W     for all o in O_b, for all b in B
alldifferent({f(o) : o in O_b})    for all b in B

The appeal with this model is that it avoids introducing a quadratic
number of variables or constraints, which tends to affect scalability
severely as you noted in your email.

I also created an example of this model extension using the program
from Fig. 15 in [1] and assuming W = 2, see attachment.

While the idea is, conceptually, fairly simple, extending the Unison
implementation would probably be non-trivial and require serious
constraint programming expertise.

20230218_172429

The code most likely has many deficiencies so feedback and suggestions are very welcome.

@robcasloz
Copy link
Contributor

Interesting, thanks for the prototype implementation @yugr!

@yugr
Copy link
Author

yugr commented Apr 5, 2024

One problem with new constraints is that they cause many more solver timeouts. Perhaps additional presolver constraints could help with this. Let me know if anything comes to mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants