Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nodegroup messages are too low a priority #3674

Open
lvkale opened this issue Dec 1, 2022 · 4 comments · May be fixed by #3676
Open

Nodegroup messages are too low a priority #3674

lvkale opened this issue Dec 1, 2022 · 4 comments · May be fixed by #3676

Comments

@lvkale
Copy link
Contributor

lvkale commented Dec 1, 2022

I noted this in a tram-replacement experiment, with histogram code from bale suite. Nodegroup messages carrying boatload of data were arriving on the node but not being processed as all PEs were busy cycling through there own locally generated work. (Although the work generation method was broken down using CThYield() every x iterations). @nilsdeppe also seconded this request on slack.
What is happening: scheduler's function for finding next message always processes non-local (i.e. messages from others but meant for my PE) and local (from my pe) messages, so nodeGroup messages, although they can be processed by any PE, get delayed.
The request is to allow nodegroup messages to execute earlier.. somehow.

@lvkale
Copy link
Contributor Author

lvkale commented Dec 1, 2022

A nuance of intent: Do we want to think of nodegroup messages as a within-node load balancing mechanism also? If used for that purpose, it should be low priority. "Finish all your local work assigned to you and then pay attention to nodegroup messages" makes sense for this scenario. However we do have task queue for that purpose. And nodegroup messages have another purpose: to quickly process data arriving from outside the node. (Basically saying: any pe, please execute this, it is urgent so I don't want to wait for any particular PE to become available). The common use case appears to be the latter, so its worth fixing.

@lvkale
Copy link
Contributor Author

lvkale commented Dec 1, 2022

I have tested a fix in my copy, which works, but can sometimes slow things down. (I think because checking the nodeq, which is accessed via lock/fence, has a cost.. But also because changing order in which things happen can have unexpected consequences). But mostly it does what we expect. I will work on making a good branch with that fix and push a PR. (The idea is to check nodegroup messages with highest priority once in every K iterations of the scheduler.)

@stwhite91
Copy link
Collaborator

If you want a nodegroup entry method to be executed urgently upon arrival, isn't that what [immediate] is for?

@lvkale
Copy link
Contributor Author

lvkale commented Dec 1, 2022

Immediate will execute on commthread, which we don't want in this case. (comm thread will become a bottleneck). Also, we do want to parallelize/distribute the (not insignificant) work of handling incoming messages to all PEs... not for each message, but across them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants