Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Future of the Boost.fiber back-end #1714

Closed
j-stephan opened this issue May 10, 2022 · 13 comments · Fixed by #1718
Closed

[RFC] Future of the Boost.fiber back-end #1714

j-stephan opened this issue May 10, 2022 · 13 comments · Fixed by #1718

Comments

@j-stephan
Copy link
Member

While working on #1713 I discovered that the Boost.fiber back-end is broken when enabling C++20. This was fixed in their repository a few days ago but all stable versions including the current one are unusable with C++20.

So we can either:

  • raise the minimum Boost version to 1.80 (or whenever the fix lands in a stable release)
  • remove the Boost.fiber back-end
  • deactivate the Boost.fiber back-end for C++20 and Boost versions < (fixed version)

Before someone puts any work into this: Do we want to continue to support Boost.fiber as a back-end? What is the use case? Which benefits does this back-end have over our OpenMP and TBB back-ends?

@j-stephan
Copy link
Member Author

Our internal discussion didn't show much love for the fiber back-end. I'll submit a PR to remove the back-end. Until then this issue is open for anyone who wants to keep it.

@bernhardmgruber
Copy link
Member

I wanted to run a quick benchmark of the fiber backend yesterday to compare it against other CPU backends but failed to make it scale across more than 1 core. The workdiv should have enough blocks and threads. Maybe the backend is already broken.

Furthermore, the fiber backend requires us to build boost and is thus an annoying dependency.

In the face of our limited maintenance resources, I see no strong reason to keep it.

@fwyzard
Copy link
Contributor

fwyzard commented May 11, 2022

Our internal discussion didn't show much love for the fiber back-end.

I'm not familiar with Fibers -- in the sense that I've read about it, but I've never had an occasion for using it.
It looks like an interesting approach, but maybe with only a niche use case.

If there is no urgency in removing it, I would be interested in giving it a try and comparing it with the other CPU backends.

As for the breakage, my suggestion would be

  • keep it enabled in c++17 mode
  • enable it in c++20 mode only if a recent enough version of Boost is detected

@bernhardmgruber
Copy link
Member

If there is no urgency in removing it, I would be interested in giving it a try and comparing it with the other CPU backends.

Please do so! I could not make it scale across more than one core in my quick test with @sliwowitz's random engine benchmark.

@fwyzard
Copy link
Contributor

fwyzard commented May 17, 2022

I wanted to run a quick benchmark of the fiber backend yesterday to compare it against other CPU backends but failed to make it scale across more than 1 core. The workdiv should have enough blocks and threads. Maybe the backend is already broken.

Looking at the code of the Fibers implementation:

  • "threads" inside a "block" will be run by a single OS thread, with as many fibers running in parallel as required by the block configuration:
auto const blockThreadCount(blockThreadExtent.prod());
FiberPool fiberPool(blockThreadCount);
  • "blocks" inside a "grid" will be run serially, within the calling OS thread:
// Execute the blocks serially.
meta::ndLoopIncIdx(gridBlockExtent, boundGridBlockExecHost);

So, if the observed behaviour is that the whole grid runs in a single OS thread, one block at a time, then it matches what the code and comments say.

Indeed running multiple threads in parallel (one per block) could be an interesting extension.

@fwyzard
Copy link
Contributor

fwyzard commented May 17, 2022

In fact, a useful extension could be the possibility to "mix and match" the grid-level and block-level scheduling algorithms:

  • run the blocks in the grid: serially, in parallel using OS threads, in parallel using TBB, etc.
  • run the threads in the block: serially, cooperatively using Fibers or coroutines, in parallel using OS threads, etc.

Then one could pick the block-level and thread-level strategy to better suit the problem and hardware ?
For example:

  • serial blocks, serial threads
  • parallel blocks using OS threads, cooperative threads using Fibers
  • parallel blocks using TBB, serial threads
  • etc.

@fwyzard
Copy link
Contributor

fwyzard commented May 17, 2022

Trying to get back on topic: is it correct that Fibers is the only single-thread CPU backend that supports thread-level synchronization primitives, like syncBlockThreads() ?

If that is the case, could you keep the (optional) Fibers back-end in Alpaka until an eventual Coroutine-based back-end is available ?

@BenjaminW3
Copy link
Member

Yes, the fiber backend only uses a single OS thread. There once was a ticket to scale this across multiple threads, not sure where this went.

We should try to keep the fiber based backend in there as long as technically possible until we have a reasonabe replacement.
If we disable it for C++20 builds, then I am fine with it.
It may only be of academic relevance, but it is a good starting point for similar experiments and extensions.

@j-stephan
Copy link
Member Author

There once was a ticket to scale this across multiple threads, not sure where this went.

This should be #22.

@bernhardmgruber
Copy link
Member

Looking at the code of the Fibers implementation:
[...]

Thank you for that investigation and summarizing the outcome! I did not have the time to look myself!

In fact, a useful extension could be the possibility to "mix and match" the grid-level and block-level scheduling algorithms:

This is exactly what @psychocoderHPC has in mind for a long time and we discussed that today in the alpaka VC. Unfortunately, that requires a redesign and probably big overhaul of alpaka and we don't have the resources for this ATM.

@bernhardmgruber
Copy link
Member

Yes, the fiber backend only uses a single OS thread.

Good to have that confirmed! Thank you!

@j-stephan j-stephan mentioned this issue Jun 20, 2022
@bernhardmgruber
Copy link
Member

I just touched ConcurrentExecPool again in #1850 and one reason for the complexity of the class is to support the fiber backend. It would help a fair bit to simplify ConcurrentExecPool if we could drop the fiber backend.

@fwyzard
Copy link
Contributor

fwyzard commented Dec 4, 2022

From the CMS side, since we are not actively using it and we don't have any more Tony looking into it, I guess it can be dropped.

Eventually we might look into an alternative implementation based on Argobots.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants