[RFC] Future of the Boost.fiber back-end #1714

j-stephan · 2022-05-10T13:06:51Z

While working on #1713 I discovered that the Boost.fiber back-end is broken when enabling C++20. This was fixed in their repository a few days ago but all stable versions including the current one are unusable with C++20.

So we can either:

raise the minimum Boost version to 1.80 (or whenever the fix lands in a stable release)
remove the Boost.fiber back-end
deactivate the Boost.fiber back-end for C++20 and Boost versions < (fixed version)

Before someone puts any work into this: Do we want to continue to support Boost.fiber as a back-end? What is the use case? Which benefits does this back-end have over our OpenMP and TBB back-ends?

j-stephan · 2022-05-11T07:23:03Z

Our internal discussion didn't show much love for the fiber back-end. I'll submit a PR to remove the back-end. Until then this issue is open for anyone who wants to keep it.

bernhardmgruber · 2022-05-11T07:52:25Z

I wanted to run a quick benchmark of the fiber backend yesterday to compare it against other CPU backends but failed to make it scale across more than 1 core. The workdiv should have enough blocks and threads. Maybe the backend is already broken.

Furthermore, the fiber backend requires us to build boost and is thus an annoying dependency.

In the face of our limited maintenance resources, I see no strong reason to keep it.

fwyzard · 2022-05-11T08:42:14Z

Our internal discussion didn't show much love for the fiber back-end.

I'm not familiar with Fibers -- in the sense that I've read about it, but I've never had an occasion for using it.
It looks like an interesting approach, but maybe with only a niche use case.

If there is no urgency in removing it, I would be interested in giving it a try and comparing it with the other CPU backends.

As for the breakage, my suggestion would be

keep it enabled in c++17 mode
enable it in c++20 mode only if a recent enough version of Boost is detected

bernhardmgruber · 2022-05-11T08:50:59Z

If there is no urgency in removing it, I would be interested in giving it a try and comparing it with the other CPU backends.

Please do so! I could not make it scale across more than one core in my quick test with @sliwowitz's random engine benchmark.

fwyzard · 2022-05-17T10:37:06Z

I wanted to run a quick benchmark of the fiber backend yesterday to compare it against other CPU backends but failed to make it scale across more than 1 core. The workdiv should have enough blocks and threads. Maybe the backend is already broken.

Looking at the code of the Fibers implementation:

"threads" inside a "block" will be run by a single OS thread, with as many fibers running in parallel as required by the block configuration:

auto const blockThreadCount(blockThreadExtent.prod());
FiberPool fiberPool(blockThreadCount);

"blocks" inside a "grid" will be run serially, within the calling OS thread:

// Execute the blocks serially.
meta::ndLoopIncIdx(gridBlockExtent, boundGridBlockExecHost);

So, if the observed behaviour is that the whole grid runs in a single OS thread, one block at a time, then it matches what the code and comments say.

Indeed running multiple threads in parallel (one per block) could be an interesting extension.

fwyzard · 2022-05-17T10:46:09Z

In fact, a useful extension could be the possibility to "mix and match" the grid-level and block-level scheduling algorithms:

run the blocks in the grid: serially, in parallel using OS threads, in parallel using TBB, etc.
run the threads in the block: serially, cooperatively using Fibers or coroutines, in parallel using OS threads, etc.

Then one could pick the block-level and thread-level strategy to better suit the problem and hardware ?
For example:

serial blocks, serial threads
parallel blocks using OS threads, cooperative threads using Fibers
parallel blocks using TBB, serial threads
etc.

fwyzard · 2022-05-17T10:50:40Z

Trying to get back on topic: is it correct that Fibers is the only single-thread CPU backend that supports thread-level synchronization primitives, like syncBlockThreads() ?

If that is the case, could you keep the (optional) Fibers back-end in Alpaka until an eventual Coroutine-based back-end is available ?

BenjaminW3 · 2022-05-17T11:05:27Z

Yes, the fiber backend only uses a single OS thread. There once was a ticket to scale this across multiple threads, not sure where this went.

We should try to keep the fiber based backend in there as long as technically possible until we have a reasonabe replacement.
If we disable it for C++20 builds, then I am fine with it.
It may only be of academic relevance, but it is a good starting point for similar experiments and extensions.

j-stephan · 2022-05-17T11:15:02Z

There once was a ticket to scale this across multiple threads, not sure where this went.

This should be #22.

bernhardmgruber · 2022-05-17T11:32:10Z

Looking at the code of the Fibers implementation:
[...]

Thank you for that investigation and summarizing the outcome! I did not have the time to look myself!

In fact, a useful extension could be the possibility to "mix and match" the grid-level and block-level scheduling algorithms:

This is exactly what @psychocoderHPC has in mind for a long time and we discussed that today in the alpaka VC. Unfortunately, that requires a redesign and probably big overhaul of alpaka and we don't have the resources for this ATM.

bernhardmgruber · 2022-05-17T11:32:30Z

Yes, the fiber backend only uses a single OS thread.

Good to have that confirmed! Thank you!

bernhardmgruber · 2022-12-04T02:22:15Z

I just touched ConcurrentExecPool again in #1850 and one reason for the complexity of the class is to support the fiber backend. It would help a fair bit to simplify ConcurrentExecPool if we could drop the fiber backend.

fwyzard · 2022-12-04T20:20:02Z

From the CMS side, since we are not actively using it and we don't have any more Tony looking into it, I guess it can be dropped.

Eventually we might look into an alternative implementation based on Argobots.

j-stephan added Type:Question Backend:Boost.Fiber labels May 10, 2022

j-stephan mentioned this issue May 11, 2022

Drop Boost.fiber back-end #1718

Merged

fwyzard mentioned this issue May 17, 2022

Coroutine based accelerator backend #689

Closed

j-stephan mentioned this issue Jun 20, 2022

CI update #1713

Closed

bernhardmgruber closed this as completed in #1718 Dec 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Future of the Boost.fiber back-end #1714

[RFC] Future of the Boost.fiber back-end #1714

j-stephan commented May 10, 2022

j-stephan commented May 11, 2022

bernhardmgruber commented May 11, 2022

fwyzard commented May 11, 2022

bernhardmgruber commented May 11, 2022

fwyzard commented May 17, 2022

fwyzard commented May 17, 2022

fwyzard commented May 17, 2022

BenjaminW3 commented May 17, 2022

j-stephan commented May 17, 2022

bernhardmgruber commented May 17, 2022

bernhardmgruber commented May 17, 2022

bernhardmgruber commented Dec 4, 2022

fwyzard commented Dec 4, 2022

[RFC] Future of the Boost.fiber back-end #1714

[RFC] Future of the Boost.fiber back-end #1714

Comments

j-stephan commented May 10, 2022

j-stephan commented May 11, 2022

bernhardmgruber commented May 11, 2022

fwyzard commented May 11, 2022

bernhardmgruber commented May 11, 2022

fwyzard commented May 17, 2022

fwyzard commented May 17, 2022

fwyzard commented May 17, 2022

BenjaminW3 commented May 17, 2022

j-stephan commented May 17, 2022

bernhardmgruber commented May 17, 2022

bernhardmgruber commented May 17, 2022

bernhardmgruber commented Dec 4, 2022

fwyzard commented Dec 4, 2022