chore(deps): update dependency org.typelevel:cats-effect to v3 #2067
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
2.5.5
->3.6-1f95fd7
Release Notes
typelevel/cats-effect
v3.5.0
Compare Source
This is the forty-fifth release in the Cats Effect 3.x lineage. It is fully binary compatible with every 3.x release.
This release contains some changes that may be semantically breaking. If you're using fs2, http4s, or other libraries from the ecosystem, make sure you've upgraded to versions of these libraries that are compatible with this release (for fs2, that's 3.7.0, for http4s it's 0.23.19)!
Additionally, if you're using methods like
fromFuture
, make sure you're aware of the major changes toasync
, described in these release notes.This is an incredibly exciting release! 3.5.0 represents the very first steps towards a fully integrated runtime, with support for timers (
IO.sleep
) built directly into the Cats Effect fiber runtime. This considerably increases performance for existing Cats Effect applications, but particularly those which rely more heavily on nativeIO
concurrency (e.g. Http4s Ember will see more benefits than Http4s Blaze).Additionally, we've taken the opportunity presented by a minor release to fix some breaking semantic issues within some of the core
IO
functionality, particularly related toasync
. For most applications this should be essentially invisible, but it closes a long-standing loophole in the cancelation and backpressure model, ensuring a greater degree of safety in Cats Effect's guarantees.Major Changes
Despite the deceptively short list of merged pull requests, this release contains an unusually large number of significant changes in runtime semantics. The changes in
async
cancelation (and particularly the implications onasync_
) are definitely expected to have user-facing impact, potentially breaking existing code in subtle ways. If you have any code which usesasync_
(orasync
) directly, you should read this section very carefully and potentially make the corresponding changes.async
Cancelation SemanticsThe
IO.async
(and correspondingly,Async#async
) constructor takes a function which returns a value of typeIO[Option[IO[Unit]]]
, with theSome
case indicating the finalizer which should be invoked if the fiber is canceled while asynchronously suspended at this precise point, andNone
indicating that there is no finalizer for the current asynchronous suspension. This mechanism is most commonly used for "unregister" functions. For example, consider the following reimplementation of thesleep
constructor:In the above, the
IO
returned fromsleep
will suspend fortime
. If its fiber is canceled, thef.cancel()
function will be invoked (onScheduledFuture
), which in turn removes theRunnable
from theScheduledExecutorService
, avoiding memory leaks and such. If we had instead returnedNone
from the registration effect, there would have been no finalizer and no way for fiber cancelation to clean up the strayScheduledFuture
.The entirety of Cats Effect's design is prescriptively oriented around safe cancelation. If Cats Effect cannot guarantee that a resource is safely released, it will prevent cancelation from short-circuiting until execution proceeds to a point at which all finalization is safe. This design does have some tradeoffs (it can lead to deadlocks in poorly behaved programs), but it has the helpful outcome of strictly avoiding resource leaks, either due to incorrect finalization or circumvented backpressure.
...except in
IO.async
. Prior to 3.5.0, defining anasync
effect without a finalizer (i.e. producingNone
) resulted in an effect which could be canceled unconditionally, without the invocation of any finalizer. This was most seriously felt in theasync_
convenience constructor, which always returnsNone
. Unfortunately, this semantic is very much the wrong default. It makes the assumption that the normal case forasync
is that the callback just cleans itself up (somehow) and no unregistration is possible or necessary. In almost all cases, the opposite is true.It is exceptionally rare, in fact, for an
async
effect to not have an obvious finalizer. By defining the default in this fashion, Cats Effect made it very easy to engineer resource leaks and backpressure loss. This loophole is now closed, both in theIO
implementation and in the laws which govern its behavior.As of 3.5.0, the following is now considered to be uncancelable:
Previously, the above was cancelable without any caveats. Notably, this applies to all uses of the
async_
constructor!In practice, we expect that usage of the
async
constructor which was already well behaved will be unaffected by this change. However, any use which is (possibly unintentionally) relying on the old semantic will break, potentially resulting in deadlock as a cancelation which was previously observed will now be suppressed until theasync
completes. For this reason, users are advised to carefully audit their use ofasync
to ensure that they always returnSome(...)
with the appropriate finalizer that unregisters their callback.In the event that you need to restore the previous semantics, they can be approximated by producing
Some(IO.unit)
from the registration. This is a very rare situation, but it does arise in some cases. For example, the definition ofIO.never
had to be adjusted to the following:This change can result in some very subtle consequences. If you find unexpected effects in your application after upgrading to 3.5.0, you should start your investigation with this change! (note that this change also affects third-party libraries using
async
, even if they have themselves not yet updated to 3.5.0 or higher!)Integrated Timers
From the very beginning, Cats Effect and applications built on top of it have managed timers (i.e.
IO.sleep
and everything built on top of it) on the JVM by using a separate thread pool. In particular,ScheduledExecutorService
. This is an extremely standard approach used prolifically by almost all JVM applications. Unfortunately, it is also fundamentally suboptimal.The problem stems from the fact that
ScheduledExecutorService
isn't magic. It works by maintaining one or more event dispatch threads which interrogate a data structure containing all active timers. If any timers have passed their expiry, the thread invokes theirRunnable
. If no timers are expired, the thread blocks for the minimum time until the next timer becomes available. In its default configuration, the Cats Effect runtime provisions exactly one event dispatch thread for this purpose.This isn't so bad when an application makes very little use of timers, since the thread in question will spend almost all of its time blocked, doing nothing. This affects timeslice granularity within the OS kernel and adds an additional GC root, but both effects are small enough that they are usually unnoticed. The bigger problem comes when an application is using a lot of timers and the thread is constantly busy reading that data structure and dispatching the next set of
Runnable
(s) (all of which completeasync
s and immediately shift back into the Cats Effect compute pool).Unfortunately, this situation where a lot of timers are in use is exactly what happens in every network application, since each and every active socket must have at least one
IO.sleep
associated with it to time out handling if the remote side stops responding (in most cases, such as HTTP, even more than one timer is needed). In other words, the fact thatIO.sleep
is relatively inefficient when a lot of concurrentsleep
s are scheduled is particularly egregiously bad, since this is precisely the situation that describes most real-world usage of Cats Effect.So we made this better! Cats Effect 3.5.0 introduces a new implementation of timers based on cooperative polling, which is basically the idea that timers can be dispatched and handled entirely by the same threads which handle compute work. Every time a compute worker thread runs out of work to do (and has nothing to steal), rather than just parking and waiting for more work, it first checks to see if there are any outstanding timers. If there are some which are ready to run, it runs them. Otherwise, if there are timers which aren't yet completed, the worker parks for that period of time (or until awakened by new work), ensuring the timer fires on schedule. In the event that a worker has not had the opportunity to park in some number of iterations, it proactively checks on its timers just to see if any have expired while it has been busy doing CPU-bound work.
This technique works extremely well in Cats Effect precisely because every timer had to shift back to the compute pool anyway, meaning that it was already impossible for any timer to have a granularity which was finer than that of the compute worker thread task queue. Thus, having that same task queue manage the dispatching of the timers themselves ensures that at worst those timers run with the same precision as previously, and at best we are able to avoid a considerable amount of overhead both in the form of OS kernel scheduler contention (since we are removing a whole thread from the application!) and the expense of a round-trip context shift and passage through the external work queue.
And, as mentioned, this optimization applies specifically to a scenario which is present in almost all real-world Cats Effect applications! To that end, we tested the performance of a relatively simple Http4s Ember server while under heavy load generated using the
hey
benchmark tool. The result was a roughly 15-25% improvement in sustained maximum requests per second, and a roughly 15% improvement in the 99th percentile latencies (P99). In practical terms, this means that this one change makes standard microservice applications around 15% more efficient with no other adjustments.Obviously, you should do your own benchmarking to measure the impact of this optimization, but we expect the results to be very visible in production top-line metrics.
User-Facing Pull Requests
uncancelable
would remain masked for one stage (@djspiewak)cede
s (@armanbilge)uncancelable
body (@armanbilge)Queue.synchronous
to include a two-phase commit (@djspiewak)Queue.synchronous
internals to simplify concurrent hand-off (@djspiewak)Mutex
memory leak (@BalmungSan)ioRuntimeConfig
, pass it toCPUStarvation
(@manuelcueto)AsyncMutex
implementation (@BalmungSan)blockedThreadDetectionEnabled
configurable via a system property (@chunjef)map2
optimization (@durban)AtomicCell#get
should not semantically block (@armanbilge)Console#readLine
cancelable (@armanbilge)IODeferred
(@armanbilge)HotSwap
safe to concurrent access (@armanbilge)IORuntimeBuilder
failureReporter
config on JS (@armanbilge)cancelable
(@djspiewak)timeout
(@djspiewak)fromFutureCancelable
and friends (@armanbilge)Mutex
&AtomicCell
(@BalmungSan)IOLocal#scope
, revert #3214 (@armanbilge)ConcurrentAtomicCell
(@BalmungSan)IOLocal
- generalizescope
function (@iRevive)BatchingMacrotaskExecutor
(@armanbilge)Ref
'sflatModify
(@mn98)asyncCheckAttempt
inIODeferred#get
(@armanbilge)IO#supervise
,IO#toResource
,IO#metered
(@kamilkloch)IO#voidError
(@armanbilge)async_
to be uncancelable (@djspiewak)flatModify
, onRef
(@mn98)Defer
instance forResource
withoutSync
requirement (@Odomontois)Async#asyncCheckAttempt
for #3087 (@seigert)IOLocal#scope
(@iRevive)A very special and heartfelt thanks to all of you!
v3.4.11
Compare Source
This is the forty-fourth release in the Cats Effect 3.x lineage. It is fully binary compatible with every 3.x release. It is fully binary compatible with every 3.x release, and fully source-compatible with every 3.4.x release. Note that source compatibility has been broken with 3.3.x in some minor areas. Since those changes require active choice on the part of users to decide the best adjusted usage for their specific scenario, we have chosen to not provide scalafixes which automatically patch the affected call sites.
User-Facing Pull Requests
Thank you, Daniel!
v3.4.10
Compare Source
This is the forty-second release in the Cats Effect 3.x lineage. It is fully binary compatible with every 3.x release. It is fully binary compatible with every 3.x release, and fully source-compatible with every 3.4.x release. Note that source compatibility has been broken with 3.3.x in some minor areas. Since those changes require active choice on the part of users to decide the best adjusted usage for their specific scenario, we have chosen to not provide scalafixes which automatically patch the affected call sites.
User-Facing Pull Requests
map2
optimization (@durban)Very special thanks to all!
v3.4.9
Compare Source
This is the fortieth release in the Cats Effect 3.x lineage. It is fully binary compatible with every 3.x release, and fully source-compatible with every 3.4.x release. Note that source compatibility has been broken with 3.3.x in some minor areas. Since those changes require active choice on the part of users to decide the best adjusted usage for their specific scenario, we have chosen to not provide scalafixes which automatically patch the affected call sites.
User-Facing Pull Requests
Dispatcher
: check for outstanding actions before release (@samspills)raceOutcome
to correct implementation (@Jasper-M)Dispatcher
error reporting (@samspills)IOFiber
(@djspiewak)std.Console
(@zetashift)IODeferred
specialization (@armanbilge)unsafeRunAndForget
(@samspills)IOFiber#toString
(@durban)Special thanks to each and every one of you!
v3.4.8
This is the thirty-ninth release in the Cats Effect 3.x lineage. It is fully binary compatible with every 3.x release. It is fully binary compatible with every 3.x release, and fully source-compatible with every 3.4.x release. Note that source compatibility has been broken with 3.3.x in some minor areas. Since those changes require active choice on the part of users to decide the best adjusted usage for their specific scenario, we have chosen to not provide scalafixes which automatically patch the affected call sites.
This release fixes a very rare runtime bug which manifests in applications with a high degree of contention on
blocking
/interruptible
operations. In some rare circumstances, a fiber could be lost during the scheduling process, which could result in application-level deadlocks.User-Facing Pull Requests
fromCompletableFuture
cancelation leak (@TimWSpence, @armanbilge)Thank you, everyone!
v3.4.7
Compare Source
This is the thirty-sixth release in the Cats Effect 3.x lineage. It is fully binary compatible with every 3.x release. It is fully binary compatible with every 3.x release, and fully source-compatible with every 3.4.x release. Note that source compatibility has been broken with 3.3.x in some minor areas. Since those changes require active choice on the part of users to decide the best adjusted usage for their specific scenario, we have chosen to not provide scalafixes which automatically patch the affected call sites.
User-Facing Pull Requests
CallbackStack
leak, restore specializedIODeferred
(@armanbilge)Thanks, Arman! <3
v3.4.6
Compare Source
This is the thirty-sixth release in the Cats Effect 3.x lineage. It is fully binary compatible with every 3.x release. It is fully binary compatible with every 3.x release, and fully source-compatible with every 3.4.x release. Note that source compatibility has been broken with 3.3.x in some minor areas. Since those changes require active choice on the part of users to decide the best adjusted usage for their specific scenario, we have chosen to not provide scalafixes which automatically patch the affected call sites.
User-Facing Pull Requests
ContState
(@durban)async
callback receivesnull
(@durban)clearTimeout
(@armanbilge)fromCompletableFuture
to usecont
(@armanbilge)nowMicros()
intoTry
(@armanbilge)Special thanks to each and every one of you!
v3.4.5
Compare Source
This is the thirty-fifth release in the Cats Effect 3.x lineage. It is fully binary compatible with every 3.x release, and fully source-compatible with every 3.4.x release. Note that source compatibility has been broken with 3.3.x in some minor areas. Since those changes require active choice on the part of users to decide the best adjusted usage for their specific scenario, we have chosen to not provide scalafixes which automatically patch the affected call sites.
This release rolls back the
Deferred[IO, A]
optimizations for the time being due to a memory leak in certain common scenarios. In particular, any use of Fs2'sinterruptWhen
where the stream in question naturally completes quickly would hit this case relatively hard. Like, for example, Http4s Ember. We have a fix for the memory leak which needs a bit more testing before release, and we felt that, out of an abundance of caution, it is better to revert the changes immediately rather than waiting for the hardening.User-Facing Pull Requests
IO
-specializedDeferred
for now (@djspiewak)memoize
(@armanbilge)Dispatcher.sequential(await = true)
release (@armanbilge)CallbackStack
leak on JS (@armanbilge)Thank you so very much!
v3.4.4
Compare Source
This is the thirty-fourth release in the Cats Effect 3.x lineage. It is fully binary compatible with every 3.x release, and fully source-compatible with every 3.4.x release. Note that source compatibility has been broken with 3.3.x in some minor areas. Since those changes require active choice on the part of users to decide the best adjusted usage for their specific scenario, we have chosen to not provide scalafixes which automatically patch the affected call sites.
This release fixes a memory leak in
Deferred
. The memory leak in question is relatively small, but can accumulate over a long period of time in certain common applications. Additionally, this leak regresses GC performance slightly for almost all Cats Effect applications. For this reason, it is highly recommended that users upgrade to this release as soon as possible if currently using version 3.4.3.User-Facing Pull Requests
CallbackStack#clear
method (@armanbilge)CallbackStack
on JS (@armanbilge)IODeferred
(@durban)ExitCase
inResource#{both,combineK}
(@armanbilge)Thank you so very much!
v3.4.3
Compare Source
This is the thirty-third release in the Cats Effect 3.x lineage. It is fully binary compatible with every 3.x release, and fully source-compatible with every 3.4.x release. Note that source compatibility has been broken with 3.3.x in some minor areas. Since those changes require active choice on the part of users to decide the best adjusted usage for their specific scenario, we have chosen to not provide scalafixes which automatically patch the affected call sites.
Despite being a patch release, this update contains two major notable feature additions: full tracing support for Scala Native applications (including enhanced exceptions!), and significantly improved performance for
Deferred
whenIO
is the base monad. Regarding the latter, sinceDeferred
is at the core of most concurrent logic written against Cats Effect, it is expected that this change will result in some noticeable performance improvements in most applications, though it is hard to predict exactly how pronounced this effect will be.User-Facing Pull Requests
Deferred
based onIOFiber
's machinery (@djspiewak)Resource.race
(@armanbilge)reportFailure
forMainThread
(@armanbilge)IOLocal
micro-optimizations (@armanbilge)Very special thanks to all of you!
v3.4.2
Compare Source
This is the thirty-second release in the Cats Effect 3.x lineage. It is fully binary compatible with every 3.x release, and fully source-compatible with every 3.4.x release. Note that source compatibility has been broken with 3.3.x in some minor areas. Since those changes require active choice on the part of users to decide the best adjusted usage for their specific scenario, we have chosen to not provide scalafixes which automatically patch the affected call sites.
User-Facing Pull Requests
Deferred#complete
uncancelable (@durban)Ref
without wrappingAtomicReference
on JS/Native (@armanbilge)cell
read inAtomicCell#evalModify
(@armanbilge)Thank you so much!
v3.4.1
This is the thirty-first release in the Cats Effect 3.x lineage. It is fully binary compatible with every 3.x release, and fully source-compatible with every 3.4.x release. The primary purpose of this release is to address a minor link-time regression which manifested when extending
IOApp
with aclass
(not atrait
) which was in turn extended by another class. In this scenario, the resulting main class would hang on exit if the intervening extension class had not been recompiled against Cats Effect 3.4.0. Note that this issue with separate compilation andIOApp
does remain in a limited form: theMainThread
executor is inaccessible when linked in this fashion. The solution is to ensure that all compilation units which extendIOApp
(directly or indirectly) are compiled against Cats Effect 3.4.0 or later.User-Facing Pull Requests
IOApp
deadlock (@armanbilge)Thank you, everyone!
v3.4.0
This is the thirtieth release in the Cats Effect 3.x lineage. It is fully binary compatible with every 3.x release, and fully source-compatible with every 3.4.x release. Note that source compatibility has been broken with 3.3.x in some minor areas. Since those changes require active choice on the part of users to decide the best adjusted usage for their specific scenario, we have chosen to not provide scalafixes which automatically patch the affected call sites.
A Note on Release Cadence
While Cats Effect minor releases are always guaranteed to be fully backwards compatible with prior releases, they are not forwards compatible with prior releases, and partially as a consequence of this, can (and often do) break source compatibility. In other words, sources which compiled and linked successfully against prior Cats Effect releases will continue to do so, but recompiling those same sources may fail against a subsequent minor release.
For this reason, we seek to balance the inconvenience this imposes on downstream users against the need to continually improve and advance the ecosystem. Our target cadence for minor releases is somewhere between once every three months and once every six months, with frequent patch releases shipping forwards compatible improvements and fixes in the interim.
Unfortunately, Cats Effect 3.3.0 was released over ten months ago, meaning that the 3.4.0 cycle has required considerably more time than usual to come to fruition. There are several reasons for this, but long and short is that this is expected to be an unusual occurrence. We currently expect to release Cats Effect 3.5.0 sometime in Spring 2023, in line with our target cadence.
Major Changes
As this has been a longer than usual development stretch (between 3.3.0 and 3.4.0), this release contains a large number of significant changes and improvements. Additionally, several improvements that we're very excited about didn't quite make the cutoff and have been pushed to 3.5.0. This section details some of the more impactful changes in this release.
High Performance
Queue
One of the core concurrency utilities in Cats Effect is
Queue
. Despite its ubiquity in modern applications, the implementation ofQueue
has always been relatively naive, based entirely on immutable data structures,Ref
, andDeferred
. In particular, the core of the boundedQueue
implementation since 3.0 looks like the following:The
ScalaQueue
type refers toscala.collection.immutable.Queue
, which is a relatively simple Bankers Queue implementation within the Scala standard library. All end-user operations (e.g.take
) within this implementation rely onRef#modify
to update internal state, withDeferred
functioning as a signalling mechanism whentake
oroffer
need to semantically block (because the queue is empty or full, respectively).This implementation has several advantages. Notably, it is quite simple and easy to reason about. This is actually an important property since lock-free queues, particularly multi-producer multi-consumer queues, are extremely complex to implement correctly. Additionally, as it is built entirely in terms of
Ref
andDeferred
, it is usable in any context which has aConcurrent
constraint onF[_]
, allowing for a significant amount of generality and abstraction within downstream frameworks.Despite its simplicity, this implementation also does surprisingly well on performance metrics. Anecdotal use of
Queue
within extremely hot I/O processing loops shows that it is rarely, if ever, the bottleneck on performance. This is somewhat surprising precisely because it's implemented in terms of these purely functional abstractions, meaning that it is relatively representative of the kind of performance you can expect out of Cats Effect as an end user when writing complex concurrent logic in terms of theConcurrent
abstraction.Despite all this though, we always knew we could do better. Persistent, immutable data structures are not known for getting the absolute top end of performance out of the underlying hardware. Lock-free queues in particular have a very rich legacy of study and optimization, due to their central position in most practical applications, and it would be unquestionably beneficial to take advantage of this mountain of knowledge within Cats Effect. The problem has always been two fold: first, the monumental effort of implementing an optimized lock-free async queue essentially from scratch, and second, how to achieve this kind of implementation without leaking into the abstraction and forcing an
Async
constraint in place of theConcurrent
one.The constraint problem is particularly thorny, since numerous downstream frameworks have built around the fact that the naive
Queue
implementation only requiresConcurrent
, and it would not make much sense to force anAsync
constraint when no surface functionality is being changed or added (only performance improvements). However, any high-performance implementation would require access toAsync
, both to directly implement asynchronous suspension (rather than redirecting throughDeferred
) and to safely suspend the side-effects required to manipulate mutable data structures.This problem has been solved by using runtime casing on the
Concurrent
instance behind the scenes. In particular, whenever you construct aQueue.bounded
, the runtime type of that instance is checked to see if it is secretly anAsync
. If it is, the higher performance implementation is transparently used instead of the naive one. In practice, this should apply at almost all possible call sites, meaning that the new implementation represents an entirely automatic and behind the scenes performance improvement.As for the implementation, we chose to start from the foundation of the industry-standard JCTools Project. In particular, we ported the
MpmcArrayQueue
implementation from Java to Scala, making slight adjustments along the way. In particular:sun.misc.Unsafe
for manipulation of directional memory fencesnull
values without introducing extra boxingAll credit goes to Nitsan Wakart (and other JCTools contributors) for this data structure.
This implementation is used to contain the fundamental data within the queue, and it handles an enormous number of very subtle corner cases involving numerous producers and consumers all racing against each other to read from and write to the same underlying data, but it is insufficient on its own to implement the Cats Effect
Queue
. In particular, whenoffer
fails onMpmcArrayQueue
(because the queue is full), it simply rejects the value. Whenoffer
fails on Cats Effect'sQueue
, the calling fiber is blocked until space is available, encoding a form of backpressure that sits at the heart of many systems.In order to achieve this semantic, we had to not only implement a fast bounded queue for the data, but also a fast unbounded queue to contain any suspended fibers which are waiting a condition on the queue. We could have used
ConcurrentLinkedQueue
(from the Java standard library) for this, but we can do even better on performance with a bit of specialization. Additionally, due to cancelation, each listener needs to be able to efficiently remove itself from the queue, regardless of how far along it is in line. To resolve these issues, Viktor Klang and myself have built a more optimized implementation based on atomic pointer chaining. It's actually possible to improve on this implementation even further (among other things, by removing branching), which should arrive in a future release.Congratulations on traversing this entire wall of text! Have a pretty performance chart as a reward:
This has been projected onto a linear relative scale. You can find the raw numbers here. In summary, the new queues are between 2x and 4x faster than the old ones.
The bottom line on all of this is that any application which relies on queues (which is to say, most applications) should see an automatic improvement in performance of some magnitude. As mentioned at the top, the queue data structure itself does not appear to be the performance bottleneck in any practical application, but every bit helps, and free performance is still free performance!
Hardened
Queue
SemanticsAs a part of the rework of the core data structures, it was decided to make a very subtle change to the semantics of the
Queue
data structure while under heavy load, particularly in true multi-producer, multi-consumer (MPMC) scenarios. Under certain circumstances, the previous implementation ofQueue
could actually lose data. This manifested when one fiber enqueued a value, while another fiber dequeued that value and was canceled during the dequeue. When this happened, it was possible for the value to have been removed from the underlying data structure but not fully returned from thepoll
effect, meaning that it could be lost without user-land code having any chance to access it within a finalizer.This sounds like a relatively serious issue, though it's important to understand that the race condition which gives rise to this was vanishingly rare (to the point where no one has ever, to our knowledge, encountered this in the wild). However, fixing this semantic required reworking a lot of the core guarantees offered by the data structure. In particular, it is now no longer strictly guaranteed in all cases while under contention that elements read from a queue by multiple concurrent consumers will be read in exactly insertion order.
More specifically, imagine a situation where you have two consumers and two producers on an empty queue. Consumer A attaches first (using
poll
), followed by consumer B. Immediately after this, the first producer writes value1
, followed by the second producer writing value2
. Critically, both the first and second producer need to write to the queue at nearly exactly the same moment.With the previous implementation of
Queue
, users could rely on an ironclad guarantee that consumer A would get value1
, while consumer B would get value2
. Now, this is no longer strictly guaranteed. It is possible for B to get1
while A gets2
. In fact, there is an even stranger version of this race condition which only involves a single producer but still generates a similar outcome: consumer A callspoll
, and sometime later consumer B callspoll
at the same moment that the single produceroffer
s item1
. When this scenario arises, it is possible for B to get item1
and A to get nothing at all, despite the fact that A has been waiting patiently for some significant length of time.More precisely, the new
Queue
no longer strictly guarantees fairness across multiple consumers when under concurrent contention. This loss of fairness can, under certain circumstances, manifest as a corruption of ordering, though one which is unobservable except if the user were to somehow coordinate precise timestamps across multiple consuming fibers. And, as it turns out, the weakening of these guarantees are directly connected to the fix for the (rare) loss of data during fiber cancelation.To be clear, multi-consumer scenarios are rather rare to begin with, and I cannot think of a single circumstance under which someone would have a multi-consumer
Queue
and have any expectation of strong ordering or fairness between their consumers. As an appeal to authority, this kind of loss of fairness is extremely standard across all MPMC queue implementations in other languages and runtimes, specifically because data loss is a much more dangerous and impactful outcome and must be avoided at all costs.To that end, it is considered very unlikely that users will even notice this change, but it is still a significant and subtle adjustment in the core semantics of
Queue
. The upside of all of this is users can now rely on the guarantee that, if an effectoffer(a)
completes successfully, then the valuea
will be "in the queue" and will be later readable by apoll
effect. Additionally, if and only ifpoll
removes the element,a
, from the queue, it will complete successfully even if externally canceled; conversely, ifpoll
is canceled before it removesa
from the queue, thena
will remain available for subsequentpoll
s. Thus, data loss is avoided in all cases.More Robust
Dispatcher
(andSupervisor
!)Dispatcher
was one of the most significant changes from Cats Effect 2 to 3. In particular, it addresses a long-standing annoyance when working with effect types: the tongue-and-cheek termed "Soviet interop" case, where unsafe code calls you. In previous versions of Cats Effect, this scenario was handled by theConcurrentEffect
typeclass and the universally confusingrunAsync
method.The way in which
Dispatcher
works is effectively as a fiber-level event dispatch pattern: a single fiber (the dispatcher) polls an asynchronous queue which containsIO[Any]
values (the units of work), and when a new work unit is acquired, the dispatcher spawns a fiber for that unit and continues polling. This type of pattern is extremely general: it doesn't matter how long the work units need to complete, they cannot interfere with each other because each is proactively relocated to its own fiber.Additionally, when CE3 was released, we weren't entirely certain how users wanted to use
Dispatcher
in practical applications. It was believed likely that most users would create a single top-levelDispatcher
for their entire application, and thus the implementation of the event dispatch fibers was optimized with the assumption that a singleDispatcher
instance would be under heavy concurrent load. These optimizations are fairly robust, but they do come with pair of costs: there is no guarantee of ordering between two sequentially-submitted work units (IO[Any]
values), and every unit of work must pay the price of spawning a new fiber regardless of how long that work unit needs to execute. The former issue is well-exemplified by the following:In the above, we submit
ioa
strictly before we submitiob
, butiob
may actually execute first! This creates a whole series of strange issues that users must account for in commonDispatcher
scenarios, particularly when using it as a mechanism for inserting ordered items intoQueue
from impure event handlers. Accounting for this ordering issue often imposes significant overhead on user code, more than undoing the benefits ofDispatcher
's own optimizations. Additionally, ifioa
andiob
are extremely cheap (e.g.q.offer(a)
), the overhead of calling.start
to create a wrapping fiber for each will exceed the total runtime of the operation itself. Fiber spawning is extremely cheap, but it's not as cheap as inserting into a queue!For all of these reasons,
Dispatcher
has been adjusted to have two major modes:parallel
andsequential
. The previous default mode of operation corresponds to theparallel
mode. When you aren't sure which to pick, select this one. Thesequential
mode adjustsDispatcher
's optimization mode for more localized usage (e.g. one per request, which is a common paradigm in practice), offers strong ordering guarantees (in the above example,ioa
will run beforeiob
, guaranteed), and much more efficient work unit execution (by removing the fiber wrapping). The danger is that units of work can interfere with each other, and thussequential
is not an appropriate mode forDispatcher
s which are shared across an entire application.If that weren't enough,
Dispatcher
has also received a new configuration option that applies to bothparallel
andsequential
modes:await = true
. In the above example, there is a deceptively annoying comment:// wait around for stuff…
. Most people who have usedDispatcher
in anger have received the dreadeddispatcher already shutdown
error message. This happens when theuse
scope for theDispatcher
resource is closed before the work unit finishes. When this happens,Dispatcher
invalidates its internal state, cancels all current work fibers, and shuts down. This is a very safe default, but as it turns out, this is often not what people want.The general expectation is often that
Dispatcher
will simply wait for all outstanding work to finish before allowing theuse
block to terminate, rather than aggressively canceling all outstanding tasks. With the addition of the newawait = true
parameter, this is now possible. In 3.4.0, we can rewrite the above example in a more natural fashion, such that it has the guarantees we expect: