-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unstreamM etc destroys performance; expose unstreamPrimM & document most-performant method #416
Comments
We must face the fact that Once we are sure that my example is the only undefined behavior of the function, we could expose |
Wait -- what if we employed |
I see your point about safety. It looks like there's some understanding of this elsewhere e.g. https://www.tweag.io/blog/2021-02-10-linear-base/ "the [linear approach] even makes I didn't consider non-determinism before, but the behaviour would be safe for a linear-in-spirit monad (like
Simply Changing
(I refactored the code slightly from what I had in the OP; latest version available in https://github.com/infinity0/lang-benchmarks) It looks like performance is not significantly affected at least in this simple case (perhaps GHC optimises away the clone?), though it's incredibly weird that unsafeFreeze/unboxed is slower than freeze/unboxed (I ran this several times and got the same weird result). |
That would be a sensible solution if it works. I haven't looked deep into GHC's linear monad yet, but it sounds promising.
See #409 for another example of behavior you observed. There is also some discussion about that there. |
What's crucial is to understand how exactly they're unsafe and when it's safe to use them. import qualified Data.Vector.Unboxed as U
import qualified Data.Vector.Fusion.Bundle.Monadic as BM
import ListT
bundleList :: BM.Bundle (ListT IO) v Int
bundleList = BM.generateM 2 $ \i -> fromFoldable [i, 100+i]
goBad :: IO [U.Vector Int]
goBad = toList $ unstreamPrimM bundleList
>>> goBad
[[100,101],[100,101],[100,101],[100,101]] Since all vector share same mutable buffer they all end up same! This causes problem since we continue to modify mutable vector after we called I suspect that this happens because |
I've been thinking of this safety issues these days. Of course
|
And thanks to the explanation of @Shimuuar, I guess I got your point better, @infinity0. Obviously It's better if we have |
I was messing around with squeezing performance out of this library.
By contrast, here is the equivalent rust code:
As you can see, the properly-written fusion version performs as well as rust. This was a very pleasant surprise for me. HOWEVER - the correct way is not documented anywhere!!! In particular, the convenience function
unstreamPrimM
is for some reason defined in the source code ofVector.Generic
, used no-where else, not exported nor advertised, yet is absolutely vital for reaching this nirvana of performance.By contrast, unstreamM is what's exposed in the API and destroys the performance so it performs even worse than the manual imperative mutable version. Even worse, all the utility functions are written in terms of unstreamM, e.g. VG.mapM, etc etc. Yes this means they have a convenient
Monad m =>
constraint, but any non-haskell-expert that cares about performance would benchmark it and write off the library as "Haskell is slow". Providing mirror utilities that have aPrimMonad m =>
constraint that useunstreamPrimM
instead ofunstreamM
, as well as a few examples, would help this effort.For reference, the above took me about half a day. Not everybody exploring Haskell has that sort of time or patience.
The text was updated successfully, but these errors were encountered: