Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable object serialisation #119

Open
aidansteele opened this issue Feb 2, 2015 · 3 comments
Open

Configurable object serialisation #119

aidansteele opened this issue Feb 2, 2015 · 3 comments

Comments

@aidansteele
Copy link

Hi,

I really appreciate the library. I've considered a potential optimisation and would like to know your thoughts before submitting a pull request. Currently Marshal is being used to serialise objects that cross process boundaries.

It would be great if we could opt to use a different serialisation class, e.g. Oj or MessagePack. Both these libraries are quite a bit faster than the default Marshal class. Parallel could be configured to use another (de)serialisation method by parameter or configuration block, etc.

What do you think?

@grosser
Copy link
Owner

grosser commented Feb 2, 2015

Sounds good, Parallel.serializer = JSON / Marshal / XXX should do the trick, all of them respond to .load and .dump afaik

The amount of data must be gigantic for this to make any real difference, but I can see this being useful ...

@aidansteele
Copy link
Author

You're right, it's a bit of an odd request. If one cares about the performance that much, why would Ruby be used? But flexibility is always nice :)

That said, I've just done a very quick implementation and another issue popped up. All of my parallel jobs took almost exactly the same amount of time to run and their resulting giant encoded blobs are deserialised serially. It would be neat to deserialise them concurrently in threads on the receiving end (assuming these libraries even release the GVL), but that adds a whole lot of additional complexity for what is probably minimal gain in the general case.

@grosser
Copy link
Owner

grosser commented Feb 2, 2015

strange, I thought the deserialization is done in threads too oO

On Sun, Feb 1, 2015 at 7:51 PM, Aidan Steele [email protected]
wrote:

You're right, it's a bit of an odd request. If one cares about the
performance that much, why would Ruby be used? But flexibility is always
nice :)

That said, I've just done a very quick implementation and another issue
popped up. All of my parallel jobs took almost exactly the same amount of
time to run and their resulting giant encoded blobs are deserialised
serially. It would be neat to deserialise them concurrently in threads on
the receiving end (assuming these libraries even release the GVL), but that
adds a whole lot of additional complexity for what is probably minimal gain
in the general case.


Reply to this email directly or view it on GitHub
#119 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants