-
Notifications
You must be signed in to change notification settings - Fork 808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
serializing umap crashes application because of exploding memory #1125
Comments
When trying to serialize it to disc using joblib memory consumption increases to 106gb and then crashes in my case becouse the harddisk was full:
on the fs umap.pcl was 67GB, |
So apparently when serializing, joblib will call this function
and here the following line produces an error:
Here hyperplane_dim seems to be the same as my dataset dimension and since that is over a million _ArrayMemoryError is thrown. |
Hi!!,
i am trying to serialize a trained umap model with pickle.dumps.
Unfortunately there is something going wrong, memory is exploding from 5gb to > 252gb
and for some reason the following outputs are printed when executing
io_bytes_array_data = dumps(umap)
and the hole thing crashes as it exceeds my memory.Apparently there is some code executed while pickle does its thing that probably should not happen.
I managed to create a minimal example that also generates this kind of output when using the pickle.dumps method.
However this does not explode the memory since that might also depend on the size of the matrix that is fed into umap.
It only happens when approximation_algorithm is run.
In my real use-case i am feeding a
scipy.sparse.csr_matrix
into umap.The text was updated successfully, but these errors were encountered: