Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential memory leak #126

Open
atsiflis opened this issue Jan 17, 2021 · 4 comments
Open

Potential memory leak #126

atsiflis opened this issue Jan 17, 2021 · 4 comments
Labels

Comments

@atsiflis
Copy link

Hi,

Thank you for making this very useful library available.

I am using it in a Python script that builds a spline once and then evaluates it multiple times. I have noticed that the memory allocation of the Python process increases by a few MB after each evaluation and I am considering whether this is due to some memory leak in splinter. To investigate this, I run the following simplified example (adapted from the basic usage example)

import numpy as np
from guppy import hpy
import psutil, os
import splinter

def f1(x):
    return -1. + 2*x + 0.1*(x**2) + 10*np.random.rand(1)[0]

# Generate the sample data
x = np.arange(0, 11, 1)
y = np.zeros((len(x),))
for i in range(len(x)):
    y[i] = f1(x[i])

# Build cubic B-spline
bspline = splinter.BSplineBuilder(x, y, degree=3).build()

# Generate evaluation points
xd = np.arange(0, 10, .01)

# Evaluate B-spline, calculating memory allocation
hp = hpy()
process = psutil.Process()
hp_before = hp.heap()
psutil_before = process.memory_info().rss
bspline.eval(xd)
print(process.memory_info().rss - psutil_before)
print(hp.heap() - hp_before)

from which I get

73728
Partition of a set of 2 objects. Total size = 444 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0      1  50      416  94       416  94 types.FrameType
     1      1  50       28   6       444 100 int

Given that the call to bspline.eval() increased the memory allocated to the process by 73728 bytes but the size of the reachable objects increased by only 444 bytes, is it right to conclude that there is a memory leak in splinter?

@gablank
Copy link
Collaborator

gablank commented Jan 19, 2021

Hello @atsiflis!

We're happy the library has been useful to you :-)

I agree that it may seem like a memory leak. As far as I know (and remember), the evaluation of a spline should not result in any allocation of memory (that is not also freed before returning), except for the data structure to hold the result. However, the Python interface makes reasoning about this a bit harder, as it uses a garbage collector which may run at (seemingly) random times.

Ideally, the memory usage of this should converge to some constant:

while True:
    y = bspline.eval(x)

I would have to have a more thorough look to determine if this is actually a memory leak, or if this is just Python playing us a trick.

Thank you for the report and the example, it will be very useful for debugging purposes!

@yj-Roy
Copy link

yj-Roy commented Jun 4, 2021

I also meet the same problem.
My dataset is large,it will occupy all the memory of my computer, and report an error.
I still have a problem and do not know how to solve it. I use python, if I use for loop + splinter, its time cost is mainly in a large number of for loops, it will take more time than scipy( because scipy uses matrix processing), how to solve this problem, thanks

@yj-Roy
Copy link

yj-Roy commented Jun 8, 2021

I can't solve this problem, but I think I can alleviate it.
The cinterface.h file contains a function SPLINTER_API void splinter_datatable_delete(splinter_obj_ptr datatable_ptr);
But this function is not used in codes, So in the datatable.py I add this function:

def __del__(self):
    if self.__handle is not None:
        splinter._call(splinter._get_handle().splinter_datatable_delete, self.__handle)
    self.__handle = None

After that, still have a memory leak problem, but the leak speed is very slow.
Thanks

@bgrimstad
Copy link
Owner

Hi, @yj-Roy. Good catch! I have not investigated this problem, but it seems reasonable to implement the del method for all the Python classes (not only DataTable) to prevent memory leakage. It would be great if you could prepare and submit a pull request so that we can fix the problem for all users.

Bjarne

@bgrimstad bgrimstad added the Bug label Jun 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants