-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory optimization: consider using Python array rather than list in some places #293
Comments
Hi Valeriu, Most of the memory is consumed in this list: JUNE/june/groups/group/subgroup.py Line 20 in 8535ee6
I had a go at using fixed numpy arrays for the subgroup, the idea is that all groups carry a maximum subgroup size, so we can have a fixed numpy array and fill it with people and a running index. I had a first go at https://github.com/IDAS-Durham/JUNE/tree/optimization/staticarrays so you're welcome to have a look. I don't know if it's fullyfunctional yet as it was WIP which I hadn't had time to test. Speed-wise it was a bit slower but I didn't measure the memory consumption. I think if we can reduce the memory consumption by a potential factor of 2 or 3 then that would be great, even if the code is a bit slower. |
This is the first time I hear about Python arrays, so maybe it's worth a try as well! |
@arnauqb mate - I just realized that the main culprit for overall memory consumption is the actual data (the
This means that the People persistent hdf5 data fully loaded in memory takes 68% of run memory; also that's about 3kB per Person object. The good news is this is dominant and should increase linearly with the population size; the bad news is that we can put
|
I mean, e.g. - why is sex so expensive? 🤣 (can use True/False, not nice but saves you 22b 😁 ) |
about the obscure Python arrays here is the documentation and a working implementation of interaction.py together with the working simulator (in the same dir) to account for the changed data type to |
does an infected person weight the same as a non infected one? |
@florpi good question! memory allocation -wise yes, give or take, unless some attribute values change from say
a few things will change, but I have just completed a run in which at max, 20k were infected out of the 160k total population and the memory jumped from 740M at the start of the loop to about 755M peak (assumed where the 20k were infected), so that's about 15M of extra changed attribute values for 20k people -> that's about 0.75kb for each of the 20k ones but only 0.1kb extra memory per Person on average, if you ran a simulation in which all would be infected at some point then yes, that's be an extra 0.75kb per person, but for 3kb per person you start with that'd be a memory hike of only 25% from the base consumption when loading the data |
So, to my understating I thought strings in Python were singletons, so actually there is only one string "m" and one string "f" in the code and Python creates pointers to that unique object similarly to True and False. But this doesn't seem to be the case here? What am I missing? |
sorry for not saying it explicitly guys, I reckon the computational implementation is A*-class 🍺 - there could be a million and one ways that the original (for the example) 500M of data in memory could become 3-4G in computations, so kudos! But we gotta think how to improve the data loading in memory and it'll be ace 👍 @arnauqb on my machine the minimal size of a string in Python 3 is 50:
whatever is more gets added to the 50 overhead |
also note that True and False actually have different sizes in Python 3 (yeh didn't know that myself) https://stackoverflow.com/questions/53015922/different-object-size-of-true-and-false-in-python-3 |
There was an annoying bug in how we loaded the world from hdf5, now the memory consumption is down by 50% (#302). It seems that households are more memory intesive than people now, possibly cause they are not |
excellent detective work! 🔍 I will test myself, but I noticed that the households, companies etc don't consume too much and that's pretty much dwarfed by a large population 👍 |
These are the memory profiling results I got for the South West region. Restoring households information ( which basically saves dictionary with the closest social venues to the household, and the relatives of the people in the household) seems to be at least as important as loading the population. Maybe changing the data structure we use to store the social venues would help. |
it might be the case indeed for larger regions with less densely populated bits - for the example I am running (the example in |
Hey guys! So I went through a lot of code recently, running all sorts of ideas and trial and error tryouts and I am happy to say that I think there really isn't much more room for speedup in serial mode. Take for instance this tree I looked at yesterday - all elementary computations I looked are of O(1e-5 s) which I think it's the best one can get from something that's not a builtin Python function; I mean, a lot of the builtin functions are slower than that actually, depends what you use. There are some things that @sadielbartholomew and myself are still a bit quizzical like statistics but I didn't see those being used in the main time loop. So, unless we manage to parallelize it with mpi or Pool, I honestly can't see anything else major to speed up the serial run. Am sure @sadielbartholomew will find another ace, but I can't think of anything else unless I understand and change the workflow in detail (which I can't and don't want to since that would be silly 😁 )
Having said that, I think there is room to improve the memory consumption. One thing I can think of on top of my head is to use Python arrays instead of lists when you have long lists and keep appending to them - Python arrays are not as efficient with memory than Numpy arrays but they are heaps faster to append to (about 75% slower to append to compared to lists, orders faster than
np.append
ornp.concatenate
) but they are 4-5 times lighter on the memory than lists. Do you think this would be something good to do? If so would it be possible to point me to bits of the code where this can be done so I can start testing? Cheers 🍺The text was updated successfully, but these errors were encountered: