Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory optimization: consider using Python array rather than list in some places #293

Open
valeriupredoi opened this issue Aug 13, 2020 · 14 comments

Comments

@valeriupredoi
Copy link
Collaborator

valeriupredoi commented Aug 13, 2020

Hey guys! So I went through a lot of code recently, running all sorts of ideas and trial and error tryouts and I am happy to say that I think there really isn't much more room for speedup in serial mode. Take for instance this tree I looked at yesterday - all elementary computations I looked are of O(1e-5 s) which I think it's the best one can get from something that's not a builtin Python function; I mean, a lot of the builtin functions are slower than that actually, depends what you use. There are some things that @sadielbartholomew and myself are still a bit quizzical like statistics but I didn't see those being used in the main time loop. So, unless we manage to parallelize it with mpi or Pool, I honestly can't see anything else major to speed up the serial run. Am sure @sadielbartholomew will find another ace, but I can't think of anything else unless I understand and change the workflow in detail (which I can't and don't want to since that would be silly 😁 )

Having said that, I think there is room to improve the memory consumption. One thing I can think of on top of my head is to use Python arrays instead of lists when you have long lists and keep appending to them - Python arrays are not as efficient with memory than Numpy arrays but they are heaps faster to append to (about 75% slower to append to compared to lists, orders faster than np.append or np.concatenate) but they are 4-5 times lighter on the memory than lists. Do you think this would be something good to do? If so would it be possible to point me to bits of the code where this can be done so I can start testing? Cheers 🍺

@arnauqb
Copy link
Member

arnauqb commented Aug 13, 2020

Hi Valeriu,

Most of the memory is consumed in this list:

self.people = []
.

I had a go at using fixed numpy arrays for the subgroup, the idea is that all groups carry a maximum subgroup size, so we can have a fixed numpy array and fill it with people and a running index. I had a first go at https://github.com/IDAS-Durham/JUNE/tree/optimization/staticarrays so you're welcome to have a look. I don't know if it's fullyfunctional yet as it was WIP which I hadn't had time to test. Speed-wise it was a bit slower but I didn't measure the memory consumption. I think if we can reduce the memory consumption by a potential factor of 2 or 3 then that would be great, even if the code is a bit slower.

@arnauqb
Copy link
Member

arnauqb commented Aug 13, 2020

This is the first time I hear about Python arrays, so maybe it's worth a try as well!

@valeriupredoi
Copy link
Collaborator Author

@arnauqb mate - I just realized that the main culprit for overall memory consumption is the actual data (the people as returned by load_population_from_hdf5); I actually changed it to a numpy array correctly populated with People objects and saw there was no overall decrease in memory per run (the example run); here's the numbers:

  • maximum memory per run in MB of Resident memory (RAM): 740 (and stays constant all through bar a fast increase in the first minute or so)
  • Resident memory right at the end of load_population_from_hdf5: 505MB
  • size of people array: 163852

This means that the People persistent hdf5 data fully loaded in memory takes 68% of run memory; also that's about 3kB per Person object.

The good news is this is dominant and should increase linearly with the population size; the bad news is that we can put arrays instead of lists and np.arrays as much as we'd want in the code, won't make much of a difference. But also the good news is that if you could reduce the size of a Person object (and that should be easy, use less expensive attribute values or restrict data types?); here's a list of attributes for a regular Joe and their sizes (don't mind the stderr, I wanted to grab everything I could find):

__annotations__ 640
__class__ 1064
__copy__ 72
__defaults__ 640
__delattr__ 48
__dir__ 72
__doc__ 16
__eq__ 48
__fields__ 184
__format__ 72
__ge__ 48
__getattribute__ 48
__getnewargs__ 72
__getstate__ 72
__gt__ 48
__hash__ 48
__init__ 48
__init_subclass__ 72
__iter__ 48
__le__ 48
__len__ 48
__lt__ 48
__module__ 71
__ne__ 48
__new__ 136
__reduce__ 72
__reduce_ex__ 72
__repr__ 48
__setattr__ 48
__setstate__ 72
__sizeof__ 72
__str__ 48
__subclasshook__ 72
_id 56
age 32
area 32
busy 24
comorbidity 16
dead 24
ethnicity 50
find_guardian 64
from_attributes 64
health_information 16
home_city 16
hospitalised 24
'list' object has no attribute 'residence'
id 32
infected 24
infection 16
'Person' object has no attribute 'hospital'
'list' object has no attribute 'leisure'
lockdown_status 16
'list' object has no attribute 'medical_facility'
mode_of_transport 48
'list' object has no attribute 'primary_activity'
'list' object has no attribute 'rail_travel'
recovered 24
sector 16
sex 50
socioecon_index 32
sub_sector 16
subgroups 120
susceptibility 24
susceptible 28
symptoms 16
work_super_area 16

attn @sadielbartholomew @grenville

@valeriupredoi
Copy link
Collaborator Author

valeriupredoi commented Aug 13, 2020

I mean, e.g. - why is sex so expensive? 🤣 (can use True/False, not nice but saves you 22b 😁 )

@valeriupredoi
Copy link
Collaborator Author

about the obscure Python arrays here is the documentation and a working implementation of interaction.py together with the working simulator (in the same dir) to account for the changed data type to array. The only thing extra is one needs to declare the data type at the start of the array - the rest behaves just like a list but much lighter on memory 🍺

@florpi
Copy link
Collaborator

florpi commented Aug 13, 2020

does an infected person weight the same as a non infected one?

@valeriupredoi
Copy link
Collaborator Author

valeriupredoi commented Aug 13, 2020

@florpi good question! memory allocation -wise yes, give or take, unless some attribute values change from say None to some string, but roughly it's about the same - here's your regular Joe as he's loaded into memory right after loading the hdf5 file:

Person(id=163851, sex='m', age=98, ethnicity='A', socioecon_index=7, area=521, work_super_area=None, sector=None, sub_sector=None, lockdown_status=None, comorbidity=None, home_city=None, mode_of_transport=<ModeOfTransport Driving a car or van>, busy=False, subgroups=[['household', 66703, 3], [None, None, None], [None, None, None], [None, None, None], [None, None, None], [None, None, None], [None, None, None]], health_information=None, susceptibility=1.0, dead=False)

a few things will change, but I have just completed a run in which at max, 20k were infected out of the 160k total population and the memory jumped from 740M at the start of the loop to about 755M peak (assumed where the 20k were infected), so that's about 15M of extra changed attribute values for 20k people -> that's about 0.75kb for each of the 20k ones but only 0.1kb extra memory per Person on average, if you ran a simulation in which all would be infected at some point then yes, that's be an extra 0.75kb per person, but for 3kb per person you start with that'd be a memory hike of only 25% from the base consumption when loading the data

@arnauqb
Copy link
Member

arnauqb commented Aug 13, 2020

I mean, e.g. - why is sex so expensive? 🤣 (can use True/False, not nice but saves you 22b 😁 )

So, to my understating I thought strings in Python were singletons, so actually there is only one string "m" and one string "f" in the code and Python creates pointers to that unique object similarly to True and False. But this doesn't seem to be the case here? What am I missing?

@valeriupredoi
Copy link
Collaborator Author

sorry for not saying it explicitly guys, I reckon the computational implementation is A*-class 🍺 - there could be a million and one ways that the original (for the example) 500M of data in memory could become 3-4G in computations, so kudos! But we gotta think how to improve the data loading in memory and it'll be ace 👍

@arnauqb on my machine the minimal size of a string in Python 3 is 50:

In [1]: import sys                                                                                                                       

In [2]: sys.getsizeof("m")                                                                                                               
Out[2]: 50

In [3]: sys.getsizeof("meeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee")                                                                      
Out[3]: 91

whatever is more gets added to the 50 overhead

@valeriupredoi
Copy link
Collaborator Author

also note that True and False actually have different sizes in Python 3 (yeh didn't know that myself) https://stackoverflow.com/questions/53015922/different-object-size-of-true-and-false-in-python-3

@arnauqb
Copy link
Member

arnauqb commented Aug 18, 2020

There was an annoying bug in how we loaded the world from hdf5, now the memory consumption is down by 50% (#302). It seems that households are more memory intesive than people now, possibly cause they are not dataobjects like people are. I'm not sure how making it a dataobject would change how it inherits from the Group class though...

@valeriupredoi
Copy link
Collaborator Author

excellent detective work! 🔍 I will test myself, but I noticed that the households, companies etc don't consume too much and that's pretty much dwarfed by a large population 👍

@arnauqb
Copy link
Member

arnauqb commented Aug 18, 2020

These are the memory profiling results I got for the South West region.

image

Restoring households information ( which basically saves dictionary with the closest social venues to the household, and the relatives of the people in the household) seems to be at least as important as loading the population. Maybe changing the data structure we use to store the social venues would help.

@valeriupredoi
Copy link
Collaborator Author

valeriupredoi commented Aug 18, 2020

it might be the case indeed for larger regions with less densely populated bits - for the example I am running (the example in scripts) this would be around 200M and population was 520M so yeah, significant but not a show stopper 🍺

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants