-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add footprint finder code #127
base: main
Are you sure you want to change the base?
Conversation
astrocut/cube_cut_from_footprint.py
Outdated
def single_intersect(ffi, polygon): | ||
return ffi.intersects_poly(polygon) | ||
|
||
return np.vectorize(single_intersect)(ffi_list["polygon"], polygon) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried added multiprocessing to this function, but found that there was no performance improvement (if anything, execution time went up because of thread overhead.) In any case, the ra_dec_crossmatch
function takes < 1 sec to run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when you say threads, do you actually mean threads or processes?
This is CPU bound, and currently, with the GIL, there will not be any improvement using threads. You would need to use multiple processes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did mean threads, but I just tried with ProcessPoolExecutor
, and oddly enough, timing is a bit worse. It takes around 1 second longer with the process pool vs. vectorize. This is the code I'm using for comparison:
def single_intersect(ffi, polygon):
return ffi.intersects_poly(polygon)
def _ffi_intersect_process(ffi_list: Table, polygon: SphericalPolygon, workers=None):
"""
Vectorizing the spherical_coordinate intersects_polygon function
"""
with ProcessPoolExecutor(max_workers=workers) as executor:
# Map the function across the ffi_list["polygon"]
results = list(executor.map(single_intersect, ffi_list["polygon"], [polygon] * len(ffi_list)))
return np.array(results)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried your attached code and couldn't get it to work in a different file without some modifications. I also am not sure the above was correct, with the arguments you've specified in map
.
However, I can confirm that using the process pool executor is no faster, and a tiny bit worse, no matter what number of processes I use. There is overhead to starting up subprocesses in Python, so perhaps that's the issue we're running into. Also, SphericalGeometry might have some multiprocessing itself, and so we're just getting in the way of that by starting multiple processes?
Running the below code reports the intersect taking just over 1 second on my machine, whereas the version in this PR takes around a second or just under a second.
import functools
import sys
import time
from concurrent.futures import ProcessPoolExecutor
from multiprocessing import freeze_support
import numpy as np
from astropy.coordinates import SkyCoord
from astropy.table import Table
from spherical_geometry.polygon import SphericalPolygon
import astrocut.cube_cut_from_footprint
ccff = sys.modules["astrocut.cube_cut_from_footprint"]
def single_intersect(polygon, ffi_poly):
return ffi_poly.intersects_poly(polygon)
def _ffi_intersect_process(ffi_list: Table, polygon: SphericalPolygon, workers=None):
"""
Vectorizing the spherical_coordinate intersects_polygon function
"""
start_t = time.monotonic()
with ProcessPoolExecutor(max_workers=4) as executor:
# Map the function across the ffi_list["polygon"]
single_intersect_poly = functools.partial(single_intersect, polygon)
results = executor.map(single_intersect_poly, ffi_list["polygon"], chunksize=64)
print("took:", time.monotonic() - start_t)
return np.array(list(results))
if __name__ == "__main__":
# monkeypatch our version in
ccff._ffi_intersect = _ffi_intersect_process
footprint_file = "s3://tesscut-ops-footprints/tess_ffi_footprint_cache.json"
all_ffis = ccff._get_s3_ffis(footprint_file, as_table=True, load_polys=True)
coordinates = "217.42893801 -62.67949189"
coordinates = SkyCoord(coordinates, unit="deg")
cutout_size = ccff.parse_size_input(10)
cone_results = ccff._ra_dec_crossmatch(all_ffis, coordinates, cutout_size, ccff.TESS_ARCSEC_PER_PX)
print(cone_results)
My initial reaction is to say let's setup a single |
astrocut/cube_cut_from_footprint.py
Outdated
|
||
# Generate cutout from each cube file | ||
cutout_files = [] | ||
if threads == 'auto' or threads > 1: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure threads here will help, and it may in fact hurt. cube_cut
already uses threads for accessing the S3 data, so with this change, each cutout file would spawn that many threads. In my testing, there are diminishing returns after 8 threads. Since this could end up creating many times that number of threads, I expect we'd see thread contention here.
If you set threads to 0, versus setting threads to "auto" or "8" what are the results here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing on my machine, I do see a performance improvement with a larger number of threads. It's more apparent when a sector isn't provided and more than 1 cutout is being generated. For example, these commands each generate 7 cutouts.
cube_cut_from_footprint('130 30', cutout_size=50, threads=0)
--> 1 min, 23.4 sec
cube_cut_from_footprint('130 30', cutout_size=50, threads=8)
--> 57.6 sec
cube_cut_from_footprint('130 30', cutout_size=50, threads='auto')
--> 46.4 sec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this before or after you made the change to use the same threads variable to pass into the cube_cut
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like I ran the test before, when threads
was set to auto for cube_cut
. When using the same threads
variable, the call with threads=0
takes a lot longer, which makes sense. I also see that there is less than a second difference between threads=8
and threads=auto
.
Maybe it would be best to keep threads
for cube_cut
constant at some value, like 'auto' or 8? I think that using threads in cube_cut_from_footprint
is still worthwhile for the performance improvement when making several cutouts at once, but performance does seem to stagnate after a certain point.
I'm also thinking that the default value for threads
in cube_cut_from_footprint
should be set to 8 rather than 1, since performance is consistently better.
astrocut/__init__.py
Outdated
@@ -28,3 +28,4 @@ class UnsupportedPythonError(Exception): | |||
from .cutout_processing import (path_to_footprints, center_on_path, # noqa | |||
CutoutsCombiner, build_default_combine_function) # noqa | |||
from .asdf_cutouts import asdf_cut, get_center_pixel # noqa | |||
from .cube_cut_from_footprint import cube_cut_from_footprint # noqa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this hides the module, is that intended?
I think it would be better to name the module differently from the function that you are intending to export
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I renamed the module to footprint_cutouts
.
I'm not sure if I know what you mean by hiding the module. You can still import the module on its own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without the change to the name of the module, you couldn't easily import the module from another file.
you had to resort to some trickery. e.g.:
import astrocut.cube_cut_from_footprint
ccff = sys.modules["astrocut.cube_cut_from_footprint"]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that the first import above doesn't import the module, it imported the function. but the module does get added to sys.modules when you import the function, so you can extract it from there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, that makes sense!
astrocut/cube_cut_from_footprint.py
Outdated
sequences : int, List[int], optional | ||
Default None. Sequence(s) from which to generate cutouts. Can provide a single | ||
sequence number as an int or a list of sequence numbers. If not specified, | ||
cutouts will be generated from all sequences that contain the cutout. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you were trying to generalize this but I'm not sure what sequences are. Are these sectors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they refer to sectors! I was trying to make the parameter more general and borrowed "sequence" from the CAOM field descriptions: https://mast.stsci.edu/api/v0/_c_a_o_mfields.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess for a user, it might be a little unclear that for TESS this is sectors, and not cameras or ccds or anything like that. so maybe some documentation or examples would help here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added some more info and an example to the docstring in the latest commit! I'll also be updating the documentation at some point (probably next sprint) and will definitely include examples there.
This is a draft PR that adds the footprint finder code from TESScut. Leaving as draft since I haven't added tests yet, but I'd welcome any feedback on the actual implementation.
I tried to make things as general as possible in terms of functions and variable names, but some things are more TESS specific and will need to be pulled out into a wrapper later when we generalize. Namely,
_extract_sequence_information
,_create_sequence_list
, and_get_cube_files_from_sequence_obs
are more TESS-specific as of now. The same is true about certain parts ofcube_cut_from_footprint
, mainly variables.Something I was unsure about is how to best handle multiprocessing. The
cube_cut_from_footprint
function takes athreads
parameter to use asmax_workers
when usingcube_cut
on the gathered cube files. However, thecube_cut
function also takes its ownthreads
parameter. Should these be two separate parameters tocube_cut_from_footprint
, or the same? Shouldthreads
incube_cut
just be set to'auto'
?