Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add footprint finder code #127

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
Draft

Add footprint finder code #127

wants to merge 9 commits into from

Conversation

snbianco
Copy link
Collaborator

@snbianco snbianco commented Sep 18, 2024

This is a draft PR that adds the footprint finder code from TESScut. Leaving as draft since I haven't added tests yet, but I'd welcome any feedback on the actual implementation.

I tried to make things as general as possible in terms of functions and variable names, but some things are more TESS specific and will need to be pulled out into a wrapper later when we generalize. Namely, _extract_sequence_information, _create_sequence_list, and _get_cube_files_from_sequence_obs are more TESS-specific as of now. The same is true about certain parts of cube_cut_from_footprint, mainly variables.

Something I was unsure about is how to best handle multiprocessing. The cube_cut_from_footprint function takes a threads parameter to use as max_workers when using cube_cut on the gathered cube files. However, the cube_cut function also takes its own threads parameter. Should these be two separate parameters to cube_cut_from_footprint, or the same? Should threads in cube_cut just be set to 'auto'?

def single_intersect(ffi, polygon):
return ffi.intersects_poly(polygon)

return np.vectorize(single_intersect)(ffi_list["polygon"], polygon)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried added multiprocessing to this function, but found that there was no performance improvement (if anything, execution time went up because of thread overhead.) In any case, the ra_dec_crossmatch function takes < 1 sec to run.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when you say threads, do you actually mean threads or processes?

This is CPU bound, and currently, with the GIL, there will not be any improvement using threads. You would need to use multiple processes.

Copy link
Collaborator Author

@snbianco snbianco Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did mean threads, but I just tried with ProcessPoolExecutor, and oddly enough, timing is a bit worse. It takes around 1 second longer with the process pool vs. vectorize. This is the code I'm using for comparison:

def single_intersect(ffi, polygon):
    return ffi.intersects_poly(polygon)


def _ffi_intersect_process(ffi_list: Table, polygon: SphericalPolygon, workers=None):
    """
    Vectorizing the spherical_coordinate intersects_polygon function
    """
    
    with ProcessPoolExecutor(max_workers=workers) as executor:
        # Map the function across the ffi_list["polygon"]
        results = list(executor.map(single_intersect, ffi_list["polygon"], [polygon] * len(ffi_list)))

    return np.array(results)

Copy link
Member

@falkben falkben Sep 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried your attached code and couldn't get it to work in a different file without some modifications. I also am not sure the above was correct, with the arguments you've specified in map.

However, I can confirm that using the process pool executor is no faster, and a tiny bit worse, no matter what number of processes I use. There is overhead to starting up subprocesses in Python, so perhaps that's the issue we're running into. Also, SphericalGeometry might have some multiprocessing itself, and so we're just getting in the way of that by starting multiple processes?

Running the below code reports the intersect taking just over 1 second on my machine, whereas the version in this PR takes around a second or just under a second.

import functools
import sys
import time
from concurrent.futures import ProcessPoolExecutor
from multiprocessing import freeze_support

import numpy as np
from astropy.coordinates import SkyCoord
from astropy.table import Table
from spherical_geometry.polygon import SphericalPolygon

import astrocut.cube_cut_from_footprint

ccff = sys.modules["astrocut.cube_cut_from_footprint"]


def single_intersect(polygon, ffi_poly):
    return ffi_poly.intersects_poly(polygon)


def _ffi_intersect_process(ffi_list: Table, polygon: SphericalPolygon, workers=None):
    """
    Vectorizing the spherical_coordinate intersects_polygon function
    """

    start_t = time.monotonic()
    with ProcessPoolExecutor(max_workers=4) as executor:
        # Map the function across the ffi_list["polygon"]

        single_intersect_poly = functools.partial(single_intersect, polygon)
        results = executor.map(single_intersect_poly, ffi_list["polygon"], chunksize=64)
    print("took:", time.monotonic() - start_t)

    return np.array(list(results))


if __name__ == "__main__":
    # monkeypatch our version in
    ccff._ffi_intersect = _ffi_intersect_process

    footprint_file = "s3://tesscut-ops-footprints/tess_ffi_footprint_cache.json"

    all_ffis = ccff._get_s3_ffis(footprint_file, as_table=True, load_polys=True)

    coordinates = "217.42893801 -62.67949189"
    coordinates = SkyCoord(coordinates, unit="deg")

    cutout_size = ccff.parse_size_input(10)
    cone_results = ccff._ra_dec_crossmatch(all_ffis, coordinates, cutout_size, ccff.TESS_ARCSEC_PER_PX)

    print(cone_results)

@scfleming
Copy link
Collaborator

scfleming commented Sep 18, 2024

My initial reaction is to say let's setup a single n_threads parameter for multi-threading of any function that can use it. While on paper it might be nice to say "use 8 threads for this one but 16 for that one", that sounds like over-engineering to me at this stage of the process, and it would be much simpler to have a single n_threads parameter used globally.


# Generate cutout from each cube file
cutout_files = []
if threads == 'auto' or threads > 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure threads here will help, and it may in fact hurt. cube_cut already uses threads for accessing the S3 data, so with this change, each cutout file would spawn that many threads. In my testing, there are diminishing returns after 8 threads. Since this could end up creating many times that number of threads, I expect we'd see thread contention here.

If you set threads to 0, versus setting threads to "auto" or "8" what are the results here?

Copy link
Collaborator Author

@snbianco snbianco Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing on my machine, I do see a performance improvement with a larger number of threads. It's more apparent when a sector isn't provided and more than 1 cutout is being generated. For example, these commands each generate 7 cutouts.

cube_cut_from_footprint('130 30', cutout_size=50, threads=0) --> 1 min, 23.4 sec
cube_cut_from_footprint('130 30', cutout_size=50, threads=8) --> 57.6 sec
cube_cut_from_footprint('130 30', cutout_size=50, threads='auto') --> 46.4 sec

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this before or after you made the change to use the same threads variable to pass into the cube_cut function?

Copy link
Collaborator Author

@snbianco snbianco Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like I ran the test before, when threads was set to auto for cube_cut. When using the same threads variable, the call with threads=0 takes a lot longer, which makes sense. I also see that there is less than a second difference between threads=8 and threads=auto.

Maybe it would be best to keep threads for cube_cut constant at some value, like 'auto' or 8? I think that using threads in cube_cut_from_footprint is still worthwhile for the performance improvement when making several cutouts at once, but performance does seem to stagnate after a certain point.

I'm also thinking that the default value for threads in cube_cut_from_footprint should be set to 8 rather than 1, since performance is consistently better.

@@ -28,3 +28,4 @@ class UnsupportedPythonError(Exception):
from .cutout_processing import (path_to_footprints, center_on_path, # noqa
CutoutsCombiner, build_default_combine_function) # noqa
from .asdf_cutouts import asdf_cut, get_center_pixel # noqa
from .cube_cut_from_footprint import cube_cut_from_footprint # noqa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this hides the module, is that intended?

I think it would be better to name the module differently from the function that you are intending to export

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I renamed the module to footprint_cutouts.

I'm not sure if I know what you mean by hiding the module. You can still import the module on its own.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without the change to the name of the module, you couldn't easily import the module from another file.

you had to resort to some trickery. e.g.:

import astrocut.cube_cut_from_footprint

ccff = sys.modules["astrocut.cube_cut_from_footprint"]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that the first import above doesn't import the module, it imported the function. but the module does get added to sys.modules when you import the function, so you can extract it from there

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, that makes sense!

Comment on lines 198 to 201
sequences : int, List[int], optional
Default None. Sequence(s) from which to generate cutouts. Can provide a single
sequence number as an int or a list of sequence numbers. If not specified,
cutouts will be generated from all sequences that contain the cutout.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you were trying to generalize this but I'm not sure what sequences are. Are these sectors?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they refer to sectors! I was trying to make the parameter more general and borrowed "sequence" from the CAOM field descriptions: https://mast.stsci.edu/api/v0/_c_a_o_mfields.html

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess for a user, it might be a little unclear that for TESS this is sectors, and not cameras or ccds or anything like that. so maybe some documentation or examples would help here.

Copy link
Collaborator Author

@snbianco snbianco Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more info and an example to the docstring in the latest commit! I'll also be updating the documentation at some point (probably next sprint) and will definitely include examples there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants