Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JP-3784 Fix Pixel replace and cube_build output file names #8979

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

jemorrison
Copy link
Collaborator

@jemorrison jemorrison commented Nov 25, 2024

Resolves JP-3784

Closes #8904

This PR addresses fixes the names of the output from pixel_replacement when called using a ModelContainer. In addition, it fixes the output names of cube_build when a single file is input to cube_build.

Tasks

news fragment change types...
  • changes/<PR#>.general.rst: infrastructure or miscellaneous change
  • changes/<PR#>.docs.rst
  • changes/<PR#>.stpipe.rst
  • changes/<PR#>.datamodels.rst
  • changes/<PR#>.scripts.rst
  • changes/<PR#>.fits_generator.rst
  • changes/<PR#>.set_telescope_pointing.rst
  • changes/<PR#>.pipeline.rst

stage 1

  • changes/<PR#>.group_scale.rst
  • changes/<PR#>.dq_init.rst
  • changes/<PR#>.emicorr.rst
  • changes/<PR#>.saturation.rst
  • changes/<PR#>.ipc.rst
  • changes/<PR#>.firstframe.rst
  • changes/<PR#>.lastframe.rst
  • changes/<PR#>.reset.rst
  • changes/<PR#>.superbias.rst
  • changes/<PR#>.refpix.rst
  • changes/<PR#>.linearity.rst
  • changes/<PR#>.rscd.rst
  • changes/<PR#>.persistence.rst
  • changes/<PR#>.dark_current.rst
  • changes/<PR#>.charge_migration.rst
  • changes/<PR#>.jump.rst
  • changes/<PR#>.clean_flicker_noise.rst
  • changes/<PR#>.ramp_fitting.rst
  • changes/<PR#>.gain_scale.rst

stage 2

  • changes/<PR#>.assign_wcs.rst
  • changes/<PR#>.badpix_selfcal.rst
  • changes/<PR#>.msaflagopen.rst
  • changes/<PR#>.nsclean.rst
  • changes/<PR#>.imprint.rst
  • changes/<PR#>.background.rst
  • changes/<PR#>.extract_2d.rst
  • changes/<PR#>.master_background.rst
  • changes/<PR#>.wavecorr.rst
  • changes/<PR#>.srctype.rst
  • changes/<PR#>.straylight.rst
  • changes/<PR#>.wfss_contam.rst
  • changes/<PR#>.flatfield.rst
  • changes/<PR#>.fringe.rst
  • changes/<PR#>.pathloss.rst
  • changes/<PR#>.barshadow.rst
  • changes/<PR#>.photom.rst
  • changes/<PR#>.pixel_replace.rst
  • changes/<PR#>.resample_spec.rst
  • changes/<PR#>.residual_fringe.rst
  • changes/<PR#>.cube_build.rst
  • changes/<PR#>.extract_1d.rst
  • changes/<PR#>.resample.rst

stage 3

  • changes/<PR#>.assign_mtwcs.rst
  • changes/<PR#>.mrs_imatch.rst
  • changes/<PR#>.tweakreg.rst
  • changes/<PR#>.skymatch.rst
  • changes/<PR#>.exp_to_source.rst
  • changes/<PR#>.outlier_detection.rst
  • changes/<PR#>.tso_photometry.rst
  • changes/<PR#>.stack_refs.rst
  • changes/<PR#>.align_refs.rst
  • changes/<PR#>.klip.rst
  • changes/<PR#>.spectral_leak.rst
  • changes/<PR#>.source_catalog.rst
  • changes/<PR#>.combine_1d.rst
  • changes/<PR#>.ami.rst

other

  • changes/<PR#>.wfs_combine.rst
  • changes/<PR#>.white_light.rst
  • changes/<PR#>.cube_skymatch.rst
  • changes/<PR#>.engdb_tools.rst
  • changes/<PR#>.guider_cds.rst

@jemorrison jemorrison requested a review from a team as a code owner November 25, 2024 20:25
@jemorrison jemorrison changed the title JP-3784Pixel replace output JP-3784 Fix Pixel replace and cube_build output file names Nov 25, 2024
Copy link

codecov bot commented Nov 25, 2024

Codecov Report

Attention: Patch coverage is 60.86957% with 9 lines in your changes missing coverage. Please review.

Project coverage is 76.82%. Comparing base (e3d263f) to head (0069f17).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
jwst/pixel_replace/pixel_replace_step.py 12.50% 7 Missing ⚠️
jwst/stpipe/utilities.py 84.61% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8979      +/-   ##
==========================================
+ Coverage   76.81%   76.82%   +0.01%     
==========================================
  Files         496      496              
  Lines       45610    45612       +2     
==========================================
+ Hits        35034    35043       +9     
+ Misses      10576    10569       -7     
Flag Coverage Δ *Carryforward flag
nightly 77.42% <ø> (-0.01%) ⬇️ Carriedforward from e3d263f

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jemorrison
Copy link
Collaborator Author

@melanieclarke @emolter
As I am writing a unit test for this step I have come across something I want to check with you guys.
After running pixel_replace on a single file the output data has 'pixelreplacestep' in the output.meta.filename.
But if I run it on a container the returned output is the input cal file. BUT if I write the data out with save_results=True. The correct filename (with pixelreplacestep) is on disk.
I am testing this:
file = []
file.append('jw01267003001_02105_00001_mirifushort_cal.fits')
file.append('jw01267003001_02105_00002_mirifushort_cal.fits')
pixel_replace = pixel_replace_step.PixelReplaceStep()
container = ModelContainer(file)
pixel_replace.call(container, save_results=True, skip=False)
for c in container:
print(c.meta.filename)
Here is a snap shot of running the step on a container. The information in the log show the correct filenames are created but the print statement shows the return value in meta.filename is the cal file

024-11-27 07:10:50,422 - stpipe.PixelReplaceStep - INFO - Saved model in jw01267003001_02105_00001_mirifushort_pixelreplacestep.fits
2024-11-27 07:10:51,059 - stpipe.PixelReplaceStep - INFO - Saved model in jw01267003001_02105_00002_mirifushort_pixelreplacestep.fits
2024-11-27 07:10:51,060 - stpipe.PixelReplaceStep - INFO - Step PixelReplaceStep done
2024-11-27 07:10:51,060 - stpipe - INFO - Results used jwst version: 1.16.1.dev237+g59ef24d68

jw01267003001_02105_00001_mirifushort_cal.fits
jw01267003001_02105_00002_mirifushort_cal.fits

So the step is working just fine but I can not seem to access the output filenames when it is called from a container.
Any suggestions ? Frankly it works so that is a win. I just wanted the filenames to have the pixelreplacestep in the name so I could have a unit test

@jemorrison
Copy link
Collaborator Author

@melanieclarke I got this step to work by including the 'invariant_filenames' routine that is from cal spec3 and it manages filenames. I just copied the routine and suck in pixel_replace_step.py. I tried to import it from:
from jwst.pipeline.calwebb_spec3 import invariant_filename

But that gets circular and I get an error because calwebb_spec3 is importing pixel_replace.
I can leave it as it is - but maybe you have a better idea rather than having duplicate code.

@emolter
Copy link
Collaborator

emolter commented Nov 27, 2024

So the step is working just fine but I can not seem to access the output filenames when it is called from a container.
Any suggestions ? Frankly it works so that is a win. I just wanted the filenames to have the pixelreplacestep in the name so I could have a unit test

The invariant_filename wrapper appears to ensure that model.meta.filename does NOT change when step.save_model is called. I'm not convinced that having _pixelreplacestep in model.meta.filename is the right thing to do, because isn't that just outdated again once the container is passed to the next step?

I didn't review this in detail yet but I don't see anything wrong with the current solution.

For unit testing, instead of checking the model.meta.filename, could you check the actual name of the files written to disk? See e.g. this outlier detection unit test as an example - it just uses assert path.isfile(filename). Make sure you pass in the tmp_cwd fixture like this test does, too.

@melanieclarke
Copy link
Collaborator

But that gets circular and I get an error because calwebb_spec3 is importing pixel_replace.
I can leave it as it is - but maybe you have a better idea rather than having duplicate code.

We could move that function to a library and import it in both places - maybe in lib/pipe_utils.py. I'm not sure I understand why that's working, though.

@jemorrison jemorrison added this to the Build 11.2 milestone Nov 27, 2024
@emolter
Copy link
Collaborator

emolter commented Nov 27, 2024

We could move that function to a library and import it in both places - maybe in lib/pipe_utils.py.

At risk of nit-picking, since invariant_filename wraps an stpipe.step method, would somewhere within jwst.stpipe be a good place for it?

@melanieclarke
Copy link
Collaborator

Testing locally, it looks like this solution does work for the case we were trying to fix, where container is made from a list of filenames:
PixelReplaceStep.call(container, save_results=True, skip=False)

But it now un-fixes the calwebb_spec3 case. Calling like this:
strun calwebb_spec3 s3.json --steps.pixel_replace.skip=False --steps.pixel_replace.save_results=True
the output from pixel_replace is correct, but the final products look like this:
_g395h-f290lp_s3d.fits and _g395h-f290lp_x1d.fits.

@jemorrison
Copy link
Collaborator Author

@melanieclarke @emolter I do have a question about a unit test that saves the output to disk. Is that allowed ? It writes out files to the test directory ?

@melanieclarke I need to understand your test better. In your s3.json file what is the output name set to ?

I have a set of tests (in a Jupyter notebook) that I have developed to test the various filenames of pixel_replace and cube_build. When I run calspec3 and save the pixel_replace output. ALL the filenames have the correct format.

spec3_asn = 'test.json'
spec3 = Spec3Pipeline()
result = spec3.call(spec3_asn,
steps={ 'outlier_detection':{'skip':True},
'pixel_replace':{'skip': False, 'save_results': True},
'cube_build':{'save_results':True},
'extract_1d':{'skip': True}
},
save_results=True)

in the association the name is set to 'test_miri' and there are 4 input files.
The output from pixel_replace is:
024-11-27 11:37:44,322 - stpipe.Spec3Pipeline.pixel_replace - INFO - Saved model in jw01267003001_02105_00002_mirifushort_ctest_pixel_replace.fits
2024-11-27 11:37:44,765 - stpipe.Spec3Pipeline.pixel_replace - INFO - Saved model in jw01267003001_02105_00001_mirifushort_ctest_pixel_replace.fits
2024-11-27 11:37:45,482 - stpipe.Spec3Pipeline.pixel_replace - INFO - Saved model in jw01267003001_02105_00004_mirifushort_ctest_pixel_replace.fits
2024-11-27 11:37:45,936 - stpipe.Spec3Pipeline.pixel_replace - INFO - Saved model in jw01267003001_02105_00003_mirifushort_ctest_pixel_replace.fits

and the output from cube_build is:
2024-11-27 11:38:17,804 - stpipe.Spec3Pipeline.spectral_leak - INFO - Saved model in test_miri_ch2-short_x1d.fits
2024-11-27 11:38:17,914 - stpipe.Spec3Pipeline.spectral_leak - INFO - Saved model in test_miri_ch1-short_x1d.fits

@jemorrison
Copy link
Collaborator Author

@melanieclarke Forget last message. I was on a wrong branch. Let me check what is going on with cube_build and I will update

@emolter
Copy link
Collaborator

emolter commented Nov 27, 2024

@emolter I do have a question about a unit test that saves the output to disk. Is that allowed ? It writes out files to the test directory ?

If you're passing in e.g. the tmp_cwd fixture to the test, it will make its own temporary directory where it writes out files. After the test ran, you should not see any new files in the test directory.

@jemorrison
Copy link
Collaborator Author

jemorrison commented Dec 6, 2024

@emolter @melanieclarke
I can get reasonable filename now but one problem.
If a ModelContainer is created from a list of filenames. It will by default create a container.asn_table["products"][0]["name"]
AND that product name that is used to make the cube output is the first name of the file in the list.
No ASN id is created.
The problem is when cube_build is finished creating an IFU cube from 2 files the out put name only has the first name because that is what is in the product name. So it is confusing that it is really a cube from 2 files. Not sure what to do here. Our pipeline is not setup to properly work on a ModelContainer but not a valid association.

for example:
file = []
file.append('jw01267003001_02105_00001_mirifushort_cal.fits')
file.append('jw01267003001_02105_00002_mirifushort_cal.fits')
container = ModelContainer(file)
print(container.asn_table["products"][0]["name"]) - > jw01267003001_02105_00001_mirifushort
final cubes made from both files has the name:
jw01267003001_02105_00001_mirifushort_ch1-short_s3d.fits
jw01267003001_02105_00001_mirifushort_ch2-short_s3d.fits

@jemorrison
Copy link
Collaborator Author

@emolter @melanieclarke
I also need to use the invariant_filename from calwebb_spec3. But I get a circular import if I just try and
from jwst.pipeline.calwebb_spec3 import invariant_filename
I have the step running by making a copy of invariant_filename in the pixel_replace step. But I should do this properly
and pull out invariant_filename and put it some place else.
Suggestions ?

@emolter
Copy link
Collaborator

emolter commented Dec 6, 2024

If a ModelContainer is created from a list of filenames. It will by default create a container.asn_table["products"][0]["name"]
AND that product name that is used to make the cube output is the first name of the file in the list.

@jemorrison It appears this is working the way I intended it to, but I do see why this could be confusing. It wasn't obvious to me how else to set a default output filename in cases like this, where there is no association metadata passed in. We could attempt to parse the filename strings and remove the exposure number, but then what happens if someone renamed their files to a custom name? or if the list includes files from multiple programs/observations/visits?

Do you have a suggestion for a better way to do this?

@emolter
Copy link
Collaborator

emolter commented Dec 6, 2024

But I should do this properly
and pull out invariant_filename and put it some place else.
Suggestions ?

I think Melanie and I already made some suggestions
#8979 (comment)
#8979 (comment)
was there something wrong with those places? or am I misunderstanding the new question?

@jemorrison jemorrison force-pushed the pixel_replace_output branch from 5756f31 to ddcab09 Compare December 6, 2024 17:02
@jemorrison
Copy link
Collaborator Author

@emolter the reason I find it confusing is it it seems like cube build only included 1 file and not both files. There really is no way to know because that information is not written to the header. So you just have to look at the log message and know that both files were included in creating the IFU cubes. Not that I think this is much of an issue - I do not think most people will create a model container and run cube_build on that container. It is really set up to run using an association that sets up the correct product name. But since this ticket all started from running pixel_replace and cube build on containers I thought I would raise this issue.

@jemorrison
Copy link
Collaborator Author

@emolter @melanieclarke I should have been clearer on invariant_filenames. I am checking what is better. Melanie did not seem convinced. Are we agreed to move it to jwst.stpipe ?

@melanieclarke
Copy link
Collaborator

Are we agreed to move it to jwst.stpipe ?

Yep, works for me!

@jemorrison
Copy link
Collaborator Author

@emolter I move the invariant_filename to stpipe.utiltites and import it as follows:
from jwst.stpipe.utilities import invariant_filename

good with you ?

@emolter
Copy link
Collaborator

emolter commented Dec 6, 2024

@emolter I move the invariant_filename to stpipe.utiltites and import it as follows: from jwst.stpipe.utilities import invariant_filename

good with you ?

yeah seems great

@jemorrison jemorrison force-pushed the pixel_replace_output branch from c8f6688 to d51601f Compare December 9, 2024 16:02
Copy link
Collaborator

@emolter emolter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great Jane, just one question about the test setup and a (low priority, but I hope easy) request for a unit test of invariant_filename

@@ -211,3 +212,23 @@ def query_step_status(datamodel, cal_step):
return getattr(datamodel[0].meta.cal_step, cal_step, NOT_SET)
else:
return getattr(datamodel.meta.cal_step, cal_step, NOT_SET)


def invariant_filename(save_model_func):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not crucial, but it would be helpful to write a unit test for this function. I think it could be pretty simple. Writing some pseudocode on-the-fly, I'm thinking something like

def mock_save_func(model):
    model.meta.filename = "changed"
    return model

def test_invariant_filename(input_model):
    input_model.meta.filename = "initial_name"
    invariant_save_func = invariant_filename(mock_save_func)
    output_model = invariant_save_func(input_model)
    assert output_model.meta.filename == "initial_name"


# for this simple case, the results from either algorithm should
# be the same
result = PixelReplaceStep.call(input_model, skip=False, algorithm=algorithm)
result = PixelReplaceStep.call(input_model, skip=False, algorithm=algorithm,save_results=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
result = PixelReplaceStep.call(input_model, skip=False, algorithm=algorithm,save_results=True)
result = PixelReplaceStep.call(input_model, skip=False, algorithm=algorithm, save_results=True)

return_files = []
# for this simple case, the results from either algorithm should
# be the same
result = PixelReplaceStep.call(container, skip=False, algorithm=algorithm,save_results=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
result = PixelReplaceStep.call(container, skip=False, algorithm=algorithm,save_results=True)
result = PixelReplaceStep.call(container, skip=False, algorithm=algorithm, save_results=True)

# for this simple case, the results from either algorithm should
# be the same
result = PixelReplaceStep.call(container, skip=False, algorithm=algorithm,save_results=True)
for dirname in [output_dir, tmp_cwd]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to check both the cwd and the output dir here? I would have thought we would know beforehand which files went to which directories, and part of the test would be to ensure those are as expected.

@melanieclarke melanieclarke modified the milestones: Build 11.2, Build 11.3 Dec 13, 2024
@jemorrison
Copy link
Collaborator Author

@emolter I added a unit test for invariant filenames. Is this good enough ?

Copy link
Collaborator

@emolter emolter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All my questions / comments have been answered, thanks Jane!

@jemorrison
Copy link
Collaborator Author

@melanieclarke There are some regression test failures.
https://github.com/spacetelescope/RegressionTests/actions/runs/12359349881
Going to run these after the dust settles from this delivery and see if some of them go away

@melanieclarke
Copy link
Collaborator

melanieclarke commented Dec 17, 2024

Going to run these after the dust settles from this delivery and see if some of them go away

Some of them might, but it does look like many are because the S_PXREPL is present and set to COMPLETE in the truth file, but not in the test output. Can you check on that?

I forgot you are out - I can take over for this issue and see if we can sort that out and get this settled.

Copy link
Collaborator

@melanieclarke melanieclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some issues with output cal_step status in the regression tests - this needs to be resoved before this PR can be merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PixelReplaceStep returns strange output filenames from save_results=True
3 participants