Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add value per hectare output values to relevant models #1270

Closed
newtpatrol opened this issue Apr 4, 2023 · 15 comments · Fixed by #1717
Closed

Add value per hectare output values to relevant models #1270

newtpatrol opened this issue Apr 4, 2023 · 15 comments · Fixed by #1717
Assignees
Labels
science request A request/proposal from within natcap related to science (rather than engineering)
Milestone

Comments

@newtpatrol
Copy link
Contributor

There have been several internal discussions recently where it was noted that several of our models provide output values per pixel, which many users then need to convert to values per hectare, which could easily be done within the model itself.

This is most apparent with the Carbon Storage model, which takes in carbon pool values as tons/ha, then produces results that are tons/pixel, which doesn't make a whole lot of sense. Interestingly, Blue Carbon provides outputs in tons/ha.

The number of forum posts where people ask about this conversion (and how many of those are unsure of how to do it) points to the utility of providing value/ha as standard output. It would be good to do a more rigorous model review to see which ones this would apply to, but these come to mind:

  • Carbon storage
  • Carbon edge
  • SDR
  • NDR
  • Crop production

AWY and SWY might not need it, since they output in millimeters, which I think should be the same per pixel or per ha, and any calculated volumes are total across a watershed. Urban Flood and Urban stormwater have a lot of volumes, and it's unclear whether they're per pixel or something else.

@davemfish
Copy link
Contributor

Are tons/ha a well-known convention for measuring carbon stock? If so I agree there could be value to returning results in those units. For other cases/models, I would ask, why hectares?

And for all cases,

I think there is some potential to introduce confusion if a raster's values does not represent the native pixel area. I feel like that's a built-in assumption of the raster: the pixel value is per-unit-area of the pixel. That allows for easy operations like summing all the pixel values to get a total per region.

@davemfish davemfish changed the title Add per value per hectare output values to relevant models Add value per hectare output values to relevant models Apr 4, 2023
@davemfish davemfish added science request A request/proposal from within natcap related to science (rather than engineering) and removed science request A request/proposal from within natcap related to science (rather than engineering) labels Apr 4, 2023
@dcdenu4
Copy link
Member

dcdenu4 commented Apr 5, 2023

potential to introduce confusion if a raster's values does not represent the native pixel area. I feel like that's a built-in assumption of the raster: the pixel value is per-unit-area of the pixel.

I think this is the point I was trying to bring up when talking about this in person with @lmandle and @adrianvogl. I was confused about interpreting a raster with values per hectare when the pixel area itself was smaller than a hectare... Maybe those are different confusion points.

@davemfish
Copy link
Contributor

I was confused about interpreting a raster with values per hectare when the pixel area itself was smaller than a hectare

yes, I think this is exactly the sort of case I was thinking about. Slightly less confusing when the pixel is larger than a hectare, but would still require extra operations before doing things like summing values across pixels.

@adrianvogl
Copy link

adrianvogl commented Apr 6, 2023

I disagree that this could be confusing in cases where pixel size does not equal one hectare. To me it would be very clearly stated in the units that it is per hectare, and many of the outputs that InVEST provides are commonly reported in per hectare terms (carbon storage, erosion, sediment export, nutrient export). To me it is less confusing when summing across a larger area to adjust the sum of per ha values using the total area, than it is to pull up raster outputs from multiple InVEST models, try to remember which ones are per pixel and which are per hectare, check the user guide to verify, and then apply pixel level conversions to create outputs that standardize the units across multiple models.

@davemfish davemfish added this to the 3.14.0 milestone Jun 27, 2023
@davemfish davemfish self-assigned this Jan 11, 2024
@davemfish
Copy link
Contributor

It appears that Carbon, Forest Carbon, NDR, & SDR are the only models that create rasters with per-pixel values,

C:\Users\dmf\projects\invest\src\natcap\invest > grep -r "/u.pixel" .

The Crop Production models and Coastal Blue Carbon already calculate per-hectare. And no other models calculate per-area raster values, according to our MODEL_SPEC['outputs'] dictionaries.

Updating those 4 models to calculate per-hectare values is straightforward. In general, they all have one or two places where a value is currently being scaled by a pixel size (measured in hectares) to yield tons/pixel. That scaling would be removed from the raster math. But then it will often be re-applied later when models aggregate raster values within polygons and report a total per polygon. (There's one case in FCEE where the aggregation also reports a mean density in the polygon, using tons/hectare, as well as a total sum.)

If we do this, many regression tests for these models will fail. In theory, the aggregated values in polygons should remain the same, or at least very close. But values in rasters will be very different.

✔️ Carbon tests have regression values hardcoded in the tests.
✔️ NDR tests have regression values are hardcoded in tests
✔️ SDR tests have regression values are hardcoded in tests
❌ FCEE tests use regression data files including carbon rasters and aggregate vectors, making them more laborious to update

Time estimate: 2 days

The changes are likely to be straightforward. But since regression data is changing, we cannot rely on tests to make sure we didn't screw up. So we should afford extra time to verifying the results of the changes, which sometimes will cascade through a large number of intermediate data products.

@davemfish
Copy link
Contributor

davemfish commented Jan 24, 2024

I also still have some uncertainty about making this change for the Carbon model. If the model is not scaling the input carbon pools data (tons/hectare) by the area, then is it really doing anything useful? In other words, if the output raster is still tons/hectare for each pixel, does it contain any information that wasn't already present in the carbon pools table?

The same concern might apply to the Carbon Edge Effects model, I'm not sure.

@lmandle
Copy link

lmandle commented Jan 24, 2024

I see value to our users in having the biophysical values mapped to to the LULC layer, even if it's not transforming the values. @adrianvogl would you like to weigh in?

The per ha value might be especially useful when combining with other layers (e.g. cost of restoration or protection would more likely be in per ha than per pixel values, I'd expect.) Per ha values would also be more useful than per pixel values in cases where there's variation in pixel size. InVEST results are often used to identify hotspots and per pixel values would bias those results to pixels with more on-the-ground area. I'm less sure how relevant that is for more InVEST users.

@newtpatrol
Copy link
Contributor Author

The carbon model is so simple that I often wonder why it exists at all. That said, if all it's going to do is add up 4 numbers, it may as well produce results in the same units that they were provided in, which are the units people most often work with, as noted by Lisa and Adrian.

Not sure if I missed this in the discussion, but can we report both per-pixel and per hectare values, provide clear file naming, and be clear in the user guide what they mean?

@davemfish
Copy link
Contributor

Not sure if I missed this in the discussion, but can we report both per-pixel and per hectare values, provide clear file naming, and be clear in the user guide what they mean?

I had been thinking in terms of one or the other, but yes, it's worth considering doing both. It feels a bit redundant in terms of disk space since one is simply a copy of the other multiplied by a scalar value. But there could be convenient ways to use a Virtual Raster Table instead of having a complete copy of the whole raster.

@lmandle
Copy link

lmandle commented Feb 5, 2024

I share your worries, @davemfish, about diskspace. For example, Jesse has been running SDR for all of Columbia at 30m resolution, and the resulting rasters are 5-12 GB each. (See here: https://stanford-natcap.slack.com/archives/C010LBUED7V/p1707165454707979)

@davemfish
Copy link
Contributor

@lmandle that's definitely a case where I wouldn't want the extra files!

I played around with using a VRT to "virtually" create a second raster that is a linear re-scaling of the first. I'm curious what people think in terms of usability of a file like this, whether it's potentially a point of confusion, or a useful addition.

In this example, I took one of the current carbon model outputs, tot_c_cur_willamatte.tif, which represents total carbon tons in each pixel, and created ha_c_cur_willamette.vrt, which has tons/hectare in each pixel.
carbon_results.zip

@newtpatrol what do you think?

As discussed earlier, I think we want to change the main output to be the tons/hectare version, and then the VRT would be the simple total per pixel. But for now that's not what the model produces, so I did it this way.

@newtpatrol
Copy link
Contributor Author

newtpatrol commented Mar 5, 2024

Sorry I'm just noticing this now @davemfish. When I bring the .tif and .vrt into ArcGIS Pro, I see no difference at all in the values between the two layers, they are exactly the same. I've never used a VRT before, but in general would advocate for doing the simple thing of just making a separate TIFF with per hectare values, which will be easier for users to understand.

The reason I came back to this thread was to note that yet another user on the forum today is wondering why their carbon values were so small, and expected tons/hectare units.

@davemfish
Copy link
Contributor

Sorry I'm just noticing this now @davemfish. When I bring the .tif and .vrt into ArcGIS Pro, I see no difference at all in the values between the two layers, they are exactly the same.

Thanks @newtpatrol , good to know Arc Pro doesn't seem to support this. When I drop both files into QGIS, they look like this:
Capture

We're going to change the main tif output to tons/hectare. I'm not sure if we're going to provide both in tif form for reasons discussed above.

@dcdenu4
Copy link
Member

dcdenu4 commented Sep 17, 2024

A short discussion came up in our Slack about converting to hectares, as it can be slightly confusing if you think about it too much. Here's a snippet from @lmandle:

If 1 pixel = 900 m2 for example, then the conversion from tons/pixel to tons/ha would be:
tons/pixel x pixel/900 m2 x 10000 m2/ ha or the per pixel value x (10000/900). The number will get bigger when the pixel is smaller than a hectare.

@davemfish
Copy link
Contributor

Done in #1717

@emilyanndavis emilyanndavis removed the in progress This issue is actively being worked on label Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
science request A request/proposal from within natcap related to science (rather than engineering)
Projects
None yet
6 participants