Skip to content

Add HRRR historical data investigation report#410

Closed
aldenks wants to merge 1 commit intomainfrom
claude/extend-hrrr-dataset-qLDKh
Closed

Add HRRR historical data investigation report#410
aldenks wants to merge 1 commit intomainfrom
claude/extend-hrrr-dataset-qLDKh

Conversation

@aldenks
Copy link
Member

@aldenks aldenks commented Feb 2, 2026

Summary

This PR adds a comprehensive investigation report documenting the feasibility of extending the HRRR analysis dataset backward from the current start date of 2018-09-16 to 2014-10-01, providing 4+ years of additional historical weather data.

Key Findings

  • Extension is feasible: The HRRR analysis dataset can be reliably extended to 2014-10-01
  • Variable availability: 16 out of 20 current variables are available in the historical period
  • Data quality: File structure, grid coordinates, and temporal availability are consistent across all versions examined
  • Grid stability: Minor sub-pixel shift (< 0.02 pixels) between HRRRv1/v2 and HRRRv3 is negligible

Notable Observations

  1. GRIB name change: Mean sea level pressure uses PRMSL in HRRRv1/v2 but MSLMA in HRRRv3 - requires version-aware lookup
  2. Missing variables:
    • downward_long_wave_radiation_flux_surface (added in HRRRv3)
    • relative_humidity_2m (added in HRRRv3)
  3. URL pattern: Unchanged since 2014-09-30, compatible with current implementation
  4. Data gaps: Transition period (2016-08-23 to 2016-09-01) has missing hours; first operational day (2014-09-30) is incomplete

Implementation Recommendations

The report outlines a phased approach:

  1. Add version-aware variable handling with available_from timestamps
  2. Implement GRIB element name mapping for PRMSL/MSLMA
  3. Update append_dim_start to 2014-10-01
  4. Test with small backfill before production rollout

Investigation Details

  • Files examined: 7 GRIB2 files spanning 2014-2018
  • Dates tested: 15+ random dates across the historical period
  • Investigation date: 2026-02-02

This report provides the technical foundation for implementing the historical extension in a follow-up PR.

https://claude.ai/code/session_014f4ZR6Cc5XvmSxQCQdCerk

Investigated extending HRRR analysis dataset back from 2018-09-16 to 2014-10-01.

Key findings:
- 16/20 current variables available from 2014-10-01 (4+ years of additional data)
- 2 variables (DLWRF, RH) only available from 2018-07-12
- 1 variable (MSL pressure) changed GRIB name from PRMSL to MSLMA
- Grid and file structure consistent across all periods
- Data quality verified through actual GRIB file inspection

Report includes detailed variable availability matrix, implementation
recommendations, and phased backfill approach.

https://claude.ai/code/session_014f4ZR6Cc5XvmSxQCQdCerk
@aldenks
Copy link
Member Author

aldenks commented Feb 2, 2026

fabulous, thanks claude.

@aldenks aldenks closed this Feb 2, 2026
@aldenks
Copy link
Member Author

aldenks commented Feb 2, 2026

Tracking implementation in #415

@aldenks aldenks deleted the claude/extend-hrrr-dataset-qLDKh branch February 2, 2026 21:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants