Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add admin lookup to blob #34

Merged
merged 3 commits into from
Dec 19, 2024
Merged

Add admin lookup to blob #34

merged 3 commits into from
Dec 19, 2024

Conversation

hannahker
Copy link
Collaborator

@hannahker hannahker commented Dec 18, 2024

A short-term solution to #30. This table is needed to unblock work in preparing Floodscan data for HDX. As discussed over Slack, more overall refactoring on the reference data before we put this in the database and fully productionalize. Data is output here, based on the cached Fieldmaps data in the same location.

@hannahker hannahker requested a review from zackarno December 18, 2024 21:49
@zackarno
Copy link

Nice! this looks good and will work I believe.

However, I was thinking it would just be the admin 2 level data w/ all the columns we have here. No need to mix levels which i think might actually make it easier to get confused/make mistakes in the future.

Also what is the purpose of the __index_level_0__ column?

Pretty sure I can still get to the table I want w/ just simply doing:

df_admin_lookup |> 
  filter(!is.na(ADM2_PCODE))

but not sure if the mixing of levels is necessary?

also tagging @isatotun so she is aware that there is a parquet file on the way for labelling the admin units on our hdx-floodscan pipeline

@hannahker
Copy link
Collaborator Author

@zackarno good point about there being some redundancy -- although we won't consistently go down to the admin-2 level for each ISO3 (eg. since some CODs don't have it, or we don't process non-HRP locations). I can simplify to just go down to the "max" admin level for each ISO3 (which we have specified in the iso3 database table.

@zackarno
Copy link

zackarno commented Dec 18, 2024

ok gotcha - yeah I was thinking floodscan which is admin 2...

What if we just add an adm_level column (sound familiar? 😄 ) . I know an analyst could probably glean the admin from the NA's present in other cols, but still seems useful.

Could then easily run something like

df %>%
  group_by(ADM0_NAME,ADM0_PCODE) %>%
  filter(admin_level ==max(admin_level))

to get to the lowest level and clearly see what that is per country etc.

@hannahker
Copy link
Collaborator Author

yeah a good idea! Just added that in.

@zackarno
Copy link

zackarno commented Dec 18, 2024

it looks like it's all ADM_LEVEL==2 now. That's fine w/ me if it actually is and that's what you want, but also fine for having multiple to makle more flexible. If that's intended, happy for you to merge -- and we can do more completeness checks as we go along

@hannahker
Copy link
Collaborator Author

@zackarno where are you seeing it with everything ADM_LEVEL=2? Just took a look and can see cases (eg. ZWE), with ADM_LEVEL=1.

@zackarno
Copy link

sorry my mistake - i see admin 1 as well!

Looks good to me - first test will be to see if we can slap labels on all the FloodScan data. Will check that out - so you can either merge and I will open an issue if necessary or leave open till this is tested

@hannahker
Copy link
Collaborator Author

Ok will merge and let me know if anything doesn't work for Floodscan!

@zackarno
Copy link

zackarno commented Dec 19, 2024

it looks good - no missing labels according to notebook which i just updated with this commit: f9b45b3

@hannahker
Copy link
Collaborator Author

Great! I also just uploaded the same file to the prod blob, which you should use for the final pipeline.

@hannahker hannahker merged commit ca5d895 into main Dec 19, 2024
1 check passed
@hannahker hannahker deleted the admin-lookup branch December 19, 2024 17:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants