Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include ISO 3166-2 subdivisions in client region field annotation #217

Open
critzo opened this issue Mar 7, 2019 · 3 comments
Open

Include ISO 3166-2 subdivisions in client region field annotation #217

critzo opened this issue Mar 7, 2019 · 3 comments

Comments

@critzo
Copy link

critzo commented Mar 7, 2019

The annotation service currently populates connection_spec.client_geolocation.region with the top level ISO 3166-2 region code. To restore the region code granularity we had prior to 2017-05-11, we should add a field for the client region subdivision, annotated with the subdivision codes in the ISO 3166-2 standard.

To demonstrate the issue, I include the query below, and a trimmed result set for Great Britain. Prior to 2017-05-11 we annotated ~198 region codes in this field, whereas now it's 4:

SELECT partition_date, connection_spec.client_geolocation.region FROM `measurement-lab.release.ndt_all`
WHERE connection_spec.client_geolocation.country_code = 'GB'
AND partition_date BETWEEN '2017-05-10' AND '2017-05-12'
GROUP BY partition_date, connection_spec.client_geolocation.region
ORDER BY partition_date, connection_spec.client_geolocation.region

# Result:
partition_date	region
2017-05-10	A1
2017-05-10	A2
2017-05-10	A3
# + 195 more FIPS 10-4 regions in Great Brittain
...
2017-05-11	England
2017-05-11	Northern Ireland
2017-05-11	Scotland
2017-05-11	Wales
@critzo critzo added the review/triage Team should review and assign priority label Mar 7, 2019
@critzo
Copy link
Author

critzo commented Mar 8, 2019

Noting that the ISO 3166-2 region codes are a part of the Maxmind Geolite2-City Locations.

The fields I believe we should consider adding to the ndt schema and annotator are:

  • subdivision_2_iso_code
  • subdivision_2_name

@critzo
Copy link
Author

critzo commented Jun 3, 2020

Adding a link to the place in the code where the current region code is referenced: https://github.com/m-lab/annotation-service/blob/master/geolite2v2/geo-ip.go#L111

Should be straightforward to pull the ISO 3166-2 subregion 2 code from the City dataset for the sub-regions.

@stephen-soltesz
Copy link
Contributor

m-lab/etl-gardener#281

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants