Replies: 2 comments 5 replies
-
Hi @ttomasz Thanks for this! On the addresses side of things we are pointing people to add new data sources to https://openaddresses.io rather than build a separate database of openly available address datasets. Their data is here https://github.com/openaddresses/openaddresses/tree/master/sources/pl. If your data is different, would you be up for submitting a pull request to open addresses? Once it's in their format we can easily add it to Overture and I'd be happy to help facilitate that. |
Beta Was this translation helpful? Give feedback.
-
@ttomasz sorry for delayed response. I was out on vacation but back now. Yeah - my understanding is OA needs a single file to convert to its output format. But if you have to do some other processing you put it in the scripts/ directory and then upload the result. But it seems like that doesn't work if you need to do it consistently. I can help ask over in the OA repo. If not, we should look into doing this on the Overture side. For the Overture schema - we only want the parts of the admin hierarchy that are relevant to the address, rather than splitting into district, city, region, etc. We also chose the generic "address_levels" to avoid having to assign names to different levels for different countries which always seems to spur lots of debate. Based on https://www.upu.int/UPU/media/upu/PostalEntitiesFiles/addressingUnit/polEn.pdf it looks like Poland only needs a locality/city? Also agree we should remove the planned addresses. Let's try to figure this out for Poland in Overture soon |
Beta Was this translation helpful? Give feedback.
-
Category Feedback
Polish government releases a lot of data into public domain like buildings, addresses, etc.
I propose adding the address dataset into Overture. It's around 8 151 958 address points at the moment.
Data is in GML format but I have prepared sample code that parses that into geoparquet using Apache Spark. Code is available here: https://github.com/ttomasz/polish_addresses_spark_parser The repo is dual use and I generate dataset with one schema for polish community and second one can be used as a basis for Overture.
Geoparquet files created using code from repo above are available here for reference: https://data.openstreetmap.org.pl/addresses/
At the moment the poland_addresses.parquet file does not conform to the Overture address schema exactly but I can modify it to be compliant if there is interest in actually using that code. I can also explain any of the data attributes or code if needed.
Dependency with other categories, if any.
No response
Beta Was this translation helpful? Give feedback.
All reactions