(that needs some preparation before loading via ogr2ogr)
Author: Astun Technology Ltd.
Contact: support [at] astuntechnology.com
A simple GML loader written in Python that makes use of OGR 1.9. Source data can be in GML or KML format (including compressed files in GZ or ZIP format) and can be output to any of the formats supported by OGR. The source data can be prepared using a simple Python to both make it suitable for loading with OGR (useful with complex feature types) or to add value by deriving attributes.
The loader was originally written to load Ordnance Survey OS MasterMap Topographic Layer data in GML/GZ format but has since been used to load other GML and KML data.
-
OGR 1.9
- OGR is part of the GDAL suite of tools for translating and manipulation geospatial data.
-
Python 2.6+ or 3
- Python 2.6 or above (including 3) is required. Most modern Linux operating systems will already have 2.6 or above.
- Python lxml module for parsing and manipulating XML
Installation details are available on the project wiki
First configure Loader by editing loader.config
specifying:
src_dir
- The directory containing your source files or an individual file. All supported files in the specified directory and it's descendants will be loaded.
out_dir
- The directory used to store the translated data if writing to a file based format such as ESRI Shape, MapInfo TAB etc.
tmp_dir
- The directory used to store temporary working files during loading.
ogr_cmd
- The ogr2ogr command that will be used to load the data. Here you can specify the destination format and any associated settings (for example database connection details if you are writing to PostGIS).
prep_cmd
- The command used to prepare the source data so it is suitable for loading with OGR, choose one that is suitable for your source data such as prep_osgml.prep_osmm_topo for OS MasterMap Topo.
post_cmd
- An optional command to be run once OGR has created it's output. Called once per file, useful for loading SQL dump files etc.
gfs_file
- OGR .gfs file used to define the feature attributes and geometry type of the features read from the GML again choose a suitable gfs file for your source data such as ../gfs/osmm_topo_postgres.gfs for loading OS MasterMap Topo in to PostgreSQL.
See python/loader.config
for further explanation and details of available tokens. Environment variables can be used with any of the options by using a token of the form:
Then run from the command-line:
python loader.py loader.config
Additional arguments can be passed to override the values in the config file (useful when running more than one instance of the loader) for example to specify a different source directory (src_dir
):
python loader.py loader.config src_dir=./data/tq
Some configuration examples are available on the project wiki
-
Data
- OS OSMM Water Network Layer
- Improve support for elements that require an external code list by fetching the code list when it's available
- Support for nil attributes such as:
<net:inNetwork nilReason="missing" xsi:nil="true" />
,<hy-n:length xsi:nil="true" uom="m" nilReason="missing" />,
<water:level xsi:nil="true" nilReason="missing" />` - Add example to wiki
- OS OSMM Water Network Layer
-
loader.py
- Add exception and message when source data is not found
- Use standard logging instead of print
-
OS MasterMap ITN
- Test load on national cover
- PostgreSQL scripts to join it all up
- Add ferryterminal, roadpartiallinkinformation, roadpartialrouteinformation types
-
Potential improvements due to changes in OGR
- Use
--config GML_GFS_TEMPLATE path/to/file.gfs
to specify template instead of copying template file for each source file (requires GDAL 1.9.0) - Use
--config GML_READ_MODE SEQUENTIAL_LAYERS
with GML files that include multiple feature types that appear sequentially to avoid the GML being scanned multiple times (requires GDAL 1.9.0) - Make use of ability to use GML attributes as feature attributes using the element@attribute syntax in the GFS file (and remove relevant prep logic that creates an element to hold the attribute values) (requires GDAL 1.11.0)
- Use