Skip to content

Simple GML & KML loader written in Python using OGR

Notifications You must be signed in to change notification settings

tjmgis/Loader

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A simple loader for geographic data in GML and KML

(that needs some preparation before loading via ogr2ogr)

Author: Astun Technology Ltd.

Contact: support [at] astuntechnology.com

A simple GML loader written in Python that makes use of OGR 1.9. Source data can be in GML or KML format (including compressed files in GZ or ZIP format) and can be output to any of the formats supported by OGR. The source data can be prepared using a simple Python to both make it suitable for loading with OGR (useful with complex feature types) or to add value by deriving attributes.

The loader was originally written to load Ordnance Survey OS MasterMap Topographic Layer data in GML/GZ format but has since been used to load other GML and KML data.

Dependencies

  • OGR 1.9

    • OGR is part of the GDAL suite of tools for translating and manipulation geospatial data.
  • Python 2.6+ or 3

    • Python 2.6 or above (including 3) is required. Most modern Linux operating systems will already have 2.6 or above.
    • Python lxml module for parsing and manipulating XML

Installation details are available on the project wiki

Usage

First configure Loader by editing loader.config specifying:

Basic configuration

  • src_dir
    • The directory containing your source files or an individual file. All supported files in the specified directory and it's descendants will be loaded.
  • out_dir
    • The directory used to store the translated data if writing to a file based format such as ESRI Shape, MapInfo TAB etc.
  • tmp_dir
    • The directory used to store temporary working files during loading.
  • ogr_cmd
    • The ogr2ogr command that will be used to load the data. Here you can specify the destination format and any associated settings (for example database connection details if you are writing to PostGIS).
  • prep_cmd
    • The command used to prepare the source data so it is suitable for loading with OGR, choose one that is suitable for your source data such as prep_osgml.prep_osmm_topo for OS MasterMap Topo.
  • post_cmd
  • An optional command to be run once OGR has created it's output. Called once per file, useful for loading SQL dump files etc.
  • gfs_file
    • OGR .gfs file used to define the feature attributes and geometry type of the features read from the GML again choose a suitable gfs file for your source data such as ../gfs/osmm_topo_postgres.gfs for loading OS MasterMap Topo in to PostgreSQL.

See python/loader.config for further explanation and details of available tokens. Environment variables can be used with any of the options by using a token of the form: $HOME, ${HOME} or %TEMP% (Windows only)

Then run from the command-line:

python loader.py loader.config

Additional arguments can be passed to override the values in the config file (useful when running more than one instance of the loader) for example to specify a different source directory (src_dir):

python loader.py loader.config src_dir=./data/tq

Some configuration examples are available on the project wiki

To-do

  • Data

    • OS OSMM Water Network Layer
      • Improve support for elements that require an external code list by fetching the code list when it's available
      • Support for nil attributes such as: <net:inNetwork nilReason="missing" xsi:nil="true" />, <hy-n:length xsi:nil="true" uom="m" nilReason="missing" />, <water:level xsi:nil="true" nilReason="missing" />`
      • Add example to wiki
  • loader.py

    • Add exception and message when source data is not found
    • Use standard logging instead of print
  • OS MasterMap ITN

    • Test load on national cover
    • PostgreSQL scripts to join it all up
    • Add ferryterminal, roadpartiallinkinformation, roadpartialrouteinformation types
  • Potential improvements due to changes in OGR

    • Use --config GML_GFS_TEMPLATE path/to/file.gfs to specify template instead of copying template file for each source file (requires GDAL 1.9.0)
    • Use --config GML_READ_MODE SEQUENTIAL_LAYERS with GML files that include multiple feature types that appear sequentially to avoid the GML being scanned multiple times (requires GDAL 1.9.0)
    • Make use of ability to use GML attributes as feature attributes using the element@attribute syntax in the GFS file (and remove relevant prep logic that creates an element to hold the attribute values) (requires GDAL 1.11.0)

About

Simple GML & KML loader written in Python using OGR

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published