Skip to content

Latest commit

 

History

History
553 lines (358 loc) · 27.4 KB

CONTRIBUTING.md

File metadata and controls

553 lines (358 loc) · 27.4 KB

Tilezen loves contributions from community members like you! Contributions come in many different shapes and sizes. In this file we provide guidance around two of the most common types of contributions: opening issues and opening pull requests.

Community Values

We ask that you are respectful when contributing to Tilezen or engaging with our community. As a community, we appreciate the fact that contributors might be approaching the project from a different perspective and background. We hope that beginners as well as advanced users will be able to use and contribute back to Tilezen. We want to encourage contributions and feedback from all over the world, which means that English might not be a contributor's native language, and sometimes we may encounter cultural differences. Constructive disagreements can be essential to moving a project forward, but disrespectful language or behavior will not be tolerated.

Above all, be patient, be respectful, and be kind!

Submitting Issues

Most issues for Tilezen are housed in the Tilezen/vector-datasource repo. Before opening an issue, be sure to search the repository to see if someone else has asked your question before. If not, go ahead and open a new issue.

Submitting technical bugs

When submitting bug reports, please be sure to give us as much context as possible so that we can reproduce the error you encountered. Be sure to include:

  • System conditions (operating system, browser, etc), if you're running from source
  • Steps to reproduce
  • Expected outcome
  • Actual outcome
  • Screenshots, if applicable
  • Code that exposes the bug, if you have it (such as a failing test or a basic script)

Submitting issues around vector tile quality

It's important to get feedback about the quality of local tile results. Your local knowledge will make it easier for us to understand the problem. When submitting issues be sure to include details like:

Pull Requests Welcome!

Project standards overview

Tilezen has several miscellaneous standards:

  • we follow PEP8 coding style for Python and use Flake8 to enforce those conventions
  • we love tests, check them out
  • we use CircleCI for continuous integration testing
  • we use semver for package versioning

All unit tests in a project will be automatically invoked when you commit to an existing project; make sure they pass!

Active contributors

We'll gladly invite active contributors to become members of the Tilezen organization. New members will gain direct write permissions, and with great power comes great responsibility.

Project overview (what to change, where)

Generally speaking there are three aspects of developing vector tiles.

  • Configuring project setup, see wiki page
  • Updating database properties (can be done ahead of time or at runtime)
  • Changing how features are selected from the database (requires tileserver restart)
⚠️ Yellow call-outs like this are meant to draw your attention to an important idea or distinction you should keep in mind.

Map database in Postgres stores data from OpenStreetMap and other projects like Natural Earth and Who's On First.

When data is loaded, database triggers calculate if a feature is included in which layer(s), at what "minimum zoom", and other Mapzen specific "mz" properties.

When modifying the logic below, we'll need to update our Postgres functions, migrate the data, and cut new tiles.

⚠️ COMMON TERMS
  • Vector tiles (or just "tiles") allow bite sized access to large geographic databases with "raw" vector geometries and properties instead of rendered raster images.
  • Map features (or just "feature") are individual map features, commonly with a name, geometry, and additonal properties like unique ID and source. Example include individual roads, landuse polygons, or business icons.

Changing tile content in the vector-datasource repo

Tile layers are configured in a root queries.yaml file. This file specifies which jinja template to use per layer, and also specifies post-processing via Python transforms. Individual database features are "filtered" into tiles per layer based on yaml files.

Typical tile content changes occur at the yaml level, but there are 4 levels total:

  • yaml files determine which features are included in a tile layer by specifying a series of source data filters and property value rules, the most import of which are a feature's kind and min_zoom.
  • jinja templates These filter & property rules get folded into sql functions, which are generated via layer templates. Some sql functions are run in the database before a tile is requested (for properties like min_zoom) and others are run as a tile is requested (like kind).
  • Python post-processing occurs per feature and across layers once a set of features has been returned for a given tile, which is useful for more involved logic.
  • layers are specified in the root queries.yaml. This file specifies which jinja template to use per layer, and also specifies per layer post-processing Python transforms.

The yaml configuration files establish which features are included per layer, and the jinja templates are better suited for rules that apply to all features in a layer.

To recap, with examples:

  • yaml files are located in the yaml/ directory. Example: pois.yaml
  • jinja files are located in the queries/ directory. Example: pois.jinja2.
  • Python files are located in the vectordatasource/ directory. Example: transform.py.
  • layers are specified in queries.yaml.
⚠️ NOTE: Generally perform maintenance on pre-existing jinja filters or optionally migrate them to the newer YAML format.

Serving tiles in the tileserver repo

Listens for API requests on localhost, which are in the format of layer/z/x/y.ext.

When tileserver hears a request it asks Postgres for "the stuff" inside that tile's bounding box, configured via the queries.yaml layers file, jinja2 templates, Python transforms, and per feature sql functions generated from the yaml filter files.

⚠️ NOTE: You must restart tileserver when vector-datasource layers are updated in queries.yaml.

Other considerations

  • yaml updates don't require restart (but do require a database migration, see below).
  • jinja updates don't require restarting tileserver; they are re-read on request during development.
  • Python updates don't require restarting tileserver; they are re-read on request during development.

DATABASE MIGRATION: Changes to layer yaml files will require at a minimum reloading the sql functions. This is sufficient if only the kind or any output properties have changed. But for min_zoom changes the affected features will need to be recalculated, probably via a data migration. This topic is covered in further detail below.

Let's do this!

We'll cover the following topics in the next sections:

1. Choose an issue to work on

We have a backlog of issues, but they are also grouped into milestones and tracked with Waffle board.

When picking an issue from the Ready column for the active milestone, self assign it to let other people know you'll be working on it and move it to the In Progress column.

If you propose to work on an issue in the Backlog but what to confirm some details add a comment to the issue or ask about it in Slack.

2. Create a new branch

Ensure you're on the master branch to establish a clean history:

git checkout master

Ensure your master branch is up-to-date with the server:

git pull

You will need a descriptive name for your new branch, and one way to do that is concatenate your user name, any relevant issue number(s) and a brief description, for example olga/875-camp-ground-zoom. Then create a new branch using git checkout -b <branchname>, like so:

git checkout -b olga/875-camp-ground-zoom

Congratulations, now you have a local branch!

We're going to push our work to the server eventually (so other people can see your work, and so you have a backup), so let's make sure that works now:

git push

NOTE: Your first push for a branch might require additional details:

git push --set-upstream origin olga/875-camp-ground-zoom

3. Create a new test

Create a new test for the issue in integration-test dir. Sometimes it's helpful to look thru the existing tests to find one that is a close match to the pattern and start there.

  • Create new test file
  • You'll need a specific OpenStreetMap feature ID to test against
  • You'll need the coordinates (z/x/y) of a map tile containing that feature
  • Run the test
⚠️ Remember to note the openstreetmap.org URL for your test feature. You'll store that in your test file so that the test knows where to find the example, and for the continuous integration computer to download that feature and verify your work.

Example test

The unit tests are written using the unittest framework, which has been subclassed in FixtureTest to provide some useful methods. This means that each test starts with the import of OsmFixtureTest, and defines a test class. These make it slightly harder to see what's going on, but are necessary to fit into the way unittest structures tests.

from . import FixtureTest


class CampGroundZoom(FixtureTest):

    def test_camp_ground_in_landuse_layer(self):
        self.load_fixtures([
            'http://www.openstreetmap.org/way/431725967',
        ])
        self.assert_has_feature(
           16, 10959, 25337, 'landuse',
           {'kind': 'camp_site'})

Example test run:

In the vector-datasource directory in your first terminal window, run your new test to make sure it fails using the existing config:

python integration-test/__init__.py integration-test/875-camp-grounds-zoom.py

Once it fails, we'll update our logic in step 4 below so it passes.

Now the gory details...

Find example feature in the raw data to test against

There are two options to identify test features:

  1. Query local database using psql on the command line or PGAdmin app.
  2. Query remote OpenStreetMap database using Overpass Turbo.

Confused about which tags to use? Read up on the OSM wiki (example) and confirm actual usage in TagInfo.

Overpass Turbo example

To find an example feature in OpenStreetMap search overpass-turbo for specific tags. Here's a sample query (assuming you've zoomed the map to an interesting area like the greater San Francisco metropolitan area):

/*
This has been generated by the overpass-turbo wizard.
The original search was:
“highway=rest_area”
*/
[out:json][timeout:25];
// gather results
(
  // query part for: “highway=rest_area”
  node["highway"="rest_area"]({{bbox}});
  way["highway"="rest_area"]({{bbox}});
  relation["highway"="rest_area"]({{bbox}});
);
// print results
out body;
>;
out skel qt;
⚠️ NOTE: Update the above example for your tag (e.g.: `"highway"="rest_area"`)!
Determine which tile the feature should appear in for your test

Once you find a result you like, click on it's map marker to pull up the info window. Following the link from Overpass take you to a page like:

On that new web page, zoom the map out to the desired min zoom of the feature (it's usually specified in the Issue description in Github), then right click on the map near the marker (but not on the marker!). Then you'll use your web browsers debug tools to "Inspect element" and look for the leaflet-map-pane and follow that down till you find the named raster tile file which encodes the tile coordinate.

Alternative method if feature is already in Mapzen tiles

Use one of the Mapzen house styles, like Bubble Wrap, to determine the tile:

Click on a feature to "view more", then click "view tile data".

screen shot 2016-07-18 at 18 07 06

screen shot 2016-07-18 at 18 07 14

If you're modifying a feature, it can be helpful to search in the JSON response for the thing you want to change to confirm it's the right tile. If you're adding a new feature, you could search for something you know should be in the tile already to confirm you got the right one.

TIP: It's helpful to install a browser extension to view the JSON formatted. jsonview for Chrome is pretty good.

Specific map tile to test with:

But the tests require this to be formatted like:

  • 7, 20, 49

Common test types

The FixtureTest class provides several useful tests (called using self.):

  • assert_has_feature
  • assert_no_matching_feature
  • assert_at_least_n_features
  • assert_less_than_n_features
  • assert_feature_geom_type

4. Edit database &/or query logic

Edit the YAML file corresponding to the layer. In this case we're modifying the landuse.yaml to add a new filter that looks for OpenStreetMap feature tagged tourism=camp_site and assigns them a min_zoom based on the feature area of at least 16 but up to zoom 13 depending on the feature's area and assigning a Tilezen kind of camp_site.

- filter: {tourism: camp_site}
  min_zoom: GREATEST(LEAST(zoom, 16), 13)
  output: {kind: camp_site}

5. Verify the new logic by running the test

Run the test, hopefully it passes now! You'll need to run the test from the project's root directory, you may need to cd ../../ to get back there after step 4 above.

python integration-test/__init__.py integration-test/875-camp-grounds-zoom.py

Example output:

python integration-test/__init__.py integration-test/875-camp-grounds-zoom.py
..
----------------------------------------------------------------------
Ran 2 tests in 2.236s

OK

If the test failed like so:

python integration-test/__init__.py integration-test/875-camp-grounds-zoom.py
.F
======================================================================
FAIL: test_small (integration-test.875-camp-grounds-zoom.CampGroundsZoom)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/matt/Programming/Mapzen/vector-datasource/integration-test/875-camp-grounds-zoom.py", line 34, in test_small
    {'kind': 'camp_site', 'sort_rank': 92, 'foo': 'bar'})
  File "/home/matt/Programming/Mapzen/vector-datasource/integration-test/__init__.py", line 1212, in assert_has_feature
    self.assertions.assert_has_feature(z, x, y, layer, props)
  File "/home/matt/Programming/Mapzen/vector-datasource/integration-test/__init__.py", line 1079, in assert_has_feature
    (properties, closest['properties'], misses))
AssertionError: Did not find feature including properties {'sort_rank': 92, 'kind': 'camp_site', 'foo': 'bar'}. The closest match was {'kind': 'camp_site', 'area': 29753, 'sort_rank': 92, 'source': 'openstreetmap.org', 'min_zoom': 13.0, 'id': 417405356}: missed {'foo': "None != 'bar'"}.

----------------------------------------------------------------------
Ran 2 tests in 2.214s

FAILED (failures=1)
⚠️ NOTE: It's best practice to run your own test AND to also confirm that all other tests are still passing before submitting a pull request. It's possible that you might need to run an overall database migration to achieve this locally, or you can rely on CircleCI to run all the tests for you in your branch by pushing it to the server.

6. Perform any modifications, as necessary

Rinse and repeat, rewrite your code. Don't be afraid to ask for help!

7. Update data migrations

Once you've finished testing your new database logic in step 4 above you need to record that that same SQL in modified form in data/migrations/ to ensure someone with an earlier database configuration can catch up with you. (Migrations are reset for each Tilezen release.)

Continuing the camp_site example, edit the following in the data/migrations/v1.0.0-polygon.sql file:

⚠️ NOTE: The exact file name for the migration will change, matching the upcoming release version.
UPDATE
   planet_osm_polygon
    SET mz_poi_min_zoom = mz_calculate_min_zoom_pois(planet_osm_polygon.*)
    WHERE
     (barrier = 'toll_booth' OR
      highway IN ('services', 'rest_area') OR
      tourism = 'camp_site')
      AND COALESCE(mz_poi_min_zoom, 999) <> COALESCE(mz_calculate_min_zoom_pois(planet_osm_polygon.*), 999);

UPDATE
   planet_osm_polygon
   SET mz_landuse_min_zoom = mz_calculate_min_zoom_landuse(planet_osm_polygon.*)
   WHERE
     (highway IN  ('services', 'rest_area') OR
      barrier IN ('city_wall', 'retaining_wall', 'fence') OR
      historic = 'citywalls' OR
      man_made = 'snow_fence' OR
      waterway = 'dam' OR
      tourism = 'camp_site' OR
      "natural" IN ('forest', 'park'))
      AND COALESCE(mz_landuse_min_zoom, 999) <> COALESCE(mz_calculate_min_zoom_landuse(planet_osm_polygon.*), 999);
⚠️ NOTE: Occasionally two PRs will land at the same time and you'll need to clean up the SQL to address a merge conflict. To prevent this, use more new lines in your SQL.
⚠️ NOTE: You will need to run the database migrations after updating the SQL files. You can do this with the `data/migrations/run_migrations.sh -d osm` command. Replace `-d osm` with your database name and whatever other `psql` arguments are necessary to connect to your local database.

Running the migration script will have the effect of building and re-installing the SQL functions which are responsible for calculating the min zoom columns used to determine if a feature is visible at all. If you find that your tests are passing, but your live database doesn't show a feature in tiles (or shows it at the wrong zoom), then you may need to run a migration again.

Migrations should be idempotent: Running them multiple times should result in the same changes. This is important, as the migrations may be run several times on the development database and many, many times against a local database. This can lead to some complex code to ensure that actions aren't performed twice. Some actions, such as UPDATEs, may be safe to perform multiple times. Even in those cases, it's best to add a restrictive WHERE clause to avoid making unnecessary writes, as these can hugely slow down the migration process.

Migration details

OpenStreetMap related migrations are recorded in the following files:

  • v1.0.0-point.sql
  • v1.0.0-line.sql
  • v1.0.0-polygon.sql

Migrations for other data sources like Natural Earth and Who's On First go in:

  • v1.0.0-other-tables.sql
⚠️ NOTE: The exact file names for the migration will change, matching the upcoming release version.

Example database SQL

Here's an example out of the v1.0.0-point.sql file:

Updating a simple point feature:

UPDATE planet_osm_point
  SET mz_poi_min_zoom = mz_calculate_min_zoom_pois(planet_osm_point.*)
  WHERE shop IN ('outdoor');

A more complicated point example:

UPDATE
  planet_osm_point
  SET mz_poi_min_zoom = mz_calculate_min_zoom_pois(planet_osm_point.*)
  WHERE
    (barrier = 'toll_booth' OR
     highway IN ('services', 'rest_area'))
    AND COALESCE(mz_poi_min_zoom, 999) <> COALESCE(mz_calculate_min_zoom_pois(planet_osm_point.*), 999);

Updating a simple line feature:

UPDATE planet_osm_line
  SET mz_boundary_min_zoom = mz_calculate_min_zoom_boundaries(planet_osm_line.*)
  WHERE
    waterway = 'dam';

Updating a simple polygon feature:

UPDATE planet_osm_polygon
  SET mz_poi_min_zoom = mz_calculate_min_zoom_pois(planet_osm_polygon.*)
  WHERE shop IN ('outdoor');
⚠️ NOTE: Some features can have a POI "label" and a landuse polygon, so calculate both!

When we calculate both the POIs and the landuse min zoom:

UPDATE planet_osm_polygon
  SET mz_poi_min_zoom = mz_calculate_min_zoom_pois(planet_osm_polygon.*),
      mz_poi_min_zoom = mz_calculate_min_zoom_landuse(planet_osm_polygon.*)
  WHERE shop IN ('outdoor');

8. Update documentation

Everything good? time to update the docs! Generally this is in the docs/layers.md file in the various layer sections to specify new properties and new kind values.

Since camp_site was already in the pois layer, we only need to document it's addition to the alphabetical list of landuse kinds:

  * `bridge`
  * `camp_site`
  * `caravan_site`
⚠️ NOTE: Project documentation is publicly accessible and documents the API promises the service makes.

9. Push your local branch to the server

First let's commit our changes. Let's confirm which files changed:

git status

You can also do a git diff on each file to determine if you meant to change or insert logic. Once you've confirmed the changes...

For each, commit using a specific commit message. The first should use the "Connects to #issuenum" format to link up the PR to the original issue in Waffle.io.

git commit -m 'Connects to #875 to add camp_site polygons' filename

NOTE: Subsequent commit messages can be more generic.

Make sure you have a clean merge by pulling down the latest master by checking out master:

git checkout master

Fetch latest changes from the server:

git pull origin master

Go back to your branch:

git checkout olga/875-camp-ground-zoom

Rebase (compare) it with master:

git rebase master

And resolve any funk, as necessary.

Then push to the server so other people can see your work. (If this is a large change over multiple days, please push the server once a day so your work is backed up.)

git push

NOTE: Your first push for a branch might require additional details:

git push --set-upstream origin olga/875-camp-ground-zoom

10. Submit a Pull Request (PR)

Back on Github.com load the project page and notice there's a button suggested you create a PR for your active branch. Press that green button. Need help? Github docs have you covered.

In the PR form, give it a good title that ties in with the original Issue title. In the comment section summarize the work you did to resolve the issue and indicate you added tests, data migrations, and updated the documentation.

A Tilezen team member will review the PR for you, either merging it right away or following up with questions.

If the review leads to code modifications those should be done in same branch and the PR will automatically update with subsequent commits to the branch.

👏 Good luck, and thank you for contributing!