Skip to content

Developer Guidelines

Isaiah Norton edited this page Mar 11, 2022 · 13 revisions

Introduction

Welcome! This page contains development guidelines specific to the TileDB code base.

Formatting

All changes in a PR must be formatted with clang-format version 9.0.x. If your changes violate this formatting, the CI checks will fail.

If you are a frequent contributor, consider adding the following .git/hooks/pre-commit to automatically format your C++ source when you make a commit:

#!/bin/sh

# Run an inline `clang-format` on all staged .h/.cc files, ignoring deleted files.
git diff --name-only --cached --diff-filter=d | grep -E '(\.h|\.cc)' | xargs clang-format -i

# Re-stage all of the files modified by `clang-format`.
git diff --name-only --cached --diff-filter=d | xargs git add

Testing

There are several types of tests in the TileDB codebase. The main integration and legacy unit tests are located in test/src. Newer object library unit tests are located in a test subdirectory at the unit level.

To run all tests, before submitting a pull request:

make check

To build/run only an individual unit test (targets are created at the superbuild level, so cd to root of build tree):

make unit_array_schema
./tiledb/sm/array_schema/unit_array_schema

To run a subset of integration tests for quicker cycles during development, or with test tag selection, use the tiledb_unit target and binary. For example, from the tiledb subdirectory in the build tree, the following command will run only tests matching the [empty] tag (in the catch2 test description) and will exclude tests matching [gcs], [s3], and [azure], while only using the native (local) filesystem backend for vfs::

make tiledb_unit
./test/tiledb_unit -v normal --vfs=native "~[gcs]" "~[s3]" "~[azure]" "[empty]"

Pull Request Annotations

When a pull request is opened or updated, a Github Action will scan your pull request's description for two reserved keywords: TYPE: and DESC:. Each keyword must start on a new line. These annotations are used to build release notes. Your pull request will fail the CI checks if these are missing or ill-formatted. Here is an example pull request description:

This patch fixes a use-after-free on variable `x` in the `Foo::bar()` API. This path is executed when a read request exceeds 10TB.

---

TYPE: BUG
DESC: Fix use-after-free on `x` in `Foo::bar()`. 

If you are a frequent contributor, we recommend configuring your development environment to use the following git commit template. If you open a pull request with a single commit, the PR description will auto-populate with your commit message. Instructions for setting a git commit template can be found here.

<long description of your pull request>

---

TYPE: FEATURE | BUG | IMPROVEMENT | DEPRECATION | C_API | CPP_API | BREAKING_BEHAVIOR | BREAKING_API | FORMAT
DESC: <short description of your pull request>

Branch Naming

When pushing to the repository, name your branches with a prefix that is likely unique to your name or username. For example, jpm/fix-segfault if your initials are jpm.

Pull Request Best Practices

Pull requests should be as small as possible to accomplish a single, well-defined objective. We prefer multiple small pull requests over one large pull request.

Dynamic Memory API

TileDB has an external API to enable heap memory profiling. The heap profiler relies on developers to use an internal API instead of the standard C/C++ dynamic memory APIs. For example: developers must use tdb_malloc instead of malloc. A pre-checkin script performs static analysis to detect violations (in other words: expect a pre-checkin check to fail if you use malloc in your pull request).

These APIs apply only to the tiledb/ directory, excluding tiledb/sm/c_api and tiledb/sm/cpp_api.

Currently, all APIs are defined in tiledb/common/heap_memory.h and the violation detection script is located in scripts/find_heap_api_violations.py.

Each C++ API has a complementary preprocessor macro that tags the usage of the API with the file name and line number from where it is called. Prefer the preprocessor macro to the C++ API unless you need a custom label. These macro interfaces are:

#define tdb_malloc(size)

#define tdb_calloc(num, size)

#define tdb_realloc(p, size)

#define tdb_free(p)

#define tdb_new(T, ...)

#define tdb_delete(p)

#define tdb_new_array(T, size)

#define tdb_delete_array(p)

#define tdb_shared_ptr

#define tdb_unique_ptr

#define tdb_make_shared(T, ...)