Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for PostgreSQL #5

Merged
merged 5 commits into from
Aug 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 56 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,9 +58,7 @@ Environment variables are interpolated before interpreting the configuration fil

### Defining tests
All test configurations are defined in the main configuration file.
Currently, there are two types of tests supported: tests that publish
their results to a CSV file, and tests that publish their results
to a Graphite database.
Hunter supports publishing results to a CSV file, [Graphite](https://graphiteapp.org/), and [PostgreSQL](https://www.postgresql.org/).

Tests are defined in the `tests` section.

Expand Down Expand Up @@ -142,6 +140,61 @@ $ curl -X POST "http://graphite_address/events/" \
Posting those events is not mandatory, but when they are available, Hunter is able to
filter data by commit or version using `--since-commit` or `--since-version` selectors.

#### Importing results from PostgreSQL

To import data from PostgreSQL, Hunter configuration must contain the database connection details:

```yaml
# External systems connectors configuration:
postgres:
hostname: ...
port: ...
username: ...
password: ...
database: ...
```

Test configurations must contain a query to select experiment data, a time column, and a list of columns to analyze:

```yaml
tests:
aggregate_mem:
type: postgres
time_column: commit_ts
attributes: [experiment_id, config_id, commit]
metrics:
process_cumulative_rate_mean:
direction: 1
scale: 1
process_cumulative_rate_stderr:
direction: -1
scale: 1
process_cumulative_rate_diff:
direction: -1
scale: 1
query: |
SELECT e.commit,
e.commit_ts,
r.process_cumulative_rate_mean,
r.process_cumulative_rate_stderr,
r.process_cumulative_rate_diff,
r.experiment_id,
r.config_id
FROM results r
INNER JOIN configs c ON r.config_id = c.id
INNER JOIN experiments e ON r.experiment_id = e.id
WHERE e.exclude_from_analysis = false AND
e.branch = 'trunk' AND
e.username = 'ci' AND
c.store = 'MEM' AND
c.cache = true AND
c.benchmark = 'aggregate' AND
c.instance_type = 'ec2i3.large'
ORDER BY e.commit_ts ASC;
```

For more details, see the examples in [examples/psql](examples/psql).

#### Avoiding test definition duplication
You may find that your test definitions are very similar to each other,
e.g. they all have the same metrics. Instead of copy-pasting the definitions
Expand Down
22 changes: 22 additions & 0 deletions examples/psql/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
## Schema

See [schema.sql](schema.sql) for the example schema.

## Usage

Define PostgreSQL connection details via environment variables:

```bash
export POSTGRES_HOSTNAME=...
export POSTGRES_USERNAME=...
export POSTGRES_PASSWORD=...
export POSTGRES_DATABASE=...
```

or in `hunter.yaml`.

The following command shows results for a single test `aggregate_mem` and updates the database with newly found change points:

```bash
$ BRANCH=trunk HUNTER_CONFIG=hunter.yaml hunter analyze aggregate_mem --update-postgres
```
55 changes: 55 additions & 0 deletions examples/psql/hunter.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# External systems connectors configuration:
postgres:
hostname: ${POSTGRES_HOSTNAME}
port: ${POSTGRES_PORT}
username: ${POSTGRES_USERNAME}
password: ${POSTGRES_PASSWORD}
database: ${POSTGRES_DATABASE}

# Templates define common bits shared between test definitions:
templates:
common:
type: postgres
time_column: commit_ts
attributes: [experiment_id, config_id, commit]
# required for --update-postgres to work
update_statement: |
UPDATE results
SET {metric}_rel_forward_change=%s,
{metric}_rel_backward_change=%s,
{metric}_p_value=%s
WHERE experiment_id = '{experiment_id}' AND config_id = {config_id}
metrics:
process_cumulative_rate_mean:
direction: 1
scale: 1
process_cumulative_rate_stderr:
direction: -1
scale: 1
process_cumulative_rate_diff:
direction: -1
scale: 1

# Define your tests here:
tests:
aggregate_mem:
inherit: [ common ] # avoids repeating metrics definitions and postgres-related config
query: |
SELECT e.commit,
e.commit_ts,
r.process_cumulative_rate_mean,
r.process_cumulative_rate_stderr,
r.process_cumulative_rate_diff,
r.experiment_id,
r.config_id
FROM results r
INNER JOIN configs c ON r.config_id = c.id
INNER JOIN experiments e ON r.experiment_id = e.id
WHERE e.exclude_from_analysis = false AND
e.branch = '${BRANCH}' AND
e.username = 'ci' AND
c.store = 'MEM' AND
c.cache = true AND
c.benchmark = 'aggregate' AND
c.instance_type = 'ec2i3.large'
ORDER BY e.commit_ts ASC;
48 changes: 48 additions & 0 deletions examples/psql/schema.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
CREATE TABLE IF NOT EXISTS configs (
id SERIAL PRIMARY KEY,
benchmark TEXT NOT NULL,
scenario TEXT NOT NULL,
store TEXT NOT NULL,
instance_type TEXT NOT NULL,
cache BOOLEAN NOT NULL,
UNIQUE(benchmark,
scenario,
store,
cache,
instance_type)
);

CREATE TABLE IF NOT EXISTS experiments (
id TEXT PRIMARY KEY,
ts TIMESTAMPTZ NOT NULL,
branch TEXT NOT NULL,
commit TEXT NOT NULL,
commit_ts TIMESTAMPTZ NOT NULL,
username TEXT NOT NULL,
details_url TEXT NOT NULL,
exclude_from_analysis BOOLEAN DEFAULT false NOT NULL,
exclude_reason TEXT
);

CREATE TABLE IF NOT EXISTS results (
experiment_id TEXT NOT NULL REFERENCES experiments(id),
config_id INTEGER NOT NULL REFERENCES configs(id),

process_cumulative_rate_mean BIGINT NOT NULL,
process_cumulative_rate_stderr BIGINT NOT NULL,
process_cumulative_rate_diff BIGINT NOT NULL,

process_cumulative_rate_mean_rel_forward_change DOUBLE PRECISION,
process_cumulative_rate_mean_rel_backward_change DOUBLE PRECISION,
process_cumulative_rate_mean_p_value DECIMAL,

process_cumulative_rate_stderr_rel_forward_change DOUBLE PRECISION,
process_cumulative_rate_stderr_rel_backward_change DOUBLE PRECISION,
process_cumulative_rate_stderr_p_value DECIMAL,

process_cumulative_rate_diff_rel_forward_change DOUBLE PRECISION,
process_cumulative_rate_diff_rel_backward_change DOUBLE PRECISION,
process_cumulative_rate_diff_p_value DECIMAL,

PRIMARY KEY (experiment_id, config_id)
);
24 changes: 24 additions & 0 deletions hunter/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

from hunter.grafana import GrafanaConfig
from hunter.graphite import GraphiteConfig
from hunter.postgres import PostgresConfig
from hunter.slack import SlackConfig
from hunter.test_config import TestConfig, create_test_config
from hunter.util import merge_dict_list
Expand All @@ -20,6 +21,7 @@ class Config:
tests: Dict[str, TestConfig]
test_groups: Dict[str, List[TestConfig]]
slack: SlackConfig
postgres: PostgresConfig


@dataclass
Expand Down Expand Up @@ -110,6 +112,27 @@ def load_config_from(config_file: Path) -> Config:
bot_token=config["slack"]["token"],
)

postgres_config = None
if config.get("postgres") is not None:
if not config["postgres"]["hostname"]:
raise ValueError("postgres.hostname")
if not config["postgres"]["port"]:
raise ValueError("postgres.port")
if not config["postgres"]["username"]:
raise ValueError("postgres.username")
if not config["postgres"]["password"]:
raise ValueError("postgres.password")
if not config["postgres"]["database"]:
raise ValueError("postgres.database")

postgres_config = PostgresConfig(
hostname=config["postgres"]["hostname"],
port=config["postgres"]["port"],
username=config["postgres"]["username"],
password=config["postgres"]["password"],
database=config["postgres"]["database"],
)

templates = load_templates(config)
tests = load_tests(config, templates)
groups = load_test_groups(config, tests)
Expand All @@ -118,6 +141,7 @@ def load_config_from(config_file: Path) -> Config:
graphite=graphite_config,
grafana=grafana_config,
slack=slack_config,
postgres=postgres_config,
tests=tests,
test_groups=groups,
)
Expand Down
Loading
Loading