Skip to content

Commit

Permalink
Merge pull request #60 from no10ds/fix/sdk-fixes
Browse files Browse the repository at this point in the history
rAPId Small Fixes
  • Loading branch information
TobyDrane authored Nov 7, 2023
2 parents 86bf824 + 98526a9 commit 07b0f80
Show file tree
Hide file tree
Showing 18 changed files with 148 additions and 79 deletions.
6 changes: 3 additions & 3 deletions api/api/controller/datasets.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ async def list_all_datasets(
| Parameters | Required| Usage | Example values | Definition |
|---------------|---------|-----------------------------------------|------------------------------------------------------------------------------------------------------- |-----------------------|
| enriched | False | Boolean Query parameter | True | enriches the metadata |
| query | False | JSON Request Body | Consult the [docs](https://github.com/no10ds/rapid-api/blob/main/docs/guides/usage/usage.md#examples-2)| the filtering query |
| query | False | JSON Request Body | Consult the [docs](https://rapid.readthedocs.io/en/latest/api/routes/dataset/#filtering-query) | the filtering query |
### Accepted permissions
Expand Down Expand Up @@ -505,7 +505,7 @@ async def query_dataset(
| `domain` | True | URL parameter | `space` | domain of the dataset |
| `dataset` | True | URL parameter | `rocket_launches` | dataset title |
| `version` | False | Query parameter | '3' | dataset version |
| `query` | False | JSON Request Body | Consult the [docs](https://github.com/no10ds/rapid-api/blob/main/docs/guides/usage/usage.md#how-to-construct-a-query-object)| the query object |
| `query` | False | JSON Request Body | Consult the [docs](https://rapid.readthedocs.io/en/latest/api/query/) | the query object |
#### Layer
Expand Down Expand Up @@ -616,7 +616,7 @@ async def query_large_dataset(
| `domain` | True | URL parameter | `space` | domain of the dataset |
| `dataset` | True | URL parameter | `rocket_launches` | dataset title |
| `version` | False | Query parameter | '3' | dataset version |
| `query` | False | JSON Request Body | Consult the [docs](https://github.com/no10ds/rapid-api/blob/main/docs/guides/usage/usage.md#how-to-construct-a-query-object)| the query object |
| `query` | False | JSON Request Body | Consult the [docs](https://rapid.readthedocs.io/en/latest/api/query/) | the query object |
#### Layer
Expand Down
6 changes: 3 additions & 3 deletions api/api/controller/schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ async def generate_schema(
In order to upload the dataset for the first time, you need to define its schema. This endpoint is provided for your
convenience to generate a schema based on an existing dataset. Alternatively you can consult
the [schema writing guide](https://github.com/no10ds/rapid-api/blob/main/docs/guides/usage/schema_creation.md) if you would like to create the schema yourself. You can then use the
the [schema writing guide](https://rapid.readthedocs.io/en/latest/api/schema/) if you would like to create the schema yourself. You can then use the
output of this endpoint in the Schema Upload endpoint.
⚠️ WARNING:
Expand Down Expand Up @@ -109,7 +109,7 @@ async def upload_schema(schema: Schema):
When you have a schema definition you can use this endpoint to upload it. This will allow you to subsequently upload
datasets that match the schema. If you do not yet have a schema definition, you can craft this yourself (see
the [schema writing guide](https://github.com/no10ds/rapid-api/blob/main/docs/guides/usage/schema_creation.md)) or use the Schema Generation endpoint (see above).
the [schema writing guide](https://rapid.readthedocs.io/en/latest/api/schema/)) or use the Schema Generation endpoint (see above).
### Inputs
Expand Down Expand Up @@ -159,7 +159,7 @@ async def update_schema(schema: Schema):
This endpoint is for uploading an updated schema definition. This will allow you to subsequently upload
datasets that match the updated schema. To create a schema definition (see
the [schema writing guide](https://github.com/no10ds/rapid-api/blob/main/docs/guides/usage/schema_creation.md)) or use the Schema Generation endpoint (see above).
the [schema writing guide](https://rapid.readthedocs.io/en/latest/api/schema/)) or use the Schema Generation endpoint (see above).
### Inputs
Expand Down
2 changes: 1 addition & 1 deletion api/api/entry.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ def info():
"url": PROJECT_URL,
"contact": PROJECT_CONTACT,
"organisation": PROJECT_ORGANISATION,
"documentation-url": "https://github.com/no10ds/rapid-api",
"documentation-url": "https://rapid.readthedocs.io/en/latest/",
},
}
],
Expand Down
39 changes: 21 additions & 18 deletions docs/api/query.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,29 +4,32 @@ Data can be queried provided data has been uploaded at some point in the past.

There are six values you can customise:

<!-- prettier-ignore-start -->
<!-- SOMETHING AUTO-GENERATED BY TOOLS - START -->
- `select_columns`
- Which column(s) you want to select
- List of strings
- Can contain aggregation functions e.g.: `"avg(col1)"`, `"sum(col2)"`
- Can contain renaming of columns e.g.: `"col1 AS custom_name"`
- Which column(s) you want to select
- List of strings
- Can contain aggregation functions e.g.: `"avg(col1)"`, `"sum(col2)"`
- Can contain renaming of columns e.g.: `"col1 AS custom_name"`
- `filter`
- How to filter the data
- This is provided as a raw SQL string
- Omit the `WHERE` keyword
- How to filter the data
- This is provided as a raw SQL string
- Omit the `WHERE` keyword
- `group_by_columns`
- Which columns to group by
- List of column names as strings
- Which columns to group by
- List of column names as strings
- `aggregation_conditions`
- What conditions you want to apply to aggregated values
- This is provided as a raw SQL string
- Omit the `HAVING` keyword
- What conditions you want to apply to aggregated values
- This is provided as a raw SQL string
- Omit the `HAVING` keyword
- `order_by_columns`
- By which column(s) to order the data
- List of strings
- Defaults to ascending (`ASC`) if not provided
- `limit`
- How many rows to limit the results to
- String of an integer
- By which column(s) to order the data
- List of strings
- Defaults to ascending (`ASC`) if not provided
- `limit` - How many rows to limit the results to - String of an integer

<!-- SOMETHING AUTO-GENERATED BY TOOLS - END -->
<!-- prettier-ignore-end -->

For example:

Expand Down
83 changes: 65 additions & 18 deletions docs/api/routes/dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,10 +122,57 @@ None

### Inputs

| Parameters | Required | Usage | Example values | Definition |
| ---------- | -------- | ----------------------- | ------------------------------------------------------------------------------------------------------- | --------------------- |
| enriched | False | Boolean Query parameter | True | enriches the metadata |
| query | False | JSON Request Body | Consult the [docs](https://github.com/no10ds/rapid-api/blob/main/docs/guides/usage/usage.md#examples-2) | the filtering query |
| Parameters | Required | Usage | Example values | Definition |
| ---------- | -------- | ----------------------- | ---------------------------------------------------------------------------------------------- | --------------------- |
| enriched | False | Boolean Query parameter | True | enriches the metadata |
| query | False | JSON Request Body | Consult the [docs](https://rapid.readthedocs.io/en/latest/api/routes/dataset/#filtering-query) | the filtering query |

#### Filtering Query

**Example 1 - Filtering by tags**

Here we retrieve all datasets that have a tag with key `tag1` with any value and `tag2` with value `value2`.

```json
{
"key_value_tags": {
"tag1": null,
"tag2": "value2"
}
}
```

**Example 2 - Filtering by sensitivity**

```json
{
"sensitivity": "PUBLIC"
}
```

**Example 3 - Filtering by tags and sensitivity**

```json
{
"sensitivity": "PUBLIC",
"key_value_tags": {
"tag1": null,
"tag2": "value2"
}
}
```

**Example 4 - Filtering by key value tags and key only tags**

```json
{
"sensitivity": "PUBLIC",
"key_value_tags": {
"tag2": "value2"
},
"key_only_tags": ["tag1"]
}
```

### Outputs

Expand Down Expand Up @@ -208,13 +255,13 @@ You will need `READ` permission appropriate to the dataset sensitivity level, e.

### Inputs

| Parameters | Required | Usage | Example values | Definition |
| ---------- | -------- | ----------------- | ---------------------------------------------------------------------------------------------------------------------------- | --------------------- |
| `layer` | True | URL parameter | `raw` | layer of the dataset |
| `domain` | True | URL parameter | `space` | domain of the dataset |
| `dataset` | True | URL parameter | `rocket_launches` | dataset title |
| `version` | False | Query parameter | '3' | dataset version |
| `query` | False | JSON Request Body | Consult the [docs](https://github.com/no10ds/rapid-api/blob/main/docs/guides/usage/usage.md#how-to-construct-a-query-object) | the query object |
| Parameters | Required | Usage | Example values | Definition |
| ---------- | -------- | ----------------- | --------------------------------------------------------------------- | --------------------- |
| `layer` | True | URL parameter | `raw` | layer of the dataset |
| `domain` | True | URL parameter | `space` | domain of the dataset |
| `dataset` | True | URL parameter | `rocket_launches` | dataset title |
| `version` | False | Query parameter | '3' | dataset version |
| `query` | False | JSON Request Body | Consult the [docs](https://rapid.readthedocs.io/en/latest/api/query/) | the query object |

### Outputs

Expand Down Expand Up @@ -258,13 +305,13 @@ You will need a `READ` permission appropriate to the dataset sensitivity level,

### Inputs

| Parameters | Required | Usage | Example values | Definition |
| ---------- | -------- | ----------------- | ---------------------------------------------------------------------------------------------------------------------------- | --------------------- |
| `layer` | True | URL parameter | `raw` | layer of the dataset |
| `domain` | True | URL parameter | `space` | domain of the dataset |
| `dataset` | True | URL parameter | `rocket_launches` | dataset title |
| `version` | False | Query parameter | '3' | dataset version |
| `query` | False | JSON Request Body | Consult the [docs](https://github.com/no10ds/rapid-api/blob/main/docs/guides/usage/usage.md#how-to-construct-a-query-object) | the query object |
| Parameters | Required | Usage | Example values | Definition |
| ---------- | -------- | ----------------- | --------------------------------------------------------------------- | --------------------- |
| `layer` | True | URL parameter | `raw` | layer of the dataset |
| `domain` | True | URL parameter | `space` | domain of the dataset |
| `dataset` | True | URL parameter | `rocket_launches` | dataset title |
| `version` | False | Query parameter | '3' | dataset version |
| `query` | False | JSON Request Body | Consult the [docs](https://rapid.readthedocs.io/en/latest/api/query/) | the query object |

### Outputs

Expand Down
18 changes: 17 additions & 1 deletion docs/changelog.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,20 @@
# Changelog

## v7.0.7 / v0.1.5 (sdk) - _2023-11-07_

### Fixes

- Issue within the sdk `upload_and_create_dataset` function where schema metadata wasn't being correctly overridden.
- Hitting maximum security group rules for the load balancer.
- Documentation improvements and removes any references to the old deprecated repositories.

### Closes relevant GitHub issues

- https://github.com/no10ds/rapid/issues/50
- https://github.com/no10ds/rapid/issues/59
- https://github.com/no10ds/rapid/issues/54
- https://github.com/no10ds/rapid/issues/51

## v7.0.6 / v0.1.4 (sdk) - _2023-10-18_

### Features
Expand Down Expand Up @@ -70,7 +85,8 @@

- See the [migration doc](migration.md) for details on how to migrate to v7 from v6.

[Unreleased changes]: https://github.com/no10ds/rapid/compare/v7.0.6...HEAD
[Unreleased changes]: https://github.com/no10ds/rapid/compare/v7.0.7...HEAD
[v7.0.7 / v0.1.5 (sdk)]: https://github.com/no10ds/rapid/v7.0.6...v7.0.7
[v7.0.6 / v0.1.4 (sdk)]: https://github.com/no10ds/rapid/v7.0.5...v7.0.6
[v7.0.5 / v0.1.3 (sdk)]: https://github.com/no10ds/rapid/v7.0.4...v7.0.5
[v7.0.4 / v0.1.2 (sdk)]: https://github.com/no10ds/rapid/v7.0.3...v7.0.4
Expand Down
8 changes: 4 additions & 4 deletions docs/sdk/useful_patterns.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Below is a simple example for uploading a Pandas DataFrame to the API.
```python
import pandas as pd
from rapid import Rapid
from rapid.patterns import data
from rapid.patterns import dataset
from rapid.items.schema import SchemaMetadata, SensitivityLevel, Owner
from rapid.exceptions import DataFrameUploadValidationException

Expand All @@ -25,7 +25,7 @@ metadata = SchemaMetadata(
)

try:
data.upload_and_create_dataset(
dataset.upload_and_create_dataset(
rapid=rapid, df=df, metadata=metadata, upgrade_schema_on_fail=False
)
except DataFrameUploadValidationException:
Expand All @@ -39,7 +39,7 @@ Now going forward say for instance we now expect that for column c we can expect
```python
import pandas as pd
from rapid import Rapid
from rapid.patterns import data
from rapid.patterns import dataset
from rapid.items.schema import SchemaMetadata, SensitivityLevel, Owner, Column
from rapid.exceptions import ColumnNotDifferentException

Expand All @@ -57,7 +57,7 @@ metadata = SchemaMetadata(
)

try:
data.update_schema_to_dataframe(
dataset.update_schema_to_dataframe(
rapid=rapid,
df=df,
metadata=metadata,
Expand Down
28 changes: 19 additions & 9 deletions infrastructure/modules/app-cluster/load_balancer.tf
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ resource "aws_alb" "application_load_balancer" {
internal = false
load_balancer_type = "application"
subnets = var.public_subnet_ids_list
security_groups = [aws_security_group.load_balancer_security_group.id]
security_groups = [aws_security_group.load_balancer_security_group_http.id, aws_security_group.load_balancer_security_group_https.id]
drop_invalid_header_fields = true
enable_deletion_protection = true

Expand Down Expand Up @@ -64,7 +64,7 @@ POLICY
data "aws_ec2_managed_prefix_list" "cloudwatch" {
name = "com.amazonaws.global.cloudfront.origin-facing"
}
resource "aws_security_group" "load_balancer_security_group" {
resource "aws_security_group" "load_balancer_security_group_http" {
# checkov:skip=CKV_AWS_260: Limits by prefix list ID's
vpc_id = var.vpc_id
description = "ALB Security Group"
Expand All @@ -75,13 +75,6 @@ resource "aws_security_group" "load_balancer_security_group" {
prefix_list_ids = [data.aws_ec2_managed_prefix_list.cloudwatch.id]
description = "Allow HTTP ingress"
}
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
prefix_list_ids = [data.aws_ec2_managed_prefix_list.cloudwatch.id]
description = "Allow HTTPS ingress"
}
egress {
from_port = 0
to_port = 0
Expand All @@ -96,6 +89,23 @@ resource "aws_security_group" "load_balancer_security_group" {
create_before_destroy = true
}
}
resource "aws_security_group" "load_balancer_security_group_https" {
# checkov:skip=CKV_AWS_260: Limits by prefix list ID's
vpc_id = var.vpc_id
description = "ALB Security Group"
ingress {
from_port = 443
to_port = 443
protocol = "tcp"
prefix_list_ids = [data.aws_ec2_managed_prefix_list.cloudwatch.id]
description = "Allow HTTPS ingress"
}
tags = var.tags

lifecycle {
create_before_destroy = true
}
}

resource "aws_lb_target_group" "target_group" {
name = "${var.resource-name-prefix}-tg"
Expand Down
5 changes: 3 additions & 2 deletions infrastructure/modules/app-cluster/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -356,7 +356,8 @@ resource "aws_ecs_service" "aws-ecs-service" {
assign_public_ip = false
security_groups = [
aws_security_group.service_security_group.id,
aws_security_group.load_balancer_security_group.id
aws_security_group.load_balancer_security_group_http.id,
aws_security_group.load_balancer_security_group_https.id
]
}

Expand All @@ -376,7 +377,7 @@ resource "aws_security_group" "service_security_group" {
from_port = 0
to_port = 0
protocol = "-1"
security_groups = [aws_security_group.load_balancer_security_group.id]
security_groups = [aws_security_group.load_balancer_security_group_http.id, aws_security_group.load_balancer_security_group_https.id]
description = "Allow traffic from load balancer"
}

Expand Down
Loading

0 comments on commit 07b0f80

Please sign in to comment.