Skip to content

Commit

Permalink
feat(api)!: ingest-api/docs auth flow (#356)
Browse files Browse the repository at this point in the history
### What?
- Breaking: `VEDA_USERPOOL_DOMAIN_BASE_URL` configuration now required
along with one time administrator step to update existing user pool
client allowed callback urls with the ingest-api's URL
- Updates ingest-api auth flow: use redirect to cognito UI for secure
user password entry via `Authorize` button in swagger docs
- Removes outdated whoami auth response model
- ~Upgrades ingest-api fastapi to patched version 0.9.1~
- Improves ingest-api configuration and deployment documentation

### Why?

- Token/ endpoint behavior sluggish via. swagger docs and it is not
desirable to enter secrets in a string input form (the token/ endpoint
is still supported for programmatic access)
- 
### Testing?

- Deployed manually to dev stack and updated allowed callbacks of dev
cognito user pool client
- Obtained tokens via `Authorize` button and `/token` endpoints
- Created, then deleted one collection
  • Loading branch information
anayeaye authored Apr 24, 2024
2 parents 9f0289b + a734874 commit 5c3cea4
Show file tree
Hide file tree
Showing 13 changed files with 96 additions and 120 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ In case of failure, all container logs will be written out to `container_logs.lo

> **Warning** PgSTAC records should be loaded in the database using [pypgstac](https://github.com/stac-utils/pgstac#pypgstac) for proper indexing and partitioning.
The VEDA ecosystem includes tools specifially created for loading PgSTAC records and optimizing data assets. The [veda-data-airflow](https://github.com/NASA-IMPACT/veda-data-airflow) project provides examples of cloud pipelines that transform data to cloud optimized formats, generate STAC metadata, and submit records for publication to the veda-backend database using the [veda-stac-ingestor](https://github.com/NASA-IMPACT/veda-stac-ingestor).
The VEDA ecosystem includes tools specifially created for loading PgSTAC records and optimizing data assets. The [veda-data-airflow](https://github.com/NASA-IMPACT/veda-data-airflow) project provides examples of cloud pipelines that transform data to cloud optimized formats, generate STAC metadata, and submit records for publication to the veda-backend database via veda-backend's ingest API. Veda-backend's integrated ingest system includes an API lambda for enqueuing collection and item records in a DynamoDB table and an ingestor lambda that batch loads DDB enqueued records into the PgSTAC database. Currently, the client id and domain of an existing Cognito user pool programmatic client must be supplied in [configuration](ingest_api/infrastructure/config.py) as `VEDA_CLIENT_ID` and `VEDA_USERPOOL_DOMAIN_BASE_URL` (the [veda-auth project](https://github.com/NASA-IMPACT/veda-auth) can be used to deploy a Cognito user pool and client). To dispense auth tokens via the ingest API swagger docs and `/token` endpoints, an administrator must add the ingest API lambda URL to the allowed callbacks of the Cognito client.

## Support scripts
Support scripts are provided for manual system operations.
Expand Down
5 changes: 4 additions & 1 deletion ingest_api/infrastructure/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from typing import Optional

import aws_cdk
from pydantic import BaseSettings, Field, constr
from pydantic import AnyHttpUrl, BaseSettings, Field, constr

AwsArn = constr(regex=r"^arn:aws:iam::\d{12}:role/.+")

Expand Down Expand Up @@ -34,6 +34,9 @@ class IngestorConfig(BaseSettings):
client_secret: Optional[str] = Field(
"", description="The Cognito APP client secret"
)
userpool_domain_base_url: AnyHttpUrl = Field(
description="The base url of the Cognito domain for authorization and token urls"
)
stac_db_security_group_id: str = Field(
description="ID of Security Group used by pgSTAC DB"
)
Expand Down
1 change: 1 addition & 0 deletions ingest_api/infrastructure/construct.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ def __init__(
"RASTER_URL": config.raster_api_url,
"ROOT_PATH": config.ingest_root_path,
"STAGE": config.stage,
"USERPOOL_DOMAIN_BASE_URL": config.userpool_domain_base_url,
}

# create lambda
Expand Down
3 changes: 2 additions & 1 deletion ingest_api/runtime/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Waiting for https://github.com/stac-utils/stac-pydantic/pull/116 and 117
Authlib==1.0.1
cryptography>=42.0.5
ddbcereal==2.1.1
fastapi<=0.108.0
fsspec==2023.3.0
Expand All @@ -8,6 +8,7 @@ orjson>=3.6.8
psycopg[binary,pool]>=3.0.15
pydantic_ssm_settings>=0.2.0
pydantic>=1.10.12
pyjwt>=2.8.0
pypgstac==0.7.4
python-multipart==0.0.7
requests>=2.27.1
Expand Down
20 changes: 1 addition & 19 deletions ingest_api/runtime/requirements_dev.txt
Original file line number Diff line number Diff line change
@@ -1,22 +1,4 @@
# Waiting for https://github.com/stac-utils/stac-pydantic/pull/116 and 117
Authlib==1.0.1
ddbcereal==2.1.1
fastapi<=0.108.0
fsspec==2023.3.0
mangum>=0.15.0
orjson>=3.6.8
psycopg[binary,pool]>=3.0.15
pydantic_ssm_settings>=0.2.0
pydantic>=1.10.12
pypgstac==0.7.4
python-multipart==0.0.7
requests>=2.27.1
s3fs==2023.3.0
stac-pydantic @ git+https://github.com/ividito/stac-pydantic.git@3f4cb381c85749bb4b15d1181179057ec0f51a94
xarray==2023.1.0
xstac==1.1.0
zarr==2.13.6
boto3==1.24.59
-r requirements.txt
moto[dynamodb, ssm]>=4.0.9,<5.0
httpx
aws_xray_sdk>=2.6.0,<3
Expand Down
93 changes: 42 additions & 51 deletions ingest_api/runtime/src/auth.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,68 +2,59 @@
import hashlib
import hmac
import logging
from typing import Dict
from typing import Annotated, Any, Dict

import boto3
import requests
import src.config as config
from authlib.jose import JsonWebKey, JsonWebToken, JWTClaims, KeySet, errors
from cachetools import TTLCache, cached
import jwt
from src.config import settings

from fastapi import Depends, HTTPException, security
from fastapi import Depends, HTTPException, Security, security, status

logger = logging.getLogger(__name__)

token_scheme = security.HTTPBearer()
oauth2_scheme = security.OAuth2AuthorizationCodeBearer(
authorizationUrl=settings.cognito_authorization_url,
tokenUrl=settings.cognito_token_url,
refreshUrl=settings.cognito_token_url,
)

jwks_client = jwt.PyJWKClient(settings.jwks_url) # Caches JWKS

def get_settings() -> config.Settings:
import src.main as main

return main.settings


def get_jwks_url(settings: config.Settings = Depends(get_settings)) -> str:
return settings.jwks_url


@cached(TTLCache(maxsize=1, ttl=3600))
def get_jwks(jwks_url: str = Depends(get_jwks_url)) -> KeySet:
with requests.get(jwks_url) as response:
response.raise_for_status()
return JsonWebKey.import_key_set(response.json())


def decode_token(
token: security.HTTPAuthorizationCredentials = Depends(token_scheme),
jwks: KeySet = Depends(get_jwks),
) -> JWTClaims:
"""
Validate & decode JWT
"""
def validated_token(
token_str: Annotated[str, Security(oauth2_scheme)],
required_scopes: security.SecurityScopes,
) -> Dict:
# Parse & validate token
try:
claims = JsonWebToken(["RS256"]).decode(
s=token.credentials,
key=jwks,
claims_options={
# # Example of validating audience to match expected value
# "aud": {"essential": True, "values": [APP_CLIENT_ID]}
},
token = jwt.decode(
token_str,
jwks_client.get_signing_key_from_jwt(token_str).key,
algorithms=["RS256"],
)

if "client_id" in claims:
# Insert Cognito's `client_id` into `aud` claim if `aud` claim is unset
claims.setdefault("aud", claims["client_id"])

claims.validate()
return claims
except errors.JoseError: #
logger.exception("Unable to decode token")
raise HTTPException(status_code=403, detail="Bad auth token")


def get_username(claims: security.HTTPBasicCredentials = Depends(decode_token)):
return claims["sub"]
except jwt.exceptions.InvalidTokenError as e:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Could not validate credentials",
headers={"WWW-Authenticate": "Bearer"},
) from e

# Validate scopes (if required)
for scope in required_scopes.scopes:
if scope not in token["scope"]:
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Not enough permissions",
headers={
"WWW-Authenticate": f'Bearer scope="{required_scopes.scope_str}"'
},
)

return token


def get_username(token: Annotated[Dict[Any, Any], Depends(validated_token)]):
return token["username"]


def _get_secret_hash(username: str, client_id: str, client_secret: str) -> str:
Expand Down
26 changes: 26 additions & 0 deletions ingest_api/runtime/src/config.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import os
from getpass import getuser
from typing import Optional

from pydantic import AnyHttpUrl, BaseSettings, Field, constr
Expand All @@ -21,14 +23,38 @@ class Settings(BaseSettings):

userpool_id: str = Field(description="The Cognito Userpool used for authentication")

userpool_domain_base_url: AnyHttpUrl = Field(
description="The base url of the Cognito domain for authorization and token urls"
)
client_id: str = Field(description="The Cognito APP client ID")
client_secret: str = Field("", description="The Cognito APP client secret")
root_path: Optional[str] = Field(description="Root path of API")
stage: Optional[str] = Field(description="API stage")

@property
def cognito_authorization_url(self) -> AnyHttpUrl:
"""Cognito user pool authorization url"""
return f"{self.userpool_domain_base_url}/oauth2/authorize"

@property
def cognito_token_url(self) -> AnyHttpUrl:
"""Cognito user pool token and refresh url"""
return f"{self.userpool_domain_base_url}/oauth2/token"

class Config(AwsSsmSourceConfig):
env_file = ".env"

@classmethod
def from_ssm(cls, stack: str):
return cls(_secrets_dir=f"/{stack}")


settings = (
Settings()
if os.environ.get("NO_PYDANTIC_SSM_SETTINGS")
else Settings.from_ssm(
stack=os.environ.get(
"STACK", f"veda-stac-ingestion-system-{os.environ.get('STAGE', getuser())}"
),
)
)
4 changes: 2 additions & 2 deletions ingest_api/runtime/src/dependencies.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@
token_scheme = security.HTTPBearer()


def get_table(settings: config.Settings = Depends(auth.get_settings)):
def get_table():
client = boto3.resource("dynamodb")
return client.Table(settings.dynamodb_table)
return client.Table(config.settings.dynamodb_table)


def get_db(table=Depends(get_table)) -> services.Database:
Expand Down
5 changes: 3 additions & 2 deletions ingest_api/runtime/src/doc.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,11 @@
## Auth
The auth API allows users to retrieve an access token and get information about the
current user.
To get an access token, the user must provide their username and password in the
request body to the POST `/token` API.
To get an access token, the user must provide their username and password via the Cognito UI (click the 'Authorize' button for a temporary login redirect).
The current user's information can be retrieved using the GET `/auth/me` API.
For programmatic access, post username and password in the request body to the POST `/token` API.
Before using the API, user must ask a VEDA team member to create credentials (username
and password) for VEDA auth.
The user name and password is used to get the access token from Auth API call in order
Expand Down
3 changes: 1 addition & 2 deletions ingest_api/runtime/src/ingestor.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@

from boto3.dynamodb.types import TypeDeserializer
from pypgstac.db import PgstacDB
from src.auth import get_settings
from src.dependencies import get_table
from src.schemas import Ingestion, Status
from src.utils import IngestionType, get_db_credentials, load_into_pgstac
Expand Down Expand Up @@ -41,7 +40,7 @@ def update_dynamodb(
"""
# Update records in DynamoDB
print(f"Updating ingested items status in DynamoDB, marking as {status}...")
table = get_table(get_settings())
table = get_table()
with table.batch_writer(overwrite_by_pkeys=["created_by", "id"]) as batch:
for ingestion in ingestions:
batch.put_item(
Expand Down
32 changes: 12 additions & 20 deletions ingest_api/runtime/src/main.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,12 @@
import os
from getpass import getuser
from typing import Dict

import src.auth as auth
import src.config as config
import src.dependencies as dependencies
import src.schemas as schemas
import src.services as services
from aws_lambda_powertools.metrics import MetricUnit
from src.collection_publisher import CollectionPublisher, ItemPublisher
from src.config import settings
from src.doc import DESCRIPTION
from src.monitoring import LoggerRouteHandler, logger, metrics, tracer

Expand All @@ -18,17 +16,6 @@
from fastapi.security import OAuth2PasswordRequestForm
from starlette.requests import Request

settings = (
config.Settings()
if os.environ.get("NO_PYDANTIC_SSM_SETTINGS")
else config.Settings.from_ssm(
stack=os.environ.get(
"STACK", f"veda-stac-ingestion-system-{os.environ.get('STAGE', getuser())}"
),
)
)


app = FastAPI(
title="VEDA Ingestion API",
description=DESCRIPTION,
Expand All @@ -40,6 +27,11 @@
root_path=settings.root_path,
openapi_url="/openapi.json",
docs_url="/docs",
swagger_ui_init_oauth={
"appName": "Cognito",
"clientId": settings.client_id,
"usePkceWithAuthorizationCodeGrant": True,
},
router=APIRouter(route_class=LoggerRouteHandler),
)

Expand Down Expand Up @@ -71,7 +63,7 @@ async def list_ingestions(
)
async def enqueue_ingestion(
item: schemas.AccessibleItem,
username: str = Depends(auth.get_username),
username: str = Depends(auth.validated_token),
db: services.Database = Depends(dependencies.get_db),
) -> schemas.Ingestion:
"""
Expand Down Expand Up @@ -142,7 +134,7 @@ def cancel_ingestion(
"/collections",
tags=["Collection"],
status_code=201,
dependencies=[Depends(auth.get_username)],
dependencies=[Depends(auth.validated_token)],
)
def publish_collection(collection: schemas.DashboardCollection):
"""
Expand All @@ -162,7 +154,7 @@ def publish_collection(collection: schemas.DashboardCollection):
@app.delete(
"/collections/{collection_id}",
tags=["Collection"],
dependencies=[Depends(auth.get_username)],
dependencies=[Depends(auth.validated_token)],
)
def delete_collection(collection_id: str):
"""
Expand All @@ -180,7 +172,7 @@ def delete_collection(collection_id: str):
"/items",
tags=["Items"],
status_code=201,
dependencies=[Depends(auth.get_username)],
dependencies=[Depends(auth.validated_token)],
)
def publish_item(item: schemas.Item):
"""
Expand Down Expand Up @@ -219,8 +211,8 @@ async def get_token(
)


@app.get("/auth/me", tags=["Auth"], response_model=schemas.WhoAmIResponse)
def who_am_i(claims=Depends(auth.decode_token)):
@app.get("/auth/me", tags=["Auth"])
def who_am_i(claims=Depends(auth.validated_token)):
"""
Return claims for the provided JWT
"""
Expand Down
21 changes: 0 additions & 21 deletions ingest_api/runtime/src/schemas.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,27 +88,6 @@ class AuthResponse(BaseModel):
)


class WhoAmIResponse(BaseModel):
sub: str = Field(..., description="A unique identifier for the user")
# cognito_groups: List[str] = Field(
# ..., description="A list of Cognito groups the user belongs to"
# )
iss: str = Field(..., description="The issuer of the token")
client_id: str = Field(..., description="The client ID of the authenticated app")
origin_jti: str = Field(
..., description="A unique identifier for the authentication event"
)
event_id: str = Field(..., description="A unique identifier for the event")
token_use: str = Field(..., description="The intended use of the token")
scope: str = Field(..., description="The scope of the token")
auth_time: int = Field(..., description="The time when the user was authenticated")
exp: int = Field(..., description="The time when the token will expire")
iat: int = Field(..., description="The time when the token was issued")
jti: str = Field(..., description="A unique identifier for the token")
username: str = Field(..., description="The username of the user")
aud: str = Field(..., description="The audience of the token")


class Ingestion(BaseModel):
id: str = Field(..., description="ID of the STAC item")
status: Status = Field(..., description="Status of the ingestion")
Expand Down
Loading

0 comments on commit 5c3cea4

Please sign in to comment.