Cloud Run job to pull metadata manifests from Synapse and update tables in the Google BigQuery dataset htan-dcc.combined_assays
. This dataset contains clinical, biospecimen, and assay metadata tables combined across HTAN centers.
Scheduled to run daily at 0200 ET.
Requires access to deploy resources in the HTAN Google Cloud Project, htan-dcc
. Please contact an owner of htan-dcc
to request access (Owners in 2024: Clarisse Lau, Vesteinn Thorsson, William Longabaugh, ISB)
-
Create a Synapse Auth Token secret in Secret Manager. Requires download access to all individual HTAN-center Synapse projects. Currently uses
synapse-service-HTAN-lambda
service account. -
Install Terraform >= 1.7.0
Before creating job, build and push a docker image to Google Artifact Registry (recommended)
cd src
docker build . -t us-docker.pkg.dev/<gc-project>/gcr.io/<image-name>
docker push us-docker.pkg.dev/<gc-project>/gcr.io/<image-name>
Define variables in terraform.tfvars. Variable descriptions can be found in variables.tf
terraform init
terraform plan
terraform apply