Feature/action transient secrets #1699

cohansen · 2025-06-24T23:50:10Z

Tickets addressed: NA
Review: By commit
Merge strategy: Merge (no squash)

Description

This PR adds transient secrets that can be added during an action run. Those secrets are stored in memory in the action-server until they are retrieved for the associated action_run.

Verification

Manually tested.

Documentation

NA

Future work

Refactor the delay / wait period for the secrets.

action-server/src/type/actionSecrets.ts

action-server/src/type/actionRunner.ts

…ve conflicts

action-server/src/type/actionRunner.ts

+  static async addActionSecret(actionRunId: string, actionSecrets: Record<string, string>): Promise<void> {
+    this.actionSecretsMap.set(actionRunId, actionSecrets);
+
+    logger.info(`Secret found for Action Run: ${actionRunId}, running action...`);


action-server/src/type/actionRunner.ts

dandelany

Spent some time looking at/tweaking this one today, some thoughts:

Major

A few things re: the setTimeouts on L32 and L47 of actionRunner.ts:

These both use the same timeout value, but are very different scenarios. The first is "once I've received an action run, how long should I wait to receive the secrets it requires" - this should always be pretty short. The other is (I think) "once I've submitted an action to be started by the pool, how long should I wait for it to be finished running before I delete its secrets to be safe" - this could be much longer, since it includes the time that the action will be queued waiting for other action runs. 60 seconds is probably more than enough time for the first, but not enough for the latter (maybe more like 10-60 minutes? open to suggestion but it's more of an emergency backstop).
The "timed out" message inside the first one (L32) will always be logged, even if the secrets were received. We should either cancel the timeout, or more simply, wrap an if around the innards to check if it still needs to be removed.
The second timeout (L47) is not started until after the awaited function returns, which is after the action run has already completed. I think it should go before, so it catches very-long-running actions as intended, or cases where await actionRunFunc throws. It should also have a cancel or if like the other one, so that it doesn't log on every run.
We should also call deleteActionSecret immediately after the await actionRunFunc for the nominal case, so we don't wait around for the long timeout before deleting the secrets. If we got past that line, I don't think we need them anymore.

Minor

I merged in changes from develop and fixed conflicts/migration numbers
Fixed a bug in 50860ac (the admin secret code I added before last release) since it was accidentally mutating the secrets objects, breaking tests
Pushed small changes in 46c4373 to replace the secrets/runs objects with Maps. CodeQL was correct in this case - since they're unvalidated strings straight from the user, it's safer not to dereference objects directly with them since they could accidentally or maliciously access JS internals like constructor etc. and cause weird issues.

cohansen · 2025-08-18T14:56:57Z

Spent some time looking at/tweaking this one today, some thoughts:

Major

A few things re: the setTimeouts on L32 and L47 of actionRunner.ts:

* These both use the same timeout value, but are very different scenarios. The first is "once I've received an action run, how long should I wait to receive the secrets it requires" - this should always be pretty short. The other is (I think) "once I've submitted an action to be started by the pool, how long should I wait for it to be **finished** running before I delete its secrets to be safe" - this could be much longer, since it includes the time that the action will be **queued waiting** for other action runs. 60 seconds is probably more than enough time for the first, but not enough for the latter (maybe more like 10-60 minutes? open to suggestion but it's more of an emergency backstop).

* The "timed out" message inside the first one (L32) will always be logged, even if the secrets **were** received. We should either cancel the timeout, or more simply, wrap an `if` around the innards to check if it still needs to be removed.

* The second timeout (L47) is not started until _after_ the `await`ed function returns, which is after the action run has already completed. I think it should go _before_, so it catches very-long-running actions as intended, or cases where `await actionRunFunc` throws. It should also have a cancel or `if` like the other one, so that it doesn't log on every run.

* We should also call `deleteActionSecret` immediately **after** the `await actionRunFunc` for the nominal case, so we don't wait around for the long timeout before deleting the secrets. If we got past that line, I don't think we need them anymore.

Minor

* I merged in changes from `develop` and fixed conflicts/migration numbers

* Fixed a bug in [50860ac](https://github.com/NASA-AMMOS/aerie/pull/1699/commits/50860acc409137cb802f3e1ccb1e847b8d12f884) (the admin secret code I added before last release) since it was accidentally mutating the secrets objects, breaking tests

* Pushed small changes in [46c4373](https://github.com/NASA-AMMOS/aerie/pull/1699/commits/46c43733034a44ba7ee1e4e62a918b4342870e90) to replace the secrets/runs objects with `Map`s. CodeQL was correct in this case - since they're unvalidated strings straight from the user, it's safer not to dereference objects directly with them since they could accidentally or maliciously access JS internals like `constructor` etc. and cause weird issues.

Just addressed all of your feedback!

sonarqubecloud · 2025-08-18T14:58:17Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

action-server/src/type/actionRunner.ts

+    if (actionRunFunc) {
+      setTimeout(() => {
+        if (this.actionSecretsMap.get(actionRunId) !== null) {
+          logger.info(`Secret for Action Run: ${actionRunId} timed out waiting for the associated action run.`);


action-server/src/type/actionRunner.ts

+        }
+      }, this.WAIT_FOR_ACTION_RUN_TIMEOUT);
+
+      await actionRunFunc(actionRunId);


cohansen requested a review from dandelany June 24, 2025 23:50

cohansen self-assigned this Jun 24, 2025

cohansen requested a review from a team as a code owner June 24, 2025 23:50

cohansen requested a review from Mythicaeda June 24, 2025 23:50

cohansen force-pushed the feature/action-transient-secrets branch from 7f56c42 to 23a4a2f Compare June 24, 2025 23:59

cohansen had a problem deploying to e2e-test June 24, 2025 23:59 — with GitHub Actions Failure

github-advanced-security bot found potential problems Jun 25, 2025

View reviewed changes

action-server/src/type/actionSecrets.ts Fixed Show fixed Hide fixed

cohansen force-pushed the feature/action-transient-secrets branch from 1c8dc3f to 9883ed4 Compare July 15, 2025 16:15

cohansen had a problem deploying to e2e-test July 15, 2025 16:15 — with GitHub Actions Failure

github-advanced-security bot found potential problems Jul 15, 2025

View reviewed changes

cohansen had a problem deploying to e2e-test July 15, 2025 16:37 — with GitHub Actions Failure

cohansen temporarily deployed to e2e-test July 15, 2025 17:29 — with GitHub Actions Inactive

cohansen had a problem deploying to e2e-test July 15, 2025 17:29 — with GitHub Actions Failure

cohansen temporarily deployed to e2e-test July 15, 2025 17:29 — with GitHub Actions Inactive

cohansen temporarily deployed to e2e-test July 16, 2025 16:14 — with GitHub Actions Inactive

cohansen added 7 commits July 23, 2025 07:25

Added transient secrets to the actions server

9fe7f64

Refactored actions with secrets to be executed via a callback

f05d3ad

Renamed secrets to has_secrets

5001d2f

Fixed action-server tests and added a secret test

4747499

Moved the secrets db migration to 24

719e0c9

Added table context to drop triggers

9ef9d49

Updated to use the newest version of aerie actions

d21f4d7

Moved transient secrets to migration 25

8a44d23

cohansen temporarily deployed to e2e-test July 23, 2025 17:56 — with GitHub Actions Inactive

merge develop into feature/feature/action-transient-secrets and resol…

9b35f90

…ve conflicts

dandelany had a problem deploying to e2e-test August 15, 2025 20:45 — with GitHub Actions Failure

update migration numbers for action secrets migration (27)

4bc7992

dandelany had a problem deploying to e2e-test August 15, 2025 20:55 — with GitHub Actions Failure

dandelany temporarily deployed to e2e-test August 15, 2025 20:55 — with GitHub Actions Inactive

dandelany had a problem deploying to e2e-test August 15, 2025 21:10 — with GitHub Actions Failure

avoid overwriting value of secrets variable in action-server codeRunner

50860ac

dandelany temporarily deployed to e2e-test August 15, 2025 21:55 — with GitHub Actions Inactive

remove action run queue/secrets objects with Map for better key safety

46c4373

dandelany temporarily deployed to e2e-test August 16, 2025 00:55 — with GitHub Actions Inactive

github-advanced-security bot found potential problems Aug 16, 2025

View reviewed changes

dandelany requested changes Aug 16, 2025

View reviewed changes

More actionRunner improvements

cfcff04

cohansen temporarily deployed to e2e-test August 18, 2025 14:56 — with GitHub Actions Inactive

github-advanced-security bot found potential problems Aug 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/action transient secrets #1699

Feature/action transient secrets #1699

Uh oh!

cohansen commented Jun 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Check warning

Uh oh!

Uh oh!

dandelany left a comment

Uh oh!

cohansen commented Aug 18, 2025

Major

Minor

Uh oh!

sonarqubecloud bot commented Aug 18, 2025

Uh oh!

Check warning

Check failure

Uh oh!

Feature/action transient secrets #1699

Are you sure you want to change the base?

Feature/action transient secrets #1699

Uh oh!

Conversation

cohansen commented Jun 24, 2025

Description

Verification

Documentation

Future work

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Check warning

Uh oh!

Uh oh!

Uh oh!

dandelany left a comment

Choose a reason for hiding this comment

Major

Minor

Uh oh!

cohansen commented Aug 18, 2025

Major

Minor

Uh oh!

sonarqubecloud bot commented Aug 18, 2025

Quality Gate passed

Uh oh!

Check warning

Uh oh!

Check failure

Uh oh!

Uh oh!