Skip to content

Conversation

cohansen
Copy link
Contributor

  • Tickets addressed: NA
  • Review: By commit
  • Merge strategy: Merge (no squash)

Description

This PR adds transient secrets that can be added during an action run. Those secrets are stored in memory in the action-server until they are retrieved for the associated action_run.

Verification

Manually tested.

Documentation

NA

Future work

Refactor the delay / wait period for the secrets.

@cohansen cohansen requested a review from dandelany June 24, 2025 23:50
@cohansen cohansen self-assigned this Jun 24, 2025
@cohansen cohansen requested a review from a team as a code owner June 24, 2025 23:50
@cohansen cohansen requested a review from Mythicaeda June 24, 2025 23:50
@cohansen cohansen force-pushed the feature/action-transient-secrets branch from 7f56c42 to 23a4a2f Compare June 24, 2025 23:59
static async addActionSecret(actionRunId: string, actionSecrets: Record<string, string>): Promise<void> {
this.actionSecretsMap.set(actionRunId, actionSecrets);

logger.info(`Secret found for Action Run: ${actionRunId}, running action...`);

Check warning

Code scanning / CodeQL

Log injection Medium

Log entry depends on a
user-provided value
.
Copy link
Collaborator

@dandelany dandelany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spent some time looking at/tweaking this one today, some thoughts:

Major

A few things re: the setTimeouts on L32 and L47 of actionRunner.ts:

  • These both use the same timeout value, but are very different scenarios. The first is "once I've received an action run, how long should I wait to receive the secrets it requires" - this should always be pretty short. The other is (I think) "once I've submitted an action to be started by the pool, how long should I wait for it to be finished running before I delete its secrets to be safe" - this could be much longer, since it includes the time that the action will be queued waiting for other action runs. 60 seconds is probably more than enough time for the first, but not enough for the latter (maybe more like 10-60 minutes? open to suggestion but it's more of an emergency backstop).
  • The "timed out" message inside the first one (L32) will always be logged, even if the secrets were received. We should either cancel the timeout, or more simply, wrap an if around the innards to check if it still needs to be removed.
  • The second timeout (L47) is not started until after the awaited function returns, which is after the action run has already completed. I think it should go before, so it catches very-long-running actions as intended, or cases where await actionRunFunc throws. It should also have a cancel or if like the other one, so that it doesn't log on every run.
  • We should also call deleteActionSecret immediately after the await actionRunFunc for the nominal case, so we don't wait around for the long timeout before deleting the secrets. If we got past that line, I don't think we need them anymore.

Minor

  • I merged in changes from develop and fixed conflicts/migration numbers
  • Fixed a bug in 50860ac (the admin secret code I added before last release) since it was accidentally mutating the secrets objects, breaking tests
  • Pushed small changes in 46c4373 to replace the secrets/runs objects with Maps. CodeQL was correct in this case - since they're unvalidated strings straight from the user, it's safer not to dereference objects directly with them since they could accidentally or maliciously access JS internals like constructor etc. and cause weird issues.

@cohansen
Copy link
Contributor Author

Spent some time looking at/tweaking this one today, some thoughts:

Major

A few things re: the setTimeouts on L32 and L47 of actionRunner.ts:

* These both use the same timeout value, but are very different scenarios. The first is "once I've received an action run, how long should I wait to receive the secrets it requires" - this should always be pretty short. The other is (I think) "once I've submitted an action to be started by the pool, how long should I wait for it to be **finished** running before I delete its secrets to be safe" - this could be much longer, since it includes the time that the action will be **queued waiting** for other action runs. 60 seconds is probably more than enough time for the first, but not enough for the latter (maybe more like 10-60 minutes? open to suggestion but it's more of an emergency backstop).

* The "timed out" message inside the first one (L32) will always be logged, even if the secrets **were** received. We should either cancel the timeout, or more simply, wrap an `if` around the innards to check if it still needs to be removed.

* The second timeout (L47) is not started until _after_ the `await`ed function returns, which is after the action run has already completed. I think it should go _before_, so it catches very-long-running actions as intended, or cases where `await actionRunFunc` throws. It should also have a cancel or `if` like the other one, so that it doesn't log on every run.

* We should also call `deleteActionSecret` immediately **after** the `await actionRunFunc` for the nominal case, so we don't wait around for the long timeout before deleting the secrets. If we got past that line, I don't think we need them anymore.

Minor

* I merged in changes from `develop` and fixed conflicts/migration numbers

* Fixed a bug in [50860ac](https://github.com/NASA-AMMOS/aerie/pull/1699/commits/50860acc409137cb802f3e1ccb1e847b8d12f884) (the admin secret code I added before last release) since it was accidentally mutating the secrets objects, breaking tests

* Pushed small changes in [46c4373](https://github.com/NASA-AMMOS/aerie/pull/1699/commits/46c43733034a44ba7ee1e4e62a918b4342870e90) to replace the secrets/runs objects with `Map`s. CodeQL was correct in this case - since they're unvalidated strings straight from the user, it's safer not to dereference objects directly with them since they could accidentally or maliciously access JS internals like `constructor` etc. and cause weird issues.

Just addressed all of your feedback!

Copy link

if (actionRunFunc) {
setTimeout(() => {
if (this.actionSecretsMap.get(actionRunId) !== null) {
logger.info(`Secret for Action Run: ${actionRunId} timed out waiting for the associated action run.`);

Check warning

Code scanning / CodeQL

Log injection Medium

Log entry depends on a
user-provided value
.
}
}, this.WAIT_FOR_ACTION_RUN_TIMEOUT);

await actionRunFunc(actionRunId);

Check failure

Code scanning / CodeQL

Unvalidated dynamic method call High

Invocation of method with
user-controlled
name may dispatch to unexpected target and cause an exception.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants