Timeseries Explainability - Env vars are disabled. #4872

Alex-Wenner-FHR · 2024-09-16T17:43:14Z

Describe the bug
My processing job for TSX fails immediately with the following error message:
Failure reason ClientError: ValidationException: Environment variable is not allowed for the provided image 205585389593.dkr.ecr.us-east-1.amazonaws.com/sagemaker-clarify-processing:1.0 status code: 400, request id: b111b9e1-3dbc-4c08-8bc1-db8d15041c6c

To reproduce
Fill the following env parameter out and run the timeseries explainability processing job:

env_vars = {"model" : "modelV1"}

SageMakerClarifyProcessor(
            role=<role>,
            sagemaker_session=Session(),
            instance_count=1,
            instance_type=<instance_type>,
            job_name_prefix=<prefix>,
            env=env_vars,
        )

Expected behavior
I would expect the environment variables to be passed through to the processing job.

System information
A description of your system. Please provide:

SageMaker Python SDK version: 2.232.0 (latest)
Framework name (eg. PyTorch) or algorithm (eg. KMeans): N/A
Framework version: N/A
Python version: N/A
CPU or GPU: N/A
Custom Docker image (Y/N): N

Additional Context:
I am trying to subscribe to events when a processing job completes, and I see the Environment key passed through the event. I would like to take action on the successful completion of a timeseries explainability job and I was attempting to send env variables through as a means of additional context to my consumer lambda.

The text was updated successfully, but these errors were encountered:

cylaceste · 2024-09-17T22:16:41Z

I feel like env variables should generally not be exposed, once they are passed into a job. They're really just for changing the runtime behaviour, and also sometimes they contain secrets, so it's really not great to emit these elsewhere.

You have a lot of options for what you're trying to do: tags is probably easiest, or a processing output if you run into limitations. Maybe simplify it and call the lambda as the last step of the job/add a lambda step.

Alex-Wenner-FHR · 2024-09-18T13:38:51Z

I feel like env variables should generally not be exposed, once they are passed into a job. They're really just for changing the runtime behaviour, and also sometimes they contain secrets, so it's really not great to emit these elsewhere.

You have a lot of options for what you're trying to do: tags is probably easiest, or a processing output if you run into limitations. Maybe simplify it and call the lambda as the last step of the job/add a lambda step.

Hi! I am currently using tags as a work around, but every other Sagemaker job supports usage environment variables. Tags are fine but they do not support objects just string based data. And using the tags in the way that I am using them right now doesn't seem like the right way to leverage tags. I do not have control over the processing output since it is a Sagemaker Clarify Processor. The most control I have is essentially env vars or tags.

Alex-Wenner-FHR added the bug label Sep 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timeseries Explainability - Env vars are disabled. #4872

Timeseries Explainability - Env vars are disabled. #4872

Alex-Wenner-FHR commented Sep 16, 2024

cylaceste commented Sep 17, 2024

Alex-Wenner-FHR commented Sep 18, 2024

Timeseries Explainability - Env vars are disabled. #4872

Timeseries Explainability - Env vars are disabled. #4872

Comments

Alex-Wenner-FHR commented Sep 16, 2024

cylaceste commented Sep 17, 2024

Alex-Wenner-FHR commented Sep 18, 2024