Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong notebook being run in pipeline when creating NotebookJobStep with shared environment_variables dict #4856

Open
fakio opened this issue Aug 29, 2024 · 1 comment
Labels
bug component: pipelines Relates to the SageMaker Pipeline Platform

Comments

@fakio
Copy link

fakio commented Aug 29, 2024

Describe the bug
When I create a Pipeline with two NotebookJobStep steps and both steps were created using the same dict as environment_variables parameter the first step is run with the second input notebook isntead of its own input notebook.

To reproduce

env_vars = {
    'test': 'test',
}
steps = [
    NotebookJobStep(
        image_uri="885854791233.dkr.ecr.us-east-1.amazonaws.com/sagemaker-distribution-prod:1-cpu",
        kernel_name="python3",
        input_notebook="job1.ipynb",
        initialization_script="setup.sh",
        environment_variables=env_vars,
    ),
    NotebookJobStep(
        image_uri="885854791233.dkr.ecr.us-east-1.amazonaws.com/sagemaker-distribution-prod:1-cpu",
        kernel_name="python3",
        input_notebook="job2.ipynb",
        initialization_script="setup.sh",
        environment_variables=env_vars,
    ),
]
pipeline = Pipeline(
    name="pipeline",
    steps=steps,
)
pipeline.upsert(role_arn=role)
execution = pipeline.start()

The problem seems the env vars for each step:

print(json.loads(pipeline.definition())["Steps"][0]["Arguments"]["Environment"])
{
'test': 'test',
'AWS_DEFAULT_REGION': 'us-east-1',
'SM_JOB_DEF_VERSION': '1.0',
'SM_ENV_NAME': 'sagemaker-default-env',
'SM_SKIP_EFS_SIMULATION': 'true',
'SM_EXECUTION_INPUT_PATH': '/opt/ml/input/data/sagemaker_headless_execution_pipelinestep',
'SM_KERNEL_NAME': 'python3',
'SM_INPUT_NOTEBOOK_NAME': 'job2.ipynb', <<==== wrong input
'SM_OUTPUT_NOTEBOOK_NAME': 'job2-ipynb-2024-08-29-15-04-49-575.ipynb',
'SM_INIT_SCRIPT': 'setup.sh'
}

print(json.loads(pipeline.definition())["Steps"][1]["Arguments"]["Environment"])
{
'test': 'test',
'AWS_DEFAULT_REGION': 'us-east-1',
'SM_JOB_DEF_VERSION': '1.0',
'SM_ENV_NAME': 'sagemaker-default-env',
'SM_SKIP_EFS_SIMULATION': 'true',
'SM_EXECUTION_INPUT_PATH': '/opt/ml/input/data/sagemaker_headless_execution_pipelinestep',
'SM_KERNEL_NAME': 'python3',
'SM_INPUT_NOTEBOOK_NAME': 'job2.ipynb',
'SM_OUTPUT_NOTEBOOK_NAME': 'job2-ipynb-2024-08-29-15-04-49-575.ipynb',
'SM_INIT_SCRIPT': 'setup.sh'
}

Expected behavior
Run job1.ipynb and job2.ipynb in each step.

Screenshots or logs

Screenshot of notebook jobs in Studio UI:

image

System information
A description of your system. Please provide:

  • SageMaker Python SDK version: 2.226.1
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans):
  • Framework version:
  • Python version: 3.8.18
  • CPU or GPU: CPU
  • Custom Docker image (Y/N): N
@fakio fakio added the bug label Aug 29, 2024
@svia3 svia3 added the component: pipelines Relates to the SageMaker Pipeline Platform label Sep 3, 2024
@qidewenwhen
Copy link
Member

Hi @fakio, thanks for the good finding! And thanks for the investigation.

This is a bug for sure.

The issue is here

job_envs = self.environment_variables if self.environment_variables else {}
system_envs = {
"AWS_DEFAULT_REGION": self._region_from_session,
"SM_JOB_DEF_VERSION": "1.0",
"SM_ENV_NAME": "sagemaker-default-env",
"SM_SKIP_EFS_SIMULATION": "true",
"SM_EXECUTION_INPUT_PATH": "/opt/ml/input/data/"
"sagemaker_headless_execution_pipelinestep",
"SM_KERNEL_NAME": self.kernel_name,
"SM_INPUT_NOTEBOOK_NAME": os.path.basename(self.input_notebook),
"SM_OUTPUT_NOTEBOOK_NAME": f"{self._underlying_job_prefix}.ipynb",
}
if self.initialization_script:
system_envs["SM_INIT_SCRIPT"] = os.path.basename(self.initialization_script)
job_envs.update(system_envs)

The environment_variables from user is directly updated with the system_envs. Thus if the same environment_variables is used across multiple notebook job steps, they could override each other.

The fix is to simply copy the environment_variables from users and add system_envs to the copy.

However, as a workaround, you can create separate environment_variables for different steps until we release a fix.

I'll creating a backlog for the fix of this bug in our service queue, and will update this issue once the fix PR is published.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug component: pipelines Relates to the SageMaker Pipeline Platform
Projects
None yet
Development

No branches or pull requests

4 participants
@fakio @qidewenwhen @svia3 and others