Skip to content

(3.10.0) Build image fails in China regions

Giacomo Marciani edited this page Jul 4, 2024 · 1 revision

The issue

ParallelCluster 3.10.0 fails to build images in all China regions.

The error surfaced by Image Builder is:

Image ARN: arn:aws-cn:imagebuilder:REGION:ACCOUNT:image/IMAGE_NAME/3.10.0/1 failed with error: Workflow Execution ID: 'wf-xxxxxx-xxxxxx-xxxxxx-xxxxxx-xxxxxx' failed with reason: Document arn:aws-cn:imagebuilder:REGION:ACCOUNT:component/parallelclusterimage-xxxxxx-xxxxxx-xxxxxx-xxxxxx-xxxxxx/3.10.0/1 failed!."

which is internally caused by a wrong URL used to download the dependency gems.tgz from the regional bucket.

Affected versions

ParallelCluster version 3.10.0 is affected.

Mitigation

You can follow one of the below mitigations, according to your needs:

  1. Create your AMI by following the approach Modifying an AWS ParallelCluster AMI

  2. If you want to use 3.10.0 and can build in another region, then build the image in a Classic region (eg: us-east-1) and copy the resulting image to China region following https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ami-store-restore.html

  3. If you want to use 3.10.0 and must build in China regions, patch the CLI locally in a virtual environment as follows:

    1. Setup the Python virtual environment, following https://docs.aws.amazon.com/parallelcluster/latest/ug/install-v3-virtual-environment.html
    2. Download the ParallelCluster CLI code from GitHub: https://github.com/aws/aws-parallelcluster/tree/v3.10.0
    3. Modify the gems URL to point to the bucket in us-east-1, by replacing ${AWS::Region} with us-east-1.
    4. Install the modified version from local code within the virtual environment using pip install cli/
  4. If you want to build in China regions, but you're not forced to use 3.10.0, then downgrade to ParallelCluster 3.9.3.

Clone this wiki locally