Expensive Cat pictures with Stable Diffusion on AWS SageMaker in CDK

5 min readJan 2, 2023

Stable Cats — StableDiffusion cat pictures

This article will demonstrate how to use AWS Cloud Development Kit (CDK) to set up an endpoint for a Stable Diffusion model on SageMaker. The repository can be found here. (Please note that the resources and services used in this repository may not be covered under the AWS Free Tier. Be sure to carefully review the pricing details for each resource and service before deploying this solution, and monitor your usage to avoid unexpected charges.)

On Nov 10, 2022 AWS announced that Amazon SageMaker JumpStart would provides two additional state-of-the-art foundational models, Bloom for text generation and Stable Diffusion for image generation. These models are hosted by AWS and can be interacted with the the AWS SDK.

Jupyter Notebooks are often used to demonstrate how to deploy and use models on AWS SageMaker. While notebooks can be useful for experimentation, they can be inconvenient for deploying solutions across multiple AWS accounts (such as for staging and production environments). Additionally, notebooks can be problematic for version control, as even minor changes can cause conflicts when merging.

While you can use the SageMaker GUI to deploy models, doing so manually for multiple AWS accounts can be time-consuming. To avoid the need to repeat these steps in the GUI multiple times, you may want to consider an alternative method.

CDK allows you to write your infrastructure as code in a supported programming language, such as Typescript. Instead of relying on Jupyter Notebooks or the AWS GUI, we will use CDK to manage our deployment.

Currently, there are no official level 2 (L2) constructs for SageMaker in AWS CDK. While the experimental construct library is available in preview as a separate package, we will not use it in this article. Instead, we will use custom resources and a Lambda function to create the necessary AWS resources.

The Stable diffusion Construct

The StableDiffusionInferenceConstruct extends the CDK Construct and is initialised with StableDiffusionInferenceConstructProps. As construct properties we’ll use the modelId , the modelVersion (If the version is set to * , the last version is used) and the inferenceInstanceType.

export interface StableDiffusionInferenceConstructProps {
    modelId: string;
    modelVersion?: string;
    inferenceInstancetype?: string;
}

This construct will create three objects. A Role with SageMaker permissions:

const sageMakerRole = new Role(this, 'Role', {
  assumedBy: new ServicePrincipal('sagemaker.amazonaws.com'),
});

sageMakerRole.addManagedPolicy(
  ManagedPolicy.fromAwsManagedPolicyName('AmazonSageMakerFullAccess')
);

a Lambda Function that creates SageMaker resources to deploy a Stable Diffusion endpoint:

const customResourceFunction = new PythonFunction(this, 'crFunction', {
  index: 'index.py',
  handler: 'handler',
  runtime: Runtime.PYTHON_3_9,
  memorySize: 10240,
  ephemeralStorageSize: Size.gibibytes(10),
  description: 'StableDiffusionModelCustomResource',
  entry: path.join(
    __dirname, '..', '..', 'lambda', 'StableDiffusionModelCustomResource'
  ),
  environment: {
    ModelId: props.modelId,
    ModelVersion: props.modelVersion || '*', // * will fetch the latest model
    inferenceInstancetype: props.inferenceInstancetype || 'ml.p3.2xlarge',
    sageMakerRoleArn: sageMakerRole.roleArn,
  },
  timeout: Duration.minutes(15),
});

and a CDK CustomResource that uses that Lambda Function as a serviceToken :

customResourceFunction.role?.grantPassRole(sageMakerRole);
customResourceFunction.role?.addManagedPolicy(
  ManagedPolicy.fromAwsManagedPolicyName('AmazonSageMakerFullAccess')
);

const sagemakerCustomResource = new CustomResource(this, 'sdmodel', {
  serviceToken: customResourceFunction.functionArn,
  removalPolicy: RemovalPolicy.DESTROY,
});
        
this.endpointName = sagemakerCustomResource.getAttString('endpoint_name');

this.invokeEndPointPolicyStatement = new PolicyStatement({
  actions: ['sagemaker:InvokeEndpoint'],
  resources: [
    `arn:aws:sagemaker:${Stack.of(this).region}:${Stack.of(this).account}:endpoint/${this.endpointName}`,
  ],
});

The Lambda function to create the SageMaker resources

Our Lambda function includes three methods for creating, deleting, and updating custom resources. The crhelper package is used to handle CloudFormation events and invoke the appropriate method.

from crhelper import CfnResource
import logging

logger = logging.getLogger(__name__)
helper = CfnResource(json_logging=False, log_level='DEBUG',
                     boto_level='CRITICAL', sleep_on_delete=120, ssl_verify=None)

The create method first retrieves the deploy image, deploy source and model uri (maintained and hosted by AWS). Then a model is created and deployed immediately afterwards as a SageMaker inference endpoint.


@helper.create
def create(event, context):
    logger.info("Create")

    model_id = os.environ['ModelId']
    model_version = os.environ['ModelVersion']
    endpoint_name = "stablediffusion"

    inference_instance_type = os.environ['inferenceInstancetype']
    logger.info(f"Using a {inference_instance_type} type for inference")

    deploy_image_uri = image_uris.retrieve(
        region=None,
        framework=None,
        image_scope="inference",
        model_id=os.environ['ModelId'],
        model_version=model_version,
        instance_type=inference_instance_type,
    )

    deploy_source_uri = script_uris.retrieve(
        model_id=model_id, model_version=model_version, script_scope="inference"
    )

    model_uri = model_uris.retrieve(
        model_id=model_id, model_version=model_version, model_scope="inference"
    )

    env = {
        "MMS_MAX_RESPONSE_SIZE": "20000000",
    }

    model = Model(
        image_uri=deploy_image_uri,
        source_dir=deploy_source_uri,
        model_data=model_uri,
        entry_point="inference.py",
        role=role,
        predictor_cls=Predictor,
        name=endpoint_name,
        env=env,
    )

    logger.info(f"Model deploy start")
    predictor = model.deploy(
        initial_instance_count=1,
        instance_type=inference_instance_type,
        predictor_cls=Predictor,
        endpoint_name=endpoint_name,
    )
    logger.info(f"Model deploy end")

    print(predictor.endpoint_name)
    helper.Data['endpoint_name'] = predictor.endpoint_name
    return predictor.endpoint_name

The delete and update methods are quite straightforward. The delete method deletes the endpoint and model; The update endpoint will first delete the existing endpoint and model and will deploy the new ones.

@helper.delete
def delete(event, context):
    logger.info("Delete")
    physical_id = event["PhysicalResourceId"]
    predictor = Predictor(
        endpoint_name=physical_id, sagemaker_session=sess)
    predictor.delete_model()
    predictor.delete_endpoint()


@helper.update
def update(event, context):
    logger.info("Update")
    delete(event, context)
    return create(event, context)

The Stable Diffusion Stack

Using the reusable construct we created, we can now create a stack with a stable diffusion endpoint, a lambda function to interact with the model, and an S3 bucket to store the generated images. To do so, we can simply create a StableDiffusionInferenceConstruct like the following:

const sdConstruct = new StableDiffusionInferenceConstruct(
  this,
  'StableDiffusionInferenceConstructConstruct', {
    modelId: 'model-txt2img-stabilityai-stable-diffusion-v2-fp16',
    modelVersion: '*',
  }
);

In addition to this construct a second Lambda Function is deployed that will can make requests to the Stable Diffusion endpoint and that saves the results on S3. The inference method takes a prompt (string), which is sent to the SageMaker endpoint.

def inference(prompt):
    predictor = Predictor(
        endpoint_name=os.environ['endpointName'], sagemaker_session=sess)

    response = predictor.predict(
        prompt.encode("utf-8"),
        {
            "ContentType": "application/x-text",
            "Accept": "application/json",
        },
    )
    return response

The response contains the generated image like the following:

response = inference(prompt)
response_dict = json.loads(response)

im = Image.fromarray(np.array((response_dict['generated_image'])).astype(np.uint8), 'RGB')
im.save("/tmp/image.png")

Finally this Lambda Function will move this image to S3 and return a link to the user.

return {
    "statusCode": 200,
    "body": json.dumps({
        "url": presigned_url,
        "prompt": prompt
    })
}

Deploy the stack to an AWS account

In order to deploy this stack to an AWS account, you’l need to define this stack in the index.ts file of your CDK project:

const stableDiffusionStack = new StableDiffusionStack(
  app, 'StableDiffusionStack', {}
);

To deploy a stack using AWS CDK, you will need to have the AWS CDK CLI installed and set up on your local machine. You will also need to have the necessary permissions to create and manage resources in the AWS account where you want to deploy the stack.

cdk deploy StableDiffusionStack --profile=YOUR-AWS-PROFILE

This will create a new stack called StableDiffusionStack in your AWS account and deploy the resources defined in the stack.

The repository can be found here.

Enjoy generation images! And again, Please note that the resources and services used in this repository may not be covered under the AWS Free Tier. Be sure to carefully review the pricing details for each resource and service before deploying this solution, and monitor your usage to avoid unexpected charges.

Expensive Cat pictures with Stable Diffusion on AWS SageMaker in CDK

Written by Pieterjan Criel @pjcr

Responses (1)