LLAMA 2 on AWS Bedrock

Generative AI, foundation model

2 min readNov 30, 2023

With AWS re:invent, finally the popular LLAMA 2 models are added to the catalog. They also support on-demand which requires no deployment at all, although you can also setup provisioned throughput. Here is how you ineract with on-demand LLAMA 2 models. Suppose you setup a AWS profile named ‘dev’.

export AWS_PROFILE=dev

main.py

import boto3
import json
from botocore.exceptions import ClientError

def invoke_llama2(bedrock_runtime_client, prompt):
        """
        Invokes the Meta Llama 2 large-language model to run an inference
        using the input provided in the request body.

        :param prompt: The prompt that you want Jurassic-2 to complete.
        :return: Inference response from the model.
        """

        try:
            # The different model providers have individual request and response formats.
            # For the format, ranges, and default values for Meta Llama 2 Chat, refer to:
            # https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-meta.html

            body = {
                "prompt": prompt,
                "temperature": 0.5,
                "top_p": 0.9,
                "max_gen_len": 512,
            }

            # model_id = 'meta.llama2-13b-chat-v1'
            model_id = 'meta.llama2-70b-chat-v1'
            response = bedrock_runtime_client.invoke_model(
                modelId=model_id, body=json.dumps(body)
            )

            response_body = json.loads(response["body"].read())
            completion = response_body["generation"]

            return completion

        except ClientError:
            logger.error("Couldn't invoke Llama 2")
            raise

# management plane, use bedrock
# brt = boto3.client(service_name='bedrock')
# brt.list_foundation_models()
brt = boto3.client(service_name='bedrock-runtime')
result = invoke_llama2(brt, 'what is llama 2?')
print(result)

Fine-tuning a foundation model is also added (like AzureML).

Customize models in Amazon Bedrock with your own data using fine-tuning and continued pre-training…

Today, I'm excited to share that you can now privately and securely customize foundation models (FMs) with your own…

aws.amazon.com

If you encounter “UnknownServiceError: Unknown service: ‘bedrock-runtime’.” Just upgrade with latest boto3.

pip install -U boto3

Appendix

Build Generative AI Applications with Foundation Models - Amazon Bedrock Pricing - AWS

Find detailed information about Amazon Bedrock pricing models including on-demand and provisioning throuput with the…