Amazon Bedrock

How to Maximize AI Performance with Amazon Bedrock Playgrounds?

Sampath Kumar Basa_

Amazon Bedrock provides a versatile platform called Playgrounds, enabling users to experiment with various foundation models within the service. This feature allows users to run model inference against different base models through a graphical interface, simplifying the process of selecting the most suitable model for specific needs. Playgrounds support text, chat, and image-based interfaces, offering flexibility for designing and building diverse generative AI applications.

Experimenting with Model Inference

Playgrounds facilitate the experimentation process by allowing users to vary prompts and inference parameters. This experimentation helps users align model responses with their use cases with the highest accuracy. By ‘playing’ with these playgrounds, users can employ prompt engineering techniques to explore the different outcomes offered by various foundation models.

The Role of Prompt Engineering

Prompt engineering is crucial when working with large language models. It involves refining input prompts to generate the most appropriate outputs. By optimizing prompts, users can enhance the performance of generative AI models for specific tasks. Adjusting and altering input prompts allows for manipulation of the AI’s output and behavior, making responses more relevant to the intended application.

Inference Parameters in Playgrounds

The type of model selected for a playground determines the inference parameters that can be adjusted. For instance, the Jurassic-2 Mid FM by AI21 Labs offers three main areas of configuration:

  1. Randomness and Diversity

  • Temperature: Adjusts the randomness of the model’s response. A lower temperature results in more focused responses, while a higher temperature leads to more diverse outputs. The typical range is 0 to 1, with a default of 0.7.
  • Top P: Focuses on the probability percentile from which tokens are chosen. A value below 1.0 ensures only the top percentile of options are considered, leading to consistent and repetitive completions.
  1. Length

  • Max Completion Length: Allows scaling from 200 to 8192 tokens, limiting the length of the response. Tokens can represent whole words, parts of words, spaces, or characters, depending on the tokenization method.
  • Stop Sequences: Specifies character sequences where the model should stop text generation, preventing unnecessary token generation.
  1. Repetition

  • Presence Penalty: Decreases the probability of new tokens appearing if they have already appeared in the prompt or completion. Higher values lower the probability.
  • Count Penalty: Reduces the likelihood of generating new tokens that have already appeared, proportional to their frequency.
  • Frequency Penalty: Similar to the count penalty, but specifically targets the frequency of token occurrences.
  • Penalize Special Tokens: Reduces the probability of repeating special characters, including whitespaces, emojis, punctuation, stopwords, and numbers.

Example: Varying the Temperature Parameter

To illustrate how altering the temperature affects responses, consider the following example using the Jurassic-2 Mid Chat Playground.

Figure 1: Model Selection screen in AWS Bedrock

The prompt was, “What are the best budget-friendly destinations in Europe?” with the following parameters:

  • Temperature: 0
  • Top P: 1
  • Max Completion Length: 300
  • Stop Sequences: Null
  • Presence Penalty: 0
  • Count Penalty: 0
  • Frequency Penalty: 0
  • Penalize Special Tokens: Null

Figure 2: Chat playground screen showing prompt and parameters.

The response was focused and concise:

Europe offers a variety of budget-friendly destinations. Some popular options
 include:

1. Budapest, Hungary: This city is known for its stunning architecture, 
thermal baths, and a vibrant nightlife scene.
2. Krakow, Poland: Krakow is known for its well-preserved medieval 
architecture, vibrant cultural scene, and affordable cost of living.
3. Lisbon, Portugal: Lisbon is a popular destination for budget travelers, 
offering a mix of historic charm, vibrant nightlife, and affordable cost of living.
4. Sofia, Bulgaria: Sofia is known for its historic architecture, cultural 
sites, and affordable cost of living.
5. Athens, Greece: Athens is a popular destination for budget travelers, 
offering a mix of historic ruins, vibrant nightlife, and affordable food options.
6. Dublin, Ireland: Dublin is a popular destination for budget travelers, 
offering a mix of historic charm, vibrant cultural scene, and affordable 
food and drink options.
7. Berlin, Germany: Berlin is a popular destination for budget travelers, 
offering a mix of historic landmarks, vibrant nightlife, and affordable cost of living.
8. Prague, Czech Republic: Prague is known for its historic architecture, 
cultural sites, and affordable cost of living.
9. Istanbul, Turkey: Istanbul is a popular destination for budget travelers, 
offering a mix of historic landmarks, vibrant cultural scene, and affordable cost of living.
10. Barcelona, Spain: Barcelona is a popular destination for budget travelers,
 offering a mix of historic charm, vibrant cultural scene, and affordable 
food and drink options.

Increasing the temperature to 0.9 yielded a more diverse and expanded response, demonstrating the impact of configurable inference parameters on completion results:

Europe offers a wide range of budget options that cater to travelers looking 
to save money. Some of these destinations include:

1. Kiev, Ukraine: This Eastern European city is known for its historic 
landmarks and cultural sites. It's also one of the cheapest places to visit in Europe.
2. Lisbon, Portugal: Portugal is a budget-friendly destination in Western 
Europe, with affordable accommodation, cheap food, and a wide range of things to do.
3. Budapest, Hungary: Hungary is another great option for budget travelers,
 with cities like Budapest offering a blend of history and modernity.
4. Katowice, Poland: Poland is often overlooked as a budget-friendly 
destination, but it can be very wallet-friendly if you know where to look. 
Katowice is a city in Poland that's becoming increasingly popular with budget travelers.
5. Istanbul, Turkey: Istanbul is a big city with a lot to offer budget 
travelers. It's located in Eastern Europe, but also borders Asia, giving 
it a unique cultural mix.
6. Sofia, Bulgaria: Bulgaria is another Eastern European country that offers 
budget-friendly travel options. Sofia is a popular destination for budget 
travelers, and it's becoming increasingly known as a budget travel destination.
7. Krakow, Poland: Krakow is another popular destination for budget travelers.
 It's located in Southern Poland and it's a Medieval city with many historic
 and cultural attractions.
8. Berlin, Germany: Berlin is one of the most budget-friendly cities in Europe.
 It's a huge city, and it's a destination that attracts a lot of budget travelers.
9. Belgrade, Serbia: Serbia is another Eastern European country that's 
becoming a popular budget travel destination. Belgrade is the capital city 
of Serbia, and it's known for its cheap cost of living.
10. Athens, Greece: Athens is a big city with a lot to offer budget travelers.
 It's located in Southern Europe, and it's a popular budget travel destination.

Assessing Model Metrics

Playgrounds also provide model metrics to evaluate latency and cost based on prompts. In the Chat playground, a ‘Model Metrics’ information screen displays latency, token counts, and inference costs. These metrics help users determine the efficiency of a model for their use case. Users can define metric criteria to set specific boundaries, such as a latency threshold, to evaluate model performance.

 modelmetrics

Figure 3: Model Metrics information screen

For example, I chose to have a matching condition of ‘less than’ and set a latency threshold of 1200 milliseconds:

model criteria

Figure 4: Metric Criteria definition screen

This shows that the model’s latency exceeded this value, highlighted in red. This assessment helps in making informed decisions about model selection and baseline evaluations.

model criteria-highlighted

Figure 5: Model metrics screen showing latency exceeded the threshold highlighted in red.

Conclusion

Amazon Bedrock Playgrounds offer a user-friendly and flexible environment for experimenting with generative AI models. By adjusting inference parameters and employing prompt engineering techniques, users can optimize model performance for specific applications. Assessing model metrics further aids in selecting the most suitable model, ensuring efficiency and relevance in AI-driven projects.