Efficient Few-Shot Prompting in LangChain: Response Caching, Prompt Templating and Prompt Serialization-Part 2

6 min readMar 1, 2024

Welcome to the second article in this series. In the first article, we learned about different types of messages in LangChain that are predominantly used to communicate between chat models and users. For further details, you can refer to this article.

In this article, we will learn about Response Caching, Prompt Templating and Prompt Serialization. We will explore each of these topics in detail, providing a comprehensive understanding of their roles in Prompt Engineering. Let’s get started! 🏃‍♂️

Response Caching

Response caching is an optimization technique which is used to store the precomputed outputs of a server in response to specific requests. In the context of LLMs, it refers to storing and reusing the generated outputs of prompts. There are 2 main reasons why this can be beneficial:

It can save money as it will ideally reduce the number of API calls we make to the LLM.
It also speeds up the response time as we are not repeatedly calling the API for same prompt and response is given from cache memory.

Response caching works either for an exact match i.e. using same prompt twice or, similar match i.e. two prompts having similar meanings. Let’s look at how this is done in LangChain.

%%time
from langchain.cache import InMemoryCache
set_llm_cache(InMemoryCache())

# First request, it is not yet in cache, so it should take longer
messages = [HumanMessage(content = 'Tell me a joke')]
response = chat(messages = messages, max_tokens = 50)
print(response.content)

# time: 15.6 ms

%%time
# Second request is faster, as response is cached
messages = [HumanMessage(content = 'Tell me a joke')]
response = chat(messages = messages, max_tokens = 50)
print(response.content)

# time: 0 ns

Prompt Templating

Here we create pre-defined structures to guide the response from the LLM better. These templates act like a blueprint representing the important elements in a prompt.

Prompt templating can be done using format strings and f-string literals, similar to how we write print statements in Python.

# Using Format strings
from langchain.llms import OpenAI
llm = OpenAI()

prompt_template = "Tell me something about {topic}"
print(llm(prompt_template.format(topic='Insurance')))

# Insurance is a contract between an individual or organization (the insured) and an insurance company (the insurer) where the insurer agrees to provide financial compensation for specified losses, damages, or injuries in exchange for a premium. The purpose of insurance is to protect against the risk of financial loss due to unexpected events such as accidents, illness,natural disasters, or death. Insurance can cover various aspects of life, including health, property, life, and liability. The amount of coverage and the cost of premiums depend on factors such as the type of insurance, the level of risk, and the individual's age and health. Insurance is typically required for certain activities, such as driving a car or owning a home, and can provide peace of mind and financial security for individuals and businesses.

# Using f-string literals

topic = 'Insurance'
prompt_template = f"Tell me something about {topic}"
print(llm(prompt_template))

# Insurance is a contract between an individual or organization (the insured) and an insurance company (the insurer) where the insurer agrees to provide financial compensation for specified losses, damages, or injuries in exchange for a premium. The purpose of insurance is to protect against the risk of financial loss due to unexpected events such as accidents, illness, natural disasters, or death. Insurance can cover various aspects of life, including health, property, life, and liability. The amount of coverage and the cost of premiums depend on factors such as the type of insurance, the level of risk, and the individual's age and health. Insurance is typically required for certain activities, such as driving a car or owning a home, and can provide peace of mind and financial security for individuals and businesses.

We can also use format strings and f-string literals with LangChain schema objects.

# Using format strings with langchain schema
from langchain.schema import SystemMessage
prompt_template = "Tell me something about {topic}"
system_prompt = SystemMessage(prompt_template.format(topic = 'Insurance'))
print(llm(system_prompt.content))

The issue here is that these methods are not scalable especially when we are dealing with complex prompts. Also, we need to define the input as global variables, making it less flexible.

Let’s learn about templating using LangChain prompt templates.

from langchain.prompts import PromptTemplate
prompt_template = PromptTemplate(
    input_variables = ['topic'],
    template = "Tell me something about {topic}"
)
prompt = prompt_template.format(topic = 'Data Science')

#Prompt: 'Tell me something about Data Science'

Let’s add more elements to the same prompt.

# adding more elements 
prompt_template = PromptTemplate(
    input_variables=['topic','num_words'],
    template="Tell me something about {topic} in {num_words} words"
)
prompt = prompt_template.format(topic = 'Data Science',num_words = 30)

#Prompt: 'Tell me something about Data Science in 30 words'

Note here that input_variables is just a placeholder, it can be empty as well.

Now, lets look into prompt templating using Langchain’s Human Message Prompt Template for a single prompt.

from langchain.prompts.chat import HumanMessagePromptTemplate, ChatPromptTemplate

human_template = "Tell me something about {topic}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)

chat_prompt = ChatPromptTemplate.from_messages([human_message_prompt])

prompt = chat_prompt.format_prompt(topic = 'Insurance')
print(prompt)

# to get the messages from Chat Prompt
messages = prompt.messages
print(messages)

# getting response from chat model
response = chat(messages = messages)
response

# prompt: messages=[HumanMessage(content='Tell me something about Insurance')]
# messages: [HumanMessage(content='Tell me something about Insurance')]
# response: AIMessage(content='Insurance is a form of risk management that provides financial protection against potential losses or damages. It involves an individual or entity paying a premium to an insurance company in exchange for coverage in the event of an unexpected event, such as an accident, illness, or natural disaster. Types of insurance include health insurance, auto insurance, home insurance, life insurance, and business insurance. The purpose of insurance is to help individuals and businesses mitigate financial risks and recover from unexpected events.')

Here, first we are defining a human template. We are then converting this template to a langchain defined human message prompt. Next, we are defining a chat prompt template using the previous message from human message.

This might look complex, but you will understand the flexibility it provides when we are dealing with complex prompts. Let’s now look into a more complex prompt step by step where we have more schemas.

First adding a system message prompt template.

# System Message Prompt Template
system_template = "You are a {character}"
system_message_prompt= SystemMessagePromptTemplate.from_template(system_template)
system_message_prompt

# SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['character'], template='You are a {character}'))

Then adding a human message prompt template.

# Human Message Prompt Template
human_template = "Write a crime scene involving {item1} in a {item2}"
human_message_prompt = HumanMessagePromptTemplate.from_template(human_template)
human_message_prompt

# HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['item1', 'item2'], template='Write a crime scene involving {item1} in a {item2}'))

Finally, let’s add a chat prompt template.

# Chat Prompt Template
chat_prompt = ChatPromptTemplate.from_messages([system_message_prompt,human_message_prompt])
chat_prompt
# ChatPromptTemplate(input_variables=['character', 'item1', 'item2'], messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['character'], template='You are a {character}')), HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['item1', 'item2'], template='Write a crime scene involving {item1} in a {item2}'))])

Here is how the final prompt looks like.

prompt = chat_prompt.format_prompt(character = 'Writer', item1 = 'robbery', item2 = 'mansion')
prompt.messages

# [SystemMessage(content='You are a Writer'),
# HumanMessage(content='Write a crime scene involving robbery in a mansion')]

The response from the model

response = chat(messages = prompt.messages, max_tokens = 100)
response.content

#The grand mansion stood imposingly in the moonlit night, its ivy-covered walls casting eerie shadows across the manicured lawn. Inside, the opulent interior was a stark contrast to the darkness outside. The air was heavy with the scent of wealth and privilege, but tonight, that sense of security was shattered.\n\nDetective Jameson surveyed the scene before him, the sound of police radios and flashing lights filling the room. The room was in disarray, expensive artwork ripped from the walls, shattered'

This is how we can customize the prompts using LangChain prompt templates. You can see that it is lot more flexible in case we have multiple schemas. We can also add more schemas like AI Message Prompt Template to this as well.

Prompt Serialization

Prompt Serialization is the process in which we convert a prompt into a storable and readable format, which enhances the reusability and maintainability of prompts.

Be serializing prompts, we can save the prompt state and reload them whenever needed, without manually creating the prompt configurations again.

LangChain uses either JSON or YAML format to serialize the prompts. Let’s look at how we can serialize a LangChain prompt.

from langchain.prompts import PromptTemplate
prompt_template = PromptTemplate(
    input_variables = [''],
    template = "Tell me something about {topic}"
)
prompt_template.save('prompt.json')

Here, the prompt template is stored in a file “prompt.json”. The file looks something like this.

{'name': None,
 'input_variables': ['topic'],
 'input_types': {},
 'output_parser': None,
 'partial_variables': {},
 'metadata': None,
 'tags': None,
 'template': 'Tell me something about {topic}',
 'template_format': 'f-string',
 'validate_template': False,
 '_type': 'prompt'}

How do we load the serialized prompt?

We can use the load_prompt function that reads the json file and recreates the prompt template.

from langchain.prompts import load_prompt
loaded_prompt = load_prompt('prompt.json')
loaded_prompt

# PromptTemplate(input_variables=['topic'], template='Tell me something about {topic}')

This is all I had in this article. The entire code can be found in this GitHub link. Let me know if you have questions or suggestions.

Thank you for taking your time to read this. Keep learning!

Efficient Few-Shot Prompting in LangChain: Response Caching, Prompt Templating and Prompt Serialization-Part 2

Written by Jayant Pal

No responses yet