Prompt Templates, intro¶
CassIO powers a sophisticated set of bindings to seamlessly inject data from Cassandra tables into your LangChain prompt templates.
Basic usage¶
First, import the specialized Cassandra prompt template:
from langchain.prompts.database import CassandraReaderPromptTemplate
A database connection is needed. (If on a Colab, the only supported option is the cloud service Astra DB.)
# Ensure loading of database credentials into environment variables:
import os
from dotenv import load_dotenv
load_dotenv("../../../.env")
import cassio
Select your choice of database by editing this cell, if needed:
database_mode = "cassandra" # "cassandra" / "astra_db"
if database_mode == "astra_db":
cassio.init(
database_id=os.environ["ASTRA_DB_ID"],
token=os.environ["ASTRA_DB_APPLICATION_TOKEN"],
keyspace=os.environ.get("ASTRA_DB_KEYSPACE"), # this is optional
)
if database_mode == "cassandra":
from cqlsession import getCassandraCQLSession, getCassandraCQLKeyspace
cassio.init(
session=getCassandraCQLSession(),
keyspace=getCassandraCQLKeyspace(),
)
Pre-populate the database¶
The following cell prepares some example data on the database. Of course your real application would have very different ways to insert data in your tables. Please run the cell, it will be used in other prompt-template-related demos:
c_session = cassio.config.resolve_session()
c_keyspace = cassio.config.resolve_keyspace()
c_session.execute(f"""
CREATE TABLE IF NOT EXISTS {c_keyspace}.people (
city text,
name text,
age int,
PRIMARY KEY (city, name)
) WITH CLUSTERING ORDER BY (name ASC);
""")
c_session.execute(f"""
CREATE TABLE IF NOT EXISTS {c_keyspace}.nickname_by_city (
city text PRIMARY KEY,
nickname text
);
""")
c_session.execute(f"INSERT INTO {c_keyspace}.people (city, name, age) VALUES ('turin', 'beppe', 2);")
c_session.execute(f"INSERT INTO {c_keyspace}.people (city, name, age) VALUES ('milan', 'samanta', 14);")
c_session.execute(f"INSERT INTO {c_keyspace}.people (city, name, age) VALUES ('tokyo', 'hideo', 144);")
c_session.execute(f"INSERT INTO {c_keyspace}.people (city, name, age) VALUES ('turin', 'alberto', 6);")
c_session.execute(f"INSERT INTO {c_keyspace}.people (city, name, age) VALUES ('lisbon', 'Pedro', 1);")
c_session.execute(f"INSERT INTO {c_keyspace}.nickname_by_city (city, nickname) VALUES ('turin', 'CereaNeh');")
c_session.execute(f"INSERT INTO {c_keyspace}.nickname_by_city (city, nickname) VALUES ('lisbon', 'ACidade');")
c_session.execute(f"INSERT INTO {c_keyspace}.nickname_by_city (city, nickname) VALUES ('turin', 'CereaNeh');")
c_session.execute(f"INSERT INTO {c_keyspace}.nickname_by_city (city, nickname) VALUES ('milan', 'Taaac');")
<cassandra.cluster.ResultSet at 0x7f94ea9331c0>
Natural binding with the DB¶
We start by defining a template string to be formatted into a final prompt:
ctemplate0 = """Please answer a question from a user.
Keep in mind that the user's age is {user_age} and they live in a city with
nickname {city_nickname}.
USER'S QUESTION: {user_question}
YOUR ANSWER:
"""
In the (string) template above, some variables are to be filled with a DB lookup.
The following instructions specifies the details of the binding: for instance,
the variable user_age
is to be found on table people
, specifically in column age
:
cassPrompt = CassandraReaderPromptTemplate(
session=None,
keyspace=None,
field_mapper={
'user_age': ('people', 'age'),
'city_nickname': ('nickname_by_city', 'nickname'),
},
template=ctemplate0,
input_variables=["user_question"],
)
Note that in the command above you specify the primary key columns as input_variables
, and not the variable names found in the prompt string above.
When formatting the Prompt Template, you will have to specify the primary key values for the DB lookup -- the rest is done by the prompt template.
In this case there are two lookups from as many tables: the prompt template takes care of everything, provided you pass all the primary key columns required across tables.
Note: this operation essentially is a client-side join (a standard pattern with Cassandra).
print(cassPrompt.format(city='turin', name='beppe',
user_question='Is functional programming fun?'))
Arbitrary row functions¶
You can specify an arbitrary function to transform the database row into the returned field. The function gets a {column_name: value}
dictionary expressing the row and returns the value for the prompt template:
def nicknamer(row_dict):
return f"{row_dict['nickname']} (i.e. {row_dict['city']})"
field_mapper_f = {
'user_age': ('people', lambda row_dict: row_dict['age'] + 10000),
'city_nickname': ('nickname_by_city', nicknamer),
}
cassPromptF = CassandraReaderPromptTemplate(
session=None,
keyspace=None,
field_mapper=field_mapper_f,
template=ctemplate0,
input_variables=["user_question"],
)
print(cassPromptF.format(city='milan', name='samanta',
user_question='Is there a square circle?'))
Null and missing values¶
You can control how the prompt template should behave when a None
value is encountered or even when a table has no rows altogether for a given primary key.
First, you can pass a boolean parameter admit_nulls
to the prompt template.
Second, you can use the full four-element tuple format for the entries in the "field mapper". This would be (table_name, column_name_or_function, admit_nulls, default_value)
(whose admit_nulls
will override the overall default).
field_mapper_n = {
'user_age': ('people', 'age'),
'city_nickname': ('nickname_by_city', 'nickname', True, '(no nickname)'),
}
cassPromptN = CassandraReaderPromptTemplate(
session=None,
keyspace=None,
field_mapper=field_mapper_n,
template=ctemplate0,
input_variables=["user_question"],
admit_nulls=False,
)
# Note: there is no "tokyo" in the nicknames table
print(cassPromptN.format(city='tokyo', name='hideo',
user_question='What are we having for lunch?'))
try:
# Note: there are no rows with city='madrid' in the "people" table
print(cassPromptN.format(city='madrid', name='alberto',
user_question='What are we having for lunch?'))
except Exception as e:
print(f"Exception => {str(e)}")
Partialing Prompt Templates¶
Cassandra-powered prompt templates support partialing. Suppose you have just enough information to bind the template to the DB-lookup values: you can leave the user_question
unspecified for later completion at "format-time":
cassPartialPrompt = cassPrompt.partial(city='lisbon', name='Pedro')
The partial prompt template will keep the provided inputs ready to execute the full lookup-and-format operation when needed:
print(cassPartialPrompt.format(user_question='Em verdade, o que quiseres?'))
You can partial on any choice of input variables, even mixing database-bound and regular inputs:
cassPartialPrompt2 = cassPrompt.partial(city='lisbon', user_question='Estou perto do Tejo?')
print(cassPartialPrompt2.format(name='Pedro'))
Chat Prompt Templates¶
The Cassandra-specific approach can be seamlessly integrated with LangChain's "chat prompt templates", which represent a (template-based) way to manage chat exchanges.
Start with a prompt, not much dissimilar from what you've seen so far:
systemTemplate = """
You are a chat assistant, helping a user of age {user_age} from a city
they refer to as {city_nickname}.
"""
cassSystemPrompt = CassandraReaderPromptTemplate(
session=None,
keyspace=None,
template=systemTemplate,
input_variables=[],
field_mapper={
'user_age': ('people', 'age'),
'city_nickname': ('nickname_by_city', 'nickname'),
},
)
Next, you need specific abstractions to wrap this "system prompt" as part of a broader chat exchange:
from langchain.prompts import (
ChatPromptTemplate,
SystemMessagePromptTemplate,
HumanMessagePromptTemplate,
)
systemMessagePrompt = SystemMessagePromptTemplate(prompt=cassSystemPrompt)
A sequence of messages¶
Once you wrap a single prompt template as a "system message prompt", go ahead and make it part of a longer chat conversation:
humanTemplate = "{text}"
humanMessagePrompt = HumanMessagePromptTemplate.from_template(humanTemplate)
cassChatPrompt = ChatPromptTemplate.from_messages(
[systemMessagePrompt, humanMessagePrompt]
)
Formatting¶
LangChain takes care of correctly propagating the formatting steps throughout the sequence of messages, including the Cassandra-backed template:
print(cassChatPrompt.format_prompt(
city='turin',
name='beppe',
text='Assistant, please help me!'
).to_string())
Partialing and Chat Prompt Templates¶
In some cases, you may want to partial with respect to the database lookup key(s) even within a chat prompt template:
cassChatPartialPrompt = cassChatPrompt.partial(
city='turin',
name='beppe'
)
print(cassChatPartialPrompt.format(text="Hahaha!"))