{ "cells": [ { "cell_type": "markdown", "id": "3f5c5368", "metadata": {}, "source": [ "# Prompt Templates, intro" ] }, { "cell_type": "markdown", "id": "871486f5", "metadata": {}, "source": [ "CassIO powers a sophisticated set of bindings to seamlessly inject data\n", "from Cassandra tables into your LangChain prompt templates." ] }, { "cell_type": "markdown", "id": "c0763b09", "metadata": {}, "source": [ "## Basic usage" ] }, { "cell_type": "markdown", "id": "a7b2eb09-a8bc-482f-9a8d-e5769d15c764", "metadata": {}, "source": [ "First, import the specialized Cassandra prompt template:" ] }, { "cell_type": "code", "execution_count": 1, "id": "a61e4821", "metadata": {}, "outputs": [], "source": [ "from langchain.prompts.database import CassandraReaderPromptTemplate" ] }, { "cell_type": "markdown", "id": "4578a87b", "metadata": {}, "source": [ "A database connection is needed. _(If on a Colab, the only supported option is the cloud service Astra DB.)_" ] }, { "cell_type": "code", "execution_count": 2, "id": "9515d4af-9fc2-4c29-b9f2-5c90cd48cc1b", "metadata": {}, "outputs": [], "source": [ "# Ensure loading of database credentials into environment variables:\n", "import os\n", "from dotenv import load_dotenv\n", "load_dotenv(\"../../../.env\")\n", "\n", "import cassio" ] }, { "cell_type": "markdown", "id": "25ee44a4-8d6a-4b41-8a8f-7c93eed35034", "metadata": {}, "source": [ "Select your choice of database by editing this cell, if needed:" ] }, { "cell_type": "code", "execution_count": 3, "id": "bfe78b92-7194-4ee9-ba7f-cd308409212c", "metadata": {}, "outputs": [], "source": [ "database_mode = \"cassandra\" # \"cassandra\" / \"astra_db\"" ] }, { "cell_type": "code", "execution_count": 4, "id": "fb5597b3-a028-4d8f-ad6d-8eaf0c941dcc", "metadata": {}, "outputs": [], "source": [ "if database_mode == \"astra_db\":\n", " cassio.init(\n", " database_id=os.environ[\"ASTRA_DB_ID\"],\n", " token=os.environ[\"ASTRA_DB_APPLICATION_TOKEN\"],\n", " keyspace=os.environ.get(\"ASTRA_DB_KEYSPACE\"), # this is optional\n", " )" ] }, { "cell_type": "code", "execution_count": 5, "id": "329d969b-6532-4dc4-a315-1a5438a59919", "metadata": {}, "outputs": [], "source": [ "if database_mode == \"cassandra\":\n", " from cqlsession import getCassandraCQLSession, getCassandraCQLKeyspace\n", " cassio.init(\n", " session=getCassandraCQLSession(),\n", " keyspace=getCassandraCQLKeyspace(),\n", " )" ] }, { "cell_type": "markdown", "id": "8fe8369a-fd2b-4526-8ef1-2bf014680624", "metadata": {}, "source": [ "### Pre-populate the database\n", "\n", "The following cell prepares some example data on the database. Of course your real application would\n", "have very different ways to insert data in your tables. Please run the cell, it will be used in other prompt-template-related demos:" ] }, { "cell_type": "code", "execution_count": 6, "id": "1cc6443f-6053-433f-8466-f57eb3521ad0", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c_session = cassio.config.resolve_session()\n", "c_keyspace = cassio.config.resolve_keyspace()\n", "\n", "c_session.execute(f\"\"\"\n", " CREATE TABLE IF NOT EXISTS {c_keyspace}.people (\n", " city text,\n", " name text,\n", " age int,\n", " PRIMARY KEY (city, name)\n", " ) WITH CLUSTERING ORDER BY (name ASC);\n", "\"\"\")\n", "\n", "c_session.execute(f\"\"\"\n", " CREATE TABLE IF NOT EXISTS {c_keyspace}.nickname_by_city (\n", " city text PRIMARY KEY,\n", " nickname text\n", " );\n", "\"\"\")\n", "\n", "c_session.execute(f\"INSERT INTO {c_keyspace}.people (city, name, age) VALUES ('turin', 'beppe', 2);\")\n", "c_session.execute(f\"INSERT INTO {c_keyspace}.people (city, name, age) VALUES ('milan', 'samanta', 14);\")\n", "c_session.execute(f\"INSERT INTO {c_keyspace}.people (city, name, age) VALUES ('tokyo', 'hideo', 144);\")\n", "c_session.execute(f\"INSERT INTO {c_keyspace}.people (city, name, age) VALUES ('turin', 'alberto', 6);\")\n", "c_session.execute(f\"INSERT INTO {c_keyspace}.people (city, name, age) VALUES ('lisbon', 'Pedro', 1);\")\n", "\n", "c_session.execute(f\"INSERT INTO {c_keyspace}.nickname_by_city (city, nickname) VALUES ('turin', 'CereaNeh');\")\n", "c_session.execute(f\"INSERT INTO {c_keyspace}.nickname_by_city (city, nickname) VALUES ('lisbon', 'ACidade');\")\n", "c_session.execute(f\"INSERT INTO {c_keyspace}.nickname_by_city (city, nickname) VALUES ('turin', 'CereaNeh');\")\n", "c_session.execute(f\"INSERT INTO {c_keyspace}.nickname_by_city (city, nickname) VALUES ('milan', 'Taaac');\")" ] }, { "cell_type": "markdown", "id": "547a4105", "metadata": {}, "source": [ "### Natural binding with the DB" ] }, { "cell_type": "markdown", "id": "ebefe5e0-582c-4613-ae54-1bf0212a8494", "metadata": {}, "source": [ "We start by defining a template string to be formatted into a final prompt:" ] }, { "cell_type": "code", "execution_count": 7, "id": "657cd948", "metadata": {}, "outputs": [], "source": [ "ctemplate0 = \"\"\"Please answer a question from a user.\n", "Keep in mind that the user's age is {user_age} and they live in a city with\n", "nickname {city_nickname}.\n", "\n", "USER'S QUESTION: {user_question}\n", "YOUR ANSWER:\n", "\"\"\"" ] }, { "cell_type": "markdown", "id": "258b7b92-5b48-45d5-9b5c-86d2d6edbea6", "metadata": {}, "source": [ "In the (string) template above, some variables are to be filled with a DB lookup.\n", "\n", "The following instructions specifies the details of the binding: for instance,\n", "the variable `user_age` is to be found on table `people`, specifically in column `age`:" ] }, { "cell_type": "code", "execution_count": null, "id": "1ef22db8", "metadata": {}, "outputs": [], "source": [ "cassPrompt = CassandraReaderPromptTemplate(\n", " session=None,\n", " keyspace=None,\n", " field_mapper={\n", " 'user_age': ('people', 'age'),\n", " 'city_nickname': ('nickname_by_city', 'nickname'),\n", " },\n", " template=ctemplate0,\n", " input_variables=[\"user_question\"],\n", ")" ] }, { "cell_type": "markdown", "id": "e188d25b", "metadata": {}, "source": [ "**Note** that in the command above you specify the _primary key columns_ as `input_variables`, and not the variable names found in the prompt string above." ] }, { "cell_type": "markdown", "id": "9d954f71", "metadata": {}, "source": [ "When formatting the Prompt Template, you will have to specify the primary key values\n", "for the DB lookup -- the rest is done by the prompt template.\n", "\n", "In this case there are two lookups from as many tables: the prompt template\n", "takes care of everything, provided you pass all the primary key columns required\n", "across tables.\n", "\n", "Note: this operation essentially is a _client-side join_ (a standard pattern with Cassandra)." ] }, { "cell_type": "code", "execution_count": null, "id": "584efc7c", "metadata": {}, "outputs": [], "source": [ "print(cassPrompt.format(city='turin', name='beppe',\n", " user_question='Is functional programming fun?'))" ] }, { "cell_type": "markdown", "id": "83ba64c4-6aad-4e98-85ab-37bb34e7802f", "metadata": {}, "source": [ "#### Arbitrary row functions\n", "\n", "You can specify an arbitrary function to transform the database row into the returned field. The function gets a `{column_name: value}` dictionary expressing the row and returns the value for the prompt template:" ] }, { "cell_type": "code", "execution_count": null, "id": "9128d438-55ad-4402-9d2a-07b234912933", "metadata": {}, "outputs": [], "source": [ "def nicknamer(row_dict):\n", " return f\"{row_dict['nickname']} (i.e. {row_dict['city']})\"\n", "\n", "field_mapper_f = {\n", " 'user_age': ('people', lambda row_dict: row_dict['age'] + 10000),\n", " 'city_nickname': ('nickname_by_city', nicknamer),\n", "}\n", "\n", "cassPromptF = CassandraReaderPromptTemplate(\n", " session=None,\n", " keyspace=None,\n", " field_mapper=field_mapper_f,\n", " template=ctemplate0,\n", " input_variables=[\"user_question\"],\n", ")\n", "\n", "print(cassPromptF.format(city='milan', name='samanta',\n", " user_question='Is there a square circle?'))" ] }, { "cell_type": "markdown", "id": "2a2247ed-85e7-4bb7-abb0-9097c6922ade", "metadata": {}, "source": [ "#### Null and missing values\n", "\n", "You can control how the prompt template should behave when a `None` value is encountered or even when a table has no rows altogether for a given primary key.\n", "\n", "First, you can pass a boolean parameter `admit_nulls` to the prompt template.\n", "\n", "Second, you can use the full four-element tuple format for the entries in the \"field mapper\". This would be `(table_name, column_name_or_function, admit_nulls, default_value)` (whose `admit_nulls` will override the overall default)." ] }, { "cell_type": "code", "execution_count": null, "id": "a1e83cec-31e4-4120-85d0-45cf7d829379", "metadata": {}, "outputs": [], "source": [ "field_mapper_n = {\n", " 'user_age': ('people', 'age'),\n", " 'city_nickname': ('nickname_by_city', 'nickname', True, '(no nickname)'),\n", "}\n", "\n", "cassPromptN = CassandraReaderPromptTemplate(\n", " session=None,\n", " keyspace=None,\n", " field_mapper=field_mapper_n,\n", " template=ctemplate0,\n", " input_variables=[\"user_question\"],\n", " admit_nulls=False,\n", ")\n", "\n", "# Note: there is no \"tokyo\" in the nicknames table\n", "print(cassPromptN.format(city='tokyo', name='hideo',\n", " user_question='What are we having for lunch?'))" ] }, { "cell_type": "code", "execution_count": null, "id": "8bfe002d-9362-4910-a62d-51e27b26ec9e", "metadata": {}, "outputs": [], "source": [ "try:\n", " # Note: there are no rows with city='madrid' in the \"people\" table\n", " print(cassPromptN.format(city='madrid', name='alberto',\n", " user_question='What are we having for lunch?'))\n", "except Exception as e:\n", " print(f\"Exception => {str(e)}\")" ] }, { "cell_type": "markdown", "id": "673f685a-2991-4ac5-80be-25482a596d77", "metadata": {}, "source": [ "## Partialing Prompt Templates\n", "\n", "Cassandra-powered prompt templates support partialing. Suppose you have just enough information to bind the template to the DB-lookup values: you can leave the `user_question` unspecified for later completion at \"format-time\":" ] }, { "cell_type": "code", "execution_count": null, "id": "47c5f06b-98ff-404c-8564-00e91a6249e2", "metadata": {}, "outputs": [], "source": [ "cassPartialPrompt = cassPrompt.partial(city='lisbon', name='Pedro')" ] }, { "cell_type": "markdown", "id": "e837616e-b184-4c32-8072-2beb4b7a2c9d", "metadata": {}, "source": [ "The partial prompt template will keep the provided inputs ready to execute the full lookup-and-format operation when needed:" ] }, { "cell_type": "code", "execution_count": null, "id": "7ce69822-7049-4161-8286-69f9e130e42f", "metadata": {}, "outputs": [], "source": [ "print(cassPartialPrompt.format(user_question='Em verdade, o que quiseres?'))" ] }, { "cell_type": "markdown", "id": "f6a7c163-32d1-42b2-a7ca-63b37d8ba0e0", "metadata": {}, "source": [ "You can partial on any choice of input variables, even mixing database-bound and regular inputs:" ] }, { "cell_type": "code", "execution_count": null, "id": "b309e5fe-5515-4280-a584-c4b4dcf641b1", "metadata": {}, "outputs": [], "source": [ "cassPartialPrompt2 = cassPrompt.partial(city='lisbon', user_question='Estou perto do Tejo?')\n", "\n", "print(cassPartialPrompt2.format(name='Pedro'))" ] }, { "cell_type": "markdown", "id": "94b1044b-4fb6-4135-995d-653be3574f59", "metadata": {}, "source": [ "## Chat Prompt Templates" ] }, { "cell_type": "markdown", "id": "00890926-02e6-4995-88d9-ff9732a11a52", "metadata": {}, "source": [ "The Cassandra-specific approach can be seamlessly integrated\n", "with LangChain's \"chat prompt templates\", which represent a (template-based) way to manage chat exchanges.\n", "\n", "Start with a prompt, not much dissimilar from what you've seen so far:" ] }, { "cell_type": "code", "execution_count": null, "id": "7f09fd9e-77fa-4e5a-ad9d-299cd31a82e3", "metadata": {}, "outputs": [], "source": [ "systemTemplate = \"\"\"\n", "You are a chat assistant, helping a user of age {user_age} from a city\n", "they refer to as {city_nickname}.\n", "\"\"\"\n", "\n", "cassSystemPrompt = CassandraReaderPromptTemplate(\n", " session=None,\n", " keyspace=None,\n", " template=systemTemplate,\n", " input_variables=[],\n", " field_mapper={\n", " 'user_age': ('people', 'age'),\n", " 'city_nickname': ('nickname_by_city', 'nickname'),\n", " },\n", ")" ] }, { "cell_type": "markdown", "id": "5a9915c8-ff18-4f47-b4e0-9319c3e035d3", "metadata": {}, "source": [ "Next, you need specific abstractions to wrap this \"system prompt\" as part of a broader chat exchange:" ] }, { "cell_type": "code", "execution_count": null, "id": "9a96803b-4c2f-466b-ad7a-0c8c64031541", "metadata": {}, "outputs": [], "source": [ "from langchain.prompts import (\n", " ChatPromptTemplate,\n", " SystemMessagePromptTemplate,\n", " HumanMessagePromptTemplate,\n", ")\n", "systemMessagePrompt = SystemMessagePromptTemplate(prompt=cassSystemPrompt)" ] }, { "cell_type": "markdown", "id": "598cf408-be02-4a0b-8e02-f96453f43b6e", "metadata": {}, "source": [ "### A sequence of messages\n", "\n", "Once you wrap a single prompt template as a \"system message prompt\", go ahead and make it part of a longer chat conversation:" ] }, { "cell_type": "code", "execution_count": null, "id": "f3f54ad2-378a-49d0-ab7e-664682b73793", "metadata": {}, "outputs": [], "source": [ "humanTemplate = \"{text}\"\n", "humanMessagePrompt = HumanMessagePromptTemplate.from_template(humanTemplate)\n", "\n", "cassChatPrompt = ChatPromptTemplate.from_messages(\n", " [systemMessagePrompt, humanMessagePrompt]\n", ")" ] }, { "cell_type": "markdown", "id": "300b3a25-fe7a-4d53-b49b-b1bf42e82ec9", "metadata": {}, "source": [ "### Formatting\n", "\n", "LangChain takes care of correctly propagating the formatting steps throughout the sequence of messages, including the Cassandra-backed template:" ] }, { "cell_type": "code", "execution_count": null, "id": "9ba93311-8bf5-4475-9681-9a6fbc0edf35", "metadata": {}, "outputs": [], "source": [ "print(cassChatPrompt.format_prompt(\n", " city='turin',\n", " name='beppe',\n", " text='Assistant, please help me!'\n", ").to_string())" ] }, { "cell_type": "markdown", "id": "608d2143-0a80-4dd3-ba8e-4661a20c96e3", "metadata": {}, "source": [ "### Partialing and Chat Prompt Templates\n", "\n", "In some cases, you may want to partial with respect to the database lookup key(s) even within a chat prompt template:" ] }, { "cell_type": "code", "execution_count": null, "id": "110cbc15-66f3-400a-9b32-416d5cdb2289", "metadata": {}, "outputs": [], "source": [ "cassChatPartialPrompt = cassChatPrompt.partial(\n", " city='turin',\n", " name='beppe'\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "89c39495-875b-4dc0-a297-b4075925f0c7", "metadata": {}, "outputs": [], "source": [ "print(cassChatPartialPrompt.format(text=\"Hahaha!\"))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3" } }, "nbformat": 4, "nbformat_minor": 5 }