ChatGPT AI: QA Bot Using Semantic Search (RAG) (Difficulty: 4)

CreatiCode

Introduction

Up until now, we’ve been using ChatGPT’s built-in knowledge and reasoning skills in our projects. For example, if you ask a medical question, ChatGPT can often answer it right away — because it was trained on lots of medical information and common medical questions.

But in many real-world projects, ChatGPT’s existing knowledge won’t be enough. Why?

ChatGPT doesn’t “know” about anything that happened after its training finished. This is called the knowledge cut-off date.
ChatGPT doesn’t memorize exact sources. It only learns patterns — like what words usually come next in a sentence. So if you ask something that depends on specific facts from a book or website, it might make something up (called “hallucinating”) instead of giving the right answer.
A lot of information is private - they are not available online or in any book. For example, a doctor may have detailed medical histories about their patients, but such information is not public, so it won’t show up in ChatGPT’s training data. Similarly, an organization may have a lot of internal documents that are not shared publicly.

In this tutorial, you will learn a very useful skill: how to teach ChatGPT some new knowledge. Specifically, you will build a chatbot that can answer questions about CreatiCode. Because CreatiCode is fairly new at the time of writing (November 2023), ChatGPT does not know much about CreatiCode. We will teach ChatGPT some new knowledge about CreatiCode, so that it can help answer questions about CreatiCode based on this knowledge.

What is Semantic Search?

Before we start coding, let’s understand the key method we’ll be using: semantic search.

“Semantic” means “meaning”, so semantic search means: searching by meaning instead of just matching words.

Let’s look at an example. Imagine we have 5 commonly asked questions about a book that we already have answers to:

What’s the name of the book?
How many pages are in the book?
What’s the book called?
How much does the book cost?
Where can I buy this book?

Now suppose someone asks: “What’s the price of the book?”. We would like to find out which of the 5 prepared questions is most similar to this new question, so that we can use the prepared answer for it.

If we use a basic search, we might look for the question that shares the most words. That might wrongly match:

“What’s the name of the book?”

Simply because these 2 questions share 5 out of the 6 words.

Instead, with semantic search, we look for the question with the most similar meaning. This time, we get the correct match:

“How much does the book cost?”

Even though the words are quite different, the meaning is similar. That’s the power of semantic search — and why it’s so useful for our project.

How Does Semantic Search Work?

Curious how semantic search works? Let’s walk through a simple example.

Imagine we convert each sentence into a vector — which is like a little arrow pointing in a certain direction on a graph. Don’t worry about how this conversion is done at this point.

Here’s how our 5 example questions might look as XY coordinates:

What’s the name of the book? -> (x = 0.9, y = 0.1)
How many pages are in the book? -> (x = 0.3, y = 0.8)
What’s the book called? -> (x = 0.4, y = 0.2)
How much does the book cost? -> (x = 0.5, y = 0.5)
Where can I buy this book? -> (x = 0.15, y = 0.9)

We can draw each question as an arrow from the center:

Now, when someone asks: “What’s the price of the book?”

We also turn that into a vector — maybe (0.45, 0.4):

Then we look for the arrow that points in almost the same direction. In this case:

“How much does the book cost?” has the smallest angle from our new vector — so it’s the most similar!

This process of turning sentences into vectors is called embedding. Behind the scenes, vectors have thousands of dimensions (as opposed to 2), but the idea is the same: find the text that’s closest in meaning, not just in words.

Step-by-Step Guide

Now, let’s start building our QA bot using this new technique.

Step 1 - Remix the Starting Project

We will use the following project as the starting point. Please open it in your account and remix it:

play.creaticode.com/projects/65544f377f7509db74c436c7

Step 2 - Populate the ‘data’ Table

The project has a data table with two columns: key (the question) and answer.

Your job is to fill this table with good question-and-answer pairs about CreatiCode.

You can download a ready-made CSV file with QA pairs here:
https://ccdn.creaticode.com/public/sampledata.csv

To import:

Right-click (or long-press) on the data table on the stage
Choose “import”
Select the sampledata.csv file you have downloaded

Now you’ll see all the questions and answers loaded in!

After the data is imported, you can put your mouse over any cell to review its content. For each row, the ‘key’ column is a question, and the ‘answer’ column is the corresponding answer.

If you are building another project, you can prepare the questions and answers using Google Sheets or Excel, and then export the table as a CSV file. Then you can upload the file into the ‘data’ table. Note that it is important that the first column is named ‘key’, since that’ll be the column we search by. You can have any number of columns after the ‘key’ column.

Step 3 - Create a Semantic Database

Next, we will build a ‘semantic database’ using the data from the table. Please drag the “create semantic database” block into the editor by itself, make sure the ‘data’ table is selected, and then click on this block to run it by itself.

Even though nothing visible happens, the “data” table has been sent to the CreatiCode server, and a semantic search database has been created!

️ Notes:

Each project can only have one semantic database at a time.
The database can store up to 100 rows of data.
You can update the database by re-running the block with new data.

Step 4 - Review the Chat Sprite

Now we will build a chatbot that will utilize this semantic database. Before we start coding, please quickly review the existing blocks in the ‘Chat’ sprite. They serve these 2 basic functions:

When the project starts, send an initial request to ChatGPT and display its response;
Whenever the user enters some input, display it, send it to ChatGPT, and then display the new response.

Step 5 - Test Question

Now we are going to verify that ChatGPT doesn’t know much about CreatiCode with a question: How much does creaticode cost?

As shown, ChatGPT doesn’t know the answer. We will need to “teach” it, and then we will ask the same question again.

Step 6 - Run a Semantic Search

To teach ChatGPT more about CreatiCode, the basic idea is the following:

Search among the prepared questions to find a few questions that are most similar to the user’s question
Instruct ChatGPT to answer the user’s question based on these search results.

You might be wondering this: why don’t we just feed ChatGPT with all the questions and answers we have prepared? This method might work, but there are a few reasons why it’s a bad idea:

In real-world applications, there can be thousands or even millions of pairs of QA data; they may be too much to be included in a single prompt, and it would also cost a lot of money.
When we feed a lot of information to ChatGPT, it may be “distracted” and fail to find the most relevant information.

Therefore, we will use semantic search to find only questions that have similar meanings to the user question, so their answers will provide the most relevant information for answering the user question.

We need to run the “search semantic database” block when we get the user question. Let’s add this block by itself for now to test it first.

In this new block, the first 3 inputs are most important:

Query: The question we are searching for.
Number of Results: How many items should be returned by the search, such as 3. The search results will be ranked by how similar they are to the query.
Result Table: The table for storing the result items.

Now let’s run this block by itself by clicking on it. To review the results, we need to look at the ‘result’ table. Since the stage window is occupied by the chat widget, we need to remove that widget as well, using the “remove all widgets” block:

After we run both the “search semantic database” and the “remove all widgets” blocks, show the “result” table on the stage, and hide the “data” table:

You should now see that the “result” table contains 3 columns.

The first column, “score,” represents how similar that item is to the query. It is usually between 0 and 1, with larger values meaning more similar.
The second and third columns are “key” and “answer”, which are the questions and answers we added to the semantic database earlier. This is working well since the first row is the question “Is CreatiCode free to use?”, which is strongly related to our query of “How much does Creaticode cost?”

Feel free to test with other questions and observe how the top 3 results change based on the question.

Note: if you get an empty “result” table, that means you did not successfully create the semantic database in step 3, so please try that step again.

Step 7 - Search with the User Input

Now, let’s add the “search semantic database” block to our program so that it uses the user input as the query. Let’s also detach the blocks below it for now.

After this change, whatever the user says will be used to run a search, and the result will be stored in the “result” table.

Step 8 - Reading the First Row from the “result” Table

Next, we will extract the search results from the “result” table and include them in the request that we send to ChatGPT. First, create 2 variables named “key” and “answer”, and try to read the content of the first row from the result table like this:

Click on these 2 blocks to run them, and you can check these 2 variables’ values on the stage:

Step 9 - Reading All Rows from the “result” Table

Next, let’s read all rows out of the result table. We can use a ‘for-loop’, which sets an index variable to go from 1 to the total number of rows:

In our case, the row count is 3 for the “result” table, so this for-loop will run 3 iterations.

Step 10 - Join All Questions and Answers into a Reference

As we go through each row of the result table, we can use the “join” block to join the content of that row into a “reference” variable. The reference variable will start empty, and we will append new content from each row. We will use the “\n” character to join the questions and answers, which adds a new line between them. After the loop finishes, we can print out the reference to review its value.

If you run this stack of blocks, the reference should be printed out in the console panel like this:

Step 11 - Compose a New Request to ChatGPT Using the Reference

Now we are ready to create a new request to ChatGPT with this new reference information. The basic idea is to say the following to ChatGPT: Hey, ChatGPT, we need you to answer this question, and by the way, here is some information you can use as reference..

More specifically, we need to compose a “request” using 3 parts: the user’s input, a “[REFERENCE]” line to indicate the reference section, and the reference we have retrieved above.

Now, run the program, you can see the complete request being printed in the console panel:

Step 12 - Get ChatGPT’s Response

Now let’s send the “request” to ChatGPT and display its response:

This is an example of what you might get from ChatGPT:

This is clearly a significant improvement compared to our initial test result at step 5. ChatGPT is much more knowledgeable about CreatiCode now.

Step 13 - Make ChatGPT’s Response More Specific.

Currently, we are just using a very simple request, and there is a lot of room to improve it. The most obvious problem with the current response is that it is not specific enough. Clearly, ChatGPT is using all 3 question/answer pairs in our reference to compose the answer for the user.

Can you try to change our instruction in the request, so that ChatGPT will only provide a specific and direct answer like this?

Don’t look at the answer below before you spend some time on this challenge. This is a very good exercise on how to fine-tune the prompt to get the desired response from ChatGPT.

Have you found a solution? Don’t look at the answer if not!

Step 14 - The Solution to Make ChatGPT’s Response More Specific

OK. If you have figured out what to say in the prompt, great job! This is not an easy task, since ChatGPT can be pretty ‘stubborn’ and would insist on using all the information in the reference.

Here is one solution to make ChatGPT more specific. We will compose the request like this:

user input:
how much does creaticode cost?

Instruction: below are some common questions/answers, but they may not be what the user is asking. can you compose a concise answer that only responds to the user’s question?

Is CreatiCode free to use?
Most of the features are free
Who is CreatiCode’s target user?
CreatiCode is a simple language for learners of all ages
can I write 3D programs on CreatiCode?
Yes

As you can see, we are making these 2 changes:

We divide up the request into 2 sections “user input” vs “instruction”.
In the instruction, we explicitly tell ChatGPT some of the questions/answers may not be what the user is asking, and we are asking it to compose an answer that ONLY responds to the user’s question.

To compose this new request, we can modify the code this way:

Specifically, this is the third part of the request that you can copy to your code:

\nInstruction: below are some common questions/answers, but they may not be what the user is asking. can you compose a concise answer that only responds to the user’s question?

Now please try to test your program again.

Step 15 - Improve Our Reference Questions/Answers

Besides updating the prompt, we can also improve the questions/answers in our semantic database.

For example, currently in our database, we have this pair of question/answer:

Q: Is CreatiCode free to use?
A: Most of the features are free

The answer is a bit unclear, and ChatGPT won’t be able to provide a better answer than what we provide.

As an example, let’s try to improve the answer to the following:

Most of the features are free to use. However, for non-paying users, there is a rate limit to some of the premium blocks, such as the ChatGPT block or the text-to-speech block.

To make this change, you can remove all widgets from the stage, update the content of the “data” table (not the “result” table), and then run the “create semantic database” block by itself again.

Now we can test it again. You will see that ChatGPT provides an updated answer based on our new answer:

In addition, we can verify that the reference information we provide to ChatGPT indeed contains the updated answer:

Retrieval Augmentation Generation (RAG)

Now you have learned a very powerful technique: We first retrieve some reference information using semantic search and then use that information in our request to ChatGPT. This is often called “Retrieval Augmentation Generation”, or RAG for short. That’s because we are using the information we have retrieved to augment ChatGPT’s response generation.

This project is just a simple example of how to use RAG to enhance ChatGPT’s response. This technique can achieve many more things.

Build Your Own QA Bot

Now, try to build another QA Bot for something you know well and ChatGPT doesn’t.

For example, you can create a QA bot about a person you know well, or about a particular place like a restaurant or a park, or an organization like a school club or a non-profit (e.g. https://visionsvcb.org/), or about an object like a book or an electronic device.

You should first test it to make sure ChatGPT doesn’t already know about the subject, otherwise it is not easy to show how much new knowledge you have taught ChatGPT.

As you may have guessed, the most important factor for a successful QA bot is high-quality input data. It will take a lot of work to prepare such data. It might be a good idea to work as a team on this project.