ChatGPT AI: QA Bot Using Semantic Search (RAG) (Difficulty: 4)
CreatiCode last edited by info-creaticode
So far, we have been using ChatGPT’s existing knowledge and reasoning capability in our projects. For example, when we ask ChatGPT a medical question, we do not need to teach ChatGPT anything, since it has already been trained with a lot of medical knowledge and questions.
However, ChatGPT’s knowledge may not be enough in many applications.
- First, ChatGPT doesn’t know about events that have happened after its knowledge cut-off date (time of its latest training data).
- Also, ChatGPT only remembers the probability of which word will come next, and it doesn’t remember the exact content of any source. Suppose we want ChatGPT to answer questions solely based on a specific medical book, then it will likely fail or hallucinate (make guesses).
In this tutorial, you will learn a very useful skill: how to teach ChatGPT some new knowledge. Specifically, you will build a chatbot that can answer questions about CreatiCode. Because CreatiCode is fairly new at the time of writing (November 2023), ChatGPT does not know anything about CreatiCode. We will teach ChatGPT some new knowledge about CreatiCode, so that it can help answer questions about CreatiCode based on this knowledge.
Before we start coding, let me explain the key method we will be using: semantic search. ‘Semantic’ means ‘meaning’, so ‘semantic search’ means to “search by meaning”.
For example, suppose I have a list of 5 questions about a book:
- What’s the name of the book?
- How many pages are in the book?
- What’s the book called?
- How much does the book cost?
- Where can I buy this book?
Now I have a new question “What’s the price of the book?”, and I want to find which of the 5 questions is similar to this new question.
If we use a traditional search method, we would need to find which existing question has the most number of common words as the new question. Obviously, these 2 questions share the most number of common words (5 out of 6):
- What’s the name of the book?
- What’s the price of the book?
However, because of the development of large language models, now we have a brand new tool called ‘semantic search’, which allows us to search this way: which of the 5 known questions has the most similar meaning as the new question? And using semantic search, we will get a much better answer: ‘How much does the book cost?’
Essentially ‘semantic search’ allows us to find sentences that are most similar in meaning, even if the words in these sentences are very different. As you shall see below, this will be very useful for our project.
If you are curious about how semantic search works, here is a simplified explanation.
Let’s still use the example above. For each of the 5 questions, we will first convert them to a vector, which is like an arrow on the XY axis:
- What’s the name of the book? -> (x = 0.9, y = 0.1)
- How many pages are in the book? -> (x = 0.3, y = 0.8)
- What’s the book called? -> (x = 0.4, y = 0.2)
- How much does the book cost? -> (x = 0.5, y = 0.5)
- Where can I buy this book? -> (x = 0.15, y = 0.9)
We can illustrate these 5 arrows like this:
Now, when we get a query question “What’s the price of the book?”, it will also be mapped to a vector, such as (x=0.45, y=0.4). We can draw this new vector as an arrow as well:
Next, we just need to compare all 5 blue arrows, and find the one that has the smallest angle from the red arrow. As shown, the arrow for “How much does the book cost?” has the smallest angle from the red arrow, which means these 2 arrows have the most similar directions. So we can use that question as the search result.
The process of mapping a sentence to a vector is called “embedding”, because it is like placing each sentence as a stone onto a wall at the right place. In reality, the vectors usually have thousands of dimensions, and the best embedding models can work really well in measuring the similarity between 2 pieces of text.
Now let’s get started with our QA Bot project. We will use the following project as the starting point. Please open it in your account and remix it:
In the project, a table ‘data’ has already been defined. It has 2 columns named ‘key’ and ‘answer’. Our first task is to populate this table with some questions and answers. The idea is that we will provide some standard questions and answers to ChatGPT as new knowledge so that it can answer other questions based on these answers.
To save time, a CSV file has been prepared that contains the questions and answers. Please download it here:
After you have downloaded the sampledata.csv file, import it to your data table. You can right-click on the data table in the stage, select ‘import’, then find the sampledata.csv file on your computer.
After the data is imported, you can put your mouse over any cell to review its content. For each row, the ‘key’ column is a question, and the ‘answer’ column is the corresponding answer.
If you are building another project, you can prepare the questions and answers using Google Sheets or Excel, and then export the table as a CSV file. Then you can upload the file into the ‘data’ table. Note that it is important that the first column is named ‘key’, since that’ll be the column we search by. You can have any number of columns after the ‘key’ column.
Next, we will build a ‘semantic database’ using the data from the table. Please drag this block into the editor by itself, make sure the ‘data’ table is selected from the dropdown, and then click on it to run it.
You will not notice anything changes, but behind the scenes, the following have happened:
- The data in the table have been sent to the CreatiCode server.
- A ‘semantic database’ has been created based on the data, which will be used for semantic search.
Note that at the point of writing, each project can only have one semantic database. If you run the ‘create semantic database’ block multiple times, the later runs will remove any existing semantic database first. That should be enough for most applications.
In addition, for all users with no subscription, there is currently a limit that the ‘data’ table can have at most 100 rows of data. For users with subscriptions, the limit is 1000 rows. If you are building an application that requires a higher limit, please reach out to firstname.lastname@example.org
Now we will build a chatbot that will utilize this semantic database. Before we start coding, please quickly review the existing blocks in the ‘Chat’ sprite. They serve these 2 basic functions:
- When the project starts, send an initial request to ChatGPT and display its response;
- Whenever the user enters some input, display that input, then send it to ChatGPT, and then display the response as well.
Now we are going to verify that ChatGPT doesn’t know much about CreatiCode with a question: How much does creaticode cost?
As shown, ChatGPT doesn’t know the answer. We will need to “teach” it, and then we will ask the same question again.
To teach ChatGPT more about CreatiCode, the basic idea is the following: Given a user question, we will first search among the questions that we have prepared to find a few questions that are similar, then we tell ChatGPT to answer the user question based on the search result.
You might be wondering this: why don’t we just feed ChatGPT with all the questions and answers we have prepared? This method might work, but there are a few reasons why it’s a bad idea:
- When we have a lot of QA pairs, they may take up a large number of tokens. It may exceed the token limit of ChatGPT, and it would also cost a lot of money.
- When we feed a lot of information to ChatGPT, it may take longer to read all of them, and it may also fail to find the relevant information.
Therefore, we will use semantic search to find only questions that have similar meaning to the user question, so their answers will provide the most relevant information for answering the user question.
We need to add the “search semantic database” block when we get the user question. Let’s add this block by itself for now to test it first.
In this new block, the first 3 inputs are specified:
- Query: The first input specifies what we are searching for. In this case, we are searching for the test question “How much does Creaticode cost?”.
- Number of Results: The second input specifies how many items will be returned by the search, such as 3. The search results will be ranked by how similar they are to the query, so we will get the top 3 questions that are the most similar to the user input
- Result Table: The third input specifies the table for storing the result items. Each item returned will be stored as one row in this table.
Now let’s run this block by itself by clicking on it. To review the results, we need to look at the ‘result’ table. Since the stage window is occupied by the chat widget, we need to remove that widget as well, using the “remove all widgets” block:
After we run both the “search semantic database” and the “remove all widgets” blocks, add the “result” table to the stage, and hide the “data” table:
You should now see that the “result” table contains 3 rows. The first column “score” represents how similar that item is to the query. It is usually between 0 and 1. The second and third columns are “key” and “answer”, which are the questions and answers we added to the semantic database earlier. This is working well since the first row is the question “Is CreatiCode free to use?”, which is strongly related to our query of “How much does Creaticode cost?”
Note: if you get an empty “result” table, that means you did not successfully create the semantic database in step 3, so please try that step again.
Now let’s add the “search semantic database” block to our program, so it uses the user input as the query. Let’s also detach the ChatGPT block below it, since we need to improve our prompt before we can send it to ChatGPT.
After this change, whatever the user says, will be used to run a search, and the result will be stored in the “result” table.
Next, we will extract the data from the “result” table, and embed them in the request that we send to ChatGPT. First, create 2 variables named “key” and “answer”, and try to read the content of the first row from the result table like this:
Click on these 2 blocks to run them, and you can check these 2 variables’ values on the stage:
Next, let’s read all rows out of the result table. We can use a ‘for-loop’, which sets an index variable to go from 1 to the total number of rows like this:
In our case, the row count is 3 for the “result” table, so this for-loop will run 3 iterations.
As we go through each row of the result table, we can use the “join” block to join the content of that row into a “reference” variable. The reference variable will start as “”, and we will append new content from each row to this variable. We will use the “\n” character to join the questions and answers, which adds a new line between them. After the loop finishes, we can print out the reference to review its value.
If you run this stack of blocks, the reference should be printed out in the console panel like this:
Now we are ready to create a new request to ChatGPT with this new reference information. The basic idea is to say the following to ChatGPT: Hey, ChatGPT, we need you to answer this question, and by the way, here is some information you can use as reference.
More specifically, we need to compose a “request” using 3 parts: the user’s input, a “[REFERENCE]” tag to indicate the reference section, and the reference we have retrieved above.
Now if we run the program, we can see the following request being printed in the console panel:
Now let’s send the value of the “request” variable to ChatGPT and display its response:
This is what you should get from ChatGPT:
This is clearly a significant improvement compared to our initial test result at step 5. ChatGPT appears to be very knowledgeable about CreatiCode.
Currently, we are just using a very simple request, and there is a lot of room to improve it. The most obvious problem with the current response is that it is not specific enough. Clearly, ChatGPT is using all 3 question/answer pairs in our reference to compose the answer for the user.
Can you try to change our instruction in the request, so that ChatGPT will only provide a specific and direct answer like this?
Don’t look at the answer below before you spend some time on this challenge. This is a very good exercise on how to fine-tune the prompt to get the desired response from ChatGPT.
Have you found a solution? Don’t look at the answer if not!
OK. If you have figured out what to say in the prompt, great job! This is not an easy task, since ChatGPT can be pretty ‘stubborn’ and would insist on using all the information in the reference.
Here is one solution to make ChatGPT more specific. We will compose the request like this:
user input: how much does creaticode cost? Instruction: below are some common questions/answers, but they may not be what the user is asking. can you compose a concise answer that only responds to the user's question? Is CreatiCode free to use? Most of the features are free Who is CreatiCode's target user? CreatiCode is a simple language for learners of all ages can I write 3D programs on CreatiCode? Yes
As you can see, we are making these 2 changes:
- We divide up the request into 2 sections “user input” vs “instruction”.
- In the instruction, we explicitly tell ChatGPT some of the questions/answers may not be what the user is asking, and we are asking it to compose an answer that ONLY responds to the user question.
To compose this new request, we can modify the code this way:
Specifically, this is the third part of the request that you can copy to your code:
\nInstruction: below are some common questions/answers, but they may not be what the user is asking. can you compose a concise answer that only responds to the user's question?
Now please try to test your program again.
Besides updating the prompt, we can also improve the questions/answers in our semantic database.
For example, currently in our database, we have this pair of question/answer:
Q: Is CreatiCode free to use? A: Most of the features are free
The answer is a bit vague, and ChatGPT won’t be able to provide a better answer beyond what we provide it.
As an example, let’s try to improve the answer to the following:
Most of the features are free to use. However, for non-paying users, there is a rate limit to some of the premium blocks, such as the ChatGPT block or the text-to-speech block.
To make this change, you can remove all widgets from the stage, update the content of the “data” table (not the “result” table), and then run the “create semantic database” block by itself again.
Now we can test it again. You will see that ChatGPT provides an updated answer based on our new answer:
In addition, we can verify that the reference information we provide to ChatGPT indeed contains the updated answer:
Now you have learned a very powerful technique: we first retrieve some reference information using semantic search, then use that information in our request to ChatGPT. This is often called “Retrieval Augmentation Generation”, or RAG for short. That’s because we are using the information we have retrieved to augment ChatGPT’s response generation.
This project is just a simple example of how to use RAG to enhance ChatGPT’s response. There are many more things you can achieve with this technique.
Now try to build another QA Bot for something you know well and ChatGPT doesn’t.
For example, you can create a QA bot about a person you know well, or about a particular place like a restaurant or a park, or an organization like a school club or a non-profit (e.g. https://visionsvcb.org/), or about an object like a book or an electronic device.
You should first test it to make sure ChatGPT doesn’t already know about the subject, otherwise it is not easy to show how much new knowledge you have taught ChatGPT.
As you may have guessed, the most important factor for a successful QA bot is high-quality input data. It will take a lot of work to prepare such data. It might be a good idea to work as a team on this project.