AI - Use Neural Network Model for Training and Prediction (Difficulty: 5)
-
Introduction
Neural Network (NN) Models are the core building blocks for many modern AI models, including large language models like ChatGPT.
In this tutorial, you will learn to build a simple NN model that predicts the result of a math expression.
Step 1 - Starting Project
Please open and remix this project as the starting point:
https://play.creaticode.com/projects/663058eedc3cd90d25ae3b11
This project contains some basic blocks to add 5 buttons for creating, training, testing, saving and loading NN models. 2 empty tables for “training data” and “test data” are also available. Lastly, an “error” variable will be used to display the prediction error.Step 2 - Create an Empty NN Model
When button0 is clicked, we will create an empty NN model like this:
The model will be named “m1”, and we will be using this name to refer to this model in many places below.Step 3 - Add a Layer
A neural network model is made of many “layers” of “neurons”. Each neuron is a small calculator that takes some input values and calculates an output value based on the input values. A layer is just an array of such neurons. For example, this picture below shows an NN model with 3 layers: the first layer has 2 neurons, the second layer has 5 neurons, and the last layer has 1.
source: https://www.makeuseof.com/neural-network-vs-deep-learning-are-they-different/
For our example project, we will build a 3-layer NN model like drawn above. It will take 2 input values at layer 1, and output 1 value as the final output at layer 3.Note that the first layer is always the “input layer”, so we don’t need to explicitly add it.
We can start by adding the second layer like this:
A few notes:- Its input shape “2” must match the number of input variables in the first layer, which is 2;
- Its output size of “5” is how many neurons are used in the second layer.
- For “activation”, it refers to how the output value is determined by the input values, and you can choose “relu” for most situations.
After this step, we have a 2-layer NN model now.
Step 4 - Add Another Layer
Now let’s add the third layer:
Its input shape must match the output size of the previous layer, which is 5. Since this is the last layer, its output size also has to match the number of variables we are predicting, which is 1.
After this step, we have built a 3-layer model, which takes 2 input variables, calculates 5 values based on them, then aggregate these 5 values into one output variable.
Step 5 - Compile the Model
Now we need to compile the NN model before using it:
This step allows us to configure the model with a few parameters:- The “loss” calculation method is set to “meanSquareError”: this is a common way to calculate the prediction error of the model. The basic idea is the following: suppose we have 10 sets of training data, and when we run the model, we generate a prediction for each set. We calculate the “error” for each prediction, square it, then calculate the sum of all these squared values.
- The optimizer is set to “adam”. This is a commonly used method that controls how the model adjusts its neurons based on the errors it gets during training.
- The learning rate is set to 0.01, which controls how much the model changes its neurons in each training cycle.
By now we have completed the model creation, but the model only contains some random values for the neurons in it, and it can not do anything useful. Therefore, the next task is to train the model with some data.
Step 6 - Generate Training Data
We will first need to generate some training data. Please define a new block “generate training data”:
Step 7 - Define the “training data” Table
To store the training data, we need to add 3 columns of “x”, “y” and “z” in the “training data” table:
Step 8 - Add Training Data
Now we will use a repeat loop to add 1024 rows of data into the “training data” table. For each row, it will have a random number for the x and y values, and the z value is calculated as x * 2 + y * 3:
Of course, once you are done with this project, you can change the z value to a more complex expression.Now if you run the “generate training data” block, you will get 1024 rows of data like this:
Step 9 - Train the NN Model
Now we can use the training data to train the model:
When we “train” the model, we use the model to predict the z value repeatedly using the given x and y values, and we try to adjust the value of the neurons in the model based on the prediction error we get. Over time, the prediction error will reduce as we adjust the model.
This block takes these inputs:
- The model to be trained is named “m1”
- The training data is in the given table
- We are using rows 1 to 1024 from the table for the training: you can change these 2 inputs to use less data if you like.
- Input columns and “x,y”: these are the column names of the input variables in the table, separated by commas. The have to match the name of the columns in the “training data” table.
- Output column is “z”, which is the desired output value from the model for each row. It also has to match the column name in the training data table.
- Batch size is 128: this controls how many rows of data are used to train the model in each cycle.
- Epochs: this is the number of training cycles to use. The higher the value, the more iterations the model will be trained to improve its parameters, and in general the smaller prediction error we will get.
Step 10 - Create Test Data
Next, we will test our model to see how well it can predict the z variable for any given x and y variables. When the “test” button is clicked, we will first create 100 rows of test data:
As shown, the code is very similar to how we create the training data, except that the table name is changed to “test data”, and we are generating only 100 rows. Note that there is a new column “prediction” as well, which will be used to store the prediction from the model.When we run this block, we will see some test data like this:
Step 11 - Run the Model for Prediction
Next, we can use the model “m1” to run prediction for the data in the “test data” table:
The inputs are:- model name: “m1”
- table containing the data: “test data”
- rows: we are using row 1 to 100 of this table
- input columns: the x and y columns will be used as the 2 input variables for the model
- output column: the prediction output will be stored in this column
When we run this block, it will run the model on every row of the test data, and store the corresponding prediction output in the “prediction” column. Here is an example result:
Note that the predictions are very close to the actual z values, but they are still not exactly the same. That’s because the NN model doesn’t really know that the z variable is 2 times x plus 3 times y. Instead, it try to guess this relationship by adjusting its neurons repeatedly. If we train the model more, it will get more and more accurate.Step 12 - Calculate Root Mean Square Error (RMSE)
To measure how well our model is doing, we can calculate the Root Mean Square Error (RMSE) using the test data. RMSE is a common metric used to measure how well a model predicts numberic variables. It tells us the average distance between the model prediction and the true value.
Here is the logic for calculating RMSE. We go through each row in the test data table, and calculate the difference between the z value and the prediction, and take a square of this difference, and add it to the “error” variable. In the end, we calculate the average of the squared differences, and calculate the square root of it.
Note that the RMSE depends on the test data. If we click the “test” button a few times, we will see the error value changes each time, but it is roughly the same range:Step 13 - Re-train the Model
Suppose we are not happy with the current error value we are getting. We can train the model again. The easiest way to improve it is to train it through more iterations, so it gets more chances to adjust its neurons to reduce the prediction error. For example, we can change the training epochs from 100 to 500 and train the model again:
After the training completes, we can test it again:
We can clearly see that the errors are significantly smaller than before.Step 14 - Saving and Loading the Model
If you are building the model for other users, you don’t want to require everyone to train the model. They should be able to simply load the model you have trained and use it for prediction. You can save the model parameters to the CreatiCode server, and then load the model from the server in the future. Note that you can only load models that are saved in the same project.
Here are the blocks:
For a test, try to save the model now, then reload the project. The model will not be available in the playground when you reload the project, so when we test the model, it will not generate any prediction. However, if we load the model, then it can be used to predict again:Step 15 - Next Steps
Here is the final project:
https://play.creaticode.com/projects/6544eaa9f569b60441928e1f
This is a very simple example, and there are many ways to play with the model more. Here are some suggestions:
- A more complex expression: you can try to change how the true value of z is calculated, and see if the model can still be trained to predict z accurately. You can also add more variables besides x and y
- A more powerful model: our current model is very simple. you can add more layers to it and make each layer bigger. Note that the first layer’s size should always match the number of input variables, and the last layer’s output size should match with the number of variables you are predicting.
-