Deep Learning Workflow in PyTorch
Super Kai (Kazuya Ito)
Posted on June 8, 2024
*Memos:
- My post explains Linear Regression in PyTorch.
- My post explains Batch, Mini-Batch and Stochastic Gradient Descent with DataLoader() in PyTorch.
-
My post explains Batch Gradient Descent without
DataLoader()
in PyTorch. - My post explains how to save a model in PyTorch.
- My post explains how to load a saved model in PyTorch.
- My repo has models.
- Prepare dataset.
- Prepare a model.
- Train model.
- Test model.
- Save model.
1. Prepare dataset.
(1) Get dataset like images, video, sound, text, etc.
(2) Divide the dataset into the one for training(Train data) and the one for testing(Test data). *Basically, train data is 80% and test data is 20%.
(3) Shuffle datasets with DataLoader():
*Memos:
- Basically, datasets are shuffled to mitigate Overfitting.
- Basically, only train data is shuffled so test data is not shuffled.
- My post explains Overfitting and Underfitting.
-
My post explains
DataLoader()
.
2. Prepare a model.
(1) Select the suitable layers for the dataset. *My post explains layers in PyTorch.
(2) Select activation functions if necesarry. *My post explains activation functions in PyTorch.
3. Train the model.
(1) Select the suitable loss function and optimizer for the dataset:
*Memos:
(2) Calculate the model's predictions with true values(train data), working from input layer to output layer. *This calculation is called Forward Propagation or Forward Pass.
(3) Calculate the mean(average) of the sum of the losses(differences) between the model's predictions and true values(train data) using a loss function.
(4) Zero out the gradients of all tensors every training(epoch) for proper calculation. *The gradients are accumulated in buffers, then they are not overwritten until backward() is called.
(5) Calculate a gradient using the average loss(difference) calculated by (3), working from output layer to input layer. *This calculation is called Backpropagation or Backward Pass.
(6) Update the model's parameters(weight
and bias
) by gradient descent using the gradient calculated by (5) to minimize the mean(average) of the sum of the losses(differences) between the model's predictions and true values(train data) using an optimizer.
*Memos:
- The tasks from (2) to (6) are one training(epoch).
- Basically, the training(epoch) is repeated with a
for
loop to minimize the mean(average) of the sum of the losses(differences) between the model's predictions and true values(train data). - Basically, a model is tested (4. Test the model) after (6) every training(epoch) or once every
n
trainings(epochs).
4. Test the model.
(1) Calculate the model's predictions with true values(test data).
(2) Calculate the mean(average) of the sum of the losses(differences) between the model's predictions and true values(test data) with a loss function.
(3) Show each mean(average) of the sum of the losses(differences) with true values(train and test data) by text or graph.
5. Save the model.
Finally, save the model.
Posted on June 8, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.