Doubt in Resolving Overfitting and Underfitting
Atul.A.Das
Posted on February 28, 2024
Hey guys so recently, I have been developing an artificial neural network for a binary classification problem. This problem classifies as to whether a particular employee will get promoted or not. This dataset has 54808 rows. I have been using the 80:20 train-test ratio and I have created a model with a drop function to do the same. However, I am getting some really weird results.
The code goes like follows
import torch
import torch.nn as nn
class ANN(nn.Module):
def init(self, input_features=5, h1=60, h2=60,h3=45,h4=45, output_features=1,dropout_prob=0.4):
super().init()
self.fc1 = nn.Linear(input_features, h1)
self.relu1 = nn.LeakyReLU()
self.dropout1 = nn.Dropout(p=dropout_prob,inplace=False)
self.fc2 = nn.Linear(h1, h2)
self.relu2 = nn.LeakyReLU()
self.dropout2 = nn.Dropout(p=dropout_prob,inplace=False)
self.fc3 =nn.Linear(h2,h3)
self.relu3 = nn.LeakyReLU()
self.dropout3 = nn.Dropout(p=dropout_prob,inplace=False)
self.fc4 =nn.Linear(h3,h4)
self.relu4 = nn.LeakyReLU()
self.dropout4 = nn.Dropout(p=dropout_prob,inplace=False)
self.output = nn.Linear(h4, output_features)
self.output_activation_function = nn.Sigmoid()
def forward(self, x):
x = self.fc1(x)
x = self.relu1(x)
x = self.dropout1(x)
x = self.fc2(x)
x = self.relu2(x)
x = self.dropout2(x)
x = self.fc3(x)
x = self.relu3(x)
x = self.dropout3(x)
x = self.fc4(x)
x = self.relu4(x)
x = self.dropout4(x)
x = self.output(x)
x = self.output_activation_function(x)
return x
def flatten_parameters(self):
flattened_parameters = []
for param in self.parameters():
flattened_parameters.append(param.flatten())
return torch.cat(flattened_parameters)
model=ANN()
flattened_params = model.flatten_parameters()
print(f"Flattened Parameters:{flattened_params}")
print(f"Shape:{flattened_params.shape}")
from sklearn.model_selection import train_test_split
X_train,X_test,Y_train,Y_test=train_test_split(X_resampled[['awards_won','avg_training_score','previous_year_rating','education','region']],Y_resampled,test_size=0.2,random_state=42)
from sklearn.preprocessing import StandardScaler
scaler=StandardScaler()
X_train_scaled=scaler.fit_transform(X_train)
X_test_scaled=scaler.transform(X_test)
print("Successfully Scaled")
from torch import tensor as tn
X_train_scaled_tensor=tn(X_train_scaled,dtype=torch.float32)
X_test_scaled_tensor=tn(X_test_scaled,dtype=torch.float32)
Y_train_tensor=tn(Y_train,dtype=torch.int64)
Y_test_tensor=tn(Y_test,dtype=torch.int64)
Y_train_tensor = Y_train_tensor.unsqueeze(1)
Y_test_tensor = Y_test_tensor.unsqueeze(1)
print("Tensors created")
import torch.optim as optim
criterion = nn.BCEWithLogitsLoss()
optimizer = optim.Adam(model.parameters(), lr=0.00001, betas=(0.95, 0.999), eps=1e-7, weight_decay=0.0001, amsgrad=False)
num_epochs = 100
batch_size = 32
print(Y_train_tensor.dtype)
for epoch in range(num_epochs):
model.train()#Set to training mode
for i in range(0, len(X_train_scaled_tensor), batch_size):
outputs = model(X_train_scaled_tensor[i:i + batch_size])
loss = criterion(outputs, Y_train_tensor[i:i + batch_size].float())
optimizer.zero_grad()
loss.backward()
optimizer.step()
model.eval()#Set to evaluation mode
with torch.no_grad():
outputs = model(X_test_scaled_tensor)
predictions = torch.round(outputs)
accuracy = (predictions == Y_test_tensor).sum().item() / len(Y_test_tensor)
print(f"Epoch {epoch + 1}, Loss: {loss.item():.4f}, Test Accuracy: {accuracy:.4f}")
THe last part is giving weird results as follows:
torch.int64
Epoch 1, Loss: 0.6607, Test Accuracy: 0.5084
Epoch 2, Loss: 0.6602, Test Accuracy: 0.5084
Epoch 3, Loss: 0.6509, Test Accuracy: 0.5084
Epoch 4, Loss: 0.6422, Test Accuracy: 0.5347
Epoch 5, Loss: 0.6585, Test Accuracy: 0.6086
Epoch 6, Loss: 0.6395, Test Accuracy: 0.6358
Epoch 7, Loss: 0.6488, Test Accuracy: 0.6451
Epoch 8, Loss: 0.6417, Test Accuracy: 0.6504
Epoch 9, Loss: 0.6505, Test Accuracy: 0.6533
Epoch 10, Loss: 0.6433, Test Accuracy: 0.6570
Epoch 11, Loss: 0.6376, Test Accuracy: 0.6598
Epoch 12, Loss: 0.6424, Test Accuracy: 0.6600
Epoch 13, Loss: 0.6412, Test Accuracy: 0.6609
Epoch 14, Loss: 0.6360, Test Accuracy: 0.6622
Epoch 15, Loss: 0.6475, Test Accuracy: 0.6631
Epoch 16, Loss: 0.6541, Test Accuracy: 0.6643
Epoch 17, Loss: 0.6539, Test Accuracy: 0.6653
Epoch 18, Loss: 0.6331, Test Accuracy: 0.6656
Epoch 19, Loss: 0.6458, Test Accuracy: 0.6657
Epoch 20, Loss: 0.6363, Test Accuracy: 0.6661
Epoch 21, Loss: 0.6193, Test Accuracy: 0.6659
Epoch 22, Loss: 0.6422, Test Accuracy: 0.6660
Epoch 23, Loss: 0.6311, Test Accuracy: 0.6673
Epoch 24, Loss: 0.6477, Test Accuracy: 0.6682
Epoch 25, Loss: 0.6207, Test Accuracy: 0.6687
Epoch 26, Loss: 0.6352, Test Accuracy: 0.6710
Epoch 27, Loss: 0.6402, Test Accuracy: 0.6721
Epoch 28, Loss: 0.6323, Test Accuracy: 0.6716
Epoch 29, Loss: 0.6454, Test Accuracy: 0.6732
Epoch 30, Loss: 0.6303, Test Accuracy: 0.6735
Epoch 31, Loss: 0.6361, Test Accuracy: 0.6734
Epoch 32, Loss: 0.6385, Test Accuracy: 0.6745
Epoch 33, Loss: 0.6333, Test Accuracy: 0.6754
Epoch 34, Loss: 0.6469, Test Accuracy: 0.6768
Epoch 35, Loss: 0.6028, Test Accuracy: 0.6780
Epoch 36, Loss: 0.6260, Test Accuracy: 0.6771
Epoch 37, Loss: 0.6230, Test Accuracy: 0.6801
Epoch 38, Loss: 0.6486, Test Accuracy: 0.6790
Epoch 39, Loss: 0.6383, Test Accuracy: 0.6808
Epoch 40, Loss: 0.6248, Test Accuracy: 0.6810
Epoch 41, Loss: 0.6400, Test Accuracy: 0.6811
Epoch 42, Loss: 0.6406, Test Accuracy: 0.6818
Epoch 43, Loss: 0.6053, Test Accuracy: 0.6822
Epoch 44, Loss: 0.6365, Test Accuracy: 0.6824
Epoch 45, Loss: 0.6580, Test Accuracy: 0.6831
Epoch 46, Loss: 0.6454, Test Accuracy: 0.6843
Epoch 47, Loss: 0.6489, Test Accuracy: 0.6845
Epoch 48, Loss: 0.6146, Test Accuracy: 0.6858
Epoch 49, Loss: 0.6071, Test Accuracy: 0.6869
Epoch 50, Loss: 0.6227, Test Accuracy: 0.6866
Epoch 51, Loss: 0.6185, Test Accuracy: 0.6871
Epoch 52, Loss: 0.6240, Test Accuracy: 0.6887
Epoch 53, Loss: 0.6312, Test Accuracy: 0.6887
Epoch 54, Loss: 0.6216, Test Accuracy: 0.6885
Epoch 55, Loss: 0.6287, Test Accuracy: 0.6881
Epoch 56, Loss: 0.6261, Test Accuracy: 0.6892
Epoch 57, Loss: 0.6083, Test Accuracy: 0.6897
Epoch 58, Loss: 0.6348, Test Accuracy: 0.6898
Epoch 59, Loss: 0.6443, Test Accuracy: 0.6901
Epoch 60, Loss: 0.6102, Test Accuracy: 0.6924
Epoch 61, Loss: 0.6331, Test Accuracy: 0.6901
Epoch 62, Loss: 0.6264, Test Accuracy: 0.6910
Epoch 63, Loss: 0.6017, Test Accuracy: 0.6911
Epoch 64, Loss: 0.6241, Test Accuracy: 0.6915
Epoch 65, Loss: 0.6350, Test Accuracy: 0.6927
Epoch 66, Loss: 0.6080, Test Accuracy: 0.6933
Epoch 67, Loss: 0.6064, Test Accuracy: 0.6928
Epoch 68, Loss: 0.6013, Test Accuracy: 0.6930
Epoch 69, Loss: 0.6134, Test Accuracy: 0.6947
Epoch 70, Loss: 0.6079, Test Accuracy: 0.6932
Epoch 71, Loss: 0.6371, Test Accuracy: 0.6936
Epoch 72, Loss: 0.6320, Test Accuracy: 0.6951
Epoch 73, Loss: 0.6258, Test Accuracy: 0.6943
Epoch 74, Loss: 0.6089, Test Accuracy: 0.6949
Epoch 75, Loss: 0.6142, Test Accuracy: 0.6949
Epoch 76, Loss: 0.6109, Test Accuracy: 0.6965
Epoch 77, Loss: 0.6138, Test Accuracy: 0.6972
Epoch 78, Loss: 0.6077, Test Accuracy: 0.6964
Epoch 79, Loss: 0.6300, Test Accuracy: 0.6964
Epoch 80, Loss: 0.6348, Test Accuracy: 0.6976
Epoch 81, Loss: 0.6145, Test Accuracy: 0.6982
Epoch 82, Loss: 0.6276, Test Accuracy: 0.6991
Epoch 83, Loss: 0.6181, Test Accuracy: 0.7001
Epoch 84, Loss: 0.6333, Test Accuracy: 0.6989
Epoch 85, Loss: 0.6119, Test Accuracy: 0.6994
Epoch 86, Loss: 0.5859, Test Accuracy: 0.6993
Epoch 87, Loss: 0.6312, Test Accuracy: 0.7005
Epoch 88, Loss: 0.6394, Test Accuracy: 0.7007
Epoch 89, Loss: 0.6410, Test Accuracy: 0.7014
Epoch 90, Loss: 0.6238, Test Accuracy: 0.7024
Epoch 91, Loss: 0.6405, Test Accuracy: 0.7026
Epoch 92, Loss: 0.6310, Test Accuracy: 0.7029
Epoch 93, Loss: 0.6087, Test Accuracy: 0.7042
Epoch 94, Loss: 0.6277, Test Accuracy: 0.7035
Epoch 95, Loss: 0.6142, Test Accuracy: 0.7045
Epoch 96, Loss: 0.6347, Test Accuracy: 0.7045
Epoch 97, Loss: 0.5915, Test Accuracy: 0.7058
Epoch 98, Loss: 0.6408, Test Accuracy: 0.7059
Epoch 99, Loss: 0.6111, Test Accuracy: 0.7053
Epoch 100, Loss: 0.6109, Test Accuracy: 0.7073
Is this by any chance an indicator of overfitting/underfitting. If it is then how do I resolve it.
I have attached the dataset as well with this case. THe features have been chosen with respect to positive correlation values.
[]https://drive.google.com/file/d/1UMUpuBvJJP1069EI1ZA4mTSA0vq1yi3u/view?usp=sharing
Thanks in Advance guys!!!
Posted on February 28, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.