📅  最后修改于: 2023-12-03 14:44:13.881000             🧑  作者: Mango
MC Dropout is a technique used in Deep Learning that helps in reducing overfitting. PyTorch is a widely used open source deep learning framework. In this article, we will explore how to implement MC Dropout in PyTorch.
MC Dropout stands for Monte Carlo Dropout. Dropout is a regularisation technique which involves randomly dropping out units (along with their connections) during training. This helps in preventing overfitting. MC Dropout is an extension to the standard dropout technique. During inference, we drop out units multiple times and take the average prediction. This helps in getting a better overall estimate of the model.
Let's implement MC Dropout in PyTorch using a simple example. We will be using the popular MNIST dataset.
First, let's import the required libraries.
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms
Next, we need to load the dataset and create a dataloader.
batch_size = 64
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transforms.ToTensor())
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transforms.ToTensor())
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size)
We will be using a simple CNN as our model.
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(1, 32, 3, 1)
self.conv2 = nn.Conv2d(32, 64, 3, 1)
self.dropout1 = nn.Dropout2d(0.25)
self.dropout2 = nn.Dropout2d(0.5)
self.fc1 = nn.Linear(9216, 128)
self.fc2 = nn.Linear(128, 10)
def forward(self, x):
x = self.conv1(x)
x = F.relu(x)
x = self.conv2(x)
x = F.relu(x)
x = F.max_pool2d(x, 2)
x = self.dropout1(x)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = F.relu(x)
x = self.dropout2(x)
x = self.fc2(x)
output = F.log_softmax(x, dim=1)
return output
model = Net()
Next, we define the training and testing functions.
def train(model, device, train_loader, optimizer, epoch):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = model(data)
loss = F.nll_loss(output, target)
loss.backward()
optimizer.step()
if batch_idx % 100 == 0:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, batch_idx * len(data), len(train_loader.dataset),
100. * batch_idx / len(train_loader), loss.item()))
def test(model, device, test_loader):
model.eval()
test_loss = 0
correct = 0
with torch.no_grad():
for data, target in test_loader:
data, target = data.to(device), target.to(device)
output = model(data)
test_loss += F.nll_loss(output, target, reduction='sum').item() # sum up batch loss
pred = output.argmax(dim=1, keepdim=True) # get the index of the max log-probability
correct += pred.eq(target.view_as(pred)).sum().item()
test_loss /= len(test_loader.dataset)
print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
test_loss, correct, len(test_loader.dataset),
100. * correct / len(test_loader.dataset)))
Next, we define a function to perform MC Dropout during inference. We store the output of the model for each sample during each iteration.
def predict_with_dropout(model, device, data, num_samples):
with torch.no_grad():
model.eval()
outputs = torch.zeros((num_samples,) + model(data).size())
for i in range(num_samples):
outputs[i] = model(data)
return outputs.mean(dim=0)
We also define the number of samples we want to take during inference.
num_samples = 10
Now, we can train the model and perform MC Dropout during inference.
epochs = 10
lr = 0.01
momentum = 0.5
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = Net().to(device)
optimizer = optim.SGD(model.parameters(), lr=lr, momentum=momentum)
for epoch in range(1, epochs + 1):
train(model, device, train_loader, optimizer, epoch)
test(model, device, test_loader)
# perform MC Dropout during inference
correct = 0
with torch.no_grad():
for data, target in test_loader:
data, target = data.to(device), target.to(device)
output = predict_with_dropout(model, device, data, num_samples)
pred = output.argmax(dim=1, keepdim=True)
correct += pred.eq(target.view_as(pred)).sum().item()
print('\nMC Dropout Test set: Accuracy: {}/{} ({:.0f}%)\n'.format(
correct, len(test_loader.dataset), 100. * correct / len(test_loader.dataset)))
MC Dropout is a powerful technique that can help in reducing overfitting in Deep Learning. PyTorch is a great framework for implementing MC Dropout. In this article, we saw how to implement MC Dropout in PyTorch using a simple example on the MNIST dataset.