如何在 Pytorch 中设置和运行 CUDA 操作？

CUDA（或计算机统一设备架构）是 NVIDIA 专有的并行计算平台和编程模型。使用 CUDA SDK，开发人员可以利用他们的 NVIDIA GPU（图形处理单元），从而使他们能够在他们通常的编程工作流程中引入基于 GPU 的并行处理的能力，而不是通常的基于 CPU 的顺序处理。

随着近年来深度学习的兴起，可以看出模型训练中涉及的各种操作，如矩阵乘法、求逆等，可以在很大程度上并行化，以获得更好的学习性能和更快的训练周期。因此，许多像 Pytorch 这样的深度学习库使他们的用户能够使用一组接口和实用程序来利用他们的 GPU。本文将介绍在包含支持 CUDA 的 GPU 的任何系统中设置 CUDA 环境，并简要介绍使用Python的 Pytorch 库中可用的各种 CUDA 操作。

安装

首先，您应该通过官方 Nvidia CUDA 兼容性列表检查他们系统的 GPU，以确保他们的 GPU 是否启用了 CUDA。 Pytorch 提供了一个友好的用户界面，让您可以选择操作系统和其他要求，从而使 CUDA 安装过程变得非常简单，如下图所示。根据我们的计算机，我们将按照下图给出的规格进行安装。

参考Pytorch的官方链接，根据自己的电脑规格选择规格。我们还建议在安装后完全重新启动系统，以确保工具包正常工作。

Pytorch 安装页面截图

pip3 install torch==1.9.0+cu102 torchvision==0.10.0+cu102 torchaudio===0.9.0 -f https://download.pytorch.org/whl/torch_stable.html

编程需要懂一点英语

在 Pytorch 中开始使用 CUDA

安装后，我们可以使用torch.cuda界面使用 Pytorch 与 CUDA 进行交互。我们将使用以下函数：

Syntax:

torch.version.cuda(): Returns CUDA version of the currently installed packages
torch.cuda.is_available(): Returns True if CUDA is supported by your system, else False
torch.cuda.current_device(): Returns ID of current device
torch.cuda.get_device_name(device_ID): Returns name of the CUDA device with ID = ‘device_ID’

编程需要懂一点英语

代码：

Python3

import torch
  
print(f"Is CUDA supported by this system? 
      {torch.cuda.is_available()}")
print(f"CUDA version: {torch.version.cuda}")
  
# Storing ID of current CUDA device
cuda_id = torch.cuda.current_device()
print(f"ID of current CUDA device:
      {torch.cuda.current_device()}")
        
print(f"Name of current CUDA device:
      {torch.cuda.get_device_name(cuda_id)}")

Python3

import torch
  
# Creating a test tensor
x = torch.randint(1, 100, (100, 100))
  
# Checking the device name:
# Should return 'cpu' by default
print(x.device)
  
# Applying tensor operation
res_cpu = x ** 2
  
# Transferring tensor to GPU
x = x.to(torch.device('cuda'))
  
# Checking the device name:
# Should return 'cuda:0'
print(x.device)
  
# Applying same tensor operation
res_gpu = x ** 2
  
# Checking the equality
# of the two results
assert torch.equal(res_cpu, res_gpu.cpu())

Python3

import torch
import torchvision.models as models
  
# Making the code device-agnostic
device = 'cuda' if torch.cuda.is_available() else 'cpu'
  
# Instantiating a pre-trained model
model = models.resnet18(pretrained=True)
  
# Transferring the model to a CUDA enabled GPU
model = model.to(device)
  
# Now the reader can continue the rest of the workflow
# including training, cross validation, etc!

输出：

CUDA版本

使用 CUDA 处理张量

为了通过 CUDA 交互 Pytorch 张量，我们可以使用以下实用函数：

Syntax:

Tensor.device: Returns the device name of ‘Tensor’
Tensor.to(device_name): Returns new instance of ‘Tensor’ on the device specified by ‘device_name’: ‘cpu’ for CPU and ‘cuda’ for CUDA enabled GPU
Tensor.cpu(): Transfers ‘Tensor’ to CPU from it’s current device

编程需要懂一点英语

为了演示上述功能，我们将创建一个测试张量并执行以下操作：

检查张量的当前设备并应用张量操作（平方），将张量传输到 GPU 并应用相同的张量操作（平方）并比较 2 个设备的结果。

代码：

蟒蛇3

import torch
  
# Creating a test tensor
x = torch.randint(1, 100, (100, 100))
  
# Checking the device name:
# Should return 'cpu' by default
print(x.device)
  
# Applying tensor operation
res_cpu = x ** 2
  
# Transferring tensor to GPU
x = x.to(torch.device('cuda'))
  
# Checking the device name:
# Should return 'cuda:0'
print(x.device)
  
# Applying same tensor operation
res_gpu = x ** 2
  
# Checking the equality
# of the two results
assert torch.equal(res_cpu, res_gpu.cpu())

输出：

cpu
cuda : 0

使用 CUDA 处理机器学习模型

一个好的 Pytorch 实践是生成与设备无关的代码，因为某些系统可能无法访问 GPU，而只能依赖 CPU，反之亦然。完成后，可使用以下函数将任何机器学习模型转移到所选设备上

Syntax: Model.to(device_name):

Returns: New instance of Machine Learning ‘Model’ on the device specified by ‘device_name’: ‘cpu’ for CPU and ‘cuda’ for CUDA enabled GPU

编程需要懂一点英语

在此示例中，我们从torchvision.models实用程序导入预训练的Resnet-18模型，读者可以使用相同的步骤将模型传输到他们选择的设备。

代码：

蟒蛇3

import torch
import torchvision.models as models
  
# Making the code device-agnostic
device = 'cuda' if torch.cuda.is_available() else 'cpu'
  
# Instantiating a pre-trained model
model = models.resnet18(pretrained=True)
  
# Transferring the model to a CUDA enabled GPU
model = model.to(device)
  
# Now the reader can continue the rest of the workflow
# including training, cross validation, etc!

输出：

带有 CUDA 的机器学习