2025-04-28发表2025-04-28更新技术学习 / 深度学习4 分钟读完 (大约578个字)0次访问

Pytorch自动混合精度AMP

AMP（Automatic Mixed Precision）自动混合精度。

可以帮助减少深度学习网络的运行时间和内存占用。本文将简单介绍使用方法。

一、简介

在Pytorch中，官方提供了一种混合精度的方法，通过将torch.float32和torch.float16混合使用的方式，减少训练时网络的运行时间和内存占用。

在类似线性层和卷积操作中，使用float16会快很多。而其他如规约操作，则float32更合适。

在使用Pytorch构建深度学习训练脚本时，AMP通常将 torch.autocast 和 torch.cuda.amp.GradScaler 一起使用。

注意： AMP 不适用于 CPU 训练，必须使用 CUDA。

二、使用方法

在训练网络的train.py代码中，只需要简单地增加混合精度的相关代码即可。

下面整理官方示例代码。

未加入AMP时的默认训练操作

net = make_model(in_size, out_size, num_layers)
opt = torch.optim.SGD(net.parameters(), lr=0.001)

start_timer()
for epoch in range(epochs):
    for input, target in zip(data, targets):
        output = net(input)
        loss = loss_fn(output, target)
        loss.backward()
        opt.step()
        opt.zero_grad() # set_to_none=True here can modestly improve performance
end_timer_and_print("Default precision:")

添加AMP

torch.autocast充当上下文管理器，运行脚本以混合精度运行。
- 如BN或者softmax这类对精度要求较高的方法，不适合使用float16，autocast会自动进行管理，无需用户感受。
torch.cuda.amp.GradScaler方便地执行梯度缩放的步骤。
- 有助于防止在混合精度训练时出现小幅度梯度冲至零（下溢）。

use_amp = True

net = make_model(in_size, out_size, num_layers)
opt = torch.optim.SGD(net.parameters(), lr=0.001)
scaler = torch.amp.GradScaler("cuda" ,enabled=use_amp)

start_timer()
for epoch in range(epochs):
    for input, target in zip(data, targets):
        with torch.autocast(device_type=device, dtype=torch.float16, enabled=use_amp):
            output = net(input)
            loss = loss_fn(output, target)
        scaler.scale(loss).backward()
        scaler.step(opt)
        scaler.update()
        opt.zero_grad() # set_to_none=True here can modestly improve performance
end_timer_and_print("Mixed precision:")

验证时
也可以使用AMP，只需要包裹一层autocast，但不需要使用GradScaler（因为已经不需要饭传播了。

model.eval()
with torch.no_grad():
    with torch.cuda.amp.autocast():
        output = model(input)

Pytorch自动混合精度AMP

https://zhouwentong7.github.io/2025/04/28/Pytorch自动混合精度AMP/

作者

Zhou

发布于

2025-04-28

更新于

2025-04-28

许可协议

爱发电支付宝

微信

Pytorch自动混合精度AMP

一、简介

二、使用方法

未加入AMP时的默认训练操作

添加AMP

作者

发布于

更新于

许可协议

喜欢这篇文章？打赏一下作者吧

评论

最新文章

分类

标签

目录