当前位置：首页 > news >正文

Mixed-precision计算原理（FP32+FP16)

news 2025/7/1 2:24:34

原文：

https://lightning.ai/pages/community/tutorial/accelerating-large-language-models-with-mixed-precision-techniques/

This approach allows for efficient training while maintaining the accuracy and stability of the neural network.

In more detail, the steps are as follows.

Convert weights to FP16: In this step, the weights (or parameters) of the neural network, which are initially in FP32 format, are converted to lower-precision FP16 format. This reduces the memory footprint and allows for faster computation, as FP16 operations require less memory and can be processed more quickly by the hardware.
Compute gradients: The forward and backward passes of the neural network are performed using the lower-precision FP16 weights. This step calculates the gradients (partial derivatives) of the loss function with respect to the network’s weights, which are used to update the weights during the optimization process.
Convert gradients to FP32: After computing the gradients in FP16, they are converted back to the higher-precision FP32 format. This conversion is essential for maintaining numerical stability and avoiding issues such as vanishing or exploding gradients that can occur when using lower-precision arithmetic.
Multiply by learning rate and update weights: Now in FP32 format, the gradients are multiplied by a learning rate (a scalar value that determines the step size during optimization).
The product from step 4 is then used to update the original FP32 neural network weights. The learning rate helps control the convergence of the optimization process and is crucial for achieving good performance.

简而言之：

g * lr + w老 --> w新，这里的g、w老、w新，都是FP32的；

其余计算梯度中的w、activation、gradient等，全部都是FP16的；

训练效果：

耗时缩减为FP32的1/2 ~ 1/3

显存变化不大（因为，增加显存：weight多专一份FP16，减少显存：forward时保存的activation变成FP16了，二者基本抵消）

推理效果：

显存减少一半；耗时缩减为FP32的1/2；

使用FP16后的test accuracy反而上升，解释：（正则效应，带来噪音，帮助模型泛化得更好，减少过拟合）

A likely explanation is that this is due to regularizing effects of using a lower precision. Lower precision may introduce some level of noise in the training process, which can help the model generalize better and reduce overfitting, potentially leading to higher accuracy on the validation and test sets.

bf16，指数位增加，所以能覆盖更大的数值范围，所以能使训练过程更鲁棒，减少overflow和underflow的出现概率；

Mixed-precision计算原理（FP32+FP16)

相关文章：

Mixed-precision计算原理（FP32+FP16)

Go 控制协程(goroutine)的并发数量

web安全渗透测试十大常规项（一）：web渗透测试之CSRF跨站请求伪造

YOLOv10尝鲜测试五分钟极简配置

社交媒体数据恢复：聊天宝

备战秋招—模拟版图面试题来了

CAN总线简介

【HSQL001】HiveSQL内置函数手册总结（更新中）

Rust面试宝典第14题：旋转数组

解决SpringBoot中插入汉字变成?(一秒解决)

5.26牛客循环结构

AIGC 004-T2I-adapter另外一种支持多条件组合控制的文生图方案！

详解 Cookies 和 WebStorage

BeanFactory、FactroyBean、ApplicationContext

【计算机网络】HTTPS 协议原理

springboot + Vue前后端项目（第十二记）

linux 常用命令：find grep ps netstat sudo df du rm

SQLiteOpenHelper数据库帮助器

2024年5月26日 (周日) 叶子游戏新闻

STM32-10-定时器

智慧医疗能源事业线深度画像分析（上）

Vue3 + Element Plus + TypeScript中el-transfer穿梭框组件使用详解及示例

在rocky linux 9.5上在线安装 docker

并发编程 - go版

省略号和可变参数模板

android RelativeLayout布局

【前端异常】JavaScript错误处理：分析 Uncaught (in promise) error

go 里面的指针

GraphQL 实战篇：Apollo Client 配置与缓存

CppCon 2015 学习:Time Programming Fundamentals