当前位置：首页 > news >正文

【RKNN】YOLO V5中pytorch2onnx，pytorch和onnx模型输出不一致，精度降低

news 2026/2/9 16:44:56

在yolo v5训练的模型，转onnx，再转rknn后，测试发现：

rknn模型，量化与非量化，相较于pytorch模型，测试精度都有降低
onnx模型，相较于pytorch模型，测试精度也有降低，且与rknn模型的精度更接近

于是，根据这种测试情况，rknn模型的上游，就是onnx。onnx这里发现不对劲，肯定是这步就出现了问题。于是就查pytorch转onnx阶段，就存在转化的精度降低了。

本篇就是记录这样一个过程，也请各位针对本文的问题，给一些建议，毕竟目前是发现了问题，同时还存在一些问题在。

一、pytorch转onnx：torch.onnx.export

yolo v5 export.py: def export_onnx()中，添加下面代码，检查转储的onnx模型，与pytorch模型的输出结果是否一致。代码如下：

torch.onnx.export(model.cpu() if dynamic else model,  # --dynamic only compatible with cpuim.cpu() if dynamic else im,f,verbose=False,opset_version=opset,export_params=True, # 将训练好的权重保存到模型文件中do_constant_folding=True,  # 执行常数折叠进行优化input_names=['images'],output_names=output_names,dynamic_axes={"image": {0: "batch_size"},  # variable length axes"output": {0: "batch_size"},}
)# Checks
model_onnx = onnx.load(f)  # load onnx model
onnx.checker.check_model(model_onnx)  # check onnx modelimport onnxruntime
import numpy as np
print('onnxruntime run start', f)
sess = onnxruntime.InferenceSession('best.onnx')
print('sess run start')
output = sess.run(['output0'], {'images': im.detach().numpy()})[0]
print('pytorch model inference start')pytorch_result = model(im)[0].detach().numpy()
print(' allclose start')
print('output:', output)
print('pytorch_result:', pytorch_result)
assert np.allclose(output, pytorch_result), 'the output is different between pytorch and onnx !!!'

对其中的输出结果进行了打印，将差异性比较明显的地方进行了标记，如下所示：

在这里插入图片描述
也可以直接使用我下面这个版本，在转完onnx后，进行评测，转好的onnx和pt文件之间的差异性。如下：

参考pytorch官方：(OPTIONAL) EXPORTING A MODEL FROM PYTORCH TO ONNX AND RUNNING IT USING ONNX RUNTIME

import os
import platform
import sys
import warnings
from pathlib import Path
import torchFILE = Path(__file__).resolve()
ROOT = FILE.parents[0]  # YOLOv5 root directory
if str(ROOT) not in sys.path:sys.path.append(str(ROOT))  # add ROOT to PATH
if platform.system() != 'Windows':ROOT = Path(os.path.relpath(ROOT, Path.cwd()))  # relativefrom models.experimental import attempt_load
from models.yolo import ClassificationModel, Detect, DetectionModel, SegmentationModel
from utils.dataloaders import LoadImages
from utils.general import (LOGGER, Profile, check_dataset, check_img_size, check_requirements, check_version,check_yaml, colorstr, file_size, get_default_args, print_args, url2file, yaml_save)
from utils.torch_utils import select_device, smart_inference_modeimport numpy as np
def cosine_distance(arr1, arr2):# flatten the arrays to shape (16128, 7)arr1_flat = arr1.reshape(-1, 7)arr2_flat = arr2.reshape(-1, 7)# calculate the cosine distancecosine_distance = np.dot(arr1_flat.T, arr2_flat) / (np.linalg.norm(arr1_flat) * np.linalg.norm(arr2_flat))return cosine_distance.mean()def check_onnx(model, im):import onnxruntimeimport numpy as npprint('onnxruntime run start')sess = onnxruntime.InferenceSession('best.onnx')print('sess run start')output = sess.run(['output0'], {'images': im.detach().numpy()})[0]print('pytorch model inference start')with torch.no_grad():pytorch_result = model(im)[0].detach().numpy()print(' allclose start')print('output:', output, output.shape)print('pytorch_result:', pytorch_result, pytorch_result.shape)cosine_dis = cosine_distance(output, pytorch_result)print('cosine_dis:', cosine_dis)# 判断小数点后几位（4），是否相等，不相等就报错# np.testing.assert_almost_equal(pytorch_result, output, decimal=4)# compare ONNX Runtime and PyTorch resultsnp.testing.assert_allclose(pytorch_result, output, rtol=1e-03, atol=1e-05)# assert np.allclose(output, pytorch_result), 'the output is different between pytorch and onnx !!!'import cv2
from utils.augmentations import letterbox
def preprocess(img, device):img = cv2.resize(img, (512, 512))img = img.transpose((2, 0, 1))[::-1]img = np.ascontiguousarray(img)img = torch.from_numpy(img).to(device)img = img.float()img /= 255if len(img.shape) == 3:img = img[None]return img
def main(weights=ROOT / 'weights/best.pt',  # weights pathimgsz=(512, 512),  # image (height, width)batch_size=1,  # batch sizedevice='cpu',  # cuda device, i.e. 0 or 0,1,2,3 or cpuinplace=False,  # set YOLOv5 Detect() inplace=Truedynamic=False,  # ONNX/TF/TensorRT: dynamic axes):# Load PyTorch modeldevice = select_device(device)model = attempt_load(weights, device=device, inplace=True, fuse=True)  # load FP32 model# Checksimgsz *= 2 if len(imgsz) == 1 else 1  # expand# Inputgs = int(max(model.stride))  # grid size (max stride)imgsz = [check_img_size(x, gs) for x in imgsz]  # verify img_size are gs-multiplesim = torch.zeros(batch_size, 3, *imgsz).to(device)  # image size(1,3,320,192) BCHW iDetection# im = cv2.imread(r'F:\tmp\yolov5_multiDR\data\0000005_20200929_M_063Y16640.jpeg')# im = preprocess(im, device)print(im.shape)# Update modelmodel.eval()for k, m in model.named_modules():if isinstance(m, Detect):m.inplace = inplacem.dynamic = dynamicm.export = Truewarnings.filterwarnings(action='ignore', category=torch.jit.TracerWarning)  # suppress TracerWarningcheck_onnx(model, im)if __name__ == "__main__":main()

测试1：图像是一个全0的数组，一致性检查如下：

Mismatched elements: 76 / 112896 (0.0673%)
Max absolute difference:  0.00053406
Max relative difference:      2.2101output: [[[     3.1054       3.965      8.9553 ...  6.8545e-07     0.36458     0.53113][     9.0205      2.5498       13.39 ...  6.2585e-07     0.18449     0.70698][     20.786      2.2233      13.489 ...  2.3842e-06    0.033101     0.95657]...[     419.42      493.04      106.14 ...  8.4937e-06     0.24135     0.60916][     485.68      500.22      46.923 ...  1.1176e-05     0.33573     0.48875][     488.37      503.87      68.881 ...  5.9605e-08  0.00030029     0.99639]]] (1, 16128, 7)
pytorch_result: [[[     3.1054       3.965      8.9553 ...  7.0523e-07     0.36458     0.53113][     9.0205      2.5498       13.39 ...  6.0181e-07     0.18449     0.70698][     20.786      2.2233      13.489 ...  2.4172e-06    0.033101     0.95657]...[     419.42      493.04      106.14 ...  8.5151e-06     0.24135     0.60916][     485.68      500.22      46.923 ...  1.1174e-05     0.33573     0.48875][     488.37      503.87      68.881 ...  9.3094e-08   0.0003003     0.99639]]] (1, 16128, 7)
cosine_dis: 0.04229331

测试2：图像是加载的本地图像，一致性检查如下：

Mismatched elements: 158 / 112896 (0.14%)
Max absolute difference:   0.0016251
Max relative difference:      1.2584output: [[[     3.0569      2.4338      10.758 ...  2.0862e-07     0.16333     0.78551][     11.028      2.0251      13.407 ...  3.5763e-07    0.090503     0.88087][     19.447      1.8957      13.431 ...  6.8545e-07    0.047358     0.95029]...[     418.66       487.8      80.157 ...  1.4573e-05     0.65453     0.23448][     472.99      491.78      79.313 ...  1.3232e-05     0.79356     0.15061][     496.41      488.49      44.447 ...  2.6256e-05     0.89966     0.08772]]] (1, 16128, 7)
pytorch_result: [[[     3.0569      2.4338      10.758 ...  2.5371e-07     0.16333     0.78551][     11.028      2.0251      13.407 ...  3.3069e-07    0.090503     0.88087][     19.447      1.8957      13.431 ...  6.6051e-07    0.047358     0.95029]...[     418.66       487.8      80.157 ...  1.4618e-05     0.65453     0.23448][     472.99      491.78      79.313 ...  1.3215e-05     0.79356     0.15061][     496.41      488.49      44.447 ...  2.6262e-05     0.89966     0.08772]]] (1, 16128, 7)
cosine_dis: 0.04071107

发现，输出结果中，差异的数据点还是挺多的，那么就说明在模型中，有些部分的参数是有差异的，这才导致相同的输入，在最后的输出结果中存在差异。

但是在一定的误差内，结果是一致的。比如我验证了小数点后3位，都是一样的，但是到第4位的时候，就开始出现了差异性。

那么，如何降低，甚至没有这种差异，该怎么办呢？不知道你们有没有这方面的知识储备或经验，欢迎评论区给出指导，感谢。

二、新的pytorch转onnx：torch.onnx.dynamo_export

在参考pytorch官方，关于torch.onnx.export的模型转换，相关文档中：(OPTIONAL) EXPORTING A MODEL FROM PYTORCH TO ONNX AND RUNNING IT USING ONNX RUNTIME

上述案例，是pytorch官方给出评测pytorch和onnx转出模型，在相同输入的情况下，输出结果一致性对比的评测代码。对比这里：

testing.assert_allclose(actual, desired, rtol=1e-07, atol=0, equal_nan=True, err_msg='', verbose=True)

其中：

rtol：相对tolerance（容忍度，公差，容许偏差）
atol：绝对tolerance
要求 actual 的 desired 值的差别不超过 atol + rtol * abs(desired)，否则弹出错误提示

可以看出，这是在误差允许的范围内，进行的评测。只要满足一定的误差要求，还是满足的。并且在本测试案例中，也确实通过了上述设定值的误差要求。

但是，峰回路转，有个提示，如下：

于是，就转到torch.onnx.dynamo_export链接，点击这里直达：EXPORT A PYTORCH MODEL TO ONNX

同样的流程，导出模型，然后进行一致性评价，发现官方竟然没有采用允许误差的评测，而是下面这样：
在这里插入图片描述输出完全一致，这是一个大好消息。至此，开始验证

2.1、验证结果

与此同时，发现yolo v5更新到了v7.0.0的版本，于是就想着把yolo 进行升级，同时将pytorch版本也更新到最新的2.1.0，这样就可以采用torch.onnx.dynamo_export 进行转onnx模型的操作尝试了。

当一起就绪后，采用下面的代码转出onnx模型的时候，却出现了错误提示。

export_output = torch.onnx.dynamo_export(model.cpu() if dynamic else model,im.cpu() if dynamic else im)
export_output.save("my_image_classifier.onnx")

2.2、转出失败

在这里插入图片描述

给出失败的的提示：torch.onnx.OnnxExporterError，转出onnx模型失败，产生了一个SARIF的文件。然后介绍了什么是SARIF文件，可以通过VS Code SARIF，也可以 SARIF web查看。最后说吧这个错误，报告给pytorch的GitHub的issue地方。

产生了一个名为：report_dynamo_export.sarif是文件，打开文件，记录的信息如下：

{"runs":[{"tool":{"driver":{"name":"torch.onnx.dynamo_export","contents":["localizedData","nonLocalizedData"],"language":"en-US","rules":[],"version":"2.1.0+cu118"}},"language":"en-US","newlineSequences":["\r\n","\n"],"results":[]}],"version":"2.1.0","schemaUri":"https://docs.oasis-open.org/sarif/sarif/v2.1.0/cs01/schemas/sarif-schema-2.1.0.json"
}

这更像是一个运行环境收集的一个记录文件。在我对全网进行搜索时候，发现了类似的报错提示，但并没有解决办法。不知道是不是因为这个函数还在内测阶段，并没有很好的适配。

如果你也遇到了同样的问题，欢迎给评论，指导问题出在了哪里？如何解决这个问题。感谢

三、总结

原本想着验证最终转rknn的模型，与原始pytorch模型是否一致的问题，最后发现在转onnx阶段，这种差异性就已经存在了。并且发现rknn的测试结果，与onnx模型的测试结果更加的贴近。无论是量化后的rknn，还是未量化的，均存在这个问题。

同时发现，量化后的rknn模型，在config阶段改变量化的方式，确实会提升模型的性能，且几乎接近于未量化的模型版本。

原本以为采用pytorch新的转出onnx的模型函数，可以解决这个问题。但是，发现还是内测版本，不知道问题是出在了哪里，还需要大神帮助，暂时未跑通。

最后，如果你也遇到了同样的问题，欢迎给评论，指导问题出在了哪里？如何解决这个问题。感谢

【RKNN】YOLO V5中pytorch2onnx，pytorch和onnx模型输出不一致，精度降低

一、pytorch转onnx：torch.onnx.export

二、新的pytorch转onnx：torch.onnx.dynamo_export

2.1、验证结果

2.2、转出失败

三、总结

相关文章：

【RKNN】YOLO V5中pytorch2onnx，pytorch和onnx模型输出不一致，精度降低

六分科技CEO李阳：精准定位助力汽车智能化普及

信号完整性分析基础知识之有损传输线、上升时间衰减和材料特性（六）：衰减和dB

吃鸡达人必备：分享顶级干货+作图工具推荐+账号安全查询！

帆软报表解决单元格不显示问题

LeetCode讲解篇之138. 随机链表的复制

主定理（简化版）

HTTP1.0和HTTP2.0的区别

ARM资源记录《AI嵌入式系统：算法优化与实现》第八章（暂时用不到）

微信小程序2

G.711语音编解码器详解

蓝桥杯每日一题2023.10.17

16.SpringBoot前后端分离项目之简要配置一

Probability Calibration概率校准大比拼：性能、应用场景和可视化对比总结

PHP 球鞋在线商城系统mysql数据库web结构apache计算机软件工程网页wamp计算机毕业设计

使用Apache和内网穿透实现私有服务公网远程访问——“cpolar内网穿透”

PreparedStatement

CSS3 新增属性-边框圆角-文字阴影-盒子阴影

制作.a静态库 (封盒)

一台服务器，一个新世界

铭豹扩展坞 USB转网口突然无法识别解决方法

QMC5883L的驱动

Opencv中的addweighted函数

04-初识css

【服务器压力测试】本地PC电脑作为服务器运行时出现卡顿和资源紧张（Windows/Linux）

sipsak：SIP瑞士军刀！全参数详细教程！Kali Linux教程！

RSS 2025｜从说明书学习复杂机器人操作任务：NUS邵林团队提出全新机器人装配技能学习框架Manual2Skill

C++ 设计模式《小明的奶茶加料风波》

【安全篇】金刚不坏之身：整合 Spring Security + JWT 实现无状态认证与授权

算法打卡第18天