当前位置：首页 > news >正文

[Pytorch]卷积运算conv2d

news 2025/12/22 12:28:52

文章目录

[Pytorch]卷积运算conv2d
- 一.F.Conv2d
- 二.nn.Conv2d
- 三.nn.Conv2d的运算过程

[Pytorch]卷积运算conv2d

一.F.Conv2d

torch.nn.functional.Conv2d()的详细参数：

conv2d(input: Tensor, weight: Tensor, bias: Optional[Tensor]=None, stride: Union[_int, _size]=1, padding: str="valid", dilation: Union[_int, _size]=1, groups: _int=1)

即F.Conv2d的主要参数如下：

input：输入特征图
weight：卷积核
bias：偏置参数
stride：卷积步长
padding：填充
dilation：膨胀系数
groups：分组卷积

利用F.Conv2d对图像中的暗线进行识别demo：

x = torch.tensor([[[1.0, 4, 1, 4, 5],[0, 5, 3, 2, 1],[21,25, 25, 23, 26],[5, 2, 5, 2, 5],[4, 9, 3, 0, 7]],[[2, 2, 2, 7, 2],[0, 0, 6, 3, 0],[24, 25, 25, 23, 27],[0, 1, 1, 1, 5],[0, 2, 0, 2, 2]],[[2, 2, 2, 1, 0],[7, 2, 4, 3, 1],[24, 23, 28, 23, 24],[0, 0, 2, 2, 5],[5, 2, 4, 5, 2]]
])weight = torch.tensor([[[0.0, 0, 0],[1, 1, 1],[0, 0, 0],],[[0, 0, 0],[1, 1, 1],[0, 0, 0],],[[0, 0, 0],[1, 1, 1],[0, 0, 0],]
])
out = F.conv2d(x, weight=weight.unsqueeze(0), bias=None, stride=1, padding=0)toPIL = transforms.ToPILImage()  # 这个函数可以将张量转为PIL图片，由小数转为0-255之间的像素值
img_PIL = toPIL(x)  # 张量tensor转换为图片
img_PIL.save('./original.png')  # 保存图片；img_PIL.show()可以直接显示图片
torchvision.transforms.ToPILImage()
img_PIL = toPIL(out)
img_PIL.save('./convoluted.png')
print(out)

在这里插入图片描述

二.nn.Conv2d

pytorch中的卷积运算接口可使用torch.nn中的Conv2d():

torch.nn.Conv2d( in_channels, out_channels, kernel_size, stride,  padding)

pytorch官方的参数解释说明：

 Args:in_channels (int): Number of channels in the input imageout_channels (int): Number of channels produced by the convolutionkernel_size (int or tuple): Size of the convolving kernelstride (int or tuple, optional): Stride of the convolution. Default: 1padding (int, tuple or str, optional): Padding added to all four sides ofthe input. Default: 0padding_mode (str, optional): ``'zeros'``, ``'reflect'``,``'replicate'`` or ``'circular'``. Default: ``'zeros'``dilation (int or tuple, optional): Spacing between kernel elements. Default: 1groups (int, optional): Number of blocked connections from inputchannels to output channels. Default: 1bias (bool, optional): If ``True``, adds a learnable bias to theoutput. Default: ``True``

其中：padding_mode， dilation， groups， bias为可选参数，不是必须给定。

import torch
import torchvision.transforms
from torch import nn
from torchvision import transforms
from torch.nn import functional as Fconvolutional_layer = nn.Conv2d(3, 3, kernel_size=3, stride=1, padding=0)
x = torch.tensor([[[231.0, 120, 111, 34, 45],[100, 85, 23, 200, 111],[31, 45, 100, 103, 220],[5, 5, 5, 5, 5],[54, 89, 103, 150, 67]],[[12, 58, 52, 87, 100],[200, 140, 86, 23, 10],[60, 75, 45, 30, 7],[155, 155, 155, 155, 155],[0, 122, 0, 0, 12]],[[12, 12, 12, 11, 10],[67, 12, 45, 23, 1],[56, 12, 5, 10, 8],[0, 0, 0, 0, 0],[5, 12, 34, 56, 12]]
])
out = convolutional_layer(x)
print(out)
print(convolutional_layer.weight.shape)
print(convolutional_layer.bias)

在这里插入图片描述

通过将tensor转为图片模拟以下卷积运算的效果：

toPIL = transforms.ToPILImage()  # 这个函数可以将张量转为PIL图片，由小数转为0-255之间的像素值
img_PIL = toPIL(x)  # 张量tensor转换为图片
img_PIL.save('./original.png')  # 保存图片；img_PIL.show()可以直接显示图
img_PIL = toPIL(out)
img_PIL.save('./convoluted.png')

原特征图：
在这里插入图片描述
卷积运算后的输出特征图：

三.nn.Conv2d的运算过程

假如nn.Conv2d的定义如下：

convolutional_layer = torch.nn.Conv2d(2, 3, 3, 1, bias=None)

我们研究该函数是如何通过卷积运算将输入通道数2变成输入通道数3的：
原特征图的输入通道数为2，图示橙色和蓝色两个通道：
nn.Conv2d随机初始化3组卷积核，3为输出通道数，其中每组卷积核中卷积核的数量为2(输入通道数)，分为橙色的卷积核与蓝色的卷积核，与对应的特征图做卷积运算，每组分别得到两个卷积运算后的特征图，将每组得到的特征图进行对应位置数值的相加操作即可得到最后的nn.Conv2d运算结果。该卷积层的卷积核个数总共为3x2个即输出通道数x输入通道数个。
在这里插入图片描述
下面为了验证上述过程正确，我们使用两个F.conv2d来分别模拟蓝色卷积核和橙色卷积核与对应特征图卷积运算的过程。并将最终的结果相加和直接采用nn.conv2d的结果比较：

convolutional_layer = torch.nn.Conv2d(2, 3, 3, 1, bias=None)
input = torch.randn(1, 2, 4, 4)
output = convolutional_layer(input)
weight = convolutional_layer.weight
out_1 = F.conv2d(input[:, 0, :, :].unsqueeze(1), weight[:, 0, :, :].unsqueeze(1), bias=None, stride=1, padding=0)
out_2 = F.conv2d(input[:, 1, :, :].unsqueeze(1), weight[:, 1, :, :].unsqueeze(1), bias=None, stride=1, padding=0)
print(f'output:\n{output}')
print('-----------------------------------------------------------------')
print(f'out_1 + out_2:\n{out_1 + out_2}')

在这里插入图片描述
最终通过比较发现，卷积运算的结果一致，说明上述nn.conv2d的具体运算过程正确

[Pytorch]卷积运算conv2d

文章目录

[Pytorch]卷积运算conv2d

一.F.Conv2d

二.nn.Conv2d

三.nn.Conv2d的运算过程

相关文章：

[Pytorch]卷积运算conv2d

主流开源监控系统一览

爬虫原理详解及requests抓包工具用法介绍

tinkerCAD案例：31. 3D 基元形状简介

Vue2基础一、快速入门

【POJ-3279】Fliptile（递推+搜索）

522个matplotlib绘图案例，包含：折线图、散点图、条形图、饼图、直方图、3D图等，源码可直接运行！

windows安装Elasticsearch8.9.0

用Delphi编写一个通用视频转换工具，让视频格式转换变得更简单

Kafka系列之：安装Know Streaming详细步骤

绝杀 GETPOST 嵌套的 JSON 参数

Spring 项目过程及如何使用 Spring

信息学奥赛一本通——1258：【例9.2】数字金字塔

selenium官网文档阅读总结（day 2）

VMware虚拟机安装VMware tools

【Linux命令200例】rm用来删除文件或目录（谨慎使用）

行云管家荣获CFS第十二届财经峰会 “2023产品科技创新奖”

uniapp禁止页面滚动

ModuleNotFoundError: No module named ‘_sqlite3‘

Rust的入门篇(下)

(LeetCode 每日一题) 3442. 奇偶频次间的最大差值 I (哈希、字符串)

iOS 26 携众系统重磅更新，但“苹果智能”仍与国行无缘

CVPR 2025 MIMO: 支持视觉指代和像素grounding 的医学视觉语言模型

云计算——弹性云计算器（ECS）

2.Vue编写一个app

ServerTrust 并非唯一

现有的 Redis 分布式锁库（如 Redisson）提供了哪些便利？

HubSpot推出与ChatGPT的深度集成引发兴奋与担忧

离线语音识别方案分析

全面解析数据库：从基础概念到前沿应用