当前位置：首页 > news >正文

Haar小波下采样模块

news 2025/11/9 16:46:15

论文原址：Haar wavelet downsampling: A simple but effective downsampling module for semantic segmentation - ScienceDirect

原文代码：HWD/HWD.py at main · apple1986/HWD (github.com)

介绍

深度卷积神经网络（DCNN）通常采用标准的下采样操作，例如最大池化、平均池化和跨步卷积，这可能会导致信息丢失。丢失的信息，如边界和纹理，对于语义分割可能是必不可少的。为了缓解这个问题，一般有下面四种方法：

通过跳过连接到解码器子网（如U-Net、LCU-Net、CENet、LinkNet和RefineNet ）。
提取具有空间金字塔池化或扩展卷积的多尺度特征图到融合模块中（如DeepLab、PSPNet、PCPLP-Net、BiSenet和ICNet）。
向编码器提供多模态图像（如DiSegNet、MMADT、CANet和CCFFNet）。
增加先验信息。轮廓增强关注模块，旨在从CT图像中提取边界和形状线索，以细化分割区域。

这些方法的主要目的是通过基于多尺度、先验指导、多模态等各种策略提供更多的学习信息或特征，帮助下采样特征与分割标签之间建立良好的关系。

因此，是否可以设计一个保留信息的下采样模块，使DCNNs中尽可能多地保留信息进行语义分割?这就是作者的想法。

下采样模块

最大池化与平均池化

池化过程类似于卷积过程。在这个示意图中，我们看到对一个 4x4 的特征图邻域进行操作，使用了一个 2x2 的滤波器，步长为2进行扫描。这个过程被称为最大池化（Max Pooling），其中选择邻域内的最大值并输出到下一层。

常用的 max pooling 参数是 S=2、f=2，其效果是将特征图的高度和宽度减半，而通道数保持不变。

如上图所示，描述的是对一个 4x4 的特征图邻域内的数值进行操作。使用了一个 2x2 的滤波器，步长为2进行扫描，计算邻域内数值的平均值并将其输出到下一层。这种操作被称为平均池化（Mean Pooling）。

"""
Copyright (c) 2023, Auorui.
All rights reserved.The Torch implementation of average pooling and maximum pooling has been compared with the official Torch implementation
"""
import torch
import torch.nn as nn__all__ = ["MaxPool2d", "AvgPool2d"]class MaxPool2d(nn.Module):"""池化层计算公式:output_size = [(input_size−kernel_size) // stride + 1]"""def __init__(self, kernel_size, stride):super(MaxPool2d, self).__init__()self.kernel_size = kernel_sizeself.stride = stridedef max_pool2d(self, input_tensor, kernel_size, stride):batch_size, channels, height, width = input_tensor.size()output_height = (height - kernel_size) // stride + 1output_width = (width - kernel_size) // stride + 1output_tensor = torch.zeros(batch_size, channels, output_height, output_width)for i in range(output_height):for j in range(output_width):# 获取输入张量中与池化窗口对应的部分window = input_tensor[:, :,i * stride: i * stride + kernel_size, j * stride: j * stride + kernel_size]output_tensor[:, :, i, j] = torch.max(window.reshape(batch_size, channels, -1), dim=2)[0]return output_tensordef forward(self, input_tensor):return self.max_pool2d(input_tensor, kernel_size=self.kernel_size, stride=self.stride)class AvgPool2d(nn.Module):"""池化层计算公式:output_size = [(input_size−kernel_size) // stride + 1]"""def __init__(self, kernel_size, stride):super(AvgPool2d, self).__init__()self.kernel_size = kernel_sizeself.stride = stridedef avg_pool2d(self, input_tensor, kernel_size, stride):batch_size, channels, height, width = input_tensor.size()output_height = (height - kernel_size) // stride + 1output_width = (width - kernel_size) // stride + 1output_tensor = torch.zeros(batch_size, channels, output_height, output_width)for i in range(output_height):for j in range(output_width):# 获取输入张量中与池化窗口对应的部分window = input_tensor[:, :,i * stride: i * stride + kernel_size, j * stride:j * stride + kernel_size]output_tensor[:, :, i, j] = torch.mean(window.reshape(batch_size, channels, -1), dim=2)return output_tensordef forward(self, input_tensor):return self.avg_pool2d(input_tensor, kernel_size=self.kernel_size, stride=self.stride)if __name__=="__main__":# input_data = torch.rand((1, 3, 3, 3))input_data = torch.Tensor([[[[0.3939, 0.8964, 0.3681],[0.5134, 0.3780, 0.0047],[0.0681, 0.0989, 0.5962]],[[0.7954, 0.4811, 0.3329],[0.8804, 0.3986, 0.3561],[0.2797, 0.3672, 0.6508]],[[0.6309, 0.1340, 0.0564],[0.3101, 0.9927, 0.5554],[0.0947, 0.2305, 0.8299]]]])print(input_data.shape)kernel_size = 3stride = 1MaxPool2d1 = nn.MaxPool2d(kernel_size, stride)output_data_with_torch_max = MaxPool2d1(input_data)AvgPool2d1 = nn.AvgPool2d(kernel_size, stride)output_data_with_torch_avg = AvgPool2d1(input_data)AvgPool2d2 = AvgPool2d(kernel_size, stride)output_data_with_torch_Avg = AvgPool2d2(input_data)MaxPool2d2 = MaxPool2d(kernel_size, stride)output_data_with_torch_Max = MaxPool2d2(input_data)# output_data_with_max = max_pool2d(input_data, kernel_size, stride)# output_data_with_avg = avg_pool2d(input_data, kernel_size, stride)print("\ntorch.nn pooling Output:")print(output_data_with_torch_max,"\n",output_data_with_torch_max.size())print(output_data_with_torch_avg,"\n",output_data_with_torch_avg.size())print("\npooling Output:")print(output_data_with_torch_Max,"\n",output_data_with_torch_Max.size())print(output_data_with_torch_Avg,"\n",output_data_with_torch_Avg.size())# 直接使用bool方法判断会因为浮点数的原因出现偏差print(torch.allclose(output_data_with_torch_max,output_data_with_torch_Max))print(torch.allclose(output_data_with_torch_avg,output_data_with_torch_Avg))# tensor([[[[0.8964]],       # output_data_with_max#          [[0.8804]],#          [[0.9927]]]])# tensor([[[[0.3686]],       # output_data_with_avg#           [[0.5047]],#           [[0.4261]]]])

在这里，简单地与PyTorch官方的实现进行了比对，成功的进行复现。

跨步卷积

import torch
import torch.nn as nnclass StridedConvolution(nn.Module):def __init__(self, in_channels, out_channels, kernel_size=3, stride=2, is_relu=True):super(StridedConvolution, self).__init__()self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, stride=stride, padding=1)self.relu = nn.ReLU(inplace=True)self.is_relu = is_reludef forward(self, x):x = self.conv(x)if self.is_relu:x = self.relu(x)return xif __name__ == '__main__':input_data = torch.rand((1, 3, 64, 64))strided_conv = StridedConvolution(3, 64)output_data = strided_conv(input_data)print("Input shape:", input_data.shape)print("Output shape:", output_data.shape)

对输入进行跨步卷积，并根据 is_relu 参数选择是否添加ReLU激活函数。在构建卷积神经网络时经常被用于下采样步骤，以减小特征图的尺寸。