当前位置：首页 > news >正文

pytorch中的torch.nn.Linear

news 2025/7/6 22:56:28

torch.nn.Linear是pytorch中的线性层，应该是最常见的网络层了，官方文档：torch.nn.Linear。

torch.nn.Linear(in_features, out_features, bias=True, device=None, dtype=None)

其中，in_features表示输入的维度；out_features表示输出的维度；bias表示是否包含偏置，默认为True。
nn.linear的作用其实就是对输入进行了一个线性变换，中学时我们学习的线性变换是y=kx+b，但是对于神经网络来说，我们的输入、输出和权重都是一个矩阵，即： $o u tp u t = in p u t * W + b$ 其中， $input\in R^{n×i}$ ， $W\in R^{i×o}$ ， $output\in R^{n×o}$ ，n为输入向量的行数（通常为batch数），i为输入神经元的个数，o为输出神经元的个数。使用举例：

FC = nn.Linear(20, 40)
input = torch.randn(128, 20) # （128，20）
output = FC(input)
print(output.size())  # (128，40）

官方源码：

import mathimport torch
from torch import Tensor
from torch.nn.parameter import Parameter, UninitializedParameter
from .. import functional as F
from .. import init
from .module import Module
from .lazy import LazyModuleMixinclass Identity(Module):r"""A placeholder identity operator that is argument-insensitive.Args:args: any argument (unused)kwargs: any keyword argument (unused)Shape:- Input: :math:`(*)`, where :math:`*` means any number of dimensions.- Output: :math:`(*)`, same shape as the input.Examples::>>> m = nn.Identity(54, unused_argument1=0.1, unused_argument2=False)>>> input = torch.randn(128, 20)>>> output = m(input)>>> print(output.size())torch.Size([128, 20])"""def __init__(self, *args, **kwargs):super(Identity, self).__init__()def forward(self, input: Tensor) -> Tensor:return inputclass Linear(Module):r"""Applies a linear transformation to the incoming data: :math:`y = xA^T + b`This module supports :ref:`TensorFloat32<tf32_on_ampere>`.Args:in_features: size of each input sampleout_features: size of each output samplebias: If set to ``False``, the layer will not learn an additive bias.Default: ``True``Shape:- Input: :math:`(*, H_{in})` where :math:`*` means any number ofdimensions including none and :math:`H_{in} = \text{in\_features}`.- Output: :math:`(*, H_{out})` where all but the last dimensionare the same shape as the input and :math:`H_{out} = \text{out\_features}`.Attributes:weight: the learnable weights of the module of shape:math:`(\text{out\_features}, \text{in\_features})`. The values areinitialized from :math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})`, where:math:`k = \frac{1}{\text{in\_features}}`bias:   the learnable bias of the module of shape :math:`(\text{out\_features})`.If :attr:`bias` is ``True``, the values are initialized from:math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})` where:math:`k = \frac{1}{\text{in\_features}}`Examples::>>> m = nn.Linear(20, 30)>>> input = torch.randn(128, 20)>>> output = m(input)>>> print(output.size())torch.Size([128, 30])"""__constants__ = ['in_features', 'out_features']in_features: intout_features: intweight: Tensordef __init__(self, in_features: int, out_features: int, bias: bool = True,device=None, dtype=None) -> None:factory_kwargs = {'device': device, 'dtype': dtype}super(Linear, self).__init__()self.in_features = in_featuresself.out_features = out_featuresself.weight = Parameter(torch.empty((out_features, in_features), **factory_kwargs))if bias:self.bias = Parameter(torch.empty(out_features, **factory_kwargs))else:self.register_parameter('bias', None)self.reset_parameters()def reset_parameters(self) -> None:# Setting a=sqrt(5) in kaiming_uniform is the same as initializing with# uniform(-1/sqrt(in_features), 1/sqrt(in_features)). For details, see# https://github.com/pytorch/pytorch/issues/57109init.kaiming_uniform_(self.weight, a=math.sqrt(5))if self.bias is not None:fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight)bound = 1 / math.sqrt(fan_in) if fan_in > 0 else 0init.uniform_(self.bias, -bound, bound)def forward(self, input: Tensor) -> Tensor:return F.linear(input, self.weight, self.bias)def extra_repr(self) -> str:return 'in_features={}, out_features={}, bias={}'.format(self.in_features, self.out_features, self.bias is not None)# This class exists solely to avoid triggering an obscure error when scripting
# an improperly quantized attention layer. See this issue for details:
# https://github.com/pytorch/pytorch/issues/58969
# TODO: fail fast on quantization API usage error, then remove this class
# and replace uses of it with plain Linear
class NonDynamicallyQuantizableLinear(Linear):def __init__(self, in_features: int, out_features: int, bias: bool = True,device=None, dtype=None) -> None:super().__init__(in_features, out_features, bias=bias,device=device, dtype=dtype)[docs]class Bilinear(Module):r"""Applies a bilinear transformation to the incoming data::math:`y = x_1^T A x_2 + b`Args:in1_features: size of each first input samplein2_features: size of each second input sampleout_features: size of each output samplebias: If set to False, the layer will not learn an additive bias.Default: ``True``Shape:- Input1: :math:`(*, H_{in1})` where :math:`H_{in1}=\text{in1\_features}` and:math:`*` means any number of additional dimensions including none. All but the last dimensionof the inputs should be the same.- Input2: :math:`(*, H_{in2})` where :math:`H_{in2}=\text{in2\_features}`.- Output: :math:`(*, H_{out})` where :math:`H_{out}=\text{out\_features}`and all but the last dimension are the same shape as the input.Attributes:weight: the learnable weights of the module of shape:math:`(\text{out\_features}, \text{in1\_features}, \text{in2\_features})`.The values are initialized from :math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})`, where:math:`k = \frac{1}{\text{in1\_features}}`bias:   the learnable bias of the module of shape :math:`(\text{out\_features})`.If :attr:`bias` is ``True``, the values are initialized from:math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})`, where:math:`k = \frac{1}{\text{in1\_features}}`Examples::>>> m = nn.Bilinear(20, 30, 40)>>> input1 = torch.randn(128, 20)>>> input2 = torch.randn(128, 30)>>> output = m(input1, input2)>>> print(output.size())torch.Size([128, 40])"""__constants__ = ['in1_features', 'in2_features', 'out_features']in1_features: intin2_features: intout_features: intweight: Tensordef __init__(self, in1_features: int, in2_features: int, out_features: int, bias: bool = True,device=None, dtype=None) -> None:factory_kwargs = {'device': device, 'dtype': dtype}super(Bilinear, self).__init__()self.in1_features = in1_featuresself.in2_features = in2_featuresself.out_features = out_featuresself.weight = Parameter(torch.empty((out_features, in1_features, in2_features), **factory_kwargs))if bias:self.bias = Parameter(torch.empty(out_features, **factory_kwargs))else:self.register_parameter('bias', None)self.reset_parameters()def reset_parameters(self) -> None:bound = 1 / math.sqrt(self.weight.size(1))init.uniform_(self.weight, -bound, bound)if self.bias is not None:init.uniform_(self.bias, -bound, bound)def forward(self, input1: Tensor, input2: Tensor) -> Tensor:return F.bilinear(input1, input2, self.weight, self.bias)def extra_repr(self) -> str:return 'in1_features={}, in2_features={}, out_features={}, bias={}'.format(self.in1_features, self.in2_features, self.out_features, self.bias is not None)class LazyLinear(LazyModuleMixin, Linear):r"""A :class:`torch.nn.Linear` module where `in_features` is inferred.In this module, the `weight` and `bias` are of :class:`torch.nn.UninitializedParameter`class. They will be initialized after the first call to ``forward`` is done and themodule will become a regular :class:`torch.nn.Linear` module. The ``in_features`` argumentof the :class:`Linear` is inferred from the ``input.shape[-1]``.Check the :class:`torch.nn.modules.lazy.LazyModuleMixin` for further documentationon lazy modules and their limitations.Args:out_features: size of each output samplebias: If set to ``False``, the layer will not learn an additive bias.Default: ``True``Attributes:weight: the learnable weights of the module of shape:math:`(\text{out\_features}, \text{in\_features})`. The values areinitialized from :math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})`, where:math:`k = \frac{1}{\text{in\_features}}`bias:   the learnable bias of the module of shape :math:`(\text{out\_features})`.If :attr:`bias` is ``True``, the values are initialized from:math:`\mathcal{U}(-\sqrt{k}, \sqrt{k})` where:math:`k = \frac{1}{\text{in\_features}}`"""cls_to_become = Linear  # type: ignore[assignment]weight: UninitializedParameterbias: UninitializedParameter  # type: ignore[assignment]def __init__(self, out_features: int, bias: bool = True,device=None, dtype=None) -> None:factory_kwargs = {'device': device, 'dtype': dtype}# bias is hardcoded to False to avoid creating tensor# that will soon be overwritten.super().__init__(0, 0, False)self.weight = UninitializedParameter(**factory_kwargs)self.out_features = out_featuresif bias:self.bias = UninitializedParameter(**factory_kwargs)def reset_parameters(self) -> None:if not self.has_uninitialized_params() and self.in_features != 0:super().reset_parameters()def initialize_parameters(self, input) -> None:  # type: ignore[override]if self.has_uninitialized_params():with torch.no_grad():self.in_features = input.shape[-1]self.weight.materialize((self.out_features, self.in_features))if self.bias is not None:self.bias.materialize((self.out_features,))self.reset_parameters()
# TODO: PartialLinear - maybe in sparse?

pytorch中的torch.nn.Linear

相关文章：

pytorch中的torch.nn.Linear

03-MySQl数据库的-用户管理

知乎：多云架构下大模型训练，如何保障存储稳定性?

JWFD流程图转换为矩阵数据库的过程说明

GT收发器第一篇_总体结构介绍

[图像处理] MFC载入图片并进行二值化处理和灰度处理及其效果显示

centos7.5 安装gitlab-ce (Omnibus)

深入理解MapReduce：从Map到Reduce的工作原理解析

初始Java篇（JavaSE基础语法）（5）（类和对象（上））

机器人---人形机器人之技术方向

MySQL MHA高可用数据库

LVS（Layout versus schematic）比的是什么？

从0开始搭建基于VUE的前端项目(三) Vuex的使用与配置

python统计分析——双样本均值比较

三台电机的顺启逆停

彩虹外链网盘界面UI美化版超级简洁好看

企业微信知识库：从了解到搭建的全流程

【华为OD机试C++】合并表记录

uniapp中使用u-popup组件导致的弹框下面的页面可滑动现象

数字孪生|山海鲸可视化快速入门

前端倒计时误差!

深入浅出：JavaScript 中的 `window.crypto.getRandomValues()` 方法

（二）TensorRT-LLM | 模型导出（v0.20.0rc3）

聊聊 Pulsar：Producer 源码解析

条件运算符

NFT模式：数字资产确权与链游经济系统构建

多模态大语言模型arxiv论文略读（108）

3-11单元格区域边界定位(End属性)学习笔记

React---day11

安宝特方案丨船舶智造的“AR+AI+作业标准化管理解决方案”（装配）