当前位置：首页 > news >正文

边写代码边学习之RNN

news 2025/12/20 10:08:00

1. 什么是 RNN

循环神经网络（Recurrent Neural Network，RNN）是一种以序列数据为输入来进行建模的深度学习模型，它是 NLP 中最常用的模型。其结构如下图：

在这里插入图片描述

x是输入，h是隐层单元，o为输出，L为损失函数，y为训练集的标签.
这些元素右上角带的t代表t时刻的状态，其中需要注意的是，因策单元h在t时刻的表现不仅由此刻的输入决定，还受t时刻之前时刻的影响。V、W、U是权值，同一类型的权连接权值相同。
有了上面的理解，前向传播算法其实非常简单，对于t时刻：
$h ^{(t)} =\phi (Ux^{(t)} +Wh^{(t-1)} +b)$

其中 $\phi ()$ 为激活函数，一般来说会选择tanh函数，b为偏置。
t时刻的输出就更为简单：
$o^{(t)} =Vh ^{(t)} +c$
最终模型的预测输出为：
$\hat y^{(t)} =\sigma (o^{(t)} )$
其中 $\sigma$ 为激活函数，通常RNN用于分类，故这里一般用softmax函数。

2. 实验代码

2.1. 搭建一个只有一层RNN和Dense网络的模型。

def simple_rnn_layer():# Create a dense layer with 10 output neurons and input shape of (None, 20)model = Sequential()model.add(SimpleRNN(units=3, input_shape=(3, 2),))  # 3 units in the RNN layer, input_shape=(timesteps, features)model.add(Dense(1))  # Output layer with one neuron# Print the summary of the dense layerprint(model.summary())
if __name__ == '__main__':simple_rnn_layer()

输出

Model: "sequential"
_________________________________________________________________Layer (type)                Output Shape              Param #   
=================================================================simple_rnn (SimpleRNN)      (None, 3)                 18        dense (Dense)               (None, 1)                 4         =================================================================
Total params: 22
Trainable params: 22
Non-trainable params: 0
_________________________________________________________________
None

2.2. 验证RNN里的逻辑

写代码验证这个过程，看看结果是不是一样的。

import keras.optimizers.optimizer
import numpy as np
from keras.models import Sequential
from keras.layers import SimpleRNN, Dense
def change_weight():# Create a simple Dense layerrnn_layer = SimpleRNN(units=3, input_shape=(3, 2), activation=None, return_sequences=True)# Simulate input data (batch size of 1 for demonstration)input_data = np.array([[[1.0, 2], [2, 3], [3, 4]],[[5, 6], [6, 7], [7, 8]],[[9, 10], [10, 11], [11, 12]]])# Pass the input data through the layer to initialize the weights and biases_ = rnn_layer(input_data)# Access the weights and biases of the dense layerkernel, recurrent_kernel, biases = rnn_layer.get_weights()# Print the initial weights and biasesprint("recurrent_kernel:", recurrent_kernel) # (3,3)print('kernal:',kernel) #(2,3)print('biase: ',biases) # (3)kernel = np.array([[1, 0, 2], [2, 1, 3]])recurrent_kernel = np.array([[1, 2, 1.0], [1, 0, 1], [0, 1, 0]])biases = np.array([0, 0, 1.0])rnn_layer.set_weights([kernel, recurrent_kernel, biases])print(rnn_layer.get_weights())test_data = np.array([[[1.0, 3], [1, 1], [2, 3]]])output = rnn_layer(test_data)print(output)if __name__ == '__main__':change_weight()

输出结果如下：可以看到结果是我手算的是一致的。

recurrent_kernel: [[ 0.06973135  0.40464386  0.9118119 ][ 0.6186313  -0.7345941   0.27868783][ 0.7825809   0.5446422  -0.3015495 ]]
kernal: [[-0.48868906  0.52718353 -0.08321357][-1.0569452  -0.9872779   0.72809434]]
biase:  [0. 0. 0.]
[array([[1., 0., 2.],[2., 1., 3.]], dtype=float32), array([[1., 2., 1.],[1., 0., 1.],[0., 1., 0.]], dtype=float32), array([0., 0., 1.], dtype=float32)]
tf.Tensor(
[[[ 7.  3. 12.][13. 27. 16.][48. 45. 54.]]], shape=(1, 3, 3), dtype=float32)

2.3 代码实现一个简单的例子

import keras.optimizers.optimizer
import numpy as np
import tensorflow as tf
from keras.models import Sequential
from keras.layers import SimpleRNN, Dense# Sample sequential data
# Each sequence has three timesteps, and each timestep has two features
data = np.array([[[1, 2], [2, 3], [3, 4]],[[5, 6], [6, 7], [7, 8]],[[9, 10], [10, 11], [11, 12]]
])print('data.shape= ',data.shape)
# Define the RNN model
model = Sequential()
model.add(SimpleRNN(units=4, input_shape=(3, 2), name="simpleRNN"))  # 4 units in the RNN layer, input_shape=(timesteps, features)
model.add(Dense(1, name= "output"))  # Output layer with one neuron# Compile the model
model.compile(loss='mse', optimizer=keras.optimizers.Adam(learning_rate=0.01))# Print the model summary
model.summary()before_RNN_weight = model.get_layer("simpleRNN").get_weights()
print('before train ', before_RNN_weight)# Train the model
model.fit(data, np.array([[10], [20], [30]]), epochs=2000, verbose=1)RNN_weight = model.get_layer("simpleRNN").get_weights()
print('after train ', len(RNN_weight),)for i in range(len(RNN_weight)):print('====',RNN_weight[i].shape, RNN_weight[i])# Make predictions
predictions = model.predict(data)
print("Predictions:", predictions.flatten())

代码输出

data.shape=  (3, 3, 2)
Model: "sequential"
_________________________________________________________________Layer (type)                Output Shape              Param #   
=================================================================simpleRNN (SimpleRNN)       (None, 4)                 28        output (Dense)              (None, 1)                 5         =================================================================
Total params: 33
Trainable params: 33
Non-trainable params: 0
_________________________________________________________________
before train  [array([[-0.00466371,  0.53100157,  0.5298798 ,  0.05514288],[-0.08896947,  0.43185067,  0.7861788 , -0.80616236]],dtype=float32), array([[-0.10712242, -0.03620092, -0.02182053, -0.9933471 ],[-0.6549012 , -0.02620655,  0.7532524 ,  0.05503315],[-0.01986913,  0.9989996 ,  0.02001702, -0.03470401],[-0.74781984,  0.00159313, -0.657065  ,  0.09502006]],dtype=float32), array([0., 0., 0., 0.], dtype=float32)]
2023-08-05 16:02:44.111298: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz
Epoch 1/2000
....
Epoch 1999/2000
1/1 [==============================] - 0s 11ms/step - loss: 0.0071
Epoch 2000/2000
1/1 [==============================] - 0s 13ms/step - loss: 0.0070
after train  3
==== (2, 4) [[ 0.27645147  0.6025058   1.6083356  -0.38382724][ 0.11586202  0.32901326  1.4760928  -1.2268958 ]]
==== (4, 4) [[-0.99628973 -2.444563    1.7412992  -1.5265529 ][ 0.80340594  0.9488743   2.44552    -0.7439341 ][-0.1827681  -1.3091801   1.547736   -0.6644555 ][-0.5724374   2.3090494  -2.1779017   0.35992467]]
==== (4,) [-0.40184066 -1.2391611   0.33460653 -0.29144585]
1/1 [==============================] - 0s 78ms/step
Predictions: [10.000422 19.999924 29.85534 ]

边写代码边学习之RNN

1. 什么是 RNN

2. 实验代码

2.1. 搭建一个只有一层RNN和Dense网络的模型。

2.2. 验证RNN里的逻辑

2.3 代码实现一个简单的例子

相关文章：

边写代码边学习之RNN

在linux调试进程PID的方法

【并发编程】线程安全的栈容器

ES嵌套查询和普通查询的高亮显示区别

Greenplum集群部署

电教智能云数据可视化平台开发电能优化日志实录

JSX语法基础总结

socker套接字

No111.精选前端面试题，享受每天的挑战和学习

【Apollo学习笔记】—— 相机仿真

【数据结构】——线性表的相关习题

SpringBoot集成Elasticsearch8.x（8）|（新版本Java API Client的Painless语言脚本script使用）

SpringBoot复习：（19）Condition接口和@Conditional注解

K8s中的Controller

【MFC】03.常用复杂控件的使用-笔记

Autosar诊断实战系列14-NRC优先级解析

《向量数据库指南》——腾讯云向量数据库Tencent Cloud VectorDB产品特性，架构和应用场景

xcode 的app工程与ffmpeg 4.4版本的静态库联调，ffmpeg内下的断点无法暂停。

机器学习06 数据准备-(利用 scikit-learn基于Pima Indian数据集作数据特征选定)

机器学习-特征选择：如何使用Lassco回归精确选择最佳特征？

在软件开发中正确使用MySQL日期时间类型的深度解析

Java 语言特性(面试系列2)

以下是对华为 HarmonyOS NETX 5属性动画（ArkTS）文档的结构化整理，通过层级标题、表格和代码块提升可读性：

从零实现富文本编辑器#5-编辑器选区模型的状态结构表达

聊聊 Pulsar：Producer 源码解析

电脑插入多块移动硬盘后经常出现卡顿和蓝屏

大语言模型如何处理长文本？常用文本分割技术详解

ServerTrust 并非唯一

多种风格导航菜单 HTML 实现（附源码）

mysql已经安装，但是通过rpm -q 没有找mysql相关的已安装包