代码 RNN原理及手写复现

🕗 发布于 2024-11-12 12:37 rnn 人工智能 深度学习

笔记连接: https://pan.baidu.com/s/1_Sm7ptEiJtTTq3vQWgOTNg?pwd=2rei 提取码: 2rei

import torch
import torch.nn as nn

bs,T=2,3  # 批大小，输入序列长度
input_size,hidden_size = 2,3 # 输入特征大小，隐含层特征大小
input = torch.randn(bs,T,input_size)  # 随机初始化一个输入特征序列
h_prev = torch.zeros(bs,hidden_size) # 初始隐含状态

# step1 调用pytorch RNN API
rnn = nn.RNN(input_size,hidden_size,batch_first=True)
rnn_output,state_finall = rnn(input,h_prev.unsqueeze(0))

print(rnn_output)
print(state_finall)

# step2 手写 rnn_forward函数，实现RNN的计算原理
def rnn_forward(input,weight_ih,weight_hh,bias_ih,bias_hh,h_prev):
    bs,T,input_size = input.shape
    h_dim = weight_ih.shape[0]
    h_out = torch.zeros(bs,T,h_dim) # 初始化一个输出（状态）矩阵

    for t in range(T):
        x = input[:,t,:].unsqueeze(2)  # 获取当前时刻的输入特征，bs*input_size*1
        w_ih_batch = weight_ih.unsqueeze(0).tile(bs,1,1) # bs * h_dim * input_size
        w_hh_batch = weight_hh.unsqueeze(0).tile(bs,1,1)# bs * h_dim * h_dim

        w_times_x = torch.bmm(w_ih_batch,x).squeeze(-1) # bs*h_dim
        w_times_h = torch.bmm(w_hh_batch,h_prev.unsqueeze(2)).squeeze(-1) # bs*h_him
        h_prev = torch.tanh(w_times_x + bias_ih + w_times_h + bias_hh)

        h_out[:,t,:] = h_prev

    return h_out,h_prev.unsqueeze(0)

# 验证结果
custom_rnn_output,custom_state_finall = rnn_forward(input,
                                                    rnn.weight_ih_l0,
                                                    rnn.weight_hh_l0,
                                                    rnn.bias_ih_l0,
                                                    rnn.bias_hh_l0,
                                                    h_prev)
print(custom_rnn_output)
print(custom_state_finall)

print(torch.allclose(rnn_output,custom_rnn_output))
print(torch.allclose(state_finall,custom_state_finall))

# step3 手写一个 bidirectional_rnn_forward函数，实现双向RNN的计算原理
def bidirectional_rnn_forward(input,weight_ih,weight_hh,bias_ih,bias_hh,h_prev,
                              weight_ih_reverse,weight_hh_reverse,bias_ih_reverse,
                              bias_hh_reverse,h_prev_reverse):
    bs,T,input_size = input.shape
    h_dim = weight_ih.shape[0]
    h_out = torch.zeros(bs,T,h_dim*2) # 初始化一个输出（状态）矩阵，注意双向是两倍的特征大小

    forward_output = rnn_forward(input,weight_ih,weight_hh,bias_ih,bias_hh,h_prev)[0]  # forward layer
    backward_output = rnn_forward(torch.flip(input,[1]),weight_ih_reverse,weight_hh_reverse,bias_ih_reverse, bias_hh_reverse,h_prev_reverse)[0] # backward layer

    # 将input按照时间的顺序翻转
    h_out[:,:,:h_dim] = forward_output
    h_out[:,:,h_dim:] = torch.flip(backward_output,[1]) #需要再翻转一下 才能和forward output拼接

    
    h_n = torch.zeros(bs,2,h_dim)  # 要最后的状态连接

    h_n[:,0,:] = forward_output[:,-1,:]
    h_n[:,1,:] = backward_output[:,-1,:]

    h_n = h_n.transpose(0,1)

    return h_out,h_n
    # return h_out,h_out[:,-1,:].reshape((bs,2,h_dim)).transpose(0,1)

# 验证一下 bidirectional_rnn_forward的正确性
bi_rnn = nn.RNN(input_size,hidden_size,batch_first=True,bidirectional=True)
h_prev = torch.zeros((2,bs,hidden_size))
bi_rnn_output,bi_state_finall = bi_rnn(input,h_prev)

for k,v in bi_rnn.named_parameters():
    print(k,v)

custom_bi_rnn_output,custom_bi_state_finall = bidirectional_rnn_forward(input,
                                                                        bi_rnn.weight_ih_l0,
                                                                        bi_rnn.weight_hh_l0,
                                                                        bi_rnn.bias_ih_l0,
                                                                        bi_rnn.bias_hh_l0,
                                                                        h_prev[0],
                                                                        bi_rnn.weight_ih_l0_reverse,
                                                                        bi_rnn.weight_hh_l0_reverse,
                                                                        bi_rnn.bias_ih_l0_reverse,
                                                                        bi_rnn.bias_hh_l0_reverse,
                                                                        h_prev[1])

print("Pytorch API output")
print(bi_rnn_output)
print(bi_state_finall)

print("\n custom bidirectional_rnn_forward function output:")
print(custom_bi_rnn_output)
print(custom_bi_state_finall)
print(torch.allclose(bi_rnn_output,custom_bi_rnn_output))
print(torch.allclose(bi_state_finall,custom_bi_state_finall))

原文地址：https://blog.csdn.net/2301_77549977/article/details/143633665

免责声明：本站文章内容转载自网络资源，如本站内容侵犯了原著者的合法权益，可联系本站删除。更多内容请关注自学内容网（zxcms.com）！

上一篇：3175. 找到连续赢 K 场比赛的第一位玩家
下一篇：将Excel文件的两个表格经过验证后分别读取到Excel表和数据库

[CUDA] cuda kernel开发记录
包括kernel的一些使用注意事项， launch_bound, __device__, debug排查技巧
阅读更多2024-11-16
【python】掌握 Flask：轻量级 Web 开发框架解析
路由是 Web 开发的基础，负责管理 URL 到视图函数的映射。在 Flask 中，路由定义非常简单，只需使用装饰器即可。这个代码段定义了一个路由，访问该路径时会返回 “Hello, Flask!通过
阅读更多2024-11-16
AI图片分析接口LiteAIServer摄像机实时接入分析平台未戴安全帽检测算法
随着人工智能技术的飞速发展，摄像机实时接入分析平台LiteAIServer工地未佩戴安全帽检测算法应运而生，为工地安全管理带来了革命性的变革。
阅读更多2024-11-16
2024新版pycharm如何切换anaconda虚拟环境
回归正题，导入项目后点击文件=>设置，找到解释器。不得不说这界面改的真不错，看着很舒服。另外在终端用指令切换也是可以的。添加解释器=>添加本地解释器。
阅读更多2024-11-16
计算机提示mfc140u.dll丢失的五种解决方法，了解mfc140u.dll错误的几种修复方法
当你尝试打开某些程序时，突然出现错误提示，告知你系统缺少 mfc140u.dll 文件，这可能让你感到困惑和无助。mfc140u.dll 是 Microsoft Foundation Class (M
阅读更多2024-11-16
k8s 中传递参数给docker容器
在 Kubernetes 中，可以通过多种方式将参数传递给 Dockerfile 或其运行的容器，常见的方式包括使用环境变量、命令行参数、配置文件等。
阅读更多2024-11-16
设计模式之工厂模式，但是宝可梦
作为一个细分了三个种类的设计模式，到底该如何取舍？比起直接new一个对象，使用对应模式的好处到底在哪？简单工厂模式：根据传入的参数决定产出的对象，可以隐藏一些创建的细节适用于需要根据条件创建不同对象的
阅读更多2024-11-16
【深度学习】wsl-ubuntu深度学习基本配置
这里注意一点，你换了源之后就最好不要开代理了，要不然搞不好下载失败，pip和conda都是。
阅读更多2024-11-16
nodejs和npm在gitbash中提示Not Found情况的解决办法
很多小伙伴学习了node以后，在cmd命令行中可以正常的获取node版本和npm版本，但是我们经常使用gitbash来管理git，这时候下载完gitbash后，在gitbash中输入node -v和n
阅读更多2024-11-16
判断子序列
给定一个长度为 n的整数序列 a1,a2,…,an以及一个长度为 m的整数序列 b1,b2,…,bm。请你判断 a序列是否为 b序列的子序列。子序列指序列的一部分项按原有次序排列而得的序列，例如序列
阅读更多2024-11-16

代码 RNN原理及手写复现

相关文章