OpenAI-gym how to implement a timer for a certain action in step()

🕗 发布于 2024-09-24 17:31 ai python pygame openai api gym

题意：OpenAI-gym 如何在 step() 中为某个动作实现一个计时器

问题背景：

One of the actions I want the agent to do needs to have a delay between every action. For context, in pygame I have the following code for shooting a bullet:

我希望代理执行的某个动作在每次执行之间需要有一个延迟。为了提供一些背景知识，在 Pygame 中，我有以下用于发射子弹的代码

if keys[pygame.K_SPACE]:
    current_time = pygame.time.get_ticks()
    # ready to fire when 600 ms have passed.
    if current_time - previous_time > 600:
        previous_time = current_time
        bullets.append([x + 25, y + 24])

I've set a timer to prevent bullet spamming, how would I construct this to work with the step() method? My other actions are moving up, down, left, right.

我设置了一个计时器来防止子弹连发，我该如何构建这个功能以使其与 step() 方法一起工作？我的其他动作是向上、向下、向左、向右移动

This is my first time creating a project with OpenAI-gym so I'm not sure what the capabilities of the toolkit are, any help would be greatly appreciated.

这是我第一次使用 OpenAI-gym 创建项目，所以我不确定该工具包的功能，任何帮助都将不胜感激

问题解决：

You can use whatever method of tracking time you like (other than pygame.time.get_ticks() I suppose), and use a similar approach as in that pygame code. You'd want to store previous_time as a member of the environment instead of just a local variable, because you want it to persist across function calls.

你可以使用任何你喜欢的时间跟踪方法（我猜除了 pygame.time.get_ticks()），并使用类似于该 pygame 代码的方法。你需要将 previous_time 存储为环境的一个成员，而不仅仅是一个局部变量，因为你希望它在函数调用之间保持持久性

It's not easy to actually prevent your Reinforcement Learning agent (assuming you're using gym for RL) from selecting the fire action altogether, but you can simply implement the step() function in such a way that the agent does not do anything at all if they select the fire action too quickly.

实际上要阻止你的强化学习代理（假设你使用 gym 进行强化学习）选择开火动作并不容易，但你可以简单地实现 step() 函数，使得如果代理选择开火动作过快，什么都不会执行

As for measuring time, you could measure wall clock time, but then the power of your CPU is going to influence how often your agent is allowed to shoot (it might be able to shoot a new bullet every step on very old hardware, but only be allowed to shoot one bullet every 100 steps on powerful hardware), that's probably a bad idea. Instead, I'd recommend measuring "time" simply by counting the step() calls. For example, using only the code from your question above, the step() function could look like:

至于时间测量，你可以测量墙上时钟的时间，但这样一来，你的 CPU 性能将会影响代理允许射击的频率（在非常旧的硬件上，它可能每一步都能射击一颗新子弹，而在强大的硬件上可能只能每 100 步射击一颗子弹），这可能不是一个好主意。相反，我建议通过计数 step() 调用次数来简单地测量“时间”。例如，仅使用你上面问题中的代码，step() 函数可能看起来像这样

def step(action):
    self.step_counter += 1

    # other step() code here

    if action == FIRE:
        if self.step_counter - self.previous_time > 10:    # or any other number
            self.previous_time = self.step_counter
            bullets.append([x + 25, y + 24])

    # other step() code here

Don't forget that you'll also want to reset your newly added member variables in reset():

别忘了你还需要在 reset() 中重置你新添加的成员变量

def reset():
    self.step_counter = 0
    self.previous_time = -100   # some negative number such that your agent can fire at start
    # other reset() code here

原文地址：https://blog.csdn.net/suiusoar/article/details/142456176

免责声明：本站文章内容转载自网络资源，如本站内容侵犯了原著者的合法权益，可联系本站删除。更多内容请关注自学内容网（zxcms.com）！

上一篇：分布式光伏监控系统在鄂尔多斯市鄂托克旗某煤矿项目中的应用
下一篇：基于 IV 的因果中介分析模型及 Stata 实现

夹耳式蓝牙耳机哪个牌子最好？夹耳式耳机推荐性价比排行榜
这款耳机拥有诸多令人惊艳的特性。QCY C30采用水滴豆造型设计，这一造型具有独特的视觉美感，其圆润、流畅的线条给人一种优雅、时尚的感觉，音质表现可圈可点，QCY C30耳机音质的核心亮点在于其搭载的
阅读更多2024-09-25
sass安装问题
之后又百度了方法，要先安装cnpm，通过cnpm安装sass,但是需要注意，安装cnpm要求node版本必须大于14.18.0，如果不符合要求，需要下载nvm去升降nodejs版本，我这里是有3个版本
阅读更多2024-09-25
【ComfyUI】生成图细节更清晰——Consistency_Decoder
改进了 stable diffusion 的VAEs 的解码。
阅读更多2024-09-25
数据结构第一\二\三章——基础准备
日拱一卒，功不唐捐；下定决心了要补一下这门核心课程，不学的话软件这条路实在走不远#源自抖音博主“英雄哪里出来”，涉及侵权请私信告知。
阅读更多2024-09-25
智能手机表面缺陷识别检测数据集 yolo数据集 1300张
智能手机表面缺陷识别检测数据集 yolo数据集 1300张
阅读更多2024-09-25
【Python】数据可视化之热力图
热力图（Heatmap）是一种通过颜色深浅来展示数据分布、密度和强度等信息的可视化图表。它通过对色块着色来反映数据特征，使用户能够直观地理解数据模式，发现规律，并作出决策。
阅读更多2024-09-25
第68期 | GPTSecurity周报
GPTSecurity是一个涵盖了前沿学术研究和实践经验分享的社区，集成了生成预训练Transformer（GPT）、人工智能生成内容（AIGC）以及大语言模型（LLM）等安全领域应用的知识。
阅读更多2024-09-25
在Docker中运行Tomcat：打造高效可移植的Java Web服务器
通过本文，您应该已经学会了如何在Docker中运行Tomcat，并部署一个简单的Java Web应用。Docker为Java Web应用的部署提供了极大的便利，使得应用的部署更加高效、可移植和可扩展。
阅读更多2024-09-25
LeetCode: 2207. 字符串中最多数目的子序列一次遍历数组，时间复杂度O（n）
LeetCode: 2207. 字符串中最多数目的子序列一次遍历数组，时间复杂度O（n）
阅读更多2024-09-25
Redis解说
Redis（Remote Dictionary Server）是一个开源的高性能键值存储数据库，它通常被用作数据库、缓存和消息代理。由于其内存中的数据结构存储、持久化选项以及丰富的数据类型支持，Red
阅读更多2024-09-25

OpenAI-gym how to implement a timer for a certain action in step()

问题背景：

问题解决：

相关文章