AIGC笔记--Classifer Guidance的代码理解

🕗 发布于 2024-07-18 13:44 AIGC

1--完整可debug代码

2--核心代码

import torch as th
import torch.nn.functional as F

from tools.unet import EncoderUNetModel

# 核心函数
def cond_fn(x, t, y = None, classifier = None): 
    # x.shape: [batch_size, 3, 64, 64]
    # t.shape: [batch_size] 
    # y.shape: [batch_size]
    assert y is not None
    with th.enable_grad():
        x_in = x.detach().requires_grad_(True)
        logits = classifier(x_in, t) # [batch_size, 1000] # classifier本质上是EncoderUNetModel
        log_probs = F.log_softmax(logits, dim = -1) # [batch_size, 1000]
        selected = log_probs[range(len(logits)), y.view(-1)] # [batch_size]
        classifier_scale = 1.0 # set by ljf
        return th.autograd.grad(selected.sum(), x_in)[0] * classifier_scale # th.autograd.grad(y, x)计算y对x的导数
        # return th.autograd.grad(selected.sum(), x_in)[0] * args.classifier_scale # [batch_size, 3, 64, 64]

def condition_mean(cond_fn, p_mean_var, x, t, model_kwargs = None, classifier = None):
    # gradient = cond_fn(x, self._scale_timesteps(t), **model_kwargs) # origin version
    gradient = cond_fn(x, t, y = model_kwargs, classifier = classifier) # [batch_size, 3, 64, 64] set by ljf
    new_mean = (
        p_mean_var["mean"].float() + p_mean_var["variance"] * gradient.float() # [batch_size, 3, 64, 64]
    )
    return new_mean

def p_sample(x, t, cond_fn = None, model_kwargs = None, classifier = None):
        # origin version
        # out = self.p_mean_variance(
        #     model,
        #     x,
        #     t,
        #     clip_denoised=clip_denoised,
        #     denoised_fn=denoised_fn,
        #     model_kwargs=model_kwargs,
        # )

        # 这里使用random的方法模拟p_mean_variance的过程
        out = {}
        out["mean"] = th.rand(x.shape)
        out["variance"] = th.rand(x.shape)
        out["log_variance"] = th.rand(x.shape)
        out["pred_xstart"] = th.rand(x.shape)

        noise = th.randn_like(x)
        nonzero_mask = (
            (t != 0).float().view(-1, *([1] * (len(x.shape) - 1)))
        )  # no noise when t == 0
        if cond_fn is not None:
            out["mean"] = condition_mean(
                cond_fn, out, x, t, model_kwargs = model_kwargs, classifier = classifier
            )
        sample = out["mean"] + nonzero_mask * th.exp(0.5 * out["log_variance"]) * noise # [batch_size, 3, 64, 64]
        return {"sample": sample, "pred_xstart": out["pred_xstart"]}


if __name__ == "__main__":
    batch_size = 2
    channel = 3
    height = 64
    width = 64
    x = th.rand(batch_size, channel, height, width) # 随机生成输入
    t = th.tensor([999]).repeat(batch_size) # 随机模拟time steps
    y = th.randint(low = 0, high = 999, size = (2,)) # 模拟类别
    classifier = EncoderUNetModel(
        image_size = 64,
        in_channels = 3,
        model_channels = 128,
        out_channels = 1000, # 1000个类别
        num_res_blocks = 3,
        attention_resolutions = [32,16,8])
    
    output = p_sample(x = x, t = t, model_kwargs = y, cond_fn = cond_fn, classifier = classifier) # 传入cond_fn函数和分类器classifier
    assert list(output['sample'].shape) == [batch_size, channel, height, width]
    print("output['sample'].shape: ", output['sample'].shape)
    print("All Done!")

原文地址：https://blog.csdn.net/weixin_43863869/article/details/140493373

免责声明：本站文章内容转载自网络资源，如本站内容侵犯了原著者的合法权益，可联系本站删除。更多内容请关注自学内容网（zxcms.com）！

上一篇：决策树算法介绍：原理与案例实现
下一篇：Hadoop3：MR程序的数据倾斜问题处理

leetcode289:生命游戏
根据，简称为，是英国数学家约翰·何顿·康威在 1970 年发明的细胞自动机。给定一个包含m × n个格子的面板，每一个格子都可以看成是一个细胞。每个细胞都具有一个初始状态：1即为（live），或0即为
阅读更多2024-10-20
MongoDB数据恢复
注意：两个MongoDB的版本要一致，本文使用的是mongo:4.2.24。先把K8S上面的MongoDB 容器停止（可以把副本改成0）。1、将容器挂载MongoDB的数据目录备份到本地。经常是数据文
阅读更多2024-10-20
C#中实现事务
C#中实现事务
阅读更多2024-10-20
【LeetCode每日一题】——560.和为 K 的子数组
给你一个整数数组 nums 和一个整数 k ，请你统计并返回该数组中和为 k 的子数组的个数。子数组是数组中元素的连续非空序列。
阅读更多2024-10-20
「漏洞复现」满客宝智慧食堂系统 selectUserByOrgId 未授权访问漏洞
请勿利用文章内的相关技术从事非法测试，由于传播、利用此文所提供的信息而造成的任何直接或者间接的后果及损失，均由使用者本人负责，作者不为此承担任何责任。工具来自网络，安全性自测，如有侵权请联系删除。本次
阅读更多2024-10-20
React面试题目（从基本到高级）
React前端面试常见题目涵盖了React的基础概念、组件、状态管理、生命周期、性能优化等多个方面。
阅读更多2024-10-20
12.个人博客系统（Java项目基于spring和vue）
1 在校学习的学生，可用于日常学习使用或是毕业设计使用 2 毕业一到两年的开发人员，用于锻炼自己的独立功能模块设计能力，增强代码编写能力。 3 亦可以部署为商化项目使用。 4 需要完整资料及源码
阅读更多2024-10-20
YoloV8改进策略：注意力改进|DeBiFormer，可变形双级路由注意力|引入DeBiLevelRoutingAttention注意力模块（全网首发）
本次改进的核心在于将DeBiLevelRoutingAttention模块嵌入到YoloV8的主干网络中，具体位置是在SPPF（Spatial Pyramid Pooling Fast）模块之后。这一
阅读更多2024-10-20
word取消自动单词首字母大写
情况说明：在word输入单词后首字母会自动变成大写取消单词首字母大写步骤：（1）点击菜单栏文件（2）点击“更多”——>“选项”（3）点击“校对”——>“自动更正选项”（4）取消“句首字母大
阅读更多2024-10-20
web前端网页用户注册页面
【代码】web前端网页用户注册页面。
阅读更多2024-10-20

AIGC笔记--Classifer Guidance的代码理解

1--完整可debug代码

2--核心代码

相关文章