YOLOv11改进，YOLOv11结合DynamicConv(动态卷积)，CVPR2024，二次创新C3k2结构

🕗 发布于 2024-11-17 04:15 YOLO 目标检测 人工智能 计算机视觉 深度学习

在这里插入图片描述

摘要

大规模视觉预训练显著提高了大规模视觉模型的性能。现有的低 FLOPs 模型无法从大规模预训练中受益。在本文中，作者提出了一种新的设计原则，称为 ParameterNet，旨在通过最小化FLOPs的增加来增加大规模视觉预训练模型中的参数数量。利用 DynamicConv 动态卷积将额外的参数加入到网络中，而几乎不增加FLOPs。ParameterNet 方法使低 FLOPs 网络能够受益于大规模视觉预训练。

# 理论介绍

DynamicConv（动态卷积）是一个用于提高卷积神经网络（CNN）性能的技术，核心思想是动态地生成卷积核（filter），而不是使用固定的卷积核。通过引入更多的计算灵活性和适应性来增强卷积操作的表达能力，进而提升模型的性能。工作方式：

专家选择：DynamicConv 通过引入多个“专家”（experts），每个专家学习特定的卷积模式。输入图像的不同部分会选择不同的专家进行卷积。
动态卷积核生成：根据输入的不同特征，专家网络动态地生成卷积核，而不是使用固定的卷积核。这意味着卷积操作是基于输入特征动态调整的，具备更多的灵活性和表达能力。
混合卷积：在某些情况下，可以将多个专家的卷积结果进行加权融合，形成最终的卷积输出。这种加权融合方式根据任务的不同可以进行调整。

DynamicConv 的核心优势在于它能够根据输入的特征动态生成适应性强的卷积核，从而提升模型的表达能力和灵活性。

理论详解可以参考链接：论文地址
代码可在这个链接找到：代码地址

下文都是手把手教程，跟着操作即可添加成功

摘要
# 理论介绍
🎓一、YOLOv11原始版本代码下载
- 🍀🍀1.YOLOv11模型结构图
- 🍀🍀2.环境配置
🎓二、DynamicConv代码
🎓三、添加方法
🎓四、yaml文件修改
- 🍀🍀1.第一种添加方法
🎓五、训练文件修改
- 🍀🍀1.新建训练文件
- 🍀🍀2.修改训练文件
总结

🎓一、YOLOv11原始版本代码下载

官网的源码下载地址：官网源码

官网打不开的话，从我的网盘下载就行，网盘下载地址: YOLOv11原始版本源码下载，版本为ultralytics-8.3.6，提取码: ehhs

注意注意注意：如果在我之前的文章下载过 YOLOv11 源码，不用重新下载了，没有特殊说明都是用同一个版本的源码

🍀🍀1.YOLOv11模型结构图

根据 yolov11.yaml 画出 yolo 整体结构图，如下图所示
在这里插入图片描述

🍀🍀2.环境配置

环境配置参考教程链接：链接: 环境配置链接，如果已经配置好环境可以忽略此步骤

🎓二、DynamicConv代码

# -*- coding: utf-8 -*-
"""
@Auth ：挂科边缘
@File ：DynamicConv.py
@IDE ：PyCharm
@Motto :学习新思想，争做新青年
"""
from timm.layers import SqueezeExcite, drop_path


from ultralytics.nn.modules.block import  Bottleneck, C2f, C3

"""
An implementation of GhostNet Model as defined in:
GhostNet: More Features from Cheap Operations. https://arxiv.org/abs/1911.11907
The train script of the model is similar to that of MobileNetV3
Original model: https://github.com/huawei-noah/CV-backbones/tree/master/ghostnet_pytorch
"""
import math
from functools import partial

import torch
import torch.nn as nn
import torch.nn.functional as F

from timm.data import IMAGENET_DEFAULT_MEAN, IMAGENET_DEFAULT_STD
from timm.models.layers import SelectAdaptivePool2d, Linear, CondConv2d, hard_sigmoid, make_divisible, DropPath

from timm.models.helpers import build_model_with_cfg
from timm.models.registry import register_model



def _cfg(url='', **kwargs):
    return {
        'url': url, 'num_classes': 1000, 'input_size': (3, 224, 224), 'pool_size': (1, 1),
        'crop_pct': 0.875, 'interpolation': 'bilinear',
        'mean': IMAGENET_DEFAULT_MEAN, 'std': IMAGENET_DEFAULT_STD,
        'first_conv': 'conv_stem', 'classifier': 'classifier',
        **kwargs
    }


default_cfgs = {
    'ghostnet_100': _cfg(
        url='https://github.com/huawei-noah/CV-backbones/releases/download/ghostnet_pth/ghostnet_1x.pth'),
    'ghostnet': _cfg(url=''),
}

_SE_LAYER = partial(SqueezeExcite, gate_fn=hard_sigmoid, divisor=4)


class DynamicConv(nn.Module):
    """ Dynamic Conv layer
    """

    def __init__(self, in_features, out_features, kernel_size=1, stride=1, padding='', dilation=1,
                 groups=1, bias=False, num_experts=4):
        super().__init__()
        self.routing = nn.Linear(in_features, num_experts)
        self.cond_conv = CondConv2d(in_features, out_features, kernel_size, stride, padding, dilation,
                                    groups, bias, num_experts)

    def forward(self, x):
        pooled_inputs = F.adaptive_avg_pool2d(x, 1).flatten(1)  # CondConv routing
        routing_weights = torch.sigmoid(self.routing(pooled_inputs))
        x = self.cond_conv(x, routing_weights)
        return x


class ConvBnAct(nn.Module):
    """ Conv + Norm Layer + Activation w/ optional skip connection
    """

    def __init__(
            self, in_chs, out_chs, kernel_size, stride=1, dilation=1, pad_type='',
            skip=False, act_layer=nn.ReLU, norm_layer=nn.BatchNorm2d, drop_path_rate=0., num_experts=4):
        super(ConvBnAct, self).__init__()
        self.has_residual = skip and stride == 1 and in_chs == out_chs
        self.drop_path_rate = drop_path_rate
        # self.conv = create_conv2d(in_chs, out_chs, kernel_size, stride=stride, dilation=dilation, padding=pad_type)
        self.conv = DynamicConv(in_chs, out_chs, kernel_size, stride, dilation=dilation, padding=pad_type,
                                num_experts=num_experts)
        self.bn1 = norm_layer(out_chs)
        self.act1 = act_layer()

    def feature_info(self, location):
        if location == 'expansion':  # output of conv after act, same as block coutput
            info = dict(module='act1', hook_type='forward', num_chs=self.conv.out_channels)
        else:  # location == 'bottleneck', block output
            info = dict(module='', hook_type='', num_chs=self.conv.out_channels)
        return info

    def forward(self, x):
        shortcut = x
        x = self.conv(x)
        x = self.bn1(x)
        x = self.act1(x)
        if self.has_residual:
            if self.drop_path_rate > 0.:
                x = drop_path(x, self.drop_path_rate, self.training)
            x += shortcut
        return x


class GhostModule(nn.Module):
    def __init__(self, inp, oup, kernel_size=1, ratio=2, dw_size=3, stride=1, act_layer=nn.ReLU, num_experts=4):
        super(GhostModule, self).__init__()
        self.oup = oup
        init_channels = math.ceil(oup / ratio)
        new_channels = init_channels * (ratio - 1)

        self.primary_conv = nn.Sequential(
            DynamicConv(inp, init_channels, kernel_size, stride, kernel_size // 2, bias=False, num_experts=num_experts),
            nn.BatchNorm2d(init_channels),
            act_layer() if act_layer is not None else nn.Sequential(),
        )

        self.cheap_operation = nn.Sequential(
            DynamicConv(init_channels, new_channels, dw_size, 1, dw_size // 2, groups=init_channels, bias=False,
                        num_experts=num_experts),
            nn.BatchNorm2d(new_channels),
            act_layer() if act_layer is not None else nn.Sequential(),
        )

    def forward(self, x):
        x1 = self.primary_conv(x)
        x2 = self.cheap_operation(x1)
        out = torch.cat([x1, x2], dim=1)
        return out[:, :self.oup, :, :]


class GhostBottleneck(nn.Module):
    """ Ghost bottleneck w/ optional SE"""

    def __init__(self, in_chs, mid_chs, out_chs, dw_kernel_size=3,
                 stride=1, act_layer=nn.ReLU, se_ratio=0., drop_path=0., num_experts=4):
        super(GhostBottleneck, self).__init__()
        has_se = se_ratio is not None and se_ratio > 0.
        self.stride = stride

        # Point-wise expansion
        self.ghost1 = GhostModule(in_chs, mid_chs, act_layer=act_layer, num_experts=num_experts)

        # Depth-wise convolution
        if self.stride > 1:
            self.conv_dw = nn.Conv2d(
                mid_chs, mid_chs, dw_kernel_size, stride=stride,
                padding=(dw_kernel_size - 1) // 2, groups=mid_chs, bias=False)
            self.bn_dw = nn.BatchNorm2d(mid_chs)
        else:
            self.conv_dw = None
            self.bn_dw = None

        # Squeeze-and-excitation
        self.se = _SE_LAYER(mid_chs, se_ratio=se_ratio,
                            act_layer=act_layer if act_layer is not nn.GELU else nn.ReLU) if has_se else None

        # Point-wise linear projection
        self.ghost2 = GhostModule(mid_chs, out_chs, act_layer=None, num_experts=num_experts)

        # shortcut
        if in_chs == out_chs and self.stride == 1:
            self.shortcut = nn.Sequential()
        else:
            self.shortcut = nn.Sequential(
                DynamicConv(
                    in_chs, in_chs, dw_kernel_size, stride=stride,
                    padding=(dw_kernel_size - 1) // 2, groups=in_chs, bias=False, num_experts=num_experts),
                nn.BatchNorm2d(in_chs),
                DynamicConv(in_chs, out_chs, 1, stride=1, padding=0, bias=False, num_experts=num_experts),
                nn.BatchNorm2d(out_chs),
            )

        self.drop_path = DropPath(drop_path) if drop_path > 0. else nn.Identity()

    def forward(self, x):
        shortcut = x

        # 1st ghost bottleneck
        x = self.ghost1(x)

        # Depth-wise convolution
        if self.conv_dw is not None:
            x = self.conv_dw(x)
            x = self.bn_dw(x)

        # Squeeze-and-excitation
        if self.se is not None:
            x = self.se(x)

        # 2nd ghost bottleneck
        x = self.ghost2(x)

        x = self.shortcut(shortcut) + self.drop_path(x)
        return x




class DynamicConv_Bottleneck(Bottleneck):

    def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5):  # ch_in, ch_out, shortcut, groups, kernels, expand
        super().__init__(c1, c2, shortcut, g, k, e)
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = DynamicConv(c1, c_)
        self.cv2 = DynamicConv(c_, c2)
        self.add = shortcut and c1 == c2

    def forward(self, x):
        return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))






class C3k2_GhostModule(C2f):
    """Faster Implementation of CSP Bottleneck with 2 convolutions."""
    def __init__(self, c1, c2, n=1, c3k=False, e=0.5, g=1, shortcut=True):
        """Initializes the C3k2 module, a faster CSP Bottleneck with 2 convolutions and optional C3k blocks."""
        super().__init__(c1, c2, n, shortcut, g, e)
        self.m = nn.ModuleList(
            C3k_GhostModule(self.c, self.c, 2, shortcut, g) if c3k else GhostModule(self.c, self.c) for _ in range(n)
        )




class C3k_GhostModule(C3):
    """C3k is a CSP bottleneck module with customizable kernel sizes for feature extraction in neural networks."""

    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5, k=3):
        """Initializes the C3k module with specified channels, number of layers, and configurations."""
        super().__init__(c1, c2, n, shortcut, g, e)
        c_ = int(c2 * e)  # hidden channels
        # self.m = nn.Sequential(*(RepBottleneck(c_, c_, shortcut, g, k=(k, k), e=1.0) for _ in range(n)))
        self.m = nn.Sequential(*(GhostModule(c_, c_) for _ in range(n)))

🎓三、添加方法

🍀🍀1.在modules目录下添加第二章的代码

(1).在 ultralytics/nn/modules 目录下，新建一个文件名，我这里取名为 DynamicConv.py，操作如以下截图：

在这里插入图片描述

(2).之后把第二章的代码复制进去就可以了

🍀🍀2.在init.py文件导入

文件路径为：ultralytics/nn/modules/init.py

(1)导入前，在 init.py 开头注释下面代码，这个步骤注释了以后可以跳过这个步骤
在这里插入图片描述

（2)之后在 init.py 开头导入该模块，导入截图所示

from .DynamicConv import C3k2_GhostModule

在这里插入图片描述

🍀🍀3.在tasks.py文件进行注册

(1)在 tasks.py 文件开头导入所有模块，该文件路径为：ultralytics/nn/tasks.py

在这个文件修改导入方法，并注释如图所示的代码，改成 * 号导入，以后无需手动一个个导入新的模块，方便很多，一次性导完，个步骤改完了，往后的文章就可以跳过这个步骤了，添加如截图所示

from ultralytics.nn.modules import *

在这里插入图片描述

(2)之后在这个文件的 parse_model 方法，添加该模块，添加如截图所示

在这里插入图片描述

看到这里已经成功把改进的模块添加进 YOLOv11 源码了，接下来配置 yaml 文件调用改进的模块就行了

🎓四、yaml文件修改

在 ultralytics/cfg/models/11 目录下，复制 yolo11.yaml 文件，然后取名为 yolo11-xxx.yaml，xxx 一般取改进模块名字，之后在这个文件进行修改，我这个文件代码如下所示：

🍀🍀1.第一种添加方法

yaml 全部代码如下：

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLO11 object detection model with P3-P5 outputs. For Usage examples see https://docs.ultralytics.com/tasks/detect

# Parameters
nc: 80 # number of classes
scales: # model compound scaling constants, i.e. 'model=yolo11n.yaml' will call yolo11.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.50, 0.25, 1024] # summary: 319 layers, 2624080 parameters, 2624064 gradients, 6.6 GFLOPs
  s: [0.50, 0.50, 1024] # summary: 319 layers, 9458752 parameters, 9458736 gradients, 21.7 GFLOPs
  m: [0.50, 1.00, 512] # summary: 409 layers, 20114688 parameters, 20114672 gradients, 68.5 GFLOPs
  l: [1.00, 1.00, 512] # summary: 631 layers, 25372160 parameters, 25372144 gradients, 87.6 GFLOPs
  x: [1.00, 1.50, 512] # summary: 631 layers, 56966176 parameters, 56966160 gradients, 196.0 GFLOPs

# YOLO11n backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]] # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]] # 1-P2/4
  - [-1, 2, C3k2_GhostModule, [256, False, 0.25]]
  - [-1, 1, Conv, [256, 3, 2]] # 3-P3/8
  - [-1, 2, C3k2_GhostModule, [512, False, 0.25]]
  - [-1, 1, Conv, [512, 3, 2]] # 5-P4/16
  - [-1, 2, C3k2_GhostModule, [512, True]]
  - [-1, 1, Conv, [1024, 3, 2]] # 7-P5/32
  - [-1, 2, C3k2_GhostModule, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]] # 9
  - [-1, 2, C2PSA, [1024]] # 10

# YOLO11n head
head:
  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 6], 1, Concat, [1]] # cat backbone P4
  - [-1, 2, C3k2_GhostModule, [512, False]] # 13

  - [-1, 1, nn.Upsample, [None, 2, "nearest"]]
  - [[-1, 4], 1, Concat, [1]] # cat backbone P3
  - [-1, 2, C3k2_GhostModule, [256, False]] # 16 (P3/8-small)

  - [-1, 1, Conv, [256, 3, 2]]
  - [[-1, 13], 1, Concat, [1]] # cat head P4
  - [-1, 2, C3k2_GhostModule, [512, False]] # 19 (P4/16-medium)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 10], 1, Concat, [1]] # cat head P5
  - [-1, 2, C3k2_GhostModule, [1024, True]] # 22 (P5/32-large)

  - [[16, 19, 22], 1, Detect, [nc]] # Detect(P3, P4, P5)

🎓五、训练文件修改

🍀🍀1.新建训练文件

(1)在根目录新建一个 python 文件，取名为：train.py，如果之前看过我的文章，已经新建过就不用重新新建了

🍀🍀2.修改训练文件

YOLOv11 训练方式跟 YOLOv5 是有区别的，但是训练数据集格式跟 YOLOv5 一样的，你只需把处理好的数据集就行，这里就不在阐述了，废话不多说，我的训练文件如下，根据你训练需求修改指定参数就行，其中圈起来的参数需要你修改的，其他参数根据自己需求选择改或者不改就行。

在这里插入图片描述

训练的代码如下，如果之前看过我的文章，已经复制过了就不用重新复制了，只需修改参数就行

# -*- coding: utf-8 -*-
"""
@Auth ： 挂科边缘
@File ：trian.py
@IDE ：PyCharm
@Motto:学习新思想，争做新青年
@Email ：179958974@qq.com
"""
import warnings
warnings.filterwarnings('ignore')
from ultralytics import YOLO

if __name__ == '__main__':
    # model.load('yolo11n.pt') # 加载预训练权重,改进或者做对比实验时候不建议打开，因为用预训练模型整体精度没有很明显的提升
    model = YOLO(model=r'D:\2-Python\1-YOLO\YOLOv11\ultralytics-8.3.6\ultralytics\cfg\models\11\yolo11.yaml')
    model.train(data=r'data.yaml',
                imgsz=640,
                epochs=50,
                batch=4,
                workers=0,
                device='',
                optimizer='SGD',
                close_mosaic=10,
                resume=False,
                project='runs/train',
                name='exp',
                single_cls=False,
                cache=False,
                )

训练代码的参数解释，标蓝色的参数为常用参数：

model 参数：该参数填入模型配置文件的路径，改进的话建议不需要预训练模型权重来训练
data 参数：该参数可以填入训练数据集配置文件的路径
imgsz 参数：该参数代表输入图像的尺寸，指定为 640x640 像素
epochs 参数：该参数代表训练的轮数
batch 参数：该参数代表批处理大小，电脑显存越大，就设置越大，根据自己电脑性能设置
workers 参数：该参数代表数据加载的工作线程数，出现显存爆了的话可以设置为 0，默认是 8
device 参数：该参数代表用哪个显卡训练，留空表示自动选择可用的 GPU 或 CPU
optimizer 参数：该参数代表优化器类型
close_mosaic 参数：该参数代表在多少个 epoch 后关闭 mosaic 数据增强
resume 参数：该参数代表是否从上一次中断的训练状态继续训练。设置为 False 表示从头开始新的训练。如果设置为 True，则会加载上一次训练的模型权重和优化器状态，继续训练。这在训练被中断或在已有模型的基础上进行进一步训练时非常有用。
project 参数：该参数代表项目文件夹，用于保存训练结果
name 参数：该参数代表命名保存的结果文件夹
single_cls 参数：该参数代表是否将所有类别视为一个类别，设置为 False 表示保留原有类别
cache 参数：该参数代表是否缓存数据，设置为 False 表示不缓存。

测试一下训练，打印出来的 YOLOv11 结构可以看到添加改进的模块成功
在这里插入图片描述

总结

把环境配置好，数据集处理好，训练基本能成功，创作不易，请帮忙点一个爱心，关注我，带你不挂科！

在这里插入图片描述

原文地址：https://blog.csdn.net/weixin_44779079/article/details/143788689

免责声明：本站文章内容转载自网络资源，如本站内容侵犯了原著者的合法权益，可联系本站删除。更多内容请关注自学内容网（zxcms.com）！

上一篇：医院信息化与智能化系统(21)
下一篇：前端第一天鸿蒙实训第19天前端篇

YOLOv11改进，YOLOv11结合DynamicConv(动态卷积)，CVPR2024，二次创新C3k2结构

摘要

# 理论介绍

目录

🎓一、YOLOv11原始版本代码下载

🍀🍀1.YOLOv11模型结构图

🍀🍀2.环境配置

🎓二、DynamicConv代码

🎓三、添加方法

🍀🍀1.在modules目录下添加第二章的代码

🍀🍀2.在__init__.py文件导入

🍀🍀3.在tasks.py文件进行注册

🎓四、yaml文件修改

🍀🍀1.第一种添加方法

🎓五、训练文件修改

🍀🍀1.新建训练文件

🍀🍀2.修改训练文件

总结

相关文章

🍀🍀2.在init.py文件导入