【BoF】《Bag of Freebies for Training Object Detection Neural Networks》

🕗 发布于 2024-09-20 07:23 目标检测 人工智能 计算机视觉 BoF

在这里插入图片描述

arXiv-2019

https://github.com/dmlc/gluon-cv

文章目录

1 Background and Motivation
2 Related Work
3 Advantages / Contributions
4 Method
5 Experiments
6 Conclusion（own） / Future work

1 Background and Motivation

在这里插入图片描述

分类任务出了篇【BoT】《Bag of Tricks for Image Classification with Convolutional Neural Networks》（CVPR-2019），目标检测任务比图像分类任务复杂，作者基于目标检测任务，来借鉴整合了些 bag of freebies，inference free，有明显涨点

2 Related Work

Scattering tricks from Image Classification
- Learning rate warmup
- Label smoothing
- mixup
- Cosine annealing strategy
Deep Object Detection Pipelines
- one stage
- two stage

3 Advantages / Contributions

整理了一些目标检测的 bag of freebies（proposed a visually coherent image mixup methods），使 yolov3 在 coco 数据集上提了 5 个点

4 Method

4.1 Visually Coherent Image Mixup for Object Detection

原版的【Mixup】《Mixup：Beyond Empirical Risk Minimization》（ICLR-2018）在分类任务中的应用

在这里插入图片描述

在这里插入图片描述
beta 分布取得是 $\alpha=\beta=0.5$ ，混合比例比较极端，基本非 A 即 B

beta 分布的这种分布应用在目标检测任务中的结果如下

在这里插入图片描述

贴在画面中的大象很容易漏检

作者把 mixup 应用在目标检测的时候，把 beta 分布的参数改为了 $\alpha=\beta=1.5$

混合的更充分，作者对这种混合形式的语言描述如下

similar to the transition frames commonly observed when we are watching low FPS movies or surveillance videos.

混合效果如下

在这里插入图片描述

networks are encouraged to observe unusual crowded patches

4.2 Classification Head Label Smoothing

正常的 label smoothing，用在分类分支上，来自【Inception-v3】《Rethinking the Inception Architecture for Computer Vision》（CVPR-2016）

在这里插入图片描述

标签的 one-shot 的分布（缺点 This encourages the model to be too confident）改为上述公式分布

4.3 Data Preprocessing

（1）Random geometry transformation

random cropping (with constraints)
random expansion
random horizontal flip
random resize (with random interpolation)

two-stage 的目标检测相比 one stage，多了一个 roi pooling 以及之后的过程，所以 two-stage 的时候，not use random cropping techniques during data augmentation.

（2）Random color jittering

brightness
hue
saturation
contrast

4.4 Training Schedule Revamping

传统 step learning rate 的缺点

Step schedule has sharp learning rate transition which may cause the optimizer to re-stabilize the learning momentum in the next few iterations.

作者采用余弦学习率（the higher frequency of learning rate adjustment） + warm up（avoid gradient explosion during the initial training iterations.）

在这里插入图片描述

4.5 Synchronized Batch Normalization

跨机器 synchronized batch normalization in object detection

4.6 Random shapes training for singlestage object detection networks

$H =W = \{320; 352; 384; 416; 448; 480; 512; 544; 576; 608\}$

5 Experiments

yolov3
faster rcnn

5.1 Datasets and Metrics

PASCAL VOC
Pascal VOC 2007 trainval and 2012 trainval for training and 2007 test set for validation.
COCO

5.2 Incremental trick evaluation on Pascal VOC

mixup 改进提升点

在这里插入图片描述

看看其他 bag of freebies 的提升情况

在这里插入图片描述

可以看到 one-stage 对 data augmentation 更依赖

two-stage sampling based proposals can effectively replace random cropping，对 data augmentation 的依赖更少

5.3 Bag of Freebies on MS COCO

在这里插入图片描述

对 yolov3 的提升还是很猛的

在这里插入图片描述

全类别，基本都是提升的红色

5.4 Impact of mixup on different phases of training detection network

mix up 有两个地方涉及到

pre-training classification network backbone with traditional mixup
training detection networks using proposed visually coherent image mixup for object detection

预训练和训练的时候都用 mix up 提升最明显

作者的解释

We expect by applying mixup in both training phases, shallow layers of networks are receiving statistically similar inputs, resulting in less perturbations for low level filters.

6 Conclusion（own） / Future work

Rosenfeld A, Zemel R, Tsotsos J K. The elephant in the room[J]. arXiv preprint arXiv:1808.03305, 2018.
a large amount of anchor size(up to 30k) is effectively contributing to batch size implicitly

原文地址：https://blog.csdn.net/bryant_meng/article/details/142335476

免责声明：本站文章内容转载自网络资源，如本站内容侵犯了原著者的合法权益，可联系本站删除。更多内容请关注自学内容网（zxcms.com）！

上一篇：图像处理-掩码
下一篇：数据结构-3.1.栈的基本概念

aws云服务器：高可靠性和数据安全
云服务器是通过虚拟化技术从物理服务器中分割出的一种虚拟化计算资源。它具备传统服务器的核心功能，但同时具备更高的弹性和灵活性。用户可以通过网络远程管理云服务器，根据需求增加或减少计算资源，并仅为实际使用
阅读更多2024-11-17
i春秋-Hash
2.通过get接收var参数，base64后进行一个正则匹配——"/[oc]:\d+:/i"是一个忽略大小写的正则表达式，用于匹配以 “o” 或 “c” 开头，接着是冒号、一个或多
阅读更多2024-11-17
[Linux]多线程详解
直接：
阅读更多2024-11-17
C#自定义特性-SQL
【代码】SQL特性。
阅读更多2024-11-17
Oracle单实例静默安装
在企业环境中，自动化和标准化是提高效率的关键。静默安装（也称为无人值守安装）是一种无需人工干预的安装方法，适用于大规模部署或需要重复安装的场景。本文将介绍如何在CentOS上静默安装Oracle数据库
阅读更多2024-11-17
【ChatGPT】提高 ChatGPT 创意输出的提示词技巧
掌握这些技巧后，您可以在各类创意项目中更好地利用 ChatGPT，快速生成丰富、多样的创意内容。本文将介绍一些实用的技巧，帮助您通过优化提示词来提升 ChatGPT 的创意输出。给出模糊的、宽泛的指示
阅读更多2024-11-17
vuetify重置样式
vuetify中按钮的英文文字默认是大写形式的，怎么把按钮文字这种大写形式的属性给去掉呢，我们可以用scss重置这个css样式。vuetify重置scss变量。
阅读更多2024-11-17
提示词高级阶段学习day3.1什么是结构化 Prompt ？
结构化的思想很普遍，结构化内容也很普遍，我们日常写作的文章，看到的书籍都在使用标题、子标题、段落、句子等语法结构。
阅读更多2024-11-17
ChatGPT实现旅游推荐微信小程序
开发一个AI旅游推荐小程序，基于用户输入的旅行偏好，提供个性化的旅游推荐和详细信息展示。
阅读更多2024-11-17
【ChatGPT】编写结构化 Prompt 的技巧
结构化 Prompt是一种经过精心设计的提示方式，它以清晰的结构传递用户需求，帮助 ChatGPT 更准确地理解任务，并生成所需的结果。任务描述：明确告知 ChatGPT 所需完成的任务。背景信息：为
阅读更多2024-11-17

【BoF】《Bag of Freebies for Training Object Detection Neural Networks》

文章目录

1 Background and Motivation

2 Related Work

3 Advantages / Contributions

4 Method

4.1 Visually Coherent Image Mixup for Object Detection

4.2 Classification Head Label Smoothing

4.3 Data Preprocessing

4.4 Training Schedule Revamping

4.5 Synchronized Batch Normalization

4.6 Random shapes training for singlestage object detection networks

5 Experiments

5.1 Datasets and Metrics

5.2 Incremental trick evaluation on Pascal VOC

5.3 Bag of Freebies on MS COCO

5.4 Impact of mixup on different phases of training detection network

6 Conclusion（own） / Future work

相关文章