yolov5的pt模型转化为rk3588的rknn，并在rk3588上调用api进行前向推理

🕗 发布于 2024-11-26 06:15 YOLO rknn 神经网络 深度学习 目标检测

当使用yolov5进行目标检测且进行边缘计算的场景时，要考虑性价比或者国产化的话，rk3588板子是个不错的选择。

本篇介绍yolov5的pytorch模型转化为rknn的流程，并展示在rk板子上如何调用相关api来使用转好的rknn模型进行前向推理。

pt转rknn流程

pt转onnx

首先将训练好的pt模型转为onnx中间模型，在转之前需要先修改主目录底下的models下的yolo.py的部分代码，将如图的forward推理部分进行注释。
在这里插入图片描述
替换为：

    def forward(self, x):
        z = []
        for i in range(self.nl):
            x[i] = self.m[i](x[i])
 
        return x

如果识别出现乱框的现象，则替换为：

    def forward(self, x):
        z = []
        for i in range(self.nl):
            x[i] = torch.sigmoid(self.m[i](x[i]))
 
        return x

记得在训练时再给他改回去，否则会报错，在转模型时才需要改这部分代码。
接下来运行export.py来转onnx：

python export.py --weights runs/train/exp/weights/best.pt --img 320 --batch 1 --include onnx

onnx转rknn

首先创建一个虚拟环境，我创建的python版本为3.8。可以直接使用pip install -r requirements.txt来创建环境；
如果出现报错，则使用conda env create -f environment.yml来创建。如果pip安装可以的话，可以直接-i换源比较方便。
这里rknn-toolkit2安装会失败是正常的，下面会手动进行安装。

*以上为直接复制我的环境，也可以根据下面的步骤来自己安装相应的包。

接下来下载RKNN-Toolkit2，可以通过官网下载，也可以通过我上传的资源下载。
下载完先进入刚才创建的环境安装rknn-toolkit2包，如图，进入packages底下，安装相关依赖：

pip install -r requirements_cp38-1.6.0.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

在这里插入图片描述
然后安装相关rknn-toolkit2包：

pip install rknn_toolkit2-1.6.0+81f21f4d-cp38-cp38-linux_x86_64.whl

*接下来进行onnx转rknn
在如图所示路径下打开test.py
在这里插入图片描述
修改相关参数

最后一个参数要改为rk3588，默认为rk3566。改完后直接运行test.py文件即可转为rknn。

rk板子上进行前向推理

因为rk3588是aarch64架构的，所以不能用rknn-toolkit2包，而是要用rknn-toolkit-lite2包，在rk上安装对应的whl：
在这里插入图片描述
代码如下：

from copy import copy
import time
import numpy as np
import cv2
from rknnlite.api import RKNNLite

RKNN_MODEL = 'best.rknn'
# IMG_PATH = './input/240513_00000741.jpg'
OBJ_THRESH = 0.25
NMS_THRESH = 0.45
IMG_SIZE = (640, 640)
OUTPUT_VIDEO_PATH = 'output_1.mp4'
BOX = (450, 150, 1100, 550)
CLASSES = ['person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light',
           'fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow',
           'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee',
           'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard',
           'tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple',
           'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch',
           'potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone',
           'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear',
           'hair drier', 'toothbrush']

anchors = [[[10, 13], [16, 30], [33, 23]],
           [[30, 61], [62, 45], [59, 119]],
           [[116, 90], [156, 198], [373, 326]]]

class Letter_Box_Info():
    def __init__(self, shape, new_shape, w_ratio, h_ratio, dw, dh, pad_color) -> None:
        self.origin_shape = shape
        self.new_shape = new_shape
        self.w_ratio = w_ratio
        self.h_ratio = h_ratio
        self.dw = dw
        self.dh = dh
        self.pad_color = pad_color

def box_process(position, anchors):
    grid_h, grid_w = position.shape[2:4]
    col, row = np.meshgrid(np.arange(0, grid_w), np.arange(0, grid_h))  # (80, 80) (80, 80)
    col = col.reshape(1, 1, grid_h, grid_w)    # (1, 1, 80, 80)
    row = row.reshape(1, 1, grid_h, grid_w)
    grid = np.concatenate((col, row), axis=1)  # (1, 2, 80, 80)
    stride = np.array([IMG_SIZE[1]//grid_h, IMG_SIZE[0]//grid_w]).reshape(1,2,1,1)  # 8 8

    col = col.repeat(len(anchors), axis=0)
    row = row.repeat(len(anchors), axis=0)
    anchors = np.array(anchors)
    anchors = anchors.reshape(*anchors.shape, 1, 1)  # (3, 2, 1, 1)

    box_xy = position[:,:2,:,:]*2 - 0.5
    box_wh = pow(position[:,2:4,:,:]*2, 2) * anchors

    box_xy += grid
    box_xy *= stride
    box = np.concatenate((box_xy, box_wh), axis=1)   # (3, 4, 80, 80)

    # Convert [c_x, c_y, w, h] to [x1, y1, x2, y2]
    xyxy = np.copy(box)
    xyxy[:, 0, :, :] = box[:, 0, :, :] - box[:, 2, :, :]/ 2  # top left x
    xyxy[:, 1, :, :] = box[:, 1, :, :] - box[:, 3, :, :]/ 2  # top left y
    xyxy[:, 2, :, :] = box[:, 0, :, :] + box[:, 2, :, :]/ 2  # bottom right x
    xyxy[:, 3, :, :] = box[:, 1, :, :] + box[:, 3, :, :]/ 2  # bottom right y

    return xyxy
#
def filter_boxes(boxes, box_confidences, box_class_probs):
    """Filter boxes with object threshold.
    """
    box_confidences = box_confidences.reshape(-1)
    class_max_score = np.max(box_class_probs, axis=-1)
    classes = np.argmax(box_class_probs, axis=-1)


    _class_pos = np.where(class_max_score* box_confidences >= OBJ_THRESH)
    scores = (class_max_score* box_confidences)[_class_pos]

    boxes = boxes[_class_pos]
    classes = classes[_class_pos]

    return boxes, classes, scores

# def filter_boxes(boxes, box_confidences, box_class_probs):
#     """Filter boxes with object threshold.
#     """
#     boxes = boxes.reshape(-1, 4)
#     box_confidences = box_confidences.reshape(-1)
#     box_class_probs = box_class_probs.reshape(-1, box_class_probs.shape[-1])
#
#     _box_pos = np.where(box_confidences >= OBJ_THRESH)
#     boxes = boxes[_box_pos]
#     box_confidences = box_confidences[_box_pos]
#     box_class_probs = box_class_probs[_box_pos]
#
#     class_max_score = np.max(box_class_probs, axis=-1)
#     classes = np.argmax(box_class_probs, axis=-1)
#     _class_pos = np.where(class_max_score >= OBJ_THRESH)
#
#     boxes = boxes[_class_pos]
#     classes = classes[_class_pos]
#     scores = (class_max_score * box_confidences)[_class_pos]
#
#     return boxes, classes, scores

def nms_boxes(boxes, scores):
    """Suppress non-maximal boxes.
    # Returns
        keep: ndarray, index of effective boxes.
    """
    x = boxes[:, 0]
    y = boxes[:, 1]
    w = boxes[:, 2] - boxes[:, 0]
    h = boxes[:, 3] - boxes[:, 1]

    areas = w * h
    order = scores.argsort()[::-1]

    keep = []
    while order.size > 0:
        i = order[0]
        keep.append(i)

        xx1 = np.maximum(x[i], x[order[1:]])
        yy1 = np.maximum(y[i], y[order[1:]])
        xx2 = np.minimum(x[i] + w[i], x[order[1:]] + w[order[1:]])
        yy2 = np.minimum(y[i] + h[i], y[order[1:]] + h[order[1:]])

        w1 = np.maximum(0.0, xx2 - xx1 + 0.00001)
        h1 = np.maximum(0.0, yy2 - yy1 + 0.00001)
        inter = w1 * h1

        ovr = inter / (areas[i] + areas[order[1:]] - inter)
        inds = np.where(ovr <= NMS_THRESH)[0]
        order = order[inds + 1]
    keep = np.array(keep)
    return keep

def post_process(input_data, anchors):
    boxes, scores, classes_conf = [], [], []
    # 1*255*h*w -> 3*85*h*w
    input_data = [_in.reshape([len(anchors[0]),-1]+list(_in.shape[-2:])) for _in in input_data]
    for i in range(len(input_data)):                                    # (3, 85, 80, 80)
        boxes.append(box_process(input_data[i][:,:4,:,:], anchors[i]))  # (3, 4, 80, 80)
        scores.append(input_data[i][:,4:5,:,:])                         # (3, 1, 80, 80)
        classes_conf.append(input_data[i][:,5:,:,:])                    # (3, 80, 80, 80)

    def sp_flatten(_in):
        ch = _in.shape[1]
        _in = _in.transpose(0,2,3,1)
        return _in.reshape(-1, ch)

    boxes = [sp_flatten(_v) for _v in boxes]                  # (3, 19200, 4)
    classes_conf = [sp_flatten(_v) for _v in classes_conf]    # (3, 19200, 80)
    scores = [sp_flatten(_v) for _v in scores]                # (3, 19200, 1)

    boxes = np.concatenate(boxes)                  # (25200, 4)
    classes_conf = np.concatenate(classes_conf)    # (25200, 80)
    scores = np.concatenate(scores)                # (25200, 1)

    # filter according to threshold
    boxes, classes, scores = filter_boxes(boxes, scores, classes_conf)
    # (12, 4)  12  12

    # nms
    nboxes, nclasses, nscores = [], [], []

    for c in set(classes):
        inds = np.where(classes == c)
        b = boxes[inds]
        c = classes[inds]
        s = scores[inds]
        keep = nms_boxes(b, s)

        if len(keep) != 0:
            nboxes.append(b[keep])
            nclasses.append(c[keep])
            nscores.append(s[keep])

    if not nclasses and not nscores:
        return None, None, None

    boxes = np.concatenate(nboxes)
    classes = np.concatenate(nclasses)
    scores = np.concatenate(nscores)

    return boxes, classes, scores

def draw(image, boxes, scores, classes):
    for box, score, cl in zip(boxes, scores, classes):
        top, left, right, bottom = [int(_b) for _b in box]
        print("%s @ (%d %d %d %d) %.3f" % (CLASSES[cl], top, left, right, bottom, score))
        cv2.rectangle(image, (top, left), (right, bottom), (255, 0, 0), 2)
        cv2.putText(image, '{0} {1:.2f}'.format(CLASSES[cl], score),
                    (top, left - 6), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2)

def letterbox(im, new_shape=(640, 640), color=(0, 0, 0), letter_box_info_list=[]):
    shape = im.shape[:2]  # current shape [height, width]
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)

    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])

    ratio = r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
    # dw, dh = np.mod(dw, 32), np.mod(dh, 32)

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
    letter_box_info_list.append(Letter_Box_Info(shape, new_shape, ratio, ratio, dw, dh, color))
    return im, letter_box_info_list

def get_real_box(box, in_format='xyxy', letter_box_info_list=[]):
    bbox = copy(box)
    # unletter_box result
    if in_format=='xyxy':
        bbox[:,0] -= letter_box_info_list[-1].dw
        bbox[:,0] /= letter_box_info_list[-1].w_ratio
        bbox[:,0] = np.clip(bbox[:,0], 0, letter_box_info_list[-1].origin_shape[1])

        bbox[:,1] -= letter_box_info_list[-1].dh
        bbox[:,1] /= letter_box_info_list[-1].h_ratio
        bbox[:,1] = np.clip(bbox[:,1], 0, letter_box_info_list[-1].origin_shape[0])

        bbox[:,2] -= letter_box_info_list[-1].dw
        bbox[:,2] /= letter_box_info_list[-1].w_ratio
        bbox[:,2] = np.clip(bbox[:,2], 0, letter_box_info_list[-1].origin_shape[1])

        bbox[:,3] -= letter_box_info_list[-1].dh
        bbox[:,3] /= letter_box_info_list[-1].h_ratio
        bbox[:,3] = np.clip(bbox[:,3], 0, letter_box_info_list[-1].origin_shape[0])
    return bbox


if __name__ == '__main__':
    rknn = RKNNLite()

    print('--> Load RKNN model')
    ret = rknn.load_rknn(RKNN_MODEL)
    if ret != 0:
        print('Load RKNN model failed')
        exit(ret)
    print('done')
    ret = rknn.init_runtime()
    if ret != 0:
        print('Init runtime environment failed!')
        exit(ret)
    print('done')

    cap = cv2.VideoCapture("./input/out_240715151339.mp4")

    # 获取视频的一些属性
    fps = cap.get(cv2.CAP_PROP_FPS)
    # width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
    # height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

    # 创建 VideoWriter 对象
    x, y, w, h = BOX
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')  # 或者使用 'XVID'
    out = cv2.VideoWriter(OUTPUT_VIDEO_PATH, fourcc, fps, (w, h))

    fps = 0.0
    while True:
        t1 = time.time()
        # 读取一帧
        ret, frame = cap.read()
        if not ret:
            break

        # 加载帧
        img0 = frame[y:y+h, x:x+w, :]

        img_size = (640, 640)
        img, letter_box_info_list = letterbox(im= img0.copy(), new_shape=(IMG_SIZE[1], IMG_SIZE[0]))  # padded resize

        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)  # HWC to CHW, BGR to RGB

        if len(img.shape) == 3:
            img = img[None]  # expand for batch dim

        outputs = rknn.inference(inputs=[img])  # Inference

        boxes, classes, scores = post_process(outputs, anchors)
        boxes_filter, scores_filter, classes_filter = [0, 0, 0, 0], [], []
        max_box = [0, 0, 0, 0]
        for box, score, cl in zip(boxes, scores, classes):
            if cl == 0:
                if (box[2]-box[0])*(box[3]-box[1]) > (max_box[2]-max_box[0])*(max_box[3]-max_box[1]):
                    max_box = box
                    boxes_filter = np.expand_dims(max_box, axis=0)
                    scores_filter = np.expand_dims(score, axis=0)
                    classes_filter = np.expand_dims(cl, axis=0)
        img_p = img0.copy()

        draw(img_p, get_real_box(boxes_filter, 'xyxy', letter_box_info_list), scores_filter, classes_filter)
        cv2.imwrite("11.jpg", img_p)
        out.write(img_p)

    cap.release()
    out.release()
    rknn.release()

这里我是输入视频，输出一帧一帧的图片来查看，可以自行改变输入输出的格式。
后续再补充实现npu进行加速。

原文地址：https://blog.csdn.net/weixin_45354497/article/details/144021373

免责声明：本站文章内容转载自网络资源，如本站内容侵犯了原著者的合法权益，可联系本站删除。更多内容请关注自学内容网（zxcms.com）！

上一篇：P7184 [CRCI2008-2009] MAJSTOR 多层循环的遍历
下一篇：C# Winform--ffmpeg图片合成视频以及参数设置

三格电子—EtherNet IP转Modbus RTU网关
SG-EIP-MOD-210网关可以实现将Modbus接口设备连接到 EtherNet/IP网络中。用户不需要了解具体的Modbus和 EtherNet/IP协议即可实现将Modbus设备挂载到 Et
阅读更多2024-11-27
【大语言模型】ACL2024论文-21 通过冗余减少加快视觉条件语言生成的训练
本文介绍了EVLGen，这是一个为视觉条件语言生成模型预训练设计的高效框架，特别适用于计算需求高的场合，并且利用了冻结的预训练大型语言模型（LLMs）。传统的视觉语言预训练（VLP）通常涉及两个阶段的
阅读更多2024-11-27
【C++】list模拟实现（完结）
我们迭代器里面实现了前置++和前置--，还需要实现后置++和后置--。在文件的lst_iterator类里面实现。
阅读更多2024-11-27
量子安全与经典密码学：一些现实方面的讨论
经典密码学的数学复杂性假设在经典计算框架下是安全的，但面临量子计算的潜在威胁。同时，量子安全芯片和硬件加速的研究将进一步推动量子安全技术的实际应用，为信息安全提供更加稳固的基础。面对量子计算带来的挑战
阅读更多2024-11-27
UE5 Add Transient Field 节点
节点是 Niagara 中一个非常强大的工具，它允许你动态地为粒子系统添加临时数据字段，这些字段在粒子生命周期内有效，并且不会影响系统的长期属性。你可以用它来模拟短暂的物理效果、瞬时力场、碰撞反馈等多
阅读更多2024-11-27
【docker集群应用】Docker常用命令
在迁移过程中，可以使用docker export 命令将已经创建好的容器导出为容器快照文件，无论这个容器是处于运行状态还是停止状态均可导出。docker 容器默认会把容器内部第一个进程，也就是 pid
阅读更多2024-11-27
聊一聊Elasticsearch的索引（2）
对索引状态的管理，索引的块进行介绍
阅读更多2024-11-27
docker入门学习笔记
docker是一个用于构建、运行、传送应用程序的平台。为什么要使用docker？在开发测试库环境中测试成功后，打包成集装箱，到生产环境也是能够成功的。而传统的安装方式不仅繁琐，并且在测试环境安装后，
阅读更多2024-11-27
【文档搜索引擎】项目核心思路，模块划分和分词的概念
项目目标：实现一个针对 Java 文档的搜索引擎。
阅读更多2024-11-27
大模型中常见的微调方法有哪些？
前缀微调将一个连续的特定于任务的向量序列添加到输入，称之为前缀，如下图中的红色块所示。因此，我们只需要存储一个大型Transformer和已知任务特定前缀的副本，对每个额外任务产生非常小的开销。P-t
阅读更多2024-11-27

yolov5的pt模型转化为rk3588的rknn，并在rk3588上调用api进行前向推理

pt转rknn流程

pt转onnx

onnx转rknn

rk板子上进行前向推理

相关文章