当前位置：首页 > news >正文

乱糟糟的YOLOv8-detect和pose训练自己的数据集

news 2026/5/21 17:43:58

时代在进步，yolo在进步，我还在踏步，v8我浅搞了一下detect和pose，记录一下，我还是要吐槽一下，为啥子这个模型就放在了这个文件深处，如图。

以下教程只应用于直接应用yolov8，不修改。我之前搞v7的环境，直接

 pip install ultralytics

1. detect

在detect文件夹下新建一个dataset放图片（jpg）和yolo格式的标签（txt）训练集和测试集直接分好，再新建一个data.yaml，如图，放你自己的路径，类别。

放一个检测框的json转yolo的代码，改类别和文件夹路径

import os
import json
import numpy as np
# 类和索引
CLASSES=["fish"]
def convert(size,box):'''''input:size:(width,height);box:(x1,x2,y1,y2)output:(x,y,w,h)'''dw=1./size[0]dh=1./size[1]x=(box[0]+box[1])/2.0y=(box[2]+box[3])/2.0w=box[1]-box[0]h=box[3]-box[2]x=x*dww=w*dwy=y*dhh=h*dhreturn (x,y,w,h)
# json -> txt
def json2txt(path_json,path_txt):# print(path_json,"r")with open(path_json,"r") as path_json:jsonx=json.load(path_json)width=int(jsonx["imageWidth"])      # 原图的宽height=int(jsonx["imageHeight"])    # 原图的高with open(path_txt,"w+") as ftxt:# 遍历每一个bbox对象for shape in jsonx["shapes"]:obj_cls=str(shape["label"])     # 获取类别cls_id=CLASSES.index(obj_cls)   # 获取类别索引points=np.array(shape["points"])    # 获取(x1,y1,x2,y2)x1=int(points[0][0])y1=int(points[0][1])x2=int(points[1][0])y2=int(points[1][1])# (左上角,右下角) -> (中心点,宽高) 归一化bb=convert((width,height),(x1,x2,y1,y2))ftxt.write(str(cls_id)+" "+" ".join([str(a) for a in bb])+"\n")
if __name__=="__main__":# json文件夹dir_json="C:\\Users\\ASUS\\Desktop\\111\\"# txt文件夹dir_txt="C:\\Users\\ASUS\\Desktop\\222\\"if not os.path.exists(dir_txt):os.makedirs(dir_txt)# 得到所有json文件list_json=os.listdir(dir_json)# 遍历每一个json文件,转成txt文件for cnt,json_name in enumerate(list_json):print("cnt=%d,name=%s"%(cnt,json_name))path_txt=dir_txt+json_name.replace(".json",".txt")path_json = dir_json + json_nameprint("path_json\t",path_json)print("path_txt\t",path_txt)# (x1,y1,x2,y2)->(x,y,w,h)json2txt(path_json,path_txt)

准备好了，直接terminal里输入就行，但是如果想改点啥比如说希望预测的时候不输出的类别，就输出框，他就改不了，因为这个ultra这个包都给整好了，封装的忒严重，想在这个模型上进行改进就得给他卸了，然后再搞。

#训练的代码
yolo task=detect mode=train model=yolov8s.yaml data=D:/DATA/ultralytics-main/ultralytics/models/yolo/detect/data.yaml epochs=200 batch=128# 预测的代码
yolo task=detect mode=predict model=D:/DATA/ultralytics-main/weights/best.pt source=D:/DATA/ultralytics-main/ultralytics/models/yolo/detect/dataset/images/val  device=cpu

2. pose

pose的数据集跟之前的有一点区别，首先标注关键点时，要先使用矩形框（rectangle）框出目标，然后在这个矩形框里面打关键点，必须保证每一张照片当中点的数量是相同的，就是说1234得对应上，每个点按顺序进行标注，总数需要是一样多的。3可以被遮挡，但是也得标，然后把这个点变成不可见就可以了。最终得到了 .json 文件，然后我们需要将其转化为 .txt 文件，2代表可见，0代表不可见。转的代码在下面，我用是好使的。

然后跟上面差不多的命令就可以了。

# 关键点检测json转txt
import os
import json
import shutil
import timeimport numpy as np
from tqdm import tqdmDataset_root = 'C:/Users/ASUS/Desktop/strong121/labels/'  # 转化的json文件地址
# 框的类别
bbox_class =["fish"]# 关键点的类别，有多少类就写多少
keypoint_class = ['1', '2', '3','4', '5', '6', '7', '8', '9', '10', '11', '12','13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23','24', '25', '26', '27', '28', '29', '30', '31', '32', '33', '34','35', '36', '37', '38', '39', '40', '41', '42', '43', '44']os.chdir(Dataset_root)def process_single_json(labelme_path, save_folder='C:/Users/ASUS/Desktop/no/'):with open(labelme_path, 'r', encoding='utf-8') as f:labelme = json.load(f)img_width = labelme['imageWidth']  # 图像宽度img_height = labelme['imageHeight']  # 图像高度# 生成 YOLO 格式的 txt 文件suffix = labelme_path.split('.')[-2]yolo_txt_path = suffix + '.txt'with open(yolo_txt_path, 'w', encoding='utf-8') as f:for each_ann in labelme['shapes']:  # 遍历每个标注if each_ann['shape_type'] == 'rectangle':  # 每个框，在 txt 里写一行yolo_str = ''## 框的信息# 框的类别 IDbbox_class_id = bbox_class.index(each_ann['label'])# print(bbox_class_id)yolo_str += '{} '.format(bbox_class_id)# 左上角和右下角的 XY 像素坐标bbox_top_left_x = int(min(each_ann['points'][0][0], each_ann['points'][1][0]))bbox_bottom_right_x = int(max(each_ann['points'][0][0], each_ann['points'][1][0]))bbox_top_left_y = int(min(each_ann['points'][0][1], each_ann['points'][1][1]))bbox_bottom_right_y = int(max(each_ann['points'][0][1], each_ann['points'][1][1]))# 框中心点的 XY 像素坐标bbox_center_x = int((bbox_top_left_x + bbox_bottom_right_x) / 2)bbox_center_y = int((bbox_top_left_y + bbox_bottom_right_y) / 2)# 框宽度bbox_width = bbox_bottom_right_x - bbox_top_left_x# 框高度bbox_height = bbox_bottom_right_y - bbox_top_left_y# 框中心点归一化坐标bbox_center_x_norm = bbox_center_x / img_widthbbox_center_y_norm = bbox_center_y / img_height# 框归一化宽度bbox_width_norm = bbox_width / img_width# 框归一化高度bbox_height_norm = bbox_height / img_heightyolo_str += '{:.5f} {:.5f} {:.5f} {:.5f} '.format(bbox_center_x_norm, bbox_center_y_norm,bbox_width_norm, bbox_height_norm)# print(yolo_str)# print("**********************")# time.sleep(90000)## 找到该框中所有关键点，存在字典 bbox_keypoints_dict 中bbox_keypoints_dict = {}for each_ann in labelme['shapes']:  # 遍历所有标注if each_ann['shape_type'] == 'point':  # 筛选出关键点标注# 关键点XY坐标、类别x = int(each_ann['points'][0][0])y = int(each_ann['points'][0][1])label = each_ann['label']if (x > bbox_top_left_x) & (x < bbox_bottom_right_x) & (y < bbox_bottom_right_y) & (y > bbox_top_left_y):  # 筛选出在该个体框中的关键点bbox_keypoints_dict[label] = [x, y]## 把关键点按顺序排好for each_class in keypoint_class:  # 遍历每一类关键点if each_class in bbox_keypoints_dict:keypoint_x_norm = bbox_keypoints_dict[each_class][0] / img_widthkeypoint_y_norm = bbox_keypoints_dict[each_class][1] / img_heightyolo_str += '{:.5f} {:.5f} {} '.format(keypoint_x_norm, keypoint_y_norm,2)  # 2-可见不遮挡 1-遮挡 0-没有点else:  # 不存在的点，一律为0yolo_str += '0 0 0 '# 写入 txt 文件中f.write(yolo_str + '\n')shutil.move(yolo_txt_path, save_folder)print('{} --> {} 转换完成'.format(labelme_path, yolo_txt_path))save_folder = 'C:/Users/ASUS/Desktop/no'   #  转换后的训练集标注文件至目录
for labelme_path in os.listdir(Dataset_root):# try:process_single_json(Dataset_root + labelme_path, save_folder=save_folder)# except:#     print('******有误******', labelme_path)
print('YOLO格式的txt标注文件已保存至 ', save_folder)

乱糟糟的YOLOv8-detect和pose训练自己的数据集

相关文章：

乱糟糟的YOLOv8-detect和pose训练自己的数据集

【Nginx】Nginx $remote_addr和$proxy_add_x_forwarded_for变量详解

MySQL自动删除binlog日志

C++ 文件和流

案例分享：西河水库安全监测信息化系统实施方案

使用Angular和MongoDB来构建具有登录功能的博客应用程序

ChatGPT 与前端技术实现制作大屏可视化

视频监控/视频云存储EasyCVR平台接入华为ivs3800平台提示400报错，如何解决？

c++基础数据结构

微服务-sentinel详解

【MTK平台】根据kernel log分析wifi 连接的时候流程

【SpringBoot】两种配置文件, 详解 properties 和 yml 的语法格式, 使用方式, 读取配置

基于微信小程序的文化宣传平台的设计与实现（Java+spring boot+微信小程序+MySQL）

一款windows的终端神奇，类似mac的iTem2

illegal cyclic inheritance involving trait Iterable_2种解决方式

探秘二叉树后序遍历：从叶子到根的深度之旅

2023全国大学生数学建模A题思路+模型+代码+论文（比赛开始后持续更新）

从输入URL到页面展示过程：深入解析网络请求与渲染

Go 使用 Gorm 将操作信息集成到链路跟踪 Jaeger，进行增删改查使用举例，并做可视化UI界面展示（附源码）

【JavaScript精通之道】掌握数据遍历：解锁现代化遍历方法，提升开发效率！

2026年降AI技术进化深度解读：从换词替句到语义重构各代技术效果完整对比

wpr_simulation机器人仿真平台：架构设计与高级应用实战

PyQt5串口上位机开发指南：从环境搭建到数据可视化实战

别再瞎找了！2026年不容错过的专业AI论文软件

占坑uvm之stop_sequence()

Android FLAG_SECURE限制突破：如何让所有应用都能自由截屏？

中年以后，真正有效的抗衰老运动，其实就这 4 种

Onekey Steam清单下载工具：快速获取游戏清单的完整指南

ComfyUI Manager 架构设计与性能优化：从插件管理到系统集成的完整解决方案

别再混淆了！一张图看懂SAP特殊采购类40、70、80的核心区别与适用场景