ComfyUI - ComfyUI 工作流中集成 SAM2 + GroundingDINO 处理图像与视频 教程
欢迎关注我的CSDN:https://spike.blog.csdn.net/
本文地址:https://spike.blog.csdn.net/article/details/143359538
免责声明:本文来源于个人知识与公开资料,仅用于学术交流,欢迎讨论,不支持转载。

SAM2 与 GroundingDINO 结合,在图像分割和目标检测领域带来显著的进展,SAM2 实现精确的图像分割,而 GroundingDINO 则强化模型的目标检测能力,提供更加准确和细致的物体识别。在实际应用中,能够有效提升各类复杂图像处理任务的性能,协同工作提高处理速度,还确保高精度和稳定性。
ComfyUI 部署节点的 3 个步骤:
- 准备 节点(Node) 工程,
git clone,位于ComfyUI/custom_nodes - 安装依赖包,进入工程,运行
pip install -r requirements.txt - (可选) 模型提前下载,放入相应的文件夹中
- 重启服务,刷新页面,即可运行。
下载工程:ComfyUI-segment-anything-2、ComfyUI-Florence2、ComfyUI-KJNodes、ComfyUI-SAM2、ComfyUI-VideoHelperSuite
cd ComfyUI/custom_nodesgit clone https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite.git
git clone https://github.com/kijai/ComfyUI-segment-anything-2.git
git clone https://github.com/kijai/ComfyUI-Florence2.git
git clone https://github.com/kijai/ComfyUI-KJNodes.git# v1.0 版本,被 ComfyUI-SAM2替代
# git clone https://github.com/storyicon/comfyui_segment_anything
git clone https://github.com/neverbiasu/ComfyUI-SAM2.gitpip install -r requirements.txt
1.ComfyUI-segment-anything-2
节点:ComfyUI-segment-anything-2
准备模型:
- SAM2 模型 -
ComfyUI/models/sam2 - Florence-2 模型
ComfyUI/models/LLM,用于代替检测模型,例如 GroundingDINO,参考 ComfyUI-Florence2
支持处理视频流程,但是整体分割效果非常一般,而且 Points 效果也比较一般。
依赖节点:ComfyUI-Florence2、ComfyUI-KJNodes、ComfyUI-VideoHelperSuite
测试示例位于:https://github.com/kijai/ComfyUI-segment-anything-2/tree/main/examples
例如:points_segment_video_example.json
Load Video (Upload),加载视频节点Points Editor,Point 编辑节点,使用shift + 左右键,选择正负点。(Down)Load SAM2Model,下载或加载模型,sam2.1_hiera_large-fp16.safetensors,选择 fp16Sam2Segmentation分割节点,注意,需要重新添加,默认流程有问题,接受正负点。Preview Animation显示动画效果
即:

2.ComfyUI-SAM2
节点:ComfyUI-SAM2
准备模型:models/bert-base-uncased、models/grounding-dino、models/sams
GroundingDino + SAM2,只有 3 个节点,功能比较单一,检测效果较好。
GroundingDinoModelLoader (segment anything2),加载 DINO 模型SAM2ModelLoader (segment anything2),加载 SAM2 模型GroundingDinoSAM2Segment (segment anything2),合并,只有2个参数,Prompt 和 阈值
测试模型效果,支持多个词汇,例如 person 和 book,注意逗号分割,即:

效果如下:

Workflow1:
{"last_node_id":117,"last_link_id":62,"nodes":[{"id":113,"type":"Note","pos":{"0":56,"1":-415},"size":{"0":309.1065368652344,"1":177.01339721679688},"flags":{},"order":0,"mode":0,"inputs":[],"outputs":[],"properties":{"text":""},"widgets_values":["To get the image for the points editor, first create a canvas, then either input image/video (first frame is taken), or copy/paste an image while the node is selected, or drag&drop an image.\n\nWARNING: the image WILL BE SAVED to the node in compressed format, including when saving the workflow!\n\nClick the ? on the node for more information"],"color":"#432","bgcolor":"#653"},{"id":116,"type":"Reroute","pos":{"0":1066,"1":115},"size":[75,26],"flags":{},"order":5,"mode":0,"inputs":[{"name":"","type":"*","link":60,"label":"","widget":{"name":"value"}}],"outputs":[{"name":"","type":"STRING","links":[61],"slot_index":0,"label":""}],"properties":{"showOutputText":false,"horizontal":false}},{"id":112,"type":"ShowText|pysssss","pos":{"0":1166,"1":-429},"size":{"0":315,"1":100},"flags":{},"order":4,"mode":0,"inputs":[{"name":"text","type":"STRING","link":53,"widget":{"name":"text"},"label":"text"}],"outputs":[{"name":"STRING","type":"STRING","links":null,"shape":6,"label":"STRING"}],"properties":{"Node name for S&R":"ShowText|pysssss"},"widgets_values":["","[{\"x\": 256, \"y\": 256}, {\"x\": 237, \"y\": 463}, {\"x\": 321, \"y\": 138}]"]},{"id":117,"type":"ShowText|pysssss","pos":{"0":1163,"1":-277},"size":{"0":315,"1":76},"flags":{},"order":6,"mode":0,"inputs":[{"name":"text","type":"STRING","link":62,"widget":{"name":"text"},"label":"text"}],"outputs":[{"name":"STRING","type":"STRING","links":null,"shape":6,"label":"STRING"}],"properties":{"Node name for S&R":"ShowText|pysssss"},"widgets_values":["","[{\"x\": 0, \"y\": 0}, {\"x\": 426, \"y\": 242}]"]},{"id":102,"type":"VHS_LoadVideo","pos":{"0":14,"1":-59},"size":[363.24957275390625,619.2495727539062],"flags":{},"order":1,"mode":0,"inputs":[{"name":"meta_batch","type":"VHS_BatchManager","link":null,"shape":7,"label":"meta_batch"},{"name":"vae","type":"VAE","link":null,"shape":7,"label":"vae"}],"outputs":[{"name":"IMAGE","type":"IMAGE","links":[43,52,57],"slot_index":0,"shape":3,"label":"IMAGE"},{"name":"frame_count","type":"INT","links":null,"shape":3,"label":"frame_count"},{"name":"audio","type":"AUDIO","links":null,"shape":3,"label":"audio"},{"name":"video_info","type":"VHS_VIDEOINFO","links":null,"shape":3,"label":"video_info"}],"properties":{"Node name for S&R":"VHS_LoadVideo"},"widgets_values":{"video":"2851_1708350515(原视频).mp4","force_rate":0,"force_size":"512x?","custom_width":512,"custom_height":512,"frame_load_cap":16,"skip_first_frames":0,"select_every_nth":3,"choose video to upload":"image","videopreview":{"hidden":false,"paused":false,"params":{"frame_load_cap":16,"skip_first_frames":0,"force_rate":0,"filename":"2851_1708350515(原视频).mp4","type":"input","format":"video/mp4","select_every_nth":3,"force_size":"512x?"}}}},{"id":114,"type":"PointsEditor","pos":{"0":439,"1":-477},"size":[557,812],"flags":{"collapsed":false},"order":3,"mode":0,"inputs":[{"name":"bg_image","type":"IMAGE","link":52,"shape":7,"label":"bg_image"}],"outputs":[{"name":"positive_coords","type":"STRING","links":[53,55],"slot_index":0,"shape":3,"label":"positive_coords"},{"name":"negative_coords","type":"STRING","links":[60,62],"slot_index":1,"shape":3,"label":"negative_coords"},{"name":"bbox","type":"BBOX","links":null,"slot_index":2,"shape":3,"label":"bbox"},{"name":"bbox_mask","type":"MASK","links":null,"shape":3,"label":"bbox_mask"},{"name":"cropped_image","type":"IMAGE","links":null,"shape":3,"label":"cropped_image"}],"properties":{"Node name for S&R":"PointsEditor","imgData":{"name":"bg_image","base64":[""]},"points":"PointsEditor","neg_points":"PointsEditor"},"widgets_values":["{\"positive\":[{\"x\":256,\"y\":256},{\"x\":237,\"y\":463},{\"x\":321,\"y\":138}],\"negative\":[{\"x\":0,\"y\":0},{\"x\":426,\"y\":242}]}","[{\"x\":256,\"y\":256},{\"x\":237,\"y\":463},{\"x\":321,\"y\":138}]","[{\"x\":0,\"y\":0},{\"x\":426,\"y\":242}]","[{}]","[{}]","xyxy",512,512,false,null,null,null]},{"id":106,"type":"DownloadAndLoadSAM2Model","pos":{"0":459,"1":393},"size":{"0":315,"1":130},"flags":{},"order":2,"mode":0,"inputs":[],"outputs":[{"name":"sam2_model","type":"SAM2MODEL","links":[56],"shape":3,"label":"sam2_model"}],"properties":{"Node name for S&R":"DownloadAndLoadSAM2Model"},"widgets_values":["sam2.1_hiera_large.safetensors","video","cuda","fp16"]},{"id":115,"type":"Sam2Segmentation","pos":{"0":898,"1":393},"size":{"0":315,"1":190},"flags":{},"order":7,"mode":0,"inputs":[{"name":"sam2_model","type":"SAM2MODEL","link":56,"label":"sam2_model"},{"name":"image","type":"IMAGE","link":57,"label":"image"},{"name":"bboxes","type":"BBOX","link":null,"shape":7,"label":"bboxes"},{"name":"mask","type":"MASK","link":null,"shape":7,"label":"mask"},{"name":"coordinates_positive","type":"STRING","link":55,"widget":{"name":"coordinates_positive"},"shape":7,"label":"coordinates_positive"},{"name":"coordinates_negative","type":"STRING","link":61,"widget":{"name":"coordinates_negative"},"shape":7,"label":"coordinates_negative"}],"outputs":[{"name":"mask","type":"MASK","links":[59],"slot_index":0,"label":"mask"}],"properties":{"Node name for S&R":"Sam2Segmentation"},"widgets_values":[true,"","",false]},{"id":107,"type":"PreviewAnimation","pos":{"0":1340,"1":-59},"size":{"0":514.92431640625,"1":577.3973999023438},"flags":{},"order":8,"mode":0,"inputs":[{"name":"images","type":"IMAGE","link":43,"shape":7,"label":"images"},{"name":"masks","type":"MASK","link":59,"slot_index":1,"shape":7,"label":"masks"}],"outputs":[],"title":"Preview Animation 16x512x512","properties":{"Node name for S&R":"PreviewAnimation"},"widgets_values":[16,null]}],"links":[[43,102,0,107,0,"IMAGE"],[52,102,0,114,0,"IMAGE"],[53,114,0,112,0,"STRING"],[55,114,0,115,4,"STRING"],[56,106,0,115,0,"SAM2MODEL"],[57,102,0,115,1,"IMAGE"],[59,115,0,107,1,"MASK"],[60,114,1,116,0,"*"],[61,116,0,115,5,"STRING"],[62,114,1,117,0,"STRING"]],"groups":[],"config":{},"extra":{"ds":{"scale":0.5131581182307067,"offset":[396.07947776523474,760.0658700441401]}},"version":0.4}
Workflow2:
{"last_node_id":8,"last_link_id":7,"nodes":[{"id":2,"type":"SAM2ModelLoader (segment anything2)","pos":{"0":109,"1":303},"size":{"0":441,"1":58},"flags":{},"order":0,"mode":0,"inputs":[],"outputs":[{"name":"SAM2_MODEL","type":"SAM2_MODEL","links":[1],"slot_index":0,"label":"SAM2_MODEL"}],"properties":{"Node name for S&R":"SAM2ModelLoader (segment anything2)"},"widgets_values":["sam2_1_hiera_large.pt"]},{"id":3,"type":"LoadImage","pos":{"0":110,"1":427},"size":{"0":315,"1":314},"flags":{},"order":1,"mode":0,"inputs":[],"outputs":[{"name":"IMAGE","type":"IMAGE","links":[3],"slot_index":0,"label":"IMAGE"},{"name":"MASK","type":"MASK","links":null,"label":"MASK"}],"properties":{"Node name for S&R":"LoadImage"},"widgets_values":["IMG_5539.JPG","image"]},{"id":7,"type":"MaskPreview+","pos":{"0":921,"1":433},"size":[210,246],"flags":{},"order":5,"mode":0,"inputs":[{"name":"mask","type":"MASK","link":7,"label":"mask"}],"outputs":[],"properties":{"Node name for S&R":"MaskPreview+"}},{"id":1,"type":"GroundingDinoModelLoader (segment anything2)","pos":{"0":104,"1":186},"size":{"0":554.4000244140625,"1":58},"flags":{},"order":2,"mode":0,"inputs":[],"outputs":[{"name":"GROUNDING_DINO_MODEL","type":"GROUNDING_DINO_MODEL","links":[2],"slot_index":0,"label":"GROUNDING_DINO_MODEL"}],"properties":{"Node name for S&R":"GroundingDinoModelLoader (segment anything2)"},"widgets_values":["GroundingDINO_SwinB (938MB)"]},{"id":6,"type":"PreviewImage","pos":{"0":575,"1":433},"size":[308.81640625,299.23828125],"flags":{},"order":4,"mode":0,"inputs":[{"name":"images","type":"IMAGE","link":6,"label":"images"}],"outputs":[],"properties":{"Node name for S&R":"PreviewImage"}},{"id":4,"type":"GroundingDinoSAM2Segment (segment anything2)","pos":{"0":683,"1":183},"size":{"0":554.4000244140625,"1":122},"flags":{},"order":3,"mode":0,"inputs":[{"name":"sam_model","type":"SAM2_MODEL","link":1,"label":"sam_model"},{"name":"grounding_dino_model","type":"GROUNDING_DINO_MODEL","link":2,"label":"grounding_dino_model"},{"name":"image","type":"IMAGE","link":3,"label":"image"}],"outputs":[{"name":"IMAGE","type":"IMAGE","links":[6],"slot_index":0,"label":"IMAGE"},{"name":"MASK","type":"MASK","links":[7],"slot_index":1,"label":"MASK"}],"properties":{"Node name for S&R":"GroundingDinoSAM2Segment (segment anything2)"},"widgets_values":["person,book",0.3]}],"links":[[1,2,0,4,0,"SAM2_MODEL"],[2,1,0,4,1,"GROUNDING_DINO_MODEL"],[3,3,0,4,2,"IMAGE"],[6,4,0,6,0,"IMAGE"],[7,4,1,7,0,"MASK"]],"groups":[],"config":{},"extra":{"ds":{"scale":0.8264462809917354,"offset":[-12.505597656249961,-82.9064101562497]}},"version":0.4}
相关文章:
ComfyUI - ComfyUI 工作流中集成 SAM2 + GroundingDINO 处理图像与视频 教程
欢迎关注我的CSDN:https://spike.blog.csdn.net/ 本文地址:https://spike.blog.csdn.net/article/details/143359538 免责声明:本文来源于个人知识与公开资料,仅用于学术交流,欢迎讨论,不支持转载。 SAM2 与…...
STM32G4 双ADC模式之常规同步模式独立注入模式
目录 概述 1 认识双ADC模式 2 功能实现 2.1 原理介绍 2.2 实现方法 概述 本文主要介绍STM32G4 双ADC模式之常规同步模式&独立注入模式相关内容,包括ADC模块的功能介绍,实现框架结构,以及常规同步模式&独立注入模式ADC的转换的实…...
深入理解网络协议:OSPF、VLAN、NAT与ACL详解
OSPF工作过程与基础配置 一、OSPF的工作过程 OSPF(开放最短路径优先)是一个广泛使用的路由协议,它的工作过程可以总结为以下几个步骤: 启动与邻居发现 OSPF在配置完成后,会通过本地组播地址224.0.0.5发送HELLO包。HE…...
idea 配置tomcat 服务
选择tomcat的安装路径 选到bin的文件夹的上一层就行...
.net core 接口,动态接收各类型请求的参数
[HttpPost] public async Task<IActionResult> testpost([FromForm] object info) { //Postman工具测试结果: //FromBody,Postman的body只有rawjson时才进的来 //参数为空时,Body(form-data、x-www-form-urlencoded)解析到的数据也有所…...
关注!这些型号SSD有Windows蓝屏问题需要修复
近期,在闪迪官方有一个SSD FW升级提醒,主要是为了解决Windows 11 24H2系统蓝屏的问题: Fix问题:这些SSD的主机内存缓冲区(Host Memory Buffer,简称HMB)功能可能会导致系统出现蓝屏死机ÿ…...
go语言gin框架平滑关闭——思悟项目技术2
目录 前言 直接关闭的缺陷 平滑关闭的使用场景 例子 思悟项目: golang qq邮件发送验证码——思悟项目技术1 前言 平滑关闭(graceful shutdown)是指在停止服务时,能够让现有的连接、任务或者操作优雅地完成,而不是…...
K8S flannel网络模式对比
K8S flannel网络模式对比 VXLAN 模式Host-GW 模式如何查看 Flannel 的网络模式?如何修改 Flannel 的网络模式?如何修改flannel vxlan端口?Flannel 是一个 Kubernetes 中常用的网络插件,用于在集群中的节点之间提供网络连接。Flannel 提供了多种后端实现方式,vxlan 和 host…...
Vue前端框架:Vue前端项目文件目录
文章目录 package.json 文件node_modulessrc(Source Code 的缩写)文件夹主要子文件夹及内容 publicdist package.json 文件 所在文件夹(通常是项目根目录) 虽然 package.json 本身不是一个文件夹,但它所在的文件夹&a…...
git回滚到指定的提交
如果你想回滚到特定的提交(例如 aa0ca72c),并且丢弃之后的所有更改,可以使用 git reset 命令。请注意,git reset 会改变你的提交历史,所以在多人协作项目中应谨慎使用。如果已经推送到远程仓库,…...
手机怎么玩森林之子?远程玩森林之子教程
你喜欢《森林之子》这款开放世界恐怖生存模拟游戏吗?玩家会被派到一座孤岛上,寻找一位失踪的亿万富翁,并深陷被食人生物占领的地方。你需要制作工具和武器、建造房屋,倾尽全力生存下去,无论独自一人还是与朋友一起。如…...
深度学习之网络与计算
1 网络操作与计算 1.1 前向传播与反向传播? 神经网络的计算主要有两种:前向传播(foward propagation, FP)作用于每一层的输入,通过逐层计算得到输出结果;反向传播(backward propagation, BP&a…...
《JVM第1课》Java 跨平台原理
无痛快速学习JVM,欢迎订阅本免费专栏 JVM Java的特性就是程序员一次编写,到处运行,意思是我们只需要写一份代码,就可以在不同的操作系统(windows、Linux、Mac OS等)中运行。但是不同的操作系统能看懂的指令…...
计算机前沿技术-人工智能算法-大语言模型-最新研究进展-2024-10-30
计算机前沿技术-人工智能算法-大语言模型-最新研究进展-2024-10-30 目录 文章目录 计算机前沿技术-人工智能算法-大语言模型-最新研究进展-2024-10-30目录1. Step Guided Reasoning: Improving Mathematical Reasoning using Guidance Generation and Step Reasoning摘要研究背…...
加强版 第五节图像处理与视频分析
基本概念 图像轮廓 主要针对二值图像,轮廓是一系列点 vector<vector<Point>xxx用于存储多个点 vector<Vec4i>xxx包含四个整数,分别代表下一个轮廓的索引,上一个轮廓的索引,一个子轮廓的索引和父轮廓的索引 相…...
Orleans8.2入门测试
微软官方文档:快速入门:使用 ASP.NET Core 生成第一个 Orleans 应用 - .NET | Microsoft Learn 项目及引入的nuget库: 1、接口项目;2、接口实现项目;3、silo项目;4、客户端项目 其中Microsoft.Orleans.St…...
【Linux 25】网络套接字 socket 概念
文章目录 🌈 一、IP 地址概念⭐ 1. IP 地址的作用⭐ 2. 源 IP 地址和目的 IP 地址 🌈 二、端口号概念⭐ 1. 源端口号和目的端口号⭐ 2. 端口号范围划分⭐ 3. 端口号 VS 进程 ID⭐ 4. 套接字 socket 的概念 🌈 三、传输层的典型代表协议⭐ 1. …...
python openai 通过Function Call 创建自动化任务
目录 一、什么是Function Call(函数掉用) 1. 功能概述 2. 工作原理 二、如何实现函数调用 1、定义自己的get_weather 函数 2、给助手添加函数调用 3、写好instrction,指导assistant去掉用你定义的方法。 4、最后也是最重要的,捕获 Assistant 的 Function Call 三、…...
设计模式之责任链的通用实践思考
责任链模式通常一般用在方法的拦截、监控、统计方面,比较典型的就是Spring的AOP拦截。 但写一些小的基础能力框架的时候,用AOP比较中,所以一般都是自己针对特定的功能写一些定制的责任链工具类,不太喜欢总是做一些定制化的东西&am…...
前端用canvas绘图并支持下载
1.根据数据绘制饼图 /** 绘制环形图 */ const drawPieCharts () > {const {canWithdrawalPriceFront,noWithdrawalPriceFront,haveWithdrawalPriceFront,} this.state;const myCanvas this.cavasRef.current;// ts-ignoreconst ctx myCanvas.getContext(2d);if (ctx) {…...
css实现圆环展示百分比,根据值动态展示所占比例
代码如下 <view class""><view class"circle-chart"><view v-if"!!num" class"pie-item" :style"{background: conic-gradient(var(--one-color) 0%,#E9E6F1 ${num}%),}"></view><view v-else …...
MySQL 8.0 OCP 英文题库解析(十三)
Oracle 为庆祝 MySQL 30 周年,截止到 2025.07.31 之前。所有人均可以免费考取原价245美元的MySQL OCP 认证。 从今天开始,将英文题库免费公布出来,并进行解析,帮助大家在一个月之内轻松通过OCP认证。 本期公布试题111~120 试题1…...
今日科技热点速览
🔥 今日科技热点速览 🎮 任天堂Switch 2 正式发售 任天堂新一代游戏主机 Switch 2 今日正式上线发售,主打更强图形性能与沉浸式体验,支持多模态交互,受到全球玩家热捧 。 🤖 人工智能持续突破 DeepSeek-R1&…...
VM虚拟机网络配置(ubuntu24桥接模式):配置静态IP
编辑-虚拟网络编辑器-更改设置 选择桥接模式,然后找到相应的网卡(可以查看自己本机的网络连接) windows连接的网络点击查看属性 编辑虚拟机设置更改网络配置,选择刚才配置的桥接模式 静态ip设置: 我用的ubuntu24桌…...
【JVM面试篇】高频八股汇总——类加载和类加载器
目录 1. 讲一下类加载过程? 2. Java创建对象的过程? 3. 对象的生命周期? 4. 类加载器有哪些? 5. 双亲委派模型的作用(好处)? 6. 讲一下类的加载和双亲委派原则? 7. 双亲委派模…...
【Android】Android 开发 ADB 常用指令
查看当前连接的设备 adb devices 连接设备 adb connect 设备IP 断开已连接的设备 adb disconnect 设备IP 安装应用 adb install 安装包的路径 卸载应用 adb uninstall 应用包名 查看已安装的应用包名 adb shell pm list packages 查看已安装的第三方应用包名 adb shell pm list…...
FFmpeg:Windows系统小白安装及其使用
一、安装 1.访问官网 Download FFmpeg 2.点击版本目录 3.选择版本点击安装 注意这里选择的是【release buids】,注意左上角标题 例如我安装在目录 F:\FFmpeg 4.解压 5.添加环境变量 把你解压后的bin目录(即exe所在文件夹)加入系统变量…...
Xela矩阵三轴触觉传感器的工作原理解析与应用场景
Xela矩阵三轴触觉传感器通过先进技术模拟人类触觉感知,帮助设备实现精确的力测量与位移监测。其核心功能基于磁性三维力测量与空间位移测量,能够捕捉多维触觉信息。该传感器的设计不仅提升了触觉感知的精度,还为机器人、医疗设备和制造业的智…...
【51单片机】4. 模块化编程与LCD1602Debug
1. 什么是模块化编程 传统编程会将所有函数放在main.c中,如果使用的模块多,一个文件内会有很多代码,不利于组织和管理 模块化编程则是将各个模块的代码放在不同的.c文件里,在.h文件里提供外部可调用函数声明,其他.c文…...
小智AI+MCP
什么是小智AI和MCP 如果还不清楚的先看往期文章 手搓小智AI聊天机器人 MCP 深度解析:AI 的USB接口 如何使用小智MCP 1.刷支持mcp的小智固件 2.下载官方MCP的示例代码 Github:https://github.com/78/mcp-calculator 安这个步骤执行 其中MCP_ENDPOI…...
