当前位置：首页 > news >正文

浏览器实时播放摄像头数据并通过 Yolo 进行图像识别

news 2026/2/10 7:44:44

安装 Ultralytics 之后，可以直接通过本地获取摄像头数据流，并通过 Yolo 模型实时进行识别。大多情况下，安装本地程序成本比较高，需要编译打包等等操作，如果可以直接通过浏览器显示视频，并实时显示识别到的对象类型就会方便很多。本文将通过 JS 原生代码 + 后台 Yolo 识别服务实现浏览器实时显示并识别对象类型的效果。

后发服务

后台服务采用 Python Flask 框架实现图片识别的 Rest API，开发之前，首先安装 Ultralytics 环境，我们使用官方的 DockerImage，用官方镜像作为基础镜像，安装相关依赖。

Dockerfile

# Use the ultralytics/ultralytics image as the base
FROM ultralytics/ultralytics:latest# Update package lists and install vim
RUN apt-get update && apt-get install -y vim# Install Flask using pip
RUN pip install flask flask-cors# Set the working directory
WORKDIR /app# Copy the current directory contents into the container at /app
COPY . /app# Expose port 5000 for Flask
EXPOSE 5000# Command to run the Flask application

App.py
后台 Rest API，/detect，解析 base64 图片，并返回识别到的图片分类和位置信息。

import os
from flask import Flask, request, jsonify
from ultralytics import YOLO
import cv2
import numpy as np# Initialize Flask app
app = Flask(__name__)# Load YOLOv8 model
model = YOLO('yolov8n.pt')  # You can change 'yolov8n.pt' to other versions like 'yolov8m.pt' or 'yolov8x.pt'# Function to perform object detection
def detect_objects(image):results = model(image)detections = []for result in results:for box in result.boxes:x1, y1, x2, y2 = map(int, box.xyxy)class_id = int(box.cls)confidence = box.confdetections.append({'class_id': class_id,'label': model.names[class_id],'confidence': float(confidence),'bbox': [x1, y1, x2, y2]})return detections# Route for object detection
@app.route('/detect', methods=['POST'])
def detect():if 'image' not in request.files:return jsonify({'error': 'No image provided'}), 400file = request.files['image']if file.filename == '':return jsonify({'error': 'No image selected for uploading'}), 400# Read imageimage = np.frombuffer(file.read(), np.uint8)image = cv2.imdecode(image, cv2.IMREAD_COLOR)# Perform detectiondetections = detect_objects(image)return jsonify(detections)# Run Flask app
if __name__ == '__main__':app.run(host='0.0.0.0', port=5000)

前端页面

在页面显示摄像头，实时发送图片数据到后台进行识别，获取位置并显示在画布纸上。

<!DOCTYPE html>
<html lang="en">
<head><meta charset="UTF-8"><meta name="viewport" content="width=device-width, initial-scale=1.0"><title>Real-Time Object Detection with YOLO</title><style>body {display: flex;justify-content: center;align-items: center;height: 100vh;margin: 0;background-color: #f0f0f0;overflow: hidden;}#camera, #canvas {position: absolute;width: 50%;height: 50%;object-fit: cover;}#camera {z-index: 1;}#canvas {z-index: 2;}</style>
</head>
<body><video id="camera" autoplay playsinline></video><canvas id="canvas"></canvas><script>const classColors = {};function getRandomColor() {const letters = '0123456789ABCDEF';let color = '#';for (let i = 0; i < 6; i++) {color += letters[Math.floor(Math.random() * 16)];}return color;}function isMobileDevice() {return /Mobi|Android/i.test(navigator.userAgent);}async function setupCamera() {const video = document.getElementById('camera');const facingMode = isMobileDevice() ? 'environment' : 'user'; // Use back camera for mobile, front camera for PCtry {const stream = await navigator.mediaDevices.getUserMedia({video: {facingMode: { ideal: facingMode }}});video.srcObject = stream;return new Promise((resolve) => {video.onloadedmetadata = () => {resolve(video);};});} catch (error) {console.error('Error accessing camera:', error);alert('Error accessing camera: ' + error.message);}}async function sendFrameToBackend(imageData) {const response = await fetch('https://c.hawk.leedar360.com/api/detect', {method: 'POST',headers: {'Content-Type': 'application/json'},body: JSON.stringify({ image: imageData })});return await response.json();}function getBase64Image(video) {const canvas = document.createElement('canvas');canvas.width = video.videoWidth;canvas.height = video.videoHeight;const ctx = canvas.getContext('2d');ctx.drawImage(video, 0, 0, canvas.width, canvas.height);return canvas.toDataURL('image/jpeg').split(',')[1];}function renderDetections(detections, canvas, video) {const ctx = canvas.getContext('2d');ctx.clearRect(0, 0, canvas.width, canvas.height);ctx.drawImage(video, 0, 0, canvas.width, canvas.height);detections.forEach(det => {const { bbox, confidence, class_id, label } = det;if (confidence > 0.5) { // Only show detections with confidence > 0.5const [x, y, w, h] = bbox;if (!classColors[class_id]) {classColors[class_id] = getRandomColor();}const color = classColors[class_id];ctx.beginPath();ctx.rect(x, y, w - x, h - y);ctx.lineWidth = 2;ctx.strokeStyle = color;ctx.fillStyle = color;ctx.stroke();ctx.font = '24px Arial';  // Set font size to 24pxctx.fillText(`${label} (${Math.round(confidence * 100)}%)`,x,y > 24 ? y - 10 : 24  // Adjust position for the larger font size);}});}async function main() {const video = await setupCamera();if (!video) {console.error('Camera setup failed');return;}video.play();const canvas = document.getElementById('canvas');canvas.width = video.videoWidth;canvas.height = video.videoHeight;async function processFrame() {const imageData = getBase64Image(video);const detections = await sendFrameToBackend(imageData);renderDetections(detections, canvas, video);requestAnimationFrame(processFrame);}processFrame();}main();// Handle orientation change and resizingwindow.addEventListener('resize', () => {const video = document.getElementById('camera');const canvas = document.getElementById('canvas');canvas.width = video.videoWidth;canvas.height = video.videoHeight;});window.addEventListener('orientationchange', () => {const video = document.getElementById('camera');const canvas = document.getElementById('canvas');canvas.width = video.videoWidth;canvas.height = video.videoHeight;});</script>
</body>
</html>

总结

功能很好实现，效果还要微调，苹果的充电器并没有识别出来。

请添加图片描述

浏览器实时播放摄像头数据并通过 Yolo 进行图像识别

后发服务

前端页面

总结

相关文章：

浏览器实时播放摄像头数据并通过 Yolo 进行图像识别

redis清空list

汽车油耗NEDC与WLTP有什么区别？以及MATLAB/Simulink的汽车行驶工况仿真

【Python】已解决报错：AttributeError: module ‘json‘ has no attribute ‘loads‘解决办法

（5）按钮输入

嵌入式开发、C++后台开发、C++音视频开发怎么选择？

高考志愿填报，大学读什么专业比较好？

33 _ 跨站脚本攻击（XSS）：为什么Cookie中有HttpOnly属性？

C++入门小结

Java 开发实例：Spring Boot+AOP+注解+Redis防重复提交（防抖）

使用difflib实现文件差异比较用html显示

【文末附gpt升级秘笈】AI热潮降温与AGI场景普及的局限性

Vue待学习

TOP150-LC88

使用Python和TCN进行时间序列预测：一个完整的实战示例

如何用R语言ggplot2画高水平期刊散点图

Python基于 Jupyter Notebook 的图形可视化工具库之ipysigma使用详解

四叉树和KD树

C语言中结构体使用.与-＞访问成员变量的区别

计算机二级Access选择题考点

Python｜GIF 解析与构建（5）：手搓截屏和帧率控制

Cursor实现用excel数据填充word模版的方法

深入剖析AI大模型：大模型时代的 Prompt 工程全解析

Cilium动手实验室: 精通之旅---20.Isovalent Enterprise for Cilium: Zero Trust Visibility

【机器视觉】单目测距——运动结构恢复

SpringTask-03.入门案例

【Java学习笔记】BigInteger 和 BigDecimal 类

在QWebEngineView上实现鼠标、触摸等事件捕获的解决方案

招商蛇口 | 执笔CID，启幕低密生活新境

【Veristand】Veristand环境安装教程-Linux RT / Windows