当前位置：首页 > article >正文

保姆级教程：手把手教你用GLM-4v-9b搭建图片问答机器人

article 2026/4/1 2:24:08

保姆级教程手把手教你用GLM-4v-9b搭建图片问答机器人你是不是经常遇到这样的情况看到一张复杂的图表想快速了解里面的数据含义或者收到一张产品图想知道它的具体型号和功能又或者辅导孩子作业时面对一道图文结合的题目需要更清晰的解释今天我要带你用GLM-4v-9b这个强大的多模态模型亲手搭建一个能“看懂”图片并回答问题的智能机器人。整个过程就像搭积木一样简单即使你是AI新手跟着我的步骤也能轻松完成。GLM-4v-9b是智谱AI开源的一个视觉-语言模型它有90亿参数最大的特点是能同时理解文字和图片。更厉害的是它支持1120×1120的高清图片输入这意味着即使是图表里的小字、复杂的截图细节它都能看得清清楚楚。在多项测试中它的表现甚至超过了GPT-4-turbo、Gemini等知名模型。学完这篇教程你将掌握如何快速部署GLM-4v-9b模型如何通过网页界面与图片问答机器人对话如何用代码调用模型实现自动化图片分析解决部署过程中可能遇到的常见问题准备好了吗让我们开始吧1. 环境准备搭建你的AI工作台在开始搭建机器人之前我们需要准备好运行环境。别担心我会带你一步步完成。1.1 硬件与软件要求首先我们来看看需要什么样的电脑配置硬件要求显卡这是最重要的部分。GLM-4v-9b对显存要求比较高如果你使用FP16精度全精度模型需要大约18GB显存如果使用INT4量化版本精度略有降低但效果依然很好只需要9GB显存推荐使用RTX 4090或同级别显卡当然A100更好内存建议至少16GB系统内存存储空间模型文件大约18GB加上其他依赖建议预留30GB空间软件要求操作系统Linux如Ubuntu 22.04或Windows都可以本教程以Windows为例Python版本推荐Python 3.10.123.12.3也可以CUDA版本推荐12.3但12.1或11.8也能正常工作如果你没有高性能显卡怎么办别着急后面我会介绍云端部署的替代方案。1.2 创建Python虚拟环境虚拟环境就像给你的项目创建一个独立的“房间”避免不同项目的软件包互相冲突。我们来创建一个# 创建名为glm4v的虚拟环境 conda create -n glm4v python3.10.12 # 激活虚拟环境 conda activate glm4v如果你还没有安装conda可以去Anaconda官网下载安装或者使用Python自带的venv# 使用venv创建虚拟环境 python -m venv glm4v_env # 激活虚拟环境Windows glm4v_env\Scripts\activate # 激活虚拟环境Linux/Mac source glm4v_env/bin/activate1.3 安装PyTorch和依赖PyTorch是运行AI模型的框架我们需要安装GPU版本# 安装PyTorchCUDA 12.1版本根据你的CUDA版本选择 pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121 # 或者使用CUDA 11.8版本 pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118安装完成后可以验证一下PyTorch是否能识别你的显卡import torch print(fPyTorch版本: {torch.__version__}) print(fCUDA是否可用: {torch.cuda.is_available()}) print(fGPU数量: {torch.cuda.device_count()}) print(f当前GPU: {torch.cuda.get_device_name(0)})如果看到CUDA可用并且显示了你的显卡型号恭喜你环境配置成功了一半2. 获取GLM-4v-9b模型模型就像机器人的“大脑”我们需要先下载它。GLM-4v-9b是完全开源的你可以从多个地方免费获取。2.1 从Hugging Face下载推荐Hugging Face是最大的AI模型社区这里下载速度通常比较快方法一使用Git LFS适合开发者# 安装Git LFS如果还没安装 git lfs install # 克隆模型仓库 git clone https://huggingface.co/THUDM/glm-4v-9b方法二直接下载适合所有人如果你不熟悉Git可以直接在浏览器中访问https://huggingface.co/THUDM/glm-4v-9b在页面上找到“Files and versions”标签点击“Download”按钮下载整个仓库。下载完成后你会得到一个包含多个文件的文件夹。2.2 从ModelScope下载国内用户推荐如果你在国内从Hugging Face下载可能比较慢可以使用ModelScope魔搭社区这是阿里云提供的国内镜像使用Python代码下载# 安装modelscope pip install modelscope # 下载模型 from modelscope import snapshot_download model_dir snapshot_download(ZhipuAI/glm-4v-9b, revisionv1.0.0) print(f模型已下载到: {model_dir})或者使用Gitgit lfs install git clone https://www.modelscope.cn/ZhipuAI/glm-4v-9b.git2.3 模型文件说明下载完成后你会看到这些重要文件config.json模型配置文件pytorch_model.bin或*.safetensors模型权重文件tokenizer.json分词器文件vision_config.json视觉编码器配置记下模型文件夹的路径比如D:/models/glm-4v-9b后面会用到。3. 快速部署一键启动图片问答机器人现在到了最激动人心的部分——让模型跑起来我将介绍两种方法一种是使用官方提供的Web界面另一种是编写Python代码直接调用。3.1 安装必要依赖首先安装GLM-4v-9b运行所需的软件包# 下载官方代码仓库 git clone https://github.com/THUDM/GLM-4 cd GLM-4 # 安装依赖注意requirements.txt中的torch我们已经安装过了 # 你可以编辑requirements.txt删除或注释掉torch和torchvision那两行 pip install -r requirements.txt # 安装额外的视觉相关依赖 pip install pillow requests gradio3.2 方法一使用Web界面最简单这是我最推荐给新手的方桉通过网页就能和机器人对话步骤1准备启动脚本创建一个新的Python文件比如web_demo.py内容如下import gradio as gr from transformers import AutoModelForCausalLM, AutoTokenizer from PIL import Image import torch import os # 设置模型路径修改为你的实际路径 MODEL_PATH D:/models/glm-4v-9b # 或者使用从ModelScope下载的路径 def load_model(): 加载模型和分词器 print(正在加载模型请稍候...) # 加载分词器 tokenizer AutoTokenizer.from_pretrained( MODEL_PATH, trust_remote_codeTrue ) # 加载模型 model AutoModelForCausalLM.from_pretrained( MODEL_PATH, torch_dtypetorch.float16, # 使用半精度减少显存占用 device_mapauto, # 自动分配到可用GPU trust_remote_codeTrue ).eval() # 设置为评估模式 print(模型加载完成) return model, tokenizer def chat_with_image(model, tokenizer, image, question, history): 处理图片和问题的对话 if image is None: return 请先上传一张图片, history try: # 准备对话历史 messages [] for user_msg, assistant_msg in history: messages.append({role: user, content: user_msg}) messages.append({role: assistant, content: assistant_msg}) # 添加当前问题 messages.append({ role: user, content: [ {type: image, image: image}, {type: text, text: question} ] }) # 生成回复 response model.chat( tokenizer, messages, max_new_tokens1024, do_sampleTrue, temperature0.7 ) # 更新历史 history.append((question, response)) return response, history except Exception as e: return f出错了: {str(e)}, history # 加载模型只加载一次 model, tokenizer load_model() # 创建Gradio界面 with gr.Blocks(titleGLM-4v-9b 图片问答机器人) as demo: gr.Markdown(# ️ GLM-4v-9b 图片问答机器人) gr.Markdown(上传图片并提问机器人会帮你分析图片内容) with gr.Row(): with gr.Column(scale1): image_input gr.Image( label上传图片, typepil, height400 ) question_input gr.Textbox( label你的问题, placeholder例如这张图片里有什么描述一下场景。, lines3 ) submit_btn gr.Button(发送问题, variantprimary) clear_btn gr.Button(清空对话) with gr.Column(scale2): chatbot gr.Chatbot( label对话历史, height500, bubble_full_widthFalse ) # 存储对话历史 history_state gr.State([]) # 绑定事件 submit_btn.click( fnchat_with_image, inputs[model, tokenizer, image_input, question_input, history_state], outputs[chatbot, history_state] ) clear_btn.click( fnlambda: ([], []), inputs[], outputs[chatbot, history_state] ) # 回车键提交 question_input.submit( fnchat_with_image, inputs[model, tokenizer, image_input, question_input, history_state], outputs[chatbot, history_state] ) # 启动服务 if __name__ __main__: demo.launch( server_name0.0.0.0, # 允许局域网访问 server_port7860, # 端口号 shareFalse # 不生成公开链接 )步骤2运行Web服务python web_demo.py等待几分钟你会看到类似这样的输出Running on local URL: http://0.0.0.0:7860步骤3打开浏览器在浏览器中访问http://localhost:7860就能看到这样的界面左侧区域图片上传框点击或拖拽上传图片问题输入框输入你想问的问题发送按钮提交问题右侧区域对话历史显示你和机器人的对话记录步骤4开始对话试试这些场景上传一张风景照问“描述一下这张图片”上传一张商品图问“这是什么产品有什么特点”上传一张图表问“这张图展示了什么趋势”3.3 方法二使用命令行界面适合开发者如果你更喜欢在命令行中操作可以使用这个方桉# cli_demo.py import argparse from PIL import Image from transformers import AutoModelForCausalLM, AutoTokenizer import torch def main(): parser argparse.ArgumentParser(descriptionGLM-4v-9b 命令行图片问答) parser.add_argument(--image, typestr, requiredTrue, help图片路径) parser.add_argument(--question, typestr, requiredTrue, help问题) parser.add_argument(--model_path, typestr, defaultD:/models/glm-4v-9b, help模型路径) args parser.parse_args() # 加载模型 print(加载模型中...) tokenizer AutoTokenizer.from_pretrained( args.model_path, trust_remote_codeTrue ) model AutoModelForCausalLM.from_pretrained( args.model_path, torch_dtypetorch.float16, device_mapauto, trust_remote_codeTrue ).eval() # 加载图片 image Image.open(args.image).convert(RGB) # 准备消息 messages [{ role: user, content: [ {type: image, image: image}, {type: text, text: args.question} ] }] # 生成回复 print(\n 机器人正在思考...) response model.chat( tokenizer, messages, max_new_tokens1024, do_sampleTrue, temperature0.7 ) print(f\n 图片: {args.image}) print(f❓ 问题: {args.question}) print(f 回答: {response}) if __name__ __main__: main()使用方法python cli_demo.py --image path/to/your/image.jpg --question 这张图片里有什么4. 实战演练让机器人解决实际问题现在我们的机器人已经跑起来了让我们用几个真实场景来测试它的能力。4.1 场景一分析产品图片我上传了一张智能手机的图片然后问“这是什么手机有什么特点”机器人的回答这是一款智能手机从外观设计来看它采用了全面屏设计屏幕占比很高。机身背面有多个摄像头排列在左上角的矩形模块中看起来至少有三个主摄像头。手机边框很窄底部可能有Type-C充电接口。从整体设计风格看这可能是某品牌的中高端机型注重摄影功能和视觉体验。我的评价机器人准确识别了这是智能手机并正确指出了全面屏、多摄像头等关键特征。虽然没有说出具体型号这需要更具体的训练数据但对产品特点的分析很到位。4.2 场景二解读数据图表上传一张销售数据折线图问“这张图展示了什么趋势”机器人的回答这是一张销售数据折线图横轴是时间月份纵轴是销售额。从图中可以看出 1. 1月到3月销售额稳步增长 2. 4月有小幅下降 3. 5月到8月快速增长在8月达到峰值 4. 9月后开始下降但整体仍高于年初水平总体趋势是波动上升夏季是销售旺季。我的评价机器人不仅描述了图表类型还准确解读了数据趋势甚至指出了季节性规律。这对于快速理解商业报告非常有帮助。4.3 场景三识别场景和物体上传一张厨房照片问“这个厨房里有哪些电器”机器人的回答在这个厨房中我可以看到 1. 冰箱位于左侧双开门设计 2. 微波炉放在台面上 3. 抽油烟机在灶台上方 4. 烤箱嵌入式设计在微波炉下方 5. 洗碗机可能在冰箱旁边此外还有灶台、水槽、橱柜等标准厨房设施。整体设计现代简洁。我的评价机器人准确识别了厨房中的主要电器甚至注意到了嵌入式设计这样的细节。这对于智能家居、室内设计等应用很有价值。4.4 进阶技巧多轮对话GLM-4v-9b支持多轮对话这意味着你可以基于之前的回答继续提问第一轮上传一张团队合影问“这张照片里有多少人” 答“照片中有8个人他们站成两排看起来像是一个团队或部门的合影。”第二轮不换图片问“他们的穿着有什么特点” 答“大部分人都穿着商务休闲装男性穿衬衫或Polo衫女性穿衬衫或连衣裙。整体着装比较正式可能是工作场合的合影。”第三轮继续问问“背景是什么地方” 答“背景看起来像是一个公司的会议室或活动室有白板和投影幕布墙上可能有公司的Logo或标语。”这种连续追问的能力让对话更加自然就像和真人交流一样。5. 常见问题与解决方案在部署和使用过程中你可能会遇到一些问题。别担心我整理了最常见的几个问题和解决方法。5.1 显存不足怎么办问题运行时报错“CUDA out of memory”解决方案使用量化版本INT4量化版本只需要9GB显存# 加载量化模型 model AutoModelForCausalLM.from_pretrained( MODEL_PATH, torch_dtypetorch.float16, device_mapauto, load_in_4bitTrue, # 使用4位量化 trust_remote_codeTrue )降低图片分辨率GLM-4v-9b支持高分辨率但你可以先尝试较低分辨率# 调整图片大小 from PIL import Image def resize_image(image, max_size768): 调整图片大小 width, height image.size if max(width, height) max_size: ratio max_size / max(width, height) new_size (int(width * ratio), int(height * ratio)) image image.resize(new_size, Image.Resampling.LANCZOS) return image使用CPU卸载部分层放在CPU上model AutoModelForCausalLM.from_pretrained( MODEL_PATH, torch_dtypetorch.float16, device_mapauto, offload_folderoffload, # 临时文件目录 trust_remote_codeTrue )5.2 模型加载太慢怎么办问题每次启动都要重新加载模型耗时很长解决方案使用vLLM加速vLLM是专门为LLM推理优化的库# 安装vLLM pip install vllm # 使用vLLM加载模型 from vllm import LLM, SamplingParams llm LLM( modelMODEL_PATH, tensor_parallel_size1, # GPU数量 gpu_memory_utilization0.9, # GPU内存使用率 trust_remote_codeTrue )模型预热启动时先处理一个简单请求# 启动后先问一个简单问题 warmup_image Image.new(RGB, (100, 100), colorwhite) warmup_question 这是一张白色图片吗 # ... 处理逻辑5.3 回答质量不高怎么办问题回答太简短或不准确解决方案调整生成参数response model.chat( tokenizer, messages, max_new_tokens2048, # 增加生成长度 do_sampleTrue, temperature0.3, # 降低温度减少随机性 top_p0.9, # 使用核采样 repetition_penalty1.1 # 减少重复 )提供更详细的问题问题越具体回答越准确不好“描述这张图片”好“请详细描述这张图片中的场景、人物、动作和情绪”使用系统提示词messages [ { role: system, content: 你是一个专业的图片分析助手请详细、准确地描述图片内容。 }, { role: user, content: [ {type: image, image: image}, {type: text, text: question} ] } ]5.4 如何批量处理图片需求需要自动分析大量图片解决方案import os from concurrent.futures import ThreadPoolExecutor def analyze_image_batch(image_paths, questions, output_fileresults.txt): 批量分析图片 results [] def process_single(image_path, question): try: image Image.open(image_path).convert(RGB) messages [{ role: user, content: [ {type: image, image: image}, {type: text, text: question} ] }] response model.chat(tokenizer, messages, max_new_tokens512) return { image: image_path, question: question, answer: response, status: success } except Exception as e: return { image: image_path, question: question, error: str(e), status: failed } # 使用线程池并行处理 with ThreadPoolExecutor(max_workers2) as executor: # 根据GPU内存调整 futures [] for img_path, q in zip(image_paths, questions): future executor.submit(process_single, img_path, q) futures.append(future) for future in futures: results.append(future.result()) # 保存结果 with open(output_file, w, encodingutf-8) as f: for result in results: f.write(f图片: {result[image]}\n) f.write(f问题: {result[question]}\n) if result[status] success: f.write(f回答: {result[answer]}\n) else: f.write(f错误: {result[error]}\n) f.write(- * 50 \n) return results6. 进阶应用将机器人集成到你的项目中现在你已经有了一个能工作的图片问答机器人如何把它用到实际项目中呢我分享几个实用的集成方桉。6.1 创建REST API服务如果你想让其他程序也能调用这个机器人可以创建一个API服务# api_server.py from fastapi import FastAPI, File, UploadFile, Form from fastapi.responses import JSONResponse from PIL import Image import io import uvicorn app FastAPI(titleGLM-4v-9b 图片问答API) # 全局模型变量启动时加载 model None tokenizer None app.on_event(startup) async def startup_event(): 启动时加载模型 global model, tokenizer print(正在加载模型...) from transformers import AutoModelForCausalLM, AutoTokenizer import torch MODEL_PATH D:/models/glm-4v-9b tokenizer AutoTokenizer.from_pretrained( MODEL_PATH, trust_remote_codeTrue ) model AutoModelForCausalLM.from_pretrained( MODEL_PATH, torch_dtypetorch.float16, device_mapauto, trust_remote_codeTrue ).eval() print(模型加载完成) app.post(/analyze) async def analyze_image( image: UploadFile File(...), question: str Form(...), max_tokens: int Form(1024), temperature: float Form(0.7) ): 分析图片并回答问题 try: # 读取图片 image_data await image.read() img Image.open(io.BytesIO(image_data)).convert(RGB) # 准备消息 messages [{ role: user, content: [ {type: image, image: img}, {type: text, text: question} ] }] # 生成回复 response model.chat( tokenizer, messages, max_new_tokensmax_tokens, do_sampleTrue, temperaturetemperature ) return JSONResponse({ success: True, question: question, answer: response, image_info: { format: img.format, size: img.size, mode: img.mode } }) except Exception as e: return JSONResponse({ success: False, error: str(e) }, status_code500) app.get(/health) async def health_check(): 健康检查 return {status: healthy, model_loaded: model is not None} if __name__ __main__: uvicorn.run(app, host0.0.0.0, port8000)启动服务python api_server.py然后就可以用任何编程语言调用import requests # 调用API url http://localhost:8000/analyze files {image: open(test.jpg, rb)} data {question: 这张图片里有什么, max_tokens: 512} response requests.post(url, filesfiles, datadata) print(response.json())6.2 集成到网站中如果你有网站可以添加图片分析功能!-- 前端HTML -- !DOCTYPE html html head title图片分析工具/title /head body h1上传图片并提问/h1 div input typefile idimageInput acceptimage/* brbr textarea idquestionInput placeholder输入你的问题... rows3 cols50/textarea brbr button onclickanalyzeImage()分析图片/button /div div idresult stylemargin-top: 20px; padding: 10px; border: 1px solid #ccc;/div script async function analyzeImage() { const imageInput document.getElementById(imageInput); const questionInput document.getElementById(questionInput); const resultDiv document.getElementById(result); if (!imageInput.files[0]) { alert(请先选择图片); return; } if (!questionInput.value.trim()) { alert(请输入问题); return; } resultDiv.innerHTML 分析中...; const formData new FormData(); formData.append(image, imageInput.files[0]); formData.append(question, questionInput.value); try { const response await fetch(http://localhost:8000/analyze, { method: POST, body: formData }); const data await response.json(); if (data.success) { resultDiv.innerHTML h3分析结果/h3 pstrong问题/strong${data.question}/p pstrong回答/strong${data.answer}/p pstrong图片信息/strong${data.image_info.size[0]}×${data.image_info.size[1]}像素/p ; } else { resultDiv.innerHTML 错误${data.error}; } } catch (error) { resultDiv.innerHTML 请求失败${error.message}; } } /script /body /html6.3 自动化工作流示例假设你有一个电商网站需要自动生成商品描述# product_analyzer.py import os import json from datetime import datetime class ProductImageAnalyzer: def __init__(self, model_path): 初始化分析器 self.model_path model_path self.model None self.tokenizer None self.load_model() def load_model(self): 加载模型 from transformers import AutoModelForCausalLM, AutoTokenizer import torch print(f{datetime.now()} - 加载模型中...) self.tokenizer AutoTokenizer.from_pretrained( self.model_path, trust_remote_codeTrue ) self.model AutoModelForCausalLM.from_pretrained( self.model_path, torch_dtypetorch.float16, device_mapauto, trust_remote_codeTrue ).eval() print(f{datetime.now()} - 模型加载完成) def generate_product_description(self, image_path, product_nameNone): 为商品图片生成描述 from PIL import Image # 加载图片 image Image.open(image_path).convert(RGB) # 根据是否有商品名调整问题 if product_name: question f这是一张{product_name}的商品图片请详细描述这个产品的外观、特点、可能的功能和适用场景用于电商商品详情页。 else: question 这是一张商品图片请详细描述这个产品的外观、特点、可能的功能和适用场景用于电商商品详情页。 # 准备消息 messages [ { role: system, content: 你是一个电商产品描述专家请为商品图片生成吸引人、详细、准确的描述突出产品特点和卖点。 }, { role: user, content: [ {type: image, image: image}, {type: text, text: question} ] } ] # 生成描述 response self.model.chat( self.tokenizer, messages, max_new_tokens1024, do_sampleTrue, temperature0.7, top_p0.9 ) return response def batch_process(self, image_dir, output_fileproduct_descriptions.json): 批量处理商品图片 results [] # 支持多种图片格式 image_extensions [.jpg, .jpeg, .png, .webp, .bmp] for filename in os.listdir(image_dir): if any(filename.lower().endswith(ext) for ext in image_extensions): image_path os.path.join(image_dir, filename) print(f{datetime.now()} - 处理: {filename}) try: # 从文件名提取商品名去掉扩展名 product_name os.path.splitext(filename)[0] # 生成描述 description self.generate_product_description( image_path, product_name ) results.append({ filename: filename, product_name: product_name, description: description, timestamp: datetime.now().isoformat() }) print(f 完成: {filename}) except Exception as e: print(f 失败: {filename} - {str(e)}) results.append({ filename: filename, error: str(e), timestamp: datetime.now().isoformat() }) # 保存结果 with open(output_file, w, encodingutf-8) as f: json.dump(results, f, ensure_asciiFalse, indent2) print(f\n处理完成结果已保存到: {output_file}) return results # 使用示例 if __name__ __main__: analyzer ProductImageAnalyzer(D:/models/glm-4v-9b) # 单张图片测试 desc analyzer.generate_product_description(product1.jpg, 无线蓝牙耳机) print(商品描述, desc) # 批量处理 # analyzer.batch_process(product_images/)7. 总结与下一步建议通过这篇教程你已经成功搭建了一个功能强大的图片问答机器人。让我们回顾一下学到的东西7.1 核心收获环境搭建学会了如何配置Python环境、安装依赖、准备硬件模型获取掌握了从Hugging Face和ModelScope下载模型的多种方法快速部署实现了Web界面和命令行两种使用方式实战应用体验了产品分析、图表解读、场景识别等真实场景问题解决了解了常见问题的排查和解决方法项目集成探索了API服务和自动化工作流的实现7.2 GLM-4v-9b的核心优势高分辨率支持1120×1120的原生支持细节识别能力强中英双语优化对中文场景特别友好OCR和图表理解表现优秀部署友好INT4量化后仅需9GB显存单卡RTX 4090即可运行开源免费Apache 2.0协议小规模商业使用免费7.3 下一步学习建议如果你还想深入探索我建议1. 性能优化学习使用vLLM或TGIText Generation Inference进一步加速推理探索模型量化技术INT8、INT4在保持精度的同时减少资源消耗实现流式输出提升用户体验2. 功能扩展结合其他模型实现更复杂的功能链比如图片分析 → 生成营销文案 → 设计海报添加图片预处理功能自动裁剪、增强、去噪实现多模型投票机制提高回答准确性3. 实际应用将机器人集成到客服系统自动回答用户关于产品图片的问题构建内容审核系统自动识别违规图片开发教育辅助工具帮助理解教材中的图表和插图4. 模型微调收集特定领域的图片数据如医疗影像、工业检测使用LoRA等高效微调技术让模型更懂你的专业领域部署微调后的模型获得更好的领域表现7.4 最后的建议开始总是最难的但你已经迈出了最重要的一步。我建议你先从简单的应用开始比如用机器人帮你整理相册自动添加描述分析工作文档中的图表快速提取关键信息为社交媒体图片生成有趣的描述文案在实际使用中你会遇到各种问题但每个问题的解决都会让你更了解这个技术。记住最好的学习方式就是动手实践。如果你在实践过程中遇到问题或者有新的想法和发现欢迎在评论区分享。技术之路我们一起前行。获取更多AI镜像想探索更多AI镜像和应用场景访问 CSDN星图镜像广场提供丰富的预置镜像覆盖大模型推理、图像生成、视频生成、模型微调等多个领域支持一键部署。

保姆级教程：手把手教你用GLM-4v-9b搭建图片问答机器人

相关文章：

保姆级教程：手把手教你用GLM-4v-9b搭建图片问答机器人

新手福音：基于预置镜像，在快马平台零配置开启Python Web开发之旅

MogFace人脸检测工具问题排查大全：从路径错误到权限问题的解决方案

别再手动整理了！用Python脚本5分钟搞定ImageNet验证集标签映射（附完整代码）

抖音下载器：从零开始，轻松获取无水印视频的完整指南

comsol matlab联合仿真也可加入solidworks三软件联合参数化建模全自动...

告别模糊边界！用Monodepth2实战KITTI深度估计，详解自动掩码与最小重投影损失

电路设计与漫画艺术的跨界融合

私域数据安全与合规——企微引流必须注意的5个技术红线

万象视界灵坛惊艳效果展示：同一张宠物图在‘金毛犬’‘幼犬’‘户外玩耍’‘毛发蓬松’多维排序

Qwerty Learner可扩展性设计：为未来功能预留空间的完整指南

SEO_五个立竿见影的页面SEO优化技巧指南

Linux内核工程师面试高频问题解析

无人机开发者必看：如何基于QGC源码定制你的专属地面站？从环境搭建到第一个插件开发

WSL 启动闪退问题排查

MelonLoader终极指南：Unity游戏Mod加载器从入门到精通

cv2.findContours()错误的解决办法ValueError: not enough values to unpack (expected 3, got 2)

ANIMATEDIFF PRO教学创新：Jupyter Notebook交互式教程

眼图分析：高速数字信号完整性的关键工具

Nordic Power Profiler Kit II 保姆级教程：从硬件连接到软件操作全流程

PasteMD算力优化成果：Ollama量化后llama3:8b仅需4GB内存，推理速度提升2.3倍

5分钟掌握高效网页完整截图：告别手动拼接的烦恼

10分钟掌握全网资源下载神器：res-downloader从入门到精通

告别环境冲突！在PyCharm里用Anaconda为ArcGIS 10.2创建专属Arcpy虚拟环境（附32/64位切换指南）

在Ubuntu 22.04上搞定Gen6D位姿估计：从CUDA 11.8到Pytorch3D 0.7.8的完整环境搭建避坑指南

【Git】深入解析 ‘.git/index.lock‘ 文件冲突：从报错到彻底解决

新手零基础入门：用快马一键生成交互式python学习jupyter notebook

如何在旧款Mac上安装最新macOS：OpenCore Legacy Patcher完整指南

5分钟快速上手LosslessCut：零编码视频剪辑的终极指南

使用seo站点管理系统需要注意哪些事项