当前位置：首页 > news >正文

【新手指南】pyqt可视化远程部署deepseek7B蒸馏版模型

news 2025/11/7 4:23:02

本地效果：（如果想做这个的本科毕设，建议美化界面。）

总结：MobaXterm远程连接autodl服务器，在MobaXterm上利用X11转发使pyqt可视化页面在自己的电脑上展现出来。

1. 官网下载MobaXterm

MobaXterm free Xserver and tabbed SSH client for Windows，我认为这个比pycharm专业版连接autodl服务器更加好用。pycharm需要考虑同步，MobaXterm不需要点那么多同步设置。在MobaXterm上操作也比autodl上的jupyterlab好操作。

2. 在autodl上租服务器

一般一个小时1.6-2r。这里注意显存大小能否支撑起模型的参数量（关于这个具体是怎么判断的，我还不会，会了再补充）我租的是4090D，建议起个早租卡，现在是晚上八点，我刚才创建新的实例租卡租不了，但是早上八点我看有富余。

镜像可以选择社区镜像，我用的是V3版本，它自带了1.5B的模型。deepseek-ai/DeepSeek-R1/DeepSeek-R1: 【一键聊天，多维研究】 DeepSeek-R1，目前位居全球大模型竞技榜前三，性能对标OpenAIo1正式版。深度求索（DeepSeek）公司在2025年1月20日发布的最新模型！ - CG

3. MobaXterm连接autodl服务器

【MobaXterm】登录与连接服务器教程_mobaxterm怎么连接服务器-CSDN博客

4. 我下载了7B的模型，注意下面下载到autodl-tmp目录中。

conda activate torch_env
export HF_ENDPOINT=https://hf-mirror.com
huggingface-cli download deepseek-ai/DeepSeek-R1-Distill-Qwen-7B --local-dir /root/autodl-tmp/DeepSeek-R1-Distill-Qwen-7B --resume-download

7B模型运行相比1.5那个应该只改了模型路径，运行后可以在终端与模型对话，输入exit退出：

文件名：deepseek_multichat-7B.py

终端运行输入：python deepseek_multichat-7B.py

import os
import json
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
import unicodedata
from typing import List@torch.inference_mode()
def generate(model: AutoModelForCausalLM,input_ids: torch.Tensor,attention_mask: torch.Tensor,max_new_tokens: int,temperature: float = 1.0
) -> List[int]:"""Generate response from the model with attention_mask provided."""outputs = model.generate(input_ids=input_ids,attention_mask=attention_mask,  # 提供显式 attention maskmax_new_tokens=max_new_tokens,temperature=temperature,eos_token_id=model.config.eos_token_id,pad_token_id=model.config.eos_token_id,do_sample=True,top_k=50,top_p=0.95,)return outputs[0].tolist()def clean_input(user_input):"""清理用户输入，去除不可见字符和多余的空格。"""user_input = "".join(c for c in user_input if not unicodedata.category(c).startswith("C"))  # 移除控制字符return user_input.strip()  # 去除首尾空格def clean_message_content(content):"""清理消息内容，去除首尾空格并过滤非法输入"""if not content or not isinstance(content, str):return ""return content.strip()  # 去除首尾空格def build_prompt(messages, max_history=3):"""Build prompt for the model, limiting the history to the most recent messages."""template = "The following is a conversation with an AI assistant. The assistant is helpful, knowledgeable, and polite:\n"for msg in messages[-max_history:]:content = clean_message_content(msg["content"])if not content:  # 跳过空内容continuetemplate += f"{msg['role'].capitalize()}: {content}\n"template += "Assistant: "return template.strip()  # 确保返回值是字符串if __name__ == "__main__":print("Initializing DeepSeek-R1 Service...")# Configurationckpt_path = "/root/autodl-tmp/DeepSeek-R1-Distill-Qwen-7B"  # 模型所在的目录config_path = "/root/autodl-tmp/DeepSeek-R1-Distill-Qwen-7B/config.json"  # 配置文件路径# Load tokenizer and modeltokenizer = AutoTokenizer.from_pretrained(ckpt_path) model = AutoModelForCausalLM.from_pretrained(ckpt_path,torch_dtype=torch.bfloat16,).cuda()# Interactive sessionmessages = []  # To maintain contextwhile True:user_input = input("You: ").strip()  # 去除首尾空格user_input = clean_input(user_input)  # 清理不可见字符if not user_input or len(user_input.strip()) == 0:  # 检查无效输入print("Invalid input. Please type something meaningful!")continueif user_input.lower() in ["exit", "quit"]:print("Exiting conversation. Goodbye!")break# Append user input to contextmessages.append({"role": "user", "content": user_input})# Limit conversation historymessages = messages[-10:]  # 只保留最近 10 条对话# Build prompt and tokenizeprompt = build_prompt(messages)if not isinstance(prompt, str) or len(prompt.strip()) == 0:  # 确保 prompt 非空print("Error: Prompt is empty or invalid. Skipping this turn.")continuetokenized_prompt = tokenizer(prompt, return_tensors="pt", truncation=True, padding=True)input_ids = tokenized_prompt["input_ids"].to("cuda")attention_mask = tokenized_prompt["attention_mask"].to("cuda")# Generate responsemax_new_tokens = 500 #150temperature = 0.7completion_tokens = generate(model, input_ids, attention_mask, max_new_tokens, temperature)completion = tokenizer.decode(completion_tokens[len(input_ids[0]):],  # 从输入长度截取生成部分skip_special_tokens=True).split("User:")[0].strip()print(f"Assistant: {completion}")# Append assistant response to contextmessages.append({"role": "assistant", "content": completion})

5. 接下来是最让我头疼的pyqt图形化页面显示。

autodl上是不能直接运行出图形化页面的。可以解决有两种方式：第一种方法：模型利用flask打包成api接口，然后在自己电脑上访问这个接口，在自己电脑上显示出来图形化页面；第二种方法：利用X11转发功能，在autodl上运行图形化页面程序，然后在自己电脑上显示。我用的第二种方法，第一种方法是问了师兄，这个还没尝试，之后尝试了再记录。

一开始用Xlaunch不行，MobaXterm就不需要那么多配置。

如果X11 Forward 显示红叉可以参考：MobaXterm连接服务器，通过x11 Forwarding实现图形可视化（记录个人学习过程）_mobaxterm x11-CSDN博客

最重要的应该是那个X11UseLocalhost no，我用的pyqt，不需要localhost，也设置了no

pyqt页面代码：

import sys
from PyQt5.QtWidgets import *
from PyQt5.QtCore import *
from PyQt5.QtGui import *
import os
import json
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
import unicodedata
from typing import List# 原模型相关代码保持不变
# [此处插入原文件中的模型加载和生成相关函数，保持原样]@torch.inference_mode()
def generate(model: AutoModelForCausalLM,input_ids: torch.Tensor,attention_mask: torch.Tensor,max_new_tokens: int,temperature: float = 1.0
) -> List[int]:"""Generate response from the model with attention_mask provided."""outputs = model.generate(input_ids=input_ids,attention_mask=attention_mask,  # 提供显式 attention maskmax_new_tokens=max_new_tokens,temperature=temperature,eos_token_id=model.config.eos_token_id,pad_token_id=model.config.eos_token_id,do_sample=True,top_k=50,top_p=0.95,)return outputs[0].tolist()def clean_input(user_input):"""清理用户输入，去除不可见字符和多余的空格。"""user_input = "".join(c for c in user_input if not unicodedata.category(c).startswith("C"))  # 移除控制字符return user_input.strip()  # 去除首尾空格def clean_message_content(content):"""清理消息内容，去除首尾空格并过滤非法输入"""if not content or not isinstance(content, str):return ""return content.strip()  # 去除首尾空格def build_prompt(messages, max_history=3):"""Build prompt for the model, limiting the history to the most recent messages."""template = "The following is a conversation with an AI assistant. The assistant is helpful, knowledgeable, and polite:\n"for msg in messages[-max_history:]:content = clean_message_content(msg["content"])if not content:  # 跳过空内容continuetemplate += f"{msg['role'].capitalize()}: {content}\n"template += "Assistant: "return template.strip()  # 确保返回值是字符串class ChatWindow(QWidget):def __init__(self, model, tokenizer):super().__init__()self.model = modelself.tokenizer = tokenizerself.messages = []self.init_ui()def init_ui(self):self.setWindowTitle('DeepSeek Chat')self.setGeometry(300, 300, 800, 600)# 创建界面组件self.history_area = QTextEdit()self.history_area.setReadOnly(True)self.input_area = QTextEdit()self.input_area.setMaximumHeight(100)self.send_btn = QPushButton('发送')self.clear_btn = QPushButton('清空历史')# 布局设置vbox = QVBoxLayout()vbox.addWidget(self.history_area)hbox = QHBoxLayout()hbox.addWidget(self.input_area)hbox.addWidget(self.send_btn)hbox.addWidget(self.clear_btn)vbox.addLayout(hbox)self.setLayout(vbox)# 信号连接self.send_btn.clicked.connect(self.on_send)self.clear_btn.clicked.connect(self.on_clear)self.input_area.installEventFilter(self)def eventFilter(self, obj, event):if obj is self.input_area and event.type() == QEvent.KeyPress:if event.key() == Qt.Key_Return and event.modifiers() & Qt.ControlModifier:self.on_send()return Truereturn super().eventFilter(obj, event)def update_history(self, role, content):self.history_area.append(f"<b>{role.capitalize()}:</b> {content}<br>")self.history_area.verticalScrollBar().setValue(self.history_area.verticalScrollBar().maximum())def on_clear(self):self.messages = []self.history_area.clear()def on_send(self):user_input = self.input_area.toPlainText().strip()if not user_input:return# 清理输入user_input = clean_input(user_input)self.input_area.clear()self.update_history("user", user_input)# 添加到消息历史self.messages.append({"role": "user", "content": user_input})self.messages = self.messages[-10:]# 在后台线程生成回复self.worker = Worker(self.model, self.tokenizer, self.messages)self.worker.finished.connect(self.handle_response)self.worker.start()def handle_response(self, completion):self.messages.append({"role": "assistant", "content": completion})self.update_history("assistant", completion)class Worker(QThread):finished = pyqtSignal(str)def __init__(self, model, tokenizer, messages):super().__init__()self.model = modelself.tokenizer = tokenizerself.messages = messagesdef run(self):prompt = build_prompt(self.messages)tokenized_prompt = self.tokenizer(prompt, return_tensors="pt", truncation=True, padding=True)input_ids = tokenized_prompt["input_ids"].to("cuda")attention_mask = tokenized_prompt["attention_mask"].to("cuda")max_new_tokens = 500temperature = 0.7completion_tokens = generate(self.model, input_ids, attention_mask, max_new_tokens, temperature)completion = self.tokenizer.decode(completion_tokens[len(input_ids[0]):],skip_special_tokens=True).split("User:")[0].strip()self.finished.emit(completion)if __name__ == "__main__":# 初始化模型（原代码部分）print("Initializing DeepSeek-R1 Service...")ckpt_path = "/root/autodl-tmp/DeepSeek-R1-Distill-Qwen-7B"tokenizer = AutoTokenizer.from_pretrained(ckpt_path) model = AutoModelForCausalLM.from_pretrained(ckpt_path,torch_dtype=torch.bfloat16,).cuda()# 启动GUIapp = QApplication(sys.argv)window = ChatWindow(model, tokenizer)window.show()sys.exit(app.exec_())

【新手指南】pyqt可视化远程部署deepseek7B蒸馏版模型

1. 官网下载MobaXterm

2. 在autodl上租服务器

3. MobaXterm连接autodl服务器

4. 我下载了7B的模型，注意下面下载到autodl-tmp目录中。

5. 接下来是最让我头疼的pyqt图形化页面显示。

相关文章：

【新手指南】pyqt可视化远程部署deepseek7B蒸馏版模型

大语言模型在患者交互任务中的临床使用评估框架

DeepSeek-V3 技术报告解读

suricata安装测试

Java反射简单理解

WPS Word中英文混杂空格和行间距不一致调整方案

探秘沃尔什-哈达玛变换（WHT）原理

优雅拼接字符串:StringJoiner 的完整指南

AFL++安装

开发者社区测试报告(功能测试+性能测试)

如何优化 VS Code 远程开发环境？高效配置与性能提升策略

【二】JavaScript能力提升---this对象

YC 孵化项目 Pinch：实时语音翻译视频会议平台；Mistral OCR：能处理多语言多模态复杂文档丨日报

OSPF报文分析

蓝桥杯刷题周计划（第二周）

PH热榜 | 2025-03-09

《gradio ：AI demos》

Interop_UdsProtocolStack之数据解析器使用方法

从0开始完成基于异步服务器的boost搜索引擎

炒菜本质是 “能量（火候）与食材特性”的动态平衡

如何将联系人从 iPhone 转移到 Android

CMake 从 GitHub 下载第三方库并使用

CMake控制VS2022项目文件分组

LeetCode - 199. 二叉树的右视图

Java求职者面试指南：Spring、Spring Boot、MyBatis框架与计算机基础问题解析

免费PDF转图片工具

淘宝扭蛋机小程序系统开发：打造互动性强的购物平台

手机平板能效生态设计指令EU 2023/1670标准解读

用鸿蒙HarmonyOS5实现中国象棋小游戏的过程

es6+和css3新增的特性有哪些