当前位置：首页 > news >正文

deploy local llm ragflow

news 2026/5/22 15:36:36

CPU >= 4 cores
RAM >= 16 GB
Disk >= 50 GB
Docker >= 24.0.0 & Docker Compose >= v2.26.1

下载docker：

官方下载方式：https://docs.docker.com/desktop/install/ubuntu/

其中 DEB package需要手动下载并传输到服务器

国内下载方式：
https://blog.csdn.net/u011278722/article/details/137673353

Ensure vm.max_map_count >= 262144:

check：
$ sysctl vm.max_map_count

Reset vm.max_map_count to a value at least 262144 if it is not:
$ sudo sysctl -w vm.max_map_count=262144

This change will be reset after a system reboot. To ensure your change remains permanent, add or update the vm.max_map_count value in /etc/sysctl.conf accordingly:
$ vm.max_map_count=262144

Clone the repo:
$ git clone https://github.com/infiniflow/ragflow.git
该步骤需要手动下载并传输，国内无法下载

Build the pre-built Docker images and start up the server:
$ cd ragflow/docker
$ chmod +x ./entrypoint.sh
$ docker compose up -d
这一步也需要手动传输或直接用用源代码build（见最后）

Check the server status after having the server up and running:
$ docker logs -f ragflow-server

The following output confirms a successful launch of the system:
____ ______ __
/ __ \ ____ _ ____ _ / // / _ __
/ // // __ // __ // / / // __ | | /| / /
/ , // // // // // / / // // /| |/ |/ /
// || _,/ _, /// // _/ |/|_/
/____/

Running on all addresses (0.0.0.0)
Running on http://127.0.0.1:9380
Running on http://x.x.x.x:9380
INFO:werkzeug:Press CTRL+C to quit

In your web browser, enter the IP address of your server and log in to RAGFlow.

With the default settings, you only need to enter http://IP_OF_YOUR_MACHINE (sans port number) as the default HTTP serving port 80 can be omitted when using the default configurations.

In service_conf.yaml, select the desired LLM factory in user_default_llm and update the API_KEY field with the corresponding API key.

See llm_api_key_setup for more information.

Rebuild:

To build the Docker images from source:
$ git clone https://github.com/infiniflow/ragflow.git
$ cd ragflow/
$ docker build -t infiniflow/ragflow:dev .
$ cd ragflow/docker
$ chmod +x ./entrypoint.sh
$ docker compose up -d

卸载原有cuda和驱动
https://blog.alumik.cn/posts/90/#:~:text=Use%20the%20following%20command%20to%20uninstall%20a%20Toolkit,remove%20–purge%20%27%5Envidia-.%2A%27%20sudo%20apt-get%20remove%20–purge%20%27%5Elibnvidia-.%2A%27

CUDA 和 Nvdia driver安装：
https://blog.hellowood.dev/posts/ubuntu-22-%E5%AE%89%E8%A3%85-nvdia-%E6%98%BE%E5%8D%A1%E9%A9%B1%E5%8A%A8%E5%92%8C-cuda/

下载Vllm
https://qwen.readthedocs.io/zh-cn/latest/deployment/vllm.html

国内下载model： /Qwen2-7B-Instruct方法：
pip install modelscope
from modelscope import snapshot_download
model_dir = snapshot_download(‘qwen/Qwen2-7B-Instruct’, cache_dir=‘/home/llmlocal/qwen/qwen/’)

运行llm服务器
python -m vllm.entrypoints.openai.api_server --model /home/llmlocal/qwen/qwen/Qwen2-7B-Instruct --host 0.0.0.0 --port 8000

测试：
curl http://localhost:8000/v1/chat/completions -H “Content-Type: application/json” -d ‘{
“model”: “/home/llmlocal/qwen/qwen/Qwen2-7B-Instruct”,
“messages”: [
{“role”: “system”, “content”: “You are a helpful assistant.”},
{“role”: “user”, “content”: “Tell me something about large language models.”}
],
“temperature”: 0.7,
“top_p”: 0.8,
“repetition_penalty”: 1.05,
“max_tokens”: 512
}’

更改ragflow的MODEL_NAME = “/home/llmlocal/qwen/qwen/Qwen2-7B-Instruct” 路径在rag里的chat_model

deploy local llm ragflow

相关文章：

deploy local llm ragflow

测桃花运（算姻缘）的网站系统源码

电商平台优惠券

内衣洗衣机多维度测评对比，了解觉飞、希亦、鲸立哪款内衣洗衣机更好

数据结构和算法入门

基于OpenCV C++的网络实时视频流传输——Windows下使用TCP/IP编程原理

(BS ISO 11898-1:2015）CAN_FD 总线协议详解6- PL（物理层）规定3

docker环境下php安装扩展步骤以mysqli为例

医院综合绩效核算系统，绩效核算系统源码，采用springboot+avue+MySQL技术开发，可适应医院多种绩效核算方式。

ROOM数据快速入门

刷新，前面接口的返回值没有到，第二个接口已经请求完了，导致第二个接口返回数据错误

pdcj设计

【数据结构】哈希表的模拟实现

面试经典算法150题系列-数组/字符串操作之多数元素

海南云亿商务咨询有限公司领航抖音电商服务

C#初级——继承

Github 2024-07-29 开源项目日报 Top10

nginx反向代理和负载均衡+安装jdk-22.0.2

软考高级科目怎么选？软考高级含金量排序

【机器学习西瓜书学习笔记——模型评估与选择】

终极指南：如何为Masa Mods全家桶安装中文汉化包，彻底告别英文界面困扰

DataRoom开源大屏设计器：零代码打造专业数据可视化大屏的终极指南

Navicat Premium试用期重置终极指南：三步恢复完整14天试用

免ROOT使用Frida：Android合规调试的底层原理与四条落地路径

当 ABAP 代码想走出 SAP 系统：一个标准化文件格式的故事

Node.js 服务中如何异步调用 Taotoken 聚合接口实现 AI 功能集成

WarcraftHelper终极教程：5分钟让魔兽争霸3焕发新生

Windows键盘终极改造指南：用SharpKeys解锁键盘隐藏潜力

3个妙招突破百度网盘限速：baidu-wangpan-parse终极解析指南

小爱音箱音乐播放限制破解实战：从基础配置到高级玩法深度解析