当前位置：首页 > news >正文

xinference 安装（http导致错误解决）

news 2026/2/9 0:08:41

为什么要使用xinference

安装xinference

环境

1）conda create -n Xinference python=3.11

注意：3.9 3.10均可能出现xinference 安装时候出现numpy兼容性，以及无法安装all版本

错误： error while attempting to bind on address，no dictory等错误，是由于ssl启动错误引起的。建议直接安装xinference all版本

注意：单部署

启动：xinference-local --host 127.0.0.1 --port 9997

分类启动

前端：xinference-local --host 127.0.0.1 --port 9997

后端：nohup xinference-local --host 127.0.0.1 --port 9997 & > output.log

涉及版本有

# CUDA/CPU

pip install "xinference[transformers]"

pip install "xinference[vllm]"

pip install "xinference[sglang]"

# Metal(MPS)

pip install "xinference[mlx]"

CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python

普通启动：xinference-local --host 0.0.0.0 --port 9997

模型启动：

# CUDA/CPU XINFERENCE_HOME=/path/.xinference XINFERENCE_MODEL_SRC=modelscope xinference-local --host 0.0.0.0 --port 9997

# Metal(MPS)

XINFERENCE_HOME=/path/.xinference XINFERENCE_MODEL_SRC=modelscope PYTORCH_ENABLE_MPS_FALLBACK=1 xinference-local --host 0.0.0.0 --port 9997

Xinference支持集群部署

主服务器启动 Supervisor

部署启动： xinference-supervisor -H 192.168.31.100 --port 9997

其他服务器启动 Worker

# 格式 xinference-worker -e "http://${主服务器IP}:9997" -H 当前服务器IP(子服务器IP) # 示例 xinference-worker -e "http://192.168.31.100:9997" -H 192.168.31.101

访问doc地址：http://localhost:9997/docs

（1）注册模型

xinference register --model-type LLM --file custom-glm4-chat.json --persist

（2）启动模型　　　　　　

xinference launch --model-name custom-glm4-chat --model-format pytorch --model-engine Transformers

Langchain-chatchat

错误1

"C:\Users\Administrator\Desktop\Langchain-Chatchat-master\libs\chatchat-server\chatchat\webui_pages\kb_chat.py", line 118, in kb_chat kb_list = [x["kb_name"] for x in api.list_knowledge_bases()

{ "input": "The food was delicious and the waiter...", "model": "360Zhinao-search", "encoding_format": "float" }

注意注意：

请用pip list查看自己httpx版本，我发现最新httpx==0.28.0是不行的，需要0.27.2版本才可以，重新安装后就不会报错了

导致错误是

langchain-chatchat报错Client.__init__() got an unexpected keyword argument ‘proxies‘

错误2

RuntimeError: Cluster is not available after multiple attempts

主要由于启动ip地址host错误，突出表现为0.0.0.0地址，应该本地化使用127.0.0.1

xinference 安装（http导致错误解决）

主服务器启动 Supervisor

其他服务器启动 Worker

相关文章：

xinference 安装（http导致错误解决）

334递增的三元子序列贪心算法（思路解析+源码）

【Linux】29.Linux 多线程（3）

利用UNIAPP实现短视频上下滑动播放功能

vscode+CMake+Debug实现及权限不足等诸多问题汇总

【提示词工程】探索大语言模型的参数设置：优化提示词交互的技巧

基于 .NET 8.0 gRPC通讯架构设计讲解，客户端+服务端

6.Centos7上部署flask+SQLAlchemy+python+达梦数据库

【C语言系列】深入理解指针（5）

mysql自连接处理层次结构数据

##__VA_ARGS__有什么作用

鸿蒙 router.back(）返回不到上个页面

深度学习模型蒸馏技术的发展与应用

STM32G0B1 ADC DMA normal

＜tauri＞＜rust＞＜GUI＞基于rust和tauri，在已有的前端框架上手动集成tauri示例

模型冗余系统(系统科学)

Deepseek部署的模型参数要求

AI-学习路线图-PyTorch-我是土堆

[LeetCode]day17 349.两个数组的交集

axios 发起 post请求 json 需要传入数据格式

[特殊字符] 智能合约中的数据是如何在区块链中保持一致的？

【杂谈】-递归进化：人工智能的自我改进与监管挑战

RocketMQ延迟消息机制

mongodb源码分析session执行handleRequest命令find过程

Java多线程实现之Thread类深度解析

Selenium常用函数介绍

苹果AI眼镜：从“工具”到“社交姿态”的范式革命——重新定义AI交互入口的未来机会

沙箱虚拟化技术虚拟机容器之间的关系详解

ubuntu22.04 安装docker 和docker-compose

jdbc查询mysql数据库时，出现id顺序错误的情况