当前位置：首页 > news >正文

LitGPT - 20多个高性能LLM，具有预训练、微调和大规模部署的recipes

news 2025/12/11 2:37:02

文章目录

- 一、关于 LitGPT
- 二、快速启动
- - 安装LitGPT
  - - 高级安装选项
  - 从20多个LLM中进行选择
- 三、工作流程
- - 1、所有工作流程
  - 2、微调LLM
  - 3、部署LLM
  - 4、评估LLM
  - 5、测试LLM
  - 6、预训练LLM
  - 7、继续预训练LLM
- 四、最先进的功能
- 五、训练方法
- - 示例
- 六、项目亮点
- 教程

一、关于 LitGPT

LitGPT 用于使用、微调、预训练和部署LLM Lightning快速⚡⚡

每个LLM都是从头开始实现的，没有抽象和完全控制，使它们在企业规模上非常快速、最小化和高性能。

github : https://github.com/Lightning-AI/litgpt
快速启动•模型•Finetune•部署•所有工作流程•功能•配方（YAML）•闪电AI•教程

✅**企业就绪-**Apache 2.0可无限企业使用。

✅**开发人员友好-**无需抽象层和单个文件实现即可轻松调试。

✅**优化性能-**旨在最大化性能、降低成本和加快训练速度的模型。

✅**经过验证的配方-**在企业规模测试的高度优化的训练/微调配方。

✅ From scratch implementations     ✅ No abstractions    ✅ Beginner friendly   
✅ Flash attention                  ✅ FSDP               ✅ LoRA, QLoRA, Adapter
✅ Reduce GPU memory (fp4/8/16/32)  ✅ 1-1000+ GPUs/TPUs  ✅ 20+ LLMs

二、快速启动

安装LitGPT

pip install 'litgpt[all]'

加载和使用20+LLM中的任何一个：

from litgpt import LLMllm = LLM.load("microsoft/phi-2")
text = llm.generate("Fix the spelling: Every fall, the familly goes to the mountains.")
print(text)
# Corrected Sentence: Every fall, the family goes to the mountains.

✅针对快速推理进行了优化
✅量化
✅在低内存GPU上运行
✅没有内部抽象层
✅针对生产规模进行了优化

高级安装选项

从源代码安装：

git clone https://github.com/Lightning-AI/litgpt
cd litgpt
pip install -e '.[all]'

探索完整的Python API文档。

从20多个LLM中进行选择

每个模型都是从头开始编写的，以最大限度地提高性能并删除抽象层：

Model	Model size	Author	Reference
Llama 3, 3.1, 3.2	1B, 3B, 8B, 70B, 405B	Meta AI	Meta AI 2024
Code Llama	7B, 13B, 34B, 70B	Meta AI	Rozière et al. 2023
Mixtral MoE	8x7B, 8x22B	Mistral AI	Mistral AI 2023
Mistral	7B, 123B	Mistral AI	Mistral AI 2023
CodeGemma	7B	Google	Google Team, Google Deepmind
Gemma 2	2B, 9B, 27B	Google	Google Team, Google Deepmind
Phi 3 & 3.5	3.8B	Microsoft	Abdin et al. 2024
…	…	…	…

三、工作流程

Finetune•预训练•持续预训练•评估•部署•测试

使用命令行界面运行高级工作流，例如对您自己的数据进行预训练或微调。

1、所有工作流程

安装LitGPT后，选择要运行的模型和工作流程（微调、预训练、评估、部署等…）：

# ligpt [action] [model]
litgpt  serve     meta-llama/Llama-3.2-3B-Instruct
litgpt  finetune  meta-llama/Llama-3.2-3B-Instruct
litgpt  pretrain  meta-llama/Llama-3.2-3B-Instruct
litgpt  chat      meta-llama/Llama-3.2-3B-Instruct
litgpt  evaluate  meta-llama/Llama-3.2-3B-Instruct

2、微调LLM

Run on Studios : https://lightning.ai/lightning-ai/studios/litgpt-finetune

微调是采用预训练的AI模型并在为特定任务或应用程序量身定制的较小、专门的数据集上进一步训练它的过程。

# 0) setup your dataset
curl -L https://huggingface.co/datasets/ksaw008/finance_alpaca/resolve/main/finance_alpaca.json -o my_custom_dataset.json# 1) Finetune a model (auto downloads weights)
litgpt finetune microsoft/phi-2 \--data JSON \--data.json_path my_custom_dataset.json \--data.val_split_fraction 0.1 \--out_dir out/custom-model# 2) Test the model
litgpt chat out/custom-model/final# 3) Deploy the model
litgpt serve out/custom-model/final

阅读完整的微调文档

3、部署LLM

Deploy on Studios : https://lightning.ai/lightning-ai/studios/litgpt-serve

部署预训练或微调LLM以在实际应用程序中使用它。部署，自动设置可由网站或应用程序访问的Web服务器。

# deploy an out-of-the-box LLM
litgpt serve microsoft/phi-2# deploy your own trained model
litgpt serve path/to/microsoft/phi-2/checkpoint

向查询服务器显示代码：

在单独的终端中测试服务器并将模型API集成到您的AI产品中：

# 3) Use the server (in a separate Python session)
import requests, json
response = requests.post("http://127.0.0.1:8000/predict",json={"prompt": "Fix typos in the following sentence: Exampel input"}
)
print(response.json()["output"])

阅读完整的部署文档。

4、评估LLM

评估一个LLM来测试它在各种任务上的表现，看看它理解和生成文本的程度。简单地说，我们可以评估它在大学水平的化学、编码等方面的表现…（MMLU、真实质量保证等…）

litgpt evaluate microsoft/phi-2 --tasks 'truthfulqa_mc2,mmlu'

阅读完整的评估文档。

5、测试LLM

Run on Studios : <https://lightning.ai/lightning-ai/studios/litgpt-chat)

通过交互式聊天测试模型的工作情况。使用chat命令聊天、提取嵌入等…

这是一个展示如何使用Phi-2 LLM的示例：

litgpt chat microsoft/phi-2>> Prompt: What do Llamas eat?

完整代码：

# 1) List all supported LLMs
litgpt download list# 2) Use a model (auto downloads weights)
litgpt chat microsoft/phi-2>> Prompt: What do Llamas eat?

某些型号的下载需要额外的访问令牌。您可以在下载文档中信息。

阅读完整的聊天文档。

6、预训练LLM

Run on Studios ： https://lightning.ai/lightning-ai/studios/litgpt-pretrain

预训练是在针对特定任务进行微调之前通过将AI模型暴露于大量数据来教授AI模型的过程。

显示代码：

mkdir -p custom_texts
curl https://www.gutenberg.org/cache/epub/24440/pg24440.txt --output custom_texts/book1.txt
curl https://www.gutenberg.org/cache/epub/26393/pg26393.txt --output custom_texts/book2.txt# 1) Download a tokenizer
litgpt download EleutherAI/pythia-160m \--tokenizer_only True# 2) Pretrain the model
litgpt pretrain EleutherAI/pythia-160m \--tokenizer_dir EleutherAI/pythia-160m \--data TextFiles \--data.train_data_path "custom_texts/" \--train.max_tokens 10_000_000 \--out_dir out/custom-model# 3) Test the model
litgpt chat out/custom-model/final

阅读完整的预训练文档

7、继续预训练LLM

Run on Studios : <https://lightning.ai/lightning-ai/studios/litgpt-continue-pretraining)

继续预训练是另一种微调方式，它通过对自定义数据进行训练来专门化已经预训练的模型：

显示代码：

mkdir -p custom_texts
curl https://www.gutenberg.org/cache/epub/24440/pg24440.txt --output custom_texts/book1.txt
curl https://www.gutenberg.org/cache/epub/26393/pg26393.txt --output custom_texts/book2.txt# 1) Continue pretraining a model (auto downloads weights)
litgpt pretrain EleutherAI/pythia-160m \--tokenizer_dir EleutherAI/pythia-160m \--initial_checkpoint_dir EleutherAI/pythia-160m \--data TextFiles \--data.train_data_path "custom_texts/" \--train.max_tokens 10_000_000 \--out_dir out/custom-model# 2) Test the model
litgpt chat out/custom-model/final

阅读完整的持续预训练文档

四、最先进的功能

✅最先进的优化：闪存注意力v2、通过完全分片数据并行支持多GPU、可选CPU卸载以及TPU和XLA支持。

✅预训练、微调和部署

✅通过低精度设置降低计算要求：FP16、BF16和FP16/FP32混合。

✅通过量化降低内存需求：4位浮点数、8位整数和双重量化。

✅配置文件具有出色的开箱即用性能。

✅参数高效微调：LoRA、QLoRA、Adapter和Adapter v2。

✅导出到其他流行的模型重量格式。

✅许多流行的数据集用于预训练和微调，并支持自定义数据集。

✅可读且易于修改的代码，以试验最新的研究思想。

五、训练方法

LitGPT带有经过验证的配方（YAML配置）来训练不同条件下的模型。我们根据我们发现的在不同训练条件下表现最佳的参数生成了这些食谱。

浏览所有训练食谱在这里。

示例

litgpt finetune \--config https://raw.githubusercontent.com/Lightning-AI/litgpt/main/config_hub/finetune/llama-2-7b/lora.yaml

✅使用配置自定义训练

配置可让您自定义所有粒度参数的训练，例如：

# The path to the base model's checkpoint directory to load for finetuning. (type: <class 'Path'>, default: checkpoints/stabilityai/stablelm-base-alpha-3b)
checkpoint_dir: checkpoints/meta-llama/Llama-2-7b-hf# Directory in which to save checkpoints and logs. (type: <class 'Path'>, default: out/lora)
out_dir: out/finetune/qlora-llama2-7b# The precision to use for finetuning. Possible choices: "bf16-true", "bf16-mixed", "32-true". (type: Optional[str], default: null)
precision: bf16-true...

✅示例：LoRA微调配置

# The path to the base model's checkpoint directory to load for finetuning. (type: <class 'Path'>, default: checkpoints/stabilityai/stablelm-base-alpha-3b)
checkpoint_dir: checkpoints/meta-llama/Llama-2-7b-hf# Directory in which to save checkpoints and logs. (type: <class 'Path'>, default: out/lora)
out_dir: out/finetune/qlora-llama2-7b# The precision to use for finetuning. Possible choices: "bf16-true", "bf16-mixed", "32-true". (type: Optional[str], default: null)
precision: bf16-true# If set, quantize the model with this algorithm. See ``tutorials/quantize.md`` for more information. (type: Optional[Literal['nf4', 'nf4-dq', 'fp4', 'fp4-dq', 'int8-training']], default: null)
quantize: bnb.nf4# How many devices/GPUs to use. (type: Union[int, str], default: 1)
devices: 1# How many nodes to use. (type: int, default: 1)
num_nodes: 1# The LoRA rank. (type: int, default: 8)
lora_r: 32# The LoRA alpha. (type: int, default: 16)
lora_alpha: 16# The LoRA dropout value. (type: float, default: 0.05)
lora_dropout: 0.05# Whether to apply LoRA to the query weights in attention. (type: bool, default: True)
lora_query: true# Whether to apply LoRA to the key weights in attention. (type: bool, default: False)
lora_key: false# Whether to apply LoRA to the value weights in attention. (type: bool, default: True)
lora_value: true# Whether to apply LoRA to the output projection in the attention block. (type: bool, default: False)
lora_projection: false# Whether to apply LoRA to the weights of the MLP in the attention block. (type: bool, default: False)
lora_mlp: false# Whether to apply LoRA to output head in GPT. (type: bool, default: False)
lora_head: false# Data-related arguments. If not provided, the default is ``litgpt.data.Alpaca``.
data:class_path: litgpt.data.Alpaca2kinit_args:mask_prompt: falseval_split_fraction: 0.05prompt_style: alpacaignore_index: -100seed: 42num_workers: 4download_dir: data/alpaca2k# Training-related arguments. See ``litgpt.args.TrainArgs`` for details
train:# Number of optimizer steps between saving checkpoints (type: Optional[int], default: 1000)save_interval: 200# Number of iterations between logging calls (type: int, default: 1)log_interval: 1# Number of samples between optimizer steps across data-parallel ranks (type: int, default: 128)global_batch_size: 8# Number of samples per data-parallel rank (type: int, default: 4)micro_batch_size: 2# Number of iterations with learning rate warmup active (type: int, default: 100)lr_warmup_steps: 10# Number of epochs to train on (type: Optional[int], default: 5)epochs: 4# Total number of tokens to train on (type: Optional[int], default: null)max_tokens:# Limits the number of optimizer steps to run (type: Optional[int], default: null)max_steps:# Limits the length of samples (type: Optional[int], default: null)max_seq_length: 512# Whether to tie the embedding weights with the language modeling head weights (type: Optional[bool], default: null)tie_embeddings:#   (type: float, default: 0.0003)learning_rate: 0.0002#   (type: float, default: 0.02)weight_decay: 0.0#   (type: float, default: 0.9)beta1: 0.9#   (type: float, default: 0.95)beta2: 0.95#   (type: Optional[float], default: null)max_norm:#   (type: float, default: 6e-05)min_lr: 6.0e-05# Evaluation-related arguments. See ``litgpt.args.EvalArgs`` for details
eval:# Number of optimizer steps between evaluation calls (type: int, default: 100)interval: 100# Number of tokens to generate (type: Optional[int], default: 100)max_new_tokens: 100# Number of iterations (type: int, default: 100)max_iters: 100# The name of the logger to send metrics to. (type: Literal['wandb', 'tensorboard', 'csv'], default: csv)
logger_name: csv# The random seed to use for reproducibility. (type: int, default: 1337)
seed: 1337

✅覆盖CLI中的任何参数：

litgpt finetune \--config https://raw.githubusercontent.com/Lightning-AI/litgpt/main/config_hub/finetune/llama-2-7b/lora.yaml \--lora_r 4

六、项目亮点

LitGPT为许多伟大的AI项目、计划、挑战，当然还有企业提供支持。请提交拉取请求以考虑某个功能。

📊SAMBA：用于高效无限上下文语言建模的简单混合状态空间模型
微软研究人员的Samba项目建立在LitGPT代码库之上，将状态空间模型与滑动窗口注意力相结合，优于纯状态空间模型。
🏆NeurIPS 2023大型语言模型效率挑战：1个LLM+1个GPU+1天
LitGPT存储库是NeurIPS 2023 LLM效率挑战赛的官方入门套件，该比赛的重点是在单个GPU上微调现有的非指令调整LLM 24小时。
🦙TinyLlama：一个开源的小语言模型
LitGPT支持TinyLlama项目和TinyLlama：开源小语言模型研究论文。
🍪MicroLlama：MicroLlama-300M
MicroLlama是在TinyLlama和LitGPT支持的50Btoken 上预训练的300M骆驼模型。
🔬预训练较少token 的小型基本LM

研究论文“预训练具有更少令牌的小型基本LM”利用LitGPT，通过从较大模型继承一些转换器块并对较大模型使用的一小部分数据进行训练来开发较小的基本语言模型。它证明，尽管使用的训练数据和资源明显较少，但这些较小的模型可以与较大的模型相比。

教程

🚀开始
⚡微调，包括LoRA、QLoRA和适配器
🤖预训练
💬模型评估
📘支持和自定义数据集
🧹量化
🤯处理内存不足（OOM）错误的提示
🧑🏽‍💻使用云TPU

2025-01-27(一)

LitGPT - 20多个高性能LLM，具有预训练、微调和大规模部署的recipes

文章目录一、关于 LitGPT二、快速启动安装LitGPT高级安装选项从20多个LLM中进行选择三、工作流程1、所有工作流程2、微调LLM3、部署LLM4、评估LLM5、测试LLM6、预训练LLM7、继续预训练LLM 四、最先进的功能五、训练方法示例六、项目亮点教程一、关于 LitGPT LitGPT 用于 …...

编程日记 2025/2/1 3:05:55

deepseek R1 14b显存占用

RTX2080ti 11G显卡，模型7b速度挺快，试试14B也不错。 7B显存使用5.6G，14B显存刚好够，出文字速度差不多。打算自己写个移动宽带的IPTV播放器，不知道怎么下手，就先问他了。...

编程日记 2025/2/1 3:03:53

无用知识研究：对std::common_type以及问号表达式类型的理解

先说结论： 如果问号表达式能编译通过，那么std::common_type就能通过。因为common_type的底层依赖的就是?: common_type的实现里，利用了问号表达式：ternary conditional operator (?:) https://stackoverflow.com/questions/14…...

编程日记 2025/2/1 2:56:43

MapReduce概述

目录 1. MapReduce概述2. MapReduce的功能2.1 数据划分和计算任务调度2.2 数据/代码互定位2.3 系统优化2.4 出错检测和恢复 3. MapReduce处理流程4. MapReduce编程基础参考 1. MapReduce概述 MapReduce是面向大数据并行处理的计算模型、框架和平台: 1. 基于集群的高性能并行…...

编程日记 2025/2/1 2:51:29

循环神经网络（RNN）+pytorch实现情感分析

目录一、背景引入二、网络介绍 2.1 输入层 2.2 循环层 2.3 输出层 2.4 举例 2.5 深层网络三、网络的训练 3.1 训练过程举例 1）输出层 2）循环层 3.2 BPTT 算法 1）输出层 2）循环层 3）算法流程四、循…...

编程日记 2025/2/1 2:50:27

Mac cursor设置jdk、Maven版本

基本配置 – Cursor 使用文档首先是系统用户级别的设置参数，运行cursor，按下ctrlshiftp，输入Open User Settings(JSON)，在弹出的下拉菜单中选中下面这样的： 在打开的json编辑器中追加下面的内容： {"…...

编程日记 2025/2/1 2:45:19

WPS数据分析000005

目录一、数据录入技巧二、一维表三、填充柄向下自动填充自动填充选项日期填充星期自定义自定义序列 1-10000序列四、智能填充五、数据有效性出错警告输入信息下拉列表六、记录单七、导入数据编辑八、查找录入会员功能 Xlookup函数 VL…...

编程日记 2025/2/1 2:34:57

CTF从入门到精通

文章目录背景知识CTF赛制背景知识 CTF赛制 1.web安全:通过浏览器访问题目服务器上的网站，寻找网站漏洞(sql注入，xss（钓鱼链接）,文件上传，包含漏洞，xxe，ssrf，命令执行&#xff0c…...

编程日记 2025/2/1 2:32:52

Flutter使用Flavor实现切换环境和多渠道打包

在Android开发中通常我们使用flavor进行多渠道打包，flutter开发中同样有这种方式，不过需要在原生中配置具体方案其实flutter官网个了相关示例（https://docs.flutter.dev/deployment/flavors）,我这里记录一下自己的操作 Android …...

编程日记 2025/2/1 2:29:48

Springboot如何使用面向切面编程AOP?

Springboot如何使用面向切面编程AOP? 在 Spring Boot 中使用面向切面编程（AOP）非常简单，Spring Boot 提供了对 AOP 的自动配置支持。以下是详细的步骤和示例，帮助你快速上手 Spring Boot 中的 AOP。 1. 添加依赖首先&#xff…...

编程日记 2025/2/1 2:27:43

51单片机（STC89C52）开发：点亮一个小灯

软件安装： 安装开发板CH340驱动。安装KEILC51开发软件：C51V901.exe。下载软件：PZ-ISP.exe 创建项目： 新建main.c 将main.c加入至项目中： main.c:点亮一个小灯 #include "reg52.h"sbit LED1P2^0; //P2的…...

编程日记 2025/2/1 2:26:39

基于MinIO的对象存储增删改查

MinIO是一个高性能的分布式对象存储服务。Python的minio库可操作MinIO，包括创建/列出存储桶、上传/下载/删除文件及列出文件。查看帮助信息 minio.exe --help minio.exe server --help …...

编程日记 2025/2/1 2:25:35

Ubuntu Server 安装 XFCE4桌面

Ubuntu Server没有桌面环境，一些软件有桌面环境使用起来才更加方便，所以我尝试安装桌面环境。常用的桌面环境有：GNOME、KDE Plasma、XFCE4等。这里我选择安装XFCE4桌面环境，主要因为它是一个极轻量级的桌面环境，适合内…...

编程日记 2025/2/1 2:23:29

MySQL 存储函数：数据库的自定义函数

在数据库开发中，存储函数（Stored Function）是一种非常有用的工具。它允许我们创建自定义的函数，这些函数可以在 SQL 查询中像内置函数一样使用，用于实现特定的逻辑和计算。本文将深入探讨 MySQL 存储函数的概念、与存储…...

编程日记 2025/2/1 2:22:26

代码随想录_栈与队列

栈与队列 232.用栈实现队列 232. 用栈实现队列使用栈实现队列的下列操作： push(x) – 将一个元素放入队列的尾部。 pop() – 从队列首部移除元素。 peek() – 返回队列首部的元素。 empty() – 返回队列是否为空。思路: 定义两个栈: 入队栈, 出队栈, 控制出入…...

编程日记 2025/2/1 2:18:19

【微服务与分布式实践】探索 Sentinel

参数设置熔断时长、最小请求数、最大RT ms、比例阈值、异常数熔断策略慢调⽤⽐例当单位统计时⻓内请求数⽬⼤于设置的最⼩请求数⽬，并且慢调⽤的⽐例⼤于阈值，则接下来的熔断时⻓内请求会⾃动被熔断异常⽐例当单位统计时⻓内请求数⽬⼤于设置…...

编程日记 2025/2/1 2:15:16

深入研究异常处理机制

一、原理探究 C异常处理本节内容针对 Linux 下的 C 异常处理机制，重点在于研究如何在异常处理流程中利用溢出漏洞，所以不对异常处理及 unwind 的过程做详细分析，只做简单介绍异常机制中主要的三个关键字：throw 抛出异常&#x…...

编程日记 2025/2/1 2:14:10

【memgpt】letta 课程4：基于latta框架构建MemGpt代理并与之交互

Lab 3: Building Agents with memory 基于latta框架构建MemGpt代理并与之交互理解代理状态，例如作为系统提示符、工具和agent的内存查看和编辑代理存档内存MemGPT 代理是有状态的 agents的设计思路每个步骤都要定义代理行为 Letta agents persist information over time and…...

编程日记 2025/2/1 2:13:05

讯飞智作 AI 配音技术浅析（二）：深度学习与神经网络

讯飞智作 AI 配音技术依赖于深度学习与神经网络，特别是 Tacotron、WaveNet 和 Transformer-TTS 模型。这些模型通过复杂的神经网络架构和数学公式，实现了从文本到自然语音的高效转换。一、Tacotron 模型 Tacotron 是一种端到端的语音合成模型&#xff…...

编程日记 2025/2/1 2:06:59

基于单片机的超声波液位检测系统(论文+源码)

1总体设计本课题为基于单片机的超声波液位检测系统的设计，系统的结构框图如图2.1所示。其中包括了按键模块，温度检测模块，超声波液位检测模块，显示模块，蜂鸣器等器件设备。其中，采用STC89C52单片机作为主控…...

编程日记 2025/2/1 2:04:56

观成科技：隐蔽隧道工具Ligolo-ng加密流量分析

1.工具介绍 Ligolo-ng是一款由go编写的高效隧道工具，该工具基于TUN接口实现其功能，利用反向TCP/TLS连接建立一条隐蔽的通信信道，支持使用Let’s Encrypt自动生成证书。Ligolo-ng的通信隐蔽性体现在其支持多种连接方式，适应复杂网…...

编程新知 2025/12/10 8:56:24

智慧医疗能源事业线深度画像分析（上）

引言医疗行业作为现代社会的关键基础设施，其能源消耗与环境影响正日益受到关注。随着全球"双碳"目标的推进和可持续发展理念的深入，智慧医疗能源事业线应运而生，致力于通过创新技术与管理方案，重构医疗领域的能源使用模式。这一事业线融合了能源管理、可持续发…...

编程新知 2025/12/10 15:33:40

解决Ubuntu22.04 VMware失败的问题 ubuntu入门之二十八

现象1 打开VMware失败 Ubuntu升级之后打开VMware上报需要安装vmmon和vmnet，点击确认后如下提示最终上报fail 解决方法内核升级导致，需要在新内核下重新下载编译安装查看版本 $ vmware -v VMware Workstation 17.5.1 build-23298084$ lsb_release…...

编程新知 2025/12/8 9:13:58