当前位置：首页 > news >正文

利用langchain 做大模型 Few-shot Learning 提示，包括固定和向量相似的动态样本筛选

news 2026/2/9 12:54:03

文章目录

- few-shot
- Fixed Examples 固定样本
- Dynamic few-shot prompting 动态样本提示
- 辅助
- 参考资料

few-shot

相比大模型微调，在有些情况下，我们更想使用 Few-shot Learning 通过给模型喂相关样本示例，让模型能够提升相应任务的能力。

固定样本提示 VS 动态样本提示：

固定样本提示：每次都用同样的样本提示去推理；
动态样本提示：根据当前要推理的样本，基于向量相似度算法，在训练集中找出相似的样本作为提示去推理。

Few-shot Learning (少样本提示学习)：

定义：Few-shot learning 是通过给模型提供少量示例（例如 1-5 个）来进行任务的学习方式。这些示例通常包括输入和相应的输出。
实现方式：在大多数情况下，few-shot learning 是在模型的输入中直接包含这些示例作为提示。这意味着模型本身没有经过任何额外的训练或调整。
优点：可以快速适应新任务，无需额外的训练时间和资源。

项目开源地址：
https://github.com/JieShenAI/csdn/blob/main/24/07/few_shot_prompt/langchain_fewshot.ipynb

Fixed Examples 固定样本

以聊天模型为例，

from langchain import PromptTemplate, FewShotPromptTemplate
from langchain_openai import ChatOpenAIparser = StrOutputParser()model = ChatOpenAI(model="gpt-4o-mini")

from langchain_core.prompts import ChatPromptTemplate, FewShotChatMessagePromptTemplateexamples = [{"input": "2 🦜 2", "output": "4"},{"input": "2 🦜 3", "output": "5"},
]

🦜 代表加法。想让大模型根据给出的例子学会🦜 代表加法。

# This is a prompt template used to format each individual example.
example_prompt = ChatPromptTemplate.from_messages([("human", "{input}"),("ai", "{output}"),]
)
few_shot_prompt = FewShotChatMessagePromptTemplate(example_prompt=example_prompt,examples=examples,
)

few_shot_prompt.invoke({}).messages

Output:

[HumanMessage(content='2 🦜 2'),AIMessage(content='4'),HumanMessage(content='2 🦜 3'),AIMessage(content='5')]

few_shot_prompt.format()

Output:

'Human: 2 🦜 2\nAI: 4\nHuman: 2 🦜 3\nAI: 5'

final_prompt = ChatPromptTemplate.from_messages([("system", "You are a wondrous wizard of math."),few_shot_prompt,("human", "{input}"),]
)
# chain = model | final_prompt
chain = final_prompt | modelchain.invoke({"input": "What's 3 🦜 3?"})

Output:

AIMessage(content='Based on the previous pattern, the 🦜 operation appears to be addition. Therefore:\n\n\\[ 3 🦜 3 = 3 + 3 = 6 \\]', response_metadata={'token_usage': {'completion_tokens': 37, 'prompt_tokens': 30, 'total_tokens': 67}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': '', 'finish_reason': 'stop', 'logprobs': None}, id='run-xxx', usage_metadata={'input_tokens': 30, 'output_tokens': 37, 'total_tokens': 67})

如上模型的输出结果所示，模型已经能够学到🦜是加法，并返回 3 🦜 3 = 3 + 3 = 6 。

Dynamic few-shot prompting 动态样本提示

为什么要有一个动态的 few-shot 呢？

在上一节 Fixed Examples中，无论输入什么问题，都只使用固定的例子作为提示。

动态例子提示是：针对不同的问题，使用不同的例子进行提示。目的是为了提高模型的性能。

如果你想评估动态few-shot的效果，那么便逐个遍历测试集的样本数据，根据测试集的样本使用向量相似度算法从训练集中拿到最相似的几个样本，再去做 few-shot prompting。

我们考虑在下一篇文章，为大家评估动态few-shot的效果。当前文章只是教学文章，不想整的太复杂。

在前一个章节中使用：
ChatPromptTemplate 和FewShotChatMessagePromptTemplate，

在本章节中使用：
PromptTemplate 和 FewShotPromptTemplate

上述一一对应，不能混用。

from langchain_core.prompts import PromptTemplateexample_prompt = PromptTemplate.from_template("Question: {question}\n{answer}")

下述代码展示了 example_prompt 使用效果：

print(example_prompt.invoke(qa_examples[0]).text)

Output:

Question: Who lived longer, Muhammad Ali or Alan Turing?Are follow up questions needed here: Yes.Follow up: How old was Muhammad Ali when he died?Intermediate answer: Muhammad Ali was 74 years old when he died.Follow up: How old was Alan Turing when he died?Intermediate answer: Alan Turing was 41 years old when he died.So the final answer is: Muhammad Ali

下述的 qa_examples 是一个训练集，供模型推理时，在其中选择向量最相似的样本。

qa_examples = [{"question": "Who lived longer, Muhammad Ali or Alan Turing?","answer": """Are follow up questions needed here: Yes.Follow up: How old was Muhammad Ali when he died?Intermediate answer: Muhammad Ali was 74 years old when he died.Follow up: How old was Alan Turing when he died?Intermediate answer: Alan Turing was 41 years old when he died.So the final answer is: Muhammad Ali""",},{"question": "When was the founder of craigslist born?","answer": """Are follow up questions needed here: Yes.Follow up: Who was the founder of craigslist?Intermediate answer: Craigslist was founded by Craig Newmark.Follow up: When was Craig Newmark born?Intermediate answer: Craig Newmark was born on December 6, 1952.So the final answer is: December 6, 1952""",},{"question": "Who was the maternal grandfather of George Washington?","answer": """Are follow up questions needed here: Yes.Follow up: Who was the mother of George Washington?Intermediate answer: The mother of George Washington was Mary Ball Washington.Follow up: Who was the father of Mary Ball Washington?Intermediate answer: The father of Mary Ball Washington was Joseph Ball.So the final answer is: Joseph Ball""",},{"question": "Are both the directors of Jaws and Casino Royale from the same country?","answer": """Are follow up questions needed here: Yes.Follow up: Who is the director of Jaws?Intermediate Answer: The director of Jaws is Steven Spielberg.Follow up: Where is Steven Spielberg from?Intermediate Answer: The United States.Follow up: Who is the director of Casino Royale?Intermediate Answer: The director of Casino Royale is Martin Campbell.Follow up: Where is Martin Campbell from?Intermediate Answer: New Zealand.So the final answer is: No""",},
]

example_prompt 作为参数放入到 FewShotPromptTemplate 模版中，实现对 qa_examples中的数据进行封装。

from langchain_core.prompts import FewShotPromptTemplateprompt = FewShotPromptTemplate(examples=qa_examples,example_prompt=example_prompt,# prefix="You are a helpful assistant.",suffix="Question: {input}",input_variables=["input"],)print(prompt.invoke({"input": "Who was the father of Mary Ball Washington?"}).to_string()
)

这里是不使用向量筛选器prompt。若调用 invoke 方法，FewShotPromptTemplate会把qa_examples中所有的样本都封装好作为上下文。

Output:

Question: Who lived longer, Muhammad Ali or Alan Turing?Are follow up questions needed here: Yes.Follow up: How old was Muhammad Ali when he died?Intermediate answer: Muhammad Ali was 74 years old when he died.Follow up: How old was Alan Turing when he died?Intermediate answer: Alan Turing was 41 years old when he died.So the final answer is: Muhammad Ali......
Question: Who was the father of Mary Ball Washington?

使用编码模型构建向量筛选器，将qa_examples经过编码后，保存到 Chroma 向量数据库中。

from langchain_chroma import Chroma
from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_openai import OpenAIEmbeddingsexample_selector = SemanticSimilarityExampleSelector.from_examples(# This is the list of examples available to select from.qa_examples,# This is the embedding class used to produce embeddings which are used to measure semantic similarity.OpenAIEmbeddings(),# This is the VectorStore class that is used to store the embeddings and do a similarity search over.Chroma,# This is the number of examples to produce.k=1,
)

使用 example_selector 根据用户输入的问题，找一个最相似的样本出来：

# Select the most similar example to the input.
question = "Who was the father of Mary Ball Washington?"
selected_examples = example_selector.select_examples({"question": question})
print(f"Examples most similar to the input: {question}")
for example in selected_examples:print("\n")print('【')for k, v in example.items():print(f"{k}: {v}")print('】')

Output:

Examples most similar to the input: Who was the father of Mary Ball Washington?【
answer: Are follow up questions needed here: Yes.Follow up: Who was the mother of George Washington?Intermediate answer: The mother of George Washington was Mary Ball Washington.Follow up: Who was the father of Mary Ball Washington?Intermediate answer: The father of Mary Ball Washington was Joseph Ball.So the final answer is: Joseph Ballquestion: Who was the maternal grandfather of George Washington?
】

使用向量选择器example_selector和提示词封装器example_prompt，构建最终的prompt。

同时可以在 FewShotPromptTemplate 添加后缀和前缀。一般前缀用来添加系统提示词，后缀用来添加问题。


prompt = FewShotPromptTemplate(example_selector=example_selector,example_prompt=example_prompt,# prefix="You are a helpful assistant.",suffix="Question: {input}",input_variables=["input"],
)print(prompt.invoke({"input": "Who was the father of Mary Ball Washington?"}).to_string()
)

Output:

Question: Who was the maternal grandfather of George Washington?Are follow up questions needed here: Yes.Follow up: Who was the mother of George Washington?Intermediate answer: The mother of George Washington was Mary Ball Washington.Follow up: Who was the father of Mary Ball Washington?Intermediate answer: The father of Mary Ball Washington was Joseph Ball.So the final answer is: Joseph BallQuestion: Who was the father of Mary Ball Washington?

chain = prompt | model
chain.invoke({"input": "Who was the father of Mary Ball Washington?"})

Output:

AIMessage(content='The father of Mary Ball Washington was Joseph Ball.', response_metadata={'token_usage': {'completion_tokens': 10, 'prompt_tokens': 103, 'total_tokens': 113}, 'model_name': 'gpt-4o-mini-2024-07-18', 'system_fingerprint': 'fp_0f03d4f0ee', 'finish_reason': 'stop', 'logprobs': None}, id='run-ae96f9c7-ac89-47ba-8074-69197b89bef5-0', usage_metadata={'input_tokens': 103, 'output_tokens': 10, 'total_tokens': 113})

辅助

与huggingface 通过代理连接

import os
os.environ['HTTP_PROXY'] = 'http://127.0.0.1:7890'
os.environ['HTTPS_PROXY'] = 'http://127.0.0.1:7890'

参考资料

下述是2个langchain的官方说明文档，均写的很不错：

https://python.langchain.com/v0.2/docs/how_to/few_shot_examples_chat/ How to use few shot examples in chat models
https://python.langchain.com/v0.2/docs/how_to/few_shot_examples/#pass-the-examples-and-formatter-to-fewshotprompttemplate How to use few shot examples

利用langchain 做大模型 Few-shot Learning 提示，包括固定和向量相似的动态样本筛选

文章目录 few-shotFixed Examples 固定样本Dynamic few-shot prompting 动态样本提示辅助参考资料 few-shot 相比大模型微调，在有些情况下，我们更想使用 Few-shot Learning 通过给模型喂相关样本示例，让模型能够提升相应任务的能力。固定样…...

编程日记 2024/8/2 3:22:14

基于python的百度迁徙迁入、迁出数据分析（五）

终于在第五篇文章我们进入了这个系列的正题：数据分析这里我选择上海2024年5月1日——5月5日的迁入、迁出数据作为分析的基础，首先选择节假日的数据作为分析的原因呢，主要是节假日人们出行目的比较单一（出游、探亲）&a…...

编程日记 2024/8/2 3:21:12

SpringBoot 如何处理跨域请求

SpringBoot 处理跨域请求，通常是通过配置全局的 CORS（跨源资源共享）策略来实现的。CORS 是一种机制，它使用额外的 HTTP 头部来告诉浏览器，让运行在一个 origin (domain) 上的 web 应用被准许访问来自不同源服务器上的指…...

编程日记 2024/8/2 3:20:09

大数据技术基础编程、实验和案例----大数据课程综合实验案例

一、实验目的 (1）熟悉Linux系统、MySQL、Hadoop、HBase、Hive、Sqoop、R、Eclipse等系统和软件的安装和使用； (2）了解大数据处理的基本流程； (3）熟悉数据预处理方法； (4）熟悉在不同类型数据库之…...

编程日记 2024/8/2 3:17:05

微信小程序-获取手机号：HttpClientErrorException: 412 Precondition Failed: [no body]

问题： 412 异常就是你的请求参数获取请求头与服务器的不符，缺少请求体！ 我的问题： 我这里获取微信手机号的时候突然给我报错142，但是代码用的是原来的代码，换了一个框架就噶了！ 排查问题&am…...

编程日记 2024/8/2 3:16:04

大数据核心概念与技术架构简介

大数据基本概念大数据是指无法在一定时间范围内用常规软件工具进行捕捉、管理和处理的数据集合，是具有更强的决策力、洞察发现力和流程优化能力的海量、高增长率和多样化的信息资产。大数据特征： 数据量大：一般以P（1000个TB&a…...

编程日记 2024/8/2 3:15:03

原题 Whos in the Middle FJ is surveying his herd to find the most average cow. He wants to know how much milk this median cow gives: half of the cows give as much or more than the median; half give as much or less. FJ正在调查他的牛群，以找到最…...

编程日记 2024/8/2 3:14:02

ORA-00911: invalid character

场景： 调用接口查询oracle的数据库数据时报错ORA-00911: invalid character，但是sql语句没有问题放在navicat控制台中运行也没有问题，但是代码中跑就会报无效字符集分析： 代码中Oracle的语法解析器比较严格，比如句…...

编程日记 2024/8/2 3:12:00

Pytorch实现线性回归Linear Regression

借助 PyTorch 实现深度神经网络 - 线性回归 - 第 2 周 | Coursera 线性回归预测用PyTorch实现线性回归模块创建自定义模块（内含一个线性回归） 训练线性回归模型对于线性回归，特定类型的噪声是高斯噪声平均损失均方误差函数&#xff1a…...

编程日记 2024/8/2 3:10:59

十八次（虚拟主机与vue项目、samba磁盘映射、nfs共享）

1、虚拟主机搭建环境准备将原有的nginx.conf文件备份 [rootserver ~]# cp /usr/local/nginx/conf/nginx.conf /usr/local/nginx/conf/nginx.conf.bak[rootserver ~]# grep -Ev "#|^$" /usr/local/nginx/conf/nginx.conf[rootserver ~]# grep -Ev "#|^$"…...

编程日记 2024/8/2 3:08:57

P1340 兽径管理题解|最小生成树

题目大意洛谷中链接推荐文章：并查集入门原文约翰农场的牛群希望能够在 N N N 个草地之间任意移动。草地的编号由 1 1 1 到 N N N。草地之间有树林隔开。牛群希望能够选择草地间的路径，使牛群能够从任一片草地移动到任一片其它草地。牛群可在…...

编程日记 2024/8/2 3:02:52

Python，Maskrcnn训练，cannot import name ‘saving‘ from ‘keras.engine‘ ，等问题集合

Python版本3.9，tensorflow2.11.0，keras2.11.0 问题一、module keras.engine has no attribute Layer Traceback (most recent call last):File "C:\Users\Administrator\Desktop\20240801\代码\test.py", line 16, in <module>from mrc…...

编程日记 2024/8/2 3:00:50

Linux常用工具

文章目录 tar打包命令详解unzip命令：解压zip文件vim操作详解netstat详解df命令详解ps命令详解find命令详解 tar打包命令详解 tar命令做打包操作当 tar 命令用于打包操作时，该命令的基本格式为： tar [选项] 源文件或目录此命令常用的选项及…...

编程日记 2024/8/2 2:59:49

AI未来的发展如何

AI（人工智能）的发展前景非常广阔，随着技术的不断进步和应用场景的不断拓展，AI将在多个领域发挥重要作用。以下是对AI发展前景的详细分析： 一、技术突破与创新生成式AI的兴起：以ChatGPT为代表的生成式AI技…...

编程日记 2024/8/2 2:58:48

若依替换首页上的logo

...

编程日记 2024/8/2 2:54:43

sed的使用示例

场景:使用sed将多个空格变成单空格,再使用cut来切分得到需要的结果得到后面这个文件名: ls ./ drwxr-x— 2 root root 6 Jul 18 9:00 7b40f1412d83c1524af7977593607f15 drwxr-x— 2 root root 6 Jul 18 14:00 50af29cef2c65a9d28905a3ce831bcb7 drwxr-x— 2 root root 6 Jul…...

编程日记 2024/8/2 2:53:42

学历不是障碍：大专生如何成功进入软件测试行业

摘要： 在当今技术驱动的职场环境中，软件测试已成为一个关键的职业领域。尽管许多人认为高学历是进入这一行业的先决条件，但实际上，大专学历的学生同样有机会在软件测试领域取得成功。本文将探讨大专生如何通过技能提升、实践经验和…...

编程日记 2024/8/2 2:52:41

文件解析漏洞—IIS解析漏洞—IIS6.X

目录方式 1：目录解析方式 2：畸形文件解析方式 3：PUT 上传漏洞（123.asp;.jpg 解析成 asp） 环境：Windows server 2003 添加 IIS 管理工具——打开 IIS——添加网站创建完成之后，右击创建的…...

编程日记 2024/8/2 2:51:40

Sqlmap中文使用手册 - Brute force模块参数使用

目录 1. Brute force模块的帮助文档2. 各个参数的介绍2.1 --common-tables2.2 --common-columns2.3 --common-files 1. Brute force模块的帮助文档 Brute force:These options can be used to run brute force checks--common-tables Check existence of common tables--c…...

编程日记 2024/8/2 2:49:38

ubuntu20.04 开源鸿蒙源码编译配置

替换华为源 sudo sed -i "shttp://.*archive.ubuntu.comhttp://repo.huaweicloud.comg" /etc/apt/sources.list && sudo sed -i "shttp://.*security.ubuntu.comhttp://repo.huaweicloud.comg" /etc/apt/sources.list 安装依赖工具如果是ubun…...

编程日记 2024/8/2 2:48:37

XCTF-web-easyupload

试了试php，php7，pht，phtml等，都没有用尝试.user.ini 抓包修改将.user.ini修改为jpg图片在上传一个123.jpg 用蚁剑连接，得到flag...

编程新知 2026/2/8 3:54:15

Java 语言特性(面试系列2)

一、SQL 基础 1. 复杂查询 （1）连接查询（JOIN） 内连接（INNER JOIN）：返回两表匹配的记录。 SELECT e.name, d.dept_name FROM employees e INNER JOIN departments d ON e.dept_id d.dept_id; 左…...

编程新知 2025/10/24 14:20:29

黑马Mybatis

Mybatis 表现层：页面展示业务层：逻辑处理持久层：持久数据化保存在这里插入图片描述 Mybatis快速入门 ![在这里插入图片描述](https://i-blog.csdnimg.cn/direct/6501c2109c4442118ceb6014725e48e4.png //logback.xml <?xml ver…...

编程新知 2026/1/22 14:22:27

Python爬虫实战：研究feedparser库相关技术

1. 引言 1.1 研究背景与意义在当今信息爆炸的时代，互联网上存在着海量的信息资源。RSS（Really Simple Syndication）作为一种标准化的信息聚合技术，被广泛用于网站内容的发布和订阅。通过 RSS，用户可以方便地获取网站更新的内容，而无需频繁访问各个网站。然而，互联网…...

编程新知 2025/8/18 9:54:31

Qwen3-Embedding-0.6B深度解析：多语言语义检索的轻量级利器

第一章引言：语义表示的新时代挑战与Qwen3的破局之路 1.1 文本嵌入的核心价值与技术演进在人工智能领域，文本嵌入技术如同连接自然语言与机器理解的“神经突触”——它将人类语言转化为计算机可计算的语义向量，支撑着搜索引擎、推荐系统、…...

编程新知 2025/11/6 8:47:31

相机从app启动流程

一、流程框架图二、具体流程分析 1、得到cameralist和对应的静态信息目录如下：重点代码分析：启动相机前，先要通过getCameraIdList获取camera的个数以及id，然后可以通过getCameraCharacteristics获取对应id camera的capabilities（静态信息）进行一些openCamera前的…...

编程新知 2026/1/31 5:09:19

Java入门学习详细版（一）

大家好，Java 学习是一个系统学习的过程，核心原则就是“理论实践坚持”，并且需循序渐进，不可过于着急，本篇文章推出的这份详细入门学习资料将带大家从零基础开始，逐步掌握 Java 的核心概念和编程技能。 …...

编程新知 2025/12/14 14:47:02

Java面试专项一-准备篇

一、企业简历筛选规则一般企业的简历筛选流程：首先由HR先筛选一部分简历后，在将简历给到对应的项目负责人后再进行下一步的操作。 HR如何筛选简历例如：Boss直聘（招聘方平台） 直接按照条件进行筛选例如&#xff1a…...

编程新知 2026/1/26 19:10:48

中医有效性探讨

文章目录西医是如何发展到以生物化学为药理基础的现代医学？传统医学奠基期（远古 - 17 世纪）近代医学转型期（17 世纪 - 19 世纪末）现代医学成熟期（20世纪至今） 中医的源远流长和一脉相承远古至…...

编程新知 2026/1/23 7:56:54

论文笔记——相干体技术在裂缝预测中的应用研究

目录相关地震知识补充地震数据的认识地震几何属性相干体算法定义基本原理第一代相干体技术：基于互相关的相干体技术（Correlation）第二代相干体技术：基于相似的相干体技术（Semblance）基于多道相似的相干体…...

编程新知 2026/2/7 1:51:12

利用langchain 做大模型 Few-shot Learning 提示，包括固定和向量相似的动态样本筛选

文章目录

few-shot

Fixed Examples 固定样本

Dynamic few-shot prompting 动态样本提示

辅助

参考资料

相关文章：

利用langchain 做大模型 Few-shot Learning 提示，包括固定和向量相似的动态样本筛选

基于python的百度迁徙迁入、迁出数据分析（五）

SpringBoot 如何处理跨域请求

大数据技术基础编程、实验和案例----大数据课程综合实验案例

微信小程序-获取手机号：HttpClientErrorException: 412 Precondition Failed: [no body]

大数据核心概念与技术架构简介

快排谁在中间

ORA-00911: invalid character

Pytorch实现线性回归Linear Regression

十八次（虚拟主机与vue项目、samba磁盘映射、nfs共享）

P1340 兽径管理题解|最小生成树

Python，Maskrcnn训练，cannot import name ‘saving‘ from ‘keras.engine‘ ，等问题集合

Linux常用工具

AI未来的发展如何

若依替换首页上的logo

sed的使用示例

学历不是障碍：大专生如何成功进入软件测试行业

文件解析漏洞—IIS解析漏洞—IIS6.X

Sqlmap中文使用手册 - Brute force模块参数使用

ubuntu20.04 开源鸿蒙源码编译配置

XCTF-web-easyupload

Java 语言特性(面试系列2)

黑马Mybatis

Python爬虫实战：研究feedparser库相关技术

Qwen3-Embedding-0.6B深度解析：多语言语义检索的轻量级利器

相机从app启动流程

Java入门学习详细版（一）

Java面试专项一-准备篇

中医有效性探讨

论文笔记——相干体技术在裂缝预测中的应用研究