当前位置: 首页 > news >正文

Langchain 集成 FAISS

Langchain 集成 FAISS

  • 1. FAISS
  • 2. Similarity Search with score
  • 3. Saving and loading
  • 4. Merging
  • 5. Similarity Search with filtering

1. FAISS

Facebook AI Similarity Search (Faiss)是一个用于高效相似性搜索和密集向量聚类的库。它包含的算法可以搜索任意大小的向量集,甚至可能无法容纳在 RAM 中的向量集。它还包含用于评估和参数调整的支持代码。

Faiss 文档地址在这里.

本笔记本展示了如何使用与 FAISS 矢量数据库相关的功能。

示例代码,

# !pip install faiss
# OR
# !pip install faiss-cpu
import os
import getpassos.environ["COHERE_API_KEY"] = getpass.getpass("Cohere API Key:")# 如果需要在没有 AVX2 优化的情况下初始化 FAISS,请取消注释以下一行
# os.environ['FAISS_NO_AVX2'] = '1'
# from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.embeddings.cohere import CohereEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.document_loaders import TextLoader

输出结果,

from langchain.document_loaders import TextLoaderloader = TextLoader("./state_of_the_union_en.txt", encoding="utf-8")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)# embeddings = OpenAIEmbeddings
embeddings = CohereEmbeddings()

示例代码,

db = FAISS.from_documents(docs, embeddings)query = "What did the president say about Ketanji Brown Jackson"
docs = db.similarity_search(query)
print(docs[0].page_content)

输出结果,

Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. Tonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. One of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. And I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.

2. Similarity Search with score

有一些 FAISS 特定方法。其中之一是 similarity_search_with_score ,它不仅允许您返回文档,还允许返回查询到它们的距离分数。返回的距离分数是L2距离。因此,分数越低越好。

示例代码,

docs_and_scores = db.similarity_search_with_score(query)
docs_and_scores[0]

输出结果,

(Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': './state_of_the_union_en.txt'}),7172.888)

refer: https://python.langchain.com/docs/integrations/vectorstores/faiss 文档的分数是 0.36913747

    (Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': '../../../state_of_the_union.txt'}),0.36913747)

还可以使用 similarity_search_by_vector 搜索与给定嵌入向量类似的文档,它接受嵌入向量作为参数而不是字符串。

示例代码,

embedding_vector = embeddings.embed_query(query)
docs_and_scores = db.similarity_search_by_vector(embedding_vector)
docs_and_scores

输出结果如下,

[Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': './state_of_the_union_en.txt'}),Document(page_content='We can’t change how divided we’ve been. But we can change how we move forward—on COVID-19 and other issues we must face together. \n\nI recently visited the New York City Police Department days after the funerals of Officer Wilbert Mora and his partner, Officer Jason Rivera. \n\nThey were responding to a 9-1-1 call when a man shot and killed them with a stolen gun. \n\nOfficer Mora was 27 years old. \n\nOfficer Rivera was 22. \n\nBoth Dominican Americans who’d grown up on the same streets they later chose to patrol as police officers. \n\nI spoke with their families and told them that we are forever in debt for their sacrifice, and we will carry on their mission to restore the trust and safety every community deserves. \n\nI’ve worked on these issues a long time. \n\nI know what works: Investing in crime preventionand community police officers who’ll walk the beat, who’ll know the neighborhood, and who can restore trust and safety.', metadata={'source': './state_of_the_union_en.txt'}),Document(page_content='And for our LGBTQ+ Americans, let’s finally get the bipartisan Equality Act to my desk. The onslaught of state laws targeting transgender Americans and their families is wrong. \n\nAs I said last year, especially to our younger transgender Americans, I will always have your back as your President, so you can be yourself and reach your God-given potential. \n\nWhile it often appears that we never agree, that isn’t true. I signed 80 bipartisan bills into law last year. From preventing government shutdowns to protecting Asian-Americans from still-too-common hate crimes to reforming military justice. \n\nAnd soon, we’ll strengthen the Violence Against Women Act that I first wrote three decades ago. It is important for us to show the nation that we can come together and do big things. \n\nSo tonight I’m offering a Unity Agenda for the Nation. Four big things we can do together.  \n\nFirst, beat the opioid epidemic.', metadata={'source': './state_of_the_union_en.txt'}),Document(page_content='Tonight, I’m announcing a crackdown on these companies overcharging American businesses and consumers. \n\nAnd as Wall Street firms take over more nursing homes, quality in those homes has gone down and costs have gone up.  \n\nThat ends on my watch. \n\nMedicare is going to set higher standards for nursing homes and make sure your loved ones get the care they deserve and expect. \n\nWe’ll also cut costs and keep the economy going strong by giving workers a fair shot, provide more training and apprenticeships, hire them based on their skills not degrees. \n\nLet’s pass the Paycheck Fairness Act and paid leave.  \n\nRaise the minimum wage to $15 an hour and extend the Child Tax Credit, so no one has to raise a family in poverty. \n\nLet’s increase Pell Grants and increase our historic support of HBCUs, and invest in what Jill—our First Lady who teaches full-time—calls America’s best-kept secret: community colleges.', metadata={'source': './state_of_the_union_en.txt'})]

3. Saving and loading

您还可以保存和加载 FAISS 索引。这很有用,因此您不必每次使用它时都重新创建它。

示例代码,

db.save_local("faiss_index")
new_db = FAISS.load_local("faiss_index", embeddings)
docs = new_db.similarity_search(query)
docs[0]

输出结果,

Document(page_content='Tonight. I call on the Senate to: Pass the Freedom to Vote Act. Pass the John Lewis Voting Rights Act. And while you’re at it, pass the Disclose Act so Americans can know who is funding our elections. \n\nTonight, I’d like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \n\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court. \n\nAnd I did that 4 days ago, when I nominated Circuit Court of Appeals Judge Ketanji Brown Jackson. One of our nation’s top legal minds, who will continue Justice Breyer’s legacy of excellence.', metadata={'source': './state_of_the_union_en.txt'})

4. Merging

您还可以合并两个 FAISS 矢量存储。

示例代码,

db1 = FAISS.from_texts(["foo"], embeddings)
db2 = FAISS.from_texts(["bar"], embeddings)
db1.docstore._dict

输出结果,

{'43f79c6d-6bb3-4a62-979d-58e011dcb086': Document(page_content='foo', metadata={})}

示例代码,

db1.docstore._dict

输出结果,

{'43f79c6d-6bb3-4a62-979d-58e011dcb086': Document(page_content='foo', metadata={})}

示例代码,

db2.docstore._dict

输出结果,

{'8dcb4556-8eb5-43be-9eaa-0bff9a6e7997': Document(page_content='bar', metadata={})}

示例代码,

db1.docstore._dict

输出结果,

{'43f79c6d-6bb3-4a62-979d-58e011dcb086': Document(page_content='foo', metadata={})}

示例代码,

db1.merge_from(db2)

输出结果,

db1.docstore._dict

输出结果,

{'43f79c6d-6bb3-4a62-979d-58e011dcb086': Document(page_content='foo', metadata={}),'8dcb4556-8eb5-43be-9eaa-0bff9a6e7997': Document(page_content='bar', metadata={})}

5. Similarity Search with filtering

FAISS vectorstore 还可以支持过滤,因为 FAISS 本身不支持过滤,我们必须手动执行。这是通过首先获取比 k 更多的结果然后过滤它们来完成的。您可以根据元数据过滤文档。您还可以在调用任何搜索方法时设置 fetch_k 参数,以设置在过滤之前要获取的文档数量。这是一个小例子:

示例代码,

from langchain.schema import Documentlist_of_documents = [Document(page_content="foo", metadata=dict(page=1)),Document(page_content="bar", metadata=dict(page=1)),Document(page_content="foo", metadata=dict(page=2)),Document(page_content="barbar", metadata=dict(page=2)),Document(page_content="foo", metadata=dict(page=3)),Document(page_content="bar burr", metadata=dict(page=3)),Document(page_content="foo", metadata=dict(page=4)),Document(page_content="bar bruh", metadata=dict(page=4)),
]
db = FAISS.from_documents(list_of_documents, embeddings)
results_with_scores = db.similarity_search_with_score("foo")
for doc, score in results_with_scores:print(f"Content: {doc.page_content}, Metadata: {doc.metadata}, Score: {score}")

输出结果,

Content: foo, Metadata: {'page': 1}, Score: 0.018019594252109528
Content: foo, Metadata: {'page': 2}, Score: 0.018019594252109528
Content: foo, Metadata: {'page': 3}, Score: 0.018019594252109528
Content: foo, Metadata: {'page': 4}, Score: 0.018019594252109528

现在我们进行相同的查询调用,但我们仅过滤 page = 1

results_with_scores = db.similarity_search_with_score("foo", filter=dict(page=1))
for doc, score in results_with_scores:print(f"Content: {doc.page_content}, Metadata: {doc.metadata}, Score: {score}")

输出结果,

Content: foo, Metadata: {'page': 1}, Score: 0.018019594252109528
Content: bar, Metadata: {'page': 1}, Score: 10266.8544921875

同样的事情也可以用 max_marginal_relevance_search 来完成。

示例代码,

results = db.max_marginal_relevance_search("foo", filter=dict(page=1))
for doc in results:print(f"Content: {doc.page_content}, Metadata: {doc.metadata}")

输出结果,

Content: foo, Metadata: {'page': 1}
Content: bar, Metadata: {'page': 1}

以下是调用 similarity_search 时如何设置 fetch_k 参数的示例。通常您需要 fetch_k 参数 >> k 参数。这是因为 fetch_k 参数是过滤之前将获取的文档数。如果将 fetch_k 设置为较小的数字,您可能无法获得足够的文档进行过滤。

示例代码,

results = db.similarity_search("foo", filter=dict(page=1), k=1, fetch_k=4)
for doc in results:print(f"Content: {doc.page_content}, Metadata: {doc.metadata}")

输出结果,

Content: foo, Metadata: {'page': 1}

完结!

相关文章:

Langchain 集成 FAISS

Langchain 集成 FAISS 1. FAISS2. Similarity Search with score3. Saving and loading4. Merging5. Similarity Search with filtering 1. FAISS Facebook AI Similarity Search (Faiss)是一个用于高效相似性搜索和密集向量聚类的库。它包含的算法可以搜索任意大小的向量集&a…...

科技与人元宇宙论坛跨界对话

近来,“元宇宙”成为热门话题,越来越频繁地出现在人们的视野里。大家都在谈论它,但似 乎还没有一个被所有人认同的定义。元宇宙究竟是什么?未来它会对我们的工作和生活带来什么样 的改变?当谈论虚拟现实(VR…...

JAVA-生成二维码图片

使用hutool工具包,主动一个简单方便,pom添加依赖 <dependency><groupId>cn.hutool</groupId><artifactId>hutool-all</artifactId><version>5.8.12</version> </dependency> 直接上代码 //设置像素宽高 QrConfig config new…...

【iOS】iOS持久化

文章目录 一. 数据持久化的目的二. iOS中数据持久化方案三. 数据持有化方式的分类1. 内存缓存2. 磁盘缓存SDWebImage缓存 四. 沙盒机制的介绍五. 沙盒目录结构1. 获取应用程序的沙盒路径2. 访问沙盒目录常用C函数介绍3. 沙盒目录介绍 六. 持久化数据存储方式1. XML属性列表2. P…...

基于Javaweb+Vue3实现淘宝卖鞋前后端分离项目

前端技术栈&#xff1a;HTMLCSSJavaScriptVue3 后端技术栈&#xff1a;JavaSEMySQLJDBCJavaWeb 文章目录 前言1️⃣登录功能登录后端登录前端 2️⃣商家管理查询商家查询商家后端查询商家前端 增加商家增加商家后端增加商家前端 删除商家删除商家后端删除商家前端 修改商家修改…...

bat一键批量、有序启动jar

将脚本文件后缀改为 bat&#xff0c;脚本文件和 jar 包放在同一个目录 echo offstart cmd /c "java -jar register.jar " ping 192.0.2.2 -n 1 -w 10000 > nulstart cmd /c "java -jar admin.jar " ping 192.0.2.2 -n 1 -w 30000 > nulstart cmd /c…...

centos7安装mysql数据库详细教程及常见问题解决

mysql数据库详细安装步骤 1.在root身份下输入执行命令&#xff1a; yum -y update 2.检查是否已经安装MySQL&#xff0c;输入以下命令并执行&#xff1a; mysql -v 如出现-bash: mysql: command not found 则说明没有安装mysql 也可以输入rpm -qa | grep -i mysql 查看是否已…...

C++ STL sort函数的底层实现

C STL sort函数的底层实现 sort函数的底层用到的是内省式排序以及插入排序&#xff0c;内省排序首先从快速排序开始&#xff0c;当递归深度超过一定深度&#xff08;深度为排序元素数量的对数值&#xff09;后转为堆排序。 先来回顾一下以上提到的3中排序方法&#xff1a; 快…...

ICP算法和优化问题详细公式推导

1. 介绍 ICP(Iterative Closest Point)&#xff1a;求一组平移和旋转使得两个点云之间重合度尽可能高。 2. 算法流程 找最近邻关联点&#xff0c;求解 R , t R , t R , t R , t R,tR,tR,tR,t R,tR,tR,tR,t&#xff0c;如此反复直到重合程度足够高。 3. 数学描述 X { x 1 ,…...

【安全狗】linux免费服务器防护软件安全狗详细安装教程

在费用有限的基础上&#xff0c;复杂密码云服务器基础防护常见端口替换安全软件&#xff0c;可以防护绝大多数攻击 第一步&#xff1a;下载服务器安全狗Linux版&#xff08;下文以64位版本为例&#xff09; 官方提供了两个下载方式&#xff0c;本文采用的是 方式2 wget安装 方…...

【iOS】自定义字体

文章目录 前言一、下载字体二、添加字体三、检查字体四、使用字体 前言 在设计App的过程中我们常常会想办法去让我们的界面变得美观&#xff0c;使用好看的字体是我们美化界面的一个方法。接下来笔者将会讲解App中添加自定义字体 一、下载字体 我们要使用自定义字体&#x…...

WPF实战学习笔记06-设置待办事项界面

设置待办事项界面 创建待办待办事项集合并初始化 TodoViewModel&#xff1a; using Mytodo.Common.Models; using Prism.Commands; using Prism.Mvvm; using System; using System.Collections.Generic; using System.Collections.ObjectModel; using System.Linq; using Sy…...

推荐几个不错的免费配色工具网站

1. Paletton专业的配色套件,提供色轮理论及调色功能。可查看配色预览效果。 网站:http://paletton.com 2. Colormind一个基于机器学习的智能配色工具。可以一键生成配色方案。 网站:http://colormind.io 3. Adobe ColorAdobe官方的配色工具,可以从图片中取色,也可以随机生成配色…...

gitee page发布的静态网站,无法播放目录中的mp4视频

起因是希望在gitee上部署静态网站&#xff0c;利用three.js VideoTexture 环境贴图播放视频。 但是试了多几次 mp4均提示404&#xff0c;资源无法获取&#xff1b; 找了很多方案&#xff0c;最后发现将视频转为ogv 就可以完美适配了&#xff1b; mp4转ogv 附threejs使用ogv进…...

opencv-26 图像几何变换04- 重映射-函数 cv2.remap()

什么是重映射&#xff1f; 重映射&#xff08;Remapping&#xff09;是图像处理中的一种操作&#xff0c;用于将图像中的像素从一个位置映射到另一个位置。重映射可以实现图像的平移、旋转、缩放和透视变换等效果。它是一种基于像素级的图像变换技术&#xff0c;可以通过定义映…...

SkyWalking链路追踪中span全解

基本概念 在SkyWalking链路追踪中&#xff0c;Span&#xff08;跨度&#xff09;是Trace&#xff08;追踪&#xff09;的组成部分之一。Span代表一次调用或操作的单个组件&#xff0c;可以是一个方法调用、一个HTTP请求或者其他类型的操作。 每个Span都包含了一些关键的信息&am…...

【前端知识】React 基础巩固(三十一)——Redux的简介

React 基础巩固(三十一)——Redux 一、Redux是个纯函数 概念 纯函数&#xff08;确定的输入一定产生确定的输出&#xff0c;函数在执行过程中不产生副作用&#xff09;&#xff1a; 在程序设计中&#xff0c;若一个函数符合以下条件&#xff0c;那么这个函数就被称为纯函数…...

拦截Bean使用之前各个时机的Spring组件

拦截Bean使用之前各个时机的Spring组件 之前使用过的BeanPostProcessor就是在Bean实例化之后&#xff0c;注入属性值之前的时机。 Spring Bean的生命周期本次演示的是在Bean实例化之前的时机&#xff0c;使用BeanFactoryPostProcessor进行验证&#xff0c;以及在加载Bean之前进…...

RT thread 之 Nand flash 读写过程分析

文章目录 前言&#xff1a;什么是Nand Flash&#xff1f;1、Nand Flash 读取步骤2、从主存读到Cache2.1 在标准spi接口下读取过程2.2 测试时序&#xff08;SPI频率30MHz&#xff09; 3.从Cache读取数据3.1在标准spi接口读取过程测试时序 前言&#xff1a;什么是Nand Flash&…...

独立站最全出单营销指南,新手卖家赶紧学起来吧!

这是一个需要投入大量时间和精力的挑战&#xff0c;但只有经过筛选在众多品牌和渠道中找到最适合自己的营销策略&#xff0c;才能成功。 新手商家经常会发现自己有很多可以改进的地方&#xff1a;品牌的颜色、字体以及其他一些细节。但真正走向成熟的商家会意识到&#xff0c;…...

Unity安卓构建72小时实战指南:从零到真机运行

1. 这不是“又一本Unity教程”&#xff0c;而是我带三个新人从零上线第一款安卓游戏的真实路径你点开这个标题&#xff0c;大概率正站在两个路口之间&#xff1a;一边是满屏“30天速成Unity”“零基础做爆款”的短视频封面&#xff0c;一边是你刚下载完Unity Hub、卡在Android …...

机器学习结合基因无关通路映射:从临床数据挖掘新药靶点

1. 项目概述&#xff1a;当机器学习遇见代谢通路&#xff0c;如何从数据中“挖”出新药靶点&#xff1f;在生物医学研究的前沿&#xff0c;我们正面临一个核心矛盾&#xff1a;一方面&#xff0c;我们拥有海量的临床数据&#xff0c;比如血糖、血压、BMI等指标&#xff1b;另一…...

BLE四大广播模式详解:可连接/不可连接/定向/周期广播

一、前言在低功耗蓝牙&#xff08;BLE&#xff09;开发中&#xff0c;广播&#xff08;Advertising&#xff09;是设备发现、连接建立、数据广播、设备重连的核心基石&#xff0c;所有BLE交互流程均始于广播报文的收发。不同于传统经典蓝牙&#xff0c;BLE所有广播行为标准化、…...

如何从零构建智能FOC轮腿机器人:完整开源硬件系统终极指南

如何从零构建智能FOC轮腿机器人&#xff1a;完整开源硬件系统终极指南 【免费下载链接】foc-wheel-legged-robot Open source materials for a novel structured legged robot, including mechanical design, electronic design, algorithm simulation, and software developme…...

如何快速批量下载高质量歌词:ZonyLrcToolsX跨平台终极解决方案

如何快速批量下载高质量歌词&#xff1a;ZonyLrcToolsX跨平台终极解决方案 【免费下载链接】ZonyLrcToolsX ZonyLrcToolsX 是一个能够方便地下载歌词的小软件。 项目地址: https://gitcode.com/gh_mirrors/zo/ZonyLrcToolsX 还在为本地音乐库缺少歌词而烦恼吗&#xff1…...

从RD、CS到WK:一文讲透SAR主流成像算法的演进与选型实战

从RD、CS到WK&#xff1a;SAR成像算法选型实战指南 当无人机掠过灾区上空&#xff0c;或卫星扫描地球表面时&#xff0c;合成孔径雷达&#xff08;SAR&#xff09;正通过电磁波穿透云层和黑暗&#xff0c;将地面信息转化为高分辨率图像。而决定图像质量的关键&#xff0c;在于工…...

Windows开机自动全屏打开指定网页?一个快捷方式参数就搞定(Chrome/Edge/Firefox教程)

Windows开机自动全屏展示网页的终极方案每次开机都要手动打开浏览器、输入网址、切换全屏模式&#xff1f;这种重复操作不仅浪费时间&#xff0c;还容易在重要演示时手忙脚乱。想象一下&#xff1a;电脑启动后自动全屏显示你的仪表盘、会议日程或是监控大屏&#xff0c;整个过程…...

2026论文顶级降AI率工具大曝光:一键把AIGC率降至安全线!

步入2026年&#xff0c;学术圈的规则已经彻底变了味。过去那种只盯着查重率的“降重焦虑”早就被更可怕的“降AI焦虑”取代了。AI检测算法越来越聪明&#xff0c;高校审核标准也越来越严苛&#xff0c;光是把重复率压下去已经完全不够用了。现在摆在学生和科研人员面前的难题是…...

从数据到模型:手把手教你预处理MPIIFaceGaze和EyeDiap数据集(Python实战)

从数据到模型&#xff1a;手把手教你预处理MPIIFaceGaze和EyeDiap数据集&#xff08;Python实战&#xff09;当你第一次打开MPIIFaceGaze或EyeDiap数据集的压缩包时&#xff0c;那种面对杂乱文件夹和神秘.mat文件的迷茫感&#xff0c;我太熟悉了。作为计算机视觉工程师&#xf…...

别再把大模型当搜索框了:一文讲透 LLM 的基本原理、能力边界与局限性

写在前面很多人把大语言模型当成“会聊天的搜索引擎”&#xff0c;结果一上线就遇到幻觉、口径不稳、上下文丢失、成本失控。真正理解 LLM&#xff0c;要先抓住一句话&#xff1a;它是基于 Transformer 的概率生成模型&#xff0c;核心能力来自海量预训练、上下文学习与后训练对…...