当前位置：首页 > news >正文

Java读取PDF后做知识库问答_SpringAI实现

news 2026/2/9 19:31:54

核心思路：

简单来说，就是把PDF文件读取并向量化，然后放到向量存储里面，再通过大模型，来实现问答。

RAG（检索增强生成）介绍：

检索增强生成（RAG）是一种结合了信息检索和文本生成的技术，旨在提高大模型的响应准确性和相关性。通过将检索模型（用于搜索专有数据集或知识库）与生成模型（如大型语言模型LLM）相结合，RAG能够利用私有或专有的数据来辅助生成更精确的回答。这样不仅减少了由于缺乏特定背景知识导致的大模型“幻觉”现象，还使得生成的内容更加贴合用户的需求和上下文环境，特别适合于需要处理企业内部数据的应用场景。

Spring AI alibaba介绍

Spring AI Alibaba 是基于 Spring Ai 构建的，用于集成阿里云通义大模型服务的应用框架。它允许开发者通过简单的配置和少量代码，将强大的AI能力如对话、文生图等快速融入到 Java 应用程序中。其核心优势在于提供了一套标准化接口，使得应用程序能够轻松切换不同的AI提供商而无需大量修改代码；同时，该框架支持流式输出，并提供了Prompt模板等功能来简化开发流程，极大地提高了效率和灵活性。通过与Spring Boot生态系统的无缝集成，Spring AI Alibaba为开发者打造了一个既高效又便捷的AI应用开发环境。

详细例子：

1 后端代码编写

读PDF->向量化->向量存储->读取展现

1. 环境准备

确保你的开发环境满足以下条件：

JDK版本在17或以上。

Spring Boot版本为3.3.x或更高。

已经从阿里云申请到了通义千问API的api-key。

2. 配置项目以使用Spring AI Alibaba

2.1 设置API Key

在启动应用之前，请设置环境变量AI_DASHSCOPE_API_KEY为你获得的API密钥值，并且在application.properties中正确引用它：

spring.ai.dashscope.api-key: ${AI_DASHSCOPE_API_KEY}

2.2 添加依赖

需要添加对spring-ai-alibaba-starter的依赖到你的pom.xml文件中，并且指定正确的Spring Boot父级依赖。同时不要忘记包含必要的仓库地址以便获取最新的快照版本。

<parent><groupId>org.springframework.boot</groupId><artifactId>spring-boot-staper-parent</artifactId><version>3.3.4</version></parent><dependencies><dependency><groupId>com.alibaba.cloud.ai</groupId><artifactId>spring-ai-alibaba-starter</artifactId><version>1.0.0-M2</version></dependency></dependencies><repositories><repository><id>sonatype-snapshots</id><url>https://oss.sonatype.org/content/repositories/snapshots</url><snapshots><enabled>true</enabled></snapshots></repository><repository><id>spring-milestones</id><name>Spring Milestones</name><url>https://repo.spring.io/milestone</url><snapshots><enabled>false</enabled></snapshots></repository><repository><id>spring-snapshots</id><name>Spring Snapshots</name><url>https://repo.spring.io/snapshot</url><releases><enabled>false</enabled></releases></repository></repositories>

3. 编写RAG服务代码

创建一个名为RagService的服务类，用于处理与向量存储、文档检索相关的逻辑。该服务还将负责初始化索引构建及查询操作。

public class RagService {private final ChatClient chatClient;private final VectorStore vectorStore;private final DashScopeApi dashscopeApi = new DashScopeApi("你的apiKey");DocumentRetriever retriever;public RagService(ChatClient chatClient, EmbeddingModel embeddingModel) {this.chatClient = chatClient;vectorStore = new DashScopeCloudStore(dashscopeApi, new DashScopeStoreOptions("spring-ai知识库"));retriever = new DashScopeDocumentRetriever(dashscopeApi, DashScopeDocumentRetrieverOptions.builder().withIndexName("spring-ai知识库").build());}public String buildIndex() {String filePath = "/path/to/阿里巴巴财报.pdf";DocumentReader reader = new DashScopeDocumentCloudReader(filePath, dashscopeApi, null);List<Document> documentList = reader.get();vectorStore.add(documentList);return "SUCCESS";}public StreamResponseSpec queryWithDocumentRetrieval(String message) {StreamResponseSpec response = chatClient.prompt().user(message).advisors(new DocumentRetrievalAdvisor(retriever, DEFAULT_USER_TEXT_ADVISE)).stream();return response;}
}

4. 创建控制器暴露接口

接下来定义一个REST控制器，用来接收HTTP请求并将结果返回给客户端。

@RestController
@RequestMapping("/ai")
public class RagController {private final RagService ragService;public RagController(RagService ragService) {this.ragService = ragService;}@GetMapping("/steamChat")public Flux<String> generate(@RequestParam(value = "input", required = true) String input,HttpServletResponse httpResponse) {StreamResponseSpec chatResponse = ragService.queryWithDocumentRetrieval(input);httpResponse.setCharacterEncoding("UTF-8");return chatResponse.content();}@GetMapping("/buildIndex")public String buildIndex() {return ragService.buildIndex();}
}

5. 运行应用程序

在运行此应用程序之前，请确保已经完成了索引的构建（调用/buildIndex）。之后可以通过访问http://localhost:8080/ai/steamChat?input=你的问题来查询财务报告中的信息了。

通过上述步骤，你就可以成功地利用检索增强技术来处理阿里巴巴财务报表PDF文件，并通过一个简单的Web API提供交互式问答功能。这不仅能够帮助用户更高效地查找所需信息，同时也展示了如何结合现有技术和工具快速搭建起实用的服务。

检索增强的前端代码编写

构建项目并填写代码

首先，创建一个新的 React 应用并安装所需的依赖：

npx create-react-app ragChatFrontend
cd ragChatFrontend
npm install

`public/index.html`

在public/index.html中不需要做特别的修改，保持默认即可。

`src/index.js`

确保你的src/index.js如下所示，它负责渲染应用的根组件App：

import React from 'react';
import ReactDOM from 'react-dom';
import App from './App';ReactDOM.render(<React.StrictMode><App /></React.StrictMode>,document.getElementById('root')
);

`src/App.js`

这个文件定义了应用的主要布局。我们在这个例子中将只包含一个聊天组件：

import React from 'react';
import RAGChatComponent from './components/RAGChatComponent';function App() {return (<div className="App"><RAGChatComponent /></div>);
}export default App;

`src/components/RAGChatComponent.js`

这是主要的功能实现部分，我们将在这里处理用户输入、向后端发送请求以及展示返回的数据流。

import React, { useState } from 'react';function RAGChatComponent() {const [input, setInput] = useState('');const [messages, setMessages] = useState('');const handleInputChange = (event) => {setInput(event.target.value);};const handleSendMessage = async () => {if (input.trim() === '') return;try {// 发送请求到后端的RAG Chat接口const response = await fetch(`http://localhost:8080/ai/streamChat?input=${encodeURIComponent(input)}`);const reader = response.body.getReader();const decoder = new TextDecoder('utf-8');let done = false;while (!done) {const { value, done: readerDone } = await reader.read();done = readerDone;const chunk = decoder.decode(value, { stream: true });setMessages((prevMessages) => prevMessages + chunk);}// 在每次请求完成后添加换行符以区分不同轮次的消息setMessages((prevMessages) => prevMessages + '\n\n=============================\n\n');} catch (error) {console.error('Failed to fetch', error);}};const handleClearMessages = () => {setMessages('');};return (<div><inputtype="text"value={input}onChange={handleInputChange}placeholder="Enter your message"/><button onClick={handleSendMessage}>Send</button><button onClick={handleClearMessages}>Clear</button><div><h3>Messages:</h3><pre>{messages}</pre></div></div>);
}export default RAGChatComponent;

运行项目

启动前端服务：

cd ragChatFrontend
npm start

解释步骤

我们创建了一个新的React应用，并构建了一个简单的界面来与支持检索增强（RAG）的聊天服务进行交互。

用户可以在文本框内输入消息并通过点击“Send”按钮将其发送给后端。

消息通过HTTP GET请求被发送到指定URL，即http://localhost:8080/ai/steamChat?input=...。这里使用了fetch API来发起异步请求，并且通过读取响应体中的数据流来逐步显示返回的内容。

当接收到新数据块时，这些数据会被解码为字符串并追加到当前的消息列表中。

最后，在每次请求完成之后都会插入一个分隔线，以便于清晰地区分不同的对话回合。

提供了一个清除功能，允许用户清空消息历史记录以便开始新一轮对话。

此方案利用了浏览器内置的TextDecoder和ReadableStream API来高效地处理从服务器接收的数据流，非常适合于实时性要求较高的应用场景如在线聊天等。

​​​​​​​​​​​​​​