当前位置：首页 > news >正文

Elasticsearch入门笔记（一）

news 2026/5/23 21:12:14

环境搭建

Elasticsearch是搜索引擎，是常见的搜索工具之一。

Kibana 是一个开源的分析和可视化平台，旨在与 Elasticsearch 合作。Kibana 提供搜索、查看和与存储在 Elasticsearch 索引中的数据进行交互的功能。开发者或运维人员可以轻松地执行高级数据分析，并在各种图表、表格和地图中可视化数据。

其它可视化还有elasticsearch-head(轻量级，有对应的Chrome插件)，本文不会详细介绍。

Elasticsearch和Kibana的版本采用7.17.0，环境搭建采用Docker，docker-compose.yml文件如下：

version: "3.1"
# 服务配置
services:elasticsearch:container_name: elasticsearch-7.17.0image: elasticsearch:7.17.0environment:- "ES_JAVA_OPTS=-Xms1024m -Xmx1024m"- "http.host=0.0.0.0"- "node.name=elastic01"- "cluster.name=cluster_elasticsearch"- "discovery.type=single-node"ports:- "9200:9200"- "9300:9300"volumes:- ./es/plugins:/usr/share/elasticsearch/plugins- ./es/data:/usr/share/elasticsearch/datanetworks:- elastic_netkibana:container_name: kibana-7.17.0image: kibana:7.17.0ports:- "5601:5601"networks:- elastic_net# 网络配置
networks:elastic_net:driver: bridge

基础命令

查看ElasticSearch是否启动成功：

curl http://IP:9200

查看集群是否健康

curl http://IP:9200/_cat/health?v

查看ElasticSearch所有的index

curl http://IP:9200/_cat/indices

查看ElasticSearch所有indices或者某个index的文档数量

curl http://IP:9200/_cat/count?v
curl http://IP:9200/_cat/count/some_index_name?v

查看每个节点正在运行的插件信息

curl http://IP:9200/_cat/plugins?v&s=component&h=name,component,version,description

查看ik插件的分词结果

curl -H 'Content-Type: application/json'  -XGET 'http://IP:9200/_analyze?pretty' -d '{"analyzer":"ik_max_word","text":"美国留给伊拉克的是个烂摊子吗"}'

index操作

查看某个index的mapping

curl http://IP:9200/some_index_name/_mapping

查看某个index的所有数据

curl http://IP:9200/some_index_name/_search

按ID进行查询

curl -X GET http://IP:9200/索引名称/文档类型/ID

检索某个index的全部数据

curl http://IP:9200/索引名称/_search?pretty
curl -X POST http://IP:9200/索引名称/_search?pretty -d "{\"query\": {\"match_all\": {} }}"

检索某个index的前几条数据(如果不指定size,则默认为10条)

curl -XPOST IP:9200/索引名称/_search?pretty -d "{\"query\": {\"match_all\": {} }, \"size\" : 2}"

检索某个index的中间几条数据(比如第11-20条数据)

curl -XPOST IP:9200/索引名称/_search?pretty -d "{\"query\": {\"match_all\": {} }, \"from\" : 10, \"size\" : 10}}"

检索某个index, 只返回context字段

curl -XPOST IP:9200/索引名称/_search?pretty -d "{\"query\": {\"match_all\": {} }, \"_source\": [\"context\"]}"

删除某个index

curl -XDELETE 'IP:9200/index_name'

ES搜索

如果有多个搜索关键字， Elastic 认为它们是or关系。
如果要执行多个关键词的and搜索，必须使用布尔查询。

$ curl 'localhost:9200/索引名称/文档类型/_search'  -d '
{"query": {"bool": {"must": [{ "match": { "content": "软件" } },{ "match": { "content": "系统" } }]}}
}'

复杂搜索：

SQL语句：

select * from test_index where name='tom' or (hired =true and (personality ='good' and rude != true ))

DSL语句：

GET /test_index/_search
{"query": {"bool": {"must": { "match":{ "name": "tom" }},"should": [{ "match":{ "hired": true }},{ "bool": {"must":{ "match": { "personality": "good" }},"must_not": { "match": { "rude": true }}}}],"minimum_should_match": 1}}
}

ik分词器

ik分词器是Elasticsearch的中文分词器插件，对中文分词支持较好。ik版本要与Elasticsearch保持一致。

ik 7.17.0下载地址为：https://github.com/medcl/elasticsearch-analysis-ik/releases/tag/v7.17.0 ，下载后将其重名为ik，将其放至Elasticsearch的plugins文件夹下。

ik分词器的使用命令（Kibana环境）：

POST _analyze
{"text": "戚发轫是哪里人","analyzer": "ik_smart"
}

输出结果为：

{"tokens" : [{"token" : "戚","start_offset" : 0,"end_offset" : 1,"type" : "CN_CHAR","position" : 0},{"token" : "发轫","start_offset" : 1,"end_offset" : 3,"type" : "CN_WORD","position" : 1},{"token" : "是","start_offset" : 3,"end_offset" : 4,"type" : "CN_CHAR","position" : 2},{"token" : "哪里人","start_offset" : 4,"end_offset" : 7,"type" : "CN_WORD","position" : 3}]
}

ik支持加载用户词典和停用词。ik 提供了配置文件 IKAnalyzer.cfg.xml（将其放在ik/config路径下），可以用来配置自己的扩展用户词典、停用词词典和远程扩展用户词典，都可以配置多个。

配置完扩展用户词典和远程扩展用户词典都需要重启ES，后续对用户词典进行更新的话，需要重启ES，远程扩展用户词典配置完后支持热更新，每60秒检查更新。两个扩展词典都是添加到ik的主词典中，对所有索引生效。

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties><comment>IK Analyzer 扩展配置</comment><!--用户可以在这里配置自己的扩展字典 --><entry key="ext_dict">custom/mydict.dic</entry><!--用户可以在这里配置自己的扩展停止词字典--><entry key="ext_stopwords">custom/ext_stopword.dic</entry><!--用户可以在这里配置远程扩展字典 --><!-- <entry key="remote_ext_dict">words_location</entry> --><!--用户可以在这里配置远程扩展停止词字典--><!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>

用户词典文件路径为：custom/mydict.dic，停用词词典路径为：custom/ext_stopword.dic，将它们放在ik/config/custom路径下。

用户词典文件中加入’戚发轫’，停用词词典加入’是’，对原来文本进行分词：

POST _analyze
{"text": "戚发轫是哪里人","analyzer": "ik_smart"
}

输出结果如下：

{"tokens" : [{"token" : "戚发轫","start_offset" : 0,"end_offset" : 3,"type" : "CN_WORD","position" : 0},{"token" : "哪里人","start_offset" : 4,"end_offset" : 7,"type" : "CN_WORD","position" : 1}]
}

如果’analyzer’选择ik_smart，则会将文本做最粗粒度的拆分；选择ik_max_word，则会将文本做最细粒度的拆分。测试如下：

POST _analyze
{"text": "戚发轫是哪里人","analyzer": "ik_max_word"
}

输出结果如下：

{"tokens" : [{"token" : "戚发轫","start_offset" : 0,"end_offset" : 3,"type" : "CN_WORD","position" : 0},{"token" : "发轫","start_offset" : 1,"end_offset" : 3,"type" : "CN_WORD","position" : 1},{"token" : "哪里人","start_offset" : 4,"end_offset" : 7,"type" : "CN_WORD","position" : 2},{"token" : "哪里","start_offset" : 4,"end_offset" : 6,"type" : "CN_WORD","position" : 3},{"token" : "里人","start_offset" : 5,"end_offset" : 7,"type" : "CN_WORD","position" : 4}]
}

总结

本文主要介绍了Elasticsearch一些基础命令和用法，是笔者的Elasticsearch学习笔记第一篇，后续将持续更新。

本文代码已放至Github，网址为：https://github.com/percent4/ES_Learning .

Elasticsearch入门笔记（一）

环境搭建

基础命令

index操作

ES搜索

ik分词器

总结

相关文章：

Elasticsearch入门笔记（一）

记一次安装nvm切换node.js版本实例详解

生态共建丨YashanDB与构力科技完成兼容互认证

React从入门到实战-react脚手架，消息订阅与发布

从零构建深度学习推理框架-1 简介和Tensor

使用WGCLOUD监测安卓（Android）设备的运行状态

C++笔记之迭代器失效问题处理

Tomcat的startup.bat文件出现闪退问题

JAVA8-lambda表达式8：在设计模式-模板方法中的应用

React之组件间通信

【MATLAB第58期】基于MATLAB的PCA-Kmeans、PCA-LVQ与BP神经网络分类预测模型对比

CF1833 A-E

【深度学习】【Image Inpainting】Generative Image Inpainting with Contextual Attention

二维深度卷积网络模型下的轴承故障诊断

redis突然变慢问题定位

React井字棋游戏官方示例

七大经典比较排序算法

【点云处理教程】03使用 Python 实现地面检测

Python 日志记录：6大日志记录库的比较

最近遇到一些问题的解决方案

洛雪音乐音源：打破音乐平台壁垒的聚合解决方案

CANN-昇腾NPU-多机多卡-怎么把16卡用出32卡的效果

原来湖南2026年的灯光设计趋势竟然是这样的？

【收藏干货】2026 版 11 款主流 AI Agent 框架全方位对比！程序员小白入门大模型必备选型指南

企业级应用通过Taotoken实现AI能力冗余与故障转移设计

抖音视频批量下载完整解决方案：从单视频到全自动归档管理

揭秘FPGA内部世界：PrjXRay开源工具完整指南

TI C2000 系列 TMS320F280049 引导模式设置

电机正反转深度解析

5个设计场景，Bebas Neue如何用大写字母征服现代视觉设计