ElasticSearch|ElasticSearch 7.x
Elasticsearch 7.x 简介
- Elasticsearch是一个开源,基于Apache Lucene库构建的Restful搜索引擎
- Elasticsearch是在Solr之后几年推出的。它提供了一个分布式,多租户能力的全文搜索引擎,具有HTTP Web界面(REST)和无架构JSON文档。 Elasticsearch的官方客户端库提供Java, Groovy, PHP, Ruby, Perl, Python, .NET和Javascript
- [官网]https://www.elastic.co/
- [软件下载地址]https://www.elastic.co/downloads/
- 索引(index)
- 一个索引可以理解成一个关系型数据库
- 类型(type)
- 一种type就像一类表,比如user表, order表
- 注意:
- ES 5.x中一个index可以有多种type
- ES 6.x中一个index只能有一种type
- ES 7.x以后已经移除type这个概念
- 映射(mapping)
- mapping定义了每个字段的类型等信息。相当于关系型数据库中的表结构
- 文档(document)
- 一个document相当于关系型数据库中的一行记录
- 字段(field)
- 相当于关系型数据库表的字段
- 集群(cluster)
- 集群由一个或多个节点组成,一个集群有一个默认名称"elasticsearch"
- 节点(node)
- 集群的节点,一台机?或者一个进程
- 分片和副本(shard)
- 副本是分片的副本。分片有主分片(primary Shard)和副本分片(replica Shard)之分
- 一个Index数据在物理上被分布在多个主分片中,每个主分片只存放部分数据
- 每个主分片可以有多个副本,叫副本分片,是主分片的复制
分类 | 类型 | 描述 |
---|---|---|
字符串 | text | 用于全文索引,该类型的字段将通过分词?进行分词 |
字符串 | keyword | 不分词,只能搜索该字段的完整的值 |
数值型 | long, integer, short, byte, double, float, half_float, scaled_float | - |
布尔 | boolean | - |
二进制 | binary | 该类型的字段把值当做经过 base64 编码的字符串,默认不存储,且不可搜索 |
范围类型 | integer_range, float_range, long_range, double_range, date_range | 范围类型表示值是一个范围,而不是一个具体的值;譬如 age 的类型是 integer_range,那么值可以是 {"gte" : 20, "lte" : 40};搜索 "term" :{"age": 21} 可以搜索该值 |
日期 | date | 由于Json没有date类型,所以es通过识别字符串是否符合format定义的格式来判断是否为date类型;format默认为strict_date_optiona_time||epoch_millis;格式"2022-01-01","2022/01/01 12:10:30",或从开始纪元(1970年年1? 1?日 0点) 开始的毫秒数 |
- 数组类型 Array
- ES中没有专门的数组类型, 直接使用[]定义即可,数组中所有的值必须是同一种数据类型, 不支持混合数据类型的数组
- 字符串数组 [ "one", "two" ] ,整数数组 [ 1, 2 ]
- Object对象数组 [ { "name": "Louis", "age": 18 }, { "name": "Daniel", "age": 17 }]
- 同一个数组只能存同类型的数据,不能混存,譬如 [ 10, "some string" ] 是错误的
- 对象类型 Object
- 对象类型可能有内部对象
{ "name": "李蒙", "age": 14, "sex": "0", "class": "7(2)班", "birthday": "2005-10-15" "hobbies": [ "阅读", "跑步" ], "address": { "province": "山东", "location": { "city": "日照" } } }
- 对象类型可能有内部对象
- IP类型
IP类型的字段?用于存储IPv4或IPv6的地址, 本质上是?一个?长整型字段
功能 | 请求方式 | url | 参数 |
---|---|---|---|
新增 | PUT(必须) | localhost:9200/stu | - |
获取 | GET | localhost:9200/stu | - |
删除 | DELETE | localhost:9200/stu | - |
批量获取 | GET | localhost:9200/stu,tea | - |
获取所有1 | GET | localhost:9200/_all | - |
获取所有2 | GET | localhost:9200/_cat/indices?v | - |
存在 | HEAD | localhost:9200/stu | - |
关闭 | POST | localhost:9200/stu/_close | - |
打开 | POST | localhost:9200/stu/_open | - |
自动创建索引 | PUT | localhost:9200/_cluster/settings | 见下 |
数据复制 | POST | localhost:9200/_reindex | 见下 |
- 新增
PUT localhost:9200/stu
// 响应 { "acknowledged": true, "shards_acknowledged": true, "index": "stu" }
- 获取
GET localhost:9200/stu
// 响应 { "stu": { "aliases": {},//别名 "mappings": {},//映射 "settings": { "index": { "creation_date": "1576139082806",//创建时间 "number_of_shards": "1",//分片 "number_of_replicas": "1",//副本 "uuid": "-ocQkbgoSyG2vDTsugK_9Q", "version": { "created": "7020099" }, "provided_name": "stu" } } } }
- 删除
DELETE localhost:9200/stu
{ "acknowledged": true }
- 批量获取
GET localhost:9200/stu,tea
// 响应 { "stu": { "aliases": {}, "mappings": {}, "settings": { "index": { "creation_date": "1576139586417", "number_of_shards": "1", "number_of_replicas": "1", "uuid": "H9dyTutEQg-4OsV2Byt-gA", "version": { "created": "7020099" }, "provided_name": "stu" } } }, "tea": { "aliases": {}, "mappings": {}, "settings": { "index": { "creation_date": "1576139593175", "number_of_shards": "1", "number_of_replicas": "1", "uuid": "nYhKuggbT_Wa2RI-M_COGA", "version": { "created": "7020099" }, "provided_name": "tea" } } } }
- 获取所有1
GET localhost:9200/_all
// 响应 { "stu": { "aliases": {}, "mappings": {}, "settings": { "index": { "creation_date": "1576139586417", "number_of_shards": "1", "number_of_replicas": "1", "uuid": "H9dyTutEQg-4OsV2Byt-gA", "version": { "created": "7020099" }, "provided_name": "stu" } } }, "tea": { "aliases": {}, "mappings": {}, "settings": { "index": { "creation_date": "1576139593175", "number_of_shards": "1", "number_of_replicas": "1", "uuid": "nYhKuggbT_Wa2RI-M_COGA", "version": { "created": "7020099" }, "provided_name": "tea" } } } }
- 获取所有2
GET localhost:9200/_cat/indices?v
// 响应 health status indexuuidpri rep docs.count docs.deleted store.size pri.store.size greenopen.kibana_task_managerIjZxE0H9TtmTrpBgzjr-qg102012.8kb12.8kb yellow openstuH9dyTutEQg-4OsV2Byt-gA1100283b283b yellow openteanYhKuggbT_Wa2RI-M_COGA1100283b283b
- 存在
HEAD localhost:9200/stu
// 响应-存在 200 ok
- 关闭
POST localhost:9200/stu/_close
// 响应 { "acknowledged": true, "shards_acknowledged": true }
- 打开
POST localhost:9200/stu/_open
// 响应 { "acknowledged": true, "shards_acknowledged": true }
- 自动创建索引
插入文档时(见下)是否自动创建索引
GET 请求http://localhost:9200/_cluster/settings 查看auto_create_index 的状态
true自动创建
- 修改auto_create_index 的状态
PUT localhost:9200/_cluster/settings
// 参数 { "persistent": { "action.auto_create_index": "true"//true或false } }
- 修改auto_create_index 的状态
- 数据复制(结合索引别名,可以重建索引并导入数据)
POST localhost:9200/_reindex
{ "source": { "index": "stu" }, "dest": { "index": "stu_oth" } }
在开发中,随着业务需求的迭代,较?的业务逻辑就要?临更新甚?是重构,?对于es来说,为了适应新的业务逻辑,可能就要对原有的索引做?些修改,?如对某些字段做调整,甚?是重建索引。?做这些操作的时候,可能会对业务造成影响,甚?是停机调整等问题。由此,es提供了索引别名来解决这些问题。 索引别名就像?个快捷?式或是软连接,可以指向?个或多个索引,也可以给任意?个需要索引名的API来使?。别名的应?为程序提供了极?地灵活性
多个索引可以指定同一个别名,一个索引也可以指定多个别名
功能 | 请求方式 | url | 参数 |
---|---|---|---|
查询 | GET | localhost:9200/_alias; localhost:9200/stu/_alias | - |
新增 | POST | localhost:9200/_aliases | 见下 |
新增 | PUT | localhost:9200/stu/_alias/stu_v1.0 | - |
删除 | POST | localhost:9200/_aliases | 见下 |
删除 | DELETE | localhost:9200/stu/_alias/stu_v1.0 | - |
重命名 | POST | localhost:9200/_aliases | 见下 |
- 新增
POST localhost:9200/_aliases
{ "actions": [ { "add": { "index": "stu", "alias": "stu_1214" } } ] }
- 删除
POST localhost:9200/_aliases
{ "actions": [ { "remove": { "index": "stu", "alias": "stu_v1.1" } } ] }
- 重命名
POST localhost:9200/_aliases
{ "actions": [ { "remove": { "index": "stu", "alias": "stu_1214" } }, { "add": { "index": "stu", "alias": "stu_1215" } } ] }
- 当别名指定了多个索引,可以指定写某个索引
POST localhost:9200/_aliases
{ "actions": [ { "add": { "index": "stu", "alias": "alia_v1.0", "is_write_index": "true" } }, { "add": { "index": "tea", "alias": "alia_v1.0" } } ] }
功能 | 请求方式 | url | 参数 |
---|---|---|---|
新增 | PUT | localhost:9200/stu/_mapping | 见下 |
获取 | GET | localhost:9200/stu/_mapping | - |
批量获取 | GET | localhost:9200/stu,tea/_mapping | - |
获取所有1 | GET | localhost:9200/_mapping | - |
获取所有2 | GET | localhost:9200/_all/_mapping | - |
修改 | PUT | localhost:9200/stu/_mapping | 见下 |
- 新增
PUT localhost:9200/stu/_mapping
// 参数 { "properties": { "name": { "type": "text" }, "age": { "type": "long" }, "sex": { "type": "keyword" }, "class": { "type": "keyword" } } }
- 获取
GET localhost:9200/stu/_mapping
// 响应 { "stu": { "mappings": { "properties": { "age": { "type": "long" }, "class": { "type": "keyword" }, "name": { "type": "text" }, "sex": { "type": "keyword" } } } } }
- 批量获取
GET localhost:9200/stu,tea/_mapping
// 响应 { "tea": { "mappings": {} }, "stu": { "mappings": { "properties": { "age": { "type": "long" }, "class": { "type": "keyword" }, "name": { "type": "text" }, "sex": { "type": "keyword" } } } } }
- 获取所有1
GET localhost:9200/_mapping
// 响应 { "stu": { "mappings": { "properties": { "age": { "type": "long" }, "class": { "type": "keyword" }, "name": { "type": "text" }, "sex": { "type": "keyword" } } } }, "tea": { "mappings": {} }, }
- 获取所有2
GET localhost:9200/_all/_mapping
// 响应 { "stu": { "mappings": { "properties": { "age": { "type": "long" }, "class": { "type": "keyword" }, "name": { "type": "text" }, "sex": { "type": "keyword" } } } }, "tea": { "mappings": {} }, }
- 【ElasticSearch|ElasticSearch 7.x】修改
注意:
修改映射时,只能新增字段,不能修改或删除已存在的字段
PUT localhost:9200/stu/_mapping
// 参数 { "properties": { "name": { "type": "text" }, "age": { "type": "long" }, "sex": { "type": "keyword" }, "class": { "type": "keyword" }, "birthday": { "type": "date" } } }
功能 | 请求方式 | url | 参数 |
---|---|---|---|
新增(指定id) | PUT | localhost:9200/stu/_doc/1 | 见下 |
新增(不指定id) | POST(必须) | localhost:9200/stu/_doc | 见下 |
指定操作类型 | PUT | localhost:9200/stu/_doc/1?op_type=create | 见下 |
查看 | GET | localhost:9200/stu/_doc/1 | - |
查看多个?文档 | POST | localhost:9200/_mget | 见下 |
修改 | POST | localhost:9200/stu/_update/1 | 见下 |
删除 | DELETE | localhost:9200/stu/_doc/1 | - |
删除全部 | POST | localhost:9200/stu/_delete_by_query |
- 新增(指定id)
PUT localhost:9200/stu/_doc/1
// 参数 { "name": "杨光", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" }
// 响应 { "_index": "stu", "_type": "_doc", "_id": "1", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 0, "_primary_term": 3 }
- 新增(不指定id)
不指定id,系统会自动分配id
POST localhost:9200/stu/_doc
// 参数 { "name": "张世杰", "age": 13, "sex": "0", "class": "7(5)班", "birthday": "2004-11-01" }
// 响应 { "_index": "stu", "_type": "_doc", "_id": "C_SI-W4Bj7nk6pLmw4Er",//系统分配id "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 1, "_primary_term": 3 }
- 指定操作类型
若不指定插入时的操作类型,向已存在的id插入数据,原数据会被更新掉,并生成一个新的版本
PUT localhost:9200/stu/_doc/1
{ "name": "杨光11", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" }
{ "_index": "stu", "_type": "_doc", "_id": "1", "_version": 2,//产生新的版本 "result": "updated",//执行结果时updated,而不是created "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 2, "_primary_term": 3 }
PUT localhost:9200/stu/_doc/1?op_type=create (向已存在的id插入数据会报错)
// 参数 { "name": "杨光22", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" }
// 响应 { "error": { "root_cause": [ { "type": "version_conflict_engine_exception", "reason": "[1]: version conflict, document already exists (current version [2])", "index_uuid": "H9dyTutEQg-4OsV2Byt-gA", "shard": "0", "index": "stu" } ], "type": "version_conflict_engine_exception", "reason": "[1]: version conflict, document already exists (current version [2])", "index_uuid": "H9dyTutEQg-4OsV2Byt-gA", "shard": "0", "index": "stu" }, "status": 409 }
- 查看
GET localhost:9200/stu/_doc/1
// 响应 { "_index": "stu", "_type": "_doc", "_id": "1", "_version": 2, "_seq_no": 2, "_primary_term": 3, "found": true, "_source": { "name": "杨光11", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" } }
- 查看多个?文档
- 方式一
POST localhost:9200/_mget
// 参数 { "docs": [ { "_index": "stu", "_type": "_doc", "_id": "1" }, { "_index": "stu", "_type": "_doc", "_id": "C_SI-W4Bj7nk6pLmw4Er" }] }
// 响应 { "docs": [ { "_index": "stu", "_type": "_doc", "_id": "1", "_version": 3, "_seq_no": 3, "_primary_term": 3, "found": true, "_source": { "name": "杨光33", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" } }, { "_index": "stu", "_type": "_doc", "_id": "C_SI-W4Bj7nk6pLmw4Er", "_version": 1, "_seq_no": 1, "_primary_term": 3, "found": true, "_source": { "name": "张世杰", "age": 13, "sex": "0", "class": "7(5)班", "birthday": "2004-11-01" } } ] }
- 方式二
POST localhost:9200/stu/_mget
// 参数 { "docs": [ { "_type": "_doc", "_id": "1" }, { "_type": "_doc", "_id": "C_SI-W4Bj7nk6pLmw4Er" }] }
- 方式三
POST localhost:9200/stu/_doc/_mget
// 参数 { "docs": [ { "_id": "1" }, { "_id": "C_SI-W4Bj7nk6pLmw4Er" }] }
- 方式一
- 修改
- 根据提供的?文档?片段更更新数据
POST localhost:9200/stu/_update/1
// 参数 { "doc": { "name": "杨光33", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" } }
- 向_source字段,增加一个字段
POST localhost:9200/stu/_update/1
// 参数 { "script": "ctx._source.height = \"173cm\"" }
- 从_source字段,删除一个字段
POST localhost:9200/stu/_update/1
// 参数 { "script": "ctx._source.remove(\"height\")" }
- 根据提供的?文档?片段更更新数据
- 删除
DELETE localhost:9200/stu/_doc/1
// 响应 { "_index": "stu", "_type": "_doc", "_id": "1", "_version": 6, "result": "deleted", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 6, "_primary_term": 3 }
- 删除全部
POST localhost:9200/stu/_delete_by_query
// 参数 { "query": { "match_all": {} } }
数据准备:批量导入数据-ES提供了一个叫 bulk 的API 来进行批量操作
- 数据
{"index": {"_index": "stu", "_type": "_doc", "_id": 1}} {"name":"杨光","age":14,"sex":"1","class":"7(2)班","birthday":"2005-08-26"} {"index": {"_index": "stu", "_type": "_doc", "_id": 2}} {"name":"张世杰","age":13,"sex":"0","class":"7(5)班","birthday":"2004-11-01"} {"index": {"_index": "stu", "_type": "_doc", "_id": 3}} {"name":"李蒙","age":14,"sex":"0","class":"7(2)班","birthday":"2005-10-15"} {"index": {"_index": "stu", "_type": "_doc", "_id": 4}} {"name":"李沁","age":15,"sex":"0","class":"7(3)班","birthday":"2004-10-15"} {"index": {"_index": "stu", "_type": "_doc", "_id": 5}} {"name":"王昭","age":14,"sex":"1","class":"7(3)班","birthday":"2005-01-26"} {"index": {"_index": "stu", "_type": "_doc", "_id": 6}} {"name":"李明","age":14,"sex":"1","class":"7(2)班","birthday":"2005-03-26"} {"index": {"_index": "stu", "_type": "_doc", "_id": 7}} {"name":"张璐","age":14,"sex":"1","class":"7(5)班","birthday":"2005-06-02"} {"index": {"_index": "stu", "_type": "_doc", "_id": 8}} {"name":"李思敏","age":14,"sex":"1","class":"7(3)班","birthday":"2005-06-02"} {"index": {"_index": "stu", "_type": "_doc", "_id": 9}} {"name":"吴民锡","age":13,"sex":"1","class":"7(5)班","birthday":"2006-04-02"} {"index": {"_index": "stu", "_type": "_doc", "_id": 10}} {"name":"赵曦","age":14,"sex":"0","class":"7(2)班","birthday":"2005-09-02"}
- POST bulk
curl -X POST "localhost:9200/_bulk" -H 'Content-Type: application/json' --data-binary @name
单词级别查询-词条查询不会分析查询条件,只有当词条和查询字符串完全匹配时,才匹配搜索;这些查询通常用于结构化的数据,比如: number, date, keyword等,而不是对text。也就是说,全文本查询之前要先对文本内容进行分词,而单词级别的查询直接在相应字段的反向索引中精确查找,单词级别的查询一般用于数值、日期等类型的字段上。
功能 | 请求方式 | url | 参数 | 描述 |
---|---|---|---|---|
单条term查询 | POST | localhost:9200/stu/_search | 见下 | - |
多条term查询 | POST | localhost:9200/stu/_search | 见下 | - |
Exsit Query | POST | localhost:9200/stu/_search | 见下 | 特定的字段中查找?非空值的?文档 |
Prefix Query | POST | localhost:9200/stu/_search | 见下 | 查找包含带有指定前缀term的?文档 |
Wildcard Query | POST | localhost:9200/stu/_search | 见下 | 支持通配符查询, *表示任意字符, ?表示任意单个字符 |
Regexp Query | POST | localhost:9200/stu/_search | 见下 | 正则表达式查询 |
Ids Query | POST | localhost:9200/stu/_search | 见下 | 通过id查询文档 |
- 单条term查询
POST localhost:9200/stu/_search
// 参数 { "query":{ "term":{ "sex": "1" } } }
// 响应 { "took": 1, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 0.9808292, "hits": [ { "_index": "stu", "_type": "_doc", "_id": "1", "_score": 0.9808292, "_source": { "name": "杨光", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" } } ] } }
- 多条term查询
POST localhost:9200/stu/_search
// 参数 { "query":{ "terms":{ "sex": ["0","1"] } } }
- Exsit Query
POST localhost:9200/stu/_search
// 参数 { "query": { "exists": { "field": "birthday" } } }
- Prefix Query
POST localhost:9200/stu/_search
// 参数 { "query": { "prefix": { "class": { "value": "7" } } } }
- Wildcard Query
POST localhost:9200/stu/_search
// 参数 { "query": { "wildcard": { "class": { "value": "*2*" } } } }
- Regexp Query
POST localhost:9200/stu/_search
// 参数 { "query": { "regexp": { "class": "7.*" } } }
- Ids Query
POST localhost:9200/stu/_search
// 参数 { "query": { "ids": { "values": [1,2] } } }
ElasticSearch引擎会先分析查询字符串,将其拆分成多个分词,只要已分析的字段中包含词条的任意一个,或全部包含,就匹配查询条件,返回该文档;如果不包含任意一个分词,表示没有任何?文档匹配查询条件
类型 | 请求方式 | url | 参数 | 描述 |
---|---|---|---|---|
match_all | POST | localhost:9200/stu/_search | 见下 | 查询全部 |
match | POST | localhost:9200/stu/_search | 见下 | 分词匹配查询 |
multi_match | POST | localhost:9200/stu/_search | 见下 | 多字段查询 |
match_phrase | POST | localhost:9200/stu/_search | 见下 | 精确匹配 |
match_phrase_prefix | POST | localhost:9200/stu/_search | 见下 | 模糊匹配(text) |
- match_all
POST localhost:9200/nba/_search
// 参数 { "query":{ "match_all":{} }, "from": 0, "size": 10 }
- match
POST localhost:9200/nba/_search
// 参数 { "query": { "match": { "name": "张" } } }
- multi_match
POST localhost:9200/nba/_search
// 参数 { "query": { "multi_match": { "query": "世",// 查询条件 "fields": ["name","class"]//查询哪些字段 } } }
- match_phrase
POST localhost:9200/nba/_search
// 参数 { "query": { "match_phrase": { "class": "7(2)班" } } }
- match_phrase_prefix
// 参数 { "query": { "match_phrase_prefix": { "name": "世杰" } } }
范围查询--日期、数字或字符串
POST localhost:9200/nba/_search
// 查询年龄14-15岁的学生
{
"query": {
"range": {
"age": {
"gte": 14,
"lte": 15
}
}
}
}
// 查询2003年到2004年出生的学生
{
"query": {
"range": {
"birthday": {
"gte": "2003",
"lte": "31-12-2004",
"format": "dd-MM-yyyy||yyyy"
}
}
}
}
布尔查询
类型 | 请求方式 | url | 参数 | 描述 |
---|---|---|---|---|
must | POST | localhost:9200/nba/_search | 见下 | 必须出现在匹配文档中 |
filter | POST | localhost:9200/nba/_search | 必须出现在文档中,但是不打分 | |
must_not | POST | localhost:9200/nba/_search | 不能出现在文档中 | |
should | POST | localhost:9200/nba/_search | 应该出现在文档中 |
- must
POST localhost:9200/nba/_search
// 查询sex为"0",name中含有"曦"的学生 { "query": { "bool": { "must": [ { "match": { "name": "曦" } }, { "term": { "sex": { "value": "0" } } } ] } } }
// 响应 { "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 2.9985561, "hits" : [ { "_index" : "stu", "_type" : "_doc", "_id" : "10", "_score" : 2.9985561,// 分数 "_source" : { "name" : "赵曦", "age" : 14, "sex" : "0", "class" : "7(2)班", "birthday" : "2005-09-02" } } ] } }
- filter
效果同must,但是不打分
POST localhost:9200/nba/_search
{ "query": { "bool": { "filter": [ { "match": { "name": "曦" } }, { "term": { "sex": { "value": "0" } } } ] } } }
- must_not
POST localhost:9200/nba/_search
// 查询name包含"张",sex不是"0"的学生 { "query": { "bool": { "must": [ { "match": { "name": "张" } } ], "must_not": [ { "term": { "sex": { "value": "0" } } } ] } } }
- should
POST localhost:9200/nba/_search
// 查询sex为"1"的学生 { "query": { "bool": { "should": [ { "term": { "sex": { "value": "1" } } } ] } } }
与其他模式结合使用时即使匹配不到也返回,只是评分不同
// 查询name中包含"李",age在13-14之间的学生 { "query": { "bool": { "must": [ { "match": { "name": "李" } } ], "should": [ { "range": { "age": { "gte": 13, "lte": 14 } } } ] } } }
POST localhost:9200/nba/_search
// 查询7(5)班学生,age倒序排列
{
"query": {
"term": {
"class": {
"value": "7(5)班"
}
}
},
"sort": [
{
"age": {
"order": "desc"
}
}
]
}
聚合查询
- 聚合分析是数据库中重要的功能特性,完成对一个查询的数据集中数据的聚合计算,如:找出某字段(或计算表达式的结果)的最大值、最小值,计算和、平均值等。 ES作为搜索引擎兼数据库,同样提供了强大的聚合分析能力
- 对一个数据集求最大、最小、和、平均值等指标的聚合,在ES中称为指标聚合
- 而关系型数据库中除了有聚合函数外,还可以对查询出的数据进行分组group by,再在组上进行指标聚合。在ES中称为桶聚合
- max min sum avg
POST localhost:9200/nba/_search
// max-7(3)班最大年龄 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "maxAge": {// 自定义名称 "max": { "field": "age" } } }, "size": 0 }
// min-7(3)班最小年龄 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "minAge": { "min": { "field": "age" } } }, "size": 0 }
// sum { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "sumAge": { "sum": { "field": "age" } } }, "size": 0 }
// avg-7(3)班平均年龄 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "avgAge": {// 自定义名称 "avg": { "field": "age" } } }, "size": 0 }
- value_count
统计非空字段的文档数
POST localhost:9200/nba/_search
// 查询7(3)班年龄非空的学生总数 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "countAge": { "value_count": { "field": "age" } } }, "size": 0 }
- Cardinality
值去重计数
POST localhost:9200/nba/_search
// 7(3)班age去重统计 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "cardinalityAge": { "cardinality": { "field": "age" } } }, "size": 0 }
- stats
统计count max min avg sum 5个值
POST localhost:9200/nba/_search
{ "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "statsAge": { "stats": { "field": "age" } } }, "size": 0 }
- Extended stats
比stats多4个统计结果: 平方和、方差、标准差、平均值加/减两个标准差的区间
POST localhost:9200/nba/_search
{ "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "extendedAge": { "extended_stats": { "field": "age" } } }, "size": 0 }
- Percentiles
占比百分位对应的值统计,默认返回[ 1, 5, 25, 50, 75, 95, 99 ]分位上的值
POST localhost:9200/nba/_search
{ "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "percentilesAge": { "percentiles": { "field": "age" } } }, "size": 0 }
// 指定分位值 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "percentilesAge": { "percentiles": { "field": "age", "percents": [ 20, 50, 75 ] } } }, "size": 0 }
- Terms Aggregation 根据字段项分组聚合
POST localhost:9200/nba/_search
// 7(3)班按照age分组 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "aggsAge": { "terms": { "field": "age", "size": 5 } } }, "size": 0 }
- order 分组聚合排序
POST localhost:9200/nba/_search
// 7(3)班按照age分组,分组信息通过年龄从大到小排序 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "aggsAge": { "terms": { "field": "age", "size": 5, "order": { "_key": "desc" } } } }, "size": 0 }
// 7(3)班按照age分组,分组信息通过文档数从大到小排序 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "aggsAge": { "terms": { "field": "age", "size": 5, "order": { "_count": "desc" } } } }, "size": 0 }
// 根据class分组,根据分组后的平均age倒排 { "aggs": { "aggsClass": { "terms": { "field": "class", "size": 10, "order": { "aggsAge": "desc" } }, "aggs": { "aggsAge": { "avg": { "field": "age" } } } } }, "size": 0 }
- 筛选分组聚合
POST localhost:9200/nba/_search
{ "aggs": { "aggsClass": { "terms": { "field": "class", "include": ["7(3)班", "7(2)班", "7(5)班"],// 包含 "exclude": ["7(5)班"],// 排除 "size": 10, "order": { "aggsAge": "desc" } }, "aggs": { "aggsAge": { "avg": { "field": "age" } } } } }, "size": 0 }
// 正则匹配 // include,exclude类型要一致 { "aggs": { "aggsClass": { "terms": { "field": "class", "include": "7.*", "exclude": "7(5)班", "size": 10, "order": { "aggsAge": "desc" } }, "aggs": { "aggsAge": { "avg": { "field": "age" } } } } }, "size": 0 }
- Range Aggregation 范围分组聚合
POST localhost:9200/nba/_search
// -13,13-14,15-范围分组 { "aggs": { "aggsrange": { "range": { "field": "age", "ranges": [ { "to": 13 }, { "from": 13, "to": 14 }, { "from": 15 } ] } } }, "size": 0 }
// 范围分组-别名 { "aggs": { "aggsrange": { "range": { "field": "age", "ranges": [ { "to": 13, "key":"A" }, { "from": 13, "to": 14, "key":"B" }, { "from": 15, "key":"C" } ] } } }, "size": 0 }
- Date Range Aggregation 时间范围分组聚合
POST localhost:9200/nba/_search
// Date 时间范围分组聚合 { "aggs": { "aggsrange": { "date_range": { "field": "birthday", "format": "yyyy-MM", "ranges": [ { "to": "2004-12", "key":"A" }, { "from": "2005-01", "to": "2005-12", "key":"B" }, { "from": "2006-01", "key":"C" } ] } } }, "size": 0 }
- Date Histogram Aggregation 时间柱状图聚合
按天、月、年等进行聚合统计。可按 year (1y), quarter (1q), month (1M), week (1w), day(1d), hour (1h), minute (1m), second (1s) 间隔聚合
POST localhost:9200/nba/_search
{ "aggs": { "aggsrange": { "date_histogram": { "field": "birthday", "format": "yyyy", "calendar_interval": "year" } } }, "size": 0 }
- 单个字段查询
POST localhost:9200/nba/_search
{ "query": { "query_string": { "default_field": "name", "query": "李 AND 思 OR 敏" } } }
- 多个字段查询
POST localhost:9200/nba/_search
{ "query": { "query_string": { "fields": ["name", "sex"], "query": "李 AND 0" } } }
将?用户输入的一段文本,按照一定逻辑,分析成多个词语的一种工具内置分词器
- standard analyzer (标准分词器)
标准分析?是默认分词?,如果未指定,则使用该分词?
- simple analyzer
simple 分析?当它遇到只要不是字母的字符,就将文本解析成term,而且所有的term都是
小写的
- whitespace analyzer
whitespace 分析?,当它遇到空白字符时,就将文本解析成terms
- stop analyzer
stop 分析? 和 simple 分析?很像,唯一不同的是, stop 分析?增加了对删除停止词的支持,默认使?用了english停止词
stopwords 预定义的停止词列表,比如 (the,a,an,this,of,at)等
- language analyzer
特定的语言的分词?,比如说, english,英语分词?),内置语言: arabic, armenian,basque, bengali, brazilian, bulgarian, catalan, cjk, czech, danish, dutch, english, finnish,french, galician, german, greek, hindi, hungarian, indonesian, irish, italian, latvian,lithuanian, norwegian, persian, portuguese, romanian, russian, sorani, spanish,swedish, turkish, thai
- pattern analyzer
用正则表达式来将文本分割成terms,默认的正则表达式是\W+(非单词字符)
GET /_analyze
{
"analyzer": "simple",
"text": "Deploy a 14-day trial of Elasticsearch Service."
}
{
"tokens" : [
{
"token" : "deploy",
"start_offset" : 0,// 开始偏移量
"end_offset" : 6,// 结束偏移量
"type" : "word",
"position" : 0 // 索引
},
{
"token" : "a",
"start_offset" : 7,
"end_offset" : 8,
"type" : "word",
"position" : 1
},
{
"token" : "day",
"start_offset" : 12,
"end_offset" : 15,
"type" : "word",
"position" : 2
},
{
"token" : "trial",
"start_offset" : 16,
"end_offset" : 21,
"type" : "word",
"position" : 3
},
{
"token" : "of",
"start_offset" : 22,
"end_offset" : 24,
"type" : "word",
"position" : 4
},
{
"token" : "elasticsearch",
"start_offset" : 25,
"end_offset" : 38,
"type" : "word",
"position" : 5
},
{
"token" : "service",
"start_offset" : 39,
"end_offset" : 46,
"type" : "word",
"position" : 6
}
]
}
中文分词器
- smartCN
一个简单的中文或中英文混合文本的分词?
- 安装 (重启服务后使用)
sh elasticsearch-plugin install analysis-smartcn
- eg:
GET /_analyze { "analyzer": "smartcn", "text": "有限公司" }
{ "tokens" : [ { "token" : "有限公司", "start_offset" : 0, "end_offset" : 4, "type" : "word", "position" : 0 } ] }
- 安装 (重启服务后使用)
- IK分词器
更智能更友好的中文分词器
- 下载 https://github.com/medcl/elasticsearch-analysis-ik/releases (版本要对应)
- 安装 解压到es安装目录-plugins目录
- eg:
GET /_analyze { "analyzer": "ik_max_word", "text": "有限公司" }
{ "tokens" : [ { "token" : "有限公司", "start_offset" : 0, "end_offset" : 4, "type" : "CN_WORD", "position" : 0 }, { "token" : "有限", "start_offset" : 0, "end_offset" : 2, "type" : "CN_WORD", "position" : 1 }, { "token" : "公司", "start_offset" : 2, "end_offset" : 4, "type" : "CN_WORD", "position" : 2 } ] }
- 下载 https://github.com/medcl/elasticsearch-analysis-ik/releases (版本要对应)
新的数据已添加到索引中??就能搜索到,但是真实情况不是这样的
- 先添加?个?档,再?刻搜索,获取不到新添加的数据
curl -X PUT localhost:9200/stu/_doc/666 -H 'Content-Type:application/json' -d '{ "name": "王丝菲" }' curl -X GET localhost:9200/stu/_doc/_search?pretty
- 强制刷新
curl -X PUT localhost:9200/stu/_doc/667?refresh -H 'Content-Type:application/json' -d '{ "name": "王豆豆" }' curl -X GET localhost:9200/stu/_doc/_search?pretty
- 修改默认更新时间(默认时间是1s)
PUT localhost:9200/stu/_settings
{ "index": { "refresh_interval": "5s" } }
- 将refresh关闭
PUT localhost:9200/stu/_settings
{ "index": { "refresh_interval": "-1" } }
- 高亮查询
POST localhost:9200/stu/_search
// 参数 { "query": { "match": { "name": "赵" } }, "highlight": { "fields": { "name": {} } } }
// 相应 { "took" : 4, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 2.4191523, "hits" : [ { "_index" : "stu", "_type" : "_doc", "_id" : "10", "_score" : 2.4191523, "_source" : { "name" : "赵曦", "age" : 14, "sex" : "0", "class" : "7(2)班", "birthday" : "2005-09-02" }, "highlight" : { "name" : [ "赵曦" ] } } ] } }
- 自定义高亮查询
POST localhost:9200/stu/_search
// 参数 { "query": { "match": { "name": "赵" } }, "highlight": { "fields": { "name": { "pre_tags": [""], "post_tags": ["
"] } } } }
// 响应 { "took" : 6, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 2.4191523, "hits" : [ { "_index" : "stu", "_type" : "_doc", "_id" : "10", "_score" : 2.4191523, "_source" : { "name" : "赵曦", "age" : 14, "sex" : "0", "class" : "7(2)班", "birthday" : "2005-09-02" }, "highlight" : { "name" : [ "赵
曦" ] } } ] } }
查询建议,是为了给?户提供更好的搜索体验。包括:词条检查,?动补全字段类型
类型 | 描述 |
---|---|
text | 指定搜索文本 |
field | 获取建议器的搜索字段 |
analyzer | 指定分词器 |
size | 每个词返回的最大建议词数 |
sort | 如何对建议词进行排序,可用选项:score-先按评分排序,再按文档频率排序,term顺序;frequency:先按文档频率排序,再按评分排序,term顺序; |
suggest_mode | 建议模式,控制提供建议词的方式:missing-仅在搜索的词项在索引中不存在时才提供建议词,默认值;popular-仅建议文档频率比搜索词项高的词;always-总是提供匹配的建议词; |
- Term suggester
term 词条建议器,对给输?的文本进?分词,为每个分词提供词项建议
POST localhost:9200/stu/_search
// 参数 { "suggest": { "MY_SUGGESTION": { "text": "7(6)班", "term": { "suggest_mode": "missing", "field": "class" } } } }
{ "took" : 105, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "MY_SUGGESTION" : [ { "text" : "7(6)班", "offset" : 0, "length" : 5, "options" : [ { "text" : "7(2)班", "score" : 0.8, "freq" : 4 }, { "text" : "7(3)班", "score" : 0.8, "freq" : 3 }, { "text" : "7(5)班", "score" : 0.8, "freq" : 3 } ] } ] } }
- Phrase suggester
phrase 短语建议,在term的基础上,会考量多个term之间的关系,?如是否同时出现在索
引的原文里,相邻程度,以及词频等
POST localhost:9200/stu/_search
// 参数 { "suggest": { "MY_SUGGESTION": { "text": "7(2) 班", "phrase": { "field": "class" } } } }
// 响应 { "took" : 17, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "MY_SUGGESTION" : [ { "text" : "7(2) 班", "offset" : 0, "length" : 6, "options" : [ { "text" : "7(2)班", "score" : 0.4678218 }, { "text" : "7(3)班", "score" : 0.37474233 }, { "text" : "7(5)班", "score" : 0.37474233 } ] } ] } }
- Completion suggester
完成建议,自动补充查询内容后面的内容
POST localhost:9200/stu/_search
// 要查询字段的类型必须是 completion { "suggest": { "MY_SUGGESTION": { // 自定义名称 "text": "I like", "completion": { "field": "selfDesc" } } } }
推荐阅读
- ElasticSearch6.6.0强大的JAVA|ElasticSearch6.6.0强大的JAVA API详解
- Elasticsearch|Elasticsearch 简介
- elasticsearch分析器
- 三十一、|三十一、 Elasticsearch集群搭建部署及配置
- springmvc|springmvc 集成 Spring Data Elasticsearch 遇到的坑
- Elasticsearch(一)什么是Elasticsearch
- es7.x(6)—minimum_should_match最低匹配度
- elasticsearch|elasticsearch 7.0 新特性之 search as you type
- Elasticsearch|Elasticsearch 7.x 深入【10】Aggregation
- 滴滴|滴滴 Elasticsearch集群