ElasticSearch|ElasticSearch 7.x

Elasticsearch 7.x 简介

  • Elasticsearch是一个开源,基于Apache Lucene库构建的Restful搜索引擎
  • Elasticsearch是在Solr之后几年推出的。它提供了一个分布式,多租户能力的全文搜索引擎,具有HTTP Web界面(REST)和无架构JSON文档。 Elasticsearch的官方客户端库提供Java, Groovy, PHP, Ruby, Perl, Python, .NET和Javascript
官网地址
  • [官网]https://www.elastic.co/
  • [软件下载地址]https://www.elastic.co/downloads/
核心概念
  • 索引(index)
    • 一个索引可以理解成一个关系型数据库
  • 类型(type)
    • 一种type就像一类表,比如user表, order表
    • 注意:
      • ES 5.x中一个index可以有多种type
      • ES 6.x中一个index只能有一种type
      • ES 7.x以后已经移除type这个概念
  • 映射(mapping)
    • mapping定义了每个字段的类型等信息。相当于关系型数据库中的表结构
  • 文档(document)
    • 一个document相当于关系型数据库中的一行记录
  • 字段(field)
    • 相当于关系型数据库表的字段
  • 集群(cluster)
    • 集群由一个或多个节点组成,一个集群有一个默认名称"elasticsearch"
  • 节点(node)
    • 集群的节点,一台机?或者一个进程
  • 分片和副本(shard)
    • 副本是分片的副本。分片有主分片(primary Shard)和副本分片(replica Shard)之分
    • 一个Index数据在物理上被分布在多个主分片中,每个主分片只存放部分数据
    • 每个主分片可以有多个副本,叫副本分片,是主分片的复制
字段类型 核心数据类型
分类 类型 描述
字符串 text 用于全文索引,该类型的字段将通过分词?进行分词
字符串 keyword 不分词,只能搜索该字段的完整的值
数值型 long, integer, short, byte, double, float, half_float, scaled_float -
布尔 boolean -
二进制 binary 该类型的字段把值当做经过 base64 编码的字符串,默认不存储,且不可搜索
范围类型 integer_range, float_range, long_range, double_range, date_range 范围类型表示值是一个范围,而不是一个具体的值;譬如 age 的类型是 integer_range,那么值可以是 {"gte" : 20, "lte" : 40};搜索 "term" :{"age": 21} 可以搜索该值
日期 date 由于Json没有date类型,所以es通过识别字符串是否符合format定义的格式来判断是否为date类型;format默认为strict_date_optiona_time||epoch_millis;格式"2022-01-01","2022/01/01 12:10:30",或从开始纪元(1970年年1? 1?日 0点) 开始的毫秒数
复杂数据类型
  • 数组类型 Array
    • ES中没有专门的数组类型, 直接使用[]定义即可,数组中所有的值必须是同一种数据类型, 不支持混合数据类型的数组
    • 字符串数组 [ "one", "two" ] ,整数数组 [ 1, 2 ]
    • Object对象数组 [ { "name": "Louis", "age": 18 }, { "name": "Daniel", "age": 17 }]
    • 同一个数组只能存同类型的数据,不能混存,譬如 [ 10, "some string" ] 是错误的
  • 对象类型 Object
    • 对象类型可能有内部对象
      { "name": "李蒙", "age": 14, "sex": "0", "class": "7(2)班", "birthday": "2005-10-15" "hobbies": [ "阅读", "跑步" ], "address": { "province": "山东", "location": { "city": "日照" } } }

专?用数据类型
  • IP类型
    IP类型的字段?用于存储IPv4或IPv6的地址, 本质上是?一个?长整型字段
索引
功能 请求方式 url 参数
新增 PUT(必须) localhost:9200/stu -
获取 GET localhost:9200/stu -
删除 DELETE localhost:9200/stu -
批量获取 GET localhost:9200/stu,tea -
获取所有1 GET localhost:9200/_all -
获取所有2 GET localhost:9200/_cat/indices?v -
存在 HEAD localhost:9200/stu -
关闭 POST localhost:9200/stu/_close -
打开 POST localhost:9200/stu/_open -
自动创建索引 PUT localhost:9200/_cluster/settings 见下
数据复制 POST localhost:9200/_reindex 见下
  • 新增
    PUT localhost:9200/stu
    // 响应 { "acknowledged": true, "shards_acknowledged": true, "index": "stu" }

  • 获取
    GET localhost:9200/stu
    // 响应 { "stu": { "aliases": {},//别名 "mappings": {},//映射 "settings": { "index": { "creation_date": "1576139082806",//创建时间 "number_of_shards": "1",//分片 "number_of_replicas": "1",//副本 "uuid": "-ocQkbgoSyG2vDTsugK_9Q", "version": { "created": "7020099" }, "provided_name": "stu" } } } }

  • 删除
    DELETE localhost:9200/stu
    { "acknowledged": true }

  • 批量获取
    GET localhost:9200/stu,tea
    // 响应 { "stu": { "aliases": {}, "mappings": {}, "settings": { "index": { "creation_date": "1576139586417", "number_of_shards": "1", "number_of_replicas": "1", "uuid": "H9dyTutEQg-4OsV2Byt-gA", "version": { "created": "7020099" }, "provided_name": "stu" } } }, "tea": { "aliases": {}, "mappings": {}, "settings": { "index": { "creation_date": "1576139593175", "number_of_shards": "1", "number_of_replicas": "1", "uuid": "nYhKuggbT_Wa2RI-M_COGA", "version": { "created": "7020099" }, "provided_name": "tea" } } } }

  • 获取所有1
    GET localhost:9200/_all
    // 响应 { "stu": { "aliases": {}, "mappings": {}, "settings": { "index": { "creation_date": "1576139586417", "number_of_shards": "1", "number_of_replicas": "1", "uuid": "H9dyTutEQg-4OsV2Byt-gA", "version": { "created": "7020099" }, "provided_name": "stu" } } }, "tea": { "aliases": {}, "mappings": {}, "settings": { "index": { "creation_date": "1576139593175", "number_of_shards": "1", "number_of_replicas": "1", "uuid": "nYhKuggbT_Wa2RI-M_COGA", "version": { "created": "7020099" }, "provided_name": "tea" } } } }

  • 获取所有2
    GET localhost:9200/_cat/indices?v
    // 响应 health status indexuuidpri rep docs.count docs.deleted store.size pri.store.size greenopen.kibana_task_managerIjZxE0H9TtmTrpBgzjr-qg102012.8kb12.8kb yellow openstuH9dyTutEQg-4OsV2Byt-gA1100283b283b yellow openteanYhKuggbT_Wa2RI-M_COGA1100283b283b

  • 存在
    HEAD localhost:9200/stu
    // 响应-存在 200 ok

  • 关闭
    POST localhost:9200/stu/_close
    // 响应 { "acknowledged": true, "shards_acknowledged": true }

  • 打开
    POST localhost:9200/stu/_open
    // 响应 { "acknowledged": true, "shards_acknowledged": true }

  • 自动创建索引
    插入文档时(见下)是否自动创建索引
    GET 请求http://localhost:9200/_cluster/settings 查看auto_create_index 的状态
    true自动创建
    • 修改auto_create_index 的状态
      PUT localhost:9200/_cluster/settings
      // 参数 { "persistent": { "action.auto_create_index": "true"//true或false } }

  • 数据复制(结合索引别名,可以重建索引并导入数据)
    POST localhost:9200/_reindex
    { "source": { "index": "stu" }, "dest": { "index": "stu_oth" } }

索引别名
在开发中,随着业务需求的迭代,较?的业务逻辑就要?临更新甚?是重构,?对于es来说,为了适应新的业务逻辑,可能就要对原有的索引做?些修改,?如对某些字段做调整,甚?是重建索引。?做这些操作的时候,可能会对业务造成影响,甚?是停机调整等问题。由此,es提供了索引别名来解决这些问题。 索引别名就像?个快捷?式或是软连接,可以指向?个或多个索引,也可以给任意?个需要索引名的API来使?。别名的应?为程序提供了极?地灵活性
多个索引可以指定同一个别名,一个索引也可以指定多个别名
功能 请求方式 url 参数
查询 GET localhost:9200/_alias; localhost:9200/stu/_alias -
新增 POST localhost:9200/_aliases 见下
新增 PUT localhost:9200/stu/_alias/stu_v1.0 -
删除 POST localhost:9200/_aliases 见下
删除 DELETE localhost:9200/stu/_alias/stu_v1.0 -
重命名 POST localhost:9200/_aliases 见下
  • 新增
    POST localhost:9200/_aliases
    { "actions": [ { "add": { "index": "stu", "alias": "stu_1214" } } ] }

  • 删除
    POST localhost:9200/_aliases
    { "actions": [ { "remove": { "index": "stu", "alias": "stu_v1.1" } } ] }

  • 重命名
    POST localhost:9200/_aliases
    { "actions": [ { "remove": { "index": "stu", "alias": "stu_1214" } }, { "add": { "index": "stu", "alias": "stu_1215" } } ] }

  • 当别名指定了多个索引,可以指定写某个索引
    POST localhost:9200/_aliases
    { "actions": [ { "add": { "index": "stu", "alias": "alia_v1.0", "is_write_index": "true" } }, { "add": { "index": "tea", "alias": "alia_v1.0" } } ] }

映射
功能 请求方式 url 参数
新增 PUT localhost:9200/stu/_mapping 见下
获取 GET localhost:9200/stu/_mapping -
批量获取 GET localhost:9200/stu,tea/_mapping -
获取所有1 GET localhost:9200/_mapping -
获取所有2 GET localhost:9200/_all/_mapping -
修改 PUT localhost:9200/stu/_mapping 见下
  • 新增
    PUT localhost:9200/stu/_mapping
    // 参数 { "properties": { "name": { "type": "text" }, "age": { "type": "long" }, "sex": { "type": "keyword" }, "class": { "type": "keyword" } } }

  • 获取
    GET localhost:9200/stu/_mapping
    // 响应 { "stu": { "mappings": { "properties": { "age": { "type": "long" }, "class": { "type": "keyword" }, "name": { "type": "text" }, "sex": { "type": "keyword" } } } } }

  • 批量获取
    GET localhost:9200/stu,tea/_mapping
    // 响应 { "tea": { "mappings": {} }, "stu": { "mappings": { "properties": { "age": { "type": "long" }, "class": { "type": "keyword" }, "name": { "type": "text" }, "sex": { "type": "keyword" } } } } }

  • 获取所有1
    GET localhost:9200/_mapping
    // 响应 { "stu": { "mappings": { "properties": { "age": { "type": "long" }, "class": { "type": "keyword" }, "name": { "type": "text" }, "sex": { "type": "keyword" } } } }, "tea": { "mappings": {} }, }

  • 获取所有2
    GET localhost:9200/_all/_mapping
    // 响应 { "stu": { "mappings": { "properties": { "age": { "type": "long" }, "class": { "type": "keyword" }, "name": { "type": "text" }, "sex": { "type": "keyword" } } } }, "tea": { "mappings": {} }, }

  • 【ElasticSearch|ElasticSearch 7.x】修改
    注意:
    修改映射时,只能新增字段,不能修改或删除已存在的字段
    PUT localhost:9200/stu/_mapping
    // 参数 { "properties": { "name": { "type": "text" }, "age": { "type": "long" }, "sex": { "type": "keyword" }, "class": { "type": "keyword" }, "birthday": { "type": "date" } } }

文档
功能 请求方式 url 参数
新增(指定id) PUT localhost:9200/stu/_doc/1 见下
新增(不指定id) POST(必须) localhost:9200/stu/_doc 见下
指定操作类型 PUT localhost:9200/stu/_doc/1?op_type=create 见下
查看 GET localhost:9200/stu/_doc/1 -
查看多个?文档 POST localhost:9200/_mget 见下
修改 POST localhost:9200/stu/_update/1 见下
删除 DELETE localhost:9200/stu/_doc/1 -
删除全部 POST localhost:9200/stu/_delete_by_query
  • 新增(指定id)
    PUT localhost:9200/stu/_doc/1
    // 参数 { "name": "杨光", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" }

    // 响应 { "_index": "stu", "_type": "_doc", "_id": "1", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 0, "_primary_term": 3 }

  • 新增(不指定id)
    不指定id,系统会自动分配id
    POST localhost:9200/stu/_doc
    // 参数 { "name": "张世杰", "age": 13, "sex": "0", "class": "7(5)班", "birthday": "2004-11-01" }

    // 响应 { "_index": "stu", "_type": "_doc", "_id": "C_SI-W4Bj7nk6pLmw4Er",//系统分配id "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 1, "_primary_term": 3 }

  • 指定操作类型
    若不指定插入时的操作类型,向已存在的id插入数据,原数据会被更新掉,并生成一个新的版本
    PUT localhost:9200/stu/_doc/1
    { "name": "杨光11", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" }

    { "_index": "stu", "_type": "_doc", "_id": "1", "_version": 2,//产生新的版本 "result": "updated",//执行结果时updated,而不是created "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 2, "_primary_term": 3 }

    PUT localhost:9200/stu/_doc/1?op_type=create (向已存在的id插入数据会报错)
    // 参数 { "name": "杨光22", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" }

    // 响应 { "error": { "root_cause": [ { "type": "version_conflict_engine_exception", "reason": "[1]: version conflict, document already exists (current version [2])", "index_uuid": "H9dyTutEQg-4OsV2Byt-gA", "shard": "0", "index": "stu" } ], "type": "version_conflict_engine_exception", "reason": "[1]: version conflict, document already exists (current version [2])", "index_uuid": "H9dyTutEQg-4OsV2Byt-gA", "shard": "0", "index": "stu" }, "status": 409 }

  • 查看
    GET localhost:9200/stu/_doc/1
    // 响应 { "_index": "stu", "_type": "_doc", "_id": "1", "_version": 2, "_seq_no": 2, "_primary_term": 3, "found": true, "_source": { "name": "杨光11", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" } }

  • 查看多个?文档
    1. 方式一
      POST localhost:9200/_mget
      // 参数 { "docs": [ { "_index": "stu", "_type": "_doc", "_id": "1" }, { "_index": "stu", "_type": "_doc", "_id": "C_SI-W4Bj7nk6pLmw4Er" }] }

      // 响应 { "docs": [ { "_index": "stu", "_type": "_doc", "_id": "1", "_version": 3, "_seq_no": 3, "_primary_term": 3, "found": true, "_source": { "name": "杨光33", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" } }, { "_index": "stu", "_type": "_doc", "_id": "C_SI-W4Bj7nk6pLmw4Er", "_version": 1, "_seq_no": 1, "_primary_term": 3, "found": true, "_source": { "name": "张世杰", "age": 13, "sex": "0", "class": "7(5)班", "birthday": "2004-11-01" } } ] }

    2. 方式二
      POST localhost:9200/stu/_mget
      // 参数 { "docs": [ { "_type": "_doc", "_id": "1" }, { "_type": "_doc", "_id": "C_SI-W4Bj7nk6pLmw4Er" }] }

    3. 方式三
      POST localhost:9200/stu/_doc/_mget
      // 参数 { "docs": [ { "_id": "1" }, { "_id": "C_SI-W4Bj7nk6pLmw4Er" }] }

  • 修改
    1. 根据提供的?文档?片段更更新数据
      POST localhost:9200/stu/_update/1
      // 参数 { "doc": { "name": "杨光33", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" } }

    2. 向_source字段,增加一个字段
      POST localhost:9200/stu/_update/1
      // 参数 { "script": "ctx._source.height = \"173cm\"" }

    3. 从_source字段,删除一个字段
      POST localhost:9200/stu/_update/1
      // 参数 { "script": "ctx._source.remove(\"height\")" }

  • 删除
    DELETE localhost:9200/stu/_doc/1
    // 响应 { "_index": "stu", "_type": "_doc", "_id": "1", "_version": 6, "result": "deleted", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 6, "_primary_term": 3 }

  • 删除全部
    POST localhost:9200/stu/_delete_by_query
    // 参数 { "query": { "match_all": {} } }

查询搜索
数据准备:批量导入数据-ES提供了一个叫 bulk 的API 来进行批量操作
  • 数据
    {"index": {"_index": "stu", "_type": "_doc", "_id": 1}} {"name":"杨光","age":14,"sex":"1","class":"7(2)班","birthday":"2005-08-26"} {"index": {"_index": "stu", "_type": "_doc", "_id": 2}} {"name":"张世杰","age":13,"sex":"0","class":"7(5)班","birthday":"2004-11-01"} {"index": {"_index": "stu", "_type": "_doc", "_id": 3}} {"name":"李蒙","age":14,"sex":"0","class":"7(2)班","birthday":"2005-10-15"} {"index": {"_index": "stu", "_type": "_doc", "_id": 4}} {"name":"李沁","age":15,"sex":"0","class":"7(3)班","birthday":"2004-10-15"} {"index": {"_index": "stu", "_type": "_doc", "_id": 5}} {"name":"王昭","age":14,"sex":"1","class":"7(3)班","birthday":"2005-01-26"} {"index": {"_index": "stu", "_type": "_doc", "_id": 6}} {"name":"李明","age":14,"sex":"1","class":"7(2)班","birthday":"2005-03-26"} {"index": {"_index": "stu", "_type": "_doc", "_id": 7}} {"name":"张璐","age":14,"sex":"1","class":"7(5)班","birthday":"2005-06-02"} {"index": {"_index": "stu", "_type": "_doc", "_id": 8}} {"name":"李思敏","age":14,"sex":"1","class":"7(3)班","birthday":"2005-06-02"} {"index": {"_index": "stu", "_type": "_doc", "_id": 9}} {"name":"吴民锡","age":13,"sex":"1","class":"7(5)班","birthday":"2006-04-02"} {"index": {"_index": "stu", "_type": "_doc", "_id": 10}} {"name":"赵曦","age":14,"sex":"0","class":"7(2)班","birthday":"2005-09-02"}

  • POST bulk
    curl -X POST "localhost:9200/_bulk" -H 'Content-Type: application/json' --data-binary @name

term(词条)查询
单词级别查询-词条查询不会分析查询条件,只有当词条和查询字符串完全匹配时,才匹配搜索;这些查询通常用于结构化的数据,比如: number, date, keyword等,而不是对text。也就是说,全文本查询之前要先对文本内容进行分词,而单词级别的查询直接在相应字段的反向索引中精确查找,单词级别的查询一般用于数值、日期等类型的字段上。
功能 请求方式 url 参数 描述
单条term查询 POST localhost:9200/stu/_search 见下 -
多条term查询 POST localhost:9200/stu/_search 见下 -
Exsit Query POST localhost:9200/stu/_search 见下 特定的字段中查找?非空值的?文档
Prefix Query POST localhost:9200/stu/_search 见下 查找包含带有指定前缀term的?文档
Wildcard Query POST localhost:9200/stu/_search 见下 支持通配符查询, *表示任意字符, ?表示任意单个字符
Regexp Query POST localhost:9200/stu/_search 见下 正则表达式查询
Ids Query POST localhost:9200/stu/_search 见下 通过id查询文档
  • 单条term查询
    POST localhost:9200/stu/_search
    // 参数 { "query":{ "term":{ "sex": "1" } } }

    // 响应 { "took": 1, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 0.9808292, "hits": [ { "_index": "stu", "_type": "_doc", "_id": "1", "_score": 0.9808292, "_source": { "name": "杨光", "age": 14, "sex": "1", "class": "7(2)班", "birthday": "2005-08-26" } } ] } }

  • 多条term查询
    POST localhost:9200/stu/_search
    // 参数 { "query":{ "terms":{ "sex": ["0","1"] } } }

  • Exsit Query
    POST localhost:9200/stu/_search
    // 参数 { "query": { "exists": { "field": "birthday" } } }

  • Prefix Query
    POST localhost:9200/stu/_search
    // 参数 { "query": { "prefix": { "class": { "value": "7" } } } }

  • Wildcard Query
    POST localhost:9200/stu/_search
    // 参数 { "query": { "wildcard": { "class": { "value": "*2*" } } } }

  • Regexp Query
    POST localhost:9200/stu/_search
    // 参数 { "query": { "regexp": { "class": "7.*" } } }

  • Ids Query
    POST localhost:9200/stu/_search
    // 参数 { "query": { "ids": { "values": [1,2] } } }

full text(全文)查询
ElasticSearch引擎会先分析查询字符串,将其拆分成多个分词,只要已分析的字段中包含词条的任意一个,或全部包含,就匹配查询条件,返回该文档;如果不包含任意一个分词,表示没有任何?文档匹配查询条件
类型 请求方式 url 参数 描述
match_all POST localhost:9200/stu/_search 见下 查询全部
match POST localhost:9200/stu/_search 见下 分词匹配查询
multi_match POST localhost:9200/stu/_search 见下 多字段查询
match_phrase POST localhost:9200/stu/_search 见下 精确匹配
match_phrase_prefix POST localhost:9200/stu/_search 见下 模糊匹配(text)
  • match_all
    POST localhost:9200/nba/_search
    // 参数 { "query":{ "match_all":{} }, "from": 0, "size": 10 }

  • match
    POST localhost:9200/nba/_search
    // 参数 { "query": { "match": { "name": "张" } } }

  • multi_match
    POST localhost:9200/nba/_search
    // 参数 { "query": { "multi_match": { "query": "世",// 查询条件 "fields": ["name","class"]//查询哪些字段 } } }

  • match_phrase
    POST localhost:9200/nba/_search
    // 参数 { "query": { "match_phrase": { "class": "7(2)班" } } }

  • match_phrase_prefix
    // 参数 { "query": { "match_phrase_prefix": { "name": "世杰" } } }

范围查询
范围查询--日期、数字或字符串
POST localhost:9200/nba/_search
// 查询年龄14-15岁的学生 { "query": { "range": { "age": { "gte": 14, "lte": 15 } } } }

// 查询2003年到2004年出生的学生 { "query": { "range": { "birthday": { "gte": "2003", "lte": "31-12-2004", "format": "dd-MM-yyyy||yyyy" } } } }

布尔查询
类型 请求方式 url 参数 描述
must POST localhost:9200/nba/_search 见下 必须出现在匹配文档中
filter POST localhost:9200/nba/_search 必须出现在文档中,但是不打分
must_not POST localhost:9200/nba/_search 不能出现在文档中
should POST localhost:9200/nba/_search 应该出现在文档中
  • must
    POST localhost:9200/nba/_search
    // 查询sex为"0",name中含有"曦"的学生 { "query": { "bool": { "must": [ { "match": { "name": "曦" } }, { "term": { "sex": { "value": "0" } } } ] } } }

    // 响应 { "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 2.9985561, "hits" : [ { "_index" : "stu", "_type" : "_doc", "_id" : "10", "_score" : 2.9985561,// 分数 "_source" : { "name" : "赵曦", "age" : 14, "sex" : "0", "class" : "7(2)班", "birthday" : "2005-09-02" } } ] } }

  • filter
    效果同must,但是不打分
    POST localhost:9200/nba/_search
    { "query": { "bool": { "filter": [ { "match": { "name": "曦" } }, { "term": { "sex": { "value": "0" } } } ] } } }

  • must_not
    POST localhost:9200/nba/_search
    // 查询name包含"张",sex不是"0"的学生 { "query": { "bool": { "must": [ { "match": { "name": "张" } } ], "must_not": [ { "term": { "sex": { "value": "0" } } } ] } } }

  • should
    POST localhost:9200/nba/_search
    // 查询sex为"1"的学生 { "query": { "bool": { "should": [ { "term": { "sex": { "value": "1" } } } ] } } }

    与其他模式结合使用时即使匹配不到也返回,只是评分不同
    // 查询name中包含"李",age在13-14之间的学生 { "query": { "bool": { "must": [ { "match": { "name": "李" } } ], "should": [ { "range": { "age": { "gte": 13, "lte": 14 } } } ] } } }

排序查询
POST localhost:9200/nba/_search
// 查询7(5)班学生,age倒序排列 { "query": { "term": { "class": { "value": "7(5)班" } } }, "sort": [ { "age": { "order": "desc" } } ] }

聚合查询
  • 聚合分析是数据库中重要的功能特性,完成对一个查询的数据集中数据的聚合计算,如:找出某字段(或计算表达式的结果)的最大值、最小值,计算和、平均值等。 ES作为搜索引擎兼数据库,同样提供了强大的聚合分析能力
  • 对一个数据集求最大、最小、和、平均值等指标的聚合,在ES中称为指标聚合
  • 而关系型数据库中除了有聚合函数外,还可以对查询出的数据进行分组group by,再在组上进行指标聚合。在ES中称为桶聚合
指标聚合
  • max min sum avg
    POST localhost:9200/nba/_search
    // max-7(3)班最大年龄 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "maxAge": {// 自定义名称 "max": { "field": "age" } } }, "size": 0 }

    // min-7(3)班最小年龄 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "minAge": { "min": { "field": "age" } } }, "size": 0 }

    // sum { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "sumAge": { "sum": { "field": "age" } } }, "size": 0 }

    // avg-7(3)班平均年龄 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "avgAge": {// 自定义名称 "avg": { "field": "age" } } }, "size": 0 }

  • value_count
    统计非空字段的文档数
    POST localhost:9200/nba/_search
    // 查询7(3)班年龄非空的学生总数 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "countAge": { "value_count": { "field": "age" } } }, "size": 0 }

  • Cardinality
    值去重计数
    POST localhost:9200/nba/_search
    // 7(3)班age去重统计 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "cardinalityAge": { "cardinality": { "field": "age" } } }, "size": 0 }

  • stats
    统计count max min avg sum 5个值
    POST localhost:9200/nba/_search
    { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "statsAge": { "stats": { "field": "age" } } }, "size": 0 }

  • Extended stats
    比stats多4个统计结果: 平方和、方差、标准差、平均值加/减两个标准差的区间
    POST localhost:9200/nba/_search
    { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "extendedAge": { "extended_stats": { "field": "age" } } }, "size": 0 }

  • Percentiles
    占比百分位对应的值统计,默认返回[ 1, 5, 25, 50, 75, 95, 99 ]分位上的值
    POST localhost:9200/nba/_search
    { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "percentilesAge": { "percentiles": { "field": "age" } } }, "size": 0 }

    // 指定分位值 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "percentilesAge": { "percentiles": { "field": "age", "percents": [ 20, 50, 75 ] } } }, "size": 0 }

桶聚合
  • Terms Aggregation 根据字段项分组聚合
    POST localhost:9200/nba/_search
    // 7(3)班按照age分组 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "aggsAge": { "terms": { "field": "age", "size": 5 } } }, "size": 0 }

  • order 分组聚合排序
    POST localhost:9200/nba/_search
    // 7(3)班按照age分组,分组信息通过年龄从大到小排序 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "aggsAge": { "terms": { "field": "age", "size": 5, "order": { "_key": "desc" } } } }, "size": 0 }

    // 7(3)班按照age分组,分组信息通过文档数从大到小排序 { "query": { "term": { "class": { "value": "7(3)班" } } }, "aggs": { "aggsAge": { "terms": { "field": "age", "size": 5, "order": { "_count": "desc" } } } }, "size": 0 }

    // 根据class分组,根据分组后的平均age倒排 { "aggs": { "aggsClass": { "terms": { "field": "class", "size": 10, "order": { "aggsAge": "desc" } }, "aggs": { "aggsAge": { "avg": { "field": "age" } } } } }, "size": 0 }

  • 筛选分组聚合
    POST localhost:9200/nba/_search
    { "aggs": { "aggsClass": { "terms": { "field": "class", "include": ["7(3)班", "7(2)班", "7(5)班"],// 包含 "exclude": ["7(5)班"],// 排除 "size": 10, "order": { "aggsAge": "desc" } }, "aggs": { "aggsAge": { "avg": { "field": "age" } } } } }, "size": 0 }

    // 正则匹配 // include,exclude类型要一致 { "aggs": { "aggsClass": { "terms": { "field": "class", "include": "7.*", "exclude": "7(5)班", "size": 10, "order": { "aggsAge": "desc" } }, "aggs": { "aggsAge": { "avg": { "field": "age" } } } } }, "size": 0 }

  • Range Aggregation 范围分组聚合
    POST localhost:9200/nba/_search
    // -13,13-14,15-范围分组 { "aggs": { "aggsrange": { "range": { "field": "age", "ranges": [ { "to": 13 }, { "from": 13, "to": 14 }, { "from": 15 } ] } } }, "size": 0 }

    // 范围分组-别名 { "aggs": { "aggsrange": { "range": { "field": "age", "ranges": [ { "to": 13, "key":"A" }, { "from": 13, "to": 14, "key":"B" }, { "from": 15, "key":"C" } ] } } }, "size": 0 }

  • Date Range Aggregation 时间范围分组聚合
    POST localhost:9200/nba/_search
    // Date 时间范围分组聚合 { "aggs": { "aggsrange": { "date_range": { "field": "birthday", "format": "yyyy-MM", "ranges": [ { "to": "2004-12", "key":"A" }, { "from": "2005-01", "to": "2005-12", "key":"B" }, { "from": "2006-01", "key":"C" } ] } } }, "size": 0 }

  • Date Histogram Aggregation 时间柱状图聚合
    按天、月、年等进行聚合统计。可按 year (1y), quarter (1q), month (1M), week (1w), day(1d), hour (1h), minute (1m), second (1s) 间隔聚合
    POST localhost:9200/nba/_search
    { "aggs": { "aggsrange": { "date_histogram": { "field": "birthday", "format": "yyyy", "calendar_interval": "year" } } }, "size": 0 }

query_string查询
  • 单个字段查询
    POST localhost:9200/nba/_search
    { "query": { "query_string": { "default_field": "name", "query": "李 AND 思 OR 敏" } } }

  • 多个字段查询
    POST localhost:9200/nba/_search
    { "query": { "query_string": { "fields": ["name", "sex"], "query": "李 AND 0" } } }

分词器
将?用户输入的一段文本,按照一定逻辑,分析成多个词语的一种工具
内置分词器
  • standard analyzer (标准分词器)
    标准分析?是默认分词?,如果未指定,则使用该分词?
  • simple analyzer
    simple 分析?当它遇到只要不是字母的字符,就将文本解析成term,而且所有的term都是
    小写的
  • whitespace analyzer
    whitespace 分析?,当它遇到空白字符时,就将文本解析成terms
  • stop analyzer
    stop 分析? 和 simple 分析?很像,唯一不同的是, stop 分析?增加了对删除停止词的支持,默认使?用了english停止词
    stopwords 预定义的停止词列表,比如 (the,a,an,this,of,at)等
  • language analyzer
    特定的语言的分词?,比如说, english,英语分词?),内置语言: arabic, armenian,basque, bengali, brazilian, bulgarian, catalan, cjk, czech, danish, dutch, english, finnish,french, galician, german, greek, hindi, hungarian, indonesian, irish, italian, latvian,lithuanian, norwegian, persian, portuguese, romanian, russian, sorani, spanish,swedish, turkish, thai
  • pattern analyzer
    用正则表达式来将文本分割成terms,默认的正则表达式是\W+(非单词字符)
eg:
GET /_analyze { "analyzer": "simple", "text": "Deploy a 14-day trial of Elasticsearch Service." }

{ "tokens" : [ { "token" : "deploy", "start_offset" : 0,// 开始偏移量 "end_offset" : 6,// 结束偏移量 "type" : "word", "position" : 0 // 索引 }, { "token" : "a", "start_offset" : 7, "end_offset" : 8, "type" : "word", "position" : 1 }, { "token" : "day", "start_offset" : 12, "end_offset" : 15, "type" : "word", "position" : 2 }, { "token" : "trial", "start_offset" : 16, "end_offset" : 21, "type" : "word", "position" : 3 }, { "token" : "of", "start_offset" : 22, "end_offset" : 24, "type" : "word", "position" : 4 }, { "token" : "elasticsearch", "start_offset" : 25, "end_offset" : 38, "type" : "word", "position" : 5 }, { "token" : "service", "start_offset" : 39, "end_offset" : 46, "type" : "word", "position" : 6 } ] }

中文分词器
  • smartCN
    一个简单的中文或中英文混合文本的分词?
    • 安装 (重启服务后使用)
      sh elasticsearch-plugin install analysis-smartcn

    • eg:
      GET /_analyze { "analyzer": "smartcn", "text": "有限公司" }

      { "tokens" : [ { "token" : "有限公司", "start_offset" : 0, "end_offset" : 4, "type" : "word", "position" : 0 } ] }

  • IK分词器
    更智能更友好的中文分词器
    • 下载 https://github.com/medcl/elasticsearch-analysis-ik/releases (版本要对应)
    • 安装 解压到es安装目录-plugins目录
    • eg:
      GET /_analyze { "analyzer": "ik_max_word", "text": "有限公司" }

      { "tokens" : [ { "token" : "有限公司", "start_offset" : 0, "end_offset" : 4, "type" : "CN_WORD", "position" : 0 }, { "token" : "有限", "start_offset" : 0, "end_offset" : 2, "type" : "CN_WORD", "position" : 1 }, { "token" : "公司", "start_offset" : 2, "end_offset" : 4, "type" : "CN_WORD", "position" : 2 } ] }

refresh
新的数据已添加到索引中??就能搜索到,但是真实情况不是这样的
  • 先添加?个?档,再?刻搜索,获取不到新添加的数据
    curl -X PUT localhost:9200/stu/_doc/666 -H 'Content-Type:application/json' -d '{ "name": "王丝菲" }' curl -X GET localhost:9200/stu/_doc/_search?pretty

  • 强制刷新
    curl -X PUT localhost:9200/stu/_doc/667?refresh -H 'Content-Type:application/json' -d '{ "name": "王豆豆" }' curl -X GET localhost:9200/stu/_doc/_search?pretty

  • 修改默认更新时间(默认时间是1s)
    PUT localhost:9200/stu/_settings
    { "index": { "refresh_interval": "5s" } }

  • 将refresh关闭
    PUT localhost:9200/stu/_settings
    { "index": { "refresh_interval": "-1" } }

高亮查询
  • 高亮查询
    POST localhost:9200/stu/_search
    // 参数 { "query": { "match": { "name": "赵" } }, "highlight": { "fields": { "name": {} } } }

    // 相应 { "took" : 4, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 2.4191523, "hits" : [ { "_index" : "stu", "_type" : "_doc", "_id" : "10", "_score" : 2.4191523, "_source" : { "name" : "赵曦", "age" : 14, "sex" : "0", "class" : "7(2)班", "birthday" : "2005-09-02" }, "highlight" : { "name" : [ "曦" ] } } ] } }

  • 自定义高亮查询
    POST localhost:9200/stu/_search
    // 参数 { "query": { "match": { "name": "赵" } }, "highlight": { "fields": { "name": { "pre_tags": [""], "post_tags": ["
    "] } } } }

    // 响应 { "took" : 6, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 2.4191523, "hits" : [ { "_index" : "stu", "_type" : "_doc", "_id" : "10", "_score" : 2.4191523, "_source" : { "name" : "赵曦", "age" : 14, "sex" : "0", "class" : "7(2)班", "birthday" : "2005-09-02" }, "highlight" : { "name" : [ "赵
    曦" ] } } ] } }

查询建议
查询建议,是为了给?户提供更好的搜索体验。包括:词条检查,?动补全
字段类型
类型 描述
text 指定搜索文本
field 获取建议器的搜索字段
analyzer 指定分词器
size 每个词返回的最大建议词数
sort 如何对建议词进行排序,可用选项:score-先按评分排序,再按文档频率排序,term顺序;frequency:先按文档频率排序,再按评分排序,term顺序;
suggest_mode 建议模式,控制提供建议词的方式:missing-仅在搜索的词项在索引中不存在时才提供建议词,默认值;popular-仅建议文档频率比搜索词项高的词;always-总是提供匹配的建议词;
suggester
  • Term suggester
    term 词条建议器,对给输?的文本进?分词,为每个分词提供词项建议
    POST localhost:9200/stu/_search
    // 参数 { "suggest": { "MY_SUGGESTION": { "text": "7(6)班", "term": { "suggest_mode": "missing", "field": "class" } } } }

    { "took" : 105, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "MY_SUGGESTION" : [ { "text" : "7(6)班", "offset" : 0, "length" : 5, "options" : [ { "text" : "7(2)班", "score" : 0.8, "freq" : 4 }, { "text" : "7(3)班", "score" : 0.8, "freq" : 3 }, { "text" : "7(5)班", "score" : 0.8, "freq" : 3 } ] } ] } }

  • Phrase suggester
    phrase 短语建议,在term的基础上,会考量多个term之间的关系,?如是否同时出现在索
    引的原文里,相邻程度,以及词频等
    POST localhost:9200/stu/_search
    // 参数 { "suggest": { "MY_SUGGESTION": { "text": "7(2) 班", "phrase": { "field": "class" } } } }

    // 响应 { "took" : 17, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "MY_SUGGESTION" : [ { "text" : "7(2) 班", "offset" : 0, "length" : 6, "options" : [ { "text" : "7(2)班", "score" : 0.4678218 }, { "text" : "7(3)班", "score" : 0.37474233 }, { "text" : "7(5)班", "score" : 0.37474233 } ] } ] } }

  • Completion suggester
    完成建议,自动补充查询内容后面的内容
    POST localhost:9200/stu/_search
    // 要查询字段的类型必须是 completion { "suggest": { "MY_SUGGESTION": { // 自定义名称 "text": "I like", "completion": { "field": "selfDesc" } } } }

    推荐阅读