本文共 5764 字,大约阅读时间需要 19 分钟。
es 7.x安装教程:
配置非本地可访问:
传统数据库 | ES |
---|---|
database数据库 | index索引 |
table表 | type类型 |
字段 | id |
一行行的记录 | 一个个的文档 |
传统数据库 | ES |
---|---|
varchar字符串int整数等等类型的字段组成的一行行记录 | json格式的文档 |
传统数据库 | ES |
---|---|
SQL语句 | Restful风格的DSL |
场景:一个学校有若干班级,每个班级有若干学生
POST http://192.168.1.104:9200/school/class_1/xiaohong{ "name":"小红", "age":18, "height":165, "tags":["学习认真","学霸","漂亮"]}
school是index,class_1是type,xiaohong是id
POST http://192.168.1.104:9200/school/class_1/{ "name":"无名", "age":17, "height":175, "tags":["学习认真","学霸"]}
“_id”: "qoFDdHUB4DF8xfvRk1av"这个是自动生成的id
再添加个小白等下给删除用
POST http://192.168.1.104:9200/school/class_1/xiaobai{ "name":"小白", "age":18, "height":165}
DELETE http://192.168.1.104:9200/school/class_1/xiaobai
可以看到下图里的result,deleted删除掉了
PUT http://192.168.1.104:9200/school/class_1/xiaohong{ "name":"小红", "age":19, "height":170, "tags":["学习认真","学霸","漂亮"]}
这里是直接put的,如果只写年龄和身高两个属性那么其他属性会丢掉的,因为这是直接替换掉了之前的整个文档,而不是部分
POST http://192.168.1.104:9200/school/class_1/xiaohong/_update{ "doc":{ "age":20 }}
可以看到每次修改后_version都加1了
GET http://192.168.1.104:9200/school/class_1/xiaohong
准备数据
POST http://192.168.1.104:9200/school/class_1/xiaoli { “name”:“小李”, “age”:22, “height”:176, “tags”:[“调皮”,“学渣”] }
GET http://192.168.1.104:9200/school/class_1/_search
查询结果
{ "took": 138, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 3, "relation": "eq" }, "max_score": 1.0, "hits": [ { "_index": "school", "_type": "class_1", "_id": "xiaohong", "_score": 1.0, "_source": { "name": "小红", "age": 20, "height": 170, "tags": [ "学习认真", "学霸", "漂亮" ] } }, { "_index": "school", "_type": "class_1", "_id": "qoFDdHUB4DF8xfvRk1av", "_score": 1.0, "_source": { "name": "无名", "age": 17, "height": 175, "tags": [ "学习认真", "学霸" ] } }, { "_index": "school", "_type": "class_1", "_id": "xiaoli", "_score": 1.0, "_source": { "name": "小李", "age": 22, "height": 176, "tags": [ "调皮", "学渣" ] } } ] }}
查询年龄在18及以上的同学
POST http://192.168.1.104:9200/school/class_1/_search{ "query":{ "bool":{ "filter":[{ "range":{ "age":{ "gte":18 } } }] } }, "from":0, "size":10}
query意思是查询,bool是不进行打分,filter是过滤器,range是范围过滤器
DSL结构比较复杂,多样性比较多写一个示范去理解吧,由于各种查找聚合操作可组合型比较多,还是后面再写 查询的时候有打分match和不打分,有过滤器,有must must not should,term terms range, avg max sum min value_count有值的数量 cardinality不同值的数量(传统数据库的distinct count) terms按照每个不同的值分桶(分成一个个的bucket)反正好多东西,各种操作,比SQL的内容和结构复杂多了,一时写不完。。。反正常用的就是查询,聚合,文字检索,相关度打分,高亮搜索等
{ "from":0, "size":10, "query":{ "bool":{ "must":[{ "range":{ "age":{ "gte":0 } } }], "must_not":[{ "range":{ "age":{ "lte":0 } } }], "should":[{ "term":{ "name":"小白" } }], "filter":[{ "range":{ "age":{ "gte":18 } } }] } }, "aggs":{ "names":{ "terms":{ "field":"age" }, "aggs":{ "stu_avg_height":{ "avg":{ "field":"height" } } } }, "age_gte_18_count":{ "value_count":{ "field":"age" } } }}
另外讲几个常用的
某字段等于某个值{ "term":{ "name":"小红" }}
匹配某字段多个值
在filter里头加入一个{ "terms":["值1","值2"]}
范围筛选
{ "range":{ "age":{ "gte":18 } }}
时间范围(d是天,s是秒其他以此类推,w是一周,now代表现在的时间)
{ "range":{ "finished_time":{ "gt":"now-1d" } }}
{ "range":{ "finished_time":{ "gte":"2020-09-27 16:10:10", "lte":"2020-10-27 16:10:10", "format":"yyyy-MM-dd HH:mm:ss", "time_zone":"+08:00" } }}
时间聚合分桶
"aggs":{ "my_date_buckets":{ "date_histogram":{ "field":"finished_time", "fixed_interval":"10m" } }}
找出最大的分桶 bucket(下面例子找出文档最多的桶子)
"aggs":{ "my_date_buckets":{ "date_histogram":{ "field":"finished_time", "fixed_interval":"10m" } }, "my_max_buckets":{ "max_bucket":{ "buckets_path":"date_interval>_count" } }}
还可以对每个桶子取前多少个,排序 下面取10个根据文档数排序 降序
"aggs":{ "my_date_buckets":{ "date_histogram":{ "field":"finished_time", "fixed_interval":"10m", "size":10, "order":{ "_count":"desc" } } }}
转载地址:http://yfiu.baihongyu.com/