自学内容网 自学内容网

ES7 查询超过10000条 返回条数错误

es版本:7.10.2

java调用restful接口查询es数据,当es数据量超过10000时,total总是返回10000,代码如下:

JSONObject query = new JSONObject();
JSONObject q = new JSONObject();
q.put("bool", new JSONObject());
query.put("query", q);
query.put("from", 0);
query.put("size", 1);
while (true) {
HttpPost post = new HttpPost(sourceUrl + "/" + indexName + "/_search");
HttpEntity entity = new StringEntity(query.toString());
post.setEntity(entity);
post.addHeader("Authorization", "Basic " + getAuthorization(sourceUsername, sourcePassword));
post.addHeader("Content-Type", "application/json");
HttpResponse response = httpClient.execute(post);
JSONObject ret = JSONObject.parseObject(EntityUtils.toString(response.getEntity()));
int total = ret.getJSONObject("hits").getJSONObject("total").getInteger("value");
//打印total  当数据超过10000条时,total总是10000
}

解决办法:增加查询参数 "track_total_hits"

代码如下:

JSONObject query = new JSONObject();
JSONObject q = new JSONObject();
q.put("bool", new JSONObject());
query.put("query", q);
query.put("from", 0);
query.put("size", 1);
query.put("track_total_hits", true);//增加此行,数据超过10000条时,精准查询条数
while (true) {
HttpPost post = new HttpPost(sourceUrl + "/" + indexName + "/_search");
HttpEntity entity = new StringEntity(query.toString());
post.setEntity(entity);
post.addHeader("Authorization", "Basic " + getAuthorization(sourceUsername, sourcePassword));
post.addHeader("Content-Type", "application/json");
HttpResponse response = httpClient.execute(post);
JSONObject ret = JSONObject.parseObject(EntityUtils.toString(response.getEntity()));
int total = ret.getJSONObject("hits").getJSONObject("total").getInteger("value");
}

简单说明:

Elasticsearch 在 7.0 版本集成了 Lucene 8 并引入这个特性,track_total_hits 参数控制着es在返回查询结果时,如何计算匹配文档的总数。

track_total_hits 默认是false,当查询结果的总命中数超过一定阈值时(默认:10,000)es不再返回精确的命中数,而是返回一个下限值。

track_total_hits 设置为true时,强制es计算所有匹配文档的总数,但是此种方式会导致性能下降。


原文地址:https://blog.csdn.net/knowwait/article/details/143686546

免责声明:本站文章内容转载自网络资源,如本站内容侵犯了原著者的合法权益,可联系本站删除。更多内容请关注自学内容网(zxcms.com)!