Elasticsearch查询返回所有logging

我在Elasticsearch中有一个小数据库，为了testing目的，我们想把所有的logging都放回去。我正在尝试使用表单的URL …

http://localhost:9200/foo/_search?pretty=true&q={'matchAll':{''}}

有人可以给我你会用来完成这个url吗？

我认为lucene语法是支持如此：

http://localhost:9200/foo/_search?pretty=true&q=*:*

大小默认为10，所以你可能还需要&size=BIGNUMBER来获得10个以上的项目。（其中BIGNUMBER等于一个你认为比你的数据集大的数字）

但是，elasticsearch文档build议使用扫描searchtypes的大型结果集。

例如：

 curl -XGET 'localhost:9200/foo/_search?search_type=scan&scroll=10m&size=50' -d ' { "query" : { "match_all" : {} } }'

然后按照上面build议的文档链接继续请求。

编辑： scan在2.1.0中弃用。

scan不会比通过_docsorting的常规scroll请求带来任何好处。链接到弹性文档（由@ christophe-roussy发现）

 http://127.0.0.1:9200/foo/_search/?size=1000&pretty=1 ^

请注意大小参数 ，这会将默认（10）显示的匹配数增加到每个分片1000个。

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-from-size.html

elasticsearch（ES）同时支持从ES集群索引获取数据的GET或POST请求。

当我们做一个GET时：

 http://localhost:9200/[your index name]/_search?size=[no of records you want]&q=*:*

当我们做一个POST时：

 http://localhost:9200/[your_index_name]/_search { "size": [your value] //default 10 "from": [your start index] //default 0 "query": { "match_all": {} } }

我build议使用elasticsearch的UI插件http://mobz.github.io/elasticsearch-head/这将帮助您更好地了解您创build的索引，并testing您的索引。;

下面的查询将返回您想返回的NO_OF_RESULTS ..

 curl -XGET 'localhost:9200/foo/_search?size=NO_OF_RESULTS' -d ' { "query" : { "match_all" : {} } }'

现在，这里的问题是你想要返回所有的logging。所以很自然地，在写一个查询之前，你不会知道NO_OF_RESULTS的值。

我们如何知道文档中有多less条logging？只需键入下面的查询

 curl -XGET 'localhost:9200/foo/_search' -d '

这会给你看起来像下面的结果

  { hits" : { "total" : 2357, "hits" : [ { ..................

总的结果会告诉您文档中有多lesslogging可用。所以，这是了解NO_OF RESULTS价值的好方法

 curl -XGET 'localhost:9200/_search' -d '

search所有索引中的所有types

 curl -XGET 'localhost:9200/foo/_search' -d '

在foo索引中search所有types

 curl -XGET 'localhost:9200/foo1,foo2/_search' -d '

在foo1和foo2索引中search所有types

 curl -XGET 'localhost:9200/f*/_search

在任何以f开头的索引中search所有types

 curl -XGET 'localhost:9200/_all/type1,type2/_search' -d '

在所有索引中search用户和鸣叫

使用server:9200/_stats也可以获得有关所有别名的统计信息，如每个别名的大小和元素数量，这是非常有用的，并提供有用的信息

简单！你可以使用size和from参数！

 http://localhost:9200/[your index name]/_search?size=1000&from=0

然后你逐渐改变，直到你得到所有的数据。

这是我使用python客户端发现的最佳解决scheme

  # Initialize the scroll page = es.search( index = 'yourIndex', doc_type = 'yourType', scroll = '2m', search_type = 'scan', size = 1000, body = { # Your query's body }) sid = page['_scroll_id'] scroll_size = page['hits']['total'] # Start scrolling while (scroll_size > 0): print "Scrolling..." page = es.scroll(scroll_id = sid, scroll = '2m') # Update the scroll ID sid = page['_scroll_id'] # Get the number of results that we returned in the last scroll scroll_size = len(page['hits']['hits']) print "scroll size: " + str(scroll_size) # Do something with the obtained page

https://gist.github.com/drorata/146ce50807d16fd4a6aa

使用java客户端

 import static org.elasticsearch.index.query.QueryBuilders.*; QueryBuilder qb = termQuery("multi", "test"); SearchResponse scrollResp = client.prepareSearch(test) .addSort(FieldSortBuilder.DOC_FIELD_NAME, SortOrder.ASC) .setScroll(new TimeValue(60000)) .setQuery(qb) .setSize(100).execute().actionGet(); //100 hits per shard will be returned for each scroll //Scroll until no hits are returned do { for (SearchHit hit : scrollResp.getHits().getHits()) { //Handle the hit... } scrollResp = client.prepareSearchScroll(scrollResp.getScrollId()).setScroll(new TimeValue(60000)).execute().actionGet(); } while(scrollResp.getHits().getHits().length != 0); // Zero hits mark the end of the scroll and the while loop.

https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-search-scrolling.html

resize的最佳方法是在URL前面使用size = number

 Curl -XGET "http://localhost:9200/logstash-*/_search?size=50&pretty"

注意：在这个尺寸中可以定义的最大值是10000.对于任何超过一万的值，它希望你使用滚动function，这将最大限度地减less任何影响性能的机会。

如果您只是添加一些大数字作为大小，Elasticsearch会显着变慢，一种方法是使用扫描和滚动ID来获取所有文档。

所以你的电话是：

 GET /foo/_search?search_type=scan&scroll=1m { "query": { "match_all": {}}, "size": 1000 }

这将返回一个_scroll_id，您现在可以使用它来获取第一批文档。

https://www.elastic.co/guide/en/elasticsearch/guide/current/scan-scroll.html

http：// localhost：9200 / foo / _search / ？大小 = 1000和漂亮= 1

您将需要指定大小查询参数，默认值为10

您可以使用_count API来获取size参数的值：

 http://localhost:9200/foo/_count?q=<your query>

返回{count:X, ...} 。提取值“X”，然后执行实际查询：

 http://localhost:9200/foo/_search?q=<your query>&size=X

他们中的一些人给出了使用扫描和滚动的正确答案，显然，我不能完全答案，这将神奇地工作。当有人想要拉logging时，必须运行下面的curl命令。

 curl -XGET 'http://ip1:9200/myindex/_search?scroll=1m' -d ' { "query": { "match_all" : {} } } '

但是我们没有在这里完成。上面的curl命令的输出将是这样的

 {"_scroll_id":"c2Nhbjs1OzUyNjE6NU4tU3BrWi1UWkNIWVNBZW43bXV3Zzs1Mzc3OkhUQ0g3VGllU2FhemJVNlM5d2t0alE7NTI2Mjo1Ti1TcGtaLVRaQ0hZU0FlbjdtdXdnOzUzNzg6SFRDSDdUaWVTYWF6YlU2Uzl3a3RqUTs1MjYzOjVOLVNwa1otVFpDSFlTQWVuN211d2c7MTt0b3RhbF9oaXRzOjIyNjAxMzU3Ow==","took":109,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":22601357,"max_score":0.0,"hits":[]}}

其重要的是有_scroll_id方便的下一个你shd运行以下命令

  curl -XGET 'localhost:9200/_search/scroll' -d' { "scroll" : "1m", "scroll_id" : "c2Nhbjs2OzM0NDg1ODpzRlBLc0FXNlNyNm5JWUc1" } '

但是，我不认为它很容易手动运行。你最好的select是写一个Java代码来做同样的事情。

  private TransportClient client = null; private Settings settings = ImmutableSettings.settingsBuilder() .put(CLUSTER_NAME,"cluster-test").build(); private SearchResponse scrollResp = null; this.client = new TransportClient(settings); this.client.addTransportAddress(new InetSocketTransportAddress("ip", port)); QueryBuilder queryBuilder = QueryBuilders.matchAllQuery(); scrollResp = client.prepareSearch(index).setSearchType(SearchType.SCAN) .setScroll(new TimeValue(60000)) .setQuery(queryBuilder) .setSize(100).execute().actionGet(); scrollResp = client.prepareSearchScroll(scrollResp.getScrollId()) .setScroll(new TimeValue(timeVal)) .execute() .actionGet();

现在LOOP最后一个命令使用SearchResponse来提取数据。

你可以使用size = 0这将返回你所有的文件的例子

 curl -XGET 'localhost:9200/index/type/_search' -d ' { size:0, "query" : { "match_all" : {} } }'

Elasticsearch查询返回所有logging

什么是DBMS上下文中的BLOB？

更新日志/审计数据库表的最佳devise？

库存数据库devise

数据库inheritance技术？

SQL如何增加或减less一个int列在一个命令

SQL，Postgres OID，它们是什么，为什么它们有用？

MyISAM与InnoDB

MYSQL 5.7中的原生JSON支持：MYSQL中的JSON数据types有哪些优缺点？

Sqlite或MySql？如何决定？

正确使用SQL Server 2008中的事务

Elasticsearch查询返回所有logging

什么是DBMS上下文中的BLOB？

更新日志/审计数据库表的最佳devise？

库存数据库devise

数据库inheritance技术？

SQL如何增加或减less一个int列在一个命令

SQL，Postgres OID，它们是什么，为什么它们有用？

MyISAM与InnoDB

MYSQL 5.7中的原生JSON支持：MYSQL中的JSON数据types有哪些优缺点？

Sqlite或MySql？ 如何决定？

正确使用SQL Server 2008中的事务

Sqlite或MySql？如何决定？