Elasticsearch Java 8.5: retrieve all Ids - java

I am trying to retrieve all (elasticsearch) Ids of an index using the following code.
SearchRequest sr = SearchRequest.of(r -> r
.index("my_index")
.source(s -> s.fetch(false))
.size(10000));
System.out.println("Request: " + sr);
final SearchResponse<MyDoc> response;
try {
response = this.esClient.search(sr, MyDoc.class);
} catch (ElasticsearchException | IOException e) {
throw new MyException("fetch all ids: request failed: " + e.getMessage(), e);
}
System.out.println("Response: " + response);
There are no results in the response. However, the printed request is
POST /my_index/_search?typed_keys=true {"_source":false,"size":10000}
which works perfectly fine when run directly as a REST request.
Any idea how to do it using the Java client?
The REST response is
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 999,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "my_index",
"_id": "2LHJSP6dTjXuEM2vsEDyxdG4Y7HPzXL15tFvrkyZm8xn",
"_score": 1
},
{
"_index": "my_index",
"_id": "A8RCf2mV4qeWvLxSfzKNX418E734uEifCenoCAiM3syB",
"_score": 1
},
...
]
}
}

Maybe this works for you:
SearchRequest sr = SearchRequest.of(r -> r
.index("idx_name")
.source(s -> s.fetch(false))
.size(10000));
System.out.println("Request: " + sr);
try {
var response = this.esClient.search(sr, Void.class);
response.hits().hits().forEach(hit -> {
System.out.println("ID: " + hit.id());
});
} catch (ElasticsearchException | IOException e) {
}

Related

Elasticsearch Java client SearchResponse not recognizing aggregations/buckets

Elasticsearch Java client SearchResponse is unable to parse aggregation results. I've come across articles online which suggest adding aggregation types prefixed to the keys. I've added what I thought applies in my use case, such as "sterms# and sum#" but I am unable to figure out which type applies to the main filter (key:'matched' in my case). I am expecting the buckets object to be populated but it is currently being returned as an empty array despite response from elasticsearch containing aggregations.
Note: This is to be able to unit test.
Json response:
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 75,
"successful": 75,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 776329,
"max_score": 0,
"hits": []
},
"aggregations": {
"sterms#matched": {
"doc_count": 15,
"sterms#id": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "mykey",
"doc_count": 15,
"sterms#manufacturerName": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "subkey",
"doc_count": 15
}
]
},
"sum#totalCars": {
"value": 214244
},
"sum#totalTrucks": {
"value": 155738
}
}
]
}
}
}
}
Parser in Java
public static SearchResponse getSearchResponseFromJson(final String jsonFileName) {
try {
String jsonResponse = IOUtils.toString(new FileInputStream(jsonFileName));
NamedXContentRegistry registry = new NamedXContentRegistry(getDefaultNamedXContents());
XContentParser parser = JsonXContent.jsonXContent.createParser(registry, jsonResponse);
return SearchResponse.fromXContent(parser);
} catch (IOException e) {
throw new RuntimeException("Failed to get resource: " + jsonFileName, e);
} catch (Exception e) {
throw new RuntimeException(("exception " + e));
}
}
private static List<NamedXContentRegistry.Entry> getDefaultNamedXContents() {
Map<String, ContextParser<Object, ? extends Aggregation>> map = new HashMap<>();
map.put(TopHitsAggregationBuilder.NAME, (parser, content) -> ParsedTopHits.fromXContent(parser, (String) content));
map.put(StringTerms.NAME, (parser, content) -> ParsedStringTerms.fromXContent(parser, (String) content));
map.put(SumAggregationBuilder.NAME, (parser, content) -> ParsedSum.fromXContent(parser, (String) content));
//map.put(ScriptedMetricAggregationBuilder.NAME, (parser, content) -> ParsedScriptedMetric.fromXContent(parser, (String) content));
List<NamedXContentRegistry.Entry> entries = map.entrySet()
.stream()
.map(entry -> new NamedXContentRegistry.Entry(Aggregation.class,
new ParseField(entry.getKey()),
entry.getValue()))
.collect(Collectors.toList());
return entries;
}
Parsed response with empty buckets (expectation is for the buckets to have aggregations totalCars and totalTrucks)
{
"took": 7,
"timed_out": false,
"_shards": {
"total": 75,
"successful": 75,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 776329,
"max_score": 0,
"hits": []
},
"aggregations": {
"sterms#matched": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": []
}
}
}
I am not sure if sterms is the right type for the filter
The answer for what type needs to be used is 'filter' (filter#matched). And Java class counterpart is FilterAggregationBuilder

Kibana scripted field - text to timestamp conversion parse exception

I am unable to convert a text field ts1 in my ES index (sample value 2017-09-30 04:53:39.412496Z) to timestamp by using scripted fields in Kibana. The SimpleDateFormat expression works in Java, just not in Elasticsearch. This mentions a similar problem but the solution doesn't work for me.
Query
GET <index_name>/_search
{
"query" : {
"match_all": {}
},
"script_fields" : {
"test1" : {
"script" : {
"lang": "painless",
"source": """new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSSSSS'Z'").parse(doc['ts1'].value).getTime()"""
}
}
}
}
Output
{
"took": 76,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 4,
"skipped": 0,
"failed": 1,
"failures": [
{
"shard": 3,
"index": "<index_name>",
"node": "-TjsxK4-QsqgLQRQ-DSKGQ",
"reason": {
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"java.text.DateFormat.parse(DateFormat.java:366)",
"""new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSSSSS'Z'").parse(doc['ts1'].value).getTime()""",
" ^---- HERE"
],
"script": """new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSSSSS'Z'").parse(doc['ts1'].value).getTime()""",
"lang": "painless",
"caused_by": {
"type": "parse_exception",
"reason": """Unparseable date: "04""""
}
}
}
]
},
"hits": {
"total": 1,
"max_score": 1,
"hits": []
}
}
Java Code that works -
import java.text.SimpleDateFormat;
public class Main{
public static void main(String []args){
String text2 = "2017-09-30 04:53:39.123123Z";
SimpleDateFormat s2 = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss.SSSSSS'Z'");
try {
System.out.println(s2.parse(text2).getTime());
} catch (Exception e) {
System.out.println(e);
}
}
}

Search is not returning expected results

I am trying to connect to elasticsearch and do some basic query in 6.3.2 version.
The code I am trying is this:
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(new HttpHost("localhost", 30100, "http")));
SearchRequest sr = new SearchRequest(INDEX);
sr.indicesOptions(IndicesOptions.lenientExpandOpen());
SearchSourceBuilder ssb = new SearchSourceBuilder();
sr.source(ssb);
MatchQueryBuilder builder = QueryBuilders.matchQuery("logLevel.keyword", "ERROR");
QueryBuilder qb = QueryBuilders.boolQuery().must(builder);
ssb.query(qb);
SearchResponse response = null;
try {
response = client.search(sr);
System.out.println("total hits ::: " + response.getHits().getTotalHits());
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
client.close();
} catch (IOException e) {
e.printStackTrace();
}
}
System.out.println(response);
UPDATE
As suggested i am using the query only now. I tried the build query from APIs and i see the result but for some reason response.getHits().getTotalHits() is returning zero. The generated query looks like below and is giving me the expected result in kibana i.e. a total count of 1 :
GET /_search
{
"from": 0,
"query": {
"bool": {
"must": [
{
"match": {
"logLevel.keyword": {
"query": "ERROR",
"operator": "OR",
"prefix_length": 0,
"max_expansions": 50,
"fuzzy_transpositions": true,
"lenient": false,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
}
Am missing some configuration for the restclient ?
I don't see why you need all the parts in the query that you posted (like max_expansions etc). A simple query like below, should do the job if you only need the count of documents that match the criteria.
curl -XGET "http://localhost:9200/index_name/_search" -d'
{
"size": 0,
"query": {
"bool": {
"must": [
{
"match": {
"logLevel": "ERROR"
}
},
{
"range": {
"date": {
"gte": "enter_date_here"
}
}
}
]
}
}
}'
reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html

JSON content from file via Java

My program parse JSON from file where is saved unstructured content of this json adress.
Unstructured json format from file json.txt:
{"total":3307,"watershed":100,"maxPage":5,"search":"http://10.103.80.149:3312/bin/sp?app=pinglun&outfmt=json&seek_timeout=400&gmt_create=1465228800~&validfeedback=1&item_id=538236063692&rate=1&layer_quota=500000&status=0,-1&order=algo_sort:des&s=0&n=20&is_wireless=0&user_id=0&utd_id=ba3290d86b9fdcd927e8ae83fde1ee64&is_click_sku=0","currentPageNum":1,"comments":[{"auction":{"title":"","thumbnail":""....
For better view in structured form:
{
"total": 3307,
"watershed": 100,
"maxPage": 5,
"search": "http://10.103.80.162:3312/bin/sp?app=pinglun&outfmt=json&seek_timeout=400&gmt_create=1465228800~&validfeedback=1&item_id=538236063692&rate=1&layer_quota=500000&status=0,-1&order=algo_sort:des&s=0&n=20&is_wireless=0&user_id=0&is_click_sku=0",
"currentPageNum": 1,
"comments": [
{
"auction": {
"title": "",
"thumbnail": "",
"aucNumId": "538236063692",
"link": "//item.taobao.com/item.htm?id=538236063692",
"auctionPic": "//img.alicdn.com/bao/uploaded/null_40x40.jpg",
"sku": "机身颜色:黑色(顶级配置)[移动+联通] &nbsp套餐类型:官方标配 &nbsp存储容量:64MB &nbsp版本类型:中国大陆"
},
"promotionType": "活动促销 ",
"enableSNS": false,
"tag": "",
"appendCanExplainable": false,
"showCuIcon": true,
"validscore": 1,
"award": "",
"noQna": true,
"appendList": [
{
"photos": [
],
"content": "老人机使用效果好,功能强大,非常值得拥有!!!",
"vicious": "",
"reply": null,
"show": true,
"dayAfterConfirm": 0,
"appendId": 290673146633
}
],
"from": "",
"date": "2016年11月05日 16:46",
"dayAfterConfirm": 0,
"bidPriceMoney": {
"cent": 7905,
"amount": 79.05,
"currencyCode": "CNY",
"centFactor": 100,
"displayUnit": "元",
"currency": {
"defaultFractionDigits": 2,
"currencyCode": "CNY",
"symbol": "¥"
}
},
"rate": "1",
"o2oRate": null,
"propertiesAvg": "0.0",
"showDepositIcon": false,
"rateId": 290589793921,
"creditFraudRule": 0,
"useful": 0,
"reply": null,
"append": {
"photos": [
],
"content": "老人机使用效果好,功能强大,非常值得拥有!!!",
"vicious": "",
.
.
.
.}
And I need to parse only chinese comments in element "content" e.g. "老人机使用效果好,功能强大,非常值得拥有!!!", but I don't know how to works with this element in JSONArray with json-simple.
Here is example of code:
public class Final {
public static void main(String[] args) throws IOException, JSONException {
String jsonData = "";
BufferedReader br = null;
try {
String line;
br = new BufferedReader(new FileReader("json.txt"));
while ((line = br.readLine()) != null) {
jsonData += line + "\n";
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null)
br.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
System.out.println("\nFile Content: \n" + jsonData1); // displays all content
JSONObject obj = new JSONObject(jsonData);
System.out.println("search: " + obj.getString("search"));
System.out.println("comments: " + obj.getJSONObject("comments"));
}
}
Could you please help me with this problem?

How get array value from nested json data (java android)

I've got the title, author, and published date from json data to my android app.
I'm trying to get event_start_date and event_location, but failed. can anyone help me how I can get it?
Json:
{
"status": "ok",
"count": 1,
"count_total": 29,
"pages": 29,
"posts": [
{
"id": 2815,
"type": "event",
"slug": "itb-integrated-career-days",
"url": "http://example.com/event/itb-integrated-career-days/",
"status": "publish",
"title": "Title",
"title_plain": "Title",
"content": "<p>test</p>\n",
"excerpt": "<p>test […]</p>\n",
"date": "2015-09-25 01:09:40",
"modified": "2015-09-29 22:52:35",
"categories": [],
"tags": [],
"author": {
"id": 1,
"slug": "john",
"name": "john",
"first_name": "",
"last_name": "",
"nickname": "john",
"url": "",
"description": ""
},
"comments": [],
"attachments": [
{
"id": 2817,
"url": "http://example.com/wp-content/uploads/2015/09/bannertkt102015.gif",
"slug": "bannertkt102015",
"title": "bannertkt102015",
"description": "",
"caption": "",
"parent": 2815,
"mime_type": "image/gif",
"images": {
"full": {
"url": "http://example.com/wp-content/uploads/2015/09/bannertkt102015.gif",
"width": 1000,
"height": 563
},
"thumbnail": {
"url": "http://example.com/wp-content/uploads/2015/09/bannertkt102015-150x150.gif",
"width": 150,
"height": 150
},
"medium": {
"url": "http://example.com/wp-content/uploads/2015/09/bannertkt102015-300x169.gif",
"width": 300,
"height": 169
},
"large": {
"url": "http://example.com/wp-content/uploads/2015/09/bannertkt102015.gif",
"width": 1000,
"height": 563
},
"blog-default": {
"url": "http://example.com/wp-content/uploads/2015/09/bannertkt102015-806x300.gif",
"width": 806,
"height": 300
}
}
},
{
"id": 2818,
"url": "http://example.com/wp-content/uploads/2015/09/2015-09-25_012008.jpg",
"slug": "2015-09-25_012008",
"title": "2015-09-25_012008",
"description": "",
"caption": "",
"parent": 2815,
"mime_type": "image/jpeg",
"images": {
"full": {
"url": "http://example.com/wp-content/uploads/2015/09/2015-09-25_012008.jpg",
"width": 589,
"height": 529
},
"thumbnail": {
"url": "http://example.com/wp-content/uploads/2015/09/2015-09-25_012008-150x150.jpg",
"width": 150,
"height": 150
},
"medium": {
"url": "http://example.com/wp-content/uploads/2015/09/2015-09-25_012008-300x269.jpg",
"width": 300,
"height": 269
},
"large": {
"url": "http://example.com/wp-content/uploads/2015/09/2015-09-25_012008.jpg",
"width": 589,
"height": 529
},
"blog-default": {
"url": "http://example.com/wp-content/uploads/2015/09/2015-09-25_012008-589x300.jpg",
"width": 589,
"height": 300
}
}
}
],
"comment_count": 0,
"comment_status": "open",
"thumbnail": "http://example.com/wp-content/uploads/2015/09/2015-09-25_012008-150x150.jpg",
"custom_fields": {
"event_location": [
"New York"
],
"event_start_date": [
"10/23/2015"
],
"event_start_time": [
"09:00 AM"
],
"event_start_date_number": [
"1445590800"
],
"event_address_country": [
"US"
],
"event_address_state": [
"Jawdst"
],
"event_address_city": [
"New York"
],
"event_address_address": [
"Gedung Sasana Budaya Ganesha (Sabuga) Jalan Tamansari 73, New York"
],
"event_address_zip": [
"42132"
],
"event_phone": [
"5345"
],
"post_views_count": [
"23"
]
}
}
]
}
And my function
public void parseJson(JSONObject json) {
try {
info.pages = json.getInt("pages");
// parsing json object
if (json.getString("status").equalsIgnoreCase("ok")) {
JSONArray posts = json.getJSONArray("posts");
info.feedList = new ArrayList<PostItem>();
for (int i = 0; i < posts.length(); i++) {
Log.v("INFO",
"Step 3: item " + (i + 1) + " of " + posts.length());
try {
JSONObject post = (JSONObject) posts.getJSONObject(i);
PostItem item = new PostItem();
item.setTitle(Html.fromHtml(post.getString("title"))
.toString());
item.setDate(post.getString("date"));
item.setId(post.getInt("id"));
item.setUrl(post.getString("url"));
item.setContent(post.getString("content"));
if (post.has("author")) {
Object author = post.get("author");
if (author instanceof JSONArray
&& ((JSONArray) author).length() > 0) {
author = ((JSONArray) author).getJSONObject(0);
}
if (author instanceof JSONObject
&& ((JSONObject) author).has("name")) {
item.setAuthor(((JSONObject) author)
.getString("name"));
}
}
if (post.has("tags") && post.getJSONArray("tags").length() > 0) {
item.setTag(((JSONObject) post.getJSONArray("tags").get(0)).getString("slug"));
}
// TODO do we dear to remove catch clause?
try {
boolean thumbnailfound = false;
if (post.has("thumbnail")) {
String thumbnail = post.getString("thumbnail");
if (thumbnail != "") {
item.setThumbnailUrl(thumbnail);
thumbnailfound = true;
}
}
if (post.has("attachments")) {
JSONArray attachments = post
.getJSONArray("attachments");
// checking how many attachments post has and
// grabbing the first one
if (attachments.length() > 0) {
JSONObject attachment = attachments
.getJSONObject(0);
item.setAttachmentUrl(attachment
.getString("url"));
// if we do not have a thumbnail yet, get
// one now
if (attachment.has("images")
&& !thumbnailfound) {
JSONObject thumbnail;
if (attachment.getJSONObject("images")
.has("post-thumbnail")) {
thumbnail = attachment
.getJSONObject("images")
.getJSONObject(
"post-thumbnail");
item.setThumbnailUrl(thumbnail
.getString("url"));
} else if (attachment.getJSONObject(
"images").has("thumbnail")) {
thumbnail = attachment
.getJSONObject("images")
.getJSONObject("thumbnail");
item.setThumbnailUrl(thumbnail
.getString("url"));
}
}
}
}
} catch (Exception e) {
Log.v("INFO",
"Item "
+ i
+ " of "
+ posts.length()
+ " will have no thumbnail or image because of exception!");
e.printStackTrace();
}
if (item.getId() != info.ignoreId)
info.feedList.add(item);
} catch (Exception e) {
Log.v("INFO", "Item " + i + " of " + posts.length()
+ " has been skipped due to exception!");
e.printStackTrace();
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
This should do it
JSONObject custom_fields = post.getJSONObject("custom_fields");
JSONArray event_start_date_array = custom_fields.getJSONArray("event_start_date");
JSONArray event_location_array = custom_fields.getJSONArray("event_location");
String event_start_date = event_start_date_array.getString(0);
String event_location = event_location_array.getString(0);

Categories

Resources