limit fields on elasticsearch response but still serialize using jackson - java

I have successfully created an index using elasticsearch, and can serialized those exact json payloads back to my java application.
for (SearchHit searchHit : searchResponse.getHits()) {
try {
result.getItems().add(objectMapper.readValue(searchHit.getSourceRef().streamInput(), Program.class));
} catch (IOException e) {
throw new IllegalArgumentException("Cannot marshall json", e);
}
}
The payload size that I export to elasticsearch is very large, but the response, I want to be very small. I also want to allow the client to dynamically include or exclude some fields. So I did this, where fields is an array of fields I want to include. This works well in that only the fields I ask for are returned, however the searchHit.getSourceRef is now null. Is there any way to get it to just copy the fields that I included via Jackson? Or must I always return the entire source object? Or do I have to write some sort of mapping code to translate (I would really like to avoid this) ?
SearchResponse searchResponse = transportClient.prepareSearch("programs")
.addFields(fields.toArray(new String[fields.size()]))
.setTypes("program")
.setQuery(query).setFrom(start).setSize(pageSize)
.execute().actionGet();

however the searchHit.getSourceRef is now null.
It is null because searchHit.getSource() is also null. As far as I know you have to add "_source" to your fields list when you do the search. Something like this:
ArrayList<String> fields = new ArrayList<String>();
fields. add("field1");
fields.add("field2");
fields.add("_source"); // add this field
SearchResponse response = transportClient.prepareSearch("programs")
.addFields(fields.toArray(new String[fields.size()]))
.execute().actionGet();

Related

Query JSON String using Jackson API with good performance

I am working on a use case where I need to parse different JSON strings and query for specific fields basing on the "type" of the JSON. The "type" is a field in the JSON string.
I am using Jackson API to perform this task after going through the blogs and benchmarks, as it is the fastest.
I am able to parse the JSON and achieve what I want but the issue is with the performance.
public String generate(String inputJson, List<String> idParams,final String seperator) throws Exception {
JsonNode node;
StringBuilder sb = new StringBuilder();
try {
node = new ObjectMapper().readTree(new StringReader(inputJson));
idParams.forEach(e -> {
String value = node.findValue(e).asText();
sb.append(value).append(seperator);
});
} catch (Exception e) {
throw e;
}
return sb.toString();
}
In the above method, I am getting the field details as a List. With the help of forEach(), i am able to fetch the values by finding using the fields.
The culprit is the list iterator as I have to search the whole json tree to find the value for each element. Is there a better approach to optimize. I would also like to get the inputs on other JSON parsing libraries which can improve the performance here.
I am also thinking of parsing the whole json once and writing the Keys and Values to a HashMap. But, I have very few fields which i really care about the remaining fields are not needed.
take a look at JsonPath . It offers xpath-like rich query language that allows for search and retrieval of individual or few elements from the JSON tree.
Consider using Jackson Streaming API -
https://github.com/FasterXML/jackson-docs/wiki/JacksonStreamingApi
Take a look at this example -
http://www.baeldung.com/jackson-streaming-api

How to parse a jsonObject or Array in Java

I have an API request from my CRM that can either return a jsonObject if there is only one result, or a jsonArray if there are multiple results. Here are what they look like in JSON Viewer
JsonObject:
JsonArray:
Before you answer, this is not my design, it's my CRM's design, I don't have any control over it, and yes, I don't like how it is designed either. The only reason I am not storing the records in my own database and just parsing that, which would be MUCH easier, is because my account is having issues not running some workflows that would allow me to auto add the records. Is there any way to figure out if the result is an object or an array using java? This is for an android app by the way, I need it to display the records on the phone.
You should use OPT command instead of GET
JSONObject potentialObject=response.getJsonObject("resuslt")
.getJsonObject("Potentials");
// here use opt. if the object is null, it means its not the type you specified
JSONObject row=potentialObject.optJsonObject("row");
if(row==null){
// row is json array .
JSONArray rowArray=potentialObject.getJsonArray("row");
// do whatever you like with rowArray
} else {
// row is json object. do whatever you like with it
}
ONE
You can use instanceof keyword to check the instances as in
if(json instanceof JSONObject){
System.out.println("object");
}else
System.out.println("array");
TWO
BUT I think a better way to do this is choose to use only JSONArray so that the format of your results can be predicated and catered for. JSONArrays can contain JSONObjects. That is they can cover the scope of JSONObject.
For example when you get the response (either in a JSONObject or a JSONArray), you need to store that in an instance. What instance are you going to store it in? So to avoid issues use JSONArray to store the response and provide statements to handle that.
THREE
Also you can try method overloading in java or Generics
Simplest way is to use Moshi, so that you dont have to parse, even in the case of the Model changing later, you have to change your pojo and it will work.
Here is the snippet from readme
String json = ...;
Moshi moshi = new Moshi.Builder().build();
JsonAdapter<BlackjackHand> jsonAdapter = moshi.adapter(BlackjackHand.class);
BlackjackHand blackjackHand = jsonAdapter.fromJson(json);
System.out.println(blackjackHand);
https://github.com/square/moshi/blob/master/README.md

Elasticsearch Update indexdocument

I need to update an index document for an elasticsearch table and this is the code I have implemented. But it is not working, what's wrong and how should I implement this?
My code.
Map<String, Object> matching_result;
for (SearchHit hit : response_text.getHits()) {
matching_result = hit.getSource();
String flag_value = matching_result.get("flag").toString();
matching_result.put("flag", true);
}
String indexString = JSONConverter.toJsonString(matching_result);
IndexResponse response = client.prepareIndex("index_name", "data").setSource(indexString).execute().actionGet();
boolean created = response.isCreated();
System.out.println("created or updated--------------------->" + created);
System.out.println("flag value==========" + matching_result.get("flag"));
return actual_theme;
(JSONConverter.toJsonString) is our library class for converting to json string.
What is wrong with this query?
Instead of updating the existing document it is creating a new one. I want to change the existing one.
Based on your example code, it looks like by "update" you mean you are trying to replace the entire document. In order to do this, you must specify the id of the document you wish to update.
Using the Java API, in addition to calling setSource on the IndexRequestBuilder, you would also need to supply the id by calling setId. For example:
IndexResponse response = client.prepareIndex("index_name", "data")
.setSource(indexString)
.setId(123) <----- supply the ID of the document you want to replace
.execute()
.actionGet();
Otherwise, just so you know, in ES you have the option to do a partial update. That is, only update certain fields in the document. This can be done with a script or by providing a partial document. Have a look at the documentation for the Update API.
In either case, you need to provide ES with the ID for the document you wish to modify.

Proper json to back end post call

I'm having some troubles with different back-end processing of POST rest calls. I have two different objects which are updated through two different POST methods in my back-end. I catch the objects as a JsonNode, and in order to parse the attributes which I need to update, i create an iterator like so :
final Iterator<String> fieldNames = attributes.fieldNames();
The problem comes when I send my data from angular, in one case I need to explicitly send it like angular.toJson(data) in order to properly grab all the field names, and in the other case I just send the data (without the angular json conversion). Why is this behavior occurring ? Does this have to do with how I create the $http post call ? Here are the two different calls from angular:
$http.post(URL, angular.toJson(data)).success(function(data){
/*whatever*/ }).error(function(data) {
/*whatever*/ });
//Second call looks like this
var promise = $http({method: 'POST', url:URL, data:data, cache:'false'});
//this one i resolve using $q.all
I truncated the code to just the important stuff. My data is created like this currently(tried multiple ways in order to skip the need for toJson):
var data = "{\"Attribute1:\"+"\""+$scope.value1+"\","+
"\"Attribute2:\"+"\""+$scope.value2+"\"}";
How do I need to send the json data in order for it to correctly be converted to a JsonNode in my back-end, so I can properly iterate the fieldNames ?
I did manage to come to a common solution which consumes the json correctly in my back-end. I declared my json objects in angular like this :
$scope.dataToSend = {
"SomeAttribute" : "",
"SomeOtherAttribute" : ""
};
And then added my values like so :
$scope.dataTosend.SomeAttribute = someValue;
$scope.dataTosend.SomeOtherAttribute = someOtherValue;
No longer need to send the data with angular.toJson().

What are the best practices to add metadata to a RESTful JSON response?

Background
We are building a Restful API that should return data objects as JSON. In most of the cases it fine just to return the data object, but in some cases, f.ex. pagination or validation, we need to add some metadata to the response.
What we have so far
We have wrapped all json responses like this example:
{
"metadata" :{
"status": 200|500,
"msg": "Some message here",
"next": "http://api.domain.com/users/10/20"
...
},
"data" :{
"id": 1001,
"name": "Bob"
}
}
Pros
We can add helpful metadata to the response
Cons
In most cases we don't need the metadata field, and it adds complexity to the json format
Since it's not a data object any more, but more like a enveloped response, we can not use the response right away in f.ex backbone.js without extracting the data object.
Question
What is the best practices to add metadata to a json response?
UPDATE
What I've got so far from answers below:
Remove the metadata.status an return the http response code in the
http protocol instead (200, 500 ...)
Add error msg to body of an http 500 repsonse
For pagination i natural to have some metadata telling about the pagination structure, and the data nested in that structure
Small amount of meta data can be added to http header (X-something)
You have several means to pass metadata in a RESTful API:
Http Status Code
Headers
Response Body
For the metadata.status, use the Http Status Code, that's what's for!
If metadata is refers to the whole response you could add it as header fields.
If metadata refers only to part of the response, you will have to embed the metadata as part of the object.DON'T wrap the whole response in an artifical envelope and split the wrapper in data and metadata.
And finally, be consistent across your API with the choices you make.
A good example is a GET on a whole collection with pagination. GET /items
You could return the collection size, and current page in custom headers. And pagination links in standard Link Header:
Link: <https://api.mydomain.com/v1/items?limit=25&offset=25>; rel=next
The problem with this approach is when you need to add metadata referencing specific elements in the response. In that case just embed it in the object itself. And to have a consistent approach...add always all metadata to response. So coming back to the GET /items, imagine that each item has created and updated metadata:
{
items:[
{
"id":"w67e87898dnkwu4752igd",
"message" : "some content",
"_created": "2014-02-14T10:07:39.574Z",
"_updated": "2014-02-14T10:07:39.574Z"
},
......
{
"id":"asjdfiu3748hiuqdh",
"message" : "some other content",
"_created": "2014-02-14T10:07:39.574Z",
"_updated": "2014-02-14T10:07:39.574Z"
}
],
"_total" :133,
"_links" :[
{
"next" :{
href : "https://api.mydomain.com/v1/items?limit=25&offset=25"
}
]
}
Note that a collection response is an special case. If you add metadata to a collection, the collection can no longer be returned as an array, it must be an object with an array in it. Why an object? because you want to add some metadata attributes.
Compare with the metadata in the individual items. Nothing close to wrapping the entity. You just add some attributes to the resource.
One convention is to differentiate control or metadata fields. You could prefix those fields with an underscore.
Along the lines of #Charlie's comment: for the pagination part of your question you still need to bake the metadata into the response somhow, but the status and message attributes here are somewhat redundant, since they are already covered by the HTTP protocol itself (status 200 - model found, 404 - model not found, 403 - insufficient privs, you get the idea) (see spec). Even if your server returns an error condition you can still send the message part as the response body. These two fields will cover quite much of your metadata needs.
Personally, I have tended towards (ab)using custom HTTP headers for smaller pieces of metadata (with an X- prefix), but I guess the limit where that gets unpractical is pretty low.
I've expanded a bit about this in a question with a smaller scope, but I think the points are still valid for this question.
I suggest you to read this page https://www.odata.org/ You are not forced to use OData but the way they do the work is a good example of good practice with REST.
We had the same use case, in which we needed to add pagination metadata to a JSON response. We ended up creating a collection type in Backbone that could handle this data, and a lightweight wrapper on the Rails side. This example just adds the meta data to the collection object for reference by the view.
So we created a Backbone Collection class something like this
// Example response:
// { num_pages: 4, limit_value: 25, current_page: 1, total_count: 97
// records: [{...}, {...}] }
PageableCollection = Backbone.Collection.extend({
parse: function(resp, xhr) {
this.numPages = resp.num_pages;
this.limitValue = resp.limit_value;
this.currentPage = resp.current_page;
this.totalCount = resp.total_count;
return resp.records;
}
});
And then we created this simple class on the Rails side, to emit the meta data when paginated with Kaminari
class PageableCollection
def initialize (collection)
#collection = collection
end
def as_json(opts = {})
{
:num_pages => #collection.num_pages
:limit_value => #collection.limit_value
:current_page => #collection.current_page,
:total_count => #collection.total_count
:records => #collection.to_a.as_json(opts)
}
end
end
You use it in a controller like this
class ThingsController < ApplicationController
def index
#things = Thing.all.page params[:page]
render :json => PageableCollection.new(#things)
end
end
Enjoy. Hope you find it useful.
How about returning directly the object that you want in data, like return:
{
"id": 1001,
"name": "Bob"
}
And return in headers the metadata.
Option 1 (one header for all metadata JSON):
X-METADATA = '{"status": 200|500,"msg": "Some message here","next": "http://api.domain.com/users/10/20"...}'
Option 2 (one header per each metadata field):
X-METADATA-STATUS = 200|500
X-METADATA-MSG = "Some message here",
X-METADATA-NEXT = "http://api.domain.com/users/10/20"
...
Until now I was using like you, a complex JSON with two fields, one for data and one for metadata. But I'm thinking in starting using this way that I suggested, I think it will be more easy.
Remind that some server have size limit for HTTP headers, like this example: https://www.tutorialspoint.com/What-is-the-maximum-size-of-HTTP-header-values
JSON:API solves this by defining top-level meta and data properties.

Categories

Resources