[This is my first post, please excuse me if I'm doing something wrong! (also sorry for my bad English)]
I'm try to develop a Mapper/Plugin for elasticsearch (in Java) that analyze some specified fields of JSON and add another field with the result of the analysis before storing and indexing the data.
EG:
I start with this popular JSON:
{
"tweet" {
"user" : "kimchy",
"message" : "This is a tweet!",
"postDate" : "2009-11-15T14:12:12",
"priority" : 4,
"rank" : 12.3
}
}
And, before the indexing, I Want it like:
{
"tweet" {
"user" : "kimchy",
"message" : "This is a tweet!",
"postDate" : "2009-11-15T14:12:12",
"priority" : 4,
"rank" : 12.3
"IsKimchy" : "Yes"
}
}
Here I suppose to read the field "user" and if the user is Kimchy I create another field that contain "Yes".
how can I analyze a field (using java) like this before the indexing?
As I know I can copy the content of a field in an other using Copy_to, so I can work only on a field, maybe it can help?
I had finally found a solution for my problem, I'll post it here, hope to help someone else!
the best solution for me (in Terms of Speed and integration) was a Custom mapper!
I had use this demo code https://github.com/dadoonet/elasticsearch-mapper-demo to implement my mapper class that analyze the Json Faster than scripting and add a custom field in record!
Cheers!
Related
I have a java program and i am parsing a json file. Because there are some dependencies between the json objects (they are procedures that must be executed so some of them depends on others). I want to make a graph so i can represent this. Is there any known way? I tryed mxgraph (jgraph) but i cannot make the representation.
Here is a simple json format
{
"blueprint":
{
"1" : { "depends" : null },
"2" : { "depends" : "1" },
"3" : { "depends" : { "2" , "1"} },
}
}
Answering this old question of mine just in case somebody needs it.
graphviz.org with dot language was the way I tackled it.
Thank you everyone for the comments.
Please consider a MongoDB collection with the following document:
"_id": "clientsInfo"
"data": {
"clientsList" : [
{
"name" : "Mike",
"country" : "USA"
},
...
]
}
After setting the DataSet and defining the Query like this...
{
collectionName:'projectA',
findQuery: {
'_id':'clientsInfo',
},
findFields: {
'_id':0,
'data.clientsList':1
},
}
...I am able to display the first item of the fetched array (java.util.List type) in JasperSoft Studio inside a Text Field using the following expression:
$F{data.clientsList}.get(0)
But, considering that I would like to exhibit the whole data in a Name/Country Table...
Question1: How could I access any of the dictionary fields? Trying get method I obtain The method get(String) is undefined for the type Object. error. However, knowing that the object is an instance of com.mongodb.BasicDBObject it should have that method inherited (See doc).
I have also tried to cast object to org.json.JSONObject but then I get net.sf.jasperreports.engine.fill.JRExpressionEvalException: Error evaluating expression for source text: (JSONObject)$F{data.clientsList}.get(0) error.
Question2: Let's suppose we have already solved first question... How can I iterate the list to access not only the first item but all of them according to the array length? Is it possible to use for-loop sentence inside the JasperSoft Expression Editor? (if-then-else seems to be available)
Thanks in advance, Any clue that point me in the right direction will be appreciated.
Just in case someone was in the same situation as I was, I must say this whole approach was wrong.
It's not about making a simple query which returns big block of complex data formatted as an object or list of objects and then manipulate it with JasperSoft Studio. Instead, what I had to do was design a more elaborated query which returns the simple fields I wanted to use straightforward. How to do this? By using Aggregation Framework.
So, by changing this...
{
collectionName:'projectA',
findQuery: {
'_id':'clientsInfo',
},
findFields: {
'_id':0,
'data.clientsList':1
},
}
...for this...
{
runCommand: {
aggregate : 'projectA',
pipeline : [
{'$match': {'_id':'clientsInfo'}},
{'$project': {'data.clientsList': 1}},
{'$unwind': '$data'},
{'$unwind': '$data.clientsList'}
]
}
}
...is how I get name and country fields in order to use them in Text Fields, Tables, ...etc.
I am trying to update an es document using Java.
My document is as follows
"_source": {
"gender": "male" ,
"names": ["name1"]
}
I need to add more names to names list. But I want no duplicates.
How can I update an array in an ES document without duplicate values?
I tried something like this. But it's not working.
client.prepareUpdate(index,type,id)
.addScriptParam("newobject", "newName")
.setScript("ctx._source.names.contains(newobject) ? ctx.op = \"none\" : ctx._source.names+=newobject ").execute().actionGet();
The idea would be to simply call unique() on the resulting list:
client.prepareUpdate(index,type,id)
.addScriptParam("newobject", "newName")
.setScript("ctx._source.names+=newobject; ctx._source.names = ctx._source.names.unique(); ").execute().actionGet();
Also for this to work, you need to make sure that scripting is enabled.
I am working on a Spring MVC and I want to insert javascript into the html output for analytics purpose. I am only partially familiar with serialization but I figured it does the job nicely rather than manually constructing a string containing javascript.
Would it be possible to generate something the following snippet? Any pointers would be great!
"emd" : new Date('6/6/2014')
Update:
I need to output a javascript object which has many fields which may be complex. Hence, on the backend I am gathering all the data into java beans with all the information and I plan to use Jackson mapper to convert to string that I can just output through JSP.
Generating the above snippet does not seem straightforward though, not sure if it is even possible. For context, the rest of that javascript looks something like this.
Analytics.items["item_123"] = {
//ratings and reviews
"rat" : a.b, //the decimal value for the rating
"rev" : xxxx, //integer
//list of flags that indicate how the product was displayed to the customer
//add as needed...tracking code will pick up flags as needed when they are available
"dec" : ["mbe", "green", "recycled"],
//delivery messaging
"delivery" : {
"cd" : new Date() //current date
"offers" : [{
"type" : "abcd"
"emd" : new Date('6/6/2014'),
"weekend" : true
}
]
},
};
JSON.stringify should do the trick. It will be built into your browser, unless you're using a very old browser, in which case you can use a polyfill.
I've got an interesting problem that is somewhat related to this question. I have multiple values for a field that I want to check. For example, say I want to look for a document with the field name matching "bob" and "barker". Initially, I thought to do this:
db.TVHosts.find({ "name": { "$all" :
[ { "$regex": ".*bob.*" }, { "$regex" : ".*barker.*" } ] } })
The way to do it via the command line is to do this:
db.TVHosts.find({ "name": { "$all" : [ /.*bob.*/, /.*barker.*/ ] } })
But that didn't appear to work from Java. Is there some key piece of documentation that I've missed?
EDIT: I'm using MongoDB via the MongoDB Java Driver.
As seen here, to send a Regex to MongoDB in Java you need to use Pattern.compile from java.util.regex.Pattern.