Is there a way to use a user-defined function saved as db.system.js.save(...) in pipeline or mapreduce?
Any function you save to system.js is available for usage by "JavaScript" processing statements such as the $where operator and mapReduce and can be referenced by the _id value is was asssigned.
db.system.js.save({
"_id": "squareThis",
"value": function(a) { return a*a }
})
And some data inserted to "sample" collection:
{ "_id" : ObjectId("55aafd2bacbed38e06f9eccf"), "a" : 1 }
{ "_id" : ObjectId("55aafea6acbed38e06f9ecd0"), "a" : 2 }
{ "_id" : ObjectId("55aafeabacbed38e06f9ecd1"), "a" : 3 }
Then:
db.sample.mapReduce(
function() {
emit(null, squareThis(this.a));
},
function(key,values) {
return Array.sum(values);
},
{ "out": { "inline": 1 } }
);
Gives:
"results" : [
{
"_id" : null,
"value" : 14
}
],
Or with $where:
db.sample.find(function() { return squareThis(this.a) == 9 })
{ "_id" : ObjectId("55aafeabacbed38e06f9ecd1"), "a" : 3 }
But in "neither" case can you use globals such as the database db reference or other functions. Both $where and mapReduce documentation contain information of the limits of what you can do here. So if you thought you were going to do something like "look up data in another collection", then you can forget it because it is "Not Allowed".
Every MongoDB command action is actually a call to a "runCommand" action "under the hood" anyway. But unless what that command is actually doing is "calling a JavaScript processing engine" then the usage becomes irrelevant. There are only a few commands anyway that do this, being mapReduce, group or eval, and of course the find operations with $where.
The aggregation framework does not use JavaScript in any way at all. You might be mistaking just as others have done a statement like this, which does not do what you think it does:
db.sample.aggregate([
{ "$match": {
"a": { "$in": db.sample.distinct("a") }
}}
])
So that is "not running inside" the aggregation pipeline, but rather the "result" of that .distinct() call is "evaluated" before the pipeline is sent to the server. Much as with an external variable is done anyway:
var items = [1,2,3];
db.sample.aggregate([
{ "$match": {
"a": { "$in": items }
}}
])
Both essentially send to the server in the same way:
db.sample.aggregate([
{ "$match": {
"a": { "$in": [1,2,3] }
}}
])
So it is "not possible" to "call" any JavaScript function in the aggregation pipeline, nor is there really any point is "passing in" results in general from something saved in system.js. The "code" needs to be "loaded to the client" and only a JavaScript engine can actually do anything with it.
With the aggregation framework, all of the "operators" available are actually natively coded functions as opposed to the "free form" JavaScript interpretation provided for mapReduce. So instead of writing "JavaScript", you use the operators themselves:
db.sample.aggregate([
{ "$group": {
"_id": null,
"sqared": { "$sum": {
"$multiply": [ "$a", "$a" ]
}}
}}
])
{ "_id" : null, "sqared" : 14 }
So there are limitations on what you can do with functions saved in system.js, and the chances are that what you want to do is either:
Not allowed, such as accessing data from another collection
Not really required as the logic is generally self contained anyway
Or probably better implemented in client logic or other different form anyway
Just about the only practical use I can really think of is that you have a number of "mapReduce" operations that cannot be done any other way and you have various "shared" functions that you would rather just store on the server than maintain within every mapReduce function call.
But then again, the 90% reason for mapReduce over the aggregation framework is usually that the "document structure" of the collections has been poorly chosen and the JavaScript functionality is "required" to traverse the document for search and analysis.
So you can use it under the allowed constraints, but in most cases you probably should not be using this at all, but fixing the other issues that caused you to believe you needed this feature in the first place.
Related
data: [
{
"name": "mark",
"age": "20"
},
{
"name": "john",
"age": "10"
}
]
in this case, how to get age greater than 10?
sample code:
JsonPath.read(json, "$.data[?(#.age > 10)]");
This can be done using Jayway's JsonPath library.
First, the JSON you showed is not good for processing, the wrapping { } are missing.
Second, Jayway's comparison operator works if you test against the number as a string (It's a bit weird but the library does the required casting internally that way).
So, with this JSON:
{
"data":[
{
"name":"mark",
"age":"20"
},
{
"name":"john",
"age":"10"
}
]
}
And this filter:
$.data[?(#.age > '10')]
You get the expected result [{"name" : "mark", "age" : "20"}]. You can test it online here.
Update
As pointed out by Parveen Verma, the suggested filter will not work if the range filter requires an implicit type conversion.
Jayway seems to only support implicit conversions when using the equality operators, e.g. $.data[?(#.age == 10)] (this works despite the type differences).
There is an unmerged pull request that adds this behavior; since the code is somewhat behind the current version it may require some work to integrate it but if you really need this functionality it can be done.
I have ElasticSearch data offers with the following structure:
{
{
"id": "123",
"tariffId": "15477",
"tariffFamilyId": "555",
"characteristics": "xxx"
},
{
"id": "124",
"tariffId": "15478",
"tariffFamilyId": "777",
"characteristics": "yyy"
},
{
"id": "351",
"tariffId": "25271",
"tariffFamilyId": "555",
"characteristics": "zzz"
}
}
I need to find all offers with tariffFamilyId of a certain tariffId. As an initial argument, I know only tariffId and do not know tariffFamilyId (I need to detect it). Normally it means the two separate requests to Elastic Search should be made:
first request - find tariffFamilyId by tariffId.
second request - find offers with that tariffFamilyId.
For example for tariffId=15477, we get tariffFamilyId=555. So for this family, there will be two offers with id 123 and 351.
The question - is it possible to somehow make only one request to Elastic search, not two?
P.S. This is for Java implementation.
I have a java program and i am parsing a json file. Because there are some dependencies between the json objects (they are procedures that must be executed so some of them depends on others). I want to make a graph so i can represent this. Is there any known way? I tryed mxgraph (jgraph) but i cannot make the representation.
Here is a simple json format
{
"blueprint":
{
"1" : { "depends" : null },
"2" : { "depends" : "1" },
"3" : { "depends" : { "2" , "1"} },
}
}
Answering this old question of mine just in case somebody needs it.
graphviz.org with dot language was the way I tackled it.
Thank you everyone for the comments.
Please consider a MongoDB collection with the following document:
"_id": "clientsInfo"
"data": {
"clientsList" : [
{
"name" : "Mike",
"country" : "USA"
},
...
]
}
After setting the DataSet and defining the Query like this...
{
collectionName:'projectA',
findQuery: {
'_id':'clientsInfo',
},
findFields: {
'_id':0,
'data.clientsList':1
},
}
...I am able to display the first item of the fetched array (java.util.List type) in JasperSoft Studio inside a Text Field using the following expression:
$F{data.clientsList}.get(0)
But, considering that I would like to exhibit the whole data in a Name/Country Table...
Question1: How could I access any of the dictionary fields? Trying get method I obtain The method get(String) is undefined for the type Object. error. However, knowing that the object is an instance of com.mongodb.BasicDBObject it should have that method inherited (See doc).
I have also tried to cast object to org.json.JSONObject but then I get net.sf.jasperreports.engine.fill.JRExpressionEvalException: Error evaluating expression for source text: (JSONObject)$F{data.clientsList}.get(0) error.
Question2: Let's suppose we have already solved first question... How can I iterate the list to access not only the first item but all of them according to the array length? Is it possible to use for-loop sentence inside the JasperSoft Expression Editor? (if-then-else seems to be available)
Thanks in advance, Any clue that point me in the right direction will be appreciated.
Just in case someone was in the same situation as I was, I must say this whole approach was wrong.
It's not about making a simple query which returns big block of complex data formatted as an object or list of objects and then manipulate it with JasperSoft Studio. Instead, what I had to do was design a more elaborated query which returns the simple fields I wanted to use straightforward. How to do this? By using Aggregation Framework.
So, by changing this...
{
collectionName:'projectA',
findQuery: {
'_id':'clientsInfo',
},
findFields: {
'_id':0,
'data.clientsList':1
},
}
...for this...
{
runCommand: {
aggregate : 'projectA',
pipeline : [
{'$match': {'_id':'clientsInfo'}},
{'$project': {'data.clientsList': 1}},
{'$unwind': '$data'},
{'$unwind': '$data.clientsList'}
]
}
}
...is how I get name and country fields in order to use them in Text Fields, Tables, ...etc.
I've got an interesting problem that is somewhat related to this question. I have multiple values for a field that I want to check. For example, say I want to look for a document with the field name matching "bob" and "barker". Initially, I thought to do this:
db.TVHosts.find({ "name": { "$all" :
[ { "$regex": ".*bob.*" }, { "$regex" : ".*barker.*" } ] } })
The way to do it via the command line is to do this:
db.TVHosts.find({ "name": { "$all" : [ /.*bob.*/, /.*barker.*/ ] } })
But that didn't appear to work from Java. Is there some key piece of documentation that I've missed?
EDIT: I'm using MongoDB via the MongoDB Java Driver.
As seen here, to send a Regex to MongoDB in Java you need to use Pattern.compile from java.util.regex.Pattern.