Mongodb Java MapReduce getOutputCollection - java

I don't know anything about the OutputTypes. I'm trying something like this:
output=collection.mapReduce(map,reduce,null,
MapReduceCommand.OutputType.INLINE,null);
collection=output.getOutputCollection();
But the collection is null, because of the INLINE output type. I need the reduced collection because I need to reduce it further. How could I do this?

I found the solution to this finally
output=collection.mapReduce(map,reduce,"mymap",MapReduceCommand.OutputType. REDUCE,null);
collection=output.getOutputCollection();
note that you cannot store in same target "mymap" again and again. You have to use different name when you are looping like "mymap".concat(Integer.toString(i))

Related

Calling groupBy method from dataset object through java interop using clojure

I need to call the groupBy method on a spark dataset by way of the java interop through clojure.
I only need to call this for one column, but the only groupBy signatures I can get to work involve multiple column names. The api seems to indicate that I should be able to use only one column name, but I cannot get this to work. What I really need is a good example to work from. What am I missing?
This does not work . . .
(-> a-dataset
(.groupBy "a-column")
This does . . .
(-> b-dataset
(.groupBy "b-column", (into-array ["c-column"])
The error message I receive says there is no groupBy method for dataset.
I know it is looking for a Column, but I don't know how to give it one.
I don't know a thing about Spark but think we can understand it better by looking at this example from the Spark API documentation into Clojure:
// To create Dataset<Row> using SparkSession
Dataset<Row> people = spark.read().parquet("...");
Dataset<Row> department = spark.read().parquet("...");
people.filter(people.col("age").gt(30))
.join(department, people.col("deptId").equalTo(department.col("id")))
.groupBy(department.col("name"), people.col("gender"))
.agg(avg(people.col("salary")), max(people.col("age")));
We can assume that you already have a DataSet and you want to call .groupBy on it. The method that you are probably calling is the one that takes Column... as an argument. You were in the right path in that variadic argument methods in Java collect the arguments as array, so this is just like receiving a Column[] as argument.
The problem is then, how to get a Column from a DataSet? It seems you can call dataset.col(String colName) to get it. Putting everything together:
(.groupBy my-dataset (into-array Column [(.col my-dataset "a-column")]))
Again, I don't have how to verify this, but I think this should help.

Hibernate search - '%like%' type query

I'm using hibernate-search in my Spring MVC project and I would like to accomplish something but I'm not sure if it's possible. Here is the problem:
I'm using NGramFilterFactoryClass for this and have configured minGramSize=3 and maxGramSize=3.
Let's say my search term is "Keyword"
If I type anything like this:
"ywo", "key", "ord", "blablaordblabla"
query will return "Keyword". This is fine and I understand how this works but what I wanna do is when I type something like:
"bkey", "blablaordblabla"
I don't want to return "Keyword". "Keyword" should be returned only when search term is something like:
"key", "ord", "ywo", "eywo", "word" etc...
So, I guess I'm looking for a '%like%' type query. How can I accomplish this with hinernate-search?
I don't know if is what you are looking for, but maybe you need what is called "wildcard queries".
Try to have a look at this link as reference.
Also have a look at this stackoverflow topic
If you Analyze your input with NGrams you won't be able to perform exact "Like%" queries.
You probably want a SimpleAnalyzer or something similar which doesn't completely break your keywords in smaller pieces, or you might want to skip Analysis for this field and index it as-is.
You then combine this with a WildCard Query; note how example in the reference docs uses the keyword element to build the query, which inherently disables the analyzer on the input. (Make sure you scroll down the the Wildcard queries section in the docs).
I assume you're using NGrams because you need them for another use case. Remember you can use the #Fields annotation to index a same property in various different ways, so you could index it with ngrams and also in another form more suited for wildcard queries.

Query that returns only certain fields?

I've got a very large, structured document(s) stored in MongoDB, and am using Morphia to query and model it in Java. I'd like to write a query that only returns a handful of the fields in that document, rather than returning the entire thing. I've looked in the documentation on the Morphia site, but couldn't find anything that explains how to do this. Is it possible to write a query like this with Morphia? In pseudocode it would be something like
GET doc.propertyA, doc.propertyB, doc.propertyX FROM doc WHERE doc.someOtherProperty = 'Foo'
Thoughts? Or is Morphia not designed to operate in this manner? Is there something better I could try?
Take a look at this: https://rawgithub.com/wiki/mongodb/morphia/javadoc/0.103/apidocs/com/google/code/morphia/query/Query.html#retrievedFields%28boolean,%20java.lang.String...%29
You'll still get back your entity objects but they'll only contain the fields listed.
example is better than words.
Query returns only "_id" field.
datastore.createQuery(entityClazz.class).retrievedFields(true, Mapper.ID_KEY);

How can I override a typesafe config list value on the command line?

I have an application.conf file with a structure like the following:
poller {
datacenters = []
}
I would like to override "datacenters" on the command line.
For other configuration keys whose values are simple types (strings, numbers) I can override using -Dpath.to.config.value=<value>, and this works fine.
However, I can't seem to find a way to do this for lists. In the example above, I tried to set "datacenters" to ["SJC", "IAD"] like so: -Dpoller.datacenters="['SJC', 'IAD']", but I get an exception that the key value is a string, not a list.
Is there a way to signal to the typesafe config library that this value is a list?
An alternative syntax is implemented in version 1.0.1 for this:
-Dpoller.datacenters.0=SJC -Dpoller.datacenters.1=IAD
I had the same issue some weeks ago, and finally dived into the source code to understand what's going on:
This feature is not implemented, it's not possible to define a list using command line argument
Fixing it wouldn't be that hard, but someone need to take time to do it.

How to create a variable name at runtime?

I am not able to create the name of the object at runtime. My statement is:
Map<String,String> objectName+""+lineNumber = new HashMap<String,String>();
It's giving me compiletime error. I want to create the HashMap object at runtime depending upon the line number.
Java is not a interpreted but rather a compiled language. So the compiler does not knows how to handle this. Such a thing might make sense in a scripting language.
If you need a custom Name for a "variable" maybe a construct like the following might make sense:
Map<String,Map<String,String>> varMap = new HashMap<String,Map<String,String>>();
varMap.put(objectName+" "+lineNumber, new HashMap<String, String>());
You can't do this directly in Java (without major tricks)
What you can (and probably should) do:
Put your Map in another map which has the 'variable' name as a key.
If you really want to do that you have to do code generation. For this again you have multiple options:
Generate Java Source Code and compile it
Generate Java Byte Code on the fly. You might wanna look at this list: http://java-source.net/open-source/bytecode-libraries for libraries available.
Having a dynamic object name is of No Use.
At first, it's not possible to give reference a dynamic name. The bigger question is Why do you want to do it?
If, just for learning and doing experiments, I'll suggest you should follow proper exercises.
But, if you are trying to achieve some project requirement, Pls. explain the requirement. There will be some other way to achieve that.

Categories

Resources