first time post here. I was hoping someone could help me make custom SPARQL functions for use within the Jena (ARQ) API. I need SPARQL to do some aggregation, and I know that it already implements avg, count, min, max, and sum, but I need to be able to do standard deviation and median as well (I also need range, but that can be done with using just min and max).
I was hoping the query could be similar to what you use for the already implemented functions:
PREFIX example: <http://www.examples.com/functions#>
PREFIX core: <http://www.core.com/values#>
SELECT (stddev(?price) as ?stddev)
WHERE {
?s core:hasPrice ?price
}
I don't know if that is possible or not, but if I need to use it like other custom functions that would be fine too, as long as it still gets the standard deviation of the results.
All I know is that the functions would be written in Java, which I already know pretty well. So, I was wondering if anyone knew of a good way to go about this or where to start looking for some guidance. I've tried looking for documentation on it, but there doesn't seem to be anything. Any help would be greatly appreciated.
Thanks in advance.
I am not sure you can do what you want without actually changing the grammar.
SUM(...) for example is a keyword defined by the SPARQL grammar:
https://svn.apache.org/repos/asf/jena/trunk/jena-arq/Grammar/sparql_11.jj
https://svn.apache.org/repos/asf/jena/trunk/jena-arq/src/main/java/com/hp/hpl/jena/sparql/expr/aggregate/AggSum.java
A filter function or a property function is probably not what you are looking for.
By the way, you do not get STDDEV in SQL as well. Is it because two passes over the data are necessary?
Aggregate functions are a special case of SPARQL (and hence of ARQ) functions.
I think in ARQ it is not easy to extend the set of aggregate functions, while it is easy (and documented) to extend the set of filter functions and the one of property functions.
You could calculate anyway the standard deviation with something like this:
PREFIX afn: <http://jena.hpl.hp.com/ARQ/function#>
PREFIX core: <http://www.core.com/values#>
SELECT ( afn:sqrt( sum( (?price - ?avg) * (?price - ?avg) ) / (?count - 1) ) as ?stddev )
WHERE {
?s core:hasPrice ?price .
{
SELECT (avg(?price) as ?avg) (count(*) as ?count)
WHERE {
?s core:hasPrice ?price
}
}
}
I'm forced anyway to use afn:sqrt that is an ARQ "proprietary" function not in the SPARQL 1.1 draft, so this query wouldn't work on frameworks different from Jena
Yes, ARQ is extensible in a variety of ways. The ARQ extensions page would be the best place to start.
ARQ allows you to add your own aggregate functions by registering them in the AggregateRegistry. The example code shows how this is done. This can be used to add the custom standard deviation aggregate function requested in the question. In the example below, Commons Math is used to do the calculation.
import org.apache.commons.math3.stat.descriptive.SummaryStatistics;
import org.apache.jena.graph.Graph;
import org.apache.jena.query.*;
import org.apache.jena.rdf.model.Model;
import org.apache.jena.rdf.model.ModelFactory;
import org.apache.jena.sparql.engine.binding.Binding;
import org.apache.jena.sparql.expr.ExprList;
import org.apache.jena.sparql.expr.NodeValue;
import org.apache.jena.sparql.expr.aggregate.Accumulator;
import org.apache.jena.sparql.expr.aggregate.AccumulatorFactory;
import org.apache.jena.sparql.expr.aggregate.AggCustom;
import org.apache.jena.sparql.expr.aggregate.AggregateRegistry;
import org.apache.jena.sparql.function.FunctionEnv;
import org.apache.jena.sparql.graph.NodeConst;
import org.apache.jena.sparql.sse.SSE;
public class StandardDeviationAggregate {
/**
* Custom aggregates use accumulators. One accumulator is created for each group in a query execution.
*/
public static AccumulatorFactory factory = (agg, distinct) -> new StatsAccumulator(agg);
private static class StatsAccumulator implements Accumulator {
private AggCustom agg;
private SummaryStatistics summaryStatistics = new SummaryStatistics();
StatsAccumulator(AggCustom agg) { this.agg = agg; }
#Override
public void accumulate(Binding binding, FunctionEnv functionEnv) {
// Add values to summaryStatistics
final ExprList exprList = agg.getExprList();
final NodeValue value = exprList.get(0).eval(binding, functionEnv) ;
summaryStatistics.addValue(value.getDouble());
}
#Override
public NodeValue getValue() {
// Get the standard deviation
return NodeValue.makeNodeDouble(summaryStatistics.getStandardDeviation());
}
}
public static void main(String[] args) {
// Register the aggregate function
AggregateRegistry.register("http://example/stddev", factory, NodeConst.nodeMinusOne);
// Add data
Graph g = SSE.parseGraph("(graph " +
"(:item1 :hasPrice 13) " +
"(:item2 :hasPrice 15) " +
"(:item3 :hasPrice 20) " +
"(:item4 :hasPrice 30) " +
"(:item5 :hasPrice 32) " +
"(:item6 :hasPrice 11) " +
"(:item7 :hasPrice 16))");
Model m = ModelFactory.createModelForGraph(g);
String qs = "PREFIX : <http://example/> " +
"SELECT (:stddev(?price) AS ?stddev) " +
"WHERE { ?item :hasPrice ?price }";
// Execute query and print results
Query q = QueryFactory.create(qs) ;
QueryExecution qexec = QueryExecutionFactory.create(q, m);
ResultSet rs = qexec.execSelect() ;
ResultSetFormatter.out(rs);
}
}
I hope this example helps someone at least, even though the question was posted a few years back.
Related
As per the Translator API reference, to identify language use below code:
LanguageTranslator service = new LanguageTranslator();
service.setUsernameAndPassword("{username}","{password}");
List <IdentifiedLanguage> langs = service.identify("this is a test");
System.out.println(langs);
But as one can see in the attached screenshot that results into syntatic error. I have corrected that by just changing one line as :
ServiceCall<List<IdentifiedLanguage>> langs = service.identify("this is a test");
It would be great if documentation can be updated.
Error is gone but now what to do with this ServiceCall? How to get language?
Also any link giving all model Ids would be appreciated as that helps during initial evaluation of API. Also where can I find which languages are supported currently?
Simply comment that line as I don't see local variable "langs" being used anywhere further in the method.
In case you need it make a call to execute method on service.identify("this is a test") and then initialize it to variable "langs" something like below:
List langs = service.identify("this is a test").execute();
The service.identify("...") call needs a .execute(), not a different type:
List<IdentifiedLanguage> langs = service.identify("this is a test").execute();
It then returns the expected list of IdentifiedLanguages. Here's a complete example that logs the list and then chooses the highest-confidence language from the list and logs that also:
package com.watson.example;
import java.util.Collections;
import com.ibm.watson.developer_cloud.language_translator.v2.LanguageTranslator;
import com.ibm.watson.developer_cloud.language_translator.v2.model.IdentifiedLanguage;
import java.util.Comparator;
import java.util.List;
public class ItentifyLanguage {
public ItentifyLanguage() {
LanguageTranslator service = new LanguageTranslator();
service.setUsernameAndPassword("{username}","{password}");
// identify returns a list of potential languages with confidence scores
List<IdentifiedLanguage> langs = service.identify("this is a test").execute();
System.out.println("language confidence scores:");
System.out.println(langs);
// this narrows the list down to a single language
IdentifiedLanguage lang = Collections.max(langs, new Comparator<IdentifiedLanguage>() {
public int compare (IdentifiedLanguage a, IdentifiedLanguage b) {
return a.getConfidence().compareTo(b.getConfidence());
}
});
System.out.println("Language " + lang.getLanguage() + " has the highest confidence score at " + lang.getConfidence());
}
public static void main(String[] args) {
new ItentifyLanguage();
}
}
For your question about which languages are supported, that's in the
doc
https://www.ibm.com/watson/developercloud/doc/language-translator/index.html#supported-languages
I've notified the doc team about the syntax error.
I am looking for an SME who can answer your technical questions
I need to create a new built-in for Jena. With this one I would like to be able to extract the minimum date from where it is.
I just wondering if it is possible to give a class of datas to a built-in instead of just one parameter.
Here is the bodyCall of my function :
#Override
public boolean bodyCall(Node[] args, int length, RuleContext context) {
System.out.println("Entra");
checkArgs(length, context);
BindingEnvironment env = context.getEnv();
Node n1 = getArg(0, args, context);
Node n2 = getArg(1, args, context);
//int count = 0;
//do{
//System.out.println("RULE"+context.getEnv().getGroundVersion(n2).getLiteralLexicalForm()); count ++;}while(count <2);
System.out.println("Date 1: " + n1 + " and Date 2: " + n2);
if (n1.isLiteral() && n2.isLiteral()) {
Object v1 = n1.getLiteralValue();
Object v2 = n2.getLiteralValue();
Node max = null;
if (v1 instanceof XSDDateTime && v2 instanceof XSDDateTime) {
XSDDateTime nv1 = (XSDDateTime) v1;
XSDDateTime nv2 = (XSDDateTime) v2;
Calendar data1 = new GregorianCalendar (nv1.getYears(), nv1.getMonths(), nv1.getDays());
Calendar data2 = new GregorianCalendar (nv2.getYears(), nv2.getMonths(), nv2.getDays());
SimpleDateFormat df = new SimpleDateFormat();
df.applyPattern("yyyy-dd-MM");
if (data1.compareTo(data2) > 0)
{
System.out.println("la data piu' grande e' DATA1: " +df.format(data1.getTime()));
max = args[0];
}
else
{
max = args[1];
System.out.print("la data piu' grande e' DATA1: " +df.format(data1.getTime()));
}
return env.bind(args[2], max);
}
}
// Doesn't (yet) handle partially bound cases
return false;
}
});
This is my simple rule:
#prefix ex: http://www.semanticweb.org/prova_rules_M#
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
[maxDate:
(?p rdf:type ex:Persona)
(?p http://www.semanticweb.org/prova_rules_M/persona#data_nascita ?c)
(?p http://www.semanticweb.org/prova_rules_M/persona#data_nascita ?d)
maxDate(?c,?d,?x)
-> print(?x)
]
I give to the built-in three parameters. Two for input and one for output.
My idea is using two varibles : ?c and ?d. In both of them there is a birthday date. I would like to get the first record from the ?c and the next record from the ?d. But, it looks like that Jena takes each time the first record.
Is it possible, by Java, telling that I want the second record and scroll the results ?
For example, my ontology is composed by two dates:
1)1992-04-13T00:00:00.0;
2)1988-04-25T00:00:00.0
I want to have 1) in ?c and 2) in ?d and then, make an algorithm to get the minimum between them.
ps : In the "bodyCall" above there is my try to get the maximum between to dates that I give to the rule. It works fine for this purpose.
Thank you all.
When you implement bodyCall(Node[], int, RuleContext) or headAction(Node[], int, RuleContext) as part of implementing a Builtin, you are given an array of arguments that represents the arguments to to the builtin. In a rule, you can hand any number of variables to the the builtin (not only one).
It loosely seems like (and you can correct me if I am misinterpreting your question) that you are looking to work over some class expression in order to get the data that you need. If your overall goal is to operate on 'a class of data', then there are multiple ways to achieve this.
(easiest) Formulate your class expression as statements within the body of the rule. This will ensure that your builtin is passed only individuals of the appropriate class. Chaining together multiple preconditions can allow you to only operate on certain individuals (a 'class of data').
(potentially nontrivial) If you intend to have your builtin operate on a class, use the RuleContext passed to your bodyCall(...) or headAction(...) in order to find individuals that satisfy your class expression (by calling RuleContext#find(...) or some other method).
As an example, let's say that we wanted to act on each member of the class urn:ex:Question. In the first solution, we'd formulate a rule similar to the following:
[eachIndividual: (?x rdf:type urn:ex:Question) -> builtin(?x)]
This would ensure that we'd operate on every single instance of urn:ex:Question. An example of the second solution would be to pass the class expression to your builtin directly. Your question does not indicate how you would identify the class in question, so I will arbitrarily assume that you are interested in classes which are rdfs:subClassOf urn:ex:Question.
[eachSubclass: (x? rdfs:subClassof urn:ex:Question) -> builtin(?x)]
In this case, you would need to somehow operate on your 'class of data' within your builtin. As mentioned previously, you could potentially use the RuleContext to do so.
EDIT
Let us assume that you have 40 individuals of type urn:ex:Question, and each individual has a property urn:ex:dateSubmitted that indicates when it was submitted. This can be rather trivially solved using a SPARQL query:
SELECT ?post WHERE {
?post a urn:ex:Question .
?post urn:ex:dateSubmitted ?date .
}
ORDER BY ?date
LIMIT 1
Edit 2
Based on the new information in your update, you can probably just modify your body call to look like the following:
#Override
public boolean bodyCall( final Node[] args, final int length, final RuleContext context )
{
checkArgs(length, context);
final Node n1 = getArg(0, args, context);
final Node n2 = getArg(1, args, context);
if (n1.isLiteral() && n2.isLiteral()) {
final Node max = Util.compareTypedLiterals(n1, n2) < 0 ? n2 : n1;
return context.getEnv().bind(args[2], max);
}
return false;
}
This is a follow up question on my last one, since I am still struggling with this topic... I need to select some subjects from my model that meet specific requirements..
If I list my statements (this is only short part of the output), I get something like this:
WorkOrder2 hasType Workorder .
WorkOrder2 hasResult Fuselage22 .
WorkOrder2 type NamedIndividual .
Now, I would like to select and iterate thourgh all subjects that hasType Workorder. My idea was something like this:
public static ArrayList<String> listAllWorkorders(Model model) {
ArrayList<String> workorders = new ArrayList<String>();
// list of all work orders associated with given fuselage and work
// station
ResIterator it = model.listSubjectsWithProperty(
ResourceFactory.createProperty(ArumCorePrefix + "hasType"), ArumCorePrefix + "Workorder");
while (it.hasNext()) {
Resource r = it.next();
String workorder = trimPrefix(r.toString());
workorders.add(workorder);
}
// sort the result alphabetically
Collections.sort(workorders);
return workorders;
}
However, it does not return anything... If I use listSubjectsWithProperty without the second argument (String), it works but returns not only Workorders but some toher stuff with hasType property, which I do not want to. What is wrong with my code! Can I use something like this and make it work?
Dont worry about the static use of this function (I will take care of this non-elegant way as soon as I udnerstand whats wrong.)
Also, I would like to implement some more compelx filtering - for example selecting subjects with multiple properties that all has to match in order to return them, like hasType Workorder, hasResult someResult, inStation station etc... Does Jena support something like this! If not, what is the common approach?
Thanks for any tips!
And a follow-up: How do I check whether some statement is present in my model? I know that there is model.contains(Statements s) method but do I have to create the statement in argument in roder to call this method? Isnt there some more elegant way like model.contains(Resource r, Property p, Resource o)?
There are a number of ways you can do this in Jena, but they mostly come down to calling
StmtIterator Model.listStatements(Resource,Property,RDFNode)
with the resource as the first argument, and null as the second and third arguments, as a wildcard. The other methods that do similar things are really just special cases of this. For instance,
listObjectsOfProperty(Property p) — listStatements(null,p,null) and take the object from each statement.
listObjectsOfProperty(Resource s, Property p) — listStatements(s,p,null) and take the object from each statement.
listResourcesWithProperty(Property p) — listStatements(null,p,null) and take the subject from each statement
listResourcesWithProperty(Property p, RDFNode o) — listStatements(null,p,o) and take the subject from each statement
For convenience, you might prefer to use the method
StmtIterator Resource.listProperties()
which returns an iterator over all the statements in the resource's model with the given resource as a subject.
Here's some example code that includes your model and uses each of these methods:
import java.io.ByteArrayInputStream;
import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.rdf.model.RDFNode;
import com.hp.hpl.jena.rdf.model.Resource;
import com.hp.hpl.jena.rdf.model.StmtIterator;
public class ResourcePropertiesExample {
final static String NS = "http://example.org/";
final static String modelText = "" +
"#prefix : <"+NS+"> .\n" +
":WorkOrder2 :hasType :Workorder .\n" +
":WorkOrder2 :hasResult :Fuselage22 .\n" +
":WorkOrder2 :type :NamedIndividual .\n" +
"";
public static void main(String[] args) {
final Model model = ModelFactory.createDefaultModel();
model.read( new ByteArrayInputStream( modelText.getBytes()), null, "TTL" );
final Resource workOrder2 = model.getResource( NS+"WorkOrder2" );
System.out.println( "Using Model.listStatements()" );
StmtIterator stmts = model.listStatements( workOrder2, null, (RDFNode) null );
while ( stmts.hasNext() ) {
System.out.println( stmts.next() );
}
System.out.println( "Using Resource.listProperties()" );
stmts = workOrder2.listProperties();
while ( stmts.hasNext() ) {
System.out.println( stmts.next() );
}
}
}
The output is:
Using Model.listStatements()
[http://example.org/WorkOrder2, http://example.org/type, http://example.org/NamedIndividual]
[http://example.org/WorkOrder2, http://example.org/hasResult, http://example.org/Fuselage22]
[http://example.org/WorkOrder2, http://example.org/hasType, http://example.org/Workorder]
Using Resource.listProperties()
[http://example.org/WorkOrder2, http://example.org/type, http://example.org/NamedIndividual]
[http://example.org/WorkOrder2, http://example.org/hasResult, http://example.org/Fuselage22]
[http://example.org/WorkOrder2, http://example.org/hasType, http://example.org/Workorder]
As for checking whether a model contains certain statements, you can, as you noted, use Model.contains, and I don't think there's anything the matter with that. You can also use the various Resource has* methods, such as
hasLiteral( Property p, [various types of literal] )
hasProperty( Property p, RDFNode / String / String, String )
Using these, you could use, continuing the example above, and assuming you'd defined the property hasResult and resources fuselage21 and fuselage22:
workOrder2.hasProperty( hasResult, fuselage21 ); // false
workOrder2.hasProperty( hasResult, fuselage22 ); // true
Suppose I have some jena query object :
String query = "SELECT * WHERE{ ?s <some_uri> ?o ...etc. }";
Query q = QueryFactory.create(query, Syntax.syntaxARQ);
What would be the best way to get all of the subjects of the triples in the query? Preferably without having to do any string parsing/manipulation manually.
For example, given a query
SELECT * WHERE {
?s ?p ?o;
?p2 ?o2.
?s2 ?p3 ?o3.
?s3 ?p4 ?o4.
<http://example.com> ?p5 ?o5.
}
I would hope to have returned some list which looks like
[?s, ?s2, ?s3, <http://example.com>]
In other words, I want the list of all subjects in a query. Even having only those subjects which were variables or those which were literals/uris would be useful, but I'd like to find a list of all of the subjects in the query.
I know there are methods to return the result variables (Query.getResultVars) and some other information (see http://jena.apache.org/documentation/javadoc/arq/com/hp/hpl/jena/query/Query.html), but I can't seem to find anything which will get specifically the subjects of the query (a list of all result variables would return the predicates and objects as well).
Any help appreciated.
Interesting question. What you need to do is go through the query, and for each block of triples iterate through and look at the first part.
The most robust way to do this is via an element walker which will go through each part of the query. It might seem over the top in your case, but queries can contain all sorts of things, including FILTERs, OPTIONALs, and nested SELECTs. Using the walker means that you can ignore that stuff and focus on only what you want:
Query q = QueryFactory.create(query); // SPARQL 1.1
// Remember distinct subjects in this
final Set<Node> subjects = new HashSet<Node>();
// This will walk through all parts of the query
ElementWalker.walk(q.getQueryPattern(),
// For each element...
new ElementVisitorBase() {
// ...when it's a block of triples...
public void visit(ElementPathBlock el) {
// ...go through all the triples...
Iterator<TriplePath> triples = el.patternElts();
while (triples.hasNext()) {
// ...and grab the subject
subjects.add(triples.next().getSubject());
}
}
}
);
It might be too late but another way is to make use of Jena ARQ libraries and create Algebra of the given query. Once the algebra is created, it can be compiled and you can traverse through all the triples (given in the where clause). Here is the code, I hope it helps:
Query query = qExec.getQuery(); //qExec is an object of QueryExecutionFactory
// Generate algebra of the query
Op op = Algebra.compile(query);
CustomOpVisitorBase opVisitorBase = new CustomOpVisitorBase();
opVisitorBase.opVisitorWalker(op);
List<Triple> queryTriples = opVisitorBase.triples;
CustomOpVisitor class is given below:
public class CustomOpVisitorBase extends OpVisitorBase {
List<Triple> triples = null;
void opVisitorWalker(Op op) {
OpWalker.walk(op, this);
}
#Override
public void visit(final OpBGP opBGP) {
triples = opBGP.getPattern().getList();
}
}
Traverse through the list of Triples and make use of given property functions such as triple.getSubject() etc etc.
I'm trying to implement paging using row-based limiting (for example: setFirstResult(5) and setMaxResults(10)) on a Hibernate Criteria query that has joins to other tables.
Understandably, data is getting cut off randomly; and the reason for that is explained here.
As a solution, the page suggests using a "second sql select" instead of a join.
How can I convert my existing criteria query (which has joins using createAlias()) to use a nested select instead?
You can achieve the desired result by requesting a list of distinct ids instead of a list of distinct hydrated objects.
Simply add this to your criteria:
criteria.setProjection(Projections.distinct(Projections.property("id")));
Now you'll get the correct number of results according to your row-based limiting. The reason this works is because the projection will perform the distinctness check as part of the sql query, instead of what a ResultTransformer does which is to filter the results for distinctness after the sql query has been performed.
Worth noting is that instead of getting a list of objects, you will now get a list of ids, which you can use to hydrate objects from hibernate later.
I am using this one with my code.
Simply add this to your criteria:
criteria.setResultTransformer(Criteria.DISTINCT_ROOT_ENTITY);
that code will be like the select distinct * from table of the native sql.
A slight improvement building on FishBoy's suggestion.
It is possible to do this kind of query in one hit, rather than in two separate stages. i.e. the single query below will page distinct results correctly, and also return entities instead of just IDs.
Simply use a DetachedCriteria with an id projection as a subquery, and then add paging values on the main Criteria object.
It will look something like this:
DetachedCriteria idsOnlyCriteria = DetachedCriteria.forClass(MyClass.class);
//add other joins and query params here
idsOnlyCriteria.setProjection(Projections.distinct(Projections.id()));
Criteria criteria = getSession().createCriteria(myClass);
criteria.add(Subqueries.propertyIn("id", idsOnlyCriteria));
criteria.setFirstResult(0).setMaxResults(50);
return criteria.list();
A small improvement to #FishBoy's suggestion is to use the id projection, so you don't have to hard-code the identifier property name.
criteria.setProjection(Projections.distinct(Projections.id()));
The solution:
criteria.setResultTransformer(Criteria.DISTINCT_ROOT_ENTITY);
works very well.
session = (Session) getEntityManager().getDelegate();
Criteria criteria = session.createCriteria(ComputedProdDaily.class);
ProjectionList projList = Projections.projectionList();
projList.add(Projections.property("user.id"), "userid");
projList.add(Projections.property("loanState"), "state");
criteria.setProjection(Projections.distinct(projList));
criteria.add(Restrictions.isNotNull("this.loanState"));
criteria.setResultTransformer(Transformers.aliasToBean(UserStateTransformer.class));
This helped me :D
if you want to use ORDER BY, just add:
criteria.setProjection(
Projections.distinct(
Projections.projectionList()
.add(Projections.id())
.add(Projections.property("the property that you want to ordered by"))
)
);
I will now explain a different solution, where you can use the normal query and pagination method without having the problem of possibly duplicates or suppressed items.
This Solution has the advance that it is:
faster than the PK id solution mentioned in this article
preserves the Ordering and don’t use the 'in clause' on a possibly large Dataset of PK’s
The complete Article can be found on my blog
Hibernate gives the possibility to define the association fetching method not only at design time but also at runtime by a query execution. So we use this aproach in conjunction with a simple relfection stuff and can also automate the process of changing the query property fetching algorithm only for collection properties.
First we create a method which resolves all collection properties from the Entity Class:
public static List<String> resolveCollectionProperties(Class<?> type) {
List<String> ret = new ArrayList<String>();
try {
BeanInfo beanInfo = Introspector.getBeanInfo(type);
for (PropertyDescriptor pd : beanInfo.getPropertyDescriptors()) {
if (Collection.class.isAssignableFrom(pd.getPropertyType()))
ret.add(pd.getName());
}
} catch (IntrospectionException e) {
e.printStackTrace();
}
return ret;
}
After doing that you can use this little helper method do advise your criteria object to change the FetchMode to SELECT on that query.
Criteria criteria = …
// … add your expression here …
// set fetchmode for every Collection Property to SELECT
for (String property : ReflectUtil.resolveCollectionProperties(YourEntity.class)) {
criteria.setFetchMode(property, org.hibernate.FetchMode.SELECT);
}
criteria.setFirstResult(firstResult);
criteria.setMaxResults(maxResults);
criteria.list();
Doing that is different from define the FetchMode of your entities at design time. So you can use the normal join association fetching on paging algorithms in you UI, because this is most of the time not the critical part and it is more important to have your results as quick as possible.
Below is the way we can do Multiple projection to perform Distinct
package org.hibernate.criterion;
import org.hibernate.Criteria;
import org.hibernate.Hibernate;
import org.hibernate.HibernateException;
import org.hibernate.type.Type;
/**
* A count for style : count (distinct (a || b || c))
*/
public class MultipleCountProjection extends AggregateProjection {
private boolean distinct;
protected MultipleCountProjection(String prop) {
super("count", prop);
}
public String toString() {
if(distinct) {
return "distinct " + super.toString();
} else {
return super.toString();
}
}
public Type[] getTypes(Criteria criteria, CriteriaQuery criteriaQuery)
throws HibernateException {
return new Type[] { Hibernate.INTEGER };
}
public String toSqlString(Criteria criteria, int position, CriteriaQuery criteriaQuery)
throws HibernateException {
StringBuffer buf = new StringBuffer();
buf.append("count(");
if (distinct) buf.append("distinct ");
String[] properties = propertyName.split(";");
for (int i = 0; i < properties.length; i++) {
buf.append( criteriaQuery.getColumn(criteria, properties[i]) );
if(i != properties.length - 1)
buf.append(" || ");
}
buf.append(") as y");
buf.append(position);
buf.append('_');
return buf.toString();
}
public MultipleCountProjection setDistinct() {
distinct = true;
return this;
}
}
ExtraProjections.java
package org.hibernate.criterion;
public final class ExtraProjections
{
public static MultipleCountProjection countMultipleDistinct(String propertyNames) {
return new MultipleCountProjection(propertyNames).setDistinct();
}
}
Sample Usage:
String propertyNames = "titleName;titleDescr;titleVersion"
criteria countCriteria = ....
countCriteria.setProjection(ExtraProjections.countMultipleDistinct(propertyNames);
Referenced from https://forum.hibernate.org/viewtopic.php?t=964506
NullPointerException in some cases!
Without criteria.setProjection(Projections.distinct(Projections.property("id")))
all query goes well!
This solution is bad!
Another way is use SQLQuery. In my case following code works fine:
List result = getSession().createSQLQuery(
"SELECT distinct u.id as usrId, b.currentBillingAccountType as oldUser_type,"
+ " r.accountTypeWhenRegister as newUser_type, count(r.accountTypeWhenRegister) as numOfRegUsers"
+ " FROM recommendations r, users u, billing_accounts b WHERE "
+ " r.user_fk = u.id and"
+ " b.user_fk = u.id and"
+ " r.activated = true and"
+ " r.audit_CD > :monthAgo and"
+ " r.bonusExceeded is null and"
+ " group by u.id, r.accountTypeWhenRegister")
.addScalar("usrId", Hibernate.LONG)
.addScalar("oldUser_type", Hibernate.INTEGER)
.addScalar("newUser_type", Hibernate.INTEGER)
.addScalar("numOfRegUsers", Hibernate.BIG_INTEGER)
.setParameter("monthAgo", monthAgo)
.setMaxResults(20)
.list();
Distinction is done in data base! In opposite to:
criteria.setResultTransformer(Criteria.DISTINCT_ROOT_ENTITY);
where distinction is done in memory, after load entities!