How to get over limitations of the Hibernate Criteria and Example APIs?

How to get over limitations of the Hibernate Criteria and Example APIs? - java

I'm in a position where our company has a database search service that is highly configurable, for which it's very useful to configure queries in a programmatic fashion. The Criteria API is powerful but when one of our developers refactors one of the data objects, the criteria restrictions won't signal that they're broken until we run our unit tests, or worse, are live and on our production environment. Recently, we had a refactoring project essentially double in working time unexpectedly due to this problem, a gap in project planning that, had we known how long it would really take, we probably would have taken an alternative approach.
I'd like to use the Example API to solve this problem. The Java compiler can loudly indicate that our queries are borked if we are specifying 'where' conditions on real POJO properties. However, there's only so much functionality in the Example API and it's limiting in many ways. Take the following example
Product product = new Product();
product.setName("P%");
Example prdExample = Example.create(product);
prdExample.excludeProperty("price");
prdExample.enableLike();
prdExample.ignoreCase();
Here, the property "name" is being queried against (where name like 'P%'), and if I were to remove or rename the field "name", we would know instantly. But what about the property "price"? It's being excluded because the Product object has some default value for it, so we're passing the "price" property name to an exclusion filter. Now if "price" got removed, this query would be syntactically invalid and you wouldn't know until runtime. LAME.
Another problem - what if we added a second where clause:
product.setPromo("Discounts up to 10%");
Because of the call to enableLike(), this example will match on the promo text "Discounts up to 10%", but also "Discounts up to 10,000,000 dollars" or anything else that matches. In general, the Example object's query-wide modifications, such as enableLike() or ignoreCase() aren't always going to be applicable to every property being checked against.
Here's a third, and major, issue - what about other special criteria? There's no way to get every product with a price greater than $10 using the standard example framework. There's no way to order results by promo, descending. If the Product object joined on some Manufacturer, there's no way to add a criterion on the related Manufacturer object either. There's no way to safely specify the FetchMode on the criteria for the Manufacturer either (although this is a problem with the Criteria API in general - invalid fetched relationships fail silently, even more of a time bomb)
For all of the above examples, you would need to go back to the Criteria API and use string representations of properties to make the query - again, eliminating the biggest benefit of Example queries.
What alternatives exist to the Example API that can get the kind of compile-time advice we need?

My company gives developers days when we can experiment and work on pet projects (a la Google) and I spent some time working on a framework to use Example queries while geting around the limitations described above. I've come up with something that could be useful to other people interested in Example queries too. Here is a sample of the framework using the Product example.
Criteria criteriaQuery = session.createCriteria(Product.class);
Restrictions<Product> restrictions = Restrictions.create(Product.class);
Product example = restrictions.getQueryObject();
example.setName(restrictions.like("N%"));
example.setPromo("Discounts up to 10%");
restrictions.addRestrictions(criteriaQuery);
Here's an attempt to fix the issues in the code example from the question - the problem of the default value for the "price" field no longer exists, because this framework requires that criteria be explicitly set. The second problem of having a query-wide enableLike() is gone - the matcher is only on the "name" field.
The other problems mentioned in the question are also gone in this framework. Here are example implementations.
product.setPrice(restrictions.gt(10)); // price > 10
product.setPromo(restrictions.order(false)); // order by promo desc
Restrictions<Manufacturer> manufacturerRestrictions
= Restrictions.create(Manufacturer.class);
//configure manuf restrictions in the same manner...
product.setManufacturer(restrictions.join(manufacturerRestrictions));
/* there are also joinSet() and joinList() methods
for one-to-many relationships as well */
Even more sophisticated restrictions are available.
product.setPrice(restrictions.between(45,55));
product.setManufacturer(restrictions.fetch(FetchMode.JOIN));
product.setName(restrictions.or("Foo", "Bar"));
After showing the framework to a coworker, he mentioned that many data mapped objects have private setters, making this kind of criteria setting difficult as well (a different problem with the Example API!). So, I've accounted for that too. Instead of using setters, getters are also queryable.
restrictions.is(product.getName()).eq("Foo");
restrictions.is(product.getPrice()).gt(10);
restrictions.is(product.getPromo()).order(false);
I've also added some extra checking on the objects to ensure better type safety - for example, the relative criteria (gt, ge, le, lt) all require a value ? extends Comparable for the parameter. Also, if you use a getter in the style specified above, and there's a #Transient annotation present on the getter, it will throw a runtime error.
But wait, there's more!
If you like that Hibernate's built-in Restrictions utility can be statically imported, so that you can do things like criteria.addRestriction(eq("name", "foo")) without making your code really verbose, there's an option for that too.
Restrictions<Product> restrictions = new Restrictions<Product>(){
public void query(Product queryObject){
queryObject.setPrice(gt(10));
queryObject.setPromo(order(false));
//gt() and order() inherited from Restrictions
}
}
That's it for now - thank you very much in advance for any feedback! We've posted the code on Sourceforge for those that are interested. http://sourceforge.net/projects/hqbe2/

The API looks great!
Restrictions.order(boolean) smells like control coupling. It's a little unclear what the values of the boolean argument represent.
I suggest replacing or supplementing with orderAscending() and orderDescending().

Have a look at Querydsl. Their JPA/Hibernate module requires code generation. Their Java collections module uses proxies but cannot be used with JPA/Hibernate at the moment.

Related

Is it okay to have type checking code when working with databases?

I'm trying to insert some code into a database. But, i have encountered a problem while working with models that are subclasses of each other. I have a list that holds all these subclases.
List<Metric> metrics = experiment.getMetrics();
for(Metric m : metrics) {
int id = m.getId();
// type checking code
}
Metric has sublcases of Rating and Quantity. Each of these in turn have there own uniquely defined tables. I am conflicted over the idea of using type checking. But I don't see any immediate solution. One alternative, which doesn't seem any better, would be to create a new column in the Metric table called metric_type. But this would lead to something quite similar to type checking. Any suggestions?

You have encountered Object-relational impedance mismatch due to mapping between not fully compatible systems. Since inheritance is not possible between tables in the relational model you will have to sacrifice something in the object model that uses inheritance. There will be edge cases no matter what you do unless you switch to an object database.
If you define a custom CRUD operations in classes that extend Metric loading entites can be tricky. What exactly will be loaded by Metric.get(id) if each table has it's own PK sequence and both Rating and Quantity can have the same numeric PK value.
You can take a look on how JPA solves this problem. It uses custom annotations e.g. #MappedSuperclass and #Entity. I guess that's a form of type checking.

I wouldn't suggest you to type check
The OOP way to solve this would be to make an insert method in the Metric class.
Then override the method both in Rating and Quality with the appropriate code that inserts the object in the respective table.
for(Metric m : metrics) {
int id = m.getId();
m.insert();
}
Inside your loop simply call insert and due to late-binding the appropriate method will be called and the right code will be executed.

Hibernate: Avoid using column names as strings

I would like to avoid having column names as strings in the code. Is there any other way to accomplish this?:
String query = "SELECT c.foo1.columnA, c.foo1.foo2.columnB FROM Table c";
session.createQuery(query).list();
I'm able to iterate over a column as string like c.foo1.foo2.columnB by splitting and getting the ClassMetadata, the property Type and other Hibernate functions until I reach the last element. However, I can't think a way to get a column string from Java beans, iterating through properties too.

Not sure what is the intention. Couple of thoughts
If you are worried about possibility of property names being wrong, current day IDEs does a good job by validating the property names in JPA queries
Object reflection can give you the property names. But not necessarily all properties are mapped to columns. You can look at this and use it along with bean property names via reflection.
Hope that helps.

There is no way to achieve what you are looking for. But, if your concern is correctness of these queries and worry that the problem will not be known until the execution hits this, you could use NamedQuery
#Entity
#NamedQuery(
name="findAllEmployeesByFirstName",
queryString="SELECT OBJECT(emp) FROM Employee emp WHERE emp.firstName = 'John'"
)
public class Employee implements Serializable {
...
}
Usage
List employees = em.createNamedQuery("findAllEmployeesByFirstName").getResultList();
The benefit is that queries defined in NamedQuery annotations are compiled to actual SQL at start up time. So incorrect field references(typo etc) will cause a start up error and the application will not start.
Another option will be as mentioned in the other answer to trust in a good IDE to refactor all occurrences properly when you rename fields (Idea does a great job at this, so would any other IDE)
EDIT: I do not think there is any performance degradation with named queries. Rather it may appear to be faster as compiled queries are cached(very subjective)
Finally, its better to use the actual query as-is as mentioned in comments. It is far more readable and debug in its context. If you are concerned about correctness, unit-test the heck out of it and be confident.

Equivalent to NEW operator that uses mutators instead of construtor

JPQL queries can return custom result objects with the NEW operator:
SELECT NEW myPackage.MyVO(e.fieldX, e.relationshipX.fieldY)
FROM MyEntity AS e
This is very useful to feed VOs. The problem is, you have to create constructors that exactly match the number of arguments, order and types of your query projection. This starts to get messy when you use a lot of projections for the same VO... Either you have one big constructor in your VO and resort to a lot of NULL literals on your query, or your VO must have a lot of different constructors.
So my question is: Is there a way in JPQL to set result object fields through mutators instead of constructors?
To people with .NET background, I'm looking to a equivalent of LINQ + object initializers.

DataNucleus JPA certainly supports two ways of instantiating result objects using no non-standard annotations or calls, primarily driven by the fact that it also supports JDO and that has the requirement for it :-
Result type with argumented constructor (as you say)
Result type with default constructor, and with setters
Such as
TypedQuery<MyResultType> q = em.createQuery("SELECT x AS field1, y AS field2 FROM ...", MyResultType.class);
where MyResultType has setters "setField1", "setField2".

Short Answer No you can not use mutators in JPQL.
While I do not know LINQ I can not see this getting done without creating mess .
Now I am sure you know that Classes can have mutlple constructors . So why not create constructors where you will not have too feed in null.
Depending upon what you need and which JPA implementation you are using , most providers do provide non standard ways around it .e.g. Hibernate has #formula which in some cases be used instead to using a constructor.
I you are using JPA2 then criteria queries might be a better choice and can take care of these kind of things.
In somecases you might prefer using #PostLoad .
Either way you need to know this converstion in not happening in SQL so you are not really offloading any work to SQL . Which we generally prefer i.e. make SQL do as much work as possible in single hit.
Yes these are my generalizations and concrete solutions or requirements may not fit.

Map database column1, column2, columnN to a collection of elements

In legacy database tables we have numbered columns like C1, C2, C3, C100 or M1, M2, M3, M100.
This columns represent BLOB data.
It is not possible to change anything it this database.
By using JPA Embeddable we map all of the columns to single fields. And then during embedding we override names by using 100 override annotations.
Recently we have switched to Hibernate and I've found things like UserCollectionType and CompositeUserType. But I hadn't found any use cases that are close to mine.
Is it possible to implement some user type by using Hibernate to be able to map a bundle of columns to a collection without additional querying?
Edit:
As you probably noticed the names of columns can differ from table to table. I want to create one type like "LegacyArray" with no need to specify all of the #Columns each time I use this type.
But instead I'd use
#Type(type = "LegacyArrayUserType",
parameters =
{
#Parameter(name = "prefix", value = "A"),
#Parameter(name = "size", value = "128")
})
List<Integer> legacyA;
#Type(type = "LegacyArrayUserType",
parameters =
{
#Parameter(name = "prefix", value = "B"),
#Parameter(name = "size", value = "64")
})
List<Integer> legacyB;

I can think of a couple of ways that I would do this.
1. Create views for the collection information that simulates a normalized table structure, and map it to Hibernate as a collection:
Assuming your existing table is called primaryentity, I would create a view that's similar to the following:
-- untested SQL...
create view childentity as
(select primaryentity_id, c1 from primaryentity union
select primaryentity_id, c2 from primaryentity union
select primaryentity_id, c3 from primaryentity union
--...
select primaryentity_id, c100 from primaryentity)
Now from Hibernate's perspective, childentity is just a normalized table that has a foreign key to primarykey. Mapping this should be pretty straight forward, and is covered here:
http://docs.jboss.org/hibernate/stable/core/reference/en/html/collections.html
The benefits of this approach:
From Hibernate's point of view, the tables are normalized, it's a fairly simple mapping
No updates to your existing tables
The drawbacks:
Data is read-only, I don't think your view can be defined in an updatable manner (I could be wrong)
Requires change to the database, you may need to create lots of views
Alternately, if your DBA won't even let you add a view to the database, or if you need to perform updates:
2. Use Hibernate's dynamic model mapping facility to map your C1, C2, C3 properties to a Map, and have some code you your DAO layer do the appropriate conversation between the Map and the Collection property:
I have never done this myself, but I believe Hibernate does allow you to map tables to HashMaps. I'm not sure how dynamically Hibernate allows you to do this (i.e., Can you get away with simply specifying the table name, and having Hibernate automatically map all the columns?), but it's another way I can think of doing this.
If going with this approach though, be sure to use the data access object pattern, and ensure that the internal implementation (use of HashMaps) is hidden from the client code. Also be sure to check before writing to the database that the size of your collection does not exceed the number of available columns.
The benefits of this approach:
No change to the database at all
Data is updatable
O/R Mapping is relatively simple
The drawbacks:
Lots of plumbing in the DAO layer to map the appropriate types
Uses experimental Hibernate features that may change in the future

Personally, I think that design sounds like it breaks first normal form for relational databases. What happens if you need C101 or M101? Change your schema again? I think it's very intrusive.
If you add Hibernate to the mix it's even worse. Adding C101 or M101 means having to alter your Java objects, your Hibernate mappings, everything.
If you have 1:m relationships with C and M tables, you'd be able handle the cases I just cited by adding additional rows. Your Java objects contain Collection<C> or Collection<M>. Your Hibernate mappings are one-to-many that don't change.
Maybe the reason that you don't see any Hibernate examples to match your case because it's a design that's not recommended.
If you must, maybe you should look at Hibernate Component Mapping.
UPDATE: The fact that this is legacy is duly noted. My point in bringing up first normal form is as much for others who might find this question in the future as it is for the person who posted the question. I would not want to answer the question in such a way that it silently asserted this design as "good".
Pointing out Hibernate component mapping is pertinent because knowing the name of what you're looking for can be the key when you're searching. Hibernate allows an object model to be finer grained than the relational model it maps. You are free to model a denormalized schema (e.g., Name and Address objects as part of a larger Person object). That's just the name they give such a technique. It might help find other examples as well.

Sorry if I'm misunderstanding your problem here, I don't know much about Hibernate. But couldn't you just concatenate during selection from database to get something like what you want?
Like:
SELECT whatever
, C1||C2||C3||C4||...||C100 AS CDATA
, M1||M2||M3||M4||...||M100 AS MDATA
FROM ...
WHERE ...
(Of course, the concatenation operator differs between RDBMSs.)

[EDIT] I suggest to use a CompositeUserType. Here is an example. There is also a good example on page 228f in the book "Java Persistence With Hibernate".
That allows you to handle the many columns as a single object in Java.
The mapping looks like this:
#org.hibernate.annotations.Columns(columns = {
#Column(name="C1"),
#Column(name="C2"),
#Column(name="C3"),
...
})
private List<Integer> c;
Hibernate will load all columns at once during the normal query.
In your case, you must copy the int values from the list into a fixed number of columns in nullSafeSet. Pseudocode:
for (int i=1; i<numColumns; i++)
if (i < list.size())
resultSet.setInt(index+i, list.get(i));
else
resultSet.setNull(index+i, Hibernate.INTEGER.sqlType());
In nullSafeGet you must create a list and stop adding elements when a column is NULL. For additional safety, I suggest to create your own list implementation which doesn't allow to grow beyond the number of columns (inherit from ArrayList and override ensureCapacity()).
[EDIT2] If you don't want to type all the #Column annotations, use a code generator for them. That can be as simple as script which you give a name and a number and it prints #Column(...) to System.out. After the script ran, just cut&paste the data into the source.
The only other solution would be to access the internal Hibernate API to build that information at runtime but that API is internal, so a lot of stuff is private. You can use Java reflection and setAccessible(true) but that code probably won't survive the next update of Hibernate.

You can use UserTypes to map a given number of columns to any type you wish. This could be a collection if (for example) for collections are always bounded in size by a known number of items.
It's been a while (> 3 years) since I used Hibernate so I'm pretty rusty but I recall it being very easy to do; your BespokeUserType class gets passed the ResultSet to hydrate your object from it.

I too have never used Hibernate.
I suggest writing a small program in an interpreted language (such as Python) in which you can execute a string as if it were a command. You could construct a statement which takes the tedious work out of doing what you want to do manually.

How do you query object collections in Java (Criteria/SQL-like)?

Suppose you have a collection of a few hundred in-memory objects and you need to query this List to return objects matching some SQL or Criteria like query. For example, you might have a List of Car objects and you want to return all cars made during the 1960s, with a license plate that starts with AZ, ordered by the name of the car model.
I know about JoSQL, has anyone used this, or have any experience with other/homegrown solutions?

Filtering is one way to do this, as discussed in other answers.
Filtering is not scalable though. On the surface time complexity would appear to be O(n) (i.e. already not scalable if the number of objects in the collection will grow), but actually because one or more tests need to be applied to each object depending on the query, time complexity more accurately is O(n t) where t is the number of tests to apply to each object.
So performance will degrade as additional objects are added to the collection, and/or as the number of tests in the query increases.
There is another way to do this, using indexing and set theory.
One approach is to build indexes on the fields within the objects stored in your collection and which you will subsequently test in your query.
Say you have a collection of Car objects and every Car object has a field color. Say your query is the equivalent of "SELECT * FROM cars WHERE Car.color = 'blue'". You could build an index on Car.color, which would basically look like this:
'blue' -> {Car{name=blue_car_1, color='blue'}, Car{name=blue_car_2, color='blue'}}
'red' -> {Car{name=red_car_1, color='red'}, Car{name=red_car_2, color='red'}}
Then given a query WHERE Car.color = 'blue', the set of blue cars could be retrieved in O(1) time complexity. If there were additional tests in your query, you could then test each car in that candidate set to check if it matched the remaining tests in your query. Since the candidate set is likely to be significantly smaller than the entire collection, time complexity is less than O(n) (in the engineering sense, see comments below). Performance does not degrade as much, when additional objects are added to the collection. But this is still not perfect, read on.
Another approach, is what I would refer to as a standing query index. To explain: with conventional iteration and filtering, the collection is iterated and every object is tested to see if it matches the query. So filtering is like running a query over a collection. A standing query index would be the other way around, where the collection is instead run over the query, but only once for each object in the collection, even though the collection could be queried any number of times.
A standing query index would be like registering a query with some sort of intelligent collection, such that as objects are added to and removed from the collection, the collection would automatically test each object against all of the standing queries which have been registered with it. If an object matches a standing query then the collection could add/remove it to/from a set dedicated to storing objects matching that query. Subsequently, objects matching any of the registered queries could be retrieved in O(1) time complexity.
The information above is taken from CQEngine (Collection Query Engine). This basically is a NoSQL query engine for retrieving objects from Java collections using SQL-like queries, without the overhead of iterating through the collection. It is built around the ideas above, plus some more. Disclaimer: I am the author. It's open source and in maven central. If you find it helpful please upvote this answer!

I have used Apache Commons JXPath in a production application. It allows you to apply XPath expressions to graphs of objects in Java.

yes, I know it's an old post, but technologies appear everyday and the answer will change in the time.
I think this is a good problem to solve it with LambdaJ. You can find it here:
http://code.google.com/p/lambdaj/
Here you have an example:
LOOK FOR ACTIVE CUSTOMERS // (Iterable version)
List<Customer> activeCustomers = new ArrayList<Customer>();
for (Customer customer : customers) {
if (customer.isActive()) {
activeCusomers.add(customer);
}
}
LambdaJ version
List<Customer> activeCustomers = select(customers,
having(on(Customer.class).isActive()));
Of course, having this kind of beauty impacts in the performance (a little... an average of 2 times), but can you find a more readable code?
It has many many features, another example could be sorting:
Sort Iterative
List<Person> sortedByAgePersons = new ArrayList<Person>(persons);
Collections.sort(sortedByAgePersons, new Comparator<Person>() {
public int compare(Person p1, Person p2) {
return Integer.valueOf(p1.getAge()).compareTo(p2.getAge());
}
});
Sort with lambda
List<Person> sortedByAgePersons = sort(persons, on(Person.class).getAge());
Update: after java 8 you can use out of the box lambda expressions, like:
List<Customer> activeCustomers = customers.stream()
.filter(Customer::isActive)
.collect(Collectors.toList());

Continuing the Comparator theme, you may also want to take a look at the Google Collections API. In particular, they have an interface called Predicate, which serves a similar role to Comparator, in that it is a simple interface that can be used by a filtering method, like Sets.filter. They include a whole bunch of composite predicate implementations, to do ANDs, ORs, etc.
Depending on the size of your data set, it may make more sense to use this approach than a SQL or external relational database approach.

If you need a single concrete match, you can have the class implement Comparator, then create a standalone object with all the hashed fields included and use it to return the index of the match. When you want to find more than one (potentially) object in the collection, you'll have to turn to a library like JoSQL (which has worked well in the trivial cases I've used it for).
In general, I tend to embed Derby into even my small applications, use Hibernate annotations to define my model classes and let Hibernate deal with caching schemes to keep everything fast.

I would use a Comparator that takes a range of years and license plate pattern as input parameters. Then just iterate through your collection and copy the objects that match. You'd likely end up making a whole package of custom Comparators with this approach.

The Comparator option is not bad, especially if you use anonymous classes (so as not to create redundant classes in the project), but eventually when you look at the flow of comparisons, it's pretty much just like looping over the entire collection yourself, specifying exactly the conditions for matching items:
if (Car car : cars) {
if (1959 < car.getYear() && 1970 > car.getYear() &&
car.getLicense().startsWith("AZ")) {
result.add(car);
}
}
Then there's the sorting... that might be a pain in the backside, but luckily there's class Collections and its sort methods, one of which receives a Comparator...

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.