Ignoring some entity fields during saving in Objectify 4

Ignoring some entity fields during saving in Objectify 4 - java

I am trying to use Objectify #IgnoreSave annotation along with simple If condition (IfEmpty, IfNull) but it seems that it is not working. Without If condition the actual value is not persisted as expected, however, when I use some If condition, it is always persisted (e.g. if IfNull condition used and null value provided, it is persisted and hence original value in datastore deleted).
...
#IgnoreSave(IfNull.class)
private String email;
...
...
this.objectify.save().entity(userDetails).now();
...
Is there any additional configuration needed? Or has anyone experienced the same?

From "hence original value in datastore deleted" it sounds like you misunderstand a fundamental characteristic of the GAE datastore - entities are stored whole. If you #IgnoreSave a field, it will be ignored during save and thus the field will not be present in the datastore. You do not get to update some fields and not others.

Related

Hibernate: how to deal with PropertyAccessException (invalid data)?

Suppose I have an entity with a field of type FOO, but an external process writes the invalid value BAR into the database. So next time I try to read this entity with Hibernate I get an exception like this:
org.hibernate.PropertyAccessException: Could not set field value [BAR] value by reflection
Unfortunately I also get this exception, when I call the method getAllFoo() and the database contains 999 valid entities plus the one invalid entity. I would like to be able to get the 999 valid entities plus a warning of some sort for the invalid one.
Is that even possible in Hibernate?

I can think of 3 options:
(1): Add a WHERE field IN(<list of valid values>) clause to all queries to this table+field.
(2): Clean up the database and add a contraint to the field, so that no invalid values can end up in there.
(3): Add the invalid option (I assume it's an enum?) to your entity, and filter it out in own code. This may break proper pagination of queries. It will also get messy quickly if you have a large variety of invalid values.
Edit
(4): Create a custom hibernate type (UserType). There you can parse the value manually from the prepared statement / result set. This allows you to (instead of throwing an exception) map 'invalid' values to null or any other meaningful value that you can process.

Best practices regarding empty fields and nulls in postgresql

I'm writing a simple webapp to show my coding skills to potential employers. It connects with an API and receives a JSON file which is then deserialized using Jackson and displayed in a table form in the browser. I want to enable the user to persist the Java object in a Postgres database using Hibernate. I got it to work and it does the job nicely but I want to make it more efficient.
Whenever there is no data in the JSON response to put in the object's field (right now all the possible JSON attributes are present in the Java class/Hibernate entity in the form of String fields) I put an empty String ('') and then, with all fields having something and no null objects, it is stored in the database.
Should I only store what I have and put no empty strings in the DB (using nulls instead) or is what I'm doing now the right way?

Null is an absence of a value. An empty string is a value. But that don't impact much to memory. If you want to display data repeatedly and don't want conversion from null to empty string while retrieval you can go for empty string ''.
But if you want unique constraint for values other than empty string '' then use null.
Sometimes null and empty '' can be used to differentiate either data was known or not. for known but not available data use empty and for unknown data null can be used.

Use NULLwhen there isn't a known value.
Never use the empty string.
For example, if you have a customer which didn't supply his address don't say his address is '', say it is NULL. NULL unambiguously states "no value".
For database columns that must have a value for your web application to work, create the backing table with NOT NULL data constraints on those columns.
In your unit tests, call NULL, ..._address_is_null_ and test for success or failure (depending on if the test should trigger no errors or trigger an exception).
The use of '' in databases as a sentinel, a special value that means something other that '', is discouraged. That's because we won't know what you meant it to mean. Also, there might be more than one special case, and if you use '' first, then it makes restructuring more difficult to add others (unless you fall into the really bad practice of using even more special strings to enumerate other special cases, like "deleted" and so on).

Hibernate Search with Elasticsearch creating fields with .keyword suffix

I just implemented the integration of Hibernate Search with Elasticsearch using hibernate search 5.8 and ES 5.5.
I have several fields created specifically for sorting, and they are all called [field]Sort.
When I was testing it locally, the first time I let Hibernate create the indexes, it created the String sort fields like this:
nameSort -> text
nameSort.keyword -> keyword
I realized that I should use the suffixed field for sorting.
But then, when I destroyed my Elasticsearch cluster, to start over, it didn't create the suffixed fields, it just created the sort fields as keyword directly.
I recreated the cluster 5 or more times again and it never created the suffixed fields again.
When I finally sent my changes to our staging environment, it created the suffixed fields again, causing my queries to fail, because they are trying to sort by a text field, instead of a keyword field.
Now, I'm really not sure of why it sometimes creates the suffix and sometimes doesn't.
Is there any rule?
Is there a way to avoid it creating 2 fields and making it always create only one keyword field with exactly the name I gave it?
Here's an example of a sort field:
#Field(name = "nameSort", analyze = Analyze.NO, store = Store.YES, index = Index.NO)
#SortableField(forField = "nameSort")
public String getNameSort() {
return name != null ? name.toLowerCase(Locale.ENGLISH) : null;
}
Thanks in advance for any help.

Hibernate Search does no such thing as creating a separate keyword field for text fields. It creates either a text field or a keyword field, depending on whether the field should be analyzed. In your case, the field is not analyzed, so it should create a keyword field.
Now, Hibernate Search is not alone here, and this behavior could stem from the Elasticsearch cluster itself. Did you check whether you have particular index templates on your Elasticsearch cluster? It could lead to Elasticsearch creating a keyword field whenever Hibernate Search creates a text property.
On a side note, you may be interested by the fact Hibernate Search 5.8 allows defining normalizers (same thing as Elasticsearch normalizers), which would allow you to annotate the getName() getter directly and avoid doing the lowercase conversion yourself. See this blog post for more information.

Hibernate Batch Save Nested Objects

I have the below class structure:
class A{
int id;
List<B> blist;
List<C> clist;
List<D> dlist;
}
I get a json as an input which is mapped to object A by a mapper. Now, i have object A which has the list of B,C and D objects. I want to use batching to save the insert time taken. I went through the documentation which describes the solution if I want to save multiple parent objects. How would I use the batching capability in my case which has nested list of objects of multiple type.
I have enabled batch inserts using
<property name="hibernate.jdbc.batch_size">50</property>
This by itself doesnt give me any batching unless I clear and flush the session. Any suggestions on how do I go about with this?

The problem is that you're using IDENTITY strategy.
Whenever you save a new entity, Hibernate will place it into the Session's 1LC; however, in order to do that the identifier must be known. The problem with IDENTITY strategy is that Hibernate must actually perform the insert to determine the identifier value.
In the end, batch insert capabilities are disabled.
You should either try to load your data using business key values that are known up front or worse case use SEQUENCE generation type with a sequence optimizer to minimize the database hit. This will allow batch inserts to work.
UPDATE
For situations where you have no business key that defines the uniqueness for a row and your database doesn't have SEQUENCE support, you could manage the identifiers yourself. You can either elect to do this using a custom identifier generator or just doing this in your loop as code.
The caveat here is that this solution is not thread-safe. You should guarantee that at no point would you ever be running this logic in two threads simultaneously, which is typically not something one does anyway with bulk data loads.
Define a variable to store your identifier in. We will need to initialize this variable based on the existing max value of the identifier in the database. If no rows in the database exist, we likely will want to initialize it as 1.
Long value = ... // createQuery ( "SELECT MAX(id) FROM YourEntity" )
value = ( value == null ? 1L : value + 1);
The next step is to change the #Id annotated field. It should not be marked as #GeneratedValue since we're going to allow the application to provide the value.
For each row you're going to insert, simply call your #setId( value ) method with the value variable generated from step 1.
Increment your value variable by 1.

Should the id field of a JPA entity be considered in equals and hashCode?

I hit a problem when writing tests for a database application using JPA2 and EclipseLink:
I add some entity to a database, retrieve it later and want to compare it to an instance which has the values I expect to confirm that the addition worked as I intended.
First I wrote something like
assertEquals(expResult, dbResult);
which failed, because I can't really know the value of id field, which is generated by the database and therefore dbResult differs from expResult which I created with new and populated manually.
I see two options:
Either I remove the id field from equals and hashCode so that the comparison is only based on the "real values". I don't know if this causes problems in the database or elsewhere, though.
Or I write my tests to explicitly check every field except id manually.
What should I do?

You might find a lot of controversy about this one. My stance is that you absolutely don't use a database primary key for anything in your application. It should be completely invisible. Identify your objects in your application by some other property or combination of properties.
On the "testing persistence operations" front, what you really want is probably to check that the fields were saved and loaded correctly and maybe that the primary key got assigned some value when you saved it. This probably isn't a job for the equals method at all.

Relying on database generated Ids in your equals and hashCode implementation is not advisable. You ought to rely on the truly unique/semi-unique attributes of your classes in checking for equality, and in generating the hashcode values. The Hibernate documentation has an extensive page that discusses this, and the facts therein are applicable to more or less every JPA provider.
The underlying reason for using business keys over database generated values in your equals and hashCode implementation is that the JPA provider must actually issue a SELECT after persisting the entity in the database. If you compare objects using the database generated Ids, then you will end up having an equality test that fails in the following scenarios:
If E1 and E2 are entities of class E (that verifies equality using database generated Ids), then if E1 and E2 will be equal if they haven't been stored in the database yet. This is not what you want, especially if want to store E1 and E2 in some Set before persistence. This is worse if the attributes of E1 and E2 possess different values; the equals implementation would prevent two significantly different entities from being added to a Set, and the hashCode implementation will give you a O(n) lookup time when entities are looked up from a HashMap using the primary key.
If E1 is a managed entity that has been persisted, and E2 is an entity that has not been persisted, then the equality test would deem that E1 != E2 in the scenario where all the attribute values of E1 and E2 (except for the Ids) are similar. Again, this is probably not what you want, especially if you want to avoid duplicate entities in the database that differ only in their database generated Ids.
The equals and hashCode implementations therefore ought to use business keys, in order to exhibit consistent behavior for both persisted and unpersisted entities.

From the book Hibernate in Action, its recommended to defined a business key and test equality on that. A business key is "a property, or some combination of properties, that is unique for each instance with the same database identity." In other areas it says to not use the id as one of those properties, and don't use values in collections.

I would write my test to explicitly check for fields. To make this easy, before performing the assertEqual test, I will set the id of both the expected and actual result to the same predefined value and then use the normal equals method.
Removing ID from equals is not justifiable, just because testing is slightly difficult. You are foregoing serious performance benefits and also code integrity.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Ignoring some entity fields during saving in Objectify 4 - java

Related

Hibernate: how to deal with PropertyAccessException (invalid data)?

Best practices regarding empty fields and nulls in postgresql

Hibernate Search with Elasticsearch creating fields with .keyword suffix

Hibernate Batch Save Nested Objects

Should the id field of a JPA entity be considered in equals and hashCode?

Categories

Resources