Hibernate not batching inserts

Hibernate not batching inserts - java

I hope you can help me with this problem since I'm really clueless by now and none of the related questions could help me so far.
I have a collection of entities I want to batch insert into TimescaleDB (extension of Postgresql) via spring-data saveAll() and I thought I configured everything by the book, but the hibernate stats never reflect a batch insert:
1223268 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
34076442411 nanoseconds spent executing 3408 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
My Hibernate properties are configured via HibernatePropertiesCustomizer like this:
props.put("hibernate.generate_statistics", true);
props.put("hibernate.order_inserts", true);
props.put("hibernate.order_updates", true);
props.put("hibernate.jdbc.batch_size", "100");
I validated that a transaction context is indeed present during saveAll() by calling:
TransactionSynchronizationManager.isSynchronizationActive()
And the entity's ID looks like this. There are no #GeneratedValue or #SequenceGenerator annotations, since I use the data's timestamp (from a web API) as the ID.
#Id
#Column
private Instant time;
I even tried by adapting my connection String to control the rewrite behavior, but it did not help:
jdbc:postgresql://localhost:5432/db?reWriteBatchedInserts=true&currentSchema=finance-data
I tested this with both maven org.postgresql 42.2.9 and 42.2.13 drivers.
The Spring Boot release train version is the 2.2.2.RELEASE which bundles hibernate-core 5.4.9.FINAL and spring-jdbc 5.2.2.RELEASE.
The timescale docker version is 1.7.1-pg12.
Please let me know if you need any more additional information. Thanks in advance!!!

From this, it mentions if the entity being batched inserted is manually assigned its ID , you have to add a #Version property. But for PostgreSQL , adding #Version is not required if using SEQUENCE generator to generate the ID.
I do not try if adding #Version can solve the problem if the ID is manually assigned but you could have a try. I personally use SEQUENCE generator to generate ID and it works without adding #Version in PostgreSQL.
And in order to generate ID effectively when using SEQUENCE generator , I also changed to use "pooled" or "pooled-lo" algorithm that is mentioned in this to reduce the database round trip to get the ID.
Here are what I do :
#Entity
public class Foo {
#Id
#GeneratedValue(strategy = GenerationType.SEQUENCE, generator="foo_sequence")
#SequenceGenerator(name="foo_sequence", sequenceName = "foo_id_seq", allocationSize = 100)
private Long id;
}
And the hibernate setting :
hibernate.order_inserts = true
hibernate.order_updates = true
hibernate.jdbc.batch_size = 50
hibernate.jdbc.batch_versioned_data = true
# For using "pool-lo" optimiser for generating ID when using JPA #SequenceGenerator
hibernate.id.optimizer.pooled.preferred = pooled-lo
And also need to make sure the sequence in PostreSQL is aligned with what is configured in #SequenceGenerator :
alter sequence foo_id_seq increment by 100;
For completeness , in case of PostgreSQL , also add reWriteBatchedInserts=true in the JDBC connection string which can provides 2-3x performance improvement which is said by the docs.

Related

Weblogic to Liberty w JPA upgrade - related entities intermittently not being queried

just a quick question please in case something stands out immediately.
We're migrating an EAR/EJB application from Weblogic 11g to latest WS Liberty (22.x) also upgrading several of the frameworks including JPA to 2.2. This also changes JPA implementation to eclipseLink. We came from com.oracle.weblogic.11g.modules:javax.persistence:1.0.0.0_1-0-2. Underlying DB is MS-SQL Server.
And I'm running into some weirdness with regards to related objects not being resolved/queried intermittently.
Just as an example we have entities where the columns hold reference data codes or similar lookups. Say I have an entity called PayemntRecordT and it has a status code which refers to a ref table that also holds a textual description. Something like this:
SQL:
CREATE TABLE [PAYMENT_RECORD_T](
[PAYMENT_ID] [int] NOT NULL,
...
[PAYMENT_STATUS_CD] [CHAR](8) NOT NULL,
...
)
ALTER TABLE [PAYMENT_RECORD_T] WITH CHECK ADD CONSTRAINT [FK_PAYM4] FOREIGN KEY([PAYMENT_STATUS_CD])
REFERENCES [RECORD_STATUS_T] ([REC_STAT_CD])
GO
CREATE TABLE [RECORD_STATUS_T] (
[RECORD_STAT_CD] [CHAR](8) NOT NULL,
[RECORD_STAT_DSC] [VARCHAR](60) NOT NULL
CONSTRAINT [PK_RECORD_STATUS_T] PRIMARY KEY CLUSTERED (
[RECORD_STAT_CD] ASC
)WITH (PAD_INDEX = OFF...) ON [PRIMARY]
) ON [PRIMARY]
GO
Java:
#Table(name = "PAYMENT_RECORD_T")
#Entity
public class PaymentRecordT {
...
#ManyToOne
#PrimaryKeyJoinColumn(name = "payment_status_cd", referencedColumnName = "REC_STAT_CD")
private RecordStatusT recordStatusT;
}
#Table(name = "RECORD_STATUS_T")
#Entity
public class RecordStatusT {
#Column(name = "REC_STAT_CD")
#Id
private String recStatCd;
#Column(name = "REC_STAT_DSC")
#Basic
private String recStatDsc;
}
Others relations in our app might not be primary key relations but loose relations in which case its just #JoinColumn but the pattern would be the same.
My 'weirdness' is the following:
So in this example I have a list of 10 'Payment Records' each of them have such a record status, which is actually NON NULL in the database. When I do the initial retrieval via EJB method it grabs the 10 records and I also get the correctly resolved/queried record statuses.
Then I add a new record via EJB method (TRANSACTION_REQUIERD). After the add method returns I can query the new payment record in the database via SSMS. Its committed and it looks 100% correct and it contains a correct record status code.
Now I run the retrieval method again and I get the 11 records as I would expect. Only the 11th (newly inserted) record will have recordStatusT as null.
When I restart the app all goes well again for the retrieval of all 11 records. But for subsequent additions the outcome seems again 'undefined'.
In JDBC logging I an see that during the original retrieval of the records the record_status_t table was queried but the 2nd time around it was not and I have no explanation why.
I played with FETCHTYPE.EAGER and read up on caching etc but I'm not going anywhere.
Any ideas?
Thanks for your time
Carsten

I solved the problem by ensuring that after inserts/updates the objects arent being queried from the cache.
In the end - rather than doing it with query hint - I disabled caching for the entity involved using the #Chacheable annotation, like so
#Table(name = "PAYMENT_RECORD_T")
#Entity
#Cacheable(false)
public class PaymentRecordT {
...
#ManyToOne
#PrimaryKeyJoinColumn(name = "payment_status_cd", referencedColumnName = "REC_STAT_CD")
private RecordStatusT recordStatusT;
}
I still feel like there should be a better solution. Eclipselink tracks the inserts/updates so it should be able track what needs rereading from the DB and what not. I still feel like I don't fully understand the entire picture, but this works for me and its reasonably clean.
I can leave the considerable amount of read-only data/objects chacheable and the few that are changeable as non-cacheable.
Thanks for reading
Carsten

JPA/Hibernate Bulk/Batch Insert using Mysql with Database Generated ID's

Okay, I've searched forever and I can't seem to find a good way of accomplishing batch inserts with JPA/Hibernate and MySql.
I want to be able to save/insert many records at once using JPA, but by default batching behavior is disabled if you use GenerationType.IDENTITY. I'm aware that you can switch to GenerationType.SEQUENCE, but that isn't available on MySql and creating new tables and using GenerationType.TABLE is not an option in my scenario.
So in the end, I need an efficient way of doing batch/bulk inserts using JPA/Hibernate, MySQL, and database generated IDs. I know it's possible to do this efficiently because I can do it with a JDBC connection, but I'd really like to not have to write my own JDBC queries for each of my repositories.
Anyone know how to accomplish this?
I'm okay if I'm unable to get the updated entities with the IDs back (think void saveAll() instead of List<User> saveAll()). My main requirement is this happens in one/two big queries instead of saving iteratively each entity like it does now when I call saveAll.
I can include more if needed, but my entity looks like this:
#Entity
#Builder
#Getter
#Setter
#With
#AllArgsConstructor
#NoArgsConstructor
#EqualsAndHashCode(callSuper = false, exclude = "id")
#Table(name = "user")
#ToString(callSuper = true, onlyExplicitlyIncluded = true)
public class User {
#Id
#ToString.Include
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Column(name = "uID")
private long id;
private String name;
}

There is no way to accomplish JDBC batching on insert with Hibernate when using the identity generation strategy, because for Hibernate, every entity must have a PK value assigned after a persist/insert.
You can use Hibernate SPIs to implement this yourself though. Take a look at how Hibernate implements inserts here org.hibernate.persister.entity.AbstractEntityPersister#insert(java.lang.Object, java.lang.Object[], java.lang.Object, org.hibernate.engine.spi.SharedSessionContractImplementor). You can reduce the complexity if you want to implement this only for a few known entities that only use a handful of features.

IDENTITY generator disables JDBC batch insert of hibernate and JPA. since the sequence is not supported in MySQL, there is no way to bulk/batch insert records using MySQL and spring data JPA. Please read my blog on that. This is not the end of the road. we can use the JDBC template or query DSL-SQL. To see how to implement using query DSL-SQL click here. For the JDBC template click here.
If you need type-safe, easy to code choose query DSL-SQL else choose JDBC template

ConstraintViolationException using TableGenerator after updating dependencies

After updating dependencies from Spring 4.1.6.RELEASE to 5.1.4.RELEASE and Hibernate 4.3.9.Final to 5.4.1.Final, some of my test cases started failing with ConstraintViolationException on the primary key.
Due to historical reasons we are generating the primary like so:
#Id
#GeneratedValue(strategy = GenerationType.TABLE, generator = "NAME_HERE")
#TableGenerator(name = "NAME_HERE", table = "SEQUENCE_TABLE_NAME", pkColumnName = "Name", pkColumnValue = "ENTITY_NAME", valueColumnName = "VALUE_COLUMN_NAME", allocationSize = 100)
While using the application normally it does not seem to be a problem, however when running tests it does.
I have noticed that the ConstraintViolationException only occurs for tables where multiple values are inserted within the same test case. So i'm thinking the rows are somehow assigned the same primary key.
Running the tests locally it does not seem to be a problem, so it could also have something to do with how the tests are run on the build server. However it was not a problem before upgrading dependencies.
I have checked that the sequence values are higher than the latest value in the database, but again this was not a problem before upgrading dependencies.
I am hoping to jolt someones memory :-)

i found this:
Hibernate, #SequenceGenerator and allocationSize
Which made me find "hibernate.id.new_generator_mappings"
As per:
https://docs.jboss.org/hibernate/orm/5.0/userguide/html_single/Hibernate_User_Guide.html#configurations
*Setting which indicates whether or not the new org.hibernate.id.IdentifierGenerator are used for AUTO, TABLE and SEQUENCE.
Existing applications may want to disable this (set it false) for upgrade compatibility from 3.x and 4.x to 5.x.*
Setting this to false solved the problem.

Where to set hibernate.id.new_generator_mappings property?

I've wasted too much time on this ...
I'm using oracle and I have a sequence (MY_TABLE_SEQ) defined which increments by 1.
In my Pojo I have:
#SequenceGenerator(name = "MY_SEQ", sequenceName="MY_TABLE_SEQ", allocationSize=50)
#GeneratedValue(strategy=GenerationType.SEQUENCE, generator="MY_SEQ")
This gives me a unique constraint issue. From my understanding I need to set the following property:
hibernate.id.new_generator_mappings=true
I've tried setting in my hibernate.cfg.xml file but it does not seem to make any difference. I've come across server post to place in persistance.xml but this is a standalone app, no webcontainer.
Setting allocationSize=1 works but of course it hits the db on each insert to get the next sequence. Setting the above property is suppose to resolve it.

I haven't tried Oracle, but I had similar issues to yours inserting into an AS400 DB2 table.
I had to remove the identity flag on the id column on DB2 table - and instead used a custom jpa/hibernate sequence generator. This is set up on the pojo/entity annotation of the #ID entity field as you've done.
DB2 had been giving me errors about missing SYSIBM.SYSSEQUENCES table, so evidently hibernate (version 5.2), doesn't recognize the native DB2 identity designation. A custom sequence was and effective workaround.
On the #ID entity field:
#GeneratedValue(generator = "table", strategy=GenerationType.TABLE)
#TableGenerator(name = "table", allocationSize = 20)
This example allocates a pool of 20 sequence numbers each time it queries the table.
Next, create the required table Hibernate needs with columns that match the hibernate5 API - must be in lower case ... so put quotes around the names to work around the auto-upper casing that DB2 defaults to. The API will error out if these names are in caps.
Table:
"hibernate_sequences"
example of 2 Columns used:
"sequence_next_hi_value" (integer, not nullable, 0 default)
"sequence_name" (character, sample length 20, not nullable, natural default)
In the configuration code for the dialect used - ex: Spring Boot programmatically, add these properties:
properties.put("hibernate.supportsSequences","false");
properties.put("hibernate.id.new_generator_mappings","false");
and in the *.properties file:
spring.jpa.properties.hibernate.dialect.supportsSequences=false
spring.jpa.properties.hibernate.id.new_generator_mappings=false
Database systems are case-sensitive for schema/table/field names. Also watch for typos everywhere, incl. property names.
Be sure your pojo/entity only contains private fields that will be mapped to the table. Static finals such as serialVersionUID are ok.
I will be doing someething similar for SQL Server soon.
For MySQL, I had no issues using an identity column as defined in a table ID field to insert records, so didn't have to make all these changes. A simpler setup since hibernate recognizes the identity designation in MySQL.
#GeneratedValue(strategy=GenerationType.IDENTITY)
was all that was needed in the pojo.
I'm a newbie at all this, so always looking for better ways ... but this worked for now.

I set the property like this in the hibernate.cfg.xml file and it works !
<property name="hibernate.jpa.compliance.global_id_generators" value="true"/>

JPA GenerationType.AUTO not considering column with auto increment

I have a table with a simple int id column with Identity auto increment in SQL Server.
The entity's Id is annotated with #Id and #GeneratedValue
#Id
#GeneratedValue(strategy = GenerationType.AUTO)
#Column(name = "id", length = 4, precision = 10, nullable = false)
private Integer id;
In SQL Server the column is properly set as Identity with Seed and Increment equals to 1.
When I try to persist an instance of this entity, Hibernate tries to query the hibernate_sequence table to obtain the ID value. Since I haven't created that table in my schema I'm getting an error:
could not read a hi value: com.microsoft.sqlserver.jdbc.SQLServerException: Invalid object name 'MySchema.hibernate_sequence'
If I change the generation type to IDENTITY everything works as expected
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
#Column(name = "id", length = 4, precision = 10, nullable = false)
private Integer id;
I cannot change it this way, since my App will run both on MS SQL and ORACLE, and the latter does not support auto incremented columns.
As far as I know the AUTO type should use the auto increment behaviour if the underlying database has support to it, so I don't know why is not working.
UPDATE:
It took me some time but I was able to understand exactly what is going on.
I am working with legacy databases with the following behaviours:
MSSQL: id generation uses table IDENTITY
ORACLE: id generation uses a trigger. The trigger queries and updates a custom table where all the "next ids" are stored. This table is called SEQ.
Here is the outcome of using some id generation strategies:
AUTO: does not work in MSSQL, as explained above
IDENTITY: works in MSSQL but is not supported by Oracle
"native": works in MSSQL but fails in ORACLE. It fails because Hibernate activates its default sequence strategy, which uses hibernate_sequences.nextval. Since this is a legacy application the values from the SEQ table (mentioned above) and the hibernate_sequences are not synchronized (SEQ's value for that particular table is at 6120, and hibernate_sequences' is at 1, which is expected since it was not used until now).
So what I need to figure out is a way to configure that entity to:
Use MSSQL Identity feature
OR
When using Oracle, do not automatically set any value to the ID variable and leave everything up to the pre-existing trigger
This can cause me serious issues on Oracle when I need to insert entities that depend on the main entity (via foreign key), because Hibernate won't know which ID value was generated by the "external" trigger.

I had a similar problem and found this information (deeper explained in here).
Adding this property into my persistence.xml file fixed the issue:
<property name="hibernate.id.new_generator_mappings" value="false" />

Orcale 12c supports IDENTITY and SQL SERVER 2012 supports SEQUENCES. I believe a SEQUENCE is always a better choice than an IDENTITY. IDENTITY disables batching and SEQUENCES allow you to provide optimizers, such as the pooled-lo optimization strategy.
This is how the actual identifier generator is chosen for the configured GenerationType value:
switch ( generatorEnum ) {
case IDENTITY:
return "identity";
case AUTO:
return useNewGeneratorMappings
? org.hibernate.id.enhanced.SequenceStyleGenerator.class.getName()
: "native";
case TABLE:
return useNewGeneratorMappings
? org.hibernate.id.enhanced.TableGenerator.class.getName()
: MultipleHiLoPerTableGenerator.class.getName();
case SEQUENCE:
return useNewGeneratorMappings
? org.hibernate.id.enhanced.SequenceStyleGenerator.class.getName()
: "seqhilo";
}
If you use the new identifier generators:
properties.put("hibernate.id.new_generator_mappings", "true");
The AUTO will actually use a SequenceStyleGenerator and where the database doesn't support sequences, you end up using a TABLE generator instead (which is a portable solution but it's less efficient than IDENTITY or SEQUENCE).
If you use the legacy identifier generators, you then end up with the "native" generation strategy, meaning:
public Class getNativeIdentifierGeneratorClass() {
if ( supportsIdentityColumns() ) {
return IdentityGenerator.class;
}
else if ( supportsSequences() ) {
return SequenceGenerator.class;
}
else {
return TableHiLoGenerator.class;
}
}
If a new Oracle12gDialect is going to be added and it will support IDENTITY, then AUTO might switch to IDENTITY rather than SEQUENCE, possibly breaking your current expectations. Currently there is no such dialect available so on Oracle you have SEQUENCE and in MSSQL you have IDENTITY.
Conclusion:
Try it like this:
#Id
#GenericGenerator(name = "native_generator", strategy = "native")
#GeneratedValue(generator = "native_generator")
private Long id;
make the id a Long instead of Integer, and you can let the HBMDDL handle the primary key column type.
force Hibernate to use the "native" generator
If your legacy system uses a table for generating sequence values and there was no hilo optimization ever used you can use a table identifier generator:
#Id
#GeneratedValue(generator = "table", strategy=GenerationType.TABLE)
#TableGenerator(name = "table", allocationSize = 1
)
private Long id;
You can also use the JPA table generator, just make sure you configure the right optimizer. For more info check my Hibernate tutorial

because
#GeneratedValue(strategy = GenerationType.AUTO)
use SequenceStyleGenerator by default in earlier versions
you have to look at this https://hibernate.atlassian.net/browse/HHH-11014

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.