How to bulk insert with Spring-Batch?

How to bulk insert with Spring-Batch? - java

I'm trying to do bulk/batch inserts using spring-batch.
public ItemWriter<MyEntity> jpaItemWriter() {
LocalSessionFactoryBuilder builder = new LocalSessionFactoryBuilder(ds);
builder.addAnnotatedClasses(MyEntity.class);
builder.setProperty("hibernate.show_sql", "true");
builder.setProperty("hibernate.batch_size", "20");
builder.setProperty("hibernate.order_updates", "true");
builder.setProperty("hibernate.order_inserts", "true");
HibernateItemWriter<MyEntity> writer = new HibernateItemWriter<>();
writer.setSessionFactory(builder.buildSessionFactory());
return writer;
}
Result:
I'm getting only single insert statements, not bulk inserts! I can see it from the logs both of hibernate + on postgresql level. Why is the bulk insert not working?
Update:
#Entity
public class MyEntity {
#Id
String shortname;
String fullname;
}

Spring has nothing to do with the SQL statements batching, its all managed by Hibernate.
I see you have batching enabled and configured properly, but that's not enough to make it work...you also need to use the right session type. in hibernate there are two session types: stateful session and stateless session.
The stateful session, which is obtained with
sessionFactory.openSession();
and also is used by default if using #Transactional, never uses batching (even if configured) and sends all SQL statements, at once, at transaction commit. However, you can simulate batching by calling flush() from time to time, and SQL statements will be sent to the db on every flush().
The stateless session, which is obtained with
sessionFactory.openStatelessSession();
respects the batching configuration, so just switch to stateless session, and batching will works as expected. Hibernate will log every session.insert(), but will not sent the SQL insert statement to the database, instead the SQL insert statements are sent as batches of configured size. So its best to "tail -f" the database log.
The main idea of having two session types is that, the stateful session uses cache, and every saved entity ends up in the 1st level cache, and therefore if you save 100k entities you will get OOM. The solution is to use stateless session which doesn't interact with any level cache.
You can read more about the stateless session.

Related

Does Session.getCurrentSession() closes without doing any transaction?

I am trying to load entity by doing this:
public void reloadRepository() {
Session session = getSessionFactory().getCurrentSession();
session.beginTransaction();
Hibernate.initialize(Repository.class);
}
From this stack overflow post (Hibernate openSession() vs getCurrentSession()), it says
When you call SessionFactory.getCurrentSession, it creates a new
Session if it does not exist, otherwise use same session which is in
current hibernate context. It automatically flushes and closes session
when transaction ends, so you do not need to do it externally.
What does it mean by "transaction ends"? If I don't make any transaction (guessing Hibernate.initialize() is not making transaction), does hibernate close this session?

Probably.
I'm guessing you set current_session_context_class to thread (since you're using beginTransaction). This means that, according to the javadoc, the session is only usable after transaction is started and is destroyed automatically when transaction ends.
I'm not sure what you mean by 'not making any transaction', you just made one using beginTransaction(). Once you commit or rollback, the transaction will end. Even if you do neither, the transaction will eventually time out,and that will also count as ending the transaction.

It's written like that because in modern apps you control transactions with the #Transactional annotation. You simply put it on top of the service methods and Hibernate opens a transaction automatically and closes it when it reaches the end of the method.
I don't really know what you think your last row of code is doing but it looks very wrong. If you want to load an entity you can simply write session.get(), add #Transactional to your method and delete session.beginTransaction() and Hibernate.initialize().

How can I use Hibernate/JPA to tell the DB who the user is before inserts/updates/deletes?

Summary (details below):
I'd like to make a stored proc call before any entities are saved/updated/deleted using a Spring/JPA stack.
Boring details:
We have an Oracle/JPA(Hibernate)/Spring MVC (with Spring Data repos) application that is set up to use triggers to record history of some tables into a set of history tables (one history table per table we want audited). Each of these entities has a modifiedByUser being set via a class that extends EmptyInterceptor on update or insert. When the trigger archives any insert or update, it can easily see who made the change using this column (we're interested in which application user, not database user). The problem is that for deletes, we won't get the last modified information from the SQL that is executed because it's just a plain delete from x where y.
To solve this, we'd like to execute a stored procedure to tell the database which app user is logged in before executing any operation. The audit trigger would then look at this value when a delete happens and use it to record who executed the delete.
Is there any way to intercept the begin transaction or some other way to execute SQL or a stored procedure to tell the db what user is executing the inserts/updates/deletes that are about to happen in the transaction before the rest of the operations happen?
I'm light on details about how the database side will work but can get more if necessary. The gist is that the stored proc will create a context that will hold session variables and the trigger will query that context on delete to get the user ID.

From the database end, there is some discussion on this here:
https://docs.oracle.com/cd/B19306_01/network.102/b14266/apdvprxy.htm#i1010372
Many applications use session pooling to set up a number of sessions
to be reused by multiple application users. Users authenticate
themselves to a middle-tier application, which uses a single identity
to log in to the database and maintains all the user connections. In
this model, application users are users who are authenticated to the
middle tier of an application, but who are not known to the
database.....in these situations, the application typically connects
as a single database user and all actions are taken as that user.
Because all user sessions are created as the same user, this security
model makes it very difficult to achieve data separation for each
user. These applications can use the CLIENT_IDENTIFIER attribute to
preserve the real application user identity through to the database.
From the Spring/JPA side of things see section 8.2 at the below:
http://docs.spring.io/spring-data/jdbc/docs/current/reference/html/orcl.connection.html
There are times when you want to prepare the database connection in
certain ways that aren't easily supported using standard connection
properties. One example would be to set certain session properties in
the SYS_CONTEXT like MODULE or CLIENT_IDENTIFIER. This chapter
explains how to use a ConnectionPreparer to accomplish this. The
example will set the CLIENT_IDENTIFIER.
The example given in the Spring docs uses XML config. If you are using Java config then it looks like:
#Component
#Aspect
public class ClientIdentifierConnectionPreparer implements ConnectionPreparer
{
#AfterReturning(pointcut = "execution(* *.getConnection(..))", returning = "connection")
public Connection prepare(Connection connection) throws SQLException
{
String webAppUser = //from Spring Security Context or wherever;
CallableStatement cs = connection.prepareCall(
"{ call DBMS_SESSION.SET_IDENTIFIER(?) }");
cs.setString(1, webAppUser);
cs.execute();
cs.close();
return connection;
}
}
Enable AspectJ via a Configuration class:
#Configuration
#EnableAspectJAutoProxy
public class SomeConfigurationClass
{
}
Note that while this is hidden away in a section specific to Spring's Oracle extensions it seems to me that there is nothing in section 8.2 (unlike 8.1) that is Oracle specific (other than the Statement executed) and the general approach should be feasible with any Database simply by specifying the relevant procedure call or SQL:
Postgres for example as the following so I don't see why anyone using Postgres couldn't use this approach with the below:
https://www.postgresql.org/docs/8.4/static/sql-set-role.html

Unless your stored procedure does more than what you described, the cleaner solution is to use Envers (Entity Versioning). Hibernate can automatically store the versions of an entity in a separate table and keep track of all the CRUD operations for you, and you don't have to worry about failed transactions since this will all happen within the same session.
As for keeping track who made the change, add a new colulmn (updatedBy) and just get the login ID of the user from Security Principal (e.g. Spring Security User)
Also check out #CreationTimestamp and #UpdateTimestamp.

I think what you are looking for is a TransactionalEvent:
#Service
public class TransactionalListenerService{
#Autowired
SessionFactory sessionFactory;
#TransactionalEventListener(phase = TransactionPhase.BEFORE_COMMIT)
public void handleEntityCreationEvent(CreationEvent<Entity> creationEvent) {
// use sessionFactory to run a stored procedure
}
}
Registering a regular event listener is done via the #EventListener
annotation. If you need to bind it to the transaction use
#TransactionalEventListener. When you do so, the listener will be
bound to the commit phase of the transaction by default.
Then in your transactional services you register the event where necessary:
#Service
public class MyTransactionalService{
#Autowired
private ApplicationEventPublisher applicationEventPublisher;
#Transactional
public void insertEntityMethod(Entity entity){
// insert
// Publish event after insert operation
applicationEventPublisher.publishEvent(new CreationEvent(this, entity));
// more processing
}
}
This can work also outside the boundaries of a trasaction if you have the requirement:
If no transaction is running, the listener is not invoked at all since
we can’t honor the required semantics. It is however possible to
override that behaviour by setting the fallbackExecution attribute of
the annotation to true.

What does "proxied" state means for a Hibernate Session

I came across this line regarding Hibernate Documentation on Jboss site.
Because Hibernate can't bind the "current session" to a transaction,
as it does in a JTA environment, it binds it to the current Java thread
when i do transction demarcation with plain JDBC.
It is opened when getCurrentSession() is called for the first time,
but in a "proxied" state that doesn't allow you to do anything except
start a transaction.
So, what exactly does the author mean by "proxied state" here. And what link they have, if any, to proxy objects?

Without JTA, the transaction management is done through the commit/rollback methods of a JDBC Connection.
This means you have to bind one JDBC Connection to the current running Hibernate Session and to the current logical transaction.
Because passing a JDBC Connection to all Hibernate Session methods would be a terrible design solution, you have to use a Thread-local storage instead.
Hibernate has a flexible CurrentSessionContext, offering the following alternatives:
JTASessionContext
ManagedSessionContext
ThreadLocalSessionContext
So if you choose the ThreadLocaSessionContext, then the underlying JDBC connection will be bound to a Thread local storage and make it available to the current Thread running Session.
If you use Spring, you shouldn't rely on the Hibernate TreadLocal context, but use the Spring specific Transaction Management support, which is implemented by:
SpringJtaSessionContext
SpringSessionContext
As for the proxy state, the Hibernate TreadLocalContext uses a proxy for the Hibernate Session:
protected Session wrap(Session session) {
final TransactionProtectionWrapper wrapper = new TransactionProtectionWrapper( session );
final Session wrapped = (Session) Proxy.newProxyInstance(
Session.class.getClassLoader(),
SESSION_PROXY_INTERFACES,
wrapper
);
wrapper.setWrapped( wrapped );
return wrapped;
}
allowing the current Session to unbind itself form the TreadLocal storage when the Session.close() method is called.
// If close() is called, guarantee unbind()
if ( "close".equals( methodName ) ) {
unbind( realSession.getSessionFactory() );
}

What will happen if we begin transaction in hibernate but do not commit it?

What will happen if we begin transaction in hibernate, then do some transaction but do not commit it?
Will it save tempervoraly or it will rollback immediately?
Thanks
Chetan

Look at the following code, which accesses the database with transaction boundaries without use of commit:
Session session = sessionFactory.openSession();
session.beginTransaction();
session.get(Item.class, 123l);
session.close();
By default, in a Java SE environment with a JDBC configuration, this is what happens if you execute this snippet:
A new Session is opened. It doesn’t obtain a database connection at
this point.
If a new underlying transaction is required, begin the transaction.
Otherwise continue the new work in the context of the existing
underlying transaction
The call to get() triggers an SQL SELECT. The Session now obtains a
JDBC Connection from the connection pool. Hibernate, by default,
immediately turns off the autocommit mode on this connection with
setAutoCommit(false). This effectively starts a JDBC transaction!
The SELECT is executed inside this JDBC transaction. The Session is
closed, and the connection is returned to the pool and released by
Hibernate — Hibernate calls close() on the JDBC Connection.
What happens to the uncommitted transaction?
The answer to that question is, “It depends!” The JDBC specification doesn’t say anything about pending transactions when close() is called on a connection. What happens depends on how the vendors implement the specification. With Oracle JDBC drivers, for example, the call to close() commits the transaction! Most other JDBC vendors take the sane route and roll back any pending transaction when the JDBC Connection object is closed and the resource is returned to the pool.
Obviously, this won’t be a problem for the SELECT you’ve executed, but look at this variation:
Session session = getSessionFactory().openSession();
session.beginTransaction();
Long generatedId = session.save(item);
session.close();
This code results in an INSERT statement, executed inside a transaction that is never committed or rolled back. On Oracle, this piece of code inserts data permanently; in other databases, it may not. (This situation is slightly more complicated: The INSERT is executed only if the identifier generator requires it. For example, an identifier value can be obtained from a sequence without an INSERT. The persistent entity is then queued until flush-time insertion — which never happens in this code. An identity strategy requires an immediate INSERT for the value to be generated.)

Its depend on hibernate-config and the connection pool config.
When try to close a session with an open transaction hibernate by default will not call to close on connection proxy (if you want to change this you need to define - hibernate.ejb.discard_pc_on_close true)
public void close() {
if ( !open ) {
throw new IllegalStateException( "EntityManager is closed" );
}
if ( !discardOnClose && isTransactionInProgress() ) {
Now , suppose that you defined discard_pc_on_close than in this case hibernate will call to close on the connection proxy (connection pool wrap connections), so now we depend on how the connection pool implement this.
You can see c3p0 implementation in : NewPooledConnection.
You will see that its depend on this flag - FORCE_IGNORE_UNRESOLVED_TXNS (default is false), so by default it will reset the transaction.
static void resetTxnState( Connection pCon,
boolean forceIgnoreUnresolvedTransactions,
boolean autoCommitOnClose,
boolean txnKnownResolved ) throws SQLException
{
if ( !forceIgnoreUnresolvedTransactions && !pCon.getAutoCommit() )

Hibernate error "database is locked". How do i correctly close session?

In my application I open session, create criteria but dont close it. Then in other method I open session again, update object and receive database is locked on tr.commit().
If I put session.close() in first instance I receive
could not initialize proxy - no Session.
How do I close and open sessions correctly? Or do I need to copy proxy objects to those created by me and then close()?
Session session = HibernateUtil.getSessionFactory().openSession();
Transaction tr=session.beginTransaction();
Criteria criteria = session.createCriteria(MyDocument.class);
criteria.add(Expression.like("isMainDoc", 1));
List docs = criteria.list();
tr.commit();
session.close();
I am a complete begginer. i use sqlite. Any help would be appreciated. Thanks in advance.

Hibernate Session is generally tied to a thread.
So, perhaps you should restructure your code to get a session at the beginning of your processing (e.g. in ServletFilter instance of a web-app).
And then in each method, you can use the same session object, to start a new transaction (and then of course, end the transaction also.
public void doWork(){
Transaction tx = null;
try{
tx = session.beginTransaction();
}catch(){
} finally {
// if tx != null then.. close transaction, or rollback?
}
}
EDIT: And then ofcouse, close the session when the processing is done (in web-app, that could be also in the same ServletFilter)
Google: "Open Session In View" pattern.

Cause
You might be getting the error when you are trying to access properties of the MyDocument class instances returned by the query.
Hibernate is lazy by default. It returns you a proxy for an object instead of hitting the database whenever a reference property is accessed. This behavior can be overwritten whenever required.
Always remember that could not initialize proxy - no Session is recieved when the code tries to access a proxy properties (by hitting the database) and finds that the session is not available ( Session is needed as Hibernate accesses database using this interface)
Solution
Make sure that your session is open whenever Hibernate tries to load object which have not been loaded yet. How do you do that?
(In simple words) There are two schools of thoughts in Hibernate:
Fetch all the data that you might access before you close the Session OR
keep the Session open for the entire duration of time you work on the objects.
I would like you brush up topics such as the unit of work in Hibernate. Hibernate provides a wonderful interface to define boundaries on database access. Data must be accessed (read/written) between these boundaries. Check Here
hibernate.current_session_context_class in the hibernate configuration which can take the values jta | thread | managed | custom.Class. This variable defines the unit of work for your Session.
Last but most importantly try using Contextual Sessions (you must have come across .getCurrentSession()
which helps you to get the same session which is open everytime anywhere in your code. Hibernate handles everything behind the scenes.
Hope this answer serves as a guide for you for taking the correct path in using Hibernate rather than just solving this particular problem.

Follow the below steps when you are using hibernate transactions Read the API here.
Session session = sessionFactory.openSession();
Transaction tx = session.beginTransaction();
//Or any other operation.
session.save(a);
tx.commit();
session.close();

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.