StaleObjectStateException when read from Spring Data repository - java

I have the next Java code:
public Role reproduceIssue(String code) {
repository.findOneByCode(code);
roleChangedService.onRoleChanged(code);
return null;
}
The code first reads a role by code from repository (in one transaction, marked as read only) and then calls another method RoleChangedService#onRoleChanged (in another transaction). The method is like this:
#Transactional
public void onRoleChanged(String code) {
Role resource = roleRepository.findOneAndLockByCode(code);
resource.setName(null);
roleRepository.saveAndFlush(resource);
}
RoleRepository#findOneAndLockByCode is annotated with #Lock(PESSIMISTIC_WRITE).
Every time I send 2 parallel HTTP requests to call the reproduceIssue method in parallel, I get org.hibernate.StaleObjectStateException: Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) on the RoleRepository#findOneAndLockByCode method call.
I wonder, how is the exception possible, given that I use the lock on this method?
The reproduction project (as well as the shell scripts that simulate the parallel call) is available here.

Related

How can I tell if current session is dirty?

I want to publish an event if and only if there were changes to the DB. I'm running under #Transaction is Spring context and I come up with this check:
Session session = entityManager.unwrap(Session.class);
session.isDirty();
That seems to fail for new (Transient) objects:
#Transactional
public Entity save(Entity newEntity) {
Entity entity = entityRepository.save(newEntity);
Session session = entityManager.unwrap(Session.class);
session.isDirty(); // <-- returns `false` ):
return entity;
}
Based on the answer here https://stackoverflow.com/a/5268617/672689 I would expect it to work and return true.
What am I missing?
UPDATE
Considering #fladdimir answer, although this function is called in a transaction context, I did add the #Transactional (from org.springframework.transaction.annotation) on the function. but I still encounter the same behaviour. The isDirty is returning false.
Moreover, as expected, the new entity doesn't shows on the DB while the program is hold on breakpoint at the line of the session.isDirty().
UPDATE_2
I also tried to change the session flush modes before calling the repo save, also without any effect:
session.setFlushMode(FlushModeType.COMMIT);
session.setHibernateFlushMode(FlushMode.MANUAL);
First of all, Session.isDirty() has a different meaning than what I understood. It tells if the current session is holding in memory queries which still haven't been sent to the DB. While I thought it tells if the transaction have changing queries. When saving a new entity, even in transaction, the insert query must be sent to the DB in order to get the new entity id, therefore the isDirty() will always be false after it.
So I ended up creating a class to extend SessionImpl and hold the change status for the session, updating it on persist and merge calls (the functions hibernate is using)
So this is the class I wrote:
import org.hibernate.HibernateException;
import org.hibernate.internal.SessionCreationOptions;
import org.hibernate.internal.SessionFactoryImpl;
import org.hibernate.internal.SessionImpl;
public class CustomSession extends SessionImpl {
private boolean changed;
public CustomSession(SessionFactoryImpl factory, SessionCreationOptions options) {
super(factory, options);
changed = false;
}
#Override
public void persist(Object object) throws HibernateException {
super.persist(object);
changed = true;
}
#Override
public void flush() throws HibernateException {
changed = changed || isDirty();
super.flush();
}
public boolean isChanged() {
return changed || isDirty();
}
}
In order to use it I had to:
extend SessionFactoryImpl.SessionBuilderImpl to override the openSession function and return my CustomSession
extend SessionFactoryImpl to override the withOptions function to return the extended SessionFactoryImpl.SessionBuilderImpl
extend AbstractDelegatingSessionFactoryBuilderImplementor to override the build function to return the extended SessionFactoryImpl
implement SessionFactoryBuilderFactory to implement getSessionFactoryBuilder to return the extended AbstractDelegatingSessionFactoryBuilderImplementor
add org.hibernate.boot.spi.SessionFactoryBuilderFactory file under META-INF/services with value of my SessionFactoryBuilderFactory implementation full class name (for the spring to be aware of it).
UPDATE
There was a bug with capturing the "merge" calls (as tremendous7 comment), so I end up capturing the isDirty state before any flush, and also checking it once more when checking isChanged()
The following is a different way you might be able to leverage to track dirtiness.
Though architecturally different than your sample code, it may be more to the point of your actual goal (I want to publish an event if and only if there were changes to the DB).
Maybe you could use an Interceptor listener to let the entity manager do the heavy lifting and just TELL you what's dirty. Then you only have to react to it, instead of prod it to sort out what's dirty in the first place.
Take a look at this article: https://www.baeldung.com/hibernate-entity-lifecycle
It has a lot of test cases that basically check for dirtiness of objects being saved in various contexts and then it relies on a piece of code called the DirtyDataInspector that effectively listens to any items that are flagged dirty on flush and then just remembers them (i.e. keeps them in a list) so the unit test cases can assert that the things that SHOULD have been dirty were actually flushed as dirty.
The dirty data inspector code is on their github. Here's the direct link for ease of access.
Here is the code where the interceptor is applied to the factory so it can be effective. You might need to write this up in your injection framework accordingly.
The code for the Interceptor it is based on has a TON of lifecycle methods you can probably exploit to get the perfect behavior for "do this if there was actually a dirty save that occured".
You can see the full docs of it here.
We do not know your complete setup, but as #Christian Beikov suggested in the comment, is it possible that the insertion was already flushed before you call isDirty()?
This would happen when you called repository.save(newEntity) without a running transaction, since the SimpleJpaRepository's save method is annotated itself with #Transactional:
#Transactional
#Override
public <S extends T> S save(S entity) {
...
}
This will wrap the call in a new transaction if none is already active, and flush the insertion to the DB at the end of the transaction just before the method returns.
You might choose to annotate the method where you call save and isDirty with #Transactional, so that the transaction is created when your method is called, and propagated to the repository call. This way the transaction would not be committed when the save returns, and the session would still be dirty.
(edit, just for completeness: in case of using an identity ID generation strategy, the insertion of newly created entity is flushed during a repository's save call to generate the ID, before the running transaction is committed)

Save to database inside CompletableFuture

I'm trying to save an entity to a Oracle DB using CrudRepository, from inside a method which returns a CompletableFuture.
In this method I am doing a REST call to get the data, I'm processing it to an excel, then save it to the DB.
public CompletableFuture<byte[]> getCompletableFuture(**some parameters**)
{
return CompletableFuture.supplyAsync(() -> caller.getForObjectList(**some parameters**))
.thenApply(listWithData-> {
Collections.sort(**some sorting**);
//Inside exportExcel method I'm creating the excel and call the save method from my service which calls the repository extending CrudRepository.
return excelExportService.exportExcel(listWithData,**some parameters**);
});
}
I am doing getCompletableFuture(parameters).get() tot get the result from the CompletableFuture.
When the save method is called, my entity is populated and everything looks fine, except the fact that nothing is saved in the DB. I think this might be a Transaction problem since the code is done on a separate thread.

Transaction handling when wrapping Stream into Flux

I really have issues understanding what's going on behind the sences when manually wrapping Stream received as a query result from spring data jpa into a Flux.
Consider the following:
Entity:
#NoArgsConstructor
#AllArgsConstructor
#Data
#Entity
public class TestEntity {
#Id
private Integer a;
private Integer b;
}
Repository:
public interface TestEntityRepository extends JpaRepository<TestEntity, Integer> {
Stream<TestEntity> findByBBetween(int b1, int b2);
}
Simple test code:
#Test
#SneakyThrows
#Transactional
public void dbStreamToFluxTest() {
testEntityRepository.save(new TestEntity(2, 6));
testEntityRepository.save(new TestEntity(3, 8));
testEntityRepository.save(new TestEntity(4, 10));
testEntityFlux(testEntityStream()).subscribe(System.out::println);
testEntityFlux().subscribe(System.out::println);
Thread.sleep(200);
}
private Flux<TestEntity> testEntityFlux() {
return fromStream(this::testEntityStream);
}
private Flux<TestEntity> testEntityFlux(Stream<TestEntity> testEntityStream) {
return fromStream(() -> testEntityStream);
}
private Stream<TestEntity> testEntityStream() {
return testEntityRepository.findByBBetween(1, 9);
}
static <T> Flux<T> fromStream(final Supplier<Stream<? extends T>> streamSupplier) {
return Flux
.defer(() -> Flux.fromStream(streamSupplier))
.subscribeOn(Schedulers.elastic());
}
Questions:
Is this the correct way to do what I do, especially regarding the static fromStream method?
While the call to testEntityFlux(testEntityStream()) does what I expect, for reasons I really don't understand, the call to testEntityFlux() runs into an error:
reactor.core.Exceptions$ErrorCallbackNotImplemented: org.springframework.dao.InvalidDataAccessApiUsageException: You're trying to execute a streaming query method without a surrounding transaction that keeps the connection open so that the Stream can actually be consumed. Make sure the code consuming the stream uses #Transactional or any other way of declaring a (read-only) transaction.
Caused by: org.springframework.dao.InvalidDataAccessApiUsageException: You're trying to execute a streaming query method without a surrounding transaction that keeps the connection open so that the Stream can actually be consumed. Make sure the code consuming the stream uses #Transactional or any other way of declaring a (read-only) transaction.
... what usually happens when I forget the #Transactional, which I didn't.
EDIT
Note: The code was inspired by: https://github.com/chang-chao/spring-webflux-reactive-jdbc-sample/blob/master/src/main/java/me/changchao/spring/springwebfluxasyncjdbcsample/service/CityServiceImpl.java which in turn was inspired by https://spring.io/blog/2016/07/20/notes-on-reactive-programming-part-iii-a-simple-http-server-application.
However, the Mono version has the same "issue".
EDIT 2
An example using optional, note that in testEntityMono() replacing testEntityOptional() with testEntityOptionalManual() leads to working code. Thus it all seems to be directly related to how jpa does the data fetching:
#SneakyThrows
#Transactional
public void dbOptionalToMonoTest() {
testEntityRepository.save(new TestEntity(2, 6));
testEntityRepository.save(new TestEntity(3, 8));
testEntityRepository.save(new TestEntity(4, 10));
testEntityMono(testEntityOptional()).subscribe(System.out::println);
testEntityMono().subscribe(System.out::println);
Thread.sleep(1200);
}
private Mono<TestEntity> testEntityMono() {
return fromSingle(() -> testEntityOptional().get());
}
private Mono<TestEntity> testEntityMono(Optional<TestEntity> testEntity) {
return fromSingle(() -> testEntity.get());
}
private Optional<TestEntity> testEntityOptional() {
return testEntityRepository.findById(4);
}
#SneakyThrows
private Optional<TestEntity> testEntityOptionalManual() {
Thread.sleep(1000);
return Optional.of(new TestEntity(20, 20));
}
static <T> Mono<T> fromSingle(final Supplier<T> tSupplier) {
return Mono
.defer(() -> Mono.fromSupplier(tSupplier))
.subscribeOn(Schedulers.elastic());
}
TL;DR:
It boils down to the differences between imperative and reactive programming assumptions and Thread affinity.
Details
We first need to understand what happens with transaction management to understand why your arrangement ends with a failure.
Using a #Transactional method creates a transactional scope for all code within the method. Transactional methods returning scalar values, Stream, collection-like types, or void (basically non-reactive types) are considered imperative transactional methods.
In imperative programming, flows stick to their carrier Thread. The code is expected to remain on the same Thread and not to switch threads. Therefore, transaction management associates transactional state and resources with the carrier Thread in a ThreadLocal storage. As soon as code within a transactional method switches threads (e.g. spinning up a new Thread or using a Thread pool), the unit of work that gets executed on a different Thread leaves the transactional scope and potentially runs in its own transaction. In the worst case, the transaction is left open on an external Thread because there is no transaction manager monitoring entry/exit of the transactional unit of work.
#Transactional methods returning a reactive type (such as Mono or Flux) are subject to reactive transaction management. Reactive transaction management is different from imperative transaction management as the transactional state is attached to a Subscription, specifically the subscriber Context. The context is only available with reactive types, not with scalar types as there are no means to attach data to void or a String.
Looking at the code:
#Test
#Transactional
public void dbStreamToFluxTest() {
// …
}
we see that this method is a #Transactional test method. Here we have two things to consider:
The method returns void so it is subject to imperative transaction management associating the transactional state with a ThreadLocal.
There's no reactive transaction support for #Test methods because typically a Publisher is expected to be returned from the method, and by doing so, there would be no way to assert the outcome of the stream.
#Test
#Transactional
public Publisher<Object> thisDoesNotWork() {
return myRepository.findAll(); // Where did my assertions go?
}
Let's take a closer look at the fromStream(…) method:
static <T> Flux<T> fromStream(final Supplier<Stream<? extends T>> streamSupplier) {
return Flux
.defer(() -> Flux.fromStream(streamSupplier))
.subscribeOn(Schedulers.elastic());
}
The code accepts a Supplier that returns a Stream. Next, subscription (subscribe(…), request(…)) signals are instructed to happen on the elastic Scheduler which effectively switches on which Thread the Stream gets created and consumed. Therefore, subscribeOn causes the Stream creation (call to findByBBetween(…)) to happen on a different Thread than your carrier Thread.
Removing subscribeOn(…) will fix your issue.
There is a bit more to explain why you want to refrain from using reactive types with JPA. Reactive programming has no strong Thread affinity. Thread switching may occur at any time. Depending on how you use the resulting Flux and how you have designed your entities, you might experience visibility issues as entities are passed across threads. Ideally, data in a reactive context remains immutable. Such an approach does not always comply with JPA rules.
Another aspect is lazy loading. By using JPA entities from threads other than the carrier Thread, the entity may not be able to correlate its context back to the JPA Transaction. You can easily run into LazyInitializationException without being aware of why this is as Thread switching can be opaque to you.
The recommendation is: Do not use reactive types with JPA or any other transactional resources. Stay with Java 8 Stream instead.
The Stream returned by the repository is lazy. It uses the connection to the database in order to get the rows when the stream is being consumed by a terminal operation.
The connection is bound to the current transaction, and the current transaction is stored in a ThreadLocal variable, i.e. is bound to the thread that is eecuting your test method.
But the consumption of the stream is done on a separate thread, belonging to the thread pool used by the elastic scheduler of Reactor. So you create the lazy stream on the main thread, which has the transaction bound to it, but you consume the stream on a separate thread, which doesn't have the transaction bound to it.
Don't use reactor with JPA transactions and entities. They're incompatible.

Spring JPA: Change attribute of instance that is simultaniously being modified in scheduled task

I'm running a scheduled task in my Spring application that runs a job. The job itself is fetched at the beginning of the task. After that a loop takes place that modifies the job in each iteration (++ a counter). After the loop I merge my instance using the entity manager. It works fairly well, but I'm facing an issue trying to modify the instance from another place. Since the instance has a 'paused' flag, I'm trying to set it. But whenever I do it's quickly reset again, due to the scheduled task unsetting it again (as far as I can tell).
Here's some code:
// This method is called using the #Scheduled annotation to be looping
// constantly with one second delay between invocations.
#Transactional
public void performActions() {
Job job = jobRepository.findFirstByPausedAtIsNull();
// Skip if no unpaused job exists
if(job == null) return;
// Iterate through batch of job actions
for(Action action : job.nextActions()) {
action.perform();
job.increaseActionsPerformedCount();
// Merge the action into the persistence context
entityManager.merge(action);
}
// Merge the job into the persistence context
entityManager.merge(job);
}
Now I'm trying to be able to pause the job at any time from the outside. I use a controller endpoint to call a pause method on the jobService. This method looks like this:
public Job pause(long id) throws JobNotFoundException, JobStatusException {
Job job = this.show(id);
if(job.getPausedAt() != null) throw new JobStatusException("The job is already paused");
job.pause(); // This sets the flag on the instance, same as job.setPausedAt(new Date())
return jobRepository.save(campaign); // Uses CrudRepository
}
Now calling the method works fine and it actually returns the Job with pausedAt set. But the value is reset quickly after.
I've tried just straight up fetching a fresh instance from the database at the end of performAction and setting the modified instance pausedAt to the freshly fetched one's value.
Any idea how this could be achieved properly?
As far as I understand , You need to stop the job when the pause flag is set ... you can achieve this by applying optimistic lock ... add a #Version field to Job .... apply LockModeType.OPTIMISTIC to the job that you retrieved in performAction() -either by adding it to the find() method or call a refresh() after retrieval -the first is better- ..... now if the other endpoint changes the pause flag the version field will be incremented and you will get OptimisticLockException at persisting .... this has some implications :
1- whatever state changes in the Job , the same behavior will happen (not only the pause field)
2- You will need to handle the Exception from inside the persistence context (i.e. inside performActions()) because after returning it might be mapped to any other exception type ... this is the idea I have now, may be there is something better that gives you more control (track only the pause attribute)

JPA correct way to handle detached entity state in case of exceptions/rollback

I have this class and I tought three ways to handle detached entity state in case of persistence exceptions (which are handled elsewhere):
#ManagedBean
#ViewScoped
public class EntityBean implements Serializable
{
#EJB
private PersistenceService service;
private Document entity;
public void update()
{
// HANDLING 1. ignore errors
service.transact(em ->
{
entity = em.merge(entity);
// some other code that modifies [entity] properties:
// entity.setCode(...);
// entity.setResposible(...);
// entity.setSecurityLevel(...);
}); // an exception may be thrown on method return (rollback),
// but [entity] has already been reassigned with a "dirty" one.
//------------------------------------------------------------------
// HANDLING 2. ensure entity is untouched before flush is ok
service.transact(em ->
{
Document managed = em.merge(entity);
// some other code that modifies [managed] properties:
// managed.setCode(...);
// managed.setResposible(...);
// managed.setSecurityLevel(...);
em.flush(); // an exception may be thrown here (rollback)
// forcing method exit without [entity] being reassigned.
entity = managed;
}); // an exception may be thrown on method return (rollback),
// but [entity] has already been reassigned with a "dirty" one.
//------------------------------------------------------------------
// HANDLING 3. ensure entity is untouched before whole transaction is ok
AtomicReference<Document> reference = new AtomicReference<>();
service.transact(em ->
{
Document managed = em.merge(entity);
// some other code that modifies [managed] properties:
// managed.setCode(...);
// managed.setResposible(...);
// managed.setSecurityLevel(...);
reference.set(managed);
}); // an exception may be thrown on method return (rollback),
// and [entity] is safe, it's not been reassigned yet.
entity = reference.get();
}
...
}
PersistenceService#transact(Consumer<EntityManager> consumer) can throw unchecked exceptions.
The goal is to maintain the state of the entity aligned with the state of the database, even in case of exceptions (prevent entity to become "dirty" after transaction fail).
Method 1. is obviously naive and doesn't guarantee coherence.
Method 2. asserts that nothing can go wrong after flushing.
Method 3. prevents the new entity assigment if there's an exception in the whole transaction
Questions:
Is method 3. really safer than method 2.?
Are there cases where an exception is thrown between flush [excluded] and commit [included]?
Is there a standard way to handle this common problem?
Thank you
Note that I'm already able to rollback the transaction and close the EntityManager (PersistenceService#transact will do it gracefully), but I need to solve database state and the business objects do get out of sync. Usually this is not a problem. In my case this is the problem, because exceptions are usually generated by BeanValidator (those on JPA side, not on JSF side, for computed values that depends on user inputs) and I want the user to input correct values and try again, without losing the values he entered before.
Side note: I'm using Hibernate 5.2.1
this is the PersistenceService (CMT)
#Stateless
#Local
public class PersistenceService implements Serializable
{
#PersistenceContext
private EntityManager em;
#TransactionAttribute(TransactionAttributeType.REQUIRED)
public void transact(Consumer<EntityManager> consumer)
{
consumer.accept(em);
}
}
#DraganBozanovic
That's it! Great explanation for point 1. and 2.
I'd just love you to elaborate a little more on point 3. and give me some advice on real-world use case.
However, I would definitely not use AtomicReference or similar cumbersome constructs. Java EE, Spring and other frameworks and application containers support declaring transactional methods via annotations: Simply use the result returned from a transactional method.
When you have to modify a single entity, the transactional method would just take the detached entity as parameter and return the updated entity, easy.
public Document updateDocument(Document doc)
{
Document managed = em.merge(doc);
// managed.setXxx(...);
// managed.setYyy(...);
return managed;
}
But when you need to modify more than one in a single transaction, the method can become a real pain:
public LinkTicketResult linkTicket(Node node, Ticket ticket)
{
LinkTicketResult result = new LinkTicketResult();
Node managedNode = em.merge(node);
result.setNode(managedNode);
// modify managedNode
Ticket managedTicket = em.merge(ticket);
result.setTicket(managedTicket);
// modify managedTicket
Remark managedRemark = createRemark(...);
result.setRemark(managedemark);
return result;
}
In this case, my pain:
I have to create a dedicated transactional method (maybe a dedicated #EJB too)
That method will be called only once (will have just one caller) - is a "one-shot" non-reusable public method. Ugly.
I have to create the dummy class LinkTicketResult
That class will be instantiated only once, in that method - is "one-shot"
The method could have many parameters (or another dummy class LinkTicketParameters)
JSF controller actions, in most cases, will just call a EJB method, extract updated entities from returned container and reassign them to local fields
My code will be steadily polluted with "one-shotters", too many for my taste.
Probably I'm not seeing something big that's just in front of me, I'll be very grateful if you can point me in the right direction.
Is method 3. really safer than method 2.?
Yes. Not only is it safer (see point 2), but it is conceptually more correct, as you change transaction-dependent state only when you proved that the related transaction has succeeded.
Are there cases where an exception is thrown between flush [excluded] and commit [included]?
Yes. For example:
LockMode.OPTIMISTIC:
Optimistically assume that transaction will not experience contention
for entities. The entity version will be verified near the transaction
end.
It would be neither performant nor practically useful to check optimistick lock violation during each flush operation within a single transaction.
Deferred integrity constraints (enforced at commit time in db). Not used often, but are an illustrative example for this case.
Later maintenance and refactoring. You or somebody else may later introduce additional changes after the last explicit call to flush.
Is there a standard way to handle this common problem?
Yes, I would say that your third approach is the standard one: Use the results of a complete and successful transaction.
However, I would definitely not use AtomicReference or similar cumbersome constructs. Java EE, Spring and other frameworks and application containers support declaring transactional methods via annotations: Simply use the result returned from a transactional method.
Not sure if this is entirely to the point, but there is only one way to recover after exceptions: rollback and close the EM. From https://docs.jboss.org/hibernate/entitymanager/3.6/reference/en/html/transactions.html#transactions-basics-issues
An exception thrown by the Entity Manager means you have to rollback
your database transaction and close the EntityManager immediately
(discussed later in more detail). If your EntityManager is bound to
the application, you have to stop the application. Rolling back the
database transaction doesn't put your business objects back into the
state they were at the start of the transaction. This means the
database state and the business objects do get out of sync. Usually
this is not a problem, because exceptions are not recoverable and you
have to start over your unit of work after rollback anyway.
-- EDIT--
Also see http://piotrnowicki.com/2013/03/jpa-and-cmt-why-catching-persistence-exception-is-not-enough/
ps: downvote is not mine.

Categories

Resources