I have a table products. In this table I need is_active with constraint - only one row with the same type can be true.
I have service for saving new Product with checking:
#Service
public class ProductServiceImpl implements ProductService {
private final ProductRepository productRepository;
public ProductServiceImpl(ProductRepository productRepository) {
this.productRepository = productRepository;
}
#Override
public void save(Product product) {
Product productInDb = productRepository.findOneByTypeAndIsActive(product.getType());
if (productInDb != null)
throw new AlreadyActiveException();
product.setActive(true);
productRepository.saveAndFlush(product);
}
}
When I call save method in a few threads and try to check active product - in both threads findOneByTypeAndIsActive methods return productInDb is null because I haven't active products in the table.
In each thread I set product.setActive(true); and try to save in DB.
If I don't have the constraint in DB - I save both products in is_active = true state and this checking not performed:
if (productInDb != null)
throw new AlreadyActiveException();
My question - Can I fix this without adding of constraint in DB?
And checking above is useless?
From my point of view this is not the best db tables design to have is_active flag in record structure in pair with restriction, that only one record in table can be is_active at same time.
You have to use database schema constraints or you have to lock whole table with all records. How to lock whole table for modification is database specific. I don't think JPA natively supports such locks.
But you wrote:
Can I fix this without adding of constraint in DB?
No, it is not possible with strict guarantee for all clients.
But if you have only one application that uses this table - you can use local, application specific locks, for example you can create Read/Write java locks on #Service level.
Your operation consists of 2 actions:
Get an entity from DB
Save a new entity if it doesn't exist
Your problem is that few threads can start this operation at the same time and don't see changes of each other. That's definitely not what you need. Your operation which consists of a few actions must be atomic.
If I understand correct you have a rule to keep only 1 active product of the same type in datastore. It sounds like a data consistency requirement, which should be solved on application level.
The most naive option to solve your problem is to acquire a lock before performing your operation. It may be solved either with synchronised or explicit lock:
#Override
public synchronised void save(Product product) {
Product productInDb = productRepository.findOneByTypeAndIsActive(product.getType());
if (productInDb != null)
throw new AlreadyActiveException();
product.setActive(true);
productRepository.saveAndFlush(product);
}
My Spring component gets a request from a client, asks a web-service about some data and saves received objects to a database.
I identify all objects and save only new ones.
The issue occurs when the client makes two or more same requests in the same time (or due to even different user requests I receive same objects from web-service).
To describe the issue with persistence here some details. For each client request my component starts execution in a separate thread, I get a new entityManager, begin a transaction, receive a data from web-service, then I identify objects and persist new ones using given entityManager in a current transaction.
If in separate transactions I receive the same objects from web-service and if they are new ones that are not yet in database I am not able to identify them in not-commited transactions and so they are persisted in all transactions. Then all duplicate objects will be commited and saved to database.
What could be good solutions in this case? Is there any way to identify new objects properly even in different transactions? Or what approaches can be applied?
May be Spring provides some approaches to manage transactions or entityManagers so that it can help with this issue...
Note. Of course I can use database instruments to avoid saving duplicate objects but in this case it is not a very good solution.
Check if objects are present in a database before saving.
Use #UniqueConstraint or #Column(unique = true) to prevent duplicate rows, handle exceptions appropriately.
Use #Version to manage concurrent modification for existing entities. More about optimistic and pesimistic locking: Chapter 5. Locking. Related discussions: Hibernate Automatic Versioning and When to use #Version and #Audited in Hibernate?
You may use thread locks / synchronization mechanisms to ensure that requests for the same user will happen in order. However, this won't work if your service in running on more than 1 node.
So the solution in my case is the following:
Make transactions pretty small and commit every object separately.
Make unique constraints in database to prevent duplicating of
objects. This point will not help us a lot but needed for point 3.
Every commit() method we insert in try-catch block. If we try to
commit duplicate object in parallel transactions then we will receive an exception and in catch block we can check the database, select the object that is already there and work with it futher.
The example:
boolean reidentifyNeed = false;
try {
DofinService.getEntityManagerThreadLocal().getTransaction().begin();
DofinService.getEntityManagerThreadLocal().persist(entity);
try {
DofinService.getEntityManagerThreadLocal().getTransaction().commit();
//if commit is successfull
entityIdInDB = (long) entity.getId();
DofinService.getEntityManagerThreadLocal().clear();
} catch (Exception ex) {
logger.error("Error committing " + entity.getClass().getSimpleName() + " in DB. Possibly duplicate object. Will try to re-identify object. Error: " + ex.toString());
reidentifyNeed = true;
}
if(reidentifyNeed){
//need clear entityManager, because if duplicated object was persisted then during *select* an object flush() method will be executed and it will thrown ConstrainViolationException
DofinService.getEntityManagerThreadLocal().clear();
CheckSimilarObject checkSimilarObject = new CheckSimilarObject();
long objectId = checkSimilarObject.checkObject(dofinObject);
logger.warn("Re-identifying was done. EntityId = " + objectId);
entityIdInDB = objectId;
}
} catch (Exception ex) {
logger.error("Error persisting and commiting object: " + ex.toString());
}
I was curious whether spring jparepository methods are thread safe and then I read the stackflow article (Is a Spring Data (JPA) Repository thread-safe? (aka is SimpleJpaRepository thread safe)). From there, I understood that repository methods are thread safe and then I made one POC to test the thread safety. I made one repository say FormRepository to do CRUD operations for 'form' entity, that is extending the JpaRepository. From DAO, I simply invoked 100 threads making the form object and manually setting its id and then saving the 'form' object.
Below is the code for reference:-
#Repository
public interface FormRepository extends JpaRepository<Tbldynamicform, Long> {
Tbldynamicform save(Tbldynamicform tblform);
#Query("SELECT max(tblform.formid) FROM Tbldynamicform tblform")
Optional<Integer> findMaxId();
}
......End of Repository above and start of DAO below...
#Component
public class DynamicFormDAO implements DynamicFormDAO {
#Inject
private FormRepository formRepository;
public void testThreadSafety() throws Exception {
List<Callable<Integer>> tasks = new ArrayList<>(100);
for (int i = 0; i < 100; i++) {
tasks.add(() -> {
try {
Tbldynamicform tbldynamicform = new Tbldynamicform();//Set all the required fields for form
if (tbldynamicform.getFormid() == null)
tbldynamicform.setFormid(findFormID());
Tbldynamicform form = formRepository.save(tbldynamicform);
return form.getFormid();
} catch (Exception e) {
e.printStackTrace();
}
return null;
});
}
ExecutorService executor = Executors.newFixedThreadPool(100);
executor.invokeAll(tasks);
}
private int findFormID() throws Exception {
Optional<Integer> id = formRepository.findMaxId();
if (id != null && id.isPresent() && id.get() != null) {
int generatedId = id.get().intValue();
return ++generatedId;
}
return 0;
}
}
When I do this, I was assuming that things have to work fine because the form repository methods are thread safe but somehow I am getting the sql dataintegrityviolationexception several times in logs making the insertion of several records failure. Below error for reference:-
org.springframework.dao.DataIntegrityViolationException: could not execute statement; SQL [n/a]; constraint ["PRIMARY KEY ON PUBLIC.TBLDYNAMICFORM(FORMID)"; SQL statement:
insert into Tbldynamicform (clientid, copyfromexisting, creationdate, formdesc, formmode, formname, formtemplate, formtitle, procutype, status, formid) values (?, ?, ?, ?,...
This has made me to think whether this is the problem of thread safety or some other problem? In my understanding, all the 'tbldynamicform' objects I created in my dao will remain on thread stack. Only the formRepository will be on heap storage and if the formrepository methods are thread safe, 100 records has to be inserted in database without any problem.
If I do the setId and save in synchronized block, everything works ok but that's not my intention and not required if the repository methods are thread safe.
Experts, any help please?
Your saving task is not atomic - two threads might fetch the same maximum id before one of them saved the new entity.
And then, even if the save method of the repository is thread - safe, it wont help.
The maxId is thread safe, the save is thread safe, but your method inside the runnable of each thread is not thread safe.
Simply put, yes, it is threadsafe, but your database is also stateful (obviously) and for integrity to be maintained you may need things like a locking strategy (hold locks to make things synchronous, or use an optimistic strategy and retry where required). As someone has noted in another answer, if you simply used a different method of generating an ID (check out SUID) you code would work fine.
The problem comes from how you retrieve the last ID with findFormID(), it doesn't work in a concurrent context.
What if two threads ask an ID at the same time ? They will retrieve the same ID and create two objects with the same ID. here is your problem.
Some integrated solutions for generated IDs already exist and you should not try to implement your own unless you know what you do.
Today I faced with next problem in hibernate:
My method:
#Transactional
public Period getDefault(Team team) {
Period defaultPeriod = team.getDefaultPeriod();
List<Period> periods = _periodDAO.getPeriods(team);
if (!periods.contains(defaultPeriod)) {
defaultPeriod = periods.get(periods.size() - 1);
}
}
_periodDAO.initializeIssues(defaultPeriod);
return defaultPeriod;
}
Method initializeIssues:
public void initializeIssues(Period period) {
if (period.getIssues() != null) {
Hibernate.initialize(period.getIssues());
}
}
I receive exception if collection periods contains defaultPeriod
Caused by: org.hibernate.HibernateException: collection is not associated with any session
at org.hibernate.collection.AbstractPersistentCollection.forceInitialization(AbstractPersistentCollection.java:474)
at org.hibernate.Hibernate.initialize(Hibernate.java:417)
But if I remove some lines and change method to
#Transactional
public Period getDefault(Team team) {
Period defaultPeriod = team.getDefaultPeriod();
_periodDAO.initializeIssues(defaultPeriod);
return defaultPeriod;
}
It works fine.
I debugged first example and hibernate session does not close during whole method.
As I understand, if loaded object (one element in periods) in session has collection which associated with active session and existing before object (defaultPeriod) also has same association - it (defaultPeriod) will lose its association.
Is it truth? Who else faced with same problem?
Thank you for answers.
Presumably, your Team argument is coming from another transaction and another Hibernate Session.
When a #Transactional method returns, the TransactionManager closes the Session which does some cleanup and unsets (sets to null) the Session field of all PersistentCollection instances. Your defaultPeriod has one of these in its issues field.
Hibernate's Hibernate.initialize() forces the initialization of a lazy PersistentCollection, but has the following code (calls AbstractPersistentCollection#forceInitialization())
if ( session == null ) {
throw new HibernateException( "collection is not associated with any session" );
}
If you are planning on using the issues collection outside the original #Transactional method (the code that produces Team), you need to load the underlying objects. Either change it to EAGER loading or do what you are doing with Hibernate.initialize().
Another solution is to make the Session last longer than just the length of the first #Transactional, but I don't have details for that. A quick google or SO search should bring up some options.
This is what is happening
Period defaultPeriod = team.getDefaultPeriod();
gets a Period object with id (ex.) 42. Because it happened in another Session that has since been closed, the issues is a PersistentCollection which has a null Session reference, and will throw the Exception you get.
The you do this
List<Period> periods = _periodDAO.getPeriods(team);
Let's say the List contains a Period object with id 42, so the if in
if (!periods.contains(defaultPeriod)) {
defaultPeriod = periods.get(periods.size() - 1);
}
doesn't get executed. Although the equals() returns true (contains() also returns true and becomes false because of !), the objects are not the same. The on in the List has an attached (non-null) Session, so that one can be initialized. But yours, the one held by defaultPeriod cannot.
I'm trying to write a method that will return a Hibernate object based on a unique but non-primary key. If the entity already exists in the database I want to return it, but if it doesn't I want to create a new instance and save it before returning.
UPDATE: Let me clarify that the application I'm writing this for is basically a batch processor of input files. The system needs to read a file line by line and insert records into the db. The file format is basically a denormalized view of several tables in our schema so what I have to do is parse out the parent record either insert it into the db so I can get a new synthetic key, or if it already exists select it. Then I can add additional associated records in other tables that have foreign keys back to that record.
The reason this gets tricky is that each file needs to be either totally imported or not imported at all, i.e. all inserts and updates done for a given file should be a part of one transaction. This is easy enough if there's only one process that's doing all the imports, but I'd like to break this up across multiple servers if possible. Because of these constraints I need to be able to stay inside one transaction, but handle the exceptions where a record already exists.
The mapped class for the parent records looks like this:
#Entity
public class Foo {
#Id
#GeneratedValue(strategy = IDENTITY)
private int id;
#Column(unique = true)
private String name;
...
}
My initial attempt at writting this method is as follows:
public Foo findOrCreate(String name) {
Foo foo = new Foo();
foo.setName(name);
try {
session.save(foo)
} catch(ConstraintViolationException e) {
foo = session.createCriteria(Foo.class).add(eq("name", name)).uniqueResult();
}
return foo;
}
The problem is when the name I'm looking for exists, an org.hibernate.AssertionFailure exception is thrown by the call to uniqueResult(). The full stack trace is below:
org.hibernate.AssertionFailure: null id in com.searchdex.linktracer.domain.LinkingPage entry (don't flush the Session after an exception occurs)
at org.hibernate.event.def.DefaultFlushEntityEventListener.checkId(DefaultFlushEntityEventListener.java:82) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
at org.hibernate.event.def.DefaultFlushEntityEventListener.getValues(DefaultFlushEntityEventListener.java:190) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
at org.hibernate.event.def.DefaultFlushEntityEventListener.onFlushEntity(DefaultFlushEntityEventListener.java:147) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
at org.hibernate.event.def.AbstractFlushingEventListener.flushEntities(AbstractFlushingEventListener.java:219) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
at org.hibernate.event.def.AbstractFlushingEventListener.flushEverythingToExecutions(AbstractFlushingEventListener.java:99) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
at org.hibernate.event.def.DefaultAutoFlushEventListener.onAutoFlush(DefaultAutoFlushEventListener.java:58) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
at org.hibernate.impl.SessionImpl.autoFlushIfRequired(SessionImpl.java:1185) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
at org.hibernate.impl.SessionImpl.list(SessionImpl.java:1709) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
at org.hibernate.impl.CriteriaImpl.list(CriteriaImpl.java:347) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
at org.hibernate.impl.CriteriaImpl.uniqueResult(CriteriaImpl.java:369) [hibernate-core-3.6.0.Final.jar:3.6.0.Final]
Does anyone know what is causing this exception to be thrown? Does hibernate support a better way of accomplishing this?
Let me also preemptively explain why I'm inserting first and then selecting if and when that fails. This needs to work in a distributed environment so I can't synchronize across the check to see if the record already exists and the insert. The easiest way to do this is to let the database handle this synchronization by checking for the constraint violation on every insert.
I had a similar batch processing requirement, with processes running on multiple JVMs. The approach I took for this was as follows. It is very much like jtahlborn's suggestion. However, as vbence pointed out, if you use a NESTED transaction, when you get the constraint violation exception, your session is invalidated. Instead, I use REQUIRES_NEW, which suspends the current transaction and creates a new, independent transaction. If the new transaction rolls back it will not affect the original transaction.
I am using Spring's TransactionTemplate but I'm sure you could easily translate it if you do not want a dependency on Spring.
public T findOrCreate(final T t) throws InvalidRecordException {
// 1) look for the record
T found = findUnique(t);
if (found != null)
return found;
// 2) if not found, start a new, independent transaction
TransactionTemplate tt = new TransactionTemplate((PlatformTransactionManager)
transactionManager);
tt.setPropagationBehavior(TransactionDefinition.PROPAGATION_REQUIRES_NEW);
try {
found = (T)tt.execute(new TransactionCallback<T>() {
try {
// 3) store the record in this new transaction
return store(t);
} catch (ConstraintViolationException e) {
// another thread or process created this already, possibly
// between 1) and 2)
status.setRollbackOnly();
return null;
}
});
// 4) if we failed to create the record in the second transaction, found will
// still be null; however, this would happy only if another process
// created the record. let's see what they made for us!
if (found == null)
found = findUnique(t);
} catch (...) {
// handle exceptions
}
return found;
}
You need to use UPSERT or MERGE to achieve this goal.
However, Hibernate does not offer support for this construct, so you need to use jOOQ instead.
private PostDetailsRecord upsertPostDetails(
DSLContext sql, Long id, String owner, Timestamp timestamp) {
sql
.insertInto(POST_DETAILS)
.columns(POST_DETAILS.ID, POST_DETAILS.CREATED_BY, POST_DETAILS.CREATED_ON)
.values(id, owner, timestamp)
.onDuplicateKeyIgnore()
.execute();
return sql.selectFrom(POST_DETAILS)
.where(field(POST_DETAILS.ID).eq(id))
.fetchOne();
}
Calling this method on PostgreSQL:
PostDetailsRecord postDetailsRecord = upsertPostDetails(
sql,
1L,
"Alice",
Timestamp.from(LocalDateTime.now().toInstant(ZoneOffset.UTC))
);
Yields the following SQL statements:
INSERT INTO "post_details" ("id", "created_by", "created_on")
VALUES (1, 'Alice', CAST('2016-08-11 12:56:01.831' AS timestamp))
ON CONFLICT DO NOTHING;
SELECT "public"."post_details"."id",
"public"."post_details"."created_by",
"public"."post_details"."created_on",
"public"."post_details"."updated_by",
"public"."post_details"."updated_on"
FROM "public"."post_details"
WHERE "public"."post_details"."id" = 1
On Oracle and SQL Server, jOOQ will use MERGE while on MySQL it will use ON DUPLICATE KEY.
The concurrency mechanism is ensured by the row-level locking mechanism employed when inserting, updating, or deleting a record, which you can view in the following diagram:
Code avilable on GitHub.
Two solution come to mind:
That's what TABLE LOCKS are for
Hibernate does not support table locks, but this is the situation when they come handy. Fortunately you can use native SQL thru Session.createSQLQuery(). For example (on MySQL):
// no access to the table for any other clients
session.createSQLQuery("LOCK TABLES foo WRITE").executeUpdate();
// safe zone
Foo foo = session.createCriteria(Foo.class).add(eq("name", name)).uniqueResult();
if (foo == null) {
foo = new Foo();
foo.setName(name)
session.save(foo);
}
// releasing locks
session.createSQLQuery("UNLOCK TABLES").executeUpdate();
This way when a session (client connection) gets the lock, all the other connections are blocked until the operation ends and the locks are released. Read operations are also blocked for other connections, so needless to say use this only in case of atomic operations.
What about Hibernate's locks?
Hibernate uses row level locking. We can not use it directly, because we can not lock non-existent rows. But we can create a dummy table with a single record, map it to the ORM, then use SELECT ... FOR UPDATE style locks on that object to synchronize our clients. Basically we only need to be sure that no other clients (running the same software, with the same conventions) will do any conflicting operations while we are working.
// begin transaction
Transaction transaction = session.beginTransaction();
// blocks until any other client holds the lock
session.load("dummy", 1, LockOptions.UPGRADE);
// virtual safe zone
Foo foo = session.createCriteria(Foo.class).add(eq("name", name)).uniqueResult();
if (foo == null) {
foo = new Foo();
foo.setName(name)
session.save(foo);
}
// ends transaction (releasing locks)
transaction.commit();
Your database has to know the SELECT ... FOR UPDATE syntax (Hibernate is goig to use it), and of course this only works if all your clients has the same convention (they need to lock the same dummy entity).
The Hibernate documentation on transactions and exceptions states that all HibernateExceptions are unrecoverable and that the current transaction must be rolled back as soon as one is encountered. This explains why the code above does not work. Ultimately you should never catch a HibernateException without exiting the transaction and closing the session.
The only real way to accomplish this it would seem would be to manage the closing of the old session and reopening of a new one within the method itself. Implementing a findOrCreate method which can participate in an existing transaction and is safe within a distributed environment would seem to be impossible using Hibernate based on what I have found.
The solution is in fact really simple. First perform a select using your name value. If a result is found, return that. If not, create a new one. In case the creation fail (with an exception), this is because another client added this very same value between your select and your insert statement. This is then logical that you have an exception. Catch it, rollback your transaction and run the same code again. Because the row already exist, the select statement will find it and you'll return your object.
You can see here explanation of strategies for optimistic and pessimistic locking with hibernate here : http://docs.jboss.org/hibernate/core/3.3/reference/en/html/transactions.html
a couple people have mentioned different parts of the overall strategy. assuming that you generally expect to find an existing object more often than you create a new object:
search for existing object by name. if found, return
start nested (separate) transaction
try to insert new object
commit nested transaction
catch any failure from nested transaction, if anything but constraint violation, re-throw
otherwise search for existing object by name and return it
just to clarify, as pointed out in another answer, the "nested" transaction is actually a separate transaction (many databases don't even support true, nested transactions).
Well, here's one way to do it - but it's not appropriate for all situations.
In Foo, remove the "unique = true" attribute on name. Add a timestamp that gets updated on every insert.
In findOrCreate(), don't bother checking if the entity with the given name already exists - just insert a new one every time.
When looking up Foo instances by name, there may be 0 or more with a given name, so you just select the newest one.
The nice thing about this method is that it doesn't require any locking, so everything should run pretty fast. The downside is that your database will be littered with obsolete records, so you may have to do something somewhere else to deal with them. Also, if other tables refer to Foo by its id, then this will screw up those relations.
Maybe you should change your strategy:
First find the user with the name and only if the user thoes not exist, create it.
I would try the following strategy:
A. Start a main transaction (at time 1)
B. Start a sub-transaction (at time 2)
Now, any object created after time 1 will not be visible in the main transaction. So when you do
C. Create new race-condition object, commit sub-transaction
D. Handle conflict by starting a new sub-transaction (at time 3) and getting the object from a query (the sub-transaction from point B is now out-of-scope).
only return the object primary key and then use EntityManager.getReference(..) to obtain the object you will be using in the main transaction. Alternatively, start the main transaction after D; it is not totally clear to me in how many race conditions you will have within your main transaction, but the above should allow for n times B-C-D in a 'large' transaction.
Note that you might want to do multi-threading (one thread per CPU) and then you can probably reduce this issue considerably by using a shared static cache for these kind of conflicts - and point 2 can be kept 'optimistic', i.e. not doing a .find(..) first.
Edit: For a new transaction, you need an EJB interface method call annotated with transaction type REQUIRES_NEW.
Edit: Double check that the getReference(..) works as I think it does.