Sorry in advance for the long post. I'm working with a Java WebApplication which uses Spring (2.0, I know...) and Jpa with Hibernateimplementation (using hibernate 4.1 and hibernate-jpa-2.0.jar). I'm having problems retrieving the value of a column from a DB Table (MySql 5) after i update it. This is my situation (simplified, but that's the core of it):
Table KcUser:
Id:Long (primary key)
Name:String
.
.
.
Contract_Id: Long (foreign key, references KcContract.Id)
Table KcContract:
Id: Long (primary Key)
ColA
.
.
ColX
In my server I have something like this:
MyController {
myService.doSomething();
}
MyService {
private EntityManager myEntityManager;
#Transactional(readOnly=true)
public void doSomething() {
List<Long> IDs = firstFetch(); // retrieves some users IDs querying the KcContract table
doUpdate(IDs); // updates a column on KcUser rows that matches the IDs retrieved by the previous query
secondFecth(IDs); // finally retrieves KcUser rows <-- here the returned rows contains the old value and not the new one i updated in the previous method
}
#Transactional(readOnly=true)
private List<Long> firstFetch() {
List<Long> userIDs = myEntityManager.createQuery("select c.id from KcContract c" ).getResultList(); // this is not the actual query, there are some conditions in the where clause but you get the idea
return userIDs;
}
#Transactional(readOnly=false, propagation=Propagation.REQUIRES_NEW)
private void doUpdate(List<Long> IDs) {
Query hql = myEntityManager().createQuery("update KcUser t set t.name='newValue' WHERE t.contract.id IN (:list)").setParameter("list", IDs);
int howMany = hql.executeUpdate();
System.out.println("HOW MANY: "+howMany); // howMany is correct, with the number of updated rows in DB
Query select = getEntityManager().createQuery("select t from KcUser t WHERE t.contract.id IN (:list)" ).setParameter("list", activeContractIDs);
List<KcUser> users = select.getResultList();
System.out.println("users: "+users.get(0).getName()); //correct, newValue!
}
private void secondFetch(List<Long> IDs) {
List<KcUser> users = myEntityManager.createQuery("from KcUser t WHERE t.contract.id IN (:list)").setParameter("list", IDs).getResultList()
for(KcUser u : users) {
myEntityManager.refresh(u);
String name = u.getName(); // still oldValue!
}
}
}
The strange thing is that if i comment the call to the first method (myService.firstFetch()) and call the other two methods with a constant list of IDs, i get the correct new KcUser.name value in secondFetch() method.
Im not very expert with Jpa and Hibernate, but I thought it might be a cache problem, so i've tried:
using myEntityManager.flush() after the update
clearing the cache with myEntityManager.clear() and myEntityManager.getEntityManagerFactory().evictAll();
clearing the cache with hibernate Session.clear()
using myEntityManager.refresh on KcUser entities
using native queries (myEntityManager.createNativeQuery("")), which to my understanding should not involve any cache
Nothing of that worked and I always got returned the old KcUser.name value in secondFetch() method.
The only things that worked so far are:
making the firstFetch() method public and moving its call outside of myService.doSomething(), so doing something like this in MyController:
List<Long> IDs = myService.firstFetch();
myService.doSomething(IDs);
using a new EntityManager in secondFetch(), so doing something like this:
EntityManager newEntityManager = myEntityManager.getEntityManagerFactory().createEntityManager();
and using it to execute the subsequent query to fetch users from DB
Using either of the last two methods, the second select works fine and i get users with the updated value in "name" column.
But I'd like to know what's actually happening and why noone of the other things worked: if it's actually a cache problem a simply .clear() or .refresh() should have worked i think. Or maybe i'm totally wrong and it's not related to the cache at all, but then i'm bit lost to what might actually be happening.
I fear there might be something wrong in the way we are using hibernate / jpa which might bite us in the future.
Any idea please? Tell me if you need more details and thanks for your help.
Actions are performed in following order:
Read-only transaction A opens.
First fetch (transaction A)
Not-read-only transaction B opens
Update (transaction B)
Transaction B closes
Second fetch (transaction A)
Transaction A closes
Transaction A is read-only. All subsequent queries in that transaction see only changes that were committed before the transaction began - your update was performed after it.
Related
I am using Hibernate implementation of JPA. Let's say I have a list of objects which I have to persist in a table called Event. All these objects have the same zip code.
public class Event {
String id;
String zipCode;
String locationCode;
String eventName;
String eventDesc;
}
Here id is the primary key and zipCode and locationCode together make a unique key (UK_zipCode_locationCode). The table might already have objects with the given zip code. So, instead of finding which ones should be added, deleted or updated, what I do is delete all the objects in the table with the given zip code first and then insert all the given objects.
// getEventsToAdd method returns the list of events to be added for the zipCode 1234
// getEventsFromTheDB method returns all the events in the db with the zipCode 1234
List<Event> eventsToAdd = getEventsToAdd("1234");
List<Event> oldEvents = getEventsFromTheDB("1234");
for (Event e : oldEvents) {
entityManager.remove(e);
}
for (Event e : eventsToAdd) {
entityManager.persist(e);
}
entityManager.flush();
// ...
This works when the oldEvents list is empty or when all objects in the oldEvents are also in eventsToAdd list (by this I mean the event objects with the same id and same zip code).
However, if there are some event objects in oldEvents which have different id, i.e., does not match with the id of any object in eventsToAdd list, then it throws an exception
Duplicate Entry found for key UK_zipCode_locationCode
The error is as if the old events were not deleted from the table and now inserting the events with the same values of zipCode and locationCode is causing org.hibernate.exception.ConstraintViolationException.
However, if I call entityManager.flush() after deleting the old events, it works -
// This works!
for (Event e : oldEvents) {
entityManager.remove(customizedProviderAttribute);
}
// flush after removing all the old events
entityManager.flush();
for (Event e : eventsToAdd) {
entityManager.persist(e);
}
So, why does flushing at the end does not work but flushing after removing the old entities work?
By default the EntityManager does all SQL commands at the point when transaction is committed. However it can decide in which order it does the SQL commands and in your case the inserts were done before delete, which caused the ConstraintViolationException. flush() causes all SQL to be done immediately, so you achieve deletion before insertion. World is not perfect, neither is Hibernate.
The entity manager does not necessarily issue delete and insert statements when you call remove and persist, it waits and generates the SQL later, typically when you flush explicitly or implicitly. That means the order of the statements will be different, so some inserts may be performed before some deletes, thus triggering the constraint violation. Your workaround with the intermediate flush is common practice in cases like this.
In your second working example when you flush after deletion ,Hibernate will change the state of those entities to REMOVED,to stay synchronized with the database AS if the deletion was physically done,and in your logs you'll see a delete sql query issued,that's why when you persist those same entities ,it'll work ,as for the first example not working,because those entities are still in MANAGED state and you're trying to persist them again,which causes duplicate entries,like #Michal said ,the insertions where issued before the deletion , because the order is not guaranteed.
I have a One to many mapping in JPA as follows:
In my blockchain class I have the #OneToMany annotation over ArrayList (which is the "chain" property) for which I have a Block class.
I have a method for replacing the chain of a blockchain instance with another when a new chain is broadcast or new block is broadcast on the wire (e.g. pubsub, triggred by async listener). The method replaces the chain if it is found to be valid and of sufficient length.
The chain is reflected in the database as a join table between blockchain and block. When a new chain comes in, it will be mostly identical. In other words there will be at least 1 collision with primary key, if only its the genesis block. More likely all but one or a few blocks at the tip will collide, so I want the ability to handle this without incident. Ideally JPA would have figured out how to do it without me with the following code, but that's not the case.
#Override
public boolean replaceChain(String name, ArrayList<Block> new_chain) throws NoSuchAlgorithmException, ChainTooShortException, GenesisBlockInvalidException, BlocksInChainInvalidException {
this.connect();
em.getTransaction().begin();
Query query = em.createQuery("select b from Blockchain b where b.instance_name = :name");
query.setParameter("name", name);
Blockchain blockchain = (Blockchain) query.getSingleResult();
blockchain.replace_chain(new_chain);
em.getTransaction().commit();
this.disconnect();
return true;
}
From there I tried many permutations and tricks I could think of. I tried manually deleting each block that is a duplicate from the block entity but then it had a problem with the join table and stack overflow said apparently JPA is not set up to manage that manually. It's not the OOP way. I'm fine with that, but then my question is what is the OOP way. Just to overwrite a one to many relationship. The new incoming OneToMany should overwrite everything, displace everything else and that's it, but it tries to duplicate. I read other SO posts but I either didn't understand them well enough or they didn't see to help.
I'm running this through a DAO service, wired up to work through a pubnub listener callback. I have two servers and in fact codebases running- this Main "node on the network" that is dealing with the database (port 8080) and an "in memory" one on 9080 that starts with only the genesis block and if it gets a 200 GET request to 8080 will clone that and replace that chain. Replace chain works- just not to write to database. I call the second node on the network the PEER instance. It has the ability to mine blocks and when it does, it broadcasts to pubsub which triggers the main node. That's my setup. Everything seems to be working beautifully except the JPA part of it. I'm using Eclipselink and Tomcat.
From my understanding, when you start a transaction with entitymanager, it basically watches what you do and takes notes and records the results and does its magic, kind of like a scribe or observer. You set it up to watch, and then you do your business and then you tell it to commit, it still has limits and constraints to deal with or exceptions will be thrown but that's my understanding and that's the route I initially went.
Here is my error log for just this code above, not the trying to manually delete the blocks of a given chain. I could do that but I couldn't get to the join table and I know that's not the ideal way
Error Code: 1062
Call: INSERT INTO block (TIMESTAMP, DATA, DIFFICULTY, HASH, LASTHASH, NONCE) VALUES (?, ?, ?, ?, ?, ?)
bind => [1617166967254, [B#5ebbe21e, 9, 002057d17e0de9c5f97f6a0f3e7534c0599036ae307ece2ee3f645025c153f80, 007e833b320c58bcf29096e22ced52a5c90c915e23830eeae0a7093290af4080, 246]
Query: InsertObjectQuery(privblock.gerald.ryan.entity.Block#d6f817c0)
at org.eclipse.persistence.internal.jpa.transaction.EntityTransactionImpl.commit(EntityTransactionImpl.java:157)
at privblock.gerald.ryan.dao.BlockchainDao.replaceChain(BlockchainDao.java:97)
at privblock.gerald.ryan.service.BlockchainService.replaceChainService(BlockchainService.java:38)
at pubsub.PubNubSubCallback.message(PubNubSubCallback.java:132)
at com.pubnub.api.managers.ListenerManager.announce(ListenerManager.java:61)
at com.pubnub.api.workers.SubscribeMessageWorker.processIncomingPayload(SubscribeMessageWorker.java:228)
at com.pubnub.api.workers.SubscribeMessageWorker.takeMessage(SubscribeMessageWorker.java:83)
at com.pubnub.api.workers.SubscribeMessageWorker.run(SubscribeMessageWorker.java:74)
at java.base/java.lang.Thread.run(Thread.java:832)
Any help or insight is appreciated!
I figured it out. It seems to work with one extra annotation property orphanRemoval=true as below
#OneToMany(targetEntity = Block.class, cascade = CascadeType.PERSIST, orphanRemoval=true)
#JoinTable(name = "BlocksByChain")
List<Block> chain; // The chain itself
I knew it had to be simple and some feature that already existed. It was just not the framework default
EDIT: Not quite. Not perfectly. I still have to have code that flushes it out inside the DAO, and that is clunky and not optimal. Also I get a console output about a deadlock or something. I didn't notice that before as my application works as expected, but I know there has to be a better way.
This code also has to exist for it to work:
#Override
public boolean replaceChain(String name, ArrayList<Block> new_chain) throws NoSuchAlgorithmException,
ChainTooShortException, GenesisBlockInvalidException, BlocksInChainInvalidException {
this.connect();
em.getTransaction().begin();
Query query = em.createQuery("select b from Blockchain b where b.instance_name = :name");
query.setParameter("name", name);
Blockchain blockchain = (Blockchain) query.getSingleResult();
// blockchain.replace_chain(new_chain);
// em.merge(blockchain);
System.out.println("GOING TO REPLACE CHAIN AS SERVICE");
// THIS LONG BLOCK IS BECAUSE I COULDN'T FIND A MORE NATURAL WAY. I KEEP GETTING
// ERRORS.
// I JUST WANT TO OVERWRITE THE CHAIN OR DO A SMART MERGE
// INSTEAD IT TRIES TO APPEND. I HAVE TO WRITE AN EMPTY SET TO DB AND COMMIT IT
// AND THEN REPOPULATE IT. ALTERNATELY I COULD MAYBE DO A NATIVE QUERY AND
// TRUNCATE
// REGARDLESS IT DOESN'T SEEM TO SMARTLY MERGE THE TWO CHAINS
// -- IT SHOULD BE EASY WHEN THE NEW CHAIN IS AN EXTENSION, VS A FORK
// -- HANDLING THE "FORK" POTENTIAL OF BLOCKCHAIN ADDS TO THE COMPLEXITY IN
// WHICH CASE EASIEST TO TRUNCATE AND START FRESH
// Try Flush
if (blockchain.willReplace(new_chain)) {
blockchain.setChain(null);
em.getTransaction().commit();
em.getTransaction().begin();
// em.flush();
Query query2 = em.createQuery("select b from Blockchain b where b.instance_name = :name");
query.setParameter("name", name);
Blockchain blockchain2 = (Blockchain) query.getSingleResult();
blockchain2.setChain(new_chain);
em.getTransaction().commit();
this.disconnect();
return true;
}
em.getTransaction().commit();
this.disconnect();
return true;
}
I have a table summary which has a column status .The table already has data with status 1. Firstly I am inserting new records into table with status equal to 0. Then
I am deleting old records with status 1 and the finally updating records with status 0 to 1.
It is working fine when server load is less but when load increases, old records(status 1) are not deleting and new records(status 0) gets inserted and updated to 1.
Following is the fow-
1.saveAndFlush new records with status 0.
2.deleteRecords();
3.updateRecords();
The query for delete is
#Transactional
#Modifying
#Query(value = "DELETE FROM TableDataSummary t where t.status=1")
public void deleteSummary();
And for update -
#Transactional
#Modifying
#Query(value = "Update TableDataSummary t set t.status=1")
public void updateSummary();
This is happening randomly, how to resolve this issue.
Thanks.
I had faced something like that recently. I had to provide a CRUD functionnality to manipulate data from the database. I used JPA also and for the update functionnality, I didn't use update query directly in my Repository.
If it may help you, I'll share my idea:
1. saveAndFlush new records with status 0.
2.deleteRecords();
3. do a findAll() from your Controller to get all the data (normally, you should only get data with status = 0 , because you have just deleted the others with status = 1.)
4. create a loop to get access to each data of the result of the "findAll()"
5. inside the loop, you have to set "status" to "1" and call "save()" for each object.
And "save()" will update the data automatically
hope that it will help you, otherwise maybe I missunderstood you.
I'm trying to update all my 4000 Objects in ProfileEntity but I am getting the following exception:
javax.persistence.QueryTimeoutException: The datastore operation timed out, or the data was temporarily unavailable.
this is my code:
public synchronized static void setX4all()
{
em = EMF.get().createEntityManager();
Query query = em.createQuery("SELECT p FROM ProfileEntity p");
List<ProfileEntity> usersList = query.getResultList();
int a,b,x;
for (ProfileEntity profileEntity : usersList)
{
a = profileEntity.getA();
b = profileEntity.getB();
x = func(a,b);
profileEntity.setX(x);
em.getTransaction().begin();
em.persist(profileEntity);
em.getTransaction().commit();
}
em.close();
}
I'm guessing that I take too long to query all of the records from ProfileEntity.
How should I do it?
I'm using Google App Engine so no UPDATE queries are possible.
Edited 18/10
In this 2 days I tried:
using Backends as Thanos Makris suggested but got to a dead end. You can see my question here.
reading DataNucleus suggestion on Map-Reduce but really got lost.
I'm looking for a different direction. Since I only going to do this update once, Maybe I can update manually every 200 objects or so.
Is it possible to to query for the first 200 objects and after it the second 200 objects and so on?
Given your scenario, I would advice to run a native update query:
Query query = em.createNativeQuery("update ProfileEntity pe set pe.X = 'x'");
query.executeUpdate();
Please note: Here the query string is SQL i.e. update **table_name** set ....
This will work better.
Change the update process to use something like Map-Reduce. This means all is done in datastore. The only problem is that appengine-mapreduce is not fully released yet (though you can easily build the jar yourself and use it in your GAE app - many others have done so).
If you want to set(x) for all object's, better to user update statement (i.e. native SQL) using JPA entity manager instead of fetching all object's and update it one by one.
Maybe you should consider the use of the Task Queue API that enable you to execute tasks up to 10min. If you want to update such a number of entities that Task Queues do not fit you, you could also consider the user of Backends.
Put the transaction outside of the loop:
em.getTransaction().begin();
for (ProfileEntity profileEntity : usersList) {
...
}
em.getTransaction().commit();
Your class behaves not very well - JPA is not suitable for bulk updates this way - you just starting a lot of transaction in rapid sequence and produce a lot of load on the database. Better solution for your use case would be scalar query setting all the objects without loading them into JVM first ( depending on your objects structure and laziness you would load much more data as you think )
See hibernate reference:
http://docs.jboss.org/hibernate/orm/3.3/reference/en/html/batch.html#batch-direct
So, I'm getting a number of instances of a particular entity by id:
for(Integer songId:songGroup.getSongIds()) {
session = HibernateUtil.getSession();
Song song = (Song) session.get(Song.class,id);
processSong(song);
}
This generates a SQL query for each id, so it occurred to me that I should do this in one, but I couldn't find a way to get multiple entities in one call except by running a query. So I wrote a query
return (List) session.createCriteria(Song.class)
.add(Restrictions.in("id",ids)).list();
But, if I enable 2nd level caching doesn't that mean that my old method would be able to return the objects from the 2nd level cache (if they had been requested before) but my query would always go to the database.
What the correct way to do this?
What you're asking to do here is for Hibernate to do special case handling for your Criteria, which is kind of a lot to ask.
You'll have to do it yourself, but it's not hard. Using SessionFactory.getCache(), you can get a reference to the actual storage for cached objects. Do something like the following:
for (Long id : allRequiredIds) {
if (!sessionFactory.getCache().containsEntity(Song.class, id)) {
idsToQueryDatabaseFor.add(id)
} else {
songs.add(session.get(Song.class, id));
}
}
List<Song> fetchedSongs = session.createCriteria(Song.class).add(Restrictions.in("id",idsToQueryDatabaseFor).list();
songs.addAll(fetchedSongs);
Then the Songs from the cache get retrieved from there, and the ones that are not get pulled with a single select.
If you know that the IDs exist, you can use load(..) to create a proxy without actually hitting the DB:
Return the persistent instance of the given entity class with the given identifier, obtaining the specified lock mode, assuming the instance exists.
List<Song> list = new ArrayList<>(ids.size());
for (Integer id : ids)
list.add(session.load(Song.class, id, LockOptions.NONE));
Once you access a non-identifier accessor, Hibernate will check the caches and fallback to DB if needed, using batch-fetching if configured.
If the ID doesn't exists, a ObjectNotFoundException will occur once the object is loaded. This might be somewhere in your code where you wouldn't really expect an exception - you're using a simple accessor in the end. So either be 100% sure the ID exists or at least force a ObjectNotFoundException early where you'd expect it, e.g. right after populating the list.
There is a difference between hibernate 2nd level cache to hibernate query cache.
The following link explains it really well: http://www.javalobby.org/java/forums/t48846.html
In a nutshell,
If you are using the same query many times with the same parameters then you can reduce database hits using a combination of both.
Another thing that you could do is to sort the list of ids, and identify subsequences of consecutive ids and then query each of those subsequences in a single query. For example, given List<Long> ids, do the following (assuming that you have a Pair class in Java):
List<Pair> pairs=new LinkedList<Pair>();
List<Object> results=new LinkedList<Object>();
Collections.sort(ids);
Iterator<Long> it=ids.iterator();
Long previous=-1L;
Long sequence_start=-1L;
while (it.hasNext()){
Long next=it.next();
if (next>previous+1) {
pairs.add(new Pair(sequence_start, previous));
sequence_start=next;
}
previous=next;
}
pairs.add(new Pair(sequence_start, previous));
for (Pair pair : pairs){
Query query=session.createQuery("from Person p where p.id>=:start_id and p.id<=:end_id");
query.setLong("start_id", pair.getStart());
query.setLong("end_id", pair.getEnd());
results.addAll((List<Object>)query.list());
}
Fetching each entity one by one in a loop can lead to N+1 query issues.
Therefore, it's much more efficient to fetch all entities at once and do the processing afterward.
Now, in your proposed solution, you were using the legacy Hibernate Criteria, but since it's been deprecated since Hibernate 4 and will probably be removed in Hibernate 6, so it's better to use one of the following alternatives.
JPQL
You can use a JPQL query like the following one:
List<Song> songs = entityManager
.createQuery(
"select s " +
"from Song s " +
"where s.id in (:ids)", Song.class)
.setParameter("ids", songGroup.getSongIds())
.getResultList();
Criteria API
If you want to build the query dynamically, then you can use a Criteria API query:
CriteriaBuilder builder = entityManager.getCriteriaBuilder();
CriteriaQuery<Song> query = builder.createQuery(Song.class);
ParameterExpression<List> ids = builder.parameter(List.class);
Root<Song> root = query
.from(Song.class);
query
.where(
root.get("id").in(
ids
)
);
List<Song> songs = entityManager
.createQuery(query)
.setParameter(ids, songGroup.getSongIds())
.getResultList();
Hibernate-specific multiLoad
List<Song> songs = entityManager
.unwrap(Session.class)
.byMultipleIds(Song.class)
.multiLoad(songGroup.getSongIds());
Now, the JPQL and Criteria API can benefit from the hibernate.query.in_clause_parameter_padding optimization as well, which allows you to increase the SQL statement caching mechanism.
For more details about loading multiple entities by their identifier, check out this article.