Out of heap space with hibernate - what's the problem?

Out of heap space with hibernate - what's the problem? - java

I'm running into the following (common) error after I added a new DB table, hibernate class, and other classes to access the hibernate class:
java.lang.OutOfMemoryError: Java heap space
Here's the relevant code:
From .jsp:
<%
com.companyconnector.model.HomepageBean homepage = new com.companyconnector.model.HomepageBean();
%>
From HomepageBean:
public class HomepageBean {
...
private ReviewBean review1;
private ReviewBean review2;
private ReviewBean review3;
public HomepageBean () {
...
GetSurveyResults gsr = new GetSurveyResults();
List<ReviewBean> rbs = gsr.getRecentReviews();
review1 = rbs.get(0);
review2 = rbs.get(1);
review3 = rbs.get(2);
}
From GetSurveyResults:
public List<ReviewBean> getRecentReviews() {
List<OpenResponse> ors = DatabaseBean.getRecentReviews();
List<ReviewBean> rbs = new ArrayList<ReviewBean>();
for(int x = 0; ors.size() > x; x =+ 2) {
String employer = "";
rbs.add(new ReviewBean(ors.get(x).getUid(), employer, ors.get(x).getResponse(), ors.get(x+1).getResponse()));
}
return rbs;
}
and lastly, from DatabaseBean:
public static List<OpenResponse> getRecentReviews() {
SessionFactory session = HibernateUtil.getSessionFactory();
Session sess = session.openSession();
Transaction tx = sess.beginTransaction();
List results = sess.createQuery(
"from OpenResponse where (uid = 46) or (uid = 50) or (uid = 51)"
).list();
tx.commit();
sess.flush();
sess.close();
return results;
}
Sorry for all the code and such a long message, but I'm getting over a million instances of ReviewBean (I used jProfiler to find this). Am I doing something wrong in the for loop in GetSurveyResults? Any other problems?
I'm happy to provide more code if necessary.
Thanks for the help.
Joe

Using JProfiler to find which objects occupy the memory is a good first step. Now that you know that needlessly many instances are created, a logical next analysis step is to run your application in debug mode, and step through the code that allocates the ReviewBeans. If you do that, the bug should be obvious. (I am pretty sure I spotted it, but I'd rather teach you how to find such bugs on your own. It's a skill that is indispensable for any good programmer).

Also you probably want to close session/commit transaction if the finally block to make sure it's always invoked event if your method throws exception. Standard pattern for working with resources in java (simplified pseudo code):
Session s = null;
try {
s = openSession();
// do something useful
}
finally {
if (s != null) s.close();
}

Related

Java Spark - java.lang.OutOfMemoryError: GC overhead limit exceeded - Large Dataset

We have a spark SQL query that returns over 5 million rows. Collecting them all for processing results in java.lang.OutOfMemoryError: GC overhead limit exceeded (eventually). Here's the code:
final Dataset<Row> jdbcDF = sparkSession.read().format("jdbc")
.option("url", "xxxx")
.option("driver", "com.ibm.db2.jcc.DB2Driver")
.option("query", sql)
.option("user", "xxxx")
.option("password", "xxxx")
.load();
final Encoder<GdxClaim> gdxClaimEncoder = Encoders.bean(GdxClaim.class);
final Dataset<GdxClaim> gdxClaimDataset = jdbcDF.as(gdxClaimEncoder);
System.out.println("BEFORE PARALLELIZE");
final JavaRDD<GdxClaim> gdxClaimJavaRDD = javaSparkContext.parallelize(gdxClaimDataset.collectAsList());
System.out.println("AFTER");
final JavaRDD<ClaimResponse> gdxClaimResponse = gdxClaimJavaRDD.mapPartitions(mapFunc);
mapFunc = (FlatMapFunction<Iterator<GdxClaim>, ClaimResponse>) claim -> {
System.out.println(":D " + claim.next().getRBAT_ID());
if (claim != null && !currentRebateId.equals((claim.next().getRBAT_ID()))) {
if (redisCommands == null || (claim.next().getRBAT_ID() == null)) {
serObjList = Collections.emptyList();
} else {
generateYearQuarterKeys(claim.next());
redisBillingKeys = redisBillingKeys.stream().collect(Collectors.toList());
final String[] stringArray = redisBillingKeys.toArray(new String[redisBillingKeys.size()]);
serObjList = redisCommands.mget(stringArray);
serObjList = serObjList.stream().filter(clientObj -> clientObj.hasValue()).collect(Collectors.toList());
deserializeClientData(serObjList);
currentRebateId = (claim.next().getRBAT_ID());
}
}
return (Iterator) racAssignmentService.assignRac(claim.next(), listClientRegArr);
};
You can ignore most of this, the line that runs forever and never can return is:
final JavaRDD<GdxClaim> gdxClaimJavaRDD = javaSparkContext.parallelize(gdxClaimDataset.collectAsList());
Because of:
gdxClaimDataset.collectAsList()
We are unsure where to go from here and totally stuck. Can anyone help? We've looked everywhere for some example to help.

At a high level, collectAsList() is going to bring your entire dataset into memory, and this is what you need to avoid doing.
You may want to look at the Dataset docs in general (same link as above). They explain its behavior, including the javaRDD() method, which is probably the way to avoid collectAsList().
Keep in mind: other "terminal" operations, that collect your dataset into memory, will cause the same problem. The key is to filter down to your small subset, whatever that is, either before or during the process of collection.

Try to replace this line :
final JavaRDD<GdxClaim> gdxClaimJavaRDD = javaSparkContext.parallelize(gdxClaimDataset.collectAsList());
with :
final JavaRDD<GdxClaim> gdxClaimJavaRDD = gdxClaimDataset.javaRDD();

Now do you create transaction in Neo4j 2.0?

Now do you create transaction in Neo4j 2.0 ? I tried dozens of ways and none of them worked.
Basically problem is that second and subsequent transactions are never successful. Perhaps I don't begin transactions properly. I don't know. I tried all possible combinations that I see in your unit-tests and also in ExecutionEngine.
Here is how I create transaction:
private def withTransaction[T](f: => T): T = {
// FIXME: Sometimes it returns PlaceboTransaction which causes TONS of issues
val tx = db.beginTx
try {
val result = f
tx.success()
result
} catch {
case e: Throwable =>
// If I don't check this I'll get NullPointerException in TopLevelTransaction.markAsRollbackOnly()
if (!tx.isInstanceOf[PlaceboTransaction])
tx.failure()
throw e
} finally {
// If I don't check this I'll get NullPointerException in TopLevelTransaction.markAsRollbackOnly()
if (!tx.isInstanceOf[PlaceboTransaction])
tx.close()
}
}
It never works. Attempts to fetch any data/properties of Node cause following exception
Exception in thread "main" org.neo4j.graphdb.NotInTransactionException
at org.neo4j.kernel.ThreadToStatementContextBridge.transaction(ThreadToStatementContextBridge.java:58)
at org.neo4j.kernel.ThreadToStatementContextBridge.statement(ThreadToStatementContextBridge.java:49)
at org.neo4j.kernel.impl.core.NodeProxy.hasLabel(NodeProxy.java:551)
at GraphDBManager$$anonfun$findUsers$1$$anonfun$apply$1.apply(GraphDBManager.scala:72)
at GraphDBManager$$anonfun$findUsers$1$$anonfun$apply$1.apply(GraphDBManager.scala:72)
at scala.collection.TraversableLike$WithFilter$$anonfun$map$2.apply(TraversableLike.scala:722)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$WithFilter.map(TraversableLike.scala:721)
at GraphDBManager$$anonfun$findUsers$1.apply(GraphDBManager.scala:72)
at GraphDBManager$$anonfun$findUsers$1.apply(GraphDBManager.scala:72)
at GraphDBManager$.withTransaction(GraphDBManager.scala:38)
at GraphDBManager$.findUsers(GraphDBManager.scala:71)
at Test$.main(Test.scala:12)
at Test.main(Test.scala)
I created sample project here.
Any help is greatly appreciated. Thanks.

This was a client code bug, Pull Request to the project in question here: https://github.com/cppexpert/neo4j_2_bad_transactions/pull/1

Well.. After hours of debugging I figured it out. I hope it's going to be fixed in final release.
Here is how problem function looks like
def findUsers: List[ObjectId] = {
val query = engine.execute(s"MATCH (n:$label) RETURN n")
val it = query.columnAs[Node]("n")
withTransaction {
val lst = it.toList
val ret = for (node <- lst; if node.hasLabel(label)) yield new ObjectId(node.getProperty("id").asInstanceOf[String])
ret
}
}
Turned out ExecutionEngine.execute leaves transaction open that causes beginTx() in withTransaction return PlaceboTransaction instead of real transaction object. On the other hand I can't get rid of my transaction wrapper because NodeProxy surprisingly gets transaction object differently
Exception in thread "main" org.neo4j.graphdb.NotInTransactionException
at org.neo4j.kernel.ThreadToStatementContextBridge.transaction(ThreadToStatementContextBridge.java:58)
at org.neo4j.kernel.ThreadToStatementContextBridge.statement(ThreadToStatementContextBridge.java:49)
at org.neo4j.kernel.impl.core.NodeProxy.hasLabel(NodeProxy.java:551)
where it comes from
private KernelTransaction transaction()
{
checkIfShutdown();
KernelTransaction transaction = txManager.getKernelTransaction();
if ( transaction == null )
{
throw new NotInTransactionException();
}
return transaction;
}
What's the difference between transaction from getKernelTransaction and object from TLS map I don't know.
Therefore fixed version of my function would be
def findUsers: List[ObjectId] = {
val query = engine.execute(s"MATCH (n:$label) RETURN n")
val it = query.columnAs[Node]("n")
val lst = it.toList
query.close()
withTransaction {
val ret = for (node <- lst; if node.hasLabel(label)) yield new ObjectId(node.getProperty("id").asInstanceOf[String])
ret
}
}
Which in my opinion not only ugly from design prospective but also give inconsistent data when I iterate through nodes in second transaction.

Discrepancy between memory usage got through getMemoryMXBean() and jvisualvm?

When trying to monitor my own program's memory usage through the following code
public static String heapMemUsage()
{
long used = ManagementFactory.getMemoryMXBean().getHeapMemoryUsage().getUsed();
long max = ManagementFactory.getMemoryMXBean().getHeapMemoryUsage().getMax();
return ""+used+" of "+max+" ("+ used/(double)max*100.0 + "%)";
}
I got a slightly different result than seen through jvisualvm (17 588 616 in my program vs 18 639 640 in jvisualvm). I know it's not that big of a deal, but it did get me thinking.
Is there any explanation for this fact?
I'd like to use the coded version if possible, but if its results are in some way skewed, being jvisualvm in some way more credible, I'll have to stick with jvisualvm instead.
Thanks

VisualVM does use the same approach to get required values, let's check the MonitoredDataImpl.java:
MonitoredDataImpl(Jvm vm,JmxSupport jmxSupport,JvmMXBeans jmxModel) {
//...
MemoryUsage mem = jmxModel.getMemoryMXBean().getHeapMemoryUsage();
MemoryPoolMXBean permBean = jmxSupport.getPermGenPool();
//...
genCapacity = new long[2];
genUsed = new long[2];
genMaxCapacity = new long[2];
genCapacity[0] = mem.getCommitted();
genUsed[0] = mem.getUsed();
genMaxCapacity[0] = mem.getMax();
if (permBean != null) {
MemoryUsage perm = permBean.getUsage();
genCapacity[1] = perm.getCommitted();
genUsed[1] = perm.getUsed();
genMaxCapacity[1] = perm.getMax();
}
}
So it is safe to use your approach anyway. Please post additional info regarding JVM version, etc in order to trace this issue.

JAVA MYSQL chat performance issue with 100 users

i'm trying to develop a client-server chat application using java servlets and mysql(innoDB engine) and jetty server. i tested the connection code with 100 simulated users hitting the server at once using jmeter but i got 40 secs as average time :( for all of them to get connected with min time taken by thread( 2 secs ) and max time( 80 secs). My connection database table has the followng structure two columns connect(user,stranger) and my servlet code is shown below.I'm using innoDB engine for row level locking.I also used explicit write lock SELECT...... FOR UPDATE inside transaction.I'm looping the transaction if it rollbacks due to deadlock until it executes atleast once.Once two users get connected they update their stranger's column with eachother's randomly generated unique number.
i'm using c3p0 connection pooling with min 100 threads open and jetty with min 100 threads.
please help me to identify the bottle necks or tools needed to find them.
import java.io.*;
import java.util.*;
import java.sql.*;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
import javax.naming.*;
import javax.sql.*;
public class connect extends HttpServlet {
public void doGet(HttpServletRequest req, HttpServletResponse res)
throws java.io.IOException {
String unumber=null;
String snumber=null;
String status=null;
InitialContext contxt1=null;
DataSource ds1=null;
Connection conxn1=null;
PreparedStatement stmt1=null;
ResultSet rs1=null;
PreparedStatement stmt2=null;
InitialContext contxt3=null;
DataSource ds3=null;
Connection conxn3=null;
PreparedStatement stmt3=null;
ResultSet rs3=null;
PreparedStatement stmt4=null;
ResultSet rs4=null;
PreparedStatement stmt5=null;
boolean checktransaction = true;
unumber=req.getParameter("number"); // GET THE USER's UNIQUE NUMBER
try {
contxt1 = new InitialContext();
ds1 =(DataSource)contxt1.lookup("java:comp/env/jdbc/user");
conxn1 = ds1.getConnection();
stmt1 = conxn1.prepareStatement("SELECT * FROM profiles WHERE number=?"); // GETTING USER DATA FROM PROFILE
stmt1.setString(1,unumber);
rs1 = stmt1.executeQuery();
if(rs1.next()) {
res.getWriter().println("user found in PROFILE table.........");
uage=rs1.getString("age");
usex=rs1.getString("sex");
ulocation=rs1.getString("location");
uaslmode=rs1.getString("aslmode");
stmt1.close();
stmt1=null;
conxn1.close();
conxn1 = null;
contxt3 = new InitialContext();
ds3 =(DataSource)contxt3.lookup("java:comp/env/jdbc/chat");
conxn3 = ds3.getConnection();
conxn3.setAutoCommit(false);
while(checktransaction) {
// TRANSACTION STARTS HERE
try {
stmt2 = conxn3.prepareStatement("INSERT INTO "+ulocation+" (user,stranger) VALUES (?,'')"); // INSERTING RECORD INTO LOCAL CHAT TABLE
stmt2.setString(1,unumber);
stmt2.executeUpdate();
stmt2.close();
stmt2 = null;
res.getWriter().println("inserting row into LOCAL CHAT TABLE.........");
System.out.println("transaction starting........."+unumber);
stmt3 = conxn3.prepareStatement("SELECT user FROM "+ulocation+" WHERE (stranger='' && user!=?) LIMIT 1 FOR UPDATE");
stmt3.setString(1,unumber); // SEARCHING FOR STRANGER
rs3=stmt3.executeQuery();
if (rs3.next()) { // stranger found
stmt4 = conxn3.prepareStatement("SELECT stranger FROM "+ulocation+" WHERE user=?");
stmt4.setString(1,unumber); //CHECKING FOR USER STATUS BEFORE CONNECTING TO STRANGER
rs4=stmt4.executeQuery();
if(rs4.next()) {
status=rs4.getString("stranger");
}
stmt4.close();
stmt4=null;
if(status.equals("")) { // user status is also null
snumber = rs3.getString("user");
stmt5 = conxn3.prepareStatement("UPDATE "+ulocation+" SET stranger=? WHERE user=?"); // CONNECTING USER AND STRANGER
stmt5.setString(1,snumber);
stmt5.setString(2,unumber);
stmt5.executeUpdate();
stmt5.setString(2,snumber);
stmt5.setString(1,unumber);
stmt5.executeUpdate();
stmt5.close();
stmt5=null;
}
} // end of stranger found
stmt3.close();
stmt3 = null;
conxn3.commit(); // TRANSACTION ENDING
checktransaction = false;
} // END OF TRY INSIDE WHILE
catch(java.sql.SQLTransactionRollbackException e) {
System.out.println("transaction restarted......."+unumber);
counttransaction = counttransaction+1;
}
} //END OF WHILE LOOP
conxn3.close();
conxn3 = null;
} // END OF USER FOUND IN PROFILE TABLE
} // end of try
catch(java.sql.SQLException sqlexe) {
try {conxn3.rollback();}
catch(java.sql.SQLException exe) {conxn3=null;}
sqlexe.printStackTrace();
res.getWriter().println("UNABE TO GET CONNECTION FROM POOL!");
}
catch(javax.naming.NamingException namexe) {
namexe.printStackTrace();
res.getWriter().println("DATA SOURCE LOOK UP FAILED!");
}
}
}

How many users do you have? Can you load them all into memory first and do a memory lookup?
If you separate you DB layer from your presentation layer, this is something you can change without changing the servlet (as it shouldn't care where the data comes from)
If you use Java memory it shouldn't take more than a 20 ms per user.
Here is a test which creates one million profiles in memory, looks them up and creates chat entries, which is removed later. The average time per operation was 640 ns (nano-seconds, or billionths of a second)
import java.util.LinkedHashMap;
import java.util.Map;
public class Main {
public static void main(String... args) {
UserDB userDB = new UserDB();
// add 1000,000 users
for (int i = 0; i < 1000000; i++)
userDB.addUser(
new Profile(i,
"user+i",
(short) (18 + i % 90),
i % 2 == 0 ? Profile.Sex.Male : Profile.Sex.Female,
"here", "mode"));
// lookup a users and add a chat session.
long start = System.nanoTime();
int operations = 0;
for(int i=0;i<userDB.profileCount();i+=2) {
Profile p0 = userDB.getProfileByNumber(i);
operations++;
Profile p1 = userDB.getProfileByNumber(i+1);
operations++;
userDB.chatsTo(i, i+1);
operations++;
}
for(int i=0;i<userDB.profileCount();i+=2) {
userDB.endChat(i);
operations++;
}
long time = System.nanoTime() -start;
System.out.printf("Average lookup and update time per operation was %d ns%n", time/operations);
}
}
class UserDB {
private final Map<Long, Profile> profileMap = new LinkedHashMap<Long, Profile>();
private final Map<Long, Long> chatsWith = new LinkedHashMap<Long, Long>();
public void addUser(Profile profile) {
profileMap.put(profile.number, profile);
}
public Profile getProfileByNumber(long number) {
return profileMap.get(number);
}
public void chatsTo(long number1, long number2) {
chatsWith.put(number1, number2);
chatsWith.put(number2, number1);
}
public void endChat(long number) {
Long other = chatsWith.get(number);
if (other == null) return;
Long number2 = chatsWith.get(other);
if (number2 != null && number2 == number)
chatsWith.remove(other);
}
public int profileCount() {
return profileMap.size();
}
}
class Profile {
final long number;
final String name;
final short age;
final Sex sex;
final String location;
final String aslmode;
Profile(long number, String name, short age, Sex sex, String location, String aslmode) {
this.number = number;
this.name = name;
this.age = age;
this.sex = sex;
this.location = location;
this.aslmode = aslmode;
}
enum Sex {Male, Female}
}
prints
Average lookup and update time per operation was 636 ns
If you need this to be faster you could look at using Trove4j which could be twice as fast in this case. Given this is likely to be fast enough, I would try to keep things simple.

Have you considered caching reads and batching writes?

I'm not sure how you can realistically expect anyone to determine where the bottle-necks are by merely looking at the source code.
To find the bottlenecks, you should run your app and the load test with a profiler attached, such as JVisualVM or YourKit or JProfiler. This will tell you exactly how much time is spent in each area of the code.
The only thing that anyone can really critique from looking at your code is the basic architecture:
Why are you looking up the DataSource on each doGet()?
Why are you using transactions for what appears to be unrelated database insertions and queries?
Is using a RDBMS to back a chat system really the best idea in the first place?

If your response times are so high, you need to properly index your db tables. Based on the times you provided I will assume this was not done.You need to speed up your read and writes.
Look up Execution Plans and how to read them. An execution plan will show you if/when indexes are being used with your queries; if you are performing seeks or scans etc on the tables. by using these, you can tweak your query/indexes/tables to be more optimal.
As others have stated, RDMS wont be your best option in large scale applications, but since you are just starting out it should be ok until you learn more.
Learn to properly setup those tables and you should see your deadlock counts and response times go down

problem in determining whether second level cache is working in hibernate

I am trying to use ehcache in my project.. i have specified the following properties in hibernate config file -
config.setProperty("hibernate.cache.provider_class","org.hibernate.cache.EhCacheProvider");
config.setProperty("hibernate.cache.provider_configuration_file_resource_path","ehcache.xml");
config.setProperty("hibernate.cache.use_second_level_cache","true");
config.setProperty("hibernate.cache.use_query_cache","true");
Now i am still not sure whether the results are coming from DB or the cache..
I looked around and found - Hibernate second level cache - print result where the person is suggesting HitCount/Misscount API's
However when i tried using it the hitcount and miss count is always returned 0... here's my code
String rName = "org.hibernate.cache.UpdateTimestampsCache";
Statistics stat =
HibernateHelper.getInstance().getFactory().getStatistics();
long oldMissCount =
stat.getSecondLevelCacheStatistics(rName).getMissCount();
long oldHitCount =
stat.getSecondLevelCacheStatistics(rName).getHitCount();
UserDAO user = new UserDAO();
user.read(new Long(1)); long
newMissCount =
stat.getSecondLevelCacheStatistics(rName).getMissCount();
long newHitCount =
stat.getSecondLevelCacheStatistics(rName).getHitCount();
if(oldHitCount+1 == newHitCount &&
oldMissCount+1 == newMissCount) {
System.out.println("came from DB"); }
else if(oldHitCount+1 == newHitCount
&& oldMissCount == newMissCount) {
System.out.println("came from cache");
}
Please let me know if i am using it wrong.. and what should be the rName(region Name) in this case..
Is there any other way of determining whether the second level cache is working ??
Thanks

You need to enable statistics collection:
config.setProperty("hibernate.generate_statistics", "true");

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Out of heap space with hibernate - what's the problem? - java

Related

Java Spark - java.lang.OutOfMemoryError: GC overhead limit exceeded - Large Dataset

Now do you create transaction in Neo4j 2.0?

Discrepancy between memory usage got through getMemoryMXBean() and jvisualvm?

JAVA MYSQL chat performance issue with 100 users

problem in determining whether second level cache is working in hibernate

Categories

Resources