Performance tuning , detecting and page faults

Performance tuning , detecting and page faults - java

I am trying to Tune one of my applications on JAVA.
I am using JAVA-Profiler and got some reports from it.
I saw that the number of page -faults for application are ranging from 30000 to 35000 range.
How can I decide if this number is too high or normal ?
I am getting same data for initial one minute and after half an hour as well.
My RAM is 2 GB and I am using application with single thread.
Thread is only trying to read messages from queue every 3 seconds and queue is empty.
Since no processing is being done, I think that page faults should not occur at all.
Please guide me here.

When you start your JVM, it reserves the maximum heap size as a continuous block. However, this virtual memory is only turned into main memory as you access those pages. i.e. every time your heap grows by 4 KB, you get one page fault. You will also get page fault from thread stacks in the same manner.
Your 35K page faults suggests you are using about 140 MB of heap.
BTW You can buy 8 GB for £25. You might consider an upgrade.

What's your JVM? If it's HotSpot, you can use JVM options like -XX:LargePageSizeInBytes or -XX:+UseMPSS to force desired page sizes in order to minimize page swapping. I Think there should be similar options for other JVMs too.
Take a look at this:
http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html

Related

Sensible Xmx/GC defaults for a microservice with a small heap

At my company we are trying an approach with JVM based microservices. They are designed to be scaled horizontally and so we run multiple instances of each using rather small containers (up to 2G heap, usually 1-1.5G). The JVM we use is 1.8.0_40-b25.
Each of such instances typically handles up to 100 RPS with max memory allocation rate around 250 MB/s.
The question is: what kind of GC would be a safe sensible default to start off with? So far we are using CMS with Xms = Xmx (to avoid pauses during heap resizing) and Xms = Xmx = 1.5G. Results are decent - we hardly ever see any Major GC performed.
I know that G1 could give me smaller pauses (at the cost of total throughput) but AFAIK it requires a bit more "breathing" space and at least 3-4G heap to perform properly.
Any hints (besides going for Azul's Zing :D) ?

Hint # 1: Do experiments!
Assuming that your microservice is deployed at least on two nodes run one on CMS, another on G1 and see what response times are.
Not very likely, but what if you can find that with G1 performance is so good that need half of original cluster size?
Side notes:
re: "250Mb/s" -> if all of this is stack memory (alternatively, if it's young gen) then G1 would provide little benefit since collection form these areas is free.
re: "100 RPS" -> in many cases on our production we found that reducing concurrent requests in system (either via proxy config, or at application container level) improves throughput. Given small heap it's very likely that you have small cpu number as well (2 to 4).
Additionally there are official Oracle Hints on tuning for a small memory footprint. It might not reflect latest config available on 1.8_40, but it's good read anyway.

Measure how much memory is retained after a full GC. add to this the amount of memory allocated per second and multiply by 2 - 10 depending on how often you would like to have a minor GC. e.g. every 2 second or every 10 second.
E.g. say you have up to 500 MB retained after a full GC and GCing every couple of seconds is fine, you can have 500 MB + 2 * 250 MB, or a heap of around 1 GB.
The number of RPS is not important.

Java process in linux needs initial warmup

we have a legacy java multithreaded process on a RHEL 6.5 which is very time critical (low latency), and it processes hundreds of thousands of message a day. It runs in a powerful Linux machine with 40cpus. What we found is the process has a high latency when it process the first 50k messages with average of 10ms / msg , and after this 'warmup' time, the latency starts to drop and became about 7ms, then 5ms and eventually stops at about 3-4ms / second at day end.
this puzzles me , and one of possibility that i can think of is maps are being resized at the beginning till it reaches a very big capacity - and it just doesn't exceed the load factor anymore. From what I see, the maps are not initialized with initial capacity - so that is why i say that may be the case. I tried to put it thru profiler and pump millions of messages inside, hoping to see some 'resize' method from the java collections, but i was unable to find any of them. It could be i am searching for wrong things, or looking into wrong direction. As a new joiner and existing team member left, i am trying to see if there are other reasons that i haven't thought of.
Another possibility that i can think of is kernel settings related, but i am unsure what it could be.
I don't think it is a programming logic issue, because it runs in acceptable speed after the first 30k-50k of messages.
Any suggestion?

It sounds like it takes some time for the operating system to realize that your application is a big resource consumer. So after few seconds it sees that there is a lot of activity with files of your application, and only then the operating system deals with the activity by populating the cache and action like this.

Memory leak or to be expected

I noticed today that my program slowly chews memory. I checked with Java VisualIVM to try and learn more. I am very new at this. I am coding in Java8 with Swing to take care of the game.
Note that nothing is supposed to happen except rendering of objects.
No new instances or anything.
The Game Loop is looking something along the lines of
while (running)
try render all drawables at optimal fps
try update all entities at optimal rate
sleep if there is time over
From what I could find during a 20 minute period the following happened.
Sleeping is working, it is yielding most of its time.
5 classes were loaded some time into the run time.
The game uses about 70 MB when first launched. (At this point everything is loaded as far as I can tell.)
15 MB of RAM were taken pretty rapidly after the initial 70 MB. Followed by a slowly increase. Now 100 MB is taken in total.
CPU Usage seems sane, about 7% on my i-2500k.
The Heap size has been increased once. Used heap has never exceeded 50%.
If I comment everything in the Game Loop except for the while (running {} part I get fewer leaks, they still occur however.
Is this normal or should I dig deeper? If I need to dig deeper can someone point me in the direction of what to look after.
Now after 25 minutes it is up to 102 MB of ram. Meaning the leaks are smaller and fewer.
A reminder, I not very good at this. First time I try to debug my project this way. Please bear that in my mind.
Update
After roughly 40 minutes it settles at 101 to 102 MB of RAM usage. It hasn't exceeded that for 15 minutes now. It goes a bit up and down.
The Heap size is getting smaller and smaller. CPU Usage is steady.

Short Answer: There is no leak.
Expalanation
Using these questions as a reference.
Simple Class - Is it a Memory Leak?
Tracking down a memory leak / garbage-collection issue in Java
Creating a memory leak with Java
And this article.
http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html
As I mentioned in my question. I am very new to this. What triggered me was that I checked Windows Task Manager and saw an increase in memory usage by my application. So I decided to dig deeper. I learned a couple of things.
Javas Garbage Collection was vastly underestimated by me. It is messy to cause a memory leak if that is your intention. There can be problems with it when there is Threading involved however. Which caught my attention as I am using Threading in my project.
The tools that Windows provide are sub-par, I recommend using external tools. I used Java VisuaIVM. In this tool I found that there are loads of classes being loaded the first 2 minutes of the game. And 5 a bit in. Some of the first ones created are String references that JVM makes.
"Thread.sleep could be allocating objects under the covers." -- I found out it does these. a total of 5 even. Which explains my initial "5 classes were loaded some time into the run time.". What they do I have no clue as of yet.
About 10 - 15 MB was from the profiling that I did. I wish I wasn't such a rookie.
So again no leak that I could find.

java heap size increasing

I compile a jar file, and i write to the Log every half a minute to check threads and memory situation.
Attached are the start of the log, and the end of the day log, after this software was stuck, and stop working.
In the middle of the day several automatic operations happened. I received quotes about 40 per seconds, and finished to take care of every one of the quotes before the next came.
Plus, every 4 seconds i write a map with info to the DB.
Any ideas why heap size in increasing?
(look at currHeapSize)
morning:
evening:

Any ideas why heap size in increasing?
These are classic symptoms of a Java storage leak. Somewhere in your application is a data structure that is accumulating more and more objects, and preventing them from being garbage collected.
The best way to find a problem like this is to use a memory profiler. The Answers to this Question explain.

How to estimate or calculate the size of the ArrayBlockingQueue

As title, in my module I had a blockingqueue to deliver my data. The data which server can produce is a a large number of logging information. In order to avoid affecting the performance of server , I wrote multi-thread clients to consume these data and persist them in data caches. Because the data can be produced hugely per mins,I became confused that how many sizes should I initialize my queue. And I knew that I can set my queue policy that if more data is produced , I can omit the overflow part. But how many size I created in the queue in order to hold these data as much as I can.
Could you give me some suggestion?As far as I know , it was related with my server JVM stack size & the single logging data in my JVM???

Make it "as large as is reasonable". For example, if you are OK with it consuming up to 1Gb of memory, then allocate its size to be 1Gb divided by the average number of bytes of the objects in the queue.
If I had to pick a "reasonable" number, I would start with 10000. The reason is, if it grows to larger than that, then making it larger isn't a good idea and isn't going to help much, because clearly the logging requirement is outpacing your ability to log, so it's time to back off the clients.
"Tuning" through experimentation is usually the best approach, as it depends on the profile of your application:
If there are highs and lows in your application's activity, then a larger queue will help "smooth out" the load on your server
If your application has a relatively steady load, then a smaller queue is appropriate as a larger queue only delays the inevitable point when clients are blocked - you would be better to make it smaller and dedicate more resources (a couple more logging threads) to consuming the work.
Note also that a very large queue may impact garbage collection responsiveness to freeing up memory, as it has to traverse a much larger heap (all the objects in the queue) each time it runs, increasing the load on both CPU and memory.
You want to make the size as small as you can without impacting throughput and responsiveness too much. To asses this you'll need to set up a test server and hit it with a typical load to see what happens. Note that you'll probably need to hit it from multiple machines to put a realistic load on the server, as hitting it from one machine can limit the load due to the number of CPU cores and other resources on the test client machine.
To be frank, I'd just make the size 10000 and tune the number of worker threads rather than the queue size.

Contiguous writes to disk are reasonably fast (easily 20MB per second). Instead of storing data in RAM, you might be better off writing it to disk without worrying about memory requirements. Your clients then can read data from files instead of RAM.
To know size of java object, you could use any java profiler. YourKit is my favorite.
I think the real problem is not size of queue but what you want to do when things exceed your planned capacity. ArrayBlockingQueue will simply block your threads, which may or may not be the right thing to do. Your options typically are:
1) Block the threads (use ArrayBlockingQueue) based on memory committed for this purpose
2) Return error to the "layer above" and let that layer decide what to do...may be send error to the client
3) Can you throw away some data...say which was en queued long ago.
4) Start writing to disk, once you overflow RAM capacity.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.