how to slice bytebuffer WITHOUT creating garbage

how to slice bytebuffer WITHOUT creating garbage - java

I am trying to use ByteBuffer as an internal storage for a class. I want to abstract the flip() and ByteBuffer manipulation from the caller but also do not want to use slice() as it creates additional garbage.
Is there any alternative or design suggestions?

Assuming you're running on hotspot and as long as the lifetime of the slice is very shortlived, e.g. immediately used in the method creating it or by its caller, then escape analysis should be able to eliminate that allocation.
That is a JVM optimization, so there's no guarantee that it happens, but it generally is good enough to not worry about those things.
Also, young GCs are very efficient. The cost of such short-lived objects is very low, even if EA does not kick in.
Also, you should avoid premature optimizations. Worry about such things once you measured performance and figured out where the actual bottlenecks are.

Related

Do Orphaned object in java lead to performance Issues [duplicate]

Should Java Objects be reused as often as it can be reused ? Or should we reuse it only when they are "heavyweight", ie have OS resources associated with it ?
All old articles on the internet talk about object reuse and object pooling as much as possible, but I have read recent articles that say new Object() is highly optimized now ( 10 instructions ) and Object reuse is not as big a deal as it used to be.
What is the current best practice and how are you people doing it ?

I let the garbage collector do that kind of deciding for me, the only time I've hit heap limit with freshly allocated objects was after running a buggy recursive algorithm for a couple of seconds which generated 3 * 27 * 27... new objects as fast as it could.
Do what's best for readability and encapsulation. Sometimes reusing objects may be useful, but generally you shouldn't worry about it.

If you use them very intensively and the construction is costly, you should try to reuse them as much as you can.
If your objects are very small, and cheap to create ( like Object ) you should create new ones.
For instance connections database are pooled because the cost of creating a new one is higher than those of creating .. mmhh new Integer for instance.
So the answer to your question is, reuse when they are heavy AND are used often ( it is not worth to pool a 3 mb object that is only used twice )
Edit:
Additionally, this item from Effective Java:Favor Immutability is worth reading and may apply to your situation.

Object creation is cheap, yes, but sometimes not cheap enough.
If you create a lot (and I mean A LOT) temporary objects in rapid succession, the costs for the garbage collector are considerable. However even with a good profiler you may not necessarily see the costs easily, as the garbage collector nowadays works in short intervals instead of blocking the whole application for a second or two.
Most of the performance improvements I got in my projects came from either avoiding object creation or avoiding the whole work (including the object creation) through aggressive caching. No matter how big or small the object is, it still takes time to create it and to manage the references and heap structures for it. (And of course, the cleanup and the internal heap-defrag/copying also takes time.)
I would not start to be religious about avoiding object creation at all cost, but if you see a jigsaw pattern in your memory-profiler, it means your garbage collector is on heavy duty. And if your garbage collector uses the CPU, the CPI is not available for your application.
Regarding object pooling: Doing it right and not running into either memory leaks or invalid states or spending more time on the management than you would save is difficult. So I never used that strategy.
My strategy has been to simply strive for immutable objects. Immutable things can be cached easily and therefore help to keep the system simple.
However, no matter what you do: Make sure you check your hotspots with a profiler first. Premature optimization is the root of most evilness.

Let the garbage collector do its job, it can be considered better than your code.
Unless a profiler proves it guilty. And don't even use common sense to try to figure out when it's wrong. In unusual cases even cheap objects like byte arrays are better pooled.
Rule 1 of optimization: don't do it.
Rule 2 (for experts only): don't do it yet.

The rule of thumb should be to use your common sense and reuse objects when their creation consumes significant resources such as I/O, network traffic, DB connections, etc...
If it's just creating a new String(), forget about the reuse, you'll gain nothing from it. Code readability has higher preference.

I would worry about performance issues if they arise. Do what makes sense first (would you do this with primatives), if you then run a profiling tool and find that it is new causing you problems, start to think about pre-allocation (ie. when your program isn't doing much work).
Re-using objects sounds like a disaster waiting to happen by the way:
SomeClass someObject = new SomeClass();
someObject.doSomething();
someObject.changeState();
someObject.changeOtherState();
someObject.sendSignal();
// stuff
//re-use
someObject.reset(); // urgh, had to put this in to support reuse
someObject.doSomethingElse(); // oh oh, this is wrong after calling changeOtherState, regardless of reset
someObject.changeState(); // crap, now this is wrong but it's not obvious yet
someObject.doImportantStuff(); // what's going on?

Object creation is certainly faster than it used to be. The newer generational GC in JDKs 5 and higher are improvements, too.
I don't think either of these makes excessive creation of objects cost-free, but they do reduce the importance of object pooling. I think pooling makes sense for database connections, but I don't attempt it for my own domain objects.
Reuse puts a premium on thread-safety. You need to think carefully to ensure that you can reuse objects safely.
If I decided that object reuse was important I'd do it with products like Terracotta, Tangersol, GridGain, etc. and make sure that my server had scads of memory available to it.

Second the above comments.
Don't try and second guess the GC and Hotspot. Object pooling may have been useful once but these days its not so useful unless you are talking about database connections or unique system resources.
Just try and write clean and simple code and be amazed at what Hotspot can do.
Why not use VisualVM or a profiler to take a look at your code?

Are there performance issues from using large numbers of objects in Java

I am currently working on a system where performance is an important consideration. It is going to be used for processing large quantities of data (some of the object types are in millions) with non-trivial algorithms (think about Integer Programming problems etc.). At the moment I have a working solution which creates all these data points as Objects.
Is there any performance increase to be gained, by treating them as arrays for example? Are there any best practices for working with large numbers of objects in Java (should it be avoided?).

I suggest you start by using a commercial CPU and memory profiler. This will give you a good idea of what are your bottleneck.
Reducing garbage and making your memory more compact helps more when your have optimised the code to the point that your profilers cannot suggest anything.
You might like to consider what structures which fit in your CPU caches better as this can improve performance by up to 2-5x. e.g. Your L3 cache might be 8 MB, and more than 5x faster than main memory. The more you can condense your working set to fit into it the better.
BTW Your L1 cache is 32 KB and ~10x faster again.
This all assumes that the time to perform a GC doesn't bother you. If you create enough objects you can see multi-second, even multi-minute GC stop-the-world pauses.

Arrays or ArrayLists have similar performance although arrays are faster (up to 25% depending on what you do with them). Where you can find a significant performance gain is by avoiding boxed primitives for calculations, in which case the only solution is to use an array.
Apart from that, creating many short lived objects incurs little performance cost, apart from the fact that GC will run more often (but the cost of running minor GC depends on the number of reachable objects, not on unreachable ones).

Premature optimization is evil. As Richard says in comments, write your code, see if its slow, then improve it. If you have suspicions write an example to simulate high load. The time spent up front to determine this is worth it.
But as for your question...
Yes, creating objects is more expensive compared to creating primitives. It also occupies more heap space (memory.) Also if you are using objects for only a short time the garbage collector will have to run more often which will eat some CPU.
Again, only worry about this if you really need speed improvement.

Prototype key parts of your algorithms, test them in separation, find the slowest, improve, repeat. Stay single threads for as long as possible, but always make a note of what can be done in parallel.
At the end your bottleneck may be either of below:
CPU because if algorithm computational complexity => try finding better algorithm (or run on multiple CPUs in parallel if you are just slightly below the target, if you are far below then parallel processing won't help)
CPU because of excessive GC => profile memory, use low/zero-GC collections (trove4j etc.) or even arrays of primitive types, or even direct memory buffers from NIO, experiment
Memory - optimize data proximity (use chunked arrays matching cache sizes, etc).
Contentions on concurrent objects => revert to single threaded design, try lock-free synchronization primitives, etc.

ByteBuffer recycling class

I'm wondering how I'd code up a ByteBuffer recycling class that can get me a ByteBuffer which is at least as big as the specified length, and which can lock up ByteBuffer objects in use to prevent their use while they are being used by my code. This would prevent re-construction of DirectByteBuffers and such over and over, instead using existing ones. Is there an existing Java library which can do this very effectively? I know Javolution can work with object recycling, but does that extend to the ByteBuffer class in this context with the requirements set out?

It would be more to the point to be more conservative in your usage patterns in the first place. For example there is lots of code out there that shows allocation of a new ByteBuffer on every OP_READ. This is insane. You only need two ByteBuffers at most per connection, one for input and one for output, and depending on what you're doing you can get away with exactly one. In extremely simple cases like an echo server you can get away with one BB for the entire application.
I would look into that rather than paper over the cracks with yet another layer of software.

This is just advice, not an answer. If you do implement some caching for DirectByteBuffer, then be sure to read about the GC implications, because the memory consumed by DirectByteBuffer is not tracked by the garbage collector.
Some references:
A thread - featuring Stack Overflow's tackline
A blog post on the same subject
And the followup

Typically, you would use combination of ThreadLocal and SoftReference wrapper. Former to simplify synchronization (eliminate need for it, essentially); and latter to make buffer recycleable if there's not enough memory (keeping in mind other comments wrt. GC issues with direct buffers). It's actually quite simple: check if SoftReference has buffer with big enough size; if not, allocate; if yes, clear reference. Once you are done with it, re-set reference to point to buffer.
Another question is whether ByteBuffer is needed, compared to regular byte[]. Many developers assume ByteBuffers are better performance-wise, but that assumption is not usually backed by actual data (i.e. testing to see if there is performance difference, and to what direction). Reason why byte[] may often be faster is that code accessing it can be simpler, easier for HotSpot to efficiently JIT.

Java: enough free heap to create an object?

I recently came across this in some code - basically someone trying to create a large object, coping when there's not enough heap to create it:
try {
// try to perform an operation using a huge in-memory array
byte[] massiveArray = new byte[BIG_NUMBER];
} catch (OutOfMemoryError oome) {
// perform the operation in some slower but less
// memory intensive way...
}
This doesn't seem right, since Sun themselves recommend that you shouldn't try to catch Error or its subclasses. We discussed it, and another idea that came up was explicitly checking for free heap:
if (Runtime.getRuntime().freeMemory() > SOME_MEMORY) {
// quick memory-intensive approach
} else {
// slower, less demanding approach
}
Again, this seems unsatisfactory - particularly in that picking a value for SOME_MEMORY is difficult to easily relate to the job in question: for some arbitrary large object, how can I estimate how much memory its instantiation might need?
Is there a better way of doing this? Is it even possible in Java, or is any idea of managing memory below the abstraction level of the language itself?
Edit 1: in the first example, it might actually be feasible to estimate the amount of memory a byte[] of a given length might occupy, but is there a more generic way that extends to arbitrary large objects?
Edit 2: as #erickson points out, there are ways to estimate the size of an object once it's created, but (ignoring a statistical approach based on previous object sizes) is there a way of doing so for yet-uncreated objects?
There also seems to be some debate as to whether it's reasonable to catch OutOfMemoryError - anyone know anything conclusive?

freeMemory isn't quite right. You'd also have to add maxMemory()-totalMemory(). e.g. assuming you start up the VM with max-memory=100M, the JVM may at the time of your method call only be using (from the OS) 50M. Of that, let's say 30M is actually in use by the JVM. That means you'll show 20M free (roughly, because we're only talking about the heap here), but if you try to make your larger object, it'll attempt to grab the other 50M its contract allows it to take from the OS before giving up and erroring. So you'd actually (theoretically) have 70M available.
To make this more complicated, the 30M it reports as in use in the above example includes stuff that may be eligible for garbage collection. So you may actually have more memory available, if it hits the ceiling it'll try to run a GC to free more memory.
You can try to get around this bit by manually triggering a System.GC, except that that's not such a terribly good thing to do because
-it's not guaranteed to run immediately
-it will stop everything in its tracks while it runs
Your best bet (assuming you can't easily rewrite your algorithm to deal with smaller memory chunks, or write to a memory-mapped file, or something less memory intensive) might be to do a safe rough estimate of the memory needed and insure that it's available before you run your function.

There are some kludges that you can use to estimate the size of an existing object; you could adapt some of these to predict the size of a yet-to-be created object.
However, in this case, I think it might be best to catch the Error. First of all, asking for the free memory doesn't account for what's available after garbage collection, which will be performed before raising an OOME. And, requesting a garbage collection with System.gc() isn't reliable. It's often explicitly disabled because it can wreck performance, and if it's not disabled… well, it can wreck performance when used unnecessarily.
It is impossible to recover from most errors. However, recoverability is up to the caller, not the callee. In this case, if you have a strategy to recover from an OutOfMemoryError, it is valid to catch it and fall back.
I guess that, in practice, it really comes down to the difference between the "slow" and "fast" way. If the "slow" method is fast enough, I'd stick with that, as it's safer and simpler. And, it seems to me, allowing it to be used as a fall back means that it is "fast enough." Don't let small optimizations derail the reliability of your application.

The "try to allocate and handle the error" approach is very dangerous.
What if you barely get your memory? A later OOM exception might occur because you brought things too close to the limits. Almost any library call will allocate memory at least briefly.
During your allocation a different thread may receive an OOM exception while trying to allocate a relatively small object. Even if your allocation is destined to fail.
The only viable approach is your second one, with the corrections noted in other answers. But you have to be sure and leave extra "slop space" in the heap when you decide to use your memory intensive approach.

I don't believe that there's a reasonable, generic approach to this that could safely be assumed to be 100% reliable. Even the Runtime.freeMemory approach is vulnerable to the fact that you may actually have enough memory after a garbage collection, but you wouldn't know that unless you force a gc. But then there's no foolproof way to force a GC either. :)
Having said that, I suspect if you really did know approximately how much you needed, and did run a System.gc() beforehand, and your running in a simple single-threaded app, you'd have a reasonably decent shot at getting it right with the .freeMemory call.
If any of those constraints fail, though, and you get the OOM error, your back at square one, and therefore are probably no better off than just catching the Error subclass. While there are some risks associated with this (Sun's VM does not make a lot of guarantees about what happens after an OOM... there's some risk of internal state corruption), there are many apps for which just catching it and moving on with life will leave you with no serious harm.
A more interesting question in my mind, however, is why are there cases where you do have enough memory to do this and others where you don't? Perhaps some more analysis of the performance tradeoffs involved is the real answer?

Definitely catching error is the worst approach. Error happens when there is NOTHING you can do about it. Not even create a log, puff, like "... Houston, we lost the VM".
I didn't quite get the second reason. It was bad because it is hard to relate SOME_MEMORY to the operations? Could you rephrase it for me?
The only alternative I see, is to use the hard disk as the memory ( RAM/ROM as in the old days ) I guess that is what you're pointing in your "else slower, less demanding approach"
Every platform has its limits, java suppport as much as RAM your hardware is willing to give ( well actually you by configuring the VM ) In Sun JVM impl that could be done with the
-Xmx
Option
like
java -Xmx8g some.name.YourMemConsumingApp
For instance
Of course you may end up trying to perform an operation that takes 10 gb of RAM
If that's your case then you should definitely swap to disk.
Additionally, using the strategy pattern could make a nicer code. Although here it looks overkill:
if (isEnoughMemory(SOME_MEMORY)) {
strategy = new InMemoryStrategy();
} else {
strategy = new DiskStrategy();
}
strategy.performTheAction();
But it may help if the "else" involves a lot of code and looks bad. Furthermore if somehow you can use a third approach ( like using a cloud for processing ) you can add a third Strategy
...
strategy = new ImaginaryCloudComputingStrategy();
...
:P
EDIT
After getting the problem with the second approach: If there are some times when you don't know how much RAM is going to be consumed but you do know how much you have left, you could use a mixed approach ( RAM when you have enough, ROM[disk] when you don't )
Suppose this theorical problem.
Suppose you receive a file from a stream and don't know how big it is.
Then you perform some operation on that stream ( encrypt it for instance ).
If you use RAM only it would be very fast, but if the file is large enough as to consume all your APP memory, then you have to perform some of the operation in memory and then swap to file and save temporary data there.
The VM will GC when running out of memory, you get more memory and then you perform the other chunk. And this repeat until you have the big stream processed.
while( !isDone() ) {
if (isMemoryLow()) {
//Runtime.getRuntime().freeMemory() < SOME_MEMORY + some other validations
swapToDisk(); // and make sure resources are GC'able
}
byte [] array new byte[PREDEFINED_BUFFER_SIZE];
process( array );
process( array );
}
cleanUp();

Why do you not explicitly call finalize() or start the garbage collector?

After reading this question, I was reminded of when I was taught Java and told never to call finalize() or run the garbage collector because "it's a big black box that you never need to worry about". Can someone boil the reasoning for this down to a few sentences? I'm sure I could read a technical report from Sun on this matter, but I think a nice, short, simple answer would satisfy my curiosity.

The short answer: Java garbage collection is a very finely tuned tool. System.gc() is a sledge-hammer.
Java's heap is divided into different generations, each of which is collected using a different strategy. If you attach a profiler to a healthy app, you'll see that it very rarely has to run the most expensive kinds of collections because most objects are caught by the faster copying collector in the young generation.
Calling System.gc() directly, while technically not guaranteed to do anything, in practice will trigger an expensive, stop-the-world full heap collection. This is almost always the wrong thing to do. You think you're saving resources, but you're actually wasting them for no good reason, forcing Java to recheck all your live objects “just in case”.
If you are having problems with GC pauses during critical moments, you're better off configuring the JVM to use the concurrent mark/sweep collector, which was designed specifically to minimise time spent paused, than trying to take a sledgehammer to the problem and just breaking it further.
The Sun document you were thinking of is here: Java SE 6 HotSpot™ Virtual Machine Garbage Collection Tuning
(Another thing you might not know: implementing a finalize() method on your object makes garbage collection slower. Firstly, it will take two GC runs to collect the object: one to run finalize() and the next to ensure that the object wasn't resurrected during finalization. Secondly, objects with finalize() methods have to be treated as special cases by the GC because they have to be collected individually, they can't just be thrown away in bulk.)

Don't bother with finalizers.
Switch to incremental garbage collection.
If you want to help the garbage collector, null off references to objects you no longer need. Less path to follow= more explicitly garbage.
Don't forget that (non-static) inner class instances keep references to their parent class instance. So an inner class thread keeps a lot more baggage than you might expect.
In a very related vein, if you're using serialization, and you've serialized temporary objects, you're going to need to clear the serialization caches, by calling ObjectOutputStream.reset() or your process will leak memory and eventually die.
Downside is that non-transient objects are going to get re-serialized.
Serializing temporary result objects can be a bit more messy than you might think!
Consider using soft references. If you don't know what soft references are, have a read of the javadoc for java.lang.ref.SoftReference
Steer clear of Phantom references and Weak references unless you really get excitable.
Finally, if you really can't tolerate the GC use Realtime Java.
No, I'm not joking.
The reference implementation is free to download and Peter Dibbles book from SUN is really good reading.

As far as finalizers go:
They are virtually useless. They aren't guaranteed to be called in a timely fashion, or indeed, at all (if the GC never runs, neither will any finalizers). This means you generally shouldn't rely on them.
Finalizers are not guaranteed to be idempotent. The garbage collector takes great care to guarantee that it will never call finalize() more than once on the same object. With well-written objects, it won't matter, but with poorly written objects, calling finalize multiple times can cause problems (e.g. double release of a native resource ... crash).
Every object that has a finalize() method should also provide a close() (or similar) method. This is the function you should be calling. e.g., FileInputStream.close(). There's no reason to be calling finalize() when you have a more appropriate method that is intended to be called by you.

Assuming finalizers are similar to their .NET namesake then you only really need to call these when you have resources such as file handles that can leak. Most of the time your objects don't have these references so they don't need to be called.
It's bad to try to collect the garbage because it's not really your garbage. You have told the VM to allocate some memory when you created objects, and the garbage collector is hiding information about those objects. Internally the GC is performing optimisations on the memory allocations it makes. When you manually try to collect the garbage you have no knowledge about what the GC wants to hold onto and get rid of, you are just forcing it's hand. As a result you mess up internal calculations.
If you knew more about what the GC was holding internally then you might be able to make more informed decisions, but then you've missed the benefits of GC.

The real problem with closing OS handles in finalize is that the finalize are executed in no guaranteed order. But if you have handles to the things that block (think e.g. sockets) potentially your code can get into deadlock situation (not trivial at all).
So I'm for explicitly closing handles in a predictable orderly manner. Basically code for dealing with resources should follow the pattern:
SomeStream s = null;
...
try{
s = openStream();
....
s.io();
...
} finally {
if (s != null) {
s.close();
s = null;
}
}
It gets even more complicated if you write your own classes that work via JNI and open handles. You need to make sure handles are closed (released) and that it will happen only once. Frequently overlooked OS handle in Desktop J2SE is Graphics[2D]. Even BufferedImage.getGrpahics() can potentially return you the handle that points into a video driver (actually holding the resource on GPU). If you won't release it yourself and leave it garbage collector to do the work - you may find strange OutOfMemory and alike situation when you ran out of video card mapped bitmaps but still have plenty of memory. In my experience it happens rather frequently in tight loops working with graphics objects (extracting thumbnails, scaling, sharpening you name it).
Basically GC does not take care of programmers responsibility of correct resource management. It only takes care of memory and nothing else. The Stream.finalize calling close() IMHO would be better implemented throwing exception new RuntimeError("garbage collecting the stream that is still open"). It will save hours and days of debugging and cleaning code after the sloppy amateurs left the ends lose.
Happy coding.
Peace.

The GC does a lot of optimization on when to properly finalize things.
So unless you're familiar with how the GC actually works and how it tags generations, manually calling finalize or start GC'ing will probably hurt performance than help.

Avoid finalizers. There is no guarantee that they will be called in a timely fashion. It could take quite a long time before the Memory Management system (i.e., the garbage collector) decides to collect an object with a finalizer.
Many people use finalizers to do things like close socket connections or delete temporary files. By doing so you make your application behaviour unpredictable and tied to when the JVM is going to GC your object. This can lead to "out of memory" scenarios, not due to the Java Heap being exhausted, but rather due to the system running out of handles for a particular resource.
One other thing to keep in mind is that introducing the calls to System.gc() or such hammers may show good results in your environment, but they won't necessarily translate to other systems. Not everyone runs the same JVM, there are many, SUN, IBM J9, BEA JRockit, Harmony, OpenJDK, etc... This JVM all conform to the JCK (those that have been officially tested that is), but have a lot of freedom when it comes to making things fast. GC is one of those areas that everyone invests in heavily. Using a hammer will often times destroy that effort.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.