How to remove the unused constants from the generated Java class file

How to remove the unused constants from the generated Java class file - java

I often use constants in my Java programs, however when taking a look at a decompiled Java class file, I noticed that all the constant's uses throughout the program was replaced by their literal values, while the original declarations and assignments at the top of the class was still present, I'm wondering if these variables are just sitting there wasting memory since they're not being used anywhere in the file. and if there's a way to restrict constants only to compile time and have them removed after the class file is generated.

The compiler is smart enough to use memory efficiently in this kind of case. And, even let's suppose that you did have wasted memory per constant, let's do a thought experiment on how much 'space' that may be, realistically.
Let's say we had a mixture of string, int, and other basic types. Each char in a string is 2 bytes in java, and an int is 4 bytes. Let's say we had 75,000 char constants and 25,000 int constants.
That is 75k * 2 so 150k bytes or ~150 kilobytes of memory for the string constants
For the ints that is 25k * 4 or 100k, so ~100 kilobyte of memory.
So a total of ~250kb.
The most basic computer you can buy these days has a minimum of 4gb of ram, more likely 8gb to 16gb.
So 250kb / 4000000kb * 100 = 0.00625% of the lowest spec modern computer.
Now if you are running some microcontroller or other limited hardware, it maaay be something, but even then it is doubtful and you probably aren't using java.
As a general rule of thumb, compilers and interpreters tend to be pretty good at their job, and trusting them to do the right thing most of the time, for most programmers and programs is good enough.
There are times when certain patterns, such as massive creation of new objects, then triggering garbage collection or other issues like that can have large performance impacts.
The key is to know where actual bottlenecks and gotchya's are within a language/computer/use patterns.
It does get more complicated once we move into the distributed space and massive scale - but again unless you are working in that kind of environment, you generally don't need to consider it.
Focus your expectations on your use case, and if you are optimizing for a 10 millisecond gain for something only a few people use, it's probably a waste of time - unless you are doing it for fun.

Related

Creating new objects versus encoding data in primitives

Let's assume I want to store (integer) x/y-values, what is considered more efficient: Storing it in a primitive value like long (which fits perfect, due sizeof(long) = 2*sizeof(int)) using bit-operations like shift, or and a mask, or creating a Point-Class?
Keep in mind that I want to create and store many(!) of these points (in a loop). Would be there a perfomance issue when using classes? The only reason I would prefer storing in primtives over storing in class is the garbage-collector. I guess generating new objects in a loop would trigger the gc way too much, is it correct?

Of course packing those as long[] is going to take less memory (though it is going to be contiguous). For each Object (a Point) you will pay at least 12 bytes more for the two headers.
On other hand, if you are creating them in a loop and thus escape analysis can prove they don't escape, it can apply an optimization called "scalar replacement" (thought is it very fragile); where your Objects will not be allocated at all. Instead those objects will be "desugared" to fields.
The general rule is that you should code the way it is the most easy to maintain and read that code. If and only if you see performance issues (via a profiler let's say or too many pauses), only then you should look at GC logs and potentially optimize code.
As an addendum, jdk code itself is full of such long where each bit means different things - so they do pack them. But then, me and I doubt you, are jdk developers. There such things matter, for us - I have serious doubts.

Which is faster: Array list or looping through all data combinations?

I'm programming something in Java, for context see this question: Markov Model descision process in Java
I have two options:
byte[MAX][4] mypatterns;
or
ArrayList mypatterns
I can use a Java ArrayList and append a new arrays whenever I create them, or use a static array by calculating all possible data combinations, then looping through to see which indexes are 'on or off'.
Essentially, I'm wondering if I should allocate a large block that may contain uninitialized values, or use the dynamic array.
I'm running in fps, so looping through 200 elements every frame could be very slow, especially because I will have multiple instances of this loop.
Based on theory and what I have heard, dynamic arrays are very inefficient
My question is: Would looping through an array of say, 200 elements be faster than appending an object to a dynamic array?
Edit>>>
More information:
I will know the maxlength of the array, if it is static.
The items in the array will frequently change, but their sizes are constant, therefore I can easily change them.
Allocating it statically will be the likeness of a memory pool
Other instances may have more or less of the data initialized than others

You right really, I should use a profiler first, but I'm also just curious about the question 'in theory'.
The "theory" is too complicated. There are too many alternatives (different ways to implement this) to analyse. On top of that, the actual performance for each alternative will depend on the the hardware, JIT compiler, the dimensions of the data structure, and the access and update patterns in your (real) application on (real) inputs.
And the chances are that it really doesn't matter.
In short, nobody can give you an answer that is well founded in theory. The best we can give is recommendations that are based on intuition about performance, and / or based on software engineering common sense:
simpler code is easier to write and to maintain,
a compiler is a more consistent1 optimizer than a human being,
time spent on optimizing code that doesn't need to be optimized is wasted time.
1 - Certainly over a large code-base. Given enough time and patience, human can do a better job for some problems, but that is not sustainable over a large code-base and it doesn't take account of the facts that 1) compilers are always being improved, 2) optimal code can depend on things that a human cannot take into account, and 3) a compiler doesn't get tired and make mistakes.

The fastest way to iterate over bytes is as a single arrays. A faster way to process these are as int or long types as process 4-8 bytes at a time is faster than process one byte at a time, however it rather depends on what you are doing. Note: a byte[4] is actually 24 bytes on a 64-bit JVM which means you are not making efficient use of your CPU cache. If you don't know the exact size you need you might be better off creating a buffer larger than you need even if you are not using all the buffer. i.e. in the case of the byte[][] you are using 6x time the memory you really need already.

Any performance difference will not be visible, when you set initialCapacity on ArrayList. You say that your collection's size can never change, but what if this logic changes?
Using ArrayList you get access to a lot of methods such as contains.
As other people have said already, use ArrayList unless performance benchmarks say it is a bottle neck.

How do I calculate 64bit Java Memory Cost

I'm trying to find a simple and accurate reference for the cost in bytes of Java 64 bit Objects. I've not been able to find this. The primitives are clearly specified but there are all these edge cases and exceptions that I am trying to figure out like padding for an Object and cost vrs. space they actually take up on the heap, etc. From the gist of what I'm reading here: http://btoddb-java-sizing.blogspot.com/ that can actually be different?? :-/

If you turn off the TLAB, you will get accurate accounting and you can see exactly how much memory each object allocation uses.
The best way to see where your memory is being used, is via a memory profiler. Worry about bytes here and there is most likely a waste of time. When you have hundreds of MB, then it makes a difference and the best way to see that is in a profiler.
BTW Most systems use 32-bit references, even in 64-bit JVMs. There is no such thing as a 64-bit Object. Apart from the header, the object will uses the same space whether it is a 32-bit JVM or using 32-bit references in a 64-bit JVM.

You are essentially asking for a simple way to get an accurate prediction of object sizes in Java.
Unfortunately ... there isn't one!
The blog posting you found mentions a number of complicating factors. Another one is that the object sizing calculation can potentially change from one Java release to the next, or between different Java implementation vendors.
In practice, you options are:
Estimate the sizes based on what you know, and accept that your estimates may be wrong. (If you take account of enough factors, you should be able to get reasonable ballpark estimates, at least for a particular platform. But accurate predictions are inherently hard work.)
Write micro benchmarks using the TLAB technique to measure the size of the objects.
The other point is that in most cases it doesn't matter if your object size predictions are not entirely accurate. The recommended approach is to implement, measure and then optimize. This does not require accurate size information until you get to the optimization stage, and at that point you can measure the sizes ... if you need the information.

When performing mmap, would C or Java have any significant performance differences?

I have a 50GB file that is a sorted csv file.
Would it in theory make any difference if I was performing lookups on this file using memory mapped access using C or java?
I'm guessing since the file access is pushed down to the operating system level, it really shouldn't make much of a difference correct?

In theory, Java will be infinitesimally slower because of the need for additional indirections due to Java's object-oriented method invocation, and possibly due to the need to cross the Java/JNI boundary.
In practice, the Hotspot compiler optimizes direct ByteBuffer access, and the cost of page faults will far exceed the extra memory indirection.

Giving a direct answer to question.
C's mmap() and Java's FileChannel.map() are considered to be pretty much equivalents and won't have significant performance differences.

Java can only map 2 GB at a time. This is because ByteBuffer uses 32-bit integers for length, size, etc. So you'd need 25 mmaps for your 50 GB file. C can just create a single mmap, although it won't be portable to 1990s computers (if you care about that)

determining java memory usage

Hmmm. Is there a primer anywhere on memory usage in Java? I would have thought Sun or IBM would have had a good article on the subject but I can't find anything that looks really solid. I'm interested in knowing two things:
at runtime, figuring out how much memory the classes in my package are using at a given time
at design time, estimating general memory overhead requirements for various things like:
how much memory overhead is required for an empty object (in addition to the space required by its fields)
how much memory overhead is required when creating closures
how much memory overhead is required for collections like ArrayList
I may have hundreds of thousands of objects created and I want to be a "good neighbor" to not be overly wasteful of RAM. I mean I don't really care whether I'm using 10% more memory than the "optimal case" (whatever that is), but if I'm implementing something that uses 5x as much memory as I could if I made a simple change, I'd want to use less memory (or be able to create more objects for a fixed amount of memory available).
I found a few articles (Java Specialists' Newsletter and something from Javaworld) and one of the builtin classes java.lang.instrument.getObjectSize() which claims to measure an "approximation" (??) of memory use, but these all seem kind of vague...
(and yes I realize that a JVM running on two different OS's may be likely to use different amounts of memory for different objects)

I used JProfiler a number of years ago and it did a good job, and you could break down memory usage to a fairly granular level.

As of Java 5, on Hotspot and other VMs that support it, you can use the Instrumentation interface to ask the VM the memory usage of a given object. It's fiddly but you can do it.
In case you want to try this method, I've added a page to my web site on querying the memory size of a Java object using the Instrumentation framework.
As a rough guide in Hotspot on 32 bit machines:
objects use 8 bytes for
"housekeeping"
fields use what you'd expect them to
use given their bit length (though booleans tend to be allocated an entire byte)
object references use 4 bytes
overall obejct size has a
granularity of 8 bytes (i.e. if you
have an object with 1 boolean field
it will use 16 bytes; if you have an
object with 8 booleans it will also
use 16 bytes)
There's nothing special about collections in terms of how the VM treats them. Their memory usage is the total of their internal fields plus -- if you're counting this -- the usage of each object they contain. You need to factor in things like the default array size of an ArrayList, and the fact that that size increases by 1.5 whenever the list gets full. But either asking the VM or using the above metrics, looking at the source code to the collections and "working it through" will essentially get you to the answer.
If by "closure" you mean something like a Runnable or Callable, well again it's just a boring old object like any other. (N.B. They aren't really closures!!)

You can use JMP, but it's only caught up to Java 1.5.

I've used the profiler that comes with newer versions of Netbeans a couple of times and it works very well, supplying you with a ton of information about memory usage and runtime of your programs. Definitely a good place to start.

If you are using a pre 1.5 VM - You can get the approx size of objects by using serialization. Be warned though.. this can require double the amount of memory for that object.

See if PerfAnal will give you what you are looking for.

This might be not the exact answer you are looking for, but the bosts of the following link will give you very good pointers. Other Question about Memory

I believe the profiler included in Netbeans can moniter memory usage also, you can try that

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.