When we have wrappers classes, why primitives are supported? - java

We have wrapper classes in java like Interger, Float.. why it is still supportng primitives which is stoppting java to be fully object oriented language?

Wrappers, being objects, get placed in the heap. Primitives are just "values" and go in the stack. This is more efficient, because for wrapped primitives in the heap you need (at least) both the value (which is in the stack) and the reference to the wrapper object.
Whether this performance gain matters at all depends on what you're doing. For heavy numerical work, definitely, but for 99 % of stuff out there, this is rather an annoyance. For one thing, you can't store primitives in a Collection anyway; they get autoboxed. So the only way to store lots of them is to use plain arrays, which in turn can lead to other kinds of inefficiencies (if you need to resize them, for instance).

Because primitives are lighter and more efficient in term of memory and CPU processing.

One word: Performance.
The wrapper types are also immutable, which makes it extra expensive if one wanted to use one for a loop counter, for example.
The JVM also has opcodes for directly doing arithmetics on primitives.

Related

Improve memory usage: IntegerHashMap

We use a HashMap<Integer, SomeType>() with more than a million entries. I consider that large.
But integers are their own hash code. Couldn't we save memory with a, say, IntegerHashMap<Integer, SomeType>() that uses a special Map.Entry, using int directly instead of a pointer to an Integer object? In our case, that would save 1000000x the memory required for an Integer object.
Any faults in my line of thought? Too special to be of general interest? (at least, there is an EnumHashMap)
add1. The first generic parameter of IntegerHashMap is used to make it closely similar to the other Map implementations. It could be dropped, of course.
add2. The same should be possible with other maps and collections. For example ToIntegerHashMap<KeyType, Integer>, IntegerHashSet<Integer>, etc.
What you're looking for is a "Primitive collections" library. They are usually much better with memory usage and performance. One of the oldest/popular libraries was called "Trove". However, it is a bit outdated now. The main active libraries in use now are:
Goldman Sach Collections
Fast Util
Koloboke
See Benchmarks Here
Some words of caution:
"integers are their own hash code" I'd be very careful with this statement. Depending on the integers you have, the distribution of keys may be anything from optimal to terrible. Ideally, I'd design the map so that you can pass in a custom IntFunction as hashing strategy. You can still default this to (i) -> i if you want, but you probably want to introduce a modulo factor, or your internal array will be enormous. You may even want to use an IntBinaryOperator, where one param is the int and the other is the number of buckets.
I would drop the first generic param. You probably don't want to implement Map<Integer, SomeType>, because then you will have to box / unbox in all your methods, and you will lose all your optimizations (except space). Trying to make a primitive collection compatible with an object collection will make the whole exercise pointless.

Is there a mutable reduction operation(Collector) for Stream<BigDecimal>?

AFAIK the only way to sum a stream of BigDecimal is:
BigDecimal result = stream.reduce(BigDecimal.ZERO, BigDecimal::add);
The problem here is that every call to BigDecimal::add will create a new BigDecimal as opposed to changing a mutable type.
Is there a mutable reduction operation aka Collector for Stream<BigDecimal>?
BigDecimal: "Immutable, arbitrary-precision signed decimal numbers."
Since its immutable there is no method to manipulate them without creating new objects. Any method that can do that would break guarantees of the class (like BigDecimal.ZERO being 0)
Well, there is no public mutable BigDecimal companion class, so there is no Collector using it. But you should not worry about the performance implication of the instance creation unless a profiling tool tells you that there is a problem.
Modern JVMs like HotSpot are usually good at dealing with temporary objects created in a hot loop. Even if they are not able to elide the allocation, the allocation costs are not so big. This is different to, e.g. String::concat where the instance creation costs do not only include the allocation, but copying the entire contents of the previously created String instances, yielding a quadratic time complexity of such reduction (unless the optimizer manages to rewrite such code). The same would apply to attempts to produce a Collection via pure (immutable) reduction.
This might be contradicting to the existence of primitive type specializations like IntStream, LongStream and DoubleStream, but that’s a trade-off. Generally, the preference of the JRE developers is towards improving the JVM performance (for the benefit of all value types) rather than adding a mutable helper class for every immutable class. There might be a continuity of special support for primitive types until the arrival of full value type support, but don’t expect the addition of new public mutable companion classes for immutable types (unless we’re talking about construction costs like in the String example).

Coexistance of primitve and reference data types in Java

I always asked myself this question : why did Java designers have introduced both primitive and reference types in their language. In other words, why would exist two data types which can fullfil the same goal, such as (int & java.lang.Integer), (float & java.lang.Float) ... Could anyone please explain me this issue ?
Java has primitives because :
They are fast. (when compared to Objects)
They have less overhead. (when compared to Objects)
They actually make life easier for people with C/C++ background and gives them the same feel (almost).
Java has Wrappers because :
In certain data structures like Collections, only objects are allowed to be added because when doing Garbage Collection, the GC
treats all these things only as Objects and then performs operations
on them.
Using Wrappers (Objects) instead of primitives in Collections is more of a design choice because it allows general behavior of
methods. For example equals() , contains() on collections work on
the basis of method overriding which cannot be done on primitives.
Primitives are faster than reference types (almost unnoticeable in small loops, but in large operations, the difference is clear), make reading code for people with c/c++ backgrounds easier, and in general, are a better, optimized fit for general-purpose operations with basic types.
Consider however that some parts of Java (Collections, Generics, Reflection... ) require classes and not basic types, so the wrappers (plus boxing and unboxing) were added for that. In addition, wrappers are nullable, while basic types are not, which might also be required in certain operations.

Primitive Array vs ArrayList

I am receiving XML and need to convert to either a primitive Array or ArrayList.
Is there much difference in terms of performance in terms of memory and garbage collection?
My application will be creating thousand of these objects every second and I need to minimize GC as I need real-time performance.
Thxs
Primitive arrays are much more efficient, as they don't require wrapper objects. Guava has List implementations that are backed by primitive arrays (example: Ints.asList(int[])), perhaps that could be a reasonable solution for you: get the power of a collection but only use Objects when you actually need them.
Primitive arrays are always more efficient, but by how much depends on the exact details of your use case. I've recently sped up performance by a factor of 7, by ripping out the ArrayLists, and replacing them with primitive arrays, in the inner-most loops. The use case was an O(n^2) algorithm applied to lists 100-1000 characters long. I then did a controlled experiment, comparing the performance of a int[] array to a ArrayList, and interestingly, as the array/list sizes get bigger, the JIT compiler seems to kick in, and the performance penalty becomes a lot less (only ~20%). But for list sizes less than 500, the performance penalty of an ArrayList can be up to a factor of 10. So if you've got a frequently called method, which is manipulating lots of small lists or arrays (as was with my use case), using primitave arrays can have a big performance impact.
As Sean Patrick Floyd pointed out, primitive arrays are much more efficient.
However, there are cases where one would definitely prefer Collections. But as long as you just iterate over the Objects, there is no need for Collections.
Linked lists are good for inserts/deletes, and arrays are good for random access.

Performance of the Number class

I'm just wondering about the performance of the Number class as opposed to, say using generics or even a whole lot of functions to handle primitive types.
The primitive types would clearly be the fastest option I would assume, however if the performance hit is not too huge, it would likely be easier for the coder to just use the Number class or generics rather than making a function that accepts and returns long, double (etc).
I am about to do a performance benchmark of the 3 options mentioned. Is there anything I should be aware of/try out when doing this, or even better, has someone done this before that they can give me results to?
Typically you use the Number class as opposed to primitive types because you need to use these values in collections or other classes that are based on Objects. If you are not restricted by this requirement, then you should use primitives.
Yes, there is a performance hit associated with using the Number class, in comparison with primitive types like int, long, etc. Especially if you are creating a lot of new Numbers, you will want to worry about the performance when compared with creating primitive types. But this is not necessarily the case for passing Numbers to methods. Passing an instance of Number to a method is no slower than passing an int or a long, since the compiler can basically pass a "pointer" to a memory location. This is very general information because your question is very general.
One thing you should be aware is that object allocation is likely to be the largest cost when you use Numbers. This affects your benchmarks as certain operations which use auto-boxing can use cached values (which don't create objects) and can give you much better performance results. e.g. if you use Integers between -128 and 127, you will get much better results than Doubles from -128 to 127 because the former uses caches values, the later does not.
In short, if you are micro-benchmarking the use of Numbers, you need to ensure the range of values you use are realistic, not all values are equal in terms of performance (of course for primitives this doesn't matter so much)

Categories

Resources