Do 2D arrays use more resources than 1D arrays in Java? - java

For example, would a full int[50][8] use more resources (RAM and CPU) than 8 full int[50] arrays?

In the first case you have one array object pointing to fifty array objects holding 8 int's.
So 1 + 50 array objects + fifty pointers in the first array object.
In the second case you have one array object pointing to 8 array objects holding 50 int's.
So 1 + 8 array objects + eight pointers in the first array object. Holding the int's is a wash.
There is not a good way to evaluate CPU usage for this.

There appears to be three things to compare here.
new int[50][8]
new int[8][50]
new int[400]
Now, I get this confused, but the way to remember is to think of new int[50][] which is valid.
So new int[50][8] is an array of 50 arrays of size 8 (51 objects). new int[8][50] is an array of 8 arrays of size 50 (9 objects). 9 objects will have a lower overhead than 51. new int[400] is just one object.
However, it at this size it probably doesn't make any measurable difference to the performance of your program. You might want to encapsulate the array(s) within an object that will allow you to change the implementation and provide a more natural interface to client code.

One additional useage point (came from a reference I unfortunately can't find now, but fairly commonsensical)-
The authors of this paper were testing various ways of compressing sparse arrays into mutidimensional arrays. One thing they noticed is that it makes a difference in terms of speed which way you iterate -
The idea was that if you have int[i][j] it was faster to do
for (i) {
for (j)
than to do
for (j) {
for (i)
because in the first instance you're iterating through elements stored contiguously.

you could tweak a tiny amout of memory by using an int[] myInt = int[400] array, and manually accessing an int at position (x,y) with myInt[x+y*50]
that would save you 50 32-bit pieces of memory. accessing it that way will maybe (who knows exactly what the hotspot compiler does to this..) take one more instruction for the multiplication.
that kind of micro-optimisation will most likely not make your app perform better, and it will decrease readability.

I suggest writing a small performance test for this with very large arrays to see the actual difference. In reality I don't think this would make the slightest difference.

int[50][8] is 50 arrays of length 8
int[8][50] is 8 arrays of length 50
int[400] is one array 400.
Each array has an overhead of about 16 bytes.
However, for the sizes you have here, it really doesn't matter. You are not going to be saving much either way.

Related

What does this statement mean in Java?

Suppose there is an array declaration:
int [] a = new int[3];
I'm reading a book that says the reason we need to explicitly create arrays at runtime is because we cannot know how much space to reserve for the array at compile time. So, in this case above don't we know we need to reserve 3 times the size of int of the underlying java platform ?
So, in this case above don't we know we need to reserve 3 times the size of int of the underlying java platform ?
Well actually slightly more than that - because arrays are objects with a class reference, as well as the length.
But yes, in this case you happen to know the size in advance. But Java also lets you write:
int[] a = new int[getLengthFromMethod()];
... in which case no, the amount of memory won't be known at compile time. This flexibility makes it considerably simpler to work with arrays than if you had to know the size at compile time. (Imagine trying to make ArrayList work otherwise, for example.)
But bear in mind that memory allocation is very dynamic in Java in general - it's effectively only really the size of stack frames which is known in advance, and then only in relative terms (as the size of a reference can vary between platforms). All1 objects are allocated on the heap and references are used to keep track of them.
1 Okay, aside from smart JVMs which are sometimes able to allocate inline after performing escape analysis.
In that book it probably says that there are scenarios in which you don't know at compile time how large an array will be (e.g.: when you ask a user for a set of numbers you don't know how many numbers he will insert).
In your example (int [] a = new int[3]) you obviously know the size of the array at compile time; it's 3.
Yes. Your correct.
3 * sizeOf(int) + array overhead.
You are reserving that much memory and then handing a pointer to the first position to your variable a. Then a can figure out how to index the array based on the location and size of what is being stored.
In your case the size of the array is 3.The number inside the square brackets is the size.CHeck this weeks tutorial about arrays on www.codejava.co.uk
Are you sure that book you are reading is about Java?
Unlike C/C++, Java does not have that dilemma - arrays are always allocated at runtime, even if you know their size during compile time.

2D-Array : prefered way access items

So here I am tonight with this question that came up into my mind :
What is your favourite way to access the items of a m x n matrix
there is the normal way where you use an index for the columns
and another index for the rows matrix[i][j]
and there's another way where your matrix is a vector of length m*n
and you access the items using [i*n+j] as index number
tell me what method you prefeer most , are there any other methods
that would work for specific cases ?
Let's say we have this piece of C(++) code:
int x = 3;
int y = 4;
arr2d[x][y] = 0xFF;
arr1d[x*10+y] = 0xFF;
Where:
unsigned char arr2d[10][10];
unsigned char arr1d[10*10];
And now let's look at the compiled version of it (assembly; using debugger):
As you can see there's absolutely no penalty or slowdown when accessing array elements no matter if you're using 2D arrays or not, since both of the methods are actually the same.
There are only two reasons to go for the one-dimensional array to represent n-dimensions I can think of:
Performance: The usual way to allocate n-dimensional arrays means that we get n dimensions that may not necessarily be allocated in one piece - which isn't that great for spatial locality (and may also result in at least some additional memory accesses - in the worst case we need 1 additional read for each access). Now in C/C++ you can get around this (allocate memory in one piece, then afterwards specify the correct pointers; just be really careful not to forget this when you delete it) and other languages (C#) already can do this out of the box. Also note that in a language with a stop&copy GC the reasoning is unnecessary since all the objects will be allocated near each other anyhow. You avoid additional overhead for each single dimension though, so you use your memory and cache a bit better.
For some algorithms it's nicer to just use a one dimensional array which may make the code shorter and slightly faster - that's probably the one thing that can be argued as subjective here.
I think that if you need a 2D array, is because you would like to access it as a 2d array, not as a 1D array
Otherwise you can do a simple multiply to make it a 1D array
If I was to use a 2-D array, I would vote for matrix[i][j]. I think this is more readable. However, I might consider using Guava's Table class.
http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/Table.html
I don't think that your "favourite" way, or the most aesthetically pleasing way is a good approach to take with this issue - underlying performance would be my main concern.
Storing a matrix as a contiguous array is often the most efficient way of doing matrix calculations. If you take a look at optimised BLAS (Basic Linear Algebra Subroutine) libraries, such as the Intel MKL, the AMD ACML, ATLAS etc etc contiguous matrix storage will be used. When contiguous storage is used, and contiguous data access patterns are exploited higher performance can result due to the improved locality-of-reference (i.e. cache performance) of the operations.
In some languages (i.e. c++) you could use operator overloading to achieve the data[i][j] style of indexing while doing the 1D array index mappings behind the scenes.
Hope this helps.

Storing multiple datatypes inside a single two dimensional array

I have a need to store multiple datatypes(like int or string mostly) inside a two dimensional array. Using Object[][] does solve the problem. But is it a good way to do so ??
How does the Object[][] array then reserve the heap space ? I mean, in accordance with which datatype? Does it leads to any wastage of resources ?
I was trying to do something like this:-
Object[][] dataToBeWritten={ {"Pami",34,45},
{"Ron","x",""},
{"spider","x",""}
};
Edit: You may suggest any better alternatives also if there exists any..
See How to calculate the memory usage of a Java array and Memory usage of Java objects: general guide.
For example, let's consider a 10x10 int array. Firstly, the "outer" array has its 12-byte object header followed by space for the 10 elements. Those elements are object references to the 10 arrays making up the rows. That comes to 12+4*10=52 bytes, which must then be rounded up to the next multiple of 8, giving 56. Then, each of the 10 rows has its own 12-byte object header, 4*10=40 bytes for the actual row of ints, and again, 4 bytes of padding to bring the total for that row to a multiple of 8. So in total, that gives 11*56=616 bytes. That's a bit bigger than if you'd just counted on 10*10*4=400 bytes for the hundred "raw" ints themselves.
I think this is for Hotspot only though. References to any object are, just link ints, 4 byte each, regardless of the actual object, or the object being null. Spare requirement for the objects themselves is a whole different story though, as the space isn't reserved or anything the like at array creation.
All objects are stored by reference. So a reference to the heap memory is stored. Therefore the amount of memory allocated for an array is one sizeof ( reference ) per entry.
An array of Objects is basically an array of pointers. However, that's what you get with any array of non-primitive types in Java - an array of Objects, and array of Strings, and an array of Robots of equal length take up the exact same amount of space. Heap space for the actual objects isn't allocated until you initialize the objects.
Alternative:
Use proper classes. You are trying to take some dynamic approach in a statically typed language. The thing is that Object[] doesn't help the reader of your code one bit what he is reading about. In fact I can't even suggest a design for a class because I can't make sense of your example. What is {"Pami",34,45} and how is this supposed to be related to {"spider","x",""}?
So supposed this information is something foo-like you should create a class Foo and collect all that stuff in a Foo[] or a List<Foo>.
Remember: Not only comments store information about your code. The type system contains valuable information about what you're trying to accomplish. Object contains no such information.

Speeding up code - 3D array

I'm trying to improve the speed of some code I've written. I was wondering how efficient accessing data from a 3d array of integers is?
I have an array
int cube[][][] = new int[10][10][10];
which I populate with values. I then access these values several thousand times.
I was wondering, seeing as all 3d arrays are theoretically stored in 1D arrays in memory, is there a way to turn my 3d array into a 1d one? For instance I could have cube[0] referring to the old cube[0][0][0] and cube [1] refering to the old cube[0][0][1].
I'm not sure how to go about doing it. I'm sure it's possible but my brain is worn out.
Thanks
You can create the single-dimension array as follows:
int cube[] = new int[w * h * d];
And to access an element:
int value = cube[x * h * d + y * d + z];
But I doubt it will be much faster and you're losing some convenience and safety. Before deciding to go through with this change it might be a good idea to perform some benchmark tests on your data to see if you actually have a problem and whether the change gives a sufficiently large improvement to be worth the extra complexity.
That's exactly what Java is doing behind the scenes. A three dimensional array is simply an array of arrays of arrays. In theory you could separate the arrays into 10 two dimensional arrays or 100 one-dimensional arrays (and even into 1000 individual variables), but it would be unlikely to speed up your performance. Focus on optimizing your algorithm instead.
int cube[] = new int[ X*Y*Z ];
cube[ i*X*Y + j*X + k ] = ...
But, as others already said: It's not expected to be faster (as the calculations have to be done anyway). Let Java do its stuff for reasons of error-avoidance.
Do not do it - Java handles all this for you. You can of course make it a 1D array and then do the calculations but you will hardly beat the optimized JVM code which does the same on the background. Also - is this really causing a performance bottleneck according to a profiler? If not, you might optimize your code prematurely.
You could use a LinkedList and store a 2D array in each Node. That would be more efficient I believe.

in java, which is better - three arrays of booleans or 1 array of bytes?

I know the question sounds silly, but consider this: I have an array of ints (1..N) and a labelling algorithm. at any point the item the int represents is in one of three states. The current version holds these states in a byte array, where 0, 1 and 2 represent the three states. alternatively, I could have three arrays of boolean - one for each state. which is better (consumes less memory) depends on how jvm (sun's version) stores the arrays - is a boolean represented by 1 bit? is there any other magic happening behind the scenes? (p.s. don't start with all that "this is not the way OO/Java works" - I know, but here performance comes in front. plus the algorithm is simple and perfectly readable even in such form).
Thanks a lot
Instead of two booleans or 1 int, just use a BitSet - http://java.sun.com/j2se/1.4.2/docs/api/java/util/BitSet.html
You can then have two bits per label/state. And BitSet being a standard java class, you are likely to get good performance.
Theoretically, with 3 boolean arrays you'll need to do:
firstState[n] = false;
secondState[n] = true;
thirdState[n] = false;
every time when you want to change n-th element state. Here you can see 3 taking element by index operations and 3 assignment operations.
With 1 byte array you'll need:
elements[n] = 1;
It's more readable and 3 times faster. And one more advantage of this solution it that you can easily add as many new states as you want (when with boolean arrays you'll need to introduce new arrays).
But I don't think you'll ever see the performance difference.
UPD: actually I'd make it more java way (not looking that you don't find easy ways) and use array of enums. This will make it much more clear and will give you some flexibility (maybe in future you'll decide that oop is not so bad thing):
enum ElementState {
FIRST, SECOND, THIRD;
}
ElementState[] elementStates = new ElementState[N];
...
elementStates[i] = ElementState.FIRST;
The JVM second edition spec (http://java.sun.com/docs/books/jvms/second_edition/html/Overview.doc.html) specifies that boolean arrays are encoded as (0,1), but doesn't specify the type used. So the particular JVM may or may not use bit - it could use int.
However, if performance is paramount, using a single byte would in any case seem to be your best option anyway.
EDIT: I incorrectly said that boolean arrays are stored as bit arrays - this is possible but implementation specific.
If you want a guaranteed minimum you could use three java.util.BitSets. These will only use one bit per flag (though you will have the extra object overhead, that may outweigh the benefits if the number of flags is small.) I would say if you have a large number of objects BitSet may be a better alternative, otherwise an array of byte constants or enums will lead to more readable code (and the extra storage shouldn't be a real concern.)
The array of bytes is much better!
A boolean uses in every programming language 1 byte! So you will use for every state 3 bytes and you can do this with only 1 byte (in theory you can reduce it to only 1 bit (see other posts).
with a byte array, you can simply change it to the byte you want. With three arrays you have to change the value at every array!
When you are your application developing, it is possible you need an extra state. So, this means you have to create again an array. Plus you have to change 4 values (second point)
So, I hope we persuaded you!

Categories

Resources