Java return array versus pass by reference

Java return array versus pass by reference - java

I'm designing an API function of a class which will return an array for client to use.But I'm not quite sure whether I should make it a return value or make it as an argument of the function.See below:
Method I:
MyObject[] getMyObject() {... return someObject;}
Method II:
void getMyObject(MyObject[] someObject) {...//assign value to someObject[index]};
In Android API I saw it is very common to return a List<MyObject> or Set<MyObject>. Does it indicate Method I is better? So what are the pros and cons of these two methods in Java?
Update: In method II I mean to assign value to someObject[index], not someObject. My question is not regarding "does Java pass reference or value". It's just simply comparing two feasible way of doing things.

Arrays are not resizeable. So with method 1, you can create a new array with just the right size and return that. With method 2, if the incoming array is the wrong size, you're sunk.
Java does not have pass-by-reference. So assigning something to someObject in method 2 won't do anything for the caller. You can only alter the elements of someObject.

Both ways have advantages and disadvantages.
Version #1
MyObject[] getMyObject() {... return someObject;}
Pros:
This allows you to return an arbitrary number of results.
It is arguably easier for the caller.
Cons:
The called method has to allocate an array. (Alternatively, it has to manage / recycle arrays, which is going to be difficult to do in general. Note that reusing a static array is liable to make the method non-reentrant, etcetera.)
Version #2
void getMyObject(MyObject[] someObject) {...//assign value to someObject[index]};
Pros:
This is potentially better in terms of objects allocated because the caller will be in a better position to recycle / reuse the array.
It allows you to pass values in ... if that is a requirement.
Cons:
The caller has to provide the array, which makes the method more work to use.
The called method has no control over the array size. That means that there is a potential error case if the supplied array is too small ...
There is also a third way, where an array is passed and returned. If the array size is not correct (or maybe if a null is passed) the called method allocates or reallocates an array. The result is either the original array or the reallocated array.
Which is better?
IMO, the first version is better under most circumstances because it is easiest to get right. IMO, you should only consider the alternatives in an API design if there is a demonstrable need to minimize new object allocation. (If you are coding for a Hotspot Java implementation or equivalent, new object allocation is cheap ...)
Finally, a simpler / cleaner way than all of the above is to use a Collection rather than a bare array. Using a standard Collection type allows you to avoid the messiness of preallocating something of the correct size.

Return is more natural to write and read, also pass by "reference" as you call it has more complications than meets the eyes..
someObject[i] = a; //works
someObject = a; // doesnt work

Java has one parameter passing mechanism: everything is passed by value, not by reference.
It's subtle, but true. The implications matter.
You can certainly return any time from that method, be it array, List, or Set. You may or may not be able to alter the contents of the List or Set, because the implementation underneath might have been made unmodifiable by the developer who wrote the method.
Personally, I tend to prefer the collections over arrays. They are more expressive than raw arrays. If I get a Set back, I know that all the entries are unique in some way.

Related

Could java.util.ArrayList<T>.toArray() be made friendlier?

I'm surprised by how painful it is to use java.util.ArrayList<T>.toArray().
Suppose I declare my array list as:
java.util.ArrayList<double[]> arrayList = new java.util.ArrayList<double[]>();
... add some items ...
Then to convert it to an array, I have to do one of the following:
double[][] array = (double[][])arrayList.toArray(new double[0][]);
or:
double[][] array = (double[][])arrayList.toArray(new double[arrayList.size()][]);
or:
double[][] array = new double[arrayList.size()];
arrayList.toArray(array);
None of the above are very readable. Shouldn't I be able to say the following instead?
double[][] array = arrayList.toArray();
But that gives a compile error because Object[] can't be converted to double[][].
Perhaps it's not possible because toArray has to return Object[]
for backwards compatibility with pre-template days.
But if that's the case, couldn't a friendlier alternative method be added
with a different name? I can't think of a good name, but almost anything
would be better than the existing ways; e.g. the following would be fine:
double[][] array = arrayList.toArrayOfNaturalType();
No such member function exists, but maybe it's possible to write a generic helper function that will do it?
double[][] array = MyToArray(arrayList);
The signature of MyToArray would be something like:
public static <T> T[] MyToArray(java.util.ArrayList<T> arrayList)
Is it possible to implement such a function?
My various attempts at implementing it resulted in compile errors
"error: generic array creation" or "error: cannot select from a type variable".
Here's the closest I was able to get:
public static <T> T[] MyToArray(java.util.ArrayList<T> arrayList, Class type)
{
T[] array = (T[])java.lang.reflect.Array.newInstance(type, arrayList.size());
arrayList.toArray(array);
return array;
}
It's called like this:
double[][] array = MyToArray(arrayList, double[].class);
I wish the redundant final parameter wasn't there, but, even so,
I think this is the least-horrible way that I've seen so far for converting array list to array.
Is it possible to do any better than this?

Is it possible to do any better than this?
Nope.
None of the above are very readable. Shouldn't I be able to say the following instead?
double[][] array = arrayList.toArray();
It would be nice ... but you can't.
The problem is that the toArray() method was specified way back in Java 1.2 with the behavior you are seeing. Generic types were not added to the language until Java 1.5. When they were added, the designers chose the "type erasure" approach, for compatibility with earlier versions of Java. So:
the semantics of the toArray() methods could not be changed without breaking compatibility, and
type erasure makes it impossible for a toArray() method implementation to know what the list's actual element type is, so it could not get it right anyway.

Unfortunately you cannot write
double[][] array = arrayList.toArray();
The reason is that toArray() was defined in JDK 1.2 (prior to generics) to return Object[]. This cannot be changed compatibly.
Generics were introduced in Java 5 but were implemented using erasure. This means that the ArrayList instance has no knowledge at runtime of the types of objects it contains; therefore, it cannot create an array of the desired element type. That's why you have to pass a type token of some sort -- in this case an actual array instance -- to tell ArrayList the type of the array to create.
You should be able to write
double[][] array = arrayList.toArray(new double[0][]);
without a cast. The one-arg overload of toArray() is generified, so you'll get the right return type.
One might think that it's preferable to pass a pre-sized array instead of a throwaway zero-length array. Aleksey Shipilev wrote an article analyzing this question. The answer is, somewhat counterintuitively, that creating a zero-length array is potentially faster.
Briefly, the reason is that allocation is cheap, a zero-length array is small, and it's probably going to be thrown away and garbage collected quickly, which is also cheap. By contrast, creating a pre-sized array requires it to be allocated and then filled with nulls/zeroes. It's then passed to toArray(), which then fills it with values from the list. Thus, every array element is typically written twice. By passing a zero-length array to toArray(), this allows the array allocation to occur in the same code as the array filling code, providing the opportunity for the JIT compiler to bypass the initial zero-fill, since it knows that every array element will be filled.
There is also JDK-8060192 which proposes to add the following:
<A> A[] Collection.toArray(IntFunction<A[]> generator)
This lets you pass a lambda expression that is given the array size and returns a created array of that size. (This is similar to Stream.toArray().) For example,
// NOT YET IMPLEMENTED
double[][] array = arrayList.toArray(n -> new double[n][]);
double[][] array = arrayList.toArray(double[][]::new);
This isn't implemented yet, but I'm still hopeful this can get into JDK 9.
You could rewrite your helper function along these lines:
static <T> T[] myToArray(List<T> list, IntFunction<T[]> generator) {
return list.toArray(generator.apply(list.size()));
}
(Note that there is some subtlety here with concurrent modification of the list, which I'm ignoring for this example.) This would let you write:
double[][] array = myToArray(arrayList, double[][]::new);
which isn't terribly bad. But it's not actually clear that it's any better than just allocating a zero-length array to pass to toArray().
Finally, one might ask why toArray() takes an actual array instance instead of a Class object to denote the desired element type. Joshua Bloch (creator of the Java collections framework) said in comments on JDK-5072831 that this is feasible but that he's not sure it's a good idea, though he could live with it.
There an additional use case here as well, of copying the elements into an existing array, like the old Vector.copyInto() method. The array-bearing toArray(T[]) method also supports this use case. In fact, it's better than Vector.copyInto() because the latter cannot be used safely in the presence of concurrent modification, if the collection's size changes. The auto-sizing behavior of toArray(T[]) handles this, and it also handles the case of creating an array of the caller's desired type as described above. Thus, while adding an overload that takes a Class object would certainly work, it doesn't add much over the existing API.

A lookup object as an argument of a recursion

We're using a recursion that iterates through tree nodes and does some computation that is a logical equivalent of something as
public static Result iterate(TreeNode node, Dictionary dictionary ) {
Map<String, Result> accumulated = new HashMap<String, Result>();
for (TreeNode child : node.getChildren()) {
Result partialResult = iterate(child, dictionary);
accumulated.put(child.getId(), partialResult);
}
return completeResult(accumulated);
}
Now the Dicitionary object is not mutated while the recursion is being done. Its simply used as a lookup table. The object is in fact quite big.
Does the fact that we have the dictionary as an argument of our recursive call have a negative impact on the memory/performance? Is this a bad design?

The really interesting issue is: "How is the Dictionary related to the Tree?"
If several Dictionaries need to be used with different iterations, you would indeed pass a Dictionary as a parameter to the iterate method, as you have it right now. (But why it "iterate" static?)
If a Dictionary is a stable property associated with some specific Tree object, its reference should be passed to the constructor and stored as an instance field. The iterate being a method could access it as any other instance field.
Possibly the Dictionary is universal and unique for all Tree objects? Then you might advocate setting the Dictionary as a static class field, and your iterate method would access it as a "global".
Technically, all of the above just passes a reference ("address") around; no copying of a potentially huge object is involved...

I would say your design is correct, in that it should produce correct results. For its performance, you would really need to do some thorough testing to assess, with various combinations of sizes for your tree structure and dictionary. Also, the implementation of Dictionary will probably play a major role in the performance characteristics.
Memory-wise, your current implementation should be the most economical, as you use the existing structures, instead of copying to others, in order to use a faster algorithm.
Passing the dictionary as an argument has the benefit of isolating each recursive run, in the case that the dictionary can change between runs, and provided that you copy the dictionary for each run. Also, it gives you the capability of using the same code to do concurrent searches (using threads) on different trees using different dictionaries. Using a global dictionary wouldn't allow you to do this.

I think this question boils down to whether Java passes by reference or value. Somewhat confusingly Java always passes by value, but where an object is passed the value is the object reference.
So for your example the method iterate takes a parameter Dictionary dictionary. The internals of this object will be stored on the heap. This is an area of memory that is shared among all objects. Additionally your method will have it's own unique reference on the stack. The reference acts as a kind of pointer so your method can lookup the values of dictionary.
When you make the recursive call the JVM will make a new reference to the same dictionary object and put this reference on the stack for the new method call. So now you have two calls to iterate on the call stack, both with their own individual reference to the dictionary object, but only one actual dictionary object on the heap.
If you were to make changes to the dictionary object using either reference it would update the same underlying object so both methods would see these changes.
When the method returns, since the dictionary reference is local to the method it will be removed from the stack. This will reduce the reference count to this object by 1. If the total number of references reaches 0 then your object becomes eligible for garbage collection since nothing will be able to see it.
Back to your question about memory I don't think you need to worry. It's the object on the heap where all of the data will be. References are cheap by comparison (8 bytes for a Java reference). Each reference will in theory take up a little memory but you are only likely to hit problems if your recursive loop doesn't exit.

How Java handles dynamic Boolean array as method parameter?

I'm using a class that has a method that accepts a boolean[].
This code does not raise any errors
public class myclass{
void move(boolean[] myarray) {
//Do stufff
}
}
Now, I do a little C++ coding, and this would not work in the context of dynamic memory.
So this is essentially a java question:
In my case the array being received has a known length, but I want to know how you would handle this in Java if it is dynamic (as well as what I should do if its not dynamic).
I'm guessing the compiler or JVM is going to handle this, but I want to know the speed optimizations I can implement.

Arrays in Java are always constant length. From The Java Tutorials, "The length of an array is established when the array is created."
If you wanted dynamic arrays, you'd use something from the Collections Framework, e.g. ArrayList.
In any case, a reference to the array (or collection) is passed into move(...), so there shouldn't be any difference in speed just for the function call.
When using the array, I'd expect (static) arrays to be dereferenced more quickly than going through the function calls to access elements of (dynamic) collections. However, to have a proper comparison, you'd need to provide more context of how your array is used.

You should consider using ArrayList<>() for all your needs related to iterating arbitrary length collections.
Also using List is a good practice in the Java world. There is a article about programmers who use Lists and arrays and those who use lists tend to produce less bugs.

Large array of 'int' type needs to be passed to a generic array & collections

I am generating a large arrays(size>1000) with elements of int type, from a function. I need to pass this array to a generic type array but since the generic type array doesnt accept arrays of primitive type, I am unable to do so.
I fear to use the Integer type array since it will be costly, in terms of creation, performance, space used(an array of 12 byte objects) when doing so for a large size arrays. More it will create immutable Integer s when I need to perform some addition operations on the array elements.
What would be the best way to go with ?
EDIT Just to remove some confusions around, I need to pass int[] to a method of signature type: void setKeys(K... keys).

I want to pass an int[] to this function: public Query<K> setKeys(K... keys);
I assume that you mean that int[] should be the set of keys ... not just one key.
That is impossible. The type parameters of a generic type have to be reference types. Your use-case requires K to be a int.
You have two choices:
use Integer (or a mutable int holder class) and pay the performance penalty, or
forgo the use of generics and change the signature of that method.
Incidentally, the Integer class keeps a cache of Integer objects for small int values. If you create your objects using Integer.valueOf(int) there's a good chance that you will get a reference to an pre-existing object. (Of course, this only works because Integer objects are immutable.)

If your arrays are on the order of 1000 (or even 10,000 or 100,000) elements, the cost difference in terms of memory and performance probably wouldn't be noticeable unless you're processing the arrays thousands of times each. Write the code with Integer and optimize later if you have performance problems.

If you're that concerned about performance, you could write a simple class that wraps a public int, thus meaning you can make your call and still mutate it as needed. Having said that, I do agree that you want to make absolute sure you need this performance improvement before doing it.

If you actually do need to worry about the performance implications of boxing/unboxing integers, you could consider GNU Trove, specifically their TIntArrayList. It lets you mimic the functionality of an ArrayList<Integer> while being backed by primitives. That said, I'm not certain you need this, and I'm not certain this is exactly what you are looking for.

If you don't want the integers permanently boxed, you could pass in the result of Ints.asList() from the Google Collections library (http://guava-libraries.googlecode.com/svn/tags/release08/javadoc/com/google/common/primitives/Ints.html#asList(int...)), which would be a List<Integer> backed by the array. The values will get boxed as they're accessed, so this only makes sense if the values are not being accessed lots of times.

Java: How do I implement a method that takes 2 arrays and returns 2 arrays?

Okay, here is what I want to do:
I want to implement a crossover method for arrays.
It is supposed to take 2 arrays of same size and return two new arrays that are a kind of mix of the two input arrays.
as in [a,a,a,a] [b,b,b,b] ------> [a,a,b,b] [b,b,a,a].
Now I wonder what would be the suggested way to do that in Java, since I cannot return more than one value.
My ideas are:
- returning a Collection(or array) containing both new arrays.
I dont really like that one because it think would result in a harder to understand code.
- avoiding the need to return two results by calling the method for each case but only getting one of the results each time.
I dont like that one either, because there would be no natural order about which solution should be returned. This would need to be specified, though resulting in harder to understand code.
Plus, this will work only for this basic case, but I will want to shuffle the array before the crossover and reverse that afterwards. I cannot do the shuffling isolated from the crossover since I wont want to actually do the operation, instead I want to use the information about the permutation while doing the crossover, which will be a more efficient way I think.
My question is not about the algorithm itself, but about the way to put in in a method(concerning input and output) in Java

Following a suggestion from Bruce Eckel's book Thinking in Java, in my Java projects I frequently include some utility classes for wrapping groups of two or three objects. They are trivial and handy, specially for methods that must return several objects. For example:
public class Pair<TA,TB> {
public final TA a;
public final TB b;
/**
* factory method
*/
public static <TA,TB> Pair<TA,TB> createPair(TA a,TB b ){
return new Pair<TA,TB>(a,b);
}
/**
* private constructor - use instead factory method
*/
private Pair(final TA a, final TB b) {
this.a = a;
this.b = b;
}
public String toString() {
return "(" + a + ", " + b + ")";
}
}

Read the last section of this article:
http://www.yoda.arachsys.com/java/passing.html
To quote:
This is the real reason why pass by
reference is used in many cases - it
allows a method to effectively have
many return values. Java doesn't allow
multiple "real" return values, and it
doesn't allow pass by reference
semantics which would be used in other
single-return-value languages.
However, here are some techniques to
work around this:
If any of your return values are status codes that indicate success or
failure of the method, eliminate them
immediately. Replace them with
exception handling that throws an
exception if the method does not
complete successfully. The exception
is a more standard way of handling
error conditions, can be more
expressive, and eliminates one of your
return values.
Find related groups of return values, and encapsulate them into
objects that contain each piece of
information as fields. The classes for
these objects can be expanded to
encapsulate their behavior later, to
further improve the design of the
code. Each set of related return
values that you encapsulate into an
object removes return values from the
method by increasing the level of
abstraction of the method's interface.
For instance, instead of passing
co-ordinates X and Y by reference to
allow them to be returned, create a
mutable Point class, pass an object
reference by value, and update the
object's values within the method.
As a bonus, this section was updated by Jon Skeet :)

If it is reasonable for the caller to know the size of the returned arrays ahead of time, you could pass them into the method:
public void foo(Object[] inOne, Object[] inTwo, Object[] outOne, Object[] outTwo) {
//etc.
}
That being said, 90+% of the time multiple return values out of a method are hiding a better design. My solution would be to make the transformation inside an object:
public class ArrayMixer {
private Object[] one;
private Object[] two;
public ArrayMixer(Object[] first, Object[] second) {
//Mix the arrays in the constructor and assign to one and two.
}
public Object[] getOne() { return one; }
public Object[] getTwo() { return two; }
}
I suspect that in your real use case that class and array one and array two can get better names.

Since the specification of your method is that it takes two input arrays and produces output arrays, I agree with you that the method should return both arrays at the same time.
I think that the most natural choice of return value is an int[][] of length 2 (substitute int with whatever type you are using). I don't see any reason it should make the code harder to understand, especially if you specify what the contents of the return value will be.
Edit: in response to your comment, I understand that you have considered this and I am saying that despite your stylistic objections, I don't believe there is a strictly "better" alternative ("better" here being loosely defined in the question).
An alternative approach, largely equivalent to this one, would be to define an object that wraps the two arrays. This has the small distinction of being able to refer to them by names rather than array indices.

The best way to do it would be to do
public void doStuff(int[] array1, int[] array2) {
// Put code here
}
Since Java arrays in Java pass the reference, any modifications made to the arrays will be made to the array itself. This has several caveats
If you are setting it to null you must use a different way (such as encapsulating it in an object)
If you are initializing the array (in the method), you must use a different way
You would use this in the format:
// other method
int[] array1 = new int[20]; // the arrays can be whatever size
int[] array2 = new int[20];
doStuff(array1,array2);
// do whatever you need to with the arrays
Edit: This makes the assumption that it is okay to make changes to the input arrays.
If it isn't, then an object (such as in leonbloy's answer is definitely what is called for).

You strictly cannot return more then one value (think object or primitive) in Java. Maybe you could return an instance of a specific "Result" object which has the two arrays as properties?

You could pass the output arrays as parameters to the method. This may give you more control over memory allocation for the arrays too.

The cleanest and easiest to understand way would be to create a container bean that contains two arrays, and return the container from the method. I'd probably also pass in the container into the method, to keep it symmetric.
The most memory efficient way, assuming both arrays are the same length, would be to pass a multidimensional array - Object[2][n] - where n is the length of the arrays.

If you're really against the arbitrary ordering that comes from a 2d array or a collection, perhaps consider making an inner class that reflects the logic of what you're doing. You could simply define a class that holds two arrays and you could have your method return that, with names and function that reflect the logic of exactly what you're doing.

A simple solution to the above problem is to return as Map.The trick of this question is how you will define the keys to identify the objects, let say there are two
input arrays [a,a,a,a] [b,b,b,b] and two outputs arrays [a,a,b,b] [b,b,a,a]
For that you can use String variable as a key just to identify objects because String variable is immutable, so they can be used as keys.
And as example
Map<String,String[]> method(String[] x,String[] y){
do your stuff..........
Hashmap<String,String[]> map =new HashMap<String,String[]>();
map.put("Object2",[b,b,a,a]);
return map;
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.