Passing a collection of JNA structures to native method

Passing a collection of JNA structures to native method - java

The Problem
I am attempting to pass a collection of JNA structures to a native method but it's proving very fiddly:
Let's say we have a structure:
class MyStructure extends Structure {
// fields...
}
and a method in a JNA interface:
void pass(MyStructure[] data);
which maps to the native method:
void pass(const MYStructure* data);
Now the complication comes from the fact that the application is building a collection of these structures dynamically, i.e. we are NOT dealing with a static array but something like this:
class Builder {
private final Collection<MyStructure> list = new ArrayList<>();
// Add some data
public void add(MyStructure entry) {
list.add(entry);
}
// Pass the data to the native library
public void pass() {
// TODO
}
}
A naive implementation of the pass() method could be:
MyStructure[] array = list.toArray(MyStucture[]::new);
api.pass(array);
(where lib is the JNA library interface).
Of course this doesn't work because the array is not a contiguous block of memory - fair enough.
Rubbish Solution #1
One solution is to allocate a JNA array from a structure instance and populate it field-by-field:
MYStructure[] array = (MyStructure[]) new MyStructure().toArray(size);
for(int n = 0; n < array.length; ++n) {
array[n].field = list.get(n).field;
// other fields...
}
This guarantees the array consist of contiguous memory. But we have had to implement a field-by-field copy of the data (which we've already populated in the list) - this is OK for a simple structure, but some of the data I am dealing with has dozens of fields, structures that point to further nested arrays, etc. Basically this approach is just not viable.
Rubbish Solution #2
Another alternative is to convert the collection of data to a simple JNA pointer, something along these lines:
MyStructure[] array = list.toArray(MyStructure[]::new);
int size = array[0].size();
Memory mem = new Memory(array.length * size);
for(int n = 0; n < array.length; ++n) {
if(array[n] != null) {
array[n].write();
byte[] bytes = array[n].getPointer().getByteArray(0, size);
mem.write(n * size, bytes, 0, bytes.length);
}
}
This solution is generic so we can apply it to other structure as well. But we have to change the method signatures to be Pointer instead of MyStructure[] which makes the code more obtuse, less self-documenting and harder to test. Also we could be using a third-party library where this might not even be an option.
(Note I asked a similar question a while ago here but didn't get a satisfactory answer, thought I'd try again and I'll delete the old one / answer both).
Summary
Basically I was expecting/hoping to have something like this:
MyStructure[] array = MyStructure.magicContiguousMemoryBlock(list.toArray());
similar to how the JNA helper class provides StringArray for an array-of-string:
StringArray array = new StringArray(new String[]{...});
But no such 'magic' exists as far as I can tell. Is there another, simpler and more 'JNA' way of doing it? It seems really dumb (and probably incorrect) to have to allocate a byte-by-byte copy of the data that we essentially already have!
Do I have any other options? Any pointers (pun intended) gratefully accepted.

As the author of the previous answer, I realize a lot of the confusion was approaching it one way before realizing a better solution that we discussed primarily in comments to your answer. I will try to answer this additional clarification with an actual demonstration of my suggestion on that answer which I think is the best approach. Simply, if you have a non-contiguous structure and need a contiguous structure, you must either bring the contiguous memory to the structure, or copy the structure to the contiguous memory. I'll outline both approaches below.
Is there another, simpler and more 'JNA' way of doing it? It seems really dumb (and probably incorrect) to have to allocate a byte-by-byte copy of the data that we essentially already have!
I did mention in my answer on the other question that you could use useMemory() in this situation. It is a protected method but if you are already extending a Structure you have access to that method from the subclass (your structure), in much the same way (and for precisely the same purpose) as you would extend the Pointer constructor of a subclass.
You could therefore take an existing structure in your collection and change its native backing memory to be the contiguous memory. Here is a working example:
public class Test {
#FieldOrder({ "a", "b" })
public static class Foo extends Structure {
public int a;
public int b;
// You can either override or create a separate helper method
#Override
public void useMemory(Pointer m) {
super.useMemory(m);
}
}
public static void main(String[] args) {
List<Foo> list = new ArrayList<>();
for (int i = 1; i < 6; i += 2) {
Foo x = new Foo();
x.a = i;
x.b = i + 1;
list.add(x);
}
Foo[] array = (Foo[]) list.get(0).toArray(list.size());
// Index 0 copied on toArray()
System.out.println(array[0].toString());
// but we still need to change backing memory for it to the copy
list.get(0).useMemory(array[0].getPointer());
// iterate to change backing and write the rest
for (int i = 1; i < array.length; i++) {
list.get(i).useMemory(array[i].getPointer());
list.get(i).write();
// Since sending the structure array as an argument will auto-write,
// it's necessary to sync it here.
array[1].read();
}
// At this point you could send the contiguous structure array to native.
// Both list.get(n) and array[n] point to the same memory, for example:
System.out.println(list.get(1).toString());
System.out.println(array[1].toString());
}
Output (note the contiguous allocation). The second two outputs are the same, from either the list or the array.
Test$Foo(allocated#0x7fb687f0d550 (8 bytes) (shared from auto-allocated#0x7fb687f0d550 (24 bytes))) {
int a#0x0=0x0001
int b#0x4=0x0002
}
Test$Foo(allocated#0x7fb687f0d558 (8 bytes) (shared from allocated#0x7fb687f0d558 (8 bytes) (shared from allocated#0x7fb687f0d558 (8 bytes) (shared from allocated#0x7fb687f0d550 (8 bytes) (shared from auto-allocated#0x7fb687f0d550 (24 bytes)))))) {
int a#0x0=0x0003
int b#0x4=0x0004
}
Test$Foo(allocated#0x7fb687f0d558 (8 bytes) (shared from allocated#0x7fb687f0d558 (8 bytes) (shared from allocated#0x7fb687f0d550 (8 bytes) (shared from auto-allocated#0x7fb687f0d550 (24 bytes))))) {
int a#0x0=0x0003
int b#0x4=0x0004
}
If you don't want to put useMemory in every one of your structure definitions you can still put it in an intermediate class that extends Structure and then extend that intermediate class instead of Structure.
If you don't want to override useMemory() in your structure definitions (or a superclass of them), you can still do it "simply" in code with a little bit of inefficiency by copying over the memory.
In order to "get" that memory to write it elsewhere, you have to either read it from the Java-side memory (via reflection, which is what JNA does to convert the structure to the native memory block), or read it from Native-side memory (which requires writing it there, even if all you want to do is read it). Under-the-hood, JNA is writing the native bytes field-by-field, all hidden under a simple write() call in the API.
Your "Rubbish Solution #2" seems close to what's desired in this case. Here are the constraints that we have to deal with, with whatever solution:
In the existing list or array of Structure, the native memory is not contiguous (unless you pre-allocate contiguous memory yourself, and use that memory in a controlled manner, or override useMemory() as demonstrated above), and the size is variable.
The native function taking an array argument expects a block of contiguous memory.
Here are the "JNA ways" of dealing with structures and memory:
Structures have native-allocated memory at a pointer value accessible via Structure.getPointer() with a size of (at least) Structure.size().
Structure native memory can be read in bulk using Structure.getByteArray().
Structures can be constructed from a pointer to native memory using the new Structure(Pointer p) constructor.
The Structure.toArray() method creates an array of structures backed by a large, contiguous block of native memory.
I think your solution #2 is a rather efficient way of doing it, but your question indicates you'd like more type safety, or at least self-documenting code, in which case I'd point out a more "JNA way" of modifying #2 with two steps:
Replace the new Memory(array.length * size) native allocation with the Structure.toArray() allocation from your solution #1.
You still have a length * size block of contiguous native memory and a pointer to it (array[0].getPointer()).
You additionally have pointers to the offsets, so you could replace mem.write(n * size, ... ) with array[n].getPointer().write(0, ... ).
There is no getting around the memory copying, but having two well-commented lines which call getByteArray() and immediately write() that byte array seem clear enough to me.
You could even one-line it... write(0, getByteArray(0, size), 0, size), although one might argue if that's more or less clear.
So, adapting your method #2, I'd suggest:
// Make your collection an array as you do, but you could just keep it in the list
// using `size()` and `list.get(n)` rather than `length` and `array[n]`.
MyStructure[] array = list.toArray(MyStructure[]::new);
// Allocate a contiguous block of memory of the needed size
// This actually writes the native memory for index 0,
// so you can start the below iteration from 1
MyStructure[] structureArray = (MyStructure[]) array[0].toArray(array.length);
// Iterate the contiguous memory and copy over bytes from the array/list
int size = array[0].size();
for(int n = 1; n < array.length; ++n) {
if(array[n] != null) {
// sync local structure to native (using reflection on fields)
array[n].write();
// read bytes from the non-contiguous native memory
byte[] bytes = array[n].getPointer().getByteArray(0, size);
// write bytes into the contiguous native memory
structureArray[n].getPointer().write(0, bytes, 0, bytes.length);
// sync native to local (using reflection on fields)
structureArray[n].read();
}
}
From a "clean code" standpoint I think this rather effectively accomplishes your goal. The one "ugly" part of the above method is that JNA doesn't provide an easy way to copy fields between Structures without writing them to native memory in the process. Unfortunately that's the "JNA way" of "serializing" and "deserializing" objects, and it's not designed with any "magic" for your use case. Strings include built-in methods to convert to bytes, making such "magic" methods easier.
It is also possible to avoid writing the structure to native memory just to read it back again if you do the field-by-field copy as you implied in your Method #1. However, you could use JNA's field accessors to make it a lot easier to access the reflection under the hood. The field methods are protected so you'd have to extend Structure to do this -- which if you're doing that, the useMemory() approach is probably better! But you could then pull this iteration out of write():
for (StructField sf : fields().values()) {
// do stuff with sf
}
My initial thought would be to iterate over the non-contiguous Structure fields using the above loop, storing a Field.copy() in a HashMap with sf.name as the key. Then, perform that same iteration on the other (contiguous) Structure object's fields, reading from the HashMap and setting their values.

If you able to create a continues block of memory, why don't you simply de-serialize your list into it.
I.e. something like:
MyStructure[] array = list.get(0).toArray(list.size());
list.toArray(array);
pass(array);
In any case you'd better not to store Structure in your List or any another collection. It is better idea to hold a POJO inside, and then remap it to array of structures directly using a bean mapping library or manually.
With MapStruct bean mapping library it may looks like:
#Mapper
public interface FooStructMapper {
FooStructMapper INSTANCE = Mappers.getMapper( FooStructMapper.class );
void update(FooBean src, #MappingTarget MyStruct dst);
}
MyStrucure[] block = new MyStructure().toArray(list.size());
for(int i=0; i < block.length; i++) {
FooStructMapper.INSTANCE.update(list.get(i), block[i]);
}
What the point - Structure constructor allocates memory block using Memory, it is really slow operation. As well as memory allocated outside of java heap space. It is always better to avoid this allocate whenever you can.

The solutions offered by Daniel Widdis will solve this 'problem' if one really needs to perform a byte-by-byte copy of a JNA structure.
However I have come round to the way of thinking expressed by some of the other posters - JNA structures are intended purely for marshalling to/from the native layer and should not really be used as 'data'. We should be defining domain POJOs and transforming those to JNA structures as required - a bit more work but deal with I guess.
EDIT: Here is the solution that I eventually implemented using a custom stream collector:
public class StructureCollector <T, R extends Structure> implements Collector<T, List<T>, R[]> {
/**
* Helper - Converts the given collection to a contiguous array referenced by the <b>first</b> element.
* #param <T> Data type
* #param <R> Resultant JNA structure type
* #param data Data
* #param identity Identity constructor
* #param populate Population function
* #return <b>First</b> element of the array
*/
public static <T, R extends Structure> R toArray(Collection<T> data, Supplier<R> identity, BiConsumer<T, R> populate) {
final R[] array = data.stream().collect(new StructureCollector<>(identity, populate));
if(array == null) {
return null;
}
else {
return array[0];
}
}
private final Supplier<R> identity;
private final BiConsumer<T, R> populate;
private final Set<Characteristics> chars;
/**
* Constructor.
* #param identity Identity structure
* #param populate Population function
* #param chars Stream characteristics
*/
public StructureCollector(Supplier<R> identity, BiConsumer<T, R> populate, Characteristics... chars) {
this.identity = notNull(identity);
this.populate = notNull(populate);
this.chars = Set.copyOf(Arrays.asList(chars));
}
#Override
public Supplier<List<T>> supplier() {
return ArrayList::new;
}
#Override
public BiConsumer<List<T>, T> accumulator() {
return List::add;
}
#Override
public BinaryOperator<List<T>> combiner() {
return (left, right) -> {
left.addAll(right);
return left;
};
}
#Override
public Function<List<T>, R[]> finisher() {
return this::finish;
}
#SuppressWarnings("unchecked")
private R[] finish(List<T> list) {
// Check for empty data
if(list.isEmpty()) {
return null;
}
// Allocate contiguous array
final R[] array = (R[]) identity.get().toArray(list.size());
// Populate array
final Iterator<T> itr = list.iterator();
for(final R element : array) {
populate.accept(itr.next(), element);
}
assert !itr.hasNext();
return array;
}
#Override
public Set<Characteristics> characteristics() {
return chars;
}
}
This nicely wraps up the code that allocates and populates a contiguous array, example usage:
class SomeDomainObject {
private void populate(SomeStructure struct) {
...
}
}
class SomeStructure extends Structure {
...
}
Collection<SomeDomainObject> collection = ...
SomeStructure[] array = collection
.stream()
.collect(new StructureCollector<>(SomeStructure::new, SomeStructure::populate));
Hopefully this might help anyone that's doing something similar.

Related

Process multi dimensional arrays in Java

What is the best way to take in a multi dimensional array as a method parameter in the form of an object and then reconstruct it as a variable inside that method? The reason I want to pass the array in as an object is because I want my code to be able to use any n dimensional array. I could circumvent this by using method overloading but making hundreds of identical methods just to account for all possible array dimensions seems like a very bad way to do it. However, using an object as a parameter causes a new set of challenges since I have no way to initialize that array since you normally need to explicitly declare an arrays dimensions. Based on some of my research I have figured out a way to determine the dimensions of an array passed in as an object which you can view in the following code snippet.
public static void callTestArray() {
var matrix = new int[][]{{1,2}, {4, 6, 7}};
test(matrix);
}
public static void test(Object obj) {
final int dimensions = dimensionOf(obj);
System.out.println("Dimensions:" + dimensions);
//I can't create a variable from this though since I need to hard code the dimensions of the array
}
/**
* This returns the amount of dimensions an array has.
*/
public static int dimensionOf(Object arr) {
int dimensionCount = 0;
Class<?> c = arr.getClass(); // getting the runtime class of an object
while (c.isArray()) { // check whether the object is an array
c = c.getComponentType(); // returns the class denoting the component type of the array
dimensionCount++;
}
return dimensionCount;
}
I have been looking around for a while now but I cant find an object that allows me to pass in any n dimensional array in that allows me to easily access all of an arrays typical information? Was this not included in Java or am I just missing it? That being said since 255 is the max amount of dimensions an array can have I could make my own utils class to handle this but it would require a ton of redundancies and effort to handle all cases. I just want to make sure it has not already been made before I waste hours making something like that. Also if anyone has a better way of doing it with any internal java libraries please let me know!

Instead of passing around arrays we more often than not use collections like ArrayList, this allows us some abstraction and allows us to add some common methods to it. Note that ArrayList doesn't extend arrays, it simply implements a list interface.
I recommend the same thing for you, instead of passing around an array, consider encapsulating the array in a class and pass that class around. Use the class to do certain simplifications, for instance you might have a method allowing it to apply a function to each element of the matrix or one to resize the matrix.
You might track your matrix's dimensions in different variables allowing you to resize it without re-allocating the array (like an ArrayList does)
Another advantage of the encapsulation, if you wish to do something different like make a sparse matrix out of it, you could re-implement the underlying code without changing the ways it's used (Like the way ArrayList and LinkedList have the same interface but do things different ways for different use cases)
Your other conditions seem to work for this Matrix object as well as it would arrays, for instance you would pass dimensions into the constructor to create it initially (Although, as I said, you could easily expand it later, especially if you used an ArrayList of ArrayLists for your underlying implementation, if you needed that)
I think the reason it's not included in Java is that it is not very commonly used and quite easy to implement, but if you really don't want to do it yourself, apache has a Matrix implementaiton that looks like it will fit.
We use time series data like hourly tempatures a lot (Often down to 10 second resolution for a day) and so we built our own class that essentially represents a line on a graph with the y axis of "Date", like a linked list but each value is timestamped. This structure is AMAZINGLY useful for us and I often wonder why it's not in Java, but I think I just answered my own question, not used enough.

This is a job for varargs:
public static void main(String[] args) {
var matrix = new int[][]{{1,2}, {4, 6, 7}};
System.out.println("Length is: " + getSize(matrix));
}
public static int getSize(int[]... multiArray) {
return multiArray.length;
}
which prints out:
Length is: 2
Also, unless you have to use an array to hold your int arrays, I would use an ArrayList<int[]> instead. That way you can easily add to your list like:
ArrayList<int[]> multiArray = new ArrayList<>();
multiArray.add(new int[]{1,2,3});
multiArray.add(new int[]{4,5,6});
and then you can get its size by simply calling:
multiArray.size()

Here's my attempt. You use Object as the parameter and then check for the array dimension in the body of the method. In this example, I only limit it to 3D array but you can go up to any dimension.
public class Main{
static void process(Object o){
if (o instanceof int[]){
int[] a = (int[]) o;
System.out.println("1D. length is " + a.length);
} else if (o instanceof int[][]){
int[][] a = (int[][]) o;
System.out.println("2D. row=" + a.length + ", col=" + a[0].length);
} else if (o instanceof int[][][]){
int[][][] a = (int[][][]) o;
System.out.println("3D. row=" + a.length + ", col=" + a[0].length + ", depth=" + a[0][0].length);
} else {
System.out.println("Unsupported array dimension.");
}
}
public static void main(String[] args) {
int[] a = {1,2,3};
int[][] b = {{1,2,3},{1,2,3}};
int[][][] c = {
{ {1,2,3}, {1,2,3} },
{ {1,2,3}, {1,2,3} }
};
process(a);
process(b);
process(c);
}
}
Output:
1D. length is 3
2D. row=2, col=3
3D. row=2, col=2, depth=3

How to call C function through JNA with 2d pointer?

Here is the signature of my C function :
typedef struct myitem_s {
int a;
int b;
} myitem_t;
int get_items(myitem_t** items);
The usage in a C program is::
myitem_t* items = NULL;
int n = = get_items(&items);
for (int i = 0; i < n; ++i) {
myitem_t* item = &items[i];
}
items is allocated in the get_items() function and contains one or more myitem_t elements.
From Java code I have succeeded in doing this:
Memory itemsPtr = new Memory(Native.POINTER_SIZE);
Pointer p = itemsPtr.getPointer(0);
int n = CLibrary.INSTANCE.get_items(p);
The n value is valid, itemsPtr is updated so I suggest value is good also. Now I have no idea of how to use it. Is there another way of doing it?

Your code works but you're using a lot of lower level functions when JNA has some higher level constructs.
First, the C struct can be represented by a JNA Structure.
#FieldOrder({ "a", "b" })
class MyitemT extends Structure {
public int a;
public int b;
}
Since the native code is handling the memory allocation, all you need is the pointer to it. Rather than a pointer-sized memory allocation, you probably want a PointerByReference.
PointerByReference pbr = new PointerByReference();
The key methods you want from this are getPointer() (the pointer to the pointer) and getValue() (the pointed-to value).
Given the above, pass the pointer-to-the-pointer to the method, which will allocate the memory and populate the value.
Using the mapping you already have (not shown but inferred):
int n = CLibrary.INSTANCE.get_items(pbr.getPointer());
However, you should actually map get_items() to take a PointerByReference argument and then you can just pass items directly.
At this point, items.getValue() is a Pointer to the start of your array of structures. Additional items would be at offsets of the pointer value based on the size of the structure (item.size()). There are multiple ways of getting at that.
In your case since you know you just have pairs of ints, you could skip the whole "structure" part and just use items.getValue().getInt(0) and items.getValue().getInt(4) for the first pair; 8 and 12 for the second pair, etc. But even better, just items.getValue().getIntArray(0,n*2); fetches an array of integers, just pull them out by pairs.
But that takes advantage of internal details. Probably the most JNA-ish choice is to use Structure.toArray() to create an array of your MyitemT structures. If you include a pointer constructor and create the initial structure using that pointer, Structure.toArray() uses that existing mapping. You can then read() into your array:
MyItemT item = new MyItemT(pbr.getValue());
MyItemT[] items = (MyItemT[]) item.toArray(n);
for (int i = 0; i < n; i++) {
items[i].read();
// now you can see items[i].a and items[i].b
}
Don't forget to eventually release the native-allocated memory however the API tells you to!

What does my professor mean by "Implementing a static Abstract Data Type"?

In my class for our assignment we are making two different Abstract Data Types, Double Stack and Leaky Stack. I have no problem creating these, but my professor put in the assignment details for both of these ADTs to "Give an efficient static implementation of the ADT". But what the hell does that mean? I could ask him tomorrow, but I want to get this assignment done today. Anyone have any idea what he means by this?

One possible interpretation is that the solutions are to use a fixed size "static" underlying structure (such as an array), rather than using a dynamic growing amount. Each stack, therefore, would have a pre-assigned maximum capacity. Therefore, I would expect an exception to be thrown on a push(...) operation that would exceed the capacity of the stack (just as a pop() operation would throw on an empty stack).
An example of a static implementation (though it allows setting the total capacity), might be like the following. Here the access will always be O(1) as an index is directly used, there is no traversal of the data structure, and no memory re-allocation. Note the code is example, and has not been tested. The use of the Generic could be removed if the approach in question specifies the specific type of stack (such as int or char).
public class AnotherStack<T>
{
private final T[] values;
private int loc = 0;
// must use the suppress, as we are using a raw Object array
// which is necessitated as cannot make a generic array
// See Effective Java
#SuppressWarnings("unchecked")
public AnotherStack(int size)
{
values = (T[])new Object[size];
}
public void push(T val)
{
if (loc < values.length) {
values[loc++] = val;
}
else {
throw new IllegalStateException("Stack full");
}
}
public T pop()
{
if (loc == 0) {
throw new IllegalStateException("Stack empty");
}
return (values[--loc]);
}
// other methods
}

JNA array of structures uninitialized after being passed from Java to C

I've been using JNA with relative success to make native function calls from Java to a small C library that I wrote. Passing structures or pointers from one to the other works great once you've worked out the tricks of structure mapping, memory management and passing by reference.
I am now trying to pass, from Java to C, an array of structures. Here is the C code for the structure:
typedef struct key {
int length;
void *data;
} key_t;
I have the matching definition in Java:
public class Key extends Structure {
public int length;
public Pointer data;
public Key() {
this.setFieldOrder(new String[] {"length", "data"});
}
public void setAsLong(long value) {
this.length = 8;
this.data = new Memory(this.length);
this.data.setLong(0, value);
}
public long longValue() {
return this.data != null ? this.data.getLong(0) : Long.MIN_VALUE;
}
};
If I understood the documentation and what I read online, I need to create my array as a contiguous memory section by doing the following on the Java side:
Key[] keys = new Key().toArray(2);
for (int i=0; i<2; i++) {
k.setAsLong(42+i);
}
So far so good. If I dump the content of each Key structure in Java using Structure.toString(), everything is in here as expected. Note that the code about setting as a long value, allocating memory for the key's content, etc, work fine when I pass a single Key structure from Java to C. So here I pass my array to my native function by using the pointer to the first element of the array:
instance.foo(keys[0].getPointer(), keys.length);
My C function is of course defined like this:
void foo(key_t *keys, size_t count) {
...;
}
The array gets there correctly: the keys pointer on the C side has the same address as keys[0].getPointer() in Java, but unfortunately the members of each structure in the array are 0/NULL, as pointed out by GDB:
(gdb) print keys
$1 = (key_t *) 0x7fd7e82389e0
(gdb) print keys[0]
$2 = {length = 0, data = 0x0}
At this point I honestly have no clue what's going on. As I said, if I pass just one structure, it works fine, but here no way. The only difference I can see is the Java native method signature which uses Pointer instead of Key[] but when I use the array I get:
IllegalArgumentException: [Lfoo.bar.Key; is not a supported argument type (in method foo ...
Thanks

If you pass a Pointer value, JNA has no idea that you're actually passing a Structure or an array of them, and it's up to you to ensure you call Structure.write() before and Structure.read() after the native call.
If you pass either a Structure or Structure[], then JNA will take care of the synchronization automagically. In the case of Structure, JNA uses internal bookkeeping to determine whether the structure you're passing is at the head of an array of structures.

java - live view on collection contained within a collection contained within ... etc

I have a class A which can contain many instances of class B which may in turn contain many instances of Class C, which can contain many instances of class D
Now, in class A I have a method getAllD. Currently every time this is called there is a lot of iterating that takes place, and a rather large list is freshly created and returned. This cannot be very efficient.
I was wondering how I could do this better. This question Combine multiple Collections into a single logical Collection? seems to touch upon a similar topic, but I'm not really sure how I could apply it to my situation.
All comments are much appreciated!

I would combine Iterables.concat with Iterables.transform to obtain a live view of Ds:
public class A {
private Collection<B> bs;
/**
* #return a live concatenated view of the Ds contained in the Cs
* contained in the Bs contained in this A.
*/
public Iterable<D> getDs() {
Iterable<C> cs = Iterables.concat(Iterables.transform(bs, BToCsFunction.INSTANCE));
Iterable<D> ds = Iterables.concat(Iterables.transform(cs, CToDsFunction.INSTANCE));
return ds;
}
private enum BToCsFunction implements Function<B, Collection<C>> {
INSTANCE;
#Override
public Collection<C> apply(B b) {
return b.getCs();
}
}
private enum CToDsFunction implements Function<C, Collection<D>> {
INSTANCE;
#Override
public Collection<D> apply(C c) {
return c.getDs();
}
}
}
public class B {
private Collection<C> cs;
public Collection<C> getCs() {
return cs;
}
}
public class C {
private Collection<D> ds;
public Collection<D> getDs() {
return ds;
}
}
This works well if your goal is simply to iterate over the Ds and you don't really need a collection view. It avoids the instantiation of a big temporary collection.

The answer to your question is going to depend on the specifics of your situation. Are these collections static or dynamic? How big is your collection of B's in A? Are you only going to access the Ds from A, or will you sometimes want to be farther down in the tree or returning Bs or Cs? How frequently are you going to want to access the same set of Ds from a particular A? Can a D (or C or B) be associated with more than 1 A?
If everything is dynamic, then the best chance of improving performance is to have parent references from the Cs to A, and then updating the parent whenever C's list of Ds changes. This way, you can keep a collection of Ds in your A object and update A whenever one of the Cs gets a new one or has one deleted.
If everything is static and there is some reuse of the D collections from each A, then caching may be a good choice, particularly if there are a lot of Bs. A would have a map with a key of B and a value of a collection of Ds. The getAllDs() method would first check to see if the map had a key for B and if so return its collection of Ds. If not, then it would generate the collection, store it into the cache map, and return the collection.
You could also use a tree to store the objects, particularly if they were fairly simple. For example, you could create an XML DOM object and use XPath expressions to pull out the subset of Ds that you wanted. This would allow far more dynamic access to the sets of objects you were interested in.
Each of these solutions has different tradeoffs in terms of cost to setup, cost to maintain, timeliness of results, flexibility of use, and cost to fetch results. Which you should choose is going to depend on your context.

Actually, I think Iterables.concat (or IteratorChain from Apache Commons) would work fine for your case:
class A {
Collection<B> children;
Iterator<D> getAllD() {
Iterator<Iterator<D>> iters = new ArrayList<Iterator<D>>();
for (B child : children) {
iters.add(child.getAllD());
}
Iterator<D> iter = Iterables.concat(iters);
return iter;
}
}
class B {
Collection<C> children;
Iterator<D> getAllD() {
Iterator<Iterator<D>> iters = new ArrayList<Iterator<D>>();
for (C child : children) {
iters.add(child.getAllD());
}
Iterator<D> iter = Iterables.concat(iters);
return iter;
}
}
class C {
Collection<D> children;
Iterator<D> getAllD() {
Iterator<D> iter = children.iterator();
return iter;
}
}

This cannot be very efficient.
Iterating in-memory is pretty damn fast. Also the efficiency of creating an ArrayList of 10 k elements compared to creating 10 ArrayList with 1k elements each won't be that drastically different. So, in conclusion, you should probably first just go with the most straight-forward iterating. Chances are that this works just fine.
Even if you have gazillion elements, it is probably wise to implement a straight-forward iterating anyways for comparison. Otherwise you don't know if you are being able to optimize or if you are slowing things down by doing things clever.
Having said that, if you want to optimize for sequential read access of all Ds, I'd maintain an "index" outside. The index could be a LinkedList, ArrayList, TreeList etc. depending on your situation. For example, if you aren't sure of the length of the index, it is probably wise to avoid ArrayList. If you want to efficiently remove random elements using the reference of that element, OrderedSet might be much better than a list etc.
When you do this you have to worry about the consistency of the index & actual references in your classes. I.e. more complexity = more place to hide bugs. So, unless you find it necessary through performance testing, it is really not advisable to attempt an optimization.
(btw avoiding instantiation of new collection objects are unlikely to make things much faster unless you are talking about EXTREME high-performing code. Object instantiation in modern JVMs only take a few ten nano seconds or something. Also, you could mistakenly use an ArrayList having small initial length or something and make things worse)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.