How to avoid copying data between Java and Native C++ Code

How to avoid copying data between Java and Native C++ Code - java

I am writing C++ library that will be used by different Android applications to process some kind of data organized like two-dimensional storage where each dimension has no predefined restrictions for size (like array of arrays of float, and size of arrays can be quite large).
Current solution uses SWIG to copy data from memory allocated by Java code to C++ structures. It turns out that each array of float value (in Java) became vector of float (in C++).
The problem is that duplication of a large amount of data increases the risk of running out of memory available for application. I understand that, in any case, memory consumption issue should be resolved by input volume limitation, but the library does not know how much memory is available and should have whole data (access to any data element is needed repeatedly) to perform correct processing.
So now I am considering possibility to use one data storage for Java and C++, so C++ code require direct access to data stored by Java code to memory allocated on Java side (making memory allocated by C++ code as single storage is not considered).
I want to know how to organize such memory sharing in a safe manner (preferably using SWIG).
I feel that some difficulties can be with such implementation, e.g. with Java garbage collector (C++ code can address to storage which already deallocated) and slowing memory access through the wrapper (as mentioned earlier, the library requires repeated access to each data item)… but perhaps someone advise me a reliable solution.
The explanation of why my idea is wrong can be accepted, if supported with sufficiently and compelling arguments.

You can take access to raw array of data using Critical Native implementation. This tecknology allow to access directly to jvm memory without owerhead of transfering data between Java and native code.
But this have next restrictions:
must be static and not synchronized;
argument types must be primitive or primitive arrays;
implementation must not call JNI functions, i.e. it cannot allocate Java objects or throw exceptions;
should not run for a long time, since it will block GC while running.
The declaration of a critical native looks like a regular JNI method, except that:
it starts with JavaCritical_ instead of Java_;
it does not have extra JNIEnv* and jclass arguments;
Java arrays are passed in two arguments: the first is an array length, and the second is a pointer to raw array data. That is, no need to call GetArrayElements and friends, you can instantly use a direct array pointer.
Look at original answer and source article for details.

Related

Implementation of the heap in the Java Memory Model

i was wondering what spcecifically the heap stores in its nodes? I understand a heap to be a kind of binary tree and from what I have studied of trees, the nodes contain a reference to the value stored. My question was in the case of the java heap, does the node structure contain a Java object reference to the location (stored somewhere else in RAM) of a stored object (the case a reference type), or a pointer to the memory location of the data type, or some other representation?
Reading about the subject I thought it strange that where an object is defined as a local variable, and is thus present both in the stack, as well as the heap (until I realized that this would be necissary since local variables are supposed to only be viziable to the relevent thread with the relevant thread stack) - however I still thought it odd to use a pair of object references like this and wondered perhaps whether I had misunderstood its implementation?

The Java heap just has to confirm to part 2.5.3 of the VM specification. There is no single implementation, so your question does not make sense strictly speaking.
There's too little space here to fully clarify the Oracle server and client VM. You should read into it for your target VM and ask more specific questions if you get stuck.
You should compare the java stack and heap to the related concepts (stack allocation vs. malloc) in C with the difference that you do not need to free them due to GC and are not allowed to do pointer arithmetics because objects can get moved at any time.
The java memory model on the other hand prescribes what guarantees the VM has to make under concurrent access to various types of variables. Compare to C++'s std::atomic. This is unrelated to the memory layout.

Sizeof in c porting to Java

I have a code in C like this
skip=(unsigned long) (st_row-1)*tot_numcols;
fseek(infile,sizeof(cnum)*skip,0);
Now i have to port it into Java How can I do That.The "cnum" is a Structure in C so I created a class in Java.But about that fseek how can i point to the exact position in File in Java.

Your C design is broken, and you can't do what you apparently want in Java.
It appears that you're storing information out of C structs by blindly dumping the pointer to disk. In addition to being difficult to debug, it's prone to break completely with any change that makes the compiler decide to pack the struct differently, including in particular compiling identical code for 32-bit and 64-bit or little- and big-endian targets. Instead, you should always explicitly serialize structured data. Human-readable formats are best unless there's a very large amount of data.
Java simply doesn't permit this kind of attempt. The Java memory model explicitly hides information about runtime memory packing, and the JVM has wide latitude to organize memory management as it sees fit.
Instead, define a clear format for saving your data, including endianness, and use that from both languages.

Java: How does the Array provide instant-lookup

More specifically, how does the Array object in Java allow users to access each bucket in constant time? I understand that Java allocates memory equivalent to the specified size at initialization, but what about the structure of it allows such rapid lookup? The reason I am asking is because it's something not available in any other data structure save those that use Array representations of data.
Another (possibly silly) question is why can't other data structures provide such quick lookup of the stored data? I guess this question can/will be answered by the answer to the first question (how Arrays are implemented).

An array is just stored as a big contiguous block of memory.
Accessing an element is a matter of:
Finding where the data starts
Validating the index
Multipying the index by the element size, and adding the result to the start location
Accessing that memory location
Note that this is all done in the JVM, not in byte code. If arrays didn't exist at a JVM level, you couldn't fake them up within pure Java. Aside from anything else, arrays are the only objects in Java which logically vary in size between instances within the same JVM. (It's possible that some JVMs have clever systems to use compressed references in some cases and not in others, but that's an implementation detail rather than a logical difference.)

Is there any way to set RAM location (in java) to a variable so that we can retrieve value?

If I have not assumed or learned anything wrong then all variables that we assign, takes a certain place of the RAM.
For example while working with Java array when we try to print an array it prints a "location".
String [] a = new String [2]
System.out.println(a)
[Ljava.lang.String;#be6280
Now is there any way to set that location?
I think it is possible using C++, is it? If any language offers this thing then I should be capable to scan my RAM for that variable or array location at least exhaustively. Can't I? Have anyone tried doing it?

You could use the Unsafe class as shown here. This is specific to the HotSpot JVM, but it's probably a start.

No, this is in Java not possible. For developpers of debugging tools, and special other tools there is a backdoor for direct memory access, but this is not that what you and 99.995% of other people want.

Memory in Java (RAM) is controlled by the Java Virtual Machine (JVM) and there is no way, unlike other programming environments, to handle memory allocation manually in the Java heap. You can allocate memory outside Java using JNI or NIO.

Setting the location is very unsafe as the GC can be performed at any time and a corrupt memory structure will crash the JVM.
You can get the location of a object, but you don't need this. You can use Reflections to get any field of an object, but you can use Unsafe to extract or set a field.

Why is there no sizeof in Java?

For what design reason is there no sizeof operator in Java? Knowing that it is very useful in C++ and C#, how can you get the size of a certain type if needed?

Because the size of primitive types is explicitly mandated by the Java language. There is no variance between JVM implementations.
Moreover, since allocation is done by the new operator depending on its argument there is no need to specify the amount of memory needed.
It would sure be convenient sometimes to know how much memory an object will take so you could estimate things like max heap size requirements but I suppose the Java Language/Platform designers did not think it was a critical aspect.

In c is useful only because you have to manually allocate and free memory. However, since in java there is automatic garbage collection, this is not necessary.

In java, you don't work directly with memory, so sizeof is usually not needed, if you still want to determine the size of an object, check out this question.

Memory management is done by the VM in Java, perhaps this might help you: http://www.javamex.com/java_equivalents/memory_management.shtml

C needed sizeof because the size of ints and longs varied depending on the OS and compiler. Or at least it used to. :-) In Java all sizes, bit configurations (e.g. IEEE 754) are better defined.
EDIT - I see that #Maerics provided a link to the Java specs in his answer.

Size of operator present in c/c++ and c/c++ is machine dependent langauge so different data types might have different size on different machine so programmes need to know how big those data types while performing operation that are sensitive to size.
Eg:one machine might have 32 bit integer while another machine might have 16 bit integer.
But Java is machine independent langauge and all the data types are the same size on all machine so no need to find size of data types it is pre defined in Java.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.