I have to convert one of my code segment from C to java. Code is given below.
union commandString{
char commndStr[20];
struct{
char commnd[4];
char separator1;
char agr1[5];
char separator2;
char arg2[3];
char separator3;
char additionalArg[5];
};
};
I don't want to use any explicit parser or I do not want to use
System.arraycopy
method.
Is there any way to do that in my preferred way?
The Java language does not support unions or direct control memory layout the way that languages like C do directly.
However Oracle does offer a backdoor that was added in Java 5 that can be used by using the class sun.misc.Unsafe. It takes a bit of work, the full details have been documented by Martin Thompson on his blog.
The other option would be to write it in C and access it from Java as native functions via JNI.
The best library for doing Struct and Union would be Javolutions which has been around for many years. These were designed to do this.
I suggest if you are going to use these Unsafe you wrap it up in a library which abstracts it away. This can avoid continuously running into bugs which crash your JVM (and I mean crash in the sense a C programmer would understand)
I have a library called Java-Lang which allows you to do the sort of things Java doesn't normally allow such as 63 bit sized off heap and memory mapped, thread safe off heap operations, sharing of memory between JVM on the same machine. And as I said, I use my own library to abstract away use of Unsafe.
Reading about the Javolution Union led me to ByteBuffer, which can be used as a Union. Being an Abstract Class with only one provided class, you may need to create a temp file using File.createTempFile(), a RandomAccessFile(File,"rw"), a FileChannel RandomAccessFile.getChannel(), a MappedByteBuffer FileChannel.map(). It has controls for whether you want a big endian (default) or little endian relationship of bytes to other types. If you just need one Union for mapping types to bytes, such as for a trie, this would suffice.
Related
I recently read this document titled Embedded Systems/Mixed C and Assembly Programming
It basically deals with how C and C++ allow the user to use Assembly code via a technique called inline assembly that looks sort of like this:
#include<stdio.h>
void main() {
int a = 3, b = 3, c;
asm {
mov ax,a
mov bx,b
add ax,bx
mov c,ax
}
printf("%d", c);
}
And I was wondering if a similar interaction was possible in other high-level languages like Java, Python and others, or if this was only possible with C and C++.
Yes, D, Rust, Delphi, and quite a few other ahead-of-time-compiled languages have some form of inline asm.
Java doesn't, nor do most other languages that are normally JIT-compiled from a portable binary (like Java's .class bytecode, or C#'s CIL). Code injecting/assembly inlining in Java?.
Memory-safe languages like Rust only allow inline asm in unsafe{} blocks because assembly language can mess up the program state in arbitrary ways if it's buggy, even more broadly than C undefined behaviour. Languages like Java intended to sandbox the guest program don't allow unsafe code at all.
Very high level languages like Python don't even have simple object-representations for numbers, e.g. an integer variable isn't just a 32-bit object, it has type info, and (in Python specifically) can be arbitrary length for large values. So even if a Python implementation did have inline-asm facilities, it would be a challenge to let you do anything to Python objects, except maybe for NumPy arrays which are laid out like C arrays.
It's possible to call native machine-code functions (e.g. libraries compiled from C, or hand-written asm) from most high-level languages - that's usually important for writing some kinds of applications. For example, in Java there's JNI (Java Native Interface). Even node.js JavaScript can call native functions. "Marshalling" args into a form that makes sense to pass to a C function can be expensive, depending on the high-level language and whether you want to let the C / asm function modify an array or just return a value.
Different forms of inline asm in different languages
Often they're not MSVC's inefficient form like you're using (which forces a store/reload for inputs and outputs). Better designs, like Rust's modeled on GNU C inline asm can use registers. e.g. like GNU C asm("lzcnt %1, %0" : "=r"(leading_zero_count) : "rm"(input)); letting the compiler pick an output register, and pick register or a memory addressing mode for the input.
(But even better to use intrinsics like _lzcnt_u32 or __builtin_clz for operations the compiler knows about, only inline asm for instructions the compiler doesn't have intrinsics for, or if you want to micro-optimize a loop in a certain way. https://gcc.gnu.org/wiki/DontUseInlineAsm)
Some (like Delphi) have inputs via a "calling convention" similar to a function call, with args in registers, not quite free mixing of asm and high-level code. So it's more like an asm block with fixed inputs, and one output in a specific register (plus side-effects) which the compiler can inline like it would a function.
For syntax like you show to work, either
You have to manually save/restore every register you use inside the asm block (really bad for performance unless you're wrapping a big loop - apparently Borland Turbo C++ was like this)
Or the compiler has to understand every single instruction to know what registers it might write (MSVC is like this). The design notes / discussion for Rust's inline asm mention this requirement for D or MSVC compilers to implement what's effectively a DSL (Domain Specific Language), and how much extra work that is, especially for portability to new ISAs.
Note that MSVC's specific implementation of inline asm was so brittle and clunky that it doesn't work safely in functions with register args, which meant not supporting it at all for x86-64, or ARM/AArch64 where the standard calling convention uses register args. Instead, they provide intriniscs for basically every instruction, including privileged ones like invlpg, making it possible to write a kernel (such as Windows) in Visual C++. (Where other compilers would expect you to use asm() for such things). Windows almost certainly has a few parts written in separate .asm files, like interrupt and system-call entry points, and maybe a context-switch function that has to load a new stack pointer, but with good intrinsics support you don't need asm, if you trust your compiler to make good-enough asm on its own.
You can inline assembly in HolyC.
In C/C++, you can do the following:
struct DataStructure
{
char member1;
char member2;
};
DataStructure ds;
char bytes[] = {0xFF, 0xFE};
memcpy(&ds, bytes, sizeof(ds));
and you would essentially get the following:
ds.member1 = 0xFF;
ds.member2 = 0xFE;
What is the Java equivalent?
What is the Java equivalent?
There is no Java equivalent.
Java does not allow you to create or modify objects by accessing them at that level. You should be using new or setter methods, depending on what you are trying to achieve.
(There are a couple of ways to do this kind of thing, but they are unsafe, non-portable and "not Java" ... and they are not warranted in this situation.)
The memcpy you wrote depends on the internal implementation of the struct and would not necessarily work. In java, you need to define a constructor that accepts a byte array and set the fields. No shortcuts like this, as the memory structure of the class is not defined.
In Java you cannot work with the memory directly (no memcpy) it is the advantage (disadvantage?) of Java. There are some java library methods to copy arrays: System.arraycopy().
In general, to copy some object you need to ship it with clone method.
You might be able to do that in C. But you'd be wandering into aliasing problems and a hunka hunka burning undefined behavior.
And because struct padding is up to a compiler, what you might get with your memcpy is just ds.member1 = 0xFF, ds.member2 = whatever junk happened to be on the stack at the time, because member1 was padded to occupy 4 bytes rather than just 1. Or maybe you get junk for both, because you set the top 2 bytes of a 4-byte and they're in the bottom 2 bytes.
What you're wandering into is compiler/runtime-specific memory layouts. The same is true in Java. Java itself won't let you do something so horrendously un-Java, but if you write your own JVM or debug an existing JVM written in C or C++, you could do something like that. And who knows what would happen; I'm not Java god enough to know exactly how much the JVM spec pins down JVM implementation, but my guess is, not to the degree necessary to enable interoperability of the in-memory, runtime representations of objects.
So you get undefined behavior in every language flavor. Tastes just as good in each language, too - like mystery meat.
I want to try abusing Java classes as structures and for that I'm wondering if it is possible to serialize a byte array to a class and other way around.
So if I have a class like this:
public class Handshake
{
byte command;
byte error;
short size;
int major;
int ts;
char[] secret; // aligned size = 32 bytes
}
Is there an easy way (without having to manually read bytes and fill out the class which requires 3 times as much code) to deserialize a set of bytes into this class? I know that Java doesn't have structs but I'm wondering if it is possible to simplify the serialization process so it does it automatically. The bytes are not from Java's serializer, they are just aligned bytes derived from C structs.
The bytes are not from Java's serializer, they are just aligned bytes
derived from C structs.
Bad idea. It can break as soon as someone compiles that code on a different platform, using a different compiler or settings, etc.
Much better: use a standardized binary interface with implementations in Java and C++ like ASN.1 or Google's Protocol Buffers.
You can write a library to do the deserializtion using reflection. This may result in more code being required, but may suit your needs. It worth nothing that char in Java 16-bit rather than 8 bit and a char[] is a separate Object, unlike in C.
In short you can write a library which reads this data without touching the Handshake class. Only you can decide if this is actually easier than adding a method or two to the handshake class..
Do not do that! I will break sooner or later. Use some binary serialization format, like [Hessian][1], which supports both java and C++ (I'm not aware of anything that works on plain C)
Also remember C does not force size for int's or long's, they are platform dependent.
So if you must use C, and you are forced to write your own library, be very careful.
I'm developing a java application that uses some jni calls.
I have on C code the following variable:
GLuint *vboIds;
I want to pass this variable from Java to C, but I don't know how to declare it in Java.
GLuint is equivalent an unsigned int.
So, I think this is the equivalent declaration in Java:
int[] vboIds;
What do you think?
Thanks
You don't say explicitly whether it is meant to be a pointer to a single value or an array, but I'd guess it's probably an array from the naming and what you are thinking of doing with the mapping (there should also be a parameter somewhere that specifies the length of the array; those both map to the same argument on the Java side as Java's arrays know their own lengths). You're probably right to use an int as that's generally the same size as a C int – not that that's a guarantee, not at all, but hardly any machine architectures are different from that these days – but you'll need to watch out for the fact that Java's numeric types are all signed. That's mostly not a problem provided you're a bit careful with arithmetic (other than addition, subtraction and left-shift, which work obviously) and comparisons.
What is the 'correct' way to store a native pointer inside a Java object?
I could treat the pointer as a Java int, if I happen to know that native pointers are <= 32 bits in size, or a Java long if I happen to know that native pointers are <= 64 bits in size. But is there a better or cleaner way to do this?
Edit: Returning a native pointer from a JNI function is exactly what I don't want to do. I would rather return a Java object that represents the native resource. However, the Java object that I return must presumably have a field containing a pointer, which brings me back to the original question.
Or, alternatively, is there some better way for a JNI function to return a reference to a native resource?
IIRC, both java.util.zip and java.nio just use long.
java.nio.DirectByteBuffer does what you want.
Internally it uses a private long address to store pointer value. Dah !
Use JNI function env->NewDirectByteBuffer((void*) data, sizeof(MyNativeStruct)) to create a DirectByteBuffer on C/C++ side and return it to Java side as a ByteBuffer. Note: It's your job to free this data at native side! It miss the automatic Cleaner available on standard DirectBuffer.
At Java side, you can create a DirectByteBuffer this way :
ByteBuffer directBuff = ByteBuffer.allocateDirect(sizeInBytes);
Think it as sort of C's malloc(sizeInBytes). Note: It has as automatic Cleaner, which deallocates the memory previously requested.
But there are some points to consider about using DirectByteBuffer:
It can be Garbage Collected (GC) if you miss your direct ByteBuffer reference.
You can read/write values to pointed structure, but beware with both offset and data size. Compiler may add extra spaces for padding and break your assumed internal offsets in structure. Structure with pointers (stride is 4 or 8 bytes ?) also puzzle your data.
Direct ByteBuffers are very easy to pass as a parameter for native methods, as well to get it back as return.
You must cast to correct pointer type at JNI side. Default type returned by env->GetDirectBufferAddress(buffer) is void*.
You are unable to change pointer value once created.
Its your job to free memory previously allocated for buffers at native side. That ones you used with env->NewDirectByteBuffer().
There is no good way. In SWT, this code is used:
int /*long*/ hModule = OS.GetLibraryHandle ();
and there is a tool which converts the code between 32bit and 64bit by moving the comment. Ugly but it works. Things would have been much easier if Sun had added an object "NativePointer" or something like that but they didn't.
A better way might by to store it in a byte array, since native pointers aren't very Java-ish in the first place. ints and longs are better reserved for storing numeric values.
I assume that this is a pointer returned from some JNI code and my advice would be just dont do it :)
Ideally the JNI code should pass you back some sort of logical reference to the resource and not an actual pointer ?
As to your question there is nothing that comes to mind about a cleaner way to store the pointer - if you know what you have then use either the int or long or byte[] as required.
You could look to the way C# handles this with the IntPtr type. By creating your own type for holding pointers, the same type can be used as a 32-bit or 64-bit depending on the system you're on.