c++ (u256)*(h256 const*)(char*[] + int) cast rewriting to java

c++ (u256)*(h256 const*)(char*[] + int) cast rewriting to java - java

i need to to rewrite some code from c++ to java and i've got into trouble with such c++ code:
using u256 = boost::multiprecision::number<boost::multiprecision::cpp_int_backend<256, 256, boost::multiprecision::unsigned_magnitude, boost::multiprecision::unchecked, void>>;
using h256 = FixedHash<32>;
using bytes = std::vector<byte>;
uint32_t offset = ...;
bytes m_data = ...;
u256 result;
result = (u256)*(h256 const*)(m_data.data() + (size_t)offset);
I have no idea what's going on and how do i rewrite it in java code.
I've understood that firstly we made and offset and now pointing at some element of m_data array, then cast in to array of h256 type (i've watched debug and this cast made the following: we get data from 0 to offset from m_data and then cast to 32 size array with leading zero's)
And then we get a first value (im not sure about it) of this array and cast to u256? But the first value after (h256 const*) cast is zero but anyway the resulting value is not a zero.
Do u have any ideas?

I don't know what a u256 is, and the question miss the typedef, but this is the typical way in C to get a scalar type (int16_t, int32_t, int64_t, double....) from a buffer in memory.
Essentially the use of the syntax:
type t = (type)*(const type *)(buffer + offset)
... let you obtain an object of a specific type from a byte array starting from a particular index.
It's not very safe, but it blazing fast when converted to assembly!
NOTE: the pointer math depends from the declaration of "buffer", if it's int8_t * for instance buffer will be get from the "offset"-nth byte, if it's int32_t * it will be used from the "offset * 4"-nth byte.

Related

How can an array of a data type with the size [1] suddenly be bigger than the datatype itself in Java

today I have been experimenting with memory in java. Specifically, I was deserializing objects into binary data and reserializing them. Something caught my eye and that is that for example an array of bytes with the size 1 takes up less binary data than defining a byte. Here's what I mean:
I defined a single byte in java and printed out the binary data of the byte:
byte size: 75bytes
101011001110110100000000000001010111001101110010000000000000111001101010011000010111011001100001001011100110110001100001011011100110011100101110010000100111100101110100011001011001110001001110011000001000010011101110010100001111010100011100000000100000000000000001010000100000000000000101011101100110000101101100011101010110010101111000011100100000000000010000011010100110000101110110011000010010111001101100011000010110111001100111001011100100111001110101011011010110001001100101011100101000011010101100100101010001110100001011100101001110000010001011000000100000000000000000011110000111000000000001
and here's a byte[1] array
byte[] size: 28bytes
10101100111011010000000000000101011101010111001000000000000000100101101101000010101011001111001100010111111110000000011000001000010101001110000000000010000000000000000001111000011100000000000000000000000000000000000100000001
But if I print out the size of byte[0] (byte at location 0 in the array) it suddenly grows back to 75bytes:
size of byte[0] in byte array:
101011001110110100000000000001010111001101110010000000000000111001101010011000010111011001100001001011100110110001100001011011100110011100101110010000100111100101110100011001011001110001001110011000001000010011101110010100001111010100011100000000100000000000000001010000100000000000000101011101100110000101101100011101010110010101111000011100100000000000010000011010100110000101110110011000010010111001101100011000010110111001100111001011100100111001110101011011010110001001100101011100101000011010101100100101010001110100001011100101001110000010001011000000100000000000000000011110000111000000000001
And yes, it's the full object and not metadata or something because using this binary i can reconstruct the object to it's original state so the values are stored inside the binary data. Here's the code I used to find out the size of the data:
public class MemoryFunctions {
static int sizeOf(Object input) {
int size = 0;
ByteArrayOutputStream checker = new ByteArrayOutputStream();
try {
ObjectOutputStream byteArray = new ObjectOutputStream(checker);
byteArray.writeObject(input);
byteArray.flush();
byte sizeDetector[] = checker.toByteArray();
size = sizeDetector.length;
int amountOfBytes = 0;
for (byte b:
sizeDetector) {
System.out.print(String.format("%8s", Integer.toBinaryString(b & 0xFF)).replace(' ', '0'));
amountOfBytes +=1;
}
System.out.println("real size in byte " + amountOfBytes);
System.out.println();
} catch (Exception e) {
System.err.println(e);
}
return size;
}
}
Is there any reason, that a byte array takes up less space than the byte itself? I need to heavily optimize a program. Using this information, would it be a better idea to have the values of a class that I want to deserialize into binary data in array form or are there any benefits of using "the full value"? Also, I am kind of confused with this information because as far as I know, byte and byte[] are primitive datatypes so they don't get called by reference but are stored as binary in memory "as is". What I also found out that getting the size of value 0 in the smaller array suddenly generates me a new int because it's size is again 75 bytes. Does this mean that values are generated another time when you call an index of an array?
It'd be nice if any of you had more information about this topic and could answer my questions.

In Java byte is a primitive type passed by-value, but all arrays are effectively objects and by-reference, including byte[].
If you pass a byte[1] to your sizeOf it passes the array, because all object types are subclasses of and compatible with the parameter Object input, and sizeOf serializes the array.
If you try to pass a primitive byte it doesn't work, because no primitive type is a reference type so it cannot be compatible with java.lang.Object or any other object type. Instead the byte is 'boxed' to an an object of the language-defined class java.lang.Byte (note different spelling) -- and sizeOf serializes that object. This is often called auto-boxing (and auto-unboxing for the reverse) because the compiler does these conversions without you writing them in the source code. The boxed object is actually about the same size in memory as the array (and both are significantly larger than the primitive), but as commented by https://stackoverflow.com/users/869736/louis-wasserman the serialization of the java.lang.Byte object is more complicated and longer than the serialization of the byte[1] array.

Casting vs Parsing vs Serialization in Java: What are the differences?

These 3 terms deal with the conversion from one form to another, which seems similar and confusing. In general, which unique features make them distinct? Under what situation(s) what should each be used?

They are similar in that all 3 deal in converting data from one representation to another one (almost, casting reference types is a bit special).
1. Casting
In Java casting does two different things, depending on whether you're casting references or primitive values:
casting a reference simply changes the type of the reference, it does not change anything about the Object. For example:
Object a = "a string constant";
String b = (String) a;
After running this code both a and b will point to the exact same object (of type String representing the value "a string constant"). The difference is just that a is a Object type reference and b is a String type reference. This limits what you can call (so a.length() won't work, but b.length() will work).
Casting a reference type will only succeed when the object being referenced is actually of a compatible type. So if o was initialized as new Object() in the code block above, then the cast on the second line would fail with a ClassCastException.
casting a primitive type does potentially change the value in question, depending on the range and resolution of the target type:
int i = 1000;
char c = (char) i;
byte b = (byte) i;
Here the int value 1000 is cast both to char and to byte. The first cast just leaves c equal to 1000. But byte can't hold the value 1000, so it will be truncated to -24.
2. Parsing
Parsing is about converting textual data to a more specific representation. The simplest example of parsing is something like this:
String s = "1000";
int i = Integer.parseInt(s);
s holds the textual representation of the number 1000, i.e. the Unicode characters U+0031 U+0030 U+0030 U+0030. Integer.parseInt takes that text representation and converts it into an int type.
However, parsing can describe a wide variety of processes ranging from simple ones as above, slightly more complex ones like parsing a decimal number or date up to arbitrarily complex object trees.
As an example: The Java compiler will parse the Java source code and convert it into an internal representation that will then be further processed.
According to some definitions parsing can also apply to non-text inputs, as long as the input is some set of symbols (which could just be bytes), but that interpretation is rather rare.
3. Serialization
Serialization is the process of turning data or program state into something that can easily be stored or transferred. Usually that means into a byte stream (or more directly, a byte[]).
Similarly to parsing, serialization can apply to very simple one-value transformations down to serializing whole object trees and writing them to files.
In Java Serialization usually refers to the mechanism surrounding ObjectOutputStream and ObjectInputStream, but the term is also used to describe the general concept (i.e. other formats can also be described as "serialization").

Implementing a very efficient bit structure

I'm looking for a solution in pesudo code or java or js for the following problem:
We need to implement an efficient bit structure to hold data for N bits (you could think of the bits as booleans as well, on/off).
We need to support the following methods:
init(n)
get(index)
set(index, True/False)
setAll(True/false)
Now I got to a solution with o(1) in all except for init that is o(n). The idea was to create an array where each index saves value for a bit. In order to support the setAll I would also save a timestamp withe the bit vapue to know if to take the value from tge array or from tge last setAll value. The o(n) in init is because we need to go through the array to nullify it, otherwise it will have garbage which can be ANYTHING. Now I was asked to find a solution where the init is also o(1) (we can create an array, but we cant clear the garbage, the garbage might even look like valid data which is wrong and make the solution bad, we need a solution that works 100%).
Update:
This is an algorithmic qiestion and not a language specific one. I encountered it in an interview question. Also using an integer to represent the bit array is not good enough because of memory limits. I was tipped that it has something to do with some kind of smart handling of garbage data in the array without ckeaning it in the init, using some kind of mechanism to not fall because if the garbage data in the array (but I'm not sure how).

Make lazy data structure based on hashmap (while hashmap sometimes might have worse access time than o(1)) with 32-bit values (8,16,64 ints are suitable too) for storage and auxiliary field InitFlag
To clear all, make empty map with InitFlag = 0 (deleting old map is GC's work in Java, isn't it?)
To set all, make empty map with InitFlag = 1
When changing some bit, check whether corresponding int key bitnum/32 exists. If yes, just change bitnum&32 bit, if not and bit value differs from InitFlag - create key with value based on InitFlag (all zeros or all ones) and change needed bit.
When retrieving some bit, check whether corresponding key exists. If yes, extract bit, if not - get InitFlag value
SetAll(0): ifl = 0, map - {}
SetBit(35): ifl = 0, map - {1 : 0x10}
SetBit(32): ifl = 0, map - {1 : 0x12}
ClearBit(32): ifl = 0, map - {1 : 0x10}
ClearBit(1): do nothing, ifl = 0, map - {1 : 0x10}
GetBit(1): key=0 doesn't exist, return ifl=0
GetBit(35): key=1 exists, return map[1]>>3 =1
SetAll(1): ifl = 1, map = {}
SetBit(35): do nothing
ClearBit(35): ifl = 1, map - {1 : 0xFFFFFFF7 = 0b...11110111}
and so on

If this is a college/high-school computer science test or homework assignment question - I suspect they are trying to get you to use BOOLEAN BIT-WISE LOGIC - specifically, saving the bit inside of an int or a long. I suspect (but I'm not a mind-reader - and I could be wrong!) that using "Arrays" is exactly what your teacher would want you to avoid.
For instance - this quote is copied from Google's Search Reults:
long: The long data type is a 64-bit two's complement integer. The
signed long has a minimum value of -263 and a maximum value of 263-1.
In Java SE 8 and later, you can use the long data type to represent an
unsigned 64-bit long, which has a minimum value of 0 and a maximum
value of 264-1
What that means is that a single long variable in Java could store 64 of your bit-wise values:
long storage;
// To get the first bit-value, use logical-or ('|') and get the bit.
boolean result1 = (boolean) storage | 0b00000001; // Gets the first bit in 'storage'
boolean result2 = (boolean) storage | 0b00000010; // Gets the second
boolean result3 = (boolean) storage | 0b00000100; // Gets the third
...
boolean result8 = (boolean) storage | 0b10000000; // Gets the eighth result.
I could write the entire thing for you, but I'm not 100% sure of your actual specifications - if you use a long, you can only store 64 separate binary values. If you want an arbitrary number of values, you would have to use as many 'long' as you need.
Here is a SO posts about binary / boolean values:
Binary representation in Java
Here is a SO post about bit-shifting:
Java - Circular shift using bitwise operations
Again, it would be a job, and I'm not going to write the entire project. However, the get(int index) and set(int index, boolean val) methods would involve bit-wise shifting of the number 1.
int pos = 1;
pos = pos << 5; // This would function as a 'pointer' to the fifth element of the binary number list.
storage | pos; // This retrieves the value stored as position 5.

java jna - get byte array by reference java.lang.IndexOutOfBoundsException

I'm using JNA and I get a strange error getting a byte array.
I use this code:
PointerByReference mac=new PointerByReference();
NativeInterface.getMac(mac);
mac.getPointer().getByteArray(0,8)
And it throws a IndexOutOfBoundsException: Bounds exceeds available space : size=4, offset=8 also if I'm sure thate the byte returned is a 8byte length.
I tried to get that array as String:
mac.getPointer().getString(0)
And here I get successfully a String 8 chars lenght.
Can you understand why?
Thank you.

PointerByReference.getValue() returns the Pointer you're looking for. PointerByReference.getPointer() returns its address.
mac.getPointer().getByteArray(0, 8) is attempting to read 8 bytes from the PointerByReference allocated memory (which is a pointer), and put those bytes into a Java primitive array. You're asking for 8 bytes but there are only 4 allocated, thus the corresponding error.
mac.getPointer().getString(0) is attempting to read a C string from the memory allocated for a pointer value (as if it were const char *, and convert that C string into a Java String. It only bounds-checks the start of the string on the Java side, so it will keep reading memory (even if it is technically out of bounds) until it finds a zero value.
EDIT
mac.getValue().getByteArray(0, 8) will give you what you were originally trying to obtain (an array of 8 bytes).
EDIT
If your called function is supposed to be writing to a buffer (and not writing the address of a buffer), then you should change its signature to accept byte[] instead, e.g.
byte[] buffer = new byte[8];
getMac(buffer);

store a string in an int

i try to store a string into an integer as follows:
i read the characters of the string and every 4 characters i do this:
val = (int) ch << 24 | (int) ch << 16 | (int) ch << 8 | (int) ch;
Then i put the integer value in an array of integer that is called memory (=> int memory[16]).
I would like to do it in an automatic way for every length of a string, plus i have difficulties to inverse the procedure again for an arbitrary size string. Any help?
EDIT:
(from below)
Basically, i do an exercise in JAVA. It's a MIPS simulator system. I have Register, Datum, Instruction, Label, Control, APSImulator classes and others. When i try to load the program from an array to simulator's memory, i actually read every contents of the array which is called 'program' and put it in memory. Memory is 2048 long and 32 bits wide. Registers are declared also 32bit integers. So when there is an content in the array like Datum.datum( "string" ) - Datum class has IntDatum and StringDatum subclasses - i have somehow to store the "string" in the simulator's data segment of memory. Memory is 0-1023 text and 1024-2047 data region. I also have to delimit the string with a null char - plus any checkings for full memory etc. I figure out that one way to store a String to MemContents ( reference type - empty interface - implemented by class that memory field belongs to ) is to store the string every ( 2 or maybe 4 symbols ) to a register and then take the contents of the register and store it in memory. So, i found very difficult to implement that and the reverse procedure also.

If you are working in C, you have your string in a char array that is of a size multiple of a int, you can just take the pointer to the char array, cast it to a pointer to a int array and do whatever you want with your int array. If you don't have this last guarantee, you may simply write a function that creates your int array on the fly:
size_t IntArrayFromString(const char * Source, int ** Dest)
{
size_t stringLength=strlen(Source);
size_t intArrElements;
intArrElements=stringLength/sizeof(int);
if(stringLength%sizeof(int)!=0)
intArrElements++;
*Dest=(int *)malloc(intArrElements*sizeof(int));
(*Dest)[intArrElements-1]=0;
memcpy(Dest, Source, stringLength);
return intArrElements;
}
The caller is responsible for freeing the Dest buffer.
(I'm not sure if it really works, I didn't test it)

Have you considered simply using String.getBytes() ? You can then use the byte array to create the ints (for example, using the BigInteger(byte[]) constructor.
This may not be the most efficient solution, but is probably less prone to errors and more readable.

Assuming Java: You could look at the ByteBuffer class, and it's getInt method. It has a byte order parameter which you need to configure first.

Basically, i do an exercise in JAVA. It's a MIPS simulator system. I have Register, Datum, Instruction, Label, Control, APSImulator classes and others. When i try to load the program from an array to simulator's memory, i actually read every contents of the array which is called 'program' and put it in memory. Memory is 2048 long and 32 bits wide. Registers are declared also 32bit integers. So when there is an content in the array like Datum.datum( "string" ) - Datum class has IntDatum and StringDatum subclasses - i have somehow to store the "string" in the simulator's data segment of memory. Memory is 0-1023 text and 1024-2047 data region. I also have to delimit the string with a null char - plus any checkings for full memory etc. I figure out that one way to store a String to MemContents ( reference type - empty interface - implemented by class that memory field belongs to ) is to store the string every ( 2 or maybe 4 symbols ) to a register and then take the contents of the register and store it in memory. So, i found very difficult to implement that and the reverse procedure also.

One common way to do this in C is to use a union. It could look like
union u_intstr {
char fourChars[4];
int singleInt;
};
Set the chars into the union as
union u_intstr myIntStr;
myIntStr.fourChars[0] = ch1;
myIntStr.fourChars[1] = ch2;
myIntStr.fourChars[2] = ch3;
myIntStr.fourChars[3] = ch4;
and then access the int as
printf("%d\n", myIntStr.singleInt);
Edit
In your case for 16 ints the union could be extended to look like
union u_my16ints {
char str[16*sizeof(int)];
int ints[16];
};

This is what I come up with
int len = strlen(str);
int count = (len + sizeof(int))/sizeof(int);
int *ptr = (int *)calloc(count, sizeof(int));
memcpy((void *)ptr, (void *)str, count*sizeof(int));
Due to the use of calloc(), the resulting buffer has at least one NULL, maybe more to pad the last integer. This is not portable because the integers are in native byte order.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.