Basic arithmetic on two byte arrays in Java without BigInteger - java

I have two byte arrays that represent unsigned 256-bit values and I want to perform simple arithmetic operations on them like ADD, SUB, DIV, MUL and EXP - Is there a way to perform these directly on the byte arrays? Currently I convert these byte array values to a BigInteger and then perform the calculations, but I have an idea this is costing me in performance. How would you do this to get the fastest results?
For example, this is my current add-function:
// Both byte arrays are length 32 and represent unsigned 256-bit values
public void add(byte[] data1, byte[] data2) {
BigInteger value1 = new BigInteger(1, data1);
BigInteger value2 = new BigInteger(1, data2);
BigInteger result = value1.add(value2);
byte[] bytes = result.toByteArray();
ByteBuffer buffer = ByteBuffer.allocate(32);
System.arraycopy(bytes, 0, buffer.array(), 32 - bytes.length, bytes.length);
this.buffer = buffer.array();
}

I don’t think that there is much benefit from working on byte[] directly rather than using BigInteger but for satisfying your curiosity here is an example of how to add two byte arrays of size 32:
public static byte[] add(byte[] data1, byte[] data2) {
if(data1.length!=32 || data2.length!=32)
throw new IllegalArgumentException();
byte[] result=new byte[32];
for(int i=31, overflow=0; i>=0; i--) {
int v = (data1[i]&0xff)+(data2[i]&0xff)+overflow;
result[i]=(byte)v;
overflow=v>>>8;
}
return result;
}
Note that it is possible to use one of the input arrays as target for the result. However, don’t be surprised if such a reusing has even a negative impact on performance. On today’s systems there are no simple answers to “how to speedup” anymore…

For treating as a big unsigned number, a byte[] isn't an ideal solution, consider for example that for adding two of these numbers, you will have to loop over the two arrays, adding each byte (and the carry from the previous byte), then storing back the resulting byte somewhere.
BigInteger internally represents the value in a manner suitable for the operations it provides, so its operations will very likely be at least as good as you can do with byte[]. A slight drawback in terms of performance might be that BigInteger is immutable.
Performance wise, a simple, mutable holder object consisting of 4 long members would probably do best:
My256BitNumber {
long l0;
long l1;
long l2;
long l3;
public void add(My256BitNumber arg) {
//...
}
}
That would allow you to bypass overhead of object creation (due to being mutable), as well as any potential array access overhead (like array index bounds checks).
But considering that none of the operations are trivial to implement, just make use of BigInteger. It combines reasonable performance, with reasonable simplicity of use, and most importantly - is a tested, working solution.
If rolling your own implementation is worth it, depends on your use case. Considering you're asking if one could get better performance than BigInteger, the answer is, yes you can - BUT at severe expense in code complexity.

Related

java long to byte[] (primitive long, not Long to byte array) - not equal between two implementations

Small question regarding a conversion from primitive long to byte array please.
I used to have a small piece of code:
import com.datastax.oss.driver.shaded.guava.common.primitives.Ints;
final long timeStamp = Instant.now().getEpochSecond();
final byte[] timeStampAsBytes = Ints.toByteArray((int) timeStamp);
Note, there is this external dependency to library, as well as a cast, so, I decided to refactor it to make it a bit more clear.
final long timeStamp = Instant.now().getEpochSecond();
final byte[] timeStampAsBytes2 = ByteBuffer.allocate(Long.SIZE / Byte.SIZE).putLong(timeStamp).array();
However, quite surprised, the two are actually not equals!
final long timeStamp = Instant.now().getEpochSecond();
final byte[] timeStampAsBytes = Ints.toByteArray((int) timeStamp);
final byte[] timeStampAsBytes2 = ByteBuffer.allocate(Long.SIZE / Byte.SIZE).putLong(timeStamp).array();
if (Arrays.equals(timeStampAsBytes, timeStampAsBytes2)) {
System.out.println("EQUAL");
} else {
System.out.println("NOT EQUAL"); //this got printed
}
I was wondering, why aren't they equals, and if possible what would be a cleaner version of
final byte[] timeStampAsBytes = Ints.toByteArray((int) timeStamp);
Thank you
ints are 4 bytes, longs are 8 bytes. They'd have been equal if you had used the same datatypes for both.
Note that seconds-in-ints is an extremely bad plan - because that doesn't get you any further than halfway through 2038. That ridiculous 'year 2000 bug' malarky is farther from now than the 2k38 problem. millis-in-longs is the usual strategy, that'll get you extremely far (millions of years). you can do seconds-in-longs too... isn't any more efficient (still 8 bytes), so do that only if you explicitly want to not store the free millisecond part.

Java->C# BigInteger + Math Conversion

I am attempting to convert some BigInteger objects and math from Java to C#.
The Java flow is as follows:
1. Construct 2 BigIntegers from a base-10 string (i.e. 0-9 values).
2. Construct a third BigInteger from an inputted byte array.
3. Create a fourth BigInteger as third.modPow(first, second).
4. Return the byte result of fourth.
The main complications in converting to C# seem to consist of endianness and signed/unsigned values.
I have tried a couple different ways to convert the initial 2 BigIntegers from Java->C#. I believe that using the base-10 string with BigInteger.Parse will work as intended, but I am not completely sure.
Another complication comes from the use of a BinaryReader/BinaryWriter implementation, in C#, that is already big-endian (like Java). I use the BR/BW to supply the byte array to create the third BigInteger and consume the byte array produced from the modPow (the fourth BigInteger).
I have tried reversing the byte arrays for input and output in every way, and still do not get the expected output.
Java:
public static byte[] doMath(byte[] input)
{
BigInteger exponent = new BigInteger("BASE-10-STRING");
BigInteger mod = new BigInteger("BASE-10-STRING");
BigInteger bigInput = new BigInteger(input);
return bigInput.modPow(exponent, mod).toByteArray();
}
C#:
public static byte[] CSharpDoMath(byte[] input)
{
BigInteger exponent = BigInteger.Parse("BASE-10-STRING");
BigInteger mod = BigInteger.Parse("BASE-10-STRING");
// big->little endian
byte[] reversedBytes = input.Reverse().ToArray();
BigInteger bigInput = new BigInteger(reversedBytes);
BigInteger output = BigInteger.ModPow(bigInput, exponent, mod);
// little->big endian
byte[] bigOutput = output.ToByteArray().Reverse().ToArray();
return bigOutput;
}
I need the same output from both.

C# equivalent to Java's Float.floatToIntBits

I've been writing a port of a networking library from Java and this is the last line of code I have yet to decipher and move on over. The line of code is as follows:
Float.floatToIntBits(Float);
Which returns an integer.
The code of floatToIntBits in Java
public static int floatToIntBits(float value) {
int result = floatToRawIntBits(value);
// Check for NaN based on values of bit fields, maximum
// exponent and nonzero significand.
if ( ((result & FloatConsts.EXP_BIT_MASK) ==
FloatConsts.EXP_BIT_MASK) &&
(result & FloatConsts.SIGNIF_BIT_MASK) != 0)
result = 0x7fc00000;
return result;
}
I'm not nearly experienced enough with memory and hex values to port this over myself, not to mention the bit shifting that's all over the place that's been driving me absolutely mad.
Take a look at the BitConverter class. For doubles it has methods DoubleToInt64Bits and Int64BitsToDouble. For floats you could do something like this:
float f = ...;
int i = BitConverter.ToInt32(BitConverter.GetBytes(f), 0);
Or changing endianness:
byte[] bytes = BitConverter.GetBytes(f);
Array.Reverse(bytes);
int i = BitConverter.ToInt32(bytes, 0);
If you can compile with unsafe, this becomes trivial:
public static unsafe uint FloatToUInt32Bits(float f) {
return *((uint*)&f);
}
Replace uint with int if you want to work with signed values, but I would say unsigned makes more sense. This is actually equivalent to Java's floatToRawIntBits(); floatToIntBits() is identical except that it always returns the same bitmask for all NaN values. If you want that functionality, you can just replicate that if statement from the Java version, but it's probably unnecesssary.
You'll need to switch on 'unsafe' support for your assembly, so it's up to you whether you want to go this route. It's not at all uncommon for high performance networking libraries to use unsafe code.

How to convert int [] to Big Integer?

I would like to convert an integer array of values, which was original were bytes.
First, make sure you know in which format your int[] is meant to be interpreted.
Each int can be seen as consisting of four bytes, and these bytes together can be converted to an BigInteger. The details are the byte order - which byte is the most and which one the least significant?
Also, do you have a signed or unsigned number?
A simple way to convert your ints to bytes (for latter use in a BigInteger constructor) would be to use ByteBuffer and wrap an IntBuffer around it.
public BigInteger toBigInteger(int[] data) {
byte[] array = new byte[data.length * 4];
ByteBuffer bbuf = ByteBuffer.wrap(array);
IntBuffer ibuf = bbuf.asIntBuffer();
ibuf.put(data);
return new BigInteger(array);
}
Obvious adaptions would be to set the byte order of bbuf, or use another BigInteger constructor (for unsigned).
Well, what about new BigInteger(byte[] val)?
To quote the API docs I linked to:
Translates a byte array containing the two's-complement binary representation of a BigInteger into a BigInteger. The input array is assumed to be in big-endian byte-order: the most significant byte is in the zeroth element.

Fast ByteBuffer to CharBuffer or char[]

What is the fastest method to convert a java.nio.ByteBuffer a into a (newly created) CharBuffer b or char[] b.
By doing this it is important, that a[i] == b[i]. This means, that not a[i] and a[i+1] together make up a value b[j], what getChar(i) would do, but the values should be "spread".
byte a[] = { 1,2,3, 125,126,127, -128,-127,-126 } // each a byte (which are signed)
char b[] = { 1,2,3, 125,126,127, 128, 129, 130 } // each a char (which are unsigned)
Note that byte:-128 has the same (lower 8) bits as char:128. Therefore I assume the "best" interpretation would be as I noted it above, because the bits are the same.
After that I also need the vice versa translation: The most efficient way to get a char[] or java.nio.CharBuffer back into a java.nio.ByteBuffer.
So, what you want is to convert using the encoding ISO-8859-1.
I don't claim anything about efficiency, but at least it is quite short to write:
CharBuffer result = Charset.forName("ISO-8859-1").decode(byteBuffer);
The other direction would be:
ByteBuffer result = Charset.forName("ISO-8859-1").encode(charBuffer);
Please measure this against other solutions. (To be fair, the Charset.forName part should not be included, and should also be done only once, not for each buffer again.)
From Java 7 on there also is the StandardCharsets class with pre-instantiated Charset instances, so you can use
CharBuffer result = StandardCharsets.ISO_8859_1.decode(byteBuffer);
and
ByteBuffer result = StandardCharsets.ISO_8859_1.encode(charBuffer);
instead. (These lines do the same as the ones before, just the lookup is easier and there is no risk to mistype the names, and no need to catch the impossible exceptions.)
I would agree with #Ishtar's, suggest to avoid converting to a new structure at all and only convert as you need it.
However if you have a heap ByteBuffer you can do.
ByteBuffer bb = ...
byte[] array = bb.array();
char[] chars = new char[bb.remaining()];
for (int i = 0; i < chars.length; i++)
chars[i] = (char) (array[i + bb.position()] & 0xFF);
Aside from deferring creation of CharBuffer, you may be able to get by without one.
If code that is using data as characters does not strictly need a CharBuffer or char[], just do simple on-the-fly conversion; use ByteBuffer.get() (relative or absolute), convert to char (note: as pointed out, you MUST unfortunately explicitly mask things; otherwise values 128-255 will be sign-extended to incorrect values, 0xFF80 - 0xFFFF; not needed for 7-bit ASCII), and use that.

Categories

Resources