I want to put a short integer (16 bits) into tail of a 32 bits integer (I'm compressing some small-range numbers to an unique integer). But didn't found a better looking way than bellow, I have to use two operation, can it be better?
short sh = -1; // 16 bits 1
int zip = 0; // last 16 bits presents a short number
System.out.println("Wrong way, maybe many of us will try first:");
zip |= sh; // expect 0xffff
System.out.println(Integer.toHexString(zip)); // but 0xffffffff
System.out.println("Simple way, right but looks ugly, two operations");
zip = 0;
zip |= sh << 16 >>> 16; // expect 0xffff
System.out.println(Integer.toHexString(zip)); // yeah
System.out.println("Is there a nicer looking way, one operation for example?");
All bitwise operations in Java are done using ints or longs, therefore you don't have too much choice here; and casting a shorter primitive type whose sign bit is set will "expand" the sign bit to all "newer created" bits.
There is a solution to do it "in one line", but that will still be two operations:
zip |= (sh & 0xffff);
Note, it pretty much requires that zip is 0 to start with, or at least that its 16 low bits are 0.
Related
I was recently looking into some problems with but manipulation in Java and I came up with two questions.
1) Firstly, I came up to the problem of flipping all the bits in a number.
I found this solution:
public class Solution {
public int flipAllBits(int num) {
int mask = (1 << (int)Math.floor(Math.log(num)/Math.log(2))+1) - 1;
return num ^ mask;
}
}
But what happens when k = 32 bits? Can the 1 be shifted 33 times?
What I understand from the code (although it doesn't really make sense), the mask is 0111111.(31 1's)....1 and not 32 1's, as someone would expect. And therefore when num is a really large number this would fail.
2) Another question I had was determining when something is a bit sequence in 2s complement or just a normal bit sequence. For example I read that 1010 when flipped is 0110 which is -10 but also 6. Which one is it and how do we know?
Thanks.
1) The Math object calls are not necessary. Flipping all the bits in any ordinal type in Java (or C) is not an arithmatic operation. It is a bitwise operation. Using the '^' operator, simply using 1- as an operand will work regardless of the sizeof int in C/C++ or a Java template with with the ordinal type as a parameter T. The tilde '~' operator is the other option.
T i = 0xf0f0f0f0;
System.out.println(T.toHexString(i));
i ^= -1;
System.out.println(T.toHexString(i));
i = ~ i;
System.out.println(T.toHexString(i));
2) Since the entire range of integers maps to the entire range of integers in a 2's compliment transform, it is not possible to detect whether a number is or is not 2's complement unless one knows the range of numbers from which the 2's complement might be calculated and the two sets (before and after) are mutually exclusive.
That mask computation is fairly inscrutable, I'm going to guess that it (attempts to, since you mention it's wrong) make a mask up to and including the highest set bit. Whether that's useful for "flipping all bits" is an other possible point of discussion, since to me at least, "all bits" means all 32 of them, not some number that depends on the value. But if that's what you want then that's what you want. Especially combined with that second question, that looks like a mistake to me, so you'd be implementing the wrong thing from the start - see near the bottom.
Anyway, the mask can be generated with some reasonably nice bitmath, which does not create any doubt about possible edge cases (eg Math.log(0) is probably bad, and k=32 corresponds with negative numbers which are also probably bad to put into a log):
int m = num | (num >> 16);
m |= m >> 8;
m |= m >> 4;
m |= m >> 2;
m |= m >> 1;
return num ^ m;
Note that this function has odd properties, it almost always returns an unsigned-lower number than went in, except at 0. It flips bits so the name is not completely wrong, but flipAllBits(flipAllBits(x)) != x (usually), while the name suggests it should be an involution.
As for the second question, there is nothing to determine. Two's complement is scheme by which you can interpret a bitvector - any bitvector. So it's really a choice you make; to interpret a given bitvector that way or some other way. In Java the "default" interpretation is two's complement (eg toString will print an int by interpreting it according to its two's complement meaning), but you don't have to go along with it, you can (with care) treat int as unsigned, or as an array of booleans, or several bitfields packed together, etc.
If you wanted to invert all the bits but made the common mistake to assume that the number of bits in an int is variable (and that you therefore needed to compute a mask that covers "all bits"), I have some great news for you, because inverting all bits is a lot easier:
return ~num;
If you were reading "invert all bits" in the context of two's complement, it would have the above meaning, so all bits, including those left of the highest set bit.
I need to sum all data bytes in ByteArrayOutputStream, adding +1 to the result and taking the 2 least significant bytes.
int checksum = 1;
for(byte b : byteOutputStream.toByteArray()) {
checksum += b;
}
Any input on taking the 2 least significant bytes would be helpful. Java 8 is used in the environment.
If you really mean least significant bytes then:
checksum & 0xFFFF
If you meant that you want to take least significant bits from checksum, then:
checksum & 0x3
Add
checksum &= 0x0000ffff;
That will zero out everything to the left of the 2 least significant bytes.
Your question is a bit underspecified. You didn’t say neither, what you want to do with these two bytes nor how you want to store them (which depends on what you want to do).
To get to individual bytes, you can use
byte lowest = (byte)checksum, semiLowest=(byte)(checksum>>8);
In case you want to store them in a single integer variable, you have to decide, how these bytes are to be interpreted numerically, i.e signed or unsigned.
If you want a signed interpretation, the operation is as simple as
short lowest2bytes = (short)checksum;
If you want an unsigned interpretation, there’s the obstacle that Java has no dedicated type for that. There is a 2 byte sized unsigned type (char), but using it for numerical values can cause confusion when other code tries to interpret it as character value (i.e. when printing). So in that case, the best solution is to use an int variable again and only initialize it with the unsigned char value:
int lowest2bytes = (char)checksum;
Note that this is semantically equivalent to
int lowest2bytes = checksum&0xffff;
seen in other solutions.
I'm using byte arrays (of size 2 or 4) to emulate the effect of short and int data types.
Main idea is to have a data type that support both char and int types, however it is really hard for me to emulate arithmetic operations in this way, since I must do them in bit level.
For those who do not follow:
The int representation of 123 is not equal to the byte[] of {0,1,2,3} since their bit representations differ (123 representation is 00000000000000000000000001111011 and the representation of {0,1,2,3} is 00000000000000010000001000000011 on my system.
So "int of 123" would actually be equivalent to "byte[] of {0,0,0,123}". The problems occur when values stretch over several bytes and I try to subtract or decrement from those byte arrays, since then you have to interact with several different bytes and my math isn't that sharp.
Any pseudo-code or java library suggestions would be welcome.
Unless you really want to know what bits are being carried from one byte to the next, I'd suggest don't do it this way! If it's just plain math, then convert your arrays to real short and int types, do the math, then convert them back again.
If you must do it this way, consider the following:
Imaging you're adding two short variables that are in byte arrays.
The first problem you have is that all Java integer types are signed.
The second is that the "carry" from the least-significant-byte into the most-significant-byte is best done using a type that's longer than a byte because otherwise you can't detect the overflow.
i.e. if you add two 8-bit values, the carry will be in bit 8. But a byte only has bits 0..7, so to calculate bit 8 you have to promote your bytes to the next appropriate larger type, do the add operation, then figure out if it resulted in a carry, and then handle that when you add up the MSB. It's just not worth it.
BTW, I did actually have to do this sort of bit manipulation many years ago when I wrote an MC6809 CPU emulator. It was necessary to perform multiple operations on the same operands just to be able to figure out the effect on the CPU's various status bits, when those same bits are generated "for free" by a hardware ALU.
For example, my (C++) code to add two 8-bit registers looked like this:
void mc6809::help_adc(Byte& x)
{
Byte m = fetch_operand();
{
Byte t = (x & 0x0f) + (m & 0x0f) + cc.bit.c;
cc.bit.h = btst(t, 4); // Half carry
}
{
Byte t = (x & 0x7f) + (m & 0x7f) + cc.bit.c;
cc.bit.v = btst(t, 7); // Bit 7 carry in
}
{
Word t = x + m + cc.bit.c;
cc.bit.c = btst(t, 8); // Bit 7 carry out
x = t & 0xff;
}
cc.bit.v ^= cc.bit.c;
cc.bit.n = btst(x, 7);
cc.bit.z = !x;
}
which requires that three different additions get done on different variations of the operands just to extract the h, v and c flags.
I have a 3 byte signed number that I need to determine the value of in Java. I believe it is signed with one's complement but I'm not 100% sure (I haven't studied this stuff in more than 10 years and the documentation of my problem isn't super clear). I think the problem I'm having is Java does everything in two's complement. I have a specific example to show:
The original 3-byte number: 0xEE1B17
Parsed as an integer (Integer.parseInt(s, 16)) this becomes: 15604503
If I do a simple bit flip (~) of this I get (I think) a two's complement representation: -15604504
But the value I should be getting is: -1172713
What I think is happening is I'm getting the two's complement of the entire int and not just the 3 bytes of the int, but I don't know how to fix this.
What I have been able to do is convert the integer to a binary string (Integer.toBinaryString()) and then manually "flip" all of the 0s to 1s and vice-versa. When then parsing this integer (Integer.parseInt(s, 16)) I get 1172712 which is very close. In all of the other examples I need to always add 1 to the result to get the answer.
Can anyone diagnose what type of signed number encoding is being used here and if there is a solution other than manually flipping every character of a string? I feel like there must be a much more elegant way to do this.
EDIT: All of the responders have helped in different ways, but my general question was how to flip a 3-byte number and #louis-wasserman answered this and answered first so I'm marking him as the solution. Thanks to everyone for the help!
If you want to flip the low three bytes of a Java int, then you just do ^ 0x00FFFFFF.
0xFFEE1B17 is -1172713
You must only add the leading byte. FF if the highest bit of the 3-byte value is set and 00 otherwise.
A method which converts your 3-byte value to a proper intcould look like this:
if(byte3val>7FFFFF)
return byte3val| 0xFF000000;
else
return byte3val;
Negative signed numbers are defined so that a + (-a) = 0. So it means that all bits are flipped and then 1 added. See Two's complement. You can check that the condition is satisfied by this process by thinking what happens when you add a + ~a + 1.
You can recognize that a number is negative by its most significant bit. So if you need to convert a signed 3-byte number into a 4-byte number, you can do it by checking the bit and if it's set, set also the bits of the fourth byte:
if ((a & 0x800000) != 0)
a = a | 0xff000000;
You can do it also in a single expression, which will most likely perform better, because there is no branching in the computation (branching doesn't play well with pipelining in current CPUs):
a = (0xfffffe << a) >> a;
Here << and >> perform byte shifts. First we shift the number 8 bits to the right (so now it occupies the 3 "upper" bytes instead of the 3 "lower" ones), and then shift it back. The trick is that >> is so-called Arithmetic shift also known as signed shift. copies the most significant bit to all bits that are made vacant by the operation. This is exactly to keep the sign of the number. Indeed:
(0x1ffffe << 8) >> 8 -> 2097150
(0xfffffe << 8) >> 8 -> -2
Just note that java also has a unsigned right shift operator >>>. For more information, see Java Tutorial: Bitwise and Bit Shift Operators.
How can I declare an unsigned short value in Java?
You can't, really. Java doesn't have any unsigned data types, except char.
Admittedly you could use char - it's a 16-bit unsigned type - but that would be horrible in my view, as char is clearly meant to be for text: when code uses char, I expect it to be using it for UTF-16 code units representing text that's interesting to the program, not arbitrary unsigned 16-bit integers with no relationship to text.
If you really need a value with exactly 16 bits:
Solution 1: Use the available signed short and stop worrying about the sign, unless you need to do comparison (<, <=, >, >=) or division (/, %, >>) operations. See this answer for how to handle signed numbers as if they were unsigned.
Solution 2 (where solution 1 doesn't apply): Use the lower 16 bits of int and remove the higher bits with & 0xffff where necessary.
This is a really stale thread, but for the benefit of anyone coming after. The char is a numeric type. It supports all of the mathematical operators, bit operations, etc. It is an unsigned 16.
We process signals recorded by custom embedded hardware so we handle a lot of unsigned 16 from the A-D's. We have been using chars all over the place for years and have never had any problems.
You can use a char, as it is an unsigned 16 bit value (though technically it is a unicode character so could potnetially change to be a 24 bit value in the future)... the other alternative is to use an int and make sure it is within range.
Don't use a char - use an int :-)
And here is a link discussing Java and the lack of unsigned.
From DataInputStream.java
public final int readUnsignedShort() throws IOException {
int ch1 = in.read();
int ch2 = in.read();
if ((ch1 | ch2) < 0)
throw new EOFException();
return (ch1 << 8) + (ch2 << 0);
}
It is not possible to declare a type unsigned short, but in my case, I needed to get the unsigned number to use it in a for loop. There is the method toUnsignedInt in the class Short that returns "the argument converted to int by an unsigned conversion":
short signedValue = -4767;
System.out.println(signedValue ); // prints -4767
int unsignedValue = Short.toUnsignedInt(signedValue);
System.out.println(unsingedValue); // prints 60769
Similar methods exist for Integer and Long:
Integer.toUnsignedLong
Long.toUnsignedString : In this case it ends up in a String because there isn't a bigger numeric type.
No such type in java
Yep no such thing if you want to use the value in code vs. bit operations.
"In Java SE 8 and later, you can use the int data type to represent an unsigned 32-bit integer, which has a minimum value of 0 and a maximum value of 232-1." However this only applies to int and long but not short :(
If using a third party library is an option, there is jOOU (a spin off library from jOOQ), which offers wrapper types for unsigned integer numbers in Java. That's not exactly the same thing as having primitive type (and thus byte code) support for unsigned types, but perhaps it's still good enough for your use-case.
import static org.joou.Unsigned.*;
// and then...
UShort s = ushort(1);
(Disclaimer: I work for the company behind these libraries)
No, really there is no such method, java is a high-level language. That's why Java doesn't have any unsigned data types.
He said he wanted to create a multi-dimensional short array. Yet no one suggested bitwise operators? From what I read you want to use 16 bit integers over 32 bit integers to save memory?
So firstly to begin 10,000 x 10,000 short values is 1,600,000,000 bits, 200,000,000 bytes, 200,000 kilobytes, 200 megabytes.
If you need something with 200MB of memory consumption you may want to redesign this idea. I also do not believe that will even compile let alone run. You should never initialize large arrays like that if anything utilize 2 features called On Demand Loading and Data Caching. Essentially on demand loading refers to the idea to only load data as it is needed. Then data caching does the same thing, but utilizes a custom frame work for delete old memory and adding new information as needed. This one is tricky to have GOOD speed performance. There are other things you can do, but those two are my favorite when done right.
Alright back to what I was saying about bitwise operators.
So a 32bit integer or in Java "int". You can store what are called "bits" to this so let's say you had 32 Boolean values which in Java all values take up 32 bits (except long) or for arrays they take up 8 for byte, 16 for short, and 32 for int. So unless you have arrays you don't get any memory benefits from using a byte or short. This does not mean you shouldn't use it as its a way to ensure you and others know the data range this value should have.
Now as I was saying you could effectively store 32 Booleans into a single integer by doing the following:
int many_booleans = -1; //All are true;
int many_booleans = 0; //All are false;
int many_booleans = 1 | 2 | 8; //Bits 1, 2, and 4 are true the rest are false;
So now a short consists of 16 bits so 16 + 16 = 32 which fits PERFECTLY within a 32bit integer. So every int value can consist of 2 short values.
int two_shorts = value | (value2 << 16);
So what the above is doing is value is something between -32768 and 32767 or as an unsigned value 0 - 65535. So let's say value equaled -1 so as an unsigned value it was 65535. This would mean bits 1 through 16 are turned on, but when actually performing the math consider the range 0 - 15.
So we need to then activate bits 17 - 32. So we must begin at something larger than 15 bits. So we begin at 16 bits. So by taking value2 and multiplying it by 65536 which is what "<< 16" does. We now would have let's say value2 equaled 3 it would be OR'd 3x65536 = 196608. So our integer value would equal 262143.
int assumed_value = 262143;
so let's say we want to retrieve the two 16bit integer values.
short value1 = (short)(assumed_value & 0xFFFF); //-1
short value2 = (short)(assumed_value >> 16); //=3
Also basically think of bitwise operators as powers of 2. That is all they really are. Never look at it terms of 0's and 1's. I mostly posted this to assist anyone who may come across this searching for unsigned short or even possibly multi-dimensional arrays. If there are any typo's I apologize quickly wrote this up.
Java does not have unsigned types. What do you need it for?
Java does have the 'byte' data type, however.
You can code yourself up a ShortUnsigned class and define methods for those operators you want. You won't be able to overload + and - and the others on them, nor have implicit type conversion with other primitive or numeric object types, alas.
Like some of the other answerers, I wonder why you have this pressing need for unsigned short that no other data type will fill.
Simple program to show why unsigned numbers are needed:
package shifttest;
public class ShiftTest{
public static void main(String[] args){
short test = -15000;
System.out.format ("0x%04X 0x%04X 0x%04X 0x%04X 0x%04X\n",
test, test>>1, test>>2, test>>3, test>>4);
}
}
results:
0xC568 0xFFFFE2B4 0xFFFFF15A 0xFFFFF8AD 0xFFFFFC56
Now for those that are not system types:
JAVA does an arithmetic shift because the operand is signed, however, there are cases where a logical shift would be appropriate but JAVA (Sun in particular), deemed it unnecessary, too bad for us on their short sightedness. Shift, And, Or, and Exclusive Or are limited tools when all you have are signed longer numbers. This is a particular problem when interfacing to hardware devices that talk "REAL" computer bits that are 16 bits or more. "char" is not guaranteed to work (it is two bytes wide now) but in several eastern gif based languages such as Chinese, Korean, and Japanese, require at least 3 bytes. I am not acquainted with the number need for sandscript style languages. The number of bytes does not depend on the programmer rather the standards committee for JAVA. So basing char as 16 bits has a downstream risk. To safely implement unsigned shorts JAVA, as special class is the best solution based on the aforementioned ambiguities. The downside of the class is the inability of overloading the mathematical operations for this special class. Many of the contributors for this thread of accurately pointed out these issues but my contribution is a working code example and my experience with 3 byte gifs languages in C++ under Linux.
//вот метод для получения аналога unsigned short
public static int getShortU(byte [] arr, int i ) throws Exception
{
try
{
byte [] b = new byte[2];
b[1] = arr[i];
b[0] = arr[i+1];
int k = ByteBuffer.wrap(b).getShort();
//if this:
//int k = ((int)b[0] << 8) + ((int)b[1] << 0);
//65536 = 2**16
if ( k <0) k = 65536+ k;
return k;
}
catch(Throwable t)
{
throw new Exception ("from getShort: i=" + i);
}
}