I am reading a book about Java programming and in the first chapter it says: "The number 149 is stored in the byte at address 16" - is storing three characters, the 1, the 4, and the 9 in one byte possible?
No, the size of a character in java is 2 bytes. Thus obviously 6 bytes cannot fit into 1.
I think the book was trying to ask whether the number 149 could fit into a byte, in which yes and no, an unsigned byte can hold a value of 255 at max while a two's complement (signed) byte can only hold a value of 126.
Info about primitive data type
Storing the number 149 and characters '1', '4', and '9' separately are completely different. Storing the character '1' is actually storing its ASCII value 49, and the ASCII value 52, and 57 represent '4' and '9 respectively. The size of each character in Java is 2 bytes. So therefore 3 characters with a total size of 6 cannot fit into a single byte.
The byte data type is only 8 bits, and therefore it can store numbers from -128 to +127. That means the maximum value for a byte (Byte.Max_VALUE) is 127, and since 149 is bigger than 127, therefore it cannot fit into a byte, and you have to at least use a short to store 149. A short is 2 bytes in java.
I highly encourage you to read this Java documentation on data types. It's very short, but pretty useful to understand everything about data types.
http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
In General, A byte is 8 bits and can store a number range from 0 to 255. Bytes are often used in RAW data processing and is how data is stored in memory. When storing characters or a "String", you are storing a sequence of bytes that represent a sequence of characters.
the number 149 in binary Byte form is 10010101
Decimal to Binary Converter
But storing characters are different than storing numbers. To address you question, storing the characters "1", "4", and "9" in a single byte is not possible, but storing the number 149 is.
Also, the number of bytes that a given character/string uses are highly dependant on which encoding you are using.
Java String see .getBytes(Charset charset)
All this being said, a byte in Java is signed. Its range goes from -128 to +127 inclusive. A byte can store 256 unique values. You can think of them as numbers, individual flags, whatever you want. I have no context to the OP's original problem, but if they are using a Java primitive byte, it cannot by default hold the number 149. IF you are talking about a sequence of 8 bits, it can.
Java Primitive Datatypes
Related
I'm confused about LEB128 or Little Endian Base 128 format. In the AOSP source code Leb128.java, its read function's return type whether signed or unsigned is int. I know the the size of int in java is 4 bytes aka 32bits. But the max length of LEB128 in AOSP is 5 bytes aka 35 bits. So where are the other lost 3bits.
Thanks for your reply.
Each byte of data in LEB only accounts for 7 bits in the actual output - the remaining bit is used to indicate whether or not it's the end.
From Wikipedia:
To encode an unsigned number using unsigned LEB128 first represent the number in binary. Then zero extend the number up to a multiple of 7 bits (such that the most significant 7 bits are not all 0). Break the number up into groups of 7 bits. Output one encoded byte for each 7 bit group, from least significant to most significant group.
The extra bits aren't so much "lost" as "used to indicate whether or not it's the end of the data".
You can't hope to encode arbitrary 32-bit values and some of them taking less than 4 bytes without some of them taking more than 4 bytes.
I'm making hash algorithm, the block of message is 512 bits.
In C/C++ I can store in char[64], but Java char takes 2 bytes.
Question: 512 bits of information are char[32] or char[64]?
Char is 16bit in Java. So char[32] should be enough for 512bits.
I think using byte[64] is better though because everyone know a byte is 8 bits and char[32] makes the code harder to read. Also you don't store characters but bits.
From the documentation:
char: The char data type is a single 16-bit Unicode character. It has a minimum value of '\u0000' (or 0) and a maximum value of '\uffff'
(or 65,535 inclusive).
So in order to store 512 bits, you should have an array of size 32.
Why use a char[]?
A hash value consists of bytes so the logical choice would be to use a byte[64].
The datatype char is intended to be used as a character and not as a number.
I can store numbers ranging from -127 to 127 but other than that it is impossible and the compiler give warning. The binary value of 127 is 01111111, and 130 is 10000010 still the same size (8 bits) and what I think is I can store 130 in Byte but it is not possible. How did that happen?
Java does not have unsigned types, each numeric type in Java is signed (except char but it is not meant for representing numbers but unicode characters).
Let's take a look at byte. It is one byte which is 8 bits. If it would be unsigned, yes, its range would be 0..255.
But if it is signed, it takes 1 bit of information to store the sign (2 possible values: + or -), which leaves us 7 bits to store the numeric (absolute) value. Range of 7 bit information is 0..127.
Note that the representation of signed integer numbers use the 2's complement number format in most languages, Java included.
Note: The range of Java's byte type is actually -128..127. The range -127..127 only contains 255 numbers (not 256 which is the number of all combinations of 8 bits).
In Java, a byte is a signed data type. You are thinking about unsigned bytes, in which case it is possible to store the value 130 in 8 bits. But with a signed data type, that also allows negative numbers, the first bit is necessary to indicate a negative number.
There are two ways to store negative numbers, (one's complement and two's complement) but the most popular one is two's complement. The benefit of it is that for two's complement, most arithmetic operations do not need to take the sign of the number into account; they can work regardless of the sign.
The first bit indicates the sign of the number: When the first bit is 1, then the number is negative. When the first bit is 0, then the number is positive. So you basically only have 7 bits available to store the magnitude of the number. (Using a small trick, this magnitude is shifted by 1 for negative numbers - otherwise, there would be two different bit patterns for "zero", namely 00000000 and 10000000).
When you want to store a number like 130, whose binary representation is 10000010, then it will be interpreted as a negative number, due to the first bit being 1.
Also see http://en.wikipedia.org/wiki/Two%27s_complement , where the trick of how the magnitude is shifted is explained in more detail.
I wanted to use DataOutputStream#writeBytes, but was running into errors. Description of writeBytes(String) from the Java Documentation:
Writes out the string to the underlying output stream as a sequence of bytes. Each character in the string is written out, in sequence, by discarding its high eight bits.
I think the problem I'm running into is due to the part about "discarding its high eight bits". What does that mean, and why does it work that way?
Most Western programmers tend to think in terms of ASCII, where one character equals one byte, but Java Strings are 16-bit Unicode. writeBytes just writes out the lower byte, which for ASCII/ISO-8859-1 is the "character" in the C sense.
The char data type is a single 16-bit Unicode character. It has a minimum value of '\u0000' (or 0) and a maximum value of '\uffff' (or 65,535 inclusive). But The byte data type is an 8-bit signed two's complement integer. It has a minimum value of -128 and a maximum value of 127 (inclusive). That is why this function is writing the low-order byte of each char in the string from first to last. Any information in the high-order byte is lost. In other words, it assumes the string contains only characters whose value is between 0and 255.
You may look into the writeUTF(String s) method, which, retains the information in the high-order byte as well as the length of the string. First it writes the number of characters in the string onto the underlying output stream as a 2-byte unsigned int between 0 and 65,535. Next it encodes the string in UTF-8 and writes the bytes of the encoded string to the underlying output stream. This allows a data input stream reading those bytes to completely reconstruct the string.
If I have the following:
byte[] byteArray = new byte[] {87, 79, 87, 46, 46, 46};
I know that the size of each element would be one byte. But what I don't seem to understand is how would the integer 87 be stored in one byte? Or, how does the byte[] store data?
EDIT: I see that you can store -128 to 127 in a byte here in java. So, does that mean there is no way to store anything greater than or lesser than those numbers in a byte[]? If so, doesn't that limit the use of this? Or am not understanding the exact places to use a byte[].
A byte is 8 bits. 2^8 is 256, meaning that 8 bits can store 256 distinct values. In Java, those values are the numbers in the range -128 to 127, so 87 is a valid byte, as it is in that range.
Similarly, try doing something like byte x = 200, and you will see that you get an error, as 200 is not a valid byte.
A byte is just an 8-bit integer value. Which means it can hold any value from -2^7 to 2^7-1, which includes all of the number in {87, 79, 87, 46, 46, 46}.
An integer in java, is just a 4-byte integer, allowing it to hold -2^31 to 2^31 - 1
A Java byte is a primitive with a minimum value of -128 and a maximum value of 127 (inclusive). 87 is within the allowed range. The byte data type can be useful for saving memory in large arrays, where the memory savings actually matters.
A byte[] is an Object which stores a number of these primitives.
I think the short answer is that byte[] stores bytes. The number 87 in your array above it a byte, not an int. If you were to change it to 700 (or anything higher than 127) you'd get a compile error. Try it.
You can use byte to store values of 8 bit in it which have a (signed) range from from -128 to 127.
With byte[] you can do some special operations like building Strings from a given bytestream and decode them with a desired Charset, and some functions will give you byte[] as their return value.
I don't know enough about the internals of the JVM but it might save memory though.
this is because, the computer stores values in a circular progression. not a linear progression like we learn in mathematics. it is because the memory is not infinite. it is finite. so every data type is storing values as a circular progression. to learn more go to this link and read the article.
https://medium.com/#hmsathyajith/numbering-system-edge-cases-of-java-237377553444