To my understanding a char is a single character, that is a letter, a digit, a punctuation mark, a tab, a space or something similar. And therefore when I do:
char c = '1';
System.out.println(c);
The output 1 was exactly what I expected. So why is it that when I do this:
int a = 1;
char c = '1';
int ans = a + c;
System.out.println(ans);
I end up with the output 50?
You're getting that because it's adding the ASCII value of the char. You must convert it to an int first.
Number 1 is ASCII code 49. The compiler is doing the only sensible thing it can do with your request, and typecasting to int.
You end up with out of 50 because you have told Java to treat the result of the addition as an int in the following line:
int ans = a + c;
Instead of int you declare ans as a char.
Like so:
final int a = 1;
final char c = '1';
final char ans = (char) (a + c);
System.out.println(ans);
Because you are adding the value of c (1) to the unicode value of 'a', which is 49. The first 128 unicode point values are identical to ASCII, you can find those here:
http://www.asciitable.com/
Notice Chr '1' is Dec 49. The rest of the unicode points are here:
http://www.utf8-chartable.de/
A char is a disguised int. A char represents a character by coding it into an int. So for example 'c' is coded with 49. When you add them together, you get an int which is the sum of the code of the char and the value of the int.
'1' is a digit, not a number, and is encoded in ASCII to be of value 49.
Chars in Java can be promoted to int, so if you ask to add an int like 1 to a char like '1', alias 49, the more narrow type char is promoted to int, getting 49, + 1 => 50.
Note that every non-digit char can be added the same way:
'a' + 0 = 97
'A' + 0 = 65
' ' + 0 = 32
'char' is really just a two-byte unsigned integer.
The value '1' and 1 are very different. '1' is encoded as the two-byte value 49.
"Character encoding" is the topic you want to research. Or from the Java language spec: http://docs.oracle.com/javase/specs/jls/se7/html/jls-4.html#jls-4.2.1
Related
This question already has answers here:
Java: Subtract '0' from char to get an int... why does this work?
(10 answers)
How does subtracting the character '0' from a char change it into an int?
(4 answers)
Closed 8 years ago.
I’m learning Java through "introduction to Java programming 9th edition" by Daniel Liang at chapter 9 "strings" I’ve encountered this piece of code :
public static int hexCharToDecimal(char ch) {
if (ch >= 'A' && ch <= 'F')
return 10 + ch - 'A';
else
return ch - '0';
}
Can someone explain what just happened in here? How is possible to add/subtract chars from integers and what's the meaning behind it?
From the Docs
The char data type is a single 16-bit Unicode character.
A char is represented by its code point value:
min '\u0000' (or 0)
max: '\uffff' (or 65,535)
You can see all of the English alphabetic code points on an ASCII table.
Note that 0 == \u0000 and 65,535 == \uffff, as well as everything in between. They are corresponding values.
A char is actually just stored as a number (its code point value). We have syntax to represent characters like char c = 'A';, but it's equivalent to char c = 65; and 'A' == 65 is true.
So in your code, the chars are being represented by their decimal values to do arithmetic (whole numbers from 0 to 65,535).
For example, the char 'A' is represented by its code point 65 (decimal value in ASCII table):
System.out.print('A'); // prints A
System.out.print((int)('A')); // prints 65 because you casted it to an int
As a note, a short is a 16-bit signed integer, so even though a char is also 16-bits, the maximum integer value of a char (65,535) exceeds the maximum integer value of a short (32,767). Therefore, a cast to (short) from a char cannot always work. And the minimum integer value of a char is 0, whereas the minimum integer value of a short is -32,768.
For your code, let's say that the char was 'D'. Note that 'D' == 68 since its code point is 68.
return 10 + ch - 'A';
This returns 10 + 68 - 65, so it will return 13.
Now let's say the char was 'Q' == 81.
if (ch >= 'A' && ch <= 'F')
This is false since 'Q' > 'F' (81 > 70), so it would go into the else block and execute:
return ch - '0';
This returns 81 - 48 so it will return 33.
Your function returns an int type, but if it were to instead return a char or have the int casted to a char afterward, then the value 33 returned would represent the '!' character, since 33 is its code point value. Look up the character in ASCII table or Unicode table to verify that '!' == 33 (compare decimal values).
This is because char is a primitive type which can be used as a numerical value. Every character in a string is encoded as a specific number (not entirely true in all cases, but good enough for a basic understanding of the matter) and Java allows you to use chars in such a way.
It probably allows this mostly for historical reasons, this is how it worked in C and they probably motivated it with "performance" or something like that.
If you think it's weird then don't worry, I think so too
The other answer is incorrect actually. ASCII is a specific encoding (an encoding is some specification that says "1 = A, 2 = B, ... , 255 = Space") and that is not the one used in Java. A Java char is two bytes wide and is interpreted through the unicode character encoding.
Chars are in turn stored as integers(ASCII value) so that you can perform add and sub on integers which will return ASCII value of a char
Regardless of how Java actually stores the char datatype, what's certain is this, the character 'A' subtracted from the character 'A' would be represented as the null character, \0. In memory, this means every bit is 0. The size in memory a char takes up in memory may vary from language to language, but as far as I know, the null character is the same in all the languages, every bit is equal to 0.
As an int value, a piece of memory with every bit equal to 0 represents the integer value of 0.
And as it turns out, when you do "character math", subtracting any alphabetical character from any other alphabetical character (of the same case) results in bits being flipped in such a way that, if you were to interpret them as an int, would represent the distance between these characters. Additionally, subtracting the char '0' from any other numeric char will result in int value of the char you subtracted from, for basically the same reason.
'A' - 'A' = '\0'
'a' - 'a' = '\0'
'0' - '0' = '\0'
Can some one explain what runner.children[c-'a']
means in the following code.
public boolean search(String word) {
TrieNode runner = root;
for(char c : word.toCharArray()) {
if(runner.children[c-'a'] == null) {
return false;
} else {
runner = runner.children[c-'a'];
}
}
return runner.isEndOfWord;
}
Every char has a numeric value, check out the ASCII table for more information.
So assume that the variable c contains character b, and subtract character a from that, you will get 1 for your answer.
That's just subtraction. You can subtract characters as though they were numbers. You end up with the result of subtracting their character codes. 'c' - 'a' (for example) equals 2, since 'a' is 2 less than 'c'.
- is the subtraction operator.
§15.18.2 The type of each of the operands of the binary - operator must be a type that is convertible to a primitive numeric type
§5.6.2 Widening primitive conversion is applied to convert either or both operands … both operands are converted to type int.
Binary numeric promotion is performed on the operands of certain operators: … addition and subtraction operators for numeric types + and - …
In other words, both c and 'a' are of type char (a UTF-16 code unit, which has a range from Character.MIN_VALUE to Character.MAX_VALUE). Due to subtraction, they are widened to type int, subtracted, resulting in a value of type int.
Think of characters on a number line. Subtraction is the distance from one character to the other. With a constant reference to 'a', the distances for 'a', 'b', … 'z' are 0, 1, … 25. This makes sense only over certain short segments of the UTF-16 number line.
Arrays are 0-based so shifting the scale like this allows characters to be used to index an array without having a large used portion with elements corresponding to unused characters.
(Note: Some people are saying ASCII because they think it's easier to understand a simpler, wrong thing on the way to learning the right thing. 🤷)
In this case children[] is probably the size of the amount of letters from a-z.
What is going on above is that they take the ascii value of the char c, and subtracting the ascii code of 'a'. Effectively resulting in getting the index of the char c in the alphabet (0-Index assumed)
Let c = 'b'
[c-'a'] = 98 - 97 = 1 (Ascii of b - Ascii of a)
With c = 'd'
[c-'a'] = 100 - 97 = 3
It is minus sign and not hyphen. In java char takes 2 bytes of space. char is representation of bits ranging from 00000000 to 11111111, most significant bit is read as signed bit. You can easily read it as a number also by assiging a char to an int variable ( as int can accept 4 bytes so 2 bytes of char can easily fit ).
char charA = ''A'; // represents 65
char charB = `B`; // represents 66
int diff = charB - charA; // this will represent 66-65 i.e. 1
Index of the array is positve int and hence it can also accept values like
anyTypeArray[charB - charA] //represents the 2nd element (index starts from 0 for arrays in java).
anyTypeArray['C' - charA] // represents the 3rd element of the array
Also I liked answer above https://stackoverflow.com/a/47106997/504133 and would like to add its link to extend my answer.
char char1 = 'a';
System.out.println(char1); //prints char 1
System.out.println(char1+1); //prints char 1
System.out.println(char1++); //prints char 1
System.out.println(char1+=1); //prints incremented char1
char1 += 1;
System.out.println(char1); //prints incremented char1
In the above, why doesn't (char1+1) or (char++) print the incremented character but theother two do?
First, I'm assuming that because you say the increment in System.out.println works, that you have really specified:
char char1 = 'a';
EDIT
In response to the change of the question (char1+1; => char1 += 1;) I see the issue.
The output is
a
98
b
The 98 shows up because the char a was promoted to an int (binary numeric promotion) to add 1. So a becomes 97 (the ASCII value for 'a') and 98 results.
However, char1 += 1; or char1++ doesn't perform binary numeric promotion, so it works as expected.
Quoting the JLS, Section 5.6.2, "Binary Numeric Promotion":
Widening primitive conversion (§5.1.2) is applied to convert either or
both operands as specified by the following rules:
If either operand is of type double, the other is converted to double.
Otherwise, if either operand is of type float, the other is converted
to float.
Otherwise, if either operand is of type long, the other is converted
to long.
Otherwise, both operands are converted to type int.
(emphasis mine)
You didn't assign the result of addition char1+1 to char1. So
char1 = char1 + 1;
or
char1 += 1;
char1++;
are correct.
Okay, first of all, fixing the format of your code:
char char1;
char1 = 'a';
System.out.println(char1); // print 1
System.out.println(char1 + 1); // print 2
char1 += 1;
System.out.println(char1); // print 3
which yields the output:
a
98
b
Now, let's look at each call to println() in detail:
1: This is simply taking the character handle named char1 and printing it. It's been assigned the letter a (note the single quotes around the a in the assignment, indicating character). Not surprisingly, this prints the character a.
2: For this line, you're performing an integer addition. A char in java is held as a unicode character. The unicode value for the letter a maps to the number 97. (Note that this also corresponds to that ASCII value for a). When performing arithmetic operations in Java between mismatched types, the smaller/less precise value type's value will be 'upgraded' to the larger type (this is very imprecisely stated). Because of this, the char is 'upgraded' to an int before the addition is performed, and the result is also an int. With this in mind, it's not surprising that the 97 from a +1 results in a 98 being printed.
3: In this instance we are once again printing the value of a char, so a character is printed. This time the 98 we saw generated before is implicitly cast back into a character. Again, unsurprisingly the next highest number mapping from a is b, so we see a b printed.
try this.
System.out.println(char1);
System.out.println(++char1);
char1 += 1;
System.out.println(char1);
instead
char1 = a;
System.out.println(char1);
system.out.println(char1+1);
char1 += 1;
System.out.println(char1);
So I have something like this:
char cr = "9783815820865".charAt(0);
System.out.println(cr); //prints out 9
If I do this:
int cr = "9783815820865".charAt(0);
System.out.println(cr); //prints out 57
I understand that the conversion between char and int is not simply from '9' to 9. My problem is right now I simply need to keep the 9 as the int value, not 57. How to get the value 9 instead of 57 as a int type?
You can try with:
int cr = "9783815820865".charAt(0) - '0';
charAt(0) will return '9' (as a char), which is a numeric type. From this value we'll just subtract the value of '0', which is again numeric and is exactly nine entries behind the entry of the '9' character in the ASCII table.
So, behind the scenes, the the subtraction will work with the ASCII codes of '9' and '0', which means that 57 - 48 will be calculated.
try this:
char c = "9783815820865".charAt(0);
int cr = Integer.parseInt(c+"");
Using Character#getNumericValue may be more idiomatic. Bear in mind that it'll convert anything above 'A' as 10.
int cr = Character.getNumericValue("9783815820865".charAt(0));
System.out.println(cr);
Let's say I have a String. If I do this:
for (int index = 0; index < ch.length(); index++) {
char c = ch.charAt(index);
System.out.println(String.format("%04x", (int) c));
}
What will the output be ?
I tried a and got 0061, which seems to be the UTF-8/ASCII value of A.
Then I tried 𐅑 and got d800 dd51 which seems not to be a UTF value.
Just wondering, what is the int value of a Char in Java.
I believe your ch variable in your for loop is a String type and you want to access each character in that then cast it to it's ascii value? Well that's what your code does. I ran it after making minor corrections and using String ch = "abcdef" and it gave me the out put of:
0061
0062
0063
0064
0065
0066
Which is exactly what your print statement instructs:
-cast the character to its' ascii value
-output four character long value.
If it helps, the ascii value for a, b, c, d, e and f are 61, 62, 63, 64, 65 and 66.