This question is hard to be asked on Google even though is so simple.
Basically I wrote this :
public static void main(String[] args) {
char cipher[] = {'a','b','c','c','d','t','w'};
System.out.println(cipher[0]+cipher[2]);
}
}
and the println result was : 196 instead of ac. Of course when I did
System.out.println(cipher[0]+""+cipher[2]);
It showed me ac as intended.
So my question is just what is this 196 ?
Thanks!
So my question is just what is this 196 ?
It's the UTF-16 code unit for 'a' (which is 97) followed by the UTF-16 code unit for 'c' (which is 99).
Other than for string concatenation, operands of the addition operator undergo binary numeric promotion (JLS 5.6.2) so you're actually performing addition of int values. Your code is equivalent to:
System.out.println((int) cipher[0] + (int) cipher[2]);
196 is the ASCII value of 'a' + the ASCII value of 'c'.
When you add chars together, without any other hints, Java interprets them as numbers.
In Java, a char is essentially an unsigned 16-bit integer with their integer value corresponding to their Unicode value. 196 is the sum of the integer representations of 'a' ja 'c'.
The result 196 is the ASCII value de 'a' (ASCII 97) + 'c' (ASCII 99).
Related
This question already has answers here:
Java: Subtract '0' from char to get an int... why does this work?
(10 answers)
How does subtracting the character '0' from a char change it into an int?
(4 answers)
Closed 8 years ago.
I’m learning Java through "introduction to Java programming 9th edition" by Daniel Liang at chapter 9 "strings" I’ve encountered this piece of code :
public static int hexCharToDecimal(char ch) {
if (ch >= 'A' && ch <= 'F')
return 10 + ch - 'A';
else
return ch - '0';
}
Can someone explain what just happened in here? How is possible to add/subtract chars from integers and what's the meaning behind it?
From the Docs
The char data type is a single 16-bit Unicode character.
A char is represented by its code point value:
min '\u0000' (or 0)
max: '\uffff' (or 65,535)
You can see all of the English alphabetic code points on an ASCII table.
Note that 0 == \u0000 and 65,535 == \uffff, as well as everything in between. They are corresponding values.
A char is actually just stored as a number (its code point value). We have syntax to represent characters like char c = 'A';, but it's equivalent to char c = 65; and 'A' == 65 is true.
So in your code, the chars are being represented by their decimal values to do arithmetic (whole numbers from 0 to 65,535).
For example, the char 'A' is represented by its code point 65 (decimal value in ASCII table):
System.out.print('A'); // prints A
System.out.print((int)('A')); // prints 65 because you casted it to an int
As a note, a short is a 16-bit signed integer, so even though a char is also 16-bits, the maximum integer value of a char (65,535) exceeds the maximum integer value of a short (32,767). Therefore, a cast to (short) from a char cannot always work. And the minimum integer value of a char is 0, whereas the minimum integer value of a short is -32,768.
For your code, let's say that the char was 'D'. Note that 'D' == 68 since its code point is 68.
return 10 + ch - 'A';
This returns 10 + 68 - 65, so it will return 13.
Now let's say the char was 'Q' == 81.
if (ch >= 'A' && ch <= 'F')
This is false since 'Q' > 'F' (81 > 70), so it would go into the else block and execute:
return ch - '0';
This returns 81 - 48 so it will return 33.
Your function returns an int type, but if it were to instead return a char or have the int casted to a char afterward, then the value 33 returned would represent the '!' character, since 33 is its code point value. Look up the character in ASCII table or Unicode table to verify that '!' == 33 (compare decimal values).
This is because char is a primitive type which can be used as a numerical value. Every character in a string is encoded as a specific number (not entirely true in all cases, but good enough for a basic understanding of the matter) and Java allows you to use chars in such a way.
It probably allows this mostly for historical reasons, this is how it worked in C and they probably motivated it with "performance" or something like that.
If you think it's weird then don't worry, I think so too
The other answer is incorrect actually. ASCII is a specific encoding (an encoding is some specification that says "1 = A, 2 = B, ... , 255 = Space") and that is not the one used in Java. A Java char is two bytes wide and is interpreted through the unicode character encoding.
Chars are in turn stored as integers(ASCII value) so that you can perform add and sub on integers which will return ASCII value of a char
Regardless of how Java actually stores the char datatype, what's certain is this, the character 'A' subtracted from the character 'A' would be represented as the null character, \0. In memory, this means every bit is 0. The size in memory a char takes up in memory may vary from language to language, but as far as I know, the null character is the same in all the languages, every bit is equal to 0.
As an int value, a piece of memory with every bit equal to 0 represents the integer value of 0.
And as it turns out, when you do "character math", subtracting any alphabetical character from any other alphabetical character (of the same case) results in bits being flipped in such a way that, if you were to interpret them as an int, would represent the distance between these characters. Additionally, subtracting the char '0' from any other numeric char will result in int value of the char you subtracted from, for basically the same reason.
'A' - 'A' = '\0'
'a' - 'a' = '\0'
'0' - '0' = '\0'
Can some one explain what runner.children[c-'a']
means in the following code.
public boolean search(String word) {
TrieNode runner = root;
for(char c : word.toCharArray()) {
if(runner.children[c-'a'] == null) {
return false;
} else {
runner = runner.children[c-'a'];
}
}
return runner.isEndOfWord;
}
Every char has a numeric value, check out the ASCII table for more information.
So assume that the variable c contains character b, and subtract character a from that, you will get 1 for your answer.
That's just subtraction. You can subtract characters as though they were numbers. You end up with the result of subtracting their character codes. 'c' - 'a' (for example) equals 2, since 'a' is 2 less than 'c'.
- is the subtraction operator.
§15.18.2 The type of each of the operands of the binary - operator must be a type that is convertible to a primitive numeric type
§5.6.2 Widening primitive conversion is applied to convert either or both operands … both operands are converted to type int.
Binary numeric promotion is performed on the operands of certain operators: … addition and subtraction operators for numeric types + and - …
In other words, both c and 'a' are of type char (a UTF-16 code unit, which has a range from Character.MIN_VALUE to Character.MAX_VALUE). Due to subtraction, they are widened to type int, subtracted, resulting in a value of type int.
Think of characters on a number line. Subtraction is the distance from one character to the other. With a constant reference to 'a', the distances for 'a', 'b', … 'z' are 0, 1, … 25. This makes sense only over certain short segments of the UTF-16 number line.
Arrays are 0-based so shifting the scale like this allows characters to be used to index an array without having a large used portion with elements corresponding to unused characters.
(Note: Some people are saying ASCII because they think it's easier to understand a simpler, wrong thing on the way to learning the right thing. 🤷)
In this case children[] is probably the size of the amount of letters from a-z.
What is going on above is that they take the ascii value of the char c, and subtracting the ascii code of 'a'. Effectively resulting in getting the index of the char c in the alphabet (0-Index assumed)
Let c = 'b'
[c-'a'] = 98 - 97 = 1 (Ascii of b - Ascii of a)
With c = 'd'
[c-'a'] = 100 - 97 = 3
It is minus sign and not hyphen. In java char takes 2 bytes of space. char is representation of bits ranging from 00000000 to 11111111, most significant bit is read as signed bit. You can easily read it as a number also by assiging a char to an int variable ( as int can accept 4 bytes so 2 bytes of char can easily fit ).
char charA = ''A'; // represents 65
char charB = `B`; // represents 66
int diff = charB - charA; // this will represent 66-65 i.e. 1
Index of the array is positve int and hence it can also accept values like
anyTypeArray[charB - charA] //represents the 2nd element (index starts from 0 for arrays in java).
anyTypeArray['C' - charA] // represents the 3rd element of the array
Also I liked answer above https://stackoverflow.com/a/47106997/504133 and would like to add its link to extend my answer.
char char1 = 'a';
System.out.println(char1); //prints char 1
System.out.println(char1+1); //prints char 1
System.out.println(char1++); //prints char 1
System.out.println(char1+=1); //prints incremented char1
char1 += 1;
System.out.println(char1); //prints incremented char1
In the above, why doesn't (char1+1) or (char++) print the incremented character but theother two do?
First, I'm assuming that because you say the increment in System.out.println works, that you have really specified:
char char1 = 'a';
EDIT
In response to the change of the question (char1+1; => char1 += 1;) I see the issue.
The output is
a
98
b
The 98 shows up because the char a was promoted to an int (binary numeric promotion) to add 1. So a becomes 97 (the ASCII value for 'a') and 98 results.
However, char1 += 1; or char1++ doesn't perform binary numeric promotion, so it works as expected.
Quoting the JLS, Section 5.6.2, "Binary Numeric Promotion":
Widening primitive conversion (§5.1.2) is applied to convert either or
both operands as specified by the following rules:
If either operand is of type double, the other is converted to double.
Otherwise, if either operand is of type float, the other is converted
to float.
Otherwise, if either operand is of type long, the other is converted
to long.
Otherwise, both operands are converted to type int.
(emphasis mine)
You didn't assign the result of addition char1+1 to char1. So
char1 = char1 + 1;
or
char1 += 1;
char1++;
are correct.
Okay, first of all, fixing the format of your code:
char char1;
char1 = 'a';
System.out.println(char1); // print 1
System.out.println(char1 + 1); // print 2
char1 += 1;
System.out.println(char1); // print 3
which yields the output:
a
98
b
Now, let's look at each call to println() in detail:
1: This is simply taking the character handle named char1 and printing it. It's been assigned the letter a (note the single quotes around the a in the assignment, indicating character). Not surprisingly, this prints the character a.
2: For this line, you're performing an integer addition. A char in java is held as a unicode character. The unicode value for the letter a maps to the number 97. (Note that this also corresponds to that ASCII value for a). When performing arithmetic operations in Java between mismatched types, the smaller/less precise value type's value will be 'upgraded' to the larger type (this is very imprecisely stated). Because of this, the char is 'upgraded' to an int before the addition is performed, and the result is also an int. With this in mind, it's not surprising that the 97 from a +1 results in a 98 being printed.
3: In this instance we are once again printing the value of a char, so a character is printed. This time the 98 we saw generated before is implicitly cast back into a character. Again, unsurprisingly the next highest number mapping from a is b, so we see a b printed.
try this.
System.out.println(char1);
System.out.println(++char1);
char1 += 1;
System.out.println(char1);
instead
char1 = a;
System.out.println(char1);
system.out.println(char1+1);
char1 += 1;
System.out.println(char1);
To my understanding a char is a single character, that is a letter, a digit, a punctuation mark, a tab, a space or something similar. And therefore when I do:
char c = '1';
System.out.println(c);
The output 1 was exactly what I expected. So why is it that when I do this:
int a = 1;
char c = '1';
int ans = a + c;
System.out.println(ans);
I end up with the output 50?
You're getting that because it's adding the ASCII value of the char. You must convert it to an int first.
Number 1 is ASCII code 49. The compiler is doing the only sensible thing it can do with your request, and typecasting to int.
You end up with out of 50 because you have told Java to treat the result of the addition as an int in the following line:
int ans = a + c;
Instead of int you declare ans as a char.
Like so:
final int a = 1;
final char c = '1';
final char ans = (char) (a + c);
System.out.println(ans);
Because you are adding the value of c (1) to the unicode value of 'a', which is 49. The first 128 unicode point values are identical to ASCII, you can find those here:
http://www.asciitable.com/
Notice Chr '1' is Dec 49. The rest of the unicode points are here:
http://www.utf8-chartable.de/
A char is a disguised int. A char represents a character by coding it into an int. So for example 'c' is coded with 49. When you add them together, you get an int which is the sum of the code of the char and the value of the int.
'1' is a digit, not a number, and is encoded in ASCII to be of value 49.
Chars in Java can be promoted to int, so if you ask to add an int like 1 to a char like '1', alias 49, the more narrow type char is promoted to int, getting 49, + 1 => 50.
Note that every non-digit char can be added the same way:
'a' + 0 = 97
'A' + 0 = 65
' ' + 0 = 32
'char' is really just a two-byte unsigned integer.
The value '1' and 1 are very different. '1' is encoded as the two-byte value 49.
"Character encoding" is the topic you want to research. Or from the Java language spec: http://docs.oracle.com/javase/specs/jls/se7/html/jls-4.html#jls-4.2.1
and thank you for your time,
For the following code (a method for a class), I am not sure what the (char) is supposed to do. B arg 1 has a char and at the end I also get a char.
So, does the (char) work like a cast or? If it's more commonly used I would like to learn more about it, so I would appreciate if you could tell me where to look specifically.
static B calculer(B arg1, int arg2) {
arg1.b = (char)(arg1.b + arg2);
}
Java characters are numbers behind the scenes where the character actually holds the Unicode codepoint. The Java char is an unsigned 16-bit number (UTF-16). So you can treat them like numbers in many ways. What this does is add arg2 to the Unicode value of arg.b.
For example if arg.b is 'a' (codepoint is decimal 97) and arg1 is 10 on return of the method arg.b will now be 'k' (codepoint is decimal 107).
Because Java promotes data types during numerical operations, after adding an int to a char the result of the expression is an int not a char. The char gets promoted to an int. So the cast is needed to turn the result back in to a char for the assignment.
It casts (arg1.b + arg2) to a char.
When there are multiple data types that can be interpreted, you cast it to the one you want.
For instance, if I was to try to display the number 97 as a char instead of just the int 97 that it would normally be interpreted as, I would cast is as such:
System.out.println( (char)97 ); // will display 'a', instead of 97
I could display a as 97 by casting it to an int as well.
In your case, it will interpret (arg1.b + arg2) as a char, instead of what it normally would be.
Example:
I wrote this code (where 'a' represents arg1.b and 5 represents arg2:
char x = (char)('a' + 5);
System.out.println(x); // displayed 'f'
It displayed 'f' because it added 'a' and 5 together, which it interpreted as 97 + 5 (97 is the ASCHII value of 'a', or the int value of it essentially), and then would be left with (char)(102) which would then be interpreted as 'f' (the char value of 102).
it's called casting. What casting does is transforms some other data type (like int) into a character. I think this is Java code? So you want to learn a little bit more about java characters and the way it handles this kind of situations.