Confused About Integer and Char Type

Confused About Integer and Char Type - java

There is a code like this :
boolean isValid(int ch) {
if(ch < '1' | ch > '7' & ch != 'q') return false;
else return true;
}
class HelpClassDemo {
...
do {
choice=(char) System.in.read();
} while(!hlpobj.isValid(choice));
}
That's the question :
Why we used int ch in isValid(int ch) in spite of choice's type is char?
Shouldn't we use char ch ? If we should use int ch why there is a code like this : (ch < '1' | ch > '7' & ch != 'q')
Isn't ch<1 or ch>7 logical ? I know it's a quite simple question but I'm confused about this.

A int type as a wider range than a char type (see this data type range table). IMO, receiving a char as an int provides you with some kind of overflow protection (points of view on this are welcome) but it is not something I'd do as I'd use the right data type instead.
Now, when you compare
if(ch < '1' | ch > '7' & ch != 'q') return false;
you are implicitly casting '1' and '7' to int type, which is perfectly valid.
And finally, regarding
Isn't ch<1 or ch>7 logical ? I know it's a quite simple question but
I'm confused about this.
It is logical, but it's not the same.
If you do ch > '1' && ch < '7', you are comparing ch to the ASCII value of 7, which is 55, and value of 1, which is 48. Basically, you are making sure that ch is a char between 2 and 6, both inclusive.
But if you do ch > 1 && ch < 7, you are comparing integers and validating that the char represented by ch is between 2 and 6 (both inclusive). This char is very likely to be not-human-readable.
If you want to make them equivalent, you wold have to compare according to the appropiate char value, like
if (ch > 48 && ch < 55)

Related

Count Vowels and Consonants

I was exploring this code which gives a count of vowels and consonants, but didn't understand this else if (ch >= 'a' && ch <= 'z') line of code. Please tell me what's the logic behind it.
import java.util.Scanner;
public class Vowels {
public static void main(String[] args) {
// TODO Auto-generated method stub
Scanner sc = new Scanner(System.in);
System.out.println("Enter string");
String str = sc.nextLine();
int vowl = 0;
int conso = 0;
for (int i = 0; i < str.length(); i++) {
char ch = str.charAt(i);
if (ch == 'a' || ch == 'e' || ch == 'i' || ch == 'o' || ch == 'u') {
vowl++;
} else if (ch >= 'a' && ch <= 'z') {
conso++;
}
}
System.out.println(vowl);
System.out.println(conso);
}
}

A benefit of chars is that you can operate with them like if they were integers.
For example, you can do you this as well 'a' + 3 = 'd'
Meaning that 'a' < 'd' = true.

notice the if statement catches all vowels
whats ever is not a vowel will either be a capital letter, a number, a special character or consonants
else if (ch >= 'a' && ch <= 'z')
this checks if its not a vowel does it atleast fall in the range of small letter 'a'-'z' and is not a special charecter or a number.( we knonw its not a vowel but is it in the ascii range 26=a -51=z)
refer to the ASCII table to understand the range comparison

The comparison of characters the way it is done can create confusion, as you can see from Java: Character comparison.
Basically #TDG is correct by saying that ch is checked to be between 'a' and 'z', and thus the check might be translated as "is ch a lower case character?"
The tricky part is that depending on the language people use the expectation can be different, especially since language specific characters are not taken into account. In German language, 'ö' would definitely qualify as lower case character but is not in the range of the check. The complexity may get evident by studying the Unicode code charts.
The best check is to use Character.isLowerCase().

char is a character that represented by a number which is the index of the character in the ASCII/unicode table, since the the alphabet characters are arranged in order in the ASCII table, the following code checks if the ch is in the range of the lowercase alphabet characters representation which is 97 to 122 in the table.
using (int) ch you can see the decimal value of the character and can compare it with the index in the ASCII table.
you can see the ASCII table here:https://www.asciitable.com/

How to prevent IntelliJ to stop continuing double-slash comments on the next line?

Short version
When pressing <enter> at the end of a // comment, Intellij sometimes decides to continue the // comment on the next line. How can I prevent that? Is there a setting somewhere to disable this automation?
Long version
There is a thing I do regularily, it is to break a long expression with a double-slash.
Let's say I have a line like
boolean isHex = c >= '0' && c <= '9' || c >= 'A' && c <= 'F' || c >= 'a' && c <= 'f';
and I want to split it like that
boolean isHex = c >= '0' && c <= '9' //
|| c >= 'A' && c <= 'F' //
|| c >= 'a' && c <= 'f';
Note that I want the final // in order to prevent any formatter to join the lines again.
So I insert a double-slash-return after the '9', by pressing //<enter>. But Intellij will auto-continue the comment on the next line.
boolean isHex = c >= '0' && c <= '9' //
// || c >= 'A' && c <= 'F' || c >= 'a' && c <= 'f';
It forces me to uncomment and reindent the line manually.
I want Intellij to not continue the comment on the next line and optionally indent my code:
boolean isHex = c >= '0' && c <= '9' //
|| c >= 'A' && c <= 'F' || c >= 'a' && c <= 'f';
So I want to disable this "continue // comment after <enter>" feature. Is it possible? I haven't found any setting related to that.

The closest you are going to get is to define a macro to insert a new line and remove the comment and then bind that macro to a suitable key.

Go to Settings → Code Style → Java → Wrapping and Braces and check "Line breaks" under "Keep when reformatting". This will make IntelliJ's formatter respect any manual line breaks, even if they contradict other formatting rules.

How do I get the numerical value/position of a character in the alphabet (1-26) in constant time (O(1)) without using any built in method or function?

How do I get the numerical value/position of a character in the alphabet (1-26) in constant time (O(1)) without using any built in method or function and without caring about the case of the character?

If your compiler supports binary literals you can use
int value = 0b00011111 & character;
If it does not, you can use 31 instead of 0b00011111 since they are equivalent.
int value = 31 & character;
or if you want to use hex
int value = 0x1F & character;
or in octal
int value = 037 & character;
You can use any way to represent the value 31.
This works because in ASCII, undercase values are prefixed with 011, and uppercase 010 and then the binary equivalent of 1-26.
By using the bitmask of 00011111 and the AND operand, we covert the 3 most significant bits to zeros. This leaves us with 00001 to 11010, 1 to 26.

Adding to the very good (self) answer of Charles Staal.
Assuming ascii encoding following will work. Updated from the kind comment of Yves Daoust
int Get1BasedIndex(char ch) {
return ( ch | ('a' ^ 'A') ) - 'a' + 1;
}
This will make the character uppercase and change the index.
However a more readable solution (O(1)) is:
int Get1BasedIndex(char ch) {
return ('a' <= ch && ch <= 'z') ? ch - 'a' + 1 : ch - 'A' + 1;
}
One more solution that is constant time but requires some extra memory is:
static int cha[256];
static void init() {
int code = -1;
fill_n (&cha[0], &cha[256], code);
code = 1;
for(char s = 'a', l = 'A'; s <= 'z'; ++s, ++l) {
cha[s] = cha[l] = code++;
}
}
int Get1BasedIndex(char ch) {
return cha[ch];
}

We can get their ASCII values and then subtract from the starting character ASCII(a - 97, A - 65)
char ch = 'a';
if(ch >=65 && ch <= 90)//if capital letter
System.out.println((int)ch - 65);
else if(ch >=97 && ch <= 122)//if small letters
System.out.println((int)ch - 97);

Strictly speaking it is not possible to do it portably in C/C++ because there is no guarantee on the ordering of the characters.
This said, with a contiguous sequence, Char - 'a' and Char - 'A' obviously give you the position of a lowercase or uppercase letter, and you could write
Ord= 'a' <= Char && Char <= 'z' ? Char - 'a' :
('A' <= Char && Char <= 'Z' ? Char - 'A' : -1);
If you want to favor efficiency over safety, exploit the binary representation of ASCII codes and use the branchless
#define ToUpper(Char) (Char | 0x20)
Ord= ToUpper(Char) - 'a';
(the output for non-letter character is considered unspecified).
Contrary to the specs, these snippets return the position in range [0, 25], more natural with zero-based indexing languages.

To check whether a character is of English Alphabet (a-zA-Z)

The method Character.isLetter(Char c) tells whether the character is a unicode letter. What if I want to check for English letters (a-zA-Z) without regex.

Easy
char c = ...;
if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) {
//english letter
}

How to check if a character is correct

I have a bunch of characters and want to remove everything that isn't a '#' '.' 'E' and 'G'.
I tried to use this:
if (buffer.get(buffertest) == 'G'|'E'|'#'|'.')
But got an issue with an incompatible type.

This root problem is incorrect use of the bitwise OR operator, and the Java operator precedence hierarchy. Java expressions of this type are evaluated left to right, and the == operator takes precedence over |. Which when combined, your expression roughly translates to:
(buffer.get(buffertest) == 'G') | 'E' | '#' | '.'
The first part of the expression buffer.get(buffertest) == 'G' evaluates to a boolean.<br>
The second part of the expression'E' | '#' | '.'` evaluates to an int, which is narrowed to a char
Which leads to an incompatible type compile time error. You can correct your code by expanding the check this way:
char ch = buffer.get(buffertest);
if(ch == 'G' || ch == 'E' || ch == '#' || ch == '.') {
// do something
}

You need to compare for each character individually. Assuming that buffer.get(buffertest) returns a char, here's how to do it:
char c = buffer.get(buffertest);
if (c == 'G' || c == 'E' || c == '#' || c == '.') {
// do something
}
Alternatively, you could do something like this:
char c = buffer.get(buffertest);
if ("GE#.".contains(Character.toString(c))) {
// do something
}

You haven't shown the type of buffer, which makes things harder. But assuming buffer.get returns a char, you could use:
if ("GE#.".indexOf(buffer.get(buffertest) >= 0)
Or you could check each option explicitly, as per Simulant's answer... or to do the same thing but only calling get once:
char x = buffer.get(buffertest);
if (x == 'G' || x == 'E' || x == '#' || x == '.')
Your original code is failing because | is trying to perform a bitwise "OR" operation on the four characters... it's not the same thing as performing a logical "OR" on four conditions.

if (buffer.get(buffertest) == 'G'||
buffer.get(buffertest) == 'E'||
buffer.get(buffertest) == '#'||
buffer.get(buffertest) == '.')

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Confused About Integer and Char Type - java

Related

Count Vowels and Consonants

How to prevent IntelliJ to stop continuing double-slash comments on the next line?

How do I get the numerical value/position of a character in the alphabet (1-26) in constant time (O(1)) without using any built in method or function?

To check whether a character is of English Alphabet (a-zA-Z)

How to check if a character is correct

Categories

Resources