The method Character.isLetter(Char c) tells whether the character is a unicode letter. What if I want to check for English letters (a-zA-Z) without regex.
Easy
char c = ...;
if ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z')) {
//english letter
}
Related
I was exploring this code which gives a count of vowels and consonants, but didn't understand this else if (ch >= 'a' && ch <= 'z') line of code. Please tell me what's the logic behind it.
import java.util.Scanner;
public class Vowels {
public static void main(String[] args) {
// TODO Auto-generated method stub
Scanner sc = new Scanner(System.in);
System.out.println("Enter string");
String str = sc.nextLine();
int vowl = 0;
int conso = 0;
for (int i = 0; i < str.length(); i++) {
char ch = str.charAt(i);
if (ch == 'a' || ch == 'e' || ch == 'i' || ch == 'o' || ch == 'u') {
vowl++;
} else if (ch >= 'a' && ch <= 'z') {
conso++;
}
}
System.out.println(vowl);
System.out.println(conso);
}
}
A benefit of chars is that you can operate with them like if they were integers.
For example, you can do you this as well 'a' + 3 = 'd'
Meaning that 'a' < 'd' = true.
notice the if statement catches all vowels
whats ever is not a vowel will either be a capital letter, a number, a special character or consonants
else if (ch >= 'a' && ch <= 'z')
this checks if its not a vowel does it atleast fall in the range of small letter 'a'-'z' and is not a special charecter or a number.( we knonw its not a vowel but is it in the ascii range 26=a -51=z)
refer to the ASCII table to understand the range comparison
The comparison of characters the way it is done can create confusion, as you can see from Java: Character comparison.
Basically #TDG is correct by saying that ch is checked to be between 'a' and 'z', and thus the check might be translated as "is ch a lower case character?"
The tricky part is that depending on the language people use the expectation can be different, especially since language specific characters are not taken into account. In German language, 'ΓΆ' would definitely qualify as lower case character but is not in the range of the check. The complexity may get evident by studying the Unicode code charts.
The best check is to use Character.isLowerCase().
char is a character that represented by a number which is the index of the character in the ASCII/unicode table, since the the alphabet characters are arranged in order in the ASCII table, the following code checks if the ch is in the range of the lowercase alphabet characters representation which is 97 to 122 in the table.
using (int) ch you can see the decimal value of the character and can compare it with the index in the ASCII table.
you can see the ASCII table here:https://www.asciitable.com/
The method returns true if such character
is a letter of the English alphabet (uppercase or lower case) or one of the arabic numerals. The method
returns false otherwise
Method for ASCII character set
boolean isEnglishLetterOrDigit(char letter) {
return (letter >= 'a' && letter <= 'z') ||
(letter >= 'A' && letter <= 'Z') ||
(letter >= '0' && letter <= '9');
}
Short version
When pressing <enter> at the end of a // comment, Intellij sometimes decides to continue the // comment on the next line. How can I prevent that? Is there a setting somewhere to disable this automation?
Long version
There is a thing I do regularily, it is to break a long expression with a double-slash.
Let's say I have a line like
boolean isHex = c >= '0' && c <= '9' || c >= 'A' && c <= 'F' || c >= 'a' && c <= 'f';
and I want to split it like that
boolean isHex = c >= '0' && c <= '9' //
|| c >= 'A' && c <= 'F' //
|| c >= 'a' && c <= 'f';
Note that I want the final // in order to prevent any formatter to join the lines again.
So I insert a double-slash-return after the '9', by pressing //<enter>. But Intellij will auto-continue the comment on the next line.
boolean isHex = c >= '0' && c <= '9' //
// || c >= 'A' && c <= 'F' || c >= 'a' && c <= 'f';
It forces me to uncomment and reindent the line manually.
I want Intellij to not continue the comment on the next line and optionally indent my code:
boolean isHex = c >= '0' && c <= '9' //
|| c >= 'A' && c <= 'F' || c >= 'a' && c <= 'f';
So I want to disable this "continue // comment after <enter>" feature. Is it possible? I haven't found any setting related to that.
The closest you are going to get is to define a macro to insert a new line and remove the comment and then bind that macro to a suitable key.
Go to Settings β Code Style β Java β Wrapping and Braces and check "Line breaks" under "Keep when reformatting". This will make IntelliJ's formatter respect any manual line breaks, even if they contradict other formatting rules.
There is a code like this :
boolean isValid(int ch) {
if(ch < '1' | ch > '7' & ch != 'q') return false;
else return true;
}
class HelpClassDemo {
...
do {
choice=(char) System.in.read();
} while(!hlpobj.isValid(choice));
}
That's the question :
Why we used int ch in isValid(int ch) in spite of choice's type is char?
Shouldn't we use char ch ? If we should use int ch why there is a code like this : (ch < '1' | ch > '7' & ch != 'q')
Isn't ch<1 or ch>7 logical ? I know it's a quite simple question but I'm confused about this.
A int type as a wider range than a char type (see this data type range table). IMO, receiving a char as an int provides you with some kind of overflow protection (points of view on this are welcome) but it is not something I'd do as I'd use the right data type instead.
Now, when you compare
if(ch < '1' | ch > '7' & ch != 'q') return false;
you are implicitly casting '1' and '7' to int type, which is perfectly valid.
And finally, regarding
Isn't ch<1 or ch>7 logical ? I know it's a quite simple question but
I'm confused about this.
It is logical, but it's not the same.
If you do ch > '1' && ch < '7', you are comparing ch to the ASCII value of 7, which is 55, and value of 1, which is 48. Basically, you are making sure that ch is a char between 2 and 6, both inclusive.
But if you do ch > 1 && ch < 7, you are comparing integers and validating that the char represented by ch is between 2 and 6 (both inclusive). This char is very likely to be not-human-readable.
If you want to make them equivalent, you wold have to compare according to the appropiate char value, like
if (ch > 48 && ch < 55)
I tried to implement Rot13 and to make it as minimal as possible, this are my results so far:
if ( (c >= 'A') && (c <= 'Z') )
c=((c-'A'+13)%26)+'A';
if ( (c >= 'a') && (c <= 'z') )
c=((c-'a'+13)%26)+'a';
return c;
I showed this to my Prof and he said it would be possible in two lines. I don't know how i could shrink this code further and not generating wrong output.
Thanks for your help
EDIT: if nothing changed (outer range ascii) it should only return c. Maybe the solution is the second answer + return line c in case nothing returned.
You don't need to update c; just return:
if ((c >= 'A') && (c <= 'Z')) {
return ((c - 'A' + 13) % 26) + 'A';
}
if ((c >= 'a') && (c <= 'z')) {
return ((c - 'a' + 13) % 26) + 'a';
}
I also made the code more readable.
This could easily be made into two lines:
if ((c >= 'A') && (c <= 'Z')) return ((c - 'A' + 13) % 26) + 'A';
if ((c >= 'a') && (c <= 'z')) return ((c - 'a' + 13) % 26) + 'a';
Or one:
if ((c >= 'A') && (c <= 'Z')) return ((c - 'A' + 13) % 26) + 'A'; if ((c >= 'a') && (c <= 'z')) return ((c - 'a' + 13) % 26) + 'a';
But of course, that is much less readable, and not a good idea.
One line:
return (c < 'a') ? ((c - 'A' + 13) % 26) + 'A' : ((c - 'a' + 13) % 26) + 'a';
This simply makes use of the fact that lower case letters come after upper case letters in ASCII and UTF-8. Of course, it doesn't verify the input in any way.
There is a little trick using the ASCII table. Upper and lower case chars only differ one bit. So you could handle them at once. Take a look at this:
A = 0100 0001 M = 0100 1101
a = 0110 0001 m = 0110 1101
So, I think this should work:
if (Character.isLetter(c))
return (char) ((((c & 0b01011111) - 'A' + 13) % 26 + 'A') | (c & 0b00100000));
return c;
Explanation:
c & 0b01011111 turns the char into an uppercase.
- 'A' + 13 converts to an 0-based int and applies the offset.
% 26 + 'A' Take the modulo and make it back a char.
(c & 0b00100000) takes the bit that indicates wether the char was lower case or not.
| Add that bit back to the result to make it lowercase if it was.
You could use the conditional operator here to make it a one-liner:
return Character.isLetter(c) ? (char) ((((c & 0b01011111) - 'A' + 13) % 26 + 'A') | (c & 0b00100000)) : c;
After replacing the binary and char literals by decimal int literals, you get:
return Character.isLetter(c) ? (char) ((((c & 95) - 52) % 26 + 65) | (c & 32)) : c;
Eliminating spaces and some extra brackets gives: (65 chars)
return Character.isLetter(c)?(char)((((c&95)-52)%26+65)|c&32):c;
Which is a win, IMHO, if it comes to code golfing. This is of course not readable.
Demo: Yep, confirmed. It works: http://ideone.com/l6xYy6
Excerpt from the output:
= -> =
> -> >
? -> ?
# -> #
A -> N
B -> O
C -> P
D -> Q
And a bit further:
W -> J
X -> K
Y -> L
Z -> M
[ -> [
\ -> \
] -> ]
^ -> ^
_ -> _
` -> `
a -> n
b -> o
c -> p
d -> q
Slightly more correct than Sibbo's answer. This returns c as is if it falls in neither range. and in 1 line.
return ((c >= 'A') && (c <= 'Z')) ? ((c-'A'+13)%26)+'A'
:((c >= 'a') && (c <= 'z') ? ((c-'a'+13)%26)+'a'
: c);
Even shorter and (perhaps) still easier to read is
char a = c < 'a' ? 'A' : 'a';
return (c - a + 13) % 26 + a;
Note that this solution, like some of the previous answers, doesn't check the input. Moreover, in Java this code returns an int, not a char, so a cast would be necessary if the method in which it is included returns a char.
As already mentioned, I also like to stress that shortest is not necessarily best. Write readable code.
Well, if we're going for short over readable;
return (c&~32) >= 'A' && (c&~32) <= 'Z' ? ((c&31) + 12) % 26 + (c&~31) + 1 : c;