Words with two or more capital letters in Java - java

Words with at least 2 Capital letters and with any special letters (like ##$%^&*()_-+= and so on...) optional.
I tried:
public static boolean isWordHas2Caps(String s) {
return s.matches("\\b(?:\\p{Ll}*\\p{Lu}){2,}\\p{Ll}*\\b");
}
But, I am getting
System.out.println(isWordHas2Caps("eHJHJK"));
System.out.println(isWordHas2Caps("YUIYUI"));
System.out.println(isWordHas2Caps("LkfjkdJkdfj"));
System.out.println(isWordHas2Caps("LLdkjkd"));
System.out.println(isWordHas2Caps("OhdfjhdsjO"));
System.out.println(isWordHas2Caps("LLLuoiu9898"));
System.out.println(isWordHas2Caps("Ohdf&jh/dsjO"));
System.out.println(isWordHas2Caps("auuuu"));
System.out.println(isWordHas2Caps("JJJJJJJJ"));
System.out.println(isWordHas2Caps("YYYY99999"));
System.out.println(isWordHas2Caps("ooooPPPP"));
Output:
true eHJHJK
true YUIYUI
true LkfjkdJkdfj
true LLdkjkd
true OhdfjhdsjO
false LLLuoiu9898 It should be true but getting false
false Ohdf&jh/dsjO It should be true but getting false
false auuuu
true JJJJJJJJ
false YYYY99999 It should be true but getting false
true ooooPPPP
I think, I should in the regexp and numbers and Special letters. How can I do that?

Update:
A valuable comment from anubhava:
Probably s.matches("(?:\\S*\\p{Lu}){2}\\S*"); may be better
Demo of the above solution.
Original answer:
You can use the regex, \b.*\p{Lu}.*\p{Lu}.*\b as shown below:
public static boolean isWordHas2Caps(String s) {
return s.matches("\\b.*\\p{Lu}.*\\p{Lu}.*\\b");
}
Demo:
public class Main {
public static void main(String[] args) {
System.out.println(isWordHas2Caps("eHJHJK"));
System.out.println(isWordHas2Caps("YUIYUI"));
System.out.println(isWordHas2Caps("LkfjkdJkdfj"));
System.out.println(isWordHas2Caps("LLdkjkd"));
System.out.println(isWordHas2Caps("OhdfjhdsjO"));
System.out.println(isWordHas2Caps("LLLuoiu9898"));
System.out.println(isWordHas2Caps("Ohdf&jh/dsjO"));
System.out.println(isWordHas2Caps("auuuu"));
System.out.println(isWordHas2Caps("JJJJJJJJ"));
System.out.println(isWordHas2Caps("YYYY99999"));
System.out.println(isWordHas2Caps("ooooPPPP"));
}
public static boolean isWordHas2Caps(String s) {
return s.matches("\\b.*\\p{Lu}.*\\p{Lu}.*\\b");
}
}
Output:
true
true
true
true
true
true
true
false
true
true
true

You want to check if there are at least two uppercase letters anywhere in a string that can contain arbitrary chars.
Then, you can use
public static boolean isWordHas2Caps(String s) {
return Pattern.compile("\\p{Lu}\\P{Lu}*\\p{Lu}").matcher(s).find();
}
See the Java demo.
Alternatively, if you still want to use String#matches you can use the following (keeping in mind that we need to match the entire string):
public static boolean isWordHas2Caps(String s) {
return s.matches("(?s)(?:\\P{Lu}*\\p{Lu}){2}.*");
}
The (?s)(?:\\P{Lu}*\\p{Lu}){2}.* regex matches
(?s) - the Pattern.DOTALL embedded flag option (makes . match any chars)
(?:\P{Lu}*\p{Lu}){2} - two occurrences of any zero or more chars other than uppercase letters and then an uppercase letter
.* - the rest of the string.
Your code did not return expected results because all of them contain non-letter characters, while String#matches() requires a full string match against a pattern, and yours matches strings that contains letters only.
That is why you should
Make sure you can match anywhere inside a string, and Matcher.find does this job best
\p{Lu}\P{Lu}*\p{Lu} pattern will find any sequence of an uppercase letter + any zero or more non-letters + an uppercase letter
Alternatively, you can use (?s)(?:\P{Lu}*\p{Lu}){2}.* regex to match a full string that contains at least two uppercase letters.

Related

How to know if a string could match a regular expression by adding more characters

This is a tricky question, and maybe in the end it has no solution (or not a reasonable one, at least). I'd like to have a Java specific example, but if it can be done, I think I could do it with any example.
My goal is to find a way of knowing whether an string being read from an input stream could still match a given regular expression pattern. Or, in other words, read the stream until we've got a string that definitely will not match such pattern, no matter how much characters you add to it.
A declaration for a minimalist simple method to achieve this could be something like:
boolean couldMatch(CharSequence charsSoFar, Pattern pattern);
Such a method would return true in case that charsSoFar could still match pattern if new characters are added, or false if it has no chance at all to match it even adding new characters.
To put a more concrete example, say we have a pattern for float numbers like "^([+-]?\\d*\\.?\\d*)$".
With such a pattern, couldMatch would return true for the following example charsSoFar parameter:
"+"
"-"
"123"
".24"
"-1.04"
And so on and so forth, because you can continue adding digits to all of these, plus one dot also in the three first ones.
On the other hand, all these examples derived from the previous one should return false:
"+A"
"-B"
"123z"
".24."
"-1.04+"
It's clear at first sight that these will never comply with the aforementioned pattern, no matter how many characters you add to it.
EDIT:
I add my current non-regex approach right now, so to make things more clear.
First, I declare the following functional interface:
public interface Matcher {
/**
* It will return the matching part of "source" if any.
*
* #param source
* #return
*/
CharSequence match(CharSequence source);
}
Then, the previous function would be redefined as:
boolean couldMatch(CharSequence charsSoFar, Matcher matcher);
And a (drafted) matcher for floats could look like (note this does not support the + sign at the start, just the -):
public class FloatMatcher implements Matcher {
#Override
public CharSequence match(CharSequence source) {
StringBuilder rtn = new StringBuilder();
if (source.length() == 0)
return "";
if ("0123456789-.".indexOf(source.charAt(0)) != -1 ) {
rtn.append(source.charAt(0));
}
boolean gotDot = false;
for (int i = 1; i < source.length(); i++) {
if (gotDot) {
if ("0123456789".indexOf(source.charAt(i)) != -1) {
rtn.append(source.charAt(i));
} else
return rtn.toString();
} else if (".0123456789".indexOf(source.charAt(i)) != -1) {
rtn.append(source.charAt(i));
if (source.charAt(i) == '.')
gotDot = true;
} else {
return rtn.toString();
}
}
return rtn.toString();
}
}
Inside the omitted body for the couldMatch method, it will just call matcher.match() iteratively with a new character added at the end of the source parameter and return true while the returned CharSequence is equal to the source parameter, and false as soon as it's different (meaning that the last char added broke the match).
You can do it as easy as
boolean couldMatch(CharSequence charsSoFar, Pattern pattern) {
Matcher m = pattern.matcher(charsSoFar);
return m.matches() || m.hitEnd();
}
If the sequence does not match and the engine did not reach the end of the input, it implies that there is a contradicting character before the end, which won’t go away when adding more characters at the end.
Or, as the documentation says:
Returns true if the end of input was hit by the search engine in the last match operation performed by this matcher.
When this method returns true, then it is possible that more input would have changed the result of the last search.
This is also used by the Scanner class internally, to determine whether it should load more data from the source stream for a matching operation.
Using the method above with your sample data yields
Pattern fpNumber = Pattern.compile("[+-]?\\d*\\.?\\d*");
String[] positive = {"+", "-", "123", ".24", "-1.04" };
String[] negative = { "+A", "-B", "123z", ".24.", "-1.04+" };
for(String p: positive) {
System.out.println("should accept more input: "+p
+", couldMatch: "+couldMatch(p, fpNumber));
}
for(String n: negative) {
System.out.println("can never match at all: "+n
+", couldMatch: "+couldMatch(n, fpNumber));
}
should accept more input: +, couldMatch: true
should accept more input: -, couldMatch: true
should accept more input: 123, couldMatch: true
should accept more input: .24, couldMatch: true
should accept more input: -1.04, couldMatch: true
can never match at all: +A, couldMatch: false
can never match at all: -B, couldMatch: false
can never match at all: 123z, couldMatch: false
can never match at all: .24., couldMatch: false
can never match at all: -1.04+, couldMatch: false
Of course, this doesn’t say anything about the chances of turning a nonmatching content into a match. You could still construct patterns for which no additional character could ever match. However, for ordinary use cases like the floating point number format, it’s reasonable.
I have no specific solution, but you might be able to do this with negations.
If you setup regex patterns in a blacklist that definitely do not match with your pattern (e.g. + followed by char) you could check against these. If a blacklisted regex returns true, you can abort.
Another idea is to use negative lookaheads (https://www.regular-expressions.info/lookaround.html)

How to test if a string contains numbers at certain points

I have a string with 6 characters in length. The first character must be a
capital letter, and the last 5 characters must be digits.
I need to write
code to return true if the characters that follow after the capital letter
are digits, and false if they are not.
Here is what I have so far, but when
testing the code, I get an error:
public boolean hasValidDigits(String s)
{
if (Character.isDigit(s.charAt(1-5))) {
return true;
} else {
return false;
}
}
Next time please put the error description.
What you need here is Regex which test the string to the pattern.
i.e.:
return s.matches("[A-Z]{1}[0-9]{5}");
[A-Z]{1}[0-9]{5} means: one capital letter, and 5 digits after.
Check str.matches("[A-Z][0-9]{5}");

Java regex not working for operand groups of possible values

I'm trying to write a Java method that will determine (true or false) if a particular String matches a regex of animal<L_OPERAND,R_OPERAND>, where L_OPERAND can be any of the following values: dog, cat, sheep and R_OPERAND can be any one of the following values: red, blue. All values are case- and whitespace-sensitive.
Some examples:
animal<fizz,cat> => false; fizz is not a valid L_OPERAND value
animAl<dog,blue> => false; animAl contains an upper-case char (illegal)
animal<dog,sheep> => false; sheep is not a valid R_OPERAND value
animal<dog, blue> => false; contains whitespace between ',' and 'blue' (no whitesapce allowed)
animal<dog,blue> => true; valid
animal<cat,red> => true; valid
animal<sheep,blue> => true; valid
My best attempt so far:
public class RegexExperiments {
public static void main(String[] args) {
boolean b = new RegexExperiments().isValidAnimalDef("animal<dog,blue>");
System.out.println(b);
}
public boolean isValidAnimalDef(String animalDef) {
String regex = "animal<[dog,cat,sheep],[red,blue]>";
if(animalDef.matches(regex)) {
return true;
} else {
return false;
}
}
}
Although I'm not getting any exceptions, I'm getting false for every type of input string (animalDef) I pass in. So obviously my regex is bad. Can anyone spot where I'm going awry?
Your problem lies within the [dog,cat,sheep] and [red,blue] structures. [] represent a character class, it matches a single character that is contained inside. For the first one this would be ,acdeghopst and for the second ,bdelru. So you currently match strings like animal<d,b> or even animal<,,,>.
What you are after is a mix of a grouping structure and an alternation. Alternations are provided by |, so e.g. dog|cat|sheep would match dog or cat or sheep. As you want this alternation inside a larger pattern, you have to contain it inside a group. The (for this case) simpliest grouping structure is a capturing group which is starting with ( and ending with ).
Your final pattern could then be animal<(dog|cat|sheep),(red|blue)>.
Try
String regex = "animal<(dog|cat|sheep),(red|blue)>";
You can use RegEx animal<(dog|cat|sheep),(red|blue)>
Output
false
false
false
false
true
true
true
Code
import java.util.regex.*;
public class HelloWorld {
public static void main(String[] args) {
System.out.println(filterOut("animal<fizz,cat>"));
System.out.println(filterOut("animAl<dog,blue>"));
System.out.println(filterOut("animal<dog,sheep>"));
System.out.println(filterOut("animal<dog, blue>"));
System.out.println(filterOut("animal<dog,blue>"));
System.out.println(filterOut("animal<cat,red>"));
System.out.println(filterOut("animal<sheep,blue>"));
}
public static boolean filterOut(String str) {
Matcher m = Pattern.compile("animal<(dog|cat|sheep),(red|blue)>").matcher(str);
if (m.find()) return true;
else return false;
}
}

Check that string contains non-latin letters

I have the following method to check that string contains only latin symbols.
private boolean containsNonLatin(String val) {
return val.matches("\\w+");
}
But it returns false if I pass string: my string because it contains space.
But I need the method which will check that if string contains letters not in Latin alphabet it should return false and it should return true in all other cases.
Please help to improve my method.
examples of valid strings:
w123.
w, 12
w#123
dsf%&#
You can use \p{IsLatin} class:
return !(var.matches("[\\p{Punct}\\p{Space}\\p{IsLatin}]+$"));
Java Regex Reference
I need something like not p{IsLatin}
If you need to match all letters but Latin ASCII letters, you can use
"[\\p{L}\\p{M}&&[^\\p{Alpha}]]+"
The \p{Alpha} POSIX class matches [A-Za-z]. The \p{L} matches any Unicode base letter, \p{M} matches diacritics. When we add &&[^\p{Alpha}] we subtract these [A-Za-z] from all the Unicode letters.
The whole expression means match one or more Unicode letters other than ASCII letters.
To add a space, just add \s:
"[\\s\\p{L}\\p{M}&&[^\\p{Alpha}]]+"
See IDEONE demo:
List<String> strs = Arrays.asList("w123.", "w, 12", "w#123", "dsf%&#", "Двв");
for (String str : strs)
System.out.println(!str.matches("[\\s\\p{L}\\p{M}&&[^\\p{Alpha}]]+")); // => 4 true, 1 false
Just add a space to your matcher:
private boolean isLatin(String val) {
return val.matches("[ \\w]+");
}
User this :
public static boolean isNoAlphaNumeric(String s) {
return s.matches("[\\p{L}\\s]+");
}
\p{L} means any Unicode letter.
\s space character

Checking if all characters of a string is uppercase except special symbols

It's my first time to use the Pattern class of Java because I want to check a string if it is in uppercase. Special symbols like "." and "," should be treated as Uppercase. Here are the expected results:
"test,." should return false //because it has a lowercase character
"TEST,." should return true //because all are uppercase and the special characters
"test" should return false //because it has a lowercase character
"TEST" should return true //because all are uppercase
"teST" should return false //because it has a lowercase character
I tried to use the StringUtils of apache but it doesn't work this way..
You can check:
if (str.toUpperCase().equals(str)) {..}
Just search for [a-z] then return false if it's found:
if (str.matches(".*[a-z].*")) {
// Negative match (false)
}
Alternatively, search for ^[^a-z]*$ (not sure on Java regex syntax, but basically whole string is not lowercase characters):
if (str.matches("[^a-z]*")) {
// Positive match (true)
}
You could iterate through the chars. It has the advantage that you can customize the matching as you wish.
boolean ok = true;
for(char c : yourString.toCharArray()) {
if(Character.isLetter(c) && !Character.isUpperCase(c)) {
ok = false;
break;
}
}
// ok contains your return value

Categories

Resources