How to check if string contains a certain substring like [3:0] - java

I am working on a project where i need to search for a particular string token and find if this token has the [3:0] format of number, how can i check it? i searched for reference on stack overflow, i could find how to search "{my string }:" in a string like the following:
String myStr = "this is {my string: } ok";
if (myStr.trim().contains("{my string: }")) {
//Do something.
}
But, could not find how to search if a string contains a number in the regular expression, i tried using the following, but it did not work:
String myStr = "this is [3 string: ] ok";
if (myStr.trim().contains("[\\d string: ]")) {
//Do something.
}
Please help!

for "[int:int]" use \\[\\d*:\\d*\\] it's working

You cannot use a regex inside String#contains, instead, use .matches() with a regex.
To match [3 string: ]-like patterns inside larger strings (where string is a literal word string), use a regex like (?s).*\\[\\d+\\s+string:\\s*\\].*:
String myStr = "this is [3 string: ] ok";
if (myStr.matches("(?s).*\\[\\d+\\s+string:\\s*\\].*")) {
System.out.println("FOUND");
}
See IDEONE demo
The regex will match any number of any characters from the start of string with .* (as many as possible) before a [+1 or more digits+1 or more whitespace+string:+0 or more whitespace+]+0 or more any characters up to the end of string.
The (?s) internal modifier makes the dot match newline characters, too.
Note we need .* on both sides because .matches() requires a full string match.
To match [3:3]-like pattern inside larger strings use:
"(?s).*\\[\\d+\\s*:\\s*\\d+\\].*"
See another IDEONE demo
Remove \\s* if whitespace around : is not allowed.

Related

Regex to match [any char][any digit]dot with an exception case

I have a regex [a-zA-Z0-9]\\.(.*) to match:
[any character, any digit] followed by a dot and then followed by anything. For example e1.abc, r11.xyz, etc.
This works fine. However I have a case where if string is e.abc then it should not match i.e. only if it is e. then it should not match.
How do I modify my regex to handle this specific exclusion?
You can modify your regex by adding a negative lookahead assertion before the first pre-dot character. This lookahead will ensure that this first letter is not e. Here is the pattern:
.*(?!e)[a-zA-Z0-9]\.(.*)
Sample code:
String match = "a.abc";
if (match.matches(".*(?!e)[a-zA-Z0-9]\\.(.*)")) {
System.out.println("match");
}
String noMatch = "e.abc";
if (noMatch.matches(".*(?!e)[a-zA-Z0-9]\\.(.*)")) {
System.out.println("no match");
}
Note that I assume that there is only one dot in your string. If not, then this answer would need to change.
Demo here:
Rextester
Just get all the matches using your current regex, then just add an If statement as followings:
String test="e.this is test";
if(!test.startsWith("e."){
//Do someting
}

What is the Regex for decimal numbers in Java?

I am not quite sure of what is the correct regex for the period in Java. Here are some of my attempts. Sadly, they all meant any character.
String regex = "[0-9]*[.]?[0-9]*";
String regex = "[0-9]*['.']?[0-9]*";
String regex = "[0-9]*["."]?[0-9]*";
String regex = "[0-9]*[\.]?[0-9]*";
String regex = "[0-9]*[\\.]?[0-9]*";
String regex = "[0-9]*.?[0-9]*";
String regex = "[0-9]*\.?[0-9]*";
String regex = "[0-9]*\\.?[0-9]*";
But what I want is the actual "." character itself. Anyone have an idea?
What I'm trying to do actually is to write out the regex for a non-negative real number (decimals allowed). So the possibilities are: 12.2, 3.7, 2., 0.3, .89, 19
String regex = "[0-9]*['.']?[0-9]*";
Pattern pattern = Pattern.compile(regex);
String x = "5p4";
Matcher matcher = pattern.matcher(x);
System.out.println(matcher.find());
The last line is supposed to print false but prints true anyway. I think my regex is wrong though.
Update
To match non negative decimal number you need this regex:
^\d*\.\d+|\d+\.\d*$
or in java syntax : "^\\d*\\.\\d+|\\d+\\.\\d*$"
String regex = "^\\d*\\.\\d+|\\d+\\.\\d*$"
String string = "123.43253";
if(string.matches(regex))
System.out.println("true");
else
System.out.println("false");
Explanation for your original regex attempts:
[0-9]*\.?[0-9]*
with java escape it becomes :
"[0-9]*\\.?[0-9]*";
if you need to make the dot as mandatory you remove the ? mark:
[0-9]*\.[0-9]*
but this will accept just a dot without any number as well... So, if you want the validation to consider number as mandatory you use + ( which means one or more) instead of *(which means zero or more). That case it becomes:
[0-9]+\.[0-9]+
If you on Kotlin, use ktx:
fun String.findDecimalDigits() =
Pattern.compile("^[0-9]*\\.?[0-9]*").matcher(this).run { if (find()) group() else "" }!!
Your initial understanding was probably right, but you were being thrown because when using matcher.find(), your regex will find the first valid match within the string, and all of your examples would match a zero-length string.
I would suggest "^([0-9]+\\.?[0-9]*|\\.[0-9]+)$"
There are actually 2 ways to match a literal .. One is using backslash-escaping like you do there \\., and the other way is to enclose it inside a character class or the square brackets like [.]. Most of the special characters become literal characters inside the square brackets including .. So use \\. shows your intention clearer than [.] if all you want is to match a literal dot .. Use [] if you need to match multiple things which represents match this or that for example this regex [\\d.] means match a single digit or a literal dot
I have tested all the cases.
public static boolean isDecimal(String input) {
return Pattern.matches("^[-+]?\\d*[.]?\\d+|^[-+]?\\d+[.]?\\d*", input);
}

Extracting numbers into a string array

I have a string which is of the form
String str = "124333 is the otp of candidate number 9912111242.
Please refer txn id 12323335465645 while referring blah blah.";
I need 124333, 9912111242 and 12323335465645 in a string array. I have tried this with
while (Character.isDigit(sms.charAt(i)))
I feel that running the above said method on every character is inefficient. Is there a way I can get a string array of all the numbers?
Use a regex (see Pattern and matcher):
Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher(<your string here>);
while (m.find()) {
//m.group() contains the digits you want
}
you can easily build ArrayList that contains each matched group you find.
Or, as other suggested, you can split on non-digits characters (\D):
"blabla 123 blabla 345".split("\\D+")
Note that \ has to be escaped in Java, hence the need of \\.
You can use String.split():
String[] nbs = str.split("[^0-9]+");
This will split the String on any group of non-numbers digits.
And this works perfectly for your input.
String str = "124333 is the otp of candidate number 9912111242. Please refer txn id 12323335465645 while referring blah blah.";
System.out.println(Arrays.toString(str.split("\\D+")));
Output:
[124333, 9912111242, 12323335465645]
\\D+ Matches one or more non-digit characters. Splitting the input according to one or more non-digit characters will give you the desired output.
Java 8 style:
long[] numbers = Pattern.compile("\\D+")
.splitAsStream(str)
.mapToLong(Long::parseLong)
.toArray();
Ah if you only need a String array, then you can just use String.split as the other answers suggests.
Alternatively, you can try this:
String str = "124333 is the otp of candidate number 9912111242. Please refer txn id 12323335465645 while referring blah blah.";
str = str.replaceAll("\\D+", ",");
System.out.println(Arrays.asList(str.split(",")));
\\D+ matches one or more non digits
Output
[124333, 9912111242, 12323335465645]
First thing comes into my mind is filter and split, then i realized that it can be done via
String[] result =str.split("\\D+");
\D matches any non-digit character, + says that one or more of these are needed, and leading \ escapes the other \ since \D would be parsed as 'escape character D' which is invalid

How to remove special characters from a string?

I want to remove special characters like:
- + ^ . : ,
from an String using Java.
That depends on what you define as special characters, but try replaceAll(...):
String result = yourString.replaceAll("[-+.^:,]","");
Note that the ^ character must not be the first one in the list, since you'd then either have to escape it or it would mean "any but these characters".
Another note: the - character needs to be the first or last one on the list, otherwise you'd have to escape it or it would define a range ( e.g. :-, would mean "all characters in the range : to ,).
So, in order to keep consistency and not depend on character positioning, you might want to escape all those characters that have a special meaning in regular expressions (the following list is not complete, so be aware of other characters like (, {, $ etc.):
String result = yourString.replaceAll("[\\-\\+\\.\\^:,]","");
If you want to get rid of all punctuation and symbols, try this regex: \p{P}\p{S} (keep in mind that in Java strings you'd have to escape back slashes: "\\p{P}\\p{S}").
A third way could be something like this, if you can exactly define what should be left in your string:
String result = yourString.replaceAll("[^\\w\\s]","");
This means: replace everything that is not a word character (a-z in any case, 0-9 or _) or whitespace.
Edit: please note that there are a couple of other patterns that might prove helpful. However, I can't explain them all, so have a look at the reference section of regular-expressions.info.
Here's less restrictive alternative to the "define allowed characters" approach, as suggested by Ray:
String result = yourString.replaceAll("[^\\p{L}\\p{Z}]","");
The regex matches everything that is not a letter in any language and not a separator (whitespace, linebreak etc.). Note that you can't use [\P{L}\P{Z}] (upper case P means not having that property), since that would mean "everything that is not a letter or not whitespace", which almost matches everything, since letters are not whitespace and vice versa.
Additional information on Unicode
Some unicode characters seem to cause problems due to different possible ways to encode them (as a single code point or a combination of code points). Please refer to regular-expressions.info for more information.
This will replace all the characters except alphanumeric
replaceAll("[^A-Za-z0-9]","");
As described here
http://developer.android.com/reference/java/util/regex/Pattern.html
Patterns are compiled regular expressions. In many cases, convenience methods such as String.matches, String.replaceAll and String.split will be preferable, but if you need to do a lot of work with the same regular expression, it may be more efficient to compile it once and reuse it. The Pattern class and its companion, Matcher, also offer more functionality than the small amount exposed by String.
public class RegularExpressionTest {
public static void main(String[] args) {
System.out.println("String is = "+getOnlyStrings("!&(*^*(^(+one(&(^()(*)(*&^%$##!#$%^&*()("));
System.out.println("Number is = "+getOnlyDigits("&(*^*(^(+91-&*9hi-639-0097(&(^("));
}
public static String getOnlyDigits(String s) {
Pattern pattern = Pattern.compile("[^0-9]");
Matcher matcher = pattern.matcher(s);
String number = matcher.replaceAll("");
return number;
}
public static String getOnlyStrings(String s) {
Pattern pattern = Pattern.compile("[^a-z A-Z]");
Matcher matcher = pattern.matcher(s);
String number = matcher.replaceAll("");
return number;
}
}
Result
String is = one
Number is = 9196390097
Try replaceAll() method of the String class.
BTW here is the method, return type and parameters.
public String replaceAll(String regex,
String replacement)
Example:
String str = "Hello +-^ my + - friends ^ ^^-- ^^^ +!";
str = str.replaceAll("[-+^]*", "");
It should remove all the {'^', '+', '-'} chars that you wanted to remove!
To Remove Special character
String t2 = "!##$%^&*()-';,./?><+abdd";
t2 = t2.replaceAll("\\W+","");
Output will be : abdd.
This works perfectly.
Use the String.replaceAll() method in Java.
replaceAll should be good enough for your problem.
You can remove single char as follows:
String str="+919595354336";
String result = str.replaceAll("\\\\+","");
System.out.println(result);
OUTPUT:
919595354336
If you just want to do a literal replace in java, use Pattern.quote(string) to escape any string to a literal.
myString.replaceAll(Pattern.quote(matchingStr), replacementStr)

Remove doubled letter from a string using java

I need to remove a doubled letter from a string using regex operations in java.
Eg: PRINCEE -> PRINCE
APPLE -> APLE
Simple Solution (remove duplicate characters)
Like this:
final String str = "APPLEE";
String replaced = str.replaceAll("(.)\\1", "$1");
System.out.println(replaced);
Output:
APLE
Not just any Chracters, Letters only
As #Jim comments correctly, the above matches any double character, not just letters. Here are a few variations that just match letters:
// the basics, ASCII letters. these two are equivalent:
str.replaceAll("([A-Za-z])\\1", "$1");
str.replaceAll("(\\p{Alpha})\\1", "$1");
// Unicode Letters
str.replaceAll("(\\p{L})\\1", "$1");
// anything where Character.isLetter(ch) returns true
str.replaceAll("(\\p{javaLetter})\\1", "$1");
References:
For additional reference:
Character.isLetter(ch) (javadocs)
any method in Character of
the form Character.isXyz(char)
enables a pattern named
\p{javaXyz} (mind the
capitalization). This mechanism is
described in the Pattern
javadocs
Unicode blocks and categories can
also be matched with the \p and
\P constructs as in Perl. \p{prop}
matches if the input has the
property prop, while \P{prop} does
not match if the input has that
property. This mechanism is also
described in the Pattern
javadocs
String s = "...";
String replaced = s.replaceAll( "([A-Z])\\1", "$1" );
If you want to replace just duplicate ("AA"->"A", "AAA" -> "AA") use
public String undup(String str) {
return str.replaceAll("(\\w)\\1", "$1");
}
To replace triplicates etc use: str.replaceAll("(\\w)\\1+", "$1");
To replace only a single dupe is a long string (AAAA->AAA, AAA->AA) use: str.replaceAll("(\\w)(\\1+)", "$2");
This can be done simply by iterating over the String instead of having to resort to regexes.
StringBuilder ret=new StringBuilder(text.length());
if (text.length()==0) return "";
ret.append(text.charAt(0));
for(int i=1;i<text.length();i++){
if (text.charAt(i)!=text.charAt(i-1))
ret.append(text.charAt(i));
}
return ret.toString();

Categories

Resources