How to find a string including non word character? - java

How to find a string including non word character?
Example input: l l'oreal s.a. l l' ab l
search String: l
output: XX l'oreal s.a. XX l' ab l
Expected: XX l'oreal s.a. XX l' ab X
I was trying to find the search string in the input string using the below regex.
String inputStr = "l l'oreal s.a. l l' ab l";
System.out.println(inputStr);
String searchStr = "l";
Pattern pattern = Pattern.compile("(\\b"+ searchStr +"\\b)(?=[\\s])");
Matcher matcher = pattern.matcher(inputStr);
if ( matcher.find()){
String rep = matcher.replaceAll("XX");
System.out.println("REP:" + rep);
}else{
System.out.println("no match...");
}
The regex pattern searches for the string where it is followed by a space(\s). But in the above example it doesn't work for the last character since it was not followed by a space.
The main goal was to find string with non word characters like...
private-limited ( when searching for private should return false)
Hello! ( false when searched for Hello)
Couple of patterns which tried but not working...
pattern = Pattern.compile("(?<![\\w+])" + searchStr + "(?![\\W+])", Pattern.CASE_INSENSITIVE);
pattern = Pattern.compile("(?<=[\\s])(\\b"+ searchStr +"\\b)(?=\\s)");
In the above example if I replace searchString = "l'" it doesn't match anything.
Is my approach correct?
What is that I am missing?
Thanks.

An easy solution, if you can modify the string to search, would be to concatenate it with a space before and afterwards, then simply search for " l "

Perhaps word boundary followed by whitespace or end of input.
In your case for l
\b(l)(\s+|$)
Demonstration

Related

Using Regular Expression in Java to extract information from a String

I have one input String like this:
"I am Duc/N Ta/N Van/N"
String "/N" present it is the Name of one person.
The expected output is:
Name: Duc Ta Van
How can I do it by using regular expression?
You can use Pattern and Matcher like this :
String input = "I am Duc/N Ta/N Van/N";
Pattern pattern = Pattern.compile("([^\\s]+)/N");
Matcher matcher = pattern.matcher(input);
String result = "";
while (matcher.find()) {
result+= matcher.group(1) + " ";
}
System.out.println("Name: " + result.trim());
Output
Name: Duc Ta Van
Another Solution using Java 9+
From Java9+ you can use Matcher::results like this :
String input = "I am Duc/N Ta/N Van/N";
String regex = "([^\\s]+)/N";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
String result = matcher.results().map(s -> s.group(1)).collect(Collectors.joining(" "));
System.out.println("Name: " + result); // Name: Duc Ta Van
Here is the regex to use to capture every "name" preceded by a /N
(\w+)\/N
Validate with Regex101
Now, you just need to loop on every match in that String and concatenate the to get the result :
String pattern = "(\\w+)\\/N";
String test = "I am Duc/N Ta/N Van/N";
Matcher m = Pattern.compile(pattern).matcher(test);
StringBuilder sbNames = new StringBuilder();
while(m.find()){
sbNames.append(m.group(1)).append(" ");
}
System.out.println(sbNames.toString());
Duc Ta Van
It is giving you the hardest part. I let you adapt this to match your need.
Note :
In java, it is not required to escape a forward slash, but to use the same regex in the entire answer, I will keep "(\\w+)\\/N", but "(\\w+)/N" will work as well.
I've used "[/N]+" as the regular expression.
Regex101
[] = Matches characters inside the set
\/ = Matches the character / literally (case sensitive)
+ = Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)

How would I search a certain word after a character in java?

So I am doing some cw, and I want to search a string for words after a hashtag, "#".
How would I go about this?
Say for example the string was 'Hello World #me'? how would i return the word "me"?
kind regards
Use a regex and prepare a Matcher to find hashtags iteratively as
String input = "Hello #World! #Me";
Pattern pattern = Pattern.compile("#(\\S+)");
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Output :
World!
Me
Split the String on basis of that character say
String []splittedString=inputString.split("#");
System.out.println(splittedString[1]);
So for Input String
Hello World #me'
Output
me
Use this
example.substring(example.indexOf("#") + 1);
Using regex:
// Matches a string of word characters preceded by a '#'
Pattern p = Pattern.compile("(?<=#)\\w*");
Matcher m = p.matcher("Hello World #me");
String hashtag = "";
if(m.find())
{
hashtag = m.group(); //me
}
So then John, let me guess. You're a computer Science student at the university of Warwick. Here you go,
String s = "hello #yolo blaaa";
if(s.contains("#")){
int hash = s.indexOf("#") - 1;
s = s.substring(hash);
int space = s.indexOf(' ');
s = s.substring(space);
}
remove the -1 if you don't want to include the #
A simple way of doing it would be to Use indexOf and then you should use overloaded indexOf with a subString
EG:
String myString = originalString.substring(originalString.indexOf("#"),originalString.indexOf(" "),originalString.indexOf("#"));
Please note that this can throw out of bounds error is the characters are not found. Read the java doc links to understand in detail as to what this is doing.

How to remove spaces in between the String

I have below String
string = "Book Your Domain And Get\n \n\n \n \n \n Online Today."
string = str.replace("\\s","").trim();
which returning
str = "Book Your Domain And Get Online Today."
But what is want is
str = "Book Your Domain And Get Online Today."
I have tried Many Regular Expression and also googled but got no luck. and did't find related question, Please Help, Many Thanks in Advance
Use \\s+ instead of \\s as there are two or more consecutive whitespaces in your input.
string = str.replaceAll("\\s+"," ")
You can use replaceAll which takes a regex as parameter. And it seems like you want to replace multiple spaces with a single space. You can do it like this:
string = str.replaceAll("\\s{2,}"," ");
It will replace 2 or more consecutive whitespaces with a single whitespace.
First get rid of multiple spaces:
String after = before.trim().replaceAll(" +", " ");
If you want to just remove the white space between 2 words or characters and not at the end of string
then here is the
regex that i have used,
String s = " N OR 15 2 ";
Pattern pattern = Pattern.compile("[a-zA-Z0-9]\\s+[a-zA-Z0-9]", Pattern.CASE_INSENSITIVE);
Matcher m = pattern.matcher(s);
while(m.find()){
String replacestr = "";
int i = m.start();
while(i<m.end()){
replacestr = replacestr + s.charAt(i);
i++;
}
m = pattern.matcher(s);
}
System.out.println(s);
it will only remove the space between characters or words not spaces at the ends
and the output is
NOR152
Eg. to remove space between words in a string:
String example = "Interactive Resource";
System.out.println("Without space string: "+ example.replaceAll("\\s",""));
Output:
Without space string: InteractiveResource
If you want to print a String without space, just add the argument sep='' to the print function, since this argument's default value is " ".
//user this for removing all the whitespaces from a given string for example a =" 1 2 3 4"
//output: 1234
a.replaceAll("\\s", "")
String s2=" 1 2 3 4 5 ";
String after=s2.replace(" ", "");
this work for me
String string_a = "AAAA BBB";
String actualTooltip_3 = string_a.replaceAll("\\s{2,}"," ");
System.out.println(String actualTooltip_3);
OUTPUT will be:AAA BBB

Regex for floor in address

I have this regex:
String regexPattern = "[0-9A-Za-z]+(st|nd|rd|th)" + " " + "floor";
I want to test it against:
String lineString = "8th floor, Prince's Building, 12 Chater Road";
so I do:
boolean isMatching = lineString.matches(regexPattern);
and it return false. Why?
I thought it had something to do with whitespaces in Java, so I removed the whitespace in the regexPattern variable so it reads
regexPattern = "[0-9A-Za-z]+(st|nd|rd|th)floor";
and matched it with a string without white space:
String lineString = "8thfloor,Prince'sBuilding,12ChaterRoad"
it still returns false. Why? Any help very much appreciated.
String.matches() only returns true if the entire string matches the pattern.
Try adding .* to the beginning and end of your regex.
Example:
String regex = ".*[0-9A-Za-z]+(st|nd|rd|th)" + " " + "floor.*";
This is not the best approach, however...
Here's a better alternative:
String input = "8th floor, Prince's Building, 12 Chater Road";
String regex = "[0-9A-Za-z]+(st|nd|rd|th)" + " " + "floor";
Pattern p = Pattern.compile(regex);
boolean isMatch = p.matcher(input).find();
If you want to extract the floor number, do this:
String input = "8th floor, Prince's Building, 12 Chater Road";
String regex = "([0-9A-Za-z])+(st|nd|rd|th)" + " " + "floor";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(input);
if (m.find()) {
String num = m.group(1);
String suffix = m.group(2);
System.out.println("Welcome to the " + num + suffix + " floor!");
// prints 'Welcome to the 8th floor!'
}
Check out the Pattern API for a boatload of info about Java regular expressions.
Edited, per comments ...
The [0-9A-Za-z]+ part is greedily matching until the end of th.
Try [0-9] instead.

Why Matcher fails for Strings obtained or provide at Runtime in Java?

Hi I was recently developing a code where i had to extract the last 3 group of digits. So i used pattern to extract the data. But i failed to understand. CAN any one help me to understand it ??
String str ="EGLA 0F 020";
String def = "ALT 1F 001 TO ALT 1F 029";
String arr[] = def.split("TO");
String str2 = arr[0];
System.out.println("str2:"+str2);
Pattern pt = Pattern.compile("[0-9][0-9][0-9]$");
Matcher m1 = pt.matcher(str);
Matcher m2 = pt.matcher(str2);
boolean flag = m1.find();
boolean flag2 = m2.find();
if(flag)
System.out.println("first match:::"+m1.group(0));
else
System.out.println("Not found");
if(flag2)
System.out.println("first match:::"+m2.group(0));
else
System.out.println("Not found");
The output produced for the above code is As follows:::
str2:ALT 1F 001
first match:::020
Not found
Please Do reply iam stuck here ??
It's because when you split you have a trailing space.
String str = "EGLA 0F 020";
String str2 = "ALT 1F 001 ";
// ^ trailing space
You could fix it a number of ways. For example:
by splitting on " TO "
trimming the result
allowing trailing spaces in your regular expression.
For example, this change would work:
String arr[] = def.split(" TO ");
If you notice your split take effect only on the letters "TO", it means str2 pattern is "ALT 1F 001 ".
To resolve this you can try to split on "\s*TO\s*" instead of "TO" so that any spaces surrounding the work TO would be removed too. Another solution would be to replace your pattern "[0-9][0-9][0-9]$" with "[0-9][0-9][0-9]" without the final $, so that it would accept ending spaces on your String.
Try this pattern:
Pattern pattern = Pattern.compile("[0-9][0-9][0-9]\\s*$");
or
Pattern pattern = Pattern.compile("[0-9]{3}\\s*$");

Categories

Resources