regex replace all ignore case

regex replace all ignore case - java

How do I ignore case in the below example?
outText = inText.replaceAll(word, word.replaceAll(" ", "~"));
Example:
Input:
inText = "Retail banking Wikipedia, the free encyclopedia Retail banking "
+ "From Wikipedia. retail banking industry."
word = "retail banking"
Output
outText = "Retail~banking Wikipedia, the free encyclopedia Retail~banking " +
"From Wikipedia. retail~banking industry."

To do case-insensitive search and replace, you can change
outText = inText.replaceAll(word, word.replaceAll(" ", "~"));
into
outText = inText.replaceAll("(?i)" + word, word.replaceAll(" ", "~"));
Avoid ruining the original capitalization:
In the above approach however, you're ruining the capitalization of the replaced word. Here is a better suggestion:
String inText="Sony Ericsson is a leading company in mobile. " +
"The company sony ericsson was found in oct 2001";
String word = "sony ericsson";
Pattern p = Pattern.compile(word, Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher(inText);
StringBuffer sb = new StringBuffer();
while (m.find()) {
String replacement = m.group().replace(' ', '~');
m.appendReplacement(sb, Matcher.quoteReplacement(replacement));
}
m.appendTail(sb);
String outText = sb.toString();
System.out.println(outText);
Output:
Sony~Ericsson is a leading company in mobile.
The company sony~ericsson was found in oct 2001

You could convert it all to lowercase before doing the search, or look at a regex modifier Pattern.CASE_INSENSITIVE

Here is my way of doing it:
private String replaceAllIgnoreCase(final String text, final String search, final String replacement){
if(search.equals(replacement)) return text;
final StringBuffer buffer = new StringBuffer(text);
final String lowerSearch = search.toLowerCase(Locale.CANADA);
int i = 0;
int prev = 0;
while((i = buffer.toString().toLowerCase(Locale.CANADA).indexOf(lowerSearch, prev)) > -1){
buffer.replace(i, i+search.length(), replacement);
prev = i+replacement.length();
}
return buffer.toString();
}
Seems to work flawlessly up to my extent. The good thing about doing it my way is that there is no regex in my solution, meaning if you wanted to replace a bracket or a plus sign (or any other meta character for that matter) it will actually replace the text for what it actually is, rather than what it means in regex. Hope this has helped.

You didn't specify a language.
Java has Pattern.CASE_INSENSITIVE
C# and VB have RegexOptions.IgnoreCase

Related

Using Regular Expression in Java to extract information from a String

I have one input String like this:
"I am Duc/N Ta/N Van/N"
String "/N" present it is the Name of one person.
The expected output is:
Name: Duc Ta Van
How can I do it by using regular expression?

You can use Pattern and Matcher like this :
String input = "I am Duc/N Ta/N Van/N";
Pattern pattern = Pattern.compile("([^\\s]+)/N");
Matcher matcher = pattern.matcher(input);
String result = "";
while (matcher.find()) {
result+= matcher.group(1) + " ";
}
System.out.println("Name: " + result.trim());
Output
Name: Duc Ta Van
Another Solution using Java 9+
From Java9+ you can use Matcher::results like this :
String input = "I am Duc/N Ta/N Van/N";
String regex = "([^\\s]+)/N";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
String result = matcher.results().map(s -> s.group(1)).collect(Collectors.joining(" "));
System.out.println("Name: " + result); // Name: Duc Ta Van

Here is the regex to use to capture every "name" preceded by a /N
(\w+)\/N
Validate with Regex101
Now, you just need to loop on every match in that String and concatenate the to get the result :
String pattern = "(\\w+)\\/N";
String test = "I am Duc/N Ta/N Van/N";
Matcher m = Pattern.compile(pattern).matcher(test);
StringBuilder sbNames = new StringBuilder();
while(m.find()){
sbNames.append(m.group(1)).append(" ");
}
System.out.println(sbNames.toString());
Duc Ta Van
It is giving you the hardest part. I let you adapt this to match your need.
Note :
In java, it is not required to escape a forward slash, but to use the same regex in the entire answer, I will keep "(\\w+)\\/N", but "(\\w+)/N" will work as well.

I've used "[/N]+" as the regular expression.
Regex101
[] = Matches characters inside the set
\/ = Matches the character / literally (case sensitive)
+ = Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)

How to split a long string in Java?

How to edit this string and split it into two?
String asd = {RepositoryName: CodeCommitTest,RepositoryId: 425f5fc5-18d8-4ae5-b1a8-55eb9cf72bef};
I want to make two strings.
String reponame;
String RepoID;
reponame should be CodeCommitTest
repoID should be 425f5fc5-18d8-4ae5-b1a8-55eb9cf72bef
Can someone help me get it? Thanks

Here is Java code using a regular expression in case you can't use a JSON parsing library (which is what you probably should be using):
String pattern = "^\\{RepositoryName:\\s(.*?),RepositoryId:\\s(.*?)\\}$";
String asd = "{RepositoryName: CodeCommitTest,RepositoryId: 425f5fc5-18d8-4ae5-b1a8-55eb9cf72bef}";
String reponame = "";
String repoID = "";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(asd);
if (m.find()) {
reponame = m.group(1);
repoID = m.group(2);
System.out.println("Found reponame: " + reponame + " with repoID: " + repoID);
} else {
System.out.println("NO MATCH");
}
This code has been tested in IntelliJ and runs without error.
Output:
Found reponame: CodeCommitTest with repoID: 425f5fc5-18d8-4ae5-b1a8-55eb9cf72bef

Assuming there aren't quote marks in the input, and that the repository name and ID consist of letters, numbers, and dashes, then this should work to get the repository name:
Pattern repoNamePattern = Pattern.compile("RepositoryName: *([A-Za-z0-9\\-]+)");
Matcher matcher = repoNamePattern.matcher(asd);
if (matcher.find()) {
reponame = matcher.group(1);
}
and you can do something similar to get the ID. The above code just looks for RepositoryName:, possibly followed by spaces, followed by one or more letters, digits, or hyphen characters; then the group(1) method extracts the name, since it's the first (and only) group enclosed in () in the pattern.

Regex for floor in address

I have this regex:
String regexPattern = "[0-9A-Za-z]+(st|nd|rd|th)" + " " + "floor";
I want to test it against:
String lineString = "8th floor, Prince's Building, 12 Chater Road";
so I do:
boolean isMatching = lineString.matches(regexPattern);
and it return false. Why?
I thought it had something to do with whitespaces in Java, so I removed the whitespace in the regexPattern variable so it reads
regexPattern = "[0-9A-Za-z]+(st|nd|rd|th)floor";
and matched it with a string without white space:
String lineString = "8thfloor,Prince'sBuilding,12ChaterRoad"
it still returns false. Why? Any help very much appreciated.

String.matches() only returns true if the entire string matches the pattern.
Try adding .* to the beginning and end of your regex.
Example:
String regex = ".*[0-9A-Za-z]+(st|nd|rd|th)" + " " + "floor.*";
This is not the best approach, however...
Here's a better alternative:
String input = "8th floor, Prince's Building, 12 Chater Road";
String regex = "[0-9A-Za-z]+(st|nd|rd|th)" + " " + "floor";
Pattern p = Pattern.compile(regex);
boolean isMatch = p.matcher(input).find();
If you want to extract the floor number, do this:
String input = "8th floor, Prince's Building, 12 Chater Road";
String regex = "([0-9A-Za-z])+(st|nd|rd|th)" + " " + "floor";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(input);
if (m.find()) {
String num = m.group(1);
String suffix = m.group(2);
System.out.println("Welcome to the " + num + suffix + " floor!");
// prints 'Welcome to the 8th floor!'
}
Check out the Pattern API for a boatload of info about Java regular expressions.

Edited, per comments ...
The [0-9A-Za-z]+ part is greedily matching until the end of th.
Try [0-9] instead.

Regex not matching words delimited by whitespace

I have an input string that will follow the pattern /user/<id>?name=<name>, where <id> is alphanumeric but must start with a letter, and <name> is a letter-only string that can have multiple spaces. Some examples of matches would be:
/user/ad?name=a a
/user/one111?name=one ONE oNe
/user/hello?name=world
I came up with the following regex:
String regex = "/user/[a-zA-Z]+\\w*\\?name=[a-zA-Z\\s]+";
All of the above examples match the regex, but it only looks at the first word in <name>. Shouldn't the sequence \s allow me to have white spaces?
The code that I made to test what it is doing is:
String regex = "/user/[a-zA-Z]+\\w*\\?name=[a-zA-Z\\s]+";
// Check to see that input matches pattern
if(Pattern.matches(regex, str) == true){
str = str.replaceFirst("/user/", "");
str = str.replaceFirst("name=", "");
String[] tokens = str.split("\\?");
System.out.println("size = " + tokens.length);
System.out.println("tokens[0] = " + tokens[0]);
System.out.println("tokens[1] = " + tokens[1]);
} else
System.out.println("Didn't match.");
So for example, one test might look like:
/user/myID123?name=firstName LastName
size = 2
tokens[0] = myID123
tokens[1] = firstName
whereas the desired output would be
tokens[1] = firstName LastName
How can I change my regex to do this?

Not sure what you think is the problem in your code. tokens[1] will indeed contain firstName LastName in your example.
Here's an ideone.com demo showing this.
However, have you considered using capturing groups for the id and the name.
If you write it like
String regex = "/user/(\\w+)\\?name=([a-zA-Z\\s]+)";
Matcher m = Pattern.compile(regex).matcher(input);
you can get hold of myID123 and firstName LastName through m.group(1) and m.group(2)

I don't find any fault in your code but you may capture group like this:
String str = "/user/myID123?name=firstName LastName ";
String regex = "/user/([a-zA-Z]+\\w*)\\?name=([a-zA-Z\\s]+)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
if(m.find()) {
System.out.println(m.group(1) + ", " + m.group(2));
}

The problem is that * is greedy by default (it matches the whole string), so you need to modify your regex by adding a ? (making it reluctant):
List<String> str = Arrays.asList("/user/ad?name=a a", "/user/one111?name=one ONE oNe", "/user/hello?name=world");
String regex = "/user/([a-zA-Z]+\\w*?)\\?name=([a-zA-Z\\s]+)";
for (String s : str) {
Matcher matcher = Pattern.compile(regex).matcher(s);
if (matcher.matches()) {
System.out.println("user: " + matcher.group(1));
System.out.println("name: " + matcher.group(2));
}
}
Output:
user: ad
name: a a
user: one111
name: one ONE oNe
user: hello
name: world

Pattern for pulling strings out a string

I'm not new to Java, but have not dealt with Regex and Patterns before. What I'm looking to do is take a string like
"Class: " + data1 + "\nFrom: " + data2 + " To: " + data3 + "\nOccures: " + data4 + " In: " + data5 + " " + data6;
and pull out only data_1 to data_n.
I appreciate any help.

Use this regex:
Pattern pattern = Pattern.compile("Class: (.+?)\nFrom: (.+?) To: (.+?)\nOccures: (.+?) In: (.+?) (.+?)");
Matcher matcher = pattern.matcher(yourInputString);
if (matcher.find())
{
String data1 = matcher.group(1);
String data2 = matcher.group(2);
String data3 = matcher.group(3);
String data4 = matcher.group(4);
String data5 = matcher.group(5);
String data6 = matcher.group(6);
} else
{
// String didn't match the specified format
}
Explanation:
.+? will match any character for undefined times, but non-greedy.
(), using brackets will create a group. A group is given an index starting by 1 (since group 0 is the entire match)
So, (.+?) will creates groups of any character.
And what the matcher does, is searching for the whole pattern somewhere in the input string. But since you specified the format, we know exactly how your entire string is going to look like. The only thing you have to do is copy the format and replace the data you want to extract with "something" (.+?), which you give an index by creating a group of it.
Afterwards, the matcher will try to find the pattern (done by matcher.find()) and you ask them what the content is of the groups 1 up to 6.

how about using split() with ":", then from the splitted String[] get string[2i+1] ? (i from 0)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

regex replace all ignore case - java

You could convert it all to lowercase before doing the search, or look at a regex modifier Pattern.CASE_INSENSITIVE

You didn't specify a language. Java has Pattern.CASE_INSENSITIVE C# and VB have RegexOptions.IgnoreCase

Related

Using Regular Expression in Java to extract information from a String

How to split a long string in Java?

Regex for floor in address

Regex not matching words delimited by whitespace

Pattern for pulling strings out a string

Categories

Resources