avoid using multiple split method - java

I have a string like this.
//Locaton;RowIndex;maxRows=New York, NY_10007;1;4
From this i need to get the contry name New York only.
How it can possible in a single step code.
i used..
String str = "Locaton;RowIndex;maxRows=New York, NY_10007;1;4 ";
str = str.split("=")[1];
str = str.split(",")[0]
the above code contails lots of splits.How can i avoid thiis.
i want to get the contry name only using single code.

Try to use this regular expression "=(.*?)," like this:
String str = "Locaton;RowIndex;maxRows=New York, NY_10007;1;4 ";
Pattern pattern = Pattern.compile("=(.*?),");
Matcher matcher = pattern.matcher(str);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
Output:
New York
Using matcher.group(1) means capturing groups make it easy to extract part of the regex match,parentheses also create a numbered capturing group.
It stores the part of the string matched by the part of the regular expression inside the parentheses.
Match "Locaton;RowIndex;maxRows=New York, NY_10007;1;4 "
Group 1: "New York"

Use capture groups with regex which perfect capturing the specific data from string.
String str = "Locaton;RowIndex;maxRows=New York, NY_10007;1;4 ";
String pattern = "(.*?=)(.*?)(,.*)";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(str);
if (m.find()) {
System.out.println("Group 1: " + m.group(1));
System.out.println("Group 2: " + m.group(2));
System.out.println("Group 3: " + m.group(3));
}
Here is the output
Group 1: Locaton;RowIndex;maxRows=
Group 2: New York
Group 3: , NY_10007;1;4

Related

Using Regular Expression in Java to extract information from a String

I have one input String like this:
"I am Duc/N Ta/N Van/N"
String "/N" present it is the Name of one person.
The expected output is:
Name: Duc Ta Van
How can I do it by using regular expression?
You can use Pattern and Matcher like this :
String input = "I am Duc/N Ta/N Van/N";
Pattern pattern = Pattern.compile("([^\\s]+)/N");
Matcher matcher = pattern.matcher(input);
String result = "";
while (matcher.find()) {
result+= matcher.group(1) + " ";
}
System.out.println("Name: " + result.trim());
Output
Name: Duc Ta Van
Another Solution using Java 9+
From Java9+ you can use Matcher::results like this :
String input = "I am Duc/N Ta/N Van/N";
String regex = "([^\\s]+)/N";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
String result = matcher.results().map(s -> s.group(1)).collect(Collectors.joining(" "));
System.out.println("Name: " + result); // Name: Duc Ta Van
Here is the regex to use to capture every "name" preceded by a /N
(\w+)\/N
Validate with Regex101
Now, you just need to loop on every match in that String and concatenate the to get the result :
String pattern = "(\\w+)\\/N";
String test = "I am Duc/N Ta/N Van/N";
Matcher m = Pattern.compile(pattern).matcher(test);
StringBuilder sbNames = new StringBuilder();
while(m.find()){
sbNames.append(m.group(1)).append(" ");
}
System.out.println(sbNames.toString());
Duc Ta Van
It is giving you the hardest part. I let you adapt this to match your need.
Note :
In java, it is not required to escape a forward slash, but to use the same regex in the entire answer, I will keep "(\\w+)\\/N", but "(\\w+)/N" will work as well.
I've used "[/N]+" as the regular expression.
Regex101
[] = Matches characters inside the set
\/ = Matches the character / literally (case sensitive)
+ = Matches between one and unlimited times, as many times as possible, giving back as needed (greedy)

Java - regex parse string

Trying to parse out names with given samples
++++++++++++++++++SELIZABETH+COLLAZO+++++++++++++++++++
+++++++++++++++++++PALOMA+CORREA+++++++++++++++++++++++
+++++++++++++++++++NOAH+BLAKEMORE++++++++++++++++++++++
I've tried
//++(.*?)+(.*?)//++
but that's way off.
Would like to parse out the first and last name to two strings.
You can use this regex (\w+)\+(\w+) or \+{2,}(.*?)\+(.*?)\+{2,} with Pattern like this :
String str = "++++++++++++++++++SELIZABETH+COLLAZO+++++++++++++++++++\n"
+ "+++++++++++++++++++PALOMA+CORREA+++++++++++++++++++++++\n"
+ "+++++++++++++++++++NOAH+BLAKEMORE++++++++++++++++++++++";
Pattern pattern = Pattern.compile("(\\w+)\\+(\\w+)");// or instead "\\+{2,}(.*?)\\+"(.*?)\\+{2,}
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group(1) + " " + matcher.group(2));
}
Outputs
SELIZABETH COLLAZO
PALOMA CORREA
NOAH BLAKEMORE

Java multiple regular expression search

I have a string some thing like this:
If message contains sensitive info like: {Password:123456, tmpPwd : tesgjadgj, TEMP_PASSWORD: kfnda}
My pattern should look for the particular words Password or tmpPwd or TEMP_PASSWORD.
How can I create a pattern for this kind of search?
I think you are looking for the values after these words. You need to set capturing groups to extract those values, e.g.
String content = "If message contains sensitive info like: {Password:123456, tmpPwd : tesgjadgj, TEMP_PASSWORD: kfnda} ";
Pattern p = Pattern.compile("\\{Password\\s*:\\s*([^,]+)\\s*,\\s*tmpPwd\\s*:\\s*([^,]+)\\s*,\\s*TEMP_PASSWORD:\\s*([^,]+)\\s*\\}");
Matcher m = p.matcher(content);
while (m.find()) {
System.out.println(m.group(1) + ", " + m.group(2) + ", " + m.group(3));
}
See IDEONE demo
This will output 123456, tesgjadgj, kfnda.
To just find out if there are any of the substrings, use contains method:
System.out.println(content.contains("Password") ||
content.contains("tmpPwd") ||
content.contains("TEMP_PASSWORD"));
See another demo
And if you want a regex-solution for the keywords, here it is:
String str = "If message contains sensitive info like: {Password:123456, tmpPwd : tesgjadgj, TEMP_PASSWORD: kfnda} ";
Pattern ptrn = Pattern.compile("Password|tmpPwd|TEMP_PASSWORD");
Matcher m = ptrn.matcher(str);
while (m.find()) {
System.out.println("Match found: " + m.group(0));
}
See Demo 3
Finally I am using it like as per my requirement .
private final static String censoredWords =
"(?i)PASSWORD|pwd";
The (?i) makes it case-insensitive

Regex not matching words delimited by whitespace

I have an input string that will follow the pattern /user/<id>?name=<name>, where <id> is alphanumeric but must start with a letter, and <name> is a letter-only string that can have multiple spaces. Some examples of matches would be:
/user/ad?name=a a
/user/one111?name=one ONE oNe
/user/hello?name=world
I came up with the following regex:
String regex = "/user/[a-zA-Z]+\\w*\\?name=[a-zA-Z\\s]+";
All of the above examples match the regex, but it only looks at the first word in <name>. Shouldn't the sequence \s allow me to have white spaces?
The code that I made to test what it is doing is:
String regex = "/user/[a-zA-Z]+\\w*\\?name=[a-zA-Z\\s]+";
// Check to see that input matches pattern
if(Pattern.matches(regex, str) == true){
str = str.replaceFirst("/user/", "");
str = str.replaceFirst("name=", "");
String[] tokens = str.split("\\?");
System.out.println("size = " + tokens.length);
System.out.println("tokens[0] = " + tokens[0]);
System.out.println("tokens[1] = " + tokens[1]);
} else
System.out.println("Didn't match.");
So for example, one test might look like:
/user/myID123?name=firstName LastName
size = 2
tokens[0] = myID123
tokens[1] = firstName
whereas the desired output would be
tokens[1] = firstName LastName
How can I change my regex to do this?
Not sure what you think is the problem in your code. tokens[1] will indeed contain firstName LastName in your example.
Here's an ideone.com demo showing this.
However, have you considered using capturing groups for the id and the name.
If you write it like
String regex = "/user/(\\w+)\\?name=([a-zA-Z\\s]+)";
Matcher m = Pattern.compile(regex).matcher(input);
you can get hold of myID123 and firstName LastName through m.group(1) and m.group(2)
I don't find any fault in your code but you may capture group like this:
String str = "/user/myID123?name=firstName LastName ";
String regex = "/user/([a-zA-Z]+\\w*)\\?name=([a-zA-Z\\s]+)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
if(m.find()) {
System.out.println(m.group(1) + ", " + m.group(2));
}
The problem is that * is greedy by default (it matches the whole string), so you need to modify your regex by adding a ? (making it reluctant):
List<String> str = Arrays.asList("/user/ad?name=a a", "/user/one111?name=one ONE oNe", "/user/hello?name=world");
String regex = "/user/([a-zA-Z]+\\w*?)\\?name=([a-zA-Z\\s]+)";
for (String s : str) {
Matcher matcher = Pattern.compile(regex).matcher(s);
if (matcher.matches()) {
System.out.println("user: " + matcher.group(1));
System.out.println("name: " + matcher.group(2));
}
}
Output:
user: ad
name: a a
user: one111
name: one ONE oNe
user: hello
name: world

Pattern for pulling strings out a string

I'm not new to Java, but have not dealt with Regex and Patterns before. What I'm looking to do is take a string like
"Class: " + data1 + "\nFrom: " + data2 + " To: " + data3 + "\nOccures: " + data4 + " In: " + data5 + " " + data6;
and pull out only data_1 to data_n.
I appreciate any help.
Use this regex:
Pattern pattern = Pattern.compile("Class: (.+?)\nFrom: (.+?) To: (.+?)\nOccures: (.+?) In: (.+?) (.+?)");
Matcher matcher = pattern.matcher(yourInputString);
if (matcher.find())
{
String data1 = matcher.group(1);
String data2 = matcher.group(2);
String data3 = matcher.group(3);
String data4 = matcher.group(4);
String data5 = matcher.group(5);
String data6 = matcher.group(6);
} else
{
// String didn't match the specified format
}
Explanation:
.+? will match any character for undefined times, but non-greedy.
(), using brackets will create a group. A group is given an index starting by 1 (since group 0 is the entire match)
So, (.+?) will creates groups of any character.
And what the matcher does, is searching for the whole pattern somewhere in the input string. But since you specified the format, we know exactly how your entire string is going to look like. The only thing you have to do is copy the format and replace the data you want to extract with "something" (.+?), which you give an index by creating a group of it.
Afterwards, the matcher will try to find the pattern (done by matcher.find()) and you ask them what the content is of the groups 1 up to 6.
how about using split() with ":", then from the splitted String[] get string[2i+1] ? (i from 0)

Categories

Resources