extract values with java regex - java

I begin with regex and i want extract values from a String like this
String test="[ABC]Name:User:Date: Adresse ";
I want extract Name, User , Date and Adresse
I can do the trick with substring and split
String test = "String test="[ABC]Name:User:Date: Adresse ";
String test2= test.substring(5,test.length());
System.out.println(test2);
String[] chaine = test2.split(":");
for(String s :chaine)
{
System.out.println("Valeur " + s);
}
but i want try with regex , i did
pattern = Pattern.compile("^[(ABC)|:].");
but it doesn ' t work
Can you help me please ?
Thanks a lot

String#split is really the best way to accomplish what you are trying to do. Having said that, with regex, the following will give you the same output:
Pattern p = Pattern.compile("^(?:\\[ABC\\])([^:]+):([^:]+):([^:]+):([^:]+)$");
Matcher m = p.matcher(test);
while (m.find()) {
System.out.println("Valeur " + m.group(1)); // Name
System.out.println("Valeur " + m.group(2)); // User
System.out.println("Valeur " + m.group(3)); // Date
System.out.println("Valeur " + m.group(4)); // Address
}

You have to escape the [ and ] here is a working example.
^\[(.*)\](.*):(.*):(.*):(.*)$
Note that your code is probably more easily maintained than regular expressions in cases where the regular expression becomes complex.
Some people, when confronted with a problem, think "I know, I'll use
regular expressions." Now they have two problems. - Jamie Zawinski

Related

Regular expression in java( pattern to match a part of a string)

Could any one please help me write a regular expression to match a part of the string given below.
"Cecscec\n:90A:/5645644343\nvalue1\nvalue2\nvalue3\n:80F:/245343\nfglfejfj\n"
I want to extract only value 1 and value 2 from the above string which is present in :90A:/ which can be considered for look behind.
Output:
Value2
Value3
Pattern:
((?<=(:90A:/).{0,40}[\n].{0,40}[\n]).asterisk)[^:]asterisk
I am struggling since so many days. I would be very grateful if someone could help me on this.
Thanks
Alternative regex:
":90A:/(?:[^\n]{0,40}\n){2}([^\n]+)\n([^\n]+)"
Regex in context:
public static void main(String[] args) {
String input = "Cecscec\n:90A:/5645644343\nvalue1\nvalue2\nvalue3\n:80F:/245343\nfglfejfj\n";
Matcher matcher = Pattern.compile(":90A:/(?:[^\n]{0,40}\n){2}([^\n]+)\n([^\n]+)").matcher(input);
if(matcher.find()) {
System.out.println("Group1: '" + matcher.group(1) + "'");
System.out.println("Group2: '" + matcher.group(2) + "'");
}
}
Output:
Group1: 'value2'
Group2: 'value3'

Need help in regex matching

It may be very simple, but I am extremely new to regex and have a requirement where I need to do some regex matches in a string and extract the number in it. Below is my code with sample i/p and required o/p. I tried to construct the Pattern by referring to https://www.freeformatter.com/java-regex-tester.html, but my regex match itself is returning false.
Pattern pattern = Pattern.compile(".*/(a-b|c-d|e-f)/([0-9])+(#[0-9]?)");
String str = "foo/bar/Samsung-Galaxy/a-b/1"; // need to extract 1.
String str1 = "foo/bar/Samsung-Galaxy/c-d/1#P2";// need to extract 2.
String str2 = "foo.com/Samsung-Galaxy/9090/c-d/69"; // need to extract 69
System.out.println("result " + pattern.matcher(str).matches());
System.out.println("result " + pattern.matcher(str1).matches());
System.out.println("result " + pattern.matcher(str1).matches());
All of above SOPs are returning false. I am using java 8, is there is any way by which in a single statement I can match the pattern and then extract the digit from the string.
I would be great if somebody can point me on how to debug/develop the regex.Please feel free to let me know if something is not clear in my question.
You may use
Pattern pattern = Pattern.compile(".*/(?:a-b|c-d|e-f)/[^/]*?([0-9]+)");
See the regex demo
When used with matches(), the pattern above does not require explicit anchors, ^ and $.
Details
.* - any 0+ chars other than line break chars, as many as possible
/ - the rightmost / that is followed with the subsequent subpatterns
(?:a-b|c-d|e-f) - a non-capturing group matching any of the alternatives inside: a-b, c-d or e-f
/ - a / char
[^/]*? - any chars other than /, as few as possible
([0-9]+) - Group 1: one or more digits.
Java demo:
List<String> strs = Arrays.asList("foo/bar/Samsung-Galaxy/a-b/1","foo/bar/Samsung-Galaxy/c-d/1#P2","foo.com/Samsung-Galaxy/9090/c-d/69");
Pattern pattern = Pattern.compile(".*/(?:a-b|c-d|e-f)/[^/]*?([0-9]+)");
for (String s : strs) {
Matcher m = pattern.matcher(s);
if (m.matches()) {
System.out.println(s + ": \"" + m.group(1) + "\"");
}
}
A replacing approach using the same regex with anchors added:
List<String> strs = Arrays.asList("foo/bar/Samsung-Galaxy/a-b/1","foo/bar/Samsung-Galaxy/c-d/1#P2","foo.com/Samsung-Galaxy/9090/c-d/69");
String pattern = "^.*/(?:a-b|c-d|e-f)/[^/]*?([0-9]+)$";
for (String s : strs) {
System.out.println(s + ": \"" + s.replaceFirst(pattern, "$1") + "\"");
}
See another Java demo.
Output:
foo/bar/Samsung-Galaxy/a-b/1: "1"
foo/bar/Samsung-Galaxy/c-d/1#P2: "2"
foo.com/Samsung-Galaxy/9090/c-d/69: "69"
Because you match always the last number in your regex, I would Like to just use replaceAll with this regex .*?(\d+)$ :
String regex = ".*?(\\d+)$";
String strResult1 = str.replaceAll(regex, "$1");
System.out.println(!strResult1.isEmpty() ? "result " + strResult1 : "no result");
String strResult2 = str1.replaceAll(regex, "$1");
System.out.println(!strResult2.isEmpty() ? "result " + strResult2 : "no result");
String strResult3 = str2.replaceAll(regex, "$1");
System.out.println(!strResult3.isEmpty() ? "result " + strResult3 : "no result");
If the result is empty then you don't have any number.
Outputs
result 1
result 2
result 69
Here is a one-liner using String#replaceAll:
public String getDigits(String input) {
String number = input.replaceAll(".*/(?:a-b|c-d|e-f)/[^/]*?(\\d+)$", "$1");
return number.matches("\\d+") ? number : "no match";
}
System.out.println(getDigits("foo.com/Samsung-Galaxy/9090/c-d/69"));
System.out.println(getDigits("foo/bar/Samsung-Galaxy/a-b/some other text/1"));
System.out.println(getDigits("foo/bar/Samsung-Galaxy/9090/a-b/69ace"));
69
no match
no match
This works on the sample inputs you provided. Note that I added logic which will display no match for the case where ending digits could not be matched fitting your pattern. In the case of a non-match, we would typically be left with the original input string, which would not be all digits.

Find all <a href>link</a> in a string with java regex

I have a String which contains some url how i can find all the href with a regular expression?
prodotto di prova
Now i have this which find all amazon links now i need to add also the href to this regex:
String regex="(http|www\\.)(amazon|AMAZON)\\.(com|it|uk|fr|de)\\/(?:gp\\/product|gp\\/product\\/glance|[^\\/]+\\/dp|dp|[^\\/]+\\/product-reviews)\\/([^\\/]{10})";
This pattern works for me in Java: (IDEONE here)
String input = "prodotto di prova\"";
String pattern = "href=(?<link>['\\\"](?:https?:\\/\\/)?(?:www\\.)?(?:amazon|AMAZON)\\.(?:com|it|uk|fr|de)\\/(?<product>:gp\\/product|gp\\/product\\/glance|[^\\/]+\\/dp|dp|[^\\/]+\\/product-reviews)\\/(?<productID>[^\\/]{10})\\/(?<queryString>.*?)\\\")";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(input);
if (m.find( )) {
System.out.println("Amazon link: " + m.group(0) );
System.out.println("product: " + m.group("product") );
System.out.println("productID: " + m.group("productID"));
System.out.println("querystring: " + m.group("queryString"));
} else {
System.out.println("NO MATCH");
}
output:
Amazon link:
href="http://www.amazon.it/Die-10-Symphonien-Orchesterlieder-Sinfonie-Complete/dp/B003LQSHBO/ref=sr_1_2?ie=UTF8&qid=1440101590&sr=8-2&keywords=mahler"
product: Die-10-Symphonien-Orchesterlieder-Sinfonie-Complete/dp
productID: B003LQSHBO
querystring: ref=sr_1_2?ie=UTF8&qid=1440101590&sr=8-2&keywords=mahler
Java's rules for backslashes and escapes in strings are absolutely infuriating to me and I never get it right. You may find it helpful to go to http://www.regexplanet.com/advanced/java/index.html and enter a regex, which it will convert into a java string with the proper escapes. (I couldn't get mine working until I did this!)

Java multiple regular expression search

I have a string some thing like this:
If message contains sensitive info like: {Password:123456, tmpPwd : tesgjadgj, TEMP_PASSWORD: kfnda}
My pattern should look for the particular words Password or tmpPwd or TEMP_PASSWORD.
How can I create a pattern for this kind of search?
I think you are looking for the values after these words. You need to set capturing groups to extract those values, e.g.
String content = "If message contains sensitive info like: {Password:123456, tmpPwd : tesgjadgj, TEMP_PASSWORD: kfnda} ";
Pattern p = Pattern.compile("\\{Password\\s*:\\s*([^,]+)\\s*,\\s*tmpPwd\\s*:\\s*([^,]+)\\s*,\\s*TEMP_PASSWORD:\\s*([^,]+)\\s*\\}");
Matcher m = p.matcher(content);
while (m.find()) {
System.out.println(m.group(1) + ", " + m.group(2) + ", " + m.group(3));
}
See IDEONE demo
This will output 123456, tesgjadgj, kfnda.
To just find out if there are any of the substrings, use contains method:
System.out.println(content.contains("Password") ||
content.contains("tmpPwd") ||
content.contains("TEMP_PASSWORD"));
See another demo
And if you want a regex-solution for the keywords, here it is:
String str = "If message contains sensitive info like: {Password:123456, tmpPwd : tesgjadgj, TEMP_PASSWORD: kfnda} ";
Pattern ptrn = Pattern.compile("Password|tmpPwd|TEMP_PASSWORD");
Matcher m = ptrn.matcher(str);
while (m.find()) {
System.out.println("Match found: " + m.group(0));
}
See Demo 3
Finally I am using it like as per my requirement .
private final static String censoredWords =
"(?i)PASSWORD|pwd";
The (?i) makes it case-insensitive

Java regex comparing group to string

I am trying to do a replacement using regex. The relevant piece of code is as follows:
String msg =" <ClientVerificationResult>\n " +
" <VerificationIDCheck>Y</VerificationIDCheck>\n" +
" </ClientVerificationResult>\n";
String regex = "(<VerificationIDCheck>)([Y|N])(</VerificationIDCheck>)";
String replacedMsg= msg.replaceAll(regex, "$2".matches("Y") ? "$1YES$3" : "$1NO$3") ;
System.out.println(replacedMsg);
The output of this is
<ClientVerificationResult>
<VerificationIDCheck>NO</VerificationIDCheck>
</ClientVerificationResult>
When it should be
<ClientVerificationResult>
<VerificationIDCheck>YES</VerificationIDCheck>
</ClientVerificationResult>
I guess the problem is that "$2".matches("Y") is returning false. I have tried doing "$2".equals("Y"); and weird combinations inside matches() like "[Y]" or "([Y])", but still nothing.
If I print "$2" the output is Y. Any hints on what am I doing wrong?
You cannot use Java code as the replacement argument for replaceAll which is supposed to be a string only. Better use Pattern and Matcher APIs and evaluate matcher.group(2) for your replacement logic.
Suggested Code:
String msg =" <ClientVerificationResult>\n " +
" <VerificationIDCheck>Y</VerificationIDCheck>\n" +
" </ClientVerificationResult>\n";
String regex = "(<VerificationIDCheck>)([YN])(</VerificationIDCheck>)";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher( msg );
StringBuffer sb = new StringBuffer();
while (m.find()) {
String repl = m.group(2).matches("Y") ? "YES" : "NO";
m.appendReplacement(sb, m.group(1) + repl + m.group(3));
}
m.appendTail(sb);
System.out.println(sb); // replaced string
You are checking the literal string "$2" to see if it matches "Y". This will never happen.

Categories

Resources