How to find passwords with regex - java

A password consists of digits and Latin letters in any case;
a password always follow by the "password" word (in any case), but they can be separated by any number of spaces and the colon : characters.
I try this regular expression
Pattern pattern = Pattern.compile("password\\s\\w*",Pattern.CASE_INSENSITIVE);
If text is
String text = "My email javacoder#gmail.com with password SECRET115. Here is my old PASSWORD: PASS111.\n";//scanner.nextLine();
We need to find SECRET115 and PASS111.
Now program fails and cannot find pattern.

You may add an optional : after password, and match 0 or more whitespaces with \s*:
password:?\s*(\w+)
See the regex demo.
Details
password - a fixed string
:? - 1 or 0 colons
\s* - 0+ whitespaces
(\w+) - Capturing group 1: one or more word chars.
Java demo:
String s = "My email javacoder#gmail.com with password SECRET115. Here is my old PASSWORD: PASS111.\n";
Pattern pattern = Pattern.compile("password:?\\s*(\\w+)", Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(1));
}
Output:
SECRET115
PASS111

Related

In Java, how do you tokenize a string that contains the delimiter in the tokens?

Let's say I have the string:
String toTokenize = "prop1=value1;prop2=String test='1234';int i=4;;prop3=value3";
I want the tokens:
prop1=value1
prop2=String test='1234';int i=4;
prop3=value3
For backwards compatibility, I have to use the semicolon as a delimiter. I have tried wrapping code in something like CDATA:
String toTokenize = "prop1=value1;prop2=<![CDATA[String test='1234';int i=4;]]>;prop3=value3";
But I can't figure out a regular expression to ignore the semicolons that are within the cdata tags.
I've tried escaping the non-delimiter:
String toTokenize = "prop1=value1;prop2=String test='1234'\\;int i=4\\;;prop3=value3";
But then there is an ugly mess of removing the escape characters.
Do you have any suggestions?
You may match either <![CDATA...]]> or any char other than ;, 1 or more times, to match the values. To match the keys, you may use a regular \w+ pattern:
(\w+)=((?:<!\[CDATA\[.*?]]>|[^;])+)
See the regex demo.
Details
(\w+) - Group 1: one or more word chars
= - a = sign
((?:<!\[CDATA\[.*?]]>|[^;])+) - Group 1: one or more sequences of
<!\[CDATA\[.*?]]> - a <![CDATA[...]]> substring
| - or
[^;] - any char but ;
See a Java demo:
String rx = "(\\w+)=((?:<!\\[CDATA\\[.*?]]>|[^;])+)";
String s = "prop1=value1;prop2=<![CDATA[String test='1234';int i=4;]]>;prop3=value3";
Pattern pattern = Pattern.compile(rx);
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
System.out.println(matcher.group(1) + " => " + matcher.group(2));
}
Results:
prop1 => value1
prop2 => <![CDATA[String test='1234';int i=4;]]>
prop3 => value3
Prerequisite:
All your tokens start with prop
There is no prop in the file other than the beginning of a token
I'd just do a replace of all ;prop by ~prop
Then your string becomes:
"prop1=value1~prop2=String test='1234';int i=4~prop3=value3";
You can then tokenize using the ~ delimiter

How to make regex pattern for some scenarios

Am doing WordXml parsar using JAVA.
And now i want to check (F(1) = 44) this type of pattern will be occured or not in a paragraph.
Note: Inside of open close will have must integer value.
Folloing pattern i will need to check.
(text text (text) text)
(F(1) = 44)
(text text [text] text)
[text text (text) text]
But, Clearly don't know how to make regex pattern for above the senarios.
So, Please suggest me. And anybody pls let me know.
You can use this regex \([a-zA-Z]+\(\d+\)\s*=\s*\d+\), which mean
one or more alphabetic [a-zA-Z]+
followed by one or more degit between parentheses \(\d+\)
followed by one or more space \s*
followed then by equal =
followed then by one or more space \s*
followed then by one or more degit \d+
all this between parentheses \([a-zA-Z]+\(\d+\)\s*=\s*\d+\)
with Pattern like this :
String[] texts = new String[]{"(text text (text) text)",
"(F(1) = 44)",
"(text text [text] text)",
"[text text (text) text]"};
String regex = "\\([a-zA-Z]*\\(\\d+\\)\\s*=\\s*\\d+\\)";
for (String s : texts) {
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
System.out.println("There are match " + matcher.group());
} else {
System.out.println("No match occurred");
}
}
Output
No match occurred
There are match (F(1) = 44)
No match occurred
No match occurred
regex demo

Using Regular Expressions to Extract specific Values in Java

I have several strings in the rough form:
String s = "Rendering content from websiteNAme using user agent userAgentNameWithSpaces ; for user username ; at time someTime";
I want to extract the values for websiteName, userAgentNameWithSpaces, username and someTime.
I have tried the following code.
private static final Pattern USER_NAME_PATTERN = Pattern.compile("for user.*;");
final Matcher matcher = USER_NAME_PATTERN.matcher(line);
matcher.find() ? Optional.of(matcher.group(group)) : Optional.empty();
It returns the whole string " for user username" after that I have to replace the for user string with empty string to get the user name.
However, I want to know if there is regex to just get the username directly?
You can use regex groups:
Pattern pattern = Pattern.compile("for user (\\w+)");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
The pair of parenthesis ( and ) forms a group that can be obtained by the matcher using group method (as it's the first parenthesis, it's group 1).
\w means a "word character" (letters, numbers and _) and + means "one or more ocurrences". So \w+ means basically "a word" (assuming your username has only these characters). PS: note that I had to escape \, so the resulting expression is \\w+.
The ouput of this code is:
username
If you want to match all the values (websiteName, userAgentNameWithSpaces and so on), you could do the following:
Pattern pattern = Pattern.compile("Rendering content from (.*) using user agent (.*) ; for user (.*) ; at time (.*)");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
System.out.println(matcher.group(3));
System.out.println(matcher.group(4));
}
The output will be:
websiteNAme
userAgentNameWithSpaces
username
someTime
Note that if userAgentNameWithSpaces contains spaces, \w+ won't work (because \w doesn't match spaces), so .* will work in this case.
But you can also use [\w ]+ - the brackes [] means "any of the characters inside me", so [\w ] means "a word character, or a space" (note that there's a space between w and ]. So the code would be (testing with a username with spaces):
String s = "Rendering content from websiteNAme using user agent userAgent Name WithSpaces ; for user username ; at time someTime";
Pattern pattern = Pattern.compile("Rendering content from (.*) using user agent ([\\w ]+) ; for user (.*) ; at time (.*)");
Matcher matcher = pattern.matcher(s);
if (matcher.find()) {
System.out.println(matcher.group(1));
System.out.println(matcher.group(2));
System.out.println(matcher.group(3));
System.out.println(matcher.group(4));
}
And the output will be:
websiteNAme
userAgent Name WithSpaces
username
someTime
Note: you can test if the groups were matched before calling matcher.group(n). The method matcher.groupCount() returns how many groups were matched (because if you call matcher.group(n) and group n is not available, you'll get an IndexOutOfBoundsException)
I think you want to use lookaheads and lookbehinds:
String s = "Rendering content from websiteNAme using user agent userAgentNameWithSpaces ; for user username ; at time someTime";
Pattern USER_NAME_PATTERN = Pattern.compile("(?<=for user).*?(?=;)");
final Matcher matcher = USER_NAME_PATTERN.matcher(s);
matcher.find();
System.out.println(matcher.group(0).trim());
Output:
username

Regular expression for extracting instance ID, AMI ID, Volume ID

Given the following string
Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305
I want to be able to extract the following using a regular expression
i-b9b4ffaa
ami-dbcf88b1
vol-e97db305
This is the regular expression I came up with, which currently doesn't do what I need :
Pattern p = Pattern.compile("Created by CreateImage([a-z]+[0.9]+)([a-z]+[0.9]+)([a-z]+[0.9]+)",Pattern.CASE_INSENSITIVE);
Matcher m = p.matcher("Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305");
System.out.println(m.matches()); --> false
You may match all words starting with letters, followed with a hyphen, and then having alphanumeric chars:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
Pattern pattern = Pattern.compile("(?i)\\b[a-z]+-[a-z0-9]+");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(0));
}
// => i-b9b4ffaa, ami-dbcf88b1, vol-e97db305
See the Java demo
Pattern details:
(?i) - a case insensitive modifier (embedded flag option)
\\b - a word boundary
[a-z]+ - 1 or more ASCII letters
- - a hyphen
[a-z0-9]+ - 1 or more alphanumerics.
To make sure these values appear on the same line after Created by CreateImage, use a \G-based regex:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
Pattern pattern = Pattern.compile("(?i)(?:Created by CreateImage|(?!\\A)\\G)(?:(?!\\b[a-z]+-[a-z0-9]+).)*\\b([a-z]+-[a-z0-9]+)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println(matcher.group(1));
}
See this demo.
Note that the above pattern is based on the \G operator that matches the end of the last successful match (so we only match after a match or after Created...) and a tempered greedy token (?:(?!\\b[a-z]+-[a-z0-9]+).)* (matching any symbol other than a newline that does not start a sequence: word boundary+letters+-+letters|digits) that is very resource consuming.
You should consider using a two-step approach to first check if a string starts with Created... string, and then process it:
String s = "Created by CreateImage(i-b9b4ffaa) for ami-dbcf88b1 from vol-e97db305";
if (s.startsWith("Created by CreateImage")) {
Matcher n = Pattern.compile("(?i)\\b[a-z]+-[a-z0-9]+").matcher(s);
while(n.find()) {
System.out.println(n.group(0));
}
}
See another demo

I want to extracting css image path by Java Pattern expression

All , I want to write a pattern regex to extract the: "/images/colorbox/ie6/borderBottomRight.png" from cssContent=".cboxIE6 #cboxBottomRight{background:url(../images/colorbox/ie6/borderBottomRight.png);}"
Who can write a pattern regex for me? Thanks a lot.
My regex can't work as:
Pattern pattern = Pattern.compile("[.*]*/:url/(/././/(.+?)/)/;[.*]*");
Matcher matcher = pattern.matcher(cssContent);
if(matcher.find()){
System.out.println(matcher.group(0));
}
Pattern pattern = Pattern.compile(":url\\(\\.\\.([^)]+)\\)");
Matcher matcher = pattern.matcher(cssContent);
if(matcher.find()){
System.out.println(matcher.group(1));
}
The regex used to match is (quoted and without \ escaped)
":url\(\.\.([^)]+)\)"
which looks for :url(.. followed by [^)] anything that's not a closing ) bracket + one or more times; finally followed by the closing ) bracket. The group () captured is available at group(1) whereas group(0) would give you the complete string that matched i.e. from :url to the closing ).
The biggest error you were making was using "/" to escape your literal characters. You need to use "\", and annoyingly, in a java string "\" must be escaped with "\", so the total escape sequence is "\\". Then, you have matcher.group(0), which matches the entire pattern. You needmatcher.group(1)` to match the first (and only) group in your regex, which contains your string of interest. Here's the corrected code:
String cssContent = "cssContent=\".cboxIE6 #cboxBottomRight{background:url(../images/colorbox/ie6/borderBottomRight.png);}\"";
String regex = ".*?:url\\(\\.\\.(.+?)\\);[.*]*";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(cssContent);
if(matcher.find()){
System.out.println(matcher.group(1));
}

Categories

Resources