Find E-Mail Adresses in a text - java

Can somebody tell me how to find E-Mail Adresses in a text?
Example text:
"Hey,
I just blahblah
E-Mail: lolcat#catinator.com
Another would be lolcat2#catinator.com"
So the output is:
lolcat#catinator.com
lolcat2#catinator.com
I tried Regex, but I got no idea how I can do this over an entire text...
Pattern pattern = Pattern.compile("[A-Z0-9._%+-]+#[A-Z0-9.-]+\\.[A-Z]{2,4}");
Matcher matcher = pattern.matcher("asd#asdasd.de".toUpperCase());
if(matcher.matches()){
System.out.println("Mail found!");
}else{
System.out.println("No Mail...");
}
Can somebody help me? :(
Greetings!

They're so many different types of email address formats that it is hard to match all of them. A simple (for your structured data) but no so effective approach would be the following:
String s = "Hey,\n" +
"I just blahblah\n" +
"E-Mail: lolcat#catinator.com\n" +
"Another would be lolcat2#catinator.com";
Pattern p = Pattern.compile("\\S+#\\S+");
Matcher m = p.matcher(s);
while (m.find()) {
System.out.println(m.group());
}
Output
lolcat#catinator.com
lolcat2#catinator.com

I am not sure about the regex expression you have provided. But if it is good and serves your purpose then you can use following to extract the string,
Matcher matcher = pattern.matcher("asd#asdasd.de".toUpperCase());
String result;
while (matcher.find()) {
// result now will contain the email address
result = matcher .group();
System.out.println(result);
}

Related

Regex capture group within if statement in Java

I'm facing a stupid problem... I know how to use Pattern and Matcher objects to capture a group in Java.
However, I cannot find a way to use them with an if statement where each choice depends on a match (simple example to illustrate the question, in reality, it's more complicated) :
String input="A=B";
String output="";
if (input.matches("#.*")) {
output="comment";
} else if (input.matches("A=(\\w+)")) {
output="value of key A is ..."; //how to get the content of capturing group?
} else {
output="unknown";
}
Should I create a Matcher for each possible test?!
Yes, you should.
Here is the example.
Pattern p = Pattern.compile("Phone: (\\d{9})");
String str = "Phone: 123456789";
Matcher m = p.matcher(str);
if (m.find()) {
String g = m.group(1); // g should hold 123456789
}

Java Regex : Extract a specific pattern from a string "I_INSERT_TO_TOPIC_345674_123456_4.json"

I want to extract only "_123456_4" from this string using java Regex.
I_INSERT_TO_TOPIC_345674_123456_4.json
I have tried
Pattern.compile("(_([^_]*_[^_]))") and Pattern.compile("_" + "([^[0-9]]*)" + "_[0-9]") but these do not work.
If you want to get 2 group of digits just before .json then you can use regex group to find the required match. You can modify the pattern as per your requirement.
Pattern p = Pattern.compile("(_\\d+_\\d+)\\.json");
Matcher matcher = p.matcher(s);
if (matcher.find()) {
String group = matcher.group(1);
}
【\_[0-9]\*\_[0-9]\*(?=\\.)】
You can try to see if this works

Java regex does not match as expected

I'm starting with regex in Java recently, and I cant wrap my head around this problem.
Pattern p = Pattern.compile("[^A-Z]+");
Matcher matcher = p.matcher("GETs");
if (matcher.matches()) {
System.out.println("Matched.");
} else {
System.out.println("Did not match.");
}
Result: Did not Match(Unexpected result) Explain this
I get the output "Did not match." This is strange to me, while reading https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html,
I'm using the X+, which matches "One, or more times".
I thought my code in words would go something like this:
"Check if there is one or more characters in the string "GETs" which does not belong in A to Z."
So I'm expecting the following result:
"Yes, there is one character that does not belong to A-Z in "GETs", the regex was a match."
However this is not the case, I'm confused to why this is.
I tried the following:
Pattern p = Pattern.compile("[A-Z]+");
Matcher matcher = p.matcher("GETs");
if (matcher.matches()) {
System.out.println("Matched.");
} else {
System.out.println("Did not match.");
}
Result: Did not match. (Expected result)
Pattern p = Pattern.compile("[A-Z]+");
Matcher matcher = p.matcher("GET");
if (matcher.matches()) {
System.out.println("Matched.");
} else {
System.out.println("Did not match.");
}
Result: Matched. (Expected result)
Please, explain why my first example did not work.
Matcher.matches returns true only if the ENTIRE region
matches the pattern.
For the output you are looking for, use Matches.find instead
Explanation of each case:
Pattern p = Pattern.compile("[^A-Z]+");
Matcher matcher = p.matcher("GETs");
if (matcher.matches()) {
Fails because the ENTIRE region 'GETs' isn't lowercase
Pattern p = Pattern.compile("[A-Z]+");
Matcher matcher = p.matcher("GETs");
if (matcher.matches()) {
This fails because the ENTIRE region 'GETs' isn't uppercase
Pattern p = Pattern.compile("[A-Z]+");
Matcher matcher = p.matcher("GET");
if (matcher.matches()) {
The ENTIRE region 'GET' is uppercase, the pattern matches.
You're very first regex asks to match any character that is not in an uppercase range of A-Z. The match is on the lowercase "s" in GETs.
if you want a regex to match either in UPPERCASE and lowercase, you can use this:
String test = "yes";
String test2= "YEs";
test.matches("(?i).*\\byes\\b.*");
test2.matches("(?i).*\\byes\\b.*");
will return true in the two cases

Get image link and text from string

I have this string
<div><img width="100px" src="http://www.mysite.com/Content/dataImages/news/small/some-pic.png" /><br />This is some text that I need to get.</div>
and i need to get the image link and the text This is some text that I need to get.from the string above in Java. Can anybody tell me how can I do this?
Use regex to get what you want.
If this is all you have to do there's no point in bringing in extra packages just use regex:
The pattern "(?<=src=\")(.*?)(?=\")" can be used to get the link, you can modify that to give you the text.
Try this, just change the patter if you must.
String str = "<div><img width=\"100px\" src=\"http://www.mysite.com/Content/dataImages/news/small/some-pic.png\" /><br />This is some text that I need to get.</div>";
Pattern p = Pattern.compile("src=\"(.*?)\" /><br />(.*?)</div>");
Matcher m = p.matcher(str);
if (m.find()) {
String link = m.group(1);
String text = m.group(2);
}
My solution was:
String tmp=xpp.nextText();
desc=android.text.Html.fromHtml(tmp).toString();
img=FindUrls.extractUrls(tmp);
for extracting the text from the string I used:
desc=android.text.Html.fromHtml(tmp).toString();
img=FindUrls.extractUrls(tmp);
and for the link inside the string I've used this function:
public static String extractUrls(String input) {
String result = null;
Pattern pattern = Pattern.compile(
"\\b(((ht|f)tp(s?)\\:\\/\\/|~\\/|\\/)|www.)" +
"(\\w+:\\w+#)?(([-\\w]+\\.)+(com|org|net|gov" +
"|mil|biz|info|mobi|name|aero|jobs|museum" +
"|travel|[a-z]{2}))(:[\\d]{1,5})?" +
"(((\\/([-\\w~!$+|.,=]|%[a-f\\d]{2})+)+|\\/)+|\\?|#)?" +
"((\\?([-\\w~!$+|.,*:]|%[a-f\\d{2}])+=?" +
"([-\\w~!$+|.,*:=]|%[a-f\\d]{2})*)" +
"(&(?:[-\\w~!$+|.,*:]|%[a-f\\d{2}])+=?" +
"([-\\w~!$+|.,*:=]|%[a-f\\d]{2})*)*)*" +
"(#([-\\w~!$+|.,*:=]|%[a-f\\d]{2})*)?\\b");
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
result=matcher.group();
}
return result;
}
Hope It will help someone that has similar problem

Question Pattern/Matcher

I want to extract the value 5342test behind the name="buddyname" from a fieldset tag.
But there are multiple fieldsets in the HTML code.
Below the example of the string in the HTML.
<fieldset style="display:none"><input type="hidden" name="buddyname" value="5342test" /></fieldset>
I have some difficulties to put in the different patterns in Pattern.compile and i just want the value 5342test displayed not the other results, could somebody please help?
Thank you.
My code:
String stringToSearch = "5342test";
Pattern pattern = Pattern.compile("(\\value=\\})");
Matcher m = pattern.matcher(stringToSearch);
while (m.find())
{
// get the matching group
String codeGroup = m.group(1);
// print the group
System.out.format("'%s'\n", codeGroup); // should be 5342test
}
Use this pattern: Pattern pattern = Pattern.compile("<input[^>]*?value\\s*?=\\s*?\\\"(.*?)\\\"");
Since you want the input values inside a fieldset tag, you can use this regex pattern.
Pattern pattern = Pattern.compile("<fieldset[^>]*>[^<]*<input.+?value\\s*=\\s*\\\"([^\\\"]*)\\\"");
Matcher matcher = pattern.matcher("<fieldset style=\"display:none\"><input type=\"hidden\" name=\"buddyname\" value=\"5342test\" /></fieldset>");
if (matcher.find())
System.out.println(matcher.group(1)); //this prints 5342test
else
System.out.println("Input html does not have a fieldset");

Categories

Resources