Matching dates with Regex inside of a random string - java

I am trying to do this in Java:
I receive this kind of string
"12/07/2004dddsss12/10/2010ñrrñrñr10/01/2000ksdifjsdifffffdd04/04/1998"
Then I have to find one or more dates inside that string, date format: dd/mm/yyyy
Finally I have to copy to another string dates matched: "12/07/2004 12/10/2010 10/01/2000 04/04/1998"
PD: I'm using this website http://regexpal.com/ to test if works. I tried some website regex and anyone worked for me.

You can separate the validity of the date with the extracted content.
To extract the dates:
String regex = "\\d{2}/\\d{2}/\\d{4}";
Check here at fiddle: http://fiddle.re/fa0bf
Code:
String input = "12/07/2004dddsss12/10/2010ñrrñrñr10/01/2000ksdifjsdifffffdd04/04/1998";
String regex = "\\d{2}/\\d{2}/\\d{4}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
System.out.println(matcher.group());
}
Gives,
12/07/2004
12/10/2010
10/01/2000
04/04/1998

Related

How to parse a string to get array of #tags out of the string?

so I have this string like
"#tag1 #tag2 #tag3 not_tag1 not_tag2 #tag4" (the space between tag2 and tag4 is to indicate there can be many spaces). From this string I want to parse just a tag1, tag2 and so on. They are similar to #tags we see on LinkedIn or any other social media. Is there any easy way to do this using regex or any other function in Java. Or should I do it hard way(i.e. using loops and conditions).
Tag format should be "#" (to indicate tag is starting) and space " "(to indicate end of tag). In between there can be character or numbers but start should be a character only.
example,
input : "#tag1 #tag2 #tag3 not_tag1 not_tag2 #12tag #tag4"
output : ["tag1", "tag2", "tag3", "tag4"]
split by regex: "#\w+"
EDIT: this is the correct regex, but split is not the right method.
same solution as javadev suggested, but use instead:
String input = "#tag1 #tag2 #tag3 not_tag1 not_tag2 #12tag #tag4";
Matcher matcher = Pattern.compile("#\\w+").matcher(input);
while (matcher.find()) {
System.out.println(matcher.group(0));
}
output with # as expected.
Maybe something like:
public static void main(String[] args ) {
String input = "#tag1 #tag2 #tag3 not_tag1 not_tag2 #12tag #tag4";
Pattern pattern = Pattern.compile("#([A-z][A-z0-9]*) *");
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
}
worked for me :)
Output:
tag1
tag2
tag3
tag4

Java Regex : Extract a specific pattern from a string "I_INSERT_TO_TOPIC_345674_123456_4.json"

I want to extract only "_123456_4" from this string using java Regex.
I_INSERT_TO_TOPIC_345674_123456_4.json
I have tried
Pattern.compile("(_([^_]*_[^_]))") and Pattern.compile("_" + "([^[0-9]]*)" + "_[0-9]") but these do not work.
If you want to get 2 group of digits just before .json then you can use regex group to find the required match. You can modify the pattern as per your requirement.
Pattern p = Pattern.compile("(_\\d+_\\d+)\\.json");
Matcher matcher = p.matcher(s);
if (matcher.find()) {
String group = matcher.group(1);
}
【\_[0-9]\*\_[0-9]\*(?=\\.)】
You can try to see if this works

Regex: how to extract a JSESSIONID cookie value from cookie string?

I might receive the following cookie string.
hello=world;JSESSIONID=sdsfsf;Path=/ei
I need to extract the value of JSESSIONID
I use the following pattern but it doesn't seem to work. However https://regex101.com shows it's correct.
Pattern PATTERN_JSESSIONID = Pattern.compile(".*JSESSIONID=(?<target>[^;\\n]*)");
You can reach your goal with a simpler approach using regex (^|;)JSESSIONID=(.*);. Here is the demo on Regex101 (you have forgotten to link the regular expression using the save button). Take a look on the following code. You have to extract the matched values using the class Matcher:
String cookie = "hello=world;JSESSIONID=sdsfsf;Path=/ei";
Pattern PATTERN_JSESSIONID = Pattern.compile("(^|;)JSESSIONID=(.*);");
Matcher m = PATTERN_JSESSIONID.matcher(cookie);
if (m.find()) {
System.out.println(m.group(0));
}
Output value:
sdsfsf
Of course the result depends on the all of possible variations of the input text. The snippet above will work in every case the value is between JSESSIONID and ; characters.
You can try below regex:
JSESSIONID=([^;]+)
regex explanation
String cookies = "hello=world;JSESSIONID=sdsfsf;Path=/ei;submit=true";
Pattern pat = Pattern.compile("\\bJSESSIONID=([^;]+)");
Matcher matcher = pat.matcher(cookies);
boolean found = matcher.find();
System.out.println("Sesssion ID: " + (found ? matcher.group(1): "not found"));
DEMO
You can even get what you aiming for with Splitting and Replacing the string aswell, below I am sharing which is working for me.
String s = "hello=world;JSESSIONID=sdsfsf;Path=/ei";
List<String> sarray = Arrays.asList(s.split(";"));
String filterStr = sarray.get(sarray.indexOf("JSESSIONID=sdsfsf"));
System.out.println(filterStr.replace("JSESSIONID=", ""));

how to extract date from the given filename in java

I have my file names as below
C:\Users\name\Documents\repository\zzz\xxx_yyy\new\aaa_bbb_ccc_ddd_eee_ZZ_E_20160801_20160831_v1-0.csv
C:\Users\name\Documents\repository\zzz\xxx_yyy\new\aaa_bbb_ppp_ccc_ddd_eee_ZZ_E_20160801_20160831_v1-0.csv
I have to write a single java script for both the file format to extract both the dates from each filename.
Can you please help.
You should use Regular expressions to extract dates from filenames like these.
private static Date[] extractDatesFromFileName(File file) throws ParseException {
Date[] dates = new Date[2];
SimpleDateFormat dateFormatter = new SimpleDateFormat("yyyyMMdd");
String regex = ".*(\\d{8})_(\\d{8}).*";
Pattern pattern = Pattern.compile(regex);
Matcher m = pattern.matcher(file.getName());
if (m.find()) {
dates[0] = dateFormatter.parse(m.group(1));
dates[1] = dateFormatter.parse(m.group(2));
}
System.out.println(dates[0]);
System.out.println(dates[1]);
return dates;
}
Little explanation:
In regex .*(\\d{8})_(\\d{8}).*:
.* stands for any sing repeated from zero to unlimited times
(\\d{8}) stands for exactly eight digits (if they are in brackets they are considered capturing groups, we have 2 capturing groups in this regex, one for each date)
_ stands for _ sign :)
If filename matches provided pattern both dates are extracted, parsed and returned as array. You should add some error handling etc.
If you mean a Java script (not Javascript) you can use regexp, something like the following:
String in = "C:\\Users\\name\\Documents\\repository\\zzz\\xxx_yyy\\new\\aaa_bbb_ppp_ccc_ddd_eee_ZZ_E_20160801_20160831_v1-0.csv";
Pattern p = Pattern.compile("_(\\d{8})_v1-0");
Matcher m = p.matcher(in);
if (m.find()){
System.out.println(m.group(1));
}
I think you want to extract two dates which are present in each file path.
This could be done as follows:
String filename1 = "C:\\Users\\name\\Documents\\repository\\zzz\\xxx_yyy\\new\\aaa_bbb_ccc_ddd_eee_ZZ_E_20160801_20160831_v1-0.csv";
Pattern p = Pattern.compile("[0-9]{8}+_[0-9]{8}+");
Matcher m = p.matcher(filename1);
String[] dateStrArr = m.find()?m.group(0).split("_"): null;
First date will be in 0 index and second date will be in 1 index position.
Same goes for second file name.
Hope this helps.
Also once extracted you can convert them to date object using SimpleDateFormat.

How to extract word from string?

Suppose I have a string:
String message = "you should try http://google.com/";
Now, I want to send "http://google.com/" to a new
String url
What I want to do is:
check if a "word" in the string begins with "http://" and extract that word, where a word is
something that's surrounded by spaces (general english definition of word).
I have no idea how to extract the string, and the best I can do is use startsWith on the string. How to I use startsWith on a word, and extract the word?
Sorry if this is a little bit difficult to explain.
Thanks in advance!
EDIT: Also, what should I do to extract the word from the REGEX operation? And how should I handle it if there is more than 1 url in the string?
Use Pattern & Matcher classes.
String str = "blabla http://www.mywebsite.com blabla";
String regex = "((https?:\\/\\/)?(www.)?(([a-zA-Z0-9-]){2,}\\.){1,4}([a-zA-Z]){2,6}(\\/([a-zA-Z-_/.0-9#:+?%=&;,]*)?)?)";
Matcher m = Pattern.compile(regex).matcher(str);
if (m.find()) {
String url = m.group(); //value "http://www.mywebsite.com"
}
This regex will work for http://..., https://... and even www... URLs. Others regex can be easily found on the net.
You can try this:
String str = "blabla http://www.mywebsite.com blabla";
Matcher m = Pattern.compile("(http://.*)").matcher(str);
if (m.find()) {
String url = (new StringTokenizer(m.group(), " ")).nextToken();
}
The "correct" way to perform this task is to split the String by whitespace -- String#split("\s") -- and then pipe it to the URL constructor. If the string starts with your prefix and a MalformedURLException is thrown it is invalid. The URL class constructor is far better tested and more robust than any solution that you or I could come up with. So, use it, please and don't reinvent the wheel.
You can use Java Regex for this:
The following regex catches any string starting with http:// or https:// till the next whitespace character:
Pattern urlPattern = Pattern.compile("(http(s)?://[.^[\\S]]*)");
Matcher matcher = compile.matcher(myString);
if (matcher.find()) {
String url = matcher.group();
}

Categories

Resources