I would like to create my own pattern to split a String with a regex expression.
Actually i want to split a String into sentences so i need a pattern like ". \p{Upper}"
I've tried to code it, but java doesn't accept it:
String[] phrase = txtbrut.split(". \p{Upper}");
Basically i need to split the text String with a pattern like : dot-space-CapitalLetter
If someone know how to create his own pattern.
To split the string into sentences, you could do
String[] sentences = txtbrut.split("\\. (?=\\p{Upper})");
Note as per Stephan's comment, this will not handle cases where abbreviations and ellipsis occur
Related
I have a regex pattern like "(\\d{4},\\d{2},\\d{2} :\\d{2}:\\d{2}:\\d{2})"
I am passing this pattern as argument to a function which tokenizes the input string based on ",".
Example:
func((\\d{4},\\d{2},\\d{2} :\\d{2}:\\d{2}:\\d{2}),func(n))";
How do I escape the comma in the regex while tokenizing?
Can you please post the function which tokenizes the string? Could help with respect to your code then.
With no such information, you could use split() as follows(if all you want to do is split on ","):
String s = "Messages,Hello\,World,Hobbies,Java\,Programming";
System.out.println(Arrays.toString(s.split("(?<!\\\\),")));
Refer - http://www.javacreed.com/how-to-split-a-string-with-escaped-delimiters/
You could replace your code with:
String str = "(\\d{4}\\,\\d{2}\\,\\d{2} \\d{2}:\\d{2}:\\d{2}), func(a)";
String[] tokens = str.split("(?<!\\\\),");
System.out.println(Arrays.toString(tokens));
This will give you a string array of tokens split on ","
The #Derryl Thomas answer is probably the correct answer.
Here is an alternate technique.
Use something else to indicate the comma in your regex.
Split based on commas.
Change the "something else" back to a comma.
For example:
Instead of "(\\d{4},\\d{2},\\d{2} :\\d{2}:\\d{2}:\\d{2})"
Use "(\\d{4}boppity\\d{2}boppity\\d{2} :\\d{2}:\\d{2}:\\d{2})"
Do the split based on comma.
Change the "boppity" in the regex to a ","; perhaps like this:
newStringVariable = yourStringVariable.replace("boppity", ",")
I want to match a string which occurs after a certain pattern but I am not able to come up with a regex to do that (I am using Java).
For example, let's say I have this string,
caa,abb,ksmf,fsksf,fkfs,admkf
and I want my regex to match only those commas which are prefixed by abb. How do I do that? Is it even possible using regexes?
If I use the regex abb, it matches the whole string abb, but I only want to match the comma after that.
I ask this because I wanted to use this regex in a split method which accepts a regex. If I pass abb, as the regex, it will consider the string abb, to be the delimiter and not the , which I want.
Any help would be greatly appreciated.
String test = "caa,abb,ksmf,fsksf,fkfs,admkf";
String regex = "(?<=abb),";
String[] split = test.split(regex);
for(String s : split){
System.out.println(s);
}
Output:
caa,abb
ksmf,fsksf,fkfs,admkf
See here for information:
https://www.regular-expressions.info/lookaround.html
Hi Guys its been a while since I ask another question,
I have this String which consist of a name and a number
Ex.
String myString = "give11arrow123test2356read809cell1245cable1257give222..."
Now what I am trying to do is to split it whenever there is a number attached to it
I have to split it so that I could have a result like this
give11, arrow123, test2356, read809, cell1245, cable1257, give222, ....
I could use this code but I cant find the right regex
String[] arrayString = myString.split("Regex")
Thanks for your help.
You can use a combination of lookarounds to split your string.
Lookarounds are zero-width assertions. They don't consume any characters on the string. The point of zero-width is the validation to see if a regex can or cannot be matched looking ahead or looking back from the current position, without adding them to the overall match.
String s = "give11arrow123test2356read809cell1245cable1257give222...";
String[] parts = s.split("(?<=\\d)(?=\\D)");
System.out.println(Arrays.toString(parts));
Output
[give11, arrow123, test2356, read809, cell1245, cable1257, give222, ...]
Use this regex for spliting
String regex = "(?<=\\d)(?=\\D)";
I am unfamiliar with using regex in java, but this expression matches what you need on www.rubular.com
([A-Za-z]+[0-9]+)
I am new to java, i have a string
"rdl_mod_id:0123456789\n\nrdl_mod_name:Driving Test\n\nrdl_mod_type:PUBL\n\nrdl_mod_mode:Practice\n\nrdl_mod_date:2013-04-23"
What I want is to get the Driving Test word. The word is dynamically changes so what I want to happen is get the word between the rdl_mod_name: and the \n.
Try the following.. It will work in your case..
String str = "rdl_mod_id:0123456789\n\nrdl_mod_name:Driving Test\n\nrdl_mod_type:PUBL\n\nrdl_mod_mode:Practice\n\nrdl_mod_date:2013-04-23";
Pattern pattern = Pattern.compile("rdl_mod_name:(.*?)\n");
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
Also you can make use of regex,matcher,pattern to get your desired result..
The following links will also give you a fair idea:
Extract string between two strings in java
Java- Extract part of a string between two special characters
How to get a string between two characters?
I would look into java regular expressions (regex). The String matches method uses a regex to determine if there's a pattern in a string. For what you are doing, I would probably use 'matches(rdl_mod_.*\n)'. The '.*' is a wildcard for strings, so in this context it means anything between rdl_mod and \n. I'm not sure if the matches method can process forward slashes (they signify special text characters), so you might have to replace them with either a different character or remove them altogether.
Use java's substring() function with java indexof() function.
Try this code :
String s = "rdl_mod_id:0123456789\n\nrdl_mod_name:Driving Test\n\nrdl_mod_type:PUBL\n\nrdl_mod_mode:Practice\n\nrdl_mod_date:2013-04-23";
String sArr[] = s.split("\n\n");
String[] sArr1 = sArr[1].split(":");
System.out.println("sArr1[1] : " + sArr1[1]);
The s.split("\n\n");will split the string on basis of \n\n.
The second split i.e. sArr[1].split(":"); will split the second element in array sArr on basis of : i.e split rdl_mod_name:Driving Test into rdl_mod_name and Driving Test.
sArr1[1] is your desired result.
I've never done this before, but basically I'm trying to break a large string up into substrings (based on a regular expression) and then make use of those substrings one at a time. Can anyone show me the easiest way to do this? I just don't quite know how to use the methods of pattern and matcher.
Thanks!
java.lang.String.split() takes a regular expression and will split the string, returning a String[] containing the substrings:
String s = "a:very:big:string";
String[] parts = s.split(":");
for (String part: parts)
{
System.out.println(part);
}
You don't need to use the Pattern and Matcher classes to achieve this.
Basic info about pattern matching in Java:
http://docs.oracle.com/javase/tutorial/essential/regex/