How to escape delimiter while tokenizing string in java - java

I have a regex pattern like "(\\d{4},\\d{2},\\d{2} :\\d{2}:\\d{2}:\\d{2})"
I am passing this pattern as argument to a function which tokenizes the input string based on ",".
Example:
func((\\d{4},\\d{2},\\d{2} :\\d{2}:\\d{2}:\\d{2}),func(n))";
How do I escape the comma in the regex while tokenizing?

Can you please post the function which tokenizes the string? Could help with respect to your code then.
With no such information, you could use split() as follows(if all you want to do is split on ","):
String s = "Messages,Hello\,World,Hobbies,Java\,Programming";
System.out.println(Arrays.toString(s.split("(?<!\\\\),")));
Refer - http://www.javacreed.com/how-to-split-a-string-with-escaped-delimiters/
You could replace your code with:
String str = "(\\d{4}\\,\\d{2}\\,\\d{2} \\d{2}:\\d{2}:\\d{2}), func(a)";
String[] tokens = str.split("(?<!\\\\),");
System.out.println(Arrays.toString(tokens));
This will give you a string array of tokens split on ","

The #Derryl Thomas answer is probably the correct answer.
Here is an alternate technique.
Use something else to indicate the comma in your regex.
Split based on commas.
Change the "something else" back to a comma.
For example:
Instead of "(\\d{4},\\d{2},\\d{2} :\\d{2}:\\d{2}:\\d{2})"
Use "(\\d{4}boppity\\d{2}boppity\\d{2} :\\d{2}:\\d{2}:\\d{2})"
Do the split based on comma.
Change the "boppity" in the regex to a ","; perhaps like this:
newStringVariable = yourStringVariable.replace("boppity", ",")

Related

Matching a string which occurs after a certain pattern

I want to match a string which occurs after a certain pattern but I am not able to come up with a regex to do that (I am using Java).
For example, let's say I have this string,
caa,abb,ksmf,fsksf,fkfs,admkf
and I want my regex to match only those commas which are prefixed by abb. How do I do that? Is it even possible using regexes?
If I use the regex abb, it matches the whole string abb, but I only want to match the comma after that.
I ask this because I wanted to use this regex in a split method which accepts a regex. If I pass abb, as the regex, it will consider the string abb, to be the delimiter and not the , which I want.
Any help would be greatly appreciated.
String test = "caa,abb,ksmf,fsksf,fkfs,admkf";
String regex = "(?<=abb),";
String[] split = test.split(regex);
for(String s : split){
System.out.println(s);
}
Output:
caa,abb
ksmf,fsksf,fkfs,admkf
See here for information:
https://www.regular-expressions.info/lookaround.html

How can we remove a ':' characters from a string?

I have strings like
#lle #mme: #crazy #upallnight:
I would like to remove the words which starts with either # or #. It works perfectly fine if those words doesn't contain the ':' character. However, that ':' character is left whenever I delete the words. Therefore I decided to replace those ':' characters before I delete the words using a string.replace() function. However, they are still not removed.
String example = "#lle #mme: #crazy #upallnight:";
example.replace(':',' ');
The result : #lle #mme: #crazy #upallnight:
I am pretty stuck here, anyhelp would be appreciated.
You can do this:
example = example.replaceAll(" +[##][^ ]+", "");
What this will do is replace any substrings in your string that match the regex pattern [##][^ ]+ with the empty string. Since that pattern matches the words you want to dump, it'll do what you want.
Demo of the pattern on Regex101
From Java docs:
String s = "Abc: abc#:";
String result = s.replace(':',' ');
Output in variable result= Abc abc#
I think you forgot to store the returned result of replace() method in some other String variable.

Split String if it has number

Hi Guys its been a while since I ask another question,
I have this String which consist of a name and a number
Ex.
String myString = "give11arrow123test2356read809cell1245cable1257give222..."
Now what I am trying to do is to split it whenever there is a number attached to it
I have to split it so that I could have a result like this
give11, arrow123, test2356, read809, cell1245, cable1257, give222, ....
I could use this code but I cant find the right regex
String[] arrayString = myString.split("Regex")
Thanks for your help.
You can use a combination of lookarounds to split your string.
Lookarounds are zero-width assertions. They don't consume any characters on the string. The point of zero-width is the validation to see if a regex can or cannot be matched looking ahead or looking back from the current position, without adding them to the overall match.
String s = "give11arrow123test2356read809cell1245cable1257give222...";
String[] parts = s.split("(?<=\\d)(?=\\D)");
System.out.println(Arrays.toString(parts));
Output
[give11, arrow123, test2356, read809, cell1245, cable1257, give222, ...]
Use this regex for spliting
String regex = "(?<=\\d)(?=\\D)";
I am unfamiliar with using regex in java, but this expression matches what you need on www.rubular.com
([A-Za-z]+[0-9]+)

Splitting a String with regex in Java

I would like to create my own pattern to split a String with a regex expression.
Actually i want to split a String into sentences so i need a pattern like ". \p{Upper}"
I've tried to code it, but java doesn't accept it:
String[] phrase = txtbrut.split(". \p{Upper}");
Basically i need to split the text String with a pattern like : dot-space-CapitalLetter
If someone know how to create his own pattern.
To split the string into sentences, you could do
String[] sentences = txtbrut.split("\\. (?=\\p{Upper})");
Note as per Stephan's comment, this will not handle cases where abbreviations and ellipsis occur

Java String Split on any character (including regex special characters)

I'm sure I'm just overlooking something here...
Is there a simple way to split a String on an explicit character without applying RegEx rules?
For instance, I receive a string with a dynamic delimiter, I know the 5th character defines the delimiter.
String s = "This,is,a,sample";
For this, it's simple to do
String delimiter = String.valueOf(s.charAt(4));
String[] result = s.split(delimiter);
However, when I have a delimiter that's a special RegEx character, this doesn't work:
String s = "This*is*a*sample";
So... is there a way to split the string on an explicit character without trying to apply extra RegEx rules? I feel like I must be missing something pretty simple.
split uses a regular expression as its argument. * is a meta-character used to match zero of more characters in regular expressions, You could use Pattern#quote to avoid interpreting the character
String[] result = s.split(Pattern.quote(delimiter));
You need not to worry about the character type If you use Pattern
Pattern regex = Pattern.compile(s.charAt(4));
Matcher matcher = regex.matcher(yourString);
if (matcher.find()){
//do something
}
You can run Pattern.quote on the delimiter before feeding it in. This will create a string literal and escape any regex specific chars:
delimiter = Pattern.quote(delimiter);
StringUtils.split(s, delimiter);
That will treat the delimiter as just a character, not use it like a regex.
StringUtils is a part of the ApacheCommons library, which is tons of useful methods. It is worth taking a look, could save you some time in the future.
Simply put your delimiter between []
String delimiter = "["+s.charAt(4)+"]";
String[] result = s.split(delimiter);
Since [ ] is the regex matches any characters between [ ]. You can also specify a list of delimiters like [*,.+-]

Categories

Resources