Regex to find specific pattern string from string - java

How to get ricCode:.ABC from following string.
My matcher is
Matcher matcher = Pattern.compile("ricCode:([A-Za-z]+),$").matcher(str);
String str = "{AMX:{ricCode:.ABC,indexDetailEnable:true,price:20,648.15,netChange:<spanclass="md-down">-41.09</span>,percentChange:<spanclass="md-down">-0.20%</span>,tradeDate:17/04/05,tradeTime:16:40,chartDate:17/04/05,chartTime:16:40}";
What is missing in the regex?

Change your regex from this:
ricCode:([A-Za-z]+),$
to this:
ricCode:([A-Za-z.]+)(?=,)
Your original regex would only allow alphabetic characters after ricCode:, but your example has a period . character. Also you were matching the , character, but this would also include the comma in your match, you dont want this - so I added a positive lookahead for the comma so it looks for it there but does not match it. Finally you had the $ character at the end of your regex which matches the end of the string, you dont want to look for the end of the string immediately after the comma, so I removed it.
It helps to use regexr.com to test out your expression.

Related

Java String Split using Regex with Escape Character

I have a string which needs to be split based on a delimiter(:). This delimiter can be escaped by a character (say '?'). Basically the delimiter can be preceded by any number of escape character. Consider below example string:
a:b?:c??:d???????:e
Here, after the split, it should give the below list of string:
a
b?:c??
d???????:e
Basically, if the delimiter (:) is preceded by even number of escape characters, it should split. If it is preceded by odd number of escape characters, it should not split. Is there a solution to this with regex?
Any help would be greatly appreciated.
Similar question has been asked earlier here, But the answers are not working for this use case.
Update:
The solution with the regex: (?:\?.|[^:?])* correctly split the string. However, this also gives few empty strings. If + is given instead of *, even the real empty matches also ignored. (Eg:- a::b gives only a,b)
Scenario 1: No empty matches
You may use
(?:\?.|[^:?])+
Or, following the pattern in the linked answer
(?:\?.|[^:?]++)+
See this regex demo
Details
(?: - start of a non-capturing group
\?. - a ? (the delimiter) followed with any char
| - or
[^:?] - any char but the : (your delimiter char) and ? (the escape char)
)+ - 1 or more repetitions.
In Java:
String regex = "(?:\\?.|[^:?]++)+";
In case the input contains line breaks, prepend the pattern with (?s) (like (?s)(?:\\?.|[^:?])+) or compile the pattern with Pattern.DOTALL flag.
Scenario 2: Empty matches included
You may add (?<=:)(?=:) alternative to the above pattern to match empty strings between : chars, see this regex demo:
String s = "::a:b?:c??::d???????:e::";
Pattern pattern = Pattern.compile("(?>\\?.|[^:?])+|(?<=:)(?=:)");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println("'" + matcher.group() + "'");
}
Output of the Java demo:
''
'a'
'b?:c??'
''
'd???????:e'
''
Note that if you want to also match empty strings at the start/end of the string, use (?<![^:])(?![^:]) rather than (?<=:)(?=:).

regex pattern accepting comma and colon

I am searching for regex pattern that matches the following String. I am using this regex pattern as,
^;[A-za-z0-9,:]+
Above regex doesn't matches the following.
I am looking for all given string to be matched with regex pattern.
:a123,234,444:322 //String started with semicolon and values are separated with comma and colon
;123,A234:123;123,345,456:999,456 // Above case with repeated condition
;;123,345,C555:123 //String started with double semicolon
Can anyone provide regex pattern that matches above string.
This one
[;:]+[A-za-z0-9,;:]+
will work for all three you want, see online on regex101.
[;:]+: Started with one or more ; or : .
[A-za-z0-9,;:]+: You miss' a : here.
You can match the above with this regex
^;+[A-za-z0-9,;:]+
Modifications:
;+ will match 1 or more semicolons
colon : has been added in characters you want to match

How to prohibit a backslash with regex in java?

I'm new with regex and just can't find what the regex is to prohibit a backslash.
Thanks for any advice.
EDIT:
I'm using the regex for JTextField, to avoid that the user writes an unvalid input. This regex currently doesnt allow the user to write a space character.
I'm doing this with
String regex = "\\S{1}";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(str);
So how I change my regex to prohibit backslash as well?
Based on your example snippet, the following expression should work similarly, but also disallowing the backslash:
String regex = "[^\\s\\\\]{1}";
It is a bit strange that you are looking for a one-char non-space and non-backslash pattern, but I guess you are iterating through and checking for consecutive matches.
I would use the following regex though:
String regex = "[^\\s\\\\]+";
and check whether it matched the whole String (matcher.matches()).
The regex pattern with java is \\\\
String somestring;
somestring = somestring.replaceAll("\\\\", "");
Would remove them. Semantically it equates down to \\ at the regex level, which becomes a literal \ match.
You can also use a Pattern match if you want to just compare, or just use String#contains
String somestring;
if (somestring.contains("\\")) {...}
To test if a string has a backslash:
if (input.matches(".*\\\\.*"))
To remove backslashes from strings:
input = input.replace("\\", "");
Note the use of the non-regex based replace() method.

How can i add multiple match conditions in a regex

I have a String like this : String x = "return function ('ABC','DEF')";
I am using this:
Pattern pattern = Pattern.compile("'(.*?)'");
Matcher matcher = pattern.matcher(formula);
while (matcher.find()) {
System.out.println("------> " + matcher.group();
}
to retrieve strings between single quotes.
My question is: how can i adapt this regex so that it will check for strings between single quotes AND strings like " ,'DEF' " (meaning which start with ,' and end with ')?
You can use this pattern:
'[^']+'|"[^"]+"
Just to match with empty quoted string change '+' to '*'.
See test.
This pattern should do what you want:
"(?:,\s*)?'[^']*'"
The ? means the first group will match zero or one times.
I used (?:...) because this is a non-capturing group. It is better to use when you don't need to capture that portion of the match.
Also, I replaced .*? with [^']*, meaning the single-quoted string contains anything that is not a single quote. This is more efficient and less likely to lead to mistakes in your regex than .*?.
(Note: this regex allows there to be space between the comma and the start of the string. At first looking at your example, I thought that was true of your example. But now I see that it is not. Still, that might be useful depending on what your data looks like).
You could use the regex pattern:
Pattern.compile(",?'(.*?)'");
,? means 0 or 1 commas. The ? is greedy, so if there is a comma, it will be included in the match.
So: This will match:
A comma, followed by a string enclosed in single quotes
OR.. only a string enclosed in single quotes

Java regular expression. Alphabetic, with spaces and apostrophe (')

what is the pattern to validate the following regular expression in java.
string with only letters, spaces and apostrophes (')
I know that for letters is this ("^[a-zA-Z]+$")
For spaces is this ("\\s)
I don't know what apostrophe is.
But most of all i just want a single expression. Not 3 individual ones.
You can create your own class with the characters that you need to match, like this:
Pattern.compile("[a-zA-Z\\s']+");
there is no single class that could match something so specific, but you can build your own from the existing ones
(\p{Alpha}|\s|')*
matches any number of characters, spaces or apostrophes, in any order.
Take a look at http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html
Pattern p = Pattern.compile("^[a-zA-Z ]*$");
Matcher m = p.matcher("Tester String");
System.out.println(m.matches());// true
Matcher m2 = p.matcher("Tester String 123");
System.out.println(m2.matches());// false
This will accept only alphabets.

Categories

Resources