Java regex pattern match for [] - java

I have a string output which I need to match and I am using a regex
String schemaName = "Amazon";
String test = "{\"data\": [], \"name\": \"Amazon\", \"title\": \"StoreDataConfig\"}";
String output= method("\\[\\]",schemaName);
Matcher n = Pattern.compile(output).matcher(test);
boolean available = n.find();
System.out.println(available);
I wanted to validate the same and passing the regex to a method as mentioned
private static String method(String data, String schemaName) throws IOException {
System.out.println(data);
return ("{\"data\": " + data + ", \"name\": " + "\"" + schemaName + "\"" + ", \"title\": \"StoreDataConfig\"}");
}
But I am always getting java.util.regex.PatternSyntaxException: Illegal repetition.
Can you let me know what is the mistake?
If I don't use a method for [] and just giving it directly, I am not getting an error

It looks like you are doing this:
Take a valid regex for matching [].
Embed the regex in some JSON
Attempt to compile the JSON-with-an-embedded-regex as if the whole lot was a valid regex.
That fails ... because the JSON-with-an-embedded-regex is not a valid regex.
For a start, the { character is a regex meta character.
But the real puzzle is .... what are you actually trying to do here?
If you simply want a regex that matches a literal string then this will do it.
Pattern p = Pattern.compile(Pattern.quote(someLiteralString)).
And you could build a regex out of sub-regexes and literal strings by using Pattern.quote to escape the literal parts and then concatenating.
If what you are ultimately trying to do here is to extract information from a JSON string using pattern matching / regexes, then ... don't. The better approach is to use a proper JSON parser, and extract the information you need from the JSON object tree.

It's because you need to escape {} characters like this "\\{"

Related

Java Split String by colon on both side

Can you suggest me an approach by which I can split a String which is like:
:31C:150318
:31D:150425 IN BANGLADESH
:20:314015040086
So I tried to parse that string with
:[A-za-z]|\\d:
This kind of regular expression, but it is not working . Please suggest me a regular expression by which I can split that string with 20 , 31C , 31D etc as Keys and 150318 , 150425 IN BANGLADESH etc as Values .
If I use string.split(":") then it would not serve my purpose.
If a string is like:
:20: MY VALUES : ARE HERE
then It will split up into 3 string , and key 20 will be associated with "MY VALUES" , and "ARE HERE" will not associated with key 20 .
You may use matching mechanism instead of splitting since you need to match a specific colon in the string.
The regex to get 2 groups between the first and second colon and also capture everything after the second colon will look like
^:([^:]*):(.*)$
See demo. The ^ will assert the beginning of the string, ([^:]*) will match and capture into Group 1 zero or more characters other than :, and (.*) will match and capture into Group 2 the rest of the string. $ will assert the position at the end of a single line string (as . matches any symbol but a newline without Pattern.DOTALL modifier).
String s = ":20:AND:HERE";
Pattern pattern = Pattern.compile("^:([^:]*):(.*)$");
Matcher matcher = pattern.matcher(s);
while (matcher.find()){
System.out.println("Key: " + matcher.group(1) + ", Value: " + matcher.group(2) + "\n");
}
Result for this demo: Key: 20, Value: AND:HERE
You can use the following to split:
^[:]+([^:]+):
Try with split function of String class
String[] splited = string.split(":");
For your requirements:
String c = ":31D:150425 IN BANGLADESH:todasdsa";
c=c.substring(1);
System.out.println("C="+c);
String key= c.substring(0,c.indexOf(":"));
String value = c.substring(c.indexOf(":")+1);
System.out.println("key="+key+" value="+value);
Result:
C=31D:150425 IN BANGLADESH:todasdsa
key=31D value=150425 IN BANGLADESH:todasdsa

Regex convert to convert a string to tab delimited field

I want to convert a string to get tab delimited format. In my opinion option 1 should do it. But it looks like option 2 is actually producing the desired result. Can someone explain why?
public class test {
public static void main(String[] args) {
String temp2 = "My name\" is something";
System.out.println(temp2);
System.out.println( "\"" + temp2.replaceAll("\"", "\\\"") +"\""); //option 1
System.out.println( "\"" + temp2.replaceAll("\"", "\\\\\"") +"\""); //option 2
if(temp2.contains("\"")) {
System.out.println("Identified");
}
}
}
and the output is:
My name" is something
"My name" is something"
"My name\" is something"
Identified
If you want an Excel compatible CSV format, the escaping of the double quote is two double quotes, so called self-escaping.
String twoColumns = "\"a nice text\"\t\"with a \"\"quote\"\".";
String s = "Some \"quoted\" text.";
String s2 = "\"" + s.replace("\"", "\"\"") + "\"";
And ... no head-ache counting the backslashes.
Use String#replace(CharSequence, CharSequence) instead of String#replaceAll(). The former is a simple string replacement, so it works as you'd expect if you haven't read any documentation or don't know about regular expressions. The latter interprets its arguments differently because it's a regex find-and-replace:
Note that backslashes (\) and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string.
You'll get this output:
My name" is something
"My name\" is something"
"My name\\" is something"
Identified

Java RegEx replace all characters in string except for a word

I am using the code in Java:
String word = "hithere";
String str = "123hithere12345hi";
output(str.replaceAll("(?!"+word+")", "x"));
However, rather than outputting: xxxhitherexxxxxxx like I want it to, it outputs: x1x2x3hxixtxhxexrxex1x2x3x4x5xhxix x, I've tried a load of different regex patterns to try to do this, but I can't seem to figure out how to do this :(
Any help would be much appreciated.
Well this technically works. Using only replace all and only one line, and it's assuming you string does not contain a deprecated ASCII character (BEL)
String string = "hithere";
String string2 = "asdfasdfasdfasdfhithereasasdf";
System.out.println(string2.replaceAll(string,"" + (char)string.length()).replaceAll("[^" + (char)string.length() + "]", "x").replaceAll("" + (char)string.length(), string));
I think this is what you're looking for, if I'm not mistaken:
String pattern = "(\\d)|(hi$)";
System.out.println("123hithere12345hi".replaceAll(pattern, "X"));
The pattern replaces any numeric digits and the word "hi".
This lookaround based code will work for you:
String word = "hithere";
String string = "123hithere12345hi";
System.out.println(string.replaceAll(
".(?=.*?\\Q" + word + "\\E)|(?<=\\Q" + word + "\\E(.){0,99}).", "x"));
//=> xxxhitherexxxxxxx

Giving inputs to java regex

I have a regex like below one :
"\\t'AUR +(username) .*? /ROLE=\"(my_role)\".*$"
username and my_role parts will be given from args. So they always change when the script is starting. So how can i give parameters to that part of regex ?
Thanks for your helps.
Define regex like this:
String fmt = "\\t'AUR +(%s) .*? /ROLE=\"(%s)\".*$";
// assuming userName and myRole are your arguments
String regex = String.format(fmt, userName, myRole);
You should escape special characters in dynamic strings using Pattern.quote. To put the regex parts together you can simply use string concatenation like this:
String quotedUsername = Pattern.quote(username);
String quotedRole = Pattern.quote(my_role);
String regexString = "\\t'AUR +(" + quotedUsername +
") .*? /ROLE=\"(" + quotedRole + ")\".*$";
I think mixing regular expressions with format strings when using String.format can make the regex harder to understand.
Use string format or straight string concat to construct the regex before passing it to compile ...
Try this for an example:
String patternString = "\\t'AUR +(%s) .*? /ROLE=\"(%s)\".*$";
String formatted = String.format(patternString, username,my_role);
System.out.println(formatted);
Pattern pattern = Pattern.compile(patternString);
You can run a working example here: http://ideone.com/93YeNg

Replace single quote with double quote with Regex

I have an app that received a malformed JSON string like this:
{'username' : 'xirby'}
I need to replaced the single quotes ' with double quoates "
With these rule (I think):
A single quote comes after a { with one or more spaces
Comes before one or more spaces and :
Comes after a : with one more spaces
Comes before one or more spaces and }
So this String {'username' : 'xirby'} or
{ 'username' : 'xirby' }
Would be transformed to:
{"username" : "xirby"}
Update:
Also a possible malformed JSON String:
{ 'message' : 'there's not much to say' }
In this example the single quote inside the message value should not be replaced.
Try this regex:
\s*\'\s*
and a call to Replace with " will do the job. Look at here.
Instead of doing this, you're better off using a JSON parser which can read such malformed JSON and "normalize" it for you. Jackson can do that:
final ObjectReader reader = new ObjectMapper()
.configure(Feature.ALLOW_SINGLE_QUOTES, true)
.reader();
final JsonNode node = reader.readTree(yourMalformedJson);
// node.toString() does the right thing
This regex will capture all appropriate single quotes and associated white spaces while ignoring single quotes inside a message. One can replace the captured characters with double quotes, while preserving the JSON format. It also generalizes to JSON strings with multiple messages (delimited by commas ,).
((?<={)\s*\'|(?<=,)\s*\'|\'\s*(?=:)|(?<=:)\s*\'|\'\s*(?=,)|\'\s*(?=}))
I know you tagged your question for java, but I'm more familiar with python. Here's an example of how you can replace the single quotes with double quotes in python:
import re
regex = re.compile('((?<={)\s*\'|(?<=,)\s*\'|\'\s*(?=:)|(?<=:)\s*\'|\'\s*(?=,)|\'\s*(?=}))')
s = "{ 'first_name' : 'Shaquille' , 'lastname' : 'O'Neal' }"
regex.sub('"', s)
> '{"first_name":"Shaquille","lastname":"O\'Neal"}'
This method looks for single quotes next to the symbols {},: using look-ahead and look-behind operations.
String test = "{'username' : 'xirby'}";
String replaced = test.replaceAll("'", "\"");
Concerning your question's tag is JAVA, I answered in JAVA.
At first import the libraries:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
Then:
Pattern p = Pattern.compile("((?<=(\\{|\\[|\\,|:))\\s*')|('\\s*(?=(\\}|(\\])|(\\,|:))))");
String s = "{ 'firstName' : 'Malus' , 'lastName' : ' Ms'Malus' , marks:[ ' A+ ', 'B+']}";
String replace = "\"";
String o;
Matcher m = p.matcher(s);
o = m.replaceAll(replace);
System.out.println(o);
Output:
{"firstName":"Malus","lastName":" Ms'Malus", marks:[" A+ ","B+"]}
If you're looking to exactly satisfy all of those conditions, try this:
'{(\s)?\'(.*)\'(\s)?:(\s)?\'(.*)\'(\s)?}'
as you regex. It uses (\s)? to match one or zero whitespace characters.
I recommend you to use a JSON parser instead of REGEX.
String strJson = "{ 'username' : 'xirby' }";
strJson = new JSONObject(strJson).toString();
System.out.println(strJson);

Categories

Resources