java JSON string formatting with regular expression

java JSON string formatting with regular expression - java

For a given plain JSON data do the following formatting:
replace all the special characters in key with underscore
remove the key double quote
replace the : with =
Example:
JSON Data: {"no/me": "139.82", "gc.pp": "\u0000\u000", ...}
After formatting: no_me="139.82", gc_pp="\u0000\u000"
Is it possible with a regular expression? or any other single command execution?

A single regex for the whole changes may be overkill. I think you could code something similar to this:
(NOTE: Since i do not code in java, my example is in javascript, just to get you the idea of it)
var json_data = '{"no/me": "139.82", "gc.pp": "0000000", "foo":"bar"}';
console.log(json_data);
var data = JSON.parse(json_data);
var out = '';
for (var x in data) {
var clean_x = x.replace(/[^a-zA-Z0-9]/g, "_");
if (out != '') out += ', ';
out += clean_x + '="' + data[x] + '"';
}
console.log(out);
Basically you loop through the keys and clean them (remove not-wanted characters), with the new key and the original value you create a new string with the format you like.
Important: Bear in mind overlapping ids. For example, both no/me and no#me will overlap into same id no_me. this may not be important since your are not outputting a JSON after all. I tell you just in case.

I haven't done Java in a long time, but I think you need something like this.
I'm assuming you mean 'all Non-Word characters' by specialchars here.
import java.util.regex.*;
String JsonData = '{"no/me": "139.82", "gc.pp": "\u0000\u000", ...}';
// remove { and }
JsonData = JsonData.substring(0, JsonData.length() - 1);
try {
Pattern regex = Pattern.compile("(\"[^\"]+\")\\s*:"); // find the keys, including quotes and colon
Matcher regexMatcher = regex.matcher(JsonData);
while (regexMatcher.find()) {
String temp = regexMatcher.group(1); // "no/me":
String key = regexMatcher.group(2).replaceAll("\\W", "_") + "="; // no_me=
JsonData.replaceAll(temp, key);
}
} catch (PatternSyntaxException ex) {
// regex has syntax error
}
System.out.println(JsonData);

Related

Java regex to extract and replace by value

Input String
${abc.xzy}/demo/${ttt.bbb}
test${kkk.mmm}
RESULT
World/demo/Hello
testSystem
The text inside the curly brackets are keys to my properties. I want to replace those properties with run time values.
I can do the following to get the regex match but what should i put in the replace logic to change the ${..} matched with the respective run time value in the input string.
Pattern p = Pattern.compile("\\{([^}]*)\\}");
Matcher m = p.matcher(s);
while (m.find()) {
// replace logic comes here
}

An alternative may be using a third-party lib such as Apache Commons Text.
They have StringSubstitutor class looks very promising.
Map valuesMap = HashMap();
valuesMap.put("abc.xzy", "World");
valuesMap.put("ttt.bbb", "Hello");
valuesMap.put("kkk.mmm", "System");
String templateString = "${abc.xzy}/demo/${ttt.bbb} test${kkk.mmm}"
StringSubstitutor sub = new StringSubstitutor(valuesMap);
String resolvedString = sub.replace(templateString);
For more info check out Javadoc https://commons.apache.org/proper/commons-text/javadocs/api-release/org/apache/commons/text/StringSubstitutor.html

You may use the following solution:
String s = "${abc.xzy}/demo/${ttt.bbb}\ntest${kkk.mmm}";
Map<String, String> map = new HashMap<String, String>();
map.put("abc.xzy", "World");
map.put("ttt.bbb", "Hello");
map.put("kkk.mmm", "System");
StringBuffer result = new StringBuffer();
Matcher m = Pattern.compile("\\$\\{([^{}]+)\\}").matcher(s);
while (m.find()) {
String value = map.get(m.group(1));
m.appendReplacement(result, value != null ? value : m.group());
}
m.appendTail(result);
System.out.println(result.toString());
See the Java demo online, output:
World/demo/Hello
testSystem
The regex is
\$\{([^{}]+)\}
See the regex demo. It matches a ${ string, then captures any 1+ chars other than { and } into Group 1 and then matches }. If Group 1 value is present in the Map as a key, the replacement is the key value, else, the matched text is pasted back where it was in the input string.

Your regex needs to include the dollar. Also making the inner group lazy is sufficient to not include any } in the resulting key String.
String regex = "\\$\\{(.+?)\\}";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(s);
while (m.find()) {
String key = m.group(1); // This is your matching group (the one in braces).
String value = someMap.get(key);
s.replaceFirst(regex, value != null ? value : "missingKey");
m = p.matcher(s); // you could alternatively reset the existing Matcher, but just create a new one, for simplicity's sake.
}
You could streamline this, by extracting the cursor position, and doing the replacement yourself, for the string. But either way, you need to reset your matcher, because otherwise it will parse on the old String.

The_Cute_Hedgehog's answer is good, but includes a dependency.
Wiktor Stribiżew's answer is missing a special case.
My answer aim to using java build-in regex and try to improve from Wiktor Stribiżew's answer. (Improve in Java code only, the regex is Ok)
Improvements:
Using StringBuilder is faster than StringBuffer
Initial StringBuilder capable to (int)(s.length()*1.2), avoid relocating memory many times in case of large input template s.
Avoid the case of regex special characters make wrong result by appendReplacement (like "cost: $100"). You can fix this problem in Wiktor Stribiżew's code by escape $ character in the replacement String like this value.replaceAll("\\$", "\\\\\\$")
Here is the improved code:
String s = "khj${abc.xzy}/demo/${ttt.bbb}\ntest${kkk.mmm}{kkk.missing}string";
Map<String, String> map = new HashMap<>();
map.put("abc.xzy", "World");
map.put("ttt.bbb", "cost: $100");
map.put("kkk.mmm", "System");
StringBuilder result = new StringBuilder((int)(s.length()*1.2));
Matcher m = Pattern.compile("\\$\\{([^}]+)\\}").matcher(s);
int nonCaptureIndex = 0;
while (m.find()) {
String value = map.get(m.group(1));
if (value != null) {
int index = m.start();
if (index > nonCaptureIndex) {
result.append(s.substring(nonCaptureIndex, index));
}
result.append(value);
nonCaptureIndex = m.end();
}
}
result.append(s.substring(nonCaptureIndex, s.length()));
System.out.println(result.toString());

Accept everything in java if condition

today I wrote a programm that automaticaly checks if an Netflix account is working or not. But I'm struggling at a point where I need to accept all the country codes in the URL. I wanted to use something like * in linux but my IDE is giving me Errors. What is the Solution and are there better ways?
WebUI.openBrowser('')
WebUI.navigateToUrl('https://www.netflix.com/login')
WebUI.setText(findTestObject('/Page_Netflix/input_email'), 'example#gmail.com')
WebUI.setText(findTestObject('/Page_Netflix/input_password'), '1234')
WebUI.click(findTestObject('/Page_Netflix/button_Sign In'))
TimeUnit.SECONDS.sleep(10)
if (WebUI.getUrl() == "https://www.netflix.com/" + * + "-" + * + "/login") {
}
WebUI.closeBrowser()

So this is your attempt:
if (WebUI.getUrl() == "https://www.netflix.com/" + * + "-" + * + "/login") {
}
which fails, as you can't just use * like that (in addition to using ==, which isn't what you should do when using java). But I think this is what you want:
if (WebUI.getUrl().matches("https://www\\.netflix\\.com/.+-.+/login")) {
// do whatever
}
which would match in whatever country you are in: any url like https://www.netflix.com/it-en/login. If within the if statement you need to use the country information, you'll might want a matcher:
import java.util.regex.*;
Pattern p = Pattern.compile("https://www\\.netflix\\.com/(.+)-(.+)/login");
Matcher m = p.matcher(WebUI.getUrl());
if (m.matches()) {
String country = m.group(1);
String language = m.group(2);
// do whatever
}
Note that we're using java here, as you have the question tagged like that. Katalon is able to use also javascript and groovy, which you've also used in your single-quote strings and leaving out semicolons. In groovy, == for string comparison is ok, and it can also use shorthands for regular expressions.

You could create a list of pair valid values for the country codes if you want to keep track of which country you are in, and the just compare the two strings.
If you don't want to do it that way and accept everything it comes in the url string, then I recommend you using split method:
String sections[] = (WebUI.getUrl()).split("/");
/* Now we have:
sections[0] = "https:""
sections[1] = ""
sections[2] = "www.netflix.com"
sections[3] = whatever the code country is
sections[4] = login
*/

Try to solve it with regular expression on URL string:
final String COUNTRY_CODES_REGEX =
"Country1|Country2|Country3";
Pattern pattern = Pattern.compile(COUNTRY_CODES_REGEX);
Matcher matcher = pattern.matcher(WebUI.getUrl());
if (matcher.find()) {
// Do some stuff.
}

Instead of using WebUI.getUrl() == ...
you could use String.matches (String pattern). Similarly to AutomatedOwl's reply you would define a String variable that is a regex logical-or separated aggregate of the individual country codes. So you have
String country1 = ...
String country2 = ...
String countryN = ...
String countryCodes = String.join("|", country1, country2, countryN);
then you have something along the lines of:
if (WebUI.getUrl().matches("https://www.netflix.com/" + countryCodes + "/login")) {
... do stuff
}

N-th indexOf in String?

I need to extract a sub-string of a URL.
URLs
/service1/api/v1.0/foo -> foo
/service1/api/v1.0/foo/{fooId} -> foo/{fooId}
/service1/api/v1.0/foo/{fooId}/boo -> foo/{fooId}/boo
And some of those URLs may have request parameters.
Code
String str = request.getRequestURI();
str = str.substring(str.indexOf("/") + 1);
str = str.substring(str.indexOf("/") + 1);
str = str.substring(str.indexOf("/") + 1);
str = str.substring(str.indexOf("/") + 1, str.indexOf("?"));
Is there a better way to extract the sub-string instead of recurrent usage of indexOf method?

There are many alternative ways:
Use Java-Stream API on splitted String with \ delimiter:
String str = "/service1/api/v1.0/foo/{fooId}/boo";
String[] split = str.split("\\/");
String url = Arrays.stream(split).skip(4).collect(Collectors.joining("/"));
System.out.println(url);
With the elimination of the parameter, the Stream would be like:
String url = Arrays.stream(split)
.skip(4)
.map(i -> i.replaceAll("\\?.+", ""))
.collect(Collectors.joining("/"));
This is also where Regex takes its place! Use the classes Pattern and Matcher.
String str = "/service1/api/v1.0/foo/{fooId}/boo";
Pattern pattern = Pattern.compile("\\/.*?\\/api\\/v\\d+\\.\\d+\\/(.+)");
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
If you rely on the indexOf(..) usage, you might want to use the while-loop.
String str = "/service1/api/v1.0/foo/{fooId}/boo?parameter=value";
String string = str;
while(!string.startsWith("v1.0")) {
string = string.substring(string.indexOf("/") + 1);
}
System.out.println(string.substring(string.indexOf("/") + 1, string.indexOf("?")));
Other answers include a way that if the prefix is not mutable, you might want to use only one call of idndexOf(..) method (#JB Nizet):
string.substring("/service1/api/v1.0/".length(), string.indexOf("?"));
All these solutions are based on your input and fact, the pattern is known, or at least the number of the previous section delimited with \ or the version v1.0 as a checkpoint - the best solution might not appear here since there are unlimited combinations of the URL. You have to know all the possible combinations of input URL to find the best way to handle it.

Path is quite useful for that :
public static void main(String[] args) {
Path root = Paths.get("/service1/api/v1.0/foo");
Path relativize = root.relativize(Paths.get("/service1/api/v1.0/foo/{fooId}/boo"));
System.out.println(relativize);
}
Output :
{fooId}/boo

How about this:
String s = "/service1/api/v1.0/foo/{fooId}/boo";
String[] sArray = s.split("/");
StringBuilder sb = new StringBuilder();
for (int i = 4; i < sArray.length; i++) {
sb.append(sArray[i]).append("/");
}
sb.deleteCharAt(sb.length() - 1);
System.out.println(sb.toString());
Output:
foo/{fooId}/boo
If the url prefix is always /service1/api/v1.0/, you just need to do s.substring("/service1/api/v1.0/".length()).

There are a few good options here.
1) If you know "foo" will always be the 4th token, then you have the right idea already. The only issue with your way is that you have the information you need to be efficient, but you aren't using it. Instead of copying the String multiple times and looping anew from the beginning of the new String, you could just continue from where you left off, 4 times, to find the starting point of what you want.
String str = "/service1/api/v1.0/foo/{fooId}/boo";
// start at the beginning
int start = 0;
// get the 4th index of '/' in the string
for (int i = 0; i != 4; i++) {
// get the next index of '/' after the index 'start'
start = str.indexOf('/',start);
// increase the pointer to the next character after this slash
start++;
}
// get the substring
str = str.substring(start);
This will be far, far more efficient than any regex pattern.
2) Regex: (java.util.regex.*). This will work if you what you want is always preceded by "service1/api/v1.0/". There may be other directories before it, e.g. "one/two/three/service1/api/v1.0/".
// \Q \E will automatically escape any special chars in the path
// (.+) will capture the matched text at that position
// $ marks the end of the string (technically it matches just before '\n')
Pattern pattern = Pattern.compile("/service1/api/v1\\.0/(.+)$");
// get a matcher for it
Matcher matcher = pattern.matcher(str);
// if there is a match
if (matcher.find()) {
// get the captured text
str = matcher.group(1);
}
If your path can vary some, you can use regex to account for it. e.g.: service/api/v3/foo/{bar}/baz/" (note varying number formats and trailing '/') could be matched as well by changing the regex to "/service\\d*/api/v\\d+(?:\\.\\d+)?/(.+)(?:/|$)"

Regular expression in java that encloses some url

i have this problem:
i have to make a regular expression which take this urls:
http://www.amazon.it/TP-LINK-TL-WR841N-Wireless-300Mbps-Ethernet/dp/B001FWYGJS?ie=UTF8&redirect=true&ref_=s9_simh_gw_p147_d0_i2
http://www.amazon.it/gp/product/B014KMQWU0/
http://www.amazon.it/gp/product/glance/B014KMQWU0/
I need a regular expression which matches the full url until the ASIN of the product (ASIN is a word of 10 capital letters)
I have write this regex but not make what i want:
String regex="http:\\/\\/(?:www\\.|)amazon\\.com\\/(?:gp\\ product|| gp\\ product\\ glance || [^\\/]+\\/dp|dp)\\/([^\\/]{10})";
Pattern pattern=Pattern.compile(regex);
Matcher urlAmazonMatcher = pattern.matcher(url);
while (urlAmazonMatcher.find()) {
System.out.println("PROVA "+urlAmazonMatcher.group(0));
}

This is my solution. Finally it works :D
String regex="(http|www\\.)amazon\\.(com|it|uk|fr|de)\\/(?:gp\\/product|gp\\/product\\/glance|[^\\/]+\\/dp|dp)\\/([^\\/]{10})";
Pattern pattern=Pattern.compile(regex);
Matcher urlAmazonMatcher = pattern.matcher(url);
String toReturn = null;
while (urlAmazonMatcher.find()) {
toReturn=urlAmazonMatcher.group(0);
}

How about
/[^/?]{10}(/$|\?)
This matches 10 characters that are neither / nor ? following a slash if those characters are followed by a final slash or a question mark.
You can get the part that precedes or follows the ASIN using one of the various Matcher functions.

Here is my work from a previous project that was to extract URLs from text:
private Pattern getUriPattern() {
if(uriPattern == null) {
// taken from http://labs.apache.org/webarch/uri/rfc/rfc3986.html
//TODO implement the full URI syntax
String genDelims = "\\:\\/\\?\\#\\[\\]\\#";
String subDelims = "\\!\\$\\&\\'\\*\\+\\,\\;\\=";
String reserved = genDelims + subDelims;
String unreserved = "\\w\\-\\.\\~"; // i.e. ALPHA / DIGIT / "-" / "." / "_" / "~"
String allowed = reserved + unreserved;
// ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
uriPattern = Pattern.compile("((?:[^\\:/\\?\\#]+:)?//[" + allowed + "&&[^\\?\\#]]*(?:\\?([" + allowed + "&&[^\\#]]*))?(?:\\#[" + allowed + "]*)?).*");
}
return uriPattern;
}
You can use the above method as follows:
Matcher uriMatcher =
getUriPattern().matcher(text);
if(uriMatcher.matches()) {
String candidateUriString = uriMatcher.group(1);
try {
new URI(candidateUriString); // check once again if you matched a URL
// your code here
} catch (Exception e) {
// error handling
}
}
This will catch the whole URL, including params. You can then split it up to the first occurence of '?' (if any) and take the first part. Of course, you can rework the regex too.

Replace single quote with double quote with Regex

I have an app that received a malformed JSON string like this:
{'username' : 'xirby'}
I need to replaced the single quotes ' with double quoates "
With these rule (I think):
A single quote comes after a { with one or more spaces
Comes before one or more spaces and :
Comes after a : with one more spaces
Comes before one or more spaces and }
So this String {'username' : 'xirby'} or
{ 'username' : 'xirby' }
Would be transformed to:
{"username" : "xirby"}
Update:
Also a possible malformed JSON String:
{ 'message' : 'there's not much to say' }
In this example the single quote inside the message value should not be replaced.

Try this regex:
\s*\'\s*
and a call to Replace with " will do the job. Look at here.

Instead of doing this, you're better off using a JSON parser which can read such malformed JSON and "normalize" it for you. Jackson can do that:
final ObjectReader reader = new ObjectMapper()
.configure(Feature.ALLOW_SINGLE_QUOTES, true)
.reader();
final JsonNode node = reader.readTree(yourMalformedJson);
// node.toString() does the right thing

This regex will capture all appropriate single quotes and associated white spaces while ignoring single quotes inside a message. One can replace the captured characters with double quotes, while preserving the JSON format. It also generalizes to JSON strings with multiple messages (delimited by commas ,).
((?<={)\s*\'|(?<=,)\s*\'|\'\s*(?=:)|(?<=:)\s*\'|\'\s*(?=,)|\'\s*(?=}))
I know you tagged your question for java, but I'm more familiar with python. Here's an example of how you can replace the single quotes with double quotes in python:
import re
regex = re.compile('((?<={)\s*\'|(?<=,)\s*\'|\'\s*(?=:)|(?<=:)\s*\'|\'\s*(?=,)|\'\s*(?=}))')
s = "{ 'first_name' : 'Shaquille' , 'lastname' : 'O'Neal' }"
regex.sub('"', s)
> '{"first_name":"Shaquille","lastname":"O\'Neal"}'
This method looks for single quotes next to the symbols {},: using look-ahead and look-behind operations.

String test = "{'username' : 'xirby'}";
String replaced = test.replaceAll("'", "\"");

Concerning your question's tag is JAVA, I answered in JAVA.
At first import the libraries:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
Then:
Pattern p = Pattern.compile("((?<=(\\{|\\[|\\,|:))\\s*')|('\\s*(?=(\\}|(\\])|(\\,|:))))");
String s = "{ 'firstName' : 'Malus' , 'lastName' : ' Ms'Malus' , marks:[ ' A+ ', 'B+']}";
String replace = "\"";
String o;
Matcher m = p.matcher(s);
o = m.replaceAll(replace);
System.out.println(o);
Output:
{"firstName":"Malus","lastName":" Ms'Malus", marks:[" A+ ","B+"]}

If you're looking to exactly satisfy all of those conditions, try this:
'{(\s)?\'(.*)\'(\s)?:(\s)?\'(.*)\'(\s)?}'
as you regex. It uses (\s)? to match one or zero whitespace characters.

I recommend you to use a JSON parser instead of REGEX.
String strJson = "{ 'username' : 'xirby' }";
strJson = new JSONObject(strJson).toString();
System.out.println(strJson);

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

java JSON string formatting with regular expression - java

Related

Java regex to extract and replace by value

Accept everything in java if condition

N-th indexOf in String?

Regular expression in java that encloses some url

Replace single quote with double quote with Regex

Categories

Resources