I need to convert the following string into json format.
Below is the input as well as the expected output for reference.
Input:
Employee Driver Report - EDR
--------------------------------
Employee Nbr: 123480 Employee Type: DI Cat: UPL
Driver License: PP3P30 Plate: ROWP
Part Number: 1006096
Output:
{
"Employee Nbr": "123480",
"Employee Type": "DI",
"Cat": "UPL",
"Driver License": "PP3P30",
"Plate": "ROWP",
"Part Number": "1006096",
}
Sample Code:
Map<String, String> keyValueMap = new HashMap<String, String>();
String[] lines = rawText.split(System.getProperty("line.separator"));
for(String line: lines) {
.......
keyValueMap.put(keyAndValues[i], keyAndValues[i + 1]);
}
Gson gson = new GsonBuilder().setPrettyPrinting().create();
String json = gson.toJson(keyValueMap);
Could you please help me on how to resolve this?
Thanks in advance.
For each line you get, you have to parse it using : and (space) as delimiters.
I'll let you search the correct regex to use and ask for help if needed ;)
You could use an expression like so: ([\w\s]+\s*):\s*([\w]+) (example here) to process the data. This expression assumes that there is no white space within your value parameters.
Thus given the code below:
String source = "Employee Driver Report - EDR\n" +
"--------------------------------\n" +
"Employee Nbr: 123480 Employee Type: DI Cat: UPL\n" +
"Driver License: PP3P30 Plate: ROWP\n" +
"Part Number: 1006096";
String[] lines = source.split("\n");
Pattern p = Pattern.compile("([\\w\\s]+\\s*):\\s*([\\w]+)");
System.out.println("{");
for(int i = 2; i < lines.length; i++)
{
Matcher m = p.matcher(lines[i]);
while(m.find())
{
System.out.println("\"" + m.group(2).trim() + "\":" + "\"" + m.group(1).trim() + "\"");
}
}
System.out.println("}");
You would get something like so:
{
"Employee Nbr":"123480"
"Employee Type":"DI"
"Cat":"UPL"
"Driver License":"PP3P30"
"Plate":"ROWP"
"Part Number":"1006096"
}
EDIT: As per your comment:
String source = "Employee Driver Report - EDR\n" +
"--------------------------------\n" +
"Employee Nbr: 123480 With white space Employee Type: DI Cat: UPL\n" +
"Driver License: PP3P30 Plate: ROWP\n" +
"Part Number: 1006096";
String[] lines = source.split("\n");
Pattern p = Pattern.compile("(Employee Nbr|Employee Type|Cat|Driver License|Plate|Part Number)\\s*:\\s*(.+?)(?:(?=Employee Nbr|Employee Type|Cat|Driver License|Plate|Part Number)|$)");
System.out.println("{");
for(int i = 2; i < lines.length; i++)
{
Matcher m = p.matcher(lines[i]);
while(m.find())
{
System.out.println("\"" + m.group(1).trim() + "\":" + "\"" + m.group(2).trim() + "\"");
}
}
System.out.println("}");
Yields:
{
"Employee Nbr":"123480 With white space"
"Employee Type":"DI"
"Cat":"UPL"
"Driver License":"PP3P30"
"Plate":"ROWP"
"Part Number":"1006096"
}
A description of the updated expression is available here.
Related
I am working on an android app that summarizes text messages that are sent to clients from Financial services provider, a text message is sent as a notification each time a user makes a transaction.
Here is a sample message
[-ZAAD SHILLING-] Ref:1141125019 SLSH4,000 sent to AXMED XASAN
WARSAME(634458520) at 12/05/19 22:33:03, Your Balance is
SLSH44,222.62.
So I want to extract several portions of this message like
Ref:1141125019
Amount Sent: SLSH4,000
Recipient Name: AXMED XASAN WARSAME
Recipient Phone: 634458520
Date: 12/05/19
Time: 22:33:03
Balance: SLSH44,222.62
I have already got the text messages to appear in a listview, I now want to customize it, I don't want the whole message to appear, I just want the portion I mentioned above to appear.
Here is a Sample Code
if (Data.contains("Ref:")){
String[] Tx = Data.split("Ref:");
String TxID = Tx[1];
}
This problem would probably be best solved using regular expressions (regex)
Regular expressions allow you to match a string based on a pattern, and extract information from the string.
public static void main(String[] args) {
String data = "( [-ZAAD SHILLING-] Ref:1141125019 SLSH4,000 sent to AXMED XASAN WARSAME(634458520) at 12/05/19 22:33:03, Your Balance is SLSH44,222.62. )";
String headerReg = "\\[-([a-zA-Z\\s]+?)-]";
String refReg = "Ref:([0-9]+)";
String amountReg = "([,.\\w]+)";
String nameReg = "([\\w\\s]+?)";
String accountReg = "\\([0-9]+\\)";
String dateReg = "([0-9]{2}/[0-9]{2}/[0-9]{2})";
String timeReg = "([0-9]{2}:[0-9]{2}:[0-9]{2})";
String balanceReg = "([,.\\w]+?)";
String finalReg = "\\( " + headerReg + " " + refReg + " " + amountReg + " sent to " + nameReg + accountReg + " at " + dateReg + " " + timeReg + ", Your Balance is " + balanceReg + ". \\)";
Pattern pattern = Pattern.compile(finalReg);
Matcher matcher = pattern.matcher(data);
if (matcher.find()) {
MatchResult result = matcher.toMatchResult();
int groups = result.groupCount();
for (int i = 0; i < groups; i++) {
System.out.println(result.group(i + 1));
}
}
}
Using this, we can find the relevant data from your input string.
If the message syntax is always the same then you can use some tricky split strings like this:
String msg = "( [-ZAAD SHILLING-] Ref:1141125019 SLSH4,000 sent to AXMED XASAN WARSAME(634458520) at 12/05/19 22:33:03, Your Balance is SLSH44,222.62. )";
String ref = msg.split("Ref:")[1].split(" ")[0];
String amount = msg.split("Ref:")[1].split(" ")[1].split(" ")[0];
String recipient = msg.split("Ref:")[1].split("sent to ")[1].split("\\(")[0];
String phone = msg.split("Ref:")[1].split("sent to ")[1].split("\\(")[1].split("\\)")[0];
String date = msg.split("Ref:")[1].split("sent to ")[1].split("\\(")[1].split(" at ")[1].split(" ")[0];
String time = msg.split("Ref:")[1].split("sent to ")[1].split("\\(")[1].split(" at ")[1].split(" ")[1].split(",")[0];
String balance = msg.split("Your Balance is ")[1].split("\\)")[0];
System.out.println("ref: "+ref);
System.out.println("amount: "+amount);
System.out.println("recipient: "+recipient);
System.out.println("phone: "+phone);
System.out.println("date: "+date);
System.out.println("time: "+time);
System.out.println("balance: "+balance);
Result is:
ref: 1141125019
amount: SLSH4,000
recipient: AXMED XASAN WARSAME
phone: 634458520
date: 12/05/19
time: 22:33:03
balance: SLSH44,222.62.
I have the following json document:
{
"videoUrl":"",
"available":"true",
"movie":{
"videoUrl":"http..."
},
"account":{
"videoUrl":"http...",
"login":"",
"password":""
}
}
In this json I have a property named videoUrl, I want to get first non empty videoUrl
My regex:
("videoUrl":)("http.+")
But this regex match the following String
"videoUrl" :"http..."},
"account" : {"videoUrl" : "http...","login" : "","password" : ""
What is my way to write Regex that will find first non empty videoUrl with it's value
(Result should be "videoUrl":"http...")
Add (?!,) at the end of the regex, it will make the regex stop at an , without capturing it:
public static void main(String[] args) {
String input = "{ \n" +
" \"videoUrl\":\"\",\n" +
" \"available\":\"true\",\n" +
" \"movie\":{ \n" +
" \"videoUrl\":\"http...\"\n" +
" },\n" +
" \"account\":{ \n" +
" \"videoUrl\":\"http...\",\n" +
" \"login\":\"\",\n" +
" \"password\":\"\"\n" +
" }\n" +
"} ";
Pattern pattern = Pattern.compile("(\"videoUrl\":)(\"http.+\")(?!,)");
Matcher matcher = pattern.matcher(input);
while (matcher.find()) {
System.out.println(matcher.group()); // "videoUrl":"http..."
}
}
It will be more appropriate to use one of JSON parsers, like Gson or Jackson, instead of regex. Something like:
String jsonStr = "...";
Gson gson = new Gson();
JsonObject json = gson.fromJson(jsonStr, JsonObject.class);
String url = element.get("videoUrl").getAsString();
i am having below string but i want to add double quotes in it to look like json
[
{
LastName=abc,
FirstName=xyz,
EmailAddress=s#s.com,
IncludeInEmails=false
},
{
LastName=mno,
FirstName=pqr,
EmailAddress=m#m.com,
IncludeInEmails=true
}
]
i want below output.
[
{
"LastName"="abc",
"FirstName"="xyz",
"EmailAddress"="s#s.com",
"IncludeInEmails"=false
},
{
"LastName"="mno",
"FirstName"="pqr",
"EmailAddress"="m#m.com",
"IncludeInEmails"=true
}
]
i have tried some string regex. but didn't got. could any one please help.
String text= jsonString.replaceAll("[^\\{\\},]+", "\"$0\"");
System.out.println(text);
thanks
The regex way, similar to you have tried:
String jsonString = "[ \n" + "{ \n" + " LastName=abc, \n" + " FirstName=xyz, \n"
+ " EmailAddress=s#s.com, \n" + " IncludeInEmails=false \n" + "}, \n" + "{ \n"
+ " LastName=mno, \n" + " FirstName=pqr, \n" + " EmailAddress=m#m.com, \n" + " Number=123, \n"
+ " IncludeInEmails=true \n" + "} \n" + "] \n";
System.out.println("Before:\n" + jsonString);
jsonString = jsonString.replaceAll("([\\w]+)[ ]*=", "\"$1\" ="); // to quote before = value
jsonString = jsonString.replaceAll("=[ ]*([\\w#\\.]+)", "= \"$1\""); // to quote after = value, add special character as needed to the exclusion list in regex
jsonString = jsonString.replaceAll("=[ ]*\"([\\d]+)\"", "= $1"); // to un-quote decimal value
jsonString = jsonString.replaceAll("\"true\"", "true"); // to un-quote boolean
jsonString = jsonString.replaceAll("\"false\"", "false"); // to un-quote boolean
System.out.println("===============================");
System.out.println("After:\n" + jsonString);
Since there are a lot of corner cases, like character escaping, booleans, numbers, ... a simple regex won't do.
You could split the input string by newline and then handle each key-value-pair separately
for (String line : input.split("\\R")) {
// split by "=" and handle key and value
}
But again, you will have to handle char. escaping, booleans, ... (and btw, = is not a valid JSON key-value separator, only : is).
I'd suggest using GSON since it provides lenient parsing. Using Maven you can add it to your project with this dependency:
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.6.2</version>
</dependency>
You can then parse your input string using
String output = new JsonParser()
.parse(input)
.toString();
Just use this library http://mvnrepository.com/artifact/com.googlecode.json-simple/json-simple/1.1
Here is code for your example:
JSONArray json = new JSONArray();
JSONObject key1 = new JSONObject();
key1.put("LastName", "abc");
key1.put("FirstName", "xyz");
key1.put("EmailAddress", "s#s.com");
key1.put("IncludeInEmails", false);
JSONObject key2 = new JSONObject();
key2.put("LastName", "mno");
key2.put("FirstName", "pqr");
key2.put("EmailAddress", "m#m.com");
key2.put("IncludeInEmails", true);
json.add(key1);
json.add(key2);
System.out.println(json.toString());
Use the below code to get the output for your expection,
public class jsonTest {
public static void main(String[] args){
String test="[{ LastName=abc, FirstName=xyz, EmailAddress=s#s.com,IncludeInEmails=false},{ LastName=mno, FirstName=pqr, EmailAddress=m#m.com, IncludeInEmails=true}]";
String reg= test.replaceAll("[^\\{\\},]+", "\"$0\"");
String value=reg.replace("\"[\"{", "[{").replace("=","\"=\"").replace(" ","").replace("}\"]\"","}]").replace("\"true\"", "true").replace("\"false\"", "false");
System.out.println("value :: "+value);
}
}
I have a string, let's call it output, that's equals the following:
ltm data-group internal str_testclass {
records {
baz {
data "value 1"
}
foobar {
data "value 2"
}
topaz {}
}
type string
}
And I'm trying to extract the substring between the quotes for a given "record" name. So given foobar I want to extract value 2. The substring I want to extract will always come in the form I have prescribed above, after the "record" name, a whitespace, an open bracket, a new line, whitespace, the string data, and then the substring I want to capture is between the quotes from there. The one exception is when there is no value, which will always happen like I have prescribed above with topaz, in which case after the "record" name there will just be an open and closed bracket and I'd just like to get an empty string for this. How could I write a line of Java to capture this? So far I have ......
String myValue = output.replaceAll("(?:foobar\\s{\n\\s*data "([^\"]*)|()})","$1 $2");
But I'm not sure where to go from here.
Let's start extracting "records" structure with following regex ltm\s+data-group\s+internal\s+str_testclass\s*\{\s*records\s*\{\s*(?<records>([^\s}]+\s*\{\s*(data\s*"[^"]*")?\s*\}\s*)*)\}\s*type\s*string\s*\}
Then from "records" group, just find for sucessive match against [^\s}]+\s*\{\s*(?:data\s*"(?<data>[^"]*)")?\s*\}\s*. The "data" group contains what's you're looking for and will be null in "topaz" case.
Java strings:
"ltm\\s+data-group\\s+internal\\s+str_testclass\\s*\\{\\s*records\\s*\\{\\s*(?<records>([^\\s}]+\\s*\\{\\s*(data\\s*\"[^\"]*\")?\\s*\\}\\s*)*)\\}\\s*type\\s*string\\s*\\}"
"[^\\s}]+\\s*\\{\\s*(?:data\\s*\"(?<data>[^\"]*)\")?\\s*\\}\\s*"
Demo:
String input =
"ltm data-group internal str_testclass {\n" +
" records {\n" +
" baz {\n" +
" data \"value 1\"\n" +
" }\n" +
" foobar {\n" +
" data \"value 2\"\n" +
" }\n" +
" topaz {}\n" +
" empty { data \"\"}\n" +
" }\n" +
" type string\n" +
"}";
Pattern language = Pattern.compile("ltm\\s+data-group\\s+internal\\s+str_testclass\\s*\\{\\s*records\\s*\\{\\s*(?<records>([^\\s}]+\\s*\\{\\s*(data\\s*\"[^\"]*\")?\\s*\\}\\s*)*)\\}\\s*type\\s*string\\s*\\}");
Pattern record = Pattern.compile("(?<name>[^\\s}]+)\\s*\\{\\s*(?:data\\s*\"(?<data>[^\"]*)\")?\\s*\\}\\s*");
Matcher lgMatcher = language.matcher(input);
if (lgMatcher.matches()) {
String records = lgMatcher.group();
Matcher rdMatcher = record.matcher(records);
while (rdMatcher.find()) {
System.out.printf("%s:%s%n", rdMatcher.group("name"), rdMatcher.group("data"));
}
} else {
System.err.println("Language not recognized");
}
Output:
baz:value 1
foobar:value 2
topaz:null
empty:
Alernatives: As your parsing a custom language, you can give a try to write an ANTLR grammar or create Groovy DSL.
Your regex shouldn't even compile, because you are not escaping the " inside your regex String, so it is ending your String at the first " inside your regex.
Instead, try this regex:
String regex = key + "\\s\\{\\s*\\n\\s*data\\s*\"([^\"]*)\"";
You can check out how it works here on regex101.
Try something like this getRecord() method where key is the record 'name' you're searching for, e.g. foobar, and the input is the string you want to search through.
public static void main(String[] args) {
String input = "ltm data-group internal str_testclass { \n" +
" records { \n" +
" baz { \n" +
" data \"value 1\" \n" +
" } \n" +
" foobar { \n" +
" data \"value 2\" \n" +
" }\n" +
" topaz {}\n" +
" } \n" +
" type string \n" +
"}";
String bazValue = getRecord("baz", input);
String foobarValue = getRecord("foobar", input);
String topazValue = getRecord("topaz", input);
System.out.println("Record data value for 'baz' is '" + bazValue + "'");
System.out.println("Record data value for 'foobar' is '" + foobarValue + "'");
System.out.println("Record data value for 'topaz' is '" + topazValue + "'");
}
private static String getRecord(String key, String input) {
String regex = key + "\\s\\{\\s*\\n\\s*data\\s*\"([^\"]*)\"";
final Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(input);
if (matcher.find()) {
//if we find a record with data return it
return matcher.group(1);
} else {
//else see if the key exists with empty {}
final Pattern keyPattern = Pattern.compile(key);
Matcher keyMatcher = keyPattern.matcher(input);
if (keyMatcher.find()) {
//return empty string if key exists with empty {}
return "";
} else {
//else handle error, throw exception, etc.
System.err.println("Record not found for key: " + key);
throw new RuntimeException("Record not found for key: " + key);
}
}
}
Output:
Record data value for 'baz' is 'value 1'
Record data value for 'foobar' is 'value 2'
Record data value for 'topaz' is ''
You could try
(?:foobar\s{\s*data "(.*)")
I think the replaceAll() isn't necessary here. Would something like this work:
String var1 = "foobar";
String regex = '(?:' + var1 + '\s{\n\s*data "([^"]*)")';
You can then use this as your regex to pass into your pattern and matcher to find the substring.
You can simple transform this into a function so that you can pass variables into it for your search string:
public static void SearchString(String str)
{
String regex = '(?:' + str + '\s{\n\s*data "([^"]*)")';
}
aa: {
one: "hello",
two: "good",
three: "bye",
four: "tomorrow",
},
"bb": {
"1": "a quick fox",
"2": "a slow bird",
"3": "a smart dog",
"4": "a wilf flowert",
my data look something like above
What i want to select is all the text within "" that is on the right side of the : and that is including the "" marks
what i get is
: ("(.*?)")
but it select the : also which isn't what i want.
If you must use a regular expression, you can try the Matcher.group() method as found here.
public class TestClass {
public static void main(String[] args) {
String input = "aa: {\n" +
" one: \"hello\",\n" +
" two: \"good\",\n" +
" three: \"bye\",\n" +
" four: \"tomorrow\",\n" +
" },\n" +
" \"bb\": {\n" +
" \"1\": \"a quick fox\",\n" +
" \"2\": \"a slow bird\",\n" +
" \"3\": \"a smart dog\",\n" +
" \"4\": \"a wilf flowert\",\n";
// the actual code you need
Pattern pattern = Pattern.compile("(: )(\".+\")");
Matcher match = pattern.matcher(input);
while (match.find()) {
// here you go, only the value without the :
String value = match.group(2);
System.out.println("Found one = " + value);
}
}
}
This results in the following for me:
Found one = "hello"
Found one = "good"
Found one = "bye"
Found one = "tomorrow"
Found one = "a quick fox"
Found one = "a slow bird"
Found one = "a smart dog"
Found one = "a wilf flowert"
Try this:
String p = "(?<=:\\s{0,10})\"[^\"]*\"";
Pattern pat = Pattern.compile(p);
String s =
"aa: {\n" +
" one: \"hello\",\n" +
" two: \"good\",\n" +
" three: \"bye\",\n" +
" four: \"tomorrow\",\n" +
"" +
" },\n" +
" \"bb\": {\n" +
" \"1\": \"a quick fox\",\n" +
" \"2\": \"a slow bird\",\n" +
" \"3\": \"a smart dog\",\n" +
" \"4\": \"a wilf flowert\",\n";
Matcher m = pat.matcher(s);
while (m.find())
System.out.println(m.group());
result:
"hello"
"good"
"bye"
"tomorrow"
"a quick fox"
"a slow bird"
"a smart dog"
"a wilf flowert"
One possible regex is:
(?<=\: )\"*.*\",
(?<=\: ) checks that there is a colon before the prospective string, but does not select it in the regex selection. The rest selects the quotes and the string they surround.
String testData = "test: \"Hello\"";
Pattern p = Pattern.compile("(?<=\\: )\\\"*.*\\\"");
Matcher m = p.matcher(testData);
while (m.find()) {
System.out.println(testData.substring(m.start(), m.end()));
}
I strongly recommend using a JSON parser opposed to a regex, as suggested by fge.
Even though your code is not technically valid JSON, it would be much more efficient and you would avoid reinventing the wheel.