I have the following code:
public static void main(String[] args) {
String key = "myjsonkey";
String baseJson = "{\"" + key + "\":\"my json %svalue\"}";
String inBackslashAndN = String.format(baseJson, "\\n");
String inNewline = String.format(baseJson, "\n");
String outBackslashAndN = valueFromJson(key, inBackslashAndN);
String outNewLine = valueFromJson(key, inNewline);
System.out.print("\nInput strings matching: ");
System.out.println(inBackslashAndN.equals(inNewline));
System.out.print("Output strings matching: ");
System.out.println(outBackslashAndN.equals(outNewLine));
}
private static String valueFromJson(String key, String jsonStr) {
System.out.println("\nINPUT: " + jsonStr);
JsonObject json = new JsonParser().parse(jsonStr).getAsJsonObject();
String output = json.get(key).getAsString();
System.out.println("\nOUTPUT: " + output);
return output;
}
Output:
INPUT: {"myjsonkey":"my json \nvalue"}
OUTPUT: my json
value
INPUT: {"myjsonkey":"my json
value"}
OUTPUT: my json
value
Input strings matching: false
Output strings matching: true
My question is: Why does JSON parse both "\n" and "\\n" as newline and is there a way to force different parsing of these two without changing the original data?
I am using gson 2.7
EDIT: I am aware that "\n" is processed into the new line control character and the "\\n" is the sequence of the character 'backslash' and the character 'n' in Java. My question remains the same.
JSON does not support literal newlines inside strings. source: http://json.org/
A newline must be represented as \n. GSON most likely accepts either an already escaped slash + n or a literal newline and normalizes to slash + n inside the JSON representation, which when converted back to a string parses the slash + n into a literal newline again.
\n being the line feed control character, and \\n two characters, backslash and letter n.
These both cases are inserted into a JavaScript string "...". Hence the second version will be converted to a linefeed. And evidently for the first case a linefeed character inside a string is allowable.
Why does JSON parse both "\n" and "\n" as newline?
\n is processed into an actual, literal newline character (i.e. Unicode 000A). \\n is equivalent to the string "\n" which the JSON parser (correctly) parses as a newline as "\n" is a newline in JSON. You might need \\\\n if you want an actual "\n". See JSON.org, escape sequences are on the right under "char". When you end up operating through several languages (e.g. Java + Regex/JSON) you tend to get some confusing nesting of escape sequences.
JSON itself technically doesn't support newlines in strings, either. Gson takes care of this for you, though, by converting it to "\n":
Is there a way to force different parsing of these two without changing the original data?
I believe Gson does not provide a way to do this, and it wouldn't make much sense according to JSON standards. You could:
String unescaped = myString.replace("\\", "\\\\");
or with regular expressions:
String unescaped = myString.replaceAll("\\\\", "\\\\\\\\");
Related
I want to remove only the starting special character i.e. " and ending special character ".
The given input string is
String input= "\"This#string%contains^special*characters&.\"";
The expected output is
String output= "This#string%contains^special*characters&.";
Can any one help me how to do this using regex or any other way to remove only starting and ending special character only from the input string in java.
I would use a regex approach:
String input= "\"This#string%contains^special*characters&.\"";
String output = input.replaceAll("^[^A-Za-z0-9]|[^A-Za-z0-9]$", "");
System.out.println(output);
This prints:
This#string%contains^special*characters&.
I want to replace ";" with "\n" except when it's escaped with a leading '\'. I haven't figured out the correct regex.
Here is what I have:
String s = "abc;efg\\;hij;pqr;xyz\\;123"
s.replaceAll("\\[^\\\\];", "\\\\n");
I'd expect the above string to be replaced with "abc\nefg\;hij;pqr;xyz\;123"
Use a negative look behind:
s = s.replaceAll("(?<!\\\\);", "\n");
The expression (?<!\\) (coded as a java string literal "(?<!\\\\)") means "the previous character should not be a backslash"
Test code:
String s = "abc;efg\\;hij;pqr;xyz\\;123";
s = s.replaceAll("(?<!\\\\);", "\n");
System.out.println(s);
Output:
abc
efg\;hij
pqr
xyz\;123
In Java, I want to print a label with a String as the input:
String command
= "N\n"
+ "A50,5,0,1,2,2,N,\"" + name + "\"\
+ "P1\n";
But when the input (name) has a double quote character ("), it is blank and prints nothing. I have tried using the replace function:
name.replace('"', '\u0022');
but it doesn't work. I want that double quote printed in label, how can I do this?
Sending the " character in the text field of the EPL string makes the EPL code think it is the end of the string you are trying to print.
So, if you want to send(and print) "hello" you have to put a backslash before each " character and send \"hello\"
You also have to do that for backslashes.
So, your (EPL)output to the printer would have quotes to begin and end the string, and \" to print the quote characters WITHIN the string :
A30,210,0,4,1,1,N,"\"hello\""\n
Also remember you have to escape to characters to build a c# string so in c# it would look like this:
outputEPLStr += "A30,210,0,4,1,1,N,\"\\"hello\\"\"\n";
[which contains 6 escaped characters]
Couple of points:
replace method returns back string after replacing so you should expect something like:
command = command.replace...
quote has special meaning and hence needs to be escaped in Java. You need the following:
name = name.replace("\"", "");
String command
= "N\n"
+ "A50,5,0,1,2,2,N,\"" + name + "\""
+ "P1\n";
System.out.println(command);
i want to split a string by array of characters,
so i have this code:
String target = "hello,any|body here?";
char[] delim = {'|',',',' '};
String regex = "(" + new String(delim).replaceAll("(.)", "\\\\$1|").replaceAll("\\|$", ")");
String[] result = target.split(regex);
everything works fine except when i want to add a character like 'Q' to delim[] array,
it throws exception :
java.util.regex.PatternSyntaxException: Illegal/unsupported escape sequence near index 11
(\ |\,|\||\Q)
so how can i fix that to work with non-special characters as well?
thanks in advance
how can i fix that to work with non-special characters as well
Put square brackets around your characters, instead of escaping them. Make sure that if ^ is included in your list of characters, you need to make sure it's not the first character, or escape it separately if it's the only character on the list.
Dashes also need special treatment - they need to go at the beginning or at the end of the regex.
String delimStr = String(delim);
String regex;
if (delimStr.equals("^") {
regex = "\\^"
} else if (delimStr.charAt(0) == '^') {
// This assumes that all characters are distinct.
// You may need a stricter check to make this work in general case.
regex = "[" + delimStr.charAt(1) + delimStr + "]";
} else {
regex = "[" + delimStr + "]";
}
Using Pattern.quote and putting it in square brackets seems to work:
String regex = "[" + Pattern.quote(new String(delim)) + "]";
Tested with possible problem characters.
Q is not a control character in a regex, so you do not have to put the \\ before it (it only serves to mark that you must interpret the following character as a literal, and not as a control character).
Example
`\\.` in a regex means "a dot"
`.` in a regex means "any character"
\\Q fails because Q is not special character in a regex, so it does not need to be quoted.
I would make delim a String array and add the quotes to these values that need it.
delim = {"\\|", ..... "Q"};
I am using Jre 1.6.
I am executing the following lines of code:
String unicodeValue = "\u001B"; text = text.replaceAll("" + character, unicodeValue);
Here, text is a string object containing an invalid XML character of Unicode value '\u001B'.
So, I am converting the invalid XML character to its Unicode value to write in the XML.
But on doing text.replaceAll, the '\' is getting stripped and the character is replaced by 'u001B'.
Can anyone please suggest a way to retain the '\' after replacing the character with its unicode value ?
The problem is that str.replaceAll(regex, repl) is defined as returning the same as
Pattern.compile(regex).matcher(str).replaceAll(repl)
But the documentation for replaceAll says,
Note that backslashes () and dollar signs ($) in the replacement string may cause the results to be different than if it were being treated as a literal replacement string. Dollar signs may be treated as references to captured subsequences as described above, and backslashes are used to escape literal characters in the replacement string.
So this means we need to add several extra layers of escaping:
public class Foo {
public static void main(String[] args)
{
String unicodeValue = "\u001B";
String escapedUnicodevalue = "\\\\u001B";
String text = "invalid" + unicodeValue + "string";
text = text.replaceAll(unicodeValue, escapedUnicodevalue);
System.out.println(text);
}
}
prints invalid\u001Bstring as desired.
Use double slash \\ to represent escaped \:
String unicodeValue = "\\u001B"; text = text.replaceAll("" + character, unicodeValue);
This ran perfect. I tested it.
char character = 0x1b;
String unicodeValue = "\\\\u001B";
String text = "invalid " + character + " string";
System.out.println(text);
text = text.replaceAll("" + character, unicodeValue);
System.out.println(text);
Just used a concept of RegEx.