I have encoded my string(Say String a="123+gtyt") using URLEncoder class.The encoded string is String b. Then I am sending "String b" as a parameter appended to a URL. Lets say to http://example.com?request=b.
When I Decode the String at example.com using URLDecoder,The symbol + in my String is missing and I am not getting "String a" after decoding
Now When I print without decoding the "String b" at example.com.I get String a exactly.
So my doubt is whether the decoding is done by browser itself while redirecting?
When you encode "123+gtyt" - it encodes the plus sign.
When you handle an HTTP request, servlet API automaticaly decodes it to "123+gtyt". If you decode it once again - it changes the "+" to a space.
So the key is - do not decode parameters explicitly.
For example:
final String encoded = URLEncoder.encode("123+gtyt");
final String decoded = URLDecoder.decode(encoded);
System.out.println("decoded = " + decoded); // 123+gtyt
System.out.println("URLDecoder.decode(decoded) = "
+ URLDecoder.decode(decoded)); // prints 123 gtyt
Related
I'm trying to save content of a pdf file in a json and thought of saving the pdf as String value converted from byte[].
byte[] byteArray = feature.convertPdfToByteArray(Paths.get("path.pdf"));
String byteString = new String(byteArray, StandardCharsets.UTF_8);
byte[] newByteArray = byteString.getBytes(StandardCharsets.UTF_8);
String secondString = new String(newByteArray, StandardCharsets.UTF_8);
System.out.println(secondString.equals(byteString));
System.out.println(Arrays.equals(byteArray, newByteArray));
System.out.println(byteArray.length + " vs " + newByteArray.length);
The result of the above code is as follows:
true
false
421371 vs 760998
The two String's are equal while the two byte[]s are not. Why is that and how to correctly convert/save a pdf inside a json?
You are probably using the wrong charset when reading from the PDF file.
For example, the character é (e with acute) does not exists in ISO-8859-1 :
byte[] byteArray = "é".getBytes(StandardCharsets.ISO_8859_1);
String byteString = new String(byteArray, StandardCharsets.UTF_8);
byte[] newByteArray = byteString.getBytes(StandardCharsets.UTF_8);
String secondString = new String(newByteArray, StandardCharsets.UTF_8);
System.out.println(secondString.equals(byteString));
System.out.println(Arrays.equals(byteArray, newByteArray));
System.out.println(byteArray.length + " vs " + newByteArray.length);
Output :
true
false
1 vs 3
Why is that
If the byteArray indeed contains a PDF, it most likely is not valid UTF-8. Thus, wherever
String byteString = new String(byteArray, StandardCharsets.UTF_8);
stumbles over a byte sequence which is not valid UTF-8, it will replace that by a Unicode replacement character. I.e. this line damages your data, most likely beyond repair. So the following
byte[] newByteArray = byteString.getBytes(StandardCharsets.UTF_8);
does not result in the original byte array but instead a damaged version of it.
The newByteArray, on the other hand, is the result of UTF-8 encoding a given string, byteString. Thus, newByteArray is valid UTF-8 and
String secondString = new String(newByteArray, StandardCharsets.UTF_8);
does not need to replace anything outside the UTF-8 mappings, in particular byteString and secondString are equal.
how to correctly convert/save a pdf inside a json?
As #mammago explained in his comment,
JSON is not the appropriate format for binary content (like files). You should propably use something like base64 to create a string out of your PDF and store that in your JSON object.
How to encode in Java this String
http://demo.pl/sample?id=tests%trg=https%3A%2F%2Fwww.google.com%sample.html%3Fwmc%3DAFF48+_LS.%23%7NUMBER_ID%7D_%23%7NUMBER_ID%7D..
java.net.URLEncoder encode this String like this:
http%3A%2F%2Fdemo.pl%2Fsample%3Fid%3Dtests%25trg%3Dhttps%253A%252F%252Fwww.google.com%25sample.html%253Fwmc%253DAFF48%2B_LS.%2523%257NUMBER_ID%257D_%2523%257NUMBER_ID%257D..
I expect this result:
http%3A%2F%2Fdemo.pl%2Fsample%3Fid%3Dtests%25trg%3Dhttps%3A%2F%2Fwww.google.com%sample.html%3Fwmc%3DAFF48+_LS.%23%7NUMBER_ID%7D_%23%7NUMBER_ID%7D..
I think following code can help you:
String s = "http://demo.pl/sample?id=tests%trg=https%3A%2F%2Fwww.google.com%sample.html%3Fwmc%3DAFF48+_LS.%23%7NUMBER_ID%7D_%23%7NUMBER_ID%7D";
int i = s.indexOf("%");
String result1 = URLEncoder.encode(s.substring(0, i)) + "%25" + s.substring(i + 1);
System.out.println(result1); // print http%3A%2F%2Fdemo.pl%2Fsample%3Fid%3Dtests%25trg=https%3A%2F%2Fwww.google.com%sample.html%3Fwmc%3DAFF48+_LS.%23%7NUMBER_ID%7D_%23%7NUMBER_ID%7D
I do not want to encode encoded part of String. Needs a universal
algorithm. String is not always encoded fragmentarily
I think universal algorithm is impossible in that case, what you can do that find encoded part manually and not encoded it again (see code above).
String original = "This is my string valúe";
I'm trying to encode the above string to UTF-8 equivalent but to replace only special character (ú) with -- "ú ;" in this case.
I've tried using the below but I get an error:
Input is not proper UTF-8, indicate encoding !Bytes: 0xFA 0x20 0x63 0x61
Code:
String original = new String("This is my string valúe");
byte ptext[] = original.getBytes("UTF-8");
String value = new String(ptext, "UTF-8");
System.out.println("Output : " + value);
This is my string valúe
You could use String.replace(CharSequence, CharSequence) and formatted io like
String original = "This is my string valúe";
System.out.printf("Output : %s%n", original.replace("ú", "ú"));
Which outputs (as I think you wanted)
Output : This is my string valúe
You seem to want to use XML character entities.
Appache Commons Lang has a method for this (in StringEscapeUtils).
Im trying to encode the above string to UTF-8 equivalent but to replace only >special character ( ú ) with -- "ú ;" in this case.
I'm not sure what encoding "ú ;" is but have you tried looking at the URLEncoder class? It won't encode the string exactly the way you asked but it gets rid of the spooky character.
Could you please try the below lines:
byte ptext[] = original.getBytes("UTF8");
String value = new String(ptext, "UTF8");
I got some encoded log information, casted into a string for transmitting purpose (the cast might be ugly but it works).
I'm trying to cast it back to a byte[] in order to decode it but it's not working:
byte[] encodedBytes = android.util.Base64.encode((login + ":" + password).getBytes(), NO_WRAP);
String encoded = "Authentification " + encodedBytes;
String to_decode = encoded.substring(17);
byte[] cast1 = to_decode; // error
byte[] cast2 = (byte[]) to_decode; // error
byte[] cast3 = to_decode.getBytes();
// no error, but i get something totally different from encodedBytes (the array is even half the size of encodedBytes)
// and when i decode it i got an IllegalArgumentException
these 3 casts are not working, any idea?
There are multiple problems here.
In general, you need to use Base64.decode in order to reverse the result of Base64.encode:
byte[] data = android.util.Base64.decode(to_decode, DEFAULT);
In general, you should always ask yourself "How did I perform the conversion from type X to type Y?" when working out how to get back from type Y to type X.
Note that you've got a typo in your code too - "Authentification" should be "Authentication".
However, you've also got a problem with your encoding - you're creating a byte[], and using string concatenation with that will call toString() on the byte array, which is not what you want. You should call encodeToString instead. Here's a complete example:
String prefix = "Authentication "; // Note fix here...
// TODO: Don't use basic authentication; it's horribly insecure.
// Note the explicit use of ASCII here and later, to avoid any ambiguity.
byte[] rawData = (login + ":" + password).getBytes(StandardCharsets.US_ASCII);
String header = prefix + Base64.encodeToString(rawData, NO_WRAP);
// Now to validate...
String toDecode = header.substring(prefix.length());
byte[] decodedData = Base64.decode(toDecode, DEFAULT);
System.out.println(new String(decodedData, StandardCharsets.US_ASCII));
I realise this is probably more of a general java question, but since it's running in Notes\ Domino environment, thought I'd check that community first.
Summary:
I don't seem to be able to decode the string: dABlAHMAdAA= using lotus.domino.axis.encoding.Base64 or sun.misc.BASE64Decoder
I know the original text is: test
I confirmed by decoding at http://www5.rptea.com/base64/ it appears it is UTF-16.
As simple test, using either of below:
String s_base64 = "dABlAHMAdAA=";
byte[] byte_base64 = null;
String s_decoded = "";
byte_base64 = new sun.misc.BASE64Decoder().decodeBuffer(s_base64);
s_decoded = new String(byte_base64, "UTF-16");
System.out.println("Test1: " + s_decoded);
byte_base64 = lotus.domino.axis.encoding.Base64.decode(s_base64);
s_decoded = new String(byte_base64, "UTF-16");
System.out.println("Test2: " + s_decoded);
System.out.println("========= FINISH.");
I get the output:
Test1: ????
Test2: ????
If I create String as UTF-8
s_decoded = new String(byte_base64, "UTF-8");
it outputs:
t
no error is thrown, but it doesn't complete the code, doesn't get to the "FINISH".
Detail
I'm accessing an asmx web service, in the SOAP response, some nodes contain base64 encoded data. At this point in time, there is no way to get the service changed, so I am having to XPath and decode myself. Encoded data is either text or html. If I pass the encoded data thru http://www5.rptea.com/base64/ and select UTF-16, it decodes correctly, so I must be doing something incorrectly.
As side note, I encoded "test":
s_base64 = lotus.domino.axis.encoding.Base64.encode(s_text.getBytes());
System.out.println("test1 encodes to: " + s_base64);
s_base64 = new sun.misc.BASE64Encoder().encode(s_text.getBytes());
System.out.println("test2 encodes to: " + s_base64);
they both encode to:
dGVzdA==
...which if you then feed into 2 decoders above, as expected, decodes correctly.
If I go to site above, and encode "test" as UTF-16, I get: dABlAHMAdAA= so that confirms that data is in UTF-16.
It's like the data is genuine base64 data, but the decoder doesn't recognise it as such. I'm slightly stumped at the moment.
Any pointers or comments would be gratefully received.
The string has been encoded in UTF-16LE (little-endian), where the least significant byte is stored first. Java defaults to big-endian. You need to use:
s_decoded = new String(byte_base64, "UTF-16LE");
i have used your sample "dABlAHMAdAA=" on my base64 decode online tool and it seems like you are missing the Apache base64 jar files
Click the link below.
http://www.hosting4free.info/Base64Decode/Base64-Decode.jsp
The code behind the website is
import org.apache.commons.codec.binary.Base64;
public class base64decode
{
public static void main(String[] args) throws UnsupportedEncodingException
{
byte[] decoded = Base64.decodeBase64("YWJjZGVmZw==".getBytes());
System.out.println(new String(decoded) + "\n");
}
}