Send base64Binary SOAP parameter between Java client and PHP server - java

I have a PHP SOAP server (using nuSOAP with wsdl) that send the content of a html page. Of course, the HTML can be coded with differents encoding, but this parameter is base64Binary type in XML, and I receive the HTML in the "native encoding" without problems.
In order to prove, I have coded three SOAP clients in: PHP, C# and Java 6 and with the first two I have no problem. The java client was made using WSIMPORT 2.1 and an example of code it's like this:
FileInputStream file = new FileInputStream (new File ("/tmp/chinese.htm"));
BufferedReader buffer = new BufferedReader (new InputStreamReader (file
,"BIG5"));
String line;
String content = "";
while ((line = buffer.readLine()) != null)
content += line+"\n";
FileManagerAPI upload = new FileManagerAPI();
FileManagerAPIPortType servUpload = upload.getFileManagerAPIPort();
BigInteger result = servUpload.apiControllerServiceUploadHTML (
"http://www.test.tmp/因此鳥哥建議您務.html", content.getBytes());
The problem is that before send the HTML in base64 encoding, only the Java client encodes HTML content to UTF8 and, when PHP receives this file, the server manage it like "UTF8 archive", not like a "BIG5 file".
The question is, how to avoid the first UTF8 encoding? or at least do utf-8 encoding after base64, not earlier.
Thanks in advance.

It looks like you need to convert the file from UTF-8 (I think that's the encoding of /tmp/chinese.htm) to BIG5 first.
To convert a file's content, read the file and re-encode it, for example with iconv:
$path = '/tmp/chinese.htm';
$buffer = file_get_contents($path);
$buffer = iconv('UTF-8', 'BIG5', $buffer);
The buffer $buffer is now re-encoded from UTF-8 into BIG5.

Related

How to ensure that the JSON string is UTF-8 encoded in Java

I am working on a legacy web service client code where the JSON data is being sent to the web service. Recently it was found that for some requests in the JSON body, the service is giving HTTP 400 response due to invalid characters (non-UTF8) in the JSON Body.
Below is one example of the data which is causing the issue.
String value = "zu3z5eq tô‰U\f‹Á‹€z";
I am using org.json.JSONObject.toString() method to generate the JSON string. Can you please let me know how can I ensure that the JSON string is UTF-8 encoded?
I already tried few solutions like available online , like converting to byte array and then back, using java charset methods etc, but they did not work. Either they convert the valid values as well like chinese/japanese characters, or doesn't work at all.
Can you please provide some input on this?
You need to set the character encoding for OutputStreamWriter when you create it:
httpConn.connect();
wr = new OutputStreamWriter(httpConn.getOutputStream(), StandardCharsets.UTF_8);
wr.write(jsonObject.toString());
wr.flush();
Otherwise it defaults to the "platform default encoding," which is some encoding that has been used historically for text files on whatever system you are running.
Use Base64 encoding for converting the value to Byte[].
String value = "zu3z5eq tô‰U\f‹Á‹€z";
// WHILE SENDING ENCODE THE VALUE
byte[] encodedBytes = Base64.getEncoder().encode(value.getBytes("UTF-8"));
String encodedValue = new String(encodedBytes, "UTF-8");
// TRANSPORT....
// ON RECEIVING END DECODE THE VALUE
byte[] decodedBytes = Base64.getDecoder().decode(encodedValue.getBytes("UTF-8"));
System.out.println( new String(decodedBytes, "UTF-8"));

Difficulty when saving android file in sqlserver database

I am having a problem finding the right encoding for a file that is saved to the database through FileUpload from asp.net on sql server 2008 with type Image. I need to migrate a web system to an Android application using a webservice in asp.net for communication with this sql server database, but the saved format is not corresponding with the already saved files. I am not understanding whether it is a question of encoding (ASCII, UTF-8, ...) or problem with base64 encode and decode, or whether it would be more appropriate to treat the file as hexadecimal.
The file is read by the system through the fileupload component:
file.ARCHIVE = FileUpload1.FileBytes and then has the save: context.SaveChanges ()
the type expected by the database is Image and the web system reads the file normally after saving.
I need to do the same process through a native Android application (java), so I read the file, convert it to base64 to send it to the webservice that does the decode of the file and saves it in the same type in the database. When I compare the string of the file the string is different, so the web system understands the file to be corrupted, even though the application reads normally.
I already tested command.Parameters.Add ("# ARCHIVE", SqlDbType.Image) .Value = bytes; to save, but it seems to me that before this save the file format is no longer the same as the web system.
On Java Android we do like this
InputStream inputStream = context.openFileInput(filename);
if ( inputStream != null ) {
InputStreamReader inputStreamReader = new InputStreamReader(inputStream);
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
String receiveString = "";
StringBuilder stringBuilder = new StringBuilder();
while ( (receiveString = bufferedReader.readLine()) != null ) {
stringBuilder.append(receiveString);
}
inputStream.close();
ret = stringBuilder.toString();
}

java printstream image file

Is a printstream appropriate for sending image files through a socket? I'm currently doing a homework assignment where I have to write a web proxy from scratch using basic sockets.
When I configure firefox to use my proxy everything works fine except images don't download. If I go to an image file directly firefox comes back with the error: The image cannot be displayed because it contains errors
Here is my code for sending the response from the server back to the client (firefox):
BufferedReader serverResponse = new BufferedReader(new InputStreamReader(webServer.getInputStream()));
String responseLine;
while((responseLine = serverResponse.readLine()) != null)
{
serverOutput.println(responseLine);
}
In the code above serverOutput is a PrintStream object. I am wondering if somehow the PrintStream is corrupting the data?
No, it is never appropriate to treat bytes as text unless you know they are text.
Specifically, the InputStreamReader will try to decode your image (which can be treated as a byte array) to a String. Then your PrintStream will try to encode the String back to a byte array.
There is no guarantee that this will produce the original byte array. You might even get an exception, depending on what encoding Java decides to use, if some of the image bytes aren't valid encoded characters.

JAX-RS and character encoding problems

I am using Jax RS and have simple POST WS, that takes InputStream, that contains MIME message (xml + file).
The MIME message is in UTF-8, file contained as a body part is an email message in MIME RFC 822 in ISO-8859-1 encoding, that I'm converting to PDF using Aspose.
When running as a webservice, the resulting PDF has incorrect characters (ø, å etc.). But when I tried to use the exact input, but reading it from file instead and call the method with FileInputStream, the resulting PDF is OK.
Here is the simplified version of the code:
#POST
#Path(value = "/documents/convert/{flag}")
#Produces("text/plain")
public String convertFile(InputStream input, #PathParam("flag") String flag) throws WebApplicationException {
FileInfo info = convertToPdf(input);
return info.getResponse();
}
If I run this as webservice it produces PDF with incorrectly encoded characters with "box" instead of some charcters (such as ø, å etc.). When I run the the same code with the same input by by calling
FileInputStream fis = new FileInputStream(file);
convertFile(fis);
the resulting PDF has correct encoding (the WS is run on server, testing with file is done on my local machine).
Could this be incorrect setting of locale on the server?
Do you use an InputStreamReader to read the FileInputStream ? If so, did you initialize it using the 2-parameters constructor, with CharSet.forName("UTF-8") as the second argument ? (as you mentionned the incoming stream is already in UTF-8) ?
You might need to tell the container that it's UTF-8.
something like...
#Produces("text/plain; charset=utf-8")
Apparently your local file and you MIME message body are not encoded the same way.
Your post states that the file is encoded in ISO-8859-1.
If you are using an InputStreamReader (as Xavier Coulon's is suggesting) you should pass the expected encoding to it. In this case
CharSet.forName("ISO-8859-1")
If this does not help, could you please provide the content of the convertToPdf(InputStream is) method

Encoding problem from database to javamail

I have a small application which reads from a Oracle 9i database and sends the data via e-mail, using JavaMail. The database has NLS_CHARACTERSET = "WE8MSWIN1252", that's it, CP1252.
If I run the app without any parameter, it works fine and the e-mails are sent correctly. However, I've a requeriment that enforces me to run the app with the -Dfile-encoding=utf8 parameter, which results in the text being sent with corrupted characters.
I've tried to change the encoding of the data read from the database, with:
String textToSend = new String(textRead.getBytes("CP1252"), "UTF-8");
But it doesn't help. I've tried all the possible combinations with CP1252, windows-1252, ISO-8859-1 and UTF-8, but still had no luck.
Any ideas?
Update to clarify my problem: when I do the following:
Statement stat = connection.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY);
stat.executeQuery("SELECT blah FROM blahblah ...");
ResultSet rs = stat.getResultSet();
String textRead = rs.getString("whatever");
I get textRead corrupted, because the database is CP1252 and the application is running in UTF-8. Another approach that I've tried but also failed:
InputStream is = rs.getBinaryStream("whatever");
Writer writer = new StringWriter();
char[] buffer = new char[1024];
Reader reader = new BufferedReader(new InputStreamReader(stream, "UTF-8"));
while ((n = reader.read(buffer)) != -1) {
writer.write(buffer, 0, n);
}
String textRead = writer.toString();
Your driver should do the conversion automatically and since cp-1252 is a subset of UTF-8 you shouldn't lose information.
Can you try the following: get the String with ResultSet.getString, write the string to a file. Open the file with an editor with which you can specify UTF-8 character set (jEdit for example).
The file should contain UTF-8 data.
You seem to get lost in charset space -- i understand this... :-)
This line
String textToSend = new String(textRead.getBytes("CP1252"), "UTF-8");
does not make much sense. You have already text, convert it to a "cp1252" encoded byte []. Then you tell the VM to treat the bytes as if they were "UTF-8" (which is a lie...).
In short: if you have a String as in textRead, you don't have to convert it at all. If something goes wrong, either the text is already rotten (look at it in the debugger) or gets rotten in the API later on. Check this and come back with more detail? Where is the text that is wrong and where do you exactly read it from or write it to...
Your database data is in windows-1252. So -- assuming it's being handed back verbatim by the JDBC driver -- when you try to convert it to a Java String, that's the charset you need to specify:
Statement stat = connection.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE, ResultSet.CONCUR_READ_ONLY);
ResultSet rs = stat.executeQuery("SELECT blah FROM blahblah ...");
byte[] rawbytes = rs.getBytes("whatever");
String textRead = new String(rawbytes, "windows-1252");
Is part of the requirement that the data be mailed out as UTF-8? If so, the UTF-8 part needs to occur on the output side, not the input side. When you have String data in Java, it's stored internally as UTF-16. So when you serialize it out to the MimeMessage, you again need to pick a charset:
mimebodypart.setText(textRead, "UTF-8");
I had the same problem:
Orace database using WE8MSWIN1252 charset, some VARCHAR2 column data/text containing the euro-sign (€) in it. Sending the text using JavaMail gave problems on the euro-sign.
Finally it works. Two important things you should check/do:
be sure to use the most recent Oracle JDBC driver for the Java version you use.
specify the charset (prefer: UTF-8) in JavaMail, e.g.MimeMessage.setSubject(String text, "UTF-8")MimeMessage.setText(String text, "UTF-8").That way the email text gets UTF-8 encoded.NOTE: Because RFC 821 restricts mail messages to 7-bit US-ASCII, 8-bit character or binary data needs to be encoded into a 7-bit format. The email header "Content-Transfer-Encoding" specifies the encoding used. For more information: http://www.w3.org/Protocols/rfc1341/5_Content-Transfer-Encoding.html
Can you do the conversion in the database? Instead of:
SELECT blah FROM blahblah
Try
SELECT convert(blah, 'WE8MSWIN1252', 'UTF8') FROM blahblah

Categories

Resources