message/rfc822 ignores utf-8 - java

I'm using JavaMail to convert incoming emails into attachments, and then attach them to another main email.
DataSource byteMail = new ByteArrayDataSource(attachedEmail.getMessageHtmlBody(), "message/rfc822; charset=utf-8");
DataHandler dataHandler = new DataHandler(byteMail);
mailBodyPart.setDataHandler(dataHandler);
mailBodyPart.setFileName(attachedEmail.getMessageSubject());
By converting the incoming htmlMail to MimeMessage, I found that rfc822 doesn't response to utf-8. The german letters like ä, ü, etc are still garbled.
But rfc822 works well with "iso-8859-1"; "UTF-8" works good with other MIME types such as "text/html" and "www/mime".
Is it possible that rfc822 messages are not compatible with UTF-8? Or does anyone know how to solve this problem?

Related

How can I set the encoding of a httpExchange response?

I'm trying to modify some server code which uses an httpExchangeobject to handle the server's response to the client.
My issue is that for responses containing characters not supported by iso-8859-1, such as Chinese characters, I get something like '????' in place of the characters. I'd like to set the encoding of the response to utf-8, but have thus far been unsuccessful in doing so.
I tried adding this line:
httpExchange.getResponseHeaders().put("charset", Arrays.asList("UTF-8"));
This successfully puts a "charset" header in the response, but I still can't send the characters I want in the response.
How do I set the encoding of the response to allow for these characters?
Use Content-Type header to specify encoding.
String encoding = "UTF-8";
httpExchange.getResponseHeaders().set("Content-Type", "text/html; charset=" + encoding);
Writer out = new OutputStreamWriter(httpExchange.getResponseBody(), encoding));
out.write(something);

GAE: Incoming emails can not preserve format

I've set up my GAE/Java project to receiving emails and it works pretty fine excepting it can not preserve the incoming mail's format(e.g. bold, italic, font size, text color, bulleted list...), and the content type of incoming mails are always "text/plain", as a result from the end user's view the mail content huddled and unreadable.
For example I send a formatted mail from Gmail, when I receiving the mail in GAE all formats is tripped off and leaves a bulk of plain text.
Is there any way I can get incoming mail type as HTML so the format would be preserved?
While sending the mail through server. Set the body content type text/html.
.
.
.
htmlPart = new MimeBodyPart();
htmlPart.setContent("<b>html content</b>", "text/html");
This should work for you..
Looks like a duplicate of this question and answer
Moreover, I am copying a few excerpts from Google App Engine Documentation here which says:
The message contains a subject, a plaintext body, and an optional HTML body.
It can also contain file attachments, as well as a limited set of headers.
And I am guessing the content type should be text/html

Java + Jersey- sending utf-8 encoded data

In my application I need to use the rest api of a web service. For now I need to send an xml message. The problem is, some of the characters in this xml are polish diacritics. Now, the code of my message sending looks like this
WebResource r = client.resource(resourceAddress);
String response = r.accept(
MediaType.APPLICATION_XML_TYPE,
MediaType.APPLICATION_JSON_TYPE,
MediaType.TEXT_HTML_TYPE
)
.type(MediaType.TEXT_XML_TYPE)
.header("Authorization", authorizationString)
.post(String.class, event);
Java Strings are UTF-16 and my XML should be UTF-8 encoded. Is there a way to tell Jersey to change somehow the encoding before serialization? Or maybe there is some other way, so I can send this String data as UTF-8 and not UTF-16 using Jersey client api?

Sending UTF-8 values in HTTP headers results in Mojibake

i want to send arabic data from servlet using HTTPServletResponse to client
i am trying this
response.setCharacterEncoding("UTF-8");
response.setHeader("Info", arabicWord);
and i receive the word like this
String arabicWord = response.getHeader("Info");
in client(receiving) also tried this
byte[]d = response.getHeader("Info").getBytes("UTF-8");
arabicWord = new String(d);
but seems like there is no unicode because i receive strange english words,so please how can i send and receive arabic utf8 words?
HTTP headers doesn't support UTF-8. They officially support ISO-8859-1 only. See also RFC 2616 section 2:
Words of *TEXT MAY contain characters from character sets other than ISO- 8859-1 [22] only when encoded according to the rules of RFC 2047 [14].
Your best bet is to URL-encode and decode them.
response.setHeader("Info", URLEncoder.encode(arabicWord, "UTF-8"));
and
String arabicWord = URLDecoder.decode(response.getHeader("Info"), "UTF-8");
URL-encoding will transform them into %nn format which is perfectly valid ISO-8859-1. Note that the data sent in the headers may have size limitations. Rather send it in the response body instead, in plain text, JSON, CSV or XML format. Using custom HTTP headers this way is namely a design smell.
I don't know where word variable is comming from, but try this:
arabicWord = new String(d, "UTF-8");
UPDATE: Looks like the problem is with UTF-8 encoded data in HTTP headers, see: HTTP headers encoding/decoding in Java for detailed discussion.

jetty ,websockets and UTF8 encoding

I'm having a little problem. I'm building a small server in java, based on jetty websockets implementations.
The clients are the browsers and I send information using the websockets javascript api.
Everything works great until I send those special characters such as : ă Ț î ș ê ñ ü
So here is the problem. Client 1 sends a message to the server with one of this characters. Server prints the message and then send the message to client 2.
Client 2 receives the message and prints the message on a browser html page and works great The characters are showed correctly.
The problem is when I wanna print the String on the server site. Instead of ă is shows me the ? char. This is causing me problems because I want to insert the text in a database(mysql- with ut8 encoding enabled)
So.. what seems to be problem. The text that is send from the browser is not UT8 encoded? or the jetty websocket implementation is not receiving String in utf8 encoding??
Thanks
Here's a function I use to HTML-encode all special characters in a string (but not html itself (like < or >)). If you apply it before sending a string to the server, everybody should see the same and you can store it in a database table:
function toHtmlEncoded(string){
return string.replace(/[\u0080-\uC350]/g,
function(a) {return '&#'+a.charCodeAt(0)+';';}
);
}
First read this http://kunststube.net/encoding/
Then check everywhere you've converted bytes into Strings (or the reverse). Common places to make a mistake include calling getBytes() on a String without specifying an encoding. Other pitfalls include not setting the encoding in the database connection string.

Categories

Resources