I'm trying to modify some server code which uses an httpExchangeobject to handle the server's response to the client.
My issue is that for responses containing characters not supported by iso-8859-1, such as Chinese characters, I get something like '????' in place of the characters. I'd like to set the encoding of the response to utf-8, but have thus far been unsuccessful in doing so.
I tried adding this line:
httpExchange.getResponseHeaders().put("charset", Arrays.asList("UTF-8"));
This successfully puts a "charset" header in the response, but I still can't send the characters I want in the response.
How do I set the encoding of the response to allow for these characters?
Use Content-Type header to specify encoding.
String encoding = "UTF-8";
httpExchange.getResponseHeaders().set("Content-Type", "text/html; charset=" + encoding);
Writer out = new OutputStreamWriter(httpExchange.getResponseBody(), encoding));
out.write(something);
Related
In my Spring MVC Test (UTF-8 encoded) we find:
this.mockMvc = MockMvcBuilders.webAppContextSetup(context).apply(springSecurity())
.apply(documentationConfiguration(restDocumentation)
.snippets().withEncoding("UTF-8")) // default
.build();
...
myRequestDTO.setValue("Größe");
ResultActions action = this.mockMvc
.perform(post("/my-service")
.content(jacksonObjectMapper.writeValueAsString(myRequestDTO))
...
action.andDo(document("docs"));
The asciidoctor file contains
HTTP Request
include::{snippets}/docs/http-request.adoc[]
After I have rendered it and open the generated HTML file (which is UTF-8 encoded, too) in my firefox browser I find
HTTP Request
POST /my-service HTTP/1.1
...
Größe
How can the special chars be displayed correctly?
The underlying problem here was with the conversion of a request's content as a byte[] into a String. Spring REST Docs uses the charset attribute of the Content-Type header to determine the Charset that should be used when creating the String. If there's no Content-Type header or its value doesn't have a charset attribute, the JVM's default Charset is used (as a result of calling new String(bytes)).
There are two ways to avoid corruption of special characters:
Specify a charset attribute in the request's Content-Type header. Use text/plain;charset=UTF-8 rather than text/plain, for example.
Configure the JVM's default Charset by setting the file.encoding system property. -Dfile.encoding=UTF8, for example.
After I have called prettyPrint() it works:
action.andDo(document("docs",
preprocessRequest(prettyPrint()),
preprocessResponse(prettyPrint())));
I am trying to put data into header. When i add it in latin, it is received on server side correctly. But when I try to add it in chinese, for example 中國的錯誤, it is received on client as ?????
How can I set header to be encoded as UTF-8?
I tried doing something like this, but it didn't help:
servletResponse.setCharacterEncoding("utf-8");
servletResponse.setContentType("text/html; charset=UTF-8");
Set additionally -Dfile.encoding=utf8 in the string of server startup.
Our application download files using HttpClient. We take the filename from the HTTP Header and emit out as SysOut (System.out.println())
When there are non-USASCII characters like umlauts the sysout does not decode correctlty.
Ex: filename Textkürzung.txt is printed in sysout (And also in console) as Textk³rzung.txt
How can we encode this to UTF-8? Unfortunately we don't get the encoded character-set from the HTTP Headers
i want to send arabic data from servlet using HTTPServletResponse to client
i am trying this
response.setCharacterEncoding("UTF-8");
response.setHeader("Info", arabicWord);
and i receive the word like this
String arabicWord = response.getHeader("Info");
in client(receiving) also tried this
byte[]d = response.getHeader("Info").getBytes("UTF-8");
arabicWord = new String(d);
but seems like there is no unicode because i receive strange english words,so please how can i send and receive arabic utf8 words?
HTTP headers doesn't support UTF-8. They officially support ISO-8859-1 only. See also RFC 2616 section 2:
Words of *TEXT MAY contain characters from character sets other than ISO- 8859-1 [22] only when encoded according to the rules of RFC 2047 [14].
Your best bet is to URL-encode and decode them.
response.setHeader("Info", URLEncoder.encode(arabicWord, "UTF-8"));
and
String arabicWord = URLDecoder.decode(response.getHeader("Info"), "UTF-8");
URL-encoding will transform them into %nn format which is perfectly valid ISO-8859-1. Note that the data sent in the headers may have size limitations. Rather send it in the response body instead, in plain text, JSON, CSV or XML format. Using custom HTTP headers this way is namely a design smell.
I don't know where word variable is comming from, but try this:
arabicWord = new String(d, "UTF-8");
UPDATE: Looks like the problem is with UTF-8 encoded data in HTTP headers, see: HTTP headers encoding/decoding in Java for detailed discussion.
I am working to extract response charset in a java web app, where I am using Apache HTTP Client.
For example, one possible value obtained from "Content-Type" header is
text/html; charset=UTF-8
Then my code will extract all text after the "=" sign...
So the charset as extracted will be
UTF-8
I just wanted to know, is the above method for obtaining response charset correct? Or is there some scenario where the above code will not work? Is there something I am missing here?
The method provided by forty-two can work. But the method is deprecated, I find out that this website has a good example of method to find the charset.
HttpEntity entity = response.getEntity();
ContentType contentType = ContentType.getOrDefault(entity);
Charset charset = contentType.getCharset();
System.out.println("Charset = " + charset.toString());
Doesn't httpclient (or http core) already provide that functionality? Something like this:
HttpResponse response = ...
String charset = EntityUtils.getContentCharSet(response.getEntity());
Well, that approach will fail when
the charset value is quoted
when the quoted value uses escapes
when there are parameters other than "charset"