Android HTTP Request Encoding

Android HTTP Request Encoding - java

I want to do a HTTPRequest in my Android App, using the following Code:
BufferedReader in = null;
try {
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet();
request.setURI(new URI("http://www.example.de/example.php"));
HttpResponse response = client.execute(request);
in = new BufferedReader
(new InputStreamReader(response.getEntity().getContent()));
StringBuffer sb = new StringBuffer("");
String line = "";
String NL = System.getProperty("line.separator");
while ((line = in.readLine()) != null) {
sb.append(line + NL);
}
in.close();
String page = sb.toString();
System.out.println(page);
return page;
} finally {
if (in != null) {
try {
in.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
The webpage I'm calling is a php Script which returns a string. My problem is that the the special Characters (ä,ü,ö,€ etc.) are showed as a Question mark with a box. How can I get these characters?
I think it's a problem with the encoding (German App -> UTF-8?).

May be you could try to set encoding when displaying into the console. Something characters are correctly returned from the server but fails to display in the console.
String page = sb.toString();
PrintStream out = new PrintStream(System.out, true, "UTF-8");
out.println(page);

I have played around with your code, against http://www.google.de.
I was able to "hack" something, not sure it's the most elegant solution though.
After the line:
HttpResponse response = client.execute(request);
... I've added:
HttpEntity e = response.getEntity();
Header ct = e.getContentType();
HeaderElement[] he = ct.getElements();
if (
he.length > 0
&& he[0].getParameters().length > 0
&& he[0].getParameter(0) != null
&& he[0].getParameter(0).getName().equals("charset")
) {
String charset = he[0].getParameter(0).getValue();
// with google.de, will print ISO latin ("ISO-8859-1")
Log.d("com.example.test", charset);
}
... then you can add the charset representation, or its Java equivalent as a second argument of your InputStreamReader constructor call:
in = new BufferedReader(
new InputStreamReader(
response.getEntity().getContent(),
charset != null ? charset : "UTF-8"
);
Let me know if that works out for you.
Also note that in order to check Java charset equivalences, you could use Charset.forName(String charsetName) and catch the relevant Exceptions (and then revert to Charset.defaultCharset() or UTF-8, etc. in your catch statement).

Related

JSONParser returns Unexpected character () at position 0

I'm with problem to parse a JSON. Always that I try do it, the follow was result is returned: Unexpected character () at position 0.
public Object execute(HttpRequestBase request){
DefaultHttpClient client = new DefaultHttpClient();
HttpResponse response = null;
Object object = null;
try {
response = client.execute(request);
InputStream is = response.getEntity().getContent();
BufferedReader br = new BufferedReader(new InputStreamReader((is)));
StringBuilder builder = new StringBuilder();
String output;
while ((output = br.readLine()) != null) {
builder.append(output).append("\n");
}
if (response.getStatusLine().getStatusCode() == 200) {
object = new JSONParser().parse(builder.toString());
client.getConnectionManager().shutdown();
} else {
LOG.log(Level.SEVERE, builder.toString());
throw new RuntimeException(builder.toString());
}
} catch (IOException | ParseException ex) {
LOG.log(Level.SEVERE, ex.toString());
} finally {
}
return object;
}
PS:
My response returns a JSON well formatted;
The problem happens when this piece of code is running object = new JSONParser().parse(builder.toString());
The is part o my JSON file:
[
{
"id":2115,
"identificacao":"17\/2454634-6",
"ultima_atualizacao":null
},
{
"id":2251,
"identificacao":"17\/3052383-2",
"ultima_atualizacao":"2017-11-21"
},
{
"id":2258,
"identificacao":"17\/3070024-6",
"ultima_atualizacao":null
},
{
"id":2257,
"identificacao":"17\/3070453-5",
"ultima_atualizacao":null
}
]

Most probably your content has some unprinted special character at beginning. For UTF-8 encoded data this may be a BOM.
Please post start of you content as byte[].

It is happening because of UTF-8 BOM.
What is UTF-8 BOM ?
The UTF-8 BOM is a sequence of bytes (EF BB BF) that allows the reader to identify a file as being encoded in UTF-8.
Normally, the BOM is used to signal the endianness of an encoding, but since endianness is irrelevant to UTF-8, the BOM is unnecessary.
How to solve the issue ?
Convert encoding of your .json or any file to UTF-8 instead of UTF-8 BOM.
Like this.

Use this instead of BufferReader
Map<String,Object> result
= (Map<String,Object>)JSONValue.parse(IOUtils.toString(response.getEntity().getContent(), "utf-8"));
You will get your JsonData into Map and simply you can iterate over map.

I believe your problem is with the content-type. Use something like this:
HttpEntity content = response.getEntity();
StringBuilder sb = new StringBuilder();
InputStream is = content.getContent();
InputStreamReader isr = new InputStreamReader(is, "UTF-8");
int character;
do {
character = isr.read();
if (character >= 0) {
sb.append((char) character);
}
} while (character >= 0);
return sb.toString();
No need for BufferedReader, InputStreamReader can handle it fine.
Hope it helps!

I found the problem!
My JSON was returning a different space character, so I did add it in my code this::
String content = builder.toString();
content = content.replaceAll("\\uFEFF", "");
This \uFEFF was my problem! And in my Dev environment it is not happens, just in production env!

Reading rest service of content type application/vnd.oracle.adf.resourceitem+json

I have a webservice whose content type is application/vnd.oracle.adf.resourceitem+json.
The HttpEntity of the reponse obtained by hitting this service is looks like this
ResponseEntityProxy{[Content-Type: application/vnd.oracle.adf.resourceitem+json,Content-Length: 3,Chunked: false]}
When I try to convert this HttpEntity into String it gives me a blank String {}.
Below are the ways I tried to convert the HttpEntity to String
1.
String strResponse = EntityUtils.toString(response.getEntity());
2.
String strResponse = "";
String inputLine;
BufferedReader br = new BufferedReader(new InputStreamReader(entity.getContent()));
try {
while ((inputLine = br.readLine()) != null) {
System.out.println(inputLine);
strResponse += inputLine;
}
br.close();
} catch (IOException e) {
e.printStackTrace();
}
3.
response.getEntity().writeTo(new FileOutputStream(new File("C:\\Users\\harshita.sethi\\Documents\\Chabot\\post.txt")));
All returns String -> {}.
Can anyone tell me what am I doing wrong?
Is this because of the content type?

The above code is still giving the same response with empty JSON object. So I modified and wrote the below code. This one seems to run perfectly fine.
URL url = new URL(urlString);
HttpsURLConnection con = (HttpsURLConnection) url.openConnection();
con.setDoOutput(true);
con.setRequestMethod("POST");
con.addRequestProperty("Authorization", getAuthToken());
con.addRequestProperty("Content-Type", "application/vnd.oracle.adf.resourceitem+json;charset=utf-8");
String input = String.format("{\"%s\":\"%s\",\"%s\":\"%s\"}", field, value, field2, value2);
System.out.println(input);
OutputStream outputStream = con.getOutputStream();
outputStream.write(input.getBytes());
outputStream.flush();
con.connect();
System.out.println(con.getResponseCode());
// Uncompressing gzip content encoding
GZIPInputStream gzip = new GZIPInputStream(con.getInputStream());
StringBuffer szBuffer = new StringBuffer();
byte tByte[] = new byte[1024];
while (true) {
int iLength = gzip.read(tByte, 0, 1024);
if (iLength < 0) {
break;
}
szBuffer.append(new String(tByte, 0, iLength));
}
con.disconnect();
returnString = szBuffer.toString();
Authentication method
private String getAuthToken() {
String name = user;
String pwd = this.password;
String authString = name + ":" + pwd;
byte[] authEncBytes = Base64.getEncoder().encode(authString.getBytes());
System.out.println(new String(authEncBytes));
return "Basic " + new String(authEncBytes);
}
In case anybody faces the same issue. Let me share the challenged I faced and how I rectified those.
The above code works for all content-types/methods. Can be used for any type (GET, POST, PUT,DELETE).
For my requirement I had a POST webservice with
Content-Encoding →gzip
Content-Type →application/vnd.oracle.adf.resourceitem+json
Challenges : I was able to get the correct response code but I was getting junk characters as my response string.
Solution : This was because the output was compressed in gzip format which needed to be uncompressed.
The code of uncompressing the gzip content encoding is also mentioned above.
Hope it helps future users.

Incomplete content (buffer) of webpage

I want to read the content of a webpage with the following methods, but I only get 60-70 percent of it.
I've tried 2 different methods to read the webpage, both with the same result. I also tried different Urls. I get no errors or timeouts.
What I am doing wrong ?
URL url = new URL(uri.toString());
HttpURLConnection urlConnection = (HttpURLConnection) url.openConnection();
try
{
InputStream in = new BufferedInputStream(urlConnection.getInputStream());
BufferedReader br = new BufferedReader(new InputStreamReader(in));
StringBuilder sb = new StringBuilder();
String line = null;
while ((line = br.readLine()) != null)
{
sb.append(line + "\n");
}
br.close();
this.content = sb.toString();
}
finally
{
urlConnection.disconnect();
}
AND
HttpGet get = new HttpGet(uri);
HttpClient defaultHttp = new DefaultHttpClient(httpParameters);
HttpResponse response = defaultHttp.execute(get);
StatusLine status = response.getStatusLine();
if(status.getStatusCode() == HttpStatus.SC_OK)
{
HttpEntity entity = response.getEntity();
InputStream stream = entity.getContent();
String encoding = "utf-8";
//long length = entity.getContentLength();
//if(entity.getContentEncoding() != null)
//{
// encoding = entity.getContentEncoding().getValue();
//}
//if(length > 0)
//{
byte[] buffer = new byte[1024];
long read = 0;
do
{
read = stream.read(buffer);
if(read > 0)
{
this.content += new String(buffer, encoding);
}
}while(read > 0);
//}
}
#edit
I've tried it with C# and WinForms. I read the complete html source of that webpage.
With java-android it doesn't work.
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create("http://www.kicker.de");
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
StreamReader reader = new StreamReader(response.GetResponseStream());
string content = reader.ReadToEnd();
reader.Close();
response.Close();

the httpurlconnection in apache's util jar has limited the biggest bytes in a response, i couldn't remember the number of it.
But in most of time ,may you use the http conncetion in UI thread , so sometimes it's not safe,and maybe will be killed, you can choose to deal with the http request in a thread but not the UI thread. So I want to know if you do it in the UT thread

I have currently the same Problem. I tried my Code in a simple Java Application and I receive the whole content. But on Android, the Content is incomplete. This Question is now a year old. I guess you have solved it in the meantime. Can you please add your Solution?
Edit:
I wrote the content into a File on my Android Device. The Content was complete!
It seems logcat doesn´t show the complete Output you receive from the Devie.

Read non-english characters from http get request

I have a problem in getting Hebrew characters from a http get request.
I'm getting squares characters like this: "[]" instead of the Hebrew characters.
The English characters are Ok.
This is my function:
public String executeHttpGet(String urlString) throws Exception {
BufferedReader in = null;
try {
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet();
request.setURI(new URI(urlString));
HttpResponse response = client.execute(request);
in = new BufferedReader(new InputStreamReader(response.getEntity().getContent(),"UTF-8"));
StringBuffer sb = new StringBuffer("");
String line = "";
String NL = System.getProperty("line.separator");
while ((line = in.readLine()) != null) {
sb.append(line + NL);
}
in.close();
String page = sb.toString();
// System.out.println(page);
return page;
} finally {
if (in != null) {
try {
in.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
You can test is by this example url:
String str = executeHttpGet("http://kavim-t.co.il/include/getXMLStations.asp?parent=7_%20_1");
Thank you!

The file you linked to doesn't seem to be UTF-8. I tested that it opens correctly using WINDOWS-1255 (hebrew encoding), you should try that instead of UTF-8.

Try a different website, it looks like it doesn't use UTF-8. Alternatively, UTF-16 may work but I haven't tried. Your code looks fine.

As others have pointed out, the content is not actually encoded as UTF-8. You might want to look at httpEntity.getContentType() to extract the actual encoding of the content, and then pass this to your InputStreamReader. This means your code will then be able to cope correctly with any encoding.

hi as is posted in this other question Special characters in PHP / MySQL
you can set the characters on the php file on the example they set utf-8, but you can set a different type that supports the chararcters you need.

Java UTF-8 encoding not set to URLConnection

I'm trying to retrieve data from http://api.freebase.com/api/trans/raw/m/0h47
As you can see in text there are sings like this: /ælˈdʒɪəriə/.
When I try to get source from the page I get text with sings like ú etc.
So far I've tried with the following code:
urlConnection.setRequestProperty("Accept-Charset", "UTF-8");
urlConnection.setRequestProperty("Content-Type", "application/x-www-form-urlencoded;charset=utf-8");
What am I doing wrong?
My entire code:
URL url = null;
URLConnection urlConn = null;
DataInputStream input = null;
try {
url = new URL("http://api.freebase.com/api/trans/raw/m/0h47");
} catch (MalformedURLException e) {e.printStackTrace();}
try {
urlConn = url.openConnection();
} catch (IOException e) { e.printStackTrace(); }
urlConn.setRequestProperty("Accept-Charset", "UTF-8");
urlConn.setRequestProperty("Content-Type", "text/plain; charset=utf-8");
urlConn.setDoInput(true);
urlConn.setUseCaches(false);
StringBuffer strBseznam = new StringBuffer();
if (strBseznam.length() > 0)
strBseznam.deleteCharAt(strBseznam.length() - 1);
try {
input = new DataInputStream(urlConn.getInputStream());
} catch (IOException e) { e.printStackTrace(); }
String str = "";
StringBuffer strB = new StringBuffer();
strB.setLength(0);
try {
while (null != ((str = input.readLine())))
{
strB.append(str);
}
input.close();
} catch (IOException e) { e.printStackTrace(); }

The HTML page is in UTF-8, and could use arabic characters and such. But those characters above Unicode 127 are still encoded as numeric entities like ú. An Accept-Encoding will not, help, and loading as UTF-8 is entirely right.
You have to decode the entities yourself. Something like:
String decodeNumericEntities(String s) {
StringBuffer sb = new StringBuffer();
Matcher m = Pattern.compile("\\&#(\\d+);").matcher(s);
while (m.find()) {
int uc = Integer.parseInt(m.group(1));
m.appendReplacement(sb, "");
sb.appendCodepoint(uc);
}
m.appendTail(sb);
return sb.toString();
}
By the way those entities could stem from processed HTML forms, so on the editing side of the web app.
After code in question:
I have replaced DataInputStream with a (Buffered)Reader for text. InputStreams read binary data, bytes; Readers text, Strings. An InputStreamReader has as parameter an InputStream and an encoding, and returns a Reader.
try {
BufferedReader input = new BufferedReader(
new InputStreamReader(urlConn.getInputStream(), "UTF-8"));
StringBuilder strB = new StringBuilder();
String str;
while (null != (str = input.readLine())) {
strB.append(str).append("\r\n");
}
input.close();
} catch (IOException e) {
e.printStackTrace();
}

Try adding also the user agent to your URLConnection:
urlConnection.setRequestProperty("User-Agent", "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36");
This solved my decoding problem like a charm.

Well I'm thinking the problem is when you are reading from the stream. You should either call the readUTF method on the DataInputStream instead of calling readLine or, what I would do, would be to create an InputStreamReader and set the encoding, then you can read from the BufferedReader line by line (this would be inside your existing try/catch):
Charset charset = Charset.forName("UTF8");
InputStreamReader stream = new InputStreamReader(urlConn.getInputStream(), charset);
BufferedReader reader = new BufferedReader(stream);
StringBuffer responseBuffer = new StringBuffer();
String read = "";
while ((read = reader.readLine()) != null) {
responseBuffer.append(read);
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Android HTTP Request Encoding - java

May be you could try to set encoding when displaying into the console. Something characters are correctly returned from the server but fails to display in the console. String page = sb.toString(); PrintStream out = new PrintStream(System.out, true, "UTF-8"); out.println(page);

Related

JSONParser returns Unexpected character () at position 0

Reading rest service of content type application/vnd.oracle.adf.resourceitem+json

Incomplete content (buffer) of webpage

Read non-english characters from http get request

Java UTF-8 encoding not set to URLConnection

Categories

Resources