JSONParser returns Unexpected character () at position 0 - java

I'm with problem to parse a JSON. Always that I try do it, the follow was result is returned: Unexpected character () at position 0.
public Object execute(HttpRequestBase request){
DefaultHttpClient client = new DefaultHttpClient();
HttpResponse response = null;
Object object = null;
try {
response = client.execute(request);
InputStream is = response.getEntity().getContent();
BufferedReader br = new BufferedReader(new InputStreamReader((is)));
StringBuilder builder = new StringBuilder();
String output;
while ((output = br.readLine()) != null) {
builder.append(output).append("\n");
}
if (response.getStatusLine().getStatusCode() == 200) {
object = new JSONParser().parse(builder.toString());
client.getConnectionManager().shutdown();
} else {
LOG.log(Level.SEVERE, builder.toString());
throw new RuntimeException(builder.toString());
}
} catch (IOException | ParseException ex) {
LOG.log(Level.SEVERE, ex.toString());
} finally {
}
return object;
}
PS:
My response returns a JSON well formatted;
The problem happens when this piece of code is running object = new JSONParser().parse(builder.toString());
The is part o my JSON file:
[
{
"id":2115,
"identificacao":"17\/2454634-6",
"ultima_atualizacao":null
},
{
"id":2251,
"identificacao":"17\/3052383-2",
"ultima_atualizacao":"2017-11-21"
},
{
"id":2258,
"identificacao":"17\/3070024-6",
"ultima_atualizacao":null
},
{
"id":2257,
"identificacao":"17\/3070453-5",
"ultima_atualizacao":null
}
]

Most probably your content has some unprinted special character at beginning. For UTF-8 encoded data this may be a BOM.
Please post start of you content as byte[].

It is happening because of UTF-8 BOM.
What is UTF-8 BOM ?
The UTF-8 BOM is a sequence of bytes (EF BB BF) that allows the reader to identify a file as being encoded in UTF-8.
Normally, the BOM is used to signal the endianness of an encoding, but since endianness is irrelevant to UTF-8, the BOM is unnecessary.
How to solve the issue ?
Convert encoding of your .json or any file to UTF-8 instead of UTF-8 BOM.
Like this.

Use this instead of BufferReader
Map<String,Object> result
= (Map<String,Object>)JSONValue.parse(IOUtils.toString(response.getEntity().getContent(), "utf-8"));
You will get your JsonData into Map and simply you can iterate over map.

I believe your problem is with the content-type. Use something like this:
HttpEntity content = response.getEntity();
StringBuilder sb = new StringBuilder();
InputStream is = content.getContent();
InputStreamReader isr = new InputStreamReader(is, "UTF-8");
int character;
do {
character = isr.read();
if (character >= 0) {
sb.append((char) character);
}
} while (character >= 0);
return sb.toString();
No need for BufferedReader, InputStreamReader can handle it fine.
Hope it helps!

I found the problem!
My JSON was returning a different space character, so I did add it in my code this::
String content = builder.toString();
content = content.replaceAll("\\uFEFF", "");
This \uFEFF was my problem! And in my Dev environment it is not happens, just in production env!

Related

Reading rest service of content type application/vnd.oracle.adf.resourceitem+json

I have a webservice whose content type is application/vnd.oracle.adf.resourceitem+json.
The HttpEntity of the reponse obtained by hitting this service is looks like this
ResponseEntityProxy{[Content-Type: application/vnd.oracle.adf.resourceitem+json,Content-Length: 3,Chunked: false]}
When I try to convert this HttpEntity into String it gives me a blank String {}.
Below are the ways I tried to convert the HttpEntity to String
1.
String strResponse = EntityUtils.toString(response.getEntity());
2.
String strResponse = "";
String inputLine;
BufferedReader br = new BufferedReader(new InputStreamReader(entity.getContent()));
try {
while ((inputLine = br.readLine()) != null) {
System.out.println(inputLine);
strResponse += inputLine;
}
br.close();
} catch (IOException e) {
e.printStackTrace();
}
3.
response.getEntity().writeTo(new FileOutputStream(new File("C:\\Users\\harshita.sethi\\Documents\\Chabot\\post.txt")));
All returns String -> {}.
Can anyone tell me what am I doing wrong?
Is this because of the content type?
The above code is still giving the same response with empty JSON object. So I modified and wrote the below code. This one seems to run perfectly fine.
URL url = new URL(urlString);
HttpsURLConnection con = (HttpsURLConnection) url.openConnection();
con.setDoOutput(true);
con.setRequestMethod("POST");
con.addRequestProperty("Authorization", getAuthToken());
con.addRequestProperty("Content-Type", "application/vnd.oracle.adf.resourceitem+json;charset=utf-8");
String input = String.format("{\"%s\":\"%s\",\"%s\":\"%s\"}", field, value, field2, value2);
System.out.println(input);
OutputStream outputStream = con.getOutputStream();
outputStream.write(input.getBytes());
outputStream.flush();
con.connect();
System.out.println(con.getResponseCode());
// Uncompressing gzip content encoding
GZIPInputStream gzip = new GZIPInputStream(con.getInputStream());
StringBuffer szBuffer = new StringBuffer();
byte tByte[] = new byte[1024];
while (true) {
int iLength = gzip.read(tByte, 0, 1024);
if (iLength < 0) {
break;
}
szBuffer.append(new String(tByte, 0, iLength));
}
con.disconnect();
returnString = szBuffer.toString();
Authentication method
private String getAuthToken() {
String name = user;
String pwd = this.password;
String authString = name + ":" + pwd;
byte[] authEncBytes = Base64.getEncoder().encode(authString.getBytes());
System.out.println(new String(authEncBytes));
return "Basic " + new String(authEncBytes);
}
In case anybody faces the same issue. Let me share the challenged I faced and how I rectified those.
The above code works for all content-types/methods. Can be used for any type (GET, POST, PUT,DELETE).
For my requirement I had a POST webservice with
Content-Encoding →gzip
Content-Type →application/vnd.oracle.adf.resourceitem+json
Challenges : I was able to get the correct response code but I was getting junk characters as my response string.
Solution : This was because the output was compressed in gzip format which needed to be uncompressed.
The code of uncompressing the gzip content encoding is also mentioned above.
Hope it helps future users.

â® characters getting converted to question marks while getting back

I am having a very weird issue.
I am putting and getting messages from Amazon AWS SQS.
While putting I am compressing and encoding the messages, like this :
String responseMessageBodyOriginal = gson.toJson(responseData);
String responseMessageBodyCompressed = compressToBase64String(responseMessageBodyOriginal);
AmazonSqsHelper.sendMessage(responseMessageBodyCompressed, queue, null);
Compression and encoding function, looks like this :
public static String compressToBase64String(String data) throws IOException {
ByteArrayOutputStream bos = new ByteArrayOutputStream(data.length());
GZIPOutputStream gzip = new GZIPOutputStream(bos);
gzip.write(data.getBytes());
gzip.close();
byte[] compressedBytes = bos.toByteArray();
bos.close();
return new String(Base64.encodeBase64(compressedBytes));
}
On the other hand, while receiving message, this is the code :
List<Message> sqsMessageList = AmazonSqsHelper.receiveMessages(queueUrl, max_message_read_count,
default_visibility_timeout);
int num_messages = sqsMessageList.size();
if (num_messages > 0) {
for (Message m : sqsMessageList) {
String responseMessageBodyCompressed = m.getBody();
String responseMessageBodyOriginal = decompressFromBase64String(responseMessageBodyCompressed);
}
}
And the function used for decoding and unzipping is like this :
public static String decompressFromBase64String(String compressedString) throws IOException {
byte[] compressedBytes = Base64.decodeBase64(compressedString);
ByteArrayInputStream bis = new ByteArrayInputStream(compressedBytes);
GZIPInputStream gis = new GZIPInputStream(bis);
BufferedReader br = new BufferedReader(new InputStreamReader(gis, "UTF-8"));
StringBuilder sb = new StringBuilder();
String line;
while ((line = br.readLine()) != null) {
sb.append(line);
}
br.close();
gis.close();
bis.close();
return sb.toString();
}
But the problem is , at times if I pass characters like "â®" then those are getting converted to ???? , after decoding if I am printing the message.
Not able to figure out why encoding and decoding is behaving weird. Any help would be appreciated.
Issue is that encoding is done using the platform's default charset (data.getBytes()), while decoding - using UTF-8.
In compressToBase64String change data.getBytes() to data.getBytes(StandardCharsets.UTF_8).

Android HTTP Request Encoding

I want to do a HTTPRequest in my Android App, using the following Code:
BufferedReader in = null;
try {
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet();
request.setURI(new URI("http://www.example.de/example.php"));
HttpResponse response = client.execute(request);
in = new BufferedReader
(new InputStreamReader(response.getEntity().getContent()));
StringBuffer sb = new StringBuffer("");
String line = "";
String NL = System.getProperty("line.separator");
while ((line = in.readLine()) != null) {
sb.append(line + NL);
}
in.close();
String page = sb.toString();
System.out.println(page);
return page;
} finally {
if (in != null) {
try {
in.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
The webpage I'm calling is a php Script which returns a string. My problem is that the the special Characters (ä,ü,ö,€ etc.) are showed as a Question mark with a box. How can I get these characters?
I think it's a problem with the encoding (German App -> UTF-8?).
May be you could try to set encoding when displaying into the console. Something characters are correctly returned from the server but fails to display in the console.
String page = sb.toString();
PrintStream out = new PrintStream(System.out, true, "UTF-8");
out.println(page);
I have played around with your code, against http://www.google.de.
I was able to "hack" something, not sure it's the most elegant solution though.
After the line:
HttpResponse response = client.execute(request);
... I've added:
HttpEntity e = response.getEntity();
Header ct = e.getContentType();
HeaderElement[] he = ct.getElements();
if (
he.length > 0
&& he[0].getParameters().length > 0
&& he[0].getParameter(0) != null
&& he[0].getParameter(0).getName().equals("charset")
) {
String charset = he[0].getParameter(0).getValue();
// with google.de, will print ISO latin ("ISO-8859-1")
Log.d("com.example.test", charset);
}
... then you can add the charset representation, or its Java equivalent as a second argument of your InputStreamReader constructor call:
in = new BufferedReader(
new InputStreamReader(
response.getEntity().getContent(),
charset != null ? charset : "UTF-8"
);
Let me know if that works out for you.
Also note that in order to check Java charset equivalences, you could use Charset.forName(String charsetName) and catch the relevant Exceptions (and then revert to Charset.defaultCharset() or UTF-8, etc. in your catch statement).

Read non-english characters from http get request

I have a problem in getting Hebrew characters from a http get request.
I'm getting squares characters like this: "[]" instead of the Hebrew characters.
The English characters are Ok.
This is my function:
public String executeHttpGet(String urlString) throws Exception {
BufferedReader in = null;
try {
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet();
request.setURI(new URI(urlString));
HttpResponse response = client.execute(request);
in = new BufferedReader(new InputStreamReader(response.getEntity().getContent(),"UTF-8"));
StringBuffer sb = new StringBuffer("");
String line = "";
String NL = System.getProperty("line.separator");
while ((line = in.readLine()) != null) {
sb.append(line + NL);
}
in.close();
String page = sb.toString();
// System.out.println(page);
return page;
} finally {
if (in != null) {
try {
in.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
You can test is by this example url:
String str = executeHttpGet("http://kavim-t.co.il/include/getXMLStations.asp?parent=7_%20_1");
Thank you!
The file you linked to doesn't seem to be UTF-8. I tested that it opens correctly using WINDOWS-1255 (hebrew encoding), you should try that instead of UTF-8.
Try a different website, it looks like it doesn't use UTF-8. Alternatively, UTF-16 may work but I haven't tried. Your code looks fine.
As others have pointed out, the content is not actually encoded as UTF-8. You might want to look at httpEntity.getContentType() to extract the actual encoding of the content, and then pass this to your InputStreamReader. This means your code will then be able to cope correctly with any encoding.
hi as is posted in this other question Special characters in PHP / MySQL
you can set the characters on the php file on the example they set utf-8, but you can set a different type that supports the chararcters you need.

Convert InputStream to String with encoding given in stream data

My input is a InputStream which contains an XML document. Encoding used in XML is unknown and it is defined in the first line of XML document.
From this InputStream, I want to have all document in a String.
To do this, I use a BufferedInputStream to mark the beginning of the file and start reading first line. I read this first line to get encoding and then I use an InputStreamReader to generate a String with the correct encoding.
It seems that it is not the best way to achieve this goal because it produces an OutOfMemory error.
Any idea, how to do it?
public static String streamToString(final InputStream is) {
String result = null;
if (is != null) {
BufferedInputStream bis = new BufferedInputStream(is);
bis.mark(Integer.MAX_VALUE);
final StringBuilder stringBuilder = new StringBuilder();
try {
// stream reader that handle encoding
final InputStreamReader readerForEncoding = new InputStreamReader(bis, "UTF-8");
final BufferedReader bufferedReaderForEncoding = new BufferedReader(readerForEncoding);
String encoding = extractEncodingFromStream(bufferedReaderForEncoding);
if (encoding == null) {
encoding = DEFAULT_ENCODING;
}
// stream reader that handle encoding
bis.reset();
final InputStreamReader readerForContent = new InputStreamReader(bis, encoding);
final BufferedReader bufferedReaderForContent = new BufferedReader(readerForContent);
String line = bufferedReaderForContent.readLine();
while (line != null) {
stringBuilder.append(line);
line = bufferedReaderForContent.readLine();
}
bufferedReaderForContent.close();
bufferedReaderForEncoding.close();
} catch (IOException e) {
// reset string builder
stringBuilder.delete(0, stringBuilder.length());
}
result = stringBuilder.toString();
}else {
result = null;
}
return result;
}
The call to mark(Integer.MAX_VALUE) is causing the OutOfMemoryError, since it's trying to allocate 2GB of memory.
You can solve this by using an iterative approach. Set the mark readLimit to a reasonable value, say 8K. In 99% of cases this will work, but in pathological cases, e.g 16K spaces between the attributes in the declaration, you will need to try again. Thus, have a loop that tries to find the encoding, but if it doesn't find it within the given mark region, it tries again, doubling the requested mark readLimit size.
To be sure you don't advance the input stream past the mark limit, you should read the InputStream yourself, upto the mark limit, into a byte array. You then wrap the byte array in a ByteArrayInputStream and pass that to the constructor of the InputStreamReader assigned to 'readerForEncoding'.
You can use this method to convert inputstream to string. this might help you...
private String convertStreamToString(InputStream input) throws Exception{
BufferedReader reader = new BufferedReader(new InputStreamReader(input));
StringBuilder sb = new StringBuilder();
String line = null;
while ((line = reader.readLine()) != null) {
sb.append(line);
}
input.close();
return sb.toString();
}

Categories

Resources