POST non latin data from Java to PHP - java

I post some data from Java to PHP:
try {
URL obj = new URL("http://myphpurl/insert.php");
HttpURLConnection conn = (HttpURLConnection) obj.openConnection();
conn.setReadTimeout(10000);
conn.setConnectTimeout(15000);
conn.setRequestMethod(POST_METHOD);
conn.setDoInput(true);
conn.setDoOutput(true);
Map<String, String> params = new HashMap<String, String>();
params.put("title", "العربية");
OutputStream os = conn.getOutputStream();
BufferedWriter writer =
new BufferedWriter(new OutputStreamWriter(os, "UTF-8"));
writer.write(getQuery(params));
writer.flush();
writer.close();
os.close();
BufferedReader in =
new BufferedReader(new InputStreamReader(conn.getInputStream(), "UTF-8"));
String inputLine;
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
LOG.debug("response {}", response);
in.close();
response = null;
inputLine = null;
conn.disconnect();
conn = null;
obj = null;
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
private String getQuery(Map<String, String> params) throws UnsupportedEncodingException {
StringBuilder result = new StringBuilder();
boolean first = true;
Iterator<Map.Entry<String, String>> it = params.entrySet().iterator();
while (it.hasNext()) {
if (first)
first = false;
else
result.append("&");
Map.Entry<String, String> pairs = it.next();
result.append(URLEncoder.encode(pairs.getKey(), "UTF-8"));
result.append("=");
result.append(URLEncoder.encode(pairs.getValue(), "UTF-8"));
it.remove(); // avoids a ConcurrentModificationException
}
return result.toString();
}
The insert.php file looks like this:
<?php
$posttitle = $_POST["title"];
echo "$posttitle";
echo urldecode($posttitle);
?>
The echo show some gibbrish مليون instead of the actual title العربية .
This gibbrish is then inserted in a mysql database.
Additionnal info:
The DATABASE is utf8_general_ci and does support arabic (when I manually update the post using phpMyAdmin it works).
I added UTF-8 in the InputStreamReader and InputStreamWriter, and I had the following behaviour:
Tomcat6 on windows, (PHP + mysql) on CentOS --> OK
Tomcat6 on CentOS , (PHP + mysql) on CentOS --> Not OK
Additionnal infos 2
Posting using javascript works fine: The page responds with the right encoding.

There are a number of things that can go wrong with your code, and we can't test it. Also, I suggest using a full featured HTTP client instead of URLConnection. The list of what you should check:
Pass the right source files encoding to javac (your test is hardcoded. Do you run the same binary or do you run the program from your IDE or anyway recompile on the deployment machine?)
Use UTF-8 to encode the query string
If your API uses the HTTP request body, check that both ends agree on the encoding, and/or use the Content-Type MIME header
PHP has binary strings (the encoding must be given) so make sure you use the appropriate parameters when connecting to the database, and/or transcode accordingly
When sending text from the PHP server, mind the encoding of the template and of the dynamic bits!
The number of moving parts is quite big. You should not debug via print/echo because that adds another level of transcoding. If possible, dump the raw text bytes and use a hex editor.
It's funny that Windows → Linux is ok, while Linux → Linux is not. You may want to check the locale on both CentOS machines (possibly running the operating system command from inside the target process - JVM and Apache)

Try using CharsetEncoder to reveal possible encoding exceptions.
CharsetEncoder encoder = Charset.forName("UTF-8").newEncoder();
encoder.onMalformedInput(CodingErrorAction.REPORT);
encoder.onUnmappableCharacter(CodingErrorAction.REPORT);

Related

Java client POST to php script - empty $_POST

I have researched extensively and cannot find a solution. I have been using the solutions provided to other users and it does not seem to work for me.
My java code:
public class Post {
public static void main(String[] args) {
String name = "Bobby";
String address = "123 Main St., Queens, NY";
String phone = "4445556666";
String data = "";
try {
// POST as urlencoded is basically key-value pairs
// create key=value&key=value.... pairs
data += "name=" + URLEncoder.encode(name, "UTF-8");
data += "&address=" +
URLEncoder.encode(address, "UTF-8");
data += "&phone=" +
URLEncoder.encode(phone, "UTF-8");
// convert string to byte array, as it should be sent
byte[] dataBytes = data.toString().getBytes("UTF-8");
// open a connection to the site
URL url = new URL("http://xx.xx.xx.xxx/yyy.php");
HttpURLConnection conn =
(HttpURLConnection) url.openConnection();
// tell the server this is POST & the format of the data.
conn.setDoOutput(true);
conn.setRequestProperty("Content-Type",
"application/x-www-form-urlencoded");
conn.setRequestMethod("POST");
conn.setFixedLengthStreamingMode(dataBytes.length);
conn.getOutputStream().write(dataBytes);
conn.getInputStream();
// Print out the echo statements from the php script
BufferedReader in = new BufferedReader(
new InputStreamReader(url.openStream()));
String line;
while((line = in.readLine()) != null)
System.out.println(line);
in.close();
} catch(Exception e) {
e.printStackTrace();
}
}
}
and the php
<?php
echo $_POST["name"];
?>
The output I receive is an empty line. I tested to see if it was a php/server side issue by making an html form that sends data over to a similar script and prints the data on the screen and that worked. But, for the life of me, I cannot get this to work with a remote client.
I am using Ubuntu server and Apache.
Thank you in advance.
The problem is actually in what you read as output. You are doing two requests:
1)conn.getInputStream(); - sends POST request with desired body
2)BufferedReader in = new BufferedReader(
new InputStreamReader(url.openStream())); - sends empty GET request (!!)
Change it to:
// ...
conn.getOutputStream().write(dataBytes);
BufferedReader in = new BufferedReader(
new InputStreamReader(conn.getInputStream()));
and see result.

Using Java bufferedreader to get html from URL

I'm trying to read all the html from a page using a buffered reader like follows
String charset = "UTF-8";
URLConnection connection = new URL(url).openConnection();
connection.addRequestProperty("User-Agent",
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)");
connection.setRequestProperty("Accept-Charset", charset);
InputStream response = connection.getInputStream();
BufferedReader br = new BufferedReader(new InputStreamReader(response,charset));
then I'm reading it line by line like this:
String data = br.readLine();
while(data != null){
data = br.readLine();
}
the problem is I'm getting something like:
}$B!)(BL$B!)(Bu"~$B!)$(D"C(B|X$B!x!)!x(B}
I've tried this:
do {
data = br.readLine();
SortedMap<String, Charset> map = Charset.availableCharsets();
for(Map.Entry<String, Charset> entry : map.entrySet()){
System.out.println(entry.getKey());
try {
System.out.println(new String(data.getBytes(entry.getValue())));
} catch (Exception e) {
e.printStackTrace();
}
}
}while(data!=null)
and I'm not getting any readable html in any of them. This really weird since it was working fine until this morning and I didn't change anything..
What am I doing wrong here? is it possible that something changed in the website I'm trying to read? please help.
The Server has changed his transfer mode to compressed data, what you can see in response header from server:
Connection:keep-alive
Content-Encoding:gzip
Content-Type:text/html; charset=utf-8
Date:Mon, 09 Mar 2015 09:34:41 GMT
Server:nginx
Transfer-Encoding:chunked
Vary:Accept-Encoding
X-Powered-By:PHP/5.5.16-pl0-gentoo
As you can see the content encoding is set to gzip Content-Encoding:gzip.
So you have to decode the zipped content first:
GZIPInputStream gzis = new GZIPInputStream(connection.getInputStream());
BufferedReader br = new BufferedReader(new InputStreamReader(gzis,charset));
To view the headers of requests and responses you could use a network monitor (see Free Network Monitor).
Simpler is it to use the developer plugins integrated in most common browsers. Here is the documentation of Chrome DevTools, how to use the network tab: https://developer.chrome.com/devtools/docs/network

Selecting one among many possible choices in a Web page using Java

I want to get the HTML code of the following Web Page (http://www.studenti.ict.uniba.it/esse3/ListaAppelliOfferta.do) after:
selecting "Dipartimento di Informatica" among Facoltà
selecting "Informatica" (or one of the others available)
clicking "Avvia Ricerca"
I am not very keen in the matter but I noticed the URL of the page stays the same after each selection!?!
Can anyone help describing, possibly in details, how can I do that? Unfortunately I am not expert in web programming.
Many thanks
After some tests, it refresh the pages with a POST request
fac_id:1012 --
cds_id:197 --
ad_id: -- Attività didattica
docente_id: -- Id of the docent selected
data:06/03/2014 -- Date
Anyway you missed the value of Attività ditattica, Docente and Data esame
Just run a HTTP request using HttpURLConnection (?) with this POST args, and with a XML parser read the output of tplmessage table.
Try this tutorial for HTTP request: click.
Try to read this to understand how to parse response: click
An example using the code of the tutorial:
HttpURLConnection connection = null;
try
{
URL url = new URL("http://www.studenti.ict.uniba.it/esse3/ListaAppelliOfferta.do");
connection = (HttpURLConnection) url.openConnection(); // open the connection with the url
String params =
"fac_id=1012&cds_id=197"; // You need to add ad_id, docente_id and data
connection.setRequestMethod("POST"); // i need to use POST request method
connection.setRequestProperty("Content-Length", "" + Integer.toString(params.getBytes().length)); // It will add the length of params
connection.setRequestProperty("Content-Language", "it-IT"); // language italian
connection.setUseCaches (false);
connection.setDoInput (true);
connection.setDoOutput (true);
DataOutputStream wr = new DataOutputStream(
connection.getOutputStream ());
wr.writeBytes (params); // pass params
wr.flush (); // send request
wr.close ();
//Get Response
InputStream is = connection.getInputStream();
BufferedReader rd = new BufferedReader(new InputStreamReader(is));
String line;
StringBuilder response = new StringBuilder();
while((line = rd.readLine()) != null) {
response.append(line);
response.append('\r');
}
rd.close();
}
catch (MalformedURLException e)
{
e.printStackTrace();
} catch (IOException e)
{
e.printStackTrace();
}
finally
{
// close connection if created
if (connection != null)
connection.disconnect();
}
In response you will have the DOM of the page.
Anyway, use Chrome developers tool to get request args:

How to post with java without urlencoding query part of url

I am trying to use a signed java applet to post to a url like:
http://some.domain.com/something/script.asp?param=5041414F9015496EA699F3D2DBAB4AC2|178411|163843|557|1|1|164||attempt|1630315
But when java makes the connection, the java console shows:
network: Connecting http://some.domain.com/something/script.asp?param=5041414F9015496EA699F3D2DBAB4AC2%7C178411%7C163843%7C557%7C1%7C1%7C164%7C%7Cattempt%7C1630315
I do not want java to urlencode the pipes in the query from | to %7c. It seems the service I'm connecting to doesn't urldecode the param, and I can't change the server side code. Is there a way in java to make the post without escaping the query?
The java I'm using is below:
try {
URL url = new URL(myURL);
URLConnection connection = url.openConnection();
connection.setDoOutput(true);
OutputStreamWriter out = new OutputStreamWriter(
connection.getOutputStream());
out.write(toSend);
out.close();
BufferedReader in = new BufferedReader(
new InputStreamReader(
connection.getInputStream()));
String decodedString = "";
while ((decodedString = in.readLine()) != null) {
totalResponse = totalResponse + decodedString;
}
in.close();
} catch (Exception ex) {
}
Thank you for any help!
the URL class does not do any encoding. testing this on my dev server confirmed this suspicion. your code must be encoding the '|' character somewhere before the snippet you included in your question.

Java connecting to Http which method to use?

I have been looking around at different ways to connect to URLs and there seem to be a few.
My requirements are to do POST and GET queries on a URL and retrieve the result.
I have seen
URL class
DefaultHttpClient class
HttpClient - apache commons
which method is best?
My rule of thumb and recommendation: Don't introduce dependencies and 3rd party libraries if it's fairly easy to get away without.
In this case I would say, if you need efficiency such as multiple requests per established connection session handling or cookie support etc, go for HTTPClient.
If you only need to perform an HTTP get, this will suffice:
Getting Text from a URL
try {
// Create a URL for the desired page
URL url = new URL("http://hostname:80/index.html");
// Read all the text returned by the server
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
String str;
while ((str = in.readLine()) != null) {
// str is one line of text; readLine() strips the newline character(s)
}
in.close();
} catch (MalformedURLException e) {
} catch (IOException e) {
}
Sending a POST Request Using a URL
try {
// Construct data
String data = URLEncoder.encode("key1", "UTF-8") + "=" + URLEncoder.encode("value1", "UTF-8");
data += "&" + URLEncoder.encode("key2", "UTF-8") + "=" + URLEncoder.encode("value2", "UTF-8");
// Send data
URL url = new URL("http://hostname:80/cgi");
URLConnection conn = url.openConnection();
conn.setDoOutput(true);
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
wr.write(data);
wr.flush();
// Get the response
BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line;
while ((line = rd.readLine()) != null) {
// Process line...
}
wr.close();
rd.close();
} catch (Exception e) {
}
Both methods work great. (I've even done manual gets/posts with cookies.)
HTTPClient is the way to go if your needs go past trivial URL connection (e.g. proxy authentication such as NTLM). There are at least a comparison here between standard HTTP client functionality between libraries provided by the JRE, Apache HTTP Client and others.
If you are using JDK versions earlier to (including 1.4) and have a fairly large data in your post requests, like large file uploads, the default HTTPURLConnection that comes with the JRE is bound to go Out of memory at some point since it buffers the entire data before posting. Additionally it does not support some advanced HTTP headers like chunked encoding, etc.
So I'd recommend it only if your request are trivial and you are not posting large data as aioobe did.

Categories

Resources