I have been looking around at different ways to connect to URLs and there seem to be a few.
My requirements are to do POST and GET queries on a URL and retrieve the result.
I have seen
URL class
DefaultHttpClient class
HttpClient - apache commons
which method is best?
My rule of thumb and recommendation: Don't introduce dependencies and 3rd party libraries if it's fairly easy to get away without.
In this case I would say, if you need efficiency such as multiple requests per established connection session handling or cookie support etc, go for HTTPClient.
If you only need to perform an HTTP get, this will suffice:
Getting Text from a URL
try {
// Create a URL for the desired page
URL url = new URL("http://hostname:80/index.html");
// Read all the text returned by the server
BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));
String str;
while ((str = in.readLine()) != null) {
// str is one line of text; readLine() strips the newline character(s)
}
in.close();
} catch (MalformedURLException e) {
} catch (IOException e) {
}
Sending a POST Request Using a URL
try {
// Construct data
String data = URLEncoder.encode("key1", "UTF-8") + "=" + URLEncoder.encode("value1", "UTF-8");
data += "&" + URLEncoder.encode("key2", "UTF-8") + "=" + URLEncoder.encode("value2", "UTF-8");
// Send data
URL url = new URL("http://hostname:80/cgi");
URLConnection conn = url.openConnection();
conn.setDoOutput(true);
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
wr.write(data);
wr.flush();
// Get the response
BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line;
while ((line = rd.readLine()) != null) {
// Process line...
}
wr.close();
rd.close();
} catch (Exception e) {
}
Both methods work great. (I've even done manual gets/posts with cookies.)
HTTPClient is the way to go if your needs go past trivial URL connection (e.g. proxy authentication such as NTLM). There are at least a comparison here between standard HTTP client functionality between libraries provided by the JRE, Apache HTTP Client and others.
If you are using JDK versions earlier to (including 1.4) and have a fairly large data in your post requests, like large file uploads, the default HTTPURLConnection that comes with the JRE is bound to go Out of memory at some point since it buffers the entire data before posting. Additionally it does not support some advanced HTTP headers like chunked encoding, etc.
So I'd recommend it only if your request are trivial and you are not posting large data as aioobe did.
Related
I want to commit text file "demo2.txt" to bitbucket server using rest API. I can upload the same file using Postman but it's not working with Java code. As shown in the below code I want to send string object "str" as the body. Can someone help me here to upload the file on the bitbucket server? Also Please let me know if there is any other way to do this.
URL url = new URL("https://api.bitbucket.org/2.0/repositories/{team name}/{repository name}/src");
HttpURLConnection httpCon = (HttpURLConnection) url.openConnection();
httpCon.setRequestProperty("X-Requested-with", "Curl");
httpCon.setDoOutput(true);
httpCon.setDoInput(true);
httpCon.setRequestProperty("Connection", "Keep-Alive");
httpCon.setRequestProperty("Content-Type", "multipart/form-data; boundary="+boundary);
httpCon.setRequestProperty("Accept", "application/x-www-form-urlencoded");
httpCon.setRequestProperty("Authorization", basicauth);
httpCon.setRequestMethod("POST");
String str =
"{"
+ "\"-F\":\"File3=#/D:/log/demo2.txt\" "
+ "}";
try {
OutputStream output = httpCon.getOutputStream();
output.write(str.getBytes());
output.close();
} catch(Exception e){
System.out.println(e.getMessage());
}
int responseCode = httpCon.getResponseCode();
String inputLine;
StringBuffer response = new StringBuffer();
if (responseCode == HttpURLConnection.HTTP_OK || responseCode == HttpURLConnection.HTTP_CREATED){
BufferedReader in = new BufferedReader(new .
InputStreamReader(httpCon.getInputStream()));
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
List<String> message = new ArrayList<>();
message.add(response.toString());
}
If this is all of your code, then your problem may be as simple as the fact that you're not making any sort of call to finalize the request...to tell HttpURLConnection that you're done forming the request and want it to complete. There are two things you can do to help this:
close the output stream when you're done writing to it. You're generally supposed to do this. Here, you can call output.close(). Better still, since you have a try/catch block already anyway, use a "try with resources" construct to make sure that the stream is closed no matter what happens (assuming you're using a newer version of Java that supports this).
make some sort of call to query the response to the request. It may
be that the request is not being fully sent until you do this. Try
calling httpCon.getResponseCode() at the bottom of your code.
Given that you have provided no information as to what "it's not working with Java code" means, this may be useful information but not the ultimate solution to your problem. Your code does look good other than exhibiting these omissions.
I have an assignment for school that involves writing a simple web crawler that crawls Wikipedia. The assignment stipulates that I can't use any external libraries so I've been playing around with the java.net.URL class. Based on the official tutorial and some code given by my professor I have:
public static void main(String[] args) {
System.setProperty("sun.net.client.defaultConnectTimeout", "500");
System.setProperty("sun.net.client.defaultReadTimeout", "1000");
try {
URL url = new URL(BASE_URL + "/wiki/Physics");
InputStream is = url.openStream();
BufferedReader br = new BufferedReader(new InputStreamReader(is));
String inputLine;
int lineNum = 0;
while ((inputLine = br.readLine()) != null && lineNum < 10) {
System.out.println(inputLine);
lineNum++;
}
is.close();
}
catch (MalformedURLException e) {
System.out.println(e.getMessage());
}
catch (IOException e) {
System.out.println(e.getMessage());
}
}
In addition, the assignment requires that:
Your program should not continuously send requests to wiki. Your program
must wait for at least 1 second after every 10 requests
So my question is, where exactly in the above code is the "request" being sent? And how does this connection work? Is the entire webpage being loaded in one go? or is it being downloaded line by line?
I honestly don't really understand much about networking at all so apologies if I'm misunderstanding something fundamental. Any help would be much appreciated.
InputStream is = url.openStream();
at the above line you will be sending request
BufferedReader br = new BufferedReader(new InputStreamReader(is));
at this line getting the input stream and reading.
Calling url.openStream() initiates a new TCP connection to the server that the URL resolves to. An HTTP GET request is then sent over the connection. If all goes right (i.e., 200 OK), the server sends back the HTTP response message that carries the data payload that is served up at the specified URL. You then need to read the bytes from the InputStream that the openStream() method returns in order to retrieve the data payload into your program.
I post some data from Java to PHP:
try {
URL obj = new URL("http://myphpurl/insert.php");
HttpURLConnection conn = (HttpURLConnection) obj.openConnection();
conn.setReadTimeout(10000);
conn.setConnectTimeout(15000);
conn.setRequestMethod(POST_METHOD);
conn.setDoInput(true);
conn.setDoOutput(true);
Map<String, String> params = new HashMap<String, String>();
params.put("title", "العربية");
OutputStream os = conn.getOutputStream();
BufferedWriter writer =
new BufferedWriter(new OutputStreamWriter(os, "UTF-8"));
writer.write(getQuery(params));
writer.flush();
writer.close();
os.close();
BufferedReader in =
new BufferedReader(new InputStreamReader(conn.getInputStream(), "UTF-8"));
String inputLine;
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
LOG.debug("response {}", response);
in.close();
response = null;
inputLine = null;
conn.disconnect();
conn = null;
obj = null;
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
private String getQuery(Map<String, String> params) throws UnsupportedEncodingException {
StringBuilder result = new StringBuilder();
boolean first = true;
Iterator<Map.Entry<String, String>> it = params.entrySet().iterator();
while (it.hasNext()) {
if (first)
first = false;
else
result.append("&");
Map.Entry<String, String> pairs = it.next();
result.append(URLEncoder.encode(pairs.getKey(), "UTF-8"));
result.append("=");
result.append(URLEncoder.encode(pairs.getValue(), "UTF-8"));
it.remove(); // avoids a ConcurrentModificationException
}
return result.toString();
}
The insert.php file looks like this:
<?php
$posttitle = $_POST["title"];
echo "$posttitle";
echo urldecode($posttitle);
?>
The echo show some gibbrish مليون instead of the actual title العربية .
This gibbrish is then inserted in a mysql database.
Additionnal info:
The DATABASE is utf8_general_ci and does support arabic (when I manually update the post using phpMyAdmin it works).
I added UTF-8 in the InputStreamReader and InputStreamWriter, and I had the following behaviour:
Tomcat6 on windows, (PHP + mysql) on CentOS --> OK
Tomcat6 on CentOS , (PHP + mysql) on CentOS --> Not OK
Additionnal infos 2
Posting using javascript works fine: The page responds with the right encoding.
There are a number of things that can go wrong with your code, and we can't test it. Also, I suggest using a full featured HTTP client instead of URLConnection. The list of what you should check:
Pass the right source files encoding to javac (your test is hardcoded. Do you run the same binary or do you run the program from your IDE or anyway recompile on the deployment machine?)
Use UTF-8 to encode the query string
If your API uses the HTTP request body, check that both ends agree on the encoding, and/or use the Content-Type MIME header
PHP has binary strings (the encoding must be given) so make sure you use the appropriate parameters when connecting to the database, and/or transcode accordingly
When sending text from the PHP server, mind the encoding of the template and of the dynamic bits!
The number of moving parts is quite big. You should not debug via print/echo because that adds another level of transcoding. If possible, dump the raw text bytes and use a hex editor.
It's funny that Windows → Linux is ok, while Linux → Linux is not. You may want to check the locale on both CentOS machines (possibly running the operating system command from inside the target process - JVM and Apache)
Try using CharsetEncoder to reveal possible encoding exceptions.
CharsetEncoder encoder = Charset.forName("UTF-8").newEncoder();
encoder.onMalformedInput(CodingErrorAction.REPORT);
encoder.onUnmappableCharacter(CodingErrorAction.REPORT);
I want to get the HTML code of the following Web Page (http://www.studenti.ict.uniba.it/esse3/ListaAppelliOfferta.do) after:
selecting "Dipartimento di Informatica" among Facoltà
selecting "Informatica" (or one of the others available)
clicking "Avvia Ricerca"
I am not very keen in the matter but I noticed the URL of the page stays the same after each selection!?!
Can anyone help describing, possibly in details, how can I do that? Unfortunately I am not expert in web programming.
Many thanks
After some tests, it refresh the pages with a POST request
fac_id:1012 --
cds_id:197 --
ad_id: -- Attività didattica
docente_id: -- Id of the docent selected
data:06/03/2014 -- Date
Anyway you missed the value of Attività ditattica, Docente and Data esame
Just run a HTTP request using HttpURLConnection (?) with this POST args, and with a XML parser read the output of tplmessage table.
Try this tutorial for HTTP request: click.
Try to read this to understand how to parse response: click
An example using the code of the tutorial:
HttpURLConnection connection = null;
try
{
URL url = new URL("http://www.studenti.ict.uniba.it/esse3/ListaAppelliOfferta.do");
connection = (HttpURLConnection) url.openConnection(); // open the connection with the url
String params =
"fac_id=1012&cds_id=197"; // You need to add ad_id, docente_id and data
connection.setRequestMethod("POST"); // i need to use POST request method
connection.setRequestProperty("Content-Length", "" + Integer.toString(params.getBytes().length)); // It will add the length of params
connection.setRequestProperty("Content-Language", "it-IT"); // language italian
connection.setUseCaches (false);
connection.setDoInput (true);
connection.setDoOutput (true);
DataOutputStream wr = new DataOutputStream(
connection.getOutputStream ());
wr.writeBytes (params); // pass params
wr.flush (); // send request
wr.close ();
//Get Response
InputStream is = connection.getInputStream();
BufferedReader rd = new BufferedReader(new InputStreamReader(is));
String line;
StringBuilder response = new StringBuilder();
while((line = rd.readLine()) != null) {
response.append(line);
response.append('\r');
}
rd.close();
}
catch (MalformedURLException e)
{
e.printStackTrace();
} catch (IOException e)
{
e.printStackTrace();
}
finally
{
// close connection if created
if (connection != null)
connection.disconnect();
}
In response you will have the DOM of the page.
Anyway, use Chrome developers tool to get request args:
I am trying to login to a website and get page source of a page site after I login to the web site with java URLConnection. The problem I am facing is I can't maintain session so server gives me this warning and doesn't let me to get connected:
This system requires the use of HTTP cookies to verify authorization information.
Our system has detected that your browser has disabled HTTP cookies, or does not support them.
Please refer to the Help page in your browser for more information on how to correctly configure your browser for use with this system.
At first I am trying to send empty cookie to let server to understand I am handling sessions but it doesn't give me session id either.
This is my source code:
try {
// Construct data
String data = URLEncoder.encode("usr", "UTF-8") + "=" + URLEncoder.encode("usr", "UTF-8");
data += "&" + URLEncoder.encode("password", "UTF-8") + "=" + URLEncoder.encode("pass", "UTF-8");
// Send data
URL url = new URL("https://loginsite.com");
URLConnection conn = url.openConnection();
conn.setDoOutput(true);
conn.setRequestProperty("Cookie", "SESSID=");
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
wr.write(data);
wr.flush();
// Get the response
BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line;
while ((line = rd.readLine()) != null) {
System.out.println(line);
}
wr.close();
rd.close();
String headerName=null;
for (int i=1; (headerName = conn.getHeaderFieldKey(i))!=null; i++) {
if (headerName.equals("Set-Cookie")) {
String cookie = conn.getHeaderField(i);
System.out.println(cookie.split(";", 2)[0]);
}
}
} catch (Exception e) {
}
You should use an HTTP library which handles session management and other details of the HTTP protocol for you, e.g. supports Cookies and things like Keep-Alive, Proxies etc. out of the box. Try Apache HttpComponents