I've created a small scraping class and the method below reads in the text from a page.
However, I've found that the method fails to close the connection properly. This results in a ton of open connections which cause my hosting company to then suspend my account. Is the below correct?
private String getPageText(String urlString) {
String pageText = "";
BufferedReader reader = null;
try {
URL url = new URL(urlString);
reader = new BufferedReader(new InputStreamReader(url.openStream()));
StringBuilder builder = new StringBuilder();
int read;
char[] chars = new char[1024];
while ((read = reader.read(chars)) != -1)
builder.append(chars, 0, read);
pageText = builder.toString();
} catch (MalformedURLException e) {
Log.e(CLASS_NAME, "getPageText.MalformedUrlException", e);
} catch (IOException e) {
Log.e(CLASS_NAME, "getPageText.IOException", e);
} finally {
if (reader != null)
try {
reader.close();
} catch (IOException e) {
Log.e(CLASS_NAME, "getPageText.IOException", e);
}
}
return pageText;
}
Your code is fine in the success case but will potentially leak connections in the failure cases (when the http server returns a 4xx or 5xx status code). In these cases HttpURLConnection provides the response body via .getErrorStream() rather than .getInputStream() and you should make sure to drain and close that stream as well.
URLConnection conn = null;
BufferedReader reader = null;
try {
conn = url.openConnection();
reader = new BufferedReader(new InputStreamReader(conn.getInputStream()));
// ...
} finally {
if(reader != null) {
// ...
}
if(conn instanceof HttpURLConnection) {
InputStream err = ((HttpURLConnection)conn).getErrorStream();
if(err != null) {
byte[] buf = new byte[2048];
while(err.read(buf) >= 0) {}
err.close();
}
}
}
There probably needs to be another layer of try/catch inside that finally but you get the idea. You should not explicitly .disconnect() the connection unless you're sure there won't be any more requests for urls on that host in the near future - disconnect() will prevent subsequent requests from being pipelined over the existing connection, which for https in particular will slow things down considerably.
You are just closing the stream and not the connection, use the following structure:
URL u = new URL(url);
HttpURLConnection conn = (HttpURLConnection)
u.openConnection();
conn.connect();
reader = new BufferedReader(new InputStreamReader(conn.getInputStream()));
and then:
} finally {
if (reader != null)
try {
reader.close();
} catch (IOException e) {
Log.e(CLASS_NAME, "getPageText.IOException", e);
}
}
try {
if (conn != null) {
conn.disconnect();
}
} catch (Exception ex) {}
}
Related
I'm getting this error : java.io.IOException: Server returned HTTP response code: 502 for URL:
and it ends my program if the website has a bad gate, and its inconsistent on working or not. Is there any way I can force it to keep on retrying the website until it gets a response?
this is currently my code if it matters:
URL webpage = null;
URLConnection conn = null;
try{
webpage = new URL(websiteurl);
conn=webpage.openConnection();
InputStreamReader reader = new InputStreamReader(conn.getInputStream(), "UTF8");
BufferedReader buffer = new BufferedReader(reader);
String line = "";
while(true){
line = buffer.readLine();
if(line!=null){
System.out.println(line);
}
else
break;
}
}
catch(Exception e){
e.printStackTrace();
}
nevermind solved it by recalling my method in the catch and adding in a pause between each call : this is what it is now
URL webpage = null;
URLConnection conn = null;
try{
webpage = new URL(website);
conn=webpage.openConnection();
InputStreamReader reader = new InputStreamReader(conn.getInputStream(), "UTF8");
BufferedReader buffer = new BufferedReader(reader);
String line = "";
while(true){
line = buffer.readLine();
if(line!=null){
System.out.println(line);
}
else
break;
}
}
catch(Exception e){
e.printStackTrace();
try
{
Thread.sleep(5000);
}
catch(InterruptedException ex)
{
Thread.currentThread().interrupt();
}
findCreationDate(name);
}
solved it by recalling my method in the catch and adding in a pause between each call : this is what it is now
URL webpage = null;
URLConnection conn = null;
try{
webpage = new URL(website);
conn=webpage.openConnection();
InputStreamReader reader = new InputStreamReader(conn.getInputStream(), "UTF8");
BufferedReader buffer = new BufferedReader(reader);
String line = "";
while(true){
line = buffer.readLine();
if(line!=null){
System.out.println(line);
}
else
break;
}
}
catch(Exception e){
e.printStackTrace();
try
{
Thread.sleep(5000);
}
catch(InterruptedException ex)
{
Thread.currentThread().interrupt();
}
findCreationDate(name);
}
I have a Java application that is runs constantly. This application makes HTTP requests to a cloud server. The problem is that at each request the memory consumption increases until it reaches the point that the machine complete freezes. I isolated parts of the code and I'm sure the problem is with this code block making this http requests. Analyzing the JVM numbers, via prometheus / Grafana, I see that the use of non-heap memory (codecache and metaspace) are constantly increasing, as shown here
In the image above, whenever there is a drop in the line, it is when 98% of memory consumption reached, and Monit kills the app.
The method that is causing this memory consumption, is below (it is executed approx. 300 times until it exhausts a little more than 1.5GB of available memory in the initialization).
public AbstractRestReponse send(RestRequest request){
BufferedReader in = null;
OutputStream fout = null;
URLConnection conn = null;
InputStreamReader inputStreamReader = null;
String result = "";
try {
MultipartEntityBuilder mb = MultipartEntityBuilder.create();// org.apache.http.entity.mime
for (String key : request.getParams().keySet()) {
String value = (String) request.getParams().get(key);
// System.out.println(key + " = " + value);
mb.addTextBody(key, value);
}
if (request.getFile() != null) {
mb.addBinaryBody("file", request.getFile());
}
org.apache.http.HttpEntity e = mb.build();
conn = new URL(request.getUrl()).openConnection();
conn.setDoOutput(true);
conn.addRequestProperty(e.getContentType().getName(), e.getContentType().getValue());// header "Content-Type"...
conn.addRequestProperty("Content-Length", String.valueOf(e.getContentLength()));
fout = conn.getOutputStream();
e.writeTo(fout);// write multi part data...
inputStreamReader = new InputStreamReader(conn.getInputStream());
in = new BufferedReader(inputStreamReader);
String line;
while ((line = in.readLine()) != null) {
result += line;
}
String text = result.toString();
return objectMapper.readValue(text, FacialApiResult.class);
}catch (Exception e) {
e.printStackTrace();
return null;
}finally {
try {
inputStreamReader.close();
} catch (IOException e) {
e.printStackTrace();
}
try {
conn.getInputStream().close();
} catch (IOException e) {
e.printStackTrace();
}
try {
fout.close();
} catch (IOException e) {
e.printStackTrace();
}
try {
in.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
((HttpURLConnection)conn).disconnect() comes to mind. Also String concatenation is time and memory exhaustive. And there was a minor bug in dropping newlines.
NullPointerExceptions may arise in the finally block when an open was not reached due to an exception. But you should have checked that.
public AbstractRestReponse send(RestRequest request) {
URLConnection conn = null;
try {
MultipartEntityBuilder mb = MultipartEntityBuilder.create();// org.apache.http.entity.mime
for (String key : request.getParams().keySet()) {
String value = (String) request.getParams().get(key);
mb.addTextBody(key, value);
}
if (request.getFile() != null) {
mb.addBinaryBody("file", request.getFile());
}
org.apache.http.HttpEntity e = mb.build();
conn = new URL(request.getUrl()).openConnection();
conn.setDoOutput(true);
conn.addRequestProperty(e.getContentType().getName(), e.getContentType().getValue());// header "Content-Type"...
conn.addRequestProperty("Content-Length", String.valueOf(e.getContentLength()));
try (fout = conn.getOutputStream()){
e.writeTo(fout);// write multi part data...
}
StringBuilder resullt = new StringBuilder(2048);
try (BufferedReader in = new InputStreamReader(conn.getInputStream(),
StandardCharsets.UTF_8)) { // Charset
String line;
while ((line = in.readLine()) != null) {
result.append(line).append('\n'); // Newline
}
}
String text = result.toString();
return objectMapper.readValue(text, FacialApiResult.class);
} catch (Exception e) {
e.printStackTrace();
return null;
} finally {
if (conn != null) {
try {
if (conn instanceof HttpURLConnection) {
((HttpURLConnection) conn).disconnect();
}
} catch (IOException e) {
e.printStackTrace(); //Better logger
}
}
}
return null;
}
I explicitly defined the charset (UTF-8 might be wrong) - momentarily it is the server's default.
Used a StringBuilder, and added the missing newline, which might have lead to wrong parsing.
Try-with-resources for auto-closing, and a bit earlier. Hopefully this does not break anything.
Disconnecting the connection when it is an HttpURLConnection. Mind the instanceof which might play a role in unit tests mocking.
You seems to have handled all possible closing part in the finally block. Anyway it's better to use try-with resources to safely close all Closeable objects, if your application is running on Java 7+. That may isolate the problem further if it doesn't fix.
public static boolean sendRequest(String request) {
InputStream inputStream = null;
try {
URL url = new URL(request);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setReadTimeout(TIMEOUT);
connection.setConnectTimeout(TIMEOUT);
connection.setRequestMethod("POST");
connection.connect();
inputStream = connection.getInputStream();
while (inputStream.read() != -1);
return true;
} catch (IOException error) {
return false;
} finally {
try {
if (inputStream != null) {
inputStream.close();
}
} catch (IOException secondError) {
Log.w(RequestManager.class.getSimpleName(), secondError);
}
}
}
how do i read data from inputreader.read()? i want to read the data that is sent back from a server
int iReadSize;
byte[] buffer = new byte[4096];
try (InputStream inputStream = connection.getInputStream();) {
while ((iReadSize = inputStream.read(buffer)) != -1) {
System.out.println(new String(buffer, 0, iReadSize));
}
} catch (IOException error) {
}
This might be useful.
import java.net.*;
import java.io.*;
public class URLConnectionReader {
public static void main(String[] args) throws Exception {
URL oracle = new URL("http://www.oracle.com/");
URLConnection yc = oracle.openConnection();
BufferedReader in = new BufferedReader(new InputStreamReader(
yc.getInputStream()));
String inputLine;
while ((inputLine = in.readLine()) != null)
System.out.println(inputLine);
in.close();
}
}
For reference you can lookup from link https://docs.oracle.com/javase/tutorial/networking/urls/readingWriting.html.
I am trying to send data to server when user mobile internet become active, in my application whenever internet connection is active my broadcast receiver call a service method.
Below is the method. I trying for both post and get method but it does not request to the site. I am not able to print "i am in service5"(Below i have printed).It does not update my database.
public class LocalService extends Service
{
....
....
public void sendfeedback(){
System.out.println("i am in service4");
String filename =MainScreenActivity.UserName;
HttpURLConnection urlConnection = null;
final String target_uri =
"http://readoline.com/feedback.php";
try {
BufferedReader mReader = new BufferedReader(new InputStreamReader(getApplicationContext().openFileInput(filename)));
String line;
StringBuffer buffer = new StringBuffer();
while ((line = mReader.readLine()) != null) {
buffer.append(line + "\n");
}
System.out.println(buffer.toString());
try {
System.out.println("i am in service41"+buffer.toString());
Uri buildUri = Uri.parse("http://readoline.com/feedback.php" + "?").buildUpon()
.appendQueryParameter("user_id", filename)
.appendQueryParameter("feedback", buffer.toString())
.build();
URL url = new URL(buildUri.toString());
urlConnection = (HttpURLConnection) url.openConnection();
urlConnection.setRequestMethod("GET");
urlConnection.connect();
System.out.println("i am in service44");
/* PrintWriter out=new PrintWriter(urlConnection.getOutputStream());
out.write("user_id="+filename);
out.write("&");
out.write("feedback="+buffer.toString());
*/
System.out.println("i am in service5");
// Log.e(LOG_TAG,dataToSend.toString());
//out.close();
// Read the input stream into a String
}catch (IOException e){
System.out.println("i am in service6");
e.printStackTrace();
}finally {
if (urlConnection!=null){
urlConnection.disconnect();
}
}
}catch(Exception e){
}
}
}
I can pull the user's statuses with no problem with cURL, but when I connect with Java, the xml comes out truncated and my parser wants to cry. I'm testing with small users so it's not choke data or anything.
public void getRuserHx(){
System.out.println("Getting user status history...");
String https_url = "https://twitter.com/statuses/user_timeline/" + idS.rootUser + ".xml?count=100&page=[1-32]";
URL url;
try {
url = new URL(https_url);
HttpsURLConnection con = (HttpsURLConnection)url.openConnection();
con.setRequestMethod("GET");
con.setReadTimeout(15*1000);
//dump all the content into an xml file
print_content(con);
}
catch (MalformedURLException e) {
e.printStackTrace();
}
catch (IOException e) {
e.printStackTrace();
}
System.out.println("Finished downloading user status history.");
}
private void print_content(HttpsURLConnection con){
if(con!=null){
try {
BufferedReader br = new BufferedReader(new InputStreamReader(con.getInputStream()));
File userHx = new File("/" + idS.rootUser + "Hx.xml");
PrintWriter out = new PrintWriter(idS.hoopoeData + userHx);
String input;
while ((input = br.readLine()) != null){
out.println(input);
}
br.close();
}
catch (IOException e) {
e.printStackTrace();
}
}
This request doesn't need auth. Sorry about my ugly code. My professor says input doesn't matter so my I/O is a trainwreck.
You have to flush the output stream when you write the content out. Did you flush or close the output stream?