Web Requests C#,Java - java

I am trying to shorten the time it takes HttpWebRequest or WebClient to get a string from url,using C#, it takes about 2000ms to get the string.
Using Java I can get the string in about 300ms. (I am new to java, please see code below)
In c# I have tried setting request.Proxy = null and System.Net.ServicePointManager.Expect100Continue = false with no clear difference.
I don't know if C# and Java codes below are comparable, However I would like to get the data in a shorter time if possible using C#.
Java:
try {
URL url = new URL("SomeURL");
InputStream is = url.openStream();
BufferedReader br = new BufferedReader(new InputStreamReader(is));
String line;
while ((line = br.readLine()) != null)
br.close();
is.close();
} catch (Exception e) {
e.printStackTrace();
}
C#:
using (WebClient nn = new WebClient()) {
nn.Proxy = null;
string SContent = await nn.DownloadStringTaskAsync(url);
return SContent;
}
or:
HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create(new Uri(url));
request.Method = "GET";
// Send the request to the server and wait for the response:
using (WebResponse response = await request.GetResponseAsync()) {
using (Stream stream = response.GetResponseStream()) {
StreamReader reader = new StreamReader(stream, Encoding.UTF8);
string SContent = reader.ReadToEnd();
return SContent;
}
}

I'm not sure if the code below will be faster than Java's URL.openStream or URLConnection, but it sure is succint. I wouldn't use HttpWebRequest anymore. Microsoft recommends using HttpClient.
using System.Net.Http;
using System.Threading.Tasks;
namespace CSharp.Console
{
public class Program
{
// Make HttpClient a static member so it's available for the lifetime of the application.
private static readonly HttpClient HttpClient = new HttpClient();
public static async Task Main(string[] args)
{
string body = await HttpClient.GetStringAsync("http://www.google.com");
System.Console.WriteLine(body);
System.Console.ReadLine();
}
}
}
Please note: to be able to use async on your Main method in a Console Application, you need to use C# language specification 7.1 or up. (Project properties, Debug, Advanced, Language version).

Related

Why am I getting 403 status code in Java after a while?

When I try to check status codes within sites I face off 403 response code after a while. First when I run the code every sites send back datas but after my code repeat itself with Timer I see one webpage returns 403 response code. Here is my code.
public class Main {
public static void checkSites() {
Timer ifSee403 = new Timer();
try {
File links = new File("./linkler.txt");
Scanner scan = new Scanner(links);
ArrayList<String> list = new ArrayList<>();
while(scan.hasNext()) {
list.add(scan.nextLine());
}
File linkStatus = new File("LinkStatus.txt");
if(!linkStatus.exists()){
linkStatus.createNewFile();
}else{
System.out.println("File already exists");
}
BufferedWriter writer = new BufferedWriter(new FileWriter(linkStatus));
for(String link : list) {
try {
if(!link.startsWith("http")) {
link = "http://"+link;
}
URL url = new URL(link);
HttpURLConnection.setFollowRedirects(true);
HttpURLConnection http = (HttpURLConnection)url.openConnection();
http.setRequestMethod("HEAD");
http.setConnectTimeout(5000);
http.setReadTimeout(8000);
int statusCode = http.getResponseCode();
if (statusCode == 200) {
ifSee403.wait(5000);
System.out.println("Hello, here we go again");
}
http.disconnect();
System.out.println(link + " " + statusCode);
writer.write(link + " " + statusCode);
writer.newLine();
} catch (Exception e) {
writer.write(link + " " + e.getMessage());
writer.newLine();
System.out.println(link + " " +e.getMessage());
}
}
try {
writer.close();
} catch (Exception e) {
System.out.println(e.getMessage());
}
System.out.println("Finished.");
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
public static void main(String[] args) throws Exception {
Timer myTimer = new Timer();
TimerTask sendingRequest = new TimerTask() {
public void run() {
checkSites();
}
};
myTimer.schedule(sendingRequest,0,150000);
}
}
How can I solve this? Thanks
Edited comment:
I've added http.disconnect(); for closing connection after checked status codes.
Also I've added
if(statusCode == 200) {
ifSee403.wait(5000);
System.out.println("Test message);
}
But it didn't work. Compiler returned current thread is not owner error. I need to fix this and change 200 with 403 and say ifSee403.wait(5000) and try it again the status code.
One "alternative" - by the way - to IP / Spoofing / Anonymizing would be to (instead) try "obeying" what the security-code is expecting you to do. If you are going to write a "scraper", and are aware there is a "bot detection" that doesn't like you debugging your code while you visit the site over and over and over - you should try using the HTML Download which I posted as an answer to the last question you asked.
If you download the HTML and save it (save it to a file - once an hour), and then write you HTML Parsing / Monitoring Code using the HTML contents of the file you have saved, you will (likely) be abiding by the security-requirements of the web-site and still be able to check availability.
If you wish to continue to use JSoup, that A.P.I. has an option for receiving HTML as a String. So if you use the HTML Scrape Code I posted, and then write that HTML String to disk, you can feed that to JSoup as often as you like without causing the Bot Detection Security Checks to go off.
If you play by their rules once in a while, you can write your tester without much hassle.
import java.io.*;
import java.net.*;
...
// This line asks the "url" that you are trying to connect with for
// an instance of HttpURLConnection. These two classes (URL and HttpURLConnection)
// are in the standard JDK Package java.net.*
HttpURLConnection con = (HttpURLConnection) url.openConnection();
// Tells the connection to use "GET" ... and to "pretend" that you are
// using a "Chrome" web-browser. Note, the User-Agent sometimes means
// something to the web-server, and sometimes is fully ignored.
con.setRequestMethod("GET");
con.setRequestProperty("User-Agent", "Chrome/61.0.3163.100");
// The classes InputStream, InputStreamReader, and BufferedReader
// are all JDK 1.0 package java.io.* classes.
InputStream is = con.getInputStream();
BufferedReader br = new BufferedReader(new InputStreamReader(is));
StringBuffer sb = new StringBuffer();
String s;
// This reads each line from the web-server.
while ((s = br.readLine()) != null) sb.append(s + "\n");
// This writes the results from the web-server to a file
// It is using classes java.io.File and java.io.FileWriter
File outF = new File("SavedSite.html");
outF.createNewFile();
FileWriter fw = new FileWriter(outF);
fw.write(sb.toString());
fw.close();
Again, this code is very basic stuff that doesn't use any special JAR Library Code at all. The next method uses the JSoup library (which you have explicitly requested - even though I don't use it... It is just fine!) ... This is the method "parse" which will parse the String you have just saved. You may load this HTML String from disk, and send it to JSoup using:
Method Documentation: org.jsoup.Jsoup.parse​(File in, String charsetName, String baseUri)
If you wish to invoke JSoup just pass it a java.io.File instance using the following:
File f = new File("SavedSite.html");
Document d = Jsoup.parse(f, "UTF-8", url.toString());
I do not think you need timers at all...
AGAIN: If you are making lots of calls to the server. The purpose of this answer is to show you how to save the response of the server to a file on disk, so you don't have to make lots of calls - JUST ONE! If you restrict your calls to the server to once per hour, then you will (likely, but not a guarantee) avoid getting a 403 Forbidden Bot Detection Problem.

JAVA send/receive information via json

I've been researching on how to send and receive information to a url, via json for the last 3 days. I have found a lot of documentation and code examples on how to do it, I just can't comprehend what they're saying. I've imported god knows how many .jar files into my eclipse package. Does anyone have a good example on how to connect to a url, send/receive information (even login), parse it, and send more information? I understand that I'm asking for a lot. I don't need all the answers, good documentation and some good examples would make me soooo happy.
Start with http://hc.apache.org/
Then look at http://code.google.com/p/google-gson/
or: http://wiki.fasterxml.com/JacksonHome
That should be all you need.
Found a really solid example here on this blog http://www.gnovus.com/blog/programming/making-http-post-request-json-using-apaches-httpclient
Pasted below if for some reason the link doesnt work.
public class SimpleHTTPPOSTRequester {
private String apiusername;
private String apipassword;
private String apiURL;
public SimpleHTTPPOSTRequester(String apiusername, String apipassword, String apiURL) {
this.apiURL = apiURL;
this.apiusername = apiusername;
this.apipassword = apipassword;
}
public void makeHTTPPOSTRequest() {
try {
HttpClient c = new DefaultHttpClient();
HttpPost p = new HttpPost(this.apiURL);
p.setEntity(new StringEntity("{\"username\":\"" + this.apiusername + "\",\"password\":\"" + this.apipassword + "\"}",
ContentType.create("application/json")));
HttpResponse r = c.execute(p);
BufferedReader rd = new BufferedReader(new InputStreamReader(r.getEntity().getContent()));
String line = "";
while ((line = rd.readLine()) != null) {
//Parse our JSON response
JSONParser j = new JSONParser();
JSONObject o = (JSONObject)j.parse(line);
Map response = (Map)o.get("response");
System.out.println(response.get("somevalue"));
}
}
catch(ParseException e) {
System.out.println(e);
}
catch(IOException e) {
System.out.println(e);
}
}
}

Multithreading (Stateless Classes)

Apologies for the long code post but am wondering if someone can help with a multithreading question (I am quite new to multi-threading). I am trying to design a facade class to a RESTFUL web services API that can be shared with multiple threads. I am using HttpURLConnection to do the connection and Google GSON to convert to and from JSON data.
The below class is what I have so far. In this example it has one public method to make an API call (authenticateCustomer()) and the private methods are used to facilitate the API call (i.e to build the POST data string, make a POST request etc).
I make one instance of this class and share it with 1000 threads. The threads call the authenticateCustomer() method. Most of the threads work but there is some threads that get a null pointer exception which is because I haven't implemented any synchronization. If I make the authenticateCustomer() method 'synchronized' it works. The problem is this results in poor concurrency (say, for example, the POST request suddenly takes a long time to complete, this will then hold up all the other threads).
Now to my question. Is the below class not stateless and therefore thread-safe? The very few fields that are in the class are declared final and assigned in the constructor. All of the methods use local variables. The Gson object is stateless (according to their web site) and created as a local variable in the API method anyway.
public final class QuizSyncAPIFacade
{
// API Connection Details
private final String m_apiDomain;
private final String m_apiContentType;
private final int m_bufferSize;
// Constructors
public QuizSyncAPIFacade()
{
m_apiDomain = "http://*****************************";
m_apiContentType = ".json";
m_bufferSize = 8192; // 8k
}
private String readInputStream(InputStream stream) throws IOException
{
// Create a buffer for the input stream
byte[] buffer = new byte[m_bufferSize];
int readCount;
StringBuilder builder = new StringBuilder();
while ((readCount = stream.read(buffer)) > -1) {
builder.append(new String(buffer, 0, readCount));
}
return builder.toString();
}
private String buildPostData(HashMap<String,String> postData) throws UnsupportedEncodingException
{
String data = "";
for (Map.Entry<String,String> entry : postData.entrySet())
{
data += (URLEncoder.encode(entry.getKey(), "UTF-8") + "=" + URLEncoder.encode(entry.getValue(), "UTF-8") + "&");
}
// Trim the last character (a trailing ampersand)
int length = data.length();
if (length > 0) {
data = data.substring(0, (length - 1));
}
return data;
}
private String buildJSONError(String message, String name, String at)
{
String error = "{\"errors\":[{\"message\":\"" + message + "\",\"name\":\"" + name + "\",\"at\":\"" + at + "\"}]}";
return error;
}
private String callPost(String url, HashMap<String,String> postData) throws IOException
{
// Set up the URL for the API call
URL apiUrl = new URL(url);
// Build the post data
String data = buildPostData(postData);
// Call the API action
HttpURLConnection conn;
try {
conn = (HttpURLConnection)apiUrl.openConnection();
} catch (IOException e) {
throw new IOException(buildJSONError("Failed to open a connection.", "CONNECTION_FAILURE", ""));
}
// Set connection parameters for posting data
conn.setRequestMethod("POST");
conn.setUseCaches(false);
conn.setDoInput(true);
conn.setDoOutput(true);
// Write post data
try {
DataOutputStream wr = new DataOutputStream(conn.getOutputStream());
wr.writeBytes(data);
wr.flush();
wr.close();
} catch (IOException e) {
throw new IOException(buildJSONError("Failed to post data in output stream (Connection OK?).", "POST_DATA_FAILURE", ""));
}
// Read the response from the server
InputStream is;
try {
is = conn.getInputStream();
} catch (IOException e) {
InputStream errStr = conn.getErrorStream();
if (errStr != null)
{
String errResponse = readInputStream(errStr);
throw new IOException(errResponse);
}
else
{
throw new IOException(buildJSONError("Failed to read error stream (Connection OK?).", "ERROR_STREAM_FAILURE", ""));
}
}
// Read and return response from the server
return readInputStream(is);
}
/* -------------------------------------
*
* Synchronous API calls
*
------------------------------------- */
public APIResponse<CustomerAuthentication> authenticateCustomer(HashMap<String,String> postData)
{
// Set the URL for this API call
String apiURL = m_apiDomain + "/customer/authenticate" + m_apiContentType;
Gson jsonConv = new Gson();
String apiResponse = "";
try
{
// Call the API action
apiResponse = callPost(apiURL, postData);
// Convert JSON response to the required object type
CustomerAuthentication customerAuth = jsonConv.fromJson(apiResponse, CustomerAuthentication.class);
// Build and return the API response object
APIResponse<CustomerAuthentication> result = new APIResponse<CustomerAuthentication>(true, customerAuth, null);
return result;
}
catch (IOException e)
{
// Build and return the API response object for a failure with error list
APIErrorList errorList = jsonConv.fromJson(e.getMessage(), APIErrorList.class);
APIResponse<CustomerAuthentication> result = new APIResponse<CustomerAuthentication>(false, null, errorList);
return result;
}
}
}
If you are getting an error it could be because you are overloading the authentication service (something which doesn't happen if you do this one at a time) Perhaps it returning a error like 500, 503 or 504 which you could be ignoring and getting nothing you expect back, you return null http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
I would use less threads assuming you don't have 1000 cpus, its possible having this many threads will be slower rather than more efficeint.
I would also check that your service is returning correctly every time and investigate why you get a null value.
If your service can only handle say 20 requests at once, you can try using a Semaphore as a last resort. This can be using to limit the numebr of concurrent requests.
Any stateless class is inherently threadsafe, provided that the objects it accesses are either private to the thread, or threadsafe themselves.

Java HttpClient or PostMethod truncating return data to 64k

I'm using Apache Commons HttpClient to grab some data from a server. My problem is that the returned XML data is always truncated to the first 64k. I was hoping this might just be a case of setting a limit on the relevant object, but apparently not - or at least, I can't find any information about such a method. I'm assuming it's a client issue as the server belongs to another company and presumably serves the data fine to everyone else.
Any ideas?
My code btw is super simple:
protected String send(Server server, String query) throws Exception {
PostMethod post = new PostMethod(server.getUrl());
post.setParameter("XMLString", query);
try {
client.executeMethod(post);
return post.getResponseBodyAsString();
} finally {
post.releaseConnection();
}
}
fyi, same thing happens with the following code using InputStreams rather than the getResponseBodyAsString() method.
protected String send(Server server, String query) throws Exception {
PostMethod post = new PostMethod(server.getUrl());
post.setParameter("XMLString", query);
try {
client.executeMethod(post);
InputStream stream = post.getResponseBodyAsStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(stream));
String line = null;
StringBuilder sb = new StringBuilder();
while ( (line=reader.readLine()) != null ) {
sb.append(line);
}
return sb.toString();
} finally {
post.releaseConnection();
}
}
So, do you know if there is a limit set somewhere? The fact that the String is response is always 64k does seem to suggest there must be. But where?!
Thanks
Alastair
ERRATUM: the 64k limit is false. It as in artefact of the eclipse debugger which won't show appears the be the truncating culprit.
SOLUTION: I have now got the code working as follows:
PostMethod post = new PostMethod(server.getUrl());
post.setParameter("XMLString", query);
post.setRequestContentLength(PostMethod.CONTENT_LENGTH_AUTO);
post.setStrictMode(true);
try {
client.executeMethod(post);
return post.getResponseBodyAsString();
} finally {
post.releaseConnection();
}
I'll admit I don't know why that works, but speculate it's something to do with the content length settings.
Separately, be aware that these methods are deprecated and I'm only using them because I'm working on a legacy system. If you are working on a more adaptable system I'd suggest following #HellGhost's advice.
Save the ResponseBody as string and return the string, because the method releaseConnection() is called before returning the result.
Furthermore, you should consider whether you really want to use the library HttpClient 3.x even further, since the development has been set and it has been replaced by the library HttpComponents project in its HttpClient and HttpCore modules.
An example use for the new HttpClient:
HttpPost post = new HttpPost(server.getUrl());
HttpResponse response = client.executeMethod(post);
return EntityUtils.toString(response.getEntity());

How efficient it is to return a string in Java

All sample function I've seen so far avoid, for some reason, returning a string. I am a total rookie as far as Java goes, so I am not sure whether this is intentional. I know that in C++ for example, returning a reference to a string is way more efficient than returning a copy of that string.
How does this work in Java?
I am particularly interested in Java for Android, in which resources are more limited than desktop/server environment.
To help this question be more focused, I am providing a code snippet in which I am interested in returning (to the caller) the string page:
public class TestHttpGet {
private static final String TAG = "TestHttpGet";
public void executeHttpGet() throws Exception {
BufferedReader in = null;
try {
HttpClient client = new DefaultHttpClient();
HttpGet request = new HttpGet();
request.setURI(new URI("http://www.google.com/"));
HttpResponse response = client.execute(request); // actual HTTP request
// read entire response into a string object
in = new BufferedReader(new InputStreamReader(response.getEntity().getContent()));
StringBuffer sb = new StringBuffer("");
String line = "";
String NL = System.getProperty("line.separator");
while ((line = in.readLine()) != null) {
sb.append(line + NL);
}
in.close();
String page = sb.toString();
Log.v(TAG, page); // instead of System.out.println(page);
}
// a 'finally' clause will always be executed, no matter how the program leaves the try clause
// (whether by falling through the bottom, executing a return, break, or continue, or throwing an exception).
finally {
if (in != null) {
try {
in.close(); // BufferedReader must be closed, also closes underlying HTTP connection
}
catch (IOException e) {
e.printStackTrace();
}
}
}
}
}
In the example above, can I just define:
public String executeHttpGet() throws Exception {
instead of:
public void executeHttpGet() throws Exception {
and return:
return (page); // Log.v(TAG, page);
A String in java corresponds more or less to std::string const * in c++. So, it's cheap to pass around, and can't be modified after it's created (String is immutable).
String is a reference type - so when you return a string, you're really just returning a reference. It's dirt cheap. It's not copying the contents of the string.
In java most of the time you return something, you return it by reference. There's no object copying or cloning of any kind. So it is fast.
Also, Strings in Java are immutable. No need to worry about that either.

Categories

Resources