how to click on a button through java? - java

I want to access forms on HTMl pages throught Java Programming Language without involving real browser in between.
At present I am doing it through HTML UNIT but it takes a bit more time to load a page. When it comes to accessing millions of page, then this extra bit time matters most.
Is there any other methods for doing this?

I've used something similar called httpunit before, but I have no idea how it compares performance wise.
If you have millions of pages to process, I would recommend throwing some more threads at it. Just a guess, but I think that if you scale this up to multiple threads, you'll run out of bandwidth before you run out of CPU power (in which case it won't matter how much faster it could be)

Accessing a web page using a browser, even HtmlUnit, is going to be slow. A better method is to test the layer just below the web interface, so that you don't need to access millions of pages -- instead you test enough to make sure that the web interface is using the lower layer correctly.

Most of the interaction in browser comes down to an HTTP GET or an HTTP POST.
You need to figure out exactly the operation you need, and then you can construct the URL and/or form data. Then you can use something like this:
try {
//Construct data
String data = URLEncoder.encode("key1", "UTF-8") + "=" + URLEncoder.encode("value1", "UTF-8"); data += "&" + URLEncoder.encode("key2", "UTF-8") + "=" + URLEncoder.encode("value2", "UTF-8");
// Send data
URL url = new URL("http://hostname:80/cgi");
URLConnection conn = url.openConnection(); conn.setDoOutput(true);
OutputStreamWriter wr = new OutputStreamWriter(conn.getOutputStream());
wr.write(data);
wr.flush();
// Get the response
BufferedReader rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line; while ((line = rd.readLine()) != null) {
// Process line... }
wr.close();
rd.close();
} catch (Exception e) { }

Related

Best way to get content from a webserver when it's busy?

I have the below code to connect a webserver and get content via http request and everything is Ok. But sometimes that website is really being busy, taking to many request at the same time from different users, since I do not know how the http servers work exactly, Is there any tricks that I can do to get content rapidly, in your mind.
I use the below code in multi threaded, every 5 milisecond to get the content when the website is updated to get data instantly.
Can I keep the connection open? Does it make sense, or anything else that makes me get the data earlier when it is really busy?
URLConnection con = url.openConnection();
url.openConnection();
BufferedReader in = new BufferedReader(new
InputStreamReader(con.getInputStream()));
while ((inputLine = in.readLine()) != null) result.append(inputLine);
in.close();
content = result.toString();
Thanks.

Java - Retrieving a web page with authorization

I'm trying to retrieve a github web page using a java code, for this I used following code.
String startingUrl = "https://github.com/xxxxxx";
URL url = new URL(startingUrl );
HttpURLConnection uc = (HttpURLConnection) url.openConnection();
uc.connect();
String line = null;
StringBuffer tmp = new StringBuffer();
try{
BufferedReader in = new BufferedReader(new InputStreamReader(uc.getInputStream(), "UTF-8"));
while ((line = in.readLine()) != null) {
tmp.append(line);
}
}catch(FileNotFoundException e){
}
However, the page I received here is different from what I observe in browser after login to github. I tried sending authorization header as following, but it didn't worked either.
uc.setRequestProperty("Authorization", "Basic encodexxx");
How can I retrieve the same page that I see when I logged in?
I can't tell you more on this, because I don't know what are you getting, but most common issue for web crawlers is the fact that website owners mostly don't like web crawlers. Thus, you should behave like regular user - your browser for instance. Open your browser inspection element (press f12) when you are reaching some website and see what your browser send in request, then try to mimic it: For example, add Host, Referer, etc in your header. You need to experiment on this.
Also, good to know - some website owners will use advanced techniques (so they will block you to access their site), some won't stop you crawling on their website. Some will let you do what you want. Most fair option is to check www.somedomain.com/robots.txt and there is list of endpoints that are allowed for scraping and those that shouldn't be allowed.

reading bytes from web site

I am trying to create a proxy server.
I want to read the websites byte by byte so that I can display images and all other stuff. I tried readLine but I can't display images. Do you have any suggestions how I can change my code and send all data with DataOutputStream object to browser ?
try{
Socket s = new Socket(InetAddress.getByName(req.hostname), 80);
String file = parcala(req.url);
DataOutputStream out = new DataOutputStream(clientSocket.getOutputStream());
BufferedReader dis = new BufferedReader(new InputStreamReader(s.getInputStream()));
PrintWriter socketOut = new PrintWriter(s.getOutputStream());
socketOut.print("GET "+ req.url + "\n\n");
//socketOut.print("Host: "+req.hostname);
socketOut.flush();
String line;
while ((line = dis.readLine()) != null){
System.out.println(line);
}
}
catch (Exception e){}
}
Edited Part
This is what I should have to do. I can block banned web sites but can't allow other web sites in my program.
In the filter program, you will open a TCP socket at the specified port and wait for connections. If a
request comes (i.e. the client types a URL to access a web site), the application will process it to
decide whether access is allowed or not and then, using the same socket, it will send the reply back
to the client. After the client opened her connection to WebPolice (and her request has been checked
and is allowed), the real web page needs to be shown to the client. Therefore, since the user already gave her request, now it is WebPolice’s turn to forward the request so that the user can get the web page. Thus, WebPolice acts as a client and requests the web page. This means you need to open a connection to the web server (without closing the connection to the user), forward the request over this connection, get the reply and forward it back to the client. You will use threads to handle multiple connections (at the same time and/or at different times).
I don't know what exactly you're trying to do, but crafting an HTTP request and reading its response incorporates somewhat more than you have done here. Readline won't work on binary data anyway.
You can take a look at the URLConnection class (stolen here):
URL oracle = new URL("http://www.oracle.com/");
URLConnection yc = oracle.openConnection();
BufferedReader in = new BufferedReader(new InputStreamReader(yc.getInputStream()));
Then you can read textual or binary data from the in object.
Read line will treat the line read as a String, so unless you want to mess around with conversions over to bytes, I wouldn't recommend that.
I would just read bytes until you can't read anymore, then write them out to a file, this should allow you to grab the images, keeping file headers intact which can be important when dealing with files other than text.
Hope this helps.
Instead of using BufferedReader you can try to use InputStream.
It has several methods for reading bytes.
http://docs.oracle.com/javase/6/docs/api/java/io/InputStream.html

Sending sms via java

I am going to send sms via java. The problem is the sms gateway ask me to send in this format
http://push1.maccesssmspush.com/servlet/com.aclwireless.pushconnectivity.listen
ers.TextListener?userId=xxxxx&pass=xxxx&appid=xxxx&subappid=xxxx&msgtyp
e=1&contenttype=1&selfid=true&to=9810790590,9810549717&from=ACL&dlrre
q=true&text=This+is+a+test+msg+from+ACL&alert=
The problem how to call this from a java application is it possible or does it need special libraries? IS it using HttpURLConnection will do the job? Thank you.
A Sample code I have done below is this correct.
URL sendSms1 = new URL("http://push1.maccesssmspush.com/servlet/com.aclwireless.pushconnectivity.listen
ers.TextListener?userId=xxxxx&pass=xxxx&appid=xxxx&subappid=xxxx&msgtyp
e=1&contenttype=1&selfid=true&to=9810790590,9810549717&from=ACL&dlrre
q=true&text=This+is+a+test+msg+from+ACL&alert=");
URLConnection smsConn1 =
sendSms1.openConnection();
It's just an HTTP call, you don't need anything special in Java (or any modern language, I expect). Just build up the string as appropriate*, then make an HTTP request to that URL.
Take a peek at the Sun tutorial Reading from and Writing to a URLConnection if you need to pick up the basics of how to do the request part in Java. This uses the built-in classes, I'm sure there are dozens of libraries that handles connections in funky and/or convenient ways too, so by all means use one of those if you're familiar with it.
*One potential gotcha which might not have occurred to you - your query string arguments will have to be URL-encoded. So the + characters for example in the text parameter, are encoded spaces (which would have a different meaning in the URL). Likewise, if you wanted to send a ? character in one of your parameters, it would have to appear as %3F. Have a look at the accepted answer to HTTP URL Address Encoding in Java for an example of how you might build the URL string safely.
It looks like a simple GET request, you can use Apache HttpClient libarary for executing such a request. Have a look into a tutorial by Vogella here: http://www.vogella.de/articles/ApacheHttpClient/article.html for sample source code and explanations.
You can try to use java.net.URL library。
like this
// at this before you need to generate the urlString as "http://push1.maccesssmspush.com/servlet/com.aclwireless.pushconnectivity.listen
ers.TextListener?userId=xxxxx&pass=xxxx&appid=xxxx&subappid=xxxx&msgtyp
e=1&contenttype=1&selfid=true&to=9810790590,9810549717&from=ACL&dlrre
q=true&text=This+is+a+test+msg+from+ACL&alert="
URL url = new URL(urlString);
// send sms
URLConnection urlConnection = url.openConnection();// open the url
// and you, also can get the feedback if you want
BufferedReader br = new BufferedReader(new InputStreamReader(
urlConnection.getInputStream()));
URL url = new URL("http://smscountry.com/SMSCwebservice.asp");
HttpURLConnection urlconnection = (HttpURLConnection) url.openConnection();
[Edit]
urlconnection.setRequestMethod("POST");
urlconnection.setRequestProperty("Content-Type","application/x-www-form-urlenc‌​oded");
urlconnection.setDoOutput(true);
OutputStreamWriter out = new OutputStreamWriter(urlconnection.getOutputStream());
out.write(postData);
out.close();
BufferedReader in = new BufferedReader(new InputStreamReader(urlconnection.getInputStream()));
String decodedString;
while ((decodedString = in.readLine()) != null) {
retval += decodedString;
}

Calling webservice from java by posting the xml

I hope someone can help me. I'm a bit of a noob to Java. But I have a question regarding calling a web service from Java. The question is actually simple but one way works the other does not?
If I call a web service from Java like this, it works:
try {
String parameters = "<soap:Envelope xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\" xmlns:soap=\"http://schemas.xmlsoap.org/soap/envelope/\">"+
"<soap:Body>"+
" <HelloWorld xmlns=\"http://np-challenger\" />"+
"</soap:Body>"+
"</soap:Envelope>";
//out.println(parameters);
java.net.URL url = new java.net.URL("http://localhost:50217/WebSite3/Service.asmx");
java.net.HttpURLConnection connjava = (java.net.HttpURLConnection)url.openConnection();
connjava.setRequestMethod("GET");
connjava.setRequestProperty("Content-Length", "" + Integer.toString(parameters.getBytes().length));
connjava.setRequestProperty("Content-Language", "en-US");
connjava.setRequestProperty("Content-Type", "text/xml; charset=utf-8");
connjava.setRequestProperty("SOAPAction", "http://np-challenger/HelloWorld");
connjava.setDoInput(true);
connjava.setDoOutput(true);
connjava.setUseCaches(false);
connjava.setAllowUserInteraction(true);
java.io.DataOutputStream printout = new java.io.DataOutputStream (connjava.getOutputStream());
printout.writeBytes(parameters);
printout.flush();
printout.close();
java.io.BufferedReader in = new java.io.BufferedReader(new java.io.InputStreamReader(connjava.getInputStream()));
String line;
while ((line = in.readLine()) != null) {
System.out.println(line);
/*pagecontent += stuff;*/
}
in.close();
} catch (Exception e) {
System.out.println("Error: "+ e);
}
However, if I try to do it like this, I keep getting a bad request. I'm just about ready to pull my hair out.
try {
String xmlData = "<soap:Envelope xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd=\"http://www.w3.org/2001/XMLSchema\" xmlns:soap=\"http://schemas.xmlsoap.org/soap/envelope/\">"+
"<soap:Body>"+
" <HelloWorld xmlns=\"http://np-challenger\" />"+
"</soap:Body>"+
"</soap:Envelope>";
//create socket
String hostname = "localhost";
int port = 50217;
InetAddress addr = InetAddress.getByName(hostname);
Socket sock = new Socket(addr,port);
FileWriter fstream = new FileWriter("out.txt");
// Send header
String path = "/WebSite3/Service.asmx";
BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(sock.getOutputStream(), "UTF8"));
bw.write("POST " + path + " HTTP/1.1\r\n");
bw.write("Host: localhost\r\n");
bw.write("Content-Type: text/xml; charset=\"utf-8\"\r\n");
bw.write("Content-Length: " + xmlData.length() + "\r\n");
bw.write("SOAPAction: \"http://np-challenger/HelloWorld\"");
bw.write("\r\n");
// Send POST data string
bw.write(xmlData);
bw.flush();
// Process the response from the Web Services
BufferedReader br = new BufferedReader(new InputStreamReader(sock.getInputStream()));
String line;
while ((line = br.readLine()) != null) {
System.out.println(line);
}
bw.close();
br.close();
} catch(Exception e) {
System.err.println(e.getMessage());
e.printStackTrace(System.err);
}
I'm a bit suspicious whether the way you calculate the content length is correct, but more importantly:
Use a testing tool.
You can use a testing tool to compare between good and bad requests. One of such tools is soapUI, it's very convenient in showing you the exact contents of the requests and responses.
Create a new project in soapUI, based on the WSDL of your web service. Make sure to mark the checkboxes "Create sample requests for all operations" and "Create a Web Service Simulation of the imported WSDL". This way, soapUI will be able to act both as a client for your actual .NET web service, and as a server to which your Java client will connect.
Make sure that when soapUI connects acts as a client and connects to your web service, the request is processed correctly. Then run it as a server, send a request from Java, and compare it to the request that was processed successfully.
I chose to emphasize the role of a testing tool instead of addressing the specific problems in your code, because I believe that the ability to analyze the contents of your requests and responses will prove to be valuable time after time.
Use a WS framework.
Working with web services on such a low level requires a lot of unnecessary work from you. There are several frameworks and tools in Java that allow you to work on a higher abstraction level, eliminating the need to handle sockets and HTTP headers yourself. Take a look at the JAX-WS standard. This tutorial shows how to create a client for an existing web service. You'll notice that it's much simpler than your code sample.
Other popular WS frameworks in Java are Apache Axis2 and Apache CXF.
It's actually difference in data that is going to server. Monitor the data that you are actually posting using TCP Monitor. and compare the data i.e. mime header, request xml etc.
You will find the mistake. As far as I can see, first method is using GET method while second method is using POST method. I do not say that this is error just monitor actual data that is going to server and you will automatically get your problem resolved.

Categories

Resources