Tried to use htmlUnit to send POST requests to communicate to server and met a tiny problem: target .php url is being changed from time to time
(www123.example.net -> www345.example.net, etc.). The only way to get new adress is to open site and check it's XHR requests, find one which goes to www???.example.net and then use this address to send POSTs.
So the question is: is there a way to track XHR using htmlUnit or any other Java library?
If you really need help you have to show your problem in more detail, provide some info about the web site you are requesting, show you code and try to explain what you expect and what goes wrong. Without this details we can only guess.
Looks like you should try to think about HtmlUnit more like a browser you can control from java instead of doing simple Http requests. Have a look at the simple samples on the HtmlUnit web site (the one at the bottoms is for you).
Try something like this (the same steps as the user of an ordinary browser does)
* open the url/page
* fill the various form fields
* find the submit button an click
* use the resulting page content
Usually HtmlUnit does all the stuff in the background for you.
Related
Long story short I having an issue sending post request to sever, after completing my first one. This is being done in java.
Basically my question is using apache http client is it possible to press a button as I can't find any other away around this, (I am normally use selenium but am attempting to save ram by removing the browser so using post request instead)
Here is an example of html code:
I have look into making post request to the sever by using network tab in chrome tried multiple things but wouldn't work.
I have a informative web page in my spring based web application which need to be saved as html/downloaded.
My requirement is to save/ download this opened webpage on click of a button on same page.
I used below code in javascript.
document.execCommand("SaveAs",true,"C:\Saved Content.html");
But this is only working in IE and not in other browsers.
Kindly help on this.
Simply no. JavaScript/Jquery is restricted to perform such operations due to security reasons.
The best way to achieve this, would be, to send a request to the server that would write the new file on the server.
Then from javascript perform a POST request to the server page passing the data you want to write to the new file.
I am planning to write a java program where I have the url of website x with which I will appending number from 1 to 100 and I will be getting result from the website.
Should I write using request and response of HTTP or mere java program where the url as string would do?
If I am getting the result as posted on browser, how to get the values from a div and write it to a text file. I guess the other option is also to get it via response.
All you need is a programatic Browser, which submits the request and gets you the response,
You can study the Http Request and Response Objects under Tcp/Ip Protocol stack and implement your own, but instead of Reinventing the wheel, you can use the apache commons Http Components Project, which has all this implemented
Apache Http Components
I'm not sure if you will be able to control the browser using only java. Even if you know where the browser exe file is installed you will not be able to use it's handle to control it (no pointers in java, different process, different memory area, etc). Sure, you could write one dll and then use it with jni but the final result would not be multplatform ...
Other possible approach would be to inject some keypress but you would be blind about the browser response (you would have to do some ugly screen capture ).
I don't think it is an easy task so IF I were you I would look in the web for some already made dll or library to control the browser.
I know that selenium does some kind of browser control (http://docs.seleniumhq.org/)
my 5 cents in 5 minutes.
Nowadays many websites contain some content loaded by ajax(e.g,comments in some video websites). Normally we can't crawl these data and what we get is just some js source code. So here is the question: in what ways can we execute the javascript code after we get the html response and get to the final page we want?
I know that HtmlUnit has the ability to execute background js,yet some many bugs and errors are there. Are there any else tools can help me with it?
Some people tell me that I can crawl the ajax request url, analyze its parameters and send request again so as to gain the data. If things can't work out according to the way I mention above, can anyone tell me how to extract the ajax url and send the request in correct format?
By the way,if the language is java,it would be the best
Yes, Netwoof can crawl Ajax easily. Its API and bot builder let you do it without a line of code.
Thats the great thing about HTTP you don't even need java. My goto tool for debugging AJAX is the chrome extension Postman. I start by looking at the request in the chrome debugger and identifying the salient bits(url or form encoded params etc.)
Then it can be as simple as opening a tab and launch requests at the server with Postman. As long as its all in the same browser context all of your cookies(for authentication, etc.) will be shipped along too.
I've often wanted to create applications that provide a simpler front-end to other websites that require users to login before the pages I want to use can be accessed. I was wondering, if
(1) any website with a POST to an http page can be authenticated by POSTing
postField1name=pf1Value&postField2name=pf2Value
to the website, if that's true how can you inspect the HTML to POST correctly?
(2) I wanted to know if you could parse HTML, say for a sign up form, and display all the fields in an application UI, including downloading a Captcha, and displaying it to the user, and allowing them to type the value in, to send back to the website, and process the response.
Also if anyone knows how I might accomplish (2) using Apache HTTP Client in java, I'd greatly appreciate it!
http://hc.apache.org/httpcomponents-client/httpclient/index.html
(1) An easy way to find out what's actually being POST'd is to look at the actual HTTP requests. You can do that with a tool like LiveHTTPHeaders. Then have your script simulate that.
(2) Yes. You can use cURL, which is excellent for things like this.
(1) Try FireBug. There's actually a lot of options for authentication.
(2) Try JTidy