How to log in and than submit a form using java? - java

I need to create a java program which should go to a website's login page, log in, than go to an another page of the site and submit a form. I know how to submit a form, but my problem is with the login part. This script should work with multiple sites, some are using cookies and some sessions. Is there any way to solve my problem ?
I can't show you any code because I don't know where to begin, first I should submit the login form and then separately go to the submission page ? I don't know please help me, or please tell me how could I solve this problem: I want to submit a form on various sites automatically, only I will be using this script. Until now I'we created a script in JavaScript and I'we opend the sites in iframes and I'we discovered that in Google chrome I can control external iframes, too, and I`we used JavaScript to fill the forms automatically, but my problem is that I need to submit files, images too, and I can't do this using only JavaScript. If it's not possible to do this using java please help me to find an another solution, I need to make it fully automated.

You can use Apache HTTP Client for logging in to websites using Java.

I would take a look at the Selenium RC framework and APIs. It's a test automation tool but there's no reason why you couldn't use it for doing programmatic logins to websites. It has client libraries for many languages including Java.
Using selenium RC you can write Java could thatcan load, navigate and fill in forms programmatically. You are able to target the form input fields using field names or classes and the Java API allows you to load multipart data into a form.
Selenium comes in two flavours, the older Selenium RC version and the newer WebDriver version. Both are capable of doing what you want, however they have slightly different ways of doing it. The documentation provides some good examples to get you started.

Related

imitate browser with java

I'm looking for a java-framework which enables me to easily communicate with a website.
What I'd like to do is for example:
log into a website
open various pages
read information
submit information into forms
send ajax-requests
read ajax-response
What I'm not looking for is a browser automation plugin like selenium. I'm trying to have my application directly communicate with the website.
That's the general outline. If you can think of a better solution for the following problem, I'm more than willing to follow your advice (:
We're working with an webapplication with an gruesome GUI. Unfortunatley we've got no means to tinker with said application or request changes to it. What I'd ike to do is to build is a client which logs into said application, fetches the data and displays them in a more appropriate manner with additional information based on that data while also providing tools to process this data and submit it back to that web-application.
Thanks in advance.
Selenium does come for JAVA. You can download it from here. http://www.seleniumhq.org/download/
Here is a tutorial:
https://www.airpair.com/selenium/posts/selenium-tutorial-with-java
How Selenium web driver works
Selenium web driver (firefox web driver) will open a web browser(firefox) for you and you can actually see what's going on. The choice of opening a browser window may not be the requirement for you. Then you can use:
HTMLUnitWebDriver
PhantomJSDriverService
Take a look at
http://hc.apache.org/httpcomponents-client-ga/quickstart.html
Its not a framework but a library but should provide you the needed methods to interact with your web application

Download HTML of a webpage that is changed by JavaScript

The program I am writing is in Java.
I am writing a little program that will download the html of webpages and save them. It works easily for basic pages that don't use JavaScript. But how can I download the page if I want it after a script has updated it? The page I am dealing with is actually updated by Ajax which might be one step harder.
I understand that this is probably a difficult problem that involves setting up a JavaScript run time environment of some kind. I am prepared for a solution of any level of difficulty, I just don't know exactly how to approach it or where to get started.
You can't do that alone with Java only. As the page that you want to download is rendered with javascript, then you must be able to execute the javascript to get the whole rendered page.
Because of this situation, you need to use a headless browser which is a web browser that can access to web pages but can’t show the output within a GUI, aims to provide the content of web pages as fully rendered to serve to the programs or scripts.
You can start with the most famous ones which are Selenium, HtmlUnit and PhantomJS

Navigate to and learn all the web objects on a page using Java (without Selenium)

I work for a start-up, where we have a requirement to automatically navigate to a given web application and find out information about all the objects contained within a page (inclusive of any iframes inside). We are supposed to code this module in Java.
So, I used Selenium WebDriver and was successful. However, due to some reasons, we've been asked not to use Selenium, but rather Core Java to do this.
So here's my question. Let's say I want to open "http://www.google.co.in" on my Firefox browser, and I have to get the attribute values for the Search Textbox, Search button and I'm feeling Lucky button. I have to do this using Java. Where do I start?
I had an idea, which was to actually navigate to a page, read its HTML source and build an xpath query to find each element and get its attributes. But how do I accomplish this navigation using Java (or jQuery as well, if that's possible)?
It may sound as if I'm trying to build an automation tool from the scratch, but I'm just considering all possibilities.
Please help.
If you have loaded the HTML content of the page into a single string variable, you can use standard Java string mechanisms to find contents of the HTML page in your string.
This might help http://www.javaworld.com/article/2077567/core-java/java-tip-66--control-browsers-from-your-java-application.html
Don't know why you want to do in Java instead of Selenium. Selenium will be the best tool for this job, you should convince your team instead.

Command line based HTTP POST to retrieve data from javascript-rich webpage

I'm not sure if this is possible but I would like to retrieve some data from a web page that uses Javascript to render data. This would be from a linux shell.
What I am able to do now:
http post using curl/lynx/wget to login and get headers from command line
use headers to get into 'secure' locations in the webpage on command line
However, the only elements that are rendered on the page are the static html. Most of the info I need are rendered dynamically with js (albeit eventually as a html as well) and don't show up on a command line browser. I understand the issue is with the lack of a js interpreter.
As such... some workarounds I thought might be possible are:
calling full browsers from command line and somehow passing the info back to stdout. this would mean that I have to be able to POST.
passing the headers (with session info, etc...) i got from curl to one of these full browsers and again dumping the output html back to stdout. it could very be a printscreen function on the window if all else fails.
a pure java solution would be OK too.
Anyone has any experience doing something similar and succeeding?
Thanks!
You can use WebDriver to do, just that you need have web browser installed. There are other solution as well such as Selenium and HtmlUnit (without browser but might behave differently).
You can find example of Selenium project at here.
WebDriver
WebDriver is a tool for writing automated tests of websites. It aims
to mimic the behaviour of a real user, and as such interacts with the
HTML of the application.
Selenium
Selenium automates browsers. That's it. What you do with that power is
entirely up to you. Primarily it is for automating web applications
for testing purposes, but is certainly not limited to just that.
Boring web-based administration tasks can (and should!) also be
automated as well.
HtmlUnit
HtmlUnit is a "GUI-Less browser for Java programs". It models HTML
documents and provides an API that allows you to invoke pages, fill
out forms, click links, etc... just like you do in your "normal"
browser.
I would recommend use WebDriver because it is not required standalone server like Selenium, while for HtmlUnit might suitable if you dont want install browser without worry about Xvfb in headless environment.
You might want to see what Selenium can do for you. It has numerous language drivers (Java included) that can be used to interact with the browser to process content typically for testing and verification purposes. I'm not exactly sure how you can get exactly what you are looking for out of it but wanted to make you aware of its existence and potential.
This is impossible unless you setup a websocket, and even like this I guess it really depends.
Could you detail your objective? For my personal curiosity :-)

autogenerate HTTP screen scraping Java code

I need to screen scrape some data from a website, because it isn't available via their web service. When I've needed to do this previously, I've written the Java code myself using Apache's HTTP client library to make the relevant HTTP calls to download the data. I figured out the relevant calls I needed to make by clicking through the relevant screens in a browser while using the Charles web proxy to log the corresponding HTTP calls.
As you can imagine this is a fairly tedious process, and I'm wodering if there's a tool that can actually generate the Java code that corresponds to a browser session. I expect the generated code wouldn't be as pretty as code written manually, but I could always tidy it up afterwards. Does anyone know if such a tool exists? Selenium is one possibility I'm aware of, though I'm not sure if it supports this exact use case.
Thanks,
Don
I would also add +1 for HtmlUnit since its functionality is very powerful: if you are needing behaviour 'as though a real browser was scraping and using the page' that's definitely the best option available. HtmlUnit executes (if you want it to) the Javascript in the page.
It currently has full featured support for all the main Javascript libraries and will execute JS code using them. Corresponding with that you can get handles to the Javascript objects in page programmatically within your test.
If however the scope of what you are trying to do is less, more along the lines of reading some of the HTML elements and where you dont much care about Javascript, then using NekoHTML should suffice. Its similar to JDom giving programmatic - rather than XPath - access to the tree. You would probably need to use Apache's HttpClient to retrieve pages.
The manageability.org blog has an entry which lists a whole bunch of web page scraping tools for Java. However, I do not seem to be able to reach it right now, but I did find a text only representation in Google's cache here.
You should take a look at HtmlUnit - it was designed for testing websites but works great for screen scraping and navigating through multiple pages. It takes care of cookies and other session-related stuff.
I would say I personally like to use HtmlUnit and Selenium as my 2 favorite tools for Screen Scraping.
A tool called The Grinder allows you to script a session to a site by going through its proxy. The output is Python (runnable in Jython).

Categories

Resources