Web site thinks I'm using small display in when doing HtmlUnit - java

I'm trying to extract statements from my gas company (PSNC Energy) at https://www.psncenergy.com
Code is as follows:
java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(Level.INFO);
WebClient webClient = new WebClient(BrowserVersion.CHROME);
try {
HtmlPage page = webClient.getPage("https://www.psncenergy.com");
System.out.println(page.asXml());
HtmlInput userNameInput = page.getFirstByXPath("//input[#name='user-name']");
userNameInput.setTextContent("john");
HtmlPasswordInput password = page.getFirstByXPath("//input[#type='password']");
password.setText("doe");
HtmlButton loginButton = (HtmlButton) page.getElementById("login-button");
The page returned is entirely different than what I would have gotten with either FF or Chrome. The OS platform is Linux FC 24. In particular, PSNC's page as reported by HtmlUnit says "You appear to be using a small screen."
I could try to use a different login page, but that gives a different (Javascript) error, which again is different than what a real browser gives.
My goal is to sign in, download the latest bill from them, then sign off.

Related

Htmlunit fill form with javascript

I want fill https://login.live.com/ form but I coult not. I don't want use Chromium Embeded Framework or java Selenium. Because they opening a browser. Is there a way do it without open browser?
I tried HtmlUnit but javascript problem occurred:
WebClient webClient = new WebClient(BrowserVersion.CHROME);
webClient.getOptions().setJavaScriptEnabled(true);
final HtmlPage page1 = webClient.getPage("https://login.live.com/en");
final HtmlForm form = (HtmlForm) page1.getElementById("i0281");
final HtmlTextInput textField = form.getInputByName("loginfmt");
textField.setValueAttribute("email");
Error message:
Exception in thread "main" com.gargoylesoftware.htmlunit.ElementNotFoundException: elementName=[input] attributeName=[name] attributeValue=[loginfmt]
at com.gargoylesoftware.htmlunit.html.HtmlForm.getInputByName(HtmlForm.java:572)
It is working html pages without javascript.
If you don't want code you can give me some hint. You can use this or you can google this framework ect...
Thank you

Open a web browser page after a POST request using Htmlunit library

I'm testing my website and what I do is moving inside of it using Htmlunit library and Java. Like this for example:
WebClient webClient = new WebClient(BrowserVersion.FIREFOX_45);
HtmlPage page1 = webClient.getPage(mypage);
// sent using POST
HtmlForm form = page1.getForms().get(0);
HtmlSubmitInput button = form.getInputByName("myButton");
HtmlPage page2 = button.click();
// I want to open page2 on a web browser and continue there using a function like
// continueOnBrowser(page2);
I filled a form programmatically using Htmlunit then I sent the form which uses a POST method. But I'd want to see the content of the response inside a web browser page. The fact is that if I use the URL to see the response it doesn't work since it's the response to a POST method.
It seems like it's the wrong approach to me, it's obvious that if you do anything programmatically you could not expect to open the browser and continue there... I can't figure out what could solve my problem.
Do you have any suggestions?

Web scraping using HtmlUnit on an intranet website

I am presently using HtmlUnit to automatically fill forms and click a button on an intranet site. The code is working on internet websites successfully but failing to do so on the intranet website. The intranet website is an asp site, only opens on IE. The code I am using is as follows,
final WebClient webClient = new
WebClient(BrowserVersion.INTERNET_EXPLORER,"10.20.30.31", 8182);
webClient.getOptions().setCssEnabled(false);
webClient.getOptions().setJavaScriptEnabled(false);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(true);
System.out.println(url);
HtmlPage page = webClient.getPage(url);
System.out.println("HTML page opened");
HtmlInput searchBox = page.getElementByName("txtFaq"); //this is actual
searchBox.setValueAttribute(faq);
HtmlSubmitInput update =page.getElementByName("clear");
page=update.click();
HtmlDivision resultStatsDiv =
page.getFirstByXPath("//div[#id='resultStats']");
System.out.println(resultStatsDiv.asText());
webClient.close();
On execution it is encountering the following exceptions,
java.net.SocketTimeoutException: Read timed out
What am I missing here?

Testing for Concurrent users for a dynamic web appliacation

I would like to test a web application which takes an input as parameter and produces output. I don't want to do load or stress testing, I would like to have some 100 users inputting the parameter and clicking the submit. How can we automate this?
The web application I would like to test is http://protein.rnet.missouri.edu:8080/MongoTest/
You can achieve such functionality by using HtmlUnit.
HtmlUnit is a "GUI-Less browser for Java programs". It models HTML
documents and provides an API that allows you to invoke pages, fill
out forms, click links, etc... just like you do in your "normal"
browser.
The way to do this is something like the following:
//set browser
WebClient webClient = new WebClient(BrowserVersion.FIREFOX_10);
//not to throw exception on javascript error
webClient.setThrowExceptionOnScriptError(false);
//set page to access
final HtmlPage homepageEn = webClient.getPage("http://protein.rnet.missouri.edu:8080/MongoTest/");
//get the form by id
HtmlForm form = homepageEn.getFirstByXPath("//form[#id='input_form']");
//setup the fields to use
HtmlTextInput mailField = form.getInputByName("mail");
HtmlPasswordInput passwordField = form.getInputByName("password");
//define the submit button (defined by value)
HtmlSubmitInput submitButton = form.getInputByValue("submit");
//change the value of text fields
mailField.setValueAttribute("somemail#xyzmail.com");
passwordField.setValueAttribute("some_password");
//finally submit the form by clicking the button
final HtmlPage resultsPage = submitButton.click();
You can then implement the 100 users maybe using a loop or something. That's totally up to you..
Hope this helps...

Getting Final HTML with Javascript rendered Java as String

I want to fetch data from an HTML page(scrape it). But it contains reviews in javascript. In normal java url fetch I am only getting the HTML(actual one) without Javascript executed. I want the final page with Javascript executed.
Example :- http://www.glamsham.com/movies/reviews/rowdy-rathore-movie-review-cheers-for-rowdy-akki-051207.asp
This page has comments as a facebook plugin which are fetched as Javascript.
Also similar to this even on this.
http://www.imdb.com/title/tt0848228/reviews
What should I do?
Use phantomjs: http://phantomjs.org
var page = require('webpage').create();
page.open("http://www.glamsham.com/movies/reviews/rowdy-rathore-movie-review-cheers-for-rowdy-akki-051207.asp")
setTimeout(function(){
// Where you want to save it
page.render("screenshoot.png")
// You can access its content using jQuery
var fbcomments = page.evaluate(function(){
return $(".fb-comments iframe").contents().find(".postContainer")
})
},10000)
You have to use the option in phantom --web-security=no to allow cross-domain interaction (ie for facebook iframe)
To communicate with other applications from phantomjs you can use a web server or make a POST request: https://github.com/ariya/phantomjs/blob/master/examples/post.js
You can use HTML Unit, A java based "GUI LESS Browser". You can easily get the final rendered output of any page because this loads the page as a web browser do so and returns the final rendered output. You can disable this behaviour though.
UPDATE: You were asking for example? You don't have to do anything extra for doing that:
Example:
WebClient webClient = new WebClient();
HtmlPage myPage = ((HtmlPage) webClient.getPage(myUrl));
UPDATE 2: You can get iframe as follows:
HtmlPage myFrame = (HtmlPage) myPage.getFrameByName(myIframeName).getEnclosedPage();
Please read the documentation from above link. There is nothing you can't do about getting page content in HTMLUnit
The simple way to solve that problem.
Hello, you can use HtmlUnit is java API, i think it can help you to access the executed js content, as a simple html.
WebClient webClient = new WebClient();
HtmlPage myPage = (HtmlPage) webClient.getPage(new URL("YourURL"));
System.out.println(myPage.getVisibleText());

Categories

Resources