When Selenium navigates through web pages I want to save some of the text from that web pages. I use Selenium with Java. So is there a way to extract from that web pages the text having a specific xpath?
selenium.getText("//div[#id='debugState']");
Visit, http://release.seleniumhq.org/selenium-remote-control/0.9.0/doc/java/com/thoughtworks/selenium/DefaultSelenium.html#getText%28java.lang.String%29
Related
Can I fill out forms, execute events and Javascript functions in Jsoup? If yes how can I? Or should I go for another parser.
JSoup is just an HTML parser/"tidyfier" - not a browser emulator. To interact with HTML pages (execute javascript, fill out forms, etc.) you should use a tool like HtmlUnit or Selenium.
Use Selenium - if you use Selenium 2 WebDriver API, the main classes there are WebDriver, FirefoxDriver, and JavascriptExecutor.
Can I fill out forms, execute events and Javascript functions in Jsoup? If yes how can I? Or should I go for another parser.
JSoup is just an HTML parser/"tidyfier" - not a browser emulator. To interact with HTML pages (execute javascript, fill out forms, etc.) you should use a tool like HtmlUnit or Selenium.
Use Selenium - if you use Selenium 2 WebDriver API, the main classes there are WebDriver, FirefoxDriver, and JavascriptExecutor.
Is there any way to get text from PDF pages using selenium/java apart from reading through input file stream?
In my application a report opens in PDF format, I need to get data from it.
When opened in Firefox it shows DOM structure but I wasn't able to locate element using that.
Big NO.Selenium automates browsers,Mock web applications, run tests. What you are asking is not the part of Selenium api. Third party api's are available that doesn't work 100%. check out
How to extract text from a PDF?
I am generating PDFs file dynamically in my application using Apache PDFBox library.
I have jsp page which is having Print button.When user click on that print button i want to generate PDF file and at the same time show pdf file on browser and apply window.print() method.
How can i achieve this in my jsp page?
Create a pdf link on your page and the link should be mapped to the actual location the PDF exists on your server.
The browser actually handles what to do with the pdf (based on your browser settings) .... whether to download it or open it via plugin. The bottomline is you cannot control it via server side code.
In either of the case you cannot apply window.print() because that is only applicable to browser window and not pdf plugin functionality or if it gets downloaded then he has to manually open it.
There is an alternate solution to this. That is show the pdf in a div in your html and print that div.
For how to show pdf in a html div you can look Display Adobe pdf inside a div
For printing a div or any other html element there are jquery plugins available. I have used print.js that will print a html div, it will also maintain your css.
So when user clicks the print button first show the pdf in a div and then call the print function to print that div.
I'm trying to parse HTML page with DOM parser and jsoup library.
The problem that I'm facing is this:
On Web site there are two buttons which show two different tables.
I need to parse the table which is shown when the second button is clicked.
There are different attribute values set after clicking the second button.
When I do Jsoup.connect("example.com")
I get response like first button is selected and I don't need that data.
Is there a way to perform click on second button, and then start parsing and retrieving data from Web site?
Jsoup is just a parser, i.e. it can't handle events such as clicking on buttons. Have a look at browser automation tools (e.g. Selenium) to perform this kind of job.
JSoup is a HTML parser and not a browser alternative. Take a look at Html Unit
HtmlUnit is a "GUI-Less browser for Java programs". It models HTML documents and provides an API that allows you to invoke pages, fill out forms, click links, etc... just like you do in your "normal" browser.
JSoup can't control the web page, only parse the content. For manipulation and interaction, there are some tools. I recommend Geb, which uses a Groovy DSL with a JQuery like syntax, making it very fluent. It's also pretty easy to parse xml/html with it.