Reading webpage's inspect element data using java

Reading webpage's inspect element data using java - java

I have a requirement . I am reading file from a dynamic web page , and the values which i require from the webpage lies within
<td>
, and this is visible when i inspect this element . So my question is , is it somehow possible to print the data contained in the inspect element using java?

Using JSOUP. Here is the cookbook
ArrayList<String> downServers = new ArrayList<>();
Element table = doc.select("table").get(0);
Elements rows = table.select("tr");
for (int i = 1; i < rows.size(); i++) {
Element row = rows.get(i);
Elements cols = row.select("td");
// Use cols.get(index) to get the data from td element
}

I found the solution to this one , leaving this answer in case if anyone stuck into this in future.
To print whatever you see inside inspect element can be tracked down using selenium.
Here's the code which i used `
WebDriver driver= new ChromeDriver();
driver.manage().timeouts().implicitlyWait(15, TimeUnit.SECONDS);
driver.manage().window().maximize();
driver.get("http://www.whatever.com");
Thread.sleep(1000);
List<WebElement> frameList = driver.findElements(By.tagName("frame"));
System.out.println(frameList.size());
driver.switchTo().frame(0);
String temp=driver.findElement(By.xpath("/html/body/table/thead/tr/td/div[2]/table/thead/tr[2]/td[2]")).getText();
read here for more .

Related

UnexpectedTagNameException: Element should have been "select" but was "a" while trying to get the texts of dropdown menu using Selenium and Java

While trying to get the menu list, I'm getting this error message:
Exception in thread "main" org.openqa.selenium.support.ui.UnexpectedTagNameException: Element should have been "select" but was "a".
Here below is the code:
public static void main(String[] args) {
// TODO Auto-generated method stub
System.setProperty("webdriver.chrome.driver", "D:\\selenium files\\chromedriver_win32_new\\chromedriver.exe");
WebDriver driver = new ChromeDriver();
driver.manage().window().maximize();
driver.get("https://www.tutorialspoint.com/tutor_connect/index.php");
driver.manage().timeouts().implicitlyWait(10, TimeUnit.SECONDS);
WebElement ele = driver.findElement(By.xpath("//*[#id=\"logo-menu\"]/div/div[1]/div/a"));
Select s = new Select(ele);
//getting list of menu
List <WebElement> op = s.getOptions();
int size = op.size();
for(int i =0; i<size ; i++){
String options = op.get(i).getText();
System.out.println(options);
}
}
}

That is because the element you are trying to cast is a link tag and not a select tag.
You need to give the Xpath or CSS of the correct Select element and then cast it from WebElement into a Select ojbect.
In the example you are using there is not real selector, you first need to click on the buttons that says "Categories" and later take the options that appear:
WebElement button = driver.findElementByCSS("div[class='mui-dropdown']");
button.click();
WebElement SelectObj = driver.findElementByCSS("ul[class*='mui--is-open']");
Select s = new Select(SelectObj);

The desired element is not a <select> element but a <ul> element. Once you click on the <a> element then only the classname mui--is-open is appended to the desired <ul> element.
Solution
So to get the contents of the dropdown menu you need to induce WebDriverWait for the visibilityOfAllElementsLocatedBy() and you can use Java8 stream() and map() and you can use either of the following Locator Strategies:
Using cssSelector:
new WebDriverWait(driver, 20).until(ExpectedConditions.elementToBeClickable(By.cssSelector("a.mui-btn.mui-btn--primary.categories"))).click();
System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.cssSelector("ul.mui-dropdown__menu.cat-menu.mui--is-open a"))).stream().map(element->element.getText()).collect(Collectors.toList()));
Using xpath:
new WebDriverWait(driver, 20).until(ExpectedConditions.elementToBeClickable(By.xpath("//a[#class='mui-btn mui-btn--primary categories']"))).click();
System.out.println(new WebDriverWait(driver, 20).until(ExpectedConditions.visibilityOfAllElementsLocatedBy(By.xpath("//ul[#class='mui-dropdown__menu cat-menu mui--is-open']//a"))).stream().map(element->element.getText()).collect(Collectors.toList()));
References
You can find a couple of relevant detailed discussions in:
How to extract the text iterating specific rows within a table using XPath with Selenium and Java
How to extract the dynamic values of the id attributes of the table elements using Selenium and Java
How to print runs scored by a batsmen in a scoreboard format webelements through CSS selector using Selenium and Java

eliminating duplicate links on the webpage and avoid link is stale error

I have a list of 20 links and some of them are duplicates. I click onto the first link which leads me to the next page, I download some files from the next page.
Page 1
Link 1
Link 2
Link 3
link 1
link 3
link 4
link 2
Link 1 (click) --> (opens) Page 2
Page 2 (click back button browser) --> (goes back to) Page 1
Now I click on Link 2 and repeat the same thing.
System.setProperty("webdriver.chrome.driver", "C:\\chromedriver.exe");
String fileDownloadPath = "C:\\Users\\Public\\Downloads";
//Set properties to supress popups
Map<String, Object> prefsMap = new HashMap<String, Object>();
prefsMap.put("profile.default_content_settings.popups", 0);
prefsMap.put("download.default_directory", fileDownloadPath);
prefsMap.put("plugins.always_open_pdf_externally", true);
prefsMap.put("safebrowsing.enabled", "false");
//assign driver properties
ChromeOptions option = new ChromeOptions();
option.setExperimentalOption("prefs", prefsMap);
option.addArguments("--test-type");
option.addArguments("--disable-extensions");
option.addArguments("--safebrowsing-disable-download-protection");
option.addArguments("--safebrowsing-disable-extension-blacklist");
WebDriver driver = new ChromeDriver(option);
driver.get("http://www.mywebpage.com/");
List<WebElement> listOfLinks = driver.findElements(By.xpath("//a[contains(#href,'Link')]"));
Thread.sleep(500);
pageSize = listOfLinks.size();
System.out.println( "The number of links in the page is: " + pageSize);
//iterate through all the links on the page
for ( int i = 0; i < pageSize; i++)
{
System.out.println( "Clicking on link: " + i );
try
{
linkText = listOfLinks.get(i).getText();
listOfLinks.get(i).click();
}
catch(org.openqa.selenium.StaleElementReferenceException ex)
{
listOfLinks = driver.findElements(By.xpath("//a[contains(#href,'Link')]"));
linkText = listOfLinks.get(i).getText();
listOfLinks.get(i).click();
}
try
{
driver.findElement(By.xpath("//span[contains(#title,'download')]")).click();
}
catch (org.openqa.selenium.NoSuchElementException ee)
{
driver.navigate().back();
Thread.sleep(300);
continue;
}
Thread.sleep(300);
driver.navigate().back();
Thread.sleep(100);
}
The code is working fine and clicks on all the links and downloads the files. Now I need to improve the logic omit the duplicate links. I tried to filter out the duplicates in the list but then not sure how should I handle the org.openqa.selenium.StaleElementReferenceException. The solution I am looking for is to click on the first occurrence of the link and avoid clicking on the link if it re-occurs.
(This is part of a complex logic to download multiple files from a portal >that I don't have control over. Hence please don't come back with the >questions like why there are duplicate links on the page at the first place.)

First I don't suggest you to be doing requests (findElements) to the WebDriver repeatedly, you will see a lot of performance issues following this path, mainly if you have a lot of links, and pages.
Also if you are doing the same thing always on the same tab, you will need to wait the refresh 2 times ( page of the links and page of the download ), now if you open each link in a new tab, you just need to wait the refresh of the page where you will download.
I have a suggestion, just distinct repeated links as #supputuri said and open each link in a NEW tab, in this way you don't need to handle stale, don't need to be searching on the screen every time for the links and don't need to wait the refresh of the page with links in each iteration.
List<WebElement> uniqueLinks = driver.findElements(By.xpath("//a[contains(#href,'Link')][not(#href = following::a/#href)]"));
for ( int i = 0; i < uniqueLinks.size(); i++)
{
new Actions(driver)
.keyDown(Keys.CONTROL)
.click(uniqueLinks.get(i))
.keyUp(Keys.CONTROL)
.build()
.perform();
// if you want you can create the array here on this line instead of create inside the method below.
driver.switchTo().window(new ArrayList<>(driver.getWindowHandles()).get(1));
//do your wait stuff.
driver.findElement(By.xpath("//span[contains(#title,'download')]")).click();
//do your wait stuff.
driver.close();
driver.switchTo().window(new ArrayList<>(driver.getWindowHandles()).get(0));
}
I'm not in a place where I was able to test my code properly right now, any issues on this code just comment and I will update the answer, but the idea is right and it's pretty simple.

First lets see the xpath.
Sample HTML:
<!DOCTYPE html>
<html>
<body>
<div>
<a href='https://google.com'>Google</a>
<a href='https://yahoo.com'>Yahoo</a>
<a href='https://google.com'>Google</a>
<a href='https://msn.com'>MSN</a>
</body>
</html>
Let's see the xpath to get the distinct Links out of the above.
//a[not(#href = following::a/#href)]
The logic in xpath is we are making sure the href of the link is not matching with any following links href, if it's match then it's considered as duplicate and xpath does not return that element.
Stale Element:
So, now it's time to handle the stale element issue in your code.
The moment you click on the Link 1 all the references stored in listOfLinks will be invalid as selenium will get assign the new references to the elements each time they load on the page. And when you try to access the elements with old reference you will get the stale element exception.
Here is the snippet of code that should give you an idea.
List<WebElement> listOfLinks = driver.findElements(By.xpath("//a[contains(#href,'Link')][not(#href = following::a/#href)]"));
Thread.sleep(500);
pageSize = listOfLinks.size();
System.out.println( "The number of links in the page is: " + pageSize);
//iterate through all the links on the page
for ( int i = 0; i < pageSize; i++)
{
// ===> consider adding step to explicit wait for the Link element with "//a[contains(#href,'Link')][not(#href = following::a/#href)]" xpath present using WebDriverWait
// don't hard code the sleep
// ===> added this line
<WebElement> link = driver.findElements(By.xpath("//a[contains(#href,'Link')][not(#href = following::a/#href)]")).get(i);
System.out.println( "Clicking on link: " + i );
// ===> updated next 2 lines
linkText = link.getText();
link.click();
// ===> consider adding explicit wait using WebDriverWait to make sure the span exist before clicking.
driver.findElement(By.xpath("//span[contains(#title,'download')]")).click();
// ===> check this answer (https://stackoverflow.com/questions/34548041/selenium-give-file-name-when-downloading/56570364#56570364) for make sure the download is completed before clicking on browser back rather than sleep for x seconds.
driver.navigate().back();
// ===> removed hard coded wait time (sleep)
}
xpath ScreenShot:
Edit1:
If you want to open the link in the new window then use the below logic.
WebDriverWait wait = new WebDriverWait(driver, 20);
wait.until(ExpectedConditions.presenceOfAllElementsLocatedBy(By.xpath("//a[contains(#href,'Link')][not(#href = following::a/#href)]")));
List<WebElement> listOfLinks = driver.findElements(By.xpath("//a[contains(#href,'Link')][not(#href = following::a/#href)]"));
JavascriptExecutor js = (JavascriptExecutor) driver;
for (WebElement link : listOfLinks) {
// get the href
String href = link.getAttribute("href");
// open the link in new tab
js.executeScript("window.open('" + href +"')");
// switch to new tab
ArrayList<String> tabs = new ArrayList<String> (driver.getWindowHandles());
driver.switchTo().window(tabs.get(1));
//click on download
//close the new tab
driver.close();
// switch to parent window
driver.switchTo().window(tabs.get(0));
}
Screenshot: Sorry for the poor quality of the screenshot, could not upload the high quality video due to size limit.

you can do like this.
Save Index of element in the list to a hashtable
if Hashtable already contains, skip it
once done, HT has only unique elements, ie first foundones
Values of HT are the index from listOfLinks
HashTable < String, Integer > hs1 = new HashTable(String, Integer);
for (int i = 0; i < listOfLinks.size(); i++) {
if (!hs1.contains(e.getText()) {
hs1.add(e.getText(), i);
}
}
for (int i: hs1.values()) {
listOfLinks.get(i).click();
}

How to get the text from first column of the table?

I have tried with below coding to obtain the text from table.It works fine.But ,i want pick the data from first column alone.How can i grab the text from first column. .
WebElement tableContents = pubDriver.findElements(By.id("view_table"));
List<WebElement> rows=tableContents.findElements(By.tagName("tr"));
for(int rnum=0;rnum<rows.size();rnum++)
{
List<WebElement> columns=rows.get(rnum).findElements(By.tagName("td"));
for(int cnum=0;cnum<columns.size();cnum++)
{
System.out.println(columns.get(cnum).getText());
}
}
I have tried with below coding ,i didn't get the text from first column
columns.get(0).getText();

you can do like this....
// Create a new instance of the html unit driver
// Notice that the remainder of the code relies on the interface,
// not the implementation.
WebDriver driver = new FirefoxDriver();
// And now use this to visit Google
driver.get("http://localhost:8081/TestXmlDisplay/tabletest.html");
WebElement tableContents = driver.findElement(By.tagName("table"));
List<WebElement> rows=tableContents.findElements(By.tagName("tr"));
for(int rnum=0;rnum<rows.size();rnum++)
{
List<WebElement> columns=rows.get(rnum).findElements(By.tagName("td"));
System.out.println(columns.get(0).getText());
}
// driver.quit();

Search an element in all pages in Selenium WebDriver (Pagination)

I need to search for particular text in a table on all the pages. Say i have got to search for text (e.g : "xxx") and this text is present at 5th row of table on 3rd page.
I have tried with some code :
List<WebElement> allrows = table.findElements(By.xpath("//div[#id='table']/table/tbody/tr"));
List<WebElement> allpages = driver.findElements(By.xpath("//div[#id='page-navigation']//a"));
System.out.println("Total pages :" +allpages.size());
for(int i=0; i<=(allpages.size()); i++)
{
for(int row=1; row<=allrows.size(); row++)
{
System.out.println("Total rows :" +allrows.size());
String name = driver.findElement(By.xpath("//div[#id='table']/table/tbody/tr["+row+"]/td[1]")).getText();
//System.out.println(name);
System.out.println("Row loop");
if(name.contains("xxxx"))
{
WebElement editbutton = table.findElement(By.xpath("//div[#id='table']/table/tbody/tr["+row+"]/td[3]"));
editbutton.click();
break;
}
else
{
System.out.println("Element doesn't exist");
}
allpages = driver.findElements(By.xpath("//div[#id='page-navigation']//a"));
}
allpages = driver.findElements(By.xpath("//div[#id='page-navigation']//a"));
driver.manage().timeouts().pageLoadTimeout(5, TimeUnit.SECONDS);
allpages.get(i).click();
}
Sorry, i missed to describe the error. Well this code gets executed properly, it checks for element "xxx" on each row of every page and clicks on editbutton when its found.
After that it moves to
"allpages.get(i).click();" // code is for click on pages
But its unable to find any pagination, so it displays error of "Element is not clickable at point (893, 731). Other element would receive the click...."

For every page loop you use one table WebElement object. So I assume that after going to the next page you get StaleElementReferenceException. I guess the solution could be with defining table on every page loop. Move this line List<WebElement> allrows = table.findElements(By.xpath("//div[#id='table']/table/tbody/tr")); after for(int i=0; i<=(allpages.size()); i++) too
EDIT: And, btw, at this line allpages.get(i).click() I think you must click the next page link, not the current one as it seems to be

Selenium WebDriver Java locating two different table cells

I'm using Selenium Webdriver in Java. I have a table, and I like to get my hands on the last cell on the first row, and the last cell of last row. I manage to get one of them
WebElement table =driver.findElement(By.className("dataTable"));
List <WebElement> rows = table.findElements(By.tagName("tr"));
WebElement firstrow= rows.get(0);
WebElement lastrow= rows.get(rivit.size()-1);
List <WebElement> firstcells = firstrow.findElements(By.tagName("td"));
List <WebElement> lastcells = lastcell.findElements(By.tagName("td"));
firstcell.get(6).getText());
This is because I'm locating td-tags twice. Any hints how to get both cells nicely? I have no identifiers in my rows or cells.

You can use xpath to get the elements:
WebElement lastCellInFirstRow = driver.findElement(By.xpath("table[#class='dataTable']//tr[1]//td[last()]"));
WebElement lastCellInLastRow = driver.findElement(By.xpath("table[#class='dataTable']//tr[last()]//td[last()]"));
Here's the xpath specification. You can play with xpath here.

You can try to make it with cssSelectors:
String cssLast="table[class='dataTable']>tr:first-child>td:last-child"
String cssFirst="table[class='dataTable']>tr:last-child>td:last-child"
it will be smt like that;
driver.findElement(By.cssSelector(cssLast)).getText();
driver.findElement(By.cssSelector(cssFirst)).getText();
another approach is using js:
String getText(cssSel){
JavascriptExecutor js = (JavascriptExecutor) driver;
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.append("var x = $(\""+cssSel+"\");");
stringBuilder.append("return x.text().toString();") ;
String res= (String) js.executeScript(stringBuilder.toString());
}
text1=getText(cssLast);
text2=getText(csscssFirst);
But always make sure that you located elements properly (e.g. using firepath, firebug addon in firefox)

The TableDriver extension (https://github.com/jkindwall/TableDriver.Java) offers a nice clean way to handle things like this. If your table has headers, you can (and should) identify the cell column by its header text, but in case you don't have headers, you can still do something like this.
Table table = Table.createWithNoHeaders(driver.findElement(By.className("dataTable")), 0);
WebElement firstRowLastCell = table.findCell(0, table.getColumnCount() - 1);
WebElement lastRowFirstCell = table.findCell(table.getRowCount() - 1, table.getColumnCount() - 1);

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Reading webpage's inspect element data using java - java

I have a requirement . I am reading file from a dynamic web page , and the values which i require from the webpage lies within <td> , and this is visible when i inspect this element . So my question is , is it somehow possible to print the data contained in the inspect element using java?

Related

UnexpectedTagNameException: Element should have been "select" but was "a" while trying to get the texts of dropdown menu using Selenium and Java

eliminating duplicate links on the webpage and avoid link is stale error

How to get the text from first column of the table?

Search an element in all pages in Selenium WebDriver (Pagination)

Selenium WebDriver Java locating two different table cells

Categories

Resources