How to read youtube comments using Selenium?

How to read youtube comments using Selenium? - java

I'm trying to read youtube video comments using the following code:
FirefoxDriver driver = new FirefoxDriver();
driver.get("https://www.youtube.com/watch?v=JcbBNpYkuW4");
WebElement element = driver.findElementByCssSelector("#watch-discussion");
System.out.println(element.getText()); // this prints: loading..
// scrolll down so that comments start to load
driver.executeScript("window.scrollBy(0,500)", "");
Thread.sleep(10000);
element = driver.findElementByCssSelector("#watch-discussion");
System.out.println(element.getText());
Last statement prints an empty string. Why?

It would be little tricky because all the comments are written in a separate iframe tag inside watch discussion. You will have to switch on that iframe first using driver.switchTo().frame("put ID or Name here"); but the iframe id is random value. After switch to that iframe you can find the comments all comments in a div that have class name 'Ct' so you can get those using XPATH. see the below working code
FirefoxDriver driver = new FirefoxDriver();
driver.get("https://www.youtube.com/watch?v=JcbBNpYkuW4");
WebElement element = driver.findElementByCssSelector("#watch-discussion");
System.out.println(element.getText()); // this prints: loading..
// scrolll down so that comments start to load
driver.executeScript("window.scrollBy(0,500)", "");
Thread.sleep(20000);
List<WebElement> iframes = driver.findElements(By.xpath("//iframe"));
for(WebElement e : iframes) {
if(e.getAttribute("id") != null && e.getAttribute("id").startsWith("I0_")) {
// switch to iframe which contains comments
driver.switchTo().frame(e);
break;
}
}
// fetch all comments
List<WebElement> comments = driver.findElements(By.xpath("//div[#class='Ct']"));
for(WebElement e : comments) {
System.out.println(e.getText());
}

I suggest you to try this API which is very easy/reliable instead of relying on the X-path of the elements. Also you cannot rely on the Xpath for dynamic pages/content.

Related

Selenium/Java No Such Element Exception for elements of a page after following a link from homepage

New to automation and could use some help here.
I am using Selenium Webdriver and Java on this website - Webdriver University and so far this code has been throwing No Such Element exception at "element.click()" step (i.e., doesn't find the element on page):
driver.manage().window().maximize();
driver.get("http://webdriveruniversity.com");
Thread.sleep(3000);
// Follow the link to another page
WebElement link = driver.findElementByXPath("(//div[#class=\"section-title\"])[6]");
link.click();
Thread.sleep(3000);
// Click on the element
WebElement element = driver.findElementByXPath("(//button[#class='accordion'])[1]");
element.click();
However, when I go to the linked page directly, it finds the element just fine
driver.manage().window().maximize();
driver.get("http://webdriveruniversity.com/Accordion/index.html");
// Click on the element
WebElement element = driver.findElementByXPath("(//button[#class='accordion'])[1]");
element.click();
I've used wait for element visibility and Thread sleeps, same results.
Any idea what could be the issue here?

Did you notice that when you click the link, page is opened in new tab? That is your issue.
You need to switch to new tab.
ArrayList<String> tabs = new ArrayList<String> (driver.getWindowHandles());
driver.switchTo().window(tabs.get(1)); //here you are switch to second tab

Hope the below code will solve your issues.
Used the getWindowHandles() to capture handle of newly opened tab and switch to the
tab
// Follow the link to another page
WebElement link = driver.findElement(By.xpath("(//div[#class=\"section-title\"][6]"));
link.click();
Set<String> allWindow = driver.getWindowHandles();
Iterator<String> itr = allWindow.iterator();
while (itr.hasNext()) {
String wind = itr.next().toString();
driver.switchTo().window(wind);
}
Thread.sleep(3000);
// Click on the element
WebElement element = driver.findElement(By.xpath("(//button[#class='accordion'][1]"));
element.click();
Thread.sleep(3000);
driver.close();
}

eliminating duplicate links on the webpage and avoid link is stale error

I have a list of 20 links and some of them are duplicates. I click onto the first link which leads me to the next page, I download some files from the next page.
Page 1
Link 1
Link 2
Link 3
link 1
link 3
link 4
link 2
Link 1 (click) --> (opens) Page 2
Page 2 (click back button browser) --> (goes back to) Page 1
Now I click on Link 2 and repeat the same thing.
System.setProperty("webdriver.chrome.driver", "C:\\chromedriver.exe");
String fileDownloadPath = "C:\\Users\\Public\\Downloads";
//Set properties to supress popups
Map<String, Object> prefsMap = new HashMap<String, Object>();
prefsMap.put("profile.default_content_settings.popups", 0);
prefsMap.put("download.default_directory", fileDownloadPath);
prefsMap.put("plugins.always_open_pdf_externally", true);
prefsMap.put("safebrowsing.enabled", "false");
//assign driver properties
ChromeOptions option = new ChromeOptions();
option.setExperimentalOption("prefs", prefsMap);
option.addArguments("--test-type");
option.addArguments("--disable-extensions");
option.addArguments("--safebrowsing-disable-download-protection");
option.addArguments("--safebrowsing-disable-extension-blacklist");
WebDriver driver = new ChromeDriver(option);
driver.get("http://www.mywebpage.com/");
List<WebElement> listOfLinks = driver.findElements(By.xpath("//a[contains(#href,'Link')]"));
Thread.sleep(500);
pageSize = listOfLinks.size();
System.out.println( "The number of links in the page is: " + pageSize);
//iterate through all the links on the page
for ( int i = 0; i < pageSize; i++)
{
System.out.println( "Clicking on link: " + i );
try
{
linkText = listOfLinks.get(i).getText();
listOfLinks.get(i).click();
}
catch(org.openqa.selenium.StaleElementReferenceException ex)
{
listOfLinks = driver.findElements(By.xpath("//a[contains(#href,'Link')]"));
linkText = listOfLinks.get(i).getText();
listOfLinks.get(i).click();
}
try
{
driver.findElement(By.xpath("//span[contains(#title,'download')]")).click();
}
catch (org.openqa.selenium.NoSuchElementException ee)
{
driver.navigate().back();
Thread.sleep(300);
continue;
}
Thread.sleep(300);
driver.navigate().back();
Thread.sleep(100);
}
The code is working fine and clicks on all the links and downloads the files. Now I need to improve the logic omit the duplicate links. I tried to filter out the duplicates in the list but then not sure how should I handle the org.openqa.selenium.StaleElementReferenceException. The solution I am looking for is to click on the first occurrence of the link and avoid clicking on the link if it re-occurs.
(This is part of a complex logic to download multiple files from a portal >that I don't have control over. Hence please don't come back with the >questions like why there are duplicate links on the page at the first place.)

First I don't suggest you to be doing requests (findElements) to the WebDriver repeatedly, you will see a lot of performance issues following this path, mainly if you have a lot of links, and pages.
Also if you are doing the same thing always on the same tab, you will need to wait the refresh 2 times ( page of the links and page of the download ), now if you open each link in a new tab, you just need to wait the refresh of the page where you will download.
I have a suggestion, just distinct repeated links as #supputuri said and open each link in a NEW tab, in this way you don't need to handle stale, don't need to be searching on the screen every time for the links and don't need to wait the refresh of the page with links in each iteration.
List<WebElement> uniqueLinks = driver.findElements(By.xpath("//a[contains(#href,'Link')][not(#href = following::a/#href)]"));
for ( int i = 0; i < uniqueLinks.size(); i++)
{
new Actions(driver)
.keyDown(Keys.CONTROL)
.click(uniqueLinks.get(i))
.keyUp(Keys.CONTROL)
.build()
.perform();
// if you want you can create the array here on this line instead of create inside the method below.
driver.switchTo().window(new ArrayList<>(driver.getWindowHandles()).get(1));
//do your wait stuff.
driver.findElement(By.xpath("//span[contains(#title,'download')]")).click();
//do your wait stuff.
driver.close();
driver.switchTo().window(new ArrayList<>(driver.getWindowHandles()).get(0));
}
I'm not in a place where I was able to test my code properly right now, any issues on this code just comment and I will update the answer, but the idea is right and it's pretty simple.

First lets see the xpath.
Sample HTML:
<!DOCTYPE html>
<html>
<body>
<div>
<a href='https://google.com'>Google</a>
<a href='https://yahoo.com'>Yahoo</a>
<a href='https://google.com'>Google</a>
<a href='https://msn.com'>MSN</a>
</body>
</html>
Let's see the xpath to get the distinct Links out of the above.
//a[not(#href = following::a/#href)]
The logic in xpath is we are making sure the href of the link is not matching with any following links href, if it's match then it's considered as duplicate and xpath does not return that element.
Stale Element:
So, now it's time to handle the stale element issue in your code.
The moment you click on the Link 1 all the references stored in listOfLinks will be invalid as selenium will get assign the new references to the elements each time they load on the page. And when you try to access the elements with old reference you will get the stale element exception.
Here is the snippet of code that should give you an idea.
List<WebElement> listOfLinks = driver.findElements(By.xpath("//a[contains(#href,'Link')][not(#href = following::a/#href)]"));
Thread.sleep(500);
pageSize = listOfLinks.size();
System.out.println( "The number of links in the page is: " + pageSize);
//iterate through all the links on the page
for ( int i = 0; i < pageSize; i++)
{
// ===> consider adding step to explicit wait for the Link element with "//a[contains(#href,'Link')][not(#href = following::a/#href)]" xpath present using WebDriverWait
// don't hard code the sleep
// ===> added this line
<WebElement> link = driver.findElements(By.xpath("//a[contains(#href,'Link')][not(#href = following::a/#href)]")).get(i);
System.out.println( "Clicking on link: " + i );
// ===> updated next 2 lines
linkText = link.getText();
link.click();
// ===> consider adding explicit wait using WebDriverWait to make sure the span exist before clicking.
driver.findElement(By.xpath("//span[contains(#title,'download')]")).click();
// ===> check this answer (https://stackoverflow.com/questions/34548041/selenium-give-file-name-when-downloading/56570364#56570364) for make sure the download is completed before clicking on browser back rather than sleep for x seconds.
driver.navigate().back();
// ===> removed hard coded wait time (sleep)
}
xpath ScreenShot:
Edit1:
If you want to open the link in the new window then use the below logic.
WebDriverWait wait = new WebDriverWait(driver, 20);
wait.until(ExpectedConditions.presenceOfAllElementsLocatedBy(By.xpath("//a[contains(#href,'Link')][not(#href = following::a/#href)]")));
List<WebElement> listOfLinks = driver.findElements(By.xpath("//a[contains(#href,'Link')][not(#href = following::a/#href)]"));
JavascriptExecutor js = (JavascriptExecutor) driver;
for (WebElement link : listOfLinks) {
// get the href
String href = link.getAttribute("href");
// open the link in new tab
js.executeScript("window.open('" + href +"')");
// switch to new tab
ArrayList<String> tabs = new ArrayList<String> (driver.getWindowHandles());
driver.switchTo().window(tabs.get(1));
//click on download
//close the new tab
driver.close();
// switch to parent window
driver.switchTo().window(tabs.get(0));
}
Screenshot: Sorry for the poor quality of the screenshot, could not upload the high quality video due to size limit.

you can do like this.
Save Index of element in the list to a hashtable
if Hashtable already contains, skip it
once done, HT has only unique elements, ie first foundones
Values of HT are the index from listOfLinks
HashTable < String, Integer > hs1 = new HashTable(String, Integer);
for (int i = 0; i < listOfLinks.size(); i++) {
if (!hs1.contains(e.getText()) {
hs1.add(e.getText(), i);
}
}
for (int i: hs1.values()) {
listOfLinks.get(i).click();
}

While fetching all links,Ignore logout link from the loop and continue navigation in selenium java

I am fetching all the links in the page and navigating to all links.
In that one of the link is Logout.
How do i skip/ignore Logout link from the loop?
I want to skip Logout link and proceed
List demovar=driver.findElements(By.tagName("a"));
System.out.println(demovar.size());
ArrayList<String> hrefs = new ArrayList<String>(); //List for storing all href values for 'a' tag
for (WebElement var : demovar) {
System.out.println(var.getText()); // used to get text present between the anchor tags
System.out.println(var.getAttribute("href"));
hrefs.add(var.getAttribute("href"));
System.out.println("*************************************");
}
int logoutlinkIndex = 0;
for (WebElement linkElement : demovar) {
if (linkElement.getText().equals("Log Out")) {
logoutlinkIndex = demovar.indexOf(linkElement);
break;
}
}
demovar.remove(logoutlinkIndex);
//Navigating to each link
int i=0;
for (String href : hrefs) {
driver.navigate().to(href);
System.out.println((++i)+": navigated to URL with href: "+href);
Thread.sleep(5000); // To check if the navigation is happening properly.
System.out.println("+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++");

If you want to leave out the Logout link from the loop instead of creating the List as driver.findElements(By.tagName("a")); as an alternative you can use:
driver.findElements(By.xpath("//a[not(contains(.,'Log Out'))]"));
Reference
You can find a couple of relevant discussions in:
Protractor Conditional Selector
How to locate the button element using Selenium through Python
What does contains(., 'some text') refers to within xpath used in Selenium
How does dot(.) in xpath to take multiple form in identifying an element and matching a text

Java approach to remove "not interesting" link using Stream.filter() function:
List<String> hrefs = driver.findElements(By.className("a"))
.stream()
.filter(link -> link.getText().equals("Log out"))
.map(link -> link.getAttribute("href"))
.collect(Collectors.toList());
Using XPath != operator solution to collect only links which text is not equal to Log Out:
List<String> hrefs = driver.findElements(By.xpath("//a[text() != 'Log out']"))
.stream()
.map(link -> link.getAttribute("href"))
.collect(Collectors.toList());

Element not found in cache - Selenium

I have List of URLs and I just wanted to open the URL on same browser session. To do that I have written the below code but it is throwing the error after opening the first URL i.e. second URL is not being opened.
findElements = driver.findElements(By.xpath("//*[#id='search-user-found']//p/a"));
for (WebElement webElement : findElements)
{
Thread.sleep(200);
System.out.println(webElement.getAttribute("href"));
driver.navigate().to(webElement.getAttribute("href"));
Thread.sleep(200);
}
Error:
Exception in thread "main" org.openqa.selenium.StaleElementReferenceException: Element not found in the cache - perhaps the page has changed since it was looked up
Please assist.

When you navigate to another page the DOM is changing and the WebDriver is loosing the elements it previously located. That causes the StaleElementReferenceException. I suggest you save the links as strings and use them.
List<WebElement> findElements = driver.findElements(By.xpath("//*[#id='search-user-found']//p/a"));
List<String> hrefs = new List<String>();
for (WebElement webElement : findElements)
{
hrefs.add(webElement.getAttribute("href"));
}
for (String href : hrefs)
{
Thread.sleep(200);
System.out.println(href);
driver.navigate().to(href);
Thread.sleep(200);
}

Navigation through link list massive in Selenium + Java

I need a script that will navigate through online profiles and return. I have some code that shows me how much online profiles links on page:
driver.get("http://mygirlfund.com");
driver.findElement(By.id("email")).sendKeys("somemail");
driver.findElement(By.id("password")).sendKeys("somepass");
driver.findElement(By.id("btn-submit")).submit();
driver.findElement(By.xpath(".//*[#id='btn-2i']/a")).click();
// log in
List<WebElement> allLinks = driver.findElements(By.xpath("//img[#alt='Online Now!']/../..//a"));
// miracle, have found links of all online profiles
System.out.println(allLinks.size());
for (int i = 1; i < allLinks.size(); i++)
{
for (WebElement link : allLinks)
{
link.click();
driver.navigate().back();
// here write a message
}
i++;
// navigating through user profiles
}
So I need to click on a link then return to previous (main) page but it only navigates to the first link and returns back.

What is the outer for-loop for? Why do you initialise i with 1 (instead of 0)? Why do you increment i twice? The inner loop should be sufficient:
List<WebElement> allLinks = driver.findElements(By.xpath("//img[#alt='Online Now!']/../..//a"));
for (WebElement link : allLinks) {
link.click();
driver.navigate().back();
}
Alternatively, you could retrieve the web elements one by one in a for loop like this (but this will throw an Exception, if there are less than 25 links):
for (int i = 0; i < 25; i++) {
String xpath = "//img[#alt='Online Now!']/../..//a[" + (i+1) + "]";
WebElement link = driver.findElement(By.xpath(xpath));
link.click();
//....
}

I have discovered that when I update webpage the consequences of profile links is breaking down. So, the decision was to open profile link in new window. Do some action and close it.
As guys above said using two loops was stupid decision. This code works for me perfect:
for(WebElement link : driver.findElements(By.xpath("//img[#alt='Online Now!']/../..//a"))){
String originalWindow =driver.getWindowHandle();
System.out.println("Original handle is: "+ originalWindow);
//open link in new window
act.contextClick(link).perform();
act.sendKeys("w").perform();
Thread.sleep(4000);
for (String newWindow : driver.getWindowHandles())
{
driver.switchTo().window(newWindow);
System.out.println("NOW THE CURRENT Handle is: "+ newWindow);
}
Thread.sleep(2000);
//here write a message
driver.close();
driver.switchTo().window(originalWindow);
}
Note:
When I store found links in variable and use it in loop:
List<WebElement> allLinks = driver.findElements(By.xpath("//img[#alt='Online Now!']/../..//a"));
//have found links of all online profiles
System.out.println(allLinks.size());
for (WebElement link : allLinks)
{
String originalWindow =driver.getWindowHandle();
System.out.println("Original handle is: "+ originalWindow);
//open link in new window
act.contextClick(link).perform();
act.sendKeys("w").perform();
Thread.sleep(4000);
//continue handling new window
My script opens just first founded link perpetually.
May be for someone it will be useful. Thanks all!

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to read youtube comments using Selenium? - java

I suggest you to try this API which is very easy/reliable instead of relying on the X-path of the elements. Also you cannot rely on the Xpath for dynamic pages/content.

Related

Selenium/Java No Such Element Exception for elements of a page after following a link from homepage

eliminating duplicate links on the webpage and avoid link is stale error

While fetching all links,Ignore logout link from the loop and continue navigation in selenium java

Element not found in cache - Selenium

Navigation through link list massive in Selenium + Java

Categories

Resources