I need to get the download link in this table:
<table cellpadding="0" cellspacing="3" border="0">
<tr>
<td><img class="img" src="...path" /></td>
<td>File -
<a id="1569" class="tepLink" href="javascript:void(0);">[Click me]</a>
</td>
</tr>
</table>
and this is what I tried:
Element table = doc.select("table[cellpadding=\"0\" cellspacing=\"3\" border=\"0\"]").first();
Element dwlLink = table.select("td:has(a)").first();
String absPath = dwlLink.attr("abs:href");
//use download manager to download from string absPath
I always get a "null object reference" so I must be wrong with that code, what should it do?
Just select all anchor tags and then get the first element in the Elements object.
Elements anchorTags = doc.select("table[cellpadding=0][cellspacing=3][border=0] a");
if(anchorTags.isEmpty())
{
System.out.println("Not found");
}
else
{
System.out.println(anchorTags.first());
}
EDIT:
I changed the select method to include the cellpadding, cellspacing and border attributes since that seems like what you were after in one of your examples.
Also, the Element.first() method returns null if the Elements list is empty. Always check for null when calling that method to prevent NullPointerExceptions.
table.select("td:has(a)").first(); will select the first <tr> element that contains an anchor. It will not select the anchor <a> itself.
here is what you can do:
Element aEl = doc.select("table[cellpadding] td a").first();
Related
I'm new to jsoup and trying to grab the attribute value of "title data-original-title" attribute but getting an empty string. I want the value
Jul-30-2015 03:26:13 PM
<table class="table table-hover">
<thead>
<tr style="border-color: #E1E1E1; border-width: 1px; background-color: #F9F9F9; border-top-style: solid;">
<th>Height</th>
<th>Age</th>
<th>txn</th>
<th>Uncles</th>
<th>Miner</th>
<th>GasUsed</th>
<th>GasLimit</th>
<th>Avg.GasPrice</th>
<th>Reward</th>
</tr>
</thead>
<tbody>
<tr><td></td>
<td>
**<span rel="tooltip" data-placement="bottom" title="" data-original-title="Jul-30-2015 03:26:13 PM">1149 days 18 hrs ago</span>**
</td>
My code is
for (int i = total_pages; i >= 1; i--) {
System.out.println("\nDisplaying blocks on page " + i);
String newString = "https://etherscan.io/blocks?p=" + i;
Document d3 = Jsoup.connect(newString).get();
Elements e = d3.select("table.table-hover > tbody");
Elements r = e.get(0).select("tr");
for (Element cr : r) {
Elements test = d3.select("span");
System.out.println(test.attr("data-original-title"));
}
}
Any help would be appreciated. I modified the attribute value to get data placement value and it is being retrieved correctly. But the data-original-title still returns empty string.
Data attributes are special kind of attributes so accessing them is a bit different but still very easy.
Instead of
System.out.println(test.attr("data-original-title"));
use:
System.out.println(test.first().dataset().get("original-title"));
You can try to see if this works :
d3.select("span[data-original-title]").get(0).attr("data-original-title")
Explanation :
This looks for the first span containing attribute "data-original-title" and gets the value of that attribute.
I'm trying to extract some data (see HTML below). I would like to extract the people who are in HR. only the first and last name.
HTML:
<tbody>
<tr>
<td>Peter</td>
<td>Smith</td>
<td>35</td>
<td>HR</td>
</tr>
<tr>
<td>Paul</td>
<td>Roberts</td>
<td>47</td>
<td>Legal</td>
</tr>
<tr>
<td>James</td>
<td>Griffin </td>
<td>23</td>
<td>HR</td>
</tr>
</tbody>
What i want extract:
Peter Smith
James Griffin
what i got so far:
public class Extract {
public static void main(String[] args) throws IOException {
Document Page = Jsoup.connect("URL").get(); //pick up html
Element List = Page.select("tbody").first();
Elements Info = List.select("tr");
for(Element value: Info)
{
System.out.println(value.select("td").first()); //first <td> ... </td>
System.out.println(value.select("td").second() + "\n"); //??? Trying to take the second <td> ... </td>
}
}
}
I would suggest putting a class on all td that has a first name and last name like:
<td class="first-name">Peter</td>
<td class="last-name">Smith</td>
<td>35</td>
<td>HR</td>
Then calling your JSoup select within the for loop like:
Element firstNames= value.select(".first-name");
Element lastNames= value.select(".last-name");
Or something along those lines. The point is, select using a class instead would be better and would insure you get nothing but the names.
If you don't control the input then you can also use the selector for:
Element firstNames= value.select("td:eq(0)");
Element lastNames= value.select("td:eq(1)");
However this requires that you are sure the information is always in the right order.
Here is the sample HTML Code :
<table width="100%" cellspacing="0" cellpadding="0" border="0">
<tbody>
<tr>
<tr class="tinyfont">
<tr height="2px">
<tr height="1px">
<tr height="1px">
<tr>
<tr height="2px">
<tr height="1px">
<tr height="1px">
<tr height="2px">
</tbody>
</table>
I am using selenium webdriver.
I have received the all the child elements from this code but now I want to exclude one particular child element in logic, how I can exclude one of the child element from my array.
I want to exclude tr[6] child element..
List<WebElement> list = driver.findElements(By.xpath("/html/body/table/tbody //*"));
ArrayList<String> al1 = new ArrayList<String>();
for(WebElement ele:list){
String className = ele.getAttribute("class");
System.out.println("Class name = "+className);
al1.add(className);
}
Thanks in Advance!!
Either omit the 6th table row, then select all descendants:
/html/body/table/tbody/tr[position() != 6]//*
or only select all table rows that are not at position 1 and have an attribute (and then select their descendants):
/html/body/table/tbody/tr[position() = 1 or #*]//*
or to be more specific, also check the attribute name:
/html/body/table/tbody/tr[position() = 1 or #height or #class]//*
Is it always element 6 that you want to avoid? If it is, use a for look with an increment and just avoid element 6 with an if statement.
int numOfElements = driver.findElements(By.xpath("/html/body/table/tbody //*")).count();
ArrayList<String> al1 = new ArrayList<String>();
for(int i = 1; i<= numOfElements; i++)
{
if(i!=6)
{
String className = driver.findElement(By.xpath("/html/body/table/tbody/tr["+i+"]")).getAttribute("class");
System.out.println("Class name = "+className);
al1.add(className);
}
}
This wont sound like a solution that you are looking for, but it still is a round about way to achieve what you want. Off the top of my head, I cant think of another way unless you have a attribute that contains something to compare off of or to exclude
I am currently trying to drill down on a user in a table full of users using Selenium webdriver, I have worked out how to iterate through the table but I'm having trouble actually selecting the person I want.
Here is the HTML (modified with X's due to it not being my data)
<table id="XXXXXXXXX_list" cellspacing="0" cellpadding="0" style=" border:0px black solid;WIDTH:100%;">
<tbody>
<tr cellspacing="0" style="height: 16px;">
<tr>
<tr onclick="widgetListView_onClick('XXXX_list',1,this,event)">
<tr onclick="widgetListView_onClick('XXXX_list',2,this,event)">
<tr onclick="widgetListView_onClick('XXXX_list',3,this,event)">
<tr onclick="widgetListView_onClick('XXXX_list',4,this,event)">
<tr onclick="widgetListView_onClick('XXXX_list',5,this,event)">
<tr onclick="widgetListView_onClick('XXXX_list',6,this,event)">
<tr onclick="widgetListView_onClick('XXXX_list',7,this,event)">
<td class="listView_default_dataStyle" nowrap="" style="font-size:12px ;
font-family: sans-serif ;color: black ;background: #FFFFFF "
ondblclick="XXXXListView_onDblClick('XXXXX_list',17, event)">NAME</td>
<td class="listView_default_dataStyle" nowrap="" style="font-size:12px ;font-family: sans-serif;
color: black ;background: #FFFFFF " ondblclick="XXXXX_onDblClick('XXXX_list',17, event)"> </td>
</tr>
Here is the code I am writing to try and find the user going by NAME in the table.
WebElement table = driver.findElement(By.id("table_list"));
// Now get all the TR elements from the table
List<WebElement> allRows = table.findElements(By.tagName("tr"));
// And iterate over them, getting the cells
for (WebElement row : allRows) {
List<WebElement> cells = row.findElements(By.tagName("td"));
for (WebElement cell : cells) {
List<WebElement> Names = cell.findElements(By.xpath("//td[text()='NAME']"));
System.out.println(Names);
This just prints thousands of [] (the table is huge in the real application).
Essentially what I need is to stop when I find the correct name and create a web element out of that table row. Which I can then click and drill down on.
Sorry if any of this is a bit vague,
Well if each name in the table is unique, you don't need to complicate things so much. Just search for element with text matching your 'Name' then select the row accordingly. Look at the code below:
WebElement name = driver.findElement(By.xpath("//table[#id='XXXXXXXXX_list']//td[contains(text(),'NAME')]"));//Select td with text NAME in table with id XXXXXXXXX_list
WebElement rowWithName = name.findElement(By.xpath("./.."));//Select the parent node, i.e., tr, of the td with text NAME
/*
* Look into that row for other element or perform any action on the row.
*/
If the names are not unique, i.e., same name exists twice at similar node, 1st instance will be picked each time. In that case we will have to try things differently, i.e., we will have to index the xpath for correct instance of matching name. Do ask if you have any further doubts :)
This will help you out.
try{
ArrayList<WebElement> cells = (ArrayList<WebElement>) driver.findElements(By.tagName("td"));
log4j.info("Value = "+input_type+" is stored in array from Webpage for "+keyword+" ");
for(WebElement type : cells)
{
if(type.getAttribute("name").equals("your correct name here")) {
type.sendKeys("ABC");
}
}
return true;
}catch(Throwable e){
return false;
}
You need to use Array list like this and you can compare your Name in which you wanna fill value Or wanna do any operation like getText(), click() etc.
Enjoy!
<table width="100%" border="0" cellpadding="0" cellspacing="1" class="table_border" id="center_table">
<tbody>
<tr>
<td width="25%" class="heading_table_top">S. No.</td>
<td width="45%" class="heading_table_top">
Booking Status (Coach No , Berth No., Quota)
</td>
<td width="30%" class="heading_table_top">
* Current Status (Coach No , Berth No.)
</td>
</tr>
</tbody>
</table>
I scrap a webpage and store the response in a string.
I then parse it into jsoup doc
Document doc = Jsoup.parse(result);
Then i select the table using
Element table=doc.select("table[id=center_table]").first();
Now i need to replace the text in tag "Booking Status (Coach No , Berth No., Quota)" to "Booking Status" using jsoup.. Could anybody help ?
I tried
table.children().text().replaceAll(RegEx to select the text?????, "Booking Status");
Elements tds=doc.select("table[id=center_table] td"); // select the tds from your table
for(Element td : tds) { // loop through them
if(td.text().contains("Booking Status")) { // found the one you want
td.text("Booking Status"); // Replace with your text
}
}
then you can use doc.toString() to get the text of the HTML back to save to disk, send to a webView or whatever else you want to do with it.
Elements tablecells=doc.select("table tbody tr td");
will give you 3 cells.
use a loop to get the each element with
Element e=Elements.get(int index);
Use the e.text() to get the String.
Compare or replace strings with String.equals() , String.contains(), String.replace()