Below is my HTML structure of page.
<tr>
<td class="checkCol">
<td align="center">
<td> 8 </td>
<td> Add </td>
<td>
<td> Route Translation </td>
<td title=""> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> Force Complete </td>
<td>
<td>
<td>
<td>
</tr>
I am using below code to retrieve the TD element values.
List<WebElement> numOfRows = sppOrder_table.findElements(By.tagName("tr"));
if (numOfRows.size() == 1) {
System.out.println("No Record");
} else {
// Excluding header row
for (int i = 1; i <= numOfRows.size() - 1; i++) {
List<WebElement> numOfColumns = ((WebElement) numOfRows.get(i)).findElements(By.tagName("td"));
for (WebElement td : numOfColumns) {
System.out.println("Column Value === "+td.getText());
}
}
My Table Xpath is correct. It is printing nothing using HTMLUNITDRIVE and working fine using Firefox. Please suggest the resolution for this issue.
It works with latest version. Your case is missing the header tr element.
The below prints the expected result:
<table id='myid'>
<tr></tr>
<tr>
<td class="checkCol">
<td align="center">
<td> 8 </td>
<td> Add </td>
<td>
<td> Route Translation </td>
<td title=""> </td>
<td> </td>
<td> </td>
<td> </td>
<td> </td>
<td> Force Complete </td>
<td>
<td>
<td>
<td>
</tr>
</table>
HtmlUnitDriver driver = new HtmlUnitDriver();
driver.get(the_url);
WebElement sppOrder_table = driver.findElement(By.id("myid"));
List<WebElement> numOfRows = sppOrder_table.findElements(By.tagName("tr"));
if (numOfRows.size() == 1) {
System.out.println("No Record");
} else {
// Excluding header row
for (int i = 1; i <= numOfRows.size() - 1; i++) {
List<WebElement> numOfColumns = ((WebElement) numOfRows.get(i)).findElements(By.tagName("td"));
for (WebElement td : numOfColumns) {
System.out.println("Column Value === "+td.getText());
}
}
}
Related
I have a form with some inputs; each input returns a list of data which is displayed in a table in another html page. Each input have a table to display it's data. My task is to do not display the data if the input is not entered by the user.
Here is my code
<!-- Country Table-->
<%for(int i = 0; i < countryList.length;i++){
if(countryList.length == 0)
break;
%>
<div class="box" align="center">
<table name="tab" align="center" class="gridtable">
<thead >
<tr>
<th style="width: 50%" scope="col">Entity Watch List Key</th>
<th style="width: 50%" scope="col">Watch List Name</th>
</tr>
</thead>
<tbody>
<tr>
<td style="width: 50%"><%out.println((String) (countryList[i].getEntityWatchListKey()));%></td>
<td style="width: 50%"><%out.println((String) (countryList[i].getEntityName()));%></td>
</tr>
</tbody>
</table>
</div>
<%}%>
I am using break to go out of the loop to do not display the table, is that true ?
You can use this condition before the for loop,
if(countryList.length != 0)
or
if(countryList.length > 0)
and then you need not use the break condition,
Furthermore the for loop you have currently defined will not work because if the length of the array is 0 then this condition i < countryList.length will become 0<0 and it will fail,so your for loop won't even be entered.So your current if condition if(countryList.length == 0) will not be accessed.
Please modify your code
<div class="box" align="center">
<table name="tab" align="center" class="gridtable">
<thead >
<tr>
<th style="width: 50%" scope="col">Entity Watch List Key</th>
<th style="width: 50%" scope="col">Watch List Name</th>
</tr>
</thead>
<tbody>
<%for(int i = 0; i < countryList.length;i++){
if(countryList.length > 0) %>
<tr>
<td style="width: 50%"><%out.println((String) (countryList[i].getEntityWatchListKey()));%></td>
<td style="width: 50%"><%out.println((String) (countryList[i].getEntityName()));%></td>
</tr>
<%}%>
</tbody>
</table>
</div>
For a good practice you have to repeat the row not the table.
i want to parse a html table with jsoup.
part of the html page i want to parse:
<tr>
<td class="dkHeading">A1</td>
<td class="dkHeading">A2</td>
<td class="dkHeading">A3</td>
<td class="dkHeading">A4</td>
<td class="dkHeading">A5</td>
<td class="dkHeading">A6</td>
<td class="dkHeading">A7</td>
</tr>
<tr id="RContents">
<td class="dkTextCenter">B1</td>
<td class="dkTextCenter">B2</td>
<td class="dkTextCenter">B3</td>
<td class="dkTextLeft">B4</td>
<td class="dkTextCenter">B5</td>
<td class="dkTextCenter">B6</td>
<td class="dkTextCenter">B7</td>
</tr>
<tr>
<td class="dkTextCenter">C1</td>
<td class="dkTextCenter">C2</td>
<td class="dkTextCenter">C3</td>
<td class="dkTextLeft">C4</td>
<td class="dkTextCenter">C5</td>
<td class="dkTextCenter">C6</td>
<td class="dkTextCenter">C7</td>
</tr>
<tr>
<td class="dkTextCenter">D1</td>
<td class="dkTextCenter">D2</td>
<td class="dkTextCenter">D3</td>
<td class="dkTextLeft">D4</td>
<td class="dkTextCenter">D5</td>
<td class="dkTextCenter">D6</td>
<td class="dkTextCenter">D7</td>
</tr>
how can i select all "tr" elements after (and including) that tr with id "RContents"?
i tried doc.select("tr[id=RContents] > tr"); but that did't work.
You can use the next siblings selector ~:
doc.select("tr[id=RContents] ~ tr");
you can select tr Elements, then loop through them. since the elements are in order you can try something like this:
Document document = Jsoup.parse("YOURHTML");
Elements elements = document.select("tr");
boolean start=false;
for(Element e : elements){
if(e.hasAttr("id") && e.attr("id").equals("RContents"))){
start=true;
}
if(start){
//all tr elements including id=RContents and after
}
}
I am new to Jsoup Library. I have html like this.
<tr class="srrowns">
<td class="num"> <a name="y2015"> </a> 1 </td>
<td nowrap>CVE-2015-4004</td>
<td>119</td>
<td class="num"> <b style="color:red"> </b> </td>
<td> DoS Overflow +Info </td>
<td>2015-06-07</td>
<td>2015-06-08</td>
<td>
<div class="cvssbox" style="background-color:#ff8000">
8.5
</div></td>
<td align="center">None</td>
<td align="center">Remote</td>
<td align="center">Low</td>
<td align="center">Not required</td>
<td align="center">Partial</td>
<td align="center">None</td>
<td align="center">Complete</td>
</tr>
when I run element.select("td"), it is returning
<td class="num"> <a name="y2015"> </a> 1 </td>
<td nowrap>CVE-2015-4004</td>
<td>119</td>
<td class="num"> <b style="color:red"> </b> </td>
<td> DoS Overflow +Info </td>
<td>2015-06-07</td>
<td>2015-06-08</td>
<td>
<div class="cvssbox" style="background-color:#ff8000">
8.5
</div></td>
<td align="center">None</td>
<td align="center">Remote</td>
<td align="center">Low</td>
<td align="center">Not required</td>
<td align="center">Partial</td>
<td align="center">Complete</td>
Obivously, deleting <td align="center">None</td> before "Complete". Is there any way that I could get all items from Jsoup Selector?
My code looks something like this in Scala.
val connection = Jsoup.connect(url).get()
val treelist = connection.select("tr.srrowns:contains(CVE-2015-4001)")
val tree = tree.select("td")
I just saw that Jsoup select is implemented using LinkedHashSet. My goal is to extract text from each tags using Jsoup.text().Is there a workaround for this or do I have to write a parser just for getting all nodes(including duplicates)?
Thank you very much.
Try this CSS selector:
tr.srrowns:has(td:contains(CVE-2015-4004)) > td
DEMO
http://try.jsoup.org/~vAgiHQY6TIJ5MSUzR-m_Y1GD5_U
SAMPLE CODE
var cve = "CVE-2015-4004";
val doc = Jsoup.connect(url).get()
val tds = doc.select("tr.srrowns:has(td:contains(" + cve + ")) > td")
for( var td <- tds ){
println( td.text() );
}
I am trying to find and click 'Available' seats from a Travel website Seat Layout. Challenge is, the available Seat has no Unique Identifier whereas 'Blocked' (already booked) seat has one in the form of 'title' (Please refer HTML). How we make WebDriver skip any blocked seat and click any 'Available' seat on any random occurrence of seat layout (Pic)??
HTML shows structure of 2 Blocked Seats (L2 , L4) and one available seat in between (L3)
<div style="max-width:695px;">
<div class="GXXXXXXX" style="display: none;" aria-hidden="true">
<div class="GXXXXXXX">
<div class="GXXXXXXX"> </div>
<div class="GXXXXXXX">
<table>
<colgroup>
<tbody>
<tr>
<tr>
<td>
<td>
<td>
<td>
<td>
<td>
<td>
<td>
<td>
<td>
<td>
<td>
Blocked Seat
<div class="GDXXXXXX GDXXXXX0" style="overflow:hidden;position:static;margin: 0 5px 5px 0;" title="Seat Name: L2 | Fare: Rs. 300.0">L2</div>
</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td>
Available Seat
<div class="GXXXXXX GXXXXXX0" style="overflow:hidden;position:static;margin: 0 5px 5px 0;">L3</div>
</td>
</tr>
<tr>
<td>
<td>
<td>
<td>
<td>
<td>
<td>
<td>
<td>
<td>
<td>
<td>
Blocked Seat
<div class="GXXXXXX GXXXXXXX" style="overflow:hidden;position:static;margin: 0 5px 5px 0;" title="Seat Name: L4 | Fare: Rs. 300.0">L4</div>
</td>
</tr>
<tr>
</tbody>
</table>
</div>
This is the logic. See if the DIV has title attribute. If it does not have the seat is available. Change the logic as per your need.
List<WebElement> seats = driver.findElements(By.cssSelector("div.GXXXXXX.GXXXXXXX"));
for (WebElement seat : seats) {
if(seat.getAttribute("title") != null){
System.out.println("Seat is not available");
}else{
System.out.println("Seat is available");
seat.click(); // break the loop if you wish
}
}
I have a document that contains <br/> , <p> , and <table> elements
I have been trying to parse this HTML using Jsoup and preserve the lines.
I tried many methods from similar questions but no result
FileInputStream in = new FileInputStream("C:............xxx.htm");
String htmlText = IOUtils.toString(in);
File file = new File("C:............xxx.txt") ;
PrintWriter pr = new PrintWriter(file) ;
String text = Jsoup.parse(htmlText.replaceAll("(?i)<br[^>]*>", "br2n")).text();
System.out.println(text.replaceAll("br2n", "\n"));
pr.println(text.replaceAll("br2n", "\n"));
// for (String line : htmlText.split("\n")) {
// String stripped = Jsoup.parse(line).text();
//
// System.out.println(stripped);
// pr.println(stripped);
//
// }
pr.close();
Here is the representative part of my HTML file (the original file starts with <html> ...of course)
<table border="0" cellspacing="0" cellpadding="0" bgcolor="white"
width='650'>
<tr>
<td><font size="4"><br />
<b>The scientific explantion of the syndrom</b></font>
<table width='650' border="0" cellspacing="5" cellpadding="0">
<tr>
<td width='5%'> </td>
<td width='25%'> </td>
<td width='25%'> </td>
<td width='15%'> </td>
<td width='30%'> </td>
</tr>
<tr height="24">
<td align="left" nowrap="nowrap" colspan="3"><font size=
"3"><b>Recent Update</b></font></td>
<td align="left" nowrap="nowrap"><a name=
"9J003346248"></a><font size="3"><b>Issue:</b></font></td>
<td align="left"><font size="3">9569865248</font></td>
</tr>
<tr>
<td> </td>
<td align="left"><b>Locust:</b></td>
<td align="left" colspan="3">UYF78UIGK</td>
</tr>
</table>
<br/> The explanation above does not necc....... <p>
Blah ....
</p>
<table border="2" cellspacing="1" cellpadding="0" bgcolor="white"
width='750'>
<tr>
<td><font size="4"><br />
<b>Syndrom of the main ......</b></font>
<table width='650' border="0" cellspacing="5" cellpadding="0">
<tr>
<td width='5%'> </td>
<td width='25%'> </td>
<td width='25%'> </td>
<td width='15%'> </td>
<td width='30%'> </td>
</tr>
<tr height="24">
<td align="left" nowrap="nowrap" colspan="3"><font size=
"3"><b>Data</b></font></td>
<td align="left" nowrap="nowrap"><a name=
"9J003346248"></a><font size="3"><b>Issue:</b></font></td>
<td align="left"><font size="3">9509809248</font></td>
</tr>
<tr>
<td> </td>
<td align="left"><b>Locust:</b></td>
<td align="left" colspan="3">U344365GK</td>
</tr>
</table>
<br/> The explanation above does not necc....... <p>
Blah ....
</p>
I need to make sure that all rows in those table lie one after another the way they do in the original document. But I have multiple tables and other "line breaking elements". How can I do this using Jsoup? Is it possible to parse html and keep line using other api more effectively?
You had it almost right. Try this
String text = Jsoup.parse(htmlText.replaceAll("(?i)</tr>", "</tr> br2n ").replaceAll("(?i)<br[^>]*>", "br2n")).replaceAll("(?i)<p>", "<p> br2n ").replaceAll("(?i)</p>", "</p> br2n ").text();
System.out.println(text.replaceAll("br2n", "\n"));