i'm trying to get the text in the reference class div without getting the text in the inner div.
I just want what's inside a <div class="class1"> -> "123456789" without taking "abcdefg".
<div class="class0">
<div class="class1">
<div class="class2">
abcdefg
</div>
123456789
</div>
</div>
i tried to run this but it always takes the text i don't want
String text = doc.getElementsByClass("class1").html();
String text2 = text.replaceAll("</?div[^>]*>","");
Log.d("text2", text2 );
output:
abcdefg
123456789
but I just want 123456789
how can I do? thank you all
i solved now by myself
doc.select("div.class2").first().remove();
String text = doc.getElementsByClass("class1").html();
Try
text = document.getElementsByClassName("class1");
text[0].outerText.split('\n')[1];
Related
I'm trying to ignore an item and not parse it on Jsoup
But css selector "not", not working !!
I don't understand what is wrong ??
my code:
MangaList list = new MangaList();
Document document = getPage("https://3asq.org/");
MangaInfo manga;
for (Element o : document.select("div.page-item-detail:not(.item-thumb#manga-item-5520)")) {
manga = new MangaInfo();
manga.name = o.select("h3").first().select("a").last().text();
manga.path = o.select("a").first().attr("href");
try {
manga.preview = o.select("img").first().attr("src");
} catch (Exception e) {
manga.preview = "";
}
list.add(manga);
}
return list;
html code:
<div class="col-12 col-md-6 badge-pos-1">
<div class="page-item-detail manga">
<div id="manga-item-5520" class="item-thumb hover-details c-image-hover" data-post-id="5520">
<a href="https://3asq.org/manga/gosu/" title="Gosu">
<img width="110" height="150" src="https://3asq.org/wp-content/uploads/2020/03/IMG_4497-110x150.jpg" srcset="https://3asq.org/wp-content/uploads/2020/03/IMG_4497-110x150.jpg 110w, https://3asq.org/wp-content/uploads/2020/03/IMG_4497-175x238.jpg 175w" sizes="(max-width: 110px) 100vw, 110px" class="img-responsive" style="" alt="IMG_4497"/> </a>
</div>
<div class="item-summary">
<div class="post-title font-title">
<h3 class="h5">
<span class="manga-title-badges custom noal-manga">Noal-Manga</span> Gosu
</h3>
If I debug your code and extract the HTML for:
System.out.println(document.select("div.page-item-detail").get(0)) (hint use the expression evaluator in IntelliJ IDEA (Alt+F8 - for in-session, real-time debugging)
I get:
<div class="page-item-detail manga">
<div id="manga-item-2003" class="item-thumb hover-details c-image-hover" data-post-id="2003">
<a href="http...
...
</div>
</div>
</div>
It looks like you want to extract the next div tag down with class containing item-thumb ... but only if the id isn't manga-item-5520.
So here's what I did to remove that one item
document.select("div.page-item-detail div[class*=item-thumb][id!=manga-item-5520]")
Result size: 19
With the element included:
document.select("div.page-item-detail div[class*=item-thumb]")
Result size: 20
You can also try the following if you want to remain based at the outer div tag rather than the inner div tag.
document.select("div.page-item-detail:has(div[class*=item-thumb][id!=manga-item-5520])")
Hi I could not get the text from html I wanna get this text This is a test text
<div class="rehou">
<span class="tlid-t t">
<span title="" class="">This is a test text</span>
</span>
<span class="tlid-t-v" style="" role="button"></span>
</div>
My java:
Document doc = Jsoup.connect(url).get();
Elements ele= doc.select("span.tlid-t t");
textass = ele.text();
The span has the two different classes tlid-t and t. So if you want to use both classes in your select you should use span.tlid-t.t instead of span.tlid-t t.
Elements ele = doc.select("span.tlid-t.t");
String textass = ele.text();
System.out.println(textass);
Which would print This is a test text.
But this will select the outer span! If the html gets changed the content of textass will be also changing. If you only want to select the text of the inner span you should use span.tlid-t.t span.
Elements ele = doc.select("span.tlid-t.t span");
String textass = ele.text();
System.out.println(textass);
This will also print This is a test text.
Using Jsoup:
Element movie_div = doc.select("div.movie").first();
I got a such HTML-code:
<div class="movie">
<div>
<div>
<strong>Year:</strong> 2014
</div>
<div>
<strong>Country:</strong> USA
</div>
</div>
</div>
How can I use jsoup to extract the country and the year?
For the example html I want the extracted values to be "2014" and "USA".
Thanks.
Use
Element e = doc.select("div.movie").first().child(0);
List<TextNode> textNodes = e.child(0).textNodes();
String year = textNodes.get(textNodes.size()-1).text().trim();
textNodes = e.child(1).textNodes();
String country = textNodes.get(textNodes.size()-1).text().trim();
Did you try something like:
Element movie_div = doc.select("div.movie strong").first();
And to get the text value you should try;
movie_div.text();
I need to capture the error messages displayed, i tried many methods but every method throws exception --unable to find element,
pls help with the code. These are the methods i tried.Also, there is no ID, it is div element. something like this...
<div id="webformErrors" class="text" name="errorContent">
<div>
There were 4 errors:
<ul>
<li>
You did not enter a value for:
<b>First Name</b>
</li>
<li>
You did not enter a value for:
<b>Last Name</b>
</li>
<li>
<li>
//String errormsg;
![enter image description here][1]errormsg = Hcd.findElement(By.xpath("//div[#id=webformErrors']/text()")).getText();
// WebElement divElement = Hcd.findElement(By.className("errorContent"));
// Hcd.findElement(By.name("There were 4 errors:")).isDisplayed();
**String pstring = Hcd.findElement(By.id("webformErrors")).getText();
System.out.println(pstring);
You have given the classname and name wrong. classname is "text" and name is "errorContent".
WebElement divElement = Hcd.findElement(By.className("text"));
In a list of 8 Elements I would select the one that contains the search text in children div. I need this because the elements of the list changes order every time. Here I would like to select the one that contains the text "TITLE TO LISTEN". How do I scroll through the list and select the wish li?
Thanks in advance
Here one li:
...
<li id="3636863298979137009" class="clearfix" data-size="1" data-fixed="1" data-side="r">
<div class="userContentWrapper">
<div class="jki">
<span class="userContent">
TITLE TO LISTEN
</div>
<div class="TimelineUFI uiContainer">
<form id="u_0_b0" class="able_item collapsed_s autoexpand_mode" onsubmit="return window.Event && E" action="/ajax/ufi/modify.php" method="post" >
<input type="hidden" value="1" name="data_only_response" autocomplete="off">
<div class="TimelineFeedbackHeader">
<a class="ction_link" role="button" title="Journal" data-ft="{"tn":"J","type":25}" rel="dialog" href="/ajax/" tabindex="0" rip-style-bordercolor-backup="" style="" rip-style-borderstyle-backup="" >LISTEN</a>
</div>
</form>
</div>
</div>
</li>
</ol>
</div>
...
I tried this code, but it don't work because the elements ids change each time.
driver.findElement(By.xpath("//li[8]/div[2]/div/div[2]/form/div/div/span[2]/a")).click();
For example:
If text contain "TEXT TO LISTEN": li[3]/div[2]/div/div/div[2]/div/div/span
Link "listen" i want to click : li[3]/div[2]/div/div[2]/form/div/div/span[2]/a
here is number 3, but the order may change. I would first like to get that number and then click on the right link
Use this
driver.findElement(By.xpath("//li[contains(text(), 'Your text goes here')]"))
EDIT: just realised it's very old ques and you might have got ans by now, so for others who are looking for answer to this question.
You could get list of all li elements, and then search for specified text
for(int i=0; i< listOfLiElements.Count, i++){
if(listOfLiElements[i].FindElement(By.ClassName("userContent")).Text == "TITLE TO LISTEN")
{
correctElement = listOfLiElements[i].FindElement(By.TagName("a"));
i =listOfLiElements.Count;
}
}
Well, then just iterate through for each and ask if the current element has the right text inside it.
List<Element> listOfLiTags = driver.findElement(By.Id("yourUlId")).findElements(By.TagName("li"));
for(Element li : listOfLiTags) {
String text = li.getElement(By.ClassName("userContent").getText();
if(text.equals("TITLE TO LISTEN") {
//do whatever you want and don't forget break
break;
}
}
Note that this is much more easier with CssSelector API.
List<Element> listOfSpans = driver.findElements(
By.CssSelector("ul[id=yourId] li span[class=userContent]");
Now just iterate and ask for the right text:)
You can try this :
public void ClickLink()
{
WebElement ol =driver.findElement(By.id("ol"));
List<WebElement> lis=ol.findElements(By.tagName("li"));
ArrayList<String> listFromGUI=new ArrayList<>();
for(int i=0;i<lis.size();i++)
{
WebElement li=ol.findElement(By.xpath("//ol[#id='ol']/li["+(i+1)+"]/div[2]/div/div/div[2]/div/div/span"));
if(li.getText().trim().equals("TEXT TO LISTEN"))
{
WebElement link=ol.findElement(By.xpath("//ol[#id='ol']/li["+(i+1)+"]/div[2]/div/div[2]/form/div/div/span[2]/a"));
if(link.getText().trim().equals("LISTEN"))
{
link.click();
break;
}
}
}
}