I cant use Jsoup on Java

I cant use Jsoup on Java - java

I want to pull the four data I marked in the table in the picture and the following data with jsoup. But I couldn't find which HTML codes to use.
Here is my code and website
https://www.ilan.gov.tr/ilan/kategori/9/ihale-duyurulari
Document doc = Jsoup.connect("https://www.ilan.gov.tr/ilan/kategori/9/ihale-duyurulari").get();
//System.out.println(doc.outerHtml());
for(Element row: doc.select("search-results-content row ng-tns-c146-3")) {
final String title = row.select(".list-desc ng-tns-c152-3").text();
final String title1 = row.select(".col col-4 col-lg-4 col-border ng-star-inserted").text();
System.out.println(title);
}

Related

Use JSoup to get all textual links

I'm using JSoup to grab content from web pages.
I want to get all the links on a page that have some contained text (it doesn't matter what the text is) just needs to be non-empty/image etc.
Example of links I want:
Link to Some Page
Since it contains the text "Link to Some Page"
Links I don't want:
<img src="someimage.jpg"/>
My code looks like this. How can I modify it to only get the first type of link?
Document document = // I get my document object
Elements linksOnPage = document.select("a[href]")
for (Element page : linksOnPage) {
String link = page.attr("abs:href");
// I do stuff with the link
}

You could do something like this.
It does it's job though it's probably not the fanciest solution out there.
Note: the function text() gets you a clean text so if there are any HTML code fragements inside it, it won't return them.
Document doc = // get the doc
Elements linksOnPage = document.select("a");
for (Element pageElem : linksOnPage){
String link = "";
if(pageElem.text().trim().equals(""))
continue;
// do smth with it
}

I am using this and it's working fine:
Document document = // I get my document object
Elements linksOnPage = document.select("a:matches(([^\\s]+))");
for (Element page : linksOnPage) {
String link = page.attr("abs:href");
// I do stuff with the link
}

Elements returns empty string

I am trying to scrape prices of a website with jSoup, but I only get an empty string.
I've tested my code with jSoup Online and I expect <meta itemprop="price" content="6,99"> to be printed when I use the following code:
Document doc = Jsoup.connect(URL).get();
Elements meta = doc.select("meta[itemprop=price]");
System.out.println("meta: " + meta.text());
price = meta.attr("content");
However, I just get an empty string and no error. What am I doing wrong here?
For the ones interested I am trying to scrape the price of this page

Try this:
Document doc = Jsoup.connect(URL).get();
Element meta = doc.select("meta[itemprop=price]").first();
System.out.println("meta: " + meta.text());
String price = meta.attr("content");

The webserver you are trying to access needs another user agent string to respond with the info you want. Try this:
Document doc = Jsoup.connect(URL).userAgent("Mozilla/5.0").get();

Grabbing information from an html file

OK, I am trying to grab the data-title and href and assigning them to variables in java.
<tr class="pl-video yt-uix-tile " data-video-id="MBBWVgE0ewk" data-set-video-id="" data-title="Windows Command Line Tutorial - 1 - Introduction to the Command Prompt"><td class="pl-video-handle "></td><td class="pl-video-index"></td><td class="pl-video-thumbnail"><span class="pl-video-thumb ux-thumb-wrap contains-addto"><a href="/watch?v=MBBWVgE0ewk&index=1&list=PL6gx4Cwl9DGDV6SnbINlVUd0o2xT4JbMu"

If you don't mind including a dependency, there is a good library for this kind of things called jsoup.
String html = ...
Document doc = Jsoup.parse(html);
Element tr = doc.select("tr").first();
Element link = tr.select("a").first();
String dataTitle = tr.attr("data-title");
String href = link.attr("href");

jSoup get title from img tag

I have a scenario where I need to pull the title from a img tag like below.
<img alt="Bear" border="0" src="/images/teddy/5433.gif" title="Bear"/>
I was able to get the image url. But how do i get the title from the img tag.
From above title = "bear". I want to extract this.

Use Element#attr() to extract arbitrary element attributes.
Element img = selectItSomehow();
String title = img.attr("title");
// ...
See also:
Jsoup Cookbook - Extract attributes, text, and HTML from elements

String html = "<img alt='Bear' border='0' src='/images/teddy/5433.gif' title='Bear'/>";
Document doc = Jsoup.parse(html);
Element e = doc.select("img[title]").first();
String title = e.attr("title");
System.out.println(title);

How to extract Dynamic text from a webpage

I want to get some text from webpage those are frequently changed.What are the technologies I cab use for this?,AS an example Currency rate that change everyday I want to extract from web page and want to save in DB,pls let me know any one knows about this,
thanxx

You can use JSoup to parse the HTML.
Example :
String html = "<p>An <a href='http://example.com/'><b>example</b></a> link.</p>";
Document doc = Jsoup.parse(html);
Element link = doc.select("a").first();
String text = doc.body().text(); // "An example link"
String linkHref = link.attr("href"); // "http://example.com/"
String linkText = link.text(); // "example""
String linkOuterH = link.outerHtml();
// "<b>example</b>"
String linkInnerH = link.html(); // "<b>example</b>"
You can look for particular DIV , tag this way, Check example

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

I cant use Jsoup on Java - java

Related

Use JSoup to get all textual links

Elements returns empty string

Grabbing information from an html file

jSoup get title from img tag

How to extract Dynamic text from a webpage

Categories

Resources