Jsoup cannot read data with ID? - java

I am trying to read an element with ID main_LbStatsOpenProfit from http://www.zulutrade.com/trader/104769 but it is always empty. I have tested it on try Jsoup http://try.jsoup.org/ and it works fine.
I have read many other values with ID and those ones worked fine.
Any Ideas why this isolated problem might occur?
or any other parser example that can read this?

You might be using html() on the selected Element? If so it won't work as the Element in the html of the given url is empty.
Try the below code :
doc.select("#main_LbStatsOpenProfit").first().toString()

Related

Fetching data from another website with JSOUP

Basically, I need a table with all the possible books that exist, and I don't wanna do that, because I'm a very lazy person xD. So, my question is.. can I use a site, that I have in mind, and just like cut off the rest this site(that I don't need) and leave only the search part(maybe do some kind of changes in layout)... then, make the search, find the book and store in my database only the data that make sense for me. Is that possible? I heard that JSOUP could help.
So, I just want some tips. (thx for reading).
the site: http://www.isbn.bn.br/website/consulta/cadastro
Yes, you can do that using Jsoup, the main problem is that the URL you shared uses JavaScript so you'll need to use Selenium to force the JS execution or you can also get the book URL and parse it.
The way to parse a web using Jsoup is:
Document document = Jsoup.connect("YOUR-URL-GOES-HERE")
.userAgent("Mozilla/5.0")
.get();
The you retrieve the whole HTML in a Document so you can get any Element contained in the Element using CSS Selectors, for example, if in the HTML you want to retrieve the title of the web, you can use:
Elements elements = document.select("title");
And that for every HTML tag that you want to retrieve information from. You can check the Jsoup Doc an check some of the examples explained: Jsoup
I hope it helps you!

How to get value from inside quotation marks in HTML using Java and Selenium WebDriver

I have a piece of code that looks like this:
Of course I can get a link using command
System.out.print(link.getText());
but in this case I will only get value "Saab". I need to have date and image size that are inside quotation marks as well.
Do you know how to do it?
complete html could have helped us to give you a better answer.
This will give - Saab <Date> <size>
System.out.print(link.findElement(By.xpath("./..")).getText());
Basically I am trying to get the parent element of the link and get the complete innerText.

Identifying a link in selenium (no id or class provided)

I would like to know how to identify via webdriver the following html "node":
thank you <em>very much indeed</em> - Angielsko-Polski Słownik <b>...</b>
(It's just any link of google when one launch a google search)
I have googled it, however I have found only cases where the id or the class were provided.
What about in this case?
This is my failing try:
webdriver.findElement(By.xpath("//a[#href='http://www.google.pl/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&uact=8&ved=0CCcQFjAA&url=http%3A%2F%2Fpl.bab.la%2Fslownik%2Fangielski-polski%2Fthank-you-very-much-indeed&ei=Sia8U6LPCevB7AagwoCICg&usg=AFQjCNF6y7swYrp3axD0hNrCWfjovhcVPw&bvm=bv.70138588,d.bGE']")).click();
Thanks in advance.
There are several possibilities:
By.tagName("a")
However, chances are there are more than one a tag, and so the above will pick the first one it encounters. To get more specific, you can use:
By.xpath("//a[0]")
0 in this case refers explicitly to the first a tag. However, to give a precise XPath answer, I would need to see more your page code, as well as your exact requirements. You can also use:
By.partialLinkText("thank you very much indeed")
This works best if you have unique enclosed text.
You may also want to read through the rest of the locators in the API.
If you are not able to identify the the link directly, you can try based on other element.
When you have any adjacent div or element having unique value, you can refer the link relative to that.
WebElemenet element = driver.findElement(By.cssSelector("div#id a"));
This will get the link element which is present in the div having an id value of "id".

Dynamic Content Parsing

I am working with content parsing I executed the sample program for this i have taken a sample link
please visit the below link
http://www.equitymaster.com/stockquotes/sector.asp?sector=0%2CSOFTL&utm_source=top-menu&utm_medium=website&utm_campaign=performance&utm_content=key-sector
or
Click Here
in the above link i parsed the table data and store into java object.
BSE and NSE are not my exact requirement just I am taken sample example. the above link is developed in the tables they are not used id's and classes. in my example I parsed data using XPath
this is my Xpath
/html/body/table[4]/tbody/tr/td/table[2]/tbody/tr[2]/td[2]/font/table[2]
I selected and parsing it is working fine . here is a problem in future if they changed website structure my program will not work for sure. tell me any other way to parse data dynamically and able to store in database. display the results based on the condition even if they changed the webpage structure I used for this JSOUP api for this. Tell me any other ApI's which provide best support for this type of requirement
If you're trying to parse a page without any clear id/class to select your nodes, you have to try and rely on something else. Redefining the whole tree is indeed the weakest way of doing it, if anything is added/changed everything will collapse.
You could try relying on color: //table[#bgcolor="#c9d0e0"], the "GET MORE INFO" field: //table[tr/td//text()="GET MORE INFO"], the "More Info" there is on every line: //table[.//td//text()="&nbspMore Info&nbsp"]...
The idea is to find something ideally unique (if you can't find any unique criteria, table[color condition selecting a few tables][2] is still stronger walking the whole tree), present every time, and use that as an id.

Jsoup is not finding my element

Perhaps I'm doing something wrong, but I'm trying to parse this page using jsoup, it for some reason it doesn't find me the div I'm looking for
doc = Jsoup.connect(params[0]).get();
content = doc.select("div.itemcontent").first().text();
Where am I going wrong here?
Thanks
The problem: you get a different website using jsoup than using a browser. I set another useragent in Jsoup, but no luck. Possible the content is changed through JavaScript?!
However, you can change the selector according to the webseite you get.
It's always a good idea to take a look into document as it's parsed - a simple System.out.println(doc) is enough.
Here are some steps you can try:
Print your Document doc (eg. using System.out)
Search for the required value(s) in there
Select those tags instead
I just played around a bit, but maybe you can use this snipped:
content = doc.select("description").first().text();
It seems to me, <description>...</description> is what you're looking for.

Categories

Resources