I am parsing the html from the following webpage using Jsoup. How do I get the value from the variable price_ourBase:
<script type="text/javascript">
var price_ourBase = 279;
.
.
.
</script>
JS:
Element upperContainer_inner = document.select("div.upperContainer_inner").first();
Element table = upperContainer_inner.select("table.645.0.left.0.0").first();
Element script = table.select("script").first();
Element base_ourPrice = script.select("base_ourPrice").first();
price = (?, not sure what to put here or if there is more code needed).text();
I dont think jSoup can parse javascript like that. But, you could select the contents of the script with jSoup and then you could do something like
String[] result = script.toString().split(" ");
if(result[1].equals("price_ourBase"))
System.out.println("Our price is "+result[3].split(";")[0]);
Related
I have a doubt, I need to capture a value that is in the HTML input using jsoup.
For example:
<input type = "text" id = "national" value = "3.26" style = "width: 2.3em;">
I need to capture only the value "3.26"
I tried using the command: Element mdolar = document.getElementById ("national");
but does not display any information.
What am I doing wrong?
Thank you.
The following test extracts the value from an input element with id=national.
This test passes using the HTML supplied in your question.
#Test
public void parseInputValueFromHtml() {
String html = "<input type = \"text\" id = \"national\" value = \"3.26\" style = \"width: 2.3em;\">";
Document document = Jsoup.parse(html);
Element mdolar = document.getElementById("national");
Assert.assertEquals("3.26", mdolar.attr("value"));
// you can also find this element by type:
Elements mdolars = document.select("input[id=national]");
Assert.assertEquals(1, mdolars.size());
Assert.assertEquals("3.26", mdolars.first().attr("value"));
}
I am trying to scrape prices of a website with jSoup, but I only get an empty string.
I've tested my code with jSoup Online and I expect <meta itemprop="price" content="6,99"> to be printed when I use the following code:
Document doc = Jsoup.connect(URL).get();
Elements meta = doc.select("meta[itemprop=price]");
System.out.println("meta: " + meta.text());
price = meta.attr("content");
However, I just get an empty string and no error. What am I doing wrong here?
For the ones interested I am trying to scrape the price of this page
Try this:
Document doc = Jsoup.connect(URL).get();
Element meta = doc.select("meta[itemprop=price]").first();
System.out.println("meta: " + meta.text());
String price = meta.attr("content");
The webserver you are trying to access needs another user agent string to respond with the info you want. Try this:
Document doc = Jsoup.connect(URL).userAgent("Mozilla/5.0").get();
In below mentioned HTML code wana pick " page = ? " value from particular "href" tag . so i can pick that value and use that particular value in my selenium webdriver script , so my loop will run till 53 page.
this " page = " value mention in "href" tag please tell me how to pick page = value
<li>
<a id="quotes_content_left_lb_LastPage" class="pagerlink" href="http://www.abcd.com/symbol/ctsh/institutional-holdings?page=53">last >></a>
</li>
var arr = document.getElementById('quotes_content_left_lb_LastPage').href.split('=');
var value = arr[arr.length-1];
//value equals 53
Is this what you need?
Try with:
WebElement lnk = driver.findElement(By.id("quotes_content_left_lb_LastPage"));
int loopCount = Integer.parseInt(lnk.getAttribute("href").split("page=")[1]);
Basically you can use getAttribute function of WebElement interface to get value of any attribute and after that, its pure java.
I am able to have the response from a servlet and also able to show it on jsp page, but if I try to populate the same in a drop down, i am not able to-
Servlet code
String sql = "SELECT records from department";
ResultSet rs = s.executeQuery(sql);
Map<String, String> options = new LinkedHashMap<String, String>();
while (rs.next()) {
options.put(rs.getString("records"),rs.getString("records"));
}
String json = new Gson().toJson(options);
response.setContentType("application/json");
response.setCharacterEncoding("UTF-8");
response.getWriter().write(json);
JSP code---
JS code
<script type="text/javascript">
$(document).ready(function () { // When the HTML DOM is ready loading, then execute the following function...
$('.btn-click').click(function () { // Locate HTML DOM element with ID "somebutton" and assign the following function to its "click" event...
$.get('/testservlet', function (responseJson) { // Execute Ajax GET request on URL of "someservlet" and execute the following function with Ajax response JSON...
//alert(responseJson);
var $select = $('#maindiv'); // Locate HTML DOM element with ID "someselect".
$select.find('option').remove(); // Find all child elements with tag name "option" and remove them (just to prevent duplicate options when button is pressed again).
$.each(responseJson, function (key, value) { // Iterate over the JSON object.
$('<option>').val(key).text(value).appendTo($select); // Create HTML <option> element, set its value with currently iterated key and its text content with currently iterated item and finally append it to the <select>.
});
});
});
});
</script>
HTML Code--
<input type="button" class="btn-click" id="best" value="check"/>
<div id="maindiv" style="display: block"></div>
If I create a <ul> and <li> I can have the data from the response on my page but not able to create the select options? Any help on this would be great.
Try it again after removing beginning /.
$.get('testservlet', function (responseJson)
The JSON string is not in proper way. It should be something like this. Why are you using JSON string here whereas you are passing only records as key as well as value.
Simply return a comma separated string from Servlet and split it jQuery.
Find example here for Iterating a comma separated string
Sample code:
var items = responseJson.split(',');
for ( var i = 0; i < items.length; i++) {
$('<option>').val(items[i]).text(items[i]).appendTo($select);
}
I am trying to extract "Know your tractor" and "Shell Petroleum Company.1955"? Bear in mind that that is just a snippet of the whole code and there are more then one H2/H3 tag. And I would like to get the data from all the H2 and H3 tags.
Heres the HTML: http://i.stack.imgur.com/Pif3B.png
The Code I have just now is:
ArrayList<String> arrayList = new ArrayList<String>();
Document doc = null;
try{
doc = Jsoup.connect("http://primo.abdn.ac.uk:1701/primo_library/libweb/action/search.do?dscnt=0&scp.scps=scope%3A%28ALL%29&frbg=&tab=default_tab&dstmp=1332103973502&srt=rank&ct=search&mode=Basic&dum=true&indx=1&tb=t&vl(freeText0)=tractor&fn=search&vid=ABN_VU1").get();
Elements heading = doc.select("h2.EXLResultTitle span");
for (Element src : heading) {
String j = src.text();
System.out.println(j); //check whats going into the array
arrayList.add(j);
}
How would I extract "Know your tractor" and "Shell Petroleum Company.1955"? Thanks for your help!
Your selector only selects <span> elements which are inside <h2 class="EXLResultTitle">, while you actually need those <h2> elements themself. So, just remove span from the selector:
Elements headings = doc.select("h2.EXLResultTitle");
for (Element heading : headings) {
System.out.println(heading.text());
}
You should be able to figure the selector for <h3 class="EXLResultAuthor"> yourself based on the lesson learnt.
See also:
Jsoup cookbook - CSS selectors
Jsoup Selector API documentation