I am trying to get URL's of some images from a webpage but I'm having problems. I'm using try.jsoup.org to parse HTML via a CSS Query img and get result:
<img src="https://d5nxst8fruw4z.cloudfront.net/atrk.gif?account=JwbPi1a4ZP00iy" style="display:none" height="1" width="1" alt="" />
<img src="http://ads.tamtay.vn/www/delivery/avw.php?zoneid=226&cb=INSET_RANDOM_NUMBER_HERE&n=aa2b62d0" border="0" alt="" />
<img src="http://a0.ttimg.vn/866392.ava" style="width: 100%;" />
I know getting these urls is very easy by attr("abs:src"), but in this case, it doesn't work, and returns null.
I try to change current webpage by other webpage. It work normal. I think problem come from webpage. not code. Any one can help?
Why did you put "abs" try only with "src"
Documentation de JSOUP
here is code:
private class Title extends AsyncTask<Void, Void, Void> {
#Override
protected void onPreExecute() {
super.onPreExecute();
}
#Override
protected Void doInBackground(Void... params) {
try {
// Connect to the web site
Document document = Jsoup.connect("http://photo.tamtay.vn").get();
Element image = document.select("img").first();
Log.d("Image", image.attr("abs:src"));
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
#Override
protected void onPostExecute(Void result) {
}
}
image.attr("abs:src") return null
Related
I'm trying to get the whole body page from youtube.com but only get a quarter of it for weird reasons
can somebody help me out here?
heres the code:
private static String data;
#Override
protected Void doInBackground(Void... voids) {
try {
Document doc = Jsoup.connect("https://www.youtube.com/results?search_query=Mettalica").get();
data = doc.body().html();
}
catch (IOException e) {
e.printStackTrace();
}
}
#Override
protected void onPostExecute(Void aVoid) {
//basically sysout the html results of the youtube search
super.onPostExecute(aVoid);
Log.d(TAG, data);
}
I think Doc Object has full HTML as well, you need to dig deeper to look for a better way but doc.outerHtml() should do the Job for you. Below SS also illustrates this object's state in Debug mode To compare with View Source of URL
I successfully retrieved specific text from a website with Jsoup. But is it possible to style the text with CSS? Below you find my code for retrieving text from a website.
public class connect extends AsyncTask<Void, Void, Void> {
String string;
#Override
protected Void doInBackground(Void... voids) {
try {
Document document = Jsoup.connect("MY_URL").get();
Elements elements = document.select("div.MY_DIV_CLASS");
string = elements.text();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
#Override
protected void onPostExecute(Void aVoid) {
super.onPostExecute(aVoid);
webView.loadData(string, "text/html", "UTF-8");
}
}
Thank you in advance.
When you have selected element or elements you can style it by adding new class:
elements.addClass("your-class");
or by adding your own style attribute:
elements.attr("style", "text-align: center; color: red;");
These changes are saved in document object so to use updated HTML code you will probably want to use the output of: document.html().
I need to get the value "8.32" from the "rnicper", "36 mg" from "rnstr" and "20/80 PG/VG" from "nirat".
<div class="recline highlight" id="rnic">
<div class="rlab"><span class="nopr indic indic-danger"></span>Nicotine juice <span id="rnstr">36 mg</span> (<span id="nirat">20/80 PG/VG</span>)</div>
<div class="runit" id="rnicml">2.08</div>
<div class="rdrops" id="rnicdr">73</div>
<div class="rgrams" id="rnicg" style="display: none;">2.53</div>
<div class="rpercent" id="rnicper">8.32</div><br>
</div>
I tried various methods, but nothing happens.
doc.getElementById("rnicper").outerHtml();
doc.getElementById("rnicper").text();
doc.select("div#rnicper");
doc.getElementsByAttributeValue("id", "rnicper").text();
Tell me, please, how can I get this information using JSOUP?
Update for Chintak Patel
AsyncTask asyncTask = new AsyncTask() {
#Override
protected Object doInBackground(Object[] objects) {
Document doc = null;
try {
doc = Jsoup.connect("http://e-liquid-recipes.com/recipe/2254223/RY4D%20Vanilla%20Swirl%20DL").get();
} catch (IOException e) {
e.printStackTrace();
}
String content = doc.select("div[id=rnicper]").text();
Log.d("content", content);
return null;
}
};
asyncTask.execute();
The values of parameters you are trying to get are are not part of initial html, but are set by javascript after page is loaded.
Jsoup only gets static html, does not execute javascript code.
To get what you want you can use tool like HtmlUnit or Selenium.
HtmlUnit example:
try (final WebClient webClient = new WebClient()) {
webClient.getOptions().setThrowExceptionOnScriptError(false);
final HtmlPage page = webClient
.getPage("http://e-liquid-recipes.com/recipe/2254223/RY4D%20Vanilla%20Swirl%20DL");
System.out.println(page.getElementById("rnicper").asText());
}
Write the following class in your Activity class and do your execution using JSoup. This code is used to get current version from play store website. you can change the URL and div[id=rnicper] into select() method. and then do your execution in postExecute() method.
private class GetVersionCode extends AsyncTask<Void, String, String> {
#Override
protected String doInBackground(Void... voids) {
String newVersion = null;
try {
newVersion = Jsoup.connect("https://play.google.com/store/apps/details?id=" + MainActivity.this.getPackageName() + "&hl=en")
.timeout(30000)
.userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")
.referrer("http://www.google.com")
.get()
.select("div[itemprop=softwareVersion]")
.first()
.ownText();
return newVersion;
} catch (Exception e) {
return newVersion;
}
}
#Override
protected void onPostExecute(String onlineVersion) {
super.onPostExecute(onlineVersion);
if (onlineVersion != null && !onlineVersion.isEmpty()) {
if (Float.valueOf(currentVersion) < Float.valueOf(onlineVersion)) {
showAlertDialogForUpdate(currentVersion, onlineVersion);
}
}
Log.e("update", "Current version " + currentVersion + "playstore version " + onlineVersion);
}
}
I am using GWT Java and I am trying to remove the PayPal Donate button (i.e., clear the RootPanel), which is part of an HTML form, when I move from the LoginView to another view. I found that I should use:
RootPanel.get("payPalDonate").clear();
RootPanel.get().clear();
RootPanel.get().getElement().setInnerHTML("");
This does clear the form so it does not appear on the next view; however, when the next view is displayed the buttons, hyperlinks and the browser back button on the view do not work.
The code is in the :
private void checkWithServerIfSessionIdIsStillLegal(String sessionID) {
rpc = (DBConnectionAsync) GWT.create(DBConnection.class);
ServiceDefTarget target = (ServiceDefTarget) rpc;
String moduleRelativeURL = GWT.getModuleBaseURL() + "MySQLConnection";
target.setServiceEntryPoint(moduleRelativeURL);
AsyncCallback<Account> callback = new AuthenticationHandler<Account>();
rpc.loginFromSessionServer(callback);
}
class AuthenticationHandler<T> implements AsyncCallback<Account> {
#Override
public void onFailure(Throwable caught) {
RootPanel.get().add(new LoginView());
}
#Override
public void onSuccess(Account result) {
if (result == null) {
RootPanel.get().add(new LoginView());
} else {
//if (result.getLoggedIn()) {
RootPanel.get().clear();
//RootPanel.get().add(new SelectPersonView());
RootPanel.get().add(new LoginView());
//} else {
//RootPanel.get().add(new LoginView());
//}
}
}
}
public void onValueChange(ValueChangeEvent<String> event) {
RootPanel.get("payPalDonate").clear();
RootPanel.get().clear();
RootPanel.get().getElement().setInnerHTML("");
//Get the historyToken value
String historyToken = event.getValue();
//Check the historyToken
if (historyToken.startsWith("!"))
historyToken = historyToken.substring(1);
if (historyToken.length() == 0) {
//Initial entry
RootPanel.get().clear();
RootPanel.get().add(new LoginView());
} else if (historyToken.equals("login")) {
RootPanel.get().clear();
RootPanel.get().add(new LoginView());
} else if (historyToken.equals("goToVideo")) {
RootPanel.get().clear();
Window.Location.replace("https://www.youtube.com/user/GlyndwrBartlett");
} else if (historyToken.equals("goToMetawerx")) {
RootPanel.get().clear();
Window.Location.replace("https://www.metawerx.net/");
} else if (historyToken.equals("goToPrivacy")) {
RootPanel.get().clear();
RootPanel.get().add(new SecurityAndPrivacyView());
} else if ...
In the LoginView:
initWidget(verticalPanel);
RootPanel.get("payPalDonate");
In the html:
<div style="margin:auto" id="payPalDonate">
<form action="https://www.paypal.com/cgi-bin/webscr" method="post" target="_top">
<input type="hidden" name="cmd" value="_s-xclick">
<input type="image" src="https://www.paypalobjects.com/en_AU/i/btn/btn_donateCC_LG.gif" border="0" name="submit" alt="PayPal – The safer, easier way to pay online!">
<img alt="" border="0" src="https://www.paypalobjects.com/en_AU/i/scr/pixel.gif" width="1" height="1">
</form>
</div>
Typically in GWT the RootPanel is never cleared. When you start your app, you pass a container to the RootPanel, and then all the views are added to and removed from that container.
Personally, I use Activities and Places pattern for all of my apps. This link offers an example of how to change views within a main container.
I am reading the link provided by Andrei. In the meantime I found the issues were being cased by:
RootPanel.get().getElement().setInnerHTML("");
I tried:
RootPanel.getBodyElement().removeChild(RootPanel.get("payPalDonate").getElement());
However, this cased the same issue. In the end I found this https://groups.google.com/forum/#!topic/google-web-toolkit/zVvY39blkY4
So I replaced the offending code with:
RootPanel.get("payPalDonate").setVisible(false);
And I placed the code in the LoginView just before I pass control to another view. Not the most elegant; however, it works until I digest the information provided by Andrei.
I'm trying to parse data from this table. Let's say, for example, that I want to parse the second elements from the second row (called SLO).
I can see there is a TR inside TR and the SLO word doesn't even have an ID or anything. How can I parse this?
This is the code:
class Title extends AsyncTask<Void, Void, Void> {
#Override
protected void onPreExecute() {
super.onPreExecute();
tw1.setText("Loading...");
}
#Override
protected Void doInBackground(Void... params) {
try {
Document doc = Jsoup.connect("https://www.easistent.com/urniki/cc45c5d0d303f954588402a186f5cdba5edb51d6/razredi/16515").get();
Elements eles = doc.select("");
title = eles.toString();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return null;
}
#Override
protected void onPostExecute(Void result) {
super.onPostExecute(result);
tw1.setText(title);
}
}
I don't know what to put in the doc.select(""); because I've never parsed something like this. I've only parsed titles of webpages and such. Could someone help me with this?
There is plenty of information there for you to use, for example class names or title attributes. The URL you provided won't work for me, and I can't copy paste the HTML from your image so my example will show just the parsing of the span based on its title:
String html = "<span title='Slovenscina'>SLO</span>";
Document doc = Jsoup.parse(html);
Elements eles = doc.select("span[title=Slovenscina]");
String title = eles.text();
System.out.println(title);
Will output:
SLO
This will work in the scope of the other HTML that you provided. I suggest you read some more about the selector-syntax of Jsoup.