Getting raw HTML / adding Javascript to WebView pages

Getting raw HTML / adding Javascript to WebView pages - java

I have a mobile flow for registration that I want to use in my app, and I do not have control of the source --
I am looking for a way to grab a few pieces of data when the user finishes registering, (confirmation number, etc.) that will be sent to my app (hopefully via Android's addJavascriptInterface)
I am certain on one thing - the id of the element I need. The flow could change, and is already a few pages long, So I'm looking for a general solution. The basic Idea I'm hoping for is this:
Inject a JavaScript snippet to each page during shouldOverrideUrlLoading, which will automatically call my JavaScript Interface and check for the value of the field with the id I'm looking for. (or just return the entire HTML, and I'll do it in Java)
view.setWebViewClient(new WebViewClient() {
#Override
public boolean shouldOverrideUrlLoading(WebView view, String url) {
//inject javascript here to get value
return false;
}});
I've seen tutorials on using addJavascriptInterface, but they all seem to assume some control or understanding of the 'single' page that will be navigated to. Since I have a potentially lengthy flow (3+ pages) and no control over the source, I am interested in hearing any suggestions.

Check URL of the current page and if it's the right one insert JavaScript in the curent page. With the help of Java-JS binding you could get some data out. See my previous answer: Determine element of url from webview shouldOverRideUrlLoading

Related

Document.select("a[href]") not getting all the href

I am using JSOUP to fetch the documents from a website.
Below is my code
webPageUrl = https://mwcc.ms.gov/#/electronicDataInterchange
Document doc = Jsoup.connect(webPageUrl).get();
Elements links = doc.getElementsByAttribute("a[href]");
Below line of code is not working. It is supposed to return an element but doesn't:
doc.getElementsByAttribute("a[href]")
Can someone please point out the mistake in my code?

That page seems to be an Angular application, which means it loads some (probably all or most) of its content via JavaScript scripts.
The fact that the URL contains the fragment separator # is already a strong indicator of that fact, because if you do a HTTP request, then everything after that indicator is cut off (i.e. not sent to the server), so the actual request will just be of https://mwcc.ms.gov/.
As far as I know JSoup does not support running JavaScript, so you might need to look into a more involved scraping tool (possibly running a full browser engine).

I am coding in Android Studio, and I need to fetch and display a specific line of data from a specific webpage

I am very new to coding in Java/Android Studio. I have everything setup that I have been able to figure out thus far. I have a button, and I need to put code inside of the button click event that will fetch information from a website, convert it to a string and display it. I figured I would have to use the html source code in order to do this, so I have installed Jsoup html parser. All of the help with Jsoup I have found only leads me up to getting the HTML into a "Document". And I am not sure if that is the best way to accomplish what I need. Can anyone tell me what code to use to fetch the html code from the website, and then do a search through the html looking for a specific match, and convert that match to a string. Or can anyone tell me if there is a better way to do this. I only need to grab one piece of information and display it.
Here is the piece of html code that contains the value I want:
writeBidRow('Wheat',-60,false,false,false,0.5,'01/15/2015','02/26/2015','All',' ',' ',60,'even','c=2246&l=3519&d=G15',quotes['KEH15'], 0-0);
I need to grab and display whatever value represents the quotes['KEH15'], in that html code.
Thank you in advance for your help.
Keith

Grabbing raw HTML is an extremely tedious way to access information from the web, bad practice, and difficult to maintain in the case that wherever you are fetching the info from changes their HTML.
I don't know your specific situation and what the data is that you are fetching, but if there is another way for you to fetch that data via an API, use that instead.
Since you say you are pretty new to Android and Java, let me explain something I wish had been explained to me very early on (although I am mostly self taught).
The way people access information across the Internet is traditionally through HTML and JavaScript (which is interpreted by your browser like Chrome or Firefox to look pretty), which are transferred over the internet using the protocol called HTTP. This is a great way for humans to communicate with computers that are far away, and the average person probably doesn't realize that there is more to the internet than this--your browser and the websites you can go to.
Although there are multiple methods, for the purpose of what I think you're looking for, applications communicate over the internet a slightly different way:
When an android application asks a server for some information, rather than returning HTML and JavaScript which is intended for human consumption, the server will (traditionally) return what's called JSON (or sometimes XML, which is very similar). JSON is a very simple way to get information about an object, and put it into a form that is readable easily by both humans (developers) and computers, and can be transmitted over the internet easily. For example, let's say you ask a server for some kind of "Video" object for an app that plays video, it may give you something like this:
{
"name": "Gangnam Style",
"metadata": {
"url": "https://www.youtube.com/watch?v=9bZkp7q19f0",
"views": 2000000000,
"ageRestricted": false,
"likes": 43434
"dislikes":124
},
"comments": [
{
"username": "John",
"comment": "10/10 would watch again"
},
{
"username": "Jane",
"number": "12/10 with rice"
}
]
}
That is very readable by us humans, but also by computers! We know the name is "Gangnam Style", the link of the video, etc.
A super helpful way to interact with JSON in Java and Android is Google's GSON library, which lets you cast a Java object as JSON or parse a JSON object to a Java object.
To get this information in the first place, you have to make a network call to an API, Application Programming Interface. Just a fancy term for communication between a server and a client. One very cool, free, and easy to understand API that I will use for this example is the OMDB API, which just spits back information about movies from IMDB. So how do you talk to the API? Well luckily they've got some nice documentation, which says that to get information on a movie we need to use some parameters in the url, like perhaps
http://www.omdbapi.com/?t=Interstellar
They want a title with the parameter "t". We could put a year, or return type, but this should be good to understand the basics. If you go to that URL in your browser, it spits back lots of information about Interstellar in JSON form. That stuff we were talking about! So how would you get this information from your Android application?
Well, you could use Android's built in HttpUrlConnection classes and research for a few hours on why your calls aren't working. But doesn't essentially every app now use networking? Why reinvent the wheel when virtually every valuable app out there has probably done this work before? Perhaps we can find some code online to do this work for us.
Or even better, a library! In particular, an open source library developed by Square, retrofit. There are multiple libraries like it (go ahead and research that out, it's best to find the best fit for your project), but the idea is they do all the hard work for you like low level network programming. Following their guides, you can reduce a lot of code work into just a few lines. So for our OMDB API example, we can set up our network calls like this:
//OMDB API
public ApiClient{
//an instance of this client object
private static OmdbApiInterface sOmdbApiInterface;
//if the omdbApiInterface object has been instantiated, return it, but if not, build it then return it.
public static OmdbApiInterface getOmdbApiClient() {
if (sOmdbApiInterface == null) {
RestAdapter restAdapter = new RestAdapter.Builder()
.setEndpoint("http://www.omdbapi.com")
.build();
sOmdbApiInterface = restAdapter.create(OmdbApiInterface.class);
}
return sOmdbApiInterface;
}
public interface OmdbApiInterface {
#GET("/")
void getInfo(#Query("t") String title, Callback<JsonObject> callback);
}
}
After you have researched and understand what's going on up there using their documentation, we can now use this class that we have set up anywhere in your application to call the API:
//you could get a user input string and pass it in as movieName
ApiClient.getOmdbApiClient().getInfo(movieName, new Callback<List<MovieInfo>>() {
//the nice thing here is that RetroFit deals with the JSON for you, so you can just get information right here from the JSON object
#Override
public void success(JsonObject movies, Response response) {
Log.i("TAG","Movie name is " + movies.getString("Title");
}
#Override
public void failure(RetrofitError error) {
Log.e("TAG", error.getMessage());
}
});
Now you've made an API call to get info from across the web! Congratulations! Now do what you want with the data. In this case we used Omdb but you can use anything that has this method of communication. For your purposes, I don't know exactly what data you are trying to get, but if it's possible, try to find a public API or something where you can get it using a method similar to this.
Let me know if you've got any questions.
Cheers!

As #caleb-allen said, if an API is available to you, it's better to use that.
However, I'm assuming that the web page is all you have to work with.
There are many libraries that can be used on Android to get the content of a URL.
Choices range from using the bare-bones HTTPUrlConnection to slightly higher-level HTTPClient to using robust libraries like Retrofit. I personally recommend Retrofit. Whatever you do, make sure that your HTTP access is asynchronous, and not done on the UI thread. Retrofit will handle this for you by default.
For parsing the results, I've had good results in the past using the open-source HTMLCleaner library - see http://htmlcleaner.sourceforge.net
Similar to JSoup, it takes a possibly-badly-formed HTML document and creates a valid XML document from it.
Once you have a valid XML document, you can use HTMLCleaner's implementation of the XML DOM to parse the document to find what you need.
Here, for example, is a method that I use to parse the names of 'projects' from a <table> element on a web page where projects are links within the table:
private List<Project> parseProjects(String html) throws Exception {
List<Project> parsedProjects = new ArrayList<Project>();
HtmlCleaner pageParser = new HtmlCleaner();
TagNode node = pageParser.clean(html);
String xpath = "//table[#class='listtable']".toString();
Object[] tables = node.evaluateXPath(xpath);
TagNode tableNode;
if(tables.length > 1) {
tableNode = (TagNode) tables[0];
} else {
throw new Exception("projects table not found in html");
}
TagNode[] projectLinks = tableNode.getElementsByName("a", true);
for(int i = 0; i < projectLinks.length; i++) {
TagNode link = projectLinks[i];
String projectName = link.getText().toString();
String href = link.getAttributeByName("href");
String projectIdString = href.split("=")[1];
int projectId = Integer.parseInt(projectIdString);
Project project = new Project(projectId, projectName);
parsedProjects.add(project);
}
return parsedProjects;
}

If you have permission to edit the webpage to add hyper link to specified line of that page you can use this way
First add code for head of line that you want to go there in your page
head your text if wanna
Then in your apk app on control click code enter
This.mwebview.loadurl("https:#######.com.html#target")
in left side of # enter your address of webpage and then #target in this example that your id is target.
Excuse me if my english lang. isn't good

Make gwt website crawlable without hash symbol?

In GWT we need to use # in a URL to navigate from one page to another i.e for creating history for eg. www.abc.com/#questions/10245857 but due to which I am facing a problem in sharing the url. Google scrappers are reading the url only before # i.e. www.abc.com.
Now I want to remove # from my url and want to keep it straight as www.abc.com/question/10245857.
I am unable to do so. How can I do this?
When user navigates the app I use the hash urls and History object (as
to not reload the pages). However sometimes it's nice/needed to have a
pretty URL (e.g. for sharing, showing in public, etc..) so I would like to know how to
provide the pretty URL of the same page.
Note:
We have to do this to make our webpages url crawlable and to link the website with outside world.

There are 3 issues here, and each can be solved:
The URL should appear prettier to the user
Going directly to the pretty URL should work.
WebCrawlers should be able to get the content
These may all seem like the same issue, but they are quite distinct in this context.
Display Pretty URLs
Can be done with a small javascript file which uses HTML5 state methods. You can see a simple demo here, with source here. This makes all changes to "#" appear without the "#" (on modern browsers).
Relevent code from fiddle:
var stateObj = {locationHash: hash};
history.replaceState(stateObj, "Page Title", baseURL + hash.substring(1));
Repsond to Pretty URLs
This is relatively simple, as long as you have a listener in GWT to load based on the "#" at page load already. You can just throw up a simple re-direct servlet which reinserts the "#" mark where it belongs when requests come in.
For a servlet, listening for the pretty URL:
if(request.getPathInfo()!=null && request.getPathInfo().length()>1){
response.sendRedirect("#" + request.getPathInfo());
return;
}
Alternatively, you can serve up your GWT app directly from this servlet, and initialize it with parameters from the URL, but there's a bit of relative-path bookkeeping to be aware of.
WebCrawlers
This is the trickiest one. Basically you can't get around having static(ish) pages here. That's not too hard if there are a finite set of simple states that you're indexing. One simple scheme is to have a separate servlet which returns the raw content you normally fetch with GWT, in minimal formatted HTML. This servlet can have a different URL pattern like "/indexing/". These wouldn't be meant for humans, just for the webcrawlers. You can attach a simple javascript in the <head> to redirect users to the pretty url once the page loads.
Here's an example for the doGet method of such a servlet:
response.setContentType("text/html;charset=UTF-8");
response.setStatus(200);
pw = response.getWriter();
pw.println("<html>");
pw.println("<head><script>");
pw.println("window.location.href='http://www.example.com/#"
+ request.getPathInfo() + "';");
pw.println("</script></head>");
pw.println("<body>");
pw.println(getRawPageContent(request.getPathInfo()));
pw.println("</body>");
pw.println("</html>");
pw.flush();
pw.close();
return;
You should then just have some links to these indexing pages hidden somewhere on your main app URL (or behind a link on your main app URL).

How do I invoke a template from another template in Play Framework?

I have a template accountlist.scala.html looking like this:
#(accounts: models.domain.AccountList)
#titlebar = {<p>some html</p>}
#content = {
#for(account <- accounts) {
<p>#account.name</p>
}
}
#main(titlebar)(content)
... and another template account.scala.html like this:
#(account: models.domain.Account)
#titlebar = {<p>#account.name</p>}
#content = {
#for(transaction <- account.getTransactions()) {
<p>#transaction.detail</p>
}
}
#main(titlebar)(content)
From both of them I am invoking the template main.scala.html.
I have access to the entire Account POJO in the first view accountlist.scala.html, so really there is no need for me to invoke the server to get the details of the account when I go to the view in which I display the details. I would just like to change view on the client side. How could I call the second view account.scala.html from the view accountlist.scala.html a user clicks on an account in the list? I am ready to change the templates as needed.

I have provided a previous answer, which is still available at the end of this post. From your comments, however I understand that you are asking for something else without understanding how dangerous it is.
There are three ways of handling your use case, let's start with the worst one.
A stateful web application
If you have loaded data into a Pojo from some data source and you want to re-use the Pojo between multiple requests, what you are trying to do is to implement some kind of client-state on the server, such as a cache. Web applications have been developed in this way for long time and this was the source of major bugs and errors. What happens if the underlying account data in the database is updated between one http request and the following? Your client won't see it, because it use cached data. Additionally, what happens when the user performs a back in his browser? How do you get notified on the server side so you keep track of where the user is in his navigation flow? For these and others reasons, Play! is designed to be stateless. If you are really into web applications, you probably need to read about what is the REST architectural style.
A stateless web application
In a stateless web applications, you are not allowed to keep data between two http requests, so you have two ways to handle it:
Generate the user interface in a single shot
This is the approach which you can use when your account data is reduced. You embed all the necessary data from each account into your page and you generate the view, which you keep hidden and you show only when the user clicks. Please note that you can generate the HTML on the server side and with Javascript makes only certain part of your DOM visible, or just transfer a JSON representation of your accounts and use some kind of templating library to build the necessary UI directly on the client
Generate the user interface when required
This approach becomes necessary when the account data structure contains too many informations, and you don't want to transfer all this information for all the accounts on the client at first. For example, if you know the user is going to be interested in seeing the details only of very few accounts, you want to require the details only when the user asks for it.
For example, in your list of accounts you will have a button associated with each account, called details and you will use the account id to send a new request to the server.
#(accounts: models.domain.AccountList)
#titlebar = {<p>some html</p>}
#content = {
#for(account <- accounts) {
<p>#account.name <button class="details" href="#routes.Controllers.account(account.id)">details</button></p>
}
}
Please note that you can also generate the user interface on the client side, but you will still need to retrieve it from the server the data structures when the user clicks on the button. This will ensure that the user retrieves the last available state of the account.
Old answer
If your goal is to reuse your views, Play views are nothing else then Scala classes, so you can import them:
#import packagename._
and then you use it in another template:
#for(account <- accounts) {
#account(account)
}

The question reveals a misunderstanding of play framework templates. When compiling the play project the template code is transformed to html, css and javascript.
You can not "invoke"/link another template showing the account transactions from a href attribute of your Account row. However, you can do any of the following:
In case you have loaded all transactions from all accounts to the client in one go: extend the template to generate separate <div> sections for each account showing the transactions. Also generate javascript to 1) hide the overview div and 2) show the specific transaction div when clicking on one of the accounts in the overview. Please see the knockout library proposed by Edmondo1984 or the accordion or tabs in twitter bootstrap.
In case you only load the account overview from the server. Generate a link such as this one href="#routes.Controllers.account(account.id)" (see Edmondo1984 answer) and make another template to view this data.
Since the question concerned a case in which you got all data from the server, go by option 1.

How do I overwrite the URL in an IE address bar using RFT?

I need to execute the following steps:
1. Start an IE browser window and open a URL (Done using StartBrowser(final string URL)
2. Start a session (done by logging in)
3. Now, I want to enter a different URL in the same browser window which has the same session.
My question is related to Step 3. How can I overwrite the URL in the existing IE window.
Note: I am using a keyword driven framework written in java.

From the IBM RFT online help: You can use the loadURL() method of the browser object.
If you do not have the browser object already 'learned' into your object map, just record a click on the browser toolbar. Then you can modify that line to be Browser_htmlBrowser().loadURL("http://stackoverflow.com");

Thanks Tom. I agree that loadURL has the implementation to do what I need.
There is one more aspect that may interest others looking at this question, i.e. the way the appropriate browser object is captured. Obviously the easist way is to use the RFT record and click way, and use the appropriate recognition properties or the other way is to implement it is find the existing browseron the fly when the method is called irrespective of recognistion properties etc which may be more useful for some scenarios or frameworks, like it is done below.
RootTestObject root = getRootTestObject();
TestObject[] testobj = root.find(atDescendant(".class", "Html.HtmlBrowser"));
BrowserTestObject bto;
bto = new BrowserTestObject(testobj[0]);
bto.loadUrl(curParamOne);

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.