GWT interface to Solr index - java

I have a solr index on a remote server and need to create a search page interface. I am using GWT to code the pages and XML-HTTP to query the index and receive the response. The problem is the same-site origin security policy. It won't let JavaScript retrieve the remote XML data. Is there a workaround for this, without using JSON preferably.

A similar problem: Make GWT interact with ASP.NET Web Service
The answers there should also apply here.
Depending on the type of data that you want to send (that is, how "public" they are), JSONP might not be the best option - it's not the safest method of transport (pure JSON is safer, but to overcome SOP you need the padding).
If you have a Java server on the.. server side, I'd go with GWT <-> servlet (acting as proxy, on the same domain as your main app) <-> web service (any domain) - the safest and cleanest code, afaict.

Related

Exposing a web site through web services

I know what I am asking is somehow weird. There is a web application (which we don't have access to its source code), and we want to expose a few of its features as web services.
I was thinking to use something like Selenium WebDriver, so I simulate web clicks on the application according to the web service request.
I want to know whether this is a better solution or pattern to do this.
I shall mention that the application is written using Java, Spring MVC (it is not SPA) and Spring Security. And there is a CAS server providing SSO.
There are multiple ways to implement it. In my opinion Selenium/PhantomJS is not the best option as if the web is properly designed, you can interact with it only using the provided HTML or even some API rather than needing all the CSS, and execute the javascript async requests. As your page is not SPA it's quite likely that an "API" already exists in form of GET/POST requests and you might be lucky enough that there's no CSRF protection.
First of all, you need to solve the authentication against the CAS. There are multiple types of authentication in oAuth, but you should get an API token that enables you access to the application. This token should be added in form of HTTP Header or Cookie in every single request. Ideally this token shouldn't expire, otherwise you'll need to implement a re-authentication logic in your app.
Once the authentication part is resolved, you'll need quite a lot of patience, open the target website with the web inspector of your preferred web browser and go to the Network panel and execute the actions that you want to run programmatically. There you'll find your request with all the headers and content and the response.
That's what you need to code. There are plenty of libraries to achieve that in Java. You can have a look at Jsop if you need to parse HTML, but to run plain GET/POST requests, go for RestTemplate (in Spring) or JAX-RS/Jersey 2 Client.
You might consider implementing a cache layer to increase performance if the result of the query is maintained over the time, or you can assume that in, let's say 5 minutes, the response will be the same to the same query.
You can create your app in your favourite language/framework. I'd recommend to start with SpringBoot + MVC + DevTools. That'd contain all you need + Jsoup if you need to parse some HTML. Later on you can add the cache provider if needed.
We do something similar to access web banking on behalf of a user, scrape his account data and obtain a credit score. In most cases, we have managed to reverse-engineer mobile apps and sniff traffic to use undocumented APIs. In others, we have to fall back to web scraping.
You can have two other types of applications to scrape:
Data is essentially the same for any user, like product listings in Amazon
Data is specific to each user, like in a banking app.
In the firs case, you could have your scraper running and populating a local database and use your local data to provide the web service. In the later case, you cannot do that and you need to scrape the site on user's request.
I understand from your explanation that you are in this later case.
When web scraping you can find really difficult web apps:
Some may require you to send data from previous requests to the next
Others render most data on the client with JavaScript
If any of these two is your case, Selenium will make your implementation easier though not performant.
Implementing the first without selenium will require you to do lots of trial an error to get the thing working because you will be simulating the requests and you will need to know what data is expected from the client. Whereas if you use selenium you will be executing the same interactions that you do with the browser and hence sending the expected data.
Implementing the second case requires your scraper to support JavaScript. AFAIK best support is provided by selenium. HtmlUnit claims to provide fair support, and I think JSoup provides no support to JavaScript.
Finally, if your solution takes too much time you can mitigate the problem providing your web service with a notification mechanism, similar to Webhooks or Resthooks:
A client of your web service would make a request for data providing a URI they would like to get notified when the results are ready.
Your service would respond immediatly with an id of the request and start scraping the necessary info in the background.
If you use skinny payload model, when the scraping is done, you store the response in your data store with an id identifying the original request. This response will be exposed as a resource.
You would execute an HTTPPOST on the URI provided by the client. In the body of the request you would add the URI of the response resource.
The client can now GET the response resource and because the request and response have the same id, the client can correlate both.
Selenium isn't a best way to consume webservices. Selenium is preferably an automation tool largely used for testing the applications.
Assuming the services are already developed, the first thing we need to do is authenticate user request.
This can be done by adding a HttpHeader with key as "Authorization" and value as "Basic "+ Base64Encode(username+":"+password)
If the user is valid (Users login credentials match with credentials in server) then generate a unique token, store the token in server by mapping with the user Id and
set the same token in the response header or create a cookie containing token.
By doing this we can avoid validating credentials for the following requests form the same user by just looking for the token in the response header or cookie.
If the services are designed to chcek login every time the "Authorization" header needs to be set in request every time when the request is made.
I think it is a lot of overhead using a webdriver but it depends on what you really want to achieve. With the info you provided I would rather go with a restTemplate implementation sending the appropriate http messages to the existing webapp, wrap it with a nice #service layer and build your web service (rest or soap) on top of it.
The authentication is a matter of configuration, you can pack this in a microservice with #EnableOAuth2Sso and your restTemplate bean, thanks to spring boot, will handle the underlining auth part for you.
May be overkill..... But RPA? http://windowsitpro.com/scripting/review-automation-anywhere-enterprise

Is it a good practice to use JSTL inside a script (javascript) tag?

I'm developing a web app using JSTL and Javascript in Eclipse Juno. I've been reading questions like How to set the JSTL variable value in javascript? and my code works good even if I have error in eclipse:
But... Is it a good practice to use JSTL and Javascript like this?
Does it cause a low performance in the time of rendering the webpage?
Can this be done in other way?
Is it a good practice to use JSTL and Javascript like this?
It is not bad practice or good practice. The bad practice would be using JSTL to control the flow of JavaScript, which is plain wrong because JSTL runs on server side and JavaScript on client side.
Does it cause a low performance in the time of rendering the webpage?
JSTL will only help to generate the HTML for the current view. JavaScript is not involved in the HTML generation at server side but at client side unless you work with nodejs or similar technologies.
Can this be done in other way?
This depends on what you're doing. Common way to access to data when accessing to a web page:
Application Server (AS) receives a GET request on http://www.foo.com/bar
AS pre process the GET request (load data from database or another data source, pre calculations, etc)
AS creates the response for the GET request (apply the data to generate the HTML)
AS sends the response to the client.
The browser client renders the HTML.
Another way to do it:
Application Server (AS) receives a GET request on http://www.foo.com/bar
AS creates the response for the GET request (generate the HTML which contains JavaScript functions to load the data in the onload event).
AS sends the response to the client.
The browser client renders the HTML.
The onload event fires and load data in the onload event through RESTful services. This way, the data interaction is handled in client side only, but the data comes from server side.
These are two very simple alternatives to handle the same problem. Which one to choose and work with will depend entirely on your design, there's no definitive answer.

How to call GQLQuery

Is it possible to call GQL query inside a gwt application in the server's implementation something like:
SELECT * FROM User order by score desc
To be able to retrieve a sorted list ???I tested the query in databaseviewer and it returned the sorted list I was wondering if I can sort the entities automatically after updating the entities so I can include the query in the application.
In your server part you can execute whatever you want. Of course that is true also for any server service provided in the GWT kit: RPC and RF.
But GWT is almost client side, hence the only way to call a remote procedure in your server, is using Ajax. So create a service method in your server side using RPC, RF, AutoBeans or any other JSON, XML, etc apprach. Then call that method from the client using Ajax (RPC, RF, RequestBuilder, etc).
The data transfered must be in a format understandable in client side, if you use RPC, RF or AutoBeans, that is almost out-of-the-box, for any other mechanism you are the responsible for the data-binding.
Said that, as you can see GWT uses the same architecture than any other JavaScript application. Except that in the case you use Java in server side you have some utilities for transfering data and calling remote methods.

JSONP or other alternatives?

I a deveveloping a web site that comunicates with a custom made webserver by me in Java. The web site is made in PHP/JavaScript/JQuery running on Apache and i made a simple second webserver in Java to support some designed features by me, and this server runs under another port XXXXX. The problem is, i want to make requests in jQuery to second server the domain is diferent, the page runs on domain and the $.getJSON function calls domain:XXXXX wich is not allowed. I thought user $.getJSONP but im concerning concerned issues. The connections between two points is authed (i was think by passing a token beyond the callback generated by jquery). The two poins are supported by. Is there safe in this case use $.getJSONP or exists other alternatives thinking in browsers support(IE7+ and FF3+).
Sorry for my english :)
Best regards lealoureiro
JSONP should work for your needs, however your other option would be to have a proxy service on your second server that would make the request server side. Your client-side code could then access all the data natively via json instead of jsonp.

GWT applications and the returned response from the server

I have some GWT application that run on the server.
we are subscripting with some solution that pings this application in a regular interval of time.
The point is, this solution (service) checks the returned response from the server to contain some pre-defined keywords.
But as you know, GWT return plain empty HTML page with the data contained in the .js file.
So, the Ping service will not be able to exmain the pre-defined keywords, Is this statement true??
And if this is ture, cannot we find any workaround solution to solve such problem?
Thanks.
The problem you are facing is related to the crawlabitlity of AJAX applications - Google has some pointers for you :) Generally, you need a headless browser on the server to generate the output you'd normally see in the browser, for example see HtmlUnit.
Only the initial container page and the loader script that it embeds are HTML & JS. Afterwards, you use GWT's RPC mechanism to exchange Java objects with the server, or Ajax (eg. RequestBuilder) to exchange any kind of data with the server. you name it: JSON, XML, plain text, etc.

Categories

Resources