I know what I am asking is somehow weird. There is a web application (which we don't have access to its source code), and we want to expose a few of its features as web services.
I was thinking to use something like Selenium WebDriver, so I simulate web clicks on the application according to the web service request.
I want to know whether this is a better solution or pattern to do this.
I shall mention that the application is written using Java, Spring MVC (it is not SPA) and Spring Security. And there is a CAS server providing SSO.
There are multiple ways to implement it. In my opinion Selenium/PhantomJS is not the best option as if the web is properly designed, you can interact with it only using the provided HTML or even some API rather than needing all the CSS, and execute the javascript async requests. As your page is not SPA it's quite likely that an "API" already exists in form of GET/POST requests and you might be lucky enough that there's no CSRF protection.
First of all, you need to solve the authentication against the CAS. There are multiple types of authentication in oAuth, but you should get an API token that enables you access to the application. This token should be added in form of HTTP Header or Cookie in every single request. Ideally this token shouldn't expire, otherwise you'll need to implement a re-authentication logic in your app.
Once the authentication part is resolved, you'll need quite a lot of patience, open the target website with the web inspector of your preferred web browser and go to the Network panel and execute the actions that you want to run programmatically. There you'll find your request with all the headers and content and the response.
That's what you need to code. There are plenty of libraries to achieve that in Java. You can have a look at Jsop if you need to parse HTML, but to run plain GET/POST requests, go for RestTemplate (in Spring) or JAX-RS/Jersey 2 Client.
You might consider implementing a cache layer to increase performance if the result of the query is maintained over the time, or you can assume that in, let's say 5 minutes, the response will be the same to the same query.
You can create your app in your favourite language/framework. I'd recommend to start with SpringBoot + MVC + DevTools. That'd contain all you need + Jsoup if you need to parse some HTML. Later on you can add the cache provider if needed.
We do something similar to access web banking on behalf of a user, scrape his account data and obtain a credit score. In most cases, we have managed to reverse-engineer mobile apps and sniff traffic to use undocumented APIs. In others, we have to fall back to web scraping.
You can have two other types of applications to scrape:
Data is essentially the same for any user, like product listings in Amazon
Data is specific to each user, like in a banking app.
In the firs case, you could have your scraper running and populating a local database and use your local data to provide the web service. In the later case, you cannot do that and you need to scrape the site on user's request.
I understand from your explanation that you are in this later case.
When web scraping you can find really difficult web apps:
Some may require you to send data from previous requests to the next
Others render most data on the client with JavaScript
If any of these two is your case, Selenium will make your implementation easier though not performant.
Implementing the first without selenium will require you to do lots of trial an error to get the thing working because you will be simulating the requests and you will need to know what data is expected from the client. Whereas if you use selenium you will be executing the same interactions that you do with the browser and hence sending the expected data.
Implementing the second case requires your scraper to support JavaScript. AFAIK best support is provided by selenium. HtmlUnit claims to provide fair support, and I think JSoup provides no support to JavaScript.
Finally, if your solution takes too much time you can mitigate the problem providing your web service with a notification mechanism, similar to Webhooks or Resthooks:
A client of your web service would make a request for data providing a URI they would like to get notified when the results are ready.
Your service would respond immediatly with an id of the request and start scraping the necessary info in the background.
If you use skinny payload model, when the scraping is done, you store the response in your data store with an id identifying the original request. This response will be exposed as a resource.
You would execute an HTTPPOST on the URI provided by the client. In the body of the request you would add the URI of the response resource.
The client can now GET the response resource and because the request and response have the same id, the client can correlate both.
Selenium isn't a best way to consume webservices. Selenium is preferably an automation tool largely used for testing the applications.
Assuming the services are already developed, the first thing we need to do is authenticate user request.
This can be done by adding a HttpHeader with key as "Authorization" and value as "Basic "+ Base64Encode(username+":"+password)
If the user is valid (Users login credentials match with credentials in server) then generate a unique token, store the token in server by mapping with the user Id and
set the same token in the response header or create a cookie containing token.
By doing this we can avoid validating credentials for the following requests form the same user by just looking for the token in the response header or cookie.
If the services are designed to chcek login every time the "Authorization" header needs to be set in request every time when the request is made.
I think it is a lot of overhead using a webdriver but it depends on what you really want to achieve. With the info you provided I would rather go with a restTemplate implementation sending the appropriate http messages to the existing webapp, wrap it with a nice #service layer and build your web service (rest or soap) on top of it.
The authentication is a matter of configuration, you can pack this in a microservice with #EnableOAuth2Sso and your restTemplate bean, thanks to spring boot, will handle the underlining auth part for you.
May be overkill..... But RPA? http://windowsitpro.com/scripting/review-automation-anywhere-enterprise
Related
As the question states, my goal is to hide a GET route in Spring Boot from being accessed from the public. I originally took a CORS approach, but that doesn't solve the actual view problem. Pretty much anyone could go to, say... https://my-api-url.com/employee/all and see a JSON record of all employees in my database.
END GOAL: I only want my front-end to have access to my API for displaying that information to an authorized user who is signed in, but I do NOT want just anyone to have access to the API. CORS policy can handle the ajax requests, but it doesn't seem like I can stop the overall viewing of the GET url.
How can I solve this problem?
You can use OAuth to register clients(frontend/postman/whatever you are using to test the API) that can access your resource server, but it might be overkill. For now, if you worry someone can view your API by typing it in the address bar(if that is your question) then you can allow access for authenticated users only.
If you want to restrict usage and make it inconvenient for abusers to call your API, you can issue a token on page load (CSRF token) and require that token to be present in the request to the API - that way the API will be callable from a browser that initiated a page load.
You can refer this link https://security.stackexchange.com/questions/246434/how-can-i-ensure-my-api-is-only-called-by-my-client
If your frontend is currently handling authentication, i‘d suggest moving to Springs Authenticationserice. That way you could prevent unauthenticated users from accessing that specific API endpoint.
I'm running a webapp that checks if a user is logged in with UserService, then shows them their homepage if they are, or redirects them to a login screen if not. Once on the page, I would like to be able to update specific portions using AJAX when they click certain elements. Now, I have already written a REST API in the same GAE project using Cloud Endpoints that gets all the information I want, and so in the spirit of DRY I would rather use my own API than write new servlets to handle these requests.
The problem is that I need to generate an OAuth token in order to access the API. I can easily do this from the Google API JavaScript Client Library, but then my user needs to re-authenticate for the rest API, which is not only bad from a UX perspective, but more importantly exposes my client id in the page's javascript and passes a token through HTTP (non-SSL) headers.
The only option I see is to write a servlet for each request and have duplicate work. But conceptually, I'm already logged in to Google, so I should just be able to access the API. How does one usually go about this? Am I thinking about it all wrong?
UserService and OAuth are two different authentication (and authorisation) mechanisms and you can not combine them.
If you do need OAuth to access some of the APIs than also use server side OAuth. This way you can access APIs and replace UserService all in one go.
I'm working on a Java REST server serving an iPhone app. Now we have to integrate with third party service exposed by oauth2 protocol. This is new to me so I've been reading and writing some "proof of concept" code but I have a big problem or I fundamentally don't understand something...
I made a simple web page with "log in with XXX" button that the user sees in a web view. When he clicks it, login page of the third party service opens and he can approve my app, at what time they will redirect the user to an URL I've specified with the authorization code as a parameter. This URL points to a REST service on my server.
The problem is that this URL must be absolutely the same as the one I've set up when applying my app for their service. Since I'm running a REST server I have no way of knowing about which user are we talking about when the redirection to my server happens (there is no session). I wanted to do this identification with some query or path param but they are not allowing it.
Does any of this makes sense to you or am I implementing this in a wrong way? The only possible solution I can imagine now will be with the help of cookies but I'm not really fond of that...
Yes, that does make sense. You got a few different options, try one of these:
Store a cookie with some user id and read it out after redirection
Use the state parameter of the authorization request for transmitting some user id. The provider is required to return it back to you in his redirect.
I keep on facing this question from my manager how SSO will work if client disable cookies but I don't have any answer. We are currently using JOSSO for single sign on. Do we have any open source framework which support single sign on without using cooking mechanism.
In the absence of cookies, you're going to have to embed some parameter in each url request. e.g. after logging in you assign some arbitrary id to a user and embed that in every link such as http://mydomain.com/main?sessionid=123422234235235. It could get pretty messy since every link would have to be fixed up before it went out the door which slows down your content. It also has security, logging and session history implications which are not such a huge deal when the state is in a cookie.
It may be simpler to do a simple cookie test on logged in users and send them off to an error page if they do not have cookies enabled.
The CAS project passes a "ticket" from the sign on server to the consuming application as a url query parameter, the consuming app then makes a back channel request back to the sign on server to validate the ticket's authenticity. This negates the need for cookies and therefore works across domains however it is a bit "chatty"
Another arguably more robust solution is to use a product based on SAML which is an industry standard for cross domain single sign on. There are a couple of open source products out there which use SAML and CAS itself has a SAML extension however they are typically quite complex to setup. Cloudseal is also based on SAML and is much simpler to use. The Cloudseal platform itself is delivered as a managed service but all the client libraries are open source
Of course with all these solutions you are simply passing a security context from one server to another, the consuming application will no doubt create it's own local session so you would then need to use URL rewriting instead of cookies
Disclaimer: I work for Cloudseal :)
I'm using gwt on my glassfish server, and I'm attempting to make some of my RPC calls authenticated via cookies. Is this possible? Are there any examples out there of how to code it?
Depending only on the cookie for authentication will make your website/services vulnerable to Cross-Site Request Forging/XSRF/CSRF attacks - read more on that in Security for GWT Applications.
The best way would be to double check the value you get from the cookie and with the one that's been transported to the server by some other means - as part of the request (header, a custom field, etc).
Other than that, there are many tutorials covering the subject - just search for Java (servlet) authentication - it doesn't have to be GWT-specific. The Google Web Toolkit Group also has many threads about the subject.
I assume that you use GWT's RPC servlet for handling requests made by the client.
One option that comes to my mind is to write and configure a ServletFilter which can examine the cookie, before the request reaches GWT's servlet.
You might rethink using cookies as it is a potencial security hole.
Why not put your communication to HTTPS?
Can you not just use the standard 'session' scope, i.e.
request.getSession()
A pattern I use in GWT apps is to have a separate 'old fashioned' login form which sets up the session. The GWT app's host page is then displayed after they have successfully logged in.
If the necessary values aren't in the session, then the user isn't logged in. Your service should return an exception, maybe, which instructs the GWT app to redirect to the login page, or display an error.