I am writing java code where i need to get some information from a particular website. i am writing java code that uses scrapping method , but while scraping data from website i am facing one problem.
When i go through the links one page to another page some time it shows security image page. I get the security string by using an API,but when i m trying to post it using postmethod in java. I can't able to get actual page source, it redirects the same security image page. How to solve this problem. How can I post arguments to resolve security image problem.
Thanks!
From the security image implementor's point of view, this is usually implemented by saving your session id with the security string, so when you post your security string attempt, the server can compare your answer to the on in the session object on the server. The session is usually managed by your web browser, by cookies most often.
So my question to you is - do you maintain some session with this particular website, or is every web request detached from the previous ones?
Related
I am having a little problem here. We are a group of 3 guys developing a web application.. When I'm doing post to one servlet handling the login, and afterwards do a post to another servlet where I'm trying to use the attribute we've stored in the session in the Login, it's like it is using another session. I don't think there is a problem in the code, since the other guys can do this without any problems..
I'm using fiddler2 as my restclient, where the others are using Cocoa as their clients. When I'm inspecting the headers the two different posts is having two different session id's.
I've been trying to figure this out most of the day, but haven't found out of anything yet. I will be thankfull for any advise.
Fiddler's Composer does not attempt to maintain any sort of cookie jar for you. If you want to send a cookie on a request using the Composer, you must add it yourself. You will find the value in the Set-Cookie response header on a previous response.
On my MVC spring application I send a form to a page using post method to perform a search with some parameters. The results of the search is a list and, for every entry is possible to navigate to a details page. It works fine but, when an user try to come back to result page with back browser button (or a Javascript), I get an "Expire page" error. Refreshing the page, it rexecutes the post submit and it works fine.
To prevent this error I added
response.setHeader("Cache-Control", "no-cache");
to the search controller and everything works fine with Safari and Firefox but I get the same "Expire page" error when I try to execute the web application with IE (8 and 9).
What's the right way to go to a detail page and come back without getting any error?
thanks for your time!Andrea
The right way is to use GET instead of POST: searching is an idempotent operation which doesn't cause any change on the server, and such operations should be done with a GET.
Not expiring a POST request seems to undermine the very idea of a POST:
Per RFC 2616, the POST method should be used for any context in which a request is non-idempotent: that is, it causes a change in server state each time it is performed, such as submitting a comment to a blog post or voting in an online poll.
Instead, I would restructure how you view the detail records, maybe with an overlay DIV that doesn't refresh the underlying page that shows the results. With many modern web apps, using the BACK button can cause issues with page flow.
I'd like to embed an ajax application into a wordpress site. The ajax application will communicate with servlets running on tomcat. Now the servlets need a way to verify if a request originates from a user that is logged in to wordpress. How does this commonly get solved?
AFAIK, wordpress is stateless and does not use sessions, which makes me curious how a logged in user in wordpress can be tracked.
The second problem is, how can a servlet request wordpress to verify if a given user is still logged in?
Any advice is welcome,
Thank you.
The only thing that you can do is read the cookies. And that will work only if you are using the same domain (or subdomain and the cookies are valid for all subdomains). The session cookie might not give you sufficient information, however. You can't read a PHP session from a Java app, and generally, you can't mix two applications that way.
As a little workaround, you can check with javascript who is the currently logged user (by finding the username in the DOM), and send that with ajax, but that is not secure at all.
I'm building an app to let users export data from a university system. Currently, they can log in and see the data in HTML, but I would like to let people download it as CSV.
I have an app where users supply their username and password. I would like to log in to the university system and HTML scrape the resulting page. How can I do this?
I'm building a GWT app. I could either do this in Java-transliterated-JS on the client, or Java on the server.
Update: Selenium might be nice, but it looks like overkill.
You're going to have to do this from the server unless the domains are the same. You'd need to determine what the POST transaction used by the other server for the login step looks like - parameter names etc. Then you'd perform that operation and do whatever you want with what comes back. If you need to see multiple pages, you need to maintain the appropriate session cookie too so that the server knows you're still logged in on the subsequent HTTP requests.
If you have to hit another site to validate the credentials, then I'm not so sure that people should feel comfortable providing those credentials to you. That is, if you don't have rights to check the credentials directly, why are you trustworthy to receive them? I know sometimes people need to integrate with a system they don't own, so this is just a question.
First, this has to be done server-side because of the limitations on client scripting due to the same origin policy.
The typical way of handling the "screen scraping" you mention is to treat the web page as if it was an XML service. First, examine the source code of the page, then using an internet/HTTP stack, craft a POST to the correct URL and read the response using a standard XML library. It will take some ingenuity to come up with a good way to dig into the XML to find the piece you need that will be as insulated as possible from changes to the page. Keep in mind that your system can break any time that the owners of the site change their page.
Sometimes, you can't just send the POST but have to request the blank page initially in order to get hidden form values that need to be returned in the POST. You'll have to experiment to find out what it requires.
Additionally, you probably have to handle cookies as well, since they usually are an integral part of the web site's authentication and session management (though you might get lucky that the session doesn't matter between the initial POST and the first response).
Last, you may be unlucky enough that the site uses javascript to do part of the authentication work, which may require additional digging to understand how the credentials are posted to the site.
There are other potential barriers such as the site checking to see that the referrer is their own site, possible use of SSL (HTTPS) and so on.
I'm pretty sure that the protection against cross-site scripting in web browsers will mean that you can't log in to the university's app using javascript running in the web browser. So the part of your program that fetches data from the university will need to run on your server. Once you have the data, you can process it either on your server or in javascript in the browser, but I think it would be easier to do it on the server.
See http://en.wikipedia.org/wiki/Same_origin_policy
I'm not too sure about GWT, but in general, you would take the form data submitted by the user, check it against a database of username and hashed passwords. If the database checks out, set a session cookie that says the user is logged in.
In your pages, check if the session cookie say the user is logged in. If not, redirect to login page, otherwise allow them to view the pagfe.
I am using SWF Uploader to upload files. I am using java in server side.
Flash is invalidating Java Session automatically. SWF team didn't found any fix till now.
After some searches, i have found this link, which discusses an idea to handle this problem in ASP.
In basic PHP we pass the session id as a POST parameter and manually restore the session.
In ASP.Net we also post the session id and use a Global.asax to catch the values
before the session is restored and dynamically add the right cookies.
Like that do we have any option to restore the session in java?
I also gone through this StackOverFlow post. But i am not able to understand what they are telling exactly. Maybe its because, i am not sound enough in java session.
Especially upload_url: "Controller?action=33&JSESSIONID=<%=request.getSession().getId()%>", this line. What is he achieving with that line. What is Controller & action=33.
Any suggestions of restoring the session from client side or server side would be more appreciative!!
Thanks!
If I read the linked SO question correctly, the problem is not invalidation of the session id, but the way the server treats the flash object: It is considered an additional client, not as part of the rest of the browser window. Therefore, 2 separate sessions are created, causing the id to be different or null upon upload.
The solution is to manually look up the correct session id, or force the server to assign the correct id to a new session. This is done by forwarding the jsessionid to Flash as a variable, and later adding it as a GET parameter to the HTTP upload request, so it can be retrieved on the server and you can use it to look up the correct session.
In the example, the author uses Controller as the name of the servlet, and action=33 is probably used to invoke some method on it. This is specific to this particular application, but not important for your solution.
What matters to you is the end of the string: &jsessionid=<%=request.getSession().getId()%>
This JSP code essentially adds the java session id to a variable containing the upload request URL. You can do this in plain Java or any other language that has access to the correct session id - what matters is that it is transmitted to the Flash plugin first, then added to the upload request, then sent back to the server again, and then used to find or create the correct session id to process the upload with.
This is the code the author used to create a new session cookie:
if (request.getParameter("JSESSIONID")!=null) {
Cookie userCookie = new Cookie("JSESSIONID", request.getParameter("JSESSIONID"));
response.addCookie(userCookie);
}