How to get previous URL page? referer does not work - java

At the beginning I'd like to point that I have done big research about it and couldn't find a solution. E.g I've read question on this site like:
How to get previous URL?
How to get the previous page URL from request in servlet after dispatcher.forward(request, response)
In those questions and other, I've read people say that request.getHeader("Referer"); is a good way to obtain previous url, but sometimes it doesn't work. That's it why I haven't found any solution what I have to do when it does not work.
I want like to obtain url from the previous page (e.g google.com or url from my app) when someone is accessing my app. The url of this page should be written in browser. Any idea how to obtain that?
Why do I need that? When someone is accessing my app with specific url (like localhost/page/something) and he is logged out, my app rederict him to login page and then after succesfull log in it goes to home page instead of previous url that he written. Sorry for my bad english.

Easy, You can put the first requested url in a session-scope variable and retrieve It when you need.
request.getSession().setAttribute("firstURL", request.getRequestURL());

Related

How to write a Java code to read fields from a website that requires login and uses POST request?

Need some help with fetching some data from a website.
Previously , we had following code in our application and it used to fetch the required data. We just used to read the required fields by forming a URL by passing username , password and search parameter (DEA number). The same URL (with parameters ) could also be hit from browser directly to see the results. It was a simple GET request:
{URL url = new URL(
"http://www.deanumber.com/Websvc/deaWebsvc.asmx/GetQuery?UserName="+getUsername()+"&Password="+getPassword()+"&DEA="
+ deaNumber
+ "&BAC=&BASC=&ExpirationDate=&Company=&Zip=&State=&PI=&MaxRows=");
Document document = parser.parse(url.toExternalForm());
// Ask the document for a list of all <sect1> tags it contains
NodeList sections = document.getElementsByTagName("DEA");
//Followed by a loop code to get each element by using sections.item(index).getFirstChild() etc.
}
Now, the website URL has got changed to following:
https://www.deanumber.com/RelId/33637/ISvars/default/Home.htm
I am able to login to the URL with credentials , go to the search page , enter the DEA number and search. The login page comes as a pop-up once I click 'Login' link on home page. Also, the final result comes as a pop-up. This is a POST request so I am unable to form the complete URL which I could use in my code.
I am not an expert in Web Services , but I think I need a web service URL like the one mentioned in the code above. Not sure how to get that !! Even if I get the URL , I am not sure how to perform the login through Java code and search the DEA number.
Also, it would be great if I could validate the URL manually before using in Java. Let me know if there is any way.
Or, in case there is any alternate approach in Java; kindly suggest.
Thanks in advance.
First of all, the previous approach provided by the website was completely wrong and insecure, because it passes the username and password as querystring parameters in plain text. I think, they would have realized this thing and changed their way of authentication.
Also, it looks like that they have restricted the direct URL based requests from the client applications like yours. For such requests from clients, they have published the web services. Check this link. They also have mentioned the rates for web service request counts.
So, you may need to open a formal communication channel to get authentication and other details to access their web services for this purpose. Depends on what they use for web service client authentication, you may code your client to access the web services.
I hope this helps.

Finding login page of website programmatically

I was trying to sign up with a whoIs api provider to get the login page for a certain website, but it doesn't seem like they provide that kind of information, is there a better way to figure that out?
I tried these sites:
https://jsonwhois.com/
http://domainr.build/
But it doesn't seem like the json they return contains the login page
For example, if I enter
google.com
I want a response that contains
https://accounts.google.com/ServiceLogin#identifier
If I enter
yahoo.com
I want a response of
https://login.yahoo.com
That's simply because the redirection happens after. It depends wether you're looking after a redirection by a 301 response or a 401.

HtmlUnit: Request website from server in a specific language

I am looking for a clean/simple way in HtmlUnit to request a webpage from a server in a specific language.
To do this i have been trying to request "bankofamerica.com" for their homepage in spanish instead of english.
This is what i have done so far:
I tried to set "Accept-Language" header to "es" in the Http request. I did this using:
myWebClient.addRequestHeader("Accept-Language" , "es");
It did not work. I then created a web request with the following code:
URL myUrl = new URL("https://www.bankofamerica.com/");
WebRequest myRequest = new WebRequest(myUrl);
myRequest.setAdditionalHeader("Accept-Language", "es");
HtmlPage aPage = myWebClient.getPage(myRequest);
Since this failed too i printed out the request object for this url , to check if these headers are being set.
[<url="https://www.bankofamerica.com/", GET, EncodingType[name=application/x-www-form-urlencoded], [], {Accept-Language=es, Accept-Encoding=gzip, deflate, Accept=*/*}, null>]
So the server is being requested for a spanish page but in response its sending the homepage in english (the response header has the value of Content-Language set to en-US)
I did find a hack to retrieve the BOA page in spanish. I visited this page and used the chrome developer tool to get the cookie value from the request
header. I used this value to do the following:
myRequest.setAdditionalHeader("Cookie", "TLTSID= ........._LOCALE_COOKIE=es-US; CONTEXT=es_US; INTL_LANG=es_US; LANG_COOKIE=es_US; hp_pf_anon=anon=((ct=+||st=+||fn=+||zc=+||lang=es_US));..........1870903; throttle_value=43");
I am guessing the answer lies somewhere here.
Here lies my next question. If i am writing a script to retrieve 100 different websites in Spanish (ie Assuming they all have their pages in the spanish) . Is there a clean way in HtmlUnit to accomplish this.
(If cookies is indeed a solution then to create them in htmlunit you need to specify the domain name. One would have to then create cookies for each of the 100 sites. As far as i know there is no way in HtmlUnit to do something like:
Cookie langCookie = new Cookie("All Domains","LANG_COOKIE","es_US");
myWebClient.getCookieManager().addCookie(langCookie);)
NOTE: I am using HtmlUnit 2.12 and setting BrowserVersion.CHROME in the webclient
Thanks.
Regarding your first concern the clear/simple(/only?) way of requesting a webpage in a particular language is, as you said, to set the HTTP Accept-Language request header to the locale(s) you want. That is it.
Now the fact that you request a page in a particular language doesn't mean that you will actually get a page in that language. The server has to be set up to process that HTTP header and respond accordingly. Even if a site has a whole section in spanish it doesn't mean that the site is responding to the HTTP header.
A clear example of this is the page you provided. I performed a quick test on it and found that it is clearly not responding accordingly to the Accept-Language I've set (which was es). Hitting the home page using es resulted in getting results in english. However, the page has a link that states En EspaƱol which means In Spanish the page does switch to spanish and you get redirected to https://www.bankofamerica.com?request_locale=es_US.
So you might be tempted to think that the page handles the locale by a request parameter. However, that is not (only) the case. Because if you then open the home page again (without the locale parameter) you will see the Spanish version again. That is clearly a proof that they are being stored somewhere else, most likely in the session, which will most likely be handled by cookies.
That can easily be confirmed by opening a private session or clearing the cookies and confirming this behaviour (I've just done that).
I think that explains the mystery of the webpage existing in Spanish but being fetched in English. (Note how most bank webpages do not conform to basic standards such as responding to simple HTTP requests... and they are handling our money!)
Regarding your second question, it would be like asking What is the recipe to not get ill ever?. It just doesn't depend on you. Also note that your first concerned used the word request while your second concern used the word retrieve. I think it should be clear by now that you can only be 100% sure of what you request but not of what you retrieve.
Regarding setting a value in a cookie manually, that is technically possible. However, that is just like adding another parameter in a get request: http://domain.com?login=yes. The parameter will only be processed by the server if it is expecting it. Otherwise, it will be ignored. That is what will happen to the value in your cookie.
Summary: There are standards to follow. You can try to use them but if the one in the other side doesn't then you won't get the results you expect. Your best choice: do your best and follow the standards.

Trailing characters in URL after Facebook Login

I am going through the facebook authentication process to log my users into my site. Once a user is logged in I redirect to the profile page using:
resp.sendRedirect("/l/profile");
But when I get to the profile page, the URL ends /profile#_=_
This seems to be at the end of the URL redirected by facebook when it returns a code. Why is it sticking around, how do I get rid of it?
I'm guessing it's a byproduct of new feature called "Authenticated Referrals" whereby they add a valid accessToken to the end of the url as either an #anchor or ?param (depending on whether you're reading it client- or server-side). You can read more about it here: http://developers.facebook.com/docs/opengraph/authentication/
At any rate, because it's appended Facebook-side, you'll have to file a bug with them to fix it, although at the moment the tracker seems to be down for me.

clear hash on redirect in java [duplicate]

I am going through the facebook authentication process to log my users into my site. Once a user is logged in I redirect to the profile page using:
resp.sendRedirect("/l/profile");
But when I get to the profile page, the URL ends /profile#_=_
This seems to be at the end of the URL redirected by facebook when it returns a code. Why is it sticking around, how do I get rid of it?
I'm guessing it's a byproduct of new feature called "Authenticated Referrals" whereby they add a valid accessToken to the end of the url as either an #anchor or ?param (depending on whether you're reading it client- or server-side). You can read more about it here: http://developers.facebook.com/docs/opengraph/authentication/
At any rate, because it's appended Facebook-side, you'll have to file a bug with them to fix it, although at the moment the tracker seems to be down for me.

Categories

Resources