JSOUP Cookie ID without post - java

I'm creating an Android application which uses JSOUP to log into a website. In order to login I'm simply using the following:
loginDoc = Jsoup.connect(loginURL).get();
So this connects to the login URL which contains the users details. What I want to do is find out the session id (cookie data) for this session. How do I do this? As you can see i'm using a .get request and all of the examples I've seen on stackoverflow and elsewhere are using .post requests. Does anyone have any ideas?
Thanks,

The .get() method returns a Document, but if you do an .execute() instead, you get a Response object with the cookies, headers, et al.
For example:
Connection.Response res = Jsoup.connect(loginUrl).execute();
String sessionId = res.cookie("sessionId");
Document doc = res.parse();

Related

How to send "post-request" to site and log in [JAVA]

I'm trying to register on site URL = http://flyner.com/signup via jsoup
What am I doing:
Connection.Response loginForm = Jsoup.connect(URL) .userAgent("Mozilla/5.0").execute();
document = Jsoup.connect(URL)
.userAgent("Mozilla/5.0")
.data("email", mail)
.data("pass", password)
.data("agree","1")
.cookies(loginForm.cookies())
.post();
but nothing happens.
I think, that i need to add to my data also "fkey", "skey", "dkey", and may be "ts",but how can i get it?
Usually, you get all relevant info (session-id, tokens, etc.) to send along a POST-request for login with a previous GET-request request to the URL that renders the login-form. This is general advice and I did not look into the URL you mentioned so I don't know what other parameters are needed. A good help is to log all network traffic from first GET to final POST request. You can turn on the option to keep all network traffic in the developer tools of a browser of your choice.

Cookies with Jsoup

I'm having issues with sending POST data to this site:
https://www.amazon.com/ap/signin?openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0&openid.assoc_handle=amzn_mturk_worker&openid.ns.pape=http%3A%2F%2Fspecs.openid.net%2Fextensions%2Fpape%2F1.0&_encoding=UTF8&openid.mode=checkid_setup&openid.claimed_id=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.identity=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.pape.max_auth_age=43200&marketplaceId=A384XSLT9ODACQ&clientContext=703ea210dfe6fd07defd5ab30ac8d9&openid.return_to=https%3A%2F%2Fwww.mturk.com%2Fmturk%2Fendsignin`
I'm using Jsoup. I'm trying to use the same cookies "session-id" for the get data but i'm still not logged in. This is my code:
Connection.Response res = Jsoup.connect("https://www.amazon.com/ap/signin?openid.ns=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0&openid.assoc_handle=amzn_mturk_worker&openid.ns.pape=http%3A%2F%2Fspecs.openid.net%2Fextensions%2Fpape%2F1.0&_encoding=UTF8&openid.mode=checkid_setup&openid.claimed_id=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.identity=http%3A%2F%2Fspecs.openid.net%2Fauth%2F2.0%2Fidentifier_select&openid.pape.max_auth_age=43200&marketplaceId=A384XSLT9ODACQ&clientContext=703ea210dfe6fd07defd5ab30ac8d9&openid.return_to=https%3A%2F%2Fwww.mturk.com%2Fmturk%2Fendsignin").data("email", "blah#gmail.com", "password", "blah").method(Connection.Method.POST).execute();
Document doc2 = res.parse();
sessionId = res.cookie("session-id");
Document doc = Jsoup.connect("https://www.mturk.com/mturk/searchbar?selectedSearchType=hitgroups&minReward=0.00&sortType=LastUpdatedTime%3A1&pageSize=50").cookie("SESSIONID", sessionId).get();
Where e-mail and password are real information instead of "blah". I don't know if my issue is how I parse the cookie or send the POST data originally.
Edit: So the site uses OpenID. Not sure if I should make a whole new question, but how would I go about it now? I basically need to login and pull information off the site after login. Here is my post info:
appActionToken:pj2FxGfwLZT6nheliE7BMxwZrTUKEj3D
appAction:SIGNIN
clientContext:ape:NzAzZWEyMTBkZmU2ZmQwN2RlZmQ1YWIzMGFjOGQ5
openid.pape.max_auth_age:ape:NDMyMDA=
openid.return_to:ape:aHR0cHM6Ly93d3cubXR1cmsuY29tL210dXJrL2VuZHNpZ25pbg==
prevRID:ape:S1kyUFNDUkhLVFZSSjRGMjBYUUo=
openid.identity:ape:aHR0cDovL3NwZWNzLm9wZW5pZC5uZXQvYXV0aC8yLjAvaWRlbnRpZmllcl9zZWxlY3Q=
openid.assoc_handle:ape:YW16bl9tdHVya193b3JrZXI=
openid.mode:ape:Y2hlY2tpZF9zZXR1cA==
openid.ns.pape:ape:aHR0cDovL3NwZWNzLm9wZW5pZC5uZXQvZXh0ZW5zaW9ucy9wYXBlLzEuMA==
openid.claimed_id:ape:aHR0cDovL3NwZWNzLm9wZW5pZC5uZXQvYXV0aC8yLjAvaWRlbnRpZmllcl9zZWxlY3Q=
pageId:ape:YW16bl9tdHVya193b3JrZXI=
openid.ns:ape:aHR0cDovL3NwZWNzLm9wZW5pZC5uZXQvYXV0aC8yLjA=
email: -Deleted-
create:0
password: -Deleted-
metadata1:+gLgZV5Fc5cBh44WnOrKTq5ofl6IhGvSbZGHfX7T5PuwmIl0Ep4bclt77iRlLPO1thRNg/9TylDw5H/9UPZnuOcF1OAHaECaWmK9H8pkW0elpz5QgEukM4aP6dPwSliw9Ggy+1/vQCk0MLm2TvkyS8uLslyh2aEw4H7hDmcF6lTgctZVE8B2KENH97L7hp4rcR2NHKMm4tEFdwpmVqv+pmLX5rUBo4p2QNUe3g0dNAifuK3RPXCVSQyQHpUzlBuFZTFK9xspwA2dgcdSZcgQzgzQKik/WEDrn0eP4sAVnO1ZGFUWKFAY55Lzgf6yd6WxCZ15yGTWENf0Km9wnXce+Ev5SMarXPJNQtfqY6tdp5snwFxpB8m/x72AvRgWJACoi5qcyqwO6dxroebIyB9uruApIkUk07AD8bJvhcf92+flN9TY4iXCkIoeSUN5aKp8rJbyhspySgsmQ9guu4964qeQRK0J092/sx1De6VmfGQ3nMrr0+McnC4/wZo2jUhGOr62ow==`
The site you're trying to log in make use of Javascript. Since Jsoup doesn't support Javascript (Jsoup 1.8.3 as of this writing), I advice you to use some better approaches: ui4j or Selenium. The choice is yours.

authenticate user from java program - java

I am trying to crawl a web-page which requires authentication. I am able to access that page in browser when I am logged in, using JSoup http://jsoup.org/ library to parse HTML pages.
public static void main(String[] args) throws IOException {
// need http protocol
Document doc = Jsoup.connect("http://www.secinfo.com/$/SEC/Filing.asp?T=r643.91Dx_2nx").get();
// get page title
String title = doc.title();
System.out.println("title : " + title);
// get all links
Elements links = doc.select("a");
for (Element link : links) {
// get the value from href attribute
System.out.println("\nlink : " + link.attr("href"));
}
System.out.println();
}
Output :
title : SEC Info - Sign In
This is getting the content of the sign in page not the actual URL i am passing. I am registered on secinfo.com and while running this program I am logged in from my default browser Firefox.
This will not help even if you are logged in using your default browser. Your java program is a separate process and it doesn't share the screen with your browsers.
On the other hand secinfo needs an authentication and JSoup allows you to pass authentication details.
It works for me when I pass the authentication details:
Please check this answer (Jsoup connection with basic access authentication)
Jsoup's connect() also support a post() with method chaining, if your target site's login mechanism work with POST request:
Document doc = Jsoup.connect("url")
.data("aUserName", "myUserName")
.data("aPassword", "myPassword")
.userAgent("Mozilla")
.timeout(3000)
.post();
But what if the page you are trying to get requires subsequent cookie sending for each request ? Try to use HttpURLConnection with POST and read the cookie from HTTP connection response header. HttpClient will make this task easier for you. Use the library to fetch a web page as string and then pass the string to jsoup.parse() function to get the document.
You have to sign in with a post command and preserve the cookies you get back. That is where you session info is stored. I wrote an example here: Jsoup can't Login on Page.
The website in the example is an exception it sets the session cookie already on the login page. You can leave that step if it is work for you.
The exact post command can be different from website to website. You have to dig it out from the html or you have to install a plugin to your browser and intercept the post commands.

jsoup record logged

I have a website.
Can see inside the contents must be logged in.
However, I use this code to log on.
doc = Jsoup.connect("http://46.137.207.181/Account/Login.aspx")
.data("ctl00$MainContent$LoginUser$UserName", "1234")
.data("ctl00$MainContent$LoginUser$Password", "123456")
.data("__VIEWSTATE","/wEPDwULLTEyMDAyNTY1NjJkGAEFHl9fQ29udHJvbHNSZXF1aXJlUG9zdEJhY2tLZXlfXxYBBSZjdGwwMCRNYWluQ29udGVudCRMb2dpblVzZXIkUmVtZW1iZXJNZUHk9FMvtsvPHqlP3vAV+1oloaxe4Asr7RQX5XFptqGz")
.data("__EVENTVALIDATION","/wEWBQLup8mjCgLFyvjkDwLQzbOWAgKVu47QDwKnwKnjBTL6Xsxc9zQnY8p9KVlFJ/8HIHqlOGl9uClF4ktcWYJ5")
.data("ctl00$MainContent$LoginUser$LoginButton","2")
request.
.post();
Then get the login pages.
doc2 = Jsoup.connect("http://46.137.207.181/Groups.aspx").get();
s=doc.title();
Elements kelime = doc.select("td");
for (Element link : kelime) {
linkHref = link.attr("hh");
Have shown no login screen.
I would like to ask how can I do it?
What is happening in your example is that you are logging in with form data to Login.apsx and creating a session, but the request to Groups.aspx doesn't carry that session data, so the request is not authenticated.
Login.aspx will return a session cookie, and you need to pass that cookie onto the next request.
See the answers to this jsoup login question for good examples.

jsoup connect parameter

I access a webpage by passing the session id and url and output is a HTML response.
I want to use jSoup to parse this response and get the tag elements.
I see the examples in Jsoup takes a String for establishing connection. How do i proceed.
pseudo code:
I tried the above method and got this exception
java.io.IOException: 401 error loading URL http://www.abc.com/index
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:387)
at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:364)
at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:143)
at org.jsoup.helper.HttpConnection.get(HttpConnection.java:132)
Basically the entity.getContent() has the HTML response which has to be passed as a String to the connect method. But it doesn't work.
Apache Commons HttpClient and Jsoup do not share the same cookie store. You basically need to pass the very same cookies as HttpClient has retrieved back through Jsoup's Connection. You can find some concrete examples here:
Sending POST request with username and password and save session cookie
how to maintain variable cookies and sessions with jsoup?
Alternatively, you can also just continue using HttpClient for firing HTTP requests and maintaining the cookies and instead feeds its HttpResponse as String through Jsoup#parse().
So this should do:
HttpResponse httpResponse = httpclient1.execute(httpget, httpContext);
String html = EntityUtils.toString(httpResponse.getEntity());
Document doc = Jsoup.parse(html, testUrl);
// ...
By the way, you do not necessarily need to create a whole new HttpClient for a subsequent request. Just reuse httpclient which you already created. Also your way of obtaining the response as String is clumsy. The second line in the above example shows how to do it at simplest.
It shows an http error 401 which means
Similar to 403 Forbidden, but specifically for use when authentication is possible but has failed or not yet been provided.
Therefore, i think you need to login into the website using your java code or identify yourself by sending cookies through your code.

Categories

Resources