I have a home integrated project working with google calendar...well, it was working. I've been using it for at least 6 months, maybe a year, I forget. Suddenly google changed the rules, and I can't figure out how to make things work now.
I don't want to use a whole library to do the extremely basic operations I need to do. I don't need a bunch of extra libraries in my Tomcat app.
Here is the full code sample that used to post a new calendar event, and get the id back so that we could later delete it if we wanted to for an update, etc.
I only get 403 errors back now, and the user/pass is OK, I can get my auth token, I can also login with a browser, I did the captcha unlock page, etc. It just stopped working on 11/18/2014. It was working on 11/17/2014.
Error:
java.io.IOException: Server returned HTTP response code: 403 for URL: https://www.google.com/calendar/feeds/myuser#gmail.com/private/full
Help? urlc.getInputStream() throws the exception.
I would be happy to use OAuth2 as well, but I can't get over the aspect that all the docs indicate to use a library, and that the user is going to be presented with the google page to accept. They can't be...they don't interact with this. This is an automated server side app building out calendar events. There is no user present or web browser. So I don't get what to do...they have the service account item, and I downloaded my private key, but I see nowhere that they tell you what you are supposed to do with the private key...
I'm happy to do CalDAV too, but again, OAuth keeps me from proceeding. I have no issues with the technical aspects after login, but I can't understand google's login architecture to get that far anymore.
--Ben
HttpURLConnection urlc = (HttpURLConnection)new URL("https://www.google.com/calendar/feeds/myuser#gmail.com/private/full").openConnection();
urlc.setDoOutput(true);
urlc.setFollowRedirects(false);
urlc.setRequestMethod("POST");
urlc.setRequestProperty("Content-Type", "application/atom+xml");
urlc.setRequestProperty("Authorization", "GoogleLogin auth=" + authToken);
OutputStream out = urlc.getOutputStream();
out.write(b);
out.close();
int code = urlc.getResponseCode();
String location = "";
for (int x=0; x<10; x++)
{
System.out.println(x+":"+urlc.getHeaderFieldKey(x)+":"+urlc.getHeaderField(x));
if (urlc.getHeaderFieldKey(x) != null && urlc.getHeaderFieldKey(x).equalsIgnoreCase("Location")) location = urlc.getHeaderField(x);
}
String result = consumeResponse(urlc.getInputStream());
System.out.println(result);
urlc.disconnect();
urlc = (HttpURLConnection)new URL(location).openConnection();
urlc.setDoOutput(true);
urlc.setFollowRedirects(false);
urlc.setRequestMethod("POST");
urlc.setRequestProperty("Content-Type", "application/atom+xml");
urlc.setRequestProperty("Authorization", "GoogleLogin auth=" + authToken);
out = urlc.getOutputStream();
out.write(b);
out.close();
code = urlc.getResponseCode();
result = consumeResponse(urlc.getInputStream());
System.out.println("Raw result:"+result);
gcal_id = result.substring(result.indexOf("gCal:uid value='")+"gCal:uid value='".length());
gcal_id = gcal_id.substring(0,gcal_id.indexOf("#google.com"));
System.out.println("Calendar ID:"+gcal_id);
So I am partially answering my own question...
The "solution" is having a refresh token. This can be used offline to get new access tokens on demand that are good for about 1 hour. You submit your refresh token to: ht tps :/ /account s. go ogle .c om/o/oauth2/token and it will give you back a "Bearer" access token to use for the next hour.
To get your refresh token though, you need to go to a URL in your browser to get the access, and your allowed redirect URLs must be configured to where you are going to 'redirect' to. It can be something invalid, so long as you can get the 'code' parameter its going to give you. You will need this code to then get the refresh token.
Configure the allowed redirect URLs in your developer console. Find your own link to the dev console. I don't have the points to tell you apparently.
An example URL to go to is something like this:
https://accounts.google.com/o/oauth2/auth?scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcalendar&state=&redirect_uri=url_encoded_url_to_redirect_to_that_is_in_developer_console&response_type=code&client_id=some_google_randomized_id.apps.googleusercontent.com&access_type=offline&approval_prompt=force
All of this info was pulled from:
https://developers.google.com/accounts/docs/OAuth2WebServer#refresh
So with all of this, you can now do the normal calendar API calls directly and pass in the Bearer authorization header.
So in total, you need exactly 0 google libraries to do all of this, they just make it very difficult to get to the meat of what is really going on. Half the "examples" even on google's pages are referencing invalid things. Most spend the majority of the example telling you how to reconfigure your eclipse to do the example...
The other side effect is this also requires the json format for calendar entries, and not the former XML style gcal was using. Not really a downside, just a change.
Until next year when it all breaks again...
https://apidata.googleusercontent.com/caldav/v2/calid/user
Where calid should be replaced by the "calendar ID" of the calendar to be accessed. This can be found through the Google Calendar web interface as follows: in the pull-down menu next to the calendar name, select Calendar Settings. On the resulting page the calendar ID is shown in a section labelled Calendar Address. The calendar ID for a user's primary calendar is the same as that user's email address.
Please refer the below link :-
https://developers.google.com/google-apps/calendar/caldav/v2/guide
Related
I am working on an app that will help me log in the website and view data that I need. While I have no trouble with making sure that I parse that data and work with it properly, I did face an issue with logging into the website. I tried sending POST request, yet that didn't really work for some reason so I started looking more closely into how POST request to that website is sent in the browser and here is what I got:
Picture
I also asked a guy who developed that website and he said that I should use two cookies with "ulogin" and "upassword" for my log in. I tried using JSOUP as shown right here: https://jsoup.org/cookbook/input/load-document-from-url
I used .cookies("upassword", "10101010"), yet it didn't work so it makes me think that there is a bit more to it than just writing a simple line a post request.
Please, can someone explain to me how do I use cookies to log into website or at least point me in the direction where I can learn that, because I am so close to making that app happen and I will be able to proceed further with it's development, but it's just this one step that I am really being stuck with.
Here is an additional picture with Response and Request Headers from the Firefox. Picture
I managed to get it working a long time, yet didn't post an answer. So, here we go.
Cookies are just simple Headers, therefore you should treat them as such. In my case, with the use of HttpURLConnection, here is a piece of working code:
Note: My original request is for Java, however, I have since moved to Kotlin, so this solution uses Kotlin and this function is a "suspend" function which means that it is designed to be used with Kotlin Couroutines.
suspend fun httpRequest(): String {
val conn: HttpURLConnection = url_profile.openConnection() as HttpURLConnection
conn.requestMethod = "POST"
conn.doOutput = true
conn.doInput = true
conn.setRequestProperty(
"Cookie",
"YOUR COOKIE DATA"
)
val input: BufferedReader = BufferedReader(InputStreamReader(conn.inputStream))
return input.readText()
}
I am using JSOUB to scrape all the web page as the following:
public static final String GOOGLE_SEARCH_URL = "https://www.google.com/search";
String searchURL = GOOGLE_SEARCH_URL + "?q="+searchTerm+"&num="+num +
"&start=" + start;
Document doc = Jsoup.connect(searchURL)
.userAgent("Mozilla/5.0 Chrome/26.0.1410.64 Safari/537.31")
// .ignoreHttpErrors(true)
.maxBodySize(1024*1024*3)
.followRedirects(true)
.timeout(100000)
.ignoreContentType(true)
.get();
Elements results = doc.select("h3.r > a");
for (Element result : results) {
String linkHref = result.attr("href");
}
But my problem is that at the start of the code working good.
after a while, it will stop and always gives me " HTTP error fetching URL. Status=503 error".
when I add the .ignoreHttpErrors(true) it will work without any error but it will not scrape the web.
*search term is any keyword I want to search about and num is the number of pages that I need to retrieve.
could anyone help, please?
Is this mean that Google blocked my IP from scraping? if yes is there any solution or how I scape the google search result, please?
I need help.
Thank you,
503 error usually means the website you trying to scrap blocks you because they don't want non-human user navigating their sites. Especially Google.
There are something you can do though. Such as
Using proxy rotator
Use chromedriver
Add some delays to your application after each page
Basically you need to be as human as possible to prevent sites blocking you.
EDIT:
I need to warn you that scraping Google search result is against their ToS and might be illegal depends on where you are.
What you can do
You can use proxy rotating service to mask your request so google will see it as request from multiple region. Google proxy rotator service if you interested. It might be expensive depends on what you do with the data.
Then code some module that change the User-Agent every request to make Google less suspicious with your request.
Add random delay after scraping each page. I suggest around 1-5 seconds. Randomized delay makes your request more human-like for Google
At last if everything fails, you might want to look into Google search API and use their API instead of scraping their site.
I am tasked with checking whether some URLs are working correctly, I'm using Java to make HTTP get request to get the response code.
So what I did was this.
URL u = new URL("some URL");
HttpURLConnection huc = (HttpURLConnection) u.openConnection();
huc.setRequestMethod("GET");
huc.connect();
int code = huc.getResponseCode();
System.out.println(code + " " + huc.getURL());
The Problem: Some sites require you to login to access the page, but the page doesn't return a 401 code, but 200. Note that the web page doesn't show up until a username and password are provided. It asks for authentication in a pop up window.
So how do I catch these kind of links?
Also, how can I identify if a webpage shows a login page like http://www.example.com/login/? Is it sufficient to just check the URL for the word “login”?
There's no universal way to deal with this. You have to know how the site you're using does authentication - 401? separate login page? multi-factor auth (ie: using RSA token)? Checking for the substring "login" in the URL is a possible way of handling some, but not enough for a general way.
For example, a 401 will only happen when using basic authentication (or when trying to access protected resources directly). There's a lot of other ways to do auth
John sums up the issue quite well in his comment:
If you have to deal with pages that roll their own custom authentication, then it follows that you probably have to write your own custom code to accommodate them. Depending on how the relevant sites work, you might be able to bypass authentication by sending an appropriate cookie in your request, as if you had already authenticated, or by some similar means
I'm trying to integrate Google APIs inside a project (Thesis project) and I have some doubts and questions. So, here it is the scenario:
I wrote a back-end application in Java that runs solely from a command-line and has absolutely no interaction with a user. Its goal is to allow communication and interaction between sensors and actuators. Everything works great. Now I'd like to integrate something in order to let the sensors backup data both with a certain periodicity and due to some detected threshold value. So I thought, why not trying with Google Drive. The first very useful links have been:
https://developers.google.com/drive/web/quickstart/quickstart-java
https://developers.google.com/accounts/docs/OAuth2InstalledApp
Quick start examples work like a charm. However it requires quite a bit of settings: create a project inside the Developer Console (therefore an account), enable Drive API, then create a Client ID and a Client Secret. Once you've done these steps, you can hard-coded client ID and secret to form the request URL for google drive. Then you're kindly asked to enter the url in a browser, log in if you're not, accept and finally copy and paste into your console the authorization code for obtaining an access token. Wow, quite a security proccess. But hey, I completely agree with it, above all in a scenario where we have either a web app, a smartphone app or a web service that needs users' authentication and authorization in order to let the app doing its job by accessing someone else account. But in my case, I just would like that sensors will backup data on my google drive.
These facts lead to my first question: in order to use Google APIs (Drive in this case), do I have to create a project anyway? Or is there another approach? If I'm not wrong, there aren't other ways to create a client Id and secret without creating a project inside the Developer Console. This puzzles me a lot. Why should I create a project to use basically some libraries?
So, let's assume the previous as justifiable constraints and move on the real question: how to automate the authentication process? Given my scenario where a sensor (simply a Java module) want to backup data, it would be impossible to complete all that steps. The google page about OAuth 2.0 has a great explanations about different scenarios where we can embed the authentication procedure, included one for "devices with limited input capabilities". Unluckily, this is more complicated then the others and requires that "The user switches to a device or computer with richer input capabilities, launches a browser, navigates to the URL specified on the limited-input device, logs in, and enters the code." (LOL)
So, I didn't give up and I ended up on this post that talks about OAuth Playground: How do I authorise an app (web or installed) without user intervention? (canonical ?). It really looks like as a solution for me, in particular when it says:
NB2. This technique works well if you want a web app which access
your own (and only your own) Drive account, without bothering to write
the authorization code which would only ever be run once. Just skip
step 1, and replace "my.drive.app" with your own email address in step
5.
However if I'm not wrong, I think that OAuth Playground it's just for helping test and debug projects that use Google APIs, isn't it? Moreover, Google drive classes such as GoogleAuthorizationCodeFlow and GoogleCredential (used inside the Java quick start example) always need Client ID, Client Secret and so on, which brings me to point zero (create a project and do the whole graphical procedure).
In conclusion: is there a way to avoid the "graphical" authentication interaction and convert it into an automated process using only Drive's APIs without the user intervention? Thanks a lot, I would be grateful for any tip, hint, answer, pointer :-)
This is just a snippet of code that I wrote thanks to pinoyyid suggestions. Just to recap what we should do in this case (when in your program there isn't a user interaction for completing all the Google GUI authentication process). As reported in https://developers.google.com/drive/web/quickstart/quickstart-java
Go to the Google Developers Console.
Select a project, or create a new one.
In the sidebar on the left, expand APIs & auth. Next, click APIs. In the list of APIs, make sure the status is ON for the Drive API.
In the sidebar on the left, select Credentials.
In either case, you end up on the Credentials page and can create your project's credentials from here.
From the Credentials page, click Create new Client ID under the OAuth heading to create your OAuth 2.0 credentials. Your application's client ID, email address, client secret, redirect URIs, and JavaScript origins are in the Client ID for web application section.
The pinoyyd post is neater and get straight to the point: How do I authorise a background web app without user intervention? (canonical ?)
Pay attention to step number 7
Finally the snippet of code is very simple, it's just about sending a POST request and it's possible to do that in many ways in Java. Therefore this is just an example and I'm sure there is room for improvements ;-)
// Both to set access token the first time that we run the module and in general to refresh the token
public void sendPOST(){
try {
URL url = new URL("https://www.googleapis.com/oauth2/v3/token");
Map<String,Object> params = new LinkedHashMap<>();
params.put("client_id", CLIENT_ID);
params.put("client_secret", CLIENT_SECRET);
params.put("refresh_token", REFRESH_TOKEN);
params.put("grant_type", "refresh_token");
StringBuilder postData = new StringBuilder();
for (Map.Entry<String,Object> param : params.entrySet()) {
if (postData.length() != 0) postData.append('&');
postData.append(URLEncoder.encode(param.getKey(), "UTF-8"));
postData.append('=');
postData.append(URLEncoder.encode(String.valueOf(param.getValue()), "UTF-8"));
}
byte[] postDataBytes = postData.toString().getBytes("UTF-8");
HttpsURLConnection conn = (HttpsURLConnection)url.openConnection();
conn.setRequestMethod("POST");
conn.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
conn.setRequestProperty("Content-Length", String.valueOf(postDataBytes.length));
conn.setDoOutput(true);
conn.getOutputStream().write(postDataBytes);
BufferedReader in_rd = new BufferedReader(new InputStreamReader(conn.getInputStream(), "UTF-8"));
// Read response body which should be a json structure
String inputLine;
StringBuilder responseBody = new StringBuilder();
while ((inputLine = in_rd.readLine()) != null) {
responseBody.append(inputLine);
}
in_rd.close();
//Parsing Response --> create a json object
JSONObject jsonResp = new JSONObject(responseBody);
//Modify previous access token String
ACCESS_TOKEN = jsonResp.getString("access_token");
}
catch(MalformedURLException ex_URL){
System.out.println("An error occured: " + ex_URL.getMessage());
}
catch(JSONException ex_json) {
System.out.println("An error occured: " + ex_json.getMessage());
}
catch(IOException ex_IO){
System.out.println("An error occured: " + ex_IO.getMessage());
}
} //end of sendRefreshPOST method
Hope this snippet of code will help others that will face the same situation !
I wrote the SO post at How do I authorise an app (web or installed) without user intervention? (canonical ?)
What it describes is indeed the solution to your use-case. The key bit you'd missed is step 7 where you enter the details of your own application into the OAuth Playground. From that point, the playground is impersonating your app and so you can do the one-time authorization and obtaining a refresh token.
I am looking for a clean/simple way in HtmlUnit to request a webpage from a server in a specific language.
To do this i have been trying to request "bankofamerica.com" for their homepage in spanish instead of english.
This is what i have done so far:
I tried to set "Accept-Language" header to "es" in the Http request. I did this using:
myWebClient.addRequestHeader("Accept-Language" , "es");
It did not work. I then created a web request with the following code:
URL myUrl = new URL("https://www.bankofamerica.com/");
WebRequest myRequest = new WebRequest(myUrl);
myRequest.setAdditionalHeader("Accept-Language", "es");
HtmlPage aPage = myWebClient.getPage(myRequest);
Since this failed too i printed out the request object for this url , to check if these headers are being set.
[<url="https://www.bankofamerica.com/", GET, EncodingType[name=application/x-www-form-urlencoded], [], {Accept-Language=es, Accept-Encoding=gzip, deflate, Accept=*/*}, null>]
So the server is being requested for a spanish page but in response its sending the homepage in english (the response header has the value of Content-Language set to en-US)
I did find a hack to retrieve the BOA page in spanish. I visited this page and used the chrome developer tool to get the cookie value from the request
header. I used this value to do the following:
myRequest.setAdditionalHeader("Cookie", "TLTSID= ........._LOCALE_COOKIE=es-US; CONTEXT=es_US; INTL_LANG=es_US; LANG_COOKIE=es_US; hp_pf_anon=anon=((ct=+||st=+||fn=+||zc=+||lang=es_US));..........1870903; throttle_value=43");
I am guessing the answer lies somewhere here.
Here lies my next question. If i am writing a script to retrieve 100 different websites in Spanish (ie Assuming they all have their pages in the spanish) . Is there a clean way in HtmlUnit to accomplish this.
(If cookies is indeed a solution then to create them in htmlunit you need to specify the domain name. One would have to then create cookies for each of the 100 sites. As far as i know there is no way in HtmlUnit to do something like:
Cookie langCookie = new Cookie("All Domains","LANG_COOKIE","es_US");
myWebClient.getCookieManager().addCookie(langCookie);)
NOTE: I am using HtmlUnit 2.12 and setting BrowserVersion.CHROME in the webclient
Thanks.
Regarding your first concern the clear/simple(/only?) way of requesting a webpage in a particular language is, as you said, to set the HTTP Accept-Language request header to the locale(s) you want. That is it.
Now the fact that you request a page in a particular language doesn't mean that you will actually get a page in that language. The server has to be set up to process that HTTP header and respond accordingly. Even if a site has a whole section in spanish it doesn't mean that the site is responding to the HTTP header.
A clear example of this is the page you provided. I performed a quick test on it and found that it is clearly not responding accordingly to the Accept-Language I've set (which was es). Hitting the home page using es resulted in getting results in english. However, the page has a link that states En Español which means In Spanish the page does switch to spanish and you get redirected to https://www.bankofamerica.com?request_locale=es_US.
So you might be tempted to think that the page handles the locale by a request parameter. However, that is not (only) the case. Because if you then open the home page again (without the locale parameter) you will see the Spanish version again. That is clearly a proof that they are being stored somewhere else, most likely in the session, which will most likely be handled by cookies.
That can easily be confirmed by opening a private session or clearing the cookies and confirming this behaviour (I've just done that).
I think that explains the mystery of the webpage existing in Spanish but being fetched in English. (Note how most bank webpages do not conform to basic standards such as responding to simple HTTP requests... and they are handling our money!)
Regarding your second question, it would be like asking What is the recipe to not get ill ever?. It just doesn't depend on you. Also note that your first concerned used the word request while your second concern used the word retrieve. I think it should be clear by now that you can only be 100% sure of what you request but not of what you retrieve.
Regarding setting a value in a cookie manually, that is technically possible. However, that is just like adding another parameter in a get request: http://domain.com?login=yes. The parameter will only be processed by the server if it is expecting it. Otherwise, it will be ignored. That is what will happen to the value in your cookie.
Summary: There are standards to follow. You can try to use them but if the one in the other side doesn't then you won't get the results you expect. Your best choice: do your best and follow the standards.