Google Query from Java? - java

I'm writing a Java program, and I want a function that, given a string, returns the number of Google hits a search formed from that query returns. How can I do this? (Bonus points for the same answer but with Bing instead.)
For instance, googleHits("Has anyone really been far even as decided to use even go want to do look more like?") would return 131,000,000. (or however many there are.)
Related: How can I programmatically access the "did you mean" suggestion? (eg searching "teh circuz" returns "did you mean the circus?")
found it: http://code.google.com/apis/ajaxsearch/documentation/#fonje

The Google Terms of Service say this:
5.3 You agree not to access (or attempt to access) any of the Services
by any means other than through the
interface that is provided by Google,
unless you have been specifically
allowed to do so in a separate
agreement with Google. You
specifically agree not to access (or
attempt to access) any of the Services
through any automated means (including
use of scripts or web crawlers) and
shall ensure that you comply with the
instructions set out in any robots.txt
file present on the Services.
Google has ways of making life unpleasant for you / your company if you violate the Terms of Service ...
UPDATE: The second sentence is about the way that you use Google's services ... including their published APIs. It is not entirely clear from the wording what is allowed and what is forbidden; literally speaking "any automated means" is very broad. However a Java app that performed Google searches, screen-scraped the results and repackaged them to provide some value added service would (IMO) be a violation of the TOS. And using Google's published APIs to do the same thing would (IMO) also be a violation.
But that's my opinion, not Google's. And it is the Google opinion that matters. If anyone is thinking of doing something like this, they should contact Google and check that what they are proposing is OK.
The point is that Google is not going to assist people to subvert their search business model. Anyone who thinks they can get away with it based on some clever interpretation of the TOS is going to get burned.

for the first part of the answer, try read the t-o-s; for the "did you mean" part, see: http://norvig.com/spell-correct.html

You may be able to do it "legally" using the Google Java Client Library. I don't know for sure, but they may have some methods similar to what you're looking for, and you won't be violating their TOS.
Google Data APIs Library

You can legally access the Google AJAX Feed API through its RESTful interface:
http://code.google.com/apis/ajaxfeeds/documentation/#fonje
Bing still has a developer program where you can call against their API in a JSON/XML or SOAP matter:
http://www.bing.com/developers

Related

Is it possible to have my app communicate with moodle?

I am thinking about building a student app, that would use Moodle data, and notify the user when a new file has been uploaded, and perhaps do something like checking your grades etc.
I'm quite new to android programming and can get easily confused with the technical terms. I've looked around the web and found that there is an API, but I dont really 100% know what that means. Which is weird since I've communicated with API's like the OpenWeatherMap one and uTorrent. Would Moodle's api do the same? Make it easier for me to get their data? Their descriptions are really technical and I can not understand much.
Please note that "API" in moodle does not automatically refer to webservices like you are used to communicate with.
See https://docs.moodle.org/32/en/Mobile_web_services and https://docs.moodle.org/32/en/Using_web_services and https://docs.moodle.org/dev/Web_service_API_functions
These docs might be interesting for you.
Beside the existing webservice methods, you can also create your own moodle plugin, which provides the methods required, some info can be found here https://docs.moodle.org/dev/Adding_a_web_service_to_a_plugin

Understand Twitter API

I'm working on a java problem that (at least is trying) to utilize the twitter API, however, it is my first project using any type of API and I am a little confused. What is the benefit of using a java library for the twitter API such as Twitter4J and how would one go about not using one? I'm a little fuzzy on the topic of APIs in general and I'm not finding anything in my searches that really makes it clear how to use one.Do I need to use a Java library or can I do it without one? what are the pros and cons of using one vs not using one. I am relatively new to this and am having some issues. Any help?
First what an API is:
An application programming interface (API) is a particular set of
rules ('code') and specifications that software programs can follow to
communicate with each other. It serves as an interface between
different software programs and facilitates their interaction, similar
to the way the user interface facilitates interaction between humans
and computers. An API can be created for applications, libraries,
operating systems, etc., as a way of defining their "vocabularies" and
resources request conventions (e.g. function-calling conventions). It
may include specifications for routines, data structures, object
classes, and protocols used to communicate between the consumer
program and the implementer program of the API
The use of the Twitter4J API would allow you to easily call commands that do complex operations, such as get tweets as they are coming in. For projects such as this, using an API is best way to go about it as you are also going to be required to get an access key which allows you permission to use the API.
Examples using Twitter4J: http://twitter4j.org/en/code-examples.html
You need to distinguish between an "API" and a "Library"
You NEED the Twitter API: it's the thing that connects twitter to your code. You can use this to send a "post this to my account" command for instance.
You CAN use a library: it helps your code talk to the api, by doing some of the work for you. You can call a function with only a string as parameter, and this function calls the forementioned send-to-twitter API
You can ofcourse say things like that the library has an API, but this would be confusing the situation a bit.
In the end it is quite nice to use the library because it helps you by writing code in your language.

How to implement a social media/website monitoring service?

i would like to implement some kind of service my customers can use to find their company on
a. blogs, forums
b. facebook, twitter
c. review sites
a. blogs, forums
This can only be done by a crawler, right? A crawler looking for the robots.txt on a forum/blog and than optionally reading the content (and of course links) of the forum/blog.
But where to start? Can i use a set of sites to start with crawling? do i have to predefine them or can i use some other searchengine first? E.g. searching in Google for that company and then crawl the SERPs? Legal?
b. facebook, twitter
They have APIs, so hat should not be a problem i think.
c. review sites
I looked at some review site's TOS and they wrote that using an automated software crawling their sites is not permitted. On the other hand, the sites that are relevant to me are not disallowed in their robots.txt. What matters here?
Any other hints are welcome.
Thanks in advance :-)
Honestly, the easiest way to do it would be to start with the search engines. They all have APIs for doing automated searches, so that'd probably give yout he highest return for your time on getting back links/mentions of your client's products or brand.
That won't handle things behind authentication, only public stuff (of course). But it'll give you a good baseline to start with. From there, you could (if you want) use API's or custom-written bots that are given auth creds on the sites, but honestly I think at that point you're missnig the core question, I think.
Is the core question, "Where are we mentioned?" or is the core question really... "What sites are getting traffic to come to us?" In most cases, it's the latter, in which case you can ignore all of what I said previously and just use Google Analytics, or similar software on your client's site to determine where traffic's coming from.
Edit
Ok, so if it's where are we mentioned, I'd still start w/ the search engines as stated. Google's api is pretty easy and it has a SOAP based one that you can pull in as a web reference if you want; example
Re: review sites. If the site's TOS says you can't use automated bots, then it's a good idea not to use automated bots. The robots.txt is not legally binding (it's sort of a good-neighbor thing), and so I wouldn't not use the lack of exclusion there to be permission. Some review sites (more modern ones) might disallow automated scraping of their site, but they might still publish RSS feeds or Atom feeds or have some other API that you can hook into, that's worth checking.

How do I use OAuth within my GWT application?

How do I use OAuth within my Java GWT application?
In particular, I want to get a list of users in my Google Aps domain, using this API:
http://code.google.com/googleapps/domain/profiles/developers_guide_protocol.html
I know this sounds like a question, that probably has been asked many times before, but I couldn't find any Java code on how to realize the OAuth steps described in the API above.
I would be glad if someone could share some code, or point me to the right docs.
This tutorial by Matt Raible is easily the best one I've seen so far on OAuth and gwt. He also has a very good picture depicting the authentication flow, which I always find help. However, as Matt himself says, the solution is not 100% reliable, but it might still get you part of the way.
With this in mind, it might be better to just go with a pure javascript implementation of it. You'll find one such implementation right here. This SO thread might come in handy to you if you chose that path.
Best of luck to you.
What do you mean in your GWT application?
Do you mean client-side only?
Because on the server you can easily use the Scribe OAuth library.
It has a good documentation and is fairly simple to use.
For integrating OAuth and GWT, you should start with Scribe which handles the implementation of the OAuth:
https://github.com/fernandezpablo85/scribe-java
Next, you need to create a GWT widget that can handle the user's interactions to acquire permission to access their account. Then grab the response token, and make the API requests to the external site.
No point re-implementing OAuth when scribe already does it for you - you just need to. I'd probably aim to use a GWT Popup for doing the authentication:
http://gwt.google.com/samples/Showcase/Showcase.html#!CwBasicPopup

How to extract data from Java web application?

I need to extract data from a Java web application. To be specific I am looking to extract real time stock data from yahoo market tracker. Can anyone please suggest any method?
I'm not sure you can extract the data from Yahoo Market Tracker. Even if you can, you might not be allowed to - I can't see any obvious terms & conditions/licensing. I think (although I could be wrong, anyone got better info?) that you'll need to pay to get access to an API providing near realtime market data.
There is a HTTP-based Yahoo Stock Quote API you could use to get prices, described here. Very simple, returns a comma-separated list of attributes for one or more stock symbols, for example:
http://finance.yahoo.com/d/quotes.csv?s=MSFT&f=snd1l1yr
It might not be realtime enough, but it might be the best you can do for free.
You can use glorious HTTP protocol to do that. Use any language you are comfortable with (Java, C#, VB.NET, python, ruby, php) and crawl the website you are trying to get information from.
I need to extract data from a Java web
application
From your standpoint, the fact that it is a Java Webapp or a PHP-one or static html pages doesn't change anything. It is not because Java is backing the webapp that suddenly you get a "Java-way" to extract the info.
Now in some cases there are APIs provided allowing you to interact with the data present on the website: but once again the fact that the Webapp is a Java one or not bears no importance.

Categories

Resources