How to google programmatically? - java

I want to query google programmatically in Java, to get texts for relation extraction purposes.
For example, I want to write in Java:
result_list=googleAgent.search("Berlin Germany");
In result_list, I can get a list of sentences which contain "Berlin" and "Germany". Then I can do NLP analysis and extract the relation.
Can I do it at all? And how if so?

Google prohibits programmatic searches directly through their website (that's why they have a search API). If you insist on trying to do this, Google will eventually pop up a captcha that your client will have to solve. So now you'll be trying to do NLP while you're doing OCR ;)
However, their search API isn't that great. You're limited to a certain number of queries per day (100) and information per result.

You can use Google's Custom Search API

Related

Retrieve Stations from Shoutcast.com

I am making a radio app in Java, and I want to get as many URLs as possible of streaming radio stations.
Is there a way to retrieve all that Shoutcast.com is offering? I found this page here but I can't configure what I am doing next. It requests a Dev ID. I am searching the forums... the night is closer and closer... I need some help.
Also:
Is there some way I can find ready lists online with available URLs of streaming stations?
The Dev ID is almost certainly the Developer API Key that Shoutcast issues to authorized developers. It looks like you can request an API key here.
And in terms of listing stations that are already playing, the page you linked contains a number of different ways to query that information directly through the API. That's probably the best official source of stations you're going to get. Of course, that means that you do need to acquire your API key first, but once you have that, it's all there in the wiki.

Google Drive SDK - Sorting results when querying large number of documents

Until a few weeks age, when using Drive SDK full text querying, results were returned sorted by "last modified" by default, but now, results are returned sorted by relevance, and this has a large effect on an app that queries a very large number of files (so client side sorting is not an option).
I don't seam to find any documentation regarding to sort parameters, so could anyone help my on this? Are there sorting options not stated in the docs? Is there a workaround for this? (until Drive SDK, I used Document List API with documented sorting options)
Also, I've noticed that developers are requesting this since mid 2012 so what can we do? Please Google, tell us if we should give up using Drive SDK and switch to another platform/api.
You are correct the older version of Google Documents List API v2 allowed for order by as one of the default query parameters.
The current Drive REST API (which is the same as Google drive sdk) files.list does not have this option standard query parameters.
You will need to sort it in your code after you get the results back from the API

what are the steps to make a word search for a website?

I want to write a word search,which connects to a specific website(huge one),takes the word from user,searches the site and returns the strings which contain the word;this should be written in java and as an applet.I have read some tutorials and questions on this,and understood what have to be done is:
1.connect to a website and get the content of a website and save it to a string.(this should be done with a webcrawler which will be made from my own code for connecting to website and save the content to a string + jsoup library to parse the html code).
2.save the datas to a database(in my case nosql database).
3.index the datas in database.
4.query the database to show the results.
5.make a UI for showing the search results(I use swing.japplet).
now my qustions are:
1.have I understood correctly the steps which I have to go?(please explain me in details if a step is unnecessary or necessary)
2.Is it necessary to have a database?
notice:I want to implement it myself,without using ready things such as lucene,nutch,solr,...
edit:3 people told me applet is not suitable for such a thing,so what should be the replacement?
many many thanks for your help.
You should look at using Lucene, as it does most of what you want here.
You should not use applets.
For small data set, database should be sufficient. Databases like mysql comes with full text search functions.
For bigger data set, you might want to consider Lucene or Solr.
That is one way way to implement this. Another (simpler) way would be to use an existing text search / indexing engine like Lucene / Solr. Going to the effort of reimplementing the "text search / indexing" wheel using database technology strikes me as a waste of effort, unless you have a sound technical reason for doing so.
You do need to has some kind of database, because indexing a website on the fly would simply not work. Lucene will handle that.
I think your choice of Java applets to build the UI is a bad idea. There are other technologies that give results that are as good or better ... without the security risk of a Java browser plugin.
Finally, another way to make your website searchable is to get Google to do it for you. Make your website content indexable, and then use Google's search APIs.

Searching with Java using Google

I am trying to use Java to search for a String on Google. I heard about a Google API but I wasn't able to find anything useful. It should look something like this:
I have a text file. Every Line is a String which should be googled. If the first search result is from a spezific site (for example: stackoverflow.com/**), The full link will be written in a new textfile. Any ideas how to realize that?
Thanks.
You can search on Google with its Custom Search API. There's a Java Client Library for CustomSearch API available to simplify the work. Warning : "Usage is free for all users, up to 100 queries per day."
Goolge offers a RESTful service to do custom queries programmatically called the Google Custom Search API.
It is not free though: you can submit up to 100 queries per day.

Google page index - java

does any one has idea how to get google page index in Java?
I was googleing since last 2-3 days but helpless, can any one refer me API for that or give some suggestion for how to do that
Lots of thanks in advance
For example if we search for facebook in google, we get around 22,980,000,000 results. So I want to fetch this number using JAVA
make a corresponding HTTP request from Java to Google, then parse the replied HTML code. There is a div with the ID resultStats. This div contains the number of results.
Not sure what your real requirement is, what kind of index do you want? Google export fairly a bit amount of APIs via RESTful service, some of them are packaged with JavaScript lib like Google MAP API. There are also Java client library for OAUTH authentication
The custom search API information could be found at http://code.google.com/apis/customsearch/v1/overview.html. A comprehensive list of google APIs could be accessed at https://code.google.com/apis/console

Categories

Resources