Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I am working on my Master's project and I am looking for a substantial amount of financial data about a particular company.
Example: let's say "Apple". I want the historic prices, current market price / ratios, quarterly results and the analyst calls.
I saw couple of posts on StackOverflow about YQL. I think I can get current price and various ratios from Yahoo Finance for free. However for other data, there are companies like Thomson Reuters, Bloomberg, etc. but they seem to have a closed system.
Where can I get an API to fetch various data? Is there anything which will help me get that data? I am fine with raw data as well in any format. Whatever I can get. Could you guys please suggest any API?
A Java library under development is IdylFin, which has convenience methods to download historical data.
Disclaimer: I am the author of this library.
Stephen is right on the money, if you really want a real wealth of data, you're probably gonna have to pay for it.
however, I've been successful on my own private projects by using the "API" spelled out here:
http://www.gummy-stuff.org/Yahoo-data.htm
I've pulled down all the stocks from the S&P 500 quite often, but if you ever publish that data, talk with yahoo. you'll probably have to license it.
btw, all this data is in CSV format, so get a CSV reader/converter etc. their easy to find
This is a Yahoo finance Historical data for "Apple"
http://in.finance.yahoo.com/q/hp?s=AAPL
There is a link at the bottom to download the data. May be this could help
I will suggest a couple of APIs that have financial data that is sometimes hard to find (e.g. quarterly results, analyst calls):
1) http://www.zacksdata.com/zacks-data-api
2) http://www.mergent.com/servius
Both have free trials available.
(Disclosure: My company manages both of these APIs)
A Java example to fetch data from Yahoo finance it given here Obba Tutorial: Using a Java class which fetches stock quotes from finance.yahoo.com
I have tackled this problem in the past.
For price history data, I used yahoo's API. When I say API, I mean I was making an HTTP get request for a CSV file of price history data. Unfortunately, that only gets you data for one company at a time, for a time span you specify. So I first made a list of all the ticker symbols, and iterated over that, calling yahoo's API for each. You might be able to find a website that lists ticker symbols too, and just periodically download that list.
Do this too often and too fast, and their website just might block you. I added some code to limit how frequently I made http requests. I also persisted my data so I would not have to get it again. I would always persist the raw/unprocessed form of data, your code could change in ways that make it tough to use anything else. Avro/Thrift might be an exception, since those support schema evolution.
For other kinds of data, you may not have any API that gives you nice CSV files. I had to cope with that problem many times. Here is my advice.
Sometimes a website calls a restful web service behind the scenes, you can discover that by using firebug. Sometimes it will also require certain headers, which you can also discover using firebug.
If you are forced to work with HTML, there are several java libraries that can help you. apache.commons.http is a library you can use to easily make http requests and handle their responses. Google has an http-client jar too, which is probably worth investigating.
The JSoup API is excellent at parsing HTML data, even when it is poorly formatted, and not XHTML. It works with XML too. Instead of traversing or visiting nodes in the jsoup hierarchy, learn XPath and use that to select what you want. The website may periodically change the format of its web page, that should be easy to cope with and fix if you're using JSoup, and tough to cope with otherwise.
If you have to work with JSON, use the Jackson library to parse it.
If you have to work with CSV, use the OpenCSV library to parse and handle it.
Also, always store the data in the raw, and avoid making unnecessary HTTP requests so you don't get blocked. I have been blocked by google finance a couple times, they can do it. Fortunately the block does expire. You might even want to add a random wait period between requests.
Have you tried Google Finance API. (Please google it ;). I am using it for tracking my portfolio. Could you try http://code.google.com/apis/finance/docs/finance-gadgets.html? There is an example of custom widget and it might tell you if you are barking under the right tree.
You are really asking about a free financial data service ... rather than an API.
The problem is that the data is a valuable commodity. It probably has cost the providers a lot of money to set up their systems, and it costs them even more money to keep those systems running. Naturally, they want a return on their investment, and they do this (in part) by selling their data / services.
(In the case of Yahoo, Google, etc, the data is bought from someone else, and Yahoo/Google will be subject to restrictions on how they can use it. Those restrictions will be reflected in respective ToS; e.g. you are only allowed to access the services "for personal use".)
I think your best bet would be to approach a number of the financial data providers, and ask if they can provide you with free access (subject to whatever restrictions they might want to impose) to their data services. You could get lucky ...
Good data is not free. Its as simple as that. The reason is that all data is ultimately licensed from an exchange like NYSE or NASDAQ.
If you can get some money high resolution historical data is available from Automated Trader.
You should also talk to the business school at your school. If they have finance masters/phd students or masters in financial engineering they should have large repositories of high resolution data for their students.
If you make your question more detailed I can provide a more detailed answer.
This is something that I kick myself for at least once a week. Way back when the internet consisted of Gopher and all that, you were able to log into FTP servers at the NASDAQ and NYSE, and download all kinds of stock history files for free. I had done it, even had it imported to a database and did some stuff with it....but that was probably 10 computers ago, its LONG gone now.
Related
I am working on IBM RTC and I need to import a .csv file to RTC using JAVA. Is there a way to do this? If yes, can someone help me with the same.
Parsing CSV data is something that you definitely do not want to implement yourself, there are plenty of libraries for that (see here).
RTC offers a wide range of APIs that can be used with, see:
rsjazz.wordpress.com or
jazz.net
In that sense: you can write Java code that reads CSV data, and RTC has a rich API that allows you push "content" into the system.
But a word of warning: I used that java API some years ago to manipulate information within our RTC instance. That was a very painful experience. I found the APIs to be badly documented and extremely hard to use. It took me several days to come to working code that would make just a few small updates to our stories/tasks.
Maybe things have improved since then, but be prepared for, as said ... a painful experience.
EDIT, regarding your comment on "other options":
Well, I dont see them: you want to push data you have in CSV into your RTC instance. So, if you still want to do that, you have to use that means that are available to you! And don't let my words discourage you. A) it was some time back when I did my programming with RTC, so maybe their APIs are better structured and more intuitive today. B) there is some documentation out there (for example here). And I think everybody can register at jazz.net; so when you have further, specific questions, you might find "better" answers there!
All I wanted to say was: I know that other products such as jenkins or sonarqube have great APIs; and you work with that, all nice, easy, fun. You get things working with RTC, too. Just the path there, maybe isnt that nice and easy.
My personal recommendation: start with the RTC part first. Meaning: just try to write a small programm that authenticates against the server; and then push some example data into the system. If that works nicely for you; then spend the time on pulling / transforming the real data that you have in mind!
I like to access some data from web pages that are arranged like a catalog/shop from an android app.
For a concrete example: This is the URL for Amazons listing on Mark Twains books:
http://www.amazon.com/s/ref=nb_sb_noss/180-5768314-5501168?url=search-alias%3Daps&field-keywords=mark+tain&x=0&y=0#/ref=nb_sb_noss_1?url=search-alias%3Daps&field-keywords=mark+twain&rh=i%3Aaps%2Ck%3Amark+twain
1) If I have the above URL how do I obtain e.g.
the number of entries and
for each entry the line with the title (and maybe the image)? Which probably includes how to iterate through all the follow-up pages and access each entry.
What is the best (correct + compatible + efficient) way to do this?
I got the impression that jquery might be of use. But so far my knowledge of HTML and Javascript is just about basic.
2) How to query for the URL for all of Mark Twains books?
3) Any suggested readings for this and similar kind of topics?
Thanks for your time and have a good day!
Thomas
You would be very well advised to not "screen scrape" other web sites. Besides being difficult to maintain (as the web site changes, etc.) - this will actually be against the terms of use / service (TOS) for many web sites.
Instead, see if the desired web sites offer a web service that you can use. These will return data in a much more consumable format, such as JSON or XML. You'll usually also get your own developer key (to track requests against), as well as other possible features that you wouldn't get if going directly against the HTML.
Amazon, in particular, certainly offers this. See https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html for details. (Don't be confused by the naming of "advertising".)
I need to extract data from a Java web application. To be specific I am looking to extract real time stock data from yahoo market tracker. Can anyone please suggest any method?
I'm not sure you can extract the data from Yahoo Market Tracker. Even if you can, you might not be allowed to - I can't see any obvious terms & conditions/licensing. I think (although I could be wrong, anyone got better info?) that you'll need to pay to get access to an API providing near realtime market data.
There is a HTTP-based Yahoo Stock Quote API you could use to get prices, described here. Very simple, returns a comma-separated list of attributes for one or more stock symbols, for example:
http://finance.yahoo.com/d/quotes.csv?s=MSFT&f=snd1l1yr
It might not be realtime enough, but it might be the best you can do for free.
You can use glorious HTTP protocol to do that. Use any language you are comfortable with (Java, C#, VB.NET, python, ruby, php) and crawl the website you are trying to get information from.
I need to extract data from a Java web
application
From your standpoint, the fact that it is a Java Webapp or a PHP-one or static html pages doesn't change anything. It is not because Java is backing the webapp that suddenly you get a "Java-way" to extract the info.
Now in some cases there are APIs provided allowing you to interact with the data present on the website: but once again the fact that the Webapp is a Java one or not bears no importance.
I have used MS Money for several years now and due to my "coding interest" it would be great to know where to start learning the basics for programming such an application. Better to say: Its not about how to design and write an application, its about the "bank details". (Just displaying the amount of a certain bank account for the beginning would be a pleasant aim for me.).
I would like to do it in C++ or Java, since I'm used to these languages.
Will it be "too big" for a hobby project? I do not know much about all the security issues, the bank server interfaces/technique, etc.
At the first place after a "no" I need a reliable source for learning.
Most of the apps I've worked with read in a file exported from the bank's website, which is relatively straight forward.
If that's the road you're looking to go down you'll need to write code to:
Login to the bank's website to download the file via HTTPS
Either get specs for the file format or reverse engineer it
Apply whatever business rules you choose to the resulting data
The first thing to remember when trying to programmatically interact with a banking website without express written permission from the bank will MOST LIKELY be a violation of the website use agreement, and may land you in more trouble than it's worth.
Second, you DON'T want to start 'learning' programming by trying to tackle something that massive and sensitive. Not that there is anything wrong with the eventual goal, but that's a journey of a thousand leagues and you need to take your first step.
I would say start with a simple programming environment, like python, or perl. Reason, you don't have to worry about linking, libraries, code generation etc. Get used to the basics of what you want to achieve functionally, them reimplementing that in C++ or Java would be the next step.
To begin with focus on learning client-server programming.
Write a client, write a server, learn all about sockets, learn all about TCP programming,
then learning about secure socket layers (SSL) and transport layer security (TLS).
Once you've done this, try switching to C++ or Java and see if you can repeat the effect.
There are TONS of tutorials on these topics.
Once you have become used to that, learn what tools and libraries are already available to do most common things. For example libcurl is great for creating common internet application protocol clients (HTTP, HTTPS, FTP and the like).
See if you can create an interactive program that you can "log in to" using your web browser which outputs stuff in XML and formats it using cascading style sheets.
This should lead you into javascript world, where there are powerful tools such as jquery. If you mix and match these correctly, you will find that development can be a LOT of fun and quite rapid.
:-)
Happy journeying.
I think its quite a reasonable hobby project; start with a simple ledger and then you can add features.
A few things I would do to begin such a project:
Decide on an initial feature set. A good start might be just one of the ledgers/accounts - basically balancing a checkbook. Make this general enough that you can have several.
Design a data model. What fields will your ledger have? What restrictions on the values of each?
Choose technologies. What language do you want to program in? How will you persist the data? What GUI do you want - a fat client like MS money or a web app?
From there, write up some design notes if warranted and start coding!
You might look into OFX4J, an implementation of the Open Financial Exchange specification, mentioned here and in a comment by #nicerobot.
Are you looking for something mint.com-ish? From my understanding of their security policy this is how they do it: You give them your online account credentials which they give immediately to the bank and get back a "read-only" account login. They then throw away the credentials you provided and use "read-only" credentials to update your metrics every 24 hours. I don't know how they do this or if they have a special relationship with the banks, but it is possible.
I don't think many (if any) banks provides api's.
Online budget-apps in Sweden seem to rely either on exporting transactions in some excel format, or simply have you "mark all transacations in the banksystem, ctrl-c then ctrl-v in a textbox", which is then parses.
I need a open source java based web crwaler which I can extend for price comparison?
How do I do the price comparison?
Is there any open source code for that?
Take a look at web harvest, you will have to use it's slightly odd and peculiar syntax for processing web pages, but it should be fairly to extend it to do some price comparison:
http://web-harvest.sourceforge.net/samples.php?num=2
Building something that scrapes price information from a large number of different sites is going to be a lot of work, whether you scrape from the stores themselves or from existing comparison sites.
Everyone's website layout will be different, requiring you to configure your crawler separately for each one.
Some websites may present the price information in ways that make scraping difficult; e.g. using AJAX.
Some website owners will put the relevant pages into their robots.txt files to tell you to stay away. And if you ignore that, there are various things they can do to make life difficult for you.
Scraping lots of people's websites without permission is likely to make you unpopular. It might attract threats of lawsuits, or actual lawsuits from people who perceive that you are harming their business model. Or other responses ...
Are you really sure you want to do this? Really??
Any reason you can't just get your data from one of the hundreds of price comparison sites already out there? Seems like would be simpler to scrape nextag or froogle or whatever instead of writing a crawler to scrape billions of store websites.
Nobody wants their site to get overloaded without getting any benefit. I think you should create a crawler for your need. However, be aware that most of them may block you or make your responses slower. you need to behave like you are not one and eating their bandwidth...
Someone here wrote about the legal issues. The legal issues are not simple. Stephen C wrote about lawsuits but that goes both ways. There is a large body of law related to anti-competitive conduct. If someone wants their prices to be not reported because they are involved in price-fixing or making false claims, then the websites themselves face severe penalties. The law is not something to trivially quote. You can google price fixing and see the large fines already imposed on countless companies.