Holidays - is there a java implementation? - java

I'd like to know if there is a jar-file out there that could do the following:
DateMidnight dateInQuestion = new DateMidnight(12,12,2000);
DateChecker.isNationalHoliday(dateInQuestion, Locale.ITALY);
If there isn't, why? Surely there are lots of properly based rules for the holidays in 99% of the times.
Right now we're mainting a table in our database, with the countries + we have some implementation when it comes to holidays that aren't on the same date every year. We have to add to our implementation for every new country we get new customers.
Could we do this an easier way?
(If there is no such thing in the java sphere, can I port it from some other language?)

I have written the Jollyday API and I'm interested to know what is so 'rough' about it. How can I improve it? Would be great to hear from you. Send me an email to sdiedrichsen#yahoo.de if you like.
By the way. Jollyday is used just the way as the questioner requests it. Please look for yourself.
Cheers, Sven
P.S.: I found a free webservice from Ulrich Hilger which provides detailed worldwide holiday info. Look at api.daybase.eu

Nothing robust exist in Java as far as I know. It also makes sense, this kind of information is namely extremely sensitive to changes. Hardcoding it would make your code potentially break on every Java update which may lead to lot of maintenance and compatibility troubles. Currently at hightest the timezones are hardcoded/maintained in Java SE and even alone that has already lead to many bugs.
Rather use a public webservice for that. E.g. http://www.bank-holidays.com
This site informs you of all the days when banks (as well as stock exchange & school holidays in a number of countries) are closed due to religious or public events. Major events (elections, announced strikes, trade fairs, festivals, sports events...) are also listed.
Our FREE SERVICE allows you to view the current calendar year (view only).
For 2 euros/country/year, our PAY SERVICE (click on credit card icon) gives you access to calendar years 2000-2070.
And write a Java wrapper around that. Or look for existing Java API's which are in turn already backed by a webservice.

I also wrote a java wrapper around the Ruby Holidays Gem (. It's available on Github: https://github.com/gdepourtales/holidays. It works similar to Jollyday (http://sourceforge.net/projects/jollyday/).

I found this interesting:
RESTful service provider - Holiday API
For Node projects - node-holidayapi
I know it is not relevent to the question asked for Java implementation. But if your project is RESTful (like most of the projects nowadays) then this would give you a place to start. ;)

My searching has brought up two results (in addition to what I listed in the comments). The first, the Holiday Client API, seems to be a dead project. The second, Jollyday, looks like a very rough, but active, work in progress.
As for why there is no good library, I'm with Tom. I suspect that your premise "Surely there are lots of properly based rules for the holidays in 99% of the times" is incorrect.

I think you'll have to do the work yourself.
National holidays are determined by, well, nations. They can change, and are, by definition, not universal, and can thus not be captured by some algorithm.
The only way to keep track of them, is to maintain them on a per-nation basis.
Perhaps someone actually already does that (maybe a webservice of sorts), but I doubt it, to be honest.

So far I haven seen any.
But what I could suggest for u is to try to link with google calendar api, which from there try to get the holiday calendar through the calendar feed or whatever u call it.
From there, process the data and if u want, save it into your database.
Afterall, as long as you have a active internet connection, you can use java to connect to the relevant data
Even for other languages, I dont think that such direct methods are available.

You can use my holiday api, there is also a docker container available to run the whole thing.
https://github.com/nager/Nager.Date
Webservice
Get the public holidays for Italy for the year 2000
https://date.nager.at/api/v2/PublicHolidays/2000/IT
More information about the available API methods you can found here
Java Example
//https://github.com/FasterXML/jackson-databind/
ObjectMapper mapper = new ObjectMapper();
MyValue value = mapper.readValue(new URL("https://date.nager.at/api/v2/PublicHolidays/2000/IT"), PublicHoliday[].class);
PublicHoliday.class
public class PublicHoliday
{
public String Date;
public String LocalName;
public String Name;
public String CountryCode;
public Boolean Fixed;
public Boolean Global;
public String[] Counties;
public String Type;
}
Example JSON data retrieved
[
{
"date": "2000-01-01",
"localName": "Capodanno",
"name": "New Year's Day",
"countryCode": "IT",
"fixed": true,
"global": true,
"counties": null,
"type": "Public"
},
...
]

Related

need advice on mysql data base storing information

I'm using Java EE (JDBC, MVC, DAO) and MySql.
I'm making my own project, so all architecture's design - my responsibility.
I have a system "Facultative", where i have entity Facultative, that store information about course, lecturer and start and duration.
Now, it is also storing a field "Status": Wait (not started), Started and Ended.
And this is a place, i have problem: how should information be updated?
Of course, it is possible, to give this function to the admin, but it seems to easy and not efficient.
I have idea - not store field "status" at DB, but to check what status in Model Entity (by checking start date/duration).
I'm using MVC Pattern and not sure if it is correct to add such method to Class.
Thank you in advance.
This is really an issue of the "world" you are modeling. Ask yourself this:
Do courses ever fail to start at the scheduled time?
Do you want to explicitly model that?
If the answer to both of those questions is "yes", then you can't treat the status field as derived from (just) the start and end dates (and the current date). And similarly, automatically setting a (non-derived) status field based on the dates is dubious.
On the other hand ... setting the status administratively would be a bad idea too, since it needs to be done at a particular time; i.e. when the lecture actually starts.
But then ... actually modeling this accurately needs to acknowledge that there is a "gap" between the information in your database, and what is actually happening in the real world. It is (probably) impractical to ensure that the database is 100% accurate. So the pragmatic solution is to accept that: make it a "feature" of the system.
If you take the pragmatic view, then making status derived should be good enough. (Change its name to notional_status or something, and change the start and end fields to scheduled_start and scheduled_end or something.)
Storing the start date and end date (or duration) and deriving the status makes the most sense to me.
The main advantage is the data wont need to be updated as the Status transitions from Wait to Started and Started to Ended and just take care of itself as time passes naturally.

Resume Parsing First Step

I have multiple resumes in a format like somebody sends to a company to apply for a job. I need to parse these resumes in Java.
Do I need to convert these resumes to XML first for parsing? May the example below be a way to convert the resume in XML?
<Name>Varjhjh</Name>
<Experience>5</Experience>
<Age>7</Age>
.
.
.
resume parsing isn't trivial task, I remember couple years ago I was implementing one strategy -- the main problem is everybody construct their CV his/her own way.
e.g. one writes Date of Birth, another DOB next Birth Date -- so you have to use some dictionary for these cases.
And another interesting thing which you can have it's parsing names, especially if your target candidate has very very very long long name e.g. Frederick Gerald Hubert Irvim John Kenneth
Or for example user have few phones his landline, mobile, his reference 1 phone, two etc.
I remember these guys parsed cv not badly
www.rchilli.com/
Other Parsing vendors include: Sovren, Daxtra, Burning Glass and Hireability
But I'm not sure if they have Java integration, and not sure about their cost.
Anyway, good luck in parsing.
I work for Sovren which is a parsing vendor for full disclosure. Resume parsing is not a trivial task. Many company including Sovren, HireAbility, Daxtra and Burning Glass offer installed and SaaS solutions for parsing. Typical work flow is convert the non image resume/cv to text and parsing it returning HR-XML, the industry standard.

Implementing twitter and facebook like hashtags

This might look really silly.. and a question with no research, but trust me it is not. I have done some research on it. One of them would be the following link:
http://www.quora.com/Twitter-1/How-does-Twitter-implement-hashtags
Also I am not looking for a complete solution here.. I will do my hard work, but I just need some guidance regarding this, just want to know which way should I approach?
I want to implement twitter and now even facebook like hashtags for my application.. So that users can add messages with hashtags and others can search over them.. like what is trending and what is relevant.
We are using Mysql, mongo and elasticsearch in our storage tech stack. any ideas how could I start working to implement this? Would I need another storage? One way is that I can store my hastags in db and then do a text search for them in Elasticsearch.
What can people with more experience in this field suggest here?
A start with MongoDB would be to parse each message for hashtags the user used and put these into a sub-array of the document. Example status update:
Peter
April 29th 2014 12:28:34
Hello friends, I visited the #tradeshow in #washington and drank a delicious #coffee
This message would look like this in MongoDB:
{
author: "Peter",
date: ISODate("2014-04-29 12:28:34"),
text: "Hello friends, I visited the #tradeshow in #washington and drank a delicious #coffee",
hashtags: [
"tradeshow",
"washington",
"coffee"
]
}
When you then create an index on db.collection.hashtags you can quickly search for all messages which include one of these hashtags. You likely want to order and limit the results by date so the user sees the most recent results first. When you make it a compound index which also includes the date, you can also speed that up.
How to implement "trending" topics is a quite complex question. It is also very subjective depending on what you would consider "trending". The exact algorithms Twitter or Facebook use to determine which topics are trending or not is not public. According to various social media analysts they also change them frequently, so we can assume that they are quite complex by now.
That means we can not help you to come up with an algorithm on your own. But when you already have an algorithm in mind to calculate the "trendyness" of a hashtag, we could help you to find a good implementation.

Is the Google Calendar API appropriate for my problem?

I'm currently working with a team on a project that will serve as a campus-wide event calendar for my school. We're designing it to be a web application using JSP having a java back end and connected to a relational database located on a server. The database will store events and produce a calendar on the web page based on the events.
Users will also be able to conduct searches and we would like to return a calendar based on the search results (such as activities occurring during a particular a time frame). Potentially we would be creating 100's of calendars at a time to accommodate multiple user requests.
We don't want users to need any special account to use the site (except maybe an account with us). The users will not be editing the events and changing anything but we want a nice GUI interface for them.
Is this a possible task to achieve using the Google Calendar API?
Just to clarify, we will be performing sql queries to construct a list of "events" in a separate section of our application. With this in mind, we do NOT want a calendar that queries our database on its own. We would like a API that allows us to input this list of events, and would output a calendar GUI that provides a user with access to multiple views (daily, weekly, monthly, etc) in an easy-to-use format.
thanks!
It sounds like a decent use of the Google Calendar API to me. After browsing through the API docs for Java, it looks like you can create a calendar, add whatever events to it you wish, and pass a link to that calendar back to the user. In fact, the API page I linked mentions that "you can generate a public calendar for Google Calendar to display, based on your organization's event database". This sounds like exactly like what you are wanting to do. Try out some of the sample code there and see if it looks like it will meet your needs.
I totally agree with bta and have an additional idea:
You said:
The database will store events and
produce a calendar on the web page
based on the events.
You would benefit even more from using Google Calendar in this case. You wouldn't need a database to store the events which has many pros:
You would eliminate a possible bottleneck, because as you said there would be 100s of calendars generated at the same time,
You could have non-tech-savvy people to manage calendars (I believe Google's interface is pretty simple, compared to the backend you would have to develop)
You would eliminate the need for backend (or at least the part responsible for event CRUDs)
You can always "wrap" Google Galendar using its API, so the GUI would be completely up to you,
These are just some of my thoughts, because I believe that simple is better. I hope this will be helpful.
Good luck developing your app!
P.S. If you could, please tell us which method you used and how its working :)

How to webscrape scholar.google.com in Java?

I want to write a Java func grabTopResults(String f) such that grabTopResults("automata theory") returns me a list of the top 100 cited papers on scholar.google.com for "automata theory".
Does anyone have suggestions for what libraries will make my life easy?
Thanks!
As I'm sure Google can afford the bandwidth, I'll ignore the question of whether this is immoral/illegal/prohibited by Google's T&C
First thing you need to do is figure out what HTTP request (or requests) you need to issue in order to obtain the page with the data you need. Once you've figured this out, use HttpClient to issue the same request from Java code. The previous link shows example code that explains how to do this.
Once you've downloaded the content of the relevant page, you'll need to use a HTML parser to extract the data you're interested in. The Jericho parser suggested by peperg is a good choice.
If the Google police come knocking, you've never heard of me, OK?
I use http://jericho.htmlparser.net/docs/index.html . Google Scholar doesn't have API ( http://code.google.com/p/google-ajax-apis/issues/detail?id=109 ). Of course it is not allowed by Google (read terms of use. Automatic requestr are forbidden).
Below is a bit of example code which gets the titles on the first page using the open source product TestPlan. It is a standalone product, but if you really need it I could help you integrated it into your Java code (it is written in Java itself).
GotoURL http://scholar.google.com/
SubmitForm with
%Params:q% automate theory
end
set %Items% as response //div[#class='gs_r']
foreach %Item% in %Items%
set %Title% as selectIn %Item% h3
Notice %Title%
end
This produces output like the below (my IP is Germany, thus a german response). Obviously you could format it however you like, or write it to a file; this is just a rough test.
00000000-00 GOTOURL http://scholar.google.com/
00000001-00 SUBMITFORM default
00000002-00 NOTICE [ZITATION] Stochastic complexity in statistical inquiry theory
00000003-00 NOTICE AUTOMATED THEORY FORMATION IN MATHEMATICS1
00000004-00 NOTICE Constraint generation via automated theory formation
00000005-00 NOTICE [BUCH] Automated theorem proving: after 25 years
00000006-00 NOTICE [BUCH] Introduction to the Theory of Computation
00000007-00 NOTICE [ZITATION] Computer-controlled systems: theory and design
00000008-00 NOTICE [BUCH] … , randomness & incompleteness: papers on algorithmic information theory
00000009-00 NOTICE [BUCH] Automatic control systems
00000010-00 NOTICE [BUCH] VLSI physical design automation: theory and practice
00000011-00 NOTICE Singular Control Systems.

Categories

Resources