How to make career guidance system intelligent - java

Well at-last I am working on my final year project which is Intelligent web based career guidance system the core functionality of my system is
Recommendation System
Basically our recommendation system will carefully examine user preferences by taking Interest tests and user’s academic record and on the basis of this examined information it will give user the best career options i.e the course like BS Computer Science etc. .
Input of the recommendation system will be the student credentials and Interest test and in interest test the questions will be given according to user academic history and the answers that he is giving in the test, so basically test will not be asking same questions from everyone it will decide on real time about what to ask from which user according to rules defined by the system.
Its output will be the option of fields which will be decided on the basis of Interest test.
Problem
When I was defending my scope infront of committee they said "this is simple if-else" this system is not intelligent.
My question is which AI technique or Algorithm could be use to make this system intelligent. I have searched a lot but papers related to my system are much more superficial they are just emphasizing on idea not on methodology.
I want to do all my work in Java. It is great if answer is technology specific.
You people can transfer my question to any other stackexchange site if it is not related to SO Q&A criteria.
Edit
After getting some idea from answers I want to implement expert system with rule based and inference engine. Now I want to be more clear on technology aspect to implement rule based engine. After searching I have found Drools to be best but Is it also compatible with web applications? And I also found Tohu to be best dynamic form generator (as this is also need of my project). can I use tohu with drools to make my web application? Is it easy to implement this type of system or not?

If you have a large amount of question, each of them can represent a feature. Assuming you are going to have a LOT of features, finding the series of if-else statements that fulfills the criteria is hard (Recall that a full tree with n questions is going to have 2^n "leaves" - representing 2^n possible answers for these questions, assuming each question is yes/no question).
Since hard programming the above is not possible for a large enough (and probably a realistic size n - there is a place for heuristical solutions one of those is Machine Learning, and specifically - the classification problem. You can have a sample of people answering your survey, with an "expert" saying what is the best career for them, and let an algorithm find a classifier for the general problem (If you want to convert it into a series of yes-no questions automatically, it can be done with a decision tree, and an algorithm like C4.5 to create the tree).
It could also be important to determine - which questions are actually relevant? Is a gender relevant? Is height relevant? These questions as well can be answered using ML algorithms with feature selection algorithms for example (one of these is PCA)
Regarding the "technology" aspect - there is a nice library in java - called Weka which implement many of the classification algorithms out there.
One question you could ask (and try to find out in your project) which classification algorithm will be best for this problem? Some possibilities are The above mentioned C4.5, Naive Bayes, Linear Regression, Neural Networks, KNN or SVM (which usually turned out best for me). You can try and back your decision which algorithm to use with a statistical research and a statistical proof which is better. Wilcoxon test is the standard for this.
EDIT: more details on point 2:
In here an "expert" can be a human classifier from the field of HR
that reads the features and classifies the answers. Obtaining this
data (usually called the "training data") is hard and expansive
sometimes, if your university has an IE or HR faculty, maybe they
will be willing to help.
The idea is: Gather a bunch of people who first answer your survey. Then, give it to a human classifier ("expert") which will chose what is the best career for this person, based on his answers. The data with the classification given by the expert is the input of the learning algorithm, its output will be a classifier.
A classifier is a function itself, that given answers to a surveys - predicts what is the "classification" (suggested career) for the person who did this survey.
Note that once you have a classifier - you do not need to maintain the training data any more, the classifier alone is enough. However, you should have your list of questions and the answers for these questions will be the features provided to the classifier.

All you have to do to satisfy them is create a simple learning system:
Change your thesis terminology so it is described as "learning the best career" instead of using the word "intelligent". Learning is a form of artificial intelligence.
Create a training regime. Do this by giving the questionnaire to people that already have careers and also ask questions to find out how satisfied they are with their career. That way your system can train on what makes a good career match and what makes a bad one.
Choose a learning system to absorb the data from (2). For example, one source of ideas might be this recent paper: http://journals.cluteonline.com/index.php/RBIS/article/download/4405/4493. Product sum networks are cutting edge in AI and apply well to expert-system-like problems.
Finally, try to give a twist to whatever your technology is to make it specific to your problem.

In my final project, I had some experience with Jena RDF inference engine. Basically, what you do with it is create a sort of knowledge base with rules like "if user chose this answer, he has that quality" and "if user has those qualities, he might be good for that job". Adding answers into the system will let you query his current status and adjust questions accordingly. It's pretty easy to create a proof of concept with it, it's easier to do than a bunch of if-else, and if your professors worship prolog-ish style things, they'll like it.

As #amit suggested, Bayesian analysis can provide you guidance on the next question to ask. Another pitfall of dynamic tests is artificial thresholds ("if your score is 28, you are in this category, if your score is 27, you are not"), a problem which fuzzy logic can help address. Another benefit of fuzzy logic is that adding a new category is relatively easy, since the domain expert is only asked to contribute qualitative assessments, not quantitative thresholds.

A program is never more intelligent than the person who wrote it. So, I would first use the collective intelligence that has been built and open sourced already.
Pass your set of known data points as an input to Apache Mahout's PearsonCorrelationSimilarity and use the output to predict which course is the best match. In addition to being open source and scalable, you can also record the outcome and feed it back to the system to improve the accuracy over time. It is very hard to match this level of performance because it is a lot easier to tweak an out of the box algorithm or replace it with your own than it is to deal with a bunch of if else conditions.
I would suggest reading this book . It contains an example of how to use PearsonCorrelationSimilarity.
Mahout also has built in recommender algorithms like NearestNeighborClusterSimilarity
that can simplify your solution further.
There's a good starter code in the book. You can build on it.
Student credentials, Interest Test Questions and answers are inputs. Career choice is the output that you can co-relate to the input. Now that's a very simplistic approach but it might be ok to start with. Eventually, you will have to apply the classifier techniques that Amit has suggested and Mahout can help you with that as well.

Drools can be used via the web, but watch out; it can be a bit of beast to configure and is likely serious overkill for your application. It is an 'enterprise' type of solution focused around rule management, rather than rule execution.
Drools is an "IF-THEN" system, and pretty much all rules engines use the Rete algorithm. http://en.wikipedia.org/wiki/Rete_algorithm - so if your original question is about how not to use an IF-THEN system, Drools is not the right choice. Now, there is a Solver and Planner part of Drools that are not IF-THEN algorithms, but this is not the main Drools algorithm.
That said, it seems like a reasonable choice for your application. Just don't expect it to be considered an 'intelligent' system by those who deem themselves as experts. Rules engines are typically used to codify (that is, make software of) the rules and regulations of business, such as 'should you be approved for a mortgage' or 'how much is your car insurance' and so on. 'what job you should do' is a reasonable application of the same.
If you want to add more AI like intelligence here are a few ideas
Use machine learning to get feedback from the user about earlier recommendations. So, if someone likes or hates a suggestion, add that back in as a feature of the person. You are now doing some basic feedback/reinforcement learning (bayes, neural nets) to try to better classify the person to the career.
Consider the questions you ask the person. Do you need to ask all of the questions? If you can alter the flow of questions based on their responses (by estimating what kind of person they are) then you are trying to learn the series of questions that gives the most useful knowledge for a recommendation.
If you want specific software, look at Weka http://www.cs.waikato.ac.nz/ml/weka/ - it has many great algorithms for classifying. And it is a Java library, so you can easily use it within a web application.
Good luck.

Related

Context aware recommendation engine

I am looking for context aware (location,time,companion) recommendation system.
I found bunch of good recommendation systems (mahout, PredictionIO, easyrec).
But unfortunately I am not convinced with any of those.
On further googling I found CARSKit based on librec.
I am exactly looking for similar library. At the same time I am more interested to work with mahout only.
Though mahout is not suiting me but still we can ask for number of recommendations and output is also much understandable.
As per my understanding "Context awareness" is missing in mahout.
I will explain my dataset.
calendar_seq,user_id,date,dayofweek,timehh,timemm,location_name,location_lat,location_long,companion,event_name,is_recommended,is_accepted,show_in_cal
1,1,14/12/15,Monday,13,0,Office,1.1,2.2,Colleagues,lunch,true,true,true
2,1,14/12/15,Monday,18,0,Cinema,3.3,4.4,NA,Movie,false,true,true
3,1,15/12/15,Tuesday,13,0,Office,1.1,2.2,Colleagues,lunch,true,true,true
4,1,15/12/15,Tuesday,18,0,Meeting,3.3,4.4,Colleagues,meeting,false,true,true
5,1,16/12/15,Wednesday,13,0,Office,1.1,2.2,Colleagues,lunch,true,true,true
I will have above five rows in DB and will be given it as training data.
Now I need recommendation for User 1 on 16/12/15 evening 18:00.
It can recommend Cinema or Meeting for 16/12.
When I run recomender again for 17/12, based on previous day's recommendation all those events will become like training data.
So again recomender can give recommendation based on location,time,companion etc..
Can any one suggest me best suited recommendation wrapper on top of Mahout or new library which will suit my requirement?
I prefer Java based solutions for my problem.
This may be similar to your question.
A quote from this link: "Your input file may have multiple features like age, location etc. R could help you in applying K-Means clustering on multiple features. Apache Mahout implementation overwrite features instead of applying multiple features. And when you apply clustering on these multiple features, clusters would be formed based on all features instead of one. However, I am not sure about the use-case, So I am just discussing technical feasibility here. You may need to apply based on your use-case."
Hope this helps.

Very small programs to improve programming skills? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I realize that to become a better programmer, you need to program!
So obviously the more practice, the better you become.
My problem is this. I am currently in university, and I find my course load is a bit daunting, and I don't have a lot of free time. I don't think I could really take on a big project, particularly I don't think I would have to motivation to see it through, it would be easier just for me to keep putting it off in favour of work that is due is school.
But I still want to practice.
So I am looking for any resources which have programming challenges which can be completed in a fairly small amount of time. Ideally something i could get done in under 10 hours of work (so just over an hour of work each day), if not smaller.
I have heard of Google Code Jam, but I am not sure the length of the programs it specifies, nor the skill level.
Does anyone have suggestions? Even perhaps a compendium of tutorials for different functions might be useful. For example, a tutorial on file IO would be worthwhile (if I didn't already know it), even though it can be a fairly small topic.
You should look into code katas, they do exactly what you are talking about. Short exercises that are designed to perfect your coding/thinking abilities.
Other references:
http://kata.coderdojo.com/wiki/Overview_of_Learning_Resources
Project Euler has some math/number related problems that are very interesting and ranged from easy to very challenging. You can pick your language of choice and submit only the solution (a large integer number). After you submitted the correct solution, you have access to a forum/comment page where others posted their comments and solutions.
From experience I recommend finding a task that you do repetitively and turning it into a program. I also recommend, seriously, re-invent the wheel in order to get practice with programming. Don't let people tell you to not do something just because it exists already. If you don't know how it works, try to write it yourself.
I don't exactly know what programming level you are on, but don't try to do anything too crazy off the bat, that is just a demotivator (such as trying to write a game for the PS3).
If you already can navigate your way around with IO, then you should try to really learn how to use Collections effectively. I think one of the best practice assignments I have ever done was rewriting the Java TreeMap Class. It was a huge challenge and I learned a lot by doing it.
Here are some suggestions for practice assignments:
Take a text file that has a fair amount of information in it, grab anything, you can get something from here if you'd like: http://www.gutenberg.org/ and make a program that will do the following:
Read in the file
Create a collection of words and their occurrences
Create a collection of anagrams
Create a collection of words and the positions in which they occur (line#, word position)
Develop statistics on the words in the file - meaning - treating each word as an individual - which words occur before it and after it.
Remove all of the white space from the file
Write all of the above data to their own files
One of my favorite things to do is mess with web data, go to a polling website, find a page that has poll data in a tabular form and do the following:
Download the data
Parse through the data and turn the tabular data into a CSV file
Open it in excel without error
Or just look for any site and extract data from it, just make sure the site is robot friendly http://www.robotstxt.org/, you don't want any one site to feel like it is under attack. Most of the time though this isn't normally a problem because if you read the site's terms of use it clearly states you are allowed to download 1 copy of whatever it is you are viewing so long as you don't intend to sell it. Of course this changes for every site.
Go to a website and get all of the links off of the page programmatically.
Here is a fun one, the Susan Program (I don't remember why it is named Susan) which I initially wrote using a C program and two Bourne shell scripts in a Unix environment. The idea in this program is to fork 4 child processes and give them each a task like so:
Child 1: Reads in a file, creates a dictionary of each word and its position in the file, this is outputted to a file.
Child 2: Takes Child 1's output and reconstructs the document, this is outputted to a file.
Child 3: Takes Child 2's output and does what child 1 did again
Child 4: Takes Child 3's output and does what child 2 did again
The goal here is to have an exact replica of the original file once Child 4 outputs it. This is challenging and somewhat pointless, but the point of this exercise is to get the practice.
In your case, don't feel that you need to use different threads for this, you can just use a single program with two different functions and just call them in order.
Again, not sure if you are at this level yet, but try to replace any "for" or "foreach" loop you have in your program with recursion, just as practice. Recursion is a pain in the butt, but it is valuable to know and understand.
These are some suggestions which I think will really help you sharpen your skills.
Enjoy
I like SPOJ and Project Euler to take quick programming challenges and exercises.
Code Jam is a good programming contest, although, as you mentioned, most of the problems there aren't for beginners.
There's a good selection of problems from past topcoder algorithm competitions. (They are held ~2 times a month for almost 10 years already, so there're quite a lot.)
Difficulty range from very simple (but still interesting) problems in the 2nd division to very hard.
Additionally, there're editorials with solutions and live environment where you can submit and test your code. You can also learn from submissions by other people.
Check the problem listing.
Another advantage of topcoder is the regular online contests they hold. I find that competing against other people in realtime is a great boost for motivation.
There're many more problem archives, like SPOJ, UVA and Timus, although they rarely provide solutions or even hints.
http://codegolf.stackexchange.com might have some programming challenges to your liking. A lot of the answers on that site are golfed (they implement the program in the least number of characters) but there are definitely some interesting examples to learn from.
Try enrolling on any IT course on the following websites:
Coursera
Edx
Udacity
These websites offer free educational IT programs from prestigious schools wherein there are lot of challenging exercises to sharpen your programming skills. I've learned to program percolation, pattern recognition, bouncing ball and so many more interesting things because of this. You will upload your program upon completion of the exercises and you will be graded accordingly (basically your progam will be checked).
At the end of each course, you will even receive a certificate of completion. Cool Right?
It depends of the language, but in the past http://rubyquiz.com and http://pythonchallenge.com did great for me, also you can join to an open source initiative because usually helps to give you better code review chances.
I've always thought that practicing with sample interview questions was a great way to sharpen one's skills and get exposed to types of problems that you normally wouldn't solve. Plus, if you're going to be looking for a job it helps you even more.
Here's a pretty simple one that I did for fun the other day:
Write a routine to print the numbers 1
to 100 and back to 1 again without
using any loops.
Glassdoor.com has a lot of good interview question submitted by people who actually got them in an interview.
Since you are in University and looking to improve your coding skills the hard-copy book Cracking the Coding Interview might be a good fit for you. It's got great general programming questions and tidbits about interviewing with some of the best companies in tech. Not only are there great questions, but there are decent problem breakdowns as well.
[Disclosure: I own the book but otherwise have no association to it.]
If you like programming and want to improve your programmer skills, you must try cocode.co. It's a social young site, similar to StackOverflow but based on posting and solving programming challenges, instead of asking and answering questions. From very easy challenges to very hard ones.
You can try to solve ACM problems. There are thousands of problems there and you can find the difficulty level so you can choose which problems to do first. The offcial site for this is:
http://uva.onlinejudge.org/. You can learn more there.
regards
arefin
It may seem a little obvious, but I've noticed a real boost in my regular-expressions skills lately just from answering regex questions on Stack Overflow. Teaching forces you to break down problems into easily explainable pieces, and will also guide your research on those occasions where you know most, but not quite all, of a solution.
I suggest finding a topic you're already somewhat proficient in, since this type of thing isn't so good as a beginners' tutorial. Search SO for questions tagged with that topic and try to figure out the answers. Don't just code them in your head; go ahead and write them out, test them, and explain them. If you're not sure your answer is correct, just write it without posting it.

implement/invoke Excel Solver from java

I have an application in Java/JSF, I need to do some optimization calculations, like Excel Solver Add-in does, one option is certainly to write my own solver implementation, but I'm kind of short of time, so I'm looking into libraries that already exist that can help me with this.
Can you recommend any libraries?
EDITED
I don't have the algorithm yet, but I know that I will have to do similar actions like in Excel Solver - defining parameters, the goal and restrictions and calculation the MAX/MIN revenue
Not a complete solution, but this may get you on the right track (what you are looking for is a non-linear parametric optimizer/solver):
http://jfuzzylogic.sourceforge.net/html/index.html
I did some Googling, and I was surprised that I wasn't able to find something right away...
Here is info about Excel's specific algorithm: http://support.microsoft.com/kb/82890 (again, not a solution, but certainly interesting information for anyone who does this sort of thing).
And here's the company that actually wrote the Excel solver: http://www.solver.com/sdkplatform2.htm
Not sure what your budget is, but if time is of the essence, it may make sense to license it (not sure if they have a Java version of their sdk or not).
And a related question at SO: Solving nonlinear equations numerically

Is a Java book enough or should I have to learn algorithms first? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I am new to Java and referring Head-First book.
Do I have to learn Algorithms to be able to make programs in Java?
Should I learn Algorithms first or only Java books like Effective Java, Java puzzlers etc will be enough?
I want to be a successful enterprise developer. Then what algorithms and data structures I should be well versed with? What books would you recommend me?
To be a successful Java developer, do I need to know all the advanced algorithms such as those given in CLRS?
PS: I had C and C++ in my past semesters, I got good marks in them but that was kinda mugging. I know basics of programming. I am not a newbie. Thing is that I don't have sound knowledge in any language also I want to be a developer not an algorithmist.
If you want to become a successfull developer you'll need to learn a lot of stuff. A main programming language (java) and CS basics (e.g. algorithms) are just two of them. Others are: a communication skills, testing (I'd say TDD but I don't want to start a fight), handling of databases, web stuff, OOD and many more. Also a lot of stuff that you might not use directly will still help a lot by broadening your horizon (functional programming, advanced CS concepts)
There is no fixed order in which to learn that stuff. Since learning works best when you are motivated, just pick what you are interested in most or what is helping you most. Learn a little here, then learn a little there. Keep your eyes open and practice a lot and you will be fine.
In your current situation I'd recommend to pick up an Algorithms book and implementing the algorithms there in java. It teach you java and algorithms.
Also read "Clean Code", "The Pragmatic Programmer" and the SOLID http://www.butunclebob.com/ArticleS.UncleBob.PrinciplesOfOod principles
My advice - learn a subset of Java, then learn some basic algorithms, then figure out where you want to go from there.
You don't need to know all of Java to experiment with implementing algorithms. You will need to be able to wrap a few methods (often only one) in a single class (and even that only because Java insists everything's a class). You'll need conditional and looping constructs and arrays. For data structures, you'll probably need to understand Java references, though you may be able to fake pointers using integers as array subscripts.
The point is that algorithms are important, but on a relatively small scale. Data structures typically need a few algorithms to make them work. But you don't need an understanding of larger-scale issues such as object-oriented design to experiment with algorithms.
EDIT
From comments, I see that you're already well into the "where to go from there" stage.
Don't forget algorithms (or discrete math) because you don't care about analysis, though. A broad (not necessarily in depth) understanding is a useful resource - they are great for problem-solving methods.
For example, I was recently confused in a dependency-ordering issue about how to deal with cycles. I'd forgotten about "strongly connected components" in digraphs. Once I was reminded, the problem became trivial - no point trying to order within a strongly connected component, but those components form an acyclic digraph. From there, the answer is just a topological sort away.
Knowing about topological sorts makes the last step trivial. Having forgotten about strongly connected components cost me a fair bit of time. Understanding how Tarjans strongly connected components algorithm works... Wikipedia and a few minutes with pen and paper are enough, once you know what to look for.
Actually, I should confess - "I was reminded" means I looked up an old Dr. Dobbs article on topological sorting that used the same approach.
I recommend you begin learning with Head First. functionx.com is also a good site. Most Java books describe algorithms as part of the book, so don't worry about algorithms. First of all learn Java.
Let me give you a easy way to find good books. Always look for higher rated books in amazon.com and then download them from wowebook.be.
I recommend "Data structures and algorithms in Java" by Michael T. Goodrich and Roberto Tamassia.
"Think Like a Programmer" are good algorithm-related books.
I'm with Steve314 for the most part but wanted to specify some things.
Learn enough Java (or whatever else) to be dangerous. Then the main data structures (lists, stacks, maps, trees, etc), which does include some algorithms for traversing them. Do the rest of your algorithms work after that. Here are some specific goals, if you want them, that more or less mirror the order of Data Structures & Algorithms in Java.
Understand Java and OO on a basic level. Specifically inheritance and polymorphism
(as Java define them) and how to use generics.
Have a basic understanding of Big-O
(how it's defined and why you can
drop lower order terms).
Be able to use, write, and trace recursive functions.
Code singly- and doubly-linked lists. They should implement Iterable and support adding and removing elements.
Use them to implement a stack and a queue.
A couple of things to know about once you do dive into a book like Algorithms:
MIT OCW has videos lectures up from the class that Algorithms was written for, given by one of its authors.
The purpose of a book like Algorithms is not just to show you specific algorithms, but to teach you how to analyze their running times. You need a small amount of discrete math for that, basically sums and recurrence relations. Look on here for big-O questions to see what I mean.
Speaking of videos, I think this is a way better place to start: Programming Abstractions (Stanford, Julie Zelenski). I'm very happy that those exist.
(As it happens I am in the middle of implementing a map with a binary search tree.)
#Chankey: According to me you should take both java and algorithms hand in hand. First learn some basics of Java programming language like what are classes, data types, functions etc. Then learn some basic algorithms of sorting and searching. And, now apply your java knowledge to implement these algorithms.
Cheers!!
Mukul Gupta

Design for a Debate club assignment application

For my university's debate club, I was asked to create an application to assign debate sessions and I'm having some difficulties as to come up with a good design for it. I will do it in Java. Here's what's needed:
What you need to know about BP debates: There are four teams of 2 debaters each and a judge. The four groups are assigned a specific position: gov1, gov2, op1, op2. There is no significance to the order within a team.
The goal of the application is to get as input the debaters who are present (for example, if there are 20 people, we will hold 2 debates) and assign them to teams and roles with regards to the history of each debater so that:
Each debater should debate with (be on the same team) as many people as possible.
Each debater should uniformly debate in different positions.
The debate should be fair - debaters have different levels of experience and this should be as even as possible - i.e., there shouldn't be a team of two very experienced debaters and a team of junior debaters.
There should be an option for the user to restrict the assignment in various ways, such as:
Specifying that two people should debate together, in a specific position or not.
Specifying that a single debater should be in a specific position, regardless of the partner.
If anyone can try to give me some pointers for a design for this application, I'll be so thankful!
Also, I've never implemented a GUI before, so I'd appreciate some pointers on that as well, but it's not the major issue right now.
Also, there is the issue of keeping Debater information in file, which I also never implemented in Java, and would like some tips on that as well.
This seems like a textbook constraint problem. GUI notwithstanding, it'd be perfect for a technology like Prolog (ECLiPSe prolog has a couple of different Java integration libraries that ship with it).
But, since you want this in Java why not store the debaters' history in a sql database, and use the SQL language to structure the constraints. You can then wrap those SQL queries as Java methods.
There are two parts (three if you count entering and/or saving the data), the underlying algorithm and the UI.
For the UI, I'm weird. I use this technique (there is a link to my sourceforge project). A Java version would have to be done, which would not be too hard. It's weird because very few people have ever used it, but it saves an order of magnitude coding effort.
For the algorithm, the problem looks small enough that I would approach it with a simple tree search. I would have a scoring algorithm and just report the schedule with the best score.
That's a bird's-eye overview of how I would approach it.

Categories

Resources