AIMA implementation bayesian networks - java

I would like to code bayesian networks in java to understand them better, and I have found some code of Artificial Intelligence A Modern Approach (3rd Edition), "AIMA"
Do you recommend I read the code there and adapt to a particular problem, or how do I start?
Could you please orient me where in how to use the code?
I found google has it here and here ,

I would say there is no need to look at existing code if you want to learn. You will probably learn more by doing it yourself.
A good start would be to write code that does the following:
Compute Condition Probabilities from Joint Probability table,
For example, from P(A,B,C) compute P(A|B)
Compute Joint Probability Table from complete set of Conditional Probabilities
For example, from P(A|B,C)*P(B)*P(C) compute P(A,B,C).
Given a DAG, compute if A is d-seperated from B
Do all of the above naively and then go back and try to make them efficient.
It should give you a good understanding of what Bayesian Networks are (conditional probability tables) and what they are used for (reasoning about probability).

Related

How to approach writing algorithm from a complex research paper

I thought of writing a piece of software which does Alpha Compositing. I didn't wanted ready made code off from internet so I tried to find research papers and other sources to understand the mathematical algorithms, and initiated to implement.
But, I got lost very quickly. So my question is,
How should I approach these papers to extract the necessary details from it in order to write algorithm based on it. Any specific set of steps which works well?
Desired answer :
Read ...
Extract ...
Understand ...
Implement ...
Note: This question is not limited to only Alpha Compositing, so more generalised approach will be helpful. I have tagged Java and C++, because thats my desired language to implement the image processing.
What I have done so far?
This is not a homework question but it is of course better to say what I know. I have read wiki of Alpha compositing, and few closely related Image compositing research papers. But, I stuck at the next step to take in order to go from understanding to implementation.
Wikipedia
Technical Memo, Image compositing
I'd recommend reading articles with complex formulas with a pencil and paper. Work through the math involved until you have a good grasp on it. Then, you'll be ready to code.
Start with identifying the steps needed to perform your algorithm on some image data. Include all of the steps from loading the image itself into memory all the way through the complex calculations that you may need to perform. Then structure that list into pseudocode. Once you have that, it should be rather easy to code up.
Write pseudocode. Ideally, the authors of the research papers would have done this, but often they don't. Write pseudocode for some simple language like Matlab or possibly Python, and hack away at writing a working implementation based on the psuedocode.
If you understand some parts of the algorithm but not others, then implement your pseudocode into real code for the parts you understand, and leaving comments for the places you don't.
The section from The Pragmatic Programmer on "Tracer Bullets" basically describes this idea. You want to quickly hack together something that takes your data into some form of an output, and then iterate on the body of the code to get it to slowly resemble the algorithm you're trying to produce.
My answer is necessarily somewhat vague. There's no magic bullet for something like this.
Have you implemented any image processing algorithms? Maybe start with something a little simpler, like desaturation/color intensification, reversal (side to side and upside down), rotating, scaling, and compositing images through a mask.
Once you have those figured out, you will be in a very good position to do an alpha composite.
I agree that academic papers seem to go out of their way to make implementation details muddy and uncertain. I find that large amounts of simplification to what is written is needed to begin to perform a practical implementation. In their haste to be general, writers excessively parameterize every aspect. To build useful, reliable software, it is necessary to start with something simple which actually works so that it can be a framework to add features. To do that, it is necessary to throw away 80–90 percent of the academic generality. Often much can be done with a raft of symbolic constants, but abandoning generality (say for four and five dimensional images) doesn't really lose anything in practice.
My suggestion is to first write the algorithm using Matlab to make sure that you understood all the steps and then try to implement using C++ or java.
To add to the good suggestions above, try to write your pseudocode in simple module (Object oriented style ) so has to have a deep understanding of each part of your code while not loosing the big picture. Writing everything in a procedural way is good a the beginning but as the code grow, it might get become hard to keep up will all you are trying to do.
This example cites one of the seminal works on the topic: Compositing Digital Images by Porter & Duff. The class java.awt.AlphaComposite implements the same rules.

ML technique for classification with probability estimates

I want to implement a OCR system. I need my program to not make any mistakes on the letters it does choose to recognize. It doesn't matter if it cannot recognize a lot of them (i.e high precision even with a low recall is Okay).
Can someone help me choose a suitable ML algorithm for this. I've been looking around and find some confusing things. For example, I found contradicting statements about SVM. In the scikits learn docs, it was mentioned that we cannot get probability estimates for SVM. Whereas, I found another post that says it is possible to do this in WEKA.
Anyway, I am looking for a machine learning algorithm that best suites this purpose. It would be great if you could suggest a library for the algorithm as well. I prefer Python based solutions, but I am OK to work with Java as well.
It is possible to get probability estimates from SVMs in scikit-learn by simply setting probability=True when constructing the SVC object. The docs only warn that the probability estimates might not be very good.
The quintessential probabilistic classifier is logistic regression, so you might give that a try. Note that LR is a linear model though, unlike SVMs which can learn complicated non-linear decision boundaries by using kernels.
I've seen people using neural networks with good results, but that was already a few years ago. I asked an expert colleague and he said that nowadays people use things like nearest-neighbor classifiers.
I don't know scikit or WEKA, but any half-decent classification package should have at least k-nearest neighbors implemented. Or you can implement it yourself, it's ridiculously easy. Give that one a try: it will probably have lower precision than you want, however you can make a slight modification where instead of taking a simple majority vote (i.e. the most frequent class among the neighbors wins) you require larger consensus among the neighbors to assign a class (for example, at least 50% of neighbors must be of the same class). The larger the consensus you require, the larger your precision will be, at the expense of recall.

Point me in the right direction on NLP datastructures and search algorithm

I've got a school assignment to make a language analyzer that's able to guess the language of an input. The assignment states this has to be done by pre-parsing language defined texts and making statistics about letters used, combinations of letter etc and then making a guess based on this data.
The data structure we're supposed to use is simple multi-dimensional hashtables but I'd like to take this opportunity to learn a bit more about implementing structures etc. What'd I'd like to know is what to read up about. My knowledge of algorithms is very limited but I'm keen on learning if someone could point me in the right direction.
Without any real knowledge and just reading up on different posts I'm currently planing on studying undirected graphs as a datastructure for letter combinations (and somehow storing the statistics within the graph as well) and boyer-moore for the per-word search algorithm.
Am I totally on the wrong track and these would be impossible to implement in this situation or is there something else superior for this problem?
If you can get your hands on a copy of Cormen et al. "Introduction to Algorithms"
http://www.amazon.com/Introduction-Algorithms-Second-Thomas-Cormen/dp/0262032937
It's a very very good book to read up on data structures and algorithms.
Language detection using character trigrams

Boosting my GA with Neural Networks and/or Reinforcement Learning

As I have mentioned in previous questions I am writing a maze solving application to help me learn about more theoretical CS subjects, after some trouble I've got a Genetic Algorithm working that can evolve a set of rules (handled by boolean values) in order to find a good solution through a maze.
That being said, the GA alone is okay, but I'd like to beef it up with a Neural Network, even though I have no real working knowledge of Neural Networks (no formal theoretical CS education). After doing a bit of reading on the subject I found that a Neural Network could be used to train a genome in order to improve results. Let's say I have a genome (group of genes), such as
1 0 0 1 0 1 0 1 0 1 1 1 0 0...
How could I use a Neural Network (I'm assuming MLP?) to train and improve my genome?
In addition to this as I know nothing about Neural Networks I've been looking into implementing some form of Reinforcement Learning, using my maze matrix (2 dimensional array), although I'm a bit stuck on what the following algorithm wants from me:
(from http://people.revoledu.com/kardi/tutorial/ReinforcementLearning/Q-Learning-Algorithm.htm)
1. Set parameter , and environment reward matrix R
2. Initialize matrix Q as zero matrix
3. For each episode:
* Select random initial state
* Do while not reach goal state
o Select one among all possible actions for the current state
o Using this possible action, consider to go to the next state
o Get maximum Q value of this next state based on all possible actions
o Compute
o Set the next state as the current state
End Do
End For
The big problem for me is implementing a reward matrix R and what a Q matrix exactly is, and getting the Q value. I use a multi-dimensional array for my maze and enum states for every move. How would this be used in a Q-Learning algorithm?
If someone could help out by explaining what I would need to do to implement the following, preferably in Java although C# would be nice too, possibly with some source code examples it'd be appreciated.
As noted in some comments, your question indeed involves a large set of background knowledge and topics that hardly can be eloquently covered on stackoverflow. However, what we can try here is suggest approaches to get around your problem.
First of all: what does your GA do? I see a set of binary values; what are they? I see them as either:
bad: a sequence of 'turn right' and 'turn left' instructions. Why is this bad? Because you're basically doing a random, brute-force attempt at solving your problem. You're not evolving a genotype: you're refining random guesses.
better: every gene (location in the genome) represents a feature that will be expressed in the phenotype. There should not be a 1-to-1 mapping between genome and phenotype!
Let me give you an example: in our brain there are 10^13ish neurons. But we have only around 10^9 genes (yes, it's not an exact value, bear with me for a second). What does this tell us? That our genotype does not encode every neuron. Our genome encodes the proteins that will then go and make the components of our body.
Hence, evolution works on the genotype directly by selecting features of the phenotype. If I were to have 6 fingers on each hand and if that would made me a better programmer, making me have more kids because I'm more successful in life, well, my genotype would then be selected by evolution because it contains the capability to give me a more fit body (yes, there is a pun there, given the average geekiness-to-reproducibily ratio of most people around here).
Now, think about your GA: what is that you are trying to accomplish? Are you sure that evolving rules would help? In other words -- how would you perform in a maze? What is the most successful thing that can help you: having a different body, or having a memory of the right path to get out? Perhaps you might want to reconsider your genotype and have it encode memorization abilities. Maybe encode in the genotype how much data can be stored, and how fast can your agents access it -- then measure fitness in terms of how fast they get out of the maze.
Another (weaker) approach could be to encode the rules that your agent uses to decide where to go. The take-home message is, encode features that, once expressed, can be selected by fitness.
Now, to the neural network issue. One thing to remember is that NNs are filters. They receive an input. perform operations on it, and return an output. What is this output? Maybe you just need to discriminate a true/false condition; for example, once you feed a maze map to a NN, it can tell you if you can get out from the maze or not. How would you do such a thing? You will need to encode the data properly.
This is the key point about NNs: your input data must be encoded properly. Usually people normalize it, maybe scale it, perhaps you can apply a sigma function to it to avoid values that are too large or too small; those are details that deal with error measures and performance. What you need to understand now is what a NN is, and what you cannot use it for.
To your problem now. You mentioned you want to use NNs as well: what about,
using a neural network to guide the agent, and
using a genetic algorithm to evolve the neural network parameters?
Rephrased like so:
let's suppose you have a robot: your NN is controlling the left and right wheel, and as input it receives the distance of the next wall and how much it has traveled so far (it's just an example)
you start by generating a random genotype
make the genotype into a phenotype: the first gene is the network sensitivity; the second gene encodes the learning ratio; the third gene.. so on and so forth
now that you have a neural network, run the simulation
see how it performs
generate a second random genotype, evolve second NN
see how this second individual performs
get the best individual, then either mutate its genotype or recombinate it with the loser
repeat
there is an excellent reading on the matter here: Inman Harvey Microbial GA.
I hope I did you some insight on such issues. NNs and GA are no silver bullet to solve all problems. In some they can do very much, in others they are just the wrong tool. It's (still!) up to us to get the best one, and to do so we must understand them well.
Have fun in it! It's great to know such things, makes everyday life a bit more entertaining :)
There is probably no 'maze gene' to find,
genetic algorithms are trying to setup a vector of properties and a 'filtering system' to decide by some kind of 'surival of the fittest' algorithm to find out which set of properties would do the best job.
The easiest way to find a way out of a maze is to move always left (or right) along a wall.
The Q-Algorithm seems to have a problem with local maxima this was workaround as I remember by kicking (adding random values to the matrix) if the results didn't improve.
EDIT: As mentioned above a backtracking algorithm suits this task better than GA or NN.
How to combine both algorithm is described here NeuroGen descibes how GA is used for training a NN.
Try using the free open source NerounDotNet C# library for your Neural networks instead of implementing it.
For Reinforcement Learning library, I am currently looking for one, especially for Dot NET framework..

Random maps/graphs and OSM

just wondering if you have any suggestions here. I need a lot of sample maps/graphs to test my shortest path search solution (I was told I should have >100 of them). My code is supposed to work in a simulator, which uses OpenStreetMap maps of urban setting, limiting the total number of junctions to a few thousand. the problem is, there are only two or three maps provided with the simulator. The way I see it, I have a few choices here:
Write my own random graph generator. Possibly lots of work (do you think? --I've never done it before) and reinventing the wheel.
Use off-the-shelf solution. I'm not aware of any that would generate me map-like graphs (well, at least I didn't find it in JUNG :-) )
In some automated way grab them from OSM. I don't really intend to myself go and pick out a 100+ urban maps that would satisfy <15000 nodes requirement. I don't think that would be easy to automate either, though.
I would assume that 3 would be tough to do. Any advice on some off-the-shelf solution? or comments about writing my own? I'm not an experienced programmer by any measure, but given a few days.
The first thought:
You have a known problem and you need to test its solution. Generate lots of test data, find correct solutions with verified algorithm, then run your algorithm against generated data set and compare results. (or just download verified dijkstra algorithm implementation, I believe that implementing this algorithm is your task)
The second thought:
Random-generated data set is not the best way to test algorithms. You need to think about cases when your algorithm can fail and create correspondent tests. For example, graph with 1 node, graph with cycles, linear graph i.e. N1---N2---N3-...-Nn, complete graph with maximum nodes number. I think if you create these 4 tests and 2-3 small random tests it'll be enough to be sure that your algorithm is implemented correctly.

Categories

Resources