So, I'm desperately trying to understand ray tracing, and I THINK I have it...? What I think it is: You have a camera location and rotation, and then you create a simulated entity in which will move along the velocity equal to the 0-1(float/double) vector per coordinate, in a short amount of time, and in the end with a lot of collision checking, find and return a location depending on how you do the collision data? This is just my theories of understand, my question is, did I get it right?
Depending on the amount of time you can invest in this, one idea is to get a working implementation, such as pbrt, and try to understand it from a higher level, as in, feed it an example scene, also downloadable, and debug the code to see how it goes parsing the file, creating the scene and its geometry, then sampling the region and creating rays, and finally evaluating the value for those rays.
You could easily skip the details so as not to feel overwhelmed.
Related
I'm trying to find if a scanned pdf form contains a signature (like making sure a check is signed).
The problem domain:
I will be receiving document packages (multi page pdf's with multiple forms). I have already put together document package classifiers that will check the package for all documents and scale the images to a common size. After that I know where the signatures should be and can scan the area of the document specifically. What I'm looking for is the best approach to making sure there is a signature present. I've considered just checking for a base threshold of dark pixels but that seems so clumsy. The trouble with signatures is that they are not really writing, more of a personal mark.
The only thing I can come up with is a machine learning method to look for loopyness? But I'm not all the familiar with machine learning and don't even know where to start with something like that. Anyone with some suggestions for practical approaches would very appreciated.
I'm coding this in Java if that's helpful at all
What you asked was very broad so there isn't a lot of information that we can give you. However, I can point you to some helpful links:
http://java-ml.sourceforge.net/ --This is a library that you can download that has lots of useful algorithms and other code to include in your program
https://www.youtube.com/playlist?list=PLiaHhY2iBX9hdHaRr6b7XevZtgZRa1PoU --this is a series that explains neural networks (something you might want to look into for your machine learning)
So a big tip I have for your algorithm is to instead of looking for how long exactly all of the loops and things are, look at all of their relative distances
"Relative distances from what?" you say. Well this is where the next tip comes in handy: instead of keeping track of the lines, keep track of the tips of the loops and the order of these points. If you then take the distance between all of them (relatively of course which means to set one of the lengths to zero). Along to keeping track of the distances, you should also keep track of the angles. You would calculate the angle ABC by taking the distance between (A,B), (B,C), and (A,C) (A,B, and C being coordinates on the xy plane) which creates a triangle between the points which allows you to use trigonometry to calculate the angle.
(I am assuming that for all of these you are also trying to detect who's signature it is of course because it actually doesn't really complicate things much at all) When trying to match up the signature detected to the stored signatures to see if they are the "same," don't make it to where the distances and angles have to be exact. Give a margin of error (like use a % range above and below). Here is a tip: Make the margin of error rather large. That way if it is written poorly, it will still be detected. This raises the chances of more than one signature being picked up. Luckily, there is a simply solution to this. Just have it run the algorithm again on the signatures that were found but with the margin of error smaller (you of course don't do this manually, the program does it). Continue decreasing the margin of error until you get only one signature remaining.
I am hoping you have ideas already for detecting where the actual signature is but check for the difference in darkness of the pixels of course. Make sure it is pretty continuous. Also take note of the fact that signatures are commonly signed in both black or blue or sometimes red and other fancy colors.
I would like to do some audio and video analysis in Java.
In a bit more detail, I would like to identify the points in audio/video that have either been monotonous for quite some time or have drastically changed compared to some previous state.
If you want to look at it in a mathematical way, I can try to explain it like this:
Example:
You have an audio file. You should extract the waveform of that
audio file. You could try to approximate that waveform with some
simpler function, that can be expressed as a closed formula. Let's
call that function f(t).
Now, to find out how your function behaves (is it increasing or decreasing) at some point or interval, I guess I could use the first derivative,f'(t). If I'd like even more information, I assume second derivative, f''(t) would also come in handy.
So, if we assume we can do that then I guess I'd have 1 piece of information about the audio.
However, if I'm not mistaken, audio files can also have spectrograms, so I'm unsure how they fall into all of this.
So, the real question goes here: Is there a way to do this in Java (efficiently)? I've been doing some digging and I've found MusicG, however, the last update date is July 2012, which leads me to believe this may be abandoned.
The second part refers to video files, but without their audio component.
This is where I'll have more questions, so I'm just gonna go and shoot them.
How do you identify points of change in "pace" in videos?
Here's an example:
Imagine the video shows car driver's point of view while he's driving
on a long, straight road. Since the surroundings are mostly the same,
the pace could be described as "not changing much". At one point, the
road begins to curve but the driver, due to him falling asleep" is not
following the road that precisely, so the surroundings start to change
somewhat, and so does the pace. At the apex of that curve there is a
tree, which grows bigger and bigger as the car is approaching it.
Here, the POV (and the pace) is changing quite a lot, since the tree
is getting bigger and bigger. In the end, the car crashes into a tree,
all hell breaks loose, the car starts to roll uncontrollably, which
indicates a really intense pace.
I'm assuming one way could be to do an image segmentation and somehow determine which portions of the frames are changing, and how big are those portions to try to determine pace, but I'd like additional input.
If anyone has had prior experience doing any sort of related work in Java, what approaches did you explore and/or use? One thing that immediately comes to my mind is JavaCV, but as I said, with my limited experience, I'm unsure what to actually try.
I'm trying to compare multiple algorithms that are used to smooth GPS data. I'm wondering what should be the standard way to compare the results to see which one provides better smoothing.
I was thinking on a machine learning approach. To crate a car model based on a classifier and check on which tracks provides better behaviour.
For the guys who have more experience on this stuff, is this a good approach? Are there other ways to do this?
Generally, there is no universally valid way for comparing two datasets, since it completely depends on the applied/required quality criterion.
For your appoach
I was thinking on a machine learning approach. To crate a car model
based on a classifier and check on which tracks provides better
behaviour.
this means that you will need to define your term "better behavior" mathematically.
One possible quality criterion for your application is as follows (it consists of two parts that express opposing quality aspects):
First part (deviation from raw data): Compute the RMSE (root mean squared error) between the smoothed data and the raw data. This gives you a measure for the deviation of your smoothed track from the given raw coordinates. This means, that the error (RMSE) increases, if you are smoothing more. And it decreases if you are smoothing less.
Second part (track smoothness): Compute the mean absolute lateral acceleration that the car will experience along the track (second deviation). This will decrease if you are smoothing more, and it will increase if you are smoothing less. I.e., it behaves in contrary to the RMSE.
Result evaluation:
(1) Find a sequence of your data where you know that the underlying GPS track is a straight line or where the tracked object is not moving. Note, that for those tracks, the (lateral) acceleration is zero by definition(!).
For these, compute RMSE and mean absolute lateral acceleration.
The RMSE of appoaches that have (almost) zero acceleration results from measurement inaccuracies!
(2) Plot the results in a coordinate system with the RMSE on the x axis and the mean acceleration on the y axis.
(3) Pick all approaches that have an RMSE similar to what you found in step (1).
(4) From those approaches, pick the one(s) with the smallest acceleration. Those give you the smoothest track with an error explained through measurement inaccuracies!
(5) You're done :)
I have no experience on this topic but I have few things in mind that may help you.
You know it is a car. You know that the data is generated from a car so you can define a set of properties of a car. For example if a car is moving with speed above 50km than the angle of the corner should be at least 110 degrees. I am absolutely guessing with the values but if you do a little research i am sure you will be able to define such properties. Next thing you can do is to test how each approximation fits the car properties and choose the best one.
Raw data. I assume you are testing all methods on a part of given road. You can generate a "raw gps track" - a track that best fits the movement of a car. Google maps may help you to generate such track os some gps devise with higher accuracy. Than you measure the distance between each approximation and your generated track - the one with the min distance wins.
i think you easily match the coordinates after the address conversion.
because address have street,area and city. so you can easily match the different radius.
let try this link
Take a look at this paper that discusses comparing machine learning algorithms:
"Choosing between two learning algorithms
based on calibrated tests" available at:
http://www.cs.waikato.ac.nz/ml/publications/2003/bouckaert-calibrated-tests.pdf
Also check out this paper:
"Bayesian Comparison of Machine Learning Algorithms on Single and
Multiple Datasets" available at:
http://www.jmlr.org/proceedings/papers/v22/lacoste12/lacoste12.pdf
Note: It is noted from the question that you are looking into the best way to compare the results for machine learning algorithms and are not looking for additional machine learning algorithms that may implement this feature.
Machine Learning is not an well suited approach for that task, you would have to define what is good smoothing...
Principially your task cannot be solved by an algorithm that gives an general answer because every smoothing destroy the original data by some amount and adds invented positions, and different systems/humans that use the smoothed data react differently on that changed data.
The question is: What do you want to achieve with smoothing?
Why do you need smoothing? (have you forgotten to implement or enable a stand still filter that eliminates movement while the vehicle is standing still, which in GPS introduces jumping location during stand still?)
The GPS chip has already built in a (best possible?) real time smoothing using a Kalman filter, having on the one side more information than a post processed smotthing algo, on the other side it has less.
So next you have to ask yourself: do you compare post processing smooting algos or real time algos? (probably post processing) Comparing a real time smoothing algorithm with a post process smoothing algorithm is not fair.
Again: What do you expect from smoothed data: That they look somewhat fine, but unrealistic like photoshopped models for tv-advertisments?
What is good smoothing? near to real vehicle postion which nobody ever knows, or a curve whith low acceleration?
I would prefer an smoothing algorithm that produces the curve most near to the real (usually unknown) vehicle trajectory.
Or you might just think it should somehow look beautifull: In that case overlay the curves with different colors, display it on a satelitte image map, and let a team of humans (experts at least owning and driving an own car) decide what looks good and realistic.
We humans have the best multi purpose pattern matching algorithm built in.
Again why smooth?: for display in a map to please humans that look at that map?
or to use the smoothed tracks to feed other algorithms that have problems with the original data?
To please humans I have given an answer above.
To please other algorithms:
What they need? nearer positions? or better course value / direction between points.
What attributes do you want to smooth: only the latitude, longitude coordinates, or also the speed value, and course value?
I have much professional experience with GPS tracks, and recommend, to just remove every location under 7km/h and keep the rest as it is. In most cases there is no need for further smoothing.
Otherwise it gets expensive:
A possible solution:
1) You arrange a 2000€ Reference GPS receiver delivered with a magnetic vehicle roof antenna (E.g Company hemisphere 2000 GPS receiver) and use that as reference
2) You use a comnsumer GPS usually used for your task (smartphone, etc.)
Both mounted inside the car: drive some test tracks, in good conditions (highways) but more tracks at very bad: strong curves combined with big houses left and right. And through tunnel, a struight and a curved one, if you have one.
3) apply the smoothing algoritms to the consumer GPS tracks
4) compare the smoothed to the reference track, by matching two positions and finally calulate the (RMSE Root mean squared error)
Difficulties
matching two positions: Hopefully the time can be exactly matched which is usually not the case (0,5s offset possible).
Think what do you do when having an GPS outage.
Consider first to display a raw track and identify what kind of unsmoothed data is not suitable/ nice looking. (Probably later posting the pics here)
what about using the good old Kalman Filter!
I've been writing a program to graph datapoints in Java. I need a lots of flexibility and speed so I don't want to use an existing library as much as possible. Right now, it essentially uses Graphics2D to draw lines and dots representing the points in a file of data.
My problem is, some of my datasets have upwards of 100,000 points.When it is going to be rendered with all full drag/zoom functionality, it is getting quite slow.
My question is, how can I reduce this dataset or make a simplification of it so that I can display the graph without changing the overall shape?
I could only draw every third point, for instance, but what if that skipped over and didn't display an important outlier? I could try averaging groups of points, but that could have the same problem.
And for services like Google Finance, where they probably have millions of points to display, how do they deal with this?
You may want to check for range differences between points before rendering them. Give them a threshold that they need to stay within in order to not be re-rendered.
What would be the best algorithm in terms of speed for locating an object in a field?
The field consists of 18 by 18 squares with side length 30.48 cm. The robot is placed in the square (0,0) and its job is to reach the light source while avoiding obstacles along the way. To locate the light source, the robot does a 360 degree turn to find the angle with the highest light reading and then travels towards the source. It can reliably detect a light source from 100 cm.
The way I'm implementing this presently is I'm storing the information about each tile in a 2x2 array. The possible values of the tiles are unexplored (default), blocked (there's an obstacle), empty (there's nothing in there). I'm thinking of using the DFS algorithm where the children are at position (i+3,j) or (i,j+3). However, considering the fact that I will be doing a rotation to locate the angle with the highest light reading at each child, I think there may be an algorithm which may be able to locate the light source faster than DFS. Also, I will only be travelling in the x and y directions since the robot will be using the grid lines on the floor to make corrections to it's x and y positions.
I would appreciate it if a fast and reliable algorithm could be suggested to accomplish this task.
This is a really broad question, and I'm not an expert so my answer is based on "first principles" thinking rather than experience in the field.
(I'm assuming that your robot has generally unobstructed line of sight and movement; i.e. it is an open area with scattered obstacles, not in a maze.)
The problem is interpreting the information that you get back from a 360 degree scan.
If the robot sees the light source, then traversing a route to the light source is either trivial, or a "simple" maze walking task.
The difficulty is when you don't see the source. It might mean that the source is not within the circle of visibility. But it could also mean that the light is behind an obstacle. And unfortunately, a simple sensor like you are describing cannot distinguish these two cases.
If your sensor system allowed you to see the obstacles, you could plot the locations of the "shadow" regions (regions behind obstacles), and use that to keep track of the places that are left to search. So your strategy would be to visit a small number of locations and do a scan at each, then methodically "tidy up" a small number of areas that were in shadow.
But since you cannot easily tell where the shadow areas are, you need an algorithm that (ultimately) searches everywhere. DFS is a general strategy that searches everywhere, but it does it by (in effect) looking in the nooks and crannies first. A better strategy is to a breadth first search, and only visit the nooks and crannies if the wide-scale scans didn't find the light source.
I would appreciate it if a fast and reliable algorithm could be suggested to accomplish this task.
I think you are going to need to develop one yourself. (Isn't this the point of the problem / task / competition?)
Although it may not look like it, this looks a more like a maze following problem than anything. I suppose this is some kind of challenge or contest situation, where there's always a path from start to target, but suppose there's not for a moment. One of the successful results for a robot navigating a beacon fully surrounded by obstacles would be a report with a description of a closed path of obstacles surrounding a signal. If there's not such a closed path, then you can find a hole in somewhere; this is why is looks like maze following.
So the basic algorithm I'd choose is to start with a spiraling-inward tranversal, sweeping out a path narrow enough so that you're sure to see a beacon if one is present. If there are no obstacles (a degenerate case), this finds the target in minimal time. (Hint: each turn reduces the number of cells your sensor can locate per step.)
Take the spiral traversal to be counter-clockwise. What you have then is related to the rule for solving mazes by keeping your right hand on the wall and following the generated path. In this case, you have the complication that, while the start of the maze is on the boundary, the end may not be. It's possible of the right-hand-touching path to fail in such a situation. Detecting this situation requires looking for "cavities" in the region swept out by adjacency to the wall.