How to do perspective fixing? - java

I'm searching for a fast way to fix perspective of a picture given in java or any language.And currently i really don't have any idea how to do it, nor find anything useful in Google.
Input:
Point[4] , Color[][]
Output:
Perspective-Fixed Color[][]
By Perspective Fixing, i meant the one in Photoshop. Just Like:
I^d appreciate it if you tell me how the code piece works since i want to understand the logic.

The simple solution is to just remap coordinates from the original to the final image, copying pixels from one coordinate space to the other, rounding off as necessary -- which may result in some pixels being copied several times adjacent to each other, and other pixels being skipped, depending on whether you're stretching or shrinking (or both) in either dimension. Make sure your copying iterates through the destination space, so all pixels are covered there even if they're painted more than once, rather than thru the source which may skip pixels in the output.
The better solution involves calculating the corresponding source coordinate without rounding, and then using its fractional position between pixels to compute an appropriate average of the (typically) four pixels surrounding that location. This is essentially a filtering operation, so you lose some resolution -- but the result looks a LOT better to the human eye; it does a much better job of retaining small details and avoids creating straight-line artifacts which humans find objectionable.
Note that the same basic approach can be used to remap flat images onto any other shape, including 3D surface mapping.

Related

Improvement Suggestion for Color Measurement Algorithm

I'm working on an Image Processing project for a while, which consists in a way to measure and classify some types of sugar in the production line by its color. Until now, my biggest concern was searching and implementing the appropriate mathematical techniques to calculate distance between two colors (a reference color and the color being analysed), and then, turn this value into something more meaningful, as an industry standard measure.
From this, I'm trying to figure out how should I reliably extract the average color value from an image, once the frame captured by a video camera may contain noises or dirt in the sugar (most likely almost black dots).
Language: Java with OpenCV library.
Current solution: Before taking average image value, I'm applying the fastNlMeansDenoisingColored function, provided by OpenCV. It removes some white dots, at cost of more defined details. Couldn't remove black dots with it (not shown in the following images).
From there, I'm using the org.opencv.core.Core.mean function to computate the mean value of array elements independently for each channel, so that I can have a scalar value to use in my calculations.
I tried to use some kinds of image thresholding filters to get rid of black and white dots, and then calculate the mean with a mask, It kinda works too. Also, I tried to find any weighted average function which could return scalar values as well, but without success.
I don't know If those are robust enough pre-processing techniques to such application, mean values can vary easily. Am I in the right way? Would you suggest a better way to get reliable value that will represent my sugar's color?

Image processing - OpenCV, Identifying digits

I am new to image processing and to opencv in particular.
I am working on an OCR project in which i need to identify numbers.
This is my image to process:
Lets say i already optimized the image, my questions are:
In the image the number are always apeared several times, lets say i found the contours, so how can i know which one if the the best one to process?
How can I know in what angle I need to rotate each contour to make It stright?
In the image the number are always apeared several times, lets say i found the contours, so how can i know which one if the the best one to process?
You want always the biggest number, because they are least warped by perspective. So you always want the numbers in the middle of the image, because they are also n the middle of the ball.
How can I know in what angle I need to rotate each contour to make It stright?
Have a look at rotated rect. I explained how to find the angle in this thread.
Since you always have a perfectly centered ball, you should think about using mapping to "unwarp" your ball (so do a projection like from the globe onto a map). It should be pretty straightforward afterwards to find the numbers on the flat image.
Edit: Since you only have 10 numbers you might also "brute force" the solution with a big enough training set. So just throw all numbers you detect into a classifier and keep the most likely solution.
1) I agree with #Sebastian in the first part. Exploit the fact that in your scenario the numbers are placed in the surface of a ball, so first select the blobs inside a centered region of interest.
2) The contours shown in the image are not rotated (the numbers are). Instead of "rotating" these bounding boxes, which seems to be quite a headache, I'd rather use them combined with rotation invariant keypoints. I'll clarify this:
a) You know where your numbers are, so you don't have to search in the entire image. OK, keep these already selected regions in mind.
b) You can take "straight" samples of the numbers 0-9 and use them as ground truth.
c) You can perform a matching search between each "ground truth" image and each candidate region. Now, forget the scale/rotation: use scale/rotation invariant keypoints! Something like this:
Again, notice that you have already selected the region-of-interest, so in your case the search will consist on checking the number of matches (number of blue lines) between each of the registered numbers and your candidate. I think it worth a try! :)
You can find more info on the different keypoints available in opencv here.
Hope that it helps!

Need Conceptual Help Rendering a Heat Map

I need to create a heatmap for android google maps. I have geolocation and points that have negative and positive weight attributed to them that I would like to visually represent. Unlike the majority of heatmaps, I want these positive and negative weights to destructively interfere; that is, when two points are close to each other and one is positive and the other is negative, the overlap of them destructively interferes, effectively not rendering areas that cancel out completely.
I plan on using the android google map's TileOverlay/TileProvider class that has the job of creating/rendering tiles based a given location and zoom. (I don't have an issue with this part.)
How should I go about rendering these Tiles? I plan on using java's Graphics class but the best that I can think of is going through each pixel, calculating what color it should be based on the surrounding data points, and rendering that pixel. This seems very inefficient, however, and I was looking for suggestions on a better approach.
Edit: I've considered everything from using a non-android Google Map inside of a WebView to using a TileOverlay to using a GroundOverlay. What I am now considering doing is having a large 2 dimensional array of "squares." Each square would have a long, lat, and total +/- weights. When a new data point is added, instead of rendering it exactly where it is, it will be added to the "square" that it is in. The weight of this data point will be added to the square and then I will use the GoogleMap Polygon object to render the square on the map. The ratio of +points to -points will determine the color that is rendered, with a ratio closer to 1:1 being clear, >1 being blue (cold point), and <1 being red (hot point).
Edit: a.k.a. clustering the data into small regional groups
I suggest trying
going through each pixel, calculating what color it should be based on the surrounding data points, and rendering that pixel.
Even if it slow, it will work. There are not too many Tiles on the screen, there are not too many pixels in each Tile and all this is done on a background thread.
All this is still followed by translating Bitmap into byte[]. The byte[] is a representation of PNG or JPG file, so it's not a simple pixel mapping from Bitmap. The last operation takes some time too and may possibly require more processing power than your whole algorithm.
Edit (moved from comment):
What you describe in the edit sounds like a simple clustering on LatLng. I can't say it's a better or worse idea, but it's something worth a try.

Find and Crop relevant image area automatically

We are trying to crop the relevant area of an image (photo) with a square aspect ratio (1:1), similar to what Facebook does when creating thumbnails.
In our case, it doesn't really matter if the crop has the original height (or width when the image orientation is portrait h>w) of the image to be processed or the auto-crop is resizing itself as well
I am thinking of algorithms like comparing objects with background or focus or something like a heat-map, combining colors and/or areas to find the most relevant part. There could be several ideas/methods to find the main part of the image to be used, similar to face detection.
We are looking for a Java (Android)-based solution or anything that can be adopted for Java / Android. Any help or idea would be greatly appreciated! Thank you!
I would do this in two steps, where the initial step is more robust and the second could be based on, for example, entropy. For the first step, you can use SURF which is relatively common nowadays and I would expect to find Java implementations of it. SURF will give a set of key points that it considers important to describe your image. Considering where these key points are in your image, you have a set of (x, y) coordinates from which you use to reduce the area of your initial image to that which encloses this set of points. Now, since these key points might be anywhere in your image, you will probably want to discard some of them (i.e., those that are too far from the others -- outliers). A very simple way to do this discarding step is considering the convex hull from the initial set of key points, from there, you can peel this hull multiple times. Each time you "peel" it, you are effectively discarding the points in the current convex hull.
Here is a sample for such first step:
f = Import["http://fohn.net/duck-pictures-facts/mallard-duck.jpg"];
kp = ImageKeypoints[f, MaxFeatures -> 200];
Show[f, Graphics[{PointSize[Medium], Red, Point[kp]}]]
After peeling once the convex hull formed by the key points and trimming the image according to the bounding rectangle of the remaining points:
From the image above, you can decide which sub-region of it to pick based on some other method. One that is apparently common is the one used by Reddit, which successively remove slices of lesser entropy from the image. Quickly searching for it, I found one such implementation at https://github.com/christopherhan/pycrop/blob/master/pycrop.py#L33, it is very simple.
Another different kind of method that you might wanna try is called Seam-Carving. Also note that depending on how large is the initial image, it is unlikely that cropping a small piece of it will give anything relevant. In those cases, it is more interesting to first resize the image and then apply the relevant methods.

Image caching and performance

I'm currently trying to improve the performances of a map rendering library. In the case of punctual symbols, the library is really often jsut drawing the same image again and again on each location. the drawing process may be really complex, though, because the parametrization of the symbol is really very rich. For each point, I have a tree structure that computes the image about to be drawn. When parameters are not dependant on the data I'm processing, as I said earlier, I just draw a complex symbol several times.
I've tried to implement a caching mechanism. I store the images that have already be drawn, and if I encounter a configuration that has already been met, I get the image and draw it again. The first test I've made is for a very simple symbol. It's a circle whose both shape and interior are filled.
As I know the symbol will be constant in all locations, I cache it and draw it again from the cached image then. That works... but I face two important problems :
The quality of the drawn symbols is hardly damaged.
More problematic : the time needed to render the map is reaally higher with caching than without caching. That's pretty disappointing for a cache ^_^
The core code when the caching mechanism is on is the following :
if(pc.isCached(map)){
BufferedImage bi = pc.getCachedValue(map);
drawCachedImageOnGeometry(g2, sds, fid, selected, mt, the_geom, bi);
} else {
BufferedImage bi = g2.getDeviceConfiguration().createCompatibleImage(200, 200);
Graphics2D tg2 = bi.createGraphics();
graphic.draw(tg2, map, selected, mt, AffineTransform.getTranslateInstance(100, 100));
drawCachedImageOnGeometry(g2, sds, fid, selected, mt, the_geom, bi);
pc.cacheSymbol(map, bi);
}
The only interesting call made in drawCachedImageOnGeometry is
g2.drawRenderedImage(bi, AffineTransform.getTranslateInstance(x-100,y-100));
I've made some attempts to use VolatileImage instances rather than BufferedImage... but that causes deeper problems (I've not been able to be sure that the image will be correctly rendered each time it is needed).
I've made some profiling too and it appears that when using my cache, the operations that take the longest time are the rendering operations made in awt.
That said, I guess my strategy is wrong... Consequently, my questions are :
Are there any efficient way to achieve the goal I've explained ?
More accurately, would it be faster to store the AWT instructions used to draw my symbols and to translate them as needed ? I make the assumption that it may be possible to retrieve the "commands" used to build the symbol... I didn't find many informations about that on the world wide web, though... If it is possible, that would save me both the computation time of the symbol (that can be really complex, as said earlier) and the quality of my symbols.
Thanks in advance for all the informations and resources you'll give me :-)
Agemen.
EDIT : Here are some details about the graphics that can be rendered. According to the symbology model I'm implementing, graphics can be really simple (ie a filled square with its shape) as well as really complex (A Label whose both shape and fill are drawn with hatches, for instance, and even if a halo around it if I want). I want to use a cache because I'm sure that in most configurations I'll be able to :
differenciate the parameters that have been used to draw two different symbols of the same source that are styled with the same style.
be sure that two sources with the same parameters (location excepted) will produce the same symbol for the same style, but at two different locations (only a translation will be needed).
Because of these two points, caching seems to be a good strategy. Moreover, there may be thousands of duplcated symbols to be drawn in the same image.
You are awefully vague about what kind of operations your drawing really entails, so all I can give you are some very general pointers.
1.) Drawing a pre-rendered Image is not necessarily faster than drawing the same Image using Graphics2D operations. It depends a lot on the complexity required to draw the image. As an extreme case consider fillRect() vs. a drawImage() of an Image containing the pre-rendered rectangle (fillRect just writes the destination pixels, where drawImage also needs to copy from a source).
2.) In most cases you never want to mess with VolatileImage directly. BufferedImage takes advantage of VolatileImage automatically unless you mess with the Image DataBuffer. If you have many pre-rendered images you may also run out of accelerated video memory and that degrades image drawing performance.
3.) On-the-fly scaling/rotating etc. of a pre-rendered image can be pretty costly (depending on the platform and current graphics transformations).
4.) The 'compatible Image' you create may not really be compatible with the drawing target. You obtain an image compatible with the default screen device, which may not be compatible with the actual target in a multi monitor setup. You may get better results using the actual target components createImage().
EDIT:
5.) Translating the coordinates of a rendering operation may alter the destination pixels produced. An obvious case is when the coordinates are non-integers (either in the coordinates themselves or indirectly through the AffineTransform set on the graphics). Also, antialiasing of text and possibly other primitives may be influenced slightly by coordinates (subpixel rendering comes to mind).
You could attempt an approach that differentiates on if a symbol is presumably fast or slow to render. The fast ones being rendered directly, while the slow ones are cached. The main problem here is in deciding which ones are fast/slow, I expect this to be non-trivial to decide.
Also, I wonder when you say there are thousands of symbols to be rendered, as I imagine most of them should be clipped away since only a small portion of the graph fits into a Window/Frame? If thats the case, don't bother much with caching. Drawing operations that are completely outside the current clip bounds will be relatively cheap - all the graphics target really does for them is detection if they are completely invisible and when they are just do nothing. If the goal is the produce an image to be saved to disk/printed (whatever) I wouldn't bother much with speeding up the rendering, since this is a relatively rare operation and the actual printing may by far exceed the time needed for rendering the graph anyway.
If none of the above applies to your case, be somewhat careful that your cache does not use more time/memory to decide if a cached version exists than it really saves in rendering time. You also need to take into accound that building a cached image instead of rendering to the target directly does cost you some time if that image is never reused. Caching can only gain you some speed if the image is reused at least once, preferably many more times.
If you build your symbols from primive operations by combining primitve rendering operation objects (like there is a Rectangle, Halo and Text rendering object subclass), you may want to assign each of them a cost indicator and only cache those symbols that exceed some (to be determined) cost threshhold. Also it may be a good idea to implement a hashCode() for each primitive operation and the symbol itself for fast(er) equals detection.

Categories

Resources