Java image library - turn grid image into array - java

If I have an image of a table of boxes, with some coloured in, is there an image processing library that can help me turn this into an array?
Thanks

You can use a thresholding function to binarize the image into dark/light pixels so dark pixels are 0 and light ones are 1.
Then you would want to remove image artifacts using dilation and erosion functions to remove noise (all these are well defined on Wikipedia).
Finally if you know where the boxes are, you can just get the value in the center of each box to determine the array value, or possibly use an area near the center and take the prevailing value (i.e. more 0's is a filled in square, more 1's is and empty square).
If you are scanning these boxes and there is a lot of variation in the position of the boxes, you will have to perform some level of image registration using known points, or fiducials.
As far as what tools to use to do this, I'd recommend first trying this manually using a tool like ImageJ, which has a UI and can also be used programatically since it is written all in Java.
Other good libraries for this include OpenCV and the Java Advanced Imaging API.
Your results will definitely vary depending on the input images and how consistenly lit and positioned they are.
The best way to see how it will do for your data is to try applying these processing steps manually to see where your threshold value should be, how much dilating/eroding you need to get consistent results.

Related

Improvement Suggestion for Color Measurement Algorithm

I'm working on an Image Processing project for a while, which consists in a way to measure and classify some types of sugar in the production line by its color. Until now, my biggest concern was searching and implementing the appropriate mathematical techniques to calculate distance between two colors (a reference color and the color being analysed), and then, turn this value into something more meaningful, as an industry standard measure.
From this, I'm trying to figure out how should I reliably extract the average color value from an image, once the frame captured by a video camera may contain noises or dirt in the sugar (most likely almost black dots).
Language: Java with OpenCV library.
Current solution: Before taking average image value, I'm applying the fastNlMeansDenoisingColored function, provided by OpenCV. It removes some white dots, at cost of more defined details. Couldn't remove black dots with it (not shown in the following images).
From there, I'm using the org.opencv.core.Core.mean function to computate the mean value of array elements independently for each channel, so that I can have a scalar value to use in my calculations.
I tried to use some kinds of image thresholding filters to get rid of black and white dots, and then calculate the mean with a mask, It kinda works too. Also, I tried to find any weighted average function which could return scalar values as well, but without success.
I don't know If those are robust enough pre-processing techniques to such application, mean values can vary easily. Am I in the right way? Would you suggest a better way to get reliable value that will represent my sugar's color?

Scalable clipping mask

I need to to clip variablesized images into puzzle shaped pices like this(not squares): http://www.fernando.com.ar/jquery-puzzle/
I have considered the posibility of doing this with a php library like Cairo or GD, but have little to no experience with these librays, and see no immidiate soulution for creating a clipping mask dynamicaly scalable for different sized images.
I'm looking for guidance/tips on which serverside programing language to use to accomplish this task, and preferably an approach to this problem.
You can create an image using GD with the size of the puzzle piece. and then copy the full image on that image with the right cropping to get the right part of the image.
Then you can just dynamically color in every part of the piece you want to remove with a distinct color (eg #0f0) and then use imagecolorallocatealpha to make that color transparent. Do it for each piece and you have your server side image pieces.
However, if I where you I would create the clipping mask of each puzzle peace in advance in the distinct color. That would make two images per connection (one with the "circle" connecter sticking out and one where this circle connector fits into). That way you can just copy these masks onto the image to create nice edges quickly.
GD is quite complicated, I've heard very good things about Image Magick for which there is a PHP version and lots of documentation on php.net. However, not all web servers would have this installed by default.
http://www.php.net/manual/en/book.imagick.php
If you choose to do it using PHP with GD then the code here may help:
http://php.amnuts.com/index.php?do=view&id=15&file=class.imagemask.php
Essentially what you need to do with GD is to start with a mask at a particular size and then use the imagecopyresampled function to copy the mask image resource to a larger or smaller size. To see what I mean, check out the _getMaskImage method class shown at the url above. A working example of the output can be seen at:
http://php.amnuts.com/demos/image-mask/
The problem with doing it via GD, as far as I can tell, is that you need to do it a pixel at a time if you want to achieve varying opacity levels, so processing a large image could take a few seconds. With ImageMagick this may not be the case.

Finding bounding box of text within JPG image

My question is similar to this one, but is more specific in scope.
In my card game application, I would like for users to be able to click on words located in a scanned jpeg image. Please see this sample Pokemon trading card.
In this case, the user should be able to hover his mouse over the text "Scratch", upon which a pulsing rectangular border will appear around the text, indicating that it is clickable. The problem is how to detect the border of the text. There will be an array of words KNOWN BEFOREHAND that the user may click on (these will be retrieved from a database on a card-by-card basis). To continue our example, the array in this case will be ["Scratch", "Live Coal"]. Once the user clicks on "Scratch", the application must know via a call-back that "Scratch" was chosen instead of "Live Coal".
I was thinking of using optical character recognition libraries to solve this problem, but the open-source options for this are poor in quality (e.g. GOCR) and/or not well-tested on multiple platforms (e.g. Tesseract). I only care about Windows and Mac compatibility. Am I missing an obvious/simpler solution/algorithm that does not require OCR? I cannot simply hand-code in bounding boxes for each card, as there will be thousands of scanned cards in my database. The user may also upload his own custom card scans with an accompanying array of clickable text.
Text color is not always black. See this panorama of different card and text styles that will be permitted. The black cards have white text, and the third-to-last card (Zekrom) has black text with a white outline.
Solutions in any programming language are appreciated. However, please note that I am looking for open-source algorithms and/or libraries. If there is a solution in Ruby or Java, even better, as my code is primarily in these two languages.
EDIT: I forgot to mention that the order of the words/phrases in the array will be the same as on the card. Thus, the array will be ["Scratch", "Live Coal"] instead of ["Live Coal", "Scratch"]. I am mentioning this because it can potentially simplify the task. Thus, for this example, I can simply look for black pixels (though I have to watch out for the black star in the white circle). However, there will be more difficult cases where there is descriptive text under the attack name in a smaller font (again, see the panorama for examples).
I would just write a program that allows you to visually draw a bounding box around your text for simplicity but could could do this buy detecting differences in pixel color. Since the text is black you could see where the upper-left most black pixel is without large indents and within the bottom half of the card.
When the cursor is stationary, check if there is a black pixel either underneath or to 4 pixels around the cursor. If it is, check the first three consecutive (because there still might be a non-black pixel between the letters) non-black pixels to the left of the cursor, to the right, to the top and at the bottom. If yes, use these locations to draw a square. You can use OpenCV.

Appending to an Image File

I have written a program that takes a 'photo' and for every pixel it chooses to insert an image from a range of other photos. The image chosen is the photo of which the average colour is closest to the original pixel from the photograph.
I have done this by firstly averaging the rgb values from every pixel in 'stock' image and then converting it to CIE LAB so i could calculate the how 'close' it is to the pixel in question in terms of human perception of the colour.
I have then compiled an image where each pixel in the original 'photo' image has been replaced with the 'closest' stock image.
It works nicely and the effect is good however the stock image size is 300 by 300 pixels and even with the virtual machine flags of "-Xms2048m -Xmx2048m", which yes I know is ridiculus, on 555px by 540px image I can only replace the stock images scaled down to 50 px before I get an out of memory error.
So basically I am trying to think of solutions. Firstly I think the image effect itself may be improved by averaging every 4 pixels (2x2 square) of the original image into a single pixel and then replacing this pixel with the image, as this way the small photos will be more visible in the individual print. This should also allow me to draw the stock images at a greater size. Does anyone have any experience in this sort of image manipulation? If so what tricks have you discovered to produce a nice image.
Ultimately I think the way to reduce the memory errors would be to repeatedly save the image to disk and append the next line of images to the file whilst continually removing the old set of rendered images from memory. How can this be done? Is it similar to appending a normal file.
Any help in this last matter would be greatly appreciated.
Thanks,
Alex
I suggest looking into the Java Advanced Imaging (JAI) API. You're probably using BufferedImage right now, which does keep everything in memory: source images as well as output images. This is known as "immediate mode" processing. When you call a method to resize the image, it happens immediately. As a result, you're still keeping the stock images in memory.
With JAI, there are two benefits you can take advantage of.
Deferred mode processing.
Tile computation.
Deferred mode means that the output images are not computed right when you call methods on the images. Instead, a call to resize an image creates a small "operator" object that can do the resizing later. This lets you construct chains, trees, or pipelines of operations. So, your work would build a tree of operations like "crop, resize, composite" for each stock image. The nice part is that the operations are just command objects so you aren't consuming all the memory while you build up your commands.
This API is pull-based. It defers computation until some output action pulls pixels from the operators. This quickly helps save time and memory by avoiding needless pixel operations.
For example, suppose you need an output image that is 2048 x 2048 pixels, scaled up from a 512x512 crop out of a source image that's 1600x512 pixels. Obviously, it doesn't make sense to scale up the entire 1600x512 source image, just to throw away 2/3 of the pixels. Instead, the scaling operator will have a "region of interest" (ROI) based on it's output dimensions. The scaling operator projects the ROI onto the source image and only computes those pixels.
The commands must eventually get evaluated. This happens in a few situations, mostly relating to output of the final image. So, asking for a BufferedImage to display the output on the screen will force all the commands to evaluate. Similarly, writing the output image to disk will force evaluation.
In some cases, you can keep the second benefit of JAI, which is tile based rendering. Whereas BufferedImage does all its work right away, across all pixels, tile rendering just operates on rectangular sections of the image at a time.
Using the example from before, the 2048x2048 output image will get broken into tiles. Suppose these are 256x256, then the entire image gets broken into 64 tiles. The JAI operator objects know how to work a tile at a tile. So, scaling the 512x512 section of the source image really happens 64 times on 64x64 source pixels at a time.
Computing a tile at a time means looping across the tiles, which would seem to take more time. However, two things work in your favor when doing tile computation. First, tiles can be evaluated on multiple threads concurrently. Second, the transient memory usage is much, much lower than immediate mode computation.
All of which is a long-winded explanation for why you want to use JAI for this type of image processing.
A couple of notes and caveats:
You can defeat tile based rendering without realizing it. Anywhere you've got a BufferedImage in the workstream, it cannot act as a tile source or sink.
If you render to disk using the JAI or JAI Image I/O operators for JPEG, then you're in good shape. If you try to use the JDK's built-in image classes, you'll need all the memory. (Basically, avoid mixing the two types of image manipulation. Immediate mode and deferred mode don't mix well.)
All the fancy stuff with ROIs, tiles, and deferred mode are transparent to the program. You just make API call on the JAI class. You only deal with the machinery if you need more control over things like tile sizes, caching, and concurrency.
Here's a suggestion that might be useful;
Try segregating the two main tasks into individual programs. Your first task is to decide which images go where, and that can be a simple mapping from coordinates to filenames, which can be represented as lines of text:
0,0,image123.jpg
0,1,image542.jpg
.....
After that task is done (and it sounds like you have it well handled), then you can have a separate program handle the compilation.
This compilation could be done by appending to an image, but you probably don't want to mess around with file formats yourself. It's better to let your programming environment do it by using a Java Image object of some sort. The biggest one you can fit in memory pixelwise will be 2GB leading to sqrt(2x10^9) maximum height and width. From this number and dividing by the number of images you have for height and width, you will get the overall pixels per subimage allowed., and can paint them into the appropriate places.
Every time you 'append' are you perhaps implicitly creating a new object with one more pixel to replace the old one (ie, a parallel to the classic problem of repeatedly appending to a String instead of using a StringBuilder) ?
If you post the portion of your code that does the storing and appending, someone will probably help you find an efficient way of recoding it.

Auto scale and rotate images

Given:
two images of the same subject matter;
the images have the same resolution, colour depth, and file format;
the images differ in size and rotation; and
two lists of (x, y) co-ordinates that correlate the images.
I would like to know:
How do you transform the larger image so that it visually aligns to the second image?
(Optional.) What are the minimum number of points needed to get an accurate transformation?
(Optional.) How far apart do the points need to be to get an accurate transformation?
The transformation would need to rotate, scale, and possibly shear the larger image. Essentially, I want to create (or find) a program that does the following:
Input two images (e.g., TIFFs).
Click several anchor points on the small image.
Click the several corresponding anchor points on the large image.
Transform the large image such that it maps to the small image by aligning the anchor points.
This would help align pictures of the same stellar object. (For example, a hand-drawn picture from 1855 mapped to a photograph taken by Hubble in 2000.)
Many thanks in advance for any algorithms (preferably Java or similar pseudo-code), ideas or links to related open-source software packages.
This is called Image Registration.
Mathworks discusses this, Matlab has this ability, and more information is in the Elastix Manual.
Consider:
Open source Matlab equivalents
IRTK
IRAF
Hugin
you can use the javax.imageio or Java Advanced Imaging api's for rotating, shearing and scaling the images once you found out what you want to do with them.
For a C++ implementation (without GUI), try the old KLT (Kanade-Lucas-Tomasi) tracker.
http://www.ces.clemson.edu/~stb/klt/

Categories

Resources