Get rectangle bounds for each letter in a image

Get rectangle bounds for each letter in a image - java

So I'm trying to fill an ArrayList<Rectangle> with the bounds of each letter of an image file.
For example, given this .png image:
I want to fill an ArrayList<Rectangle> with 14 Rectangle(one rectangle for each letter)
We can assume that the image will contain only 2 colors, one for the background and one for the letters, in this case, pixels will be either white or red.
At first, I thought I could search for white columns in between the letters, then if I found a completely white column I could get for example the width by getting the lowest red pixel value and the highest red pixel value and width = maxX-minX and so on:
x = minX;
y = minY;
w = maxX-minX;
h = maxY-minY;
letterBounds.add(new Rectangle(x,y,w,h));
The problem is that there's no space in between the letters, not even 1 pixel:
My next idea was for each red pixel I find, look for a neighbor that hasn't been seen yet, then if I can't find a neighbor I have all the pixels to get the bounds of that letter. But with this approach, I will get 2 rectangles for letters like "i" I could then write some algorithm to merge those rectangles but I don't know how that will turn out with other multi part letters, and before I try that I wanted to ask here for more ideas
So do you guys have any ideas?

You can use the OpenCV cv2.findContours() function. Instead of using the cv2.drawcontours() function for drawing the contours, which will highlight the outline of the letter, you could instead draw a rectangle on the image by using the cv2.rectangle and by extracting the coordinates from cv2.findContours() function.

I think two step algorithm is enough to solve the problem if not using library like OpenCV.
histogram
seam calculation
1. histogram
C.....C..C...
.C.C.C...C...
. C.C....CCCC
1111111003111
dot(.) means background color(white)
C means any colors except background color(in your case, red)
accumulating the number of vertical pixels with non-background color generates histogram.
*
*
******..****
0123456789AB
It is clear the boundary exists at 6 and 7
2. seam calculation
Some cases like We, cannot be solved by histogram because there is no empty vertical lines at all.
Seam Carving algorithm gives us some hints
https://en.wikipedia.org/wiki/Seam_carving
More detail implementation is found at
princeton.edu - seamCarving.html
Energy calcuation for a pixel
The red numbers are not color values for pixels, but energy values calculated from adjacent pixels.
The vertical pathes with minimal energy give us the boundary of each characters.
3. On more...
Statistical data is required to determine whether to apply the seam carving or not.
Max and min width of characters
Even if histogram give us vertical boundaries, it is not clear there are two or more characters in a group.

Related

Maze Image Manipulation, Trimming whitespace

The problem i am having is the 2 pixel width pathways (the white parts).
In the top-left of the image (the darker black part) i have manually gone over the white parts that were 2 pixels in width/height;
there are two solutions (that i can think of).
to programmatically edit it so that pathways are 1x1;
to find a way of dealing with paths that are larger than 1x1.
any suggestions, the maze-solving algorithm (tremaux) i have implemented works for 1x1 pathways but i am trying to adapt it to this larger maze.
preferably looking for a solution that is adaptable to a maze where the pathway widths can be any size as i have already written a tool where i can take an image and turn it into a monochrome int[][] array for maze solving.
Just looking for hints/steps in the right direction since I'm not sure if I'm looking at this correctly or if I'm heading down the correct path (no pun intended).
Thanks

So your grid is effectively repeating (1,2) = 3 pixels, 1 wall 2 paths. Just remove every 3th row. Then remove every 3th column.

Think of the image as being divided up into 3x3 blocks, with the top-left corner being always wall, the top row and left column being the optional walls and the rest being path, like this:
W w w
w P P
w P P
W = always wall
w = possible wall
P = always path
You need to convert each of those 3x3 blocks into a 2x2 block like this:
W w
w P

If you're trying to thin the paths just to make the maze easier to solve, then you don't need to bother. Finding the shortest path through the maze with BFS is about as fast as any path thinning algorithm (except delete every Nth row and column), and will produce paths without any extra twists or turns.

Fill the hole in the image properly

I have to find out the contour of the image. After that, I want to find out how to fill in hole in the number characters, but not in the other space. The image is the following.
http://i.stack.imgur.com/jlLYE.jpg
Actually, if it is not possible, is there any other method for me to perform segmentation of this image by using openCV in java platform? I want the image contains the characters only. Thankyou.
http://i.stack.imgur.com/kY4Dh.png

Here is a simple method (But I am not sure if it will work everywhere. Test it yourself)
NB: Code is in Python, I don't do Java, sorry about that :(
Load the grayscale image
Apply Otsu's binarization
import cv2
import numpy as np
img = cv2.imread('test.png',0)
ret, thresh = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
Below is the thresholded image:
Now you can try two methods:
3a. Median Blurring with a 3x3 kernel
res = cv2.medianBlur(thresh,3)
Result:
3b. Erosion with a 3x1 kernel (vertical). 3x1 because all the lines in your image are more-horizontal. It there are vertical lines in other images, you may need to take 3x3 kernel (not sure. Check it)
kernel = np.ones((3,1))
cls = cv2.erode(thresh, kernel)
If you think you are losing some parts of digits also, you can apply dilation after erosion, or replace everything with morphological opening function.
Result:
Finally, find contours. It will also pick up some noise left in the preprocessed image, but you can filter out them by checking their aspect ratio, area etc.

There is a not that complicated solution. Think about the vertical run length representation of the image: since you have only black and white there, you can think that each vertical line of the image is a list containing only 1(black pixel) and 0 (white pixel) so you will have for instace 01111111110000000000000111. This can be minimized if you will take only the lengths of each sublist containing only 1 or 0, so instead of 0000000001111111000111111111111 you will have 0 9 7 3 12, starts with 0 because let us say that you always start with the count of black pixels, and since there you don't have any black pixels at the begining you put a 0 (it will be much easier to work like this). After you have this reprezentation you take the maximal value for a white run (the run is actually that count of white or black pixels) and the minimal one and go throught all white runs. For each of them you see if the value is closer to the smallest white run, and if this is the case you just remove it.
This algorithm should work for the given image ;)

Trying to create a random baord layout for pre-school puzzle app

I'm making a app that has 4 puzzle pieces , that consist of one of the alphabet letters. The top of the screen will have 4 alphabet letters in black put in random locations. The child will drag the color letters (on the bottom) on top of the corsponding black letter.
Right now the black letters will some times overlap, or be very close to each other. I was trying to figure out a way to make random setups where the black letters are evenly disrupted over the board area. Is there a edy way to do this? (ie determine x, y locations with a given width and h for each piece)

I would solve this in two steps:
1) Find a enclosing circle radius for the letters
2) Use a random generator which generates points with a minimum distance. Basically drop points which are to close and generate a new one.

Figure out how to detect when this happens, and move one of the offending letters when it does.

Image compositing of two Images

I am trying to solve a problem of compositing two images in Java. The program will take a part of the first image and past it on the second image. The goal is to make the boundary between the two images less visible. The boundary must be chosen in such a way that the difference between the two images at the boundary is small.
My Tasks:
To write a method to choose the boundary between the two images. The method will receive the overlapping parts of the input images. This must first be transformed so that the boundary always starts from the left-top corner to the right-bottom corner.
NOTE:
The returned image should not be the joined image but gives which parts of the two images were used.
The pixels of the boundary line can be marked with a constant(SEAM). Pixels of the first image can be marked with integer 0, pixels of the second image with integer 1. After choosing the boundary line, the floodfill algorithm can be used to fill the extra pixels with 0 or 1.
NOTE: The image can be represented as a graph whereby each pixel is connected with its left, right, top and bottom neighbor. So using the flood fill will be like depth-first search.
The shortest path algorithm must be used to choose the boundary in order to make it small.
NOTE: I can not use any java data structure except Arrays (not even ArrayList)
Guys, am new in this area and am trying to solve it. What steps must I follow to solve this problem? or a pointer to a tutorial

I would do it so:
Choose the width of the border checked. At your will.
1. find the maximal possible shift in pixels. That is D.
2. For all possible shifts in the square (+-D,+-D) find the k (correlation quocient) for the border. The border is taken in the middle of the shift.
3. The shift that has the largest k is the best. Let it be taken for granted.
4. Now begin to move the border, checking it by "k" the same way. Find the place of it. Done.
If D is large and the process is long, do it in 2(or more) stages. On the first stages the step of counting k is large, the last stage has step of 1. You could also use previous filtering.
If the border or relative images' position could be turned, the algorithm doesn't change principally - only add to it trying for the best k among different slightly turned positions and later - turned border, too.

How does a QuadTree work for non-square areas?

I understand how quad trees work on square images (by splitting the image until the section is a single colour, which is stored in the leaf node).
What happens if the image has one dimension longer that the other, you may end up with a 2x1 pixel area as the smallest sub unit, making it difficult to use quadtree division methods to store a single colour. How would you solve this issue?

You could pad the image until it is an equal and power of two size. While it may add some extra memory requirements, the increase shouldn't be that large.
The 2x1 example would be padded to a standard 2x2 and store the real size or use a special value for padded nodes so you can restore the original size.

Why don't you allow empty leafes in your tree?
Edit:
Maybe i don't understand the question^^. Your problem is that you end up with a non square images like 2x1 and want to represent them as a quadtreenode?
When you have a 2x2 square like
1 2
3 4
you would create a Quadnode with something like "new QuadNode(1,2,3,4)"
I would suggest to handel a 2x1 square like
1 2
with something like "new QuadNode(1,2,null,null)"
When you have bigger missing pieces you can use the same system. When you have a 4x2 picture like
1 2 3 4
5 6 7 8
you would get a "new QuadNode(new QuadNode(1,2,3,4),null,new QuadNode(5,6,7,8),null)"
This should also work with pieces with equal color instead of pixels.
Did i understand your problem and made myself clear?

A square is a special rectangle, Quad trees work on rectangles, too.
You just need a split method which gives 4 rectangles for a given one.
In case the top most root quad cell is an rectangle, just divide the width and height by 2.
In case of pixels, it makes only sense if the root cell widthand height are both a power of 2.
So if root cell = 2048 * 1024
The split just divides both width and height by 2.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Get rectangle bounds for each letter in a image - java

Related

Maze Image Manipulation, Trimming whitespace

Fill the hole in the image properly

Trying to create a random baord layout for pre-school puzzle app

Image compositing of two Images

How does a QuadTree work for non-square areas?

Categories

Resources