I am looking for library routines for the image enhancement of (scientific) plots and diagrams. Typical examples are shown in
http://www.jcheminf.com/content/pdf/1758-2946-4-11.pdf
and Figure 3 of http://en.wikipedia.org/wiki/Anti-aliasing
These have the features that:
They usually use a very small number of primitives (line, character, circle, rectangle)
They are usually monochrome (black/white) or have a very small number of block colours
The originals have no gradients or patterns.
I wish to reconstruct the primitives and am looking for an algorithm to restore clean lines in the image before the next stage of analysis (which may include line detection and OCR). The noise often comes from :
use of JPGs (the noise is often seen close to the original primitive)
antialiasing
I require Free/Open Source solutions and would ideally like existing Java libraries. If there are any which already do some of the job or reconstructing lines that would be a bonus! For characters recognition I would be happy to isolate each character at this stage and defer OCR, though pointers to that would also be appreciated.
UPDATE:
I am surprised that even with a bounty there have been no substantive replies to the question. I am therefore investigating it myself. I still invite answers but they should go beyond my own answer.
ANSWER TO OWN QUESTION
Since there there have been no answers after nearly a week here is what I now plan:
I found mention of the Canny edge-detection algorithm on another SO post and then found:
[http://www.tomgibara.com/computer-vision/canny-edge-detector][2]
from Tom Gibara.
This is very easy to use in default mode and the main program is:
public static void main(String[] args) throws Exception {
File file = new File("c.bmp");
//create the detector
CannyEdgeDetector detector = new CannyEdgeDetector();
//adjust its parameters as desired
detector.setLowThreshold(0.5f);
detector.setHighThreshold(1f);
//apply it to an image
BufferedImage img = ImageIO.read(file);
detector.setSourceImage(img);
detector.process();
BufferedImage edges = detector.getEdgesImage();
ImageIO.write(edges, "png", new File("c.png"));
}
Here ImageIO reads and writes bitmaps. The unprocessed image is read as a 24-bit BMP (ImageIO seems to fail with lower colour range). The defaults are Gibara's out-of-the-box.
The edge detection is very impressive and outlines all the lines and characters. This bitmap
is converted to the edges
So now I have two tasks:
fit straight lines to the outlines, which are essentially clean "tramlines". I expect this to be straightforward for clean diagrams. I'd be grateful for any mention of Java libraries to fit line primitives to outlines.
recognize the characters. Gibara has done an excellent job of separating them and so this is an exercise of recognising the individual glyphs. I can use the outlines to isolate the individual pixel maps for each glyph and then pass these to JavaOCR. Alternatively the outlines may be good enough to recognize the characters directly. I do NOT know what the font is, but most characters are in the 32-255 range and I believe I can build up heuristic maps.
See How do I properly load a BufferedImage in java? for loading bitmaps in Java
Java Library
OpenCV is the go-to library for computer vision tasks like this. There are Java bindings here: http://code.google.com/p/javacv/ . OpenCV covers everything from basic image processing filters to high-level object and motion detection algorithms.
Line Detection
For detecting straight lines, try the Hough Transform. The OpenCV Tutorials have a good explanation: http://opencv.itseez.com/doc/tutorials/imgproc/imgtrans/hough_lines/hough_lines.html#how-does-it-work
The classical Hough transform outputs infinite lines, but OpenCV also implements a variant called the Probabilistic Hough Transform that outputs line segments. It should give what you need. The original academic paper is here: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.34.9440&rep=rep1&type=pdf
Once you detect line segments, you might want to detect linked line segments and join them together. For your simple images, you will probably do just fine with a brute-force comparison of all segment endpoints. If you detect more than one endpoint within a small radius, say 2 pixels, join them together to make sure your lines are continuous. You can also measure the angle between joined line segments to detect polygons.
Circle Detection
There is another version of the Hough transform that can detect circles, explained here: http://opencv.itseez.com/doc/tutorials/imgproc/imgtrans/hough_circle/hough_circle.html#hough-circle
I wish to reconstruct the primitives and am looking for an algorithm
to restore clean lines in the image before the next stage of analysis
(which may include line detection and OCR).
Have you looked at jaitools? ( http://code.google.com/p/jaitools/ ).
They have API for vectorizing graphics which are quite fast and flexible; see API and docs here: http://jaitools.org/
Related
I am working on a project where I want to detect the pores in a given skin image.
I have tried various methods(HoughCircles, BlobDetection and Contours) from OpenCv using Java, however, I am unable to proceed.
HoughCircles is showing me all false circles and same is the case with contours.
My current code uses blob detection technique which is also not showing what is required. Sample code is written below:
public void detectBlob() {
Mat orig = Highgui.imread("skin_pore.jpg",Highgui.IMREAD_GRAYSCALE);
Mat MatOut= new Mat();
FeatureDetector blobDetector;
blobDetector = FeatureDetector.create(FeatureDetector.SIFT);
MatOfKeyPoint keypoints1 = new MatOfKeyPoint();
blobDetector.detect(orig,keypoints1);
org.opencv.core.Scalar cores = new org.opencv.core.Scalar(0,0,255);
org.opencv.features2d.Features2d.drawKeypoints(orig,keypoints1,MatOut,cores,2);
Highgui.imwrite("PhotoOut.jpg", MatOut);
}
public static void main(String args[]) {
BlobDetection bd = new BlobDetection();
bd.detectBlob();
}
When I tried the same code using FeatureDetector.SIMPLEBLOB instead of FeatureDetector.SIFT it shows almost 0 blobs.
The output and source images are attached for the above code. Source Image
Output Image using SIFT
Is there any other algorithm which can help in achieving the result or what can be the appropriate approach to achieve this?
As you did not ask anything in your question I won't give you an answer. More some general advice.
That you tried to solve that problem using the named algorithms clearly shows that you have absolutely no clue what you are doing. You lack the very basics of image processing.
It's like trying to win vs decent chess player if you don't even know how the figures can move.
I highly recommend that you get yourself a beginners book, read it and make sure you understand its contents. Then do some more research on the algorithms you want to use, befor you use them.
You cannot take some arbitrary image, feed it into some random feature detection algorithm you find on the internet and expect to succeed.
Hough transform for cirles for example is good for finding circle shaped contours of a roughly known radius. If you know how it works internally you will know why it is not a good idea to use it on your image.
https://en.wikipedia.org/wiki/Circle_Hough_Transform
Blobdetection and contour based algorithms might work, but only after a lot of pre-processing. Your image is not very "segmentation-friendly"
https://en.wikipedia.org/wiki/Image_segmentation
https://en.wikipedia.org/wiki/Blob_detection
A SIFT detector usually has to be taught using reference images and reference keypoints. I don't see this either in your code.
https://en.wikipedia.org/wiki/Scale-invariant_feature_transform
Please note that reading those wikipedia articles will only give you a first idea of what's going on. You have to read a lot more.
Always start at the beginning of your processing chain. Can you get better images? (Better means more suitable for what you want to detect). This is like 10% camera and 90% illumination. I don't think detecting skin pores is a classical task for shitty cellphone pictures so why not put a bit effort into your imaging setup?
First rule of image processing: crap in = crap out. You should at least change the angle of illumination or even better approach like shape from shading.
An image optimized for the detection you have to do is cruicial. It will make image processing so much easier.
Then pre-processing: How can you transform the image you have into something you can easily extract features from?
And so on...
We are trying to crop the relevant area of an image (photo) with a square aspect ratio (1:1), similar to what Facebook does when creating thumbnails.
In our case, it doesn't really matter if the crop has the original height (or width when the image orientation is portrait h>w) of the image to be processed or the auto-crop is resizing itself as well
I am thinking of algorithms like comparing objects with background or focus or something like a heat-map, combining colors and/or areas to find the most relevant part. There could be several ideas/methods to find the main part of the image to be used, similar to face detection.
We are looking for a Java (Android)-based solution or anything that can be adopted for Java / Android. Any help or idea would be greatly appreciated! Thank you!
I would do this in two steps, where the initial step is more robust and the second could be based on, for example, entropy. For the first step, you can use SURF which is relatively common nowadays and I would expect to find Java implementations of it. SURF will give a set of key points that it considers important to describe your image. Considering where these key points are in your image, you have a set of (x, y) coordinates from which you use to reduce the area of your initial image to that which encloses this set of points. Now, since these key points might be anywhere in your image, you will probably want to discard some of them (i.e., those that are too far from the others -- outliers). A very simple way to do this discarding step is considering the convex hull from the initial set of key points, from there, you can peel this hull multiple times. Each time you "peel" it, you are effectively discarding the points in the current convex hull.
Here is a sample for such first step:
f = Import["http://fohn.net/duck-pictures-facts/mallard-duck.jpg"];
kp = ImageKeypoints[f, MaxFeatures -> 200];
Show[f, Graphics[{PointSize[Medium], Red, Point[kp]}]]
After peeling once the convex hull formed by the key points and trimming the image according to the bounding rectangle of the remaining points:
From the image above, you can decide which sub-region of it to pick based on some other method. One that is apparently common is the one used by Reddit, which successively remove slices of lesser entropy from the image. Quickly searching for it, I found one such implementation at https://github.com/christopherhan/pycrop/blob/master/pycrop.py#L33, it is very simple.
Another different kind of method that you might wanna try is called Seam-Carving. Also note that depending on how large is the initial image, it is unlikely that cropping a small piece of it will give anything relevant. In those cases, it is more interesting to first resize the image and then apply the relevant methods.
What is the best way to identify an image's type? rwong's answer on this question suggests that Google segments images into the following groups:
Photo - continuous-tone
Clip art - smooth shading
Line drawing - bitonal
What is the best strategy for classifying an image into one of those groups? I'm currently using Java but any general approaches are welcome.
Thanks!
Update:
I tried the unique colour counting method that tyjkenn mentioned in a comment and it seems to work for about 90% of the cases that I've tried. In particular black and white photos are hard to correctly detect using unique colour count alone.
Getting the image histogram and counting the peeks alone doesn't seem like it will be a viable option. For example this image only has two peaks:
Here are two more images I've checked out:
Rather simple, but effective approaches to differentiate between drawings and photos. Use them in combination to achieve a the best accuracy:
1) Mime type or file extension
PNGs are typically clip arts or drawings, while JPEGs are mostly photos.
2) Transparency
If the image has an alpha channel, it's most likely a drawing. In case an alpha channel exists, you can additionally iterate over all pixels to check if transparency is indeed used. Here a Python example code:
from PIL import Image
img = Image.open('test.png')
transparency = False
if img.mode in ('RGBA', 'RGBa', 'LA') or (img.mode == 'P' and 'transparency' in img.info):
if img.mode != 'RGBA': img = img.convert('RGBA')
transparency = any(px for px in img.getdata() if px[3] < 220)
print 'Transparency:', transparency
3) Color distribution
Clip arts often have regions with identical colors. If a few color make up a significant part of the image, it's rather a drawing than a photo. This code outputs the percentage of the image area that is made from the ten most used colors (Python example):
from PIL import Image
img = Image.open('test.jpg')
img.thumbnail((200, 200), Image.ANTIALIAS)
w, h = img.size
print sum(x[0] for x in sorted(img.convert('RGB').getcolors(w*h), key=lambda x: x[0], reverse=True)[:10])/float((w*h))
You need to adapt and optimize those values. Is ten colors enough for your data? What percentage is working best for you. Find it out by testing a larger number of sample images. 30% or more is typically a clip art. Not for sky photos or the likes, though. Therefore, we need another method - the next one.
4) Sharp edge detection via FFT
Sharp edges result in high frequencies in a Fourier spectrum. And typically such features are more often found in drawings (another Python snippet):
from PIL import Image
import numpy as np
img = Image.open('test.jpg').convert('L')
values = abs(numpy.fft.fft2(numpy.asarray(img.convert('L')))).flatten().tolist()
high_values = [x for x in values if x > 10000]
high_values_ratio = 100*(float(len(high_values))/len(values))
print high_values_ratio
This code gives you the number of frequencies that are above one million per area. Again: optimize such numbers according to your sample images.
Combine and optimize these methods for your image set. Let me know if you can improve this - or just edit this answer, please. I'd like to improve it myself :-)
This problem can be solved by image classification and that's probably Google's solution to the problem. Basically, what you have to do is (i) get a set of images labeled into 3 categories: photo, clip-art and line drawing; (ii) extract features from these images; (iii) use the image's features and label to train a classifier.
Feature Extraction:
In this step you have to extract visual information that may be useful for the classifier to discriminate between the 3 categories of images:
A very basic yet useful visual feature is the image histogram and its variants. For example, the gray level histogram of a photo is probably smoother than a histogram of a clipart, where you have regions that may be all of the same color value.
Another feature that one can use is to convert the image to the frequency domain (e.g. using FFT or DCT) and measure the energy of high frequency components. Because line drawings will probably have sharp transitions of colors, its high frequency components will tend to accumulate more energy.
There's also a number of other feature extraction algorithms that may be used.
Training a Classifier:
After the feature extraction phase, we will have for each image a vector of numeric values (let's call it the image feature vector) and its tuple. That's a suitable input for a training a classifier. As for the classifier, one may consider Neural Networks, SVM and others.
Classification:
Now that we have a trained classifier, to classify an image (i.e. detect a image category) we simply have to extract its features and input it to the classifier and it will return its predicted category
Histograms would be a first way to do this.
Convert the color image to grayscale and calculate the histogram.
A very bi-modal histogram with 2 sharp peaks in black (or dark) and white (or right), probably with much more white, are a good indication for line-drawing.
If you have just a few more peaks then it is likely a clip-art type image.
Otherwise it's a photo.
In addition to color histograms, also consider edge information and the consistency of line widths throughout the image.
Photo - natural edges will have a variety of edge strengths, and it's less likely that there will be many parallel edges.
Clip art - A watershed algorithm could help identify large, connected regions of consistent brightness. In clip art and synthetic images designed for high visibility there are more likely to be perfectly straight lines and parallel lines. A histogram of edge strengths is likely to have a few very strong peaks.
Line drawing - synthetic lines are likely to have very consistent width. The Stroke Width Transform could help you identify strokes. (One of the basic principles is to find edge gradients that "point at" each other.) A histogram of edge strengths may have only one strong peak.
Here’s my task which I want to solve with as little effort as possible (preferrably with QT & C++ or Java): I want to use webcam video input to detect if there’s a (or more) crate(s) in front of the camera lens or not. The scene can change from "clear" to "there is a crate in front of the lens" and back while the cam feeds its video signal to my application. For prototype testing/ learning I have 2-3 images of the “empty” scene, and 2-3 images with one or more crates.
Do you know straightforward idea how to tackle this task? I found OpenCV, but isn't this framework too bulky for this simple task? I'm new to the field of computer vision. Is this generally a hard task or is it simple and robust to detect if there's an obstacle in front of the cam in live feeds? Your expert opinion is deeply appreciated!
Here's an approach I've heard of, which may yield some success:
Perform edge detection on your image to translate it into a black and white image, whereby edges are shown as black pixels.
Now create a histogram to record the frequency of black pixels in each vertical column of pixels in the image. The theory here is that a high frequency value in the histogram in or around one bucket is indicative of a vertical edge, which could be the edge of a crate.
You could also consider a second histogram to measure pixels on each row of the image.
Obviously this is a fairly simple approach and is highly dependent on "simple" input; i.e. plain boxes with "hard" edges against a blank background (preferable a background that contrasts heavily with the box).
You dont need a full-blown computer-vision library to detect if there is a crate or no crate in front of the camera. You can just take a snapshot and make a color-histogram (simple). To capture the snapshot take a look here:
http://msdn.microsoft.com/en-us/library/dd742882%28VS.85%29.aspx
Lots of variables here including any possible changes in ambient lighting and any other activity in the field of view. Look at implementing a Canny edge detector (which OpenCV has and also Intel Performance Primitives have as well) to look for the outline of the shape of interest. If you then kinda know where the box will be, you can perhaps sum pixels in the region of interest. If the box can appear anywhere in the field of view, this is more challenging.
This is not something you should start in Java. When I had this kind of problems I would start with Matlab (OpenCV library) or something similar, see if the solution would work there and then port it to Java.
To answer your question I did something similar by XOR-ing the 'reference' image (no crate in your case) with the current image then either work on the histogram (clustered pixels at right means large difference) or just sum the visible pixels and compare them with a threshold. XOR is not really precise but it is fast.
My point is, it took me 2hrs to install Scilab and the toolkits and write a proof of concept. It would have taken me two days in Java and if the first solution didn't work each additional algorithm (already done in Mat-/Scilab) another few hours. IMHO you are approaching the problem from the wrong angle.
If really Java/C++ are just some simple tools that don't matter then drop them and use Scilab or some other Matlab clone - prototyping and fine tuning would be much faster.
There are 2 parts involved in object detection. One is feature extraction, the other is similarity calculation. Some obvious features of the crate are geometry, edge, texture, etc...
So you can find some algorithms to extract these features from your crate image. Then comparing these features with your training sample images.
I'm capturing data from a tablet using Java (JPen library rocks) and would like to be able to paint a penstroke in a more natural way.
Currently I'm drawing the pen stroke as straight line segments each with a different Stroke thickness.
There has to be something in Java's Graphics Library that lets me to this more efficiently.
Right?
I've never done this, but here are a couple things you could try. First, you could implement a custom Stroke that creates skinny trapezoids. The width of the end caps would be a function of the pressure at the end points. If that works, you could try to make the line segments look more natural by using Bezier curves to form "curvy trapezoids". You might be able to use QuadCurve2D to help.
There's a more general solution available at least. The feature was added to Inkscape based on a recent algorithm. You can see it applied directly to your problem in some screenshots. It can extrude any shape brush along the path to mimic a paintbrush for example, but you'd have to port it to Java from the algorithm in the first link or from the Inkscape sources. Also, it's covered by patents so you'd have to release your code under the GPL (the author gives explicit permission) or buy a patent license.
PostScript RIPs often convert circles to curves and curves to a series of straight line segments. The number of segments depends on the flatness setting which defaults to one suitable for the raster display resolution.
A thick line or thick line segments can be converted to a skinny filled polygon.