I'm trying to tell if a given photo is blurry. I know that this is basically an impossible task, and that any metric will return undesirable results sometimes.
I'm wondering if there's a simple metric that at least tries to estimate blur that I can use though. Specifically, the task has a high tolerance for false positives. e.g. If I got something that eliminated 90% of blurry photos and 50% of non-blurry photos I would be very happy.
I'm trying to implement this in Java. I have an array of pixels (as ints). Please keep in mind I have a limited understanding of image processing techniques (fourier transforms, etc.), and I would love a very specific walkthrough of how to code a solution.
A very simple measure would be to apply a Sobel filter and investigate the overall energy of the filtered image. The more an image is blurred, the more edges vanish, the smaller the energy of the filtered image. Of course you'll run into problems with this approach when you try to determine a threshold for blurred vs. not blurred, but maybe this simple method will give you an idea.
Check wikipedia for the Sobel filter, and here is a code snippet to get out the edge ratio of an image. You can use these edge ratios to pair wise compare is images have more or less edges. Still, keep in mind that this is a simple approach and the answer of a.lasram is definitely correct.
float[] sobelX = {
-1, 0, 1,
-2, 0, 2,
-1, 0, 1,
};
BufferedImage image = ImageIO.read(new File("test.jpg"));
ColorConvertOp grayScaleOp = new ColorConvertOp(ColorSpace.getInstance(ColorSpace.CS_GRAY), null);
BufferedImage grayImage = grayScaleOp.filter(image, null);
BufferedImageOp op = new ConvolveOp( new Kernel(3, 3, sobelX) );
BufferedImage result = op.filter(grayImage, null);
WritableRaster r = result.getRaster();
int[] pixel = new int[r.getWidth()];
double countEdgePixels = 0;
for (int y = 0; y<r.getHeight();y++) {
// System.out.println("y = " + y);
r.getPixels(0, y, r.getWidth(),1, pixel);
for (int i = 0; i < pixel.length; i++) {
// create some stat out of the energy ...
if (pixel[i] > 128) {
countEdgePixels++;
}
}
}
System.out.printf("Edge pixel ratio = %4.4f\n", countEdgePixels/(double) (r.getWidth()*r.getHeight()));
ImageIO.write(result, "png", new File("out.png"));
As you've said you're not going to find a universal metric.
Also there are different types of blur: uniform, anisotropic, motion blur...
In general blurred images tend to exhibit low frequencies. A possible descriptor is the sum of magnitude of the k highest frequencies. Image with a low sum is likely to be blurred overall.
The magnitudes can be obtained in N*log(N) time using Fourier spectrum (high frequencies are far from the origin) or a Laplace pyramid (high frequencies correspond to the first scales).
Wavelet transform is another possible descriptor
A bit late reply but worth for the next person that will bump in this question.
I found in Google Scholars several papers that talk about averaging the sum of all edges in the picture compared to the width of all of them, as can be seen in this two articles: First and Second.
I have an better idea: what if we solve vice versa task? If we find sharpness image then we gain opposite to blur metric.
Some of articles are here:
A Perceptual Image Sharpness Metric Based on Local Edge Gradient Analysis
A No-Reference Objective Image Sharpness Metric Based on the Notion of Just Noticeable Blur (JNB)
Aberration correction by maximizing generalized sharpness metrics
Related
Please note, I am a complete beginner in computer vision and OpenCV(Java).
My objective is to identify parking signs, and to draw bounding boxes around them. My problem is that the four signs from the top (with red borders) were not identified (see last image). I am also noticing that the Canny edge detection does not capture the edges of these four signs (see second image). I have tried with other images, and got the same results. My approach is as follows:
Load the image and convert it to gray scale
Pre-process the image by applying bilateralFilter and Gaussian blur
Execute Canny edge detection
Find all contours
Calculate the perimeter with arcLength and approximate the contour with approxPolyDP
If approximated figure has 4 points, then assuming it is a rectangle hence adding the contour
Finally, draw the contours that has 4 points exactly.
Mat filtered = new Mat();
Mat edges = new Mat(src.size(), CvType.CV_8UC1);
Imgproc.cvtColor(src, edges, Imgproc.COLOR_RGB2GRAY);
Imgproc.bilateralFilter(edges, filtered, 11, 17, 17);
org.opencv.core.Size s = new Size(5, 5);
Imgproc.GaussianBlur(filtered, filtered, s, 0);
Imgproc.Canny(filtered, filtered, 170, 200);
List<MatOfPoint> contours = new ArrayList<MatOfPoint>();
Imgproc.findContours(filtered, contours, new Mat(), Imgproc.RETR_LIST, Imgproc.CHAIN_APPROX_SIMPLE);
List<MatOfPoint> rectangleContours = new ArrayList<MatOfPoint>();
for (MatOfPoint contour : contours) {
MatOfPoint2f dst = new MatOfPoint2f();
contour.convertTo(dst, CvType.CV_32F);
perimeter = Imgproc.arcLength(dst, true);
approximationAccuracy = 0.02 * perimeter;
MatOfPoint2f approx = new MatOfPoint2f();
Imgproc.approxPolyDP(dst, approx, approximationAccuracy, true);
if (approx.total() == 4) {
rectangleContours.add(contour);
Toast.makeText(reactContext.getApplicationContext(), "Rectangle detected" + approx.total(), Toast.LENGTH_SHORT).show();
}
}
Imgproc.drawContours(src, rectangleContours, -1, new Scalar(0, 255, 0), 5);
Very happy to get advice on how I could resolve this issue, even if it implies changing my stratergy.
What about starting with OCR, Tesseract, in order to recognize big "P" and other parking-related text patterns?
(Toast seems like Android: How can I use Tesseract in Android?
General Tesseract for Java: https://www.geeksforgeeks.org/tesseract-ocr-with-java-with-examples/ )
Another example, in Python, but see the preprocessing and other tricks and ideas for making the letters recognizable when the image has gradients, lower contrast, small fonts etc.: How to obtain the best result from pytesseract?
Also, there could be filtering by color, since the colors of the signs are known. The conversion to grayscale removes that valuable information, so finding the edges is OK, but the colors still can be used. E.g. split the colors to b,g,r and use each channel as grayscale and possibly boost it. The red and blue borders would stand out.
It seems the contrast around the red borders is too low, the blue signs are brighter compared to the black contour. If not splitting, before converting to grayscale, some of the color channels could be amplified anyway, like the red one.
Searching for big yellow/blue regions with low contrast, with text found, "P" etc. Tesseract has a function returning the boxes of the text that was found.
Also once you find a sign somewhere or a bar of signs and their directions, you could search there, vertically/horizontally.
You may search HoughLines as well, that may find the black border around the signs.
Calculate the perimeter with arcLength and approximate the contour
with approxPolyDP
If approximated figure has 4 points, then assuming it is a rectangle
hence adding the contour
IMO finding exactly 4 points (or after simplification of the polygon) is hard and may be not enough of an evidence, also there are round corners etc. if contours are compared directly.
The angles between the vertices and the distances matter - are the lines parallel (with some precision) etc.
The process could be iterative: gradually reducing the polygon detail, checking the area and perimeter, until the vertices reach 4 (or about that). If the area and perimeter don't change much (the ratio has to be found) after polygon aproximation (simplifying the round corners etc.), while the number of points in the contour gets reduced. I'd try also a comparison to the bounding box and the convex hull measurements etc.
If you need to only detect the parking signs, then treat this problem as a classic object detection problem (just like face detection). For the best results, you will need to use deep learning based convolutional neural network models.
To start with you can train the YOLO model which will give you a lot better results that anything you tried with OpenCV. You need at least 500 images. Then you need to annotate them. This tutorial is kick start tutorial on YOLO. Let's give a try.
Like YOLO there are so many models and all of them can be trained using similar process. So if you want to deploy your model on android, I will recommend you to choose a tensorflow based model. Train it on your PC and integrate the trained serialized model in your app.
Hello I am an inexperienced programmer and this is my first question on Stack Overflow!
I am attempting to implement 'fog of war' in my Java game. This means most of my map begins off black and then as one of my characters moves around parts of the map will be revealed. I have searched around including here and found a few suggestions and tried tweaking them myself. Each of my approaches works, however I run into significant runtime issues with each. For comparison, before any of my fog of war attempts I was getting 250-300 FPS.
Here is my basic approach:
Render my background and all objects on my JPanel
Create a black BufferedImage (fogofwarBI)
Work out which areas of my map need to be visible
Set the relevant pixels on my fogofwarBI to be fully transparent
Render my fogofwarBI, thus covering parts of the screen with black and in transparent sections allowing the background and objects to be seen.
For initialising the buffered image I have done the following in my FogOfWar() class:
private BufferedImage blackBI = loader.loadImage("/map_black_2160x1620.png");
private BufferedImage fogofwarBI = new BufferedImage(blackBI.getWidth(), blackBI.getHeight(), BufferedImage.TYPE_INT_ARGB);
public FogOfWar() {
fogofwarBI.getGraphics().drawImage(blackBI,0,0,null);
}
In each of my attempts I start the character in a middle of 'visible' terrain, ie. in a section of my map which has no fog (where my fogofwarBI will have fully transparent pixels).
Attempt 1: setRGB
First I find the 'new' coordinates in my character's field of vision if it has moved. ie. not every pixel within the character's range of sight, but just the pixels at the edge of his range of vision in the direction he is moving. This is done with a for loop, and will go through up to 400 or so pixels.
I feed each of these x and y coordinates into my FogOfWar class.
I check if these x,y coordinates are already visible (in which case I don't bother doing anything to them to save time). I do this check by maintaining a Set of Lists. Where each List contains two elements: an x and y value. And the Set is a unique set of the coordinate Lists. The Set begins empty, and I will add x,y coordinates to represent transparent pixels. I use the Set to keep the collection unique and because I understand the List.contains function is a fast way of doing this check. And I store the coordinates in a List to avoid mixing up x and y.
If a given x,y position on my fogofwarBI is not currently visible I add set the RBG to be transparent using .setRGB, and add it to my transparentPoints Set so that coordinate will not be edited again in future.
Set<List<Integer>> transparentPoints = new HashSet<List<Integer>>();
public void editFog(int x, int y) {
if (transparentPoints.contains(Arrays.asList(x,y)) == false){
fogofwarBI.setRGB(x,y,0); // 0 is transparent in ARGB
transparentPoints.add(Arrays.asList(x,y));
}
}
I then render it using
public void render(Graphics g, Camera camera) {
g.drawImage(fogofwarBI, 0, 0, Game.v_WIDTH, Game.v_HEIGHT,
camera.getX()-Game.v_WIDTH/2, camera.getY()-Game.v_HEIGHT/2,
camera.getX()+Game.v_WIDTH/2, camera.getY()+Game.v_HEIGHT/2, null);
}
Where I am basically applying the correct part of my fogofwarBI to my JPanel (800*600) based on where my game camera is.
Results:
Works correctly.
FPS of 20-30 when moving through fog, otherwise normal (250-300).
This method is slow due to the .setRGB function, being run up to 400 times each time my game 'ticks'.
Attempt 2: Raster
In this attempt I create a raster of my fogofwarBI to play with the pixels directly in an array format.
private BufferedImage blackBI = loader.loadImage("/map_black_2160x1620.png");
private BufferedImage fogofwarBI = new BufferedImage(blackBI.getWidth(), blackBI.getHeight(), BufferedImage.TYPE_INT_ARGB);
WritableRaster raster = fogofwarBI.getRaster();
DataBufferInt dataBuffer = (DataBufferInt)raster.getDataBuffer();
int[] pixels = dataBuffer.getData();
public FogOfWar() {
fogofwarBI.getGraphics().drawImage(blackBI,0,0,null);
}
My editFog method then looks like this:
public void editFog(int x, int y) {
if (transparentPoints.contains(Arrays.asList(x,y)) == false){
pixels[(x)+((y)*Game.m_WIDTH)] = 0; // 0 is transparent in ARGB
transparentPoints.add(Arrays.asList(x,y));
}
}
My understanding is that the raster is in (constant?) communication with the pixels array, and so I render the BI in the same way as in attempt 1.
Results:
Works correctly.
A constant FPS of around 15.
I believe it is constantly this slow (regardless of whether my character is moving through fog or not) because whilst manipulating the pixels array is quick, the raster is constantly working.
Attempt 3: Smaller Raster
This is a variation on attempt 2.
I read somewhere that constantly resizing a BufferedImage using the 10 input version of .drawImage is slow. I also thought that having a raster for a 2160*1620 BufferedImage might be slow.
Therefore I tried having my 'fog layer' only equal to the size of my view (800*600), and updating every pixel using a for loop, based on whether the current pixel should be black or visible from my standard transparentPoints Set and based on my camera position.
So now my editFog Class just updates the Set of invisible pixels and my render class looks like this:
public void render(Graphics g, Camera camera) {
int xOffset = camera.getX() - Game.v_WIDTH/2;
int yOffset = camera.getY() - Game.v_HEIGHT/2;
for (int i = 0; i<Game.v_WIDTH; i++) {
for (int j = 0; j<Game.v_HEIGHT; j++) {
if ( transparentPoints.contains(Arrays.asList(i+xOffset,j+yOffset)) ) {
pixels[i+j*Game.v_WIDTH] = 0;
} else {
pixels[i+j*Game.v_WIDTH] = myBlackARGB;
}
}
}
g.drawImage(fogofwarBI, 0, 0, null);
}
So I am no longer resizing my fogofwarBI on the fly, but I am updating every single pixel every time.
Result:
Works correctly.
FPS: Constantly 1 FPS - worst result yet!
I guess that any savings of not resizing my fogofwarBI and having it smaller are massively outweighed by updating 800*600 pixels in the raster rather than around 400.
I have run out of ideas and none of my internet searching is getting me any further in trying to do this in a better way. I think there must be a way to do fog of war effectively, but perhaps I am not yet familiar enough with Java or the available tools.
And pointers as to whether my current attempts could be improved or whether I should be trying something else altogether would be very much appreciated.
Thanks!
This is a good question. I am not familar with the awt/swing type rendering, so I can only try to explain a possible solution for the problem.
From a performance standpoint I think it is a better choice to chunk/raster the FOW in bigger sections of the map rather than using a pixelbased system. That will reduce the amount of checks per tick and updating it will also take less resources, as only a small portion of the window/map needs to update. The larger the grid, the less checks, but there is a visual penalty the bigger you go.
Leaving it like that would make the FOW look blocky/pixelated, but its not something you can't fix.
For the direct surrounding of a player, you can add a circle texture with the player at its center. You can than use blending (I believe the term in awt/swing is composite) to 'override' the alpha where the circle overlaps the FOW texture. This way the pixel-based updating is done by the renderingAPI which usually uses hardware enhanced methods to achieve these things. (for custom pixel-based rendering, something like 'shader scripts' are often used if supported by the rendering API)
This is enough if you only need temporary vission in the FOW (if you don't need to 'remember' the map), you don't even need a texture grid for the FOW than, but I suspect you do want to 'remember' the map. So in that case:
The blocky/pixelated look can be fixed like they do with grid-based terain. Basically add a small additional textures/shapes based on the surroundings to make things look nice. The link below provides good examples and a detailed explanation on how to do the 'terrain-transitions' as they are called.
https://www.gamedev.net/articles/programming/general-and-gameplay-programming/tilemap-based-game-techniques-handling-terrai-r934/
I hope this gives a better result. If you cannot get a better result, I would advise switching over to something like OpenGL for the render engine as it is meant for games, while the awt/swing API is primarely used for UI/application rendering.
I am trying to program a visualisation for the Mandelbrot set in java, and there are a couple of things that I am struggling with to program. I realize that questions around this topic have been asked a lot and there is a lot of documentation online but a lot of things seem very complicated and I am relatively new to programming.
The first issue
The first issue I have is to do with zooming in on the fractal. My goal is to make an "infinite" zoom on the fractal (of course not infinite, as far as a regular computer allows it regarding calculation time and precision). The approach I am currently going for is the following on a timer:
Draw the set using some number of iterations on the range (-2, 2) on the real axis and (2, 2) on the imaginary axis.
Change those ranges to zoom in.
Redraw that section of the set with the number of iterations.
It's the second step that I struggle with. This is my current code:
for (int Py = beginY; Py < endY; Py++) {
for (int Px = beginX; Px < endX; Px++) {
double x0 = map(Px, 0, height,-2, 2);
double y0 = map(Py, 0, width, -2, 2);
Px and Py are the coordinates of the pixels in the image. The image is 1000x1000. The map funtion takes a number, in this case Px or Py, with a range of (0, 1000) and devides it evenly over the range (-2, 2), so it returns the corresponding value in that range.
I think that in order to zoom in, I'll have to change the -2 and 2 values by some way in the timer, but whatever I try, it doesn't seem to work. The zoom always ends up slowing down after a while or it will end up zooming in on a part of the set that is in the set, so not the borders. I tried multiplying them by some scale factor every timer tick, but that doesn't really produce the result I was looking for.
Now I have two questions about this issue.
Is this the right approach to visualizing the set and zooming in(draw, change range, redraw)?
If it is, how do I zoom in properly on an area that is interesting and that will keep zooming in properly even after running for a minute?
The second issue
Of course when visualizing something, you need to get some actual visual thing. In this case I want to color the set in a way similar to what you see here: (https://upload.wikimedia.org/wikipedia/commons/f/fc/Mandel_zoom_08_satellite_antenna.jpg).
My guess is that you have use the amount of iterations a pixel went through to before breaking out of the loop to give it some color value. However, I only really know how to do this with a black and white color scheme. I tried making a color array that holds the same amount of different gray colors as the amount of max iterations, starting from black and ending in white. Here is my code:
Color[] colors = new Color[maxIterations + 2];
for (int i = 0; i < colors.length; i++) {
colors[i] = new Color((int)map(i, 0, maxIterations + 2, 0, 255),
(int)map(i, 0, maxIterations + 2, 0, 255),
(int)map(i, 0, maxIterations + 2, 0, 255));
}
I then just filled in the amount of iterations in the array and assigned that color to the pixel. I have two questions about this:
Will this also work as we zoom into the fractal in the previously described manner?
How can I add my own color scheme in this, like in the picture? I've read some things about "linear interpolation" but I don't really understand what it is and in what way it can help me.
It sounds like you've made a good start.
Re the first issue: I believe there are ways to automatically choose an "interesting" portion of the set to zoom in on, but I don't know what they are. And I'm quite sure it involves more than just applying some linear function to your current bounding rectangle, which is what it sounds like you're doing.
So you could try to find out what these methods are (might get mathematically complicated), but if you're new to programming, you'll probably find it easier to let the user choose where to zoom. This is also more fun in the beginning, since you can run your program repeatedly and explore a new part of the set each time.
A simple way to do this is to let the user draw a rectangle over the image, and use your map function to convert the pixel coordinates of the drawn rectangle to the new real and imaginary coordinates of your zoom area.
You could also combine both approaches: once you've found somewhere you find interesting by manually selecting the zoom area, you can set this as your "final destination", and have the code gradually and smoothly zoom into it, to create a nice movie.
It will always get gradually slower though, as you start using ever more precise coordinates, until you reach the limits of precision with double and it becomes a pixellated mess. From there, if you want to zoom further, you'll have to look into arbitrary-precision arithmetic with BigDecimal - and it will continue to get slower and slower.
Re the second issue: starting off by calculating a value of numIterations / maxIterations (i.e. between 0 and 1) for each pixel is the right idea (I think this is basically what you're doing).
From there, there are all sorts of ways to convert this value to a colour, it's time to get creative!
A simple one is to have an array of a few very different colours. E.g. if you had white (0.0), red (0.25), green (0.5), blue (0.75), black (1.0), then if your calculated number was exactly one of the ones listed, you'd use the corresponding colour. If it's somewhere between, you blend the colours, e.g. for 0.3 you'd take:
((0.5-0.3)*red + (0.3-0.25)*green) / (0.5 - 0.25)
= 0.8*red + 0.2*green
Taking a weighted average of two colours is something I'll leave as an exercise ;)
(hint: take separate averages of the r, g, and b values. Playing with the alpha values could maybe also work).
Another one, if you want to get more mathsy, is to take an equation for a spiral and use it to calculate a point on a plane in HSB colour space (you can keep the brightness at some fixed value, say 1). In fact, any curve in 2D or 3D which you know how to write as an equation of one real variable can be used this way to give you smoothly changing colours, if you interpret the coordinates as points in some colour space.
Hope that's enough to keep you going! Let me know if it's not clear.
I'm trying to figure out how to use my thermal sensor to change colors to an overlay that I have over the Android camera. The problem is that the data I get back is in a 16x4 array. How do I resize this 16x4 grid to a different resolution? Such as 32x8, 48x12...etc.
Edit:
For instance, I have this as my draw method:
public void onDraw(Canvas canvas){
super.onDraw(canvas);
for(int x = 0; x < 4; x++){
for(int y = 0; y < 16; y++){
// mapping 2D array to 1D array
int index = x*GRID_WIDTH + y;
tempVal = currentTemperatureValues.get(index);
// 68x68 bitmap squares to display temperature data
if(tempVal >= 40.0)
bitmaps[x][y].eraseColor(Color.RED);
else if(tempVal < 40.0 && tempVal > 35.0)
bitmaps[x][y].eraseColor(Color.YELLOW);
else
bitmaps[x][y].eraseColor(Color.BLUE);
}
}
combinedBitmap = mergeBitmaps();
// combinedBitmap = fastblur(combinedBitmap, 45);
paint.setAlpha(alphaValue);
canvas.drawBitmap(combinedBitmap, xBitmap, yBitmap, paint);
Log.i(TAG,"Done drawing");
}
The current implementation is to draw to a 16x4 overlay over my camera preview, but resolution is very low, and I'd like to improve it the best I can.
The Bitmap class in the Android API (that's what I'm assuming you're using) has a static method called createScaledBitmap: http://developer.android.com/reference/android/graphics/Bitmap.html#createScaledBitmap%28android.graphics.Bitmap,%20int,%20int,%20boolean%29
What this method does is that it accepts an already created Bitmap, you specify the final width and height dimensions as well as a boolean flag called filter. Setting this to false does nearest neighbour interpolation while true does bilinear interpolation.
As an example, given that you have a 2D array of Bitmaps, you could resize one like so:
Bitmap resize = Bitmap.createScaledBitmap(bitmaps[x][y], 32, 8, true);
The first parameter is the Bitmap you want resized, the second parameter is the width, third parameter the height, and the last is the filter flag. The output (of course) is stored in resize and is your resized / scaled image. Currently, the Javadoc for this method (as you can see) provides no explanation for what filter does. I had to look at the Android source to figure out what exactly it was doing, and also from experience as I have used the method before.
Generally, you set this to false if you are shrinking the image, while you set this to true if you are upscaling the image. The reason why is because when you are interpolating an image from small to large, you are trying to create more information than what was initially available. Doing this with nearest neighbour will introduce blocking artifacts, and so bilinear interpolation will help smooth this out. Going from large to small has no noticeable artifacts using either method, and so you generally choose nearest neighbour as it's more computationally efficient. There will obviously be blurring as you resize to a larger image. The larger you go, the more blurriness you get, but that beats that blockiness you get with nearest neighbour.
For using just the Android API, this is the best and easiest solution you can get. If you want to get into more sophisticated interpolation techniques (Cubic, Lanczos, etc...), unfortunately you will have to implement that yourself. Try bilinear first and see what you get.
According to my research, Canny Edge Detector is very useful for detecting the edge of an image. After I put many effort on it, I found that OpenCV function can do that, which is
Imgproc.Canny(Mat image, Mat edges, double threshold1, double threshold2)
But for the low threshold and high threshold, I know that different image has different threshold, so can I know if there are any fast adaptive threshold method can automatically assign the low and high threshold according to different image?
This is relatively easy to do. Check out this older SO post on the subject.
A quick way is to compute the mean and standard deviation of the current image and apply +/- one standard deviation to the image.
The example in C++ would be something like:
Mat img = ...;
Scalar mu, sigma;
meanStdDev(img, mu, sigma);
Mat edges;
Canny(img, edges, mu.val[0] - sigma.val[0], mu.val[0] + sigma.val[0]);
Another method is to compute the median of the image and target a ratio above and below the median (e.g., 0.66*medianValue and 1.33*medianValue).
Hope that helps!
Opencv has an adaptive threshold function.
With OpenCV4Android it is like this:
Imgproc.adaptiveThreshold(src, dst, maxValue, adaptiveMethod, thresholdType, blockSize, C);
An example:
Imgproc.adaptiveThreshold(mInput, mInput, 255, Imgproc.ADAPTIVE_THRESH_MEAN_C, Imgproc.THRESH_BINARY_INV, 15, 4);
As for how to choose the parameters, you have to read the docs for more details. Choosing the right threshold for each image is a whole different question.