Reading Content of image or Image processing(Indoor Navigation) [duplicate] - java

This question already has answers here:
Floor Plan Edge Detection - Image Processing?
(3 answers)
Closed 7 years ago.
I am making a small indoor navigation application for my project. The main Idea behind my application is that i will be given a .pdf file or Autocad file(floor plan) for some Area. I have to parse or get data from that image to find out open path in a floor plan.
For Determining open Path from an image i have to map image content or data in some Data Structure also, so that i can apply some path finding algorithms on it.
My problem is that i don't know how can i break my image into pixels or any other form to get data from it in my initial phase. Do i need to apply some image processing using Matlab or it could be achieved by Java Or Python Libraries?

This is a rather broad question, so i can only give hints on some of the relevant points.
How to read single pixels from an Image in java:
BufferedImage bi = ImageIO.read(new File("pathToYourImage"));
bi.getRGB(0 , 0);
This way you can load an image in java and get the values of a single pixel ((0,0) in the example).
Datastructure: The most common way of representing a floorplan or any other kind of collection of paths is a graph. There are several good libraries on the net for graphs, or you can implement it on your own.
ImageProcessing: Since the image won't be b/w (i guess), you'll have to transform it in order to preform the transformation into a graph - though the conversion to a graph wouldn't even be necessary. The most common way would be to simply convert the graph into a b/w image where black pixels are walls. Since the color of pixels representing the floor might not be perfectly the same color equal, i added some imprecision (delta) to the comparison:
//comparison function
boolean isMatch(Color inp , Color toMatch)
{
final int delta = 25;
return (Math.abs(inp.getRed() - toMatch.getRed()) <= delta &&
Math.abs(inp.getBlue() - toMatch.getBlue()) <= delta &&
Math.abs(inp.getGreen() - toMatch.getGreen()) <= delta);
}
//color of pixels that don't represent obstacles
Color floor = getFloorColor();
//create a copy of the image for the transformation
BufferedImage floorPlan = new BufferedImage(getFloorPlan().getWidth() ,
getFloorPlan().getHeight() , BufferedImage.TYPE_INT_RGB);
floorPlan.getGraphics().drawImage(getFloorPlan() ,
floorPlan.getWidth() , floorPlan.getHeight() , null);
//color pixels that aren't walls or other obstacles white and obstacles/walls black
for(int i = 0 ; i < floorPlan.getWidth() ; i++)
for(int j = 0 ; j < floorPlan.getHeight() ; j++)
if(isMatch(new Color(floorPlan.getRGB(i , j)) , floor)
floorPlan.setRGB(Color.WHITE.getRGB());
else
floorPlan.setRGB(Color.BLACK.getRGB());
This image can now easily be either transformed into a graph, or used directly as representation of the graph.

Related

Background removal from images with OpenCV in Android

I want to remove image background with Open CV in Android. Code is working fine but output quality not as per expectation. I followed java documentation for code reference:
https://opencv-java-tutorials.readthedocs.io/en/latest/07-image-segmentation.html
Thanks
original Image
My output
Expected output
My code snippet in Android:
private fun doBackgroundRemoval(frame: Mat): Mat? {
// init
val hsvImg = Mat()
val hsvPlanes: List<Mat> = ArrayList()
val thresholdImg = Mat()
var thresh_type = Imgproc.THRESH_BINARY_INV
thresh_type = Imgproc.THRESH_BINARY
// threshold the image with the average hue value
hsvImg.create(frame.size(), CvType.CV_8U)
Imgproc.cvtColor(frame, hsvImg, Imgproc.COLOR_BGR2HSV)
Core.split(hsvImg, hsvPlanes)
// get the average hue value of the image
val threshValue: Double = getHistAverage(hsvImg, hsvPlanes[0])
threshold(hsvPlanes[0], thresholdImg, threshValue, 78.0, thresh_type)
Imgproc.blur(thresholdImg, thresholdImg, Size(1.toDouble(), 1.toDouble()))
val kernel1 =
Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, Size(11.toDouble(), 11.toDouble()))
val kernel2 = Mat.ones(3, 3, CvType.CV_8U)
// dilate to fill gaps, erode to smooth edges
Imgproc.dilate(thresholdImg, thresholdImg, kernel1, Point(-1.toDouble(), -1.toDouble()), 1)
Imgproc.erode(thresholdImg, thresholdImg, kernel2, Point(-1.toDouble(), -1.toDouble()), 7)
threshold(thresholdImg, thresholdImg, threshValue, 255.0, Imgproc.THRESH_BINARY_INV)
// create the new image
val foreground = Mat(
frame.size(), CvType.CV_8UC3, Scalar(
255.toDouble(),
255.toDouble(),
255.toDouble()
)
)
frame.copyTo(foreground, thresholdImg)
val img_bitmap =
Bitmap.createBitmap(foreground!!.cols(), foreground!!.rows(), Bitmap.Config.ARGB_8888)
Utils.matToBitmap(foreground!!, img_bitmap)
imageView.setImageBitmap(img_bitmap)
return foreground
}
The task, as you have seen, is not trivial at all. OpenCV has a segmentation algorithm called "GrabCut" that tries to solve this particular problem. The algorithm is pretty good at classifying background and foreground pixels, however it needs very specific information to work. It can operate on two modes:
1st Mode (Mask Mode): Using a Binary Mask (same size as the original input) where 100% definite background pixels are marked, as
well as 100% definite foreground pixels. You don't have to mark every
pixel on the image, just a region where you are sure the algorithm
will find either class of pixels.
2nd Mode (Foreground ROI): Using a bounding box that encloses 100% definite foreground pixels.
Now, I use the notation "100% definitive" to label those pixels you are 100% sure they correspond to either the background of foreground. The algorithm classifies the pixels in four possible classes: "Definite Background", "Probable Background", "Definite Foreground" and "Probable Foreground". It will predict both Probable Background and Probable Foreground pixels, but it needs a priori information of where to find at least "Definitive Foreground" pixels.
With that said, we can use GrabCut in its 2nd mode (Rectangle ROI) to try an segment the input image . We can try and get a first, rough, binary mask of the input. This will mark where we are sure the algorithm can find foreground pixels. We will feed this rough mask to the algorithm and check out the results. Now, the method is not easy and its automation not straightforward, there's some manual information we will set that work particularly well for this input image. I don't know the Java implementation of OpenCV, so I'm giving you the solution for Python. Hopefully you will be able to port it. This is the general outline of the algorithm:
Get a first rough mask of the foreground object via thresholding
Detect contours on the rough mask to retrieve a bounding rectangle
The bounding rectangle will serve as input ROI for the GrabCut algorithm
Set the parameters needed for the GrabCut algorithm
Clean the segmentation mask obtained by GrabCut
Use the segmentation mask to finally segment the foreground object
This is the code:
# imports:
import cv2
import numpy as np
# image path
path = "D://opencvImages//"
fileName = "backgroundTest.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# (Optional) Deep copy for results:
inputImageCopy = inputImage.copy()
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Adaptive Thresholding
windowSize = 31
windowConstant = 11
binaryImage = cv2.adaptiveThreshold(grayscaleImage, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, windowSize, windowConstant)
The first step is to get the rough foreground mask using Adaptive Thresholding. Here, I've use the ADAPTIVE_THRESH_MEAN_C method, where the (local) threshold value is the mean of a neighborhood area on the input image. This yields the following image:
It's pretty rough, right? We can clean this up a little bit using some morphology. I use a Closing with a rectangular kernel of size 3 x 3 and 10 iterations to join the big blobs of white pixels. I've wrapped the OpenCV functions inside custom functions that save me the typing of some lines. These helper functions are presented at the end of this post. For now, this step is as follows:
# Apply a morphological closing with:
# Rectangular SE size 3 x 3 and 10 iterations
binaryImage = morphoOperation(binaryImage, 3, 10, "Closing")
This is the rough mask after filtering:
A little bit better. Ok, we can now search for the bounding box of the biggest contour. A search for the outer contours via cv2.RETR_EXTERNAL will suffice for this example, as we can safely ignore children contours, like this:
# Find the EXTERNAL contours on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# This list will store the target bounding box
maskRect = []
Additionally, let's get a list ready where we will store the target bounding rectangle. Let's now search on the detected contours. I've also implemented an area filter in case some noise is present, so the pixels below a certain area threshold are ignored:
# Look for the outer bounding boxes (no children):
for i, c in enumerate(contours):
# Get blob area:
currentArea = cv2.contourArea(c)
# Get the bounding rectangle:
boundRect = cv2.boundingRect(c)
# Set a minimum area
minArea = 1000
# Look for the target contour:
if currentArea > minArea:
# Found the target bounding rectangle:
maskRect = boundRect
# (Optional) Draw the rectangle on the input image:
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# (Optional) Set color and draw:
color = (0, 0, 255)
cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 2 )
# (Optional) Show image:
cv2.imshow("Bounding Rectangle", inputImageCopy)
cv2.waitKey(0)
Optionally you can draw the bounding box found by the algorithm. This is the resulting image:
It is looking good. Note that some obvious background pixels are also enclosed by the ROI. GrabCut will try to re-classify these pixels into their proper class, i.e., "Definitive Background". Alright, let's prepare the data for GrabCut:
# Create mask for Grab n Cut,
# The mask is a uint8 type, same dimensions as
# original input:
mask = np.zeros(inputImage.shape[:2], np.uint8)
# Grab n Cut needs two empty matrices of
# Float type (64 bits) and size 1 (rows) x 65 (columns):
bgModel = np.zeros((1, 65), np.float64)
fgModel = np.zeros((1, 65), np.float64)
We need to prepare three matrices/numpy arrays/whatever data type is used to represent images in Java. The first is where the segmentation mask obtained by GrabCut will be stored. This mask will have values from 0 to 3 to denote the class of each pixel on the original input. The bgModel and fgModel matrices are used internally by the algorithm to store the statistical model of the foreground and background. Be aware that both of these matrices are float matrices. Lastly, GrabCut is an iterative algorithm. It will run for n iterations. Ok, Let's run GrabCut:
# Run Grab n Cut on INIT_WITH_RECT mode:
grabCutIterations = 5
mask, bgModel, fgModel = cv2.grabCut(inputImage, mask, maskRect, bgModel, fgModel, grabCutIterations, mode=cv2.GC_INIT_WITH_RECT)
Ok, the classification is done. You can try and convert mask to an (image) visible type to check out the labels of each pixel. This is optional, but should you wish to do so, you'd get 4 matrices. Each one for each class. For example, for the "Definitive Background" class, GrabCut found these are the pixels belonging to such class (in white):
The pixels belonging to the "Probable Background" class are these:
That's pretty good, huh? Here are the pixels belonging to the "Probable Foreground" class:
Very nice. Let's create the final segmentation mask, because mask is not an image, it is just an array containing labels for each pixel. We will use the Definite Background and Probable Background pixels to set the final mask, we then can "normalize" the data range and convert it to uint8 to obtain an actual image
# Set all definite background (0) and probable background pixels (2)
# to 0 while definite foreground and probable foreground pixels are
# set to 1
outputMask = np.where((mask == cv2.GC_BGD) | (mask == cv2.GC_PR_BGD), 0, 1)
# Scale the mask from the range [0, 1] to [0, 255]
outputMask = (outputMask * 255).astype("uint8")
This is the actual segmentation mask:
Alright, we can clean a little bit this image, because there are some small holes produced by misclassifying foreground pixels as background pixels. Let's apply just another morphological closing, this time using 5 iterations:
# (Optional) Apply a morphological closing with:
# Rectangular SE size 3 x 3 and 5 iterations:
outputMask = morphoOperation(outputMask, 3, 5, "Closing")
Finally, use this outputMask in an AND with the original image to produce the final segmented result:
# Apply a bitwise AND to the image using our mask generated by
# GrabCut to generate the final output image:
segmentedImage = cv2.bitwise_and(inputImage, inputImage, mask=outputMask)
cv2.imshow("Segmented Image", segmentedImage)
cv2.waitKey(0)
This is the final result:
If you need transparency on this image, is very straightforward to use outputMask as alpha channel. This is the helper function I used earlier:
# Applies a morpho operation:
def morphoOperation(binaryImage, kernelSize, opIterations, opString):
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform Operation:
if opString == "Closing":
op = cv2.MORPH_CLOSE
else:
print("Morpho Operation not defined!")
return None
outImage = cv2.morphologyEx(binaryImage, op, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
return outImage
The Code
import cv2
import numpy as np
def process(img):
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_canny = cv2.Canny(img_gray, 10, 20)
kernel = np.ones((13, 13))
img_dilate = cv2.dilate(img_canny, kernel, iterations=1)
return cv2.erode(img_dilate, kernel, iterations=1)
def get_mask(img):
contours, _ = cv2.findContours(process(img), cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
blank = np.zeros(img.shape[:2]).astype('uint8')
for cnt in contours:
if cv2.contourArea(cnt) > 500:
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, peri * 0.004, True)
cv2.drawContours(blank, [approx], -1, 255, -1)
return blank
img = cv2.imread("crystal.jpg")
img_masked = cv2.bitwise_and(img, img, mask=get_mask(img))
cv2.imshow("Masked", img_masked)
cv2.waitKey(0)
The Output
The Explanation
Import the necessary libraries:
import cv2
import numpy as np
Define a function to process an image to be fit for proper contour detection. In the function, first convert the image to grayscale, and then detect its edges using the canny edge detector. With the edges detected, we can dilate and erode them once to give the edges more body:
def process(img):
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_canny = cv2.Canny(img_gray, 10, 20)
kernel = np.ones((13, 13))
img_dilate = cv2.dilate(img_canny, kernel, iterations=1)
return cv2.erode(img_dilate, kernel, iterations=1)
Define a function to generate a mask for the image. After finding the contours of the image, define a grayscale blank image with the shape of the image, and draw every contour (of area greater than 400 to filter out noise) filled in onto the blank image. I also approximated the contours to smoothen things out a bit:
def get_mask(img):
contours, _ = cv2.findContours(process(img), cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
blank = np.zeros(img.shape[:2]).astype('uint8')
for cnt in contours:
if cv2.contourArea(cnt) > 500:
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, peri * 0.004, True)
cv2.drawContours(blank, [approx], -1, 255, -1)
return blank
Finally, read in the image, and mask the image using the cv2.bitwise_and method, along with the get_mask function we defined, which uses the process function we defined. Show the masked image in the end:
img = cv2.imread("crystal.jpg")
img_masked = cv2.bitwise_and(img, img, mask=get_mask(img))
cv2.imshow("Masked", img_masked)
cv2.waitKey(0)
Transparent Background
Instead of the cv2.bitwise_and method, you can use the cv2.merge method:
img = cv2.imread("crystal.jpg")
img_masked = cv2.merge(cv2.split(img) + [get_mask(img)])
cv2.imwrite("masked_crystal.png", img_masked)
Resulting image (screenshot):
Explanation:
Keeping in mind we already imported the cv2 module and the numpy module as np. We also defined a process function and a get_mask function, we can read in the image:
img = cv2.imread("crystal.jpg")
The cv2.split method takes in an image array and returns a list of every individual channel present in the image. In our case, we only have 3 channels, and in order to make the image transparent, we need a forth channel: the alpha channel. The cv2.merge method does the opposite of cv2.split; it takes in a list of individual channels and returns an image array with the channels. So next we get the bgr channels of the image in a list, and concatenate the mask of the image as the alpha channel:
img_masked = cv2.merge(cv2.split(img) + [get_mask(img)])
Lastly we can write the four channel image into a file:
cv2.imwrite("masked_crystal.png", img_masked)
Here are some more example of the cv2.merge method: Python cv2.merge() Examples

Java Advanced Imaging: How to get the ImageLayout from a huge image?

I have a couple of huge images which can't be loaded into the memory in whole. I know that the images are tiled and all the methods in the class ImageReader give me plausible non zero return values for
getTileGridXOffset(int),
getTileGridYOffset(int),
getTileWidth(int) and
getTileHeight(int).
My problem now is that I want to read one tile only to avoid having to load the entire image into memory using the ImageReader.readtTile(int, int, int) method. But how do I determine what the valid values for the tile coordinates are?
There is the method getNumXTiles() and getNumYTiles() in the interface RenderedImage but all attempts to create a rendered image from the source results into a out of memory/java heap space error.
The tile coordinates can theoretically be anything and I tried readtTile(0, -1, -1) which also works for a few images I tested.
I also tried to reach the metadata for those images but I didn't find any useful information regarding the image layout.
Is there anyone who can tell me how to get the values for the tile coordinates without having to read the entire image into memory? Is there another way which does not require an instance of ImageLayout?
Thank you very much for your assistance.
First of all, you should check that the ImageReader in question supports tiling for the given image, using the isImageTiled(imageIndex). If it doesn't, you can't expect useful values from the other methods.
Then if it does, all tiles for a given image must be equal in size (but the last tile in each column/the last row may be truncated). This is also the case for all tiled file formats that I know of (ie. TIFF). So, using this knowledge, the number of tiles in both dimensions can be calculated:
// Calculate number of x tiles/y tiles:
int cols = (int) Math.ceil(reader.getWidth(imageIndex) / (double) reader.getTileWidth(imageIndex));
int rows = (int) Math.ceil(reader.getHeight(imageIndex) / (double) reader.getTileHeight(imageIndex));
You can then, loop over the tile indexes (the first tile is always 0,0):
for (int row = 0; row < rows; row++) {
for (int col = 0; col < cols; col++) {
BufferedImage tile = reader.readTile(imageIndex, col, row);
// ...do more processing...
}
}
Or, if you only want to get a single tile, you obviously don't need the double for loops. :-)
Note: For ImageReaders/images that don't support tiling, the getTileWidth and getTileHeight methods will just return the same as getWidthand getHeight, respectively.
Also, the readTile API docs says:
If the arguments are out of range, an IllegalArgumentException is thrown. If the image is not tiled, the values 0, 0 will return the entire image; any other values will cause an IllegalArgumentException to be thrown.
This means your example, readtTile(0, -1, -1) should always throw an IllegalArgumentException regardless of the tiling... I suspect some implementations may disregard the tile coordinates completely, and give you the entire image anyway.
PS: The RenderedImage interface could in theory help you. But it would require a special implementation in the ImageReader. In most cases you will just get a normal BufferedImage (which implements RenderedImage), and is a single (1x1) tile.

Java bufferstrategy graphics or integer array

When doing 2D game development in Java, most tutorials create a bufferstrategy to render. This makes perfect sense.
However, where people seem to skew off is the method of drawing the actual graphics to the buffer.
Some of the tutorials create a buffered image, then create an integer array to represent the individual pixel colors.
private BufferedImage image = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB);
private int[] pixels = ((DataBufferInt) image.getRaster().getDataBuffer()).getData();
Graphics g = bs.getDrawGraphics();
g.setColor(new Color(0x556B2F));
g.fillRect(0, 0, getWidth(), getHeight());
g.drawImage(image, 0, 0, getWidth(), getHeight(), null);
However some other tutorials don't create the buffered image, drawing the pixels to an int array, and instead use the Graphics component of the BufferStrategy to draw their images directly to the buffer.
Graphics g = bs.getDrawGraphics();
g.setColor(new Color(0x556B2F));
g.fillRect(0, 0, getWidth(), getHeight());
g.drawImage(testImage.image, x*128, y*128, 128, 128, null);
I was just wondering, why create the entire int array, then draw it. This requires a lot more work in implementing rectangles, stretching, transparency, etc. The graphics component of the buffer strategy already has methods which can easily be called.
Is there some huge performance boost of using the int array?
I've looked this up for hours, and all the sites I've seen just explain what they're doing, and not why they chose to do it that way.
Lets be clear about one thing: both snippets of code do exactly the same thing - draw an Image. The snippets are rather incomplete however - the second snippet does not show what 'testImage.image' actually is or how it is created. But they both ultimately call Graphics.drawImage() and all variants of drawImage() in either Graphics or Graphics2D draw an Image, plain and simple. In the second case we simply don't know if it is a BufferedImage, a VolatileImage or even a Toolkit Image.
So there is no difference in drawing actually illustrated here!
There is but one difference between the two snippets - the first one also obtains a direct reference to the integer array that is ultimately internally backing the Image instance. This gives direct access to the pixel data rather than having to go through the (Buffered)Image API of using for example the relatively slow getRGB() and setRGB() methods. The reason why to do that can't be made specific in the context is in this question, the array is obtained but never ever used in the snippet. So in order to give the following explanation any reason to exist, we must make the assumption that someone wants to directly read or edit the pixels of the image, quite possibly for optimization reasons given the "slowness" of the (Buffered)Image API to manipulate data.
And those optimization reasons may be a premature optimization that can backfire on you.
Firs of all, this code only works because the type of the image is INT_RGB which will give the image an IntDataBuffer. If it has been another type of image, ex 3BYTE_BGR, this code will fail with a ClassCastException since the backing data buffer won't be an IntDataBuffer. This may not be much of a problem when you only manually create images and you enforce a specific type, but images tend to be loaded from files created by external tools.
Secondly, there is another bigger downside to directly accessing the pixel buffer: when you do that, Java2D will refuse acceleration of that image since it cannot know when you will be making changes to it outside of its control. Just for clarity: acceleration is the process of keeping an unaltered image in video memory rather than copying it from system memory each time it is drawn. This is potentially a huge performance improvement (or loss if you break it) depending on how many images you work with.
How can I create a hardware-accelerated image with Java2D?
(As that related question shows you: you should use GraphicsConfiguration.createCompatibleImage() to construct BufferedImage instances).
So in essence: try to use the Java2D API for everything, don't access buffers directly. This off-site resource gives a good idea just what features the API has to support you in that without having to go low level:
http://www.pushing-pixels.org/2008/06/06/effective-java2d.html
First of all, there are lots of historical aspects. Early API was very basic, so the only way to do anything non-trivial was to implement all required primitives.
Raw data access is a bit old-fashioned and we can try to do some "archeology" to find the reason such approach was used. I think there are two main reasons:
1. Filter effects
Let's not forget filter effects (various kinds of blurs, etc) are simple, very important for any game developer and widely used.
The simples way to implement such an effect with Java 1 was to use int array and filter defined as a matrix. Herbert Schildt, for example, used to have lots of such demos:
public class Blur {
public void convolve() {
for (int y = 1; y < height - 1; y++) {
for (int x = 1; x < width - 1; x++) {
int rs = 0;
int gs = 0;
int bs = 0;
for (int k = -1; k <= 1; k++) {
for (int j = -1; j <= 1; j++) {
int rgb = imgpixels[(y + k) * width + x + j];
int r = (rgb >> 16) & 0xff;
int g = (rgb >> 8) & 0xff;
int b = rgb & 0xff;
rs += r;
gs += g;
bs += b;
}
}
rs /= 9;
gs /= 9;
bs /= 9;
newimgpixels[y * width + x] = (0xff000000
| rs << 16 | gs << 8 | bs);
}
}
}
}
Naturally, you can implement that using getRGB, but raw data access is way more effective. Later, Graphics2D provided better abstraction layer:
public interface BufferedImageOp
This interface describes
single-input/single-output operations performed on BufferedImage
objects. It is implemented by AffineTransformOp, ConvolveOp,
ColorConvertOp, RescaleOp, and LookupOp. These objects can be passed
into a BufferedImageFilter to operate on a BufferedImage in the
ImageProducer-ImageFilter-ImageConsumer paradigm.
2. Double buffering
Another problem was related to flickering and really slow drawing. Double buffering eliminates ugly flickering and all of a sudden it provides an easy way to do filtering effects, because you have buffer already.
Something like a final conclusion :)
I would say the situation you've described is pretty common for any evolving technology. There are two ways to achieve same goals:
use legacy approach, code more, etc
rely on new abstraction layers, provided techniques, etc
There are also some useful extensions to simplify your life even more, so no need to use int[] :)

Are there OpenStreetMap/MapQuest-hosted map tile attributes I am unaware of?

I am in the process of changing how I get my map images from Google maps api to the MapQuest-hosted map tiles which uses OpenStreetMap Data. I am switching from Google maps because I hit the daily request limit which I wasn't expecting and I am not using OpenStreet api because although their data is free, their tiles have a limit and all I need is an image. Therefore, here I am using the MapQuest-hosted map tiles.
I think I understand it, but there are some things that I would like to be able to do but cannot find any documentation on it. For example, I would like to have an image size of 500x300 if possible, or at least 512*512 (double 256*256 which is what the tiles come out to be). I would also like to be able to display a marker. Is this possible?
I used this code found here to convert my latitude and longitude data into x and y coordinates:
public class slippy {
public static void main(String[] args) {
int zoom = 9;
double lat = 42.8549;
double lon = -78.863;
System.out.println("http://otile1.mqcdn.com/tiles/1.0.0/map/" + getTileNumber(lat, lon, zoom) + ".png");
}
public static String getTileNumber(final double lat, final double lon, final int zoom) {
int xtile = (int)Math.floor( (lon + 180) / 360 * (1<<zoom) ) ;
int ytile = (int)Math.floor( (1 - Math.log(Math.tan(Math.toRadians(lat)) + 1 / Math.cos(Math.toRadians(lat))) / Math.PI) / 2 * (1<<zoom) ) ;
return("" + zoom + "/" + xtile + "/" + ytile);
}
}
I used this code to generate two links to a map of Buffalo; one with a zoom of 9, here, and one with 10 ,here, and the center seems to differ. Is this a result of using open source data or is there an attribute I could use?
Of course the center differs. From one zoom level to the next, one tile is "split" into four other tiles. Consequently the center of the single tile will be located at the corners of the four tiles. Using the mentioned formula you will always get the tile which contains your coordinates. But due to the nature of tiles it won't be necessarily at the center of the tile. For each specific coordinate there is only one tile at a given zoom level containing it. Hence the coordinate can be anywhere on the tile and not necessarily at the center.
Still I'm not quite sure what you actually want to achieve. For displaying tiles (and markers) all you need to do is using Leaflet or OpenLayers (or any another library supporting the tiles concept).
And keep in mind that MapQuest also has terms of use.
Edit:
An alternative would be to use a WMS service instead of a TMS which does the resizing and concatenation of the tiles for you. With a WMS you just have to define a bounding box around your center and an image size. The resulting image will always be centered around the coordinates. The OSM wiki has a list of OSM WMS servers.
Don't forget to get informed about the usage policy of the WMS service you choose.

Comparing two images for motion detecting purposes

I've started differentiating two images by counting the number of different pixels using a simple algorithm:
private int returnCountOfDifferentPixels(String pic1, String pic2)
{
Bitmap i1 = loadBitmap(pic1);
Bitmap i2 = loadBitmap(pic2);
int count=0;
for (int y = 0; y < i1.getHeight(); ++y)
for (int x = 0; x < i1.getWidth(); ++x)
if (i1.getPixel(x, y) != i2.getPixel(x, y))
{
count++;
}
return count;
}
However this approach seems to be inefficient in its initial form, as there is always a very high number of pixels which differ even in very similar photos.
I was thinking of a way of to determine if two pixels are really THAT different.
the bitmap.getpixel(x,y) from android returns a Color object.
How can I implement a proper differentiation between two Color objects, to help with my motion detection?
You are right, because of noise and other factors there is usually a lot of raw pixel change in a video stream. Here are some options you might want to consider:
Blurring the image first, ideally with a Gaussian filter or with a simple box filter. This just means that you take the (weighted) average over the neighboring pixel and the pixel itself. This should reduce the sensor noise quite a bit already.
Only adding the difference to count if it's larger than some threshold. This has the effect of only considering pixels that have really changed a lot. This is very easy to implement and might already solve your problem alone.
Thinking about it, try these two options first. If they don't work out, I can give you some more options.
EDIT: I just saw that you're not actually summing up differences but just counting different pixels. This is fine if you combine it with Option 2. Option 1 still works, but it might be an overkill.
Also, to find out the difference between two colors, use the methods of the Color class:
int p1 = i1.getPixel(x, y);
int p2 = i2.getPixel(x, y);
int totalDiff = Color.red(p1) - Color.red(p2) + Color.green(p1) - Color.green(p2) + Color.blue(p1) - Color.blue(p2);
Now you can come up with a threshold the totalDiff must exceed to contribute to count.
Of course, you can play around with these numbers in various ways. The above code for example only computes changes in pixel intensity (brightness). If you also wanted to take into account changes in hue and saturation, you would have to compute totalDifflike this:
int totalDiff = Math.abs(Color.red(p1) - Color.red(p2)) + Math.abs(Color.green(p1) - Color.green(p2)) + Math.abs(Color.blue(p1) - Color.blue(p2));
Also, have a look at the other methods of Color, for example RGBToHSV(...).
I know that this is essentially very similar another answer here but I think be restating it in a different form it might prove useful to those seeking the solution. This involves have more than two images over time. If you only literally then this will not work but an equivilent method will.
Do the history for all pixels on each frame. For example, for each pixel:
history[x, y] = (history[x, y] * (w - 1) + get_pixel(x, y)) / w
Where w might be w = 20. The higher w the larger the spike for motion but the longer motion has to be missing for it to reset.
Then to determine if something has changed you can do this for each pixel:
changed_delta = abs(history[x, y] - get_pixel(x, y))
total_delta += changed_delta
You will find that it stabilizes most of the noise and when motion happens you will get a large difference. You are essentially taking many frames and detecting motion from the many against the newest frame.
Also, for detecting positions of motion consider breaking the image into smaller pieces and doing them individually. Then you can find objects and track them across the screen by treating a single image as a grid of separate images.

Categories

Resources