Comparing two images for motion detecting purposes

Comparing two images for motion detecting purposes - java

I've started differentiating two images by counting the number of different pixels using a simple algorithm:
private int returnCountOfDifferentPixels(String pic1, String pic2)
{
Bitmap i1 = loadBitmap(pic1);
Bitmap i2 = loadBitmap(pic2);
int count=0;
for (int y = 0; y < i1.getHeight(); ++y)
for (int x = 0; x < i1.getWidth(); ++x)
if (i1.getPixel(x, y) != i2.getPixel(x, y))
{
count++;
}
return count;
}
However this approach seems to be inefficient in its initial form, as there is always a very high number of pixels which differ even in very similar photos.
I was thinking of a way of to determine if two pixels are really THAT different.
the bitmap.getpixel(x,y) from android returns a Color object.
How can I implement a proper differentiation between two Color objects, to help with my motion detection?

You are right, because of noise and other factors there is usually a lot of raw pixel change in a video stream. Here are some options you might want to consider:
Blurring the image first, ideally with a Gaussian filter or with a simple box filter. This just means that you take the (weighted) average over the neighboring pixel and the pixel itself. This should reduce the sensor noise quite a bit already.
Only adding the difference to count if it's larger than some threshold. This has the effect of only considering pixels that have really changed a lot. This is very easy to implement and might already solve your problem alone.
Thinking about it, try these two options first. If they don't work out, I can give you some more options.
EDIT: I just saw that you're not actually summing up differences but just counting different pixels. This is fine if you combine it with Option 2. Option 1 still works, but it might be an overkill.
Also, to find out the difference between two colors, use the methods of the Color class:
int p1 = i1.getPixel(x, y);
int p2 = i2.getPixel(x, y);
int totalDiff = Color.red(p1) - Color.red(p2) + Color.green(p1) - Color.green(p2) + Color.blue(p1) - Color.blue(p2);
Now you can come up with a threshold the totalDiff must exceed to contribute to count.
Of course, you can play around with these numbers in various ways. The above code for example only computes changes in pixel intensity (brightness). If you also wanted to take into account changes in hue and saturation, you would have to compute totalDifflike this:
int totalDiff = Math.abs(Color.red(p1) - Color.red(p2)) + Math.abs(Color.green(p1) - Color.green(p2)) + Math.abs(Color.blue(p1) - Color.blue(p2));
Also, have a look at the other methods of Color, for example RGBToHSV(...).

I know that this is essentially very similar another answer here but I think be restating it in a different form it might prove useful to those seeking the solution. This involves have more than two images over time. If you only literally then this will not work but an equivilent method will.
Do the history for all pixels on each frame. For example, for each pixel:
history[x, y] = (history[x, y] * (w - 1) + get_pixel(x, y)) / w
Where w might be w = 20. The higher w the larger the spike for motion but the longer motion has to be missing for it to reset.
Then to determine if something has changed you can do this for each pixel:
changed_delta = abs(history[x, y] - get_pixel(x, y))
total_delta += changed_delta
You will find that it stabilizes most of the noise and when motion happens you will get a large difference. You are essentially taking many frames and detecting motion from the many against the newest frame.
Also, for detecting positions of motion consider breaking the image into smaller pieces and doing them individually. Then you can find objects and track them across the screen by treating a single image as a grid of separate images.

Related

Java Advanced Imaging: How to get the ImageLayout from a huge image?

I have a couple of huge images which can't be loaded into the memory in whole. I know that the images are tiled and all the methods in the class ImageReader give me plausible non zero return values for
getTileGridXOffset(int),
getTileGridYOffset(int),
getTileWidth(int) and
getTileHeight(int).
My problem now is that I want to read one tile only to avoid having to load the entire image into memory using the ImageReader.readtTile(int, int, int) method. But how do I determine what the valid values for the tile coordinates are?
There is the method getNumXTiles() and getNumYTiles() in the interface RenderedImage but all attempts to create a rendered image from the source results into a out of memory/java heap space error.
The tile coordinates can theoretically be anything and I tried readtTile(0, -1, -1) which also works for a few images I tested.
I also tried to reach the metadata for those images but I didn't find any useful information regarding the image layout.
Is there anyone who can tell me how to get the values for the tile coordinates without having to read the entire image into memory? Is there another way which does not require an instance of ImageLayout?
Thank you very much for your assistance.

First of all, you should check that the ImageReader in question supports tiling for the given image, using the isImageTiled(imageIndex). If it doesn't, you can't expect useful values from the other methods.
Then if it does, all tiles for a given image must be equal in size (but the last tile in each column/the last row may be truncated). This is also the case for all tiled file formats that I know of (ie. TIFF). So, using this knowledge, the number of tiles in both dimensions can be calculated:
// Calculate number of x tiles/y tiles:
int cols = (int) Math.ceil(reader.getWidth(imageIndex) / (double) reader.getTileWidth(imageIndex));
int rows = (int) Math.ceil(reader.getHeight(imageIndex) / (double) reader.getTileHeight(imageIndex));
You can then, loop over the tile indexes (the first tile is always 0,0):
for (int row = 0; row < rows; row++) {
for (int col = 0; col < cols; col++) {
BufferedImage tile = reader.readTile(imageIndex, col, row);
// ...do more processing...
}
}
Or, if you only want to get a single tile, you obviously don't need the double for loops. :-)
Note: For ImageReaders/images that don't support tiling, the getTileWidth and getTileHeight methods will just return the same as getWidthand getHeight, respectively.
Also, the readTile API docs says:
If the arguments are out of range, an IllegalArgumentException is thrown. If the image is not tiled, the values 0, 0 will return the entire image; any other values will cause an IllegalArgumentException to be thrown.
This means your example, readtTile(0, -1, -1) should always throw an IllegalArgumentException regardless of the tiling... I suspect some implementations may disregard the tile coordinates completely, and give you the entire image anyway.
PS: The RenderedImage interface could in theory help you. But it would require a special implementation in the ImageReader. In most cases you will just get a normal BufferedImage (which implements RenderedImage), and is a single (1x1) tile.

Data Structure and Algorithm for a 3D Volume?

I've been tinkering with some Minecraft Bukkit plugin development, and am currently working on something where I need to be able to define a "volume" of space and determine when an entity (player) moves from outside that volume to inside (or vice versa).
If I restrict the "volume" to boxes, it should be simple. The data structure can just maintain the X/Y/Z bounding integers (so 6 total integers) and calculating entry/exit given two points (movement from and movement to) should just be a matter of determining if A) all three To values are within all three ranges and B) at least one From value is outside its corresponding range.
(Though if there's a better, more performant way of storing and calculating this, I'm all ears.)
However, what if the "volume" isn't a simple box? Suppose I have an oddly-shaped room and want to enclose the volume of that room. I could arrange multiple "volumes" individually to fill the overall space, however that would result in false positives when an entity moves from one to another.
Not having worked in gaming or 3D engines before, I'm drawing a blank on how I might be able to structure something like this. But it occurs to me that this is likely a problem which has been solved and has known established patterns. Essentially, I'm trying to:
Define a data structure which can represent an oddly-shaped volume of space (albeit at least based on block coordinates).
Define an algorithm which, given a source and destination of movement, can determine if the movement crossed a boundary of the defined space.
Are there established patterns and practices for this?

I don't know if this has been used in any kind of video game before, but the first thing that came to mind is the classic Sieve of Eratosthenes implementation, the only change would be to make the boolean array 3D, and use the keys as coordinates. Obviously though as x and y values can be huge in Minecraft, you'd probably want to save space by saving an offset between the world 0,0 position and your selection, something like this:
class OddArea
{
static final int MAX_SELECTION_SIZE = 64; //Or whatever
public final int xOffset, yOffset;
// 256 = Chunk height
public final boolean[][][] squares = new boolean[MAX_SELECTION_SIZE][MAX_SELECTION_SIZE][256];
OddArea()
{
this(0, 0);
}
OddArea(final int xOffset, final int yOffset)
{
this.xOffset = xOffset;
this.yOffset = yOffset;
}
void addBlock(final int x, final int y, final int z)
{
this.squares[x - this.xOffset][y - this.yOffset][z] = true;
}
boolean isInsideArea(final int x, final int y, final int z)
{
return this.squares[x - this.xOffset][y - this.yOffset][z];
}
}
z doesn't require an offset as the Minecraft world is only 256 blocks high.
The only issue I can think of with this setup is you'd have to know the lowest x,y coordinates before you start filling up your object

In general you should be using a data structure similar to kd trees. You can represent your volume as a union of either cubes or spheres, and it should be easy to evaluate if an object enters the volume.
BTW, to calculate if two spheres intersect, check if the distance between centers is less than sum of radii.

Collisions between rectangles

I have been trying to solve this for a few hours, and the internet is pretty unfruitful on the subject.
I need help detecting and solving collisions between rectangles, and not just detecting, but note I mentioned solving as well.
These are two boxes, with x/y width/heights. I simply need to detect when they are overlapping, and push one of the boxes out of the other one smoothly.
Also, note that one box is stationary - and the other is moving.
Does anyone have anything on this (or can give me an example?) I'd really appreciate it.
I need the boxes to be able to rest on top of each other as well.
Thank you!

I'm not sure what the context here is (Are these boxes moving or stationary? Are you looking for a physically accurate resolution, or simply a geometrically correct one?), but it seems like you could accomplish this in the following way:
1) Determine if there is a box collision
2) Determine the intersection of the two boxes, which would produce a third box. The width and height of the box is your penetration depth.
3) move the center of one of the boxes by the penetration depth, (x - width, y - height).
This should cause the boxes to become disjoint.
FYI: Intersection of two boxes can be computed by taking the max of the mins and the mins of the maxes from both boxes.
Here is some code from my engine for box intersection:
bool Bounds::IntersectsBounds(const Bounds &other) const
{
return !(min.x > other.max.x || max.x < other.min.x
|| min.y > other.max.y || max.y < other.min.y);
}
bool Bounds::Intersection(const Bounds &other, Bounds &outBounds) const
{
if (!this->IntersectsBounds(other)) {
return false;
}
outBounds.min.x = std::max(min.x, other.min.x);
outBounds.min.y = std::max(min.y, other.min.y);
outBounds.max.x = std::min(max.x, other.max.x);
outBounds.max.y = std::min(max.y, other.max.y);
return true;
}
In this case, the "outBounds" variable is the intersection of the two boxes (which in this case is your penetration depth). You can use the width/height of this box to perform your collision resolution.

Yeah! This is a pretty common problem! You may want to check out the gamedev portion of the stack exchange network!
Detection
bool collide(float x1,float y1,float sx1,float sy1, float x2, float y2, float sx2, float sy2){
if (x1+sx1 <= x2)
return false;
if (x2+sx2 <= x1)
return false;
if (y1+sy1 <= y2)
return false;
if (y2+sy2 <= y1)
return false;
return true;
}
Resolution
As far as an answer, this depends on the type of application you are going for. Is it a sidescroller, top-down, tile based? The answer depends on the response to this question. I'll assume something dynamic like a sidescroller or top-down action game.
The code is not difficult, but the implementation can be. If you have few objects moving on the screen you can use a similar system to mine, which goes something like the following:
Get a list of objects you are currently colliding with, in order of distance from the current object.
Iterate through the objects, and resolve collisions using the following method
Check if the object has some special collision type (teleporter, etc) by sending that object a message, and checking on the return value (a teleporter will take care of the collision resolution)
check if the previous bottom position of our current object (A) was above the top side of the object in question(B), if so that means you have had a bottom collision. Resolve by setting the y position of A to the y position of B minus the height of A
(IF THE PREVIOUS FAILED) check if the previous right side of A was to the left of the left side of B, if so that means you have had a right side collision. Resolve by setting the x position of A to B's position minus A's width
(IF THE PREVIOUS FAILED) check if the previous left side of A was to the right of the right side of B, if so that means you have had a left side collision. Resolve by setting the x position of A to B's x position plus B's width
(IF THE PREVIOUS FAILED) check if the previous top side of A was below the bottom side of B, if so you have had a top side collision. Resolve by setting the y position of A to the y position of B plus B's height
Whew. It is important that you have the objects sorted according to distance, it will catch on edges if you check collisions with an object that is farther away!
I hope that makes sense!

Edit: Apparently doesn't work in Android.
https://stackoverflow.com/a/15515114/3492994
Using the available classes from the 2D Graphics API.
Rectangle r1 = new Rectangle(100, 100, 100, 100);
Line2D l1 = new Line2D.Float(0, 200, 200, 0);
System.out.println("l1.intsects(r1) = " + l1.intersects(r1));
What this doesn't tell you, is where...

Small bug in Koch's Snowflake Implementation

So I'm programming a recursive program that is supposed to draw Koch's snowflake using OpenGL, and I've got the program basically working except one tiny issue. The deeper the recursion, the weirder 2 particular vertices get. Pictures at the bottom.
EDIT: I don't really care about the OpenGL aspect, I've got that part down. If you don't know OpenGL, all that the glVertex does is draw a line between the two vertices specified in the 2 method calls. Pretend its drawLine(v1,v2). Same difference.
I suspect that my method for finding points is to blame, but I can't find anything that looks incorrect.
I'm following the basically standard drawing method, here are the relevant code snips
(V is for vertex V1 is the bottom left corner, v2 is the bottom right corner, v3 is the top corner):
double dir = Math.PI;
recurse(V2,V1,n);
dir=Math.PI/3;
recurse(V1,V3,n);
dir= (5./3.)* Math.PI ;
recurse(V3,V2,n);
Recursive method:
public void recurse(Point2D v1, Point2D v2, int n){
double newLength = v1.distance(v2)/3.;
if(n == 0){
gl.glVertex2d(v1.getX(),v1.getY());
gl.glVertex2d(v2.getX(),v2.getY());
}else{
Point2D p1 = getPointViaRotation(v1, dir, newLength);
recurse(v1,p1,n-1);
dir+=(Math.PI/3.);
Point2D p2 = getPointViaRotation(p1,dir,newLength);
recurse(p1,p2,n-1);
dir-=(Math.PI*(2./3.));
Point2D p3 = getPointViaRotation(p2, dir, newLength);
recurse(p2,p3,n-1);
dir+=(Math.PI/3.);
recurse(p3,v2,n-1);
}
}
I really suspect my math is the problem, but this looks correct to me:
public static Point2D getPointViaRotation(Point2D p1, double rotation, double length){
double xLength = length * Math.cos(rotation);
double yLength = length * Math.sin(rotation);
return new Point2D.Double(xLength + p1.getX(), yLength + p1.getY());
}
N = 0 (All is well):
N = 1 (Perhaps a little bendy, maybe)
N = 5 (WAT)

I can't see any obvious problem code-wise. I do however have a theory about what happens.
It seems like all points in the graph are based on the locations of the points that came before it. As such, any rounding errors that occurs during this process eventually start accumulating, eventually ending with it going haywire and being way off.
What I would do for starters is calculating the start and end points of each segment before recursing, as to limit the impact of the rounding errors of the inner calls.

One thing about Koch's snowflake is, that the algorithm will lead to a rounding issue one time (it is recursive and all rounding errors add up). The trick is, to keep it going as long as possible. There're three things you can do:
If you want to get more detailed, the only way is to expand the possibilities of Double. You will need to use your own range of coordinates and transform them, every time you actually paint on the screen, to screen coordinates. Your own coordinates should zoom and show the last recursion step (the last triangle) in a coordination system of e.g. 100x100. Then calculate the three new triangles on top of that, transform into screen coordinates and paint.
The line dir=Math.PI/3; divides by 3 instead of (double) 3. Add the . after the 3
Make sure you use Point2D.Double anywhere. Your code should do so, but I would explicitely write it everywhere.
You won the game, when you still have a nice snowflake but get a Stackoverflow.

So, it turns out I am the dumbest man alive.
Thanks everyone for trying, I appreciate the help.
This code is meant to handle an equilateral triangle, its very specific about that (You can tell by the angles).
I put in a triangle with the height equal to the base (not equilateral). When I fixed the input triangle, everything works great.

Java bufferstrategy graphics or integer array

When doing 2D game development in Java, most tutorials create a bufferstrategy to render. This makes perfect sense.
However, where people seem to skew off is the method of drawing the actual graphics to the buffer.
Some of the tutorials create a buffered image, then create an integer array to represent the individual pixel colors.
private BufferedImage image = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB);
private int[] pixels = ((DataBufferInt) image.getRaster().getDataBuffer()).getData();
Graphics g = bs.getDrawGraphics();
g.setColor(new Color(0x556B2F));
g.fillRect(0, 0, getWidth(), getHeight());
g.drawImage(image, 0, 0, getWidth(), getHeight(), null);
However some other tutorials don't create the buffered image, drawing the pixels to an int array, and instead use the Graphics component of the BufferStrategy to draw their images directly to the buffer.
Graphics g = bs.getDrawGraphics();
g.setColor(new Color(0x556B2F));
g.fillRect(0, 0, getWidth(), getHeight());
g.drawImage(testImage.image, x*128, y*128, 128, 128, null);
I was just wondering, why create the entire int array, then draw it. This requires a lot more work in implementing rectangles, stretching, transparency, etc. The graphics component of the buffer strategy already has methods which can easily be called.
Is there some huge performance boost of using the int array?
I've looked this up for hours, and all the sites I've seen just explain what they're doing, and not why they chose to do it that way.

Lets be clear about one thing: both snippets of code do exactly the same thing - draw an Image. The snippets are rather incomplete however - the second snippet does not show what 'testImage.image' actually is or how it is created. But they both ultimately call Graphics.drawImage() and all variants of drawImage() in either Graphics or Graphics2D draw an Image, plain and simple. In the second case we simply don't know if it is a BufferedImage, a VolatileImage or even a Toolkit Image.
So there is no difference in drawing actually illustrated here!
There is but one difference between the two snippets - the first one also obtains a direct reference to the integer array that is ultimately internally backing the Image instance. This gives direct access to the pixel data rather than having to go through the (Buffered)Image API of using for example the relatively slow getRGB() and setRGB() methods. The reason why to do that can't be made specific in the context is in this question, the array is obtained but never ever used in the snippet. So in order to give the following explanation any reason to exist, we must make the assumption that someone wants to directly read or edit the pixels of the image, quite possibly for optimization reasons given the "slowness" of the (Buffered)Image API to manipulate data.
And those optimization reasons may be a premature optimization that can backfire on you.
Firs of all, this code only works because the type of the image is INT_RGB which will give the image an IntDataBuffer. If it has been another type of image, ex 3BYTE_BGR, this code will fail with a ClassCastException since the backing data buffer won't be an IntDataBuffer. This may not be much of a problem when you only manually create images and you enforce a specific type, but images tend to be loaded from files created by external tools.
Secondly, there is another bigger downside to directly accessing the pixel buffer: when you do that, Java2D will refuse acceleration of that image since it cannot know when you will be making changes to it outside of its control. Just for clarity: acceleration is the process of keeping an unaltered image in video memory rather than copying it from system memory each time it is drawn. This is potentially a huge performance improvement (or loss if you break it) depending on how many images you work with.
How can I create a hardware-accelerated image with Java2D?
(As that related question shows you: you should use GraphicsConfiguration.createCompatibleImage() to construct BufferedImage instances).
So in essence: try to use the Java2D API for everything, don't access buffers directly. This off-site resource gives a good idea just what features the API has to support you in that without having to go low level:
http://www.pushing-pixels.org/2008/06/06/effective-java2d.html

First of all, there are lots of historical aspects. Early API was very basic, so the only way to do anything non-trivial was to implement all required primitives.
Raw data access is a bit old-fashioned and we can try to do some "archeology" to find the reason such approach was used. I think there are two main reasons:
1. Filter effects
Let's not forget filter effects (various kinds of blurs, etc) are simple, very important for any game developer and widely used.
The simples way to implement such an effect with Java 1 was to use int array and filter defined as a matrix. Herbert Schildt, for example, used to have lots of such demos:
public class Blur {
public void convolve() {
for (int y = 1; y < height - 1; y++) {
for (int x = 1; x < width - 1; x++) {
int rs = 0;
int gs = 0;
int bs = 0;
for (int k = -1; k <= 1; k++) {
for (int j = -1; j <= 1; j++) {
int rgb = imgpixels[(y + k) * width + x + j];
int r = (rgb >> 16) & 0xff;
int g = (rgb >> 8) & 0xff;
int b = rgb & 0xff;
rs += r;
gs += g;
bs += b;
}
}
rs /= 9;
gs /= 9;
bs /= 9;
newimgpixels[y * width + x] = (0xff000000
| rs << 16 | gs << 8 | bs);
}
}
}
}
Naturally, you can implement that using getRGB, but raw data access is way more effective. Later, Graphics2D provided better abstraction layer:
public interface BufferedImageOp
This interface describes
single-input/single-output operations performed on BufferedImage
objects. It is implemented by AffineTransformOp, ConvolveOp,
ColorConvertOp, RescaleOp, and LookupOp. These objects can be passed
into a BufferedImageFilter to operate on a BufferedImage in the
ImageProducer-ImageFilter-ImageConsumer paradigm.
2. Double buffering
Another problem was related to flickering and really slow drawing. Double buffering eliminates ugly flickering and all of a sudden it provides an easy way to do filtering effects, because you have buffer already.
Something like a final conclusion :)
I would say the situation you've described is pretty common for any evolving technology. There are two ways to achieve same goals:
use legacy approach, code more, etc
rely on new abstraction layers, provided techniques, etc
There are also some useful extensions to simplify your life even more, so no need to use int[] :)

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.