Optical Flow method returning unexpected values - java

This past two weeks I have been trying to create an Android App that tracks points in space as I move my Samsung Galaxy III's camera. In short, I use the OpenCV libraries to try to track said points using the two following methods:
public static void goodFeaturesToTrack(
Mat image,
MatOfPoint corners,
int maxCorners,
double qualityLevel,
double minDistance);
Were image corresponds to a 8-bit gray image, corners corresponds to the output with the best points to be used for tracking, maxCorners corresponds to the maximum number of points to be obtained, qualityLevel corresponds to a fraction such that all points to be obtained must be >= BestQualityPointValue*qualityLevel and minDistance corresponds to the minimum distance between points to be found. (Link)
public static void calcOpticalFlowPyrLK(
Mat prevImg,
Mat nextImg,
MatOfPoint2f prevPts,
MatOfPoint2f nextPts,
MatOfByte status,
MatOfFloat err);
Were prevImg corresponds to a 8-bit gray image at time t, nextImg corresponds to a 8-bit gray image at time t+dt, prevPts corresponds to the matrix containing the 2D(x,y) points to be tracked, nextPts corresponds to the OUTPUT matrix containing the NEW POSITION of the points, status indicates which points have been tracked(1) and which not(0) AND err contains the error associated with those points whose displacement has been computed. (Link)
So far I have been successful using the goodFeaturesToTrack(...) method, but I am still unable to calculate the FLOW of the points using calcOpticalFlowPyrLK(...) method.
Here is the chunk of code that takes care of initializing the variables and tracking the points:
private static final double MIN_FEATURE_QUALITY = 0.05;
private static final double MIN_FEATURE_DISTANCE = 4.0;
private Mat prevGray;
private MatOfPoint2f prev2D,next2D;
private MatOfPoint corners;
private MatOfByte status;
private MatOfFloat err;
private Scalar color;
private Size winSize;
private int maxCorners,maxLevel;
...
public void onCameraViewStarted(int width, int height) {
nextGray = new Mat(height, width, CvType.CV_8UC1); //unsigned char
Rscale = new Mat(height, width, CvType.CV_8UC1);
prevGray = new Mat(height, width, CvType.CV_8UC1);
prev2D = new MatOfPoint2f(new Point());
next2D = new MatOfPoint2f(new Point());
status = new MatOfByte();
err = new MatOfFloat();
corners = new MatOfPoint();
maxLevel = 0;
winSize = new Size(21,21);
color = new Scalar(0, 255, 0);
maxCorners = 1;
}
//THIS IS THE METHOD THAT TAKES CARE OF TRACKING THE POINTS
public Mat onCameraFrame(CvCameraViewFrame inputFrame) {
nextGray = inputFrame.gray();
Rscale = nextGray;
if(!corners.empty()){
Video.calcOpticalFlowPyrLK(
prevGray,
nextGray,
prev2D,
next2D,
status,
err,
winSize,
maxLevel);
System.out.println("status = " + status.toArray()[0]);
System.out.println("err = " + err.toArray()[0]);
}
prevGray = nextGray;
prev2D = next2D;
for(int i=0;i<next2D.toArray().length;i++){
Core.circle(Rscale, next2D.toArray()[i], 3, color,-1);
}
System.out.println("Prev2D = " + prev2D.toArray()[0].toString());
System.out.println("Next2D = " + next2D.toArray()[0].toString());
return Rscale;
}
THE PROBLEM:
As mentioned earlier, the parameter status tells the user if the flow has been computed or not for each point. Using System.out.println(...), I check each point status and they are all 1. Moreover, I also check the error and get that error = 0.0 However, and this is what is killing me, the new computed points are always the same as the input points (i.e. nextPts = prevPts). This being said, sometimes the points may change position by tiny amounts that are imperceptible, but that rarely happens...

prevGray = nextGray;
this shallow copy will lead to both Mat's point to the same pixel data. so, in the next iteration, when you say:
nextGray = inputFrame.gray();
prevGray will get updaded to the very same pixels, too ;)
what you want is a deep copy:
prevGray = nextGray.clone();
prev2D = next2D.clone(); // same story..

Related

copying an image onto another with JOCL/OpenCL

so my goal is to use the GPU for my brand new Java project which is to create a game and the game engine itself (I think it is a very good way to learn in deep how it works).
I was using multi-threading on the CPU with java.awt.Graphics2D to display my game, but i have observed on other PCs that the game was running below 40FPS so i have decided to learn how to use GPU (I will be still rendering all objects in a for loop then draw the image on screen).
For that reason, I started to code following the OpenCL documentation and the JOCL samples a small simple test which is to paint the texture onto the background image (let's amdit that every entities has a texture).
This method is called in each render call and it is given the background, the texture, and the position of this entity as arguments.
Both codes below has been updated to fit #ProjectPhysX recommandations.
public static void XXX(final BufferedImage output_image, final BufferedImage input_image, float x, float y) {
cl_image_format format = new cl_image_format();
format.image_channel_order = CL_RGBA;
format.image_channel_data_type = CL_UNSIGNED_INT8;
//allocate ouput pointer
cl_image_desc output_description = new cl_image_desc();
output_description.buffer = null; //must be null for 2D image
output_description.image_depth = 0; //is only used if the image is a 3D image
output_description.image_row_pitch = 0; //must be 0 if host_ptr is null
output_description.image_slice_pitch = 0; //must be 0 if host_ptr is null
output_description.num_mip_levels = 0; //must be 0
output_description.num_samples = 0; //must be 0
output_description.image_type = CL_MEM_OBJECT_IMAGE2D;
output_description.image_width = output_image.getWidth();
output_description.image_height = output_image.getHeight();
output_description.image_array_size = output_description.image_width * output_description.image_height;
cl_mem output_memory = clCreateImage(context, CL_MEM_WRITE_ONLY, format, output_description, null, null);
//set up first kernel arg
clSetKernelArg(kernel, 0, Sizeof.cl_mem, Pointer.to(output_memory));
//allocates input pointer
cl_image_desc input_description = new cl_image_desc();
input_description.buffer = null; //must be null for 2D image
input_description.image_depth = 0; //is only used if the image is a 3D image
input_description.image_row_pitch = 0; //must be 0 if host_ptr is null
input_description.image_slice_pitch = 0; //must be 0 if host_ptr is null
input_description.num_mip_levels = 0; //must be 0
input_description.num_samples = 0; //must be 0
input_description.image_type = CL_MEM_OBJECT_IMAGE2D;
input_description.image_width = input_image.getWidth();
input_description.image_height = input_image.getHeight();
input_description.image_array_size = input_description.image_width * input_description.image_height;
DataBufferInt input_buffer = (DataBufferInt) input_image.getRaster().getDataBuffer();
int input_data[] = input_buffer.getData();
cl_mem input_memory = clCreateImage(context, CL_MEM_READ_ONLY | CL_MEM_USE_HOST_PTR, format, input_description, Pointer.to(input_data), null);
//loads the input buffer to the gpu memory
long[] input_origin = new long[] { 0, 0, 0 };
long[] input_region = new long[] { input_image.getWidth(), input_image.getHeight(), 1 };
int input_row_pitch = input_image.getWidth() * Sizeof.cl_uint; //the length of each row in bytes
clEnqueueWriteImage(commandQueue, input_memory, CL_TRUE, input_origin, input_region, input_row_pitch, 0, Pointer.to(input_data), 0, null, null);
//set up second kernel arg
clSetKernelArg(kernel, 1, Sizeof.cl_mem, Pointer.to(input_memory));
//set up third and fourth kernel args
clSetKernelArg(kernel, 2, Sizeof.cl_float, Pointer.to(new float[] { x }));
clSetKernelArg(kernel, 3, Sizeof.cl_float, Pointer.to(new float[] { y }));
//blocks until all previously queued commands are issued
clFinish(commandQueue);
//enqueue the program execution
long[] globalWorkSize = new long[] { input_description.image_width, input_description.image_height };
clEnqueueNDRangeKernel(commandQueue, kernel, 2, null, globalWorkSize, null, 0, null, null);
//transfer the output result back to host
DataBufferInt output_buffer = (DataBufferInt) output_image.getRaster().getDataBuffer();
int output_data[] = output_buffer.getData();
long[] output_origin = new long[] { 0, 0, 0 };
long[] output_region = new long[] { output_description.image_width, output_description.image_height, 1 };
int output_row_pitch = output_image.getWidth() * Sizeof.cl_uint;
clEnqueueReadImage(commandQueue, output_memory, CL_TRUE, output_origin, output_region, output_row_pitch, 0, Pointer.to(output_data), 0, null, null);
//free pointers
clReleaseMemObject(input_memory);
clReleaseMemObject(output_memory);
}
And here's the program source runned on the kernel.
const sampler_t sampler = CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_CLAMP | CLK_FILTER_NEAREST;
__kernel void drawImage(__write_only image2d_t dst_image, __read_only image2d_t src_image, float xoff, float yoff)
{
const int x = get_global_id(0);
const int y = get_global_id(1);
int2 in_coords = (int2) { x, y };
uint4 pixel = read_imageui(src_image, sampler, in_coords);
pixel = -16184301;
printf("%d, %d, %u\n", x, y, pixel);
const int sx = get_global_size(0);
const int sy = get_global_size(1);
int2 out_coords = (int2) { ((int) xoff + x) % sx, ((int) yoff + y) % sy};
write_imageui(dst_image, out_coords, pixel);
}
Without the call to write_imageui, the background is painted black, otherwhise it is white.
At the moment, I am a bit struggling to understand why pixel = 0 in the C function, but i think that someone familiar with JOCL would found out very quick my error in this code. I am very confused with this code for today, maybe tomorrow, but i don't feel like I will ever catch myself my mistake. For that reason i request your help to review my code. I feel like an idiot that i can't figure it out at that point.
Try
const int sx = get_global_size(0);
const int sy = get_global_size(1);
int2 out_coords = (int2) { (xoff + x)%sx, (yoff + y)%sy};
to avoid errors or undefined behaviour. Right now you are writing into Nirwana if the coordinate+offset is putside the image region. Also there is no clEnqueueWriteImage before the kernel is called, so src_image on the GPU is undefined and may contain random values.
OpenCL requires kernel parameters to be declared in global memory space:
__kernel void drawImage(global image2d_t dst_image, global image2d_t src_image, global float xoff, global float yoff)
Also as someone who has written a graphics engine in Java, C++ and GPU-parallelized in OpenCL, let me give you some guidance: In the Java code, you probably use painter's algorithm: Make a list of all drawn objects with their approximate z-coordinates, sort the objects by z-coordinate and draw them back-to-front in a single for-loop. On the GPU, painter's algorithm won't work as you cannot parallelize it. Instead you have a list of objects (lines/triangles) in 3D space, and you parallelize over this list: Each GPU thread rasterizes a single triangle, all threads at the same time, and draw the pixels on the frame at the same time. To solve the draing order problem, you use a z-buffer: an image consisting of a z-coordinate per pixel. During rasterization of the line/triange, you calculate the z-coordinate for every pixel, and only if it is larger than the one previously in the z-buffer at that pixel, you draw the new color.
Regarding performance: java.awt.Graphics2D is very efficient in terms of CPU usage, you can do ~40k triangles per frame at 60fps. With OpenCL, expect ~30M triangles per frame at 60fps.

OpenCV - find blackboard edges on video and images

UPDATE
You can find all the images I have for testing on my GitHub here:
GitHub repository with sources
There are also 2 videos, where the detection should work on as well
ORIGINAL QUESTION
I tried to use OpenCV 4.x.x to find the edges of a blackboard (image following), but somehow I cannot succeed. My code at the moment looks like this: (Android with OpenCV and live camera feed), where imgMat is a Mat from the camera feed:
Mat gray = new Mat();
Imgproc.cvtColor(imgMat, gray, Imgproc.COLOR_RGB2BGR);
Mat blurred = new Mat();
Imgproc.blur(gray, blurred, new org.opencv.core.Size(3, 3));
Mat canny = new Mat();
Imgproc.Canny(blurred, canny, 80, 230);
Mat kernel = Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new org.opencv.core.Size(2, 2));
Mat dilated = new Mat();
Imgproc.morphologyEx(canny, dilated, Imgproc.MORPH_DILATE, kernel, new Point(0, 0), 10);
Mat rectImage = new Mat();
Imgproc.morphologyEx(dilated, rectImage, Imgproc.MORPH_CLOSE, kernel, new Point(0, 0), 5);
Mat endproduct = new Mat();
Imgproc.Canny(rectImage, endproduct, 120, 230);
List<MatOfPoint> contours = new ArrayList<>();
Mat hierarchy = new Mat();
Imgproc.findContours(endproduct, contours, hierarchy, Imgproc.RETR_LIST, Imgproc.CHAIN_APPROX_SIMPLE);
double maxArea = 0;
boolean hasContour = false;
MatOfPoint2f biggestContour = new MatOfPoint2f();
Iterator<MatOfPoint> each = contours.iterator();
while (each.hasNext()) {
MatOfPoint wrapper = each.next();
double area = Imgproc.contourArea(wrapper);
if (area > maxArea) {
maxArea = area;
biggestContour = new MatOfPoint2f(wrapper.toArray());
hasContour = true;
}
}
if (hasContour) {
Mat output = imgMat.clone();
MatOfPoint2f approx = new MatOfPoint2f();
MatOfPoint poly = new MatOfPoint();
Imgproc.approxPolyDP(biggestContour, approx, Imgproc.arcLength(biggestContour, true) * .02, true);
approx.convertTo(poly, CvType.CV_32S);
Rect rect = Imgproc.boundingRect(poly);
}
Somehow I am not able to get it working, although the same code(written in python) worked on my computer with a video. I take the output from the rectangle and display it on my mobile screen, where it flickers around a lot and does not work properly.
These are my images I tried the python program on, and they worked:
What am I doing wrong? I am not able to constantly detect the edges of the blackboard.
Additional information about the blackboard:
always rectangular
may have different lighting
the text should be ignored, only the main board should be detected
the outer blackboard should be ignored as well
only the contour for the main board should be shown/returned
Thanks for any advice or code!
I used HSV because that's the easiest way to detect specific colors. I used an abundancy test to automatically select the color threshold (so this will work for green or blue boards). However, this test will fail on white or black boards since white and black count as all colors according to hue. Instead, in HSV, white and black are easiest to detect as very low saturation (white) or as very low value (black).
I did a 3-way check for each and selected the mask that had the most pixels in it (I assume that the boards are the majority of the image). I'm not sure how this will work on other images since we only have one here, so this may or may not work for other boards.
I used approxPolyDP to cut down on the number of points in the contour until I had 4 points and used that to draw the shape.
import cv2
import numpy as np
# get unique colors (to speed up search) and return the most abundant mask
def getAbundantColor(channel, margin):
# get uniques
unique_colors, counts = np.unique(channel, return_counts=True);
# check for the most abundant color
most = None;
biggest_count = -1;
for col in unique_colors:
# count number of white pixels
mask = cv2.inRange(channel, int(col - margin), int(col + margin));
count = np.count_nonzero(mask);
# if bigger, set new "most"
if count > biggest_count:
biggest_count = count;
most = mask;
return most, biggest_count;
# load image
img = cv2.imread("blackboard.jpg");
# it's huge, scale down so that we can see the whole thing
h, w = img.shape[:2];
scale = 0.25;
h = int(scale*h);
w = int(scale*w);
img = cv2.resize(img, (w,h));
# hsv
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV);
h,s,v = cv2.split(hsv);
# median blur to get rid of most of the text
h = cv2.medianBlur(h, 5);
s = cv2.medianBlur(s, 5);
v = cv2.medianBlur(v, 5);
# get most abundant color
color_margin = 30;
hmask, hcount = getAbundantColor(h, color_margin);
# detect white and black separately
light_margin = 30;
# white
wmask = cv2.inRange(s, 0, light_margin);
wcount = np.count_nonzero(wmask);
# black
bmask = cv2.inRange(v, 0, light_margin);
bcount = np.count_nonzero(bmask);
# check which is biggest
sorter = [[hcount, hmask], [wcount, wmask], [bcount, bmask]];
sorter.sort();
mask = sorter[-1][1];
# dilate and erode to close holes
kernel = np.ones((3,3), np.uint8);
mask = cv2.dilate(mask, kernel, iterations = 2);
mask = cv2.erode(mask, kernel, iterations = 4);
mask = cv2.dilate(mask, kernel, iterations = 2);
# get contours # OpenCV 3.4, in OpenCV 2* or 4* it returns (contours, _)
_, contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE);
# for each contour, approximate a simpler shape until we have 4 points
simplified = [];
for con in contours:
# go until we have 4 points
num_points = 999999;
step_size = 0.01;
percent = step_size;
while num_points >= 4:
# get number of points
epsilon = percent * cv2.arcLength(con, True);
approx = cv2.approxPolyDP(con, epsilon, True);
num_points = len(approx);
# increment
percent += step_size;
# step back and get the points
# there could be more than 4 points if our step size misses it
percent -= step_size * 2;
epsilon = percent * cv2.arcLength(con, True);
approx = cv2.approxPolyDP(con, epsilon, True);
simplified.append(approx);
cv2.drawContours(img, simplified, -1, (0,0,200), 2);
# print out the number of points
for points in simplified:
print("Num Points: " + str(len(points)));
# show image
cv2.imshow("Image", img);
cv2.imshow("Hue", h);
cv2.imshow("Mask", mask);
cv2.waitKey(0);
Edit: In order to accommodate the uncertainty in the board's color and appearance I run the assumption that the board itself will be the majority of the picture. The lines involving the sorter are looking for the most abundant color in the image. If the white wall behind the board takes up more space in the image then that'll be the color that gets selected for the mask.
There are other ways to try and select just the board, but it's really difficult to come up with a catch-all solution. The rest of the code should do its job the same if you can come up with some way of masking the board. If you're willing to budge on the unknown color assumption and provide the original pictures of the failing cases then I can probably come up with an appropriate mask.

Detecting costs/variables resulting in unbounded optimization problem in ojAlgo

I am using the ojAlgo linear/quadratic solver via ExpressionsBasedModel to solve the layout of graphical elements in a plotting library so that they fit neatly into the screen boundaries. Specifically, I want to solve for scale and translation so that the coordinates of a scatter plot fill up the screen space. I do that by declaring scale and translation variables of the ExpressionsBasedModel and transform the scatter plot coordinates to the screen using those variables and then construct linear constraints that the transformed coordinates should project inside the screen. I also add a negative cost to the scale variables, so that they are maximized and the scatter plot covers as much screen space as possible. My problem is that in some special cases, for example if I have only one point to plot, this results in an unbounded problem where the scale goes towards infinity without any constraint being active. How can I detect the scale variables for which this would happen and fix them to some default values?
To illustrate the above problem, I constructed a toy plotting library (the full library that I am working on is too big to fit in this question). To help layout the graphical elements, I have a problem class:
class Problem {
private ArrayList<Variable> _scale_variables = new ArrayList<Variable>();
private ExpressionsBasedModel _model = new ExpressionsBasedModel();
Variable freeVariable() {
return _model.addVariable();
}
Variable scaleVariable() {
Variable x = _model.addVariable();
x.lower(0.0); // Negative scale not allowed
_scale_variables.add(x);
return x;
}
Expression expr() {
return _model.addExpression();
}
Result solve() {
for (Variable scale_var: _scale_variables) {
// This is may result in unbounded solution for degenerate cases.
Expression expr = _model.addExpression("Encourage-larger-scale");
expr.set(scale_var, -1.0);
expr.weight(1.0);
}
return _model.minimise();
}
}
It wraps an ExpressionsBasedModel and has some facilities to create variables. For the transform that I will use to map my scatter point coordinates to screen coordinates, I have this class:
class Transform2d {
Variable x_scale;
Variable y_scale;
Variable x_translation;
Variable y_translation;
Transform2d(Problem problem) {
x_scale = problem.scaleVariable();
y_scale = problem.scaleVariable();
x_translation = problem.freeVariable();
y_translation = problem.freeVariable();
}
void respectBounds(double x, double y, double marker_size,
double width, double height,
Problem problem) {
// Respect left and right screen bounds
{
Expression expr = problem.expr();
expr.set(x_scale, x);
expr.set(x_translation, 1.0);
expr.lower(marker_size);
expr.upper(width - marker_size);
}
// Respect top and bottom screen bounds
{
Expression expr = problem.expr();
expr.set(y_scale, y);
expr.set(y_translation, 1.0);
expr.lower(marker_size);
expr.upper(height - marker_size);
}
}
}
The respectBounds method is used to add the constraints of a single point in the scatter plot the the Problem class mentioned before. To add all the points of a scatter plot, I have this function:
void addScatterPoints(
double[] xy_pairs,
// How much space every marker occupies
double marker_size,
Transform2d transform_to_screen,
// Screen size
double width, double height,
Problem problem) {
int data_count = xy_pairs.length/2;
for (int i = 0; i < data_count; i++) {
int offset = 2*i;
double x = xy_pairs[offset + 0];
double y = xy_pairs[offset + 1];
transform_to_screen.respectBounds(x, y, marker_size, width, height, problem);
}
}
First, let's look at what a non-degenerate case looks like. I specify the screen size and the size of the markers used for the scatter plot. I also specify the data to plot, build the problem and solve it. Here is the code
Problem problem = new Problem();
double marker_size = 4;
double width = 800;
double height = 600;
double[] data_to_plot = new double[] {
1.0, 2.0,
4.0, 9.3,
7.0, 4.5};
Transform2d transform = new Transform2d(problem);
addScatterPoints(data_to_plot, marker_size, transform, width, height, problem);
Result result = problem.solve();
System.out.println("Solution: " + result);
which prints out Solution: OPTIMAL -81.0958904109589 # { 0, 81.0958904109589, 795.99999999999966, -158.19178082191794 }.
This is what a degenerate case looks like, plotting two points with the same y-coordinate:
Problem problem = new Problem();
double marker_size = 4;
double width = 800;
double height = 600;
double[] data_to_plot = new double[] {
1, 1,
9, 1
};
Transform2d transform = new Transform2d(problem);
addScatterPoints(data_to_plot, marker_size, transform, width, height, problem);
Result result = problem.solve();
System.out.println("Solution: " + result);
It displays Solution: UNBOUNDED -596.0 # { 88.44444444444444, 596, 0, 0 }.
As mentioned before, my question is: How can I detect the scale variables whose negative cost would result in an unbounded solution and constraint them to some default value, so that my solution is not unbounded?

Transform cartesian pixel-data-array to lat/lon pixel-data-array

I have an image (basically, I get raw image data as 1024x1024 pixels) and the position in lat/lon of the center pixel of the image.
Each pixel represents the same fixed pixel scale in meters (e.g. 30m per pixel).
Now, I would like to draw the image onto a map which uses the coordinate reference system "EPSG:4326" (WGS84).
When I draw it by defining just corners in lat/lon of the image, depending on a "image size in pixel * pixel scale" calculation and converting the distances from the center point to lat/lon coordinates of each corner, I suppose, the image is not correctly drawn onto the map.
By the term "not correctly drawn" I mean, that the image seems to be shifted and also the contents of the image are not at the map location, where I expected them to be.
I suppose this is the case because I "mix" a pixel scaled image and a "EPSG:4326" coordinate reference system.
Now, with the information I have given, can I transform the whole pixel matrix from fixed pixel scale base to a new pixel matrix in the "EPSG:4326" coordinate reference system, using Geotools?
Of course, the transformation must be dependant on the center position in lat/lon, that I have been given, and on the pixel scale.
I wonder if using something like this would point me into the correct direction:
MathTransform transform = CRS.findMathTransform(DefaultGeocentricCRS.CARTESIAN, DefaultGeographicCRS.WGS84, true);
DirectPosition2D srcDirectPosition2D = new DirectPosition2D(DefaultGeocentricCRS.CARTESIAN, degreeLat.getDegree(), degreeLon.getDegree());
DirectPosition2D destDirectPosition2D = new DirectPosition2D();
transform.transform(srcDirectPosition2D, destDirectPosition2D);
double transX = destDirectPosition2D.x;
double transY = destDirectPosition2D.y;
int kmPerPixel = mapImage.getWidth / 1024; // It is known to me that my map is 1024x1024km ...
double x = zeroPointX + ((transX * 0.001) * kmPerPixel);
double y = zeroPointY + (((transX * -1) * 0.001) * kmPerPixel);
(got this code from another SO thread and already modified it a little bit, but still wonder if this is the correct starting point for my problem.)
I only suppose that my original image coordinate reference system is of the type DefaultGeocentricCRS.CARTESIAN. Can someone confirm this?
And from here on, is this the correct start to use Geotools for this kind of problem solving, or am I on the complete wrong path?
Additionally, I would like to add that this would be used in a quiet dynamic system. So my image update would be about 10Hz and the transormations have to be performed accordingly often.
Again, is this initial thought of mine leading to a solution, or do you have other solutions for solving my problem?
Thank you very much,
Kiamur
This is not as simple as it might sound. You are essentially trying to define an area on a sphere (ellipsoid technically) using a flat square. As such there is no "correct" way to do it, so you will always end up with some distortion. Without knowing exactly where your image came from there is no way to answer this exactly but the following code provides you with 3 different possible answers:
The first two make use of GeoTools' GeodeticCalculator to calculate the corner points using bearings and distances. These are the blue "square" and the green "square" above. The blue is calculating the corners directly while the green calculates the edges and infers the corners from the intersections (that's why it is squarer).
final int width = 1024, height = 1024;
GeometryFactory gf = new GeometryFactory();
Point centre = gf.createPoint(new Coordinate(0,51));
WKTWriter writer = new WKTWriter();
//direct method
GeodeticCalculator calc = new GeodeticCalculator(DefaultGeographicCRS.WGS84);
calc.setStartingGeographicPoint(centre.getX(), centre.getY());
double height2 = height/2.0;
double width2 = width/2.0;
double dist = Math.sqrt(height2*height2+width2 *width2);
double bearing = 45.0;
Coordinate[] corners = new Coordinate[5];
for (int i=0;i<4;i++) {
calc.setDirection(bearing, dist*1000.0 );
Point2D corner = calc.getDestinationGeographicPoint();
corners[i] = new Coordinate(corner.getX(),corner.getY());
bearing+=90.0;
}
corners[4] = corners[0];
Polygon bbox = gf.createPolygon(corners);
System.out.println(writer.write(bbox));
double[] edges = new double[4];
bearing = 0;
for(int i=0;i<4;i++) {
calc.setDirection(bearing, height2*1000.0 );
Point2D corner = calc.getDestinationGeographicPoint();
if(i%2 ==0) {
edges[i] = corner.getY();
}else {
edges[i] = corner.getX();
}
bearing+=90.0;
}
corners[0] = new Coordinate( edges[1],edges[0]);
corners[1] = new Coordinate( edges[1],edges[2]);
corners[2] = new Coordinate( edges[3],edges[2]);
corners[3] = new Coordinate( edges[3],edges[0]);
corners[4] = corners[0];
bbox = gf.createPolygon(corners);
System.out.println(writer.write(bbox));
Another way to do this is to transform the centre point into a projection that is "flatter" and use simple addition to calculate the corners and then reverse the transformation. To do this we can use the AUTO projection defined by the OGC WMS Specification to generate an Orthographic projection centred on our point, this gives the red "square" which is very similar to the blue one.
String code = "AUTO:42003," + centre.getX() + "," + centre.getY();
// System.out.println(code);
CoordinateReferenceSystem auto = CRS.decode(code);
// System.out.println(auto);
MathTransform transform = CRS.findMathTransform(DefaultGeographicCRS.WGS84,
auto);
MathTransform rtransform = CRS.findMathTransform(auto,DefaultGeographicCRS.WGS84);
Point g = (Point)JTS.transform(centre, transform);
width2 *=1000.0;
height2 *= 1000.0;
corners[0] = new Coordinate(g.getX()-width2,g.getY()-height2);
corners[1] = new Coordinate(g.getX()+width2,g.getY()-height2);
corners[2] = new Coordinate(g.getX()+width2,g.getY()+height2);
corners[3] = new Coordinate(g.getX()-width2,g.getY()+height2);
corners[4] = corners[0];
bbox = gf.createPolygon(corners);
bbox = (Polygon)JTS.transform(bbox, rtransform);
System.out.println(writer.write(bbox));
Which solution to use is a matter of taste, and depends on where your image came from but I suspect that either the red or the blue will be best. If you need to do this at 10Hz then you will need to test them for speed, but I suspect that transforming the images will be the bottle neck.
Once you have your bounding box setup to your satisfaction you can convert you (unreferenced) image to a georeferenced coverage using:
GridCoverageFactory factory = CoverageFactoryFinder.getGridCoverageFactory(null);
GridCoverage2D gc = factory.create("name", image, new ReferencedEnvelope(bbox.getEnvelopeInternal(),DefaultGeographicCRS.WGS84));
String fileName = "myImage.tif";
AbstractGridFormat format = GridFormatFinder.findFormat(fileName);
File out = new File(fileName);
GridCoverageWriter writer = format.getWriter(out);
try {
writer.write(gc, null);
writer.dispose();
} catch (IllegalArgumentException | IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

Exception in BufferedImage.getData().getPixels()

I was looking to convert a buffered image to its corresponding pixel value array. I found a code for that:
public static double[] createArrFromIm(BufferedImage im){
int imWidth = im.getWidth();
int imHeight = im.getHeight();
double[] imArr = new double[imWidth* imHeight];
im.getData().getPixels(0, 0, imWidth, imHeight, imArr);
return imArr;
}
The original author who wrote this code block also gave some sample images which work perfect for this block. However, when I try to run this block against my images (the images are always 125*150) the block throws an array index out of bound exception at line:
im.getData().getPixels(0, 0, imWidth, imHeight, imArr);
This incident seems very arcane to me. Any help or suggestion will be very much appreciable. Thanks.
As #FiReTiTi says, you should use the getRaster() method instead of the getData() method, unless you really want a copy of the image data.
However, that is not the cause of the exception. The problem is that your double array only allocates space for a single band (similarly, FiReTiTi's version works, because he explicitly leaves the last parameter 0, only requesting the first band). This is fine for single band (gray scale) images, but I assume you use RGB, CMYK or other color model with multiple bands.
The fix is to multiply the allocated space with the number of bands, as below:
public static double[] createArrFromIm(BufferedImage im) {
int imWidth = im.getWidth();
int imHeight = im.getHeight();
int imBands = im.getRaster().getNumBands(); // typically 3 or 4, depending on RGB or ARGB
double[] imArr = new double[imWidth * imHeight * imBands];
im.getRaster().getPixels(0, 0, imWidth, imHeight, imArr);
return imArr;
}
Do it using the raster:
public static double[] createArrFromIm(BufferedImage im){
int imWidth = im.getWidth();
int imHeight = im.getHeight();
double[] imArr = new double[imWidth* imHeight];
for (int y=0, nb=0 ; y < imHeight ; y++)
for (int x=0 ; x < imWidth ; x++, nb++)
imArr[nb] = im.getRaster().getSampleDouble(x, y, 0) ;
return imArr;
}
As pointed by #haraldK, getData() also works, but it returns a copy of the raster, so it's slower.
There is a faster way using the DataBuffer, but then you have to manage the BufferedImage encoding because you have a direct access to the pixel values.
And here is the answer why what you did, didn't work. im.getData().getPixels() returns an array, it does not fill the array you give as parameter. The array that you give as parameter just determines the return type. So if you want to use getData (it's not the best option), you have to do:
double[] imArr = im.getData().getPixels(0, 0, imWidth, imHeight, (double[])null) ;

Categories

Resources