I am encountering some problems using OpenCV on Android without NDK.
Currently I am doing a project a my university and my chefs tells me that I should avoid camera calibration when reconstructing 3D objects from 2D images.
So far I have 2 2D images and have all the featurepoints, matches, good_matches, fundamental matrix and the homogenous matrix. In addition I have calculated the disparity map using StereoBM. The next step should be getting a 3D Point cloud from all those values.
I checked the internet and found
Calib3d.reprojectImageTo3D(disparity, _3dImage, Q, false);
Using this method, I should be able to recreate the 3D Point Cloud ... current problem is, that I do not have the Matrix Q.
I think I will get this from the method
stereoRectify(...);
But as I should avoid cameraCalibration for this specific case, I cannot use this method. The alternative
stereoRectifyUncalibrated(...);
Does not provide Q...
Can someone please help me and show me how I can get Q or the point cloud in an easier way?
Thanks
To answer your question, the Q matrix required by reprojectImageTo3D represents the mapping from a pixel position and associated disparity (i.e. of the form [u; v; disp; 1]) to the corresponding 3D point [X; Y; Z; 1]. Unfortunately, you cannot derive this relation without knowing the cameras' intrinsics (matrix K) and extrinsics (rotation & translation between the two camera poses).
Camera calibration is the usual way to estimate those. Your chef said that it is not an option, however there are several different techniques (e.g. using a chessboard, or via autocalibration) with various requirements and possibilities. Hence, investigating exactly why calibration is off the table may enable finding a method appropriate to your application.
If you really have no way of estimating the intrinsics, a possible solution could be Bundle Adjustment, using more than just 2 images. However, without the intrinsics, the 3D reconstruction will likely not be very useful. Which leads us to my second point.
There are several types of 3D reconstruction, the main types being: projective, metric and Euclidian. (For more details on this, see §10.2 p 264 in "Multiple View Geometry in Computer Vision" by Hartley & Zisserman, 2nd Edition)
The Euclidian reconstruction is what most people mean by "3D reconstruction", though not necessarily what they need: a model of the scene which relates to the true model only by a 3D rotation and a 3D translation (i.e. a change of the 3D coordinate system). Hence, orthogonal angles in the scene are orthogonal in such a model and a distance of 1 meter in the scene corresponds to 1 meter in the model. In order to obtain such a Euclidian 3D reconstruction, you need to know the intrinsics of at least some cameras AND the true distance between two given points in the scene.
The metric or similarity reconstruction is most of the time good enough and refers to a 3D model of the scene which relates to the true model by a similarity transform, in other words by a 3D rotation and a 3D translation (i.e. a change of the 3D coordinate system) and also by an overall scaling. In order to obtain such an metric reconstruction, you need to know the intrinsics of at least some cameras.
The projective reconstruction is what you will obtain if you have no knowledge about the scene or camera's intrinsics. Such a 3D model is not up-to-scale with respect to the observed scene, and angles which are orthogonal in the scene will probably not be orthogonal in the model.
Hence, knowing the intrinsics parameters of (some of) the cameras is crucial if you want an accurate reconstruction.
Related
I was looking around a lot but only found how to check collision for 2d objects.. for my current project I want to check colisions in 3d (I'm using obj models) - I could probably figure out something myself the problem is that I only know the center point of each object..
Is there a way to get the boarders of the object so I can check if it touches the boarders of another object? What would be the best way to get this information?
Edit: Some more information that might help:
I'm using lwjgl 2.8,
my objects are obj files,
I can get position, scale and the rotation of an object
EDIT: This is what I found on youtube:
https://www.youtube.com/watch?v=Iu6nAXFm2Wo&list=PLEETnX-uPtBXm1KEr_2zQ6K_0hoGH6JJ0&index=4
What you have is called a triangle soup[1]: you got vertex position (, texture and normal) information couppled with triangle information. What you can do is intersect each triangle from one mesh with another mesh. You can do this either brute-force (by testing all the other triangles each time) or build up a space partitioning data structure to speed up your intersections.
E.g. build an Octree per mesh and iterate over the leaves in one of them: for each of those leaves, test its bounding box against the leaves in the other tree and for each of those collision pairs brute-force test each triangles within either of those pairs against each other (or only test those from mesh A against those in mesh B, if you don't care about self-intersections).
There are libraries for these sorts of algorithms like OpenMesh or Bullet for example. But I only know of one port to Java: JBullet.
EDIT: If you're only interested in approximate collisions you can throw away all information about triangles and build a bounding volume out of your vertices (e.g. an axis aligned box is just the min and the max for all vertex positions, an oriented box is built similarly, but you have to find a good enough orientation first, a sphere is a bit more involved and finally you have a convex mesh, which uses the same sorts of intersection tests as a normal mesh, but is smaller than the original and convex, allowing for some optimisations of the general intersection test).
[1]: there are other kinds of 3D representations that record information about how the triangles are connected. You might imagine finding only two intersecting triangles in mesh A and mesh B and then start searching for intersecting triangles only in the neighbourhood of the initial intersection... these algorithms are much more involved, but for meshes that deform have the advantage, that you don't have to rebuild the space partitioning data structure each time as you have to do with triangle soups.
I see that there are some fairly similar questions, but those are mostly regarding performance. I would be very grateful if someone could take the time to explain as to how I could implement what I currently have into the vertex shader(For animation, as the title states).
I have a simple .FBX file reader which extracts the following:
Vertex coordinates (X, Y, Z);
Vertex indices;
Normal coordinates (X, Y, Z);
As far as bones go:
Bone names
Indices of the vertices attached to the bones;
The weight of each of the vertices;
A 4x4 matrix array of the bone. (Not sure how this is laid out, please explain! Wouldn't the matrix only hold the position of 1 end?)
Help appreciated!
Looks like you may be missing a few things. Most notably, the bone hierarchy.
The way it typically works, is you have a root node, whose transformations propagate down to the next level of "bones", whose transformations propagate, and so on until you reach the "leaf" bones, or bones with no children. This transformation chain is done using matrix multiplication.
There are a few ways to then transform the vertices. It's sometimes done with with an "influence" that is decided on beforehand and is usually loaded from a 3d application such as 3dsmax.
There's another way of doing this which is simple and straightforward. You simply calculate the distance from the vertex to the bone node. The influence of the bone's transformation is directly related to this distance.
The 4x4 matrix you speak of holds much more than just position. It holds rotational data as well. It has the potential to hold scaling data as well but this typically isn't used in skinning/bone applications.
Getting the orchestra to play together nicely requires a thorough understanding of matrix math/coordinate systems. I'd grab a book on the subject of 3D math for game programming. Any book worth its weight will have an entire section devoted to this topic.
Say you have a collection of points with coordinates on a Cartesian coordinate system.
You want to plot another point, and you know its coordinates in the same Cartesian coordinate system.
However, the plot you're drawing on is distorted from the original. Imagine taking the original plane, printing it on a rubber sheet, and stretching it in some places and pinching it in others, in an asymmetrical way (no overlapping or anything complex).
(source)
You know the stretched and unstretched coordinates of each of your set of points, but not the underlying stretch function. You know the unstretched coordinates of a new point.
How can you estimate where to plot the new point in the stretched coordinates based on the stretched positions of nearby points? It doesn't need to be exact, since you can't determine the actual stretch function from a set of remapped points unless you have more information.
other possible keywords: warped distorted grid mesh plane coordinate unwarp
Ok, so this sounds like image warping. This is what you should do:
Create a Delaunay triangulation of your unwarped grid and use your knowledge of the correspondences between the warped and unwarped grid to create the triangulation for the warped grid. Now you know the corresponding triangles in each image and since there is no overlapping, you should be able to perform the next step without much difficulty.
Now, to find the corresponding point A, in the warped image:
Find the triangle A lies in and use the transformation between the triangle in the unwarped grid and the warped grid to figure out the new position.
This is explained explicitly in detail here.
Another (much more complicated) method is the Thin Plate Spline (which is also explained in the slides above).
I understood that you have one-to-one correspondence between the wrapped and unwrapped grid points. And I assume that the deformation is not so extreme that you might have intersecting grid lines (like the image you show).
The strategy is exactly what Jacob suggests: Triangulate the two grids such that there is a one-to-one correspondence between triangles, locate the point to be mapped in the triangulation and then use barycentric coordinates in the corresponding triangle to compute the new point location.
Preprocess
Generate the Delaunay triangulation of the points of the wrapped grid, let's call it WT.
For every triangle in WT add a triangle between the corresponding vertices in the unwrapped grid. This gives a triangulation UWT of the unwrapped points.
Map a point p into the wrapped grid
Find the triangle T(p1,p2,p3) in the UWT which contains p.
Compute the barycentric coordinates (b1,b2,b3) of p in T(p1,p2,p3)
Let Tw(q1,q2,q3) be the triangle in WT corresponding to T(p1,p2,p3). The new position is b1 * q1 + b2 * q2 + b3 * q3.
Remarks
This gives a deformation function as a linear spline. For smoother behavior one could use the same triangulation but do higher order approximation which would lead to a bit more complicated computation instead of the barycentric coordinates.
The other answers are great. The only thing I'd add is that you might want to take a look at Free form deformation as a way of describing the deformations.
If that's useful, then it's quite possible to fit a deformation grid/lattice to your known pairs, and then you have a very fast method of deforming future points.
A lot depends on how many existing points you have. If you have only one, there's not really much you can do with it -- you can offset the second point by the same amount in the same direction, but you don't have enough data to really do any better than that.
If you have a fair number of existing points, you can do a surface fit through those points, and use that to approximate the proper position of the new point. Given N points, you can always get a perfect fit using an order N polynomial, but you rarely want to do that -- instead, you usually guess that the stretch function is a fairly low-order function (e.g. quadratic or cubic) and fit a surface to the points on that basis. You then place your new point based on the function for your fitted surface.
i want to find a circular object(Iris of eye, i have used Haar Cascase with viola Jones algorithm). so i found that hough circle would be the correct way to do it. can anybody explain me how to implement Hough circle in Java or any other easy implementation to find iris with Java.
Thanks,
Duda and Hart (1971) has a pretty clear explanation of the Hough transform and a worked example. It's not difficult to produce an implementation directly from that paper, so it's a good place for you to start.
ImageJ provides a Hough Circle plugin. I've been playing around with it several times in the past.
You could take a look at the source code if you want or need to modify it.
If you want to find an iris you should be straightforward about this. The part of the iris you are after is actually called a limbus. Also note that the contrast of the limbus is much lower than the one of the pupil so if image resolution permits pupil is a better target. Java is not a good option as programming language here since 1. It is slow while processing is intense; 2. Since classic Hough circle requires 3D accumulator and Java probably means using a cell phone the memory requirements will be tough.
What you can do is to use a fact that there is probably a single (or only a few) Limbuses in the image. First thing to do is to reduce the dimensionality of the problem from 3 to 2 by using oriented edges: extract horizontal and vertical edges that together represent edge orientation (they can be considered as horizontal and vertical components of edge vector). The simple idea is that the dominant intersection of edge vectors is the center of your limbus. To find the intersection you only need two oriented edges instead of three points that define a circle. Hence dimensionality reduction from 3 to 2.
You also don’t need to use a classical Hough circle transform with a huge accumulator and numerous calculations to find this intersection. A Randomized Hough will be much faster. Here is how it works (~ to RANSAC): you select a minimum number of oriented edges at random (in your case 2), find the intersection, then find all the edges that intersect at approximately the same location. These are inliers. You just iterate 10-30 times choosing a different random sample of 2 edges to settle in a set with maximum number of inliers. Hopefully, these inliers lie on the limbus. The median of inlier ray intersections will give you the center of the circle and the median distance to the inliers from the center is the radius.
In the picture below bright colors correspond to inliers and orientation is shown with little line segment. The set of original edges is shown in the middle (horizontal only). While original edges lie along an ellipse, Hough edges were transformed by an Affine transform to make those belonging to limbus to lie on a circle. Also note that edge orientations are pretty noisy.
I need a suggestion/idea how to create a 3D Tag Cloud in Java (Swing)
(exactly like shown here: http://www.adesblog.com/2008/08/27/wp-cumulus-plugin/)
, could you help, please?
I'd go either with Swing and Java2D or OpenGL (JOGL).
I used OpenGL few times and drawing text is easy using JOGL's extenstions (TextRenderer).
If you choose Swing, than the hard part will be implementation of a 3D transformation. You'd have to write some sort of particle system. The particles would have to reside on a 3D sphere. You personally would be responsible of doing any 3D transformation, but using orthogonal projection that would be trivial. So it's a nice exercise - what You need is here: Wiki's spherical coord sys and here 3d to 2d projection.
After You made all of the transformation only drawing is left. And Java2D and Swing have very convenient API for this. It would boil down to pick font size and draw text at given coordinates. Custom JPanel with overriden paintComponent method would be enough to start and finish.
As for the second choice the hardest part is OpenGL API itself. It's procedural so if You're familiar mostly with Java You would have hard time using non-OO stuff. It can get used to and, to be honest, can be quite rewarding since You can do a lot with it. If you picked OpenGL than you would get all the 3D transformations for free, but still have to transform from spherical coordinate system to cartesian by yourself (first wiki article still helpful). After that it's just a matter of using some text drawing class, such as TextRenderer that comes with JOGL distribution.
So OpenGL helps You with view projection calculations and is hardware accelerated. The Java2D would require more math to use, but in my opinion, this approach seems a bit easier. Oh, and by the way - the Java2D tries to use any graphic acceleration there is (OpenGL or DirectDraw) internally. So You are shielded from certain low-level problems.
For both options You need also to bind mouse coordinates s to rotational speed of sphere. Whether it's Java2D or OpenGL the code will look very similar. Just map mouse coordinates related to the center of panel to some speed vector. At the drawing time You could use the vector to rotate the sphere accordingly.
And one more thing: if You would want to try OpenGL I'd recommend: Processing language created on MIT especially for rich graphic applets. Their 3D API, not so coincidentally, is almost the same as OpenGL, but without much of the cruft. So if You want the quickest prototype that's the best bet. Consult this discussion thread for actual example. Note: Processing is written in Java.
That's not really 3D. There are no perspective transformations or mapping the text on some 3D shape (such as, say, a sphere). What you have is a bunch of strings where each string has an associated depth (or Z order). Strings "closer" to you are painted with a stronger shade of gray and larger font size.
The motion of each string as you move the mouse is indeed a 3D shape which looks like a slanted circle around a fixed center - with the slant depending on where the mouse cursor is. That's simple math - if you figure it for one string, you figure it out for all. And then the last piece would be to scatter the strings so that they don't overlap too much, and give each one the initial weight based on their frequency.
That's what most of the code is doing. So you need to either do the math, or translate the ActionScript to Java2D blindly. And no, there is no need for JOGL.
Why don't you just download the source code, and have a look? Even if you can't write PHP, it should still be possible to read it and figure out how the algorithm works.