I have a java application that, when I press a button, records point clouds xyz coordinates together with the right pose.
What I want is to pick an object, record a pointCloud in the front and one in the back, then merge the 2 clouds.
Obviously to get a reasonable result I need to translate and rotate one or both the clouds I recorded. But I'm new to Tango Project and there are some things I should be missing.
I have read about this in this post.
There, #Jason Guo talks about those matrix:
start_service_T_device, imu_T_device , imu_T_depth
How could I get them?
Should i use getMatrixTransformAtTime()?
The first matrix is from start of service to device, but I'm using area learning, so my BaseFrame is TangoPoseData.COORDINATE_FRAME_AREA_DESCRIPTION.
Is it possible to use same strategy also in my case?
Just replacing start_service_T_device with something like area_description_T_device?
Side question
I want to extend this approach for the 3D reconstruction of objects.
I want to get several pointClouds of different view of the same object, rotate and translate them wrf some fixed axes. Then i'll assume that 2 points (x,y,z) and (x',y',z') are the same point if x~=x' && y~=y' && z~=z'.
This way i'll should be able to get a point cloud of the entire object, am I right?
Is this approach suitable?
Is there better alternatives?
The original post is a little bit out of date. Previously, we don't have getMatrixTransformAtTime(). So you have to use Tango.getPoseAtTime to query each of the transformation, and then chain them up using matrix.
But now, with getMatrixTransformAtTime, you can directly query area_description_T_depth, even in opengl frame. In order to transform a point cloud to the ADF frame in opengl, you can use following code (pseudo code):
TangoSupport.TangoMatrixTransformData transform =
TangoSupport.getMatrixTransformAtTime(pointCloud.timestamp,
TangoPoseData.COORDINATE_FRAME_START_OF_SERVICE,
TangoPoseData.COORDINATE_FRAME_CAMERA_DEPTH,
TangoSupport.TANGO_SUPPORT_ENGINE_OPENGL,
TangoSupport.TANGO_SUPPORT_ENGINE_TANGO);
// Convert it into the matrix format you use in render.
// This is a pure data structure conversion, transform is
// in opengl world frame already.
Matrix4x4 model_matrix = ConvertMatrix(transform);
foreach (Point p in pointCloud) {
p = model_matrix * p;
}
// Now p is in opengl world frame.
But note that, you have to have a valid area description frame to query the pose based on area description, that is after relocalized with an ADF or in learning mode.
Related
I want to make clickable cell of the palette in Vuforia (without Unity) by tap on screen:
I found Dominoes example with similar functionality and do that:
create one plate object and multiply cells objects
call isTapOnSetColor function with parameter x, y (click coordinates) on tap and get coordinates,
coordinates is correct, but get the id/name of part of objects is wrong
I think problem with this line:
boolean bool = checkIntersectionLine(matrix44F, lineStart, lineEnd);
In the Dominoes example this was:
bool intersection = checkIntersectionLine(domino->pickingTransform, lineStart, lineEnd);
But I don't know what does do domino->pickingTransform and paste instead of this line modelViewMatrix (Tool.convertPose2GLMatrix(trackableResult.getPose()).getData())
Full code of my touch function: http://pastebin.com/My4CkxHa
Can you help me to make clicks or may be another way (not Unity) to do that?
Basically, domino->pickingTransform is pretty much the final matrix that is being drawn for each domino object. The domino sample work in a way that for each object (domino), the app checks the projected point of the screen touch and sees if it intersects the matrix of the object. The picking matrix is not exactly the same, since you want to make the is more responsive, so you make it a little wider than the drawing matrix.
You said you are getting a wrong id, but the question is - is it always the same id for different cells? If not, this is probably some small calculation error you made in your matrix transformations. I would suggest to do a visual debugging - add some graphical indication for the detected id, so you will be able to see what cell the application thinks you have clicked. This should help you progress towards the solution.
I wanted to use Tango's RGB camera along with its Depth data to create a specific point cloud involving only one color, but I'm not sure how to approach this.
What i want to do is ultimately re-construct an object in blender based on it's XYZ value and the way I'm trying to extract this object from its background is based on color because it doesn't have any depth on it's own. Like a drawing on a 3D object.
I will recommend to check the examples in the C api of tango. It should be possible to do it all in java but the example in c called cpp_rgb_depth_sync_example should give you several ideas
Check the code in https://github.com/googlesamples/tango-examples-c
This example puts the information of the pointcloud in the color image... you just want to do the inverse!
For each point cloud:
- Gather the previous color image
- Using the camera intrinsics (see the example above), you can link each point of the point cloud with a voxel in the image.
- Once you have the color for each point you can remove the points you are not interested in.
One thing to remind is that the color image is in a yuv format (you might want to convert it in RBG).
I hope this will help.
I am encountering some problems using OpenCV on Android without NDK.
Currently I am doing a project a my university and my chefs tells me that I should avoid camera calibration when reconstructing 3D objects from 2D images.
So far I have 2 2D images and have all the featurepoints, matches, good_matches, fundamental matrix and the homogenous matrix. In addition I have calculated the disparity map using StereoBM. The next step should be getting a 3D Point cloud from all those values.
I checked the internet and found
Calib3d.reprojectImageTo3D(disparity, _3dImage, Q, false);
Using this method, I should be able to recreate the 3D Point Cloud ... current problem is, that I do not have the Matrix Q.
I think I will get this from the method
stereoRectify(...);
But as I should avoid cameraCalibration for this specific case, I cannot use this method. The alternative
stereoRectifyUncalibrated(...);
Does not provide Q...
Can someone please help me and show me how I can get Q or the point cloud in an easier way?
Thanks
To answer your question, the Q matrix required by reprojectImageTo3D represents the mapping from a pixel position and associated disparity (i.e. of the form [u; v; disp; 1]) to the corresponding 3D point [X; Y; Z; 1]. Unfortunately, you cannot derive this relation without knowing the cameras' intrinsics (matrix K) and extrinsics (rotation & translation between the two camera poses).
Camera calibration is the usual way to estimate those. Your chef said that it is not an option, however there are several different techniques (e.g. using a chessboard, or via autocalibration) with various requirements and possibilities. Hence, investigating exactly why calibration is off the table may enable finding a method appropriate to your application.
If you really have no way of estimating the intrinsics, a possible solution could be Bundle Adjustment, using more than just 2 images. However, without the intrinsics, the 3D reconstruction will likely not be very useful. Which leads us to my second point.
There are several types of 3D reconstruction, the main types being: projective, metric and Euclidian. (For more details on this, see ยง10.2 p 264 in "Multiple View Geometry in Computer Vision" by Hartley & Zisserman, 2nd Edition)
The Euclidian reconstruction is what most people mean by "3D reconstruction", though not necessarily what they need: a model of the scene which relates to the true model only by a 3D rotation and a 3D translation (i.e. a change of the 3D coordinate system). Hence, orthogonal angles in the scene are orthogonal in such a model and a distance of 1 meter in the scene corresponds to 1 meter in the model. In order to obtain such a Euclidian 3D reconstruction, you need to know the intrinsics of at least some cameras AND the true distance between two given points in the scene.
The metric or similarity reconstruction is most of the time good enough and refers to a 3D model of the scene which relates to the true model by a similarity transform, in other words by a 3D rotation and a 3D translation (i.e. a change of the 3D coordinate system) and also by an overall scaling. In order to obtain such an metric reconstruction, you need to know the intrinsics of at least some cameras.
The projective reconstruction is what you will obtain if you have no knowledge about the scene or camera's intrinsics. Such a 3D model is not up-to-scale with respect to the observed scene, and angles which are orthogonal in the scene will probably not be orthogonal in the model.
Hence, knowing the intrinsics parameters of (some of) the cameras is crucial if you want an accurate reconstruction.
I am having some performance problems with OpenGL. I essentially want to create a grid of squares. I first tried to implement it where each square I would translate to where I want a square, then multiply the model and view matrix, pass it into the shader program and draw the square. I would do this for each square. After creating about 50 squares the frame rate would start to drop to less than what I desire.
I then tried a VBO method where I basically would generate a vertex buffer each time the squares change location. Frame rate increased dramatically with this approach, but I have too much latency when something changes because it has to regenerate all the vertex locations.
What I think I need is a matrix stack... I used opengl 1.1 before and would use push/pop. I don't really understand the concepts of what that was doing though and how to reproduce it. Does anyone know where a good example of a matrix stack is that I can use as an example? Or possibly just a good explanation for one?
You can check this tutorial, is basically doing the same you want to achieve, but with cubes instead of squares. It uses a VBO as well:
http://www.learnopengles.com/android-lesson-seven-an-introduction-to-vertex-buffer-objects-vbos/
About the matrices, in OpenGL ES 2.0 you don't have any matrix related functions anymore, but you can use the glmath library, which does the same (and much more):
http://glm.g-truc.net/
It's a header library, so you just need to copy it somewhere and include it where you need it.
I'm not sure if I completely understand your objective, but I guess you could copy the data of one square in the grapic card (using a VBO) and then repeatedly update the model matrix for every square.
The concept of a matrix stack makes sense if your squares have some kind of hierarchy between them (for instance, if one of them moves, the one to its left has to move accordingly).
You can imagine it as a skeleton made out of squares. If the shoulder moves, all the pieces in the arm will move as well (hands, fingers, and so on).
You can emulate that by using a matrix stack. You can create some kind of tree with all your squares, so that every square has a list of "descendants", which will apply the same transformation as the parent. then you can render recursively all the squares like that:
Apply transform to the root square(s)
Push the transform in a queue
Call the same render function for every child
Every child reads the matrix on the top of the queue, multiplies
it by its own transformation, push the new matrix on the queue and
calls the children
After that every child pops out the matrix they pushed before
Using the glmath is quite easy, you just need to create a queue (std:vector in this case) of matrices:
std::vector<glm::mat4> matrixStack;
And then for every child:
glm::mat4 modelMatrix = matrixStack.back();
glm::mat4 nodeTransform = /*apply your transform here*/
glm::mat4 new = modelMatrix * nodeTransform;
matrixStack.push_back(new);
/*Pass in the new matrix to the shader and call to glDrawArrays or whatever to render your square*/
for (every child) {
render();
}
matrixStack.pop_back();
For the drawing part, I guess you could bind the vertex array with the square vertices, and then update the model matrix in the shader for every child, before calling glDrawArrays.
I've imported some 3d models in java3D and I want to change the pivot point of my model from the origin to a specific point!
Please don't say to translate to origin, rotate and then translate back
I want to know the exact way .
This helped me. The idea of translation is good, actually, do it like follows:
Create a TransformGroup "tg" for example containing the node you would like to rotate and/or translate.
Be sure that you translate it to the point you want to be your pivot point.
Then, create a new TransformGroup containing tg, and rotate it.
Translate back then (translate with the same vector*(-1)).
Rotation around a specific point (eg, rotate around 0,0,0)
This helped me
If I understand what you mean, you should traverse the scene graph produced by the model loader, find any GeometryArrays in it, and translate all the coordinates in the GeometryArrays (this is no simple task -- coordinates can be stored in a number of ways). That way a simple rotation transform would rotate around a different pivot point than before.