Confusion regarding skeletal animation in the vertex shader - java

I see that there are some fairly similar questions, but those are mostly regarding performance. I would be very grateful if someone could take the time to explain as to how I could implement what I currently have into the vertex shader(For animation, as the title states).
I have a simple .FBX file reader which extracts the following:
Vertex coordinates (X, Y, Z);
Vertex indices;
Normal coordinates (X, Y, Z);
As far as bones go:
Bone names
Indices of the vertices attached to the bones;
The weight of each of the vertices;
A 4x4 matrix array of the bone. (Not sure how this is laid out, please explain! Wouldn't the matrix only hold the position of 1 end?)
Help appreciated!

Looks like you may be missing a few things. Most notably, the bone hierarchy.
The way it typically works, is you have a root node, whose transformations propagate down to the next level of "bones", whose transformations propagate, and so on until you reach the "leaf" bones, or bones with no children. This transformation chain is done using matrix multiplication.
There are a few ways to then transform the vertices. It's sometimes done with with an "influence" that is decided on beforehand and is usually loaded from a 3d application such as 3dsmax.
There's another way of doing this which is simple and straightforward. You simply calculate the distance from the vertex to the bone node. The influence of the bone's transformation is directly related to this distance.
The 4x4 matrix you speak of holds much more than just position. It holds rotational data as well. It has the potential to hold scaling data as well but this typically isn't used in skinning/bone applications.
Getting the orchestra to play together nicely requires a thorough understanding of matrix math/coordinate systems. I'd grab a book on the subject of 3D math for game programming. Any book worth its weight will have an entire section devoted to this topic.

Related

Java collision detection with 3d obj models

I was looking around a lot but only found how to check collision for 2d objects.. for my current project I want to check colisions in 3d (I'm using obj models) - I could probably figure out something myself the problem is that I only know the center point of each object..
Is there a way to get the boarders of the object so I can check if it touches the boarders of another object? What would be the best way to get this information?
Edit: Some more information that might help:
I'm using lwjgl 2.8,
my objects are obj files,
I can get position, scale and the rotation of an object
EDIT: This is what I found on youtube:
https://www.youtube.com/watch?v=Iu6nAXFm2Wo&list=PLEETnX-uPtBXm1KEr_2zQ6K_0hoGH6JJ0&index=4
What you have is called a triangle soup[1]: you got vertex position (, texture and normal) information couppled with triangle information. What you can do is intersect each triangle from one mesh with another mesh. You can do this either brute-force (by testing all the other triangles each time) or build up a space partitioning data structure to speed up your intersections.
E.g. build an Octree per mesh and iterate over the leaves in one of them: for each of those leaves, test its bounding box against the leaves in the other tree and for each of those collision pairs brute-force test each triangles within either of those pairs against each other (or only test those from mesh A against those in mesh B, if you don't care about self-intersections).
There are libraries for these sorts of algorithms like OpenMesh or Bullet for example. But I only know of one port to Java: JBullet.
EDIT: If you're only interested in approximate collisions you can throw away all information about triangles and build a bounding volume out of your vertices (e.g. an axis aligned box is just the min and the max for all vertex positions, an oriented box is built similarly, but you have to find a good enough orientation first, a sphere is a bit more involved and finally you have a convex mesh, which uses the same sorts of intersection tests as a normal mesh, but is smaller than the original and convex, allowing for some optimisations of the general intersection test).
[1]: there are other kinds of 3D representations that record information about how the triangles are connected. You might imagine finding only two intersecting triangles in mesh A and mesh B and then start searching for intersecting triangles only in the neighbourhood of the initial intersection... these algorithms are much more involved, but for meshes that deform have the advantage, that you don't have to rebuild the space partitioning data structure each time as you have to do with triangle soups.

reprojectImageTo3D - Where do I get Q

I am encountering some problems using OpenCV on Android without NDK.
Currently I am doing a project a my university and my chefs tells me that I should avoid camera calibration when reconstructing 3D objects from 2D images.
So far I have 2 2D images and have all the featurepoints, matches, good_matches, fundamental matrix and the homogenous matrix. In addition I have calculated the disparity map using StereoBM. The next step should be getting a 3D Point cloud from all those values.
I checked the internet and found
Calib3d.reprojectImageTo3D(disparity, _3dImage, Q, false);
Using this method, I should be able to recreate the 3D Point Cloud ... current problem is, that I do not have the Matrix Q.
I think I will get this from the method
stereoRectify(...);
But as I should avoid cameraCalibration for this specific case, I cannot use this method. The alternative
stereoRectifyUncalibrated(...);
Does not provide Q...
Can someone please help me and show me how I can get Q or the point cloud in an easier way?
Thanks
To answer your question, the Q matrix required by reprojectImageTo3D represents the mapping from a pixel position and associated disparity (i.e. of the form [u; v; disp; 1]) to the corresponding 3D point [X; Y; Z; 1]. Unfortunately, you cannot derive this relation without knowing the cameras' intrinsics (matrix K) and extrinsics (rotation & translation between the two camera poses).
Camera calibration is the usual way to estimate those. Your chef said that it is not an option, however there are several different techniques (e.g. using a chessboard, or via autocalibration) with various requirements and possibilities. Hence, investigating exactly why calibration is off the table may enable finding a method appropriate to your application.
If you really have no way of estimating the intrinsics, a possible solution could be Bundle Adjustment, using more than just 2 images. However, without the intrinsics, the 3D reconstruction will likely not be very useful. Which leads us to my second point.
There are several types of 3D reconstruction, the main types being: projective, metric and Euclidian. (For more details on this, see ยง10.2 p 264 in "Multiple View Geometry in Computer Vision" by Hartley & Zisserman, 2nd Edition)
The Euclidian reconstruction is what most people mean by "3D reconstruction", though not necessarily what they need: a model of the scene which relates to the true model only by a 3D rotation and a 3D translation (i.e. a change of the 3D coordinate system). Hence, orthogonal angles in the scene are orthogonal in such a model and a distance of 1 meter in the scene corresponds to 1 meter in the model. In order to obtain such a Euclidian 3D reconstruction, you need to know the intrinsics of at least some cameras AND the true distance between two given points in the scene.
The metric or similarity reconstruction is most of the time good enough and refers to a 3D model of the scene which relates to the true model by a similarity transform, in other words by a 3D rotation and a 3D translation (i.e. a change of the 3D coordinate system) and also by an overall scaling. In order to obtain such an metric reconstruction, you need to know the intrinsics of at least some cameras.
The projective reconstruction is what you will obtain if you have no knowledge about the scene or camera's intrinsics. Such a 3D model is not up-to-scale with respect to the observed scene, and angles which are orthogonal in the scene will probably not be orthogonal in the model.
Hence, knowing the intrinsics parameters of (some of) the cameras is crucial if you want an accurate reconstruction.

3D Game Geometry

I have a simple game that uses a 3D grid representation, something like:
Blocks grid[10][10][10];
The person in the game is represented by a point and a sight vector:
double x,y,z, dx,dy,dz;
I draw the grid with 3 nested for loops:
for(...) for(...) for(...)
draw(grid[i][j][k]);
The obvious problem with this is when the size of the grid grows within the hundreds, fps drop dramatically. With some intuition, I realized that:
Blocks that were hidden by other blocks in the grid don't need to be rendered
Blocks that were not within the person's vision field also don't need to be rendered (ie. blocks that were behind the person)
My question is, given a grid[][][], a person's x,y,z, and a sight vector dx,dy,dz, how could I figure out which blocks need to be rendered and which don't?
I looked into using JMonkeyEngine, a 3D game engine, a while back and looked at some of the techniques they employ. From what I remember, they use something called culling. They build a tree structure of everything that exists in the 'world'. The idea then is that you have a subset of this tree that represents the visible objects at any given time. In other words, these are the things that need to be rendered. So, say for example that I have a room with objects in the room. The room is on the tree and the objects in the room are children of the tree. If I am outside of the room, then I prune (remove) this branch of the tree which then means I don't render it. The reason this works so well is that I don't have to evaluate EVERY object in the world to see if it should be rendered, but instead I quickly prune whole portions of the world that I know shouldn't be rendered.
Even better, then when I step inside the room, I trim the entire rest of the world from the tree and then only render the room and all its descendants.
I think a lot of the design decisions that the JMonkeyEngine team made were based on things in David Eberly's book, 3D Game Engine Design. I don't know the technical details of how to implement an approach like this, but I bet this book would be a great starting point for you.
Here is an interesting article on some different culling algorithms:
View Frustum Culling
Back-face Culling
Cell-based occlusion culling
PVS-based arbitrary geometry occlusion culling
Others
First you need a spatial partitioning structure, if you are using uniform block sizes, probably the most effective structure will be an octree. Then you will need to write an algorithm that can calculate if a box is on a particular side of (or intersecting) a plane. Once you have that you can work out which leaf nodes of the octree are inside the six sides of your view frustum - that's view culling. Also using the octree you can determine which blocks occlude others (sometimes called frustum masking), but get the first part working first.
It sounds like you're going for a minecraft-y type thing.
Take a look at this android minecraft level renderer.
The points to note are:
You only have to draw the faces of blocks that are shared with transparent blocks. e.g.: don't bother drawing the faces between two opaque blocks - the player will never see them.
You'll probably want to batch up your visible block geometry into chunklets (and stick it into a VBO) and determine visibility on a per-chunklet basis. Finding exactly which blocks can be seen will probably take longer than just flinging the VBO at the gpu and accepting the overdraw.
A flood-fill works pretty well to determine which chunklets are visible - limit the fill using the view frustum, view direction (if you're facing in the +ve x direction, don't flood in the -ve direction), and simple analyses of chunklet data (e.g.: if an entire face of a chunklet is opaque, don't flood through that face)

Algorithm for drawing a graph structure?

I have a digraph graph G=(V,E) that I would like to redraw because it is currently very messy. This is a flow chart that is being visualized and since |V|>1000 and each v in V has more than 1 outgoing edge, it is very hard to trace by eye. For instance; a node on the lower left corner is connected by an edge to a node on the upper right corner. It would be better, for example, if these two nodes were placed next to each other. There are too many edges and it is a pain to trace each of them.
I have access to and can change the (x,y) coordinates of all the vertices. I would like to redraw G by maintaining it's current structure, in a way that is more human-friendly. I thought that minimizing the number of intersecting edges may be something to start with.
Is there an algorithm that can help me redraw this graph?
My question is, how do I assign (x,y) coordinates to each v in V such that it is organized better and easier to trace and read? How do I express these requirements formally? Should I go with a heuristic, if this is NP? Here is an example for a somewhat organized graph and this is something messy (although much smaller than what I'm dealing with).
Any help will be greatly appreciated. Thanks.
EDIT: I'm still looking for a to-the-point answer. I've researched into planar straight-line and orthogonal drawing methods but what I've got is lengthy research papers. What I'm seeking is an implementation, pseudo-code or at least something to get me started.
EDIT 2: I'm not trying to display the graph. The input to the algorithm shall be the graph G (composed of V and E) and the output shall be {(xi, yi) for each vi in V}
You want to look at graphviz.org; this is a difficult problem on which there has been a lot of research, reimplementing the wheel is not the right way to go.
Probably you'll have to get the java to write out a datafile which a tool like 'dot' can read and use for the graph layout.
That messy one seems to be drawn using spline, try planar straight line algorithm instead. Indeed this is a very difficult problem and I always use GraphViz as my backend graph drawing tools, you can generate that graph you want with -Gsplines=line option.

How does Affine Transform really work in Java?

I have been using Affine Transform to rotate a String in my java project, and I am not an experienced programmer yet, so it has taking me a long time to do a seemingly small task.. To rotate a string.
Now I have finally gotten it to work more or less as I had hoped, except it is not as precisely done as I want... yet.
Since it took a lot of trial and error and reading the description of the affine transform I am still not quite sure what it really does. What I think I know at the moment, is that I take a string, and define the center of the string (or the point which I want to rotate around), but where does matrices come into this? (Apparently I do not know that hehe)
Could anyone try and explain to me how affine transform works, in other words than the java doc? Maybe it can help me tweak my implementation and also, I just would really like to know :)
Thanks in advance.
To understand what is affine transform and how it works see the wikipedia article.
In general, it is a linear transformation (like scaling or reflecting) which can be implemented as a multiplication by specific matrix, and then followed by translation (moving) which is done by adding a vector. So to calculate for each pixel [x,y] its new location you need to multiply it by specific matrix (do the linear transform) and then add then add a specific vector (do the translation).
In addition to the other answers, a higher level view:
Points on the screen have a x and a y coordinate, i.e. can be written as a vector (x,y). More complex geometric objects can be thought of being described by a collection of points.
Vectors (point) can be multiplied by a matrix and the result is another vector (point).
There are special (ie cleverly constructed) matrices that when multiplied with a vector have the effect that the resulting vector is equivalent to a rotation, scaling, skewing or with a bit of trickery translation of the input point.
That's all there is to it, basically. There are a few more fancy features of this approach:
If you multiply 2 matrices you get a matrix again (at least in this case; stop nit-picking ;-) ).
If you multiply 2 matrices that are equivalent to 2 geometric transformations, the resulting matrix is equivalent to doing the 2 geometric transformations one after the other (the order matters btw).
This means you can encode an arbitrary chain of these geometric transformations in a single matrix. And you can create this matrix by multiplying the individual matrices.
Btw this also works in 3D.
For more details see the other answers.
Apart from the answers already given by other I want to show a practical tip namely a pattern I usually apply when rotating strings or other objects:
move the point of rotation (x,y) to the origin of space by applying translate(-x,-y).
do the rotation rotate(angle) (possible also scaling will be done here)
move everything back to the original point by translate(x,y).
Remember that you have to apply these steps in reverse order (see answer of trashgod).
For strings with the first translation I normally move the center of the bounding box to the origin and with the last translate move the string to the actual point on screen where the center should appear. Then I can simply draw the string at whatever position I like.
Rectangle2D r = g.getFontMetrics().getStringBounds(text, g);
g.translate(final_x, final_y);
g.rotate(-angle);
g.translate(-r.getCenterX(), -r.getCenterY());
g.drawString(text, 0, 0);
or alternatively
Rectangle2D r = g.getFontMetrics().getStringBounds(text, g);
AffineTransform trans = AffineTransform.getTranslateInstance(final_x, final_y);
trans.concatenate(AffineTransform.getRotateInstance(-angle));
trans.concatenate(AffineTransform.getTranslateInstance(-r.getCenterX(), -r.getCenterY()));
g.setTransform(trans);
g.drawString(text, 0, 0);
As a practical matter, I found two things helpful in understanding AffineTransform:
You can transform either a graphics context, Graphics2D, or any class that implements the Shape interface, as discussed here.
Concatenated transformations have an apparent last-specified-first-applied order, also mentioned here.
Here is purely mathematical video guide how to design a transformation matrix for your needs http://www.khanacademy.org/video/linear-transformation-examples--scaling-and-reflections?topic=linear-algebra
You will probably have to watch previous videos to understand how and why this matrices work though. Anyhow, it's a good resource to learn linear algebra if you have enough patience.

Categories

Resources