Thank you for the help in advance!
I am making a 2D game in Java with LWJGL, and I am separating the renderer and game logic into separate threads.
To do so, I have to put the world data in the view into a buffer, which then gets passed to the renderer thread.
The data is made up from the world, which is static and can be passed by reference, but the entities are too dynamic to do so. The maximum number of entities would be a couple hundred to a few thousand.
Since the renderer only draws sprites, I want to fill up the buffer with a data structure of the sprites, and the coordinates to draw them to, which the renderer can read from. This is at 60 FPS.
I can use a LinkedList or Arraylist, but the varying data count and creation-deletion may cause too much overhead. I also saw other buffer types used in other code, though I didn't understand them, so I suspect there are other options, not to mention I'm not too experienced with the performance limitations of the basic ones either.
What would be a good way to build my buffer?
Related
While I understand the concept of LOD, I'm having a little trouble implementing it. Assume I have a number of models at different LODs and I want to store them in my Mesh class. What do I need to change (I already have a Mesh supporting one model). Do I have multiple VBOs (an array, with index dictating level perhaps?), buffer each model into it's VBO and bind the correct one when rendering? Or am I completely missing the idea?
What do I need to change
It really depends how you structure your vertices or indices. There are a thousand ways to achieve level of detail for models. You could as s3rius described, hold each LOD model in its own vertex buffer. Alternatively, depending on how you order your triangles or vertices, you can keep 1 vertex buffer and skip along the indices when you draw the triangles for varying quality.
For example, when you want full level of detail, you would render every vertex of the model. When you want half as much detail, you would render every other vertex.
Check out some of the links posted here.
LOD in modern games Always read up on modern practices.
The game is tile-based, but the tiles are really only for terrain and path-finding purposes. Sprite movement is free-form (ie, the player can be half way through a tile).
The maps in this game are very large. At normal zoom tiles are 32*32 pixels, and maps sizes can be up 2000x2000 or larger (4 million tiles!). Currently, a map is an array of tiles, and the tile object looks like this:
public class Tile {
public byte groundType;
public byte featureType;
public ArrayList<Sprite> entities;
public Tile () {
groundType = -1;
featureType = -1;
entities = null;
}
}
Where groundType is the texture, and featureType is a map object that takes up an entire tile (such as a tree, or large rock). These types of features are quite common so I have opted to make them their own variable rather than store them in entities, which is a list of objects on the tile (items, creatures, etc). Entities are saved to a tile for performance reasons.
The problem I am having is that if entities is not initialized to null, Java runs out of heap space. But setting it to null and only initializing when something moves into the tile seems to me a bad solution. If a creature were moving across otherwise empty tiles, lists would constantly need to be initialized and set back to null. Is this not poor memory management? What would be a better solution?
Have a single structure (start with an ArrayList) containing all of
your sprites.
If you're running a game loop and cycling through the sprites list,
say, once very 30-50 seconds and there are up to, say, 200 sprites,
you shouldn't have a performance hit from this structure per se.
Later on, for other purposes such as collision detection, you may
well need to revise the structure of just a single ArrayList. I would suggest
starting with the simple, noddyish solution to get your game logic sorted out, then optimise as necessary.
For your tiles, if space is a concern, then rather than having a special "Tile" object, consider packing the
information for each tile into a single byte, short or int if not
actually much specific information per tile is required. Remember
that every Java object you create has some overhead (for the sake of
argument, let's say in the order of 24-32 bytes per object depending
on VM and 32 vs 64 bit processor). An array of 4 million bytes is
"only" 4MB, 4 million ints "only" 16MB.
Another solution for your tile data, if packing a tile's specification into a single primitive isn't practical, is to declare a large ByteBuffer, with each tile's data stored at index (say) tileNo * 16 if each tile needs 16 bytes of data.
You could consider not actually storing all of the tiles in memory. Whether this is appropriate will depend on your game. I would say that 2000x2000 is still within the realm that you could sensibly keep the whole data in memory if each individual tile does not need much data.
If you're thinking the last couple of points defeat the whole point of an object-oriented language, then yes you're right. So you need to weigh up at what point you opt for the "extreme" solution to save heap space, or whether you can "get away with" using more memory for the sake of a better programming paradigm. Having an object per tile might use (say) in the order of a few hundred megabytes. In some environments that will be ridiculous. In others where several gigabytes are available, it might be entirely reasonable.
I am developing a tile-based physics game like Falling Sand Game. I am currently using a Static VBO for the vertices and a Dynamic VBO for the colors associated with each block type. With this type of game the data in the color VBO changes very frequently. (ever block change) Currently I am calling glBufferSubDataARB for each block change. I have found this to work, yet it doesn't scale well with resolution. (Much slower with each increase in resolution) I was hoping that that I could get double my current playable resolution. (256x256)
Should I call BufferSubData very frequently or BufferData once a frame? Should I drop the VBO and go with vertex array?
What can be done about video cards that do not support VBOs?
(Note: Each block is larger than one pixel)
First of all, you should stop using both functions. Buffer objects have been core OpenGL functionality since around 2002; there is no reason to use the extension form of them. You should be using glBufferData and glBufferSubData, not the ARB versions.
Second, if you want high-performance buffer object streaming, tips can be found on the OpenGL wiki. But in general, calling glBufferSubData many times per frame on the same memory isn't helpful. It would likely be better to map the buffer and modify it directly.
To your last question, I would say this: why should you care? As previously stated, buffer objects are old. It's like asking what you should do for hardware that only support D3D 5.0.
Ignore it; nobody will care.
You should preferrably have the frequently changing color information updated in your own copy in RAM and hand the data to the GL in one operation, once per frame, preferrably at the end of the frame, just before swapping buffers (this means you need to do it once out of line for the very first frame).
glBufferSubData can be faster than glBufferData since it does not reallocate the memory on the server, and since it possibly transfer less data. In your case, however, it is likely slower, because it needs to be synced with the data that is still drawn. Also, since data could possibly change in any random location, the gains from only uploading a subrange won't be great, and uploading the whole buffer once per frame should be no trouble bandwidth-wise.
The best strategy would be to call glDraw(Elements|Arrays|Whatever) followed by glBufferData(...NULL). This tells OpenGL that you don't care about the buffer any more, and it can throw the contents away as soon as it's done drawing (when you map this buffer or copy into it now, OpenGL will secretly use a new buffer without telling you. That way, you can work on the new buffer while the old one has not finished drawing, this avoids a stall).
Now you run your physics simulation, and modify your color data any way you want. Once you are done, either glMapBuffer, memcpy, and glUnmapBuffer, or simply use glBufferData (mapping is sometimes better, but in this case it should make little or no difference). This is the data you will draw the next frame. Finally, swap buffers.
That way, the driver has time to do the transfer while the card is still processing the last draw call. Also, if vsync is enabled and your application blocks waiting for vsync, this time is available to the driver for data transfers. You are thus practically guaranteed that whenever a draw call is made (the next frame), the data is ready.
About cards that do not support VBOs, this does not really exist (well, it does, but not really). VBO is more a programming model, rather than a hardware feature. If you use plain normal vertex arrays, the driver still has to somehow transfer a block of data to the card, eventually. The only difference is that you own a vertex array, but the driver owns a VBO.
Which means in the case of VBO, the driver needs not ask you when to do what. In the case of vertex arrays, it can only rely that the data be valid at the exact time you call glDrawElements. In the case of a VBO, it always knows the data is valid, because you can only modify it via an interface controlled by the driver. This means it can much more optimally manage memory and transfers, and can better pipeline drawing.
There do of course exist implementations that don't support VBOs, but those would need to be truly old (like 10+ years old) drivers. It's not something to worry about, realistically.
I am rendering rather heavy object consisting of about 500k triangles. I use opengl display list and in render method just call glCallList. I thought that once graphic primitives is compiled into display list cpu work is done and it just tells gpu to draw. But now one cpu core is loaded up to 100%.
Could you give me some clues why does it happen?
UPDATE: I have checked how long does it take to run glCallList, it's fast, it takes about 30 milliseconds to run it
Most likely you are hitting the limits on the list length, which are at 64k verteces per list. Try to split your 500k triangles (1500k verteces?) into smaller chunks and see what you get.
btw which graphical chips are you using? If the verteces are processed on CPU, that also might be a problem
It's a bit of a myth that display lists magically offload everything to the GPU. If that was really the case, texture objects and vertex buffers wouldn't have needed to be added to OpenGL. All the display list really is, is a convenient way of replaying a sequence of OpenGL calls and hopefully saving some of the function call/data conversion overhead (see here). None of the PC HW implementations I've used seem to have done anything more than that so far as I can tell; maybe it was different back in the days of SGI workstations, but these days buffer objects are the way to go. (And modern OpenGL books like OpenGL Distilled give glBegin/glEnd etc the briefest of mentions only before getting stuck into the new stuff).
The one place I have seen display lists make a huge difference is the GLX/X11 case where your app is running remotely to your display (X11 "server"); in that case using a display list really does push all the display-list state to the display side just once, whereas a non-display-list immediate-mode app needs to send a bunch of stuff again each frame using lots more bandwidth.
However, display lists aside, you should be aware of some issues around vsync and busy waiting (or the illusion of it)... see this question/answer.
In an OpenGL ES 1.x Android application, I generate a circle (from triangles) and then translate it about one hundred times to form a level. Everything works except when a certain event occurs that causes about 15 objects to be immediately added to the arraylist that stores the circles' coordinates. When this event happens 2+ times quickly, all the circles in the list disappear for about 1/5th of a second. Besides this, the circles animate smoothly.
The program runs well as a java SE app using the same synchronization techniques, and I have tried a half a dozen or so other synch techniques to no avail, so I feel the problem is the openGL implementation. Any suggestions?
Do you really have to store the vertex data in client memory? If you don't modify it, I suggest you use a VBO instead. Just upload it into graphics memory once, then draw from there. It will be much faster (not requiring you to send all the vertex data for each draw), and I'm pretty sure you won't run into the problem you described.
Transformations can be done as much as you like, then you only have to give the draw command for each instance of your circle.
So the list is being modified under your nose? It sounds like you need to do any modification to this list on the OpenGL thread. Try Activity.postOnUiThread(Runnable), where Runnable implements your own code. Possibly.