Efficiently gathering data from a game board

Efficiently gathering data from a game board - java

Say I have a connect-4 board, it's a 7x6 board, and I want to store what piece is being stored in what spot on that board. Using a 2-array would be nice, on the fact that I can quickly visualize it as a board, but I worry about the efficiency of looping through an array to gather data so often.
What would be the most efficient way of 1) Storing that game board and 2) Gathering the data from the said game board?
Thanks.

The trite answer is that at 7x6, it's not going to be a big deal: Unless you're on a microcontroller this might not make a practical difference. However, if you're thinking about this from an algorithm perspective, you can reason about the operations; "storing" and "gathering" are not quite specific enough. You'll need to think through exactly which operations you're trying to support, and how they would scale if you had thousands of columns and millions of pieces. Operations might be:
Read whether a piece exist, and what color it is, given its x and y coordinates.
When you add a piece to a column, it will "fall". Given a column, how far does it fall, or what would the new y value be for a piece added to column x? At most this will be height times whatever the cost of reading is, since you could just scan the column.
Add a piece at the given x and y coordinate.
Scan through all pieces, which is at most width times height times the cost of reading.
Of course, all of this has to fit on your computer as well, so you care about storage space as well as time.
Let's list out some options:
Array, such as game[x][y] or game[n] where n is something like x * height + y: Constant time (O(1)) to read/write given x and y, but O(width * height) to scan and count, and O(height) time to figure out how far a piece drops. Constant space of O(width * height). Perfectly reasonable for 7x6, might be a bad idea if you had a huge grid (e.g. 7 million x 6 million).
Array such as game[n] where each piece is added to the board and each piece contains its x and y coordinate: O(pieces) time to find/add/delete a piece given x and y, O(pieces) scan time, O(pieces) space. Probably good for an extremely sparse grid (e.g. 7 million x 6 million), but needlessly slow for 7x6.
HashMap as Grant suggests, where the key is a Point data object you write that contains x and y. O(1) to read/write, O(height) to see how far a piece drops, O(pieces) time to scan, O(pieces) space. Slightly better than an array, because you don't need an empty array slot per blank space on the board. There's a little extra memory per piece entry for the HashMap key object, but you could effectively make an enormous board with very little extra cost, which makes this slightly better than option 1 if you don't mind writing the extra Point class.
An array of resizable column arrays, e.g. List. This is similar to an array of fixed arrays, but because List stores its size and can allocate only as much memory as needed, you can store the state very efficiently including how far a piece needs to fall. Constant read/write/add, constant "fall" time, O(pieces) + O(width) scan time, O(pieces) + O(width) space because you don't need to scan/store the cells you know are empty.
Given those options, I think that an array of Lists (#4) is the most scalable solution, but unless I knew it needed to scale I would probably choose the array of arrays (#1) for ease of writing and understanding.

I may be wrong, but I think you're looking a hashmap (a form of hashtable) if you want efficiency.
Here's the documentation:
https://docs.oracle.com/javase/8/docs/api/java/util/Hashtable.html
HashMap provides expected constant-time performance O(1) for most operations like add(), remove() and contains().
Since you're using a 7x6 board, you can simply name your keys and values A1 ... A6 for example.

Related

optimizing a grid-based particle system

I've implemented a game somewhat similar to this one in Java and currently find that I'm hitting a ceiling number of particles of ~80k. My game board is a 2D array of references to 'Particle' objects, each of which must be updated every frame. Different kinds of 'Particle' have different behaviors and may move or change their state in response to environmental conditions such as wind or adjacent particles.
Some possible 'rules' that might be in effect:
If a Particle of type lava is adjacent to a Particle of type water, both disappear, and the lava is replaced by obsidian
If a gas Particle is adjacent to a Lava, Fire, Ember, etc. Particle, it will ignite, and produce fire and smoke
If a sufficient number of dust particles are stacked on top of one another, those at lower levels, as if under pressure, can become sedimentary rock
I've searched around and haven't been able to find any algorithms or data structures that seem particularly well-suited to speeding up the task. It seems that some kind of memoization might be useful? Would a quad tree be of any use here? I've seen them used in the somewhat similar Conway's Game of Life with the Hashlife algorithm. Or, is it the case that I'm not going to be able to do too much to increase the speed?

Hashlife will work in principle but there are two reasons why you might not get as much out of it as Conway Life.
Firstly it relies on recurring patterns. The more cell states you have and the less structured the plane the fewer cache hits you'll encounter and the more you'll be working with brute force.
Secondly as another poster noted rules that involve non-local effects will either mean your primitives (in Conway Life 4x4) will need to be bigger so you will have abandon divide and conquer at say 8x8 or 16x16 or whatever size guarantees you can correctly calculate the middle portion in n/2 time.
That's made the worse by the diversity of states. In Conway Life it's common to pre-calculate all 4x4 gridsor at least have nearly all relevant ones in cache.
With 2 states there are only 65536 4x4 grids (peanuts on modern platforms) but with only 3 there are 43046721.
If you have to have 8x8 primitives it gets very big very quickly and beyond any realistic storage.
So the larger the primitive and the more states you have that becomes quickly unrealistic.
One way to address that primitive size is to have the rock rule propagate pressure. So a Rock+n (n representing pressure) becomes Rock+(n+1) in the next generation if it has Rock+m where m>=n above it. Up to some threshold k where it turns to sedimentary Rock.
That means cells are still only dependent on their immediate neighbours but again multiplies up the number of states.
If you have cell types like the 'Bird' in the example given and you have velocities that you don't keep to a minimum (say -1,0,1 in either direction) you'll totally collapse memoization. Even then the chaotic nature of such rules may make cache hits on those areas vanishingly small.
If your rules don't lead to steady states (or repeating cycles) like Conway Life often does the return on memoization will be limited unless your plane is mostly empty.

i don't understand your problem clearly but I think cuda or OpenGL (GPU programming in general) can easily handle your ref link: https://dan-ball.jp/en/javagame/dust/

I'd use a fixed NxN grid for this mainly because there are too many points moving around each frame to benefit from the recursive subdividing nature of the quad-tree. This is a case where a straightforward data structure with the data representations and memory layouts tuned appropriately can make all the difference in the world.
The main thing I'd do for Java here is actually avoid modeling each particle as an object. It should be raw data like just some plain old data like floats or ints. You want to be able to work with contiguity guarantees for spatial locality with sequential processing and not pay for the cost of padding and the 8-byte overhead per class instance. Split cold fields away from hot fields.
For example, you don't necessarily need to know a particle's color to move it around and apply physics. As a result, you don't want an AoS representation here which has to load in a particle's color into cache lines during the sequential physics pass only to evict it and not use it. Cram as much relevant memory used together into a cache line as you can by separating it away from the irrelevant memory for a particular pass.
Each cell in the grid should just store an index into a particle, with each particle storing an index to the next particle in the cell (a singly-linked list, but an intrusive one which requires allocating no nodes and just uses indices into arrays). A -1 can be used to indicate the end of the list as well as empty cells.
To find collisions between particles of interest, look in the same cell as the particle you're testing, and you can do this in parallel where each thread handles one or more cells worth of particles.
The NxN grid should be very fine given the boatload of moving particles you can have per frame. Play with how many cells you create to find something optimal for your input sizes. You might even have multiple grids. If certain particles don't interact with each other, don't put them in the same grid. Don't worry about the memory usage of the grid here. If each grid cell just stores a 32-bit index to the first particle in the cell, then a 200x200 grid only takes 160 kilobytes with a 32-bit next index overhead per particle.
I made something similar to this some years back in C using the technique above (but not with as many interesting particle interactions as the demo game) which could handle about 10 mil particles before it started to go below 30 FPS and on older hardware with only 2 cores. It did use C as well as SIMD and multithreading, but I think you can get a very speedy solution in Java handling a boatload of particles at once if you do the above.
Data structure:
As particles move from one cell to the next, all you do is manipulate a couple of integers to move them from one cell to the other. Cells don't "own memory" or allocate any. They're just 32-bit indices.
To figure out which cell a particle occupies, just do:
cell_x = (int)(particle_x[particle_index] / cell_size)
cell_y = (int)(particle_y[particle_index] / cell_size)
cell_index = cell_y * num_cols + cell_x
... much cheaper constant-time operation than traversing a tree structure and having to rebalance it as particles move around.

Is there a way to use limitless lists in java?

I'm trying to make a randomly-generated 2-d game, which I plan to do with a list of terrain to the right of the spawn point and a list of terrain to the left of the spawn point. However, I need these lists to not have a length limit, as I want the world to be infinite. If I can't find a way I will make the world "round" but infinite would be preferable. Is this possible?

An ArrayList is infinite... until memory runs out. But I guess that was not the question.
Update: Right, this is limited even though I argue nobody will notice the world restarting after two billion units.
Thought about that again. What you need is a random function that creates the same value again and again when you give it seed and current position. So you do not store the world, you recalculate it on the fly.
So you need an infinite counter only for the position in your world. The only challenge will be the storage of event results such us eaten mushrooms and destroyed bridges.

Storing all the data in a list will have a lot of limitations.
If you use an ArrayList, you can't have infinite elements.
If you use a LinkedList, you lose random access, so speed is a lot slower.
And for any list, RAM is an issue.
You'd be better off by splitting generated areas into chunks, then storing those to the harddrive.
Now, you'd still want a list of loaded areas, but this will be limited by a scope. If you're 2 game-miles to the East of some town, no point keeping the town information in reference (I hope).
One very popular game to this is Minecraft. Attempting to load the entire Minecraft world into your RAM won't happen - yet it still has the potential for infinite worlds.

If the world is going to be huge, I wouldn't store it in an ArrayList or a LinkedList. Instead you can make the whole world depend on a randomly selected long value seed. The terrain at position i can then be found using new Random(seed ^ i).nextInt() (or something). That way the world will be (effectively) infinite and you won't have to save the terrain in memory. Whenever you return to a previously visited part of the world it will be the same as it was before. The number of different worlds is 2^64 so you'd have to live a very long time before you saw the same world again.

ArrayList can contain up to 2^31 values (because length of array is integer, which is unsigned 4 byte structure).
However LinkedList is limitless, the only limit is the memory of JVM.

Optimal data structures for a tile-based RPG In java

The game is tile-based, but the tiles are really only for terrain and path-finding purposes. Sprite movement is free-form (ie, the player can be half way through a tile).
The maps in this game are very large. At normal zoom tiles are 32*32 pixels, and maps sizes can be up 2000x2000 or larger (4 million tiles!). Currently, a map is an array of tiles, and the tile object looks like this:
public class Tile {
public byte groundType;
public byte featureType;
public ArrayList<Sprite> entities;
public Tile () {
groundType = -1;
featureType = -1;
entities = null;
}
}
Where groundType is the texture, and featureType is a map object that takes up an entire tile (such as a tree, or large rock). These types of features are quite common so I have opted to make them their own variable rather than store them in entities, which is a list of objects on the tile (items, creatures, etc). Entities are saved to a tile for performance reasons.
The problem I am having is that if entities is not initialized to null, Java runs out of heap space. But setting it to null and only initializing when something moves into the tile seems to me a bad solution. If a creature were moving across otherwise empty tiles, lists would constantly need to be initialized and set back to null. Is this not poor memory management? What would be a better solution?

Have a single structure (start with an ArrayList) containing all of
your sprites.
If you're running a game loop and cycling through the sprites list,
say, once very 30-50 seconds and there are up to, say, 200 sprites,
you shouldn't have a performance hit from this structure per se.
Later on, for other purposes such as collision detection, you may
well need to revise the structure of just a single ArrayList. I would suggest
starting with the simple, noddyish solution to get your game logic sorted out, then optimise as necessary.
For your tiles, if space is a concern, then rather than having a special "Tile" object, consider packing the
information for each tile into a single byte, short or int if not
actually much specific information per tile is required. Remember
that every Java object you create has some overhead (for the sake of
argument, let's say in the order of 24-32 bytes per object depending
on VM and 32 vs 64 bit processor). An array of 4 million bytes is
"only" 4MB, 4 million ints "only" 16MB.
Another solution for your tile data, if packing a tile's specification into a single primitive isn't practical, is to declare a large ByteBuffer, with each tile's data stored at index (say) tileNo * 16 if each tile needs 16 bytes of data.
You could consider not actually storing all of the tiles in memory. Whether this is appropriate will depend on your game. I would say that 2000x2000 is still within the realm that you could sensibly keep the whole data in memory if each individual tile does not need much data.
If you're thinking the last couple of points defeat the whole point of an object-oriented language, then yes you're right. So you need to weigh up at what point you opt for the "extreme" solution to save heap space, or whether you can "get away with" using more memory for the sake of a better programming paradigm. Having an object per tile might use (say) in the order of a few hundred megabytes. In some environments that will be ridiculous. In others where several gigabytes are available, it might be entirely reasonable.

Algorithm to find peaks in 2D array

Let's say I have a 2D accumulator array in java int[][] array. The array could look like this:
(x and z axes represent indexes in the array, y axis represents values - these are images of an int[56][56] with values from 0 ~ 4500)
or
What I need to do is find peaks in the array - there are 2 peaks in the first one and 8 peaks in the second array. These peaks are always 'obvious' (there's always a gap between peaks), but they don't have to be similar like on these images, they can be more or less random - these images are not based on the real data, just samples. The real array can have size like 5000x5000 with peaks from thousands to several hundred thousands... The algorithm has to be universal, I don't know how big the array or peaks can be, I also don't know how many peaks there are. But I do know some sort of threshold - that the peaks can't be smaller than a given value.
The problem is, that one peak can consist of several smaller peaks nearby (first image), the height can be quite random and also the size can be significantly different within one array (size - I mean the number of units it takes in the array - one peak can consist from 6 units and other from 90). It also has to be fast (all done in 1 iteration), the array can be really big.
Any help is appreciated - I don't expect code from you, just the right idea :) Thanks!
edit: You asked about the domain - but it's quite complicated and imho it can't help with the problem. It's actually an array of ArrayLists with 3D points, like ArrayList< Point3D >[][] and the value in question is the size of the ArrayList. Each peak contains points that belong to one cluster (plane, in this case) - this array is a result of an algorithm, that segments a pointcloud . I need to find the highest value in the peak so I can fit the points from the 'biggest' arraylist to a plane, compute some parameters from it and than properly cluster most of the points from the peak.

He's not interested in estimating the global maximum using some sort of optimization heuristic - he just wants to find the maximum values within each of a number of separate clusters.
These peaks are always 'obvious' (there's always a gap between peaks)
Based on your images, I assume you mean there's always some 0-values separating clusters? If that's the case, you can use a simple flood-fill to identify the clusters. You can also keep track of each cluster's maximum while doing the flood-fill, so you both identify the clusters and find their maximum simultaneously.
This is also as fast as you can get, without relying on heuristics (which could return the wrong answer), since the maximum of each cluster could potentially be any value in the cluster, so you have to check them all at least once.
Note that this will iterate through every item in the array. This is also necessary, since (from the information you've given us) it's potentially possible for any single item in the array to be its own cluster (which would also make it a peak). With around 25 million items in the array, this should only take a few seconds on a modern computer.

This might not be an optimal solution, but since the problem sounds somewhat fluid too, I'll write it down.
Construct a list of all the values (and coordinates) that are over your minimum treshold.
Sort it in descending order of height.
The first element will be the biggest peak, add it to the peak list.
Then descend down the list, if the current element is further than the minimum distance from all the existing peaks, add it to the peak list.
This is a linear description but all the steps (except 3) can be trivially parallelised. In step 4 you can also use a coverage map: a 2D array of booleans that show which coordinates have been "covered" by a nearby peak.
(Caveat emptor: once you refine the criteria, this solution might become completely unfeasible, but in general it works.)

Simulated annealing, or hill climbing are what immediately comes to mind. These algorithms though will not guarantee that all peaks are found.
However if your "peaks" are separated by values of 0 as the gap, maybe a connected components analysis would help. You would label a region as "connected" if it is connected with values greater than 0(or if you have a certain threshold, label regions as connected that are over that threshold), then your number of components would be your number of peaks. You could also then do another pass of the array to find the max of each component.
I should note that connected components can be done in linear time, and finding the peak values can also be done in linear time.

Programming a 2D grid in Java

What is the best data structure to use when programming a 2-dimensional grid of tiles in Java? Tiles on the grid should be easily referenced by their location, so that neighbors and paths can be efficiently computed. Should it be a 2D array? An ArrayList? Something else?

If you're not worrying about speed or memory too much, you can simply use a 2D array - this should work well enough.
If speed and/or memory are issues for you then this depends on memory usage and the access pattern.
A single dimensional array is the way to go if you need high performance. You compute the proper index as y * wdt + x. There are 2 potential problems with this: cache misses and memory usage.
If you know that your access pattern is such that you fetch neighbours of an element most of the time, then mapping a 2D space into a 1D array as described above may cause cache misses - you want the neighbours to be close in memory, and neighbours from 2 different rows are not. You may have to map your 2d tiles in a different order to your 1d array. See Hilbert curves for example.
For better memory usage, if you know that most of your tiles are always the same (e.g. always grass), you might want to implement a sparse array or a quad tree. Both can be implemented quite efficiently, with cache awareness in mind (the sparse array link is good example for this). Another benefit is that these can be dynamically extended. However, you will always have to pay extra levels of indirection in the end for this to work.
NOTE: Be careful with using generic classes such as HashMaps with the key type being some primitive type or a special location class if you're worried about performance - you will either have to allocate an object each time you index the hash map or pay the price of boxing/unboxing. In addition to this, hash maps will not allow you efficient spatial queries (e.g. give me all objects existing in the radius R of a given object - quad trees are better for this).

If you have a fixed dimension for your grid, use a 2D array. If you need the size to be dynamic, use an ArrayList of ArrayLists.

A 2D array seems like a good bet if you plan on inserting stuff into specific locations. As long as its a fixed Size.

The data structure to use really depends on the type of operations you will perform:
In case the number of meaningful positions (nonzero/nondefault) in the grid is rather low (<< n x m) it might be more space efficient to use a hashmap, that maps (x,y) positions to specific tiles. Also you can iterate over meaningful positions alot more efficiently. In addition you could store references to neighboring tiles to each tile to speed up path/neighborhood traversal.
If your grid is densely filled with "information" you should consider using a 2d array or ArrayList (in case you will at some point have generic types involved as "tile-type", you have to use ArrayLists, since Java does not allow native arrays of generic type).

If you simply need to iterate over the grid and random addressing of cells, then MyCellType[][] should be fine. This is most efficient in terms of space and (one would expect) time for these use-cases.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.