So I am working on a project that requires a collection of clients to be iterated through for updating, with each client requiring an update packet for every other client within proximity. I want to be able to do this in a fast way since updates will happen for a large amount of clients, at an often-occurring interval.
My original plan of attack was to create regions based on client locations, updating each client only with the other clients in their region. This would entail a LinkedList<Region>, with the Region having its own list of clients which would update among each other. One problem with this method was that some regions could have 1 client, while others could have 1000. Another level of difficulty arose from the fact that clients will constantly be moving (thus changing location and Region). These problems could be avoided if there was a way to modify the list while iterating through it, possibly splitting elements when a region gets too large.
Next I thought of creating one large List<Client> that held all players, which was constantly sorted based on location. Then to update client at index n of the list with the closest 20 clients, I would only iterate n-10 and n+10 from their current index. I don't really like this method as much since if there was a 21st client in a closeby area, they could be ignored even though they had equal distance to the client at n as the one at n+10. It also seemed slow to have to resort all the clients every tick.
In terms of speed, which of these methods provides better performance? Additionally, are there any other Java collections I should consider? Thanks!
I strongly prefer the first method. Sorting the entire list every tick is going to end up being a very bad idea time-wise, which rules out the second method.
To solve the concurrency issues, you should make a copy of the LinkedList<Reigon> before updating it in a thread. That way you will allow Clients to change their Reigon at the same time as updates are being pushed out to each Reigon.
Another note is that if you plan on retrieving an arbitrary Reigon from the LinkedList<Reigon> (for example, when you move a Client from one Reigon to another) you should look into some kind of a hash set. It will increase performance greatly when retrieving an arbitrary element from the middle of the list, especially if the list is large.
Related
I was thinking of how I would go about implementing a thread-safe RingBuffer in Java and Android (as for some reason there is none, even after all these years, not even a circular queue. So, no (Circular/Ring)ByteBuffer, nor (Circular/Ring)(Buffer/Queue).
Even majority of the RingBuffer implementations that are third party are said to be not thread safe, which makes me think it really isn't as simple as I think it is going to be. What I was thinking about was doing something like this:
Have an Object (say RingBufferPosition) that encapsulates both the Head and Tail position.
Have the RingBuffer maintain an AtomicReference to the RingBufferPosition
When a thread adds something, it will create a temporary (unfortunately, I don't know enough of Java to determine this, but "Stack-allocated") object, which will be recycled over and over, updating it with the new updated head and tail, until it can CAS successfully.
When a thread removes something, it will do similar to adding something.
Everything is accessed in an array allocated to the max length, hence, the head and tail can access/update the current element in O(1) time.
Would this work, and better yet, would it yield any benefits over simply synchronizing access to the collection?
A small code sample/pseudocode (has not been run yet, and I do not even know how to remotely test an atomic data structure, I plan on using it for buffering/streaming media but I haven't gotten that far yet as I need to create this first) can be found here. I have comments/documentation that details my concerns there.
Lastly, to address a possible "Why" question, as in "Why do you need such performance", I'll be truthful. I have always found data structures, especially atomic/lock-free data structures very interesting, and I found this as a very good exercise to learn, plus I always wanted to create a Ring Buffer. I could have just "synchronized" everything, however I do also value performance.
Multiple reader/multiple writer ring buffers are tricky.
Your way doesn't work, because you can't update that start/end position AND the array contents atomically. Consider adding to the buffer: If you update the end position first, then there is a moment before you update the array when the buffer contains an invalid item. If you update the array first, then there's nothing to stop simultaneous additions from stomping on the same array element.
There are lots of ways to deal with these problems, but the various ways have different trade-offs, and you have better options available if you can get rid of the multiple reader or multiple writer requirement.
If I had to guess at why we don't have a concurrent ring buffer in the standard library, I'd say it's because there is no one best way to implement it that is going to be good for most scenarios. The data structure used for ConcurrentLinkedQueue, in contrast, is simple and elegant and an obvious choice when a concurrent linked list is required.
My application has a number of objects in an internal list, and I need to be able to log them (e.g. once a second) and later recreate the state of the list at any time by querying the log file.
The current implementation logs the entire list every second, which is great for retrieval because I can simply load the log file, scan through it until I reach the desired time, and load the stored list.
However, the majority of my objects (~90%) rarely change, so it is wasteful in terms of disk space to continually log them at a set interval.
I am considering switching to a "delta" based log where only the changed objects are logged every second. Unfortunately this means it becomes hard to find the true state of the list at any one recorded time, without "playing back" the entire file to catch those objects that had not changed for a while before the desired recall time.
An alternative could be to store (every second) both the changed objects and the last-changed time for each unchanged object, so that a log reader would know where to look for them. I'm worried I'm reinventing the wheel here though — this must be a problem that has been encountered before.
Existing comparable techniques, I suppose, are those used in version control systems, but I'd like a native object-aware Java solution if possible — running git commit on a binary file once a second seems like it's abusing the intention of a VCS!
So, is there a standard way of solving this problem that I should be aware of? If not, any pitfalls that I might encounter when developing my own solution?
I am currently designing around a big memory index structure (several giga bytes). The index is actually a RTree which leafes are BTrees (dont ask). It supports a special query and pushes it to the logical limit.
Since those nodes are soley search nodes I ask my self how to best make it parallel.
I know of six solutions so far:
Block reads when a write is scheduled. The tree is completely blocked until the last read is finished and then the write is performed and after the write the tree can yet again used for multiple reads. (reads need no locking).
Clone Nodes to change and reuse existing nodes (including leafs) and switch between both by simply yet again stop reads switch and done. Since leaf pointers must be altered also the leaf pointers might become their own collection making it possible to switch modifications atomar and changes can be redo to a second version to avoid copy of the pointer on each insert.
Use independent copies of the index like double buffering. Update one copy of the index, switch it. Once noone reads the old index, alter this index in the same way. This way the change can be done without blocking existing reads. If another insert hits the tree in a reasonable amount of time these changes can also be done.
Use a serial share nothing architecture so each search thread has its own copy. Since a thread can only alter its tree after a single read is performed, this would be also lock free and simple. Due reads are spread evenly for each worker thread (being bound to a certain core), the throughput would not be harmed.
Use write / read locks for each node being about to be written and do only block a subtree during write. This would involve additional operations against the tree since splitting and merging would propagate upwards and therefore require a repass of the insert (since expanding locks upwards (parentwise) would introduce the chance of a deadlock). Since Split and Merge are not that frequent if you have a higher page size, this would also be a good way. Actually currently my BTree implementation currently uses a similar mechanism by spliting a node and reinsert the value unless no split is needed (which is not optimal but more simple).
Use double buffer for each node like the shadow cache for databases where each page is switched between two versions. So everytime a node is modified a copy is modified and once a read is issued the old versions are used or the new one. Each node gets a version number and the version that is more close to the active version (latest change) is choosen. To switch between to version, one needs only an atomar change on the root information. This way the tree can be altered and used. This swith can be done every time but it must be ensured that no read is using the old version when overriding the new one. This method has the possibility to not interfer with cache locality in order to link leafs and alike. But it also requires twice the amount of memory since a back buffer must be present but saves allocation time and might be good for a high frequency of changes.
With all that thoughts what is best? I know it depends but what is done in the wild? If there are 10 read threads (or even more) and being blocked by a single write operation I guess this is nothing I really want.
Also how about L3, L2 and L1 cache and in scenarios with multiple CPUs? Any issues on that? The beauty of the double buffering is the chance that those reads hitting the old version are still working with the correct cache version.
The version of creating a fresh copy of a node is quite not appealing. So what is meet in the wild of todays database landscapes?
[update]
By rereading the post, I wonder if using the write locks for split and merge would be better suited by creating replacement nodes since for a split and a merge I need to copy somewhat the half of elements around, those operations are very rare and so actually copy a node completely would do the trick by replacing this node in the parent node which is a simple and fast operation. This way the actual blocks for reads would be very limited and since we create copies anyway, the blocking only happens when the new nodes are replaced. Since during those access leafs may not be altered it is unimportant since the information density has not changed. But again this needs for every access of a node a increment and decrement of a read lock and checking for intended write locks. This all is overhead and this all is blocking further reads.
[Update2]
Solution 7. (currently favored)
Currently we favor a double buffer for the internal (non-leaf) nodes and use something similar to row locking.
Our logical tables that we try to decompose using those index structure (which is all a index does) results in using algebra of sets on those information. I noticed that this algebra of sets is linear (O(m+n) for intersection and union) and gives us the chance to lock each entry being part of such operation.
By double buffering the internal nodes (which is not hard to implement nor does it cost much (about <1% memory overhead)) we can live problem free on that issue not blocking too much read operations.
Since we batch modifications in a certain way it is very rarely seen that a given column is updated but once it is, it takes more time since those modifications might go in the thousands for this single entry.
So the goal is to alter the algebra of sets used to simply intersect those columns being currently modified later on. Since only one column is modified at a time such operation would only block once. And for everyone currently reading it, the write operation has to wait. And guess what, once a write operation waits, it usually lets another write operation of another column taking place that is not bussy. We calculate the propability of such a block to be very very low. So we dont need to care.
The locking mechanism is done using check for write, check for write intention, add read, check for write again and procced with the read. So there is no explicit object locking. We access fixed areas of bytes and if the structure is clear everything critical is planed to move into a c++ version to make it somewhat faster (2x we guess and this only takes one person one or two weeks to do especially if you use a Java to C++ translator).
The only effect that is now also important might be the caching issue since it invalidates L1 caches and maybe L2 too. So we plan to collect all modifications on such a table / index to be scheduled to run within 1 or more minutes timeshare but be evenly distributed to not make a system that has performance hickhups.
If you know of anything that helps us please go ahead.
As noone replied I would like to summarize what we (I) finally did. The structure is now separated. We have a RTree which leaf are actually Tables. Those tables can be even remote so we have a distribution way that is mostly transparent thanks to RMI and proxies.
The rest was simply easy. The RTree has the way to advise a table to split and this split is again a table. This split is been done on a single maschine and transfered to another if it has to be remote. Merge is almost similar.
This remote also is true for threads bound to different CPUs to avoid cache issues.
About the modification in memory it is as I already suggested. we duplicate internal nodes and turned the table 90° and adapted the algebraic set algorithms to handle locked columns efficiently. The test for a single table is simple and compared to the 1000ends of entries per column not a performance issue after all. Deadlocks are also impossible since one column is used at a time so there is only one lock per thread. We experiment with doing columns in parallel which would increase the response time. We also think about binding columns to a given virtual core so there is no locking again since the column is in isolation and yet again the modification can be serialized.
This way one can utilize 20 cores and more per CPU and also avoid cache misses.
I am making a program in Java in which a ball bounces around on the screen. The user can add other balls, and they all bounce off of each other. My question lies in the storage of the added balls. At the moment, I am using an ArrayList to store them, and every time the space bar is pressed, a new ball class is created and added to an Array List. Is this the most efficient way of doing things? I don't specify the size of the Array List at the beginning, so is it inefficient to have to allocate a new space on the array every time the user wants a new ball, even if the ball count will get up in the hundreds? Is there another class I could use to handle this in a more efficient manner?
Thanks!
EDIT:
Sorry, I should have been more clear. I iterate through the balls every 30 milliseconds, using nested for loops to see if they are intersecting with each other. I do access one ball the most often (the ball which the user can control with the arrow keys, another feature of the game), but the user can choose to switch control balls. Balls are never removed. So, I am performing some fairly complex calculations (I use my own vector class to move them off of each other every time there is a collision) on the balls very often.
Measure it and find out! In all seriousness, often times the best way to get answers to these questions is to set up a benchmark and swap in different collection types.
I can tell you that it won't allocate new space every time you add a new item to the ArrayList. Extra space is allocated so that it has room to grow.
LinkedList is another List option. It is super cheap to add items, but random access (list.get(10)) is expensive. Sets could also be good if you don't need ordered access (though there are ordered sets, too), and you want a Map implementation if you're accessing them by some sort of key/id. It really all depends on how you're using the collection.
Update based on added details
It sounds like you are mostly doing sequential reads through the entire list. In that scenario, a LinkedList is probably your best choice. Though again, if you only expose the List interface to the rest of your code (or even a more general Collection), you can easily swap in different implementations and actually measure the difference.
ArrayList is a highly optimized and very efficient wrapper on top of a plain Java array. A small timing overhead comes from copying array elements, which happens when the allocated size is less than required number of elements. When you grow the array into a few hundreds of items, the copying will happen less than ten times, so the price you pay for not knowing the size in advance is very small. You can further reduce that overhead by suggesting an initial size for your ArrayList.
Removing from the middle of the ArrayList does take linear time. If you plan to remove items and/or insert them in the middle of the list frequently, this may become an issue. Note, however, that the overhead is not going to be worse than that for a plain array.
I iterate through the balls every 30 milliseconds, using nested for loops to see if they are intersecting with each other.
This does not have much to do with the collection in which the balls are stored. You could use a spatial index to improve the speed of finding intersections.
About ArrayList in Java, the complexity of remove at the end and add one element is Amortize O(1). Or, you can say, it's almost efficient in most cases. (In some rare cases, it will be awful.)
But you should think more carefully about your design before choosing your data structure.
How many objects often in your collection. If it's small, you can free to choose any data structure that you feel easily to work with. it will almost doesn't lost performance for your code.
If you often find one ball in all of your balls, another datastructure such as HashMap or HashSet would be better.
Or you often delete at middle of your list, maybe LinkedList will be appropriate choice :)
I'd recommend working out the way in which you need to access the balls, and pick an appropriate interface (Not implementation) eg. If you're accessing sequentially only, use a List. If you need to look up the ball by ID, think of a Map. The interface should match your requirements in terms of functionality, not in terms of speed/efficiency.
Then pick an implementation, eg. HashMap or TreeMap, and write your code.
Afterwards, profile it - Is your code inefficient in the ball access code? If so, then try to optimise by switching to an alternate implementation thats more appropriate to your needs.
I have written following game server and want to provide a groups feature. Groups will allow to group players together who are "nearby" on screen. In fast action games, this group would be changing fast since players will be moving in and out of their zones constantly.
Since each player would need to listen to events from other players in the group, players will subscribe to the group.
This brings me to my question. What is the appropriate datastructure or java collection class which can be used in this scenario to hold the changing set of event listeners on a group? The number of listeners to a group would rarely exceed 20 in my opinion, and should be lesser than that in most scenarios. It is a multi-threaded environment.
The one I am planning to use is a CopyOnWriteArrayList. But since there will be reasonable amount of updates(due to changing subscriptions) is this class appropriate? What other class would be good to use? If you have any custom implementation using array's etc please share.
Unless you have millions of changes per second (which seems unlikely in your scenario) a CopyOnWriteArrayList should be good enough for what you need. If I were you, I would use that.
IF you notice a performance issue AND you have profiled your application AND you have identified that the CopyOnWriteArrayList is the bottleneck, then you can find a better structure. But I doubt it will be the case.
Do players have integer IDs? If so then I have an lightweight, immutable array-based set class that might make sense for you:
http://code.google.com/p/mikeralib/source/browse/trunk/Mikera/src/main/java/mikera/persistent/IntSet.java
This was written for similar kinds of situations in game engines.
However I also have an alternative approach to consider: If you are updating the groups automatically based on vicinity, then you might want to consider not tracking groups at all. Instead, consider using a spatial data structure that allows you to quickly search for nearby players whenever an event occurs, and directly send the event to nearby players.
Typically you could use a 2D or 3D grid or octree with the smallest division size set to be equal to the max range for your groups. Then a vicinity search will only need to check 9 (2D case) or 27 (3D case) locations in order to find all nearby players. I think doing this search whenever needed will be faster and simpler than the overhead of maintaining lists of groups and listeners all the time....
From what I've gathered, you have choice between CopyOnWriteArrayList and ConcurrentHashMap:
CopyOnWriteArrayList:
Add/remove operation computational cost is linear to the size of the list. May happen multiple times during a single iteration (group notification).
Simpler data structure with constant read time.
ConcurrentHashMap:
Add/remove operation is a constant time operation. Additions or Removal of subscribers do not affect iteration already in progress and blocking is minimized.
Larger data structure that requires slightly longer read time.
Creating a custom solution is possible when it comes to efficiency but probably not as safe when it comes to thread safety. I'm leaning towards ConcurrentHashMap but the winner will probably depend heavily on how your game turns out.