Low latency collection for edits - java

Here is the situation - I need to maintain a collection(unbounded/single writer) of ids(string) in Java 7.
As new records come in with a particular flag set (immaterial)- I attempt to insert.
If I find a pre-existing record in the collection - I alert and over-write anyways.
If new records come in with the flag unset - I attempt to remove the record,if one exists.
All lookups(2-step) to be avoided for performance.
Insert/update/remove to be as close to O(1) as possible.
Would HashSet be the most apt collection for this?

If your code is single-threaded, HashSet should be a good match. Sample implementation:
Set<String> ids = new HashSet<>();
void processRecord(Record record) {
if (record.hasFlag()) {
if (!ids.add(record.getId())) {
alertDuplicate(record);
}
} else {
ids.remove(record.getId());
}
}

Related

Handling additional data in Apache ServiceComb compensation methods

I'm currently looking at the implementations of saga pattern for distributed transactions and I found that Apache ServiceComp pack might be something that works for me.
However, I have found a problem that the limitation of compensating methods to have the same declaration as the methods they compensate may be a bottleneck.
From Apache's example:
#Compensable(compensationMethod = "cancel")
void order(CarBooking booking) {
booking.confirm();
bookings.put(booking.getId(), booking);
}
void cancel(CarBooking booking) {
Integer id = booking.getId();
if (bookings.containsKey(id)) {
bookings.get(id).cancel();
}
}
You can see that we have the same declaration for both methods.
But, what if I need additional information to compensate my transaction? For instance, I have a call to external system to update some flag to "true". When I need to compensate it, how do I make "cancel" method know what the original value of this flag was?
The things get more tricky when we update the whole object. How do I send the whole object before modification to the cancel transaction?
These limitation doesn't look quite promising. Do you know if there are approaches to fight with this limitation?
You can save localTxId and flag an in your application and use localTxId in the compensation method to get the flag
Map extmap = new HashMap();
#Autowired
OmegaContext omegaContext;
#Compensable(compensationMethod = "cancel")
void order(CarBooking booking) {
booking.confirm();
bookings.put(booking.getId(), booking);
//save flag
extmap.put(omegaContext.localTxId(),'Your flag')
}
void cancel(CarBooking booking) {
//get flag
extmap.get(omegaContext.localTxId());
Integer id = booking.getId();
if (bookings.containsKey(id)) {
bookings.get(id).cancel();
}
}

ConcurrentModificationException while calling size() on a sublist

I am trying to save a bunch of SQL transaction. I am in a context of ESB routes, transferring from a SQL source to a SQL target, and the order of the SQL transactions is not guaranteed, so you can have a SQL update before the object was inserted.
Due to the architecture, I'm saving those SQL transactions 1000 by 1000 (I'm using a messageQueue). So some of these can fail, and I re-route them in order to retry or reject them. To improve efficiency, I'm willing to improve the older system, where if the 1000 fail, you save 1 by 1, to implements dichotomia (if the save fail, you split the list and try again), via recursivity. I am also tracking an attribute of my objects, thanks to another list (objectsNo) for further operations.
However I am getting a ConcurrentModificationException when in my first recursivity, when calling objectsList.size(). How can I avoid it ? I'm also opened, and would be very thankful to any solutions which would provide another way than dichotomia to improve efficiency (and would by such bypass my issue).
Suppressed: java.util.ConcurrentModificationException: null
at java.util.ArrayList$SubList.checkForComodification(ArrayList.java:1231)
at java.util.ArrayList$SubList.size(ArrayList.java:1040)
at fr.company.project.esbname.mariadb.MariaDbDatabase.saveObjectWithDichotomie(MariaDbDatabase.java:398)
at fr.company.project.esbname.mariadb.MariaDbDatabase.saveObjectWithDichotomie(MariaDbDatabase.java:404)
at fr.company.project.esbname.mariadb.MariaDbDatabase.saveObject(MariaDbDatabase.java:350)
at sun.reflect.GeneratedMethodAccessor324.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.camel.component.bean.MethodInfo.invoke(MethodInfo.java:472)
at org.apache.camel.component.bean.MethodInfo$1.doProceed(MethodInfo.java:291)
at org.apache.camel.component.bean.MethodInfo$1.proceed(MethodInfo.java:264)
at org.apache.camel.component.bean.BeanProcessor.process(BeanProcessor.java:178)
at org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:77)
at org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:541)
... 22 common frames omitted
I tried to understand, but there should not be any mistake. Even if I used recursivity, it stays single-threaded. I considered that the issue could be with hibernate (some requests from the save which failed could stay in the cache, and lock modification), but the issue is with the size, which is on a sublist of the original list.
private List<String> saveObjectWithDichotomie(List<Object> objects,
List<String> objectsNo,
Exchange exchange) throws JsonProcessingException {
try {
objectRepository.save(objects);
return objectsNo;
} catch (DataIntegrityViolationException e) {
if (objects.size() == 1) {
objectsNo.clear();
errorProcessor.sendErrorToRejets(objects.get(0), exchange, e);
return objectsNo;
} else {
List<Object> objectsFirstHalf = objects.subList(0, objects.size()/2);
List<Object> objectsSecondHalf = objects.subList(objects.size()/2, objects.size());
List<String> objectsNoFirstHalf = objectsNo.subList(0, objectsNo.size()/2);
List<String> objectsNoSecondHalf = objectsNo.subList(objectsNo.size()/2, objectsNo.size());
objectsNo.clear();
objectsNo.addAll(
saveObjectWithDichotomie(objects, objectsNoFirstHalf, exchange)
);
objectsNo.addAll(
saveObjectWithDichotomie(objects, objectsNoSecondHalf, exchange)
);
return objectsNo;
}
}
}
If you would read the documentation of sublist is clearly says:
The returned list is backed by this list, so non-structural changes in the returned list are reflected in this list, and vice-versa.
That being the reason for your Exception (no need for multiple threads for this to happen). Thus when you create a new List, create it via:
List<Object> objectsFirstHalf = new ArrayList<>(objects.subList(0, objects.size()/2));
Two things:
ConcurrentModificationException does not mean that the list was modified by another thread, but that something is trying to access the list in an expected state but it was changed in the meantime.
subList does not create an actual new list, it creates a view on the original list. That means that you cannot change the original list without making the retrieved sublist invalid.
So,
objectsNo.clear();
is your problem.
See this MCVE:
public class Sublist {
public static void main(String[] args) {
List<String> list = new ArrayList<>(
IntStream.range(0, 100).mapToObj(Integer::toString).collect(Collectors.toList()));
List<String> sublist = list.subList(10, 20);
// outputs "15"
System.out.println(sublist.get(5));
list.clear();
// throws ConcurrentModificationException
System.out.println(sublist.get(5));
}
}

Refreshing cache without impacting latency to access the cache

I have a cache refresh logic and want to make sure that it's thread-safe and correct way to do it.
public class Test {
Set<Integer> cache = Sets.newConcurrentHashSet();
public boolean contain(int num) {
return cache.contains(num);
}
public void refresh() {
cache.clear();
cache.addAll(getNums());
}
}
So I have a background thread refreshing cache - periodically call refresh. And multiple threads are calling contain at the same time. I was trying to avoid having synchronized in the methods signature because refresh could take some time (imagine that getNum makes network calls and parsing huge data) then contain would be blocked.
I think this code is not good enough because if contain called in between clear and addAll then contain always returns false.
What is the best way to achieve cache refreshing without impacting significant latency to contain call?
Best way would be to use functional programming paradigm whereby you have immutable state (in this case a Set), instead of adding and removing elements to that set you create an entirely new Set every time you want to add or remove elements. This is in Java9.
It can be a bit awkward or infeasible however to achieve this method for legacy code. So instead what you could do is have 2 Sets 1 which has the get method on it which is volatile, and then this is assigned a new instance in the refresh method.
public class Test {
volatile Set<Integer> cache = new HashSet<>();
public boolean contain(int num) {
return cache.contains(num);
}
public void refresh() {
Set<Integer> privateCache = new HashSet<>();
privateCache.addAll(getNums());
cache = privateCache;
}
}
Edit We don't want or need a ConcurrentHashSet, that is if you want to add and remove elements to a collection at the same time, which in my opinion is a pretty useless thing to do. But you want to switch the old Set with a new one, which is why you just need a volatile variable to make sure you can't read and edit the cache at the same time.
But as I mentioned in my answer at the start is that if you never modify collections, but instead make new ones each time you want to update a collection (note that this is a very cheap operation as internally the old set is reused in the operation). This way you never need to worry about concurrency, as there is no shared state between threads.
How would you make sure your cache doesn't contain invalid entries when calling contains?? Furthermore, you'd need to call refresh every time getNums() changes, which is pretty inefficient. It would be best if you make sure you control your changes to getNums() and then update cache accordingly. The cache might look like:
public class MyCache {
final ConcurrentHashMap<Integer, Boolean> cache = new ConcurrentHashMap<>(); //it's a ConcurrentHashMap to be able to use putIfAbsent
public boolean contains(Integer num) {
return cache.contains(num);
}
public void add(Integer nums) {
cache.putIfAbsent(num, true);
}
public clear(){
cache.clear();
}
public remove(Integer num) {
cache.remove(num);
}
}
Update
As #schmosel made me realize, mine was a wasted effort: it is in fact enough to initialize a complete new HashSet<> with your values in the refresh method. Assuming of course that the cache is marked with volatile. In short #Snickers3192's answer, points out what you seek.
Old answer
You can also use a slightly different system.
Keep two Set<Integer>, one of which will always be empty. When you refresh the cache, you can asynchronously re-initialize the second one and then just switch the pointers. Other threads accessing the cache won't see any particular overhead in this.
From an external point of view, they will always be accessing the same cache.
private volatile int currentCache; // 0 or 1
private final Set<Integer> caches[] = new HashSet[2]; // use two caches; either one will always be empty, so not much memory consumed
private volatile Set<Integer> cachePointer = null; // just a pointer to the current cache, must be volatile
// initialize
{
this.caches[0] = new HashSet<>(0);
this.caches[1] = new HashSet<>(0);
this.currentCache = 0;
this.cachePointer = caches[this.currentCache]; // point to cache one from the beginning
}
Your refresh method may look like this:
public void refresh() {
// store current cache pointer
final int previousCache = this.currentCache;
final int nextCache = getNextPointer();
// you can easily compute it asynchronously
// in the meantime, external threads will still access the normal cache
CompletableFuture.runAsync( () -> {
// fill the unused cache
caches[nextCache].addAll(getNums());
// then switch the pointer to the just-filled cache
// from this point on, threads are accessing the new cache
switchCachePointer();
// empty the other cache still on the async thread
caches[previousCache].clear();
});
}
where the utility methods are:
public boolean contains(final int num) {
return this.cachePointer.contains(num);
}
private int getNextPointer() {
return ( this.currentCache + 1 ) % this.caches.length;
}
private void switchCachePointer() {
// make cachePointer point to a new cache
this.currentCache = this.getNextPointer();
this.cachePointer = caches[this.currentCache];
}

Iteration over a Set

I'm having issues with getting an iteration done (and modification) through the Set, which contains Objects. I've tried so many ways of iteration (4), but none of them seem to work and still throw me the Error java.util.ConcurrentModificationException.
[Code is written in Groovy]
private void replaceRock() {
ObjectNodeManager.OBJECTS.each {
System.out.println("Going...");
if(it.getPosition().withinDistance(player.getPosition(), 30)) {
System.out.println("Found...");
Position position = it.getPosition();
ObjectNode newRock = new ObjectNode(439, position, ObjectDirection.NORTH, ObjectType.DEFAULT);
ObjectNodeManager.unregister(it);
ObjectNodeManager.register(newRock);
it.remove();
}
}
}
I've tried synchronization to prevent access from other Threads, but this also didn't work. Please help me, I'm very desperate.
First find them (this will give you basically a list of refs) and then deal with them:
ObjectNodeManager.OBJECTS.findAll {
it.getPosition().withinDistance(player.getPosition(), 30))
}.each{
ObjectNode newRock = new ObjectNode(439, it.position, ObjectDirection.NORTH, ObjectType.DEFAULT)
ObjectNodeManager.unregister(it)
ObjectNodeManager.register(newRock)
it.remove()
}
On a random site note: i'd add a replace method in the ObjectNodeManager to combine unregister, register, remove. Also working with class methods and properties is not the best thing to do (but since it looks like a game...)
The problem is that you are modifying the list of objects while you are looping through the objects.
Try iterating through a copy of the objects instead.
ArrayList<YourType> copy = new ArrayList<YourType>(ObjectNodeManager.OBJECTS);
copy.each(...)

synchronization, remove from collection

I am working on website with games and I have map of players and their virtual tables.
private final ConcurrentMap<Player, List<Table>> tableOfPlayers = new ConcurrentHashMap<>();
and method to remove table
private void removeTable(Player player,Table table) {
if(tableOfPlayers.get(player).size() == 1) {
tableOfPlayers.remove(player);
} else {
tableOfPlayers.get(player).remove(table);
}
}
Is there any good way to solve this check-then-act idiom, because now it isn't thread-safe.
I know that I can synchronize both add and remove method, but I am wondering if it is possible to make it better. The reason why I check if size is equal to 1 is that if player have only one active table and I decide to remove I no longer need this player in my map.

Categories

Resources