Right now i just wondering why, prob need help also.
So when i add values to my list it will be added but after 5 min it will remove, that works
But if i do list.size() it still says 1 but there is no values inside
public class Test extends Runnable {
private ExpiringSet<User> users; //User is just my object class
public Test() {
users = new ExpiringSet<User>(5, TimeUnit.MINUTES); //Sets the users
users.add(new User("John"));
public void run() {
// This will always say 1 even after i wait 1hour ive seen it has increased after a while then it said 2
//and there is no values in the users
System.out.println("Size: " + users.size());
//But this is empty after 5min
for(Iterator<User> iter = users.iterator(); iter.hasNext(); ) {
User user = (User)iter.next();
System.out.println("User " + user.getName());
public class ExpiringSet<E> extends ForwardingSet<E> {
private final Set<E> setView;
public ExpiringSet(long duration, TimeUnit unit) {
Cache<E, Boolean> cache = CaffeineFactory.newBuilder().expireAfterAccess(duration, unit).build();
this.setView = Collections.newSetFromMap(cache.asMap());
protected Set<E> delegate() {
return this.setView;
Basically i just want the users.size() to be 0 if there is no values inside users.
It just seems weird when it says 1 and there is no values.
By default expired entries are removed lazily during routine maintenance, which is triggered by a few reads or a write. That results in size of the underlying map being exposed, whereas the entries will be hidden if queried for and evicted at some later point. The Cache.cleanUp() can be called to run the maintenance work.
A background thread is required if you want expired entries to be removed promptly regardless of cache activity. This can be configured by configuring a Scheduler on the builder. Java 9+ includes a built-in JVM-wide scheduling thread rather than spinning up a new one, so the cache can be configured as,
LoadingCache<Key, Graph> graphs = Caffeine.newBuilder()
.expireAfterWrite(10, TimeUnit.MINUTES)
.build(key -> createExpensiveGraph(key));
My requirement is to find out all the matching objects with given criteria.
So what I did was created a Custom predicate and passed the matching criteria and the source IDs that should be matched with the Objects. If the condition is matching, then I am internally populating a map to hold the details.
And when the stream is completed, this map will have all the matching criteria details.
So I use this map for later computation.
Most of the time it works, but starts failing randomly.
My Custom predicate is -
public class IdentifierMapPredicate<T extends Identifiable, V> implements Predicate<T> {
private static Logger logger = Logger
private Collection<V> sourceIds;
private BiPredicate<T, V> matchingCondition;
private Map<V, Collection<Identifier>> identifierMap;
public Map<V, Collection<Identifier>> getIdentifierMap() {
return identifierMap;
public IdentifierMapPredicate(Collection<V> sourceIds,
BiPredicate<T, V> matchingCondition) {
this.sourceIds = sourceIds;
this.matchingCondition = matchingCondition;
identifierMap = new HashMap<>(sourceIds.size());
public boolean test(T t) {
for (V id : sourceIds) {
if (matchingCondition.test(t, id)) {
//logger.debug("t :{}, id :{}", t.getIdentifier(), id);
Collection<Identifier> ids = identifierMap.get(id);
if(ids == null){
ids = new HashSet<>();
identifierMap.put(id, ids);
logger.debug("identifierMap :{}", identifierMap);
return true;
return false;
I added log statements when the criteria is matched and we update the map, sometime the map has only 1 element (even though I am expecting 2) or sometimes its 0.
2019-11-24 09:31:52.574 PST DEBUG ForkJoinPool.commonPool-worker-5 IdentifierMapPredicate:52 - identifierMap :{}
2019-11-24 09:31:52.574 PST DEBUG ForkJoinPool.commonPool-worker-7 IdentifierMapPredicate:52 -identifierMap :{}
Is there anything wrong with the above implementation?
Should changing it to ConcurrentHashMap will help?
Most of the time it works, but starts failing randomly
Yeah, classical race condition.
You're using IdentifierMapPredicate in a multi-threaded context. If you access IdentifierMapPredicate.test concurrently, threads will use also identifierMap simulaneously, which is not thread safe.
It looks like a ConcurrentHashMap will solve the issue. Alternatively you could modify method test with the synchronized keyword, but that will give you less throughput/more locking I'd assume. But on the other hand also less headache, its a trade-off where I'd gladly accept less headache if there aren't tight performance requirements. Just my personal preference.
Btw. nowadays you can use Map.computeIfAbsent to create new entries instead of
Collection<Identifier> ids = identifierMap.get(id);
if(ids == null){
ids = new HashSet<>();
identifierMap.put(id, ids);
I have a cache refresh logic and want to make sure that it's thread-safe and correct way to do it.
public class Test {
Set<Integer> cache = Sets.newConcurrentHashSet();
public boolean contain(int num) {
return cache.contains(num);
public void refresh() {
So I have a background thread refreshing cache - periodically call refresh. And multiple threads are calling contain at the same time. I was trying to avoid having synchronized in the methods signature because refresh could take some time (imagine that getNum makes network calls and parsing huge data) then contain would be blocked.
I think this code is not good enough because if contain called in between clear and addAll then contain always returns false.
What is the best way to achieve cache refreshing without impacting significant latency to contain call?
Best way would be to use functional programming paradigm whereby you have immutable state (in this case a Set), instead of adding and removing elements to that set you create an entirely new Set every time you want to add or remove elements. This is in Java9.
It can be a bit awkward or infeasible however to achieve this method for legacy code. So instead what you could do is have 2 Sets 1 which has the get method on it which is volatile, and then this is assigned a new instance in the refresh method.
public class Test {
volatile Set<Integer> cache = new HashSet<>();
public boolean contain(int num) {
return cache.contains(num);
public void refresh() {
Set<Integer> privateCache = new HashSet<>();
cache = privateCache;
Edit We don't want or need a ConcurrentHashSet, that is if you want to add and remove elements to a collection at the same time, which in my opinion is a pretty useless thing to do. But you want to switch the old Set with a new one, which is why you just need a volatile variable to make sure you can't read and edit the cache at the same time.
But as I mentioned in my answer at the start is that if you never modify collections, but instead make new ones each time you want to update a collection (note that this is a very cheap operation as internally the old set is reused in the operation). This way you never need to worry about concurrency, as there is no shared state between threads.
How would you make sure your cache doesn't contain invalid entries when calling contains?? Furthermore, you'd need to call refresh every time getNums() changes, which is pretty inefficient. It would be best if you make sure you control your changes to getNums() and then update cache accordingly. The cache might look like:
public class MyCache {
final ConcurrentHashMap<Integer, Boolean> cache = new ConcurrentHashMap<>(); //it's a ConcurrentHashMap to be able to use putIfAbsent
public boolean contains(Integer num) {
return cache.contains(num);
public void add(Integer nums) {
cache.putIfAbsent(num, true);
public clear(){
public remove(Integer num) {
As #schmosel made me realize, mine was a wasted effort: it is in fact enough to initialize a complete new HashSet<> with your values in the refresh method. Assuming of course that the cache is marked with volatile. In short #Snickers3192's answer, points out what you seek.
Old answer
You can also use a slightly different system.
Keep two Set<Integer>, one of which will always be empty. When you refresh the cache, you can asynchronously re-initialize the second one and then just switch the pointers. Other threads accessing the cache won't see any particular overhead in this.
From an external point of view, they will always be accessing the same cache.
private volatile int currentCache; // 0 or 1
private final Set<Integer> caches[] = new HashSet[2]; // use two caches; either one will always be empty, so not much memory consumed
private volatile Set<Integer> cachePointer = null; // just a pointer to the current cache, must be volatile
// initialize
this.caches[0] = new HashSet<>(0);
this.caches[1] = new HashSet<>(0);
this.currentCache = 0;
this.cachePointer = caches[this.currentCache]; // point to cache one from the beginning
Your refresh method may look like this:
public void refresh() {
// store current cache pointer
final int previousCache = this.currentCache;
final int nextCache = getNextPointer();
// you can easily compute it asynchronously
// in the meantime, external threads will still access the normal cache
CompletableFuture.runAsync( () -> {
// fill the unused cache
// then switch the pointer to the just-filled cache
// from this point on, threads are accessing the new cache
// empty the other cache still on the async thread
where the utility methods are:
public boolean contains(final int num) {
return this.cachePointer.contains(num);
private int getNextPointer() {
return ( this.currentCache + 1 ) % this.caches.length;
private void switchCachePointer() {
// make cachePointer point to a new cache
this.currentCache = this.getNextPointer();
this.cachePointer = caches[this.currentCache];
I'm working with Akka (version 2.4.17) to build an observation Flow in Java (let's say of elements of type <T> to stay generic).
My requirement is that this Flow should be customizable to deliver a maximum number of observations per unit of time as soon as they arrive. For instance, it should be able to deliver at most 2 observations per minute (the first that arrive, the rest can be dropped).
I looked very closely to the Akka documentation, and in particular this page which details the built-in stages and their semantics.
So far, I tried the following approaches.
With throttle and shaping() mode (to not close the stream when the limit is exceeded):
new FiniteDuration(1, TimeUnit.MINUTES),
With groupedWith and an intermediary custom method:
final int nbObsMax = 2;
.groupedWithin(Integer.MAX_VALUE, new FiniteDuration(1, TimeUnit.MINUTES))
.map(list -> {
List<T> listToTransfer = new ArrayList<>();
for (int i = list.size()-nbObsMax ; i>0 && i<list.size() ; i++) {
listToTransfer.add(new T(list.get(i)));
return listToTransfer;
.mapConcat(elem -> elem) // Splitting List<T> in a Flow of T objects
Previous approaches give me the correct number of observations per unit of time but these observations are retained and only delivered at the end of the time window (and therefore there is an additional delay).
To give a more concrete example, if the following observations arrives into my Flow:
[Obs1 t=0s] [Obs2 t=45s] [Obs3 t=47s] [Obs4 t=121s] [Obs5 t=122s]
It should only output the following ones as soon as they arrive (processing time can be neglected here):
Window 1: [Obs1 t~0s] [Obs2 t~45s]
Window 2: [Obs4 t~121s] [Obs5 t~122s]
Any help will be appreciated, thanks for reading my first StackOverflow post ;)
I cannot think of a solution out of the box that does what you want. Throttle will emit in a steady stream because of how it is implemented with the bucket model, rather than having a permitted lease at the start of every time period.
To get the exact behavior you are after you would have to create your own custom rate-limit stage (which might not be that hard). You can find the docs on how to create custom stages here: http://doc.akka.io/docs/akka/2.5.0/java/stream/stream-customize.html#custom-linear-processing-stages-using-graphstage
One design that could work is having an allowance counter saying how many elements that can be emitted that you reset every interval, for every incoming element you subtract one from the counter and emit, when the allowance used up you keep pulling upstream but discard the elements rather than emit them. Using TimerGraphStageLogic for GraphStageLogic allows you to set a timed callback that can reset the allowance.
I think this is exactly what you need: http://doc.akka.io/docs/akka/2.5.0/java/stream/stream-cookbook.html#Globally_limiting_the_rate_of_a_set_of_streams
Thanks to the answer of #johanandren, I've successfully implemented a custom time-based GraphStage that meets my requirements.
I post the code below, if anyone is interested:
import akka.stream.Attributes;
import akka.stream.FlowShape;
import akka.stream.Inlet;
import akka.stream.Outlet;
import akka.stream.stage.*;
import scala.concurrent.duration.FiniteDuration;
public class CustomThrottleGraphStage<A> extends GraphStage<FlowShape<A, A>> {
private final FiniteDuration silencePeriod;
private int nbElemsMax;
public CustomThrottleGraphStage(int nbElemsMax, FiniteDuration silencePeriod) {
this.silencePeriod = silencePeriod;
this.nbElemsMax = nbElemsMax;
public final Inlet<A> in = Inlet.create("TimedGate.in");
public final Outlet<A> out = Outlet.create("TimedGate.out");
private final FlowShape<A, A> shape = FlowShape.of(in, out);
public FlowShape<A, A> shape() {
return shape;
public GraphStageLogic createLogic(Attributes inheritedAttributes) {
return new TimerGraphStageLogic(shape) {
private boolean open = false;
private int countElements = 0;
setHandler(in, new AbstractInHandler() {
public void onPush() throws Exception {
A elem = grab(in);
if (open || countElements >= nbElemsMax) {
pull(in); // we drop all incoming observations since the rate limit has been reached
else {
if (countElements == 0) { // we schedule the next instant to reset the observation counter
scheduleOnce("resetCounter", silencePeriod);
push(out, elem); // we forward the incoming observation
countElements += 1; // we increment the counter
setHandler(out, new AbstractOutHandler() {
public void onPull() throws Exception {
public void onTimer(Object key) {
if (key.equals("resetCounter")) {
open = false;
countElements = 0;
I have a stream of objects which I would like to collect the following way.
Let's say we are handling forum posts:
class Post {
private Date time;
private Data data
I want to create a list which groups posts by a period. If there were no posts for X minutes, create a new group.
class PostsGroup{
List<Post> posts = new ArrayList<> ();
I want to get a List<PostGroups> containing the posts grouped by the interval.
Example: interval of 10 minutes.
[{time:x, data:{}}, {time:x + 3, data:{}} , {time:x + 12, data:{}, {time:x + 45, data:{}}}]
I want to get a list of posts group:
{posts : [{time:x, data:{}}, {time:x + 3, data:{}}, {time:x + 12, data:{}]]},
{posts : [{time:x + 45, data:{}]}
notice that the first group lasted till X + 22. Then a new post was received at X + 45.
Is this possible?
This problem could be easily solved using the groupRuns method of my StreamEx library:
long MAX_INTERVAL = TimeUnit.MINUTES.toMillis(10);
.groupRuns((p1, p2) -> p2.time.getTime() - p1.time.getTime() <= MAX_INTERVAL)
I assume that you have a constructor
class PostsGroup {
private List<Post> posts;
public PostsGroup(List<Post> posts) {
this.posts = posts;
The StreamEx.groupRuns method takes a BiPredicate which is applied to two adjacent input elements and returns true if they must be grouped together. This method creates the stream of lists where each list represents the group. This method is lazy and works fine with parallel streams.
You need to retain state between stream entries and write yourself a grouping classifier. Something like this would be a good start.
class Post {
private final long time;
private final String data;
public Post(long time, String data) {
this.time = time;
this.data = data;
public String toString() {
return "Post{" + "time=" + time + ", data=" + data + '}';
public void test() {
long t = 0;
List<Post> posts = Arrays.asList(
new Post(t, "One"),
new Post(t + 1000, "Two"),
new Post(t + 10000, "Three")
// Group every 5 seconds.
Map<Long, List<Post>> gouped = posts
.collect(Collectors.groupingBy(new ClassifyByTimeBetween(5000)));
gouped.entrySet().stream().forEach((e) -> {
System.out.println(e.getKey() + " -> " + e.getValue());
class ClassifyByTimeBetween implements Function<Post, Long> {
final long delay;
long currentGroupBy = -1;
long lastDateSeen = -1;
public ClassifyByTimeBetween(long delay) {
this.delay = delay;
public Long apply(Post p) {
if (lastDateSeen >= 0) {
if (p.time > lastDateSeen + delay) {
// Grab this one.
currentGroupBy = p.time;
} else {
// First time - start there.
currentGroupBy = p.time;
lastDateSeen = p.time;
return currentGroupBy;
Since no one has provided a solution with a custom collector as it was required in the original problem statement, here is a collector-implementation that groups Post objects based on the provided time-interval.
Date class mentioned in the question is obsolete since Java 8 and not recommended to be used in new projects. Hence, LocalDateTime will be utilized instead.
Post & PostGroup
For testing purposes, I've used Post implemented as a Java 16 record (if you substitute it with a class, the overall solution will be fully compliant with Java 8):
public record Post(LocalDateTime dateTime) {}
Also, I've enhanced the PostGroup object. My idea is that it should be capable to decide whether the offered Post should be added to the list of posts or rejected as the Information expert principle suggests (in short: all manipulations with the data should happen only inside a class to which that data belongs).
To facilitate this functionality, two extra fields were added: interval of type Duration from the java.time package to represent the maximum interval between the earliest post and the latest post in a group, and intervalBound of type LocalDateTime which gets initialized after the first post will be added a later on will be used internally by the method isWithinInterval() to check whether the offered post fits into the interval.
public class PostsGroup {
private Duration interval;
private LocalDateTime intervalBound;
private List<Post> posts = new ArrayList<>();
public PostsGroup(Duration interval) {
this.interval = interval;
public boolean tryAdd(Post post) {
if (posts.isEmpty()) {
intervalBound = post.dateTime().plus(interval);
return posts.add(post);
} else if (isWithinInterval(post)) {
return posts.add(post);
return false;
public boolean isWithinInterval(Post post) {
return post.dateTime().isBefore(intervalBound);
public String toString() {
return "PostsGroup{" + posts + '}';
I'm making two assumptions:
All posts in the source are sorted by time (if it is not the case, you should introduce sorted() operation in the pipeline before collecting the results);
Posts need to be collected into the minimum number of groups, as a consequence of this it's not possible to split this task and execute stream in parallel.
Building a Custom Collector
We can create a custom collector either inline by using one of the versions of the static method Collector.of() or by defining a class that implements the Collector interface.
These parameters have to be provided while creating a custom collector:
Supplier Supplier<A> is meant to provide a mutable container which store elements of the stream. In this case, ArrayDeque (as an implementation of the Deque interface) will be handy as a container to facilitate the convenient access to the most recently added element, i.e. the latest PostGroup.
Accumulator BiConsumer<A,T> defines how to add elements into the container provided by the supplier. For this task, we need to provide the logic on that will allow determining whether the next element from the stream (i.e. the next Post) should go into the last PostGroup in the Deque, or a new PostGroup needs to be allocated for it.
Combiner BinaryOperator<A> combiner() establishes a rule on how to merge two containers obtained while executing stream in parallel. Since this operation is treated as not parallelizable, the combiner is implemented to throw an AssertionError in case of parallel execution.
Finisher Function<A,R> is meant to produce the final result by transforming the mutable container. The finisher function in the code below turns the container, a deque containing the result, into an immutable list.
Note: Java 16 method toList() is used inside the finisher function, for Java 8 it can be replaced with collect(Collectors.toUnmodifiableList()) or collect(Collectors.toList()).
Characteristics allow providing additional information, for instance Collector.Characteristics.UNORDERED which is used in this case denotes that the order in which partial results of the reduction produced while executing in parallel is not significant. In this case, collector doesn't require any characteristics.
The method below is responsible for generating the collector based on the provided interval.
public static Collector<Post, ?, List<PostsGroup>> groupPostsByInterval(Duration interval) {
return Collector.of(
(Deque<PostsGroup> deque, Post post) -> {
if (deque.isEmpty() || !deque.getLast().tryAdd(post)) { // if no groups have been created yet or if adding the post into the most recent group fails
PostsGroup postsGroup = new PostsGroup(interval);
(Deque<PostsGroup> left, Deque<PostsGroup> right) -> { throw new AssertionError("should not be used in parallel"); },
(Deque<PostsGroup> deque) -> deque.stream().collect(Collectors.collectingAndThen(Collectors.toUnmodifiableList())));
main() - demo
public static void main(String[] args) {
List<Post> posts =
List.of(new Post(LocalDateTime.of(2022,4,28,15,0)),
new Post(LocalDateTime.of(2022,4,28,15,3)),
new Post(LocalDateTime.of(2022,4,28,15,5)),
new Post(LocalDateTime.of(2022,4,28,15,8)),
new Post(LocalDateTime.of(2022,4,28,15,12)),
new Post(LocalDateTime.of(2022,4,28,15,15)),
new Post(LocalDateTime.of(2022,4,28,15,18)),
new Post(LocalDateTime.of(2022,4,28,15,27)),
new Post(LocalDateTime.of(2022,4,28,15,48)),
new Post(LocalDateTime.of(2022,4,28,15,54)));
Duration interval = Duration.ofMinutes(10);
List<PostsGroup> postsGroups = posts.stream()
PostsGroup{[Post[dateTime=2022-04-28T15:00], Post[dateTime=2022-04-28T15:03], Post[dateTime=2022-04-28T15:05], Post[dateTime=2022-04-28T15:08]]}
PostsGroup{[Post[dateTime=2022-04-28T15:12], Post[dateTime=2022-04-28T15:15], Post[dateTime=2022-04-28T15:18]]}
PostsGroup{[Post[dateTime=2022-04-28T15:48], Post[dateTime=2022-04-28T15:54]]}
You can also play around with this Online Demo
My problem
Let's say I want to hold my messages in some sort of datastructure for longpolling application:
1. "dude"
2. "where"
3. "is"
4. "my"
5. "car"
Asking for messages from index[4,5] should return:
Next let's assume that after a while I would like to purge old messages because they aren't useful anymore and I want to save memory. Let's say after time x messages[1-3] became stale. I assume that it would be most efficient to just do the deletion once every x seconds. Next my datastructure should contain:
4. "my"
5. "car"
My solution?
I was thinking of using a concurrentskiplistset or concurrentskiplist map. Also I was thinking of deleting the old messages from inside a newSingleThreadScheduledExecutor. I would like to know how you would implement(efficiently/thread-safe) this or maybe use a library?
The big concern, as I gather it, is how to let certain elements expire after a period. I had a similar requirement and I created a message class that implemented the Delayed Interface. This class held everything I needed for a message and (through the Delayed interface) told me when it has expired.
I used instances of this object within a concurrent collection, you could use a ConcurrentMap because it will allow you to key those objects with an integer key.
I reaped the collection once every so often, removing items whose delay has passed. We test for expiration by using the getDelay method of the Delayed interface:
I used a normal thread that would sleep for a period then reap the expired items. In my requirements it wasn't important that the items be removed as soon as their delay had expired. It seems that you have a similar flexibility.
If you needed to remove items as soon as their delay expired, then instead of sleeping a set period in your reaping thread, you would sleep for the delay of the message that will expire first.
Here's my delayed message class:
class DelayedMessage implements Delayed {
long endOfDelay;
Date requestTime;
String message;
public DelayedMessage(String m, int delay) {
requestTime = new Date();
endOfDelay = System.currentTimeMillis()
+ delay;
this.message = m;
public long getDelay(TimeUnit unit) {
long delay = unit.convert(
endOfDelay - System.currentTimeMillis(),
return delay;
public int compareTo(Delayed o) {
DelayedMessage that = (DelayedMessage) o;
if (this.endOfDelay < that.endOfDelay) {
return -1;
if (this.endOfDelay > that.endOfDelay) {
return 1;
return this.requestTime.compareTo(that.requestTime);
public String toString() {
return message;
I'm not sure if this is what you want, but it looks like you need a NavigableMap<K,V> to me.
import java.util.*;
public class NaviMap {
public static void main(String[] args) {
NavigableMap<Integer,String> nmap = new TreeMap<Integer,String>();
nmap.put(1, "dude");
nmap.put(2, "where");
nmap.put(3, "is");
nmap.put(4, "my");
nmap.put(5, "car");
// prints "{1=dude, 2=where, 3=is, 4=my, 5=car}"
System.out.println(nmap.subMap(4, true, 5, true).values());
// prints "[my, car]" ^inclusive^
nmap.subMap(1, true, 3, true).clear();
// prints "{4=my, 5=car}"
// wrap into synchronized SortedMap
SortedMap<Integer,String> ssmap =Collections.synchronizedSortedMap(nmap);
System.out.println(ssmap.subMap(4, 5));
// prints "{4=my}" ^exclusive upper bound!
System.out.println(ssmap.subMap(4, 5+1));
// prints "{4=my, 5=car}" ^ugly but "works"
Now, unfortunately there's no easy way to get a synchronized version of a NavigableMap<K,V>, but a SortedMap does have a subMap, but only one overload where the upper bound is strictly exclusive.
API links