Pre-load values for a Guava Cache

Pre-load values for a Guava Cache - java

I have a requirement where we are loading static data from a database for use in a Java application. Any caching mechanism should have the following functionality:
Load all static data from the database (once loaded, this data will not change)
Load new data from the database (data present in the database at start-up will not change but it is possible to add new data)
Lazy loading of all the data isn't an option as the application will be deployed to multiple geographical locations and will have to communicate with a single database. Lazy loading the data will make the first request for a specific element too slow where the application is in a different region to the database.
I have been using the MapMaker API in Guava with success but we are now upgrading to the latest release and I can't seem to find the same functionality in the CacheBuilder API; I can't seem to find a clean way of loading all data at start-up.
One way would be to load all keys from the database and load those through the Cache individually. This would work but would result in N+1 calls to the database, which isn't quite the efficient solution I'm looking for.
public void loadData(){
List<String> keys = getAllKeys();
for(String s : keys)
cache.get(s);
}
Or the other solution is to use a ConcurrentHashMap implementation and handle all of the threads and missing entries myself? I'm not keen on doing this as the MapMaker and CacheBuilder APIs provide the key-based thread locking for free without having to provide extra testing. I'm also pretty sure the MapMaker/CacheBuilder implementations will have some efficiencies that I don't know about/haven't got time to investigate.
public Element get(String key){
Lock lock = getObjectLock(key);
lock.lock();
try{
Element ret = map.get(key)
if(ret == null){
ret = getElement(key); // database call
map.put(key, e);
}
return ret;
}finally {
lock.unlock();
}
}
Can anyone think of a better solution to my two requirements?
Feature Request
I don't think pre-loading a cache is an uncommon requirement, so it would be nice if the CacheBuilder provided a configuration option to pre-load the cache. I think providing an Interface (much like CacheLoader) which will populate the cache at start-up would be an ideal solution, such as:
CacheBuilder.newBuilder().populate(new CachePopulator<String, Element>(){
#Override
public Map<String, Element> populate() throws Exception {
return getAllElements();
}
}).build(new CacheLoader<String, Element>(){
#Override
public Element load(String key) throws Exception {
return getElement(key);
}
});
This implementation would allow the Cache to be pre-populated with all relevant Element objects, whilst keeping the underlying CustomConcurrentHashMap non-visible to the outside world.

In the short-term I would just use Cache.asMap().putAll(Map<K, V>).
Once Guava 11.0 is released you can use Cache.getAll(Iterable<K>), which will issue a single bulk request for all absent elements.

I'd load all static data from the DB, and store it in the Cache using cache.asMap().put(key, value) ([Guava 10.0.1 allows write operations on the Cache.asMap() view][1]).
Of course, this static data might get evicted, if your cache is configured to evict entries...
The CachePopulator idea is interesting.

Related

Spring - Storing volatile data in memory

I'm developing a SpringBoot web application for managing gaming servers.
I want to have a cronjob that queries the servers, checks whether they have crashed and collects relevant data, such as the number of players online etc. This data needs to be stored and shared among services that require it. Since this data will change often and will become invalid after the whole application stops, I don't want to persist these stats in the database, but in the application memory.
Current implementation
Currently, my implementation is pretty naive - having a collection as a member field of the corresponding Spring service and storing the server statuses there. However I feel this is a really bad solution, as the services should be stateless and also I don't take concurrency into account.
Example code:
#Service
public class ServersServiceImpl implements ServersService {
private final Map<Long, ServerStats> stats = new HashMap<>(); // Map server ID -> stats
...
public void startServer(Long id) {
// ... call service to actually start server process
serverStats.setRunning(true);
stats.put(id, serverStats);
}
...
}
Alternative: Using #Repository classes
I could move the collection with the data to classes with #Repository annotation, which would be semantically more correct. There, I would implement a thread-safe logic of storing the data in java collection. Then I would inject this repository into relevant services.
#Repository
public class ServerStatsRepository {
private final Map<Long, ServerStats> stats = new ConcurrentHashMap<>();
...
public ServerStats getServerStats(Long id) {
return stats.get(id);
}
public ServerStats updateServerStats(Long id, ServerStats serverStats) {
return stats.put(id, serverStats);
}
...
}
Using Redis also came to mind, but I don't want to add too much complexity to the app.
Is my proposed solution a valid approach? Would there be any better option of handling this problem?

is there a Cacheable in C# similar to Java?

In Java Spring Boot, I can easily enable caching using the annotation #EnableCaching and make methods cache the result using #Cacheable, this way, any input to my method with the exact same parameters will NOT call the method, but return immediately using the cached result.
Is there something similar in C#?
What I did in the past was i had to implement my own caching class, my own data structures, its a big hassle. I just want an easy way for the program to cache the result and return the exact result if the input parameters are the same.
EDIT: I dont want to use any third party stuff, so no MemCached, no Redis, no RabbitMQ, etc... Just looking for a very simple and elegant solution like Java's #Cacheable.

Caches
A cache is the most valuable feature that Microsoft provides. It is a type of memory that is relatively small but can be accessed very quickly. It essentially stores information that is likely to be used again. For example, web browsers typically use a cache to make web pages load faster by storing a copy of the webpage files locally, such as on your local computer.
Caching
Caching is the process of storing data into cache. Caching with the C# language is very easy. System.Runtime.Caching.dll provides the feature for working with caching in C#. In this illustration I am using the following classes:
ObjectCache
MomoryCache
CacheItemPolicy
ObjectCache
: The CacheItem class provides a logical representation of a cache entry, that can include regions using the RegionName property. It exists in the System.Runtime.Caching.
MomoryCache
: This class also comes under System.Runtime.Caching and it represents the type that implements an in-cache memory.
CacheItemPolicy
: Represents a set of eviction and expiration details for a specific cache entry.
.NET provides
System.Web.Caching.Cache - default caching mechanizm in ASP.NET. You can get instance of this class via property Controller.HttpContext.Cache also you can get it via singleton HttpContext.Current.Cache. This class is not expected to be created explicitly because under the hood it uses another caching engine that is assigned internally. To make your code work the simplest way is to do the following:
public class DataController : System.Web.Mvc.Controller{
public System.Web.Mvc.ActionResult Index(){
List<object> list = new List<Object>();
HttpContext.Cache["ObjectList"] = list; // add
list = (List<object>)HttpContext.Cache["ObjectList"]; // retrieve
HttpContext.Cache.Remove("ObjectList"); // remove
return new System.Web.Mvc.EmptyResult();
}
}
System.Runtime.Caching.MemoryCache - this class can be constructed in user code. It has the different interface and more features like update\remove callbacks, regions, monitors etc. To use it you need to import library System.Runtime.Caching. It can be also used in ASP.net application, but you will have to manage its lifetime by yourself.
var cache = new System.Runtime.Caching.MemoryCache("MyTestCache");
cache["ObjectList"] = list; // add
list = (List<object>)cache["ObjectList"]; // retrieve
cache.Remove("ObjectList"); // remove

You can write a decorator with a get-or-create functionality. First, try to get value from cache, if it doesn't exist, calculate it and store in cache:
public static class CacheExtensions
{
public static async Task<T> GetOrSetValueAsync<T>(this ICacheClient cache, string key, Func<Task<T>> function)
where T : class
{
// try to get value from cache
var result = await cache.JsonGet<T>(key);
if (result != null)
{
return result;
}
// cache miss, run function and store result in cache
result = await function();
await cache.JsonSet(key, result);
return result;
}
}
ICacheClient is the interface you're extending. Now you can use:
await _cacheClient.GetOrSetValueAsync(key, () => Task.FromResult(value));

How to populate entries into Loading Cache guava?

I have a use case where I want to populate entries into a data structure from multiple threads and after a particular size is reached start dropping old records. So I decided to use Guava Loading Cache for this.
I want to populate entries into my Loading Cache from multiple threads and I am setting eviction based policy as Size Based Eviction.
private final ScheduledExecutorService executorService = Executors
.newSingleThreadScheduledExecutor();
private final LoadingCache<String, DataBuilder> cache =
CacheBuilder.newBuilder().maximumSize(10000000)
.removalListener(RemovalListeners.asynchronous(new CustomListener(), executorService))
.build(new CacheLoader<String, DataBuilder>() {
#Override
public DataBuilder load(String key) throws Exception {
// what I should do here?
// return
}
});
// this will be called from multiple threads to populate the cache
public void addToCache(String key, DataBuilder dataBuilder) {
// what I should do here?
//cache.get(key).
}
My addToCache method will be called from multiple threads to populate the cache. I am confuse what I should be doing inside addToCache method to fill the cache and also what does my load method looks like?
Here DataBuilder is my builder pattern.

Obviously your problem is that you don't get the main purpose of a CacheLoader.
A CacheLoader is used to automatically load the value of a given key (which doesn't exist in the cache yet) when calling get(K key) or getUnchecked(K key) in way that even if we have several threads trying to get the value of the same key at the same time, only one thread will actually load the value and once done all calling threads will have the same value.
This is typically useful when the value to load takes some time, like for example when it is the result of a database access or a long computation, because the longer it takes the higher is the probability to have several threads trying to load the same value at the same time which would waste resources without a mechanism that ensures that only one thread will load the data for all calling threads.
So here let's say that your DataBuilder's instances are long to build or you simply need to ensure that all threads will have the same instance for a given key, you would then indeed need a CacheLoader and it would look like this:
new CacheLoader<String, DataBuilder>() {
#Override
public DataBuilder load(String key) throws Exception {
return callSomeMethodToBuildItFromTheKey(key); // could be new DataBuilder(key)
}
}
Thanks to the CacheLoader, you have no need to call put explicitly anymore as your cache will be populated behind the scene by the threads calling cache.get(myKey) or cache.getUnchecked(myKey).
If you want to manually populate your cache, you can simply use the put(K key, V value) method like any Cache as next:
public void addToCache(String key, DataBuilder dataBuilder) {
cache.put(key, dataBuilder);
}
If you intend to populate the cache yourself, you don't need a CacheLoader, you can simply call build() instead of build(CacheLoader<? super K1,V1> loader) to build your Cache instance (it won't be a LoadingCache anymore).
Your code would then be:
private final Cache<String, DataBuilder> cache =
CacheBuilder.newBuilder().maximumSize(10000000)
.removalListener(
RemovalListeners.asynchronous(new CustomListener(), executorService)
).build();

Clean Architecture and Cache Invalidation

I have an app that tries to follow the Clean Architecture and I need to do some cache invalidation but I don't know in which layer this should be done.
For the sake of this example, let's say I have an OrderInteractor with 2 use cases : getOrderHistory() and sendOrder(Order).
The first use case is using an OrderHistoryRepository and the second one is using a OrderSenderRepository. Theses repositories are interfaces with multiple implementations (MockOrderHistoryRepository and InternetOrderHistoryRepository for the first one). The OrderInteractor only interact with theses repositories through the interfaces in order to hide the real implementation.
The Mock version is very dummy but the Internet version of the history repository is keeping some data in cache to perform better.
Now, I want to implement the following : when an order is sent successfully, I want to invalidate the cache of the history but I don't know where exactly I should perform the actual cache invalidation.
My first guess is to add a invalidateCache() to the OrderHistoryRepository and use this method at the end of the sendOrder() method inside the interactor. In the InternetOrderHistoryRepository, I will just have to implement the cache invalidation and I will be good. But I will be forced to actually implement the method inside the MockOrderHistoryRepository and it's exposing to the outside the fact that some cache management is performed by the repository. I think that the OrderInteractor should not be aware of this cache management because it is implementation details of the Internet version of the OrderHistoryRepository.
My second guess would be perform the cache invalidation inside the InternetOrderSenderRepository when it knows that the order was sent successfully but it will force this repository to know the InternetOrderHistoryRepository in order to get the cache key used by this repo for the cache management. And I don't want my OrderSenderRepository to have a dependency with the OrderHistoryRepository.
Finally, my third guess is to have some sort of CacheInvalidator (whatever the name) interface with a Dummy implementation used when the repository is mocked and an Real implementation when the Interactor is using the Internet repositories. This CacheInvalidator would be injected to the Interactor and the selected implementation would be provided by a Factory that's building the repository and the CacheInvalidator. This means that I will have a MockedOrderHistoryRepositoryFactory - that's building the MockedOrderHistoryRepository and the DummyCacheInvalidator - and a InternetOrderHistoryRepositoryFactory - that's building the InternetOrderHistoryRepository and the RealCacheInvalidator. But here again, I don't know if this CacheInvalidator should be used by the Interactor at the end of sendOrder() or directly by the InternetOrderSenderRepository (even though I think the latter is better because again the interactor should probably not know that there is some cache management under the hood).
What would be your preferred way of architecturing this ?
Thank you very much.
Pierre

Your 2nd guess is correct because caching is a detail of the persistence mechanism. E.g. if the repository would be a file based repository caching might not be an issue (e.g. a local ssd).
The interactor (use case) should not know about caching at all. This will make it easier to test because you don't need a real cache or mock for testing.
My second guess would be perform the cache invalidation inside the InternetOrderSenderRepository when it knows that the order was sent successfully but it will force this repository to know the InternetOrderHistoryRepository in order to get the cache key used by this repo for the cache management.
It seems that your cache key is a composite of multiple order properties and therefore you need to encapsulate the cache key creation logic somewhere for reuse.
In this case, you have the following options:
One implementation for both interfaces
You can create a class that implements the InternetOrderSenderRepository as well as the InternetOrderHistoryRepository interface. In this case, you can extract the cache key generation logic into a private method and reuse it.
Use a utility class for the cache key creation
Simple extract the cache key creation logic in a utility class and use it in both repositories.
Create a cache key class
A cache key is just an arbitrary object because a cache must only check if a key exists and this means use the equals method that every object has. But to be more type-safe most caches use a generic type for the key so that you can define one.
Thus you can put the cache key logic and validation in an own class. This has the advantage that you can easily test that logic.
public class OrderCacheKey {
private Integer orderId;
private int version;
public OrderCacheKey(Integer orderId, int version) {
this.orderId = Objects.requireNonNull(orderId);
if (version < 0) {
throw new IllegalArgumentException("version must be a positive integer");
}
this.version = version;
}
public OrderCacheKey(Order order) {
this(order.getId(), order.getVersion());
}
public boolean equals(Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
OrderCacheKey other = (OrderCacheKey) obj;
if (!Objects.equals(orderId, other.orderId))
return false;
return Objects.equals(version, other.version);
}
public int hashCode() {
int result = 1;
result = 31 * result + Objects.hashCode(orderId);
result = 31 * result + Objects.hashCode(version);
return result;
}
}
You can use this class as the key type of your cache: Cache<OrderCacheKey, Order>. Then you can use the OrderCacheKey class in both repository implementations.
Introduce a order cache interface to hide caching details
You can apply the interface segregation principle and hide the complete caching details behind a simple interface. This will make your unit tests more easy because you have to mock less.
public interface OrderCache {
public void add(Order order);
public Order get(Integer orderId, int version);
public void remove(Order order);
public void removeByKey(Integer orderId, int version);
}
You can then use the OrderCache in both repository implementations and you can also combine the interface segregation with the cache key class above.
How to apply
You can use aspect-oriented programming and one of the options above to implement the caching
You can create a wrapper (or delegate) for each repository that applies caching and delegates to the real repositories when needed. This is very similar to the aspect-oriented way. You just implement the aspect manually.

spring cache, TTL unles service is down

I have an interesting task where I need to cache the results of my method, which is really simple with spring cache abstraction
#Cachable(...)
public String getValue(String key){
return restService.getValue(key);
}
The restService.getValue() targets a REST service, which can be answering or not if the end point is down.
I need to set a specific TTL for the cache value, lets say 5 minutes, but in case if the server is down I need to return the last value, even if it extends 5 minutes.
I was thinking about having a second cachable method which have no TTL and always returns the last value, it would be called from getValue if restService returns nothing, but maybe there is a better way?

I've been interested in doing this for a while too. Sorry to say, I have not found any trivial way of doing this. Spring will not do this for you, it's more a question of whether what cache implementation spring is wrapping can do it. I assume you are using the EhCache implementation. Unfortunately this functionality does not come out the box as far as I know.
There are various ways one can achieve something similar depending on your problem
1) use an eternal cache time and have a second class Thread which periodically loops over the cached data refreshing it. I have not done this exactly, but the Thread class would need to have to look something like this:
#Autowired
EhCacheCacheManager ehCacheCacheManager;
...
//in the infinite loop
List keys = ((Ehcache) ehCacheCacheManager.getCache("test").getNative Cache()).getKeys();
for (int i = 0; i < keys.size(); i++) {
Object o = keys.get(i);
Ehcache ehcache = (Ehcache)ehCacheCacheManager.getCache("test").getNativeCache()
Element item = (ehcache).get(o);
//get the data based on some info in the value, and if no exceptions
ehcache.put(new Element(element.getKey(), newValue));
}
benefits are this is very fast for the #Cacheable caller, downside is your server might get more hits than neccessary
2) You could make a CacheListener to listen to the eviction event, store the data temporarily. And should the server call fail, use that data and return from the method.
the ehcache.xml
<cacheEventListenerFactory class="caching.MyCacheEventListenerFactory"/>
</cache>
</ehcache>
The factory:
import net.sf.ehcache.event.CacheEventListener;
import net.sf.ehcache.event.CacheEventListenerFactory;
import java.util.Properties;
public class MyCacheEventListenerFactory extends CacheEventListenerFactory {
#Override
public CacheEventListener createCacheEventListener(Properties properties) {
return new CacheListener();
}
}
The Pseudo-implementation
import net.sf.ehcache.CacheException;
import net.sf.ehcache.Ehcache;
import net.sf.ehcache.Element;
import net.sf.ehcache.event.CacheEventListener;
import java.util.concurrent.ConcurrentHashMap;
public class CacheListener implements CacheEventListener {
//prob bad practice to use a global static here - but its just for demo purposes
public static ConcurrentHashMap myMap = new ConcurrentHashMap();
#Override
public void notifyElementPut(Ehcache ehcache, Element element) throws CacheException {
//we can remove it since the put happens after a method return
myMap.remove(element.getKey());
}
#Override
public void notifyElementExpired(Ehcache ehcache, Element element) {
//expired item, we should store this
myMap.put(element.getKey(), element.getValue());
}
//....
}
A challenge here is that the key is not very useful, you might need to store something about the key in the returned value to be able to pick it up if the server call fails. This feels a bit hacky, and I have not determined if this is exactly bullet proof. It might need some testing.
3) A lot of effort but works:
#Cacheable("test")
public MyObject getValue(String data) {
try {
MyObject result = callServer(data);
storeResultSomewhereLikeADatabase(result);
} catch (Exception ex) {
return getStoredResult(data);
}
}
a Pro here is that it will work between server restarts, and you can extend it simply to allow shared caches between clustered servers.
I had a version in an 12 clustered environment where each one checked the database first to see if any other cluster had got the "expensive" data first
and then reused that rather than make the server call.
A slight variant would also be to use a second #Cacheable method together with #CachePut rather than a DB to store the data. But this would mean doubling up in memory usage. That might be acceptable depending on your result sizes.

Maybe you can use spel to change the used cache (one using ttl and the second not) if the condition (is the service up?) is true or false, I've never used spel this way (I used it to change the key based on some request params) but I think it could work
#Cacheable(value = "T(com.xxx.ServiceChecker).checkService()",...)
where checkService() is a static method that returns the name of the cache that should be used

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.