Here is an example how start hazelcast without network. For my purpose it is needed to run hazelcast as embeded, only as simple cache. But original question was about testing - can I use that code for production when I do not need a separated hazelcast server?
Hazelcast is designed to be a distributed system. It wasn't designed to be in-process cache. Because of distributed nature, may design decisions don't make it as a good candidate for your use case. You will see overhead on serialization and network (even in local single node embedded mode).
We're planning to improve this situation by providing optimization for local cache use case but no ETA at this point. You will see some features related to this use case in next couple releases.
I would suggest taking a look on Caffeine. It has JCache and Spring Boot integration. I would suggest sticking to JCache integration because it will make your code portable. If in the future you decide to go distributed, you just need to replace caffeine jars with Hazelcast.
Feel free to ask if you have any questions.
Thank you
Related
We got multiple Application Server behind a Reverse Proxy. We want a single cache on another host which all Application Servers can easily use, thus the cache has to have some kind of network support. Furthermore the setup should be easy probably supporting docker, but this is not a must. The cache duration is about 1d. The API should be as easy and standardized as possible (JCache?).
In a later stage we want to prepolutate the Cache.
Which options do I have?
Background: In a first step we want to reduce load on the backend systems, which provides mainly SOAP Services. So we want to cache the SOAP response (JAX-WS). The cache hit rate will be probably about 25% in a first stage.
Later we want to use the same cache for JPA as well (we already have in memory caching enabled for each Application Servcer and use a Cache Coordination strategy).
To use even more caching we will need some sort cache categories.
In general: The question is to broad and actually you are asking for a product recommendation. Please take a look at the stackoverflow question guidelines.
About your question:
There is no "single cache" for any purpose. Furthermore, there can be many variants in software and system architecture, with a single cache product, too. The best solution depends not on the application but on the type of data access you want to cache. Some questions that come to my mind:
Do you have a mostly read or a read/write usage pattern?
What is the type of access, point, range, or a full scan? What type of operations you do on the data? What is the object count and typical object size? Are there hot spots? How many application servers you have? Is there a memory limit in the application servers? How costly is it to generate the data in the backend (latency and resource costs)?
One general recommendation: If you only have a few application servers, I would start with local caching in the application servers and ignore the fact that there may be redundant requests on the backend from different application servers. This way you can keep the existing system architecture. Putting in a separate cache server or servers needs a lot of planing and a lot considerations for staging, deployment and operation your application.
One second general recommendation: The cache hit rate will be probably about 25% in a first stage A cache with this hitrate will be pretty useless. It may happen that you don't get any performance gain from the cache at all. There may be reasons to do it anyway, e.g. to improve the application for flash crowds. This needs some more detailed elaboration. Double check you numbers!
I am looking forward for more detailed questions :)
What about using the cache server from Ehcache ?
It provides a RESTful interface and can run on a dedicated server.
I'm looking for a distributed cache for key-value pairs with these features -
Persistence to disk
Open Source
Java Interface
Fast read/write with minimum memory utilisation
Easy to add more machines to the database (Horizontally Scalable)
What are the databases that fit the bill?
Redisson framework also provides distributed cache abilities based on Redis
There are a lot of options that you can make use of.
Redis - the one you've stated by yourself. Its a distinct process, very fast, key-value for sure, but it's not an "in-memory with your application", I mean that you'll always do a socket I/O in order to go to redis process.
Its not written in Java, but it provides a descent Java Driver to work with, moreover there is a spring integration.
If you want a java based solution consider the following:
memcached - a distributed cache
Hazelcast - its a datagrid, its much more than simply key-value store, but you might be interested in this as well.
Infinispan - folks from JBoss have created this one
EHCache - a popular distributed cache
Hope this helps
I'm getting ready to start working on performance in an application which will eventually be running distributed, but currently is in [greenfield] development.
I'd like to be able to introduce caching without either selecting or committing to a specific library, so I am wondering whether there is a caching facade library (analogous to slf4j for logging) already in existence that will allow me to make that decision at a later date.
There is also a Java standard: JSR 107: JCACHE - Java Temporary Caching API. Pretty much dead, but there was some movement half year ago. Also there is quite a lot happens in the source repository. EhCache supports this JSR natively.
If you are using Spring, it has a great caching abstraction.
If you are using Spring it has a cache abstraction.
Have a look at the blog entry here too which introduced me to the concept.
One of the popular cache implementations is EhCache. You can also take a look on Terracotta cache (terracota has a lot of sub-projects - see the cache).
Between the transitions of the web app I use a Session object to save my objects in.
I've heard there's a program called memcached but there's no compiled version of it on the site,
besides some people think there are real disadvantages of it.
Now I wanna ask you.
What are alternatives, pros and cons of different approaches?
Is memcached painpul for sysadmins to install? Is it difficult to embed it to the existing infrastructure from the perspective of a sysadmin?
What about using a database to hold temporary data between web app transitions?
Is it a normal practice?
What about using a database to hold
temporary data between web app
transitions? Is it a normal practice?
Database have indeed a cache already. A well design application should try to leverage it to reduce the disk IO.
The database cache works at the data level. That's why other caching mechanism can be used to address different levels. At the java level, you can use the 2nd level cache of hibernate, which can cache entities and query result. This can notably reduce the network IO between the app. server and the database.
Then you may want to address horizontal scalability, that is, to add servers to manage the load. In this case, the 2nd level cache need to be distributed across the nodes. This exists (see JBoss cache), but can get slightly complicated to manage.
Distributed cache tend to worker better if they have simpler scheme based on key/value. That's what memcached is, but there are also other similar solutions. The biggest problem with distributed caches is invalidation of outdated entries -- which can itself turn into a performance bottleneck.
Don't think that you can use a distributed cache as-is to make your performance problems vanish. Designing a scalable distributed architecture requires experience and is always a matter of trade-off between what to optimize and not.
To come back to your question: for regular application, there is IMHO no need of a distributed cache. Decent disk IO and network IO lead usually to decent performance.
EDIT
For non-persistent objects, you have several options:
The HttpSession. Objects need to implement Serializable. The exact way the session is managed depends on the container. In a cluster, the session is usually replicated twice, so that if one node crashes you still have one copy. There is then session affinity to route the request to the server that has the session in memory.
Distributed cache. A system like memcached may indeed make sense, but I don't know the details.
Database. You could of course dump any Serializable object in the database in a BLOB. Can be an option if the web servers are not as reliable as the database server.
Again, for regular application, I would try to go as far as possible with the HttpSession.
How about Ehcache? It's an easy to use pure Java solution ready to plug in to Hibernate. As far as I remember it's supported by containers.
It's quite painless in my experience.
http://docs.jboss.org/hibernate/core/3.3/reference/en/html/performance.html#performance-cache
This page should have everything that you need (hopefully !)
Never used a cache like this before. The problem is that I want to load 500,000 + records out of a database and do some selecting/filtering wicked fast.
I'm thinking about using a cache, and preliminarily found EHCache and OSCache, any opinions?
Judging by their releases page, OSCache has not been actively maintained since 2007. This is not a good thing. EhCache, on the other hand, is under constant development. For that reason alone, I would choose EhCache.
Edit Nov 2013: OSCache, like the rest of OpenSymphony, is dead.
They're both pretty solid projects. If you have pretty basic caching needs, either one of them will probably work as well as the other.
You may also wish to consider doing the filtering in a database query if it's feasible. Often, using a tuned query that returns a smaller result set will give you better performance than loading 500,000 rows into memory and then filtering them.
I've used JCS (http://jakarta.apache.org/jcs/) and it seems solid and easy to use programatically.
It sort of depends on your needs. If you're doing the work in memory on one machine, then ehcache will work perfectly, assuming you have enough RAM or a fast enough hard disk so that the overflow doesn't cause disk paging/thrashing. if you find you need to achieve scalability, even despite this particular operation happening a lot, then you'll probably want to do clustering. JGroups /TreeCache from JBoss support this, so does EHcache (I think), and I know it definitely works if you use Ehcache with terracotta, which is a very slick integration. This answer doesn't speak directly to the merits of EHcache and OSCache, so here's that answer: EHcache seems to have the most inertia (used to be the default, well known, active development, including a new cache server), and OSCache seemed (at least at one point) to have slightly more features, but I think that with the options mentioned above those advantages are moot/superseded. Ah, the other thing I forgot to mention is that transactionality of the data is important, and your requirements will refine the list of valid choices.
Choose a cache which complies to JSR 107 which will make your job easy when you want to migrate from one implementation to the other. To be specific on the question go for Ehcache which is more popular and widely used Java caching solution. We are using Ehcache extensively and it works for us.
Other answers discuss pros/cons for caches; but I am wondering whether you actually benefit from cache at all. It is not quite clear exactly what you plan on doing here, and why a cache would be beneficial: if you have the data set at your use, just access that. Cache only helps reuse things between otherwise independent tasks. If this is what you are doing, yes, caching can help. But if it is a big task that can carry along its data set, caching would add no value.
Either way, I recommend using them with Spring Modules.
The cache can be transparent to the application, and cache implementations are trivially easy to swap.
In addition to OSCache and EHCache, Spring Modules also support Gigaspaces and JBoss cache.
As to comparisons....
OSCache is easier to configure
EHCache has more configuration options
They are both rock solid, both support mirroring cache, both work with Terracotta, both support in-memory and to-disk caching.
I have used oscache on several spring projects with spring-modules, using the aop based configuration.
Recently I looked to use oscache + spring modules on a Spring 3.x project, but found spring-modules annotation-based caching is not supported (even by the fork).
I recently found out about this project -
http://code.google.com/p/ehcache-spring-annotations/
Which supports spring 3.x with declarative annotation-based caching using ehcache.
I mainly use EhCache because it used to be the default cache provider for Hibernate. There is a list of caching solutions on Java-Source.net.
I used to have a link that compared the main caching solutions. If I find it I will update this answer.
OSCache is pretty much dead as it has been abandoned a few years ago. You may take a look at Cacheonix, it's been actively developed and we've just released v.2.2.2 with support for caching in the web tier. I'm a committer so you can reach out if you have any questions.