I'm working on an application which was, till now, running on JBoss AS on a single server. Now there is a requirement for failover for which we're adding another server and creating a JBoss cluster. Here is the problem:
Till now the application was using a Hashmap to store about 2 million records that we fetched from the database.
Now i have to replicate this data to the second node (there maybe more nodes added in the future).
The data that we need to store is more likely going to be something like 5 million records now.
I just want to have an opinion on what's the best approach for storing this key/value type data and replicate it on all the server nodes.
I've been thinking if Redis or memcached would be an appropriate solution? How about JBoss Cache, I know it is distributed cache and does replication to all nodes in the cluster.
Here are things i'm MOST worried about:
effect on performance - replication could cause network latency data
quality - want to avoid working with stale data impact on memory -
once the data is loaded in HashMap/Cache it should not expire. There maybe some additions or deletions of records and these changes will have to be replicated on all the nodes.
scalability - as i mentioned ... more nodes could be added
Any thoughts on this are highly appreciated.
Since you are working in a Java environment I would suggest you to have a look at hazelcast (http://www.hazelcast.com/). We are using it to synchronize several portal servers and it works very nice!
Related
Let say i have a array of memcache server, the memcache client will make sure the the cache entry is only on a single memcache server and all client will always ask that server for the cache entry... right ?
Now Consider two scenarios:
[1] web-server's are getting lots of different request(different urls) then the cache entry will be distributed among the memcache server and request will fan out to memcache cluster.
In this case the memcache strategy to keep single cache entry on a single server works.
[2] web-server's are getting lots of request for the same resource then all request from the web-server will land on a single memcache server which is not desired.
What i am looking for is the distributed cache in which:
[1] Each web-server can specify which cache node to use to cache stuff.
[2] If any web-server invalidate a cache then the cache server should invalidate it from all caching nodes.
Can memcache fulfill this usecase ?
PS: I dont have ton of resouces to cache , but i have small number of resource with a lots of traffic asking for a single resource at once.
Memcache is a great distributed cache. To understand where the value is stored, it's a good idea to think of the memcache cluster as a hashmap, with each memcached process being precisely one pigeon hole in the hashmap (of course each memcached is also an 'inner' hashmap, but that's not important for this point). For example, the memcache client determines the memcache node using this pseudocode:
index = hash(key) mod len(servers)
value = servers[index].get(key)
This is how the client can always find the correct server. It also highlights how important the hash function is, and how keys are generated - a bad hash function might not uniformly distribute keys over the different servers…. The default hash function should work well in almost any practical situation, though.
Now you bring up in issue [2] the condition where the requests for resources are non-random, specifically favouring one or a few servers. If this is the case, it is true that the respective nodes are probably going to get a lot more requests, but this is relative. In my experience, memcache will be able to handle a vastly higher number of requests per second than your web server. It easily handles 100's of thousands of requests per second on old hardware. So, unless you have 10-100x more web servers than memcache servers, you are unlikely to have issues. Even then, you could probably resolve the issue by upgrading the individual nodes to have more CPUs or more powerful CPUs.
But let us assume the worst case - you can still achieve this with memcache by:
Install each memcache as a single server (i.e. not as a distributed cache)
In your web server, you are now responsible for managing the connections to each of these servers
You are also responsible for determining which memcached process to pass each key/value to, achieving goal 1
If a web server detects a cache invalidation, it should loop over the servers invalidating the cache on each, thereby achieving goal 2
I personally have reservations about this - you are, by specification, disabling the distributed aspect of your cache, and the distribution is a key feature and benefit of the service. Also, your application code would start to need to know about the individual cache servers to be able to treat each differently which is undesirable architecturally and introduces a large number of new configuration points.
The idea of any distributed cache is to remove the ownership of the location(*) from the client. Because of this, distributed caches and DB do not allow the client to specify the server where the data is written.
In summary, unless your system is expecting 100,000k or more requests per second, it's doubtful that you will this specific problem in practice. If you do, scale the hardware. If that doesn't work, then you're going to be writing your own distribution logic, duplication, flushing and management layer over memcache. And I'd only do that if really, really necessary. There's an old saying in software development:
There are only two hard things in Computer Science: cache invalidation
and naming things.
--Phil Karlton
(*) Some distributed caches duplicate entries to improve performance and (additionally) resilience if a server fails, so data may be on multiple servers at the same time
We use eclipselink and weblogic
We have two websphere clusters, with 2 servers in each
Right now an app in 1 cluster uses rmi to do cache-coordination to keep 2 of those servers in synch
When we add a new app in the new cluster to the mix, we will have to synch the cache 2 clusters
How do I achieve this?
Can I still use jpa cache co ordination? using rmi? jms?
should I look into using coherence as l2 cache?
I dont need highly scale-able grid configurations. All I need to make sure is that cache has no stale data
Nothing is a sure thing to prevent stale data, so I hope you are using a form of optimistic locking where needed. You will have to evaluate what is the better solution for your 4 server architecture, but RMI, JMS and even just turning off the second level cache where stale data cannot be tolerated are valid options and would work. I recommend setting up simple tests that match your use cases, the expected load and evaluate if the network traffic and overhead of having to merge and maintain changes on the second level caches out weighs the cost of removing the second level cache. For highly volitile entities, that tipping point might come sooner, in which case you might have more benifit by disabling the shared cache for that entity.
In my experience, JMS has been easier to configure for cache coordination, as it is a central point all servers can connect to, where as RMI requires each server to maintain connections to every other server.
I have been experimenting with Hibernate 3.6 and I am wondering about the capabilities of the provided infinispan distributed cache.
I have a requirement to have database replication between my main site and my disaster recovery site.
While it is possible to configure PostgreSQL to replicate, I am thinking that it might cause the same data to be sent twice from the main to the DR site. My application is expected to have a lot of updates, so that's something to keep in mind. Since this would be over a constrained WAN link, it feels like a lot of data would be sent and that just doesn't look like a good idea.
Can infinispan be configured to replicate between the two sites such that the underlying database doesn't need to ever be replicated itself?
If so, how? How bandwith intensive would it be?
Postgresql >=9.0 has very good replication, use it. You shouldn't replicate cache to DR center, if You have a lot of updates. You shouldn't replicate enything other than data needed to recovery.
DR Center == some backup <> load balancing etc.
the Hibernate cache will keep lots of data in memory that won't be flushed"
It depends. You can flush data manually, you can set flush mode to auto,commit etc.
I am currently in need of a high performance java storage mechanism.
This means:
1) I have 10,000+ objects with 1 - Many Relationship.
2) The objects are updated every 5 seconds, with the most recent updates persistent in the case of system failure.
3) The objects need to be queryable in a reasonable time (1-5 seconds). (IE: Give me all of the objects with this timestamp or give me all of the objects within these location boundaries).
4) The objects need to be available across various Glassfish installs.
Currently:
I have been using JMS to distribute the objects, Hibernate as an ORM, and HSQLDB to provide the needed recoverablity.
I am not exactly happy with the performance. Especially the JMS part of this.
After doing some Stack Overflow research, I am wondering if this would be a better solution. Keep in mind that I have no experience with what Terracotta gives me.
I would use Terracotta to distribute objects around the system, and something else need to give the ability to "query" for attributes of those objects.
Does this sound reasonable? Would it meet these performance constraints? What other solutions should I consider?
I know it's not what you asked, but, you may want to start by switching from HSQLDB to H2. H2 is a relatively new, pure Java DB. It is written by the same guy who wrote HSQLDB and he claims the performance is much better. I'm using it for some time now and I'm very happy with it. It should be a very quick transition (add a Jar, change the connection string, create the database) so it's worth a shot.
In general, I believe in trying to get the most of what I have before rewriting the application in a different architecture. Try profiling it to identify the bottleneck first.
At first, Lucene isn't your friend here. (read only)
Terracotta is to scale around at the Logical layer! Your problem seems not to be related to the processing logic. It's more around the Storage/Communication point.
Identify your bottleneck! Benchmark the Storage/Logic/JMS processing time and overhead!
Kill JMS issues with a good JMS framework (eg. ActiveMQ) and a good/tuned configuration.
Maybe a distributed key=>value store is your friend. Try Project Voldemort!
If you like to stay at Hibernate and HSQL, check out the Hibernate 2nd level cache and connection pooling (c3po, container driven...)!
Several Terracotta users have built systems like this in the past, so I can you tell you by proof of existence that it can be done. :)
Compass does have support for clustering with Terracotta so that might help you. I suspect you might get further faster by just being careful with how you create your clustered data structures.
Regarding your requirements and Terracotta:
1) 10k objects is quite small from a Terracotta perspective
2) 5 sec update rate doesn't seem like an issue. Might depend how many nodes there are and whether there is any natural partitioning you can take advantage of. All updates will be persistent.
3) 1-5 second query time seems quite easy. Building your own well-organized data structures for lookup is the tricky part. Obviously you want to avoid scanning all the data.
4) Terracotta currently supports Glassfish v1 and v2.
If you post on the Terracotta forums, you could probably get more Terracotta eyeballs on the problem.
I am currently working on writing the client for a very (very) fast Key/Value distributed hash DB that provides set + list semantics. The DB is C99 and requires GCC and right now I'm battling with good old Java network IO to break my current 30,000 get/sets per/sec barrier. Hope to be done within the week. Drop me a line through my account and I'll get back when its show time.
With such a high update rate, Lucene is almost definitely not what you're looking for, since there is no way to update a document once it's indexed. You'd have to keep all the object versions in the index and select the one with the latest time stamp, which will kill your performance.
I'm no DB expert, but I think you should look into any one of the distributed DB solutions that's been on the news lately. (CouchDB, Cassandra)
Maybe you should take a look to: Prevayler.
Your objects are always in mem.
The "changes" to your objects are persisted.
From time to time you are able to take a snapshot: every object is persisted.
You don't say what vendor you are using for JMS, but I wouldn't surprise me if you have some bottle neck there. I couldn't get more than 100 messages a second from ActiveMq, and whatever I tried in terms of configuration of acknowledgment, queue size, etc we were unable to soak the CPU beyond a few percent.
The solution was to batch many queries into one JMS message. We had a simple class that either sent a batch of messages when it got to 200 queries or reached a timeout (we used 20ms), which gave us a dramatic increase in message throughput.
Guaranteed messaging is going to be much slower than volatile messaging. Given every object is updated every few second, you might consider batching your updates (into say 500 changes or by time say 1-10 ms' worth), sending over volatile messaging, and batching your transactions. In this case you are more likely to be limited by bandwidth. Tuning your use case you may find smaller batch sizes also work efficiently. If bandwidth is critical (say you have a 10 MB connection or slower, then you could use compression over JMS)
You can achieve much higher performance with a custom solution (which also might be simpler) e.g. Hazelcast & JGroups are free (you can add a node(s) which does the database synchronization so your main app doesn't slow down). There are commercial products which handle in the order of half a million durable messages/sec.
Terracotta + jofti = queryable persistent clustered data structures
Search google for terracotta querymap or visit tusharkhairnar.blogspot.com for querymap blog
You may want to integrate timasync as well to update your database. Database is is your system of record use terracotta as caching and database offloading mechanism you can even batch async updates to make it faster so that I'd db contains fairly recent data
Tushar
tusharkhairnar.blogspot.com
I have this in mind:
On each server: (they all are set up identically)
A free database like MySQL or PostgreSQL.
Tomcat 6.x for hosting Servlet based Java applications
Hibernate 3.x as the ORM tool
Spring 2.5 for the business layer
Wicket 1.3.2 for the presentation layer
I place a load balancer in front of the servers and a replacement load balancer in case my primary load balancer goes down.
I use Terracotta to have the session information replicated between the servers. If a server goes down the user should be able to continue their work at another server, ideally as if nothing happened.
What is left to "solve" (as I haven't actually tested this and for example do not know what I should use as a load balancer) is the database replication which is needed.
If a user interacts with the application and the database changes, then that change must be replicated to the database servers on the other server machines. How should I go about doing that? Should I use MySQL PostgreSQL or something else (which ideally is free as we have a limited budget)? Does the other things above sound sensible?
Clarification: I cluster to get high availability first and foremost and I want to be able to add servers and use them all at the same time to get high scalability.
Since you're already using Terracotta, and you believe that a second DB is a good idea (agreed), you might consider expanding Terracotta's role. We have customers who use Terracotta for database replication. Here's a brief example/description but I think they have stopped supporting clients for this product.:
http://www.terracotta.org/web/display/orgsite/TCCS+Asynchronous+Data+Replication
You are trying to create a multi-master replication, which is a very bad idea, as any change to any database has to replicate to every other database. This is terribly slow - on one server you can get several hundred transactions per second using a couple of fast disks and RAID1 or RAID10. It can be much more if you have a good RAID controller with battery backed cache. If you add the overhead of communicating with all your servers, you'll get at most tens of transactions per second.
If you want high availability you should go for a warm standby solution, where you have a server, which is replicated but not used - when main server dies a replacement takes over. You can lose some recent transactions if your main server dies.
You can also go for one master, multiple slaves asynchronous replication. Every change to a database will have to be performed on one master server. But you can have several slave, read-only servers. Data on this slave servers can be several transactions behind the master so you can also lose some recent transactions in case of server death.
PostgreSQL does have both types of replication - warm standby using log shipping and one master, multiple slaves using slony.
Only if you will have a very small number of writes you can go for synchronous replication. This can also be set for PostgreSQL using PgPool-II or Sequoia.
Please read High Availability, Load Balancing, and Replication chapter in Postgres documentation for more.
For my (Perl-driven) website, I am using MySQL on two servers with database replication. Each MySQL server is slave and master at the same time. I did this for redudancy, not for performance, but the setup has worked fine for the past 3 years, we had almost no downtime at all during this period.
Regarding Kent's question / comment: I am using the standard replication that comes with MySQL.
Regarding the failover mechanism: I am using DNSMadeEasy.com's failover functionality. I have a Perl script run every 5 minutes via cron that checks if replication is still running (and also lots of other things such as server load, HDD sanity, RAM usage, etc.). During normal operation, the faster of the two servers delivers all web pages. If the script detects that something is wrong with the server (or if the server is just plain down), DNSMadeEasy switches DNS entries so that the secondary server becomes primary. Once the "real" primary server is back up, MySQL automatically catches up on missing database changes and DNSMadeEasy automatically switches back.
Here's an idea. Read Theo Schlossnagle's book Salable Internet Architectures.
What you're proposing is not a the best idea.
Load balancers are expensive and not as valuable as they would appear. Use something simpler for distributing the load between your servers (something like Wackamole).
Rather than fool around with DB replication, spend your money on a reliable DB server separate from your front-end web servers. Do regular backups and in the very unlikely event of DB failure, get back running as quickly as possible from ordinary backups.
AFAIK, MySQL does better job being scalable. See the documentation
http://dev.mysql.com/doc/mysql-ha-scalability/en/ha-overview.html
And there is a blog, where you can take a look at real life examples:
http://highscalability.com/tags/mysql