I have a problem with a product that I am currently working on. Essentially, There is some very commonly used (and very seldomly updated) information that is retrieved from the database on server start up. We do not want to query the database every time this information is needed because it is very frequent. There is a way to update this information through the application (only by an admin). When this method is used, the data in the database is updated and the cached data in that single server (1 of 4) is updated. Unfortunately, if a user hits any of the other servers they will not see the updated information. Restarting the cluster remedies the problem however, that is not a feasible solution for our production environment. Now that I have explained the situation, I am open to suggestions. Thank you for your time.
For a simple solution, you can go to the cluster in the admin console and ripple start it. That stops/stars the nodes gracefully and one at a time. The only impact is a 25% reduction in capacity while it is working.
IBM WebSphere Application Server has a Dynamic Cache that you can use to store Java objects. The cache can be set up to use replication over a replication domain so it can be shared across a cluster.
Your code would use the DistributedMap interface to interact with the cache. All settings for the dynamic cache can be included with your application or it can be pre-configured. Examples are included in the javadoc link.
(Similar to Java EE Application-scoped variables in a clustered environment (Websphere)?)
That is, I think the standard answer would be a "Distributed Object Store". But a crude alternative (that we use) would be to configure a list of server:port combinations to contact to inform each cluster member to update their own copy of the data.
Related
I have the following problem: I have Java application - Sprint boot, which uses Angular in the frontend. This application needs to store some data on the client side, however, this data is lost when the client changes their browser or opens an anonymous browser tab.
I need an alternative, other than linking data to the user in the database. Something that is implemented in Java itself.
Is there any way I can store data in Java - Even though I know they will be volatile, that is, we can assume that my application server will be up 100% of the time.
**edit
My server run a openshift plataform that have multiple pods, the load baancer of server are configured in a NON-Sticky sessions design. That's why we can assuming that my server will be 100% active.
This really depends on the design of your server. For example, why is it guaranteed to be up 100% of the time? Do you have multiple redundant instances? In that case you need to coordinate that "storage" between all instances; you may even want to deal with a quorum of instances keeping the state etc. Doesn't seem to be trivial. Or do you have just one single instance? But how do you guarantee 100% uptime?
I strongly recommend using some kind of data store or at least distributed cache.
We maintain our server once a week.
Sometimes, the customer wishes that we change some settings which is already cached in server.
My colleague always write some JSP code to change these settings which are stored in the memory.
Is it a good method to use this kind of methodology?
If our project is not a Web container, which tools can help me?
Usually, in my experience, the server configuration is not stored only in memory of server:
What happens that after a configuration change, the server has been restarted / just went down for some system reason?
What happens if you have more than one instance of the same server to work on (a cluster of servers in other words)?
So, usually, people opt for various "externalized configuration" options that can range from "file-based" configuration + redeploy the whole cluster upon each configuration change, to configuration management servers (like Consul, etc.d, etc). There are also some solutions that came from (and used in) a java world: Apache Zookeeper, Spring cloud config server to name a few, there are others. In addition, sometimes, it's convenient to store the configurations in a database.
Now to your question: If your project is not a web container and you don't care that configuration will "disappear" after a server restart and you're not running a distributed cluster of servers, then, using JSP indeed doesn't seem appropriate in this case.
Maybe you should take a look at JMX - Java management extensions, that have a built-in solution so that you probably will be able to get rid of a web container (which seems to be not used by your team anyway other than for JSP modifications that you've described).
You basically need in memory cache, there are multiple solutions found in answers which include creating your own implementation or using existing java library. You can also get data from database and add cache over the database layer.
I'm trying to find the best indexing solution for implementing a search-engine in my clustered webapp, and I cannot find a clear answer to my questions in official documentations.
My Java/Java EE backend will be deployed among several load-balanced instances. The search-engine will require near-real-time availability of indexed data (i.e. less than 5 seconds between the indexation and the retrievability).
Hibernate Search can work in a clustered environment with JGroups but the documentation also says, about near-real-time that as a tradeoff it requires a non-clustered and non-shared index.
Does that mean that NRTIndexManager cannot be used in a JGroups Slave/Master setup ? i.e. can only be used whith one single node ?
Does that mean that with such a setup, the availability of indexed data depends only on the refresh period (period of index copy to slave nodes) ?
With the standard IndexManager, you only see the latest changes when they are written to the disk and you reopen your IndexSearcher.
By default, Hibernate Search writes to disk and opens a new IndexSearcher for each query so you're sure your searches are always in sync with your database.
The NRTIndexManager is different from the standard one because it allows you to search on the latest changes indexed without an explicit write on disk. It's typically used when you need a high throughput and you can't write everything on the disk right away. So it's not really correlated to the fact that you will see your changes right away or not: it's an optimization when you can allow some index data loss - the latest changes might be lost.
As mentioned in the documentation here http://docs.jboss.org/hibernate/search/5.5/reference/en-US/html_single/#jgroups-backend , you can have a sync JGroups with Hibernate Search blocking until all the indexes are in sync. So it can work for your case.
Note that we are currently working for 5.6 on an Elasticsearch backend which might be of some interest to you as it's typically designed for your case. It's still in beta but it's already in pretty good shape. You might want to take a look to it: http://docs.jboss.org/hibernate/search/5.6/reference/en-US/html/ch11.html .
What are the possibilities to distribute data selectively?
I explain my question with an example.
Consider a central database that holds all the data. This database is located in a certain geographical location.
Application A needs a subset of the information present in the central database. Also, application A may be located in a geographical location different (and maybe far) from the one where the central database is located.
So, I thought about creating a new database at the same location of application A that would contain a subset of information of the central database.
Which technology/product allow me to deploy such a configuration?
Thanks
Look for database replication. SQL Server can do this for sure, others (Oracle, MySQL, ...) should have it, too.
The idea is that the other location maintains a (subset) copy. Updates are exchanged incrementally. The way to treat conflicts depends on your application.
Most major database software such as MySql and SQL server can do the job, but it
is not a good model. With the growth of the application (traffic and users),
not only will you create a load on the central database server (which might be serving
other applications),but you will also be abusing your network bandwidth to transfer data
between the far away database and the application server.
A better model is to keep your data close to the application server, and use the far away
database for backup and recovery purposes only. You can use an FC\IP SAN (or any other
storage network architecture) as your storage network model, based on your applications' needs.
One big question that you didn't address is if Application A needs read-only access to the data or if it needs to be read-write.
The immediate concept that comes to mind when reading your requirements is sharding. In MySQL, this can be accomplished with partitioning. That being said, before you jump into partitions, make sure you read up on their pros and cons. There are instances where partitioning can slow things down if your indexes are not well chosen, or your partitioning scheme is not well thought out.
If your needs are read-only, then this should be a fairly simple solution. You can use MySQL in a Master-Slave context, and use App A off a slave. If you need read-write, then this becomes much more complex.
Depending on your write needs, you can split your reads to your slave, and your writes to the master, but that significantly adds complexity to your code structure (need to deal with multiple connections to multiple dbs). The advantage of this kind of layout is that you don't need to have complex DB infrastructure.
On the flip side, you can keep your code as is, and use a Master-Master replication in MySQL. Although not officially supported by Oracle, a lot of people have had success in this. A quick Google search will find you a huge list of blogs, howtos, etc. Just keep in mind that your code has to be properly written to support this (ex: you cannot use auto-increment fields for PKs, etc).
If you have cash to spend, then you can look at some of the more commercial offerings. Oracle DB and SQL Server both support this.
You can also use Block Based data replication, such as DRDB (and Mysql DRDB) to handle the replication between your nodes, but the problem you always will encounter is what happens if your link between the two nodes fails.
The biggest issue you will encounter is how to handle conflicting updates in 2 separate DB nodes. If your data is geographically dependent, then this may not be an issue for you.
Long story short, this is not an easy (or inexpensive) problem to resolve.
It's important to address the possibility of conflicts at the design phase anytime you are talking about replicating databases.
Moving on from that, SAP's Sybase Replication Server will allow you to do just that, either with Sybase database's or 3rd party databases.
In Sybase's world this is frequently called a corporate roll-up environment. There may be multiple geographically seperated databases each with a subset of data which they have primary control over. At the HQ, there is a server that contains all the various subsets in one repository. You can choose to replicate whole tables, or replicate based on values in individual rows/columns.
This keeps the databases in a loosely consistent state. Transaction rates, Geographic separation, and the latency that can be inherent to network will impact how quickly updates move from one database to another. If a network connection is temporarily down, Sybase Replication Server will queue up transaction, and send them as soon as the link comes back up, but the reliability and stability of the replication system will be affected by the stability of the network connection.
Again, as others have stated it's not cheap, but it's relatively straight forward to implement and maintain.
Disclaimer: I have worked for Sybase, and am still part of the SAP family of companies.
I have an administrative page in a web application that resets the cache, but it only resets the cache on the current JVM.
The web application is deployed as a cluster to two WAS servers.
Any way that I can elegantly have the "clear cache" button on each server trigger the method on both JVMs?
Edit:
The original developer just wrote a singleton holding a HashMap to be the cache in question. Lightweight and (previously) worked just fine for the requirements. It caches content pulled from six or seven web services for specified amounts of time.
Edit:
The entire application in question is three pages, so the elegant solution might well be the lightest solution.
Since the Cache is internal to your application you are going to need to provide an interface to clear it within your application. Quotidian says to use a JMS queue. That will not work because only one instance will pick up the message assuming you have clustered MQ Queues.
If you want to reuse the same implementation then the only way to do this is to write some JMX that you will be able to interact with at the instance level.
If not you can either use the built in WAS cache (which is JMX enabled) or use a distributed cache like ehcache.
in the past I have created a subclassed LinkedHashMap that was linked to all instances on the network using JBOSS JGroups. Of course reinventing the wheel is always painful.
I tend to use a JMS queue for doing exactly that.