Upgrade Hazelcast 3.5.* to a newer version without losing data - java

From the Hazelcast official documentation, rolling upgrade is supported starting from version 3.8.
Provided my server version is 3.5, is there a way to create a successful cluster with new boxes running newer versions of Hazelcast?
Naively upgrading to 3.6.* resulted in 2 different clusters (old boxes still running 3.5 and another one with the new ones running 3.6 that obviously has no data as it was never able to touch base with the existing boxes).
My deployment process is as follows:
create a new set of boxes
remove the existing boxes one by one
repeat with a second batch of boxes
My thoughts have gone towards storing a snapshot on disk / db and remount the partition / load from the DB at rollout time, but this might not even be supported and I'm hopeful there might be a better way.

What data structures do you use? For IMaps, ICaches and ILists, you can use Hazelcast Jet. It connects to the old cluster and pumps the data to the new cluster.
This works if your new cluster is on 3.x version. 3.x -> 4.x isn't possible this way. Use Jet 3.x version for it.
See https://docs.hazelcast.org/docs/jet/3.2.2/manual/manual.html#connector-imdg

Related

Objectify v5 and v6 at the same time in a google app engine java 8 standard project

We want to do a zero downtime migration of a google app engine java 8 standard project to another region.
Unfortunately google does not support this, so it has to be done manually.
One could export the datastore and import it again, but there may be no downtime and the data must always be consistent.
So the idea came up to create the project in the new region, and embed objectify 5 there with all entities (definitions, not data) used in the old project. Any new data goes in the "new datastore" attached to this new project.
All data not found on this new datastore shall be queried (if necessary) using objectify 6 connected to the "old" project using datastore api.
The advantage would be to not export any data manually at all and only migrate the most important data on the fly, using the mechanism above. (there's a lot of unused garbage, we did not do housekeeping for, but also some very vital data that must be on the new system)
Is this a valid approach? I know I'll probably integrate objectify by code and change package names to not have problems on the "code side"
If there is a better approach to migrate a project to another region, we're happy to hear.
We searched for hours without a proper result
Edit: I'm aware that we must instantly stop requests to the old service / disable writes there. We'd solve this by redirecting traffic (http) from the old project to the new one and disable writes.
This is a valid approach for migration. The traffic from new project can continue to do reads from old Datastore and writes to new one. I would like to add one more point.
However, soon after this switchover you should also plan data migration from old datastore to new one through mass export and import. The app will then have to be pointed to the new ones even for reads. https://cloud.google.com/datastore/docs/export-import-entities
This can be done gracefully by introducing a proxy connection logic in JAVA for connecting with the new Datastore entity. Which means during data migration, you put a condition in OFY6 to check for the new Datastore entity, if it is not available then read data from the old entity. This will ensure Zero downtime and in the backend you can silently and safely turn off the old datastore assuming you already have its full export.
Reads from both the old data source and new data source is a valid way to do migrations.

Elastic Search Stable Version

I'm new to Elastic search. We are building a Spring boot application with Elastic search.
Currently, we are bound to use Spring boot 2.1.3.RELEASE.
After a lot of R&D, I've finalized to use elasticsearch-rest-high-level-client for integrating my Spring boot application.
Thinking to setup latest Elastic search version "7.7.1".
Is it fine to proceed with the latest version or should I go with any previous version of Elastic search for any reason?
As you are just starting a new project I would recommend to go for a stable version. So at least you are sure that the problems in your application do not come from the ES.
Later in the development process as you analyse the changelog between releases you can consider if you need the fixes and the new features of the new release, and test that in a new branch.
I guess at the first golive you will end up with a completely different spring release and also a different Elastic stack release if you are doing something substantional.

Replicating Couchbase to ElasticSearch (w/ multiple indices)

Currently we're using Couchbase and ElasticSearch(2.x) and replicating data from CB to ES successfully using elasticsearch-transport-couchbase plugin.
The problems began while upgrading to ES 5.6.4. Up until now, we used a single index in ES, and due to the fact that ElasticSearch doesn't recommend this approach anymore we are now trying to create multiple indices in ES (index per type)
That means that we need a way to replicate data from CB (A single bucket) to ES (multiple indices).
What is the best way to approach this?
Possible solutions:
Continue using the elasticsearch-transport-couchbase plugin, but then we'll have to create a lot (~150) XDCR replications, 1 replication per type. I doubt this will scale..
Write our own solution using Spark or Kafka (Neither of them are on Technological stack so implementation might take time, so it's not the most favourable solution)
Any help would be appreciated.
Version 4 of the Couchbase Elasticsearch Connector supports the new "index-per-type" model (and other features, including support for ES 6, secure connections, and replication checkpoint management tools). If you'd like to try it out, your feedback would be invaluable.
Disclaimer: I am a Couchbase employee developing the Elasticsearch connector.

CouchbaseClient VS CouchbaseCluster

I am trying to implement couchbase in my application.
I am confused with
com.couchbase.client.CouchbaseClient
AND
com.couchbase.client.java.CouchbaseCluster.
I tried to google on CouchbaseClient vs CouchbaseCluster but didn't found which one is better & Pros and Cons.
I know we have 3 types of Couchbase Client, one is vBucket-aware, one is traditional old client which support auto clustering via Moxi-Server.
Can someone who have already used couchbase provides me some link or detailed information about these two Java-Client.
I have done some homework on CouchbaseClient and CouchbaseCluster like inserting, updating, deleting documents via both.
In CouchbaseClient the documents stored are Serialized and you cannot view and edit those documents via Couchbase Admin Console, whereas if Documents like StringDocument, JsonDocument, JsonArrayDocument stored via Couchbase cluster can be viewed and are editable over Couchbase Admin Console.
My requirements is I want to use a couchbase client which is AutoConfiurable (vBucket-aware) like if I add new nodes to a cluster, it will auto detect it, or if any node failed, it will auto detect it and does not throw any exception. Further, if I add new cluster, I'd like it to auto detect it and start using it. I don't want to modify the application code for all these things.
There is now two generations of official Couchbase Java SDKs:
generation 1 (currently 1.4.x, not sure of the patch version) is derived from an old Memcached client, Spymemcached... it is now bugfixes only, and it's the one where you have CouchbaseClient as the primary API.
generation 2 is a rewrite, layered into a core artifact and java-client artifact in Maven. Current version is 2.1.3. This is the one where you deal with CouchbaseCluster.
In the old one, you'd have to instantiate one CouchbaseClient for each bucket you deal with.
In the new generation, the notions of cluster and bucket are first class citizens and you can (and should) reuse the same Cluster instance to open references to different Buckets. The Buckets should also be reused (don't open the same bucket several times). Resources are better mutualized this way.
Also, the new generation has more coherent APIs, uses RxJava for asynchronous processing, etc... It is cluster-aware and will get updates of the topology of the cluster (new nodes, failing nodes, etc...).
Note that these two generations are differents artifacts in Maven (old one is couchbase-client while new one is java-client).
There's no way you can get such a notification if you "add new cluster", but that operation doesn't really make sense to me...

Replicating a standalone HBase 0.2 deployment

As far as I'm aware the current stable release of HBase, 0.2, does not support replication, although it is being built into the next version.
How would you recommend replicating a standalone (non-distributed) deployment of HBase (0.2) ?
I want the secondary instance to be used as a working backup i.e. read-only. I can afford asynchronous backups with "eventual consistency", and a small amount of loss (the data is non-critical).
So far my only thought was to manually update the secondary instance, asynchronously, after writing to the primary instance.
HBase natively tolerates node failure/failover (assuming that you are running on HDFS), so it's not really necessary to maintain a replica like you would with a RDBMS.
What's wrong with just using HDFS replication?
EDIT: In this case, you would switch from standalone to distributed, and just have 2 nodes with a replication factor of 2.

Categories

Resources