Cassandra setup in 3 data-center (dc1, dc2 & dc3) forming a cluster
Running a Java Application on dc1.
dc1 application has Cassandra connectors pointed to dc1 (ips of cassandra in dc1 alone given to the application)
turning off the dc1 cassandra nodes application throws exception in application like
All host(s) tried for query failed (no host was tried)
More Info:
cassandra-driver-core-3.0.8.jar
netty-3.10.5.Final.jar
netty-buffer-4.0.37.Final.jar
netty-codec-4.0.37.Final.jar
netty-common-4.0.37.Final.jar
netty-handler-4.0.37.Final.jar
netty-transport-4.0.37.Final.jar
Keyspace : Network topology
Replication : dc1:2, dc2:2, dc3:2
Cassandra Version : 3.11.4
Here are some things I have found out with connections and Cassandra (and BTW, I believe Cassandra has one of the best HA configurations of any database I've worked with over the past 25 years).
1) Ensure you have all of the components specified in your connection connection. Here is an example of some of the connection components, but there are others as well (maybe you've already done this):
cluster = Cluster.builder()
.addContactPoints(nodes.split(","))
.withCredentials(username, password)
.withPoolingOptions(poolingOptions)
.withLoadBalancingPolicy(
new TokenAwarePolicy(DCAwareRoundRobinPolicy.builder()
.withLocalDc("MYLOCALDC")
.withUsedHostsPerRemoteDc(1)
.allowRemoteDCsForLocalConsistencyLevel()
.build()
)
).build();
2) Unless the entire DC you're "working in" is down, you could receive errors. Cassandra doesn't fail over to alternate DCs unless every node is down in the DC. If less than all nodes are down and your client can't satisfy the client CL settings, you will receive errors. I was actually hoping, when I did testing a while back, that if you couldn't achieve client CL in the LOCAL DC (even if some nodes in the current DC were up) and alternate DCs could, that it would automatically fail over, but this is not the case (since I last tested).
Maybe that helps?
-Jim
Related
I'm using Spring and Ignite Spring to run a pretty simple cluster consisting of one server node and multiple client nodes with varying domain logic.
Ignite configuration:
All nodes connect via TcpCommuncationSpi, TcpDiscoverySpi and TcpDiscoveryVmIpFinder. TcpDiscoveryVmIpFinder.setAddresses only contain the server node.
The server node has serveral CacheJdbcPojoStore configured, the database data is loaded after calling Ignition.start(cfg) using ignite.cache("mycachename").loadCache(null).
After a client node is connected to the server node, it does some domain specific checks to verify data integrity. This works very well if I first start the server node, wait for all data to be loaded and then start the client nodes.
My problem: if I first start the client nodes, they can't connect to the server node as it is not yet started. They patiently wait for the server node to come up. If I now start the server node, the client nodes connect directly after Ignition.start(cfg) is done on the server node, but BEFORE the CacheJdbcPojoStores are done loading their data. Thus the domain specific integrity checks do fail as there is no data present in the caches yet.
My goal: I need a way to ensure the client nodes are only able to connect to the server node AFTER all data is loaded in the server node. This would simply the deployment process as well as local development a lot, as there would be no strict ordering of starting the nodes.
What I tried so far: Fiddling around with ignite.cluster().state(ClusterState.INACTIVE) as well as setting up a manually controlled BaselineTopology. Sadly, both methods simple declare the cluster as not being ready yet and thus I can't even create the caches, let alone load data.
My question: Is there any way to achieve either:
hook up into the the startup process of the server node so I can load the data in BEFORE it joins the cluster
Declare the server node "not yet ready" until the data is loaded.
Use one of the available data structures like AtomicLong or Semaphore to simulate readiness state. There is no need for internal hooks.
Added an example below:
Client:
while (true) {
// try get "myAtomic" and check its value
IgniteAtomicLong atomicLong = ignite.atomicLong("myAtomic", 0, false);
if (atomicLong != null && atomicLong.get() == 1) {
// initialization is complete
break;
}
// not ready
Thread.sleep(1000);
}
Server:
...
//loading is complete, create atomic sequence and set its value to 1
ignite.atomicLong("myAtomic", 1, true);
...
UPDATE#2
instead of having a self-written-loop including Thread.sleep, latch.await() can be called. See: countdownlatch
I'm using Hazelcast Cache for my application.
I have two nodes of Jboss on two different Machines.
Each nodes have two deployments.
Each deployment file has their own hazelcast cache.
I want to cluster between two nodes for each application and below is my configurations,
Config config = new Config();
config.setClusterName("uniqueClusterName");
config.getNetworkConfig().getJoin().getTcpIpConfig().addMember("10.100.101.82,10.100.101.83").setEnabled(true);
config.getNetworkConfig().getJoin().getMulticastConfig().setEnabled(false);
manager = Hazelcast.newHazelcastInstance(config);
My above configuration is working fine and both the nodes are making cluster on each application.
But I have found below logs, and these logs are printing continuously
INFO [com.hazelcast.internal.cluster.impl.operations.SplitBrainMergeValidationOp] (hz.cocky_jackson.priority-generic-operation.thread-0) [10.100.101.82]:5702 [losce_qa] [4.1] Ignoring join check from [10.100.101.83]:5702, because this node is not master...
INFO [com.hazelcast.internal.cluster.impl.operations.SplitBrainMergeValidationOp] (hz.hungry_hofstadter.priority-generic-operation.thread-0) [10.100.101.82]:5701 [losce_qa] [4.1] Ignoring join check from [10.100.101.83]:5702, because this node is not master...
INFO [com.hazelcast.internal.cluster.impl.operations.SplitBrainMergeValidationOp] (hz.cocky_jackson.generic-operation.thread-1) [10.100.101.82]:5702 [losce_qa] [4.1] Ignoring join check from [10.100.101.83]:5702, because this node is not master...
Any work around? How to avoid these logs or I'm doing something wrong here?
TIA
Two clusters sharing the same hardware isn't ideal, as they contend for machine resources.
But if you do, you don't want them clashing, which is what will happen with the default port allocation. The default being to try to listen on port 5701, if this is busy try 5702 and so on. And to try to find other cluster members assuming they are on 5701 also.
To make it work:
(1) Give them unique names, as you've done
config.setClusterName("uniqueClusterName");
&
config.setClusterName("uniqueClusterName2");
As they have different cluster names, members from one cluster won't be able to
join the other. This won't stop them trying, which is causing unwanted log messages.
(2) Assign predictable ports
Try
config.getNetworkConfig().setPort(6701);
&
config.getNetworkConfig().setPort(7701);
They will both try to find ports starting from different offsets, which will allow for predictability.
Without this, both clusters will try to use the default 5701 as the first port, and whichever cluster starts first will success.
With this, the first cluster's member will try and should succeed to get 6701. The second cluster's member will try and should succeed to get 7701.
(3) Specify addresses and ports for connectivity attempts
Try
config.getNetworkConfig().getJoin().getTcpIpConfig()
.addMember("10.100.101.82:6701,10.100.101.83:6701")
and
config.getNetworkConfig().getJoin().getTcpIpConfig()
.addMember("10.100.101.82:7701,10.100.101.83:7701")
I want to give multiple Cassandra endpoints from the config file to my Java application.
Ex:
cassandra host: "host1, host2"
I tried addContactPoints(host), but it did not work. If one of the Cassandra node goes down, I don't want my application to go down.
cluster = Cluster.builder()
.withClusterName(cassandraConfig.getClusterName())
.addContactPoints(cassandraConfig.getHostName())
.withSocketOptions(new SocketOptions().setConnectTimeoutMillis(30000).setReadTimeoutMillis(30000))
.withPoolingOptions(poolingOptions).build();
The java driver is resilient to one of the contact points provided not being available. Contact points are used for establishing an initial connection [*]. As long as the driver is able to communicate with one contact point, it should be able to query the system.peers and system.local table to discover the rest of the nodes in the cluster.
* They are also added to a list of initial hosts in the cluster, but typically the contact points provided map to a node in the system.peers table.
For some reasons I need to query a particular datacenter within my cassandra cluster. According to the documentation, I can use the LOCAL_QUORUM consistency level:
Returns the record after a quorum of replicas in the current
datacenter as the coordinator has reported. Avoids latency of
inter-datacenter communication.
Do I correctly understand, that in order to specify a particular datacenter for the current query, I have to build a cluster on the given endpoint belonging to this particular DC?
Say, I have two DC's with the following nodes:
DC1: 172.0.1.1, 172.0.1.2
DC1: 172.0.2.1, 172.0.2.2
So, to work with DC1, I build a cluster as:
Cluster cluster = Cluster.builder().addContactPoint("172.0.1.1").build();
Session session = cluster.connect();
Statement statement = session.prepare("select * from ...").bind().setConsistencyLevel(ConsistencyLevel.LOCAL_QUORUM);
ResultSet resultSet = session.execute(session);
Is it a proper way to do that?
By itself, DCAwwareRoundRobinPolicy will pick the data center that it finds with the "least network distance" algorithm. To ensure it connects where you want, you should specify the DC as a parameter.
Here is how I tell our dev teams to do it:
Builder builder = Cluster.builder()
.addContactPoints(nodes)
.withQueryOptions(new QueryOptions()
.setConsistencyLevel(ConsistencyLevel.LOCAL_ONE))
.withLoadBalancingPolicy(new TokenAwarePolicy(
new DCAwareRoundRobinPolicy.Builder()
.withLocalDc("DC1").build()))
.withPoolingOptions(options);
Note: this may or may not be applicable to your situation, but do I recommend using the TokenAwarePolicy with the DCAwareRoundRobin nested inside it (specifying the local DC). That way any operation specifying the partition key will automatically route to the correct node, skipping the need for an extra hop required with a coordinator node.
According to the Cluster class documentation:
A cluster object maintains a permanent connection to one of the
cluster nodes which it uses solely to maintain information on the
state and current topology of the cluster
Also, because a default load balancing policy is DCAwareRoundRobinPolicy this approach should work fine as expected.
I have setup a Cassandra cluster with two nodes recently. The replication factor is set to 2 and they both seem to be working well if both the nodes are turned on.
Now how can I use hector in such a way so that it keeps working as far as atleast one node is up? As of now I have something like following.
CassandraHostConfigurator cassandraHostConfigurator = new CassandraHostConfigurator(
"localhost:9160,xx.xx.13.22:9160");
cassandraHostConfigurator.setMaxActive(20);
cassandraHostConfigurator.setMaxIdle(5);
cassandraHostConfigurator.setCassandraThriftSocketTimeout(3000);
cassandraHostConfigurator.setMaxWaitTimeWhenExhausted(4000);
Cluster cluster = HFactory.getOrCreateCluster("structspeech",
cassandraHostConfigurator);
Keyspace keyspace = HFactory.createKeyspace("structspeech", cluster);
....
Let's say if host xx.xx.13.22 goes down then I am getting the following message in my console and all my inserts are failing untill that node comes up.
Downed xx.xx.13.22(xx.xx.13.22):9160 host still appears to be down: Unable to open transport to xx.xx.13.22(xx.xx.13.22):9160 , java.net.ConnectException: Connection refused: connect
This is how my keyspace is defined
update keyspace structspeech with placement_strategy =
'org.apache.cassandra.locator.SimpleStrategy'
and strategy_options =[{replication_factor:2}];
I am sure I am missing something very trivial, any help will be greatly appreciated.
Thanks
By default Hector uses a consistency level of Quorum so if one of your nodes is down this level cannot be satisfied.
When RF = 2 quorum means you need to read and write to both nodes, so if one of them is down you can't execute.
Here's a nice online tool that demonstrates NRW (N = replication factor, R = read consistency and W = write consistency) http://www.ecyrd.com/cassandracalculator/
To change the consistency level while writing/reading use, for example AllOneConsistencyLevelPolicy HFactory.createKeyspace(String, Cluster, ConsistencyLevelPolicy)
What consistency level are you using when you insert? If you are writing at QUORUM or ALL, you need both nodes to be up to write with a replication factor of 2 (a quorum for 2 nodes is 2, that's why typical cassandra clusters use an odd number for replication factor)