Couchbase Cluster: one node down => entire cluster down?

Couchbase Cluster: one node down => entire cluster down? - java

I'm testing Couchbase Server 2.5`. I have a cluster with 7 nodes and 3 replicates. In normal condition, the system works fine.
But I failed with this test case:
Couchbase cluster's serving 40.000 ops and I stop couchbase service on one server => one node down. After that, entire cluster's performance is decreased painfully. It only can server below 1.000 ops. When I click fail-over then entire cluster return healthy.
I think when a node down then only partial request is influenced. Is that right?
And in reality, when one node down, it will make a big impact to entire cluster?
Updated:
I wrote a tool to load test use spymemcached. This tool create multi-thread to connect to Couchbase cluster. Each thread Set a key and Get this key to check immediately, if success it continues Set/Get another key. If fail, it retry Set/Get and by pass this key if fail in 5 times.
This is log of a key when I Set/Get fail.
2014-04-16 16:22:20.405 INFO net.spy.memcached.MemcachedConnection: Reconnection due to exception handling a memcached operation on {QA sa=/10.0.0.23:11234, #Rops=2, #Wops=0, #iq=0, topRop=Cmd: 1 Opaque: 2660829 Key: test_key_2681412 Cas: 0 Exp: 0 Flags: 0 Data Length: 800, topWop=null, toWrite=0, interested=1}. This may be due to an authentication failure.
OperationException: SERVER: Internal error
at net.spy.memcached.protocol.BaseOperationImpl.handleError(BaseOperationImpl.java:192)
at net.spy.memcached.protocol.binary.OperationImpl.getStatusForErrorCode(OperationImpl.java:244)
at net.spy.memcached.protocol.binary.OperationImpl.finishedPayload(OperationImpl.java:201)
at net.spy.memcached.protocol.binary.OperationImpl.readPayloadFromBuffer(OperationImpl.java:196)
at net.spy.memcached.protocol.binary.OperationImpl.readFromBuffer(OperationImpl.java:139)
at net.spy.memcached.MemcachedConnection.readBufferAndLogMetrics(MemcachedConnection.java:825)
at net.spy.memcached.MemcachedConnection.handleReads(MemcachedConnection.java:804)
at net.spy.memcached.MemcachedConnection.handleReadsAndWrites(MemcachedConnection.java:684)
at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:647)
at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:418)
at net.spy.memcached.MemcachedConnection.run(MemcachedConnection.java:1400)
2014-04-16 16:22:20.405 WARN net.spy.memcached.MemcachedConnection: Closing, and reopening {QA sa=/10.0.0.23:11234, #Rops=2, #Wops=0, #iq=0, topRop=Cmd: 1 Opaque: 2660829 Key: test_key_2681412 Cas: 0 Exp: 0 Flags: 0 Data Length: 800, topWop=null, toWrite=0, interested=1}, attempt 0.
2014-04-16 16:22:20.406 WARN net.spy.memcached.protocol.binary.BinaryMemcachedNodeImpl: Discarding partially completed op: Cmd: 1 Opaque: 2660829 Key: test_key_2681412 Cas: 0 Exp: 0 Flags: 0 Data Length: 800
2014-04-16 16:22:20.406 WARN net.spy.memcached.protocol.binary.BinaryMemcachedNodeImpl: Discarding partially completed op: Cmd: 0 Opaque: 2660830 Key: test_key_2681412
Cancelled
2014-04-16 16:22:20.407 ERROR net.spy.memcached.protocol.binary.StoreOperationImpl: Error: Internal error
2014-04-16 16:22:20.407 INFO net.spy.memcached.MemcachedConnection: Reconnection due to exception handling a memcached operation on {QA sa=/10.0.0.24:11234, #Rops=2, #Wops=0, #iq=0, topRop=Cmd: 1 Opaque: 2660831 Key: test_key_2681412 Cas: 0 Exp: 0 Flags: 0 Data Length: 800, topWop=null, toWrite=0, interested=1}. This may be due to an authentication failure.
OperationException: SERVER: Internal error
at net.spy.memcached.protocol.BaseOperationImpl.handleError(BaseOperationImpl.java:192)
at net.spy.memcached.protocol.binary.OperationImpl.getStatusForErrorCode(OperationImpl.java:244)
at net.spy.memcached.protocol.binary.OperationImpl.finishedPayload(OperationImpl.java:201)
at net.spy.memcached.protocol.binary.OperationImpl.readPayloadFromBuffer(OperationImpl.java:196)
at net.spy.memcached.protocol.binary.OperationImpl.readFromBuffer(OperationImpl.java:139)
at net.spy.memcached.MemcachedConnection.readBufferAndLogMetrics(MemcachedConnection.java:825)
at net.spy.memcached.MemcachedConnection.handleReads(MemcachedConnection.java:804)
at net.spy.memcached.MemcachedConnection.handleReadsAndWrites(MemcachedConnection.java:684)
at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:647)
at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:418)
at net.spy.memcached.MemcachedConnection.run(MemcachedConnection.java:1400)
2014-04-16 16:22:20.407 WARN net.spy.memcached.MemcachedConnection: Closing, and reopening {QA sa=/10.0.0.24:11234, #Rops=2, #Wops=0, #iq=0, topRop=Cmd: 1 Opaque: 2660831 Key: test_key_2681412 Cas: 0 Exp: 0 Flags: 0 Data Length: 800, topWop=null, toWrite=0, interested=1}, attempt 0.
2014-04-16 16:22:20.408 WARN net.spy.memcached.protocol.binary.BinaryMemcachedNodeImpl: Discarding partially completed op: Cmd: 1 Opaque: 2660831 Key: test_key_2681412 Cas: 0 Exp: 0 Flags: 0 Data Length: 800
2014-04-16 16:22:20.408 WARN net.spy.memcached.protocol.binary.BinaryMemcachedNodeImpl: Discarding partially completed op: Cmd: 0 Opaque: 2660832 Key: test_key_2681412
Cancelled

You should find that 6/7 (i.e. 85%) of your operations should continue to operate at the same performance. However the 15% of operations which are directed at the vbuckets owned by the now downed node will never complete and likely timeout, and so depending on how your application is handling these timeouts you may see a greater performance drop overall.
How are you benchmarking / measuring the performance?
Update: OP's extra details
I wrote a tool to load test use spymemcached. This tool create multi-thread to connect to Couchbase cluster. Each thread Set a key and Get this key to check immediately, if success it continues Set/Get another key. If fail, it retry Set/Get and by pass this key if fail in 5 times.
The Java SDK is designed to make use of async operations for maximum performance, and this is particularly true when the cluster is degraded and some operations will timeout. I'd suggest starting running in a single thread but using Futures to handle the get after the set. For example:
client.set("key", document).addListener(new OperationCompletionListener() {
#Override
public void onComplete(OperationFuture<?> future) throws Exception {
System.out.println("I'm done!");
}
});
This is an extract from the Understanding and Using Asynchronous Operations section of the Java Developer guide.
There's essentially no reason why given the right code your performance with 85% of nodes up shouldn't be close to 85% of the maximum for a short downtime.
Note that if a node is down for a long time then the replication queues on the other nodes will start to back up and that can impact performance, hence the recommendation of using auto-failover / rebalance to get back to 100% active buckets and re-create replicas to ensure any further node failures don't cause data loss.

Related

Restriction input queu size at Micronaut(Netty)

When the application is started but not yet warmed(Jit need time), it cannot process the expected RPS.
The problem is in the incoming queue. As the IO thread continues to process requests, there are many requests in the queue that the GC cannot clean up. After the overflow of the Survived generation, GC starts perfom major pause, which slows down the execution of requests even more and after some time the application falls on the OOM.
My application have self warmed readnessProbe (3k random request).
I try to configure count of thread and queue size:
application.yml
micronaut:
server:
port: 8080
netty:
parent:
threads: 2
worker:
threads: 2
executors:
io:
n-threads: 1
parallelism: 1
type: FIXED
scheduled:
n-threads: 1
parallelism: 1
corePoolSize: 1
And some props
System.setProperty("io.netty.eventLoop.maxPendingTasks", "16")
System.setProperty("io.netty.eventexecutor.maxPendingTasks", "16")
System.setProperty("io.netty.eventLoopThreads", "1")
But the queue keeps filling up:
i want to find somw way to restriction input queu size at Micronaut, so that the application does not failed under high load

Hazelcast - client mode topology / distributed map lock issue

Below is the description of problem we faced in production. Please note that I could not reproduce the issue in test or local environment and therfore can not provide you with test code.
We have a hazelcast cluster with two members M1, M2 and three clients C1,C2,C3. Hazelcast version is 3.9.
Clients use IMap.tryLock() method with timeout of 10 seconds. After getting the lock, critical and long running operations are performed and finally the lock is released using IMap.unlock() method.
The problem occured in production is as follows:
At some time instant t, we first saw heartbeat failure to M2 at client C2. Afterwards there are errors in fetching partition table casued by com.hazelcast.spi.exception.TargetDisconnectedException:
[hz.client_0.internal-2 ] WARN [] HeartbeatManager - hz.client_0 [mygroup] [3.9] HeartbeatManager failed to connection: .....
[hz.client_0.internal-3 ] WARN [] ClientPartitionService - hz.client_0 [mygroup] [3.9] Error while fetching cluster partition table!
java.util.concurrent.ExecutionException: com.hazelcast.spi.exception.TargetDisconnectedException: Heartbeat timed out to owner connection ClientConnection{alive=true, connectionId=1, ......
Around 250 ms after initial heartbeat failure, client gets disconnected and then reconnects in 20 ms.
[hz.client_0.cluster- ] INFO [] LifecycleService - hz.client_0 [mygroup] [3.9] HazelcastClient 3.9 (20171023 - b29f549) is CLIENT_DISCONNETED
[hz.client_0.cluster- ] INFO [] LifecycleService - hz.client_0 [mygroup] [3.9] HazelcastClient 3.9 (20171023 - b29f549) is CLIENT_CONNECTED
The problem we are having is, for some keys that are previously acquired by C2, C1 and C3 can not acquire the lock even if it seems to be released by C2. C2 can get the lock, but this puts unacceptable delays
to the application and is not acceptable.. All clients should get since lock is released...
We were notified of the problem after receiving complaints, and then restarted the client application C2.
As documented in http://docs.hazelcast.org/docs/latest-development/manual/html/Distributed_Data_Structures/Lock.html, locks acquired by restarted member (C2 in my case) seemed to be removed after restart operation.
Currently the issue seems to go away, but we are not sure if it will recur.
Do you have any suggestions about the probable cause and more importantly do you have any recommendations?
Would enabling redo-operation in client help for this problem case?
As I tried to explain client seems to recover the problem, but keys remain locked in cluster and this is fatal to my application.
Thanks

It looks like the client had lost the ownership of the lock because of its disconnection from the cluster. You can use IMap#forceUnlock API in cases such as you faced. It releases the lock regardless of the lock owner and it always successfully unlocks, never blocks, and returns immediately.

How to deploy a smart contract to private network by EthereumJ?

I set up a private ethereum network by go-ethereum,and I configure the Ethereumj's config to connect to the private network,I can see the Ethereumj's information in go-ethereum's console:
> admin.peers
[{
caps: ["eth/62", "eth/63"],
id: "e084894a3b72e8a990710a8f84b2d6f99ac15c0a1d0d7f1a6510769633b64067f9c2df2074e920a4e46fc7d7eb1b211c06f189e5325f0856d326e32d87f49d20",
name: "Ethereum(J)/v1.5.0/Windows/Dev/Java/Dev",
network: {
localAddress: "127.0.0.1:30303",
remoteAddress: "127.0.0.1:18499"
},
protocols: {
eth: {
difficulty: 7746910281,
head: "0x97568a8b38cce14776d5daee5169954f76007a79d7329f71e48c673e6e533215",
version: 63
}
}
}]
>
then I run the sample of deploying contract(CreateContractSample.java)，and the go-ethereum is mining on the private network,but I get the output:
14:10:37.969 INFO [sample] [v] Available Eth nodes found.
14:10:37.969 INFO [sample] Searching for peers to sync with...
14:10:40.970 INFO [sample] [v] At least one sync peer found.
14:10:40.970 INFO [sample] Current BEST block: #10105 (0fc0c0 <~ e6a78f) Txs:0, Unc: 0
14:10:40.970 INFO [sample] Waiting for blocks start importing (may take a while)...
14:10:46.973 INFO [sample] [v] Blocks import started.
14:10:46.973 INFO [sample] Waiting for the whole blockchain sync (will take up to several hours for the whole chain)...
14:10:56.974 INFO [sample] [v] Sync complete! The best block: #10109 (90766e <~ 46ebc3) Txs:0, Unc: 0
14:10:56.974 INFO [sample] Compiling contract...
14:10:57.078 INFO [sample] Sending contract to net and waiting for inclusion
cd2a3d9f938e13cd947ec05abc7fe734df8dd826
14:10:57.093 INFO [sample] <=== Sending transaction: TransactionData [hash= nonce=00, gasPrice=104c533c00, gas=2dc6c0, receiveAddress=, sendAddress=cd2a3d9f938e13cd947ec05abc7fe734df8dd826, value=, data=6060604052346000575b6096806100176000396000f300606060405263ffffffff60e060020a600035041663623845d88114602c5780636d4ce63c14603b575b6000565b3460005760396004356057565b005b3460005760456063565b60408051918252519081900360200190f35b60008054820190555b50565b6000545b905600a165627a7a72305820f4c00cf17626a18f19d4bb01d62482537e347bbc8c2ae2b0a464dbf1794f7c260029, signatureV=28, signatureR=33b12df9ac0351f1caa816161f5cc1dec30e288d97c02aedd3aafc59e9faafd1, signatureS=457954d2d34e88cdd022100dde72042c79e918ff6d5ddd372334a614a03a331a]
''''''some Unnecessary output''''''
java.lang.RuntimeException: The transaction was not included during last 16 blocks: ed2b6b59
at org.ethereum.samples.CreateContractSample.waitForTx(CreateContractSample.java:142)
at org.ethereum.samples.CreateContractSample.sendTxAndWait(CreateContractSample.java:116)
at org.ethereum.samples.CreateContractSample.onSyncDone(CreateContractSample.java:73)
at org.ethereum.samples.BasicSample.run(BasicSample.java:148)
at java.lang.Thread.run(Thread.java:745)
I want to know the reason of this error.

MongoDB SDK Failover not working

I have set up a replica set using three machines (192.168.122.21, 192.168.122.147 and 192.168.122.148) and I am interacting with the MongoDB Cluster using the Java SDK:
ArrayList<ServerAddress> addrs = new ArrayList<ServerAddress>();
addrs.add(new ServerAddress("192.168.122.21", 27017));
addrs.add(new ServerAddress("192.168.122.147", 27017));
addrs.add(new ServerAddress("192.168.122.148", 27017));
this.mongoClient = new MongoClient(addrs);
this.db = this.mongoClient.getDB(this.db_name);
this.collection = this.db.getCollection(this.collection_name);
After the connection is established I do multiple inserts of a simple test document:
for (int i = 0; i < this.inserts; i++) {
try {
this.collection.insert(new BasicDBObject(String.valueOf(i), "test"));
} catch (Exception e) {
System.out.println("Error on inserting element: " + i);
e.printStackTrace();
}
}
When simulating a node crash of the master server (power-off), the MongoDB cluster does a successful failover:
19:08:03.907+0100 [rsHealthPoll] replSet info 192.168.122.21:27017 is down (or slow to respond):
19:08:03.907+0100 [rsHealthPoll] replSet member 192.168.122.21:27017 is now in state DOWN
19:08:04.153+0100 [rsMgr] replSet info electSelf 1
19:08:04.154+0100 [rsMgr] replSet couldn't elect self, only received -9999 votes
19:08:05.648+0100 [conn15] replSet info voting yea for 192.168.122.148:27017 (2)
19:08:10.681+0100 [rsMgr] replSet not trying to elect self as responded yea to someone else recently
19:08:10.910+0100 [rsHealthPoll] replset info 192.168.122.21:27017 heartbeat failed, retrying
19:08:16.394+0100 [rsMgr] replSet not trying to elect self as responded yea to someone else recently
19:08:22.876+.
19:08:22.912+0100 [rsHealthPoll] replset info 192.168.122.21:27017 heartbeat failed, retrying
19:08:23.623+0100 [SyncSourceFeedbackThread] replset setting syncSourceFeedback to 192.168.122.148:27017
19:08:23.917+0100 [rsHealthPoll] replSet member 192.168.122.148:27017 is now in state PRIMARY
This is also recognized by the MongoDB Driver on the Client Side:
Dec 01, 2014 7:08:16 PM com.mongodb.ConnectionStatus$UpdatableNode update
WARNING: Server seen down: /192.168.122.21:27017 - java.io.IOException - message: Read timed out
WARNING: Server seen down: /192.168.122.21:27017 - java.io.IOException - message: couldn't connect to [/192.168.122.21:27017] bc:java.net.SocketTimeoutException: connect timed out
Dec 01, 2014 7:08:36 PM com.mongodb.DBTCPConnector setMasterAddress
WARNING: Primary switching from /192.168.122.21:27017 to /192.168.122.148:27017
But it still keeps trying to connect to the old node (forever):
Dec 01, 2014 7:08:50 PM com.mongodb.ConnectionStatus$UpdatableNode update
WARNING: Server seen down: /192.168.122.21:27017 - java.io.IOException - message: couldn't connect to [/192.168.122.21:27017] bc:java.net.NoRouteToHostException: No route to host
.....
Dec 01, 2014 7:10:43 PM com.mongodb.ConnectionStatus$UpdatableNode update
WARNING: Server seen down: /192.168.122.21:27017 - java.io.IOException -message: couldn't connect to [/192.168.122.21:27017] bc:java.net.NoRouteToHostException: No route to host
The Document count on the Database stays the same from the moment the primary fails and a secondary becomes primary. Here is the Output from the same node during the process:
"rs0":SECONDARY> db.test_collection.find().count() 12260161
"rs0":PRIMARY> db.test_collection.find().count() 12260161
Update:
Using WriteConcern Unacknowledged it works as designed. Insert Operations are also performed on the new master and all operations during the election process get lost.
With WriteConcern Acknowleged it seems that the Operation is waiting infinitely for an ACK from the crashed master. This could explain why the program continuous after the crashed server boots up again and joins the cluster as a secondary. But in my case I don't want the driver to wait forever, it should raise an error after a certain time.
Update:
WriteConcern Acknowledged is also working as expected when killing the mongod process on the primary. In this case the failover only takes ~3 Seconds. During this time no inserts are done, and after the new primary is elected the insert operations continue.
So I only get the problem when simulating a node failure (power off/network down). In this case the operation hangs until the failed node starts up again.

Does your app still work? Since that server is still in your seed list, the driver will try to connect to it as far as I know. Your app should still work so long as any of the other servers in your seed list can gain primary status.

Explicit specifying a Connection Timeout Value solved the error. See also: http://api.mongodb.org/java/2.7.0/com/mongodb/MongoOptions.html

Terracotta Ehcache: server disconnects during debug

I found out, that when I connect by debugger to the application, and starting to debug,
the connection to terracotta server is lost (?) and in the terracotta server logs next messages are appeared:
2012-03-30 13:45:06,758 [L2_L1:TCComm Main Selector Thread_R (listen
0.0.0.0:9510)] WARN com.tc.net.protocol.transport.ConnectionHealthChecker Impl. DSO Server
- 127.0.0.1:55112 might be in Long GC. GC count since last ping reply : 1 2012-03-30 13:45:27,761 [L2_L1:TCComm Main Selector Thread_R
(listen 0.0.0.0:9510)] WARN
com.tc.net.protocol.transport.ConnectionHealthChecker Impl. DSO Server
- 127.0.0.1:55112 might be in Long GC. GC count since last ping reply : 1 2012-03-30 13:45:31,761 [L2_L1:TCComm Main Selector Thread_R
(listen 0.0.0.0:9510)] WARN
com.tc.net.protocol.transport.ConnectionHealthChecker Impl. DSO Server
- 127.0.0.1:55112 might be in Long GC. GC count since last ping reply : 2
...
2012-03-30 13:46:37,768 [L2_L1:TCComm Main Selector Thread_R (listen
0.0.0.0:9510)] ERROR com.tc.net.protocol.transport.ConnectionHealthChecke rImpl. DSO Server
- 127.0.0.1:55112 might be in Long GC. GC count since last ping reply : 10. But its too long. No more retries 2012-03-30 13:46:38,768
[HealthChecker] INFO
com.tc.net.protocol.transport.ConnectionHealthCheckerImpl. DSO Server
- 127.0.0.1:55112 is DEAD 2012-03-30 13:46:38,768 [HealthChecker] ERROR com.tc.net.protocol.transport.ConnectionHealthCheckerImpl: DSO
Server - Declared connection dead
ConnectionID(1.0b1994ac80f14b7191080bdc3f38582a) idle time 45317ms
2012-03-30 13:46:38,768 [L2_L1:TCWorkerComm # 0_R] WARN
com.tc.net.protocol.transport.ServerMessageTransport -
ConnectionID(1.0b1994ac80f14b71 91080bdc3f38582a): CLOSE EVENT :
com.tc.net.core.TCConnectionJDK14#5158277: connected: false, closed:
true local=127.0.0.1:9510 remote=127.0.0 .1:55112 connect=[Fri Mar 30
13:34:22 BST 2012] idle=2001ms [207584 read, 229735 write]. STATUS :
DISCONNECTED
...
2012-03-30 13:46:38,799 [L2_L1:TCWorkerComm # 0_R] INFO
com.tc.objectserver.persistence.sleepycat.SleepycatPersistor - Deleted
client state fo r ChannelID=[1] 2012-03-30 13:46:38,801
[WorkerThread(channel_life_cycle_stage, 0)] INFO
com.tc.objectserver.handler.ChannelLifeCycleHandler - : Received tran
sport disconnect. Shutting down client ClientID[1] 2012-03-30
13:46:38,801 [WorkerThread(channel_life_cycle_stage, 0)] INFO
com.tc.objectserver.persistence.impl.TransactionStoreImpl - shutdownC
lient() : Removing txns from DB : 0
After this is happened, any operation with cache, like getWithLoader just doesn't answer, until terracotta server won't be restarted again.
Question: how can it be fixed/reconfigured? I assume, it can happen in production also (and actually sometimes happens) if for some (any) reason application will hang/staled/etc.

This is just to get you started.
TC connections betwee server and client are considered dead when the applicable HealthCheck fails. The default values for the HealthCheck assume a very stable and performant network. I recommend you familiarize yourself with the details and the calculations on
http://www.terracotta.org/documentation/3.5.2/terracotta-server-array/high-availability#85916
So typically you begin with
a) making sure your network doesn't hiccup occasionally
b) setting the TC HealthCheck values a bit higher
If the problem persists I'd recommend posting directly on the TC forums (they'll help you even if you only use the open-source edition, may take a few days to reply though.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Couchbase Cluster: one node down => entire cluster down? - java

Related

Restriction input queu size at Micronaut(Netty)

Hazelcast - client mode topology / distributed map lock issue

How to deploy a smart contract to private network by EthereumJ?

MongoDB SDK Failover not working

Terracotta Ehcache: server disconnects during debug

Categories

Resources