com.hazelcast.cp.exception.NotLeaderException no leader election on 3 nodes

com.hazelcast.cp.exception.NotLeaderException no leader election on 3 nodes - java

I am using hazelcast-4.2.1 – not enterprise –d on 3 nodes. I try enable cp-subsystem with raft consensus protocol by changing config to on each node and restart hazelcast:
<cp-subsystem>
<cp-member-count>3</cp-member-count>
<group-size>3</group-size>
<session-time-to-live-seconds>300</session-time-to-live-seconds>
<session-heartbeat-interval-seconds>5</session-heartbeat-interval-seconds>
<missing-cp-member-auto-removal-seconds>14400</missing-cp-member-auto-removal-seconds>
<fail-on-indeterminate-operation-state>false</fail-on-indeterminate-operation-state>
<raft-algorithm>
<leader-election-timeout-in-millis>2000</leader-election-timeout-in-millis>
<leader-heartbeat-period-in-millis>5000</leader-heartbeat-period-in-millis>
<max-missed-leader-heartbeat-count>5</max-missed-leader-heartbeat-count>
<append-request-max-entry-count>100</append-request-max-entry-count>
<commit-index-advance-count-to-snapshot>10000</commit-index-advance-count-to-snapshot>
<uncommitted-entry-count-to-reject-new-appends>100</uncommitted-entry-count-to-reject-new-appends>
<append-request-backoff-timeout-in-millis>100</append-request-backoff-timeout-in-millis>
</raft-algorithm>
</cp-subsystem>
The code of app is simple:
hazelcastInstance.getCPSubsystem().getLock().lock();
But I was warned on each of 3 nodes "Leader N/A":
2021-11-23 11:45:45 INFO [MetadataRaftGroupManager] - [172.18.20.166]:5701 [dev] [4.2.1] CP Subsystem is waiting for 3 members to join the cluster. Current member count: 1
2021-11-23 11:45:48 INFO [ClusterService] - [172.18.20.166]:5701 [dev] [4.2.1]
Members {size:3, ver:3} [
Member [172.18.20.166]:5701 - 66ef130c-a666-4ec0-8e99-cec4cd504bac this
Member [172.18.20.167]:5701 - 5039c2ea-22dd-4f7c-a134-9367fab4e767
Member [172.18.20.168]:5701 - 47dd80d4-b983-4ae7-b6bc-c2352416eada
]
2021-11-23 11:45:48 INFO [PartitionStateManager] - [172.18.20.166]:5701 [dev] [4.2.1] Initializing cluster partition table arrangement...
2021-11-23 11:45:49 INFO [RaftService] - [172.18.20.166]:5701 [dev] [4.2.1] RaftNode[CPGroupId{name='METADATA', seed=0, groupId=0}] is created with [RaftEndpoint{uuid='47dd80d4-b983-4ae7-b6bc-c2352416eada'}, RaftEndpoint{uuid='5039c2ea-22dd-4f7c-a134-9367fab4e767'}, RaftEndpoint{uuid='66ef130c-a666-4ec0-8e99-cec4cd504bac'}]
2021-11-23 11:45:49 INFO [RaftNode(METADATA)] - [172.18.20.166]:5701 [dev] [4.2.1] Status is set to: ACTIVE
2021-11-23 11:45:49 INFO [AuthenticationMessageTask] - [172.18.20.166]:5701 [dev] [4.2.1] Received auth from Connection[id=12, /172.18.20.166:5701->/172.18.20.166:53635, qualifier=null, endpoint=[172.18.20.166]:53635, alive=true, connectionType=MCJVM, planeIndex=-1], successfully authenticated, clientUuid: 7c86a343-8b52-4652-8235-7c0dfdb2f5ad, client version: 4.2
2021-11-23 11:45:51 INFO [PreVoteRequestHandlerTask(METADATA)] - [172.18.20.166]:5701 [dev] [4.2.1] Granted pre-vote for PreVoteRequest{candidate=RaftEndpoint{uuid='47dd80d4-b983-4ae7-b6bc-c2352416eada'}, nextTerm=1, lastLogTerm=0, lastLogIndex=0}
2021-11-23 11:45:51 INFO [VoteRequestHandlerTask(METADATA)] - [172.18.20.166]:5701 [dev] [4.2.1] Moving to new term: 1 from current term: 0 after VoteRequest{candidate=RaftEndpoint{uuid='47dd80d4-b983-4ae7-b6bc-c2352416eada'}, term=1, lastLogTerm=0, lastLogIndex=0, disruptive=false}
2021-11-23 11:45:51 INFO [RaftNode(METADATA)] - [172.18.20.166]:5701 [dev] [4.2.1]
CP Group Members {groupId: METADATA(0), size:3, term:1, logIndex:0} [
CPMember{uuid=47dd80d4-b983-4ae7-b6bc-c2352416eada, address=[172.18.20.168]:5701}
CPMember{uuid=5039c2ea-22dd-4f7c-a134-9367fab4e767, address=[172.18.20.167]:5701}
CPMember{uuid=66ef130c-a666-4ec0-8e99-cec4cd504bac, address=[172.18.20.166]:5701} - FOLLOWER this
]
2021-11-23 11:45:51 INFO [VoteRequestHandlerTask(METADATA)] - [172.18.20.166]:5701 [dev] [4.2.1] Granted vote for VoteRequest{candidate=RaftEndpoint{uuid='47dd80d4-b983-4ae7-b6bc-c2352416eada'}, term=1, lastLogTerm=0, lastLogIndex=0, disruptive=false}
2021-11-23 11:45:51 INFO [AppendRequestHandlerTask(METADATA)] - [172.18.20.166]:5701 [dev] [4.2.1] Setting leader: RaftEndpoint{uuid='47dd80d4-b983-4ae7-b6bc-c2352416eada'}
2021-11-23 11:45:51 INFO [RaftNode(METADATA)] - [172.18.20.166]:5701 [dev] [4.2.1]
CP Group Members {groupId: METADATA(0), size:3, term:1, logIndex:0} [
CPMember{uuid=47dd80d4-b983-4ae7-b6bc-c2352416eada, address=[172.18.20.168]:5701} - LEADER
CPMember{uuid=5039c2ea-22dd-4f7c-a134-9367fab4e767, address=[172.18.20.167]:5701}
CPMember{uuid=66ef130c-a666-4ec0-8e99-cec4cd504bac, address=[172.18.20.166]:5701} - FOLLOWER this
]
2021-11-23 11:45:53 INFO [MetadataRaftGroupManager] - [172.18.20.166]:5701 [dev] [4.2.1] CP Subsystem is initialized with: [CPMember{uuid=47dd80d4-b983-4ae7-b6bc-c2352416eada, address=[172.18.20.168]:5701}, CPMember{uuid=5039c2ea-22dd-4f7c-a134-9367fab4e767, address=[172.18.20.167]:5701}, CPMember{uuid=66ef130c-a666-4ec0-8e99-cec4cd504bac, address=[172.18.20.166]:5701}]
2021-11-23 11:45:54 INFO [HealthMonitor] - [172.18.20.166]:5701 [dev] [4.2.1] processors=2, physical.memory.total=5.5G, physical.memory.free=844.4M, swap.space.total=2.0G, swap.space.free=1.2G, heap.memory.used=101.1M, heap.memory.free=857.4M, heap.memory.total=958.5M, heap.memory.max=958.5M, heap.memory.used/total=10.55%, heap.memory.used/max=10.55%, minor.gc.count=2, minor.gc.time=34ms, major.gc.count=2, major.gc.time=96ms, load.process=100.00%, load.system=100.00%, load.systemAverage=1.01, thread.count=47, thread.peakCount=53, cluster.timeDiff=0, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.client.query.size=0, executor.q.client.blocking.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operations.size=0, executor.q.priorityOperation.size=0, operations.completed.count=116, executor.q.mapLoad.size=0, executor.q.mapLoadAllKeys.size=0, executor.q.cluster.size=0, executor.q.response.size=0, operations.running.count=0, operations.pending.invocations.percentage=0.00%, operations.pending.invocations.count=0, proxy.count=1, clientEndpoint.count=10, connection.active.count=12, client.connection.count=10, connection.count=12
2021-11-23 11:46:12 WARN [Invocation] - [172.18.20.166]:5701 [dev] [4.2.1] Retrying invocation: Invocation{op=com.hazelcast.cp.internal.operation.DefaultRaftReplicateOp{serviceName='hz:core:raft', identityHash=309878934, partitionId=185, replicaIndex=0, callId=3364, invocationTime=1637649972379 (2021-11-23 11:46:12.379), waitTimeout=-1, callTimeout=60000, tenantControl=com.hazelcast.spi.impl.tenantcontrol.NoopTenantControl#0, groupId=CPGroupId{name='default', seed=0, groupId=6960}, op=com.hazelcast.cp.internal.session.operation.HeartbeatSessionOp{serviceName='hz:core:raftSession', sessionId=10}}, tryCount=250, tryPauseMillis=500, invokeCount=100, callTimeoutMillis=60000, firstInvocationTimeMs=1637649936372, firstInvocationTime='2021-11-23 11:45:36.372', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 05:00:00.000', target=[172.18.20.168]:5701, pendingResponse={VOID}, backupsAcksExpected=-1, backupsAcksReceived=0, connection=Connection[id=11, /172.18.20.166:5701->/172.18.20.168:51871, qualifier=null, endpoint=[172.18.20.168]:5701, alive=true, connectionType=MEMBER, planeIndex=0]}, Reason: com.hazelcast.cp.exception.NotLeaderException: RaftEndpoint{uuid='47dd80d4-b983-4ae7-b6bc-c2352416eada'} is not LEADER of CPGroupId{name='default', seed=0, groupId=6960}. Known leader is: N/A
2021-11-23 11:46:12 WARN [Invocation] - [172.18.20.166]:5701 [dev] [4.2.1] Retrying invocation: Invocation{op=com.hazelcast.cp.internal.operation.DefaultRaftReplicateOp{serviceName='hz:core:raft', identityHash=1764520567, partitionId=185, replicaIndex=0, callId=3366, invocationTime=1637649972380 (2021-11-23 11:46:12.380), waitTimeout=-1, callTimeout=60000, tenantControl=com.hazelcast.spi.impl.tenantcontrol.NoopTenantControl#0, groupId=CPGroupId{name='default', seed=0, groupId=6960}, op=com.hazelcast.cp.internal.session.operation.HeartbeatSessionOp{serviceName='hz:core:raftSession', sessionId=10}}, tryCount=250, tryPauseMillis=500, invokeCount=100, callTimeoutMillis=60000, firstInvocationTimeMs=1637649936136, firstInvocationTime='2021-11-23 11:45:36.136', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 05:00:00.000', target=[172.18.20.168]:5701, pendingResponse={VOID}, backupsAcksExpected=-1, backupsAcksReceived=0, connection=Connection[id=11, /172.18.20.166:5701->/172.18.20.168:51871, qualifier=null, endpoint=[172.18.20.168]:5701, alive=true, connectionType=MEMBER, planeIndex=0]}, Reason: com.hazelcast.cp.exception.NotLeaderException: RaftEndpoint{uuid='47dd80d4-b983-4ae7-b6bc-c2352416eada'} is not LEADER of CPGroupId{name='default', seed=0, groupId=6960}. Known leader is: N/A
2021-11-23 11:46:12 WARN [Invocation] - [172.18.20.166]:5701 [dev] [4.2.1] Retrying invocation: Invocation{op=com.hazelcast.cp.internal.operation.DefaultRaftReplicateOp{serviceName='hz:core:raft', identityHash=1136581224, partitionId=185, replicaIndex=0, callId=3416, invocationTime=1637649972878 (2021-11-23 11:46:12.878), waitTimeout=-1, callTimeout=60000, tenantControl=com.hazelcast.spi.impl.tenantcontrol.NoopTenantControl#0, groupId=CPGroupId{name='default', seed=0, groupId=6960}, op=com.hazelcast.cp.internal.session.operation.HeartbeatSessionOp{serviceName='hz:core:raftSession', sessionId=10}}, tryCount=250, tryPauseMillis=500, invokeCount=100, callTimeoutMillis=60000, firstInvocationTimeMs=1637649936875, firstInvocationTime='2021-11-23 11:45:36.875', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 05:00:00.000', target=[172.18.20.167]:5701, pendingResponse={VOID}, backupsAcksExpected=-1, backupsAcksReceived=0, connection=Connection[id=10, /172.18.20.166:5701->/172.18.20.167:36039, qualifier=null, endpoint=[172.18.20.167]:5701, alive=true, connectionType=MEMBER, planeIndex=0]}, Reason: com.hazelcast.cp.exception.NotLeaderException: RaftEndpoint{uuid='5039c2ea-22dd-4f7c-a134-9367fab4e767'} is not LEADER of CPGroupId{name='default', seed=0, groupId=6960}. Known leader is: N/A
And flooding inovocation with com.hazelcast.cp.exception.NotLeaderException to end of log.
Question: how setup cp-subsystem enable for raft with 3 nodes?

If restart all members of cluster, it change groupId because state be not consistent without persistents mode, that only in enterprise version. And you need restart clients too.
https://github.com/hazelcast/hazelcast/issues/17436

Related

Apache Ignite: Slow Node join and failure

We have a Ignite setup with 3 Servers and Persistence and therefore Baselining enabled. From time to time we have the issue that the Servers take a long time to rebuild the cluster after all Nodes are restarted. Ignite runs embedded in the application.
20.11.2020 08:18:17.678 WARN [main] org.apache.ignite.internal.util.typedef.G:290 - Ignite work directory is not provided, automatically resolved to: D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work
20.11.2020 08:18:17.709 WARN [main] org.apache.ignite.internal.util.typedef.G:295 - Consistent ID is not set, it is recommended to set consistent ID for production clusters (use IgniteConfiguration.setConsistentId property)
20.11.2020 08:18:18.053 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Config URL: n/a
20.11.2020 08:18:18.084 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - IgniteConfiguration [igniteInstanceName=null, pubPoolSize=8, svcPoolSize=8, callbackPoolSize=8, stripedPoolSize=8, sysPoolSize=8, mgmtPoolSize=4, igfsPoolSize=4, dataStreamerPoolSize=8, utilityCachePoolSize=8, utilityCacheKeepAliveTime=60000, p2pPoolSize=2, qryPoolSize=8, sqlQryHistSize=1000, dfltQryTimeout=0, igniteHome=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite, igniteWorkDir=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work, mbeanSrv=com.sun.jmx.mbeanserver.JmxMBeanServer#78c03f1f, nodeId=0e60d50b-ee2e-46ed-8d76-5cb51791011b, marsh=BinaryMarshaller [], marshLocJobs=false, daemon=false, p2pEnabled=true, netTimeout=5000, netCompressionLevel=1, sndRetryDelay=1000, sndRetryCnt=3, metricsHistSize=10000, metricsUpdateFreq=2000, metricsExpTime=9223372036854775807, discoSpi=TcpDiscoverySpi [addrRslvr=null, sockTimeout=0, ackTimeout=0, marsh=null, reconCnt=10, reconDelay=2000, maxAckTimeout=600000, soLinger=5, forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null, skipAddrsRandomization=false], segPlc=STOP, segResolveAttempts=2, waitForSegOnStart=true, allResolversPassReq=true, segChkFreq=10000, commSpi=TcpCommunicationSpi [connectGate=null, connPlc=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$FirstConnectionPolicy#522ba524, chConnPlc=null, enableForcibleNodeKill=false, enableTroubleshootingLog=false, locAddr=null, locHost=null, locPort=47100, locPortRange=100, shmemPort=-1, directBuf=true, directSndBuf=false, idleConnTimeout=600000, connTimeout=5000, maxConnTimeout=600000, reconCnt=10, sockSndBuf=32768, sockRcvBuf=32768, msgQueueLimit=0, slowClientQueueLimit=0, nioSrvr=null, shmemSrv=null, usePairedConnections=false, connectionsPerNode=1, tcpNoDelay=true, filterReachableAddresses=false, ackSndThreshold=32, unackedMsgsBufSize=0, sockWriteTimeout=2000, boundTcpPort=-1, boundTcpShmemPort=-1, selectorsCnt=4, selectorSpins=0, addrRslvr=null, ctxInitLatch=java.util.concurrent.CountDownLatch#29c5ee1d[Count = 1], stopping=false, metricsLsnr=null], evtSpi=org.apache.ignite.spi.eventstorage.NoopEventStorageSpi#15cea7b0, colSpi=NoopCollisionSpi [], deploySpi=LocalDeploymentSpi [], indexingSpi=org.apache.ignite.spi.indexing.noop.NoopIndexingSpi#1e6cc850, addrRslvr=null, encryptionSpi=org.apache.ignite.spi.encryption.noop.NoopEncryptionSpi#7e7f0f0a, clientMode=false, rebalanceThreadPoolSize=4, rebalanceTimeout=10000, rebalanceBatchesPrefetchCnt=3, rebalanceThrottle=0, rebalanceBatchSize=524288, txCfg=TransactionConfiguration [txSerEnabled=false, dfltIsolation=REPEATABLE_READ, dfltConcurrency=PESSIMISTIC, dfltTxTimeout=0, txTimeoutOnPartitionMapExchange=0, deadlockTimeout=10000, pessimisticTxLogSize=0, pessimisticTxLogLinger=10000, tmLookupClsName=null, txManagerFactory=null, useJtaSync=false], cacheSanityCheckEnabled=true, discoStartupDelay=60000, deployMode=SHARED, p2pMissedCacheSize=100, locHost=null, timeSrvPortBase=31100, timeSrvPortRange=100, failureDetectionTimeout=10000, sysWorkerBlockedTimeout=null, clientFailureDetectionTimeout=30000, metricsLogFreq=0, hadoopCfg=null, connectorCfg=ConnectorConfiguration [jettyPath=null, host=null, port=11211, noDelay=true, directBuf=false, sndBufSize=32768, rcvBufSize=32768, idleQryCurTimeout=600000, idleQryCurCheckFreq=60000, sndQueueLimit=0, selectorCnt=4, idleTimeout=7000, sslEnabled=false, sslClientAuth=false, sslCtxFactory=null, sslFactory=null, portRange=100, threadPoolSize=8, msgInterceptor=null], odbcCfg=null, warmupClos=null, atomicCfg=AtomicConfiguration [seqReserveSize=1000, cacheMode=PARTITIONED, backups=1, aff=null, grpName=null], classLdr=null, sslCtxFactory=null, platformCfg=null, binaryCfg=null, memCfg=null, pstCfg=null, dsCfg=DataStorageConfiguration [sysRegionInitSize=10485760, sysRegionMaxSize=52428800, pageSize=0, concLvl=0, dfltDataRegConf=DataRegionConfiguration [name=default, maxSize=858886144, initSize=10485760, swapPath=null, pageEvictionMode=DISABLED, evictionThreshold=0.9, emptyPagesPoolSize=100, metricsEnabled=true, metricsSubIntervalCount=5, metricsRateTimeInterval=60000, persistenceEnabled=true, checkpointPageBufSize=0, lazyMemoryAllocation=true], dataRegions=DataRegionConfiguration[] [DataRegionConfiguration [name=persistent, maxSize=52428800, initSize=10485760, swapPath=null, pageEvictionMode=DISABLED, evictionThreshold=0.9, emptyPagesPoolSize=100, metricsEnabled=true, metricsSubIntervalCount=5, metricsRateTimeInterval=60000, persistenceEnabled=true, checkpointPageBufSize=0, lazyMemoryAllocation=true]], storagePath=null, checkpointFreq=180000, lockWaitTime=10000, checkpointThreads=4, checkpointWriteOrder=SEQUENTIAL, walHistSize=20, maxWalArchiveSize=1073741824, walSegments=4, walSegmentSize=10485760, walPath=db/wal, walArchivePath=db/wal/archive, metricsEnabled=false, walMode=LOG_ONLY, walTlbSize=131072, walBuffSize=0, walFlushFreq=2000, walFsyncDelay=1000, walRecordIterBuffSize=67108864, alwaysWriteFullPages=false, fileIOFactory=org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory#59429fac, metricsSubIntervalCnt=5, metricsRateTimeInterval=60000, walAutoArchiveAfterInactivity=-1, writeThrottlingEnabled=false, walCompactionEnabled=false, walCompactionLevel=1, checkpointReadLockTimeout=null, walPageCompression=DISABLED, walPageCompressionLevel=null], activeOnStart=true, autoActivation=true, longQryWarnTimeout=3000, sqlConnCfg=null, cliConnCfg=ClientConnectorConfiguration [host=null, port=10800, portRange=100, sockSndBufSize=0, sockRcvBufSize=0, tcpNoDelay=true, maxOpenCursorsPerConn=128, threadPoolSize=8, idleTimeout=0, handshakeTimeout=10000, jdbcEnabled=true, odbcEnabled=true, thinCliEnabled=true, sslEnabled=false, useIgniteSslCtxFactory=true, sslClientAuth=false, sslCtxFactory=null, thinCliCfg=ThinClientConfiguration [maxActiveTxPerConn=100]], mvccVacuumThreadCnt=2, mvccVacuumFreq=5000, authEnabled=false, failureHnd=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]], commFailureRslvr=null]
20.11.2020 08:18:18.084 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Daemon mode: off
...
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Remote Management [restart: off, REST: on, JMX (remote: on, port: 8071, auth: off, ssl: off)]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Logger: JavaLogger [quiet=true, config=null]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - IGNITE_HOME=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - VM arguments: [-Dcom.sun.management.jmxremote, -Dcom.sun.management.jmxremote.port=8071, -Dcom.sun.management.jmxremote.authenticate=false, -Dcom.sun.management.jmxremote.ssl=false, -Djava.rmi.server.hostname=127.0.0.1, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=log/dump.hprof, -XX:+UseG1GC, -XX:+UseStringDeduplication, --add-exports=java.base/jdk.internal.misc=ALL-UNNAMED, --add-exports=java.base/sun.nio.ch=ALL-UNNAMED, --add-exports=java.management/com.sun.jmx.mbeanserver=ALL-UNNAMED, --add-exports=jdk.internal.jvmstat/sun.jvmstat.monitor=ALL-UNNAMED, --add-exports=java.base/sun.reflect.generics.reflectiveObjects=ALL-UNNAMED, --illegal-access=permit, -Xmx500m]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - System cache's DataRegion size is configured to 10 MB. Use DataStorageConfiguration.systemRegionInitialSize property to change the setting.
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Configured caches [in 'sysMemPlc' dataRegion: ['ignite-sys-cache']]
20.11.2020 08:18:18.100 WARN [main] org.apache.ignite.internal.IgniteKernal:295 - Peer class loading is enabled (disable it in production for performance and deployment consistency reasons)
20.11.2020 08:18:18.100 WARN [main] org.apache.ignite.internal.IgniteKernal:295 - Please set system property '-Djava.net.preferIPv4Stack=true' to avoid possible problems in mixed environments.
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - 3-rd party licenses can be found at: D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\libs\licenses
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [BUILD_VERSION=2.1.4]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [NODE_NAME=EESRV-LBXC03]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [BUILD_NUMBER=848]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [NODE_TYPE=LABBOX]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [VERSION=0]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [BUILD_TIME=1604577743000]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [APPLICATION_NAME=Labbox]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [BUILD_GIT_HASH=ff2f1f3]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [KEY=_OL2;f~.C3n}yo6p<Zx=BE4I2P:lDL"f]
20.11.2020 08:18:18.163 WARN [pub-#19] org.apache.ignite.internal.GridDiagnostic:295 - This operating system has been tested less rigorously: Windows Server 2012 R2 6.3 amd64. Our team will appreciate the feedback if you experience any problems running ignite in this environment.
20.11.2020 08:18:18.163 WARN [pub-#22] org.apache.ignite.internal.GridDiagnostic:295 - Initial heap size is 64MB (should be no less than 512MB, use -Xms512m -Xmx512m).
20.11.2020 08:18:18.334 INFO [main] o.a.i.i.p.plugin.IgnitePluginProcessor:285 - Configured plugins:
20.11.2020 08:18:18.334 INFO [main] o.a.i.i.p.plugin.IgnitePluginProcessor:285 - ^-- Authentication 1.0.0
20.11.2020 08:18:18.334 INFO [main] o.a.i.i.p.plugin.IgnitePluginProcessor:285 - ^-- null
20.11.2020 08:18:18.334 INFO [main] o.a.i.i.p.plugin.IgnitePluginProcessor:285 -
20.11.2020 08:18:18.334 INFO [main] o.a.i.i.processors.failure.FailureProcessor:285 - Configured failure handler: [hnd=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]]
20.11.2020 08:18:18.600 INFO [main] o.a.i.s.communication.tcp.TcpCommunicationSpi:285 - Successfully bound communication NIO server to TCP port [port=47100, locHost=0.0.0.0/0.0.0.0, selectorsCnt=4, selectorSpins=0, pairedConn=false]
20.11.2020 08:18:18.678 WARN [main] o.a.i.s.communication.tcp.TcpCommunicationSpi:295 - Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
20.11.2020 08:18:18.694 WARN [main] o.a.i.spi.checkpoint.noop.NoopCheckpointSpi:295 - Checkpoints are disabled (to enable configure any GridCheckpointSpi implementation)
20.11.2020 08:18:18.741 WARN [main] o.a.i.i.m.collision.GridCollisionManager:295 - Collision resolution is disabled (all jobs will be activated upon arrival).
20.11.2020 08:18:18.741 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Security status [authentication=off, tls/ssl=off]
20.11.2020 08:18:18.866 INFO [main] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Successfully bound to TCP port [port=47500, localHost=0.0.0.0/0.0.0.0, locNodeId=0e60d50b-ee2e-46ed-8d76-5cb51791011b]
20.11.2020 08:18:18.866 INFO [main] o.a.i.i.p.c.p.filename.PdsFoldersResolver:285 - Successfully locked persistence storage folder [D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5]
20.11.2020 08:18:18.866 INFO [main] o.a.i.i.p.c.p.filename.PdsFoldersResolver:285 - Consistent ID used for local node is [1dbddb2c-ef76-4811-b7d3-46da82061bc5] according to persistence data storage folders
20.11.2020 08:18:18.866 INFO [main] o.a.i.i.p.c.b.CacheObjectBinaryProcessorImpl:285 - Resolved directory for serialized binary metadata: D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\binary_meta\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5
20.11.2020 08:18:19.631 INFO [main] o.a.i.i.p.c.p.file.FilePageStoreManager:285 - Resolved page store work directory: D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5
20.11.2020 08:18:19.694 INFO [main] o.a.i.i.p.c.p.w.f.FileHandleManagerImpl:285 - Initialized write-ahead log manager [mode=LOG_ONLY]
20.11.2020 08:18:19.772 WARN [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:295 - DataRegionConfiguration.maxWalArchiveSize instead DataRegionConfiguration.walHistorySize would be used for removing old archive wal files
20.11.2020 08:18:19.803 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Configured data regions initialized successfully [total=5]
20.11.2020 08:18:19.834 INFO [main] o.a.i.i.p.c.d.d.t.PartitionsEvictManager:285 - Evict partition permits=2
20.11.2020 08:18:19.850 INFO [main] o.a.i.i.p.odbc.ClientListenerProcessor:285 - Client connector processor has started on TCP port 10800
20.11.2020 08:18:20.006 INFO [main] o.a.i.i.p.r.protocols.tcp.GridTcpRestProtocol:285 - Command protocol successfully started [name=TCP binary, host=0.0.0.0/0.0.0.0, port=11211]
20.11.2020 08:18:20.115 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Non-loopback local IPs: 192.168.92.177, fe80:0:0:0:6859:37c8:f543:8087%eth4
20.11.2020 08:18:20.115 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Enabled local MACs: 00000000000000E0, 005056BD5072
20.11.2020 08:18:20.131 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Read checkpoint status [startMarker=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5\cp\1605855371041-8b5aaf2a-7867-47b0-879c-85791363041f-START.bin, endMarker=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5\cp\1605855371041-8b5aaf2a-7867-47b0-879c-85791363041f-END.bin]
20.11.2020 08:18:20.147 INFO [main] o.a.i.i.p.c.p.pagemem.PageMemoryImpl:285 - Started page memory [memoryAllocated=50,0 MiB, pages=12404, tableSize=988,2 KiB, checkpointBuffer=50,0 MiB]
20.11.2020 08:18:20.147 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Checking memory state [lastValidPos=FileWALPointer [idx=512, fileOff=3672982, len=99269], lastMarked=FileWALPointer [idx=512, fileOff=3672982, len=99269], lastCheckpointId=8b5aaf2a-7867-47b0-879c-85791363041f]
20.11.2020 08:18:20.225 WARN [main] o.a.i.i.p.c.p.wal.FileWriteAheadLogManager:290 - WAL segment tail reached. [idx=512, isWorkDir=true, serVer=org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer#5853495b, actualFilePtr=FileWALPointer [idx=512, fileOff=3772251, len=0]]
20.11.2020 08:18:20.256 WARN [main] o.a.i.i.p.c.p.wal.FileWriteAheadLogManager:290 - WAL segment tail reached. [idx=512, isWorkDir=true, serVer=org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer#21f459fc, actualFilePtr=FileWALPointer [idx=512, fileOff=3772251, len=0]]
20.11.2020 08:18:20.256 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Found last checkpoint marker [cpId=8b5aaf2a-7867-47b0-879c-85791363041f, pos=FileWALPointer [idx=512, fileOff=3672982, len=99269]]
20.11.2020 08:18:20.350 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Applying lost cache updates since last checkpoint record [lastMarked=FileWALPointer [idx=512, fileOff=3672982, len=99269], lastCheckpointId=8b5aaf2a-7867-47b0-879c-85791363041f]
20.11.2020 08:18:20.365 WARN [main] o.a.i.i.p.c.p.wal.FileWriteAheadLogManager:290 - WAL segment tail reached. [idx=512, isWorkDir=true, serVer=org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer#6c15e8c7, actualFilePtr=FileWALPointer [idx=512, fileOff=3772251, len=0]]
20.11.2020 08:18:20.381 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Finished applying WAL changes [updatesApplied=0, time=31 ms]
20.11.2020 08:18:20.381 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Restoring partition state for local groups.
20.11.2020 08:18:20.381 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Finished restoring partition state for local groups [groupsProcessed=0, partitionsProcessed=0, time=0ms]
20.11.2020 08:18:20.412 INFO [main] o.a.i.i.p.cluster.GridClusterStateProcessor:285 - Restoring history for BaselineTopology[id=12]
20.11.2020 08:18:20.522 INFO [main] o.a.i.i.c.DistributedBaselineConfiguration:285 - Baseline parameter 'baselineAutoAdjustEnabled' was changed from 'null' to 'true'
20.11.2020 08:18:20.522 INFO [main] o.a.i.i.c.DistributedBaselineConfiguration:285 - Baseline parameter 'baselineAutoAdjustTimeout' was changed from 'null' to '300000'
20.11.2020 08:18:20.522 INFO [main] o.a.i.i.p.c.p.file.FilePageStoreManager:285 - Cleanup cache stores [total=1, left=0, cleanFiles=false]
20.11.2020 08:18:20.522 INFO [main] o.a.i.i.p.c.p.pagemem.PageMemoryImpl:285 - Started page memory [memoryAllocated=50,0 MiB, pages=12404, tableSize=988,2 KiB, checkpointBuffer=50,0 MiB]
20.11.2020 08:18:20.537 INFO [main] o.a.i.i.p.c.p.pagemem.PageMemoryImpl:285 - Started page memory [memoryAllocated=50,0 MiB, pages=12404, tableSize=988,2 KiB, checkpointBuffer=50,0 MiB]
20.11.2020 08:18:20.537 INFO [main] o.a.i.i.p.c.p.pagemem.PageMemoryImpl:285 - Started page memory [memoryAllocated=50,0 MiB, pages=12404, tableSize=988,2 KiB, checkpointBuffer=50,0 MiB]
20.11.2020 08:18:20.537 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Configured data regions started successfully [total=5]
20.11.2020 08:18:20.537 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Starting binary memory restore for: [166757441, -1947899996, -8785046, -2100569601, 1793235927, -499392514, 30677022, 129211407, 1139332309, 1725334265]
20.11.2020 08:18:21.334 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Read checkpoint status [startMarker=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5\cp\1605855371041-8b5aaf2a-7867-47b0-879c-85791363041f-START.bin, endMarker=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5\cp\1605855371041-8b5aaf2a-7867-47b0-879c-85791363041f-END.bin]
20.11.2020 08:18:21.334 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Checking memory state [lastValidPos=FileWALPointer [idx=512, fileOff=3672982, len=99269], lastMarked=FileWALPointer [idx=512, fileOff=3672982, len=99269], lastCheckpointId=8b5aaf2a-7867-47b0-879c-85791363041f]
20.11.2020 08:18:21.365 WARN [main] o.a.i.i.p.c.p.wal.FileWriteAheadLogManager:290 - WAL segment tail reached. [idx=512, isWorkDir=true, serVer=org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer#317e9c3c, actualFilePtr=FileWALPointer [idx=512, fileOff=3772251, len=0]]
20.11.2020 08:18:21.397 WARN [main] o.a.i.i.p.c.p.wal.FileWriteAheadLogManager:290 - WAL segment tail reached. [idx=512, isWorkDir=true, serVer=org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer#31a3f4de, actualFilePtr=FileWALPointer [idx=512, fileOff=3772251, len=0]]
20.11.2020 08:18:21.397 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Found last checkpoint marker [cpId=8b5aaf2a-7867-47b0-879c-85791363041f, pos=FileWALPointer [idx=512, fileOff=3672982, len=99269]]
20.11.2020 08:18:21.412 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Binary memory state restored at node startup [restoredPtr=FileWALPointer [idx=512, fileOff=3772251, len=0]]
20.11.2020 08:18:21.428 INFO [main] o.a.i.i.p.c.p.pagemem.PageMemoryImpl:285 - Started page memory [memoryAllocated=50,0 MiB, pages=12404, tableSize=988,2 KiB, checkpointBuffer=50,0 MiB]
20.11.2020 08:18:21.568 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=license, id=166757441, dataRegionName=persistent, mode=REPLICATED, atomicity=ATOMIC, backups=2147483647, mvcc=false]
20.11.2020 08:18:21.584 INFO [main] o.a.i.i.p.c.p.pagemem.PageMemoryImpl:285 - Started page memory [memoryAllocated=819,1 MiB, pages=203256, tableSize=15,8 MiB, checkpointBuffer=256,0 MiB]
20.11.2020 08:18:21.584 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=commservices, id=-8785046, dataRegionName=default, mode=REPLICATED, atomicity=ATOMIC, backups=2147483647, mvcc=false]
20.11.2020 08:18:21.615 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=ignite-sys-cache, id=-2100569601, dataRegionName=sysMemPlc, mode=REPLICATED, atomicity=TRANSACTIONAL, backups=2147483647, mvcc=false]
20.11.2020 08:18:21.615 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=machinespecifications, id=1793235927, dataRegionName=persistent, mode=REPLICATED, atomicity=ATOMIC, backups=2147483647, mvcc=false]
20.11.2020 08:18:21.615 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=nxisPorts, id=-499392514, dataRegionName=persistent, mode=REPLICATED, atomicity=ATOMIC, backups=2147483647, mvcc=false]
20.11.2020 08:18:21.631 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=datastructures_ATOMIC_PARTITIONED_1#labqueue, id=1205724040, group=labqueue, dataRegionName=default, mode=PARTITIONED, atomicity=ATOMIC, backups=1, mvcc=false]
20.11.2020 08:18:21.631 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=ignite-sys-atomic-cache#labqueue, id=-327698687, group=labqueue, dataRegionName=default, mode=PARTITIONED, atomicity=TRANSACTIONAL, backups=1, mvcc=false]
20.11.2020 08:18:21.631 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=machinemaxbatchno, id=30677022, dataRegionName=persistent, mode=PARTITIONED, atomicity=ATOMIC, backups=1, mvcc=false]
20.11.2020 08:18:21.646 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=machineconfiguration, id=129211407, dataRegionName=persistent, mode=REPLICATED, atomicity=ATOMIC, backups=2147483647, mvcc=false]
20.11.2020 08:18:21.646 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=specimentracer, id=1139332309, dataRegionName=persistent, mode=PARTITIONED, atomicity=ATOMIC, backups=1, mvcc=false]
20.11.2020 08:18:21.646 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=machinestatus, id=1725334265, dataRegionName=default, mode=PARTITIONED, atomicity=ATOMIC, backups=1, mvcc=false]
20.11.2020 08:18:21.646 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Binary recovery performed in 1109 ms.
20.11.2020 08:18:21.646 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Read checkpoint status [startMarker=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5\cp\1605855371041-8b5aaf2a-7867-47b0-879c-85791363041f-START.bin, endMarker=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5\cp\1605855371041-8b5aaf2a-7867-47b0-879c-85791363041f-END.bin]
20.11.2020 08:18:21.662 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Applying lost cache updates since last checkpoint record [lastMarked=FileWALPointer [idx=512, fileOff=3672982, len=99269], lastCheckpointId=8b5aaf2a-7867-47b0-879c-85791363041f]
20.11.2020 08:18:21.693 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Finished applying WAL changes [updatesApplied=0, time=31 ms]
20.11.2020 08:18:21.693 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Restoring partition state for local groups.
20.11.2020 08:18:21.943 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Finished restoring partition state for local groups [groupsProcessed=10, partitionsProcessed=5220, time=235ms]
20.11.2020 08:18:22.021 INFO [main] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Connection check threshold is calculated: 10000
20.11.2020 08:19:19.373 INFO [tcp-disco-srvr-[:47500]-#3] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - TCP discovery accepted incoming connection [rmtAddr=/192.168.92.175, rmtPort=56962]
20.11.2020 08:19:19.389 INFO [tcp-disco-srvr-[:47500]-#3] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - TCP discovery spawning a new thread for connection [rmtAddr=/192.168.92.175, rmtPort=56962]
20.11.2020 08:19:19.389 INFO [tcp-disco-sock-reader-[]-#4] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Started serving remote node connection [rmtAddr=/192.168.92.175:56962, rmtPort=56962]
20.11.2020 08:19:19.389 INFO [tcp-disco-sock-reader-[9f44068b 192.168.92.175:56962 client]-#4] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Initialized connection with remote client node [nodeId=9f44068b-b8ca-4d8b-bb32-efd2e2a1940c, rmtAddr=/192.168.92.175:56962]
20.11.2020 08:19:19.498 INFO [tcp-disco-sock-reader-[9f44068b 192.168.92.175:56962 client]-#4] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Finished serving remote node connection [rmtAddr=/192.168.92.175:56962, rmtPort=56962
20.11.2020 08:20:21.287 INFO [tcp-disco-srvr-[:47500]-#3] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - TCP discovery accepted incoming connection [rmtAddr=/192.168.92.176, rmtPort=55941]
20.11.2020 08:20:21.287 INFO [tcp-disco-srvr-[:47500]-#3] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - TCP discovery spawning a new thread for connection [rmtAddr=/192.168.92.176, rmtPort=55941]
20.11.2020 08:20:21.287 INFO [tcp-disco-sock-reader-[]-#5] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Started serving remote node connection [rmtAddr=/192.168.92.176:55941, rmtPort=55941]
20.11.2020 08:20:21.287 INFO [tcp-disco-sock-reader-[6a50abff 192.168.92.176:55941]-#5] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Initialized connection with remote server node [nodeId=6a50abff-8cfd-4b3a-b894-54fa9d405d36, rmtAddr=/192.168.92.176:55941]
20.11.2020 08:20:21.287 INFO [tcp-disco-sock-reader-[6a50abff 192.168.92.176:55941]-#5] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Finished serving remote node connection [rmtAddr=/192.168.92.176:55941, rmtPort=55941
20.11.2020 08:20:26.239 INFO [tcp-disco-srvr-[:47500]-#3] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - TCP discovery accepted incoming connection [rmtAddr=/192.168.92.175, rmtPort=56996]
... it continues like that till the join or failure
The logs log the same on all servers. In this case server 1 and 2 create a cluster after 7 minutes. server 3 fails after 9 minutes due to incompatible baseline topology. After reseting the failed server it can rejoin the cluster. The behavior only happens sometimes. Most of the time the servers rebuild the cluster without problem.

Connect to an azure iot hub from inside a kubernetes cluster via amqp over websockets

we are trying to communicate to an azure iothub via amqp over websocket from a java docker container inside an azure kubernetes cluster. Sadly it seems, that the container cant establish a connection while locally or even on another virtual machine (where only docker is installed) the container run successfully.
The network policies rules should allow all necessary protocols and ports to communicate with eventhub endpoint of the iot hub.
Does anybody know which switch we have to pull to "allow" the container from the cluster the communication with the iothub?
The only logs we have are this:
13:10:26.688 [main] DEBUG reactor.util.Loggers$LoggerFactory - Using Slf4j logging framework
13:10:26.851 [main] INFO com.azure.messaging.eventhubs.EventHubClientBuilder - connectionId[REDACTED]: Emitting a single connection.
13:10:26.901 [main] DEBUG com.azure.core.amqp.implementation.ReactorConnection - connectionId[REDACTED]: Connection state: UNINITIALIZED
13:10:26.903 [main] INFO com.azure.messaging.eventhubs.implementation.EventHubConnectionProcessor - namespace[REDACTED] entityPath[REDACTED]: Setting next AMQP channel.
13:10:26.903 [main] INFO com.azure.messaging.eventhubs.implementation.EventHubConnectionProcessor - namespace[REDACTED] entityPath[REDACTED]: Next AMQP channel received, updating 0 current subscribers
13:10:26.920 [main] INFO com.azure.core.amqp.implementation.ReactorConnection - connectionId[REDACTED]: Creating and starting connection to REDACTED:443
13:10:26.940 [main] INFO com.azure.core.amqp.implementation.ReactorExecutor - connectionId[REDACTED], message[Starting reactor.]
13:10:26.955 [single-1] INFO com.azure.core.amqp.implementation.handler.ConnectionHandler - onConnectionInit hostname[REDACTED], connectionId[REDACTED]
13:10:26.956 [single-1] INFO com.azure.core.amqp.implementation.handler.ReactorHandler - connectionId[REDACTED] reactor.onReactorInit
13:10:26.956 [single-1] INFO com.azure.core.amqp.implementation.handler.ConnectionHandler - onConnectionLocalOpen hostname[REDACTED:443], connectionId[REDACTED], errorCondition[null], errorDescription[null]
13:10:26.975 [main] DEBUG com.azure.core.amqp.implementation.ReactorSession - Connection state: UNINITIALIZED
13:10:26.991 [main] INFO com.azure.core.amqp.implementation.ReactorConnection - Emitting new response channel. connectionId: REDACTED. entityPath: $management. linkName: mgmt.
13:10:26.991 [main] INFO class com.azure.core.amqp.implementation.RequestResponseChannel<mgmt-session> - namespace[REDACTED] entityPath[$management]: Setting next AMQP channel.
13:10:26.991 [main] INFO class com.azure.core.amqp.implementation.RequestResponseChannel<mgmt-session> - namespace[REDACTED] entityPath[$management]: Next AMQP channel received, updating 0 current subscribers
13:10:26.993 [main] INFO com.azure.messaging.eventhubs.implementation.ManagementChannel - Management endpoint state: UNINITIALIZED
13:10:27.032 [main] INFO com.azure.messaging.eventhubs.implementation.EventHubConnectionProcessor - Upstream connection publisher was completed. Terminating processor.
13:10:27.033 [main] INFO com.azure.messaging.eventhubs.implementation.EventHubConnectionProcessor - namespace[REDACTED] entityPath[REDACTED]: AMQP channel processor completed. Notifying 0 subscribers.
13:10:27.040 [main] INFO com.azure.messaging.eventhubs.implementation.EventHubReactorAmqpConnection - connectionId[REDACTED]: Disposing of connection.
13:10:27.040 [main] INFO class com.azure.core.amqp.implementation.RequestResponseChannel<mgmt-session> - Upstream connection publisher was completed. Terminating processor.
13:10:27.040 [main] INFO class com.azure.core.amqp.implementation.RequestResponseChannel<mgmt-session> - namespace[REDACTED] entityPath[$management]: AMQP channel processor completed. Notifying 0 subscribers.
13:10:27.041 [main] INFO com.azure.core.amqp.implementation.ReactorConnection - connectionId[REDACTED]: Disposing of ReactorConnection.
13:10:27.041 [main] INFO com.azure.messaging.eventhubs.implementation.EventHubConnectionProcessor - namespace[REDACTED] entityPath[REDACTED]: Channel is disposed.
13:10:27.041 [main] INFO com.azure.core.amqp.implementation.ReactorConnection - connectionId[REDACTED]: Removing session 'mgmt-session'
13:10:27.041 [main] INFO com.azure.core.amqp.implementation.ReactorSession - sessionId[mgmt-session]: Disposing of session.
13:10:27.043 [main] INFO com.azure.core.amqp.implementation.AmqpExceptionHandler - Shutdown received: ReactorExecutor.close() was called., isTransient[false], initiatedByClient[true]
13:10:27.089 [single-1] DEBUG com.azure.core.amqp.implementation.handler.SessionHandler - onSessionLocalOpen connectionId[REDACTED], entityName[mgmt-session], condition[Error{condition=null, description='null', info=null}]
13:10:27.090 [single-1] INFO com.azure.core.amqp.implementation.handler.SendLinkHandler - onLinkLocalClose connectionId[REDACTED], linkName[mgmt:sender], errorCondition[null], errorDescription[null]
13:10:27.090 [single-1] INFO com.azure.core.amqp.implementation.handler.ReceiveLinkHandler - onLinkLocalClose connectionId[REDACTED], linkName[mgmt:receiver], errorCondition[null], errorDescription[null]
13:10:27.090 [single-1] DEBUG com.azure.core.amqp.implementation.handler.SessionHandler - onSessionLocalClose connectionId[mgmt-session], entityName[REDACTED], condition[Error{condition=null, description='null', info=null}]
13:10:27.090 [single-1] INFO com.azure.core.amqp.implementation.handler.ConnectionHandler - onConnectionLocalClose hostname[REDACTED:443], connectionId[REDACTED], errorCondition[null], errorDescription[null]
13:10:27.090 [single-1] INFO com.azure.core.amqp.implementation.handler.ConnectionHandler - onConnectionBound hostname[REDACTED], connectionId[REDACTED]
13:10:27.098 [single-1] DEBUG com.azure.core.amqp.implementation.handler.WebSocketsConnectionHandler - connectionId[REDACTED] Adding web sockets transport layer for hostname[REDACTED:443]
13:10:27.125 [single-1] DEBUG com.azure.core.amqp.implementation.handler.DispatchHandler - Running task for event: %s
13:10:27.126 [single-1] INFO com.azure.core.amqp.implementation.ReactorExecutor - connectionId[REDACTED], message[Processing all pending tasks and closing old reactor.]
13:10:27.126 [single-1] DEBUG com.azure.core.amqp.implementation.handler.SendLinkHandler - onLinkLocalOpen connectionId[REDACTED], linkName[mgmt:sender], localTarget[Target{address='$management', durable=NONE, expiryPolicy=SESSION_END, timeout=0, dynamic=false, dynamicNodeProperties=null, capabilities=null}]
13:10:27.126 [single-1] INFO com.azure.core.amqp.implementation.handler.ReceiveLinkHandler - onLinkLocalOpen connectionId[REDACTED], linkName[mgmt:receiver], localSource[Source{address='$management', durable=NONE, expiryPolicy=SESSION_END, timeout=0, dynamic=false, dynamicNodeProperties=null, distributionMode=null, filter=null, defaultOutcome=null, outcomes=null, capabilities=null}]
13:10:27.127 [single-1] INFO com.azure.core.amqp.implementation.ReactorExecutor - connectionId[REDACTED], message[Stopping the reactor because thread was interrupted or the reactor has no more events to process.]

The failure was not the networking at all.
My mistake was, that i assume if i can run the container manual via docker run -it * it should also work in a kubernetes cluster. But with the -it argument the container stays open und and pseudo-tty was attached. But in the kubernetes cluster this of cause will not happen, so we have to adjust the loop logic of the java application and it works after.
Thanks # all

Hazelcast web clustered session service: Retrying the connection

hazelcast unable to connect, the message receive is as follows
2018-03-03 10:27:51,074 INFO c.h.i.DefaultAddressPicker [LOCAL] [dev] [3.6] Picked Address[127.0.0.1]:5703, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5703], bind any local is true
2018-03-03 10:28:01.078 INFO 19478 --- [.ensureInstance] c.hazelcast.web.ClusteredSessionService : Retrying the connection!!
2018-03-03 10:28:01,078 INFO c.h.w.ClusteredSessionService Retrying the connection!!
2018-03-03 10:28:01.079 INFO 19478 --- [.ensureInstance] com.hazelcast.config.XmlConfigLocator : Loading 'hazelcast-default.xml' from classpath.
2018-03-03 10:28:01,079 INFO c.h.c.XmlConfigLocator Loading 'hazelcast-default.xml' from classpath.
2018-03-03 10:28:01.085 INFO 19478 --- [.ensureInstance] c.hazelcast.web.HazelcastInstanceLoader : Creating a new HazelcastInstance for session replication
2018-03-03 10:28:01,085 INFO c.h.w.HazelcastInstanceLoader Creating a new HazelcastInstance for session replication
2018-03-03 10:28:01.086 INFO 19478 --- [.ensureInstance] c.h.instance.DefaultAddressPicker : [LOCAL] [dev] [3.6] Picked Address[127.0.0.1]:5703, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5703], bind any local is true

hazelcast-kubernetes Network Discovery: How to use with Multiple Nodes

We have a Kubernetes cluster which spins up 4 instances of our application. We'd like to have it share a Hazelcast data grid and keep in synch between these nodes. According to https://github.com/hazelcast/hazelcast-kubernetes the configuration is straightforward. We'd like to use the DNS approach rather than the kubernetes api.
With DNS we are supposed to be able to add the DNS name of our app as described here. So this would be something like myservice.mynamespace.svc.cluster.local.
The problem is that although we have 4 VMs spun up, only one Hazelcast network member is found; thus we see the following in the logs:
Members [1] {
Member [192.168.187.3]:5701 - 50056bfb-b710-43e0-ad58-57459ed399a5 this
}
It seems that there aren't any errors, it just doesn't see any of the other network members.
Here's my configuration. I've tried both using an xml file, like the example on the hazelcast-kubernetes git repo, as well as programmatically. Neither attempt appear to work.
I'm using hazelcast 3.8.
Using hazelcast.xml:
<hazelcast>
<properties>
<!-- only necessary prior Hazelcast 3.8 -->
<property name="hazelcast.discovery.enabled">true</property>
</properties>
<network>
<join>
<!-- deactivate normal discovery -->
<multicast enabled="false"/>
<tcp-ip enabled="false" />
<!-- activate the Kubernetes plugin -->
<discovery-strategies>
<discovery-strategy enabled="true"
class="com.hazelcast.HazelcastKubernetesDiscoveryStrategy">
<properties>
<!-- configure discovery service API lookup -->
<property name="service-dns">myapp.mynamespace.svc.cluster.local</property>
<property name="service-dns-timeout">10</property>
</properties>
</discovery-strategy>
</discovery-strategies>
</join>
</network>
</hazelcast>
Using the XmlConfigBuilder to construct the instance.
Properties properties = new Properties();
XmlConfigBuilder builder = new XmlConfigBuilder();
builder.setProperties(properties);
Config config = builder.build();
this.instance = Hazelcast.newHazelcastInstance(config);
And Programmatically (personal preference if I can get it to work):
Config cfg = new Config();
NetworkConfig networkConfig = cfg.getNetworkConfig();
networkConfig.setPort(hazelcastNetworkPort);
networkConfig.setPortAutoIncrement(true);
networkConfig.setPortCount(100);
JoinConfig joinConfig = networkConfig.getJoin();
joinConfig.getMulticastConfig().setEnabled(false);
joinConfig.getTcpIpConfig().setEnabled(false);
DiscoveryConfig discoveryConfig = joinConfig.getDiscoveryConfig();
HazelcastKubernetesDiscoveryStrategyFactory factory = new HazelcastKubernetesDiscoveryStrategyFactory();
DiscoveryStrategyConfig strategyConfig = new DiscoveryStrategyConfig(factory);
strategyConfig.addProperty("service-dns", kubernetesSvcsDnsName);
strategyConfig.addProperty("service-dns-timeout", kubernetesSvcsDnsTimeout);
discoveryConfig.addDiscoveryStrategyConfig(strategyConfig);
this.instance = Hazelcast.newHazelcastInstance(cfg);
Is anyone farmiliar with this setup? I have ports 5701 - 5800 open. It seems kubernetes starts up and recognizes that discovery mode is on, but only finds the one (local) node.
Here's a snippet from the logs for what it's worth. This was while using the xml file for config:
2017-03-15 08:15:33,688 INFO [main] c.h.c.XmlConfigLocator [StandardLoggerFactory.java:49] Loading 'hazelcast-default.xml' from classpath.
2017-03-15 08:15:33,917 INFO [main] c.g.a.c.a.u.c.HazelcastCacheClient [HazelcastCacheClient.java:112] CONFIG: Config{groupConfig=GroupConfig [name=dev, password=********], properties={}, networkConfig=NetworkConfig{publicAddress='null', port=5701, portCount=100, portAutoIncrement=true, join=JoinConfig{multicastConfig=MulticastConfig [enabled=true, multicastGroup=224.2.2.3, multicastPort=54327, multicastTimeToLive=32, multicastTimeoutSeconds=2, trustedInterfaces=[], loopbackModeEnabled=false], tcpIpConfig=TcpIpConfig [enabled=false, connectionTimeoutSeconds=5, members=[127.0.0.1, 127.0.0.1], requiredMember=null], awsConfig=AwsConfig{enabled=false, region='us-west-1', securityGroupName='hazelcast-sg', tagKey='type', tagValue='hz-nodes', hostHeader='ec2.amazonaws.com', iamRole='null', connectionTimeoutSeconds=5}, discoveryProvidersConfig=com.hazelcast.config.DiscoveryConfig#3c153a1}, interfaces=InterfacesConfig{enabled=false, interfaces=[10.10.1.*]}, sslConfig=SSLConfig{className='null', enabled=false, implementation=null, properties={}}, socketInterceptorConfig=SocketInterceptorConfig{className='null', enabled=false, implementation=null, properties={}}, symmetricEncryptionConfig=SymmetricEncryptionConfig{enabled=false, iterationCount=19, algorithm='PBEWithMD5AndDES', key=null}}, mapConfigs={default=MapConfig{name='default', inMemoryFormat=BINARY', backupCount=1, asyncBackupCount=0, timeToLiveSeconds=0, maxIdleSeconds=0, evictionPolicy='NONE', mapEvictionPolicy='null', evictionPercentage=25, minEvictionCheckMillis=100, maxSizeConfig=MaxSizeConfig{maxSizePolicy='PER_NODE', size=2147483647}, readBackupData=false, hotRestart=HotRestartConfig{enabled=false, fsync=false}, nearCacheConfig=null, mapStoreConfig=MapStoreConfig{enabled=false, className='null', factoryClassName='null', writeDelaySeconds=0, writeBatchSize=1, implementation=null, factoryImplementation=null, properties={}, initialLoadMode=LAZY, writeCoalescing=true}, mergePolicyConfig='com.hazelcast.map.merge.PutIfAbsentMapMergePolicy', wanReplicationRef=null, entryListenerConfigs=null, mapIndexConfigs=null, mapAttributeConfigs=null, quorumName=null, queryCacheConfigs=null, cacheDeserializedValues=INDEX_ONLY}}, topicConfigs={}, reliableTopicConfigs={default=ReliableTopicConfig{name='default', topicOverloadPolicy=BLOCK, executor=null, readBatchSize=10, statisticsEnabled=true, listenerConfigs=[]}}, queueConfigs={default=QueueConfig{name='default', listenerConfigs=null, backupCount=1, asyncBackupCount=0, maxSize=0, emptyQueueTtl=-1, queueStoreConfig=null, statisticsEnabled=true}}, multiMapConfigs={default=MultiMapConfig{name='default', valueCollectionType='SET', listenerConfigs=null, binary=true, backupCount=1, asyncBackupCount=0}}, executorConfigs={default=ExecutorConfig{name='default', poolSize=16, queueCapacity=0}}, semaphoreConfigs={default=SemaphoreConfig{name='default', initialPermits=0, backupCount=1, asyncBackupCount=0}}, ringbufferConfigs={default=RingbufferConfig{name='default', capacity=10000, backupCount=1, asyncBackupCount=0, timeToLiveSeconds=0, inMemoryFormat=BINARY, ringbufferStoreConfig=RingbufferStoreConfig{enabled=false, className='null', properties={}}}}, wanReplicationConfigs={}, listenerConfigs=[], partitionGroupConfig=PartitionGroupConfig{enabled=false, groupType=PER_MEMBER, memberGroupConfigs=[]}, managementCenterConfig=ManagementCenterConfig{enabled=false, url='http://localhost:8080/mancenter', updateInterval=3}, securityConfig=SecurityConfig{enabled=false, memberCredentialsConfig=CredentialsFactoryConfig{className='null', implementation=null, properties={}}, memberLoginModuleConfigs=[], clientLoginModuleConfigs=[], clientPolicyConfig=PermissionPolicyConfig{className='null', implementation=null, properties={}}, clientPermissionConfigs=[]}, liteMember=false}
2017-03-15 08:15:33,949 INFO [main] c.h.i.DefaultAddressPicker [StandardLoggerFactory.java:49] [LOCAL] [dev] [3.8] Prefer IPv4 stack is true.
2017-03-15 08:15:33,960 INFO [main] c.h.i.DefaultAddressPicker [StandardLoggerFactory.java:49] [LOCAL] [dev] [3.8] Picked [192.168.187.3]:5701, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5701], bind any local is true
2017-03-15 08:15:34,000 INFO [main] c.h.system [StandardLoggerFactory.java:49] [192.168.187.3]:5701 [dev] [3.8] Hazelcast 3.8 (20170217 - d7998b4) starting at [192.168.187.3]:5701
2017-03-15 08:15:34,001 INFO [main] c.h.system [StandardLoggerFactory.java:49] [192.168.187.3]:5701 [dev] [3.8] Copyright (c) 2008-2017, Hazelcast, Inc. All Rights Reserved.
2017-03-15 08:15:34,001 INFO [main] c.h.system [StandardLoggerFactory.java:49] [192.168.187.3]:5701 [dev] [3.8] Configured Hazelcast Serialization version : 1
2017-03-15 08:15:34,507 INFO [main] c.h.s.i.o.i.BackpressureRegulator [StandardLoggerFactory.java:49] [192.168.187.3]:5701 [dev] [3.8] Backpressure is disabled
2017-03-15 08:15:35,170 INFO [main] c.h.i.Node [StandardLoggerFactory.java:49] [192.168.187.3]:5701 [dev] [3.8] Creating MulticastJoiner
2017-03-15 08:15:35,339 INFO [main] c.h.s.i.o.i.OperationExecutorImpl [StandardLoggerFactory.java:49] [192.168.187.3]:5701 [dev] [3.8] Starting 8 partition threads
2017-03-15 08:15:35,342 INFO [main] c.h.s.i.o.i.OperationExecutorImpl [StandardLoggerFactory.java:49] [192.168.187.3]:5701 [dev] [3.8] Starting 5 generic threads (1 dedicated for priority tasks)
2017-03-15 08:15:35,351 INFO [main] c.h.c.LifecycleService [StandardLoggerFactory.java:49] [192.168.187.3]:5701 [dev] [3.8] [192.168.187.3]:5701 is STARTING
2017-03-15 08:15:37,463 INFO [main] c.h.system [StandardLoggerFactory.java:49] [192.168.187.3]:5701 [dev] [3.8] Cluster version set to 3.8
2017-03-15 08:15:37,466 INFO [main] c.h.i.c.i.MulticastJoiner [StandardLoggerFactory.java:49] [192.168.187.3]:5701 [dev] [3.8]
Members [1] {
Member [192.168.187.3]:5701 - 50056bfb-b710-43e0-ad58-57459ed399a5 this
}

Could you try with service dns name as:
myapp.mynamespace.endpoints.cluster.local
Please reply it's work or not and also post your full log.

I know it's kinda happened long time ago. But the problem here was using wrong class name for discovery strategy.
It should be com.hazelcast.kubernetes.HazelcastKubernetesDiscoveryStrategy

Cassandra Java Driver Cold to Hot in 500ms?

I experience a cold to hot (first use) of cluster and session to a local data source (Cassandra) to take 640ms. Any additional connect takes 80 to 100ms so the overhead of the first connect is about 500+ms. Is that normal and is there anything I can do to get this figure down somehow? I use a T410 (i5 2.5GHz).
[Update]
23:27:11.453 [main] DEBUG c.d.driver.core.SystemProperties - com.datastax.driver.NEW_NODE_DELAY_SECONDS is undefined, using default value 1
23:27:11.460 [main] DEBUG c.d.driver.core.SystemProperties - com.datastax.driver.NON_BLOCKING_EXECUTOR_SIZE is undefined, using default value 4
23:27:11.463 [main] DEBUG c.d.driver.core.SystemProperties - com.datastax.driver.NOTIF_LOCK_TIMEOUT_SECONDS is undefined, using default value 60
23:27:11.607 [main] DEBUG com.datastax.driver.core.Cluster - Starting new cluster with contact points [localhost/127.0.0.1:9042]
23:27:11.905 [main] DEBUG com.datastax.driver.core.Connection - Connection[localhost/127.0.0.1:9042-1, inFlight=0, closed=false] Transport initialized and ready
23:27:11.906 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Refreshing node list and token map
23:27:11.969 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Refreshing schema
23:27:12.016 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Refreshing node list and token map
23:27:12.051 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Successfully connected to localhost/127.0.0.1:9042
23:27:12.052 [main] INFO c.d.d.c.p.DCAwareRoundRobinPolicy - Using data-center name 'datacenter1' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor)
23:27:12.053 [main] INFO com.datastax.driver.core.Cluster - New Cassandra host localhost/127.0.0.1:9042 added
23:27:12.076 [Cassandra Java Driver worker-0] DEBUG com.datastax.driver.core.Connection - Connection[localhost/127.0.0.1:9042-2, inFlight=0, closed=false] Transport initialized and ready
23:27:12.077 [Cassandra Java Driver worker-0] DEBUG com.datastax.driver.core.Session - Added connection pool for localhost/127.0.0.1:9042
23:27:12.097 [main] DEBUG com.datastax.driver.core.Connection - Connection[localhost/127.0.0.1:9042-2, inFlight=0, closed=true] closing connection
23:27:12.103 [main] DEBUG com.datastax.driver.core.Cluster - Shutting down
23:27:12.105 [main] DEBUG com.datastax.driver.core.Connection - Connection[localhost/127.0.0.1:9042-1, inFlight=0, closed=true] closing connection
23:27:12.123 [main] DEBUG com.datastax.driver.core.Cluster - Starting new cluster with contact points [/127.0.0.1:9042]
23:27:12.132 [main] DEBUG com.datastax.driver.core.Connection - Connection[/127.0.0.1:9042-1, inFlight=0, closed=false] Transport initialized and ready
23:27:12.132 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Refreshing node list and token map
23:27:12.138 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Refreshing schema
23:27:12.168 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Refreshing node list and token map
23:27:12.192 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Successfully connected to /127.0.0.1:9042
23:27:12.192 [main] INFO c.d.d.c.p.DCAwareRoundRobinPolicy - Using data-center name 'datacenter1' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor)
23:27:12.192 [main] INFO com.datastax.driver.core.Cluster - New Cassandra host /127.0.0.1:9042 added
23:27:12.201 [Cassandra Java Driver worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/127.0.0.1:9042-2, inFlight=0, closed=false] Transport initialized and ready
23:27:12.202 [Cassandra Java Driver worker-0] DEBUG com.datastax.driver.core.Session - Added connection pool for /127.0.0.1:9042
As one can see the first connection attempt uses up to 600ms and more depending how one might read the figures.

My guess is this has to do with connection initialization. In all currently released versions of the java driver connections are initialized 1 after another synchronously. Fortunately, individual host pools are initialized in parallel, but the connections are not. If you are using 2.0.9, which has a default # of core connections of 8 that could explain why you are seeing slow initialization times. Also if you are using password authentication, that will slow things down quite a bit as well (from ~0-10ms per connection to ~60-120ms).
In java driver 2.0.10, which will be released soon, all connections are initialized in parallel which greatly improves Session initialization. For information see JAVA-701.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

com.hazelcast.cp.exception.NotLeaderException no leader election on 3 nodes - java

If restart all members of cluster, it change groupId because state be not consistent without persistents mode, that only in enterprise version. And you need restart clients too. https://github.com/hazelcast/hazelcast/issues/17436

Related

Apache Ignite: Slow Node join and failure

Connect to an azure iot hub from inside a kubernetes cluster via amqp over websockets

Hazelcast web clustered session service: Retrying the connection

hazelcast-kubernetes Network Discovery: How to use with Multiple Nodes

Cassandra Java Driver Cold to Hot in 500ms?

Categories

Resources