Apache Ignite: Slow Node join and failure - java
We have a Ignite setup with 3 Servers and Persistence and therefore Baselining enabled. From time to time we have the issue that the Servers take a long time to rebuild the cluster after all Nodes are restarted. Ignite runs embedded in the application.
20.11.2020 08:18:17.678 WARN [main] org.apache.ignite.internal.util.typedef.G:290 - Ignite work directory is not provided, automatically resolved to: D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work
20.11.2020 08:18:17.709 WARN [main] org.apache.ignite.internal.util.typedef.G:295 - Consistent ID is not set, it is recommended to set consistent ID for production clusters (use IgniteConfiguration.setConsistentId property)
20.11.2020 08:18:18.053 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Config URL: n/a
20.11.2020 08:18:18.084 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - IgniteConfiguration [igniteInstanceName=null, pubPoolSize=8, svcPoolSize=8, callbackPoolSize=8, stripedPoolSize=8, sysPoolSize=8, mgmtPoolSize=4, igfsPoolSize=4, dataStreamerPoolSize=8, utilityCachePoolSize=8, utilityCacheKeepAliveTime=60000, p2pPoolSize=2, qryPoolSize=8, sqlQryHistSize=1000, dfltQryTimeout=0, igniteHome=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite, igniteWorkDir=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work, mbeanSrv=com.sun.jmx.mbeanserver.JmxMBeanServer#78c03f1f, nodeId=0e60d50b-ee2e-46ed-8d76-5cb51791011b, marsh=BinaryMarshaller [], marshLocJobs=false, daemon=false, p2pEnabled=true, netTimeout=5000, netCompressionLevel=1, sndRetryDelay=1000, sndRetryCnt=3, metricsHistSize=10000, metricsUpdateFreq=2000, metricsExpTime=9223372036854775807, discoSpi=TcpDiscoverySpi [addrRslvr=null, sockTimeout=0, ackTimeout=0, marsh=null, reconCnt=10, reconDelay=2000, maxAckTimeout=600000, soLinger=5, forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null, skipAddrsRandomization=false], segPlc=STOP, segResolveAttempts=2, waitForSegOnStart=true, allResolversPassReq=true, segChkFreq=10000, commSpi=TcpCommunicationSpi [connectGate=null, connPlc=org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$FirstConnectionPolicy#522ba524, chConnPlc=null, enableForcibleNodeKill=false, enableTroubleshootingLog=false, locAddr=null, locHost=null, locPort=47100, locPortRange=100, shmemPort=-1, directBuf=true, directSndBuf=false, idleConnTimeout=600000, connTimeout=5000, maxConnTimeout=600000, reconCnt=10, sockSndBuf=32768, sockRcvBuf=32768, msgQueueLimit=0, slowClientQueueLimit=0, nioSrvr=null, shmemSrv=null, usePairedConnections=false, connectionsPerNode=1, tcpNoDelay=true, filterReachableAddresses=false, ackSndThreshold=32, unackedMsgsBufSize=0, sockWriteTimeout=2000, boundTcpPort=-1, boundTcpShmemPort=-1, selectorsCnt=4, selectorSpins=0, addrRslvr=null, ctxInitLatch=java.util.concurrent.CountDownLatch#29c5ee1d[Count = 1], stopping=false, metricsLsnr=null], evtSpi=org.apache.ignite.spi.eventstorage.NoopEventStorageSpi#15cea7b0, colSpi=NoopCollisionSpi [], deploySpi=LocalDeploymentSpi [], indexingSpi=org.apache.ignite.spi.indexing.noop.NoopIndexingSpi#1e6cc850, addrRslvr=null, encryptionSpi=org.apache.ignite.spi.encryption.noop.NoopEncryptionSpi#7e7f0f0a, clientMode=false, rebalanceThreadPoolSize=4, rebalanceTimeout=10000, rebalanceBatchesPrefetchCnt=3, rebalanceThrottle=0, rebalanceBatchSize=524288, txCfg=TransactionConfiguration [txSerEnabled=false, dfltIsolation=REPEATABLE_READ, dfltConcurrency=PESSIMISTIC, dfltTxTimeout=0, txTimeoutOnPartitionMapExchange=0, deadlockTimeout=10000, pessimisticTxLogSize=0, pessimisticTxLogLinger=10000, tmLookupClsName=null, txManagerFactory=null, useJtaSync=false], cacheSanityCheckEnabled=true, discoStartupDelay=60000, deployMode=SHARED, p2pMissedCacheSize=100, locHost=null, timeSrvPortBase=31100, timeSrvPortRange=100, failureDetectionTimeout=10000, sysWorkerBlockedTimeout=null, clientFailureDetectionTimeout=30000, metricsLogFreq=0, hadoopCfg=null, connectorCfg=ConnectorConfiguration [jettyPath=null, host=null, port=11211, noDelay=true, directBuf=false, sndBufSize=32768, rcvBufSize=32768, idleQryCurTimeout=600000, idleQryCurCheckFreq=60000, sndQueueLimit=0, selectorCnt=4, idleTimeout=7000, sslEnabled=false, sslClientAuth=false, sslCtxFactory=null, sslFactory=null, portRange=100, threadPoolSize=8, msgInterceptor=null], odbcCfg=null, warmupClos=null, atomicCfg=AtomicConfiguration [seqReserveSize=1000, cacheMode=PARTITIONED, backups=1, aff=null, grpName=null], classLdr=null, sslCtxFactory=null, platformCfg=null, binaryCfg=null, memCfg=null, pstCfg=null, dsCfg=DataStorageConfiguration [sysRegionInitSize=10485760, sysRegionMaxSize=52428800, pageSize=0, concLvl=0, dfltDataRegConf=DataRegionConfiguration [name=default, maxSize=858886144, initSize=10485760, swapPath=null, pageEvictionMode=DISABLED, evictionThreshold=0.9, emptyPagesPoolSize=100, metricsEnabled=true, metricsSubIntervalCount=5, metricsRateTimeInterval=60000, persistenceEnabled=true, checkpointPageBufSize=0, lazyMemoryAllocation=true], dataRegions=DataRegionConfiguration[] [DataRegionConfiguration [name=persistent, maxSize=52428800, initSize=10485760, swapPath=null, pageEvictionMode=DISABLED, evictionThreshold=0.9, emptyPagesPoolSize=100, metricsEnabled=true, metricsSubIntervalCount=5, metricsRateTimeInterval=60000, persistenceEnabled=true, checkpointPageBufSize=0, lazyMemoryAllocation=true]], storagePath=null, checkpointFreq=180000, lockWaitTime=10000, checkpointThreads=4, checkpointWriteOrder=SEQUENTIAL, walHistSize=20, maxWalArchiveSize=1073741824, walSegments=4, walSegmentSize=10485760, walPath=db/wal, walArchivePath=db/wal/archive, metricsEnabled=false, walMode=LOG_ONLY, walTlbSize=131072, walBuffSize=0, walFlushFreq=2000, walFsyncDelay=1000, walRecordIterBuffSize=67108864, alwaysWriteFullPages=false, fileIOFactory=org.apache.ignite.internal.processors.cache.persistence.file.AsyncFileIOFactory#59429fac, metricsSubIntervalCnt=5, metricsRateTimeInterval=60000, walAutoArchiveAfterInactivity=-1, writeThrottlingEnabled=false, walCompactionEnabled=false, walCompactionLevel=1, checkpointReadLockTimeout=null, walPageCompression=DISABLED, walPageCompressionLevel=null], activeOnStart=true, autoActivation=true, longQryWarnTimeout=3000, sqlConnCfg=null, cliConnCfg=ClientConnectorConfiguration [host=null, port=10800, portRange=100, sockSndBufSize=0, sockRcvBufSize=0, tcpNoDelay=true, maxOpenCursorsPerConn=128, threadPoolSize=8, idleTimeout=0, handshakeTimeout=10000, jdbcEnabled=true, odbcEnabled=true, thinCliEnabled=true, sslEnabled=false, useIgniteSslCtxFactory=true, sslClientAuth=false, sslCtxFactory=null, thinCliCfg=ThinClientConfiguration [maxActiveTxPerConn=100]], mvccVacuumThreadCnt=2, mvccVacuumFreq=5000, authEnabled=false, failureHnd=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]], commFailureRslvr=null]
20.11.2020 08:18:18.084 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Daemon mode: off
...
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Remote Management [restart: off, REST: on, JMX (remote: on, port: 8071, auth: off, ssl: off)]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Logger: JavaLogger [quiet=true, config=null]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - IGNITE_HOME=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - VM arguments: [-Dcom.sun.management.jmxremote, -Dcom.sun.management.jmxremote.port=8071, -Dcom.sun.management.jmxremote.authenticate=false, -Dcom.sun.management.jmxremote.ssl=false, -Djava.rmi.server.hostname=127.0.0.1, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=log/dump.hprof, -XX:+UseG1GC, -XX:+UseStringDeduplication, --add-exports=java.base/jdk.internal.misc=ALL-UNNAMED, --add-exports=java.base/sun.nio.ch=ALL-UNNAMED, --add-exports=java.management/com.sun.jmx.mbeanserver=ALL-UNNAMED, --add-exports=jdk.internal.jvmstat/sun.jvmstat.monitor=ALL-UNNAMED, --add-exports=java.base/sun.reflect.generics.reflectiveObjects=ALL-UNNAMED, --illegal-access=permit, -Xmx500m]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - System cache's DataRegion size is configured to 10 MB. Use DataStorageConfiguration.systemRegionInitialSize property to change the setting.
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Configured caches [in 'sysMemPlc' dataRegion: ['ignite-sys-cache']]
20.11.2020 08:18:18.100 WARN [main] org.apache.ignite.internal.IgniteKernal:295 - Peer class loading is enabled (disable it in production for performance and deployment consistency reasons)
20.11.2020 08:18:18.100 WARN [main] org.apache.ignite.internal.IgniteKernal:295 - Please set system property '-Djava.net.preferIPv4Stack=true' to avoid possible problems in mixed environments.
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - 3-rd party licenses can be found at: D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\libs\licenses
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [BUILD_VERSION=2.1.4]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [NODE_NAME=EESRV-LBXC03]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [BUILD_NUMBER=848]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [NODE_TYPE=LABBOX]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [VERSION=0]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [BUILD_TIME=1604577743000]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [APPLICATION_NAME=Labbox]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [BUILD_GIT_HASH=ff2f1f3]
20.11.2020 08:18:18.100 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Local node user attribute [KEY=_OL2;f~.C3n}yo6p<Zx=BE4I2P:lDL"f]
20.11.2020 08:18:18.163 WARN [pub-#19] org.apache.ignite.internal.GridDiagnostic:295 - This operating system has been tested less rigorously: Windows Server 2012 R2 6.3 amd64. Our team will appreciate the feedback if you experience any problems running ignite in this environment.
20.11.2020 08:18:18.163 WARN [pub-#22] org.apache.ignite.internal.GridDiagnostic:295 - Initial heap size is 64MB (should be no less than 512MB, use -Xms512m -Xmx512m).
20.11.2020 08:18:18.334 INFO [main] o.a.i.i.p.plugin.IgnitePluginProcessor:285 - Configured plugins:
20.11.2020 08:18:18.334 INFO [main] o.a.i.i.p.plugin.IgnitePluginProcessor:285 - ^-- Authentication 1.0.0
20.11.2020 08:18:18.334 INFO [main] o.a.i.i.p.plugin.IgnitePluginProcessor:285 - ^-- null
20.11.2020 08:18:18.334 INFO [main] o.a.i.i.p.plugin.IgnitePluginProcessor:285 -
20.11.2020 08:18:18.334 INFO [main] o.a.i.i.processors.failure.FailureProcessor:285 - Configured failure handler: [hnd=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED, SYSTEM_CRITICAL_OPERATION_TIMEOUT]]]
20.11.2020 08:18:18.600 INFO [main] o.a.i.s.communication.tcp.TcpCommunicationSpi:285 - Successfully bound communication NIO server to TCP port [port=47100, locHost=0.0.0.0/0.0.0.0, selectorsCnt=4, selectorSpins=0, pairedConn=false]
20.11.2020 08:18:18.678 WARN [main] o.a.i.s.communication.tcp.TcpCommunicationSpi:295 - Message queue limit is set to 0 which may lead to potential OOMEs when running cache operations in FULL_ASYNC or PRIMARY_SYNC modes due to message queues growth on sender and receiver sides.
20.11.2020 08:18:18.694 WARN [main] o.a.i.spi.checkpoint.noop.NoopCheckpointSpi:295 - Checkpoints are disabled (to enable configure any GridCheckpointSpi implementation)
20.11.2020 08:18:18.741 WARN [main] o.a.i.i.m.collision.GridCollisionManager:295 - Collision resolution is disabled (all jobs will be activated upon arrival).
20.11.2020 08:18:18.741 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Security status [authentication=off, tls/ssl=off]
20.11.2020 08:18:18.866 INFO [main] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Successfully bound to TCP port [port=47500, localHost=0.0.0.0/0.0.0.0, locNodeId=0e60d50b-ee2e-46ed-8d76-5cb51791011b]
20.11.2020 08:18:18.866 INFO [main] o.a.i.i.p.c.p.filename.PdsFoldersResolver:285 - Successfully locked persistence storage folder [D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5]
20.11.2020 08:18:18.866 INFO [main] o.a.i.i.p.c.p.filename.PdsFoldersResolver:285 - Consistent ID used for local node is [1dbddb2c-ef76-4811-b7d3-46da82061bc5] according to persistence data storage folders
20.11.2020 08:18:18.866 INFO [main] o.a.i.i.p.c.b.CacheObjectBinaryProcessorImpl:285 - Resolved directory for serialized binary metadata: D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\binary_meta\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5
20.11.2020 08:18:19.631 INFO [main] o.a.i.i.p.c.p.file.FilePageStoreManager:285 - Resolved page store work directory: D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5
20.11.2020 08:18:19.694 INFO [main] o.a.i.i.p.c.p.w.f.FileHandleManagerImpl:285 - Initialized write-ahead log manager [mode=LOG_ONLY]
20.11.2020 08:18:19.772 WARN [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:295 - DataRegionConfiguration.maxWalArchiveSize instead DataRegionConfiguration.walHistorySize would be used for removing old archive wal files
20.11.2020 08:18:19.803 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Configured data regions initialized successfully [total=5]
20.11.2020 08:18:19.834 INFO [main] o.a.i.i.p.c.d.d.t.PartitionsEvictManager:285 - Evict partition permits=2
20.11.2020 08:18:19.850 INFO [main] o.a.i.i.p.odbc.ClientListenerProcessor:285 - Client connector processor has started on TCP port 10800
20.11.2020 08:18:20.006 INFO [main] o.a.i.i.p.r.protocols.tcp.GridTcpRestProtocol:285 - Command protocol successfully started [name=TCP binary, host=0.0.0.0/0.0.0.0, port=11211]
20.11.2020 08:18:20.115 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Non-loopback local IPs: 192.168.92.177, fe80:0:0:0:6859:37c8:f543:8087%eth4
20.11.2020 08:18:20.115 INFO [main] org.apache.ignite.internal.IgniteKernal:285 - Enabled local MACs: 00000000000000E0, 005056BD5072
20.11.2020 08:18:20.131 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Read checkpoint status [startMarker=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5\cp\1605855371041-8b5aaf2a-7867-47b0-879c-85791363041f-START.bin, endMarker=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5\cp\1605855371041-8b5aaf2a-7867-47b0-879c-85791363041f-END.bin]
20.11.2020 08:18:20.147 INFO [main] o.a.i.i.p.c.p.pagemem.PageMemoryImpl:285 - Started page memory [memoryAllocated=50,0 MiB, pages=12404, tableSize=988,2 KiB, checkpointBuffer=50,0 MiB]
20.11.2020 08:18:20.147 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Checking memory state [lastValidPos=FileWALPointer [idx=512, fileOff=3672982, len=99269], lastMarked=FileWALPointer [idx=512, fileOff=3672982, len=99269], lastCheckpointId=8b5aaf2a-7867-47b0-879c-85791363041f]
20.11.2020 08:18:20.225 WARN [main] o.a.i.i.p.c.p.wal.FileWriteAheadLogManager:290 - WAL segment tail reached. [idx=512, isWorkDir=true, serVer=org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer#5853495b, actualFilePtr=FileWALPointer [idx=512, fileOff=3772251, len=0]]
20.11.2020 08:18:20.256 WARN [main] o.a.i.i.p.c.p.wal.FileWriteAheadLogManager:290 - WAL segment tail reached. [idx=512, isWorkDir=true, serVer=org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer#21f459fc, actualFilePtr=FileWALPointer [idx=512, fileOff=3772251, len=0]]
20.11.2020 08:18:20.256 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Found last checkpoint marker [cpId=8b5aaf2a-7867-47b0-879c-85791363041f, pos=FileWALPointer [idx=512, fileOff=3672982, len=99269]]
20.11.2020 08:18:20.350 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Applying lost cache updates since last checkpoint record [lastMarked=FileWALPointer [idx=512, fileOff=3672982, len=99269], lastCheckpointId=8b5aaf2a-7867-47b0-879c-85791363041f]
20.11.2020 08:18:20.365 WARN [main] o.a.i.i.p.c.p.wal.FileWriteAheadLogManager:290 - WAL segment tail reached. [idx=512, isWorkDir=true, serVer=org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer#6c15e8c7, actualFilePtr=FileWALPointer [idx=512, fileOff=3772251, len=0]]
20.11.2020 08:18:20.381 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Finished applying WAL changes [updatesApplied=0, time=31 ms]
20.11.2020 08:18:20.381 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Restoring partition state for local groups.
20.11.2020 08:18:20.381 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Finished restoring partition state for local groups [groupsProcessed=0, partitionsProcessed=0, time=0ms]
20.11.2020 08:18:20.412 INFO [main] o.a.i.i.p.cluster.GridClusterStateProcessor:285 - Restoring history for BaselineTopology[id=12]
20.11.2020 08:18:20.522 INFO [main] o.a.i.i.c.DistributedBaselineConfiguration:285 - Baseline parameter 'baselineAutoAdjustEnabled' was changed from 'null' to 'true'
20.11.2020 08:18:20.522 INFO [main] o.a.i.i.c.DistributedBaselineConfiguration:285 - Baseline parameter 'baselineAutoAdjustTimeout' was changed from 'null' to '300000'
20.11.2020 08:18:20.522 INFO [main] o.a.i.i.p.c.p.file.FilePageStoreManager:285 - Cleanup cache stores [total=1, left=0, cleanFiles=false]
20.11.2020 08:18:20.522 INFO [main] o.a.i.i.p.c.p.pagemem.PageMemoryImpl:285 - Started page memory [memoryAllocated=50,0 MiB, pages=12404, tableSize=988,2 KiB, checkpointBuffer=50,0 MiB]
20.11.2020 08:18:20.537 INFO [main] o.a.i.i.p.c.p.pagemem.PageMemoryImpl:285 - Started page memory [memoryAllocated=50,0 MiB, pages=12404, tableSize=988,2 KiB, checkpointBuffer=50,0 MiB]
20.11.2020 08:18:20.537 INFO [main] o.a.i.i.p.c.p.pagemem.PageMemoryImpl:285 - Started page memory [memoryAllocated=50,0 MiB, pages=12404, tableSize=988,2 KiB, checkpointBuffer=50,0 MiB]
20.11.2020 08:18:20.537 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Configured data regions started successfully [total=5]
20.11.2020 08:18:20.537 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Starting binary memory restore for: [166757441, -1947899996, -8785046, -2100569601, 1793235927, -499392514, 30677022, 129211407, 1139332309, 1725334265]
20.11.2020 08:18:21.334 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Read checkpoint status [startMarker=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5\cp\1605855371041-8b5aaf2a-7867-47b0-879c-85791363041f-START.bin, endMarker=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5\cp\1605855371041-8b5aaf2a-7867-47b0-879c-85791363041f-END.bin]
20.11.2020 08:18:21.334 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Checking memory state [lastValidPos=FileWALPointer [idx=512, fileOff=3672982, len=99269], lastMarked=FileWALPointer [idx=512, fileOff=3672982, len=99269], lastCheckpointId=8b5aaf2a-7867-47b0-879c-85791363041f]
20.11.2020 08:18:21.365 WARN [main] o.a.i.i.p.c.p.wal.FileWriteAheadLogManager:290 - WAL segment tail reached. [idx=512, isWorkDir=true, serVer=org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer#317e9c3c, actualFilePtr=FileWALPointer [idx=512, fileOff=3772251, len=0]]
20.11.2020 08:18:21.397 WARN [main] o.a.i.i.p.c.p.wal.FileWriteAheadLogManager:290 - WAL segment tail reached. [idx=512, isWorkDir=true, serVer=org.apache.ignite.internal.processors.cache.persistence.wal.serializer.RecordV2Serializer#31a3f4de, actualFilePtr=FileWALPointer [idx=512, fileOff=3772251, len=0]]
20.11.2020 08:18:21.397 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Found last checkpoint marker [cpId=8b5aaf2a-7867-47b0-879c-85791363041f, pos=FileWALPointer [idx=512, fileOff=3672982, len=99269]]
20.11.2020 08:18:21.412 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Binary memory state restored at node startup [restoredPtr=FileWALPointer [idx=512, fileOff=3772251, len=0]]
20.11.2020 08:18:21.428 INFO [main] o.a.i.i.p.c.p.pagemem.PageMemoryImpl:285 - Started page memory [memoryAllocated=50,0 MiB, pages=12404, tableSize=988,2 KiB, checkpointBuffer=50,0 MiB]
20.11.2020 08:18:21.568 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=license, id=166757441, dataRegionName=persistent, mode=REPLICATED, atomicity=ATOMIC, backups=2147483647, mvcc=false]
20.11.2020 08:18:21.584 INFO [main] o.a.i.i.p.c.p.pagemem.PageMemoryImpl:285 - Started page memory [memoryAllocated=819,1 MiB, pages=203256, tableSize=15,8 MiB, checkpointBuffer=256,0 MiB]
20.11.2020 08:18:21.584 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=commservices, id=-8785046, dataRegionName=default, mode=REPLICATED, atomicity=ATOMIC, backups=2147483647, mvcc=false]
20.11.2020 08:18:21.615 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=ignite-sys-cache, id=-2100569601, dataRegionName=sysMemPlc, mode=REPLICATED, atomicity=TRANSACTIONAL, backups=2147483647, mvcc=false]
20.11.2020 08:18:21.615 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=machinespecifications, id=1793235927, dataRegionName=persistent, mode=REPLICATED, atomicity=ATOMIC, backups=2147483647, mvcc=false]
20.11.2020 08:18:21.615 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=nxisPorts, id=-499392514, dataRegionName=persistent, mode=REPLICATED, atomicity=ATOMIC, backups=2147483647, mvcc=false]
20.11.2020 08:18:21.631 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=datastructures_ATOMIC_PARTITIONED_1#labqueue, id=1205724040, group=labqueue, dataRegionName=default, mode=PARTITIONED, atomicity=ATOMIC, backups=1, mvcc=false]
20.11.2020 08:18:21.631 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=ignite-sys-atomic-cache#labqueue, id=-327698687, group=labqueue, dataRegionName=default, mode=PARTITIONED, atomicity=TRANSACTIONAL, backups=1, mvcc=false]
20.11.2020 08:18:21.631 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=machinemaxbatchno, id=30677022, dataRegionName=persistent, mode=PARTITIONED, atomicity=ATOMIC, backups=1, mvcc=false]
20.11.2020 08:18:21.646 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=machineconfiguration, id=129211407, dataRegionName=persistent, mode=REPLICATED, atomicity=ATOMIC, backups=2147483647, mvcc=false]
20.11.2020 08:18:21.646 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=specimentracer, id=1139332309, dataRegionName=persistent, mode=PARTITIONED, atomicity=ATOMIC, backups=1, mvcc=false]
20.11.2020 08:18:21.646 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Started cache in recovery mode [name=machinestatus, id=1725334265, dataRegionName=default, mode=PARTITIONED, atomicity=ATOMIC, backups=1, mvcc=false]
20.11.2020 08:18:21.646 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Binary recovery performed in 1109 ms.
20.11.2020 08:18:21.646 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Read checkpoint status [startMarker=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5\cp\1605855371041-8b5aaf2a-7867-47b0-879c-85791363041f-START.bin, endMarker=D:\IntegrationSolutions\Services\LabDeviceHUB\Labbox\.\..\userdata\labbox\ignite\work\db\node00-1dbddb2c-ef76-4811-b7d3-46da82061bc5\cp\1605855371041-8b5aaf2a-7867-47b0-879c-85791363041f-END.bin]
20.11.2020 08:18:21.662 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Applying lost cache updates since last checkpoint record [lastMarked=FileWALPointer [idx=512, fileOff=3672982, len=99269], lastCheckpointId=8b5aaf2a-7867-47b0-879c-85791363041f]
20.11.2020 08:18:21.693 INFO [main] o.a.i.i.p.c.p.GridCacheDatabaseSharedManager:285 - Finished applying WAL changes [updatesApplied=0, time=31 ms]
20.11.2020 08:18:21.693 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Restoring partition state for local groups.
20.11.2020 08:18:21.943 INFO [main] o.a.i.i.processors.cache.GridCacheProcessor:285 - Finished restoring partition state for local groups [groupsProcessed=10, partitionsProcessed=5220, time=235ms]
20.11.2020 08:18:22.021 INFO [main] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Connection check threshold is calculated: 10000
20.11.2020 08:19:19.373 INFO [tcp-disco-srvr-[:47500]-#3] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - TCP discovery accepted incoming connection [rmtAddr=/192.168.92.175, rmtPort=56962]
20.11.2020 08:19:19.389 INFO [tcp-disco-srvr-[:47500]-#3] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - TCP discovery spawning a new thread for connection [rmtAddr=/192.168.92.175, rmtPort=56962]
20.11.2020 08:19:19.389 INFO [tcp-disco-sock-reader-[]-#4] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Started serving remote node connection [rmtAddr=/192.168.92.175:56962, rmtPort=56962]
20.11.2020 08:19:19.389 INFO [tcp-disco-sock-reader-[9f44068b 192.168.92.175:56962 client]-#4] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Initialized connection with remote client node [nodeId=9f44068b-b8ca-4d8b-bb32-efd2e2a1940c, rmtAddr=/192.168.92.175:56962]
20.11.2020 08:19:19.498 INFO [tcp-disco-sock-reader-[9f44068b 192.168.92.175:56962 client]-#4] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Finished serving remote node connection [rmtAddr=/192.168.92.175:56962, rmtPort=56962
20.11.2020 08:20:21.287 INFO [tcp-disco-srvr-[:47500]-#3] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - TCP discovery accepted incoming connection [rmtAddr=/192.168.92.176, rmtPort=55941]
20.11.2020 08:20:21.287 INFO [tcp-disco-srvr-[:47500]-#3] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - TCP discovery spawning a new thread for connection [rmtAddr=/192.168.92.176, rmtPort=55941]
20.11.2020 08:20:21.287 INFO [tcp-disco-sock-reader-[]-#5] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Started serving remote node connection [rmtAddr=/192.168.92.176:55941, rmtPort=55941]
20.11.2020 08:20:21.287 INFO [tcp-disco-sock-reader-[6a50abff 192.168.92.176:55941]-#5] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Initialized connection with remote server node [nodeId=6a50abff-8cfd-4b3a-b894-54fa9d405d36, rmtAddr=/192.168.92.176:55941]
20.11.2020 08:20:21.287 INFO [tcp-disco-sock-reader-[6a50abff 192.168.92.176:55941]-#5] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - Finished serving remote node connection [rmtAddr=/192.168.92.176:55941, rmtPort=55941
20.11.2020 08:20:26.239 INFO [tcp-disco-srvr-[:47500]-#3] o.a.ignite.spi.discovery.tcp.TcpDiscoverySpi:285 - TCP discovery accepted incoming connection [rmtAddr=/192.168.92.175, rmtPort=56996]
... it continues like that till the join or failure
The logs log the same on all servers. In this case server 1 and 2 create a cluster after 7 minutes. server 3 fails after 9 minutes due to incompatible baseline topology. After reseting the failed server it can rejoin the cluster. The behavior only happens sometimes. Most of the time the servers rebuild the cluster without problem.
Related
Connect to an azure iot hub from inside a kubernetes cluster via amqp over websockets
we are trying to communicate to an azure iothub via amqp over websocket from a java docker container inside an azure kubernetes cluster. Sadly it seems, that the container cant establish a connection while locally or even on another virtual machine (where only docker is installed) the container run successfully. The network policies rules should allow all necessary protocols and ports to communicate with eventhub endpoint of the iot hub. Does anybody know which switch we have to pull to "allow" the container from the cluster the communication with the iothub? The only logs we have are this: 13:10:26.688 [main] DEBUG reactor.util.Loggers$LoggerFactory - Using Slf4j logging framework 13:10:26.851 [main] INFO com.azure.messaging.eventhubs.EventHubClientBuilder - connectionId[REDACTED]: Emitting a single connection. 13:10:26.901 [main] DEBUG com.azure.core.amqp.implementation.ReactorConnection - connectionId[REDACTED]: Connection state: UNINITIALIZED 13:10:26.903 [main] INFO com.azure.messaging.eventhubs.implementation.EventHubConnectionProcessor - namespace[REDACTED] entityPath[REDACTED]: Setting next AMQP channel. 13:10:26.903 [main] INFO com.azure.messaging.eventhubs.implementation.EventHubConnectionProcessor - namespace[REDACTED] entityPath[REDACTED]: Next AMQP channel received, updating 0 current subscribers 13:10:26.920 [main] INFO com.azure.core.amqp.implementation.ReactorConnection - connectionId[REDACTED]: Creating and starting connection to REDACTED:443 13:10:26.940 [main] INFO com.azure.core.amqp.implementation.ReactorExecutor - connectionId[REDACTED], message[Starting reactor.] 13:10:26.955 [single-1] INFO com.azure.core.amqp.implementation.handler.ConnectionHandler - onConnectionInit hostname[REDACTED], connectionId[REDACTED] 13:10:26.956 [single-1] INFO com.azure.core.amqp.implementation.handler.ReactorHandler - connectionId[REDACTED] reactor.onReactorInit 13:10:26.956 [single-1] INFO com.azure.core.amqp.implementation.handler.ConnectionHandler - onConnectionLocalOpen hostname[REDACTED:443], connectionId[REDACTED], errorCondition[null], errorDescription[null] 13:10:26.975 [main] DEBUG com.azure.core.amqp.implementation.ReactorSession - Connection state: UNINITIALIZED 13:10:26.991 [main] INFO com.azure.core.amqp.implementation.ReactorConnection - Emitting new response channel. connectionId: REDACTED. entityPath: $management. linkName: mgmt. 13:10:26.991 [main] INFO class com.azure.core.amqp.implementation.RequestResponseChannel<mgmt-session> - namespace[REDACTED] entityPath[$management]: Setting next AMQP channel. 13:10:26.991 [main] INFO class com.azure.core.amqp.implementation.RequestResponseChannel<mgmt-session> - namespace[REDACTED] entityPath[$management]: Next AMQP channel received, updating 0 current subscribers 13:10:26.993 [main] INFO com.azure.messaging.eventhubs.implementation.ManagementChannel - Management endpoint state: UNINITIALIZED 13:10:27.032 [main] INFO com.azure.messaging.eventhubs.implementation.EventHubConnectionProcessor - Upstream connection publisher was completed. Terminating processor. 13:10:27.033 [main] INFO com.azure.messaging.eventhubs.implementation.EventHubConnectionProcessor - namespace[REDACTED] entityPath[REDACTED]: AMQP channel processor completed. Notifying 0 subscribers. 13:10:27.040 [main] INFO com.azure.messaging.eventhubs.implementation.EventHubReactorAmqpConnection - connectionId[REDACTED]: Disposing of connection. 13:10:27.040 [main] INFO class com.azure.core.amqp.implementation.RequestResponseChannel<mgmt-session> - Upstream connection publisher was completed. Terminating processor. 13:10:27.040 [main] INFO class com.azure.core.amqp.implementation.RequestResponseChannel<mgmt-session> - namespace[REDACTED] entityPath[$management]: AMQP channel processor completed. Notifying 0 subscribers. 13:10:27.041 [main] INFO com.azure.core.amqp.implementation.ReactorConnection - connectionId[REDACTED]: Disposing of ReactorConnection. 13:10:27.041 [main] INFO com.azure.messaging.eventhubs.implementation.EventHubConnectionProcessor - namespace[REDACTED] entityPath[REDACTED]: Channel is disposed. 13:10:27.041 [main] INFO com.azure.core.amqp.implementation.ReactorConnection - connectionId[REDACTED]: Removing session 'mgmt-session' 13:10:27.041 [main] INFO com.azure.core.amqp.implementation.ReactorSession - sessionId[mgmt-session]: Disposing of session. 13:10:27.043 [main] INFO com.azure.core.amqp.implementation.AmqpExceptionHandler - Shutdown received: ReactorExecutor.close() was called., isTransient[false], initiatedByClient[true] 13:10:27.089 [single-1] DEBUG com.azure.core.amqp.implementation.handler.SessionHandler - onSessionLocalOpen connectionId[REDACTED], entityName[mgmt-session], condition[Error{condition=null, description='null', info=null}] 13:10:27.090 [single-1] INFO com.azure.core.amqp.implementation.handler.SendLinkHandler - onLinkLocalClose connectionId[REDACTED], linkName[mgmt:sender], errorCondition[null], errorDescription[null] 13:10:27.090 [single-1] INFO com.azure.core.amqp.implementation.handler.ReceiveLinkHandler - onLinkLocalClose connectionId[REDACTED], linkName[mgmt:receiver], errorCondition[null], errorDescription[null] 13:10:27.090 [single-1] DEBUG com.azure.core.amqp.implementation.handler.SessionHandler - onSessionLocalClose connectionId[mgmt-session], entityName[REDACTED], condition[Error{condition=null, description='null', info=null}] 13:10:27.090 [single-1] INFO com.azure.core.amqp.implementation.handler.ConnectionHandler - onConnectionLocalClose hostname[REDACTED:443], connectionId[REDACTED], errorCondition[null], errorDescription[null] 13:10:27.090 [single-1] INFO com.azure.core.amqp.implementation.handler.ConnectionHandler - onConnectionBound hostname[REDACTED], connectionId[REDACTED] 13:10:27.098 [single-1] DEBUG com.azure.core.amqp.implementation.handler.WebSocketsConnectionHandler - connectionId[REDACTED] Adding web sockets transport layer for hostname[REDACTED:443] 13:10:27.125 [single-1] DEBUG com.azure.core.amqp.implementation.handler.DispatchHandler - Running task for event: %s 13:10:27.126 [single-1] INFO com.azure.core.amqp.implementation.ReactorExecutor - connectionId[REDACTED], message[Processing all pending tasks and closing old reactor.] 13:10:27.126 [single-1] DEBUG com.azure.core.amqp.implementation.handler.SendLinkHandler - onLinkLocalOpen connectionId[REDACTED], linkName[mgmt:sender], localTarget[Target{address='$management', durable=NONE, expiryPolicy=SESSION_END, timeout=0, dynamic=false, dynamicNodeProperties=null, capabilities=null}] 13:10:27.126 [single-1] INFO com.azure.core.amqp.implementation.handler.ReceiveLinkHandler - onLinkLocalOpen connectionId[REDACTED], linkName[mgmt:receiver], localSource[Source{address='$management', durable=NONE, expiryPolicy=SESSION_END, timeout=0, dynamic=false, dynamicNodeProperties=null, distributionMode=null, filter=null, defaultOutcome=null, outcomes=null, capabilities=null}] 13:10:27.127 [single-1] INFO com.azure.core.amqp.implementation.ReactorExecutor - connectionId[REDACTED], message[Stopping the reactor because thread was interrupted or the reactor has no more events to process.]
The failure was not the networking at all. My mistake was, that i assume if i can run the container manual via docker run -it * it should also work in a kubernetes cluster. But with the -it argument the container stays open und and pseudo-tty was attached. But in the kubernetes cluster this of cause will not happen, so we have to adjust the loop logic of the java application and it works after. Thanks # all
Apache Storm 1.1.0 not running and giving Unable to read additional data from client sessionid
I am running a simple Hello World kind of application in Apache Storm 1.1.0 . Application has a random integer spout and a bolt which prints the tuple output. But somehow I am not able to get it working on my windows system. I am new to Apache storm and following a tutorial. I have looked for the answers in stack overflow, but i was not able to find any solved question regarding the same. Following is my run Topology code: public static void runTopology() { //String filePath = "./src/main/resources/operations.txt"; TopologyBuilder builder = new TopologyBuilder(); builder.setSpout("randomNumberSpout", new RandomIntSpout()); builder.setBolt("printingBolt", new PrintingBolt()).shuffleGrouping("randomNumberSpout"); Config config = new Config(); config.setDebug(true); LocalCluster cluster = new LocalCluster(); try{ cluster.submitTopology("Test", config, builder.createTopology()); }finally{ cluster.shutdown(); } } Bolt code public class PrintingBolt extends BaseBasicBolt { /** * */ private static final long serialVersionUID = 1L; public void execute(Tuple tuple, BasicOutputCollector basicOutputCollector) { System.out.println("Printing Tupple!!!!"); System.out.println(tuple); System.out.println("Tupple processed " + tuple.getInteger(1)); basicOutputCollector.emit(new Values(tuple.getInteger(1))); } public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) { outputFieldsDeclarer.declare(new Fields("TestOutput")); } } Spout Code public class RandomIntSpout extends BaseRichSpout { /** * */ private static final long serialVersionUID = 1L; private Random random; private SpoutOutputCollector outputCollector; /*#Override public void open(Map<String,Object> map, TopologyContext topologyContext, SpoutOutputCollector spoutOutputCollector) { random = new Random(); outputCollector = spoutOutputCollector; }*/ public void nextTuple() { Utils.sleep(1000); outputCollector.emit(new Values(random.nextInt(), System.currentTimeMillis())); } public void declareOutputFields(OutputFieldsDeclarer outputFieldsDeclarer) { outputFieldsDeclarer.declare(new Fields("randomInt", "timestamp")); } public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) { random = new Random(); outputCollector = collector; } } I can provide the rest of the code as well, but I don't think that will be required. If required please mention in comments, I will provide that as well. I get following error whenever I try to run the application. 10620 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Initiating client connection, connectString=localhost:2000/storm sessionTimeout=20000 > watcher=org.apache.storm.shade.org.apache.curator.ConnectionState#31b0f02 10625 [main-SendThread(0:0:0:0:0:0:0:1:2000)] INFO o.a.s.s.o.a.z.ClientCnxn - Opening socket connection to server > 0:0:0:0:0:0:0:1/0:0:0:0:0:0:0:1:2000. Will not attempt to authenticate using SASL (unknown error) 10627 [main-SendThread(0:0:0:0:0:0:0:1:2000)] INFO o.a.s.s.o.a.z.ClientCnxn - Socket connection established to > 0:0:0:0:0:0:0:1/0:0:0:0:0:0:0:1:2000, initiating session 10627 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxnFactory - Accepted socket connection from /> 0:0:0:0:0:0:0:1:56905 10628 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.ZooKeeperServer - Client attempting to establish new session at /> 0:0:0:0:0:0:0:1:56905 10631 [main-SendThread(0:0:0:0:0:0:0:1:2000)] INFO o.a.s.s.o.a.z.ClientCnxn - Session establishment complete on server > 0:0:0:0:0:0:0:1/0:0:0:0:0:0:0:1:2000, sessionid = 0x16a8e5abd97000d, negotiated timeout = 20000 10631 [SyncThread:0] INFO o.a.s.s.o.a.z.s.ZooKeeperServer - Established session 0x16a8e5abd97000d with negotiated timeout 20000 for client /> 0:0:0:0:0:0:0:1:56905 10632 [main-EventThread] INFO o.a.s.s.o.a.c.f.s.ConnectionStateManager - State change: CONNECTED 10635 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Got user-level KeeperException when processing sessionid:0x16a8e5abd97000d type:create cxid:0x2 zxid:0x26 txntype:-1 reqpath:n/a Error Path:/storm/blobstoremaxkeysequencenumber > Error:KeeperErrorCode = NoNode for /storm/blobstoremaxkeysequencenumber 10655 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 10657 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: > 0x16a8e5abd97000d 10659 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 10659 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x16a8e5abd97000d closed 10661 [main] INFO o.a.s.cluster - setup-path/blobstore/Test-1-1557166474-stormconf.ser/IBMT450PC053RLV.Corp.CVS.com:6627-1 10660 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN o.a.s.s.o.a.z.s.NIOServerCnxn - caught end of stream exception org.apache.storm.shade.org.apache.zookeeper.server.ServerCnxn$EndOfStreamException: Unable to read additional data from client sessionid > 0x16a8e5abd97000d, likely client has closed socket at org.apache.storm.shade.org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) [storm-core-1.1.0.jar:1.1.0] at org.apache.storm.shade.org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) [storm-core-1.1.0.jar:1.1.0] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60] 10671 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /> 0:0:0:0:0:0:0:1:56905 which had sessionid 0x16a8e5abd97000d 10746 [main] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - Starting 10747 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Initiating client connection, connectString=localhost:2000/storm sessionTimeout=20000 > watcher=org.apache.storm.shade.org.apache.curator.ConnectionState#73893ec1 10755 [main-SendThread(127.0.0.1:2000)] INFO o.a.s.s.o.a.z.ClientCnxn - Opening socket connection to server 127.0.0.1/127.0.0.1:2000. Will not > attempt to authenticate using SASL (unknown error) 10756 [main-SendThread(127.0.0.1:2000)] INFO o.a.s.s.o.a.z.ClientCnxn - Socket connection established to 127.0.0.1/127.0.0.1:2000, initiating > session 10758 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxnFactory - Accepted socket connection from /127.0.0.1:56908 10759 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.ZooKeeperServer - Client attempting to establish new session at > /127.0.0.1:56908 10766 [main-SendThread(127.0.0.1:2000)] INFO o.a.s.s.o.a.z.ClientCnxn - Session establishment complete on server 127.0.0.1/127.0.0.1:2000, > sessionid = 0x16a8e5abd97000e, negotiated timeout = 20000 10766 [SyncThread:0] INFO o.a.s.s.o.a.z.s.ZooKeeperServer - Established session 0x16a8e5abd97000e with negotiated timeout 20000 for client > /127.0.0.1:56908 10767 [main-EventThread] INFO o.a.s.s.o.a.c.f.s.ConnectionStateManager - State change: CONNECTED 10778 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 10781 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: > 0x16a8e5abd97000e 10785 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x16a8e5abd97000e closed 10785 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /127.0.0.1:56908 > which had sessionid 0x16a8e5abd97000e 10785 [main] INFO o.a.s.cluster - setup-path/blobstore/Test-1-1557166474-stormcode.ser/IBMT450PC053RLV.Corp.CVS.com:6627-1 10786 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 10821 [main] INFO o.a.s.d.nimbus - desired replication count 1 achieved, current-replication-count for conf key = 1, current-replication-count > for code key = 1, current-replication-count for jar key = 1 11042 [main] INFO o.a.s.d.nimbus - Activating Test: Test-1-1557166474 11058 [main] INFO o.a.s.d.nimbus - Shutting down master 11064 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 11066 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: > 0x16a8e5abd970003 11068 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 11069 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN o.a.s.s.o.a.z.s.NIOServerCnxn - caught end of stream exception org.apache.storm.shade.org.apache.zookeeper.server.ServerCnxn$EndOfStreamException: Unable to read additional data from client sessionid > 0x16a8e5abd970003, likely client has closed socket at org.apache.storm.shade.org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) [storm-core-1.1.0.jar:1.1.0] at org.apache.storm.shade.org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) [storm-core-1.1.0.jar:1.1.0] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60] 11068 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x16a8e5abd970003 closed 11069 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /127.0.0.1:56875 > which had sessionid 0x16a8e5abd970003 11069 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 11072 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: > 0x16a8e5abd970004 11074 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 11075 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /> 0:0:0:0:0:0:0:1:56878 which had sessionid 0x16a8e5abd970004 11074 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x16a8e5abd970004 closed 11077 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 11079 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: > 0x16a8e5abd970000 11081 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x16a8e5abd970000 closed 11081 [main] INFO o.a.s.zookeeper - closing zookeeper connection of leader elector. 11082 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 11082 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /127.0.0.1:56866 > which had sessionid 0x16a8e5abd970000 11082 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 11084 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: > 0x16a8e5abd970001 11086 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x16a8e5abd970001 closed 11087 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 11087 [main] INFO o.a.s.d.nimbus - Shut down master 11087 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 11089 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN o.a.s.s.o.a.z.s.NIOServerCnxn - caught end of stream exception org.apache.storm.shade.org.apache.zookeeper.server.ServerCnxn$EndOfStreamException: Unable to read additional data from client sessionid > 0x16a8e5abd970001, likely client has closed socket at org.apache.storm.shade.org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) [storm-core-1.1.0.jar:1.1.0] at org.apache.storm.shade.org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) [storm-core-1.1.0.jar:1.1.0] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60] 11089 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /127.0.0.1:56869 > which had sessionid 0x16a8e5abd970001 11090 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: > 0x16a8e5abd970006 11092 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x16a8e5abd970006 closed 11093 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 11093 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN o.a.s.s.o.a.z.s.NIOServerCnxn - caught end of stream exception org.apache.storm.shade.org.apache.zookeeper.server.ServerCnxn$EndOfStreamException: Unable to read additional data from client sessionid > 0x16a8e5abd970006, likely client has closed socket at org.apache.storm.shade.org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) [storm-core-1.1.0.jar:1.1.0] at org.apache.storm.shade.org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) [storm-core-1.1.0.jar:1.1.0] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60] 11094 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /127.0.0.1:56884 > which had sessionid 0x16a8e5abd970006 11095 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 11095 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: > 0x16a8e5abd970008 11098 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x16a8e5abd970008 closed 11098 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 11099 [main] INFO o.a.s.d.s.ReadClusterState - Setting Thread[SLOT_1024,5,main] assignment to null 11099 [main] INFO o.a.s.d.s.ReadClusterState - Setting Thread[SLOT_1025,5,main] assignment to null 11099 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN o.a.s.s.o.a.z.s.NIOServerCnxn - caught end of stream exception org.apache.storm.shade.org.apache.zookeeper.server.ServerCnxn$EndOfStreamException: Unable to read additional data from client sessionid > 0x16a8e5abd970008, likely client has closed socket at org.apache.storm.shade.org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) [storm-core-1.1.0.jar:1.1.0] at org.apache.storm.shade.org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) [storm-core-1.1.0.jar:1.1.0] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60] 11099 [main] INFO o.a.s.d.s.ReadClusterState - Setting Thread[SLOT_1026,5,main] assignment to null 11099 [main] INFO o.a.s.d.s.ReadClusterState - Waiting for Thread[SLOT_1024,5,main] to be EMPTY, currently EMPTY 11099 [main] INFO o.a.s.d.s.ReadClusterState - Waiting for Thread[SLOT_1025,5,main] to be EMPTY, currently EMPTY 11099 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /> 0:0:0:0:0:0:0:1:56890 which had sessionid 0x16a8e5abd970008 11099 [main] INFO o.a.s.d.s.ReadClusterState - Waiting for Thread[SLOT_1026,5,main] to be EMPTY, currently EMPTY 11099 [main] INFO o.a.s.d.s.Supervisor - Shutting down supervisor 009e412c-0d39-400c-8302-08296524c703 11100 [Thread-10] INFO o.a.s.e.EventManagerImp - Event manager interrupted 11102 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 11103 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: > 0x16a8e5abd97000a 11105 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x16a8e5abd97000a closed 11105 [main] INFO o.a.s.d.s.ReadClusterState - Setting Thread[SLOT_1027,5,main] assignment to null 11105 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 11105 [main] INFO o.a.s.d.s.ReadClusterState - Setting Thread[SLOT_1028,5,main] assignment to null 11105 [main] INFO o.a.s.d.s.ReadClusterState - Setting Thread[SLOT_1029,5,main] assignment to null 11106 [main] INFO o.a.s.d.s.ReadClusterState - Waiting for Thread[SLOT_1027,5,main] to be EMPTY, currently EMPTY 11106 [main] INFO o.a.s.d.s.ReadClusterState - Waiting for Thread[SLOT_1028,5,main] to be EMPTY, currently EMPTY 11106 [main] INFO o.a.s.d.s.ReadClusterState - Waiting for Thread[SLOT_1029,5,main] to be EMPTY, currently EMPTY 11106 [main] INFO o.a.s.d.s.Supervisor - Shutting down supervisor 5daf8496-451f-43ca-b176-b16055d6183c 11106 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN o.a.s.s.o.a.z.s.NIOServerCnxn - caught end of stream exception org.apache.storm.shade.org.apache.zookeeper.server.ServerCnxn$EndOfStreamException: Unable to read additional data from client sessionid > 0x16a8e5abd97000a, likely client has closed socket at org.apache.storm.shade.org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) [storm-core-1.1.0.jar:1.1.0] at org.apache.storm.shade.org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) [storm-core-1.1.0.jar:1.1.0] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60] 11106 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /> 0:0:0:0:0:0:0:1:56896 which had sessionid 0x16a8e5abd97000a 11106 [Thread-14] INFO o.a.s.e.EventManagerImp - Event manager interrupted 11108 [Curator-Framework-0] INFO o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl - backgroundOperationsLoop exiting 11109 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Processed session termination for sessionid: > 0x16a8e5abd97000c 11112 [main] INFO o.a.s.s.o.a.z.ZooKeeper - Session: 0x16a8e5abd97000c closed 11112 [main-EventThread] INFO o.a.s.s.o.a.z.ClientCnxn - EventThread shut down 11112 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] WARN o.a.s.s.o.a.z.s.NIOServerCnxn - caught end of stream exception org.apache.storm.shade.org.apache.zookeeper.server.ServerCnxn$EndOfStreamException: Unable to read additional data from client sessionid > 0x16a8e5abd97000c, likely client has closed socket at org.apache.storm.shade.org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228) [storm-core-1.1.0.jar:1.1.0] at org.apache.storm.shade.org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208) [storm-core-1.1.0.jar:1.1.0] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_60] 11113 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxn - Closed socket connection for client /127.0.0.1:56902 > which had sessionid 0x16a8e5abd97000c 11114 [main] INFO o.a.s.testing - Shutting down in process zookeeper 11115 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2000] INFO o.a.s.s.o.a.z.s.NIOServerCnxnFactory - NIOServerCnxn factory exited run method 11116 [main] INFO o.a.s.s.o.a.z.s.ZooKeeperServer - shutting down 11116 [main] INFO o.a.s.s.o.a.z.s.SessionTrackerImpl - Shutting down 11116 [main] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - Shutting down 11117 [main] INFO o.a.s.s.o.a.z.s.SyncRequestProcessor - Shutting down 11117 [SyncThread:0] INFO o.a.s.s.o.a.z.s.SyncRequestProcessor - SyncRequestProcessor exited! 11117 [ProcessThread(sid:0 cport:-1):] INFO o.a.s.s.o.a.z.s.PrepRequestProcessor - PrepRequestProcessor exited loop! 11117 [main] INFO o.a.s.s.o.a.z.s.FinalRequestProcessor - shutdown of request processor complete 11118 [main] INFO o.a.s.testing - Done shutting down in process zookeeper 11118 [main] INFO o.a.s.testing - Deleting temporary path C:\Users\AKHAND~1\AppData\Local\Temp\ae4119b4-70b3-4d04-9aee-5bfae4c4775b 11203 [main] INFO o.a.s.testing - Deleting temporary path C:\Users\AKHAND~1\AppData\Local\Temp\a78b8c79-b9b3-438d-8df6-5d7bd74281fc 11215 [main] INFO o.a.s.testing - Unable to delete file: > C:\Users\AKHAND~1\AppData\Local\Temp\a78b8c79-b9b3-438d-8df6-5d7bd74281fc\version-2\log.1 11215 [main] INFO o.a.s.testing - Deleting temporary path C:\Users\AKHAND~1\AppData\Local\Temp\0e4fbadc-ad33-4577-9784-4cc163a778fa 11255 [main] INFO o.a.s.testing - Deleting temporary path C:\Users\AKHAND~1\AppData\Local\Temp\456d6b1d-eb21-4b76-98f1-a2bb44b2aa5e 12197 [SessionTracker] INFO o.a.s.s.o.a.z.s.SessionTrackerImpl - SessionTrackerImpl exited loop! I am not able to understand why client socket is closed and why session is closed? I am not able to get it working. Please help.
I think you might need to add a sleep here try{ cluster.submitTopology("Test", config, builder.createTopology()); //Sleep here }finally{ cluster.shutdown(); } Currently you are submitting the topology, and immediately shutting down. Unless you sleep a bit, your topology doesn't get a chance to run.
Hazelcast web clustered session service: Retrying the connection
hazelcast unable to connect, the message receive is as follows 2018-03-03 10:27:51,074 INFO c.h.i.DefaultAddressPicker [LOCAL] [dev] [3.6] Picked Address[127.0.0.1]:5703, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5703], bind any local is true 2018-03-03 10:28:01.078 INFO 19478 --- [.ensureInstance] c.hazelcast.web.ClusteredSessionService : Retrying the connection!! 2018-03-03 10:28:01,078 INFO c.h.w.ClusteredSessionService Retrying the connection!! 2018-03-03 10:28:01.079 INFO 19478 --- [.ensureInstance] com.hazelcast.config.XmlConfigLocator : Loading 'hazelcast-default.xml' from classpath. 2018-03-03 10:28:01,079 INFO c.h.c.XmlConfigLocator Loading 'hazelcast-default.xml' from classpath. 2018-03-03 10:28:01.085 INFO 19478 --- [.ensureInstance] c.hazelcast.web.HazelcastInstanceLoader : Creating a new HazelcastInstance for session replication 2018-03-03 10:28:01,085 INFO c.h.w.HazelcastInstanceLoader Creating a new HazelcastInstance for session replication 2018-03-03 10:28:01.086 INFO 19478 --- [.ensureInstance] c.h.instance.DefaultAddressPicker : [LOCAL] [dev] [3.6] Picked Address[127.0.0.1]:5703, using socket ServerSocket[addr=/0:0:0:0:0:0:0:0,localport=5703], bind any local is true
Riak java client, execute() never returns
I've setup a riak server on ubuntu. http://192.168.0.102:8098/ping return "OK" I'm trying to remotely connect to it using riak java client(2.1.1) using the following code. client.execute() never returns. I'm attaching the log also. public class Testing { public static void main(String[] args) throws ExecutionException, InterruptedException, UnknownHostException { RiakClient client = RiakClient.newClient(8098, "192.168.0.102"); // put some stuff Namespace ns = new Namespace("TestBucket"); Location location = new Location(ns, "TestKey"); String myData = "TestValue"; StoreValue store = new StoreValue.Builder(myData) .withLocation(location).build(); Response rv = client.execute(store); // << NEVER GETS PAST THIS System.out.println("write done"); // get some stuff FetchValue fv = new FetchValue.Builder(location).build(); FetchValue.Response response = client.execute(fv); String obj = response.getValue(String.class); System.out.println(obj); System.out.println("fetch done"); } } Log on the console is... 17:19:40.841 [main] DEBUG i.n.u.i.l.InternalLoggerFactory - Using SLF4J as the default logging framework 17:19:40.865 [main] DEBUG i.n.c.MultithreadEventLoopGroup - -Dio.netty.eventLoopThreads: 16 17:19:40.891 [main] DEBUG i.n.util.internal.PlatformDependent0 - java.nio.Buffer.address: available 17:19:40.892 [main] DEBUG i.n.util.internal.PlatformDependent0 - sun.misc.Unsafe.theUnsafe: available 17:19:40.893 [main] DEBUG i.n.util.internal.PlatformDependent0 - sun.misc.Unsafe.copyMemory: available 17:19:40.894 [main] DEBUG i.n.util.internal.PlatformDependent0 - direct buffer constructor: available 17:19:40.894 [main] DEBUG i.n.util.internal.PlatformDependent0 - java.nio.Bits.unaligned: available, true 17:19:40.894 [main] DEBUG i.n.util.internal.PlatformDependent0 - java.nio.DirectByteBuffer.<init>(long, int): available 17:19:40.896 [main] DEBUG io.netty.util.internal.Cleaner0 - java.nio.ByteBuffer.cleaner(): available 17:19:40.896 [main] DEBUG i.n.util.internal.PlatformDependent - Platform: Windows 17:19:40.897 [main] DEBUG i.n.util.internal.PlatformDependent - Java version: 8 17:19:40.897 [main] DEBUG i.n.util.internal.PlatformDependent - -Dio.netty.noUnsafe: false 17:19:40.897 [main] DEBUG i.n.util.internal.PlatformDependent - sun.misc.Unsafe: available 17:19:40.898 [main] DEBUG i.n.util.internal.PlatformDependent - -Dio.netty.noJavassist: false 17:19:40.899 [main] DEBUG i.n.util.internal.PlatformDependent - Javassist: unavailable 17:19:40.899 [main] DEBUG i.n.util.internal.PlatformDependent - You don't have Javassist in your class path or you don't have enough permission to load dynamically generated classes. Please check the configuration for better performance. 17:19:40.899 [main] DEBUG i.n.util.internal.PlatformDependent - -Dio.netty.tmpdir: C:\Users\Rakesh\AppData\Local\Temp (java.io.tmpdir) 17:19:40.900 [main] DEBUG i.n.util.internal.PlatformDependent - -Dio.netty.bitMode: 32 (sun.arch.data.model) 17:19:40.900 [main] DEBUG i.n.util.internal.PlatformDependent - -Dio.netty.noPreferDirect: false 17:19:40.900 [main] DEBUG i.n.util.internal.PlatformDependent - io.netty.maxDirectMemory: 259522560 bytes 17:19:40.921 [main] DEBUG io.netty.channel.nio.NioEventLoop - -Dio.netty.noKeySetOptimization: false 17:19:40.921 [main] DEBUG io.netty.channel.nio.NioEventLoop - -Dio.netty.selectorAutoRebuildThreshold: 512 17:19:40.922 [main] DEBUG i.n.util.internal.PlatformDependent - org.jctools-core.MpscChunkedArrayQueue: available 17:19:41.039 [main] DEBUG io.netty.channel.DefaultChannelId - -Dio.netty.processId: 2924 (auto-detected) 17:19:41.041 [main] DEBUG io.netty.util.NetUtil - -Djava.net.preferIPv4Stack: false 17:19:41.041 [main] DEBUG io.netty.util.NetUtil - -Djava.net.preferIPv6Addresses: false 17:19:41.162 [main] DEBUG io.netty.util.NetUtil - Loopback interface: lo (Software Loopback Interface 1, 127.0.0.1) 17:19:41.163 [main] DEBUG io.netty.util.NetUtil - \proc\sys\net\core\somaxconn: 200 (non-existent) 17:19:41.321 [main] DEBUG io.netty.channel.DefaultChannelId - -Dio.netty.machineId: e4:b3:18:ff:fe:6c:52:eb (auto-detected) 17:19:41.321 [main] DEBUG i.n.util.internal.ThreadLocalRandom - -Dio.netty.initialSeedUniquifier: 0xb620b93d4006e503 17:19:41.333 [main] DEBUG io.netty.util.ResourceLeakDetector - -Dio.netty.leakDetection.level: simple 17:19:41.333 [main] DEBUG io.netty.util.ResourceLeakDetector - -Dio.netty.leakDetection.maxRecords: 4 17:19:41.355 [main] DEBUG i.n.buffer.PooledByteBufAllocator - -Dio.netty.allocator.numHeapArenas: 2 17:19:41.355 [main] DEBUG i.n.buffer.PooledByteBufAllocator - -Dio.netty.allocator.numDirectArenas: 2 17:19:41.355 [main] DEBUG i.n.buffer.PooledByteBufAllocator - -Dio.netty.allocator.pageSize: 8192 17:19:41.355 [main] DEBUG i.n.buffer.PooledByteBufAllocator - -Dio.netty.allocator.maxOrder: 11 17:19:41.355 [main] DEBUG i.n.buffer.PooledByteBufAllocator - -Dio.netty.allocator.chunkSize: 16777216 17:19:41.355 [main] DEBUG i.n.buffer.PooledByteBufAllocator - -Dio.netty.allocator.tinyCacheSize: 512 17:19:41.355 [main] DEBUG i.n.buffer.PooledByteBufAllocator - -Dio.netty.allocator.smallCacheSize: 256 17:19:41.355 [main] DEBUG i.n.buffer.PooledByteBufAllocator - -Dio.netty.allocator.normalCacheSize: 64 17:19:41.355 [main] DEBUG i.n.buffer.PooledByteBufAllocator - -Dio.netty.allocator.maxCachedBufferCapacity: 32768 17:19:41.355 [main] DEBUG i.n.buffer.PooledByteBufAllocator - -Dio.netty.allocator.cacheTrimInterval: 8192 17:19:41.364 [main] DEBUG io.netty.buffer.ByteBufUtil - -Dio.netty.allocator.type: pooled 17:19:41.365 [main] DEBUG io.netty.buffer.ByteBufUtil - -Dio.netty.threadLocalDirectBufferSize: 65536 17:19:41.365 [main] DEBUG io.netty.buffer.ByteBufUtil - -Dio.netty.maxThreadLocalCharBufferSize: 16384 17:19:41.406 [main] INFO com.basho.riak.client.core.RiakNode - RiakNode started; 192.168.0.102:8098 17:19:41.407 [main] INFO c.basho.riak.client.core.RiakCluster - RiakCluster is starting. 17:19:41.408 [main] INFO c.b.r.c.core.util.DefaultCharset - No desired charset found in system properties, the default charset 'windows-1252' will be used 17:19:41.408 [main] INFO c.b.r.c.core.util.DefaultCharset - Initializing client charset to: windows-1252 17:19:41.443 [main] DEBUG com.basho.riak.client.core.RiakNode - Attempting to acquire channel permit 17:19:41.445 [main] DEBUG io.netty.util.Recycler - -Dio.netty.recycler.maxCapacityPerThread: 32768 17:19:41.445 [main] DEBUG io.netty.util.Recycler - -Dio.netty.recycler.maxSharedCapacityFactor: 2 17:19:41.445 [main] DEBUG io.netty.util.Recycler - -Dio.netty.recycler.linkCapacity: 16 17:19:41.445 [main] DEBUG io.netty.util.Recycler - -Dio.netty.recycler.ratio: 8 17:19:41.447 [main] DEBUG com.basho.riak.client.core.RiakNode - Operation 28144878 being executed on RiakNode 192.168.0.102:8098 17:19:41.461 [nioEventLoopGroup-2-10] DEBUG io.netty.buffer.AbstractByteBuf - -Dio.netty.buffer.bytebuf.checkAccessible: true 17:19:41.463 [nioEventLoopGroup-2-10] DEBUG i.n.util.ResourceLeakDetectorFactory - Loaded default ResourceLeakDetector: io.netty.util.ResourceLeakDetector#1536e36 Call stack of suspended thread Thread [main] (Suspended) Unsafe.park(boolean, long) line: not available [native method] LockSupport.park(Object) line: not available CountDownLatch$Sync(AbstractQueuedSynchronizer).parkAndCheckInterrupt() line: not available CountDownLatch$Sync(AbstractQueuedSynchronizer).doAcquireSharedInterruptibly(int) line: not available CountDownLatch$Sync(AbstractQueuedSynchronizer).acquireSharedInterruptibly(int) line: not available CountDownLatch.await() line: not available StoreOperation(FutureOperation<T,U,S>).await() line: 387 GenericRiakCommand$1(CoreFutureAdapter<T2,S2,T,S>).await() line: 90 StoreValue(RiakCommand<T,S>).execute(RiakCluster) line: 92 RiakClient.execute(RiakCommand<T,S>) line: 355 Testing.main(String[]) line: 29
A simple code addition after the following line of your code should fix things for you: response rv = client.execute(store); add: client.shutdown(); to release that connection and continue execution. Note that you will need to create a new connection for your next request against the database since you closed client or use .executeAsync() in place of .execute().
It appears you are expecting the Riak java client to connect using HTTP API. The Riak java client only connects using protocol buffer; using the HTTP address and port will freeze.
Yoy have to use this, its works fine... public static void main(String[] args) throws ExecutionException, InterruptedException, UnknownHostException { RiakClient client = RiakClient.newClient(8087,"192.168.0.65"); // put some stuff Namespace ns = new Namespace("TestBucket"); Location location = new Location(ns, "TestKey"); String myData = "TestValue"; StoreValue store = new StoreValue.Builder(myData) .withLocation(location).build(); client.execute(store); // << NEVER GETS PAST THIS System.out.println("write done"); // get some stuff FetchValue fv = new FetchValue.Builder(location).build(); FetchValue.Response response = client.execute(fv); String obj = response.getValue(String.class); System.out.println(obj); System.out.println("fetch done"); } hope you will also get... !!!
Cassandra Java Driver Cold to Hot in 500ms?
I experience a cold to hot (first use) of cluster and session to a local data source (Cassandra) to take 640ms. Any additional connect takes 80 to 100ms so the overhead of the first connect is about 500+ms. Is that normal and is there anything I can do to get this figure down somehow? I use a T410 (i5 2.5GHz). [Update] 23:27:11.453 [main] DEBUG c.d.driver.core.SystemProperties - com.datastax.driver.NEW_NODE_DELAY_SECONDS is undefined, using default value 1 23:27:11.460 [main] DEBUG c.d.driver.core.SystemProperties - com.datastax.driver.NON_BLOCKING_EXECUTOR_SIZE is undefined, using default value 4 23:27:11.463 [main] DEBUG c.d.driver.core.SystemProperties - com.datastax.driver.NOTIF_LOCK_TIMEOUT_SECONDS is undefined, using default value 60 23:27:11.607 [main] DEBUG com.datastax.driver.core.Cluster - Starting new cluster with contact points [localhost/127.0.0.1:9042] 23:27:11.905 [main] DEBUG com.datastax.driver.core.Connection - Connection[localhost/127.0.0.1:9042-1, inFlight=0, closed=false] Transport initialized and ready 23:27:11.906 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Refreshing node list and token map 23:27:11.969 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Refreshing schema 23:27:12.016 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Refreshing node list and token map 23:27:12.051 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Successfully connected to localhost/127.0.0.1:9042 23:27:12.052 [main] INFO c.d.d.c.p.DCAwareRoundRobinPolicy - Using data-center name 'datacenter1' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor) 23:27:12.053 [main] INFO com.datastax.driver.core.Cluster - New Cassandra host localhost/127.0.0.1:9042 added 23:27:12.076 [Cassandra Java Driver worker-0] DEBUG com.datastax.driver.core.Connection - Connection[localhost/127.0.0.1:9042-2, inFlight=0, closed=false] Transport initialized and ready 23:27:12.077 [Cassandra Java Driver worker-0] DEBUG com.datastax.driver.core.Session - Added connection pool for localhost/127.0.0.1:9042 23:27:12.097 [main] DEBUG com.datastax.driver.core.Connection - Connection[localhost/127.0.0.1:9042-2, inFlight=0, closed=true] closing connection 23:27:12.103 [main] DEBUG com.datastax.driver.core.Cluster - Shutting down 23:27:12.105 [main] DEBUG com.datastax.driver.core.Connection - Connection[localhost/127.0.0.1:9042-1, inFlight=0, closed=true] closing connection 23:27:12.123 [main] DEBUG com.datastax.driver.core.Cluster - Starting new cluster with contact points [/127.0.0.1:9042] 23:27:12.132 [main] DEBUG com.datastax.driver.core.Connection - Connection[/127.0.0.1:9042-1, inFlight=0, closed=false] Transport initialized and ready 23:27:12.132 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Refreshing node list and token map 23:27:12.138 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Refreshing schema 23:27:12.168 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Refreshing node list and token map 23:27:12.192 [main] DEBUG c.d.driver.core.ControlConnection - [Control connection] Successfully connected to /127.0.0.1:9042 23:27:12.192 [main] INFO c.d.d.c.p.DCAwareRoundRobinPolicy - Using data-center name 'datacenter1' for DCAwareRoundRobinPolicy (if this is incorrect, please provide the correct datacenter name with DCAwareRoundRobinPolicy constructor) 23:27:12.192 [main] INFO com.datastax.driver.core.Cluster - New Cassandra host /127.0.0.1:9042 added 23:27:12.201 [Cassandra Java Driver worker-0] DEBUG com.datastax.driver.core.Connection - Connection[/127.0.0.1:9042-2, inFlight=0, closed=false] Transport initialized and ready 23:27:12.202 [Cassandra Java Driver worker-0] DEBUG com.datastax.driver.core.Session - Added connection pool for /127.0.0.1:9042 As one can see the first connection attempt uses up to 600ms and more depending how one might read the figures.
My guess is this has to do with connection initialization. In all currently released versions of the java driver connections are initialized 1 after another synchronously. Fortunately, individual host pools are initialized in parallel, but the connections are not. If you are using 2.0.9, which has a default # of core connections of 8 that could explain why you are seeing slow initialization times. Also if you are using password authentication, that will slow things down quite a bit as well (from ~0-10ms per connection to ~60-120ms). In java driver 2.0.10, which will be released soon, all connections are initialized in parallel which greatly improves Session initialization. For information see JAVA-701.