I'm currently working on a project that will poll an email inbox for a daily report.
I have a working implementation using spring-integration-mail with an imapInboundAdapter when hitting an email user hosted on Amazon Workmail.
When there is a single email in the inbox that is unread and unflagged it receives that email and properly emits only a single message.
However, when I changed the email server to one hosted by outlook I instead receive the email twice in the same poll.
As far as I can tell, it seems like the seen / flag status of the email isn't updated until the second polling attempt when on outlook, but is updated on the firstattempt on workmail.
The second attempt on outlook retrieves an email that should already have been processed.
IntegrationFlows
.from(
Mail.imapInboundAdapter(format("imaps://%s:%s/INBOX", source.getHost(), source.getPort()))
.javaMailAuthenticator(authenticator)
.maxFetchSize(10),
e -> e.poller(... maxMessagesPerPoll = -1, cronTrigger = "*/15 * * ? * *" ...)
).log(INFO, m -> "Received email: " + m)
This is the logging that occurs when hitting workmail:
com.sun.mail.imap.connectionpool : getStoreProtocol() - connection available -- size: 1
com.sun.mail.imap.connectionpool : getStoreProtocol() -- storeConnectionInUse
com.sun.mail.imap : IMAPProtocol noop
com.sun.mail.imap.connectionpool : releaseStoreProtocol()
com.sun.mail.imap.connectionpool : getStoreProtocol() borrowing a connection
com.sun.mail.imap.connectionpool : getStoreProtocol() - connection available -- size: 1
com.sun.mail.imap.connectionpool : getStoreProtocol() -- storeConnectionInUse
com.sun.mail.imap.connectionpool : releaseFolderStoreProtocol()
o.s.integration.mail.ImapMailReceiver : opening folder [imaps://[SNIPPED]#imap.mail.us-west-2.awsapps.com:993/INBOX]
com.sun.mail.imap : connection available -- size: 1
com.sun.mail.imap.messagecache : create cache of size 8
o.s.integration.mail.ImapMailReceiver : attempting to receive mail from folder [INBOX]
o.s.integration.mail.ImapMailReceiver : This email server does not support RECENT or USER flags. System flag 'Flag.FLAGGED' will be used to prevent duplicates during email fetch.
com.sun.mail.imap.messagecache : create message number 8
o.s.integration.mail.ImapMailReceiver : found 1 new messages
o.s.integration.mail.ImapMailReceiver : Received 1 messages
o.s.integration.mail.ImapMailReceiver : USER flags are not supported by this mail server. Flagging message with system flag
com.sun.mail.imap.connectionpool : connection pool current size: 0 pool size: 1
com.sun.mail.imap.connectionpool : connection pool current size: 0 pool size: 1
com.sun.mail.imap : added an Authenticated connection -- size: 1
o.s.i.mail.MailReceivingMessageSource : received mail message [org.springframework.integration.mail.AbstractMailReceiver$IntegrationMimeMessage#a101a27]
o.s.integration.handler.LoggingHandler : Received email: GenericMessage [payload=org.springframework.integration.mail.AbstractMailReceiver$IntegrationMimeMessage#a101a27, headers={id=20fb1886-eeda-c1e2-7ce0-8a9aae4f7ebc, timestamp=1624590243455}]
com.sun.mail.imap.connectionpool : getStoreProtocol() - connection available -- size: 1
com.sun.mail.imap.connectionpool : getStoreProtocol() -- storeConnectionInUse
com.sun.mail.imap : IMAPProtocol noop
com.sun.mail.imap.connectionpool : releaseStoreProtocol()
com.sun.mail.imap.connectionpool : getStoreProtocol() borrowing a connection
com.sun.mail.imap.connectionpool : getStoreProtocol() - connection available -- size: 1
com.sun.mail.imap.connectionpool : getStoreProtocol() -- storeConnectionInUse
com.sun.mail.imap.connectionpool : releaseFolderStoreProtocol()
o.s.integration.mail.ImapMailReceiver : opening folder [imaps://[SNIPPED]#imap.mail.us-west-2.awsapps.com:993/INBOX]
com.sun.mail.imap : connection available -- size: 1
com.sun.mail.imap.messagecache : create cache of size 8
o.s.integration.mail.ImapMailReceiver : attempting to receive mail from folder [INBOX]
o.s.integration.mail.ImapMailReceiver : This email server does not support RECENT or USER flags. System flag 'Flag.FLAGGED' will be used to prevent duplicates during email fetch.
o.s.integration.mail.ImapMailReceiver : found 0 new messages
o.s.integration.mail.ImapMailReceiver : Received 0 messages
com.sun.mail.imap.connectionpool : connection pool current size: 0 pool size: 1
com.sun.mail.imap.connectionpool : connection pool current size: 0 pool size: 1
com.sun.mail.imap : added an Authenticated connection -- size: 1
This is the logging that occurs when hitting outlook:
com.sun.mail.imap.connectionpool : getStoreProtocol() - connection available -- size: 1
com.sun.mail.imap.connectionpool : getStoreProtocol() -- storeConnectionInUse
com.sun.mail.imap : IMAPProtocol noop
com.sun.mail.imap.connectionpool : releaseStoreProtocol()
com.sun.mail.imap.connectionpool : getStoreProtocol() borrowing a connection
com.sun.mail.imap.connectionpool : getStoreProtocol() - connection available -- size: 1
com.sun.mail.imap.connectionpool : getStoreProtocol() -- storeConnectionInUse
com.sun.mail.imap.connectionpool : releaseFolderStoreProtocol()
o.s.integration.mail.ImapMailReceiver : opening folder [imaps://[SNIPPED]#outlook.office365.com:993/INBOX]
com.sun.mail.imap : connection available -- size: 1
com.sun.mail.imap.messagecache : create cache of size 2
o.s.integration.mail.ImapMailReceiver : attempting to receive mail from folder [INBOX]
o.s.integration.mail.ImapMailReceiver : This email server does not support RECENT or USER flags. System flag 'Flag.FLAGGED' will be used to prevent duplicates during email fetch.
com.sun.mail.imap.messagecache : create message number 2
o.s.integration.mail.ImapMailReceiver : found 1 new messages
o.s.integration.mail.ImapMailReceiver : Received 1 messages
o.s.integration.mail.ImapMailReceiver : USER flags are not supported by this mail server. Flagging message with system flag
com.sun.mail.imap.connectionpool : connection pool current size: 0 pool size: 1
com.sun.mail.imap.connectionpool : connection pool current size: 0 pool size: 1
com.sun.mail.imap : added an Authenticated connection -- size: 1
o.s.i.mail.MailReceivingMessageSource : received mail message [org.springframework.integration.mail.AbstractMailReceiver$IntegrationMimeMessage#5dffc30b]
o.s.integration.handler.LoggingHandler : Received email: GenericMessage [payload=org.springframework.integration.mail.AbstractMailReceiver$IntegrationMimeMessage#5dffc30b, headers={id=9d158e52-d46e-fd82-38b7-b438a4899a1e, timestamp=1624589763164}]
com.sun.mail.imap.connectionpool : getStoreProtocol() - connection available -- size: 1
com.sun.mail.imap.connectionpool : getStoreProtocol() -- storeConnectionInUse
com.sun.mail.imap : IMAPProtocol noop
com.sun.mail.imap.connectionpool : releaseStoreProtocol()
com.sun.mail.imap.connectionpool : getStoreProtocol() borrowing a connection
com.sun.mail.imap.connectionpool : getStoreProtocol() - connection available -- size: 1
com.sun.mail.imap.connectionpool : getStoreProtocol() -- storeConnectionInUse
com.sun.mail.imap.connectionpool : releaseFolderStoreProtocol()
o.s.integration.mail.ImapMailReceiver : opening folder [imaps://[SNIPPED]#outlook.office365.com:993/INBOX]
com.sun.mail.imap : connection available -- size: 1
com.sun.mail.imap.messagecache : create cache of size 2
o.s.integration.mail.ImapMailReceiver : attempting to receive mail from folder [INBOX]
o.s.integration.mail.ImapMailReceiver : This email server does not support RECENT or USER flags. System flag 'Flag.FLAGGED' will be used to prevent duplicates during email fetch.
com.sun.mail.imap.messagecache : create message number 2
o.s.integration.mail.ImapMailReceiver : found 1 new messages
o.s.integration.mail.ImapMailReceiver : Received 1 messages
o.s.integration.mail.ImapMailReceiver : USER flags are not supported by this mail server. Flagging message with system flag
com.sun.mail.imap.connectionpool : connection pool current size: 0 pool size: 1
com.sun.mail.imap.connectionpool : connection pool current size: 0 pool size: 1
com.sun.mail.imap : added an Authenticated connection -- size: 1
o.s.i.mail.MailReceivingMessageSource : received mail message [org.springframework.integration.mail.AbstractMailReceiver$IntegrationMimeMessage#59c04c6c]
o.s.integration.handler.LoggingHandler : Received email: GenericMessage [payload=org.springframework.integration.mail.AbstractMailReceiver$IntegrationMimeMessage#59c04c6c, headers={id=002e6b83-caf5-86cd-b090-44aa346df119, timestamp=1624589766210}]
com.sun.mail.imap.connectionpool : getStoreProtocol() - connection available -- size: 1
com.sun.mail.imap.connectionpool : getStoreProtocol() -- storeConnectionInUse
com.sun.mail.imap : IMAPProtocol noop
com.sun.mail.imap.connectionpool : releaseStoreProtocol()
com.sun.mail.imap.connectionpool : getStoreProtocol() borrowing a connection
com.sun.mail.imap.connectionpool : getStoreProtocol() - connection available -- size: 1
com.sun.mail.imap.connectionpool : getStoreProtocol() -- storeConnectionInUse
com.sun.mail.imap.connectionpool : releaseFolderStoreProtocol()
o.s.integration.mail.ImapMailReceiver : opening folder [imaps://[SNIPPED]#outlook.office365.com:993/INBOX]
com.sun.mail.imap : connection available -- size: 1
com.sun.mail.imap.messagecache : create cache of size 2
o.s.integration.mail.ImapMailReceiver : attempting to receive mail from folder [INBOX]
o.s.integration.mail.ImapMailReceiver : This email server does not support RECENT or USER flags. System flag 'Flag.FLAGGED' will be used to prevent duplicates during email fetch.
o.s.integration.mail.ImapMailReceiver : found 0 new messages
o.s.integration.mail.ImapMailReceiver : Received 0 messages
com.sun.mail.imap.connectionpool : connection pool current size: 0 pool size: 1
com.sun.mail.imap.connectionpool : connection pool current size: 0 pool size: 1
com.sun.mail.imap : added an Authenticated connection -- size: 1
The issue turned out to be directly related to the windows outlook mail client.
If you add an account with imap in the outlook client it will cause duplication to occur if you mark an email as unread through the client (Possibly folder handling related).
I'm not sure of the exact cause, but adding the account as a full outlook account (instead of via imap) resolved the issue.
Otherwise you can mark emails as unread in the web browser instead of in the outlook client.
Sounds like that setFlag() is an async for Outlook. You may consider to not fetch all the messages at once - .maxFetchSize(1). Another way is to use a custom SearchTermStrategy or selector to skip messages which you have read already.
When I start hbase on my cluster, HMaster process and HQuorumPeer process start on master node while only HQuorumPeer process starts on slaves.
In the GUI console, in the task section, I can see the master (node0) in the state RUNNING and the status "Waiting for region servers count to settle; currently checked in 0, slept for 250920 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms".
In the software attributes section I can find all my nodes in the zookeeper quorum with the description "Addresses of all registered ZK servers".
So It seems that Zookeeper is working but in the log file it seems to be the problem.
Log hbase-clusterhadoop-master:
2016-09-08 12:26:14,875 INFO [main-SendThread(node0:2181)] zookeeper.ClientCnxn: Opening socket connection to server node0/192.168.1.113:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Impossibile trovare una configurazione di login) 2016-09-08 12:26:14,882 WARN [main-SendThread(node0:2181)] zookeeper.ClientCnxn: Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081) 2016-09-08 12:26:14,994 WARN [main] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=node3:2181,node2:2181,node1:2181,node0:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase
........
2016-09-08 12:32:53,063 INFO [master:node0:60000] zookeeper.ZooKeeper: Initiating client connection, connectString=node3:2181,node2:2181,node1:2181,node0:2181 sessionTimeout=90000 watcher=replicationLogCleaner0x0, quorum=node3:2181,node2:2181,node1:2181,node0:2181, baseZNode=/hbase
2016-09-08 12:32:53,064 INFO [master:node0:60000-SendThread(node3:2181)] zookeeper.ClientCnxn: Opening socket connection to server node3/192.168.1.112:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Impossibile trovare una configurazione di login)
2016-09-08 12:32:53,065 INFO [master:node0:60000-SendThread(node3:2181)] zookeeper.ClientCnxn: Socket connection established to node3/192.168.1.112:2181, initiating session
2016-09-08 12:32:53,069 INFO [master:node0:60000-SendThread(node3:2181)] zookeeper.ClientCnxn: Session establishment complete on server node3/192.168.1.112:2181, sessionid = 0x357095a4b940001, negotiated timeout = 90000
2016-09-08 12:32:53,072 INFO [master:node0:60000] zookeeper.RecoverableZooKeeper: Node /hbase/replication/rs already exists and this is not a retry
2016-09-08 12:32:53,072 DEBUG [master:node0:60000] cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner
2016-09-08 12:32:53,075 DEBUG [master:node0:60000] cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.master.snapshot.SnapshotLogCleaner
2016-09-08 12:32:53,076 DEBUG [master:node0:60000] cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.master.cleaner.HFileLinkCleaner
2016-09-08 12:32:53,077 DEBUG [master:node0:60000] cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.master.snapshot.SnapshotHFileCleaner
2016-09-08 12:32:53,078 DEBUG [master:node0:60000] cleaner.CleanerChore: initialize cleaner=org.apache.hadoop.hbase.master.cleaner.TimeToLiveHFileCleaner
2016-09-08 12:32:53,078 INFO [master:node0:60000] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 0 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
2016-09-08 12:32:54,607 INFO [master:node0:60000] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 1529 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
2016-09-08 12:32:56,137 INFO [master:node0:60000] master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 3059 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
Log hbase-clusterhadoop-zookeeper-node0 (master):
2016-09-08 12:26:18,315 WARN [WorkerSender[myid=0]] quorum.QuorumCnxManager: Cannot open channel to 1 at election address node1/192.168.1.156:3888
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:382)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:241)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:228)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:431)
at java.net.Socket.connect(Socket.java:527)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368)
at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:341)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:449)
at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:430)
at java.lang.Thread.run(Thread.java:695)
Log hbase-clusterhadoop-regionserver-node1 (one of the slave):
2016-09-08 12:33:32,690 INFO [regionserver60020-SendThread(node3:2181)] zookeeper.ClientCnxn: Opening socket connection to server node3/192.168.1.112:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Impossibile trovare una configurazione di login)
2016-09-08 12:33:32,691 INFO [regionserver60020-SendThread(node3:2181)] zookeeper.ClientCnxn: Socket connection established to node3/192.168.1.112:2181, initiating session
2016-09-08 12:33:32,692 INFO [regionserver60020-SendThread(node3:2181)] zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect
2016-09-08 12:33:32,793 WARN [regionserver60020] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=node3:2181,node2:2181,node1:2181,node0:2181, exception=org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
2016-09-08 12:33:32,794 ERROR [regionserver60020] zookeeper.RecoverableZooKeeper: ZooKeeper exists failed after 4 attempts
2016-09-08 12:33:32,794 WARN [regionserver60020] zookeeper.ZKUtil: regionserver:600200x0, quorum=node3:2181,node2:2181,node1:2181,node0:2181, baseZNode=/hbase Unable to set watcher on znode /hbase/master
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:427)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77)
at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:778)
at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:751)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:884)
at java.lang.Thread.run(Thread.java:695)
2016-09-08 12:33:32,794 ERROR [regionserver60020] zookeeper.ZooKeeperWatcher: regionserver:600200x0, quorum=node3:2181,node2:2181,node1:2181,node0:2181, baseZNode=/hbase Received unexpected KeeperException, re-throwing exception
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:427)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77)
at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:778)
at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:751)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:884)
at java.lang.Thread.run(Thread.java:695)
2016-09-08 12:33:32,795 FATAL [regionserver60020] regionserver.HRegionServer: ABORTING region server node1,60020,1473330794709: Unexpected exception during initialization, aborting
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /hbase/master
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:1045)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:222)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.watchAndCheckExists(ZKUtil.java:427)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperNodeTracker.start(ZooKeeperNodeTracker.java:77)
at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:778)
at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:751)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:884)
at java.lang.Thread.run(Thread.java:695)
2016-09-08 12:33:32,798 FATAL [regionserver60020] regionserver.HRegionServer: RegionServer abort: loaded coprocessors are: []
2016-09-08 12:33:32,798 INFO [regionserver60020] regionserver.HRegionServer: STOPPED: Unexpected exception during initialization, aborting
2016-09-08 12:33:32,867 INFO [regionserver60020-SendThread(node0:2181)] zookeeper.ClientCnxn: Opening socket connection to server node0/192.168.1.113:2181. Will not attempt to authenticate using SASL (java.lang.SecurityException: Impossibile trovare una configurazione di login)
Log hbase-clusterhadoop-zookeeper-node1:
2016-09-08 12:33:32,075 WARN [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0%0:2181] quorum.Learner: Unexpected exception, tries=0, connecting to node3/192.168.1.112:2888
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:382)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:241)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:228)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:431)
at java.net.Socket.connect(Socket.java:527)
at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:225)
at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
2016-09-08 12:33:32,227 INFO [node1/192.168.1.156:3888] quorum.QuorumCnxManager: Received connection request /192.168.1.113:49844
2016-09-08 12:33:32,233 INFO [WorkerReceiver[myid=1]] quorum.FastLeaderElection: Notification: 1 (message format version), 0 (n.leader), 0x10000002d (n.zxid), 0x1 (n.round), LOOKING (n.state), 0 (n.sid), 0x1 (n.peerEpoch) FOLLOWING (my state)
2016-09-08 12:33:32,239 INFO [WorkerReceiver[myid=1]] quorum.FastLeaderElection: Notification: 1 (message format version), 3 (n.leader), 0x10000002d (n.zxid), 0x1 (n.round), LOOKING (n.state), 0 (n.sid), 0x1 (n.peerEpoch) FOLLOWING (my state)
2016-09-08 12:33:32,725 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.NIOServerCnxnFactory: Accepted socket connection from /192.168.1.111:49534
2016-09-08 12:33:32,725 WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.NIOServerCnxn: Exception causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not running
2016-09-08 12:33:32,725 INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.NIOServerCnxn: Closed socket connection for client /192.168.1.111:49534 (no session established for client)
The conf file abase-site:
<configuration>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.rootdir</name>
<value>hdfs://node0:9000/hbase</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>node0,node1,node2,node3</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/Users/clusterhadoop/usr/local/zookeeper</value>
</property>
<property>
<name>hbase.tmp.dir</name>
<value>/Users/clusterhadoop/usr/local/hbtmp</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>hbase.master</name>
<value>node0:60000</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.maxClientCnxns</name>
<value>1000</value>
</property>
</configuration>
Hosts file:
127.0.0.1 localhost
127.0.0.1 node3
192.168.1.112 node3
192.168.1.156 node1
192.168.1.111 node2
192.168.1.113 node0
Any idea on what is the problem and how to solve it?
i have problems to connect cassandra to spark, i can connect to cassandra by cqlsh but when i launch my program:
public static void main(String[] args) {
Cluster cluster;
Session session;
cluster = Cluster.builder().addContactPoint("127.0.0.1").build();
session = cluster.connect();
SparkConf conf = new SparkConf conf = new SparkConf().setAppName("CassandraExamples").setMaster("local[1]")
.set("spark.cassandra.connection.host", "9.168.86.84");
JavaSparkContext sc = new JavaSparkContext("spark://9.168.86.84:9042","CassandraExample",conf);
CassandraJavaPairRDD<String, String> rdd1 = javaFunctions(sc).cassandraTable("keyspace", "table",
mapColumnTo(String.class), mapColumnTo(String.class)).select("row1", "row2");
System.out.println("Data fetched: \n" + StringUtils.join(rdd1.toArray(), "\n"));
}
i'm getting this error:
15/06/11 11:41:15 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster#9.168.86.84:9042]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: no further information: /9.168.86.84:9042
15/06/11 11:41:34 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster#9.168.86.84:9042/user/Master...
15/06/11 11:41:35 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster#9.168.86.84:9042: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster#9.168.86.84:9042
15/06/11 11:41:35 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster#9.168.86.84:9042]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: no further information: /9.168.86.84:9042
15/06/11 11:41:54 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster#9.168.86.84:9042/user/Master...
15/06/11 11:41:55 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster#9.168.86.84:9042: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster#9.168.86.84:9042
15/06/11 11:41:55 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster#9.168.86.84:9042]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: Connection refused: no further information: /9.168.86.84:9042
15/06/11 11:42:14 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
15/06/11 11:42:14 WARN SparkDeploySchedulerBackend: Application ID is not initialized yet.
15/06/11 11:42:14 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.
cassandra.yaml has this properties:
listen_address: 9.168.86.84
start_native_transport: true
rpc_address: 0.0.0.0
native_transport_port: 9042
rpc_port: 9160
can someone tell my what's wrong?
How to solve the problem? When I config the flume server, it had the below questiosns.
2014-10-20 22:24:01,480 INFO org.apache.avro.ipc.NettyServer: [id: 0x2fe09f1a, /ip:57063 => /ip:34001] OPEN
2014-10-20 22:24:01,481 INFO org.apache.avro.ipc.NettyServer: [id: 0x2fe09f1a, /ip:57063 => /ip:34001] BOUND: /ip:34001
2014-10-20 22:24:01,481 INFO org.apache.avro.ipc.NettyServer: [id: 0x2fe09f1a, /10.182.4.70:57063 => /ip:34001] CONNECTED: /ip:57063
2014-10-20 22:24:01,481 INFO org.apache.avro.ipc.NettyServer: [id: 0x2fe09f1a, /ip:57063 :> /ip:34001] DISCONNECTED
2014-10-20 22:24:01,481 INFO org.apache.avro.ipc.NettyServer: [id: 0x2fe09f1a, /ip:57063 :> /10.182.4.79:34001] UNBOUND
2014-10-20 22:24:01,481 INFO org.apache.avro.ipc.NettyServer: [id: 0x2fe09f1a, /10.182.4.70:57063 :> /10.182.4.79:34001] CLOSED
2014-10-20 22:24:01,481 INFO org.apache.avro.ipc.NettyServer: Connection to /10.182.4.70:57063 disconnected.
2014-10-20 22:24:01,481 WARN org.apache.avro.ipc.NettyServer: Unexpected exception from downstream.
java.nio.channels.ClosedChannelException
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.cleanUpWriteBuffer(AbstractNioWorker.java:673)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.writeFromUserCode(AbstractNioWorker.java:400)
at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.handleAcceptedSocket(NioServerSocketPipelineSink.java:120)
at org.jboss.netty.channel.socket.nio.NioServerSocketPipelineSink.eventSunk(NioServerSocketPipelineSink.java:59)
at org.jboss.netty.channel.Channels.write(Channels.java:733)
at org.jboss.netty.channel.Channels.write(Channels.java:694)
at org.jboss.netty.handler.codec.compression.ZlibEncoder.finishEncode(ZlibEncoder.java:380)
at org.jboss.netty.handler.codec.compression.ZlibEncoder.handleDownstream(ZlibEncoder.java:316)
at org.jboss.netty.handler.codec.oneone.OneToOneEncoder.handleDownstream(OneToOneEncoder.java:55)
at org.jboss.netty.channel.Channels.close(Channels.java:821)
And the flume.conf is as belows.
instance_35001.channels.channel1.checkpointDir=editlog/checkpoint
instance_35001.channels.channel1.dataDirs=editlog/data
instance_35001.channels.channel1.capacity=200000000
instance_35001.channels.channel1.transactionCapacity=1000000
instance_35001.channels.channel1.checkpointInterval=10000
instance_35001.sources=source1
instance_35001.sources.source1.type=avro
instance_35001.sources.source1.bind=0.0.0.0
instance_35001.sources.source1.port=34001
instance_35001.sources.source1.compression-type=deflate
instance_35001.sources.source1.channels=channel1
instance_35001.sources.source1.interceptors = inter1
instance_35001.sources.source1.interceptors.inter1.type = host
instance_35001.sources.source1.interceptors.inter1.hostHeader = servername
instance_35001.sinks=sink1
instance_35001.sinks.sink1.type=hdfs
instance_35001.sinks.sink1.hdfs.path=hdfs://address:5000/user/admin/%{appname}/%Y/%m/%d/
instance_35001.sinks.sink1.hdfs.filePrefix=%{appname}-%{hostname}-%{servername}.34001
instance_35001.sinks.sink1.hdfs.rollInterval=0
instance_35001.sinks.sink1.hdfs.rollCount=0
instance_35001.sinks.sink1.hdfs.rollSize=21521880492
The environment is CDH5. And the sink is the hdfs program. The log is usually very normal. but the sink is very slowly. So please help me. Thanks.
One thing I can see here is that your roll size is significantly larger than the channel capacity. So before rolling the file everything is stored in the channel which gets filled up after a point and starts throwing an error.
instance_35001.channels.channel1.capacity=200000000
instance_35001.sinks.sink1.hdfs.rollSize=21521880492
Keep your role size around the block size you have set for HDFS. Also HDFS sink has a default batch size of 100. Change it to some larger value and see how it behaves.
capacity is measured in # of events while rollsize in actual bytes so it is difficult to properly correlate those two.
However you want your roll size to be close to your hdfs block size (default 128mb).
rollsize = 21521880492 -> 21GB
My hadoop version is 0.20.203.0. The namenode running on my hadoop clulser was shut down. I checked the logs, and found the error message only in the secondary name logs:
2014-09-27 22:18:54,930 WARN org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Checkpoint done. New Image Size: 29552383
2014-09-27 22:19:42,792 INFO org.mortbay.log: org.mortbay.io.nio.SelectorManager$SelectSet#8135daf JVM BUG(s) - injecting delay2 times
2014-09-27 22:19:42,792 INFO org.mortbay.log: org.mortbay.io.nio.SelectorManager$SelectSet#8135daf JVM BUG(s) - recreating selector 2 times, canceled keys 38 times
2014-09-27 23:18:55,508 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of transactions: 0 Total time for transactions(ms): 0Number of transactions batched in Syncs: 0 Number of syncs: 0 SyncTimes(ms): 0
2014-09-27 23:18:55,508 FATAL org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Fatal Error : All storage directories are inaccessible.
2014-09-27 23:18:55,509 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: SHUTDOWN_MSG:
There was anonther error message apprears in one of my datanodes:
2014-09-27 01:03:58,535 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(10.75.6.51:50010, storageID=DS-532990984-10.75.6.51-50010-1343295370699, infoPort=50075, ipcPort=50020):DataXceiver
org.apache.hadoop.util.DiskChecker$DiskOutOfSpaceException: No space left on device
at org.apache.hadoop.hdfs.server.datanode.DataNode.checkDiskError(DataNode.java:770)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:475)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:528)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:397)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:107)
at java.lang.Thread.run(Thread.java:662)
Not sure whether this is the root cause of the namenode shutting down issue?
New error raised when I was trying to restart the namenode?
2014-09-28 11:25:06,202 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
java.io.IOException: Incorrect data format. logVersion is -31 but writables.length is 0.
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:542)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1009)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:827)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:365)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:97)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:379)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:353)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:254)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:434)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1153)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1162)
Is there anyone knows about it ? Is it possible to fix the imagne and editor files as I don't want to lose the data?