I am using cassandra 2.0.7 sitting on a remote server listening on non-default port
<code>
---cassandra.yaml
rpc_address: 0.0.0.0
rpc_port: 6543
</code>
I am trying to connect to the server using titan-0.4.4 (java API, also tried with rexster) using the following config:
<code>
storage.hostname=172.182.183.215
storage.backend=cassandra
storage.port=6543
storage.keyspace=abccorp
</code>
It does not connect and I see the the following exceptions below. However, if I use cqlsh on the same host from where I am trying to execute my code/rexster, I am able to connect without any issues. Anybody seen this?
<code>
0 [main] INFO com.netflix.astyanax.connectionpool.impl.ConnectionPoolMBeanManager - Registering mbean: com.netflix.MonitoredResources:type=ASTYANAX,name=ClusterTitanConnectionPool,ServiceType=connectionpool
49 [main] INFO com.netflix.astyanax.connectionpool.impl.CountingConnectionPoolMonitor - AddHost: 172.182.183.215
554 [main] INFO com.netflix.astyanax.connectionpool.impl.ConnectionPoolMBeanManager - Registering mbean: com.netflix.MonitoredResources:type=ASTYANAX,name=KeyspaceTitanConnectionPool,ServiceType=connectionpool
555 [main] INFO com.netflix.astyanax.connectionpool.impl.CountingConnectionPoolMonitor - AddHost: 172.182.183.215
999 [main] INFO com.netflix.astyanax.connectionpool.impl.CountingConnectionPoolMonitor - AddHost: 127.0.0.1
1000 [main] INFO com.netflix.astyanax.connectionpool.impl.CountingConnectionPoolMonitor - RemoveHost: 172.182.183.215
2366 [main] INFO com.thinkaurelius.titan.diskstorage.Backend - Initiated backend operations thread pool of size 16
41523 [RingDescribeAutoDiscovery] WARN com.netflix.astyanax.impl.RingDescribeHostSupplier - Failed to get hosts from abccorp via ring describe. Will use previously known ring instead
61522 [RingDescribeAutoDiscovery] WARN com.netflix.astyanax.impl.RingDescribeHostSupplier - Failed to get hosts from abccorp via ring describe. Will use previously known ring instead
63080 [main] INFO com.thinkaurelius.titan.diskstorage.util.BackendOperation - Temporary storage exception during backend operation. Attempting backoff retry
com.thinkaurelius.titan.diskstorage.TemporaryStorageException: Temporary failure in storage backend
at com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxOrderedKeyColumnValueStore.getNamesSlice(AstyanaxOrderedKeyColumnValueStore.java:138)
at com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxOrderedKeyColumnValueStore.getSlice(AstyanaxOrderedKeyColumnValueStore.java:88)
at com.thinkaurelius.titan.graphdb.configuration.KCVSConfiguration$1.call(KCVSConfiguration.java:70)
at com.thinkaurelius.titan.graphdb.configuration.KCVSConfiguration$1.call(KCVSConfiguration.java:64)
at com.thinkaurelius.titan.diskstorage.util.BackendOperation.execute(BackendOperation.java:30)
at com.thinkaurelius.titan.graphdb.configuration.KCVSConfiguration.getConfigurationProperty(KCVSConfiguration.java:64)
at com.thinkaurelius.titan.diskstorage.Backend.initialize(Backend.java:277)
at com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration.getBackend(GraphDatabaseConfiguration.java:1174)
at com.thinkaurelius.titan.graphdb.database.StandardTitanGraph.<init>(StandardTitanGraph.java:75)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:40)
at com.thinkaurelius.titan.core.TitanFactory.open(TitanFactory.java:29)
at com.abccorp.grp.graphorm.GraphORM.<init>(GraphORM.java:23)
at com.abccorp.grp.graphorm.GraphORM.getInstance(GraphORM.java:47)
at com.abccorp.grp.utils.dataloader.MainLoader.main(MainLoader.java:150)
Caused by: com.netflix.astyanax.connectionpool.exceptions.NoAvailableHostsException: NoAvailableHostsException: [host=None(0.0.0.0):0, latency=0(0), attempts=0]No hosts to borrow from
at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.<init>(RoundRobinExecuteWithFailover.java:30)
at com.netflix.astyanax.connectionpool.impl.TokenAwareConnectionPoolImpl.newExecuteWithFailover(TokenAwareConnectionPoolImpl.java:83)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256)
at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$4.execute(ThriftColumnFamilyQueryImpl.java:519)
at com.thinkaurelius.titan.diskstorage.cassandra.astyanax.AstyanaxOrderedKeyColumnValueStore.getNamesSlice(AstyanaxOrderedKeyColumnValueStore.java:136)
... 13 more
91522 [RingDescribeAutoDiscovery] WARN com.netflix.astyanax.impl.RingDescribeHostSupplier - Failed to get hosts from abccorp via ring describe. Will use previously known ring instead
121522 [RingDescribeAutoDiscovery] WARN com.netflix.astyanax.impl.RingDescribeHostSupplier - Failed to get hosts from abccorp via ring describe. Will use previously known ring instead
</code>
Any help greatly appreciated. I am evaluating titan on cassandra and am a bit stuck on this as previously I was using cassandra (same version) on localhost and everything was fine.
thanks
Changing the listen_address to 172.182.183.215 in the configuration had done the trick. Initially it was not clear if just setting the rpc_address was enough.
Thrift and the drivers that support Thrift are deprecated as of C* 1.2. You should switch to the DataStax Java Driver (currently at 2.0.2).
Alternately, ensure this is set properly in cassandra.yaml
start_rpc: true
Related
I'm using the Apache Flink Kubernetes operator to deploy a standalone job on an Application cluster setup.
I have setup the following files using the Flink official documentation - Link
jobmanager-application-non-ha.yaml
taskmanager-job-deployment.yaml
flink-configuration-configmap.yaml
jobmanager-service.yaml
I have not changed any of the configurations in these files and am trying to run a simple WordCount example from the Flink examples using the Apache Flink Operator.
After running the kubectl commands to setting up the job manager and the task manager - the job manager goes into a NotReady state while the task manager goes into a CrashLoopBackOff loop.
NAME READY STATUS RESTARTS AGE
flink-jobmanager-28k4b 1/2 NotReady 2 (4m24s ago) 16m
flink-kubernetes-operator-6585dddd97-9hjp4 2/2 Running 0 10d
flink-taskmanager-6bb88468d7-ggx8t 1/2 CrashLoopBackOff 9 (2m21s ago) 15m
The job manager logs look like this
org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Slot request bulk is not fulfillable! Could not allocate the required slot within slot request timeout
at org.apache.flink.runtime.jobmaster.slotpool.PhysicalSlotRequestBulkCheckerImpl.lambda$schedulePendingRequestBulkWithTimestampCheck$0(PhysicalSlotRequestBulkCheckerImpl.java:86) ~[flink-dist-1.16.0.jar:1.16.0]
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) ~[?:?]
at java.util.concurrent.FutureTask.run(Unknown Source) ~[?:?]
at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRunAsync$4(AkkaRpcActor.java:453) ~[flink-rpc-akka_be40712e-8b2e-47cd-baaf-f0149cf2604d.jar:1.16.0]
at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) ~[flink-rpc-akka_be40712e-8b2e-47cd-baaf-f0149cf2604d.jar:1.16.0]
The Task manager it seems cannot connect to the job manager
2023-01-28 19:21:47,647 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Connecting to ResourceManager akka.tcp://flink#flink-jobmanager:6123/user/rpc/resourcemanager_*(00000000000000000000000000000000).
2023-01-28 19:21:57,766 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not resolve ResourceManager address akka.tcp://flink#flink-jobmanager:6123/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink#flink-jobmanager:6123/user/rpc/resourcemanager_*.
2023-01-28 19:22:08,036 INFO akka.remote.transport.ProtocolStateActor [] - No response from remote for outbound association. Associate timed out after [20000 ms].
2023-01-28 19:22:08,057 WARN akka.remote.ReliableDeliverySupervisor [] - Association with remote system [akka.tcp://flink#flink-jobmanager:6123] has failed, address is now gated for [50] ms. Reason: [Association failed with [akka.tcp://flink#flink-jobmanager:6123]] Caused by: [No response from remote for outbound association. Associate timed out after [20000 ms].]
2023-01-28 19:22:08,069 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not resolve ResourceManager address akka.tcp://flink#flink-jobmanager:6123/user/rpc/resourcemanager_*, retrying in 10000 ms: Could not connect to rpc endpoint under address akka.tcp://flink#flink-jobmanager:6123/user/rpc/resourcemanager_*.
2023-01-28 19:22:08,308 WARN akka.remote.transport.netty.NettyTransport [] - Remote connection to [null] failed with org.jboss.netty.channel.ConnectTimeoutException: connection timed out: flink-jobmanager/100.127.18.9:6123
The flink-configuration-configmap.yaml looks like this
flink-conf.yaml: |+
jobmanager.rpc.address: flink-jobmanager
taskmanager.numberOfTaskSlots: 2
blob.server.port: 6124
jobmanager.rpc.port: 6123
taskmanager.rpc.port: 6122
queryable-state.proxy.ports: 6125
jobmanager.memory.process.size: 1600m
taskmanager.memory.process.size: 1728m
parallelism.default: 2
This is what the pom.xml looks like - Link
You deployed the Kubernetes Operator in the namespace, but you did not create the CRDs the Operator requires. Instead you tried to create a standalone Flink Kubernetes cluster.
The Flink Operator makes it a lot easier to deploy your Flink jobs, you only need to deploy the operator itself and FlinkDeployment/FlinkSessionJob CRDs. The operator will manage your deployment after.
Please use this documentation for the Kubernetes Operator: Link
I have configured Log4j2 to write the logs to my Mongo Atlas cluster (4.4.8).
The configuration seems ok (I use the connection string given by Atlas), and the logs (console) say that the connection to the MongoDB is ok, database retrieved correctly and collection retrived correctly.
But then, when it tries to write a log to the DB, it times out after 30000ms saying:
Timed out after 30000 ms while waiting to connect. Client view of cluster state is {type=UNKNOWN, servers=[]
I also can see several messages saying:
INFO org.mongodb.driver.cluster - Cluster description not yet available. Waiting for 30000 ms before timing out
What I don't understand is that, using the very same driver, same connection string, all the operations I perform on this same MongoDB managing the connection myself (I have a MongoDBService class where I build the Mongo Connection etc...normal stuff) work with no problem, so it leads me to thing that it is Log4j that handles the connection to MongoDB in a bad way...
Any help is appreciated!
Finally I found the problem in my configurations. Maybe it works for you too.
I was used to have multiple appenders in root logger. So mongodb was trying to log something like: "hey, I'm going to log" after initializing the RollingFileAppender but before the mongodbAppender. You can see it in below:
Root:
level: info
AppenderRef:
- ref: ConsoleAppender
- ref: RollingFileAppender
- ref: MongoAppender
Just by changing the mongo appender's logger everything worked for me.
logger:
- name: com.sinansoft
level: info
additivity: false
AppenderRef:
- ref: MongoAppender
Root:
level: info
AppenderRef:
- ref: ConsoleAppender
- ref: RollingFileAppender
Let me know if you want more configuration details in this case.
I have a 4GB RAM ubuntu server on digitalocean
I am using cassandra 3.9
After going through the setup process detailed here
cqlsh, nodetool status all return back this message:
nodetool: Failed to connect to '127.0.0.1:7199' - ConnectException: 'Connection refused (Connection refused)'.
I have read several similar issues and they all suggested 4GB minimum ram size, I have that but still get the same error >>
Nodetool status connection refused
Some suggest to set listen_address and rpc_address to digitalocean assigned ip in cassandra.yaml, also tried that, but problem persists
some suggest looking at debuglogs & systemlogs, Alot of [INFO] and [DEBUG] lines, but I have some [WARN] lines, which dont terminate the execution and it terminates at an [ERROR] line
Warnings
...
WARN [main] 2018-03-13 12:06:52,359 DatabaseDescriptor.java:563 - Small commitlog volume detected at /var/lib/cassandra/commitlog; setting commitlog_total_space_in_mb to 6158. You can override this in cassandra.yaml
WARN [main] 2018-03-13 12:06:52,361 DatabaseDescriptor.java:590 - Small cdc volume detected at /var/lib/cassandra/cdc_raw; setting cdc_total_space_in_mb to 3079. You can override this in cassandra.yaml
WARN [main] 2018-03-13 12:06:52,365 DatabaseDescriptor.java:643 - Only 22.102GiB free across all data volumes. Consider adding more capacity to your cluster or removing obsolete snapshots
...
WARN [main] 2018-03-13 12:06:52,530 StartupChecks.java:123 - jemalloc shared library could not be preloaded to speed up memory allocations
WARN [main] 2018-03-13 12:06:52,530 StartupChecks.java:156 - JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info.
INFO [main] 2018-03-13 12:06:52,533 SigarLibrary.java:44 - Initializing SIGAR library
WARN [main] 2018-03-13 12:06:52,554 SigarLibrary.java:174 - Cassandra server running in degraded mode. Is swap disabled? : true, Address space adequate? : true, nofile limit adequate? : true, nproc limit adequate? : false
Error
...
ERROR [main] 2018-03-13 12:06:55,808 CassandraDaemon.java:747 - Exception encountered during startup
java.lang.AbstractMethodError: org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/ser$
at javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:150) ~[na:1.8.0_161]
at javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:135) ~[na:1.8.0_161]
at javax.management.remote.rmi.RMIConnectorServer.start(RMIConnectorServer.java:405) ~[na:1.8.0_161]
at org.apache.cassandra.utils.JMXServerUtils.createJMXServer(JMXServerUtils.java:106) ~[apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.service.CassandraDaemon.maybeInitJmx(CassandraDaemon.java:145) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:219) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:601) [apache-cassandra-3.9.jar:3.9]
at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:730) [apache-cassandra-3.9.jar:3.9]
Not sure what to do with that error message
I suspect many of you have had this issue, and some of you solved it
please clearly detail for the rest of us (and future folks) how you sorted it
I am considering trying an earlier version of cassandra, maybe this problem is specific to version 3.9 and not earlier ones
This is a known issue (CASSANDRA-14173). Either downgrade Java to Java 8 build 152, or upgrade the Cassandra.
I am getting this error, trying to launch Cassandra. I ran rm -rf * on /var/log/cassandra and /var/lib/cassandra, and try to run cassandra again, with no success. Any idea? I have been looking to similar cases, but not at the launch of Cassandra, and nothing I found helped me to solve this problem.
root#test # cassandra
(...)
16:38:38.551 [MemtableFlushWriter:1] ERROR o.a.c.service.CassandraDaemon - Exception in thread Thread[MemtableFlushWriter:1,5,main]
java.lang.RuntimeException: Insufficient disk space to write 572 bytes
at org.apache.cassandra.db.Directories.getWriteableLocation(Directories.java:349) ~[apache-cassandra-2.2.7.jar:2.2.7]
at org.apache.cassandra.db.Memtable.flush(Memtable.java:324) ~[apache-cassandra-2.2.7.jar:2.2.7]
at org.apache.cassandra.db.ColumnFamilyStore$Flush.run(ColumnFamilyStore.java:1165) ~[apache-cassandra-2.2.7.jar:2.2.7]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[na:1.8.0_65]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[na:1.8.0_65]
at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_65]
16:38:41.785 [ScheduledTasks:1] INFO o.a.cassandra.locator.TokenMetadata - Updating topology for all endpoints that have changed
FYI, if I try to run cassandra again just after, I got
[main] ERROR o.a.c.service.CassandraDaemon - Port already in use: 7199; nested exception is:
java.net.BindException: Address already in use
So it seems that "CassandraDaemon" is alive; but if I want to run cqlsh I got this error:
root#test # cqlsh
Warning: custom timestamp format specified in cqlshrc, but local timezone could not be detected.
Either install Python 'tzlocal' module for auto-detection or specify client timezone in your cqlshrc.
Connection error: ('Unable to connect to any servers', {'127.0.0.1': error(111, "Tried connecting to [('127.0.0.1', 9042)]. Last error: Connection refused")})
Finally, the free -m command gives me
total used free shared buff/cache available
Mem: 15758 8308 950 1440 6499 5576
Swap: 4095 2135 1960
Thanks for you help!
EDIT : Here are the WARN message I got during Cassandra launch:
11:36:14.622 [main] WARN o.a.c.config.DatabaseDescriptor - Small commitlog volume detected at /var/lib/cassandra/commitlog; setting commitlog_total_space_in_mb to 2487. You can override this in cassandra.yaml
11:36:14.623 [main] WARN o.a.c.config.DatabaseDescriptor - Only 17 MB free across all data volumes. Consider adding more capacity to your cluster or removing obsolete snapshots
11:36:14.943 [main] WARN o.a.cassandra.service.StartupChecks - jemalloc shared library could not be preloaded to speed up memory allocations
11:36:14.943 [main] WARN o.a.cassandra.service.StartupChecks - JMX is not enabled to receive remote connections. Please see cassandra-env.sh for more info.
11:36:14.943 [main] WARN o.a.cassandra.service.StartupChecks - OpenJDK is not recommended. Please upgrade to the newest Oracle Java release
11:36:14.954 [main] WARN o.a.cassandra.utils.SigarLibrary - Cassandra server running in degraded mode. Is swap disabled? : false, Address space adequate? : true, nofile limit adequate? : false, nproc limit adequate? : true
As I said in a previous comment, it seems that the issue was triggered by the /var folder reaching its limit size, because Nifi, that I used for ingesting data, generates big logs in my configuration. Therefore removing useless logs and keeping an eye on the /var folder seems to prevent the error.
While adding a document using solr cloud server I'm getting following exception
60 [main] INFO org.apache.solr.common.cloud.ConnectionManager - Waiting for client to connect to ZooKeeper
65 [main-SendThread(jmajeed.ibsorb.com:8982)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server jmajeed.ibsorb.com/192.168.70.91:8982. Will not attempt to authenticate using SASL (unknown error)
69 [main-SendThread(jmajeed.ibsorb.com:8982)] INFO org.apache.zookeeper.ClientCnxn - Socket connection established to jmajeed.ibsorb.com/192.168.70.91:8982, initiating session
Exception in thread "main" java.lang.RuntimeException: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper 192.168.70.91:8982/#/hotelcontent within 10000 ms
Does anybody has any idea why this is happening??
Thanks.
Have the disturbed the default configuration of solr nodes because by default if you do not specify the port the first node in the cluster will start in 8983 port so check this first. If this is not the problem then check whether the cluster is up or not by accessing admin UI of solr cloud. Then see whether all the shards in the cluster are alive by clicking on the cloud tab.
If everything is fine and still you are facing the above problem then are you trying to access a remote solr cloud server and it is firewall issue.