Cassandra Read TimeOut during paging query - java

I have 2 cassandra node with replica_factor=2. I am trying to run select().all() from my code and i used setFetchSize(50000). When i start iterating result after some time it throw readTimeOutException Cassandra timeout during read query at consistency ONE (1 responses were required but only 0 replica responded). Could any one please give me some suggestion?
I am creating cluster using below code
PoolingOptions poolingOptions = new PoolingOptions();
poolingOptions.setCoreConnectionsPerHost(HostDistance.LOCAL, 52)
.setMaxConnectionsPerHost(HostDistance.LOCAL, 80)
.setMaxRequestsPerConnection(HostDistance.LOCAL, 500);
SocketOptions socketOption = new SocketOptions();
socketOption.setReadTimeoutMillis(600000)
.setReceiveBufferSize(1024*512)
.setSendBufferSize(1024*512)
.setKeepAlive(true).setConnectTimeoutMillis(1800000);
cluster = Cluster.builder()
.addContactPoints(cassandraHosts.get("HOST_1"), cassandraHosts.get("HOST_2"))
.withPoolingOptions(poolingOptions)
.withPort(cassandraPort)
.withSocketOptions(socketOption)
.withLoadBalancingPolicy(new TokenAwarePolicy(new DCAwareRoundRobinPolicy())).build();
Session session = cluster.connect(cassandraDB);
Cassandra version: 2.2.1
Java 7
Is there any other way to execute select all query without read time out exception

Related

How to manage RecordTooLargeException avoiding Flink job restarting

Is there any way to ignore oversized messages without Flink job restarting?
If I try to produce (using KafkaSink ) a message which is too large (greater than max.message.bytes) then the RecordTooLargeException occurs and the Flink job restarts, then this "exception&restart" cycle is repeating endlessly!
I don't need to increase messages size limits such as max.message.bytes (Kafka Topic Config) and max.request.size (Flink Producer Config), they are good, they are already big. I just want to handle the situation when an unrealistically large message is trying to be produced. In this case, this big message should be ignored, and an error should be logged, and any Runtime Exception should NOT occur, and the endless restarting loop should NOT start.
I tried to use ProducerInterceptor -> it cannot intercept/reject a message, it can just modify it.
I tried to ignore oversized messages in SerializationSchema (implemented a custom wrapper of SerializationSchema) -> it cannot discard message producing too.
I am trying to overwrite KafkaWriter and KafkaSink classes, but it seems to be challenging.
I will be grateful for any advice!
A few quick environment details:
Kafka version is 2.8.1
Flink code is Java code based on the newer KafkaSource/KafkaSink API, not the
older KafkaConsumer/KafkaProduer API.
The flink-clients and flink-connector-kafka version is 1.15.0
Code sample which throws the RecordTooLargeException:
int numberOfRows = 1;
int rowsPerSecond = 1;
DataStream<String> stream = environment.addSource(
new DataGeneratorSource<>(
RandomGenerator.stringGenerator(1050000), // max.message.bytes=1048588
rowsPerSecond,
(long) numberOfRows),
TypeInformation.of(String.class))
.setParallelism(1)
.name("string-generator");
KafkaSinkBuilder<String> builder = KafkaSink.<String>builder()
.setBootstrapServers("localhost:9092")
.setDeliverGuarantee(DeliveryGuarantee.AT_LEAST_ONCE)
.setRecordSerializer(
KafkaRecordSerializationSchema.builder().setTopic("test.output")
.setValueSerializationSchema(new SimpleStringSchema())
.build());
KafkaSink<String> sink = builder.build();
stream.sinkTo(sink).setParallelism(1).name("output-producer");
Exception Stack Trace:
2022-06-02/14:01:45.066/PDT [flink-akka.actor.default-dispatcher-4] INFO output-producer: Writer -> output-producer: Committer (1/1) (a66beca5a05c1c27691f7b94ca6ac025) switched from RUNNING to FAILED on 271b1b90-7d6b-4a34-8116-3de6faa8a9bf # 127.0.0.1 (dataPort=-1). org.apache.flink.util.FlinkRuntimeException: Failed to send data to Kafka null with FlinkKafkaInternalProducer{transactionalId='null', inTransaction=false, closed=false} at org.apache.flink.connector.kafka.sink.KafkaWriter$WriterCallback.throwException(KafkaWriter.java:440) ~[flink-connector-kafka-1.15.0.jar:1.15.0] at org.apache.flink.connector.kafka.sink.KafkaWriter$WriterCallback.lambda$onCompletion$0(KafkaWriter.java:421) ~[flink-connector-kafka-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMailsNonBlocking(MailboxProcessor.java:353) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:317) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:201) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:804) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:753) ~[flink-streaming-java-1.15.0.jar:1.15.0] at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:948) ~[flink-runtime-1.15.0.jar:1.15.0] at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927) ~[flink-runtime-1.15.0.jar:1.15.0] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:741) ~[flink-runtime-1.15.0.jar:1.15.0] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563) ~[flink-runtime-1.15.0.jar:1.15.0] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_292] Caused by: org.apache.kafka.common.errors.RecordTooLargeException: The message is 1050088 bytes when serialized which is larger than 1048576, which is the value of the max.request.size configuration.

Cassandra connection has been closed error when querying via Spark

I'm trying to access remote Cassandra using Spark in Java. However, when I'm trying to execute an aggregation function (count), the following error:
Exception in thread "main" com.datastax.driver.core.exceptions.TransportException: [/192.168.1.103:9042] Connection has been closed
at com.datastax.driver.core.exceptions.TransportException.copy(TransportException.java:38)
at com.datastax.driver.core.exceptions.TransportException.copy(TransportException.java:24)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
I already set the timeout in the Cassandra.yml to big value.
Here is my code:
SparkConf conf = new SparkConf();
conf.setAppName("Test");
conf.setMaster("local[*]");
conf.set("spark.cassandra.connection.host", "host");
Spark app = new Spark(conf);
app.run();
.
.
.
CassandraConnector connector = CassandraConnector.apply(sc.getConf());
// Prepare the schema
try (Session session = connector.openSession()) {
session.execute("USE keyspace0");
ResultSet results = session.execute("SELECT count(*) FROM table0");

Invalid value of TX counter when delete vertexes with orientdb 2.1.x

While deleting vertexes from a class with more than 1600000 instances using the console:
delete vertex program batch 5000
I encounter this error:
com.orientechnologies.orient.core.exception.OStorageException: Invalid value of TX counter
at com.orientechnologies.orient.core.tx.OTransactionOptimistic.rollback(OTransactionOptimistic.java:175)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2595)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.close(ODatabaseDocumentTx.java:1137)
at com.orientechnologies.orient.server.network.protocol.http.command.post.OServerCommandPostCommand.execute(OServerCommandPostCommand.java:107)
at com.orientechnologies.orient.server.network.protocol.http.ONetworkProtocolHttpAbstract.service(ONetworkProtocolHttpAbstract.java:180)
at com.orientechnologies.orient.server.network.protocol.http.ONetworkProtocolHttpAbstract.execute(ONetworkProtocolHttpAbstract.java:627)
at com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:77)
I don't run inside a transaction and I was able to reproduce the error with plocal and remote connection, with enterprise version v2.1.4.
The database was created with another server instance, but with the same major version (2.1.x).
Update
My database is really huge (~29G) but I managed to reproduce the issue with a smaller graph and starting the console with a small heap (-Xmx32M).
OrientGraphFactory factory = ...
OrientBaseGraph graph = factory.getTx();
graph.createVertexType("Program");
graph.createVertexType("Variable");
graph.getRawGraph().getLocalCache().setEnable(false);
graph.declareIntent(new OIntentMassiveInsert());
try {
Vertex vr = graph.addVertex("class:Program");
for (int i = 0; i < 20000; i++) {
Vertex v;
v = graph.addVertex("class:Variable");
graph.addEdge(null, v, vr, "has");
}
} finally {
graph.shutdown();
}
Update 2
With orient enterprise 2.1.5, the error is a little bit different. I ran with remote connection (because I wasn't able to configure the logging in client console) and with the same memory restriction (-Xmx32M):
2015-11-02 11:50:49:936 FINE {db=db1} Deleted record #13:468 v.2 [OLocalPaginatedStorage]{db=db1} Error on transaction commit
com.orientechnologies.orient.core.exception.OTransactionException: Transaction was committed more times than it is started.
at com.orientechnologies.orient.core.tx.OTransactionOptimistic.commit(OTransactionOptimistic.java:160)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2653)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2622)
at com.tinkerpop.blueprints.impls.orient.OrientTransactionalGraph.commit(OrientTransactionalGraph.java:161)
at com.orientechnologies.orient.graph.sql.OCommandExecutorSQLDeleteVertex.end(OCommandExecutorSQLDeleteVertex.java:279)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLSelect.execute(OCommandExecutorSQLSelect.java:437)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLDelegate.execute(OCommandExecutorSQLDelegate.java:90)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.executeCommand(OAbstractPaginatedStorage.java:1538)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.command(OAbstractPaginatedStorage.java:1519)
at com.orientechnologies.orient.core.sql.query.OSQLQuery.run(OSQLQuery.java:72)
at com.orientechnologies.orient.core.query.OQueryAbstract.execute(OQueryAbstract.java:33)
at com.orientechnologies.orient.graph.sql.OCommandExecutorSQLDeleteVertex$2.call(OCommandExecutorSQLDeleteVertex.java:212)
at com.orientechnologies.orient.graph.sql.OCommandExecutorSQLDeleteVertex$2.call(OCommandExecutorSQLDeleteVertex.java:207)
at com.orientechnologies.orient.graph.sql.OGraphCommandExecutorSQLFactory.runInTx(OGraphCommandExecutorSQLFactory.java:130)
at com.orientechnologies.orient.graph.sql.OCommandExecutorSQLDeleteVertex.execute(OCommandExecutorSQLDeleteVertex.java:207)
at com.orientechnologies.orient.core.sql.OCommandExecutorSQLDelegate.execute(OCommandExecutorSQLDelegate.java:90)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.executeCommand(OAbstractPaginatedStorage.java:1538)
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.command(OAbstractPaginatedStorage.java:1519)
at com.orientechnologies.orient.core.command.OCommandRequestTextAbstract.execute(OCommandRequestTextAbstract.java:63)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.command(ONetworkProtocolBinary.java:1319)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.executeRequest(ONetworkProtocolBinary.java:396)
at com.orientechnologies.orient.server.network.protocol.binary.OBinaryNetworkProtocolAbstract.execute(OBinaryNetworkProtocolAbstract.java:223)
at com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:77)
2015-11-02 11:50:54:101 FINE {db=db1} Sent run-time exception to the client /127.0.0.1:37943: com.orientechnologies.orient.core.exception.OStorageException: Invalid value of TX counter [ONetworkProtocolBinary]

Hibernate + enabling SQLite Pragmas to gain speed on Windows 7 machine

Used software:
hibernate 3.6
sqlite jbdc 3.6.0
java jre 1.6.X
I have a problem with transferring data over a tcp connection ( 20 000 entrys )
create a sqlite database with the help of hibernate
use hibernateview and hibernate annotations to create querys
hibernate proberties are also used
storing 20 000 entries with hibernate and NO sqlite pragmas enabled lasts nearly 6 minutes ( ~ 330 sec) on Windows 7
storing 20 000 entries without hibernate and all relevant sql pragmas enabled lasts ca 2 minutes ( ~ 109 sec ) on windows 7
tests with hibernate and sqlite without pragmas on windows XP and windows Vista run fast, but on win7 it lasts nearly
3 times ( ~ 330 sec - win 7) as long as on the XP machine
on windows 7 we want to activate sqlite pragmas to gain speed boost
relevant pragmas are:
PRAGMA cache_size = 400000;
PRAGMA synchronous = OFF;
PRAGMA count_changes = OFF;
PRAGMA temp_store = MEMORY;
PRAGMA auto_vacuum = NONE;
Problem: we must use hibernate ( no Nhibernate ! )
Questions:
how to enable these pragmas in hibernate sqlite connection if its possible?
Is it possible to do so with using hibernate?
I was also looking for some way to set another pragma: PRAGMA foreign_keys = ON for hibernate connections. I didn't find anything on the subject and the only solution I came up with is to decor SQLite JDBC driver and set required pragma each time new connection is retrieved. See sample code below:
#Override
public Connection connect(String url, Properties info) throws SQLException {
final Connection connection = originalDriver.connect(url, info);
initPragmas(connection);
return connection;
}
private void initPragmas(Connection connection) throws SQLException {
//Enabling foreign keys
connection.prepareStatement("PRAGMA foreign_keys = ON;").execute();
}
Full sample is here: https://gist.github.com/52dbc7066787684de634. Then when initializing hibernate.connection.driver_class property just set it to your package.DriverDecorator
Inserts one-by-one can be very slow; you may want to consider batching. Please see my answer to this other post.
for PRAGMA foreign_keys = ON equivalent
hibernate.connection.foreign_keys=true
or
<property name="connection.foreign_keys">true</property>
depending on your strategy

Basics of Hector & Cassandra

I'm working with Cassandra-0.8.2.
I am working with the most recent version of Hector &
My java version is 1.6.0_26
I'm very new to Cassandra & Hector.
What I'm trying to do:
1. connect to an up & running instance of cassandra on a different server. I know it's running b/c I can ssh through my terminal into the server running this Cassandra instance and run the CLI with full functionality.
2. then I want to connect to a keyspace & create a column family and then add a value to that column family through Hector.
I think my problem is that this running instance of Cassandra on this server might not be configured to get commands that are not local. I think my next step will be to add a local instance of Cassandra on the cpu I'm working on and try to do this locally. What do you think?
Here's my Java code:
import me.prettyprint.cassandra.serializers.StringSerializer;
import me.prettyprint.cassandra.service.CassandraHostConfigurator;
import me.prettyprint.hector.api.Cluster;
import me.prettyprint.hector.api.Keyspace;
import me.prettyprint.hector.api.ddl.ColumnFamilyDefinition;
import me.prettyprint.hector.api.ddl.ComparatorType;
import me.prettyprint.hector.api.factory.HFactory;
import me.prettyprint.hector.api.mutation.Mutator;
public class MySample {
public static void main(String[] args) {
Cluster cluster = HFactory.getOrCreateCluster("Test Cluster", "xxx.xxx.x.41:9160");
Keyspace keyspace = HFactory.createKeyspace("apples", cluster);
ColumnFamilyDefinition cf = HFactory.createColumnFamilyDefinition("apples","ColumnFamily2",ComparatorType.UTF8TYPE);
StringSerializer stringSerializer = StringSerializer.get();
Mutator<String> mutator = HFactory.createMutator(keyspace, stringSerializer);
mutator.insert("jsmith", "Standard1", HFactory.createStringColumn("first", "John"));
}
}
My ERROR is:
16:22:19,852 INFO CassandraHostRetryService:37 - Downed Host Retry service started with queue size -1 and retry delay 10s
16:22:20,136 INFO JmxMonitor:54 - Registering JMX me.prettyprint.cassandra.service_Test Cluster:ServiceType=hector,MonitorType=hector
Exception in thread "main" me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:Keyspace apples does not exist)
at me.prettyprint.cassandra.connection.HThriftClient.getCassandra(HThriftClient.java:70)
at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:226)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:102)
at me.prettyprint.cassandra.service.KeyspaceServiceImpl.batchMutate(KeyspaceServiceImpl.java:108)
at me.prettyprint.cassandra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:222)
at me.prettyprint.cassandra.model.MutatorImpl$3.doInKeyspace(MutatorImpl.java:219)
at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
at me.prettyprint.cassandra.model.MutatorImpl.execute(MutatorImpl.java:219)
at me.prettyprint.cassandra.model.MutatorImpl.insert(MutatorImpl.java:59)
at org.cassandra.examples.MySample.main(MySample.java:25)
Caused by: InvalidRequestException(why:Keyspace apples does not exist)
at org.apache.cassandra.thrift.Cassandra$set_keyspace_result.read(Cassandra.java:5302)
at org.apache.cassandra.thrift.Cassandra$Client.recv_set_keyspace(Cassandra.java:481)
at org.apache.cassandra.thrift.Cassandra$Client.set_keyspace(Cassandra.java:456)
at me.prettyprint.cassandra.connection.HThriftClient.getCassandra(HThriftClient.java:68)
... 11 more
Thank you in advance for your help.
The exception you are getting is,
why:Keyspace apples does not exist
In your code, this line does not actually create the keyspace,
Keyspace keyspace = HFactory.createKeyspace("apples", cluster);
As described here, this is the code you need to define your keyspace,
ColumnFamilyDefinition cfDef = HFactory.createColumnFamilyDefinition("MyKeyspace", "ColumnFamilyName", ComparatorType.BYTESTYPE);
KeyspaceDefinition newKeyspace = HFactory.createKeyspaceDefinition("MyKeyspace", ThriftKsDef.DEF_STRATEGY_CLASS, replicationFactor, Arrays.asList(cfDef));
// Add the schema to the cluster.
// "true" as the second param means that Hector will block until all nodes see the change.
cluster.addKeyspace(newKeyspace, true);
We also have a getting started guide up on the wiki as well which might be of some help.

Categories

Resources