I am getting Timeout exceptions even though there is not much load on the Couchbase server.
net.spy.memcached.OperationTimeoutException: Timeout waiting for value
at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:1003)
at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:1018)
at com.eos.cache.CacheClient.get(CacheClient.java:280)
at com.eos.cache.GenericCacheAccessObject.get(GenericCacheAccessObject.java:55)
...
...
Caused by: net.spy.memcached.internal.CheckedOperationTimeoutException: Timed out waiting for operation - failing node: /192.168.4.12:11210
at net.spy.memcached.internal.OperationFuture.get(OperationFuture.java:157)
at net.spy.memcached.internal.GetFuture.get(GetFuture.java:62)
at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:997)
...30 more
This is how I am creating the client.
List<URI> uris = new ArrayList<URI>();
String[] serverTokens = getServers().split(" ");
for (int index = 0; index < serverTokens.length; index++) {
uris.add(new URI(serverTokens[index]));
}
CouchbaseConnectionFactoryBuilder ccfb = new CouchbaseConnectionFactoryBuilder();
ccfb.setProtocol(Protocol.BINARY);
ccfb.setOpTimeout(10000); // wait up to 10 seconds for an operation to
// succeed
ccfb.setOpQueueMaxBlockTime(5000); // wait up to 5 seconds when trying
// to enqueue an operation
ccfb.setMaxReconnectDelay(1500);
CouchbaseConnectionFactory cf = ccfb.buildCouchbaseConnection(uris, bucket, "");
CouchbaseClient client = new CouchbaseClient(cf);
I am maintaining a pool of persistent clients in our web server. And we are not even touching the max conn limit which has been set to 15 only.
Pls help me guys in solving this.
Related
I am using the CloseableHttpClient with the PoolingHttpClientConnectionManager. I am using this client to make POST requests to a single URL. PoolingHttpClientConnectionManager is set with a "max total" of 3 connections, TTL set to 5 seconds, connection/socket timeout 5 secs. Here is what I see (note that the requests are made sequentially not concurrently):
POST request #1: 1 connection in the pool
POST request #2: 2 connections in the pool
POST request #3: 3 connections in the pool
POST request #4: last used connection is closed, a new connection is created
I'm not sure why the last used connection is closed and a new connection is established with the fourth request. Why doesn't the connection manager reuse of the existing connections?
Here is what I see in the log:
2020-05-20 22:34:12 DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection request: [route: {s}->https://xxxxx.com:443][total kept alive: 2; total allocated: 2 of 3]
2020-05-20 22:34:12 DEBUG o.a.h.i.c.LoggingManagedHttpClientConnection - http-outgoing-3: Close connection
I traced it down to the following code in HttpComponent's AbstractConnPool.java (httpcore-4.4.4.jar) file (function getPoolEntryBlocking):
int maxPerRoute = this.getMax(route);
int excess = Math.max(0, pool.getAllocatedCount() + 1 - maxPerRoute);
int totalUsed;
if (excess > 0) {
for(totalUsed = 0; totalUsed < excess; ++totalUsed) {
E lastUsed = pool.getLastUsed();
if (lastUsed == null) {
break;
}
lastUsed.close();
this.available.remove(lastUsed);
pool.remove(lastUsed);
}
}
Can someone explain why it needs to close the connection (lastUsed.close())?
I need to send a customized email to 400 clients.
I am doing this :
for (Client c : clients){
setUpEmail(c);
sendMail(c);
}
My problem is that my email provider authorizes me to send only 10 emails per minute. How could I do that in the loop?
Thanks.
Use Guava's RateLimiter.
If you already have Guava in your library path, or if you're interested in adding it, you can use this solution:
RateLimiter rateLimiter = RateLimiter.create(10/60d); // 10 permits per 60 seconds.
for (Client c : clients){
setUpEmail(c);
rateLimiter.acquire(1);
sendMail(c);
}
Your kind of problem is exactly why RateLimiter was created.
Use a counter and wait for a minute when ten mails were sent:
int counter = 0;
for (Client c : clients){
counter++;
setUpEmail(c);
sendMail(c);
if(counter%10==0){
Thread.sleep(60*1000); // wait a minute
}
}
This is not ideal since you may lost some time, e.g. when sending ten mails needs 20 seconds, you only may wait 40 seconds before starting a new bulk.
Another option would be to wait between each mail so that the time for 10 mails is at least 60 seconds:
for (Client c : clients){
setUpEmail(c);
sendMail(c);
Thread.sleep(6*1000); // wait 6 seconds
}
And a more sophisticated one:
int counter = 0;
long start = System.currentTimeMillis();
for (Client c : clients){
counter++;
setUpEmail(c);
sendMail(c);
if(counter%10==0){
long needed = System.currentTimeMillis() - start; // ms needed for ten mails
Thread.sleep(60*1000 - needed); // wait rest of the minute
start = System.currentTimeMillis(); // start of the next bulk
}
}
Deque clientsDeque = new ArrayDeque(clients);
ScheduledExecutorService executor = Executors.newScheduledThreadPool(1);
Runnable task = () => {
for (int i=0; i<10; i++){
Client c = clientsDeque.poll();
setUpEmail(c);
sendMail(c);
}
}
executor.schedule(task, 60, TimeUnit.SECONDS);
I am trying to export a lot of data (2 TB, 30kkk rows) from Cassandra to BigQuery. All my infrastructure is on GCP. My Cassandra cluster have 4 nodes (4 vCPUs, 26 GB memory, 2000 GB PD (HDD) each). There is one seed node in the cluster. I need to transform my data before writing to BQ, so I am using Dataflow. Worker type is n1-highmem-2. Workers and Cassandra instances are at the same zone europe-west1-c. My limits for Cassandra:
Part of my pipeline code responsible for reading transform is located here.
Autoscaling
The problem is that when I don't set --numWorkers, the autoscaling set number of workers in such manner (2 workers average):
Load balancing
When I set --numWorkers=15 the rate of reading doesn't increase and only 2 workers communicate with Cassandra (I can tell it from iftop and only these workers have CPU load ~60%).
At the same time Cassandra nodes don't have a lot of load (CPU usage 20-30%). Network and disk usage of the seed node is about 2 times higher than others, but not too high, I think:
And for the not seed node here:
Pipeline launch warnings
I have some warnings when pipeline is launching:
WARNING: Size estimation of the source failed:
org.apache.beam.sdk.io.cassandra.CassandraIO$CassandraSource#7569ea63
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: /10.132.9.101:9042 (com.datastax.driver.core.exceptions.TransportException: [/10.132.9.101:9042] Cannot connect), /10.132.9.102:9042 (com.datastax.driver.core.exceptions.TransportException: [/10.132.9.102:9042] Cannot connect), /10.132.9.103:9042 (com.datastax.driver.core.exceptions.TransportException: [/10.132.9.103:9042] Cannot connect), /10.132.9.104:9042 [only showing errors of first 3 hosts, use getErrors() for more details])
My Cassandra cluster is in GCE local network and it seams that some queries are made from my local machine and cannot reach the cluster (I am launching pipeline with Dataflow Eclipse plugin as described here). These queries are about size estimation of tables. Can I specify size estimation by hand or launch pipline from GCE instance? Or can I ignore these warnings? Does it have effect on rate of read?
I'v tried to launch pipeline from GCE VM. There is no more problem with connectivity. I don't have varchar columns in my tables but I get such warnings (no codec in datastax driver [varchar <-> java.lang.Long]). :
WARNING: Can't estimate the size
com.datastax.driver.core.exceptions.CodecNotFoundException: Codec not found for requested operation: [varchar <-> java.lang.Long]
at com.datastax.driver.core.CodecRegistry.notFound(CodecRegistry.java:741)
at com.datastax.driver.core.CodecRegistry.createCodec(CodecRegistry.java:588)
at com.datastax.driver.core.CodecRegistry.access$500(CodecRegistry.java:137)
at com.datastax.driver.core.CodecRegistry$TypeCodecCacheLoader.load(CodecRegistry.java:246)
at com.datastax.driver.core.CodecRegistry$TypeCodecCacheLoader.load(CodecRegistry.java:232)
at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3628)
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2336)
at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2295)
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2208)
at com.google.common.cache.LocalCache.get(LocalCache.java:4053)
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4057)
at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4986)
at com.datastax.driver.core.CodecRegistry.lookupCodec(CodecRegistry.java:522)
at com.datastax.driver.core.CodecRegistry.codecFor(CodecRegistry.java:485)
at com.datastax.driver.core.CodecRegistry.codecFor(CodecRegistry.java:467)
at com.datastax.driver.core.AbstractGettableByIndexData.codecFor(AbstractGettableByIndexData.java:69)
at com.datastax.driver.core.AbstractGettableByIndexData.getLong(AbstractGettableByIndexData.java:152)
at com.datastax.driver.core.AbstractGettableData.getLong(AbstractGettableData.java:26)
at com.datastax.driver.core.AbstractGettableData.getLong(AbstractGettableData.java:95)
at org.apache.beam.sdk.io.cassandra.CassandraServiceImpl.getTokenRanges(CassandraServiceImpl.java:279)
at org.apache.beam.sdk.io.cassandra.CassandraServiceImpl.getEstimatedSizeBytes(CassandraServiceImpl.java:135)
at org.apache.beam.sdk.io.cassandra.CassandraIO$CassandraSource.getEstimatedSizeBytes(CassandraIO.java:308)
at org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$BoundedReadEvaluator.startDynamicSplitThread(BoundedReadEvaluatorFactory.java:166)
at org.apache.beam.runners.direct.BoundedReadEvaluatorFactory$BoundedReadEvaluator.processElement(BoundedReadEvaluatorFactory.java:142)
at org.apache.beam.runners.direct.TransformExecutor.processElements(TransformExecutor.java:146)
at org.apache.beam.runners.direct.TransformExecutor.run(TransformExecutor.java:110)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Pipeline read code
// Read data from Cassandra table
PCollection<Model> pcollection = p.apply(CassandraIO.<Model>read()
.withHosts(Arrays.asList("10.10.10.101", "10.10.10.102", "10.10.10.103", "10.10.10.104")).withPort(9042)
.withKeyspace(keyspaceName).withTable(tableName)
.withEntity(Model.class).withCoder(SerializableCoder.of(Model.class))
.withConsistencyLevel(CASSA_CONSISTENCY_LEVEL));
// Transform pcollection to KV PCollection by rowName
PCollection<KV<Long, Model>> pcollection_by_rowName = pcollection
.apply(ParDo.of(new DoFn<Model, KV<Long, Model>>() {
#ProcessElement
public void processElement(ProcessContext c) {
c.output(KV.of(c.element().rowName, c.element()));
}
}));
Number of splits (Stackdriver log)
W Number of splits is less than 0 (0), fallback to 1
I Number of splits is 1
W Number of splits is less than 0 (0), fallback to 1
I Number of splits is 1
W Number of splits is less than 0 (0), fallback to 1
I Number of splits is 1
What I'v tried
No effect:
set read consistency level to ONE
nodetool setstreamthroughput 1000, nodetool setinterdcstreamthroughput 1000
increase Cassandra read concurrency (in cassandra.yaml): concurrent_reads: 32
setting different number of workers 1-40.
Some effect:
1. I'v set numSplits = 10 as #jkff proposed. Now I can see in logs:
I Murmur3Partitioner detected, splitting
W Can't estimate the size
W Can't estimate the size
W Number of splits is less than 0 (0), fallback to 10
I Number of splits is 10
W Number of splits is less than 0 (0), fallback to 10
I Number of splits is 10
I Splitting source org.apache.beam.sdk.io.cassandra.CassandraIO$CassandraSource#6d83ee93 produced 10 bundles with total serialized response size 20799
I Splitting source org.apache.beam.sdk.io.cassandra.CassandraIO$CassandraSource#25d02f5c produced 10 bundles with total serialized response size 19359
I Splitting source [0, 1) produced 1 bundles with total serialized response size 1091
I Murmur3Partitioner detected, splitting
W Can't estimate the size
I Splitting source [0, 0) produced 0 bundles with total serialized response size 76
W Number of splits is less than 0 (0), fallback to 10
I Number of splits is 10
I Splitting source org.apache.beam.sdk.io.cassandra.CassandraIO$CassandraSource#2661dcf3 produced 10 bundles with total serialized response size 18527
But I'v got another exception:
java.io.IOException: Failed to start reading from source: org.apache.beam.sdk.io.cassandra.Cassandra...
(5d6339652002918d): java.io.IOException: Failed to start reading from source: org.apache.beam.sdk.io.cassandra.CassandraIO$CassandraSource#5f18c296
at com.google.cloud.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.start(WorkerCustomSources.java:582)
at com.google.cloud.dataflow.worker.util.common.worker.ReadOperation$SynchronizedReaderIterator.start(ReadOperation.java:347)
at com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:183)
at com.google.cloud.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:148)
at com.google.cloud.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:68)
at com.google.cloud.dataflow.worker.DataflowWorker.executeWork(DataflowWorker.java:336)
at com.google.cloud.dataflow.worker.DataflowWorker.doWork(DataflowWorker.java:294)
at com.google.cloud.dataflow.worker.DataflowWorker.getAndPerformWork(DataflowWorker.java:244)
at com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.doWork(DataflowBatchWorkerHarness.java:135)
at com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:115)
at com.google.cloud.dataflow.worker.DataflowBatchWorkerHarness$WorkerThread.call(DataflowBatchWorkerHarness.java:102)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: com.datastax.driver.core.exceptions.SyntaxError: line 1:53 mismatched character 'p' expecting '$'
at com.datastax.driver.core.exceptions.SyntaxError.copy(SyntaxError.java:58)
at com.datastax.driver.core.exceptions.SyntaxError.copy(SyntaxError.java:24)
at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:245)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:68)
at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:43)
at org.apache.beam.sdk.io.cassandra.CassandraServiceImpl$CassandraReaderImpl.start(CassandraServiceImpl.java:80)
at com.google.cloud.dataflow.worker.WorkerCustomSources$BoundedReaderIterator.start(WorkerCustomSources.java:579)
... 14 more
Caused by: com.datastax.driver.core.exceptions.SyntaxError: line 1:53 mismatched character 'p' expecting '$'
at com.datastax.driver.core.Responses$Error.asException(Responses.java:144)
at com.datastax.driver.core.DefaultResultSetFuture.onSet(DefaultResultSetFuture.java:179)
at com.datastax.driver.core.RequestHandler.setFinalResult(RequestHandler.java:186)
at com.datastax.driver.core.RequestHandler.access$2500(RequestHandler.java:50)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.setFinalResult(RequestHandler.java:817)
at com.datastax.driver.core.RequestHandler$SpeculativeExecution.onSet(RequestHandler.java:651)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1077)
at com.datastax.driver.core.Connection$Dispatcher.channelRead0(Connection.java:1000)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:293)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:267)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:341)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:363)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:349)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:642)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:565)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:479)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:441)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
... 1 more
Maybe there is a mistake: CassandraServiceImpl.java#L220
And this statement looks like mistype: CassandraServiceImpl.java#L207
Changes I'v done to CassandraIO code
As #jkff proposed, I've change CassandraIO in the way I needed:
#VisibleForTesting
protected List<BoundedSource<T>> split(CassandraIO.Read<T> spec,
long desiredBundleSizeBytes,
long estimatedSizeBytes) {
long numSplits = 1;
List<BoundedSource<T>> sourceList = new ArrayList<>();
if (desiredBundleSizeBytes > 0) {
numSplits = estimatedSizeBytes / desiredBundleSizeBytes;
}
if (numSplits <= 0) {
LOG.warn("Number of splits is less than 0 ({}), fallback to 10", numSplits);
numSplits = 10;
}
LOG.info("Number of splits is {}", numSplits);
Long startRange = MIN_TOKEN;
Long endRange = MAX_TOKEN;
Long startToken, endToken;
String pk = "$pk";
switch (spec.table()) {
case "table1":
pk = "table1_pk";
break;
case "table2":
case "table3":
pk = "table23_pk";
break;
}
endToken = startRange;
Long incrementValue = endRange / numSplits - startRange / numSplits;
String splitQuery;
if (numSplits == 1) {
// we have an unique split
splitQuery = QueryBuilder.select().from(spec.keyspace(), spec.table()).toString();
sourceList.add(new CassandraIO.CassandraSource<T>(spec, splitQuery));
} else {
// we have more than one split
for (int i = 0; i < numSplits; i++) {
startToken = endToken;
endToken = startToken + incrementValue;
Select.Where builder = QueryBuilder.select().from(spec.keyspace(), spec.table()).where();
if (i > 0) {
builder = builder.and(QueryBuilder.gte("token(" + pk + ")", startToken));
}
if (i < (numSplits - 1)) {
builder = builder.and(QueryBuilder.lt("token(" + pk + ")", endToken));
}
sourceList.add(new CassandraIO.CassandraSource(spec, builder.toString()));
}
}
return sourceList;
}
I think this should be classified as a bug in CassandraIO. I filed BEAM-3424. You can try building your own version of Beam with that default of 1 changed to 100 or something like that, while this issue is being fixed.
I also filed BEAM-3425 for the bug during size estimation.
I am struggling with manual the transaction management. Background: I need to run quarz crons which run batch processes. It is recommended for batch processing to manually decide when to flush to the db to not slow down the application to much.
I have a pooled hibernate connection as the following
dataSource {
pooled = true
driverClassName = "com.mysql.jdbc.Driver"
dialect = "org.hibernate.dialect.MySQL5InnoDBDialect"
properties {
maxActive = 50
maxIdle = 25
minIdle = 1
initialSize = 1
minEvictableIdleTimeMillis = 60000
timeBetweenEvictionRunsMillis = 60000
numTestsPerEvictionRun = 3
maxWait = 10000
testOnBorrow = true
testWhileIdle = true
testOnReturn = false
validationQuery = "SELECT 1"
validationQueryTimeout = 3
validationInterval = 15000
jmxEnabled = true
maxAge = 10 * 60000
// http://tomcat.apache.org/tomcat-7.0-doc/jdbc-pool.html#JDBC_interceptors
jdbcInterceptors = "ConnectionState;StatementCache(max=200)"
}
}
hibernate {
cache.use_second_level_cache = false
cache.use_query_cache = false
cache.region.factory_class = 'net.sf.ehcache.hibernate.EhCacheRegionFactory'
show_sql = false
logSql = false
}
the cron job calls a service in the service i run do the following:
for(int g=0; g<checkResults.size() ;g++) {
def tmpSearchTerm = SearchTerm.findById((int)results[g+i][0])
tmpSearchTerm.count=((String)checkResults[g]).toInteger()
batch.add(tmpSearchTerm)
}
//increase counter
i+=requestSizeTMP
if (i%(requestSize*4)==0 || i+1==results.size()){
println "PREPARATION TO WRITE:" + i
SearchTerm.withSession{
def tx = session.beginTransaction()
for (SearchTerm s: batch) {
s.save()
}
batch.clear()
tx.commit()
println ">>>>>>>>>>>>>>>>>>>>>writing: ${i}<<<<<<<<<<<<<<<<<<<<<<"
}
session.flush()
session.clear()
}
}
So I am adding things to a batch until I have enough (4x the request size or the last item) and then I am trying to write it to the db.
Everything works fine.. but somehow the code seems to open hibernate transactions and does not close them. I don't really understand why but I am getting a hard error and tomcat crashes with too many connections. I have 2 Problems with that, which i do not understand:
1) If the dataSource is pooled and the maxActive is 50 how can i get a too many connection errors if the limit of tomcat is 500.
2) How do I explicitly terminate the transaction so that i do not have so many open connections?
You can use withTransaction because it will manage transaction.
For example
Account.withTransaction { status ->
def source = Account.get(params.from)
def dest = Account.get(params.to)
int amount = params.amount.toInteger()
if (source.active) {
source.balance -= amount
if (dest.active) {
dest.amount += amount
}
else {
status.setRollbackOnly()
}
}
}
You can look about withTransaction in http://grails.org/doc/latest/ref/Domain%20Classes/withTransaction.html
You can see the difference between withSession and withTransaction in https://stackoverflow.com/a/19692615/1610918
------UPDATE-----------
But I would prefer you to use service and it can be called from job.
I am trying to filter results in HBase this way:
List<Filter> andFilterList = new ArrayList<>();
SingleColumnValueFilter sourceLowerFilter = new SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("source"), CompareFilter.CompareOp.GREATER, Bytes.toBytes(lowerLimit));
sourceLowerFilter.setFilterIfMissing(true);
SingleColumnValueFilter sourceUpperFilter = new SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("source"), CompareFilter.CompareOp.LESS_OR_EQUAL, Bytes.toBytes(upperLimit));
sourceUpperFilter.setFilterIfMissing(true);
SingleColumnValueFilter targetLowerFilter = new SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("target"), CompareFilter.CompareOp.GREATER, Bytes.toBytes(lowerLimit));
targetLowerFilter.setFilterIfMissing(true);
SingleColumnValueFilter targetUpperFilter = new SingleColumnValueFilter(Bytes.toBytes("cf"), Bytes.toBytes("target"), CompareFilter.CompareOp.LESS_OR_EQUAL, Bytes.toBytes(upperLimit));
targetUpperFilter.setFilterIfMissing(true);
andFilterList.add(sourceUpperFilter);
andFilterList.add(targetUpperFilter);
FilterList andFilter = new FilterList(FilterList.Operator.MUST_PASS_ALL, andFilterList);
List<Filter> orFilterList = new ArrayList<>();
orFilterList.add(sourceLowerFilter);
orFilterList.add(targetLowerFilter);
FilterList orFilter = new FilterList(FilterList.Operator.MUST_PASS_ONE, orFilterList);
FilterList fl = new FilterList(FilterList.Operator.MUST_PASS_ALL);
fl.addFilter(andFilter);
fl.addFilter(orFilter);
Scan edgeScan = new Scan();
edgeScan.setFilter(fl);
ResultScanner edgeScanner = table.getScanner(edgeScan);
Result edgeResult;
logger.info("Writing edges...");
while ((edgeResult = edgeScanner.next()) != null) {
// Some code
}
This code launchs this error:
org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout?
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:402)
at org.deustotech.internet.phd.framework.rdf2subdue.RDF2Subdue.writeFile(RDF2Subdue.java:150)
at org.deustotech.internet.phd.framework.rdf2subdue.RDF2Subdue.run(RDF2Subdue.java:39)
at org.deustotech.internet.phd.Main.main(Main.java:32)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 178 number_of_rows: 100 close_scanner: false next_call_seq: 0
at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3098)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111)
at java.lang.Thread.run(Thread.java:745)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:95)
at org.apache.hadoop.hbase.protobuf.ProtobufUtil.getRemoteException(ProtobufUtil.java:285)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:204)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:59)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:114)
at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:90)
at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:354)
... 9 more
Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException): org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 178 number_of_rows: 100 close_scanner: false next_call_seq: 0
at org.apache.hadoop.hbase.regionserver.HRegionServer.scan(HRegionServer.java:3098)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29497)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:168)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:39)
at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:111)
at java.lang.Thread.run(Thread.java:745)
at org.apache.hadoop.hbase.ipc.RpcClient.call(RpcClient.java:1453)
at org.apache.hadoop.hbase.ipc.RpcClient.callBlockingMethod(RpcClient.java:1657)
at org.apache.hadoop.hbase.ipc.RpcClient$BlockingRpcChannelImplementation.callBlockingMethod(RpcClient.java:1715)
at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:29900)
at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:174)
... 13 more
The RPC timeout is set to 600000. I have tried to remove some filters given these results:
sourceUpperFilter && (sourceLowerFilter || targetLowerFilter) --> Success
targetUpperFilter && (sourceLowerFilter || targetLowerFilter) --> Success
(sourceUpperFilter && targetUpperFilter) && (sourceLowerFilter) --> Fail
(sourceUpperFilter && targetUpperFilter) && (targetLowerFilter) --> Fail
Any help would be appreciated. Thank you.
I solve this problem by setting hbase.client.scanner.caching
see also
Client and RS maintain a nextCallSeq number during the scan. Every next() call from client to server will increment this number in both sides. Client passes this number along with the request and at RS side both the incoming nextCallSeq and its nextCallSeq will be matched. In case of a timeout this increment at the client side should not happen. If at the server side fetching of next batch of data was over, there will be mismatch in the nextCallSeq number. Server will throw OutOfOrderScannerNextException and then client will reopen the scanner with startrow as the last successfully retrieved row.
Since the problem is caused by the client-side overtime, then the corresponding reduction in client cache (hbase.client.scanner.caching) size or increase rpc timeout time (hbase.rpc.timeout) can be.
Hope this answer helps.
Reason :Looking for few rows from a big region. It takes time to fill the #rows
as requested by client side. By this time the client gets an rpc timeout.
So client side will retry the call on same scanner. Remember with this next
call client says give me next N rows from where you are. The old failed
call was in progress and would have advanced some rows. So this retry call
will miss those rows.... to avoid this and to distinguish this case we have
this scan seqno and this exception. On seeing this the client will close
the scanner and create a new one with proper start row . But this retry way
happens only one more time. Again this call also might be timing out.
So we
have to adjust the timeout and/or scan caching value.
heart beat mechanism avoids such timeout for long running scans.
In our case where data is huge in hbase we have used
RPC time out = 1800000 and lease period = 1800000 and we have used fuzzy row filters and also scan.setCaching(xxxx)// value need to be adjusted ;
Note : value filters are slow(since full table scan will take long for execution) than row filter
With all above precautions we are successful to query huge data from hbase with mapreduce.
Hope this explanation helps.