Redisson client ; RedisTimeoutException issue - java

I am using Google cloud managed redis cluster(v5) via redisson(3.12.5)
Following are my SingleServer configurations in yaml file
singleServerConfig:
idleConnectionTimeout: 10000
connectTimeout: 10000
timeout: 3000
retryAttempts: 3
retryInterval: 1500
password: null
subscriptionsPerConnection: 5
clientName: null
address: "redis://127.0.0.1:6379"
subscriptionConnectionMinimumIdleSize: 1
subscriptionConnectionPoolSize: 50
connectionMinimumIdleSize: 40
connectionPoolSize: 250
database: 0
dnsMonitoringInterval: 5000
threads: 0
nettyThreads: 0
codec: !<org.redisson.codec.JsonJacksonCodec> {}
I am getting following exceptions when I increase the load on my application
org.redisson.client.RedisTimeoutException: Unable to acquire connection! Increase connection pool size and/or retryInterval settings Node source: NodeSource
org.redisson.client.RedisTimeoutException: Command still hasn't been written into connection! Increase nettyThreads and/or retryInterval settings. Payload size in bytes: 34. Node source: NodeSource
It seems there is no issue on redis cluster and i think i need to make tweaking in my client side redis connection pooling confs(mentioned above) to make it work.
Please suggest me the changes i need to make in my confs
I am also curious if I should close the Redis connection after making get/set calls. I have tried finding this but found nothing conclusive on how to close Redis connections
One last thing that I want to ask is that is there any mechanism to get Redis connection pool stats(active connection, idle connection etc ) in Redisson
Edit1:
I have tried by changing values following values in 3 different iterations
Iteration 1:
idleConnectionTimeout: 30000
connectTimeout: 30000
timeout: 30000
Iteration 2:
nettyThreads: 0
Iteration 3:
connectionMinimumIdleSize: 100
connectionPoolSize: 750
I have tried these things but nothing has worked for me
Any help is appreciated.
Thanks in advance

Assuming you are getting low memory alerts on your cache JVM.
You may have to analyze the traffic and determine 2 things
Too many parallel cache persists.
Huge chunk of data being persisted.
Both can be determined by the traffic on your server.
For option 1 configuring pool-size would solve you issue, but for option 2 you may have to refactor your code to persist data in smaller chunks.

Try to set nettyThreads = 64 settings

Related

Memory issue with App Engine and Firestore

I'm developing a MS with Kotlin and Micronaut which access a Firestore database. When I run this MS locally I can make it work with 128M because it's very simple just read and write data to Firestore, and not big amounts of data, really small data like this:
{
"project": "DUMMY",
"columns": [
{
"name": "TODO",
"taskStatus": "TODO"
},
{
"name": "IN_PROGRESS",
"taskStatus": "IN_PROGRESS"
},
{
"name": "DONE",
"taskStatus": "DONE"
}
],
"tasks": {}
}
I'm running this in App Engine Standard in a F1 instance (256 MB 600 MHz) with this properties in my app.yaml
runtime: java11
instance_class: F1 # 256 MB 600 MHz
entrypoint: java -Xmx200m -jar MY_JAR.jar
service: data-connector
env_variables:
JAVA_TOOL_OPTIONS: "-Xmx230m"
GAE_MEMORY_MB: 128M
automatic_scaling:
max_instances: 1
max_idle_instances: 1
I know all that properties for handling memory are not necessary but I was desperate trying to make this work and just tried a lot of solutions because my first error message was:
Exceeded soft memory limit of 256 MB with 263 MB after servicing 1 requests total. Consider setting a larger instance class in app.yaml.
The error below is not fixed with the properties in the app.yaml, but now everytime I make a call to return that JSON I get this error
2020-04-10 12:09:15.953 CEST
While handling this request, the process that handled this request was found to be using too much memory and was terminated. This is likely to cause a new process to be used for the next request to your application. If you see this message frequently, you may have a memory leak in your application or may be using an instance with insufficient memory. Consider setting a larger instance class in app.yaml.
It always last longer in the first request, I think due to some Firestore configuration, but the thing is that I cannot make that work, always getting the same error.
Do you have any idea what I could be doing wrong or what I need to fix this?
TL;DR The problem was I tried to used a very small instance for a simple application, but even with that I needed more memory.
Ok, a friend helped me with this. I was using a very small instance and even when I didn't get the error of memory limit it was a memory problem.
Updating my instance to a F2 (512 MB 1.2 GHz) solved the problem and testing my app with siege resulted in a very nice performance:
Transactions: 5012 hits
Availability: 100.00 %
Elapsed time: 59.47 secs
Data transferred: 0.45 MB
Response time: 0.30 secs
Transaction rate: 84.28 trans/sec
Throughput: 0.01 MB/sec
Concurrency: 24.95
Successful transactions: 3946
Failed transactions: 0
Longest transaction: 1.08
Shortest transaction: 0.09
My sysops friends tells me that this instances are more for python scripting code and things like that, not JVM REST servers.

How to solve network and memory issues in Kafka brokers?

When using kafka, I got intermittent two network related errors.
1. Error in fetch kafka.server.replicafetcherthread$fetchrequest connection to broker was disconnected before the reponse was read
2. Error in fetch kafka.server.replicafetcherthread$fetchrequest Connection to broker1 (id: 1 rack: null) failed
[configuration environment]
Brokers: 5 / server.properties: "kafka_manager_heap_s=1g", "kafka_manager_heap_x=1g", "offsets.commit.required.acks=1","offsets.commit.timeout.ms=5000", Most settings are the default.
Zookeepers: 3
Servers: 5
Kafka:0.10.1.2
Zookeeper: 3.4.6
Both of these errors are caused by loss of network communication.
If these errors occur, Kafka will work to expand or shrink the ISR partition several times.
expanding-ex) INFO Partition [my-topic,7] on broker 1: Expanding ISR for partition [my-topic,7] from 1,2 to 1,2,3
shrinking-ex) INFO Partition [my-topic,7] on broker 1: Shrinking ISR for partition [my-topic,7] from 1,2,3 to 1,2
I understand that these errors are caused by network problems, but I'm not sure why the break in the network is occurring.
And if this network disconnection persists, I got the following additional error
: Error when handling request(topics=null} java.lang.OutOfMemoryError: Java heap space
I wonder what causes these and how can I improve this?
The network error tells you that one of the brokers is not running, which means it cannot connect to it. As per experience the minimum heap size you can assign is 2Gb.

Hazelcast Operation Heartbeat Timeouts appearing sporadically

We have a Hazelcast client (3.7.4):
//Initializes Hazelcast client config
ClientConfig aHazelcastClientConfig = new ClientConfig();
String aHazelcastUrl = this.getHost()+":"+this.getPort().toString();
ClientNetworkConfig aHazelcastNetworkConfig=
aHazelcastClientConfig.getNetworkConfig();
aHazelcastNetworkConfig.addAddress(aHazelcastUrl);
GroupConfig group = new GroupConfig (getGroupName(),getGroupPassword());
aHazelcastClientConfig.setGroupConfig(group);
HazelcastInstance aHazelcastClient=
HazelcastClient.newHazelcastClient(aHazelcastClientConfig);
...
IMap aMonitoredMap = aHazelcastClient.getMap(getMonitoredMap());
that periodically checks one HZ Server (3.7.4), and we have observed sometimes next exceptions are appearing in the client side:
InitializeDistributedObjectOperation invocation failed to complete due to operation-heartbeat-timeout. Current time: 2017-02-07 18:07:30.329. Total elapsed time: 120189 ms. Last operation heartbeat: never. Last operation heartbeat from member: 2017-02-07 18:05:37.489. Invocation{op=com.hazelcast.spi.impl.proxyservice.impl.operations.InitializeDistributedObjectOperation{serviceName='hz:impl:mapService', identityHash=9759664, partitionId=-1, replicaIndex=0, callId=0, invocationTime=1486487130140 (2017-02-07 18:05:30.140), waitTimeout=-1, callTimeout=60000}, tryCount=1, tryPauseMillis=500, invokeCount=1, callTimeoutMillis=60000, firstInvocationTimeMs=1486487130140, firstInvocationTime='2017-02-07 18:05:30.140', lastHeartbeatMillis=0, lastHeartbeatTime='1970-01-01 01:00:00.000', target=[10.118.152.82]:5720, pendingResponse={VOID}, backupsAcksExpected=0, backupsAcksReceived=0, connection=Connection[id=7, /172.22.191.200:5720->/10.118.152.82:42563, endpoint=[10.118.152.82]:5720, alive=true, type=MEMBER]}
It seems the maximum call waiting timeout (by default 60000 msecs) is being reached. In the above example, the total elapsed time is more than 2 minutes (120189 ms)
This problem is appearing sporadically, without any regular appearance pattern.
It seems the network is working correctly when it has appeared, so we can discard some network connectivity issue.
Any hint or recommendation about which reasons could provoke it?
Thanks a lot.
Best Regards,
Jorge

Oracle Coherence eviction not working

I am working on implementing Oracle Coherence replicated cache. The implementation is as follows:
<?xml version="1.0"?>
<!DOCTYPE cache-config SYSTEM "cache-config.dtd">
<cache-config>
<caching-scheme-mapping>
<cache-mapping>
<cache-name>EntryList</cache-name>
<scheme-name>ENTRY_ITEMS</scheme-name>
</cache-mapping>
</caching-scheme-mapping>
<caching-schemes>
<replicated-scheme>
<scheme-name>ENTRY_ITEMS</scheme-name>
<backing-map-scheme>
<local-scheme>
<scheme-name>ENTRY_ITEMS</scheme-name>
<unit-calculator>FIXED</unit-calculator>
<expiry-delay>60m</expiry-delay> <!-- expire after 60 minutes -->
<high-units>2000</high-units>
<eviction-policy>LFU</eviction-policy>
</local-scheme>
</backing-map-scheme>
<autostart>true</autostart>
</replicated-scheme>
</caching-schemes>
</cache-config>
tangasol-coherence-override.xml
<coherence xmlns:xsi="http://www.w4.org/2001/XMLSchema-instance"
xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
xsi:schemaLocation="http://xmlns.oracle.com/coherence/coherence-operational-config
coherence-operational-config.xsd">
<cluster-config>
<member-identity>
<cluster-name>clusterName</cluster-name>
<!-- Name of the first member of the cluster -->
<role-name>RoleName</role-name>
</member-identity>
<unicast-listener xml-override=coherence-environment.xml/>
</cluster-config>
</coherence>
coherence-environment.xml
<unicast-listener xmlns:xsi="http://www.w4.org/2001/XMLSchema-instance"
xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
xsi:schemaLocation="http://xmlns.oracle.com/coherence/coherence-operational-config
coherence-operational-config.xsd">
<well-known-addresses>
<socket-address id="1">
<address>member1</address>
<port>7777</port>
</socket-address>
</well-known-addresses>
<well-known-addresses>
<socket-address id="2">
<address>member2</address>
<port>7777</port>
</socket-address>
</well-known-addresses>
</unicast-listener>
This is implemented and tested to be working perfectly.
We were testing the eviction policy of the cache. To ease out testing I did the following:
I keep the size of cache as 4 by setting high-units as 4. Now add 4 entries in the cache. This should fill the cache completely.
Now if I make one more entry number 5 in the cache, I was expecting the lease frequently used entry to be kicked out of the cache to make room for the entry number 5.
The next time I access the cache for the new entry number 5, I should get a cache HIT.
But that's not happening, I always get a Cache MISS.
I ran my java code in debug mode and I see that the code PUT's entry number 5 in the cache but this PUT operation does not reflect on the cache.
Now I am definitely not the first person testing the coherence cache eviction policies. Am I missing anything in the configuration? Am I testing the eviction in a wrong way. Any inputs are welcome.
Thanks.
Try to isolate the problem:
change<expiry-delay>1</expiry-delay> (1ms)
add <low-units>0</low-units> (default value is 75% which is 3 entries).
try another policy <eviction-policy>LRU</eviction-policy>
If those won't help, try to add custom eviction policy class to see wheter eviction triggered. see here:
I have tried your example with 3 as High Units. My observation:
Eviction works as soon as I put 4th entry item. So, it works!!
You can start Coherence Server (coherence.sh) for command-line monitoring with same override & cache config file. See details, which gets printed when I put following command to see cache:
Map (?): cache EntryList
Cache Configuration: EntryList
SchemeName: ENTRY_ITEMS
AutoStart: true
ServiceName: ReplicatedCache
ServiceDependencies
EventDispatcherThreadPriority: 10
ThreadPriority: 10
WorkerThreadsMax: 2147483647
WorkerPriority: 5
EnsureCacheTimeout: 30000
BackingMapScheme
InnerScheme (LocalScheme)
SchemeName: ENTRY_ITEMS
UnitCalculatorBuilder
Calculator: FIXED
EvictionPolicyBuilder
Policy: LFU
ExpiryDelay: 1h
HighUnits
Units: 3
UnitFactor: 1

How to resolve mongodb client Timeout waiting for a pooled item after 120000 MILLISECONDS exceptino error

I have a Java class (not Java Spring or server) which
1) inserts documents to one table,
2) reads documents from other table,
3) insert documents to another table and
4) delete documents from another table.
All above 4 operations happens with 3 tables.
I get the following error.
Exception in thread "pool-1-thread-240" com.mongodb.MongoTimeoutException: Timeout waiting for a pooled item after 120000 MILLISECONDS
at com.mongodb.ConcurrentPool.get(ConcurrentPool.java:113)
at com.mongodb.PooledConnectionProvider.get(PooledConnectionProvider.java:75)
at com.mongodb.DefaultServer.getConnection(DefaultServer.java:73)
at com.mongodb.BaseCluster$WrappedServer.getConnection(BaseCluster.java:221)
at com.mongodb.DBTCPConnector$MyPort.getConnection(DBTCPConnector.java:508)
at com.mongodb.DBTCPConnector$MyPort.get(DBTCPConnector.java:456)
at com.mongodb.DBTCPConnector.getPrimaryPort(DBTCPConnector.java:414)
at com.mongodb.DBCollectionImpl.insert(DBCollectionImpl.java:176)
at com.mongodb.DBCollectionImpl.insert(DBCollectionImpl.java:159)
at com.mongodb.DBCollection.insert(DBCollection.java:93)
at com.mongodb.DBCollection.insert(DBCollection.java:78)
at com.mongodb.DBCollection.insert(DBCollection.java:120)
at MyProgram$MyClass.run(MyProgram.java:149)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)
a) How can I fix it?
I am using mongod 2.6.3 in Mac OS System.
b) Should I increase the mongodb pool in my client side.
c) If yes, how should I do it?
d) What is the maximum number to which I can set it?
I get this problem for the line in my java code where I do insert operation.
The default pool connection number is 100 and the maximum pool size is 20 for a single MongoDB instance.
For resolve your problem take this article in consideration

Categories

Resources