EhCache - Expired elements won't evict - java

I'm using EHCache 2.9 in my Spring Boot application and I've configured the cache to expire after 300 seconds (5 minutes).
When I run the application and request the element for the first time it gets cached and after that never expires.
However, when I do #CachePut it gets updated successfully and updated element is then returned.
What is wrong in my configuration?
Here is my ehcache.xml file:
<?xml version="1.0" encoding="UTF-8"?>
<ehcache>
<defaultCache maxElementsInMemory="500" eternal="false"
overflowToDisk="false" memoryStoreEvictionPolicy="LFU" />
<diskStore path="java.io.tempdir"/>
<cache name="appointments"
maxElementsInMemory="5000"
eternal="false"
timeToIdleSeconds="0"
timeToLiveSeconds="300"
overflowToDisk="false"
memoryStoreEvictionPolicy="LFU" />
</ehcache>
And here is how I request the cache:
#Cacheable("appointments")
public List<Event> getEvents(String eventsForUser, Date startDate, Date endDate) throws Exception {
return fetchEventsFromTheServer(eventsForUser, startDate, endDate);
}
#CachePut("appointments")
public List<Event> refreshEventsCache(String eventsForUser, Date startDate, Date endDate) throws Exception {
return fetchEventsFromTheServer(eventsForUser, startDate, endDate);
}
Any suggestions?

Flush – To move a cache entry to a lower tier. Flushing is used to free up resources while still keeping data in the cluster. Entry E1 is shown to be flushed from the L1 off-heap store to the Terracotta Server Array (TSA).
Fault – To copy a cache entry from a lower tier to a higher tier. Faulting occurs when data is required at a higher tier but is not resident there. The entry is not deleted from the lower tiers after being faulted. Entry E2 is shown to be faulted from the TSA to L1 heap.
Eviction – To remove a cache entry from the cluster. The entry is deleted; it can only be reloaded from a source outside the cluster. Entries are evicted to free up resources. Entry E3, which exists only on the L2 disk, is shown to be evicted from the cluster.
Expiration – A status based on Time To Live and Time To Idle settings. To maintain cache performance, expired entries may not be immediately flushed or evicted. Entry E4 is shown to be expired but still in the L1 heap.
Pinning – To force data to remain in certain tiers. Pinning can be set on individual entries or an entire cache, and must be used with caution to avoid exhausting a resource such as heap. E5 is shown pinned to L1 heap.
http://www.ehcache.org/documentation/2.7/configuration/data-life.html

Related

Flink rocksdb compaction filter not working

I have a Flink Cluster. I enabled the compaction filter and using state TTL. but Rocksdb Compaction Filter does not free states from memory.
I have about 300 record / s in my Flink Pipeline
My state TTL config:
#Override
public void open(Configuration parameters) throws Exception {
ListStateDescriptor<ObjectNode> descriptor = new ListStateDescriptor<ObjectNode>(
"my-state",
TypeInformation.of(new TypeHint<ObjectNode>() {})
);
StateTtlConfig ttlConfig = StateTtlConfig
.newBuilder(Time.seconds(600))
.cleanupInRocksdbCompactFilter(2)
.build();
descriptor.enableTimeToLive(ttlConfig);
myState = getRuntimeContext().getListState(descriptor);
}
flink-conf.yaml:
state.backend: rocksdb
state.backend.rocksdb.ttl.compaction.filter.enabled: true
state.backend.rocksdb.block.blocksize: 16kb
state.backend.rocksdb.compaction.level.use-dynamic-size: true
state.backend.rocksdb.thread.num: 4
state.checkpoints.dir: file:///opt/flink/checkpoint
state.backend.rocksdb.timer-service.factory: rocksdb
state.backend.rocksdb.checkpoint.transfer.thread.num: 2
state.backend.local-recovery: true
state.backend.rocksdb.localdir: /opt/flink/rocksdb
jobmanager.execution.failover-strategy: region
rest.port: 8081
state.backend.rocksdb.memory.managed: true
# state.backend.rocksdb.memory.fixed-per-slot: 20mb
state.backend.rocksdb.memory.write-buffer-ratio: 0.9
state.backend.rocksdb.memory.high-prio-pool-ratio: 0.1
taskmanager.memory.managed.fraction: 0.6
taskmanager.memory.network.fraction: 0.1
taskmanager.memory.network.min: 500mb
taskmanager.memory.network.max: 700mb
taskmanager.memory.process.size: 5500mb
taskmanager.memory.task.off-heap.size: 800mb
metrics.reporter.influxdb.class: org.apache.flink.metrics.influxdb.InfluxdbReporter
metrics.reporter.influxdb.host: ####
metrics.reporter.influxdb.port: 8086
metrics.reporter.influxdb.db: ####
metrics.reporter.influxdb.username: ####
metrics.reporter.influxdb.password: ####
metrics.reporter.influxdb.consistency: ANY
metrics.reporter.influxdb.connectTimeout: 60000
metrics.reporter.influxdb.writeTimeout: 60000
state.backend.rocksdb.metrics.estimate-num-keys: true
state.backend.rocksdb.metrics.num-running-compactions: true
state.backend.rocksdb.metrics.background-errors: true
state.backend.rocksdb.metrics.block-cache-capacity: true
state.backend.rocksdb.metrics.block-cache-pinned-usage: true
state.backend.rocksdb.metrics.block-cache-usage: true
state.backend.rocksdb.metrics.compaction-pending: true
Monitoring by Influxdb and Grafana:
As the name of this TTL cleanup implies (cleanupInRocksdbCompactFilter), it relies on the custom RocksDB compaction filter which runs only during compactions. More details in docs.
The metrics in the screenshot show that there have been no running compactions all the time. I suppose that the size of data is just not big enough to start any compaction at this point of time.
Compaction Filter does not free states from memory.
I assume that the main RAM memory is meant by saying 'from memory'. If so, the compaction is not running there at all. The size of data, kept by RocksDB in main memory, is always limited. It is basically a cache and the expired untouched state should just get evicted from it eventually. The rest is periodically spilled to disk and gets compacted over time. This is when this TTL cleanup is supposed to remove the expired state from the system.

How can I make Jgroups reconnect even after a long period of time?

So We have a problem where a penetration checker being run for something like 12 hours is causing Jgroups to disconnect, the slave doesn't rejoin the cluster, split brain, some other issues that represent the lack of replication, and it doesn't recover.
<config xmlns="urn:org:jgroups"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="urn:org:jgroups http://www.jgroups.org/schema/JGroups-3.6.xsd">
<TCP bind_addr="NON_LOOPBACK"
bind_port="${infinispan.jgroups.bindPort}"
enable_diagnostics="false"
thread_naming_pattern="pl"
send_buf_size="640k"
sock_conn_timeout="300"
thread_pool.min_threads="${jgroups.thread_pool.min_threads:2}"
thread_pool.max_threads="${jgroups.thread_pool.max_threads:30}"
thread_pool.keep_alive_time="60000"
thread_pool.queue_enabled="false"
internal_thread_pool.min_threads="${jgroups.internal_thread_pool.min_threads:5}"
internal_thread_pool.max_threads="${jgroups.internal_thread_pool.max_threads:20}"
internal_thread_pool.keep_alive_time="60000"
internal_thread_pool.queue_enabled="true"
internal_thread_pool.queue_max_size="500"
oob_thread_pool.min_threads="${jgroups.oob_thread_pool.min_threads:20}"
oob_thread_pool.max_threads="${jgroups.oob_thread_pool.max_threads:200}"
oob_thread_pool.keep_alive_time="60000"
oob_thread_pool.queue_enabled="false"
/>
<TCPPING async_discovery="true"
initial_hosts="${infinispan.jgroups.tcpping.initialhosts}"
port_range="1"/>
/>
<MERGE3 min_interval="10000"
max_interval="30000"
/>
<FD_SOCK />
<FD />
<VERIFY_SUSPECT />
<pbcast.NAKACK2 use_mcast_xmit="false"
xmit_interval="1000"
xmit_table_num_rows="50"
xmit_table_msgs_per_row="1024"
xmit_table_max_compaction_time="30000"
max_msg_batch_size="100"
resend_last_seqno="true"
/>
<UNICAST3 xmit_interval="500"
xmit_table_num_rows="50"
xmit_table_msgs_per_row="1024"
xmit_table_max_compaction_time="30000"
max_msg_batch_size="100"
conn_expiry_timeout="0"
/>
<pbcast.STABLE stability_delay="500"
desired_avg_gossip="5000"
max_bytes="1M"
/>
<pbcast.GMS print_local_addr="true" join_timeout="15000"/>
<pbcast.FLUSH />
<FRAG2 />
</config>
versions
jgroups 3.6.13
infinispan 8.1.0,
hibernate search 5.3
I'm wondering if we can change our jgroups configuration so that the cluster node will eventually be able to rejoin. Even after 12 hours of "attack" so that we don't have to restart the servers.
Define disconnect for me first, please!
Regarding your stack, I have a few suggestions / questions:
I suggest in general to use tcp.xml from the version you use and then modify it according to your needs
TCPPING: does initial_hosts contain all cluster members?
Replace FD with FD_ALL
STABLE: desired_avg_gossip of 5s is a bit small; this generates more traffic than needed
GMS.join_timeout of 15s is quite high; this is the startup time of the first member, and it also influences discovery time
What do you need FLUSH for?

Oracle Coherence eviction not working

I am working on implementing Oracle Coherence replicated cache. The implementation is as follows:
<?xml version="1.0"?>
<!DOCTYPE cache-config SYSTEM "cache-config.dtd">
<cache-config>
<caching-scheme-mapping>
<cache-mapping>
<cache-name>EntryList</cache-name>
<scheme-name>ENTRY_ITEMS</scheme-name>
</cache-mapping>
</caching-scheme-mapping>
<caching-schemes>
<replicated-scheme>
<scheme-name>ENTRY_ITEMS</scheme-name>
<backing-map-scheme>
<local-scheme>
<scheme-name>ENTRY_ITEMS</scheme-name>
<unit-calculator>FIXED</unit-calculator>
<expiry-delay>60m</expiry-delay> <!-- expire after 60 minutes -->
<high-units>2000</high-units>
<eviction-policy>LFU</eviction-policy>
</local-scheme>
</backing-map-scheme>
<autostart>true</autostart>
</replicated-scheme>
</caching-schemes>
</cache-config>
tangasol-coherence-override.xml
<coherence xmlns:xsi="http://www.w4.org/2001/XMLSchema-instance"
xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
xsi:schemaLocation="http://xmlns.oracle.com/coherence/coherence-operational-config
coherence-operational-config.xsd">
<cluster-config>
<member-identity>
<cluster-name>clusterName</cluster-name>
<!-- Name of the first member of the cluster -->
<role-name>RoleName</role-name>
</member-identity>
<unicast-listener xml-override=coherence-environment.xml/>
</cluster-config>
</coherence>
coherence-environment.xml
<unicast-listener xmlns:xsi="http://www.w4.org/2001/XMLSchema-instance"
xmlns="http://xmlns.oracle.com/coherence/coherence-operational-config"
xsi:schemaLocation="http://xmlns.oracle.com/coherence/coherence-operational-config
coherence-operational-config.xsd">
<well-known-addresses>
<socket-address id="1">
<address>member1</address>
<port>7777</port>
</socket-address>
</well-known-addresses>
<well-known-addresses>
<socket-address id="2">
<address>member2</address>
<port>7777</port>
</socket-address>
</well-known-addresses>
</unicast-listener>
This is implemented and tested to be working perfectly.
We were testing the eviction policy of the cache. To ease out testing I did the following:
I keep the size of cache as 4 by setting high-units as 4. Now add 4 entries in the cache. This should fill the cache completely.
Now if I make one more entry number 5 in the cache, I was expecting the lease frequently used entry to be kicked out of the cache to make room for the entry number 5.
The next time I access the cache for the new entry number 5, I should get a cache HIT.
But that's not happening, I always get a Cache MISS.
I ran my java code in debug mode and I see that the code PUT's entry number 5 in the cache but this PUT operation does not reflect on the cache.
Now I am definitely not the first person testing the coherence cache eviction policies. Am I missing anything in the configuration? Am I testing the eviction in a wrong way. Any inputs are welcome.
Thanks.
Try to isolate the problem:
change<expiry-delay>1</expiry-delay> (1ms)
add <low-units>0</low-units> (default value is 75% which is 3 entries).
try another policy <eviction-policy>LRU</eviction-policy>
If those won't help, try to add custom eviction policy class to see wheter eviction triggered. see here:
I have tried your example with 3 as High Units. My observation:
Eviction works as soon as I put 4th entry item. So, it works!!
You can start Coherence Server (coherence.sh) for command-line monitoring with same override & cache config file. See details, which gets printed when I put following command to see cache:
Map (?): cache EntryList
Cache Configuration: EntryList
SchemeName: ENTRY_ITEMS
AutoStart: true
ServiceName: ReplicatedCache
ServiceDependencies
EventDispatcherThreadPriority: 10
ThreadPriority: 10
WorkerThreadsMax: 2147483647
WorkerPriority: 5
EnsureCacheTimeout: 30000
BackingMapScheme
InnerScheme (LocalScheme)
SchemeName: ENTRY_ITEMS
UnitCalculatorBuilder
Calculator: FIXED
EvictionPolicyBuilder
Policy: LFU
ExpiryDelay: 1h
HighUnits
Units: 3
UnitFactor: 1

How to trace and prevent the deadlock appeared in c3po which is running in seperate processes?

I have a very simple computation which produces letter matrices finds probably all the words in the matrix. The letters in the word are adjacent cells.
for (int i = 0; i < 500; i++) {
System.out.println(i);
Matrix matrix = new Matrix(4);
matrix.scanWordsRandomly(9);
matrix.printMatrix();
System.out.println(matrix.getSollSize());
matrix.write_to_db();
}
Here is the persisting code.
public void write_to_db() {
Session session = null;
try {
session = HibernateUtil.getSessionFactory().openSession();
session.beginTransaction();
Matrixtr onematrixtr = new Matrixtr();
onematrixtr.setDimension(dimension);
onematrixtr.setMatrixstr(this.toString());
onematrixtr.setSolsize(getSollSize());
session.save(onematrixtr);
for (Map.Entry<Kelimetr, List<Cell>> sollution : sollutions.entrySet()) {
Kelimetr kelimetr = sollution.getKey();
List<Cell> solpath = sollution.getValue();
Solstr onesol = new Solstr();
onesol.setKelimetr(kelimetr);
onesol.setMatrixtr(onematrixtr);
onesol.setSoltext(solpath.toString().replace("[", "").replace("]", "").replace("true", "").replace("false", ""));
session.save(onesol);
}
session.getTransaction().commit();
session.close();
}
catch (HibernateException he) {
System.out.println("DB Error : " + he.getMessage());
session.close();
}
catch (Exception ex) {
System.out.println("General Error : " + ex.getMessage());
}
}
Here is the hibernate configuration file.
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE hibernate-configuration PUBLIC "-//Hibernate/Hibernate Configuration DTD 3.0//EN" "http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd">
<hibernate-configuration>
<session-factory>
<property name="hibernate.dialect">org.hibernate.dialect.MySQLDialect</property>
<property name="hibernate.connection.driver_class">com.mysql.jdbc.Driver</property>
<property name="hibernate.connection.url">jdbc:mysql://localhost:3306/kelimegame_db_dev?autoReconnect=true&useUnicode=true&characterEncoding=UTF-8</property>
<property name="hibernate.connection.username">root</property>
<property name="hibernate.connection.password">!.Wlu9RrCA</property>
<property name="hibernate.show_sql">false</property>
<property name="hibernate.query.factory_class">org.hibernate.hql.classic.ClassicQueryTranslatorFactory</property>
<property name="hibernate.format_sql">false</property>
<!-- Use the C3P0 connection pool provider -->
<property name="hibernate.c3p0.acquire_increment">50</property>
<property name="hibernate.c3p0.min_size">10</property>
<property name="hibernate.c3p0.max_size">100</property>
<property name="hibernate.c3p0.timeout">300</property>
<property name="hibernate.c3p0.max_statements">5</property>
<property name="hibernate.c3p0.idle_test_period">3000</property>
<mapping resource="kelimegame/entity/Progress.hbm.xml"/>
<mapping resource="kelimegame/entity/Solstr.hbm.xml"/>
<mapping resource="kelimegame/entity/Kelimetr.hbm.xml"/>
<mapping resource="kelimegame/entity/User.hbm.xml"/>
<mapping resource="kelimegame/entity/Achievement.hbm.xml"/>
<mapping resource="kelimegame/entity/Matrixtr.hbm.xml"/>
</session-factory>
</hibernate-configuration>
After finding all possible solutions I persist the matrix and the solutions using hibernate. I am also using c3pO library. I am not spawning any thread. All the work is being done in a very simple iterative way. But I am running the jar in separate processes.
From different terminals I am executing this :
java -jar NewDB.jar
I got a deadlock as follows :
Apr 25, 2013 8:38:05 PM com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector run
WARNING: com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#7f0c09f9 -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks!
Apr 25, 2013 9:08:23 PM com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector run
WARNING: com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#7f0c09f9 -- APPARENT DEADLOCK!!! Complete Status:
Managed Threads: 3
Active Threads: 3
Active Tasks:
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#2933f261
on thread: C3P0PooledConnectionPoolManager[identityToken->z8kfsx8uibeyqevbbapc|4045cf35]-HelperThread-#1
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#116dd369
on thread: C3P0PooledConnectionPoolManager[identityToken->z8kfsx8uibeyqevbbapc|4045cf35]-HelperThread-#0
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#41529b6f
on thread: C3P0PooledConnectionPoolManager[identityToken->z8kfsx8uibeyqevbbapc|4045cf35]-HelperThread-#2
Pending Tasks:
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#165ab5ea
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#1d5d211d
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#4d2905fa
Pool thread stack traces:
Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx8uibeyqevbbapc|4045cf35]-HelperThread-#1,5,main]
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:662)
Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx8uibeyqevbbapc|4045cf35]-HelperThread-#0,5,main]
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:662)
Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx8uibeyqevbbapc|4045cf35]-HelperThread-#2,5,main]
com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:662)
Apr 25, 2013 9:41:29 PM com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector run
WARNING: com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#7f0c09f9 -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks!
Apr 25, 2013 9:55:18 PM com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector run
WARNING: com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector#7f0c09f9 -- APPARENT DEADLOCK!!! Complete Status:
Managed Threads: 3
Active Threads: 3
Active Tasks:
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#5a337b7d
on thread: C3P0PooledConnectionPoolManager[identityToken->z8kfsx8uibeyqevbbapc|4045cf35]-HelperThread-#0
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#69f079ce
on thread: C3P0PooledConnectionPoolManager[identityToken->z8kfsx8uibeyqevbbapc|4045cf35]-HelperThread-#1
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#2accf9b8
on thread: C3P0PooledConnectionPoolManager[identityToken->z8kfsx8uibeyqevbbapc|4045cf35]-HelperThread-#2
Pending Tasks:
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#771eb4fb
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#fc07d6
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#2266731b
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#740f0341
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#59edbee
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#78e924
com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask#2123aba
com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask#7acd8a65
Pool thread stack traces:
Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx8uibeyqevbbapc|4045cf35]-HelperThread-#0,5,main]
java.text.NumberFormat.getInstance(NumberFormat.java:769)
java.text.NumberFormat.getInstance(NumberFormat.java:393)
java.text.MessageFormat.subformat(MessageFormat.java:1262)
java.text.MessageFormat.format(MessageFormat.java:860)
java.text.Format.format(Format.java:157)
java.text.MessageFormat.format(MessageFormat.java:836)
com.mysql.jdbc.Messages.getString(Messages.java:106)
com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2552)
com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3002)
com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:2991)
com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3532)
com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:943)
com.mysql.jdbc.MysqlIO.secureAuth411(MysqlIO.java:4113)
com.mysql.jdbc.MysqlIO.doHandshake(MysqlIO.java:1308)
com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2336)
com.mysql.jdbc.ConnectionImpl.connectWithRetries(ConnectionImpl.java:2176)
com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2158)
com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:792)
com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
sun.reflect.GeneratedConstructorAccessor7.newInstance(Unknown Source)
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
java.lang.reflect.Constructor.newInstance(Constructor.java:525)
com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:381)
com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:305)
com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:134)
com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:183)
com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:172)
com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:188)Killed
caglar#ubuntu:~/NetBeansProjects/NewDB/dist$
My question is as follows :
Can this deadlock in c3po happen since I am running the program in
separate processes?
Should I use one process and multiple threads inside of this
process?
How can I trace this deadlock understand the cause of it? Is there a way to trace multiple JVMs causing deadlocks?
this is an interesting one.
you've published two distinct APPARENT DEADLOCKS. the first one is being caused by c3p0 attempting to close() Connections, and those close() operations are neither succeeding nor failing with an Exception in a timely manner. the second APPARENT DEADLOCK shows problems with Connection acquisition: c3p0 is attempting to acquire new Connections, and those attempts are neither succeeding nor failing with an Exception in a timely manner. the fact that very different operations are freezing suggests that it might be a more general problem with your dbms locking up under the stress of what you are doing or somesuch. it should be no problem to run multiple processes against your database, but you need to stay cognizant of limits.
there are a few interesting things about your configuration:
1) hibernate.c3p0.max_statements=5 is a very bad idea, on almost any pool and particularly on pools this large. you've got up to 100 Connections, and you're only allowing a total of 5 Statements to be cached between all of them. this might stress both the pool and the DBMS, as you will constantly be churning through PreparedStatements and the statement cache does a lot of bookkeeping about that. you may have meant that to be 5 cached statements per connection, but that's not what you have configured. you have set a global maximum for your pool. maybe try hibernate.c3p0.maxStatementsPerConnection=5 instead? or set max_statements to zero to turn statement caching off, at least until you resolve your deadlock. see http://www.mchange.com/projects/c3p0/#configuring_statement_pooling
2) if you are running your computation in multiple processes rather than multiple Threads, do you really need each process to hold 50 - 100 Connections? things may well be freezing up simply because you are stressing the dbms with too many Connections outstanding as each of your multiple processes acquire lots of resource-heavy Connections. you don't need more Connections in any process than you might have client Threads running concurrently within that process. i'd set hibernate.c3p0.acquire_increment and probably hibernate.c3p0.max_size to much smaller values.
3) if you really do need all those Connections running simultaneously, you can reduce the vulnerability of your pools to deadlock by increasing the config parameter numHelperThreads to some value greater than its default of 3. you probably want numHelperThreads to be something like twice the number of cores available on your machine. given that you are running multiple processes though, you might find that you are saturating your CPU, and that is freezing things up. so watch for that.
basically, try updating your configuration so that you are using resources -- file handles, network connections, CPU -- as efficiently as possible and so that you are not unnecessarily stressing the pool / statement cache / dbms more than you need to be.
if these suggestions don't resolve the problem, please post the fill config of your pools. c3p0 dumps its config at INFO level on pool initialization.
good luck!

Tomcat memory management

I'm running Tomcat7, the server is quite powerful, 8 GB RAM 8-core.
My problem is that the RES memory is geting higher and higher, until the server just doesn't respond anymore, not even calling OnOutOfMemoryError.
Tomcat configuration :
-Xms1024M
-Xmx2048M
-XX:PermSize=256m
-XX:MaxPermSize=512m
-XX:+UseConcMarkSweepGC
-XX:OnOutOfMemoryError='/var/tomcat/conf/restart_tomcat.sh'
Memory informations :
Memory: Non heap memory = 106 Mb (Perm Gen, Code Cache),
Loaded classes = 14,055,
Garbage collection time = 47,608 ms,
Process cpu time = 4,296,860 ms,
Committed virtual memory = 6,910 Mb,
Free physical memory = 4,906 Mb,
Total physical memory = 8,192 Mb,
Free swap space = 26,079 Mb,
Total swap space = 26,079 Mb
Perm Gen memory: 88 Mb / 512 Mb ++++++++++++
Free disk space: 89,341 Mb
The memory used by Tomcat doesn't look that high compared to the top command.
I also had java.net.SocketException: No buffer space available when trying to connect to SMTP server or when trying to connect to facebook servers.
I use Hibernate, with c3p0 connection pool with this configuration :
<property name="hibernate.connection.driver_class">com.mysql.jdbc.Driver</property>
<property name="hibernate.connection.url">jdbc:mysql://urldb/schema?autoReconnect=true</property>
<property name="hibernate.connection.username">username</property>
<property name="hibernate.dialect">org.hibernate.dialect.MySQL5InnoDBDialect</property>
<property name="hibernate.connection.password"></property>
<property name="connection.characterEncoding">UTF-8</property>
<property name="hibernate.c3p0.acquire_increment">1</property>
<property name="hibernate.c3p0.idle_test_period">300</property>
<property name="hibernate.c3p0.timeout">5000</property>
<property name="hibernate.c3p0.max_size">50</property>
<property name="hibernate.c3p0.min_size">1</property>
<property name="hibernate.c3p0.max_statement">0</property>
<property name="hibernate.c3p0.preferredTestQuery">select 1;</property>
<property name="hibernate.connection.provider_class">org.hibernate.connection.C3P0ConnectionProvider</property>
I couldn't find anything... does someone have an hint of where I should be looking for ?
Thanks!
[UPDATE 1]HEAP DUMP :
HEAP HISTOGRAM :
class [C 269780 34210054
class [B 5600 33836661
class java.util.HashMap$Entry 221872 6212416
class [Ljava.util.HashMap$Entry; 23797 6032056
class java.lang.String 271170 5423400
class org.hibernate.hql.ast.tree.Node 103588 4972224
class net.bull.javamelody.CounterRequest 28809 2996136
class org.hibernate.hql.ast.tree.IdentNode 23461 2205334
class java.lang.Class 14677 2113488
class org.hibernate.hql.ast.tree.DotNode 13045 1852390
class [Ljava.lang.String; 48506 1335600
class [Ljava.lang.Object; 12997 1317016
Instance Counts for All Classes (excluding platform) :
103588 instances of class org.hibernate.hql.ast.tree.Node
33366 instances of class antlr.ANTLRHashString
28809 instances of class net.bull.javamelody.CounterRequest
24436 instances of class org.apache.tomcat.util.buf.ByteChunk
23461 instances of class org.hibernate.hql.ast.tree.IdentNode
22781 instances of class org.apache.tomcat.util.buf.CharChunk
22331 instances of class org.apache.tomcat.util.buf.MessageBytes
13045 instances of class org.hibernate.hql.ast.tree.DotNode
10024 instances of class net.bull.javamelody.JRobin
9084 instances of class org.apache.catalina.loader.ResourceEntry
7931 instances of class org.hibernate.hql.ast.tree.SqlNode
[UPDATE 2] server.xml :
<Connector port="8080" protocol="HTTP/1.1"
connectionTimeout="20000"
redirectPort="8443"
URIEncoding="UTF-8"
maxThreads="150"
minSpareThreads="25"
maxSpareThreads="75"
enableLookups="false"
acceptCount="1024"
server="unknown"
address="public_ip"
/>
****[UPDATE 3] Output from log files : ****
2012-06-04 06:18:24,152 [http-bio-ip-8080-exec-3500] ERROR org.apache.catalina.core.ContainerBase.[Catalina].[localhost].[/api].[Jersey REST Service]- Servlet.ser
vice() for servlet [Jersey REST Service] in context with path [/socialapi] threw exception
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at org.apache.coyote.http11.InternalInputBuffer.fill(InternalInputBuffer.java:532)
at org.apache.coyote.http11.InternalInputBuffer.fill(InternalInputBuffer.java:501)
at org.apache.coyote.http11.InternalInputBuffer$InputStreamInputBuffer.doRead(InternalInputBuffer.java:563)
at org.apache.coyote.http11.filters.IdentityInputFilter.doRead(IdentityInputFilter.java:118)
at org.apache.coyote.http11.AbstractInputBuffer.doRead(AbstractInputBuffer.java:326)
at org.apache.coyote.Request.doRead(Request.java:422)
[UPDATE 4] ServletContext
I use a ServletContextListenerin my application to instanciate controllers and keep a reference with event.getServletContext().setAttribute. Those controllers loads configurations and translations (the 88Mb in Perm).
Then to use the database i use :
SessionFactory sf = dbManager.getSessionFactory(DatabaseManager.DB_KEY_DEFAULT);
Session session = sf.openSession();
Transaction tx = null;
try {
tx = session.beginTransaction();
//Do stuuf
tx.commit();
} catch (Exception e){
//Do something
} finally {
session.close();
}
Could this be the source of a leak ?
Why not to use Manual transaction/session, and how would you do then ?
Try with this parameter:
+XX:+HeapDumpOnOutOfMemoryError -XX:+HeapDumpPath=dump.log
Also try with lower start memory parameters -Xms.
then you can inspect the dump to see if the problem was object allocation.
While running try
jps
That will output all java processes, lets say Tomcat is PID 4444:
jmap -dump:format=b,file=heapdump 4444
And
jhat heapdump
If you run out of memory while executing jhat just add more memory. From there you can inspect the heap of your application.
Another way to go is to enable Hibernate statistics to check that you are not retrieving more objects. Although it looks like a full garbage collection every hour should not be a problem (room for do it better there).
-verbose:gc -Xloggc:/opt/tomcat/logs/gc.out -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
And with GCViewer for example take a look at every space of memory (ternured, eden, survivors, perm).
Another handy tool:
jstack 4444 > stack.txt
That will retrieve a full stack trace of every thread running inside the java process with pid 4444.
Bear in mind that you need privileges if you started Tomcat as root or another user.
jps
won't output process which you have no privileges, therefore you cannot connect to it.
Since I don't know what your application is about (and therefore I don't know its requirements) 3 million instances looks like a lot.
With Hibernate statistics you can see which classes you instantiate the most.
Then tunning the proportions of your eden and ternured garbage recolection can be more efficient.
Newly instantiated objects goes to eden. When it fills up a minor gc triggers. What is not deleted goes to a survivor space. When this fills up it goes to ternured. Full gc will arise when ternured is full.
In this picture (which is inaccurate) I left aside String that become interned and Memory mapped files (that are not in heap). Take a look at which classes you instantiate most. Intensive use of String might lead to quickly fill up perm.
I guess you do so, but use a managed session factory, such as Spring (if in your stack) and avoid manually management of transactions and sessions.
Keep in mind that objects are deleted in the GC when no object refers to it. So as long as a object is reachable in your application the object remain.
If your ServletContextListener instantiate controllers and are stored in the event getServletContext. Make sure you completely remove the reference afterwards, if you keep a reference the objects won't be deleted, since they are still reachable.
If you manage your own transactions and session (which is fine if you cannot use a framework) then you must deal with code maintenance and bugs that Spring-tx for instance has solved and improved.
I personally would take advantage of FOSS. But of course sometime you cannot enlarge the stack.
If you are using Hibernate I would take a look at Spring-orm and Spring-tx to manage transactions and session. Also take a look at Hibernate patter Open Session In View.
I'd also recommend that you download Visual VM 1.3.3, install all the plugins, and attach it to the Tomcat PID so you can see what's happening in real time. Why wait for a thread dump? It'll also tell you CPU, threads, all heap generations, which objects consume the most memory, etc.

Categories

Resources