Spring Data JPA Is Too Slow

Spring Data JPA Is Too Slow - java

I recently switched my app to Spring Boot 2. I rely on Spring Data JPA to handle all transactions and I noticed a huge speed difference between this and my old configuration. Storing around 1000 elements was being done in around 6s and now it's taking over 25 seconds. I have seen SO posts about batching with Data JPA but none of these worked.
Let me show you the 2 configurations:
The entity (common to both) :
#Entity
#Table(name = "category")
public class CategoryDB implements Serializable
{
private static final long serialVersionUID = -7047292240228252349L;
#Id
#Column(name = "category_id", length = 24)
private String category_id;
#Column(name = "category_name", length = 50)
private String name;
#Column(name = "category_plural_name", length = 50)
private String pluralName;
#Column(name = "url_icon", length = 200)
private String url;
#Column(name = "parent_category", length = 24)
#JoinColumn(name = "parent_category", referencedColumnName = "category_id")
private String parentID;
//Getters & Setters
}
Old Repository (showing an insert only) :
#Override
public Set<String> insert(Set<CategoryDB> element)
{
Set<String> ids = new HashSet<>();
Transaction tx = session.beginTransaction();
for (CategoryDB category : element)
{
String id = (String) session.save(category);
ids.add(id);
}
tx.commit();
return ids;
}
Old Hibernate XML Config File:
<property name="show_sql">true</property>
<property name="format_sql">true</property>
<!-- connection information -->
<property name="hibernate.connection.driver_class">com.mysql.cj.jdbc.Driver</property>
<property name="hibernate.dialect">org.hibernate.dialect.MySQLDialect</property>
<!-- database pooling information -->
<property name="connection_provider_class">org.hibernate.connection.C3P0ConnectionProvider</property>
<property name="hibernate.c3p0.min_size">5</property>
<property name="hibernate.c3p0.max_size">100</property>
<property name="hibernate.c3p0.timeout">300</property>
<property name="hibernate.c3p0.max_statements">50</property>
<property name="hibernate.c3p0.idle_test_period">3000</property>
Old Statistics:
18949156 nanoseconds spent acquiring 2 JDBC connections;
5025322 nanoseconds spent releasing 2 JDBC connections;
33116643 nanoseconds spent preparing 942 JDBC statements;
3185229893 nanoseconds spent executing 942 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
3374152568 nanoseconds spent executing 1 flushes (flushing a total of 941 entities and 0 collections);
6485 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
New Repository:
#Repository
public interface CategoryRepository extends JpaRepository<CategoryDB,String>
{
#Query("SELECT cat.parentID FROM CategoryDB cat WHERE cat.category_id = :#{#category.category_id}")
String getParentID(#Param("category") CategoryDB category);
}
And I'm using the saveAll() in my service.
New application.properties:
spring.datasource.driver-class-name=com.mysql.cj.jdbc.Driver
spring.datasource.hikari.connection-timeout=6000
spring.datasource.hikari.maximum-pool-size=10
spring.jpa.properties.hibernate.show_sql=true
spring.jpa.properties.hibernate.format_sql=true
spring.jpa.properties.hibernate.generate_statistics = true
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.MySQLDialect
spring.jpa.properties.hibernate.jdbc.batch_size=50
spring.jpa.properties.hibernate.order_inserts=true
New Statistics:
24543605 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
136919170 nanoseconds spent preparing 942 JDBC statements;
5457451561 nanoseconds spent executing 941 JDBC statements;
19985781508 nanoseconds spent executing 19 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
20256178886 nanoseconds spent executing 3 flushes (flushing a total of 2823 entities and 0 collections);
0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
Probably , I'm misconfiguring something on behalf on Spring. This is a huge performance difference and I'm on a dead end. Any hints on what is going wrong here are very appreciated.

Let's merge the statistics so they can be easily compared.
Old rows are prefixed with o, new ones with n.
Rows with a count of 0 are ignored.
Nanoseconds measurements are formatted so that milliseconds can are before a .
o: 18 949156 nanoseconds spent acquiring 2 JDBC connections;
n: 24 543605 nanoseconds spent acquiring 1 JDBC connections;
o: 33 116643 nanoseconds spent preparing 942 JDBC statements;
n: 136 919170 nanoseconds spent preparing 942 JDBC statements;
o: 3185 229893 nanoseconds spent executing 942 JDBC statements;
n: 5457 451561 nanoseconds spent executing 941 JDBC statements; //loosing ~2sec
o: 0 nanoseconds spent executing 0 JDBC batches;
n: 19985 781508 nanoseconds spent executing 19 JDBC batches; // loosing ~20sec
o: 3374 152568 nanoseconds spent executing 1 flushes (flushing a total of 941 entities and 0 collections);
n: 20256 178886 nanoseconds spent executing 3 flushes (flushing a total of 2823 entities and 0 collections); // loosing ~20sec, processing 3 times the entities
o: 6485 nanoseconds spent executing 1 partial-flushes (flushing a total of 0 entities and 0 collections)
n: 0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
The following seem to be the relevant points:
The new version has 19 batches which take 20sec which don't exist in the old version at all.
The new version has 3 flushes instead of 1, which take together 20 sec more or about 6 times as long. This is probably more or less the same extra time as the batches since they are most certainly part of these flushes.
Although batches are supposed to make things faster, there are reports where they make things slower, especially with MySql: Why Spring's jdbcTemplate.batchUpdate() so slow?
This brings us to a couple of things you can try/investigate:
Disable batching, in order to test if you are actually suffering from some kind of slow batch problem.
Use the linked SO post in order to speed up batching.
log the SQL statements that actually get executed in order to find the difference.
Since this will result in rather lengthy logs to manipulate, try extracting only the SQL statements in two files and comparing them with a diff tool.
log flushes in order to get ideas why extra flushes are triggered.
use breakpoints and a debugger or extra logging to find out what entities are getting flushed and why you have way more entities in the second variant.
All the proposals above operate on JPA.
But your statistics and question content suggest that you are doing simple inserts in a single or few tables.
Doing this with on JDBC, e.g. with a JdbcTemplate might be more efficient and at least easier to understand.

You can use jdbc template directly it is much fast than data jpa.

Related

Batch update does not works when using JPQL

I'm trying to add a batch update to my spring boot project. But multiple queries are still executed when I check SQL logs and Hibernate stats.
Hibernate stats
290850400 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
3347700 nanoseconds spent preparing 19 JDBC statements;
5919028800 nanoseconds spent executing 19 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
2635900 nanoseconds spent executing 1 flushes (flushing a total of 1 entities and 0 collections);
19447300 nanoseconds spent executing 19 partial-flushes (flushing a total of 18 entities and 18 collections)
Versions
Spring Boot v2.7.1
Spring v5.3.21
Java 17.0.3.1
Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
application.yml
spring:
datasource:
url: jdbc:oracle:thin:#localhost:1521/db
username: user
password: password
driver-class-name: oracle.jdbc.OracleDriver
jpa:
database-platform: org.hibernate.dialect.Oracle12cDialect
hibernate:
naming:
physical-strategy: org.hibernate.boot.model.naming.PhysicalNamingStrategyStandardImpl
ddl-auto: update
show-sql: true
properties:
hibernate:
dialect: org.hibernate.dialect.Oracle12cDialect
format_sql: false
jdbc:
fetch_size: 100
batch_size: 5
order_updates: true
batch_versioned_data: true
generate_statistics: true
Snapshot entity
#Entity
#Table(name = "SNAPSHOT", schema = "SYSTEM", catalog = "")
public class Snapshot {
#GeneratedValue(strategy = GenerationType.SEQUENCE)
#Id
#Column(name = "ID")
private long id;
#Basic
#Column(name = "CREATED_ON")
private String createdOn;
...
}
SnapshotRepository
#Repository
public interface SnapshotRepository extends JpaRepository<Snapshot, Long> {
#Modifying
#Query("UPDATE Snapshot s SET s.fieldValue =?1,s.createdOn=?2 where s.id = ?3 and s.fieldName = ?4")
int updateSnapshot(String fieldValue, String createdOn, String id, String fieldName);
}
And this repository method is called from the service class.
for (Map.Entry<String, String> entry : res.getValues().entrySet()) {
snapshotRepository.updateSnapshot(entry.getValue(), createdOn, id, entry.getKey());
}
pom.xml
<dependency>
<groupId>com.oracle.database.jdbc</groupId>
<artifactId>ojdbc11</artifactId>
<version>21.7.0.0</version>
</dependency>
In the application.yml, I think I'm configuring all required properties to activate batch update but still no luck.
Please let me know what I'm doing incorrectly.

Batching only works when Hibernate does the flushing of entities. If you are executing manual queries, Hibernate can't do batching.
The way you are implementing this, Hibernate will reuse the same prepared statement on the database side though, but no JDBC batching.

JPA batch inserts does not improve performance

I would like to improve the performance of my postgresql inserts with JPA batch inserts.
I'm using :
spring-boot-starter-data-jpa 2.1.3.RELEASE
postgresql 42.2.5 (jdbc driver).
Database is PostgreSQL 9.6.2
I have managed to activate JPA's batch inserts, but performance has not improved at all.
I use #GeneratedValue(strategy = GenerationType.SEQUENCE) in my entities
I use reWriteBatchedInserts=true in my jdbc connexion string
I set the following properties :
spring.jpa.properties.hibernate.jdbc.batch_size=100
spring.jpa.properties.hibernate.order_inserts=true
spring.jpa.properties.hibernate.generate_statistics=true
I use the saveAll(collection) method
I tried to flush and clean my entityManager after each batch
I tried with a batch size of 100 and 1000, flushing for each batch, with no noticeable change.
I can see in the logs that hibernate does use batch inserts but am unsure if my database does (I'm trying to fetch the logs, folder permission is pending).
#Service
#Configuration
#Transactional
public class SecteurGeographiqueServiceImpl implements SecteurGeographiqueService {
private static final Logger logger = LoggerFactory.getLogger(SecteurGeographiqueServiceImpl.class);
#Value("${spring.jpa.properties.hibernate.jdbc.batch_size}")
private int batchSize;
#PersistenceContext
private EntityManager entityManager;
#Autowired
private SecteurGeographiqueRepository secteurGeographiqueRepository;
#Override
public List<SecteurGeographique> saveAllSecteurGeographiquesISOs(List<SecteurGeographique> listSecteurGeographiques) {
logger.warn("BATCH SIZE : " + this.batchSize);
final List<SecteurGeographique> tempList = new ArrayList<>();
final List<SecteurGeographique> savedList = new ArrayList<>();
for (int i = 0; i < listSecteurGeographiques.size(); i++) {
if ((i % this.batchSize) == 0) {
savedList.addAll(this.secteurGeographiqueRepository.saveAll(tempList));
tempList.clear();
this.entityManager.flush();
this.entityManager.clear();
}
tempList.add(listSecteurGeographiques.get(i));
}
savedList.addAll(this.secteurGeographiqueRepository.saveAll(tempList));
tempList.clear();
this.entityManager.flush();
this.entityManager.clear();
return savedList;
}
}
...
#Entity
public class SecteurGeographique {
private static final long serialVersionUID = 1L;
#Id
#GeneratedValue(strategy = GenerationType.SEQUENCE)
#Column(name = "id")
public Long id;
...
My repository implementation is :
org.springframework.data.jpa.repository.JpaRepository<SecteurGeographique, Long>
application.properties (connection part) :
spring.datasource.url=jdbc:postgresql://xx.xx.xx.xx:5432/bddname?reWriteBatchedInserts=true
spring.jpa.properties.hibernate.default_schema=schema
spring.datasource.username=xxxx
spring.datasource.password=xxxx
spring.datasource.driverClassName=org.postgresql.Driver
spring.jpa.properties.hibernate.dialect=org.hibernate.dialect.PostgreSQLDialect
spring.jpa.properties.hibernate.jdbc.lob.non_contextual_creation=true
spring.jpa.properties.hibernate.jdbc.batch_size=100
spring.jpa.properties.hibernate.order_inserts=true
spring.jpa.properties.hibernate.generate_statistics=true
And in the logs after my 16073 entities are inserted (this test does not include flushing) :
13:31:40.882 [restartedMain] INFO o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
15721506 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
121091067 nanoseconds spent preparing 16074 JDBC statements;
240144821872 nanoseconds spent executing 16073 JDBC statements;
3778202166 nanoseconds spent executing 161 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
4012929596 nanoseconds spent executing 1 flushes (flushing a total of 16073 entities and 0 collections);
0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}
Note that this is just one table, with no constraint / foreign key. Just flat basic data in a table, nothing fancy.
From the logs ot does look like there is a problem :
240144821872 nanoseconds spent executing <b>16073 JDBC statements</b>;
3778202166 nanoseconds spent executing 161 JDBC batches;
Shouldn't it be "executing 161 JDBC statements" if everything is in the batches ?
Tests with flushes, and batch sizes 100 then 1000 :
15:32:17.612 [restartedMain] WARN f.g.j.a.r.s.i.SecteurGeographiqueServiceImpl - BATCH SIZE : 100
15:36:46.206 [restartedMain] INFO o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
15416324 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
105369002 nanoseconds spent preparing 16234 JDBC statements;
262388696401 nanoseconds spent executing 16073 JDBC statements;
3669253410 nanoseconds spent executing 161 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
3956493726 nanoseconds spent executing 161 flushes (flushing a total of 16073 entities and 0 collections);
0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}
15:43:54.155 [restartedMain] WARN f.g.j.a.r.s.i.SecteurGeographiqueServiceImpl - BATCH SIZE : 1000
15:48:22.335 [restartedMain] INFO o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
15676227 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
111370586 nanoseconds spent preparing 16090 JDBC statements;
265089247563 nanoseconds spent executing 16073 JDBC statements;
599946208 nanoseconds spent executing 17 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
866452023 nanoseconds spent executing 17 flushes (flushing a total of 16073 entities and 0 collections);
0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}
Each time I get a 4min 30sec execution time. It feels enormous for batch inserts.
What am I missing / misinterpreting ?

After trying a batch size of 1000 with a postgresql server on localhost (https://gareth.flowers/postgresql-portable/ v10.1.1), the execution runs under 3 seconds. So it seems the code or configuration is not to blame here.
Unfortunately I cannot investigate why it was taking so much time on the remote postgresql (hosted on an AWS), but I can only conclude this was a network or database issue.
As of today I cannot access postgresql remote logs, but if you have any advice on what to look for on the postgresql instance, I'm all ears.
Logs with batching (1000) and flush+clean :
16:20:52.360 [restartedMain] WARN f.g.j.a.r.s.i.SecteurGeographiqueServiceImpl - BATCH SIZE : 1000
16:20:54.844 [restartedMain] INFO o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
523125 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
44649191 nanoseconds spent preparing 16090 JDBC statements;
1311557995 nanoseconds spent executing 16073 JDBC statements;
204225325 nanoseconds spent executing 17 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
381230968 nanoseconds spent executing 17 flushes (flushing a total of 16073 entities and 0 collections);
0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}
Logs WITHOUT batching, flush or clean :
16:57:34.426 [restartedMain] INFO o.h.e.i.StatisticalLoggingSessionEventListener - Session Metrics {
725069 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
55763008 nanoseconds spent preparing 32146 JDBC statements;
2816525053 nanoseconds spent executing 32146 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
1796451447 nanoseconds spent executing 1 flushes (flushing a total of 16073 entities and 0 collections);
0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}
This comparison shows a 46% gain in the overall JDBC statements execution time.

Concurrency in Spring Boot

I have spring boot application with embedded jetty and its configurations are:
jetty's minThread: 50
jetty's maxThread: 500
jetty's maxQueueSize: 25000 (I changed default queue to LinkedBlockingQueue)
I didn't change acceptors and selectors (since I dont believe on hard coding the value)
With above configuration, I am getting below jmeter test results:
Concurrent Users: 60
summary = 183571 in 00:01:54 = 1611.9/s Avg: 36 Min: 3 Max:
1062 Err: 0 (0.00%)
Concurrent Users: 75
summary = 496619 in 00:05:00 = 1654.6/s Avg: 45 Min: 3 Max:
1169 Err: 0 (0.00%)
If I increase concurrent users, I dont see any improvement. I want to increase concurrency. How to achieve this?
===========================================================================
Updating on 29-March-2019
I was spending more effort on improving business logic. Still no much improvement. Then I decided to develop one hello world spring-boot project.
i.e.,
spring-boot (1.5.9)
jetty 9.4.15
rest controller which has get endpoint
code below:
#GetMapping
public String index() {
return "Greetings from Spring Boot!";
}
Then I tried to benchmark using apachebench
75 concurrent users:
ab -t 120 -n 1000000 -c 75 http://10.93.243.87:9000/home/
Server Software:
Server Hostname: 10.93.243.87
Server Port: 9000
Document Path: /home/
Document Length: 27 bytes
Concurrency Level: 75
Time taken for tests: 37.184 seconds
Complete requests: 1000000
Failed requests: 0
Write errors: 0
Total transferred: 143000000 bytes
HTML transferred: 27000000 bytes
Requests per second: 26893.28 [#/sec] (mean)
Time per request: 2.789 [ms] (mean)
Time per request: 0.037 [ms] (mean, across all concurrent requests)
Transfer rate: 3755.61 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 1 23.5 0 3006
Processing: 0 2 7.8 1 404
Waiting: 0 2 7.8 1 404
Total: 0 3 24.9 2 3007
100 concurrent users:
ab -t 120 -n 1000000 -c 100 http://10.93.243.87:9000/home/
Server Software:
Server Hostname: 10.93.243.87
Server Port: 9000
Document Path: /home/
Document Length: 27 bytes
Concurrency Level: 100
Time taken for tests: 36.708 seconds
Complete requests: 1000000
Failed requests: 0
Write errors: 0
Total transferred: 143000000 bytes
HTML transferred: 27000000 bytes
Requests per second: 27241.77 [#/sec] (mean)
Time per request: 3.671 [ms] (mean)
Time per request: 0.037 [ms] (mean, across all concurrent requests)
Transfer rate: 3804.27 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 2 35.7 1 3007
Processing: 0 2 9.4 1 405
Waiting: 0 2 9.4 1 405
Total: 0 4 37.0 2 3009
500 concurrent users:
ab -t 120 -n 1000000 -c 500 http://10.93.243.87:9000/home/
Server Software:
Server Hostname: 10.93.243.87
Server Port: 9000
Document Path: /home/
Document Length: 27 bytes
Concurrency Level: 500
Time taken for tests: 36.222 seconds
Complete requests: 1000000
Failed requests: 0
Write errors: 0
Total transferred: 143000000 bytes
HTML transferred: 27000000 bytes
Requests per second: 27607.83 [#/sec] (mean)
Time per request: 18.111 [ms] (mean)
Time per request: 0.036 [ms] (mean, across all concurrent requests)
Transfer rate: 3855.39 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 14 126.2 1 7015
Processing: 0 4 22.3 1 811
Waiting: 0 3 22.3 1 810
Total: 0 18 129.2 2 7018
1000 concurrent users:
ab -t 120 -n 1000000 -c 1000 http://10.93.243.87:9000/home/
Server Software:
Server Hostname: 10.93.243.87
Server Port: 9000
Document Path: /home/
Document Length: 27 bytes
Concurrency Level: 1000
Time taken for tests: 36.534 seconds
Complete requests: 1000000
Failed requests: 0
Write errors: 0
Total transferred: 143000000 bytes
HTML transferred: 27000000 bytes
Requests per second: 27372.09 [#/sec] (mean)
Time per request: 36.534 [ms] (mean)
Time per request: 0.037 [ms] (mean, across all concurrent requests)
Transfer rate: 3822.47 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 30 190.8 1 7015
Processing: 0 6 31.4 2 1613
Waiting: 0 5 31.4 1 1613
Total: 0 36 195.5 2 7018
From above test run, I achieved ~27K per second with 75 users itself but it looks increasing the users also increasing the latency. Also, we can clearly note connect time is increasing.
I have requirement for my application to support 40k concurrent users (assume all are using own separate browsers) and request should be finished within 250 milliseconds.
Please help me on this

You can try increasing or decreasing the number of Jetty threads but the application performance will depend on the application logic. If your current bottleneck is the database query you will see hardly any improvements by tuning HTTP layer, especially when testing over local network.
Find the bottleneck in your application, attempt to improve it, and then measure again to confirm it's better. Repeat this three steps until achieving desired performance. Do not tune performance blindly, it's a waste of time.

Cassandra Read/Write performance - High CPU

I have started using Casandra since last few days and here is what I am trying to do.
I have about 2 Million+ objects which maintain profiles of users. I convert these objects to json, compress and store them in a blob column. The average compressed json size is about 10KB. This is how my table looks in cassandra,
Table:
dev.userprofile (uid varchar primary key, profile blob);
Select Query:
select profile from dev.userprofile where uid='';
Update Query:
update dev.userprofile set profile='<bytebuffer>' where uid = '<uid>'
Every hour, I get events from a queue which I apply to my userprofile object. Each event corresponds to one userprofile object. I get about 1 Million of such events, so I have to update around 1M of the userprofile objects within a short time i.e update the object in my application, compress the json and update the cassandra blob. I have to finish updating all of 1 Million user profile objects preferably in few mins. But I notice its taking longer now.
While running my application, I notice that I can update around 400 profiles/second on an average. I already see a lot of CPU iowait - 70%+ on cassandra instance. Also, the load initially is pretty high around 16 (on 8 vcpu instance) and then drops off to around 4.
What am I doing wrong? Because, when I was updating smaller objects of size 2KB I noticed that cassandra operations /sec is much faster. I was able to get about 3000 Ops/sec. Any thoughts on how I should improve the performance?
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>3.1.0</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-extras</artifactId>
<version>3.1.0</version>
</dependency>
I just have one node of cassandra setup within a m4.2xlarge aws instance for testing
Single node Cassandra instance
m4.2xlarge aws ec2
500 GB General Purpose (SSD)
IOPS - 1500 / 10000
nodetool cfstats output
Keyspace: dev
Read Count: 688795
Read Latency: 27.280683695439137 ms.
Write Count: 688780
Write Latency: 0.010008401811899301 ms.
Pending Flushes: 0
Table: userprofile
SSTable count: 9
Space used (live): 32.16 GB
Space used (total): 32.16 GB
Space used by snapshots (total): 0 bytes
Off heap memory used (total): 13.56 MB
SSTable Compression Ratio: 0.9984539538554672
Number of keys (estimate): 2215817
Memtable cell count: 38686
Memtable data size: 105.72 MB
Memtable off heap memory used: 0 bytes
Memtable switch count: 6
Local read count: 688807
Local read latency: 29.879 ms
Local write count: 688790
Local write latency: 0.012 ms
Pending flushes: 0
Bloom filter false positives: 47
Bloom filter false ratio: 0.00003
Bloom filter space used: 7.5 MB
Bloom filter off heap memory used: 7.5 MB
Index summary off heap memory used: 2.07 MB
Compression metadata off heap memory used: 3.99 MB
Compacted partition minimum bytes: 216 bytes
Compacted partition maximum bytes: 370.14 KB
Compacted partition mean bytes: 5.82 KB
Average live cells per slice (last five minutes): 1.0
Maximum live cells per slice (last five minutes): 1
Average tombstones per slice (last five minutes): 1.0
Maximum tombstones per slice (last five minutes): 1
nodetool cfhistograms output
Percentile SSTables Write Latency Read Latency Partition Size Cell Count
(micros) (micros) (bytes)
50% 3.00 9.89 2816.16 4768 2
75% 3.00 11.86 43388.63 8239 2
95% 4.00 14.24 129557.75 14237 2
98% 4.00 20.50 155469.30 17084 2
99% 4.00 29.52 186563.16 20501 2
Min 0.00 1.92 61.22 216 2
Max 5.00 74975.55 4139110.98 379022 2
Dstat output
---load-avg--- --io/total- ---procs--- ------memory-usage----- ---paging-- -dsk/total- ---system-- ----total-cpu-usage---- -net/total-
1m 5m 15m | read writ|run blk new| used buff cach free| in out | read writ| int csw |usr sys idl wai hiq siq| recv send
12.8 13.9 10.6|1460 31.1 |1.0 14 0.2|9.98G 892k 21.2G 234M| 0 0 | 119M 3291k| 63k 68k| 1 1 26 72 0 0|3366k 3338k
13.2 14.0 10.7|1458 28.4 |1.1 13 1.5|9.97G 884k 21.2G 226M| 0 0 | 119M 3278k| 61k 68k| 2 1 28 69 0 0|3396k 3349k
12.7 13.8 10.7|1477 27.6 |0.9 11 1.1|9.97G 884k 21.2G 237M| 0 0 | 119M 3321k| 69k 72k| 2 1 31 65 0 0|3653k 3605k
12.0 13.7 10.7|1474 27.4 |1.1 8.7 0.3|9.96G 888k 21.2G 236M| 0 0 | 119M 3287k| 71k 75k| 2 1 36 61 0 0|3807k 3768k
11.8 13.6 10.7|1492 53.7 |1.6 12 1.2|9.95G 884k 21.2G 228M| 0 0 | 119M 6574k| 73k 75k| 2 2 32 65 0 0|3888k 3829k
Edit
Switched to LeveledCompactionStrategy & disabled compression on sstables, I don't see a big improvement:
There was a bit of improvement in profiles/sec updated. Its now 550-600 profiles /sec. But, the cpu spikes remain i.e the iowait.
gcstats
Interval (ms) Max GC Elapsed (ms)Total GC Elapsed (ms)Stdev GC Elapsed (ms) GC Reclaimed (MB) Collections Direct Memory Bytes
755960 83 3449 8 73179796264 107 -1
dstats
---load-avg--- --io/total- ---procs--- ------memory-usage----- ---paging-- -dsk/total- ---system-- ----total-cpu-usage---- -net/total-
1m 5m 15m | read writ|run blk new| used buff cach free| in out | read writ| int csw |usr sys idl wai hiq siq| recv send
7.02 8.34 7.33| 220 16.6 |0.0 0 1.1|10.0G 756k 21.2G 246M| 0 0 | 13M 1862k| 11k 13k| 1 0 94 5 0 0| 0 0
6.18 8.12 7.27|2674 29.7 |1.2 1.5 1.9|10.0G 760k 21.2G 210M| 0 0 | 119M 3275k| 69k 70k| 3 2 83 12 0 0|3906k 3894k
5.89 8.00 7.24|2455 314 |0.6 5.7 0|10.0G 760k 21.2G 225M| 0 0 | 111M 39M| 68k 69k| 3 2 51 44 0 0|3555k 3528k
5.21 7.78 7.18|2864 27.2 |2.6 3.2 1.4|10.0G 756k 21.2G 266M| 0 0 | 127M 3284k| 80k 76k| 3 2 57 38 0 0|4247k 4224k
4.80 7.61 7.13|2485 288 |0.1 12 1.4|10.0G 756k 21.2G 235M| 0 0 | 113M 36M| 73k 73k| 2 2 36 59 0 0|3664k 3646k
5.00 7.55 7.12|2576 30.5 |1.0 4.6 0|10.0G 760k 21.2G 239M| 0 0 | 125M 3297k| 71k 70k| 2 1 53 43 0 0|3884k 3849k
5.64 7.64 7.15|1873 174 |0.9 13 1.6|10.0G 752k 21.2G 237M| 0 0 | 119M 21M| 62k 66k| 3 1 27 69 0 0|3107k 3081k
You could notice the cpu spikes.
My main concern is iowait before I increase the load further. Anything specific I should looking for thats causing this? Because, 600 profiles / sec (i.e 600 Reads + Writes) seems low to me.

Can you try LeveledCompactionStrategy? With 1:1 read/writes on large objects like this the IO saved on reads will probably counter the IO spent on the more expensive compactions.
If your already compressing the data before sending it, you should turn off compression on the table. Its breaking it into 64kb chunks which will be largely dominated by only 6 values which wont get much compression (as shown in horrible compression ratio SSTable Compression Ratio: 0.9984539538554672).
ALTER TABLE dev.userprofile
WITH compaction = { 'class' : 'LeveledCompactionStrategy' }
AND compression = { 'sstable_compression' : '' };
400 profiles/second is very very slow though and there may be some work to do on your client that could potentially be bottleneck as well. If you have a 4 load on a 8 core system its may not Cassandra slowing things down. Make sure your parallelizing your requests and using them asynchronously, sending requests sequentially is a common issue.
With larger blobs there is going to be an impact on GCs, so monitoring them and adding that information can be helpful. I would be surprised for 10kb objects to affect it that much but its something to look out for and may require more JVM tuning.
If that helps, from there I would recommend tuning the heap and upgrading to at least 3.7 or latest in 3.0 line.

Dropwizard 404 error on available route

I'm getting 404 on available route which is a put request.
So the message I'm getting is:
6_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.122 Safari/537.36" 93
INFO [2015-02-16 02:38:13,326] org.hibernate.engine.internal.StatisticalLoggingSessionEventListener: Session Metrics {
42205 nanoseconds spent acquiring 1 JDBC connections;
0 nanoseconds spent releasing 0 JDBC connections;
0 nanoseconds spent preparing 0 JDBC statements;
0 nanoseconds spent executing 0 JDBC statements;
0 nanoseconds spent executing 0 JDBC batches;
0 nanoseconds spent performing 0 L2C puts;
0 nanoseconds spent performing 0 L2C hits;
0 nanoseconds spent performing 0 L2C misses;
0 nanoseconds spent executing 0 flushes (flushing a total of 0 entities and 0 collections);
0 nanoseconds spent executing 0 partial-flushes (flushing a total of 0 entities and 0 collections)
}
127.0.0.1 - - [16/Feb/2015:02:38:13 +0000] "PUT /users/2/rules/current HTTP/1.1" 404 - "-" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.122 Safari/537.36" 15
Is there a way to debug this or fix if above information is enough?
I have checked the route and its parameters and they all look fine.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.