I'm developing a MS with Kotlin and Micronaut which access a Firestore database. When I run this MS locally I can make it work with 128M because it's very simple just read and write data to Firestore, and not big amounts of data, really small data like this:
{
"project": "DUMMY",
"columns": [
{
"name": "TODO",
"taskStatus": "TODO"
},
{
"name": "IN_PROGRESS",
"taskStatus": "IN_PROGRESS"
},
{
"name": "DONE",
"taskStatus": "DONE"
}
],
"tasks": {}
}
I'm running this in App Engine Standard in a F1 instance (256 MB 600 MHz) with this properties in my app.yaml
runtime: java11
instance_class: F1 # 256 MB 600 MHz
entrypoint: java -Xmx200m -jar MY_JAR.jar
service: data-connector
env_variables:
JAVA_TOOL_OPTIONS: "-Xmx230m"
GAE_MEMORY_MB: 128M
automatic_scaling:
max_instances: 1
max_idle_instances: 1
I know all that properties for handling memory are not necessary but I was desperate trying to make this work and just tried a lot of solutions because my first error message was:
Exceeded soft memory limit of 256 MB with 263 MB after servicing 1 requests total. Consider setting a larger instance class in app.yaml.
The error below is not fixed with the properties in the app.yaml, but now everytime I make a call to return that JSON I get this error
2020-04-10 12:09:15.953 CEST
While handling this request, the process that handled this request was found to be using too much memory and was terminated. This is likely to cause a new process to be used for the next request to your application. If you see this message frequently, you may have a memory leak in your application or may be using an instance with insufficient memory. Consider setting a larger instance class in app.yaml.
It always last longer in the first request, I think due to some Firestore configuration, but the thing is that I cannot make that work, always getting the same error.
Do you have any idea what I could be doing wrong or what I need to fix this?
TL;DR The problem was I tried to used a very small instance for a simple application, but even with that I needed more memory.
Ok, a friend helped me with this. I was using a very small instance and even when I didn't get the error of memory limit it was a memory problem.
Updating my instance to a F2 (512 MB 1.2 GHz) solved the problem and testing my app with siege resulted in a very nice performance:
Transactions: 5012 hits
Availability: 100.00 %
Elapsed time: 59.47 secs
Data transferred: 0.45 MB
Response time: 0.30 secs
Transaction rate: 84.28 trans/sec
Throughput: 0.01 MB/sec
Concurrency: 24.95
Successful transactions: 3946
Failed transactions: 0
Longest transaction: 1.08
Shortest transaction: 0.09
My sysops friends tells me that this instances are more for python scripting code and things like that, not JVM REST servers.
I have a third party application named oVirt, We need to connect to this application using their Rest API exposed.
We have 4 GB RAM in the VM and allocated 1GB for the third party application.
Also this is the CPU configuration,
[root#test ~]# lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 2
On-line CPU(s) list: 0,1
Thread(s) per core: 1
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 23
Model name: Pentium(R) Dual-Core CPU E5200 # 2.50GHz
Stepping: 10
CPU MHz: 1200.000
CPU max MHz: 2500.0000
CPU min MHz: 1200.0000
BogoMIPS: 4999.77
L1d cache: 32K
L1i cache: 32K
L2 cache: 2048K
NUMA node0 CPU(s): 0,1
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl est tm2 ssse3 cx16 xtpr pdcm xsave lahf_lm dtherm
NOTE : The oVirt application itself exposed the JMX support.
Then we start to hit the rest API of the application with 500 number of request in which 30 request are hit parallely. I could see that it scales successfully. I could see that memory is not even using 600 MB during the API hit.
Then we increased the concurrent hits to 32 with 500 request and it failed without any error saying timed out.
I increases the RAM for third party app to 2GB still it fails at 35 concurrent request and sometimes it fails at 32 request again.
I have one more environment with 2 GB Ram running the same environment with different CPU configuration,
lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 58
Model name: Intel(R) Core(TM) i5-3450 CPU # 3.10GHz
Stepping: 9
CPU MHz: 1600.132
BogoMIPS: 6200.36
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 6144K
NUMA node0 CPU(s): 0-3
It can pass 63 concurrent request, so what is the point in it? I didnt understand the issue in Scaling.
I observed one log in their application :
2018-03-02 10:58:39,821+05 INFO [org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService] (EE-ManagedThreadFactory-engineThreadMonitoring-Thread-1) [] Thread pool 'engineScheduled' is using 0 threads out of 100 and 1 tasks are waiting in the queue.
2018-03-02 10:58:39,822+05 INFO [org.ovirt.engine.core.bll.utils.ThreadPoolMonitoringService] (EE-ManagedThreadFactory-engineThreadMonitoring-Thread-1) [] Thread pool 'engineThreadMonitoring' is using 1 threads out of 1 and 0 tasks are waiting in the queue.
How to analyse this ?
Could somebody explain the below threading output?
EDITED :
Added JMX response :
[standalone#127.0.0.1:8706 /] ls /core-service=platform-mbean/type=threading
all-thread-ids=[559L,558L,557L,556L,555L,554L,553L,552L,455L,399L,326L,325L,302L,301L,300L,299L,298L,297L,296L,295L,294L,293L,292L,289L,288L,287L,286L,285L,284L,283L,282L,281L,280L,279L,278L,277L,276L,275L,274L,273L,272L,271L,269L,264L,263L,262L,261L,260L,259L,258L,257L,256L,255L,254L,252L,251L,242L,237L,236L,235L,234L,233L,232L,231L,230L,227L,226L,225L,224L,223L,222L,221L,220L,219L,218L,217L,216L,215L,214L,213L,212L,211L,209L,208L,206L,205L,204L,203L,202L,201L,200L,199L,198L,197L,196L,195L,194L,193L,192L,191L,190L,189L,188L,187L,186L,185L,184L,183L,182L,181L,180L,179L,178L,177L,176L,175L,174L,173L,172L,171L,168L,167L,166L,165L,164L,163L,162L,161L,160L,159L,158L,157L,155L,154L,153L,152L,151L,150L,149L,148L,147L,146L,145L,144L,143L,142L,141L,140L,139L,135L,134L,132L,133L,131L,130L,129L,128L,127L,126L,125L,124L,123L,122L,121L,120L,119L,118L,117L,116L,114L,113L,112L,111L,110L,109L,108L,107L,106L,105L,104L,103L,102L,101L,99L,98L,97L,96L,84L,83L,80L,77L,76L,75L,74L,73L,72L,70L,71L,69L,68L,67L,66L,65L,62L,60L,64L,44L,43L,42L,41L,39L,38L,18L,17L,15L,14L,13L,12L,8L,4L,3L,2L]
thread-contention-monitoring-supported=true
thread-cpu-time-supported=true
current-thread-cpu-time-supported=true
object-monitor-usage-supported=true
synchronizer-usage-supported=true
thread-contention-monitoring-enabled=false
thread-cpu-time-enabled=true
thread-count=222
peak-thread-count=223
total-started-thread-count=551
daemon-thread-count=147
current-thread-cpu-time=62992810
current-thread-user-time=50000000
object-name=java.lang:type=Threading
[standalone#127.0.0.1:8706 /] ls /core-service=platform-mbean/type=memory
heap-memory-usage={"init" => 1073741824L,"used" => 801489408L,"committed" => 2016411648L,"max" => 2016411648L}
non-heap-memory-usage={"init" => 2555904L,"used" => 194310080L,"committed" => 212074496L,"max" => -1L}
object-name=java.lang:type=Memory
object-pending-finalization-count=0
verbose=true
I have installed Docker Toolbox for Mac OSX and running several containers inside. First two I created were with Cassandra and were running fine. After that I've created 2 Debian containers, connected to bash through docker terminal with the purpose to install Oracle JDK8.
At the point when I was about to extract java from the tarball - I've got a ton of "Cannot write: No space left on device" error messages during the execution of "tar" command.
I've checked the space:
$ docker ps -s
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES SIZE
9d8029e21918 debian:latest "/bin/bash" 54 minutes ago Up 54 minutes deb-2 620.5 MB (virtual 744 MB)
49c7a0e37475 debian:latest "/bin/bash" 55 minutes ago Up 55 minutes deb-1 620 MB (virtual 743.5 MB)
66a17af83ca3 cassandra "/docker-entrypoint.s" 4 hours ago Up 4 hours 7000-7001/tcp, 7199/tcp, 9042/tcp, 9160/tcp node-2 40.16 MB (virtual 412.6 MB)
After seeing that output I noticed that one of my nodes with cassandra is missing. In went to check to Kitematic and found out that it is in the DOWN state and I can't start it: "Cannot write node . No space left on device" - error message shown for this attempt.
Are there any limits that Docker has to run the containers?
When I remove all my cassandra ones and leave just a couple of Debian - java is able to be extracted from the tar. So the issue is definitely in some Docker settings related to sizing.
What is the correct way to resolve the issue with space limits here?
UPDATE.
$ docker images
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
cassandra latest 13ea610e5c2b 11 hours ago 374.8 MB
debian jessie 23cb15b0fcec 2 weeks ago 125.1 MB
debian latest 23cb15b0fcec 2 weeks ago 125.1 MB
The output of df -hi
$ df -hi
Filesystem Inodes IUsed IFree IUse% Mounted on
none 251K 38K 214K 15% /
tmpfs 251K 18 251K 1% /dev
tmpfs 251K 12 251K 1% /sys/fs/cgroup
tmpfs 251K 38K 214K 15% /etc/hosts
shm 251K 1 251K 1% /dev/shm
`df -h
Filesystem Size Used Avail Use% Mounted on
none 1.8G 1.8G 0 100%
/ tmpfs 1002M 0 1002M 0%
/dev tmpfs 1002M 0 1002M 0%
/sys/fs/cgroup tmpfs 1.8G 1.8G 0 100%
/etc/hosts shm 64M 0 64M 0% /dev/shm`
Appreciate help.
I have resolved this issue in docker somehow.
By default the memory for the docker is set to be 2048M by default.
First step I performed is stopping my docker machine:
$ docker-machine stop default
Then I went to the $HOME/.docker/machine/machines/default/config.json file and changed the "Memory" setting to be higher, i.e. 4096.
{
"ConfigVersion": 3,
"Driver": {
"VBoxManager": {},
"IPAddress": "192.168.99.102",
"MachineName": "default",
"SSHUser": "docker",
"SSHPort": 59177,
"SSHKeyPath": "/Users/lenok/.docker/machine/machines/default/id_rsa",
"StorePath": "/Users/lenok/.docker/machine",
"SwarmMaster": false,
"SwarmHost": "tcp://0.0.0.0:3376",
"SwarmDiscovery": "",
"CPU": 1,
"Memory": 4096,
"DiskSize": 204800,
"Boot2DockerURL": "",
"Boot2DockerImportVM": "",
"HostDNSResolver": false,
"HostOnlyCIDR": "192.168.99.1/24",
"HostOnlyNicType": "82540EM",
"HostOnlyPromiscMode": "deny",
"NoShare": false,
"DNSProxy": false
},
"DriverName": "virtualbox",
"HostOptions": {
"Driver": "",
"Memory": 0,
"Disk": 0,
"EngineOptions": {
"ArbitraryFlags": [],
"Dns": null,
"GraphDir": "",
"Env": [],
"Ipv6": false,
"InsecureRegistry": [],
"Labels": [],
"LogLevel": "",
"StorageDriver": "",
"SelinuxEnabled": false,
"TlsVerify": true,
"RegistryMirror": [],
"InstallURL": "https://get.docker.com"
},
"SwarmOptions": {
"IsSwarm": false,
"Address": "",
"Discovery": "",
"Master": false,
"Host": "tcp://0.0.0.0:3376",
"Image": "swarm:latest",
"Strategy": "spread",
"Heartbeat": 0,
"Overcommit": 0,
"ArbitraryFlags": [],
"config.json" [noeol] 75L, 2560C
"Overcommit": 0,
"ArbitraryFlags": [],
"Env": null
},
"AuthOptions": {
"CertDir": "/Users/lenok/.docker/machine/certs",
"CaCertPath": "/Users/lenok/.docker/machine/certs/ca.pem",
"CaPrivateKeyPath": "/Users/lenok/.docker/machine/certs/ca-key.pem",
"CaCertRemotePath": "",
"ServerCertPath": "/Users/lenok/.docker/machine/machines/default/server.pem",
"ServerKeyPath": "/Users/lenok/.docker/machine/machines/default/server-key.pem",
"ClientKeyPath": "/Users/lenok/.docker/machine/certs/key.pem",
"ServerCertRemotePath": "",
"ServerKeyRemotePath": "",
"ClientCertPath": "/Users/lenok/.docker/machine/certs/cert.pem",
"ServerCertSANs": [],
"StorePath": "/Users/lenok/.docker/machine/machines/default"
}
},
"Name": "default"
}
Finally, started my docker machine again:
$ docker-machine start default
Issue 18869 refers to a docker-machine memory allocation problem.
This can be tested on the fly with
vboxmanage controlvm default 4096
Since drivers/virtualbox/virtualbox.go#L344-L352 reloads the settings from HOME/.docker/machine/machines/default/config.json, it is best to record that new value in that file (as mentioned in this answer).
That "No space left on device" was seen in docker/machine issue 2285, where the vmdk image created is a dynamically allocated/grow at run-time (default), creating a smaller on-disk foot-print initially, therefore even when creating a ~20GiB vm, with --virtualbox-disk-size 20000 requires on about ~200MiB of free space on-disk to start with.
And the default memory is quite low.
Make sure you that don't have :
any more exited container that you could remove:
docker rm -v $(docker ps --filter status=exited -q 2>/dev/null) 2>/dev/null
any dangling images
docker rmi $(docker images --filter dangling=true -q 2>/dev/null) 2>/dev/null
(Those are the result of rebuild which makes intermediate images unused)
See also "How to remove old and unused Docker images"
Then make sure you don't have an inode exhaustion problem, as in issue 10613.
Check df -hi (with i for inodes)
connected to bash through docker terminal with the purpose to install Oracle JDK8.
Try instead to specify the installation in a Dockerfile and build an image with the JDK installed.
We are seeing some performance issues with Kafka + Storm + Trident + OpaqueTridentKafkaSpout
Mentioned below are our setup details :
Storm Topology :
Broker broker = Broker.fromString("localhost:9092")
GlobalPartitionInformation info = new GlobalPartitionInformation()
if(args[4]){
int partitionCount = args[4].toInteger()
for(int i =0;i<partitionCount;i++){
info.addPartition(i, broker)
}
}
StaticHosts hosts = new StaticHosts(info)
TridentKafkaConfig tridentKafkaConfig = new TridentKafkaConfig(hosts,"test")
tridentKafkaConfig.scheme = new SchemeAsMultiScheme(new StringScheme())
OpaqueTridentKafkaSpout kafkaSpout = new OpaqueTridentKafkaSpout(tridentKafkaConfig)
TridentTopology topology = new TridentTopology()
Stream st = topology.newStream("spout1", kafkaSpout).parallelismHint(args[2].toInteger())
.each(kafkaSpout.getOutputFields(), new NEO4JTridentFunction(), new Fields("status"))
.parallelismHint(args[1].toInteger())
Map conf = new HashMap()
conf.put(Config.TOPOLOGY_WORKERS, args[3].toInteger())
conf.put(Config.TOPOLOGY_DEBUG, false)
if (args[0] == "local") {
LocalCluster cluster = new LocalCluster()
cluster.submitTopology("mytopology", conf, topology.build())
} else {
StormSubmitter.submitTopology("mytopology", conf, topology.build())
NEO4JTridentFunction.getGraphDatabaseService().shutdown()
}
Storm.yaml we are using for Storm is as below :
########### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
- "localhost"
# - "server2"
#
storm.zookeeper.port : 2999
storm.local.dir: "/opt/mphrx/neo4j/stormdatadir"
nimbus.childopts: "-Xms2048m"
ui.childopts: "-Xms1024m"
logviewer.childopts: "-Xmx512m"
supervisor.childopts: "-Xms1024m"
worker.childopts: "-Xms2600m -Xss256k -XX:MaxPermSize=128m -XX:PermSize=96m
-XX:NewSize=1000m -XX:MaxNewSize=1000m -XX:MaxTenuringThreshold=1 -XX:SurvivorRatio=6
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled
-XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
-server -XX:+AggressiveOpts -XX:+UseCompressedOops -Djava.awt.headless=true -Djava.net.preferIPv4Stack=true
-Xloggc:logs/gc-worker-%ID%.log -verbose:gc
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1m
-XX:+PrintGCDetails -XX:+PrintHeapAtGC -XX:+PrintGCTimeStamps -XX:+PrintClassHistogram
-XX:+PrintTenuringDistribution -XX:-PrintGCApplicationStoppedTime -XX:-PrintGCApplicationConcurrentTime
-XX:+PrintCommandLineFlags -XX:+PrintFlagsFinal"
java.library.path: "/usr/lib/jvm/jdk1.7.0_25"
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
topology.trident.batch.emit.interval.millis: 100
topology.message.timeout.secs: 300
#topology.max.spout.pending: 10000
Size of each message produced in Kafka : 11 KB
Execution time of each bolt(NEO4JTridentFunction) to process the data : 500ms
No. of Storm Workers : 1
Parallelism hint for Spout(OpaqueTridentKafkaSpout): 1
Parallelism hint for Bolt/Function(NEO4JTridentFunction) : 50
We are seeing throughput of around 12msgs/sec from Spout.
Rate of messages produced in Kafka : 150msgs/sec
Both Storm and Kafka are a single node deployment.
We have read about much higher throughput from Storm but are unable to produce the same. Please suggest how to tune the Storm+ Kafka + OpaqueTridentKafkaSpout configuration to achieve higher throughput. Any help in this regard would help us immensely.
Thanks,
You should set spout parallelism same as partition count for mentioned topics.
By default, trident accept one batch for each execution, you should increase this count by changing topology.max.spout.pending property. Since Trident forces ordered transaction management, your execution method (NEO4JTridentFunction)must be fast to reach desired solution.
In addition,you can play with "tridentConfig.fetchSizeBytes", by changing it, you can ingest more data for each new emit call in your spout.
Also you must check your garbage collection log, it will give you clue about real point.
You can enable garbage collection log by adding "-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc -Xloggc:{path}/gc-storm-worker-%ID%.log", in worker.childopts settings in your worker config.
Last but not least, you can use G1GC, if your young generation ratio is higher than normal case.
Please set your worker.childopts based on your system configuration. Use SpoutConfig.fetchSizeBytes to increase the number of bytes being pulled into the topology. Increase your Parallelism hint.
my calculations: if 8 Cores and 500MS per bolt -> ~16 Messages/sec.
if you optimize the bolt, then you will see improvements.
also, for CPU bound bolts, try Parallelism hint = 'amount of total cores'
and increase topology.trident.batch.emit.interval.millis to the amount of time it takes to process entire batch divided by 2.
set topology.max.spout.pending to 1.