Solr: java heap space continually growing

Solr: java heap space continually growing - java

We have a Solr Server (Solr 4.5) with a custom schema and configuration set up for a project under development. Now I have observed, that by running our integration tests the memory usage of the Solr server continually grows. These tests (JUnit) each post a set of 100 randomly generated records to the server, queries around a bit and deletes them.
The deletion policy is set to
<deletionPolicy class="solr.SolrDeletionPolicy">
<str name="maxCommitsToKeep">1</str>
<str name="maxOptimizedCommitsToKeep">0</str>
</deletionPolicy>
Even when the index contains no documents anymore, no memory is freed. Every run of the tests increases the used memory by a certain amount (about 40 M, while the index itself is about 7 k), until the complete server dies with an OutOfMemoryError.
The Solr installation runs on a Tomcat 6.0.35.0, with Java 1.7.0_17 with -Xmx12g. OS is Linux.
How can that be? Where can I tweak the memory handling of Solr?

As it seems, I have set the cache values and the Xmx too high for the test machine, as a java process uses up more memory than it is assigned to (overhead). With reduced sizes the Solr now runs stable for two days and a lot of unit test runs (which it didn't before).
The test scenario (fill the index with random values and then clean it completely) filled the caches to maximum with records. It seems that removing a record from index not necessarily removes it from the caches...

Related

Large dataset: None of the configured nodes are available for large dataset

There are a lot of questions about this error, but none for this condition.
I am running Elasticsearch 5.4.1 with a java client(1.8) which uses the API to make Elasticsearch calls. It is on a Mac. I have a large number of documents which I have to search on / insert/ and merge on. (~ 8000 documents)
I get this exception
None of the configured nodes are available: [{#transport#-1}{gNzkHZmURzaabE336I-T4w}{localhost}{127.0.0.1:9300}]
The thing is, it works with a smaller number of entries (e.g. 5000 documents). So the connection seems to be going down for some reason? Should I allocate more memory/ use more nodes to it? Is there some weird garbage collection going on?
According to the activity monitor for the Mac, memory pressure is fine.

Java application Performance issue

Issue at hand:
I have a java application which is taking twice as long to run on DEV and QA Servers than on my local machine. When running the job on Dev and QA I’m getting times around 1:45 – 2:30 (hh:MM) compared to my local which is getting about 0:45 – 1:10. I’m trying to determine what could be causing slow performance of a java application on servers.
What I have looked into so far none providing an answer:
Testing with same maxheap size
Observing stress on cpu. Dev is idle about 75% of the time when running the batch application, so I don’t think this is an issue.
Observing ram on Dev. Dev has more than enough memory to provide the JVM the specified maxHeap (128mb). If I understand correctly the available memory of the machine doesn’t matter as long as the MaxHeap size can be provided by the machine correct?
Ensuring the version of java isn’t causing the issue.
Set logging level same: “INFO”
Processor: servers has 2.67GHz processor my local only has 2.19GHz
Additional Information.
Server OS: Linux
Local Computer OS: Windows
Single threaded Java application.
Application is reading and writing to text files and also has calls
to a database(hibernate c3p0). These are my most/only expensive operations
I’ve scoured dozens of sites to determine a root cause but I haven’t been able to nail down what is causing the issue any help will be much appreciated.

Application is reading and writing to text files and also has calls to a database(hibernate c3p0). These are my most/only expensive operations
Most likely your server has slower access to the database it is using. e.g. if you were to run the database on the same machine it can be a lot faster than across a network. I would look at the time it takes to perform some simple hibernate operations locally and on your server. If performance is a concern I suggest looking at removing hibernate or even your database and your program can run 10x - 100x faster.

I’m trying to determine what could be causing slow performance of a java application on servers
The Server I was testing on happened to be in the DMZ (outside the network). While the database and my local computer (when working in the office) are inside the network. This was the case I failed to evaluate.

Java - issue with memory

Need some help from the experts!
We have a project here (still on dev) that needs to run 50 java processes (for now and it will probably doubled or tripled in the future) at the same time every 5 minutes. I set Xmx50m for every process and our server has only 4gb of RAM, I know that would really slow our server. What I have in mind is to upgrade our RAM. My question is that do I have other options to prevent our server from being slow when running that amount of java processes?

Since you have 50 process and as per your assumption your processes need about 2.5 Gb to run .
To prevent your server from being slow you can follow some best practices to set java memory parameters e.g. set -Xmin and -Xmx the same values and determine a proper values based on your process usage, Also you can profile your process on runtime to ensure that everything is ok.

SOLR suddenly crashes with OutOfMemoryException

I ran into a problem with Solr going OutOfMemory. The situation is as follows. We had 2 Amazon EC2 small instances (3.5G) each running a Spring/BlazeDS backend in Tomcat 6 (behind a loadbalancer). Each instance has its own local Solr instance. The index size on disk is about 500M. The JVM settings were since months (Xms=512m,Xmx=768). We use Solr to find people based on properties they entered in their profile and documents they uploaded. We're not using the Solr update handler, only the select. Updates are done using deltaImports. The Spring app in each Tomcat instance has a job that triggers the /dataimport?command=delta-import handler every 30 seconds.
This worked well for months, even for over a year if I'm correct (I'm not that long on the project). CPU load was at a minimum, with exceptionally some peaks.
The past week we suddenly had OutOfMemory crashes of SOLR on both machines. I reviewed my changes over the past few weeks, but none of the seamed related to SOLR. Bugfixes in the UI, something email related, but again: nothing in the SOLR schema or queries.
Today, we changed the Ec2 instances to m1.large (7.5G) and the SOLR JVM settings to -Xms=2048 / -Mmx=3072. This helped a bit, they run for 3 a 4 hours, but eventually, they crash too.
Oh, and the dataset (number of rows, documents, entities, etc) did not change significantly. There is a constant growth, but it doesn't make sense to me when I triple the JVM memory, that it still crashes...
The question: have you any directions to point me to?

Measure, not guess. Instead of guessing, what has changed, which could lead to your problems, you would better take some memory leak detection tool, e.g. Plumbr. Run your Solr with the tool attached and see, is it will tell you the exact reason of memory leak.

Take a look at your Solr cache settings. Reducing the size of the document cache has helped us stabilize a Solr 3.6 server that was also experiencing OutOfMemory errors. The query result cache size may also be relevant in your case, it was not in mine.
You can see your Solr cache usage on the admin page for your core:
http://localhost:8983/solr/core0/admin/stats.jsp#cache
(Replace core0 with the name of your Solr core)
documentCache
https://wiki.apache.org/solr/SolrCaching#documentCache
queryResultCache
https://wiki.apache.org/solr/SolrCaching#queryResultCache

Java-mysql highload application crash

I have a problem with my html-scraper. Html-scraper is multithreading application written on Java using HtmlUnit, by default it run with 128 threads. Shortly, it works as follows: it takes a site url from big text file, ping url and if it is accessible - parse site, find specific html blocks, save all url and blocks info including html code into corresponding tables in database and go to the next site. Database is mysql 5.1, there are 4 InnoDb tables and 4 views. Tables have numeric indexes for fields used in table joining. I also has a web-interface for browsing and searching parsed data (for searching I use Sphinx with delta indexes), written on CodeIgniter.
Server configuration:
CPU: Type Xeon Quad Core X3440 2.53GHz
RAM: 4 GB
HDD: 1TB SATA
OS: Ubuntu Server 10.04
Some mysql config:
key_buffer = 256M
max_allowed_packet = 16M
thread_stack = 192K
thread_cache_size = 128
max_connections = 400
table_cache = 64
query_cache_limit = 2M
query_cache_size = 128M
Java machine run with default parameters except next options: -Xms1024m -Xmx1536m -XX:-UseGCOverheadLimit -XX:NewSize=500m -XX:MaxNewSize=500m -XX:SurvivorRatio=6 -XX:PermSize=128M -XX:MaxPermSize=128m -XX:ErrorFile=/var/log/java/hs_err_pid_%p.log
When database was empty, scraper process 18 urls in second and was stable enough. But after 2 weaks, when urls table contains 384929 records (~25% of all processed urls) and takes 8.2Gb, java application begun work very slowly and crash every 1-2 minutes. I guess the reason is mysql, that can not handle growing loading (parser, which perform 2+4*BLOCK_NUMBER queries every processed url; sphinx, which updating delta indexes every 10 minutes; I don't consider web-interface, because it's used by only one person), maybe it rebuild indexes very slowly? But mysql and scraper logs (which also contain all uncaught exceptions) are empty. What do you think about it?

I'd recommend running the following just to check a few status things.. puting that output here would help as well:
dmesg
top Check the resident vs virtual memory per processes

So the application become non responsive? (Not the same as a crash at all) I would check all your resources are free. e.g. do a jstack to check if any threads are tied up.
Check in MySQL you have the expect number of connections. If you continuously create connections in Java and don't clean them up the database will run slower and slower.

Thank you all for your advice, mysql was actually cause of the problem. By enabling slow query log in my.conf I see that one of the queries, which executes every iteration, performs 300s (1 field for searching was not indexed).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.