Java application running in eclipse, random "fatal errors" - java

I have written a short application that converts files from their raw data to XML (ECGs). I have about 350000 files to convert, and the convertion itself is done via a library that I got from the manufacturer of the ECG devices. To make use of multiple processors and cores in the machine I'm using to do the convertion I wrote a "wrapper application" that creates a pool of threads, which is then used to do the convertion in separate threads. It works somewhat ok, but unfortunately I do get random errors causing the whole application to stop (85k files have been converted over the past 3-4 days and I have had four of those errors):
A fatal error has been detected by the Java Runtime Environment:
EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x71160a6c, pid=1468, tid=1396
JRE version: Java(TM) SE Runtime Environment (8.0_20-b26) (build 1.8.0_20-b26)
Java VM: Java HotSpot(TM) Client VM (25.20-b23 mixed mode windows-x86 )
Problematic frame:
C [msvcr100.dll+0x10a6c]
I would suspect that it's the library that I'm using that causes these, so I don't think I can do all that much to fix it. If that error happens, I run then program and let it start where it left off before crashing. Right now I have to do that manually but was hoping that there is some way to let Eclipse restart the program (with an argument of the filename where it should start). Does anyone know if there is some way to do that?
Thanks!

It is not entirely clear, but I think you are saying that you have a 3rd party Java library (with a native code component) that you are running within one JVM using multiple threads.
If so, I suspect that the problem is that the native part of the 3rd-party application is not properly multi-threaded, and that is the root cause of the crashes. (I don't expect that you want to track down the cause of the problem ...)
Instead of using one JVM with multiple converter threads, use multiple JVMs each with a single converter thread. You can spread the conversions across the JVMs either by partitioning the work statically, or by some form of queueing mechanism.
Or ... you could modify your existing wrapper so that the threads launched the converter in a separate JVMs using ProcessBuilder. If a converter JVM crashes, the wrapper thread that launched it could just launch it again. Alternatively, it could just make a note of the failed conversion and move onto the next one. (You need to be a bit careful with retrying, in case it is something about the file that you are converting that is triggering the JVM crash.)
For the record, I don't know of an existing "off the shelf" solution.

It seems that you are using the x86 (32-bit) version of Java. Maybe you could try it with the x64 (64-bit) version. That has sometimes worked for me in the past.
The problem seems to be in the native library, but maybe if you try it with 64-bit Java, it will use a 64-bit version of the native library?

Related

Processing Sketch- Why is 32-bit and 64-bit both created?

I've recently been experimenting with Processing (https://processing.org/).
It's a sort of IDE used to make GUI design in Java easier. Since I'm not a fan of swing or AWT, I found it quite fun to use.
Something interesting to note though. When I "export" the Application for windows, it creates both a 32-bit and 64-bit version.
I'm a little confused as I thought after Java source code is compiled to Java bytecode, it can be run anywhere as long as that place as a JVM. (Write once, run anywhere).
So why are both a 32 bit and 64 bit version of the app created? Shouldn't the bytecode be platform independent and only be translated using Just-In-Time compilation to whichever architecture the JVM is on, during runtime? At least, I know that's how .NET does it with the CLR.
I'm going to attempt to answer my own question by saying since the applications created are .exe files, the translation to native architecture happened ALREADY, since windows was specified as a target-platform...I guess to increase efficiency?
Otherwise, I'm confused. The only time I've seen a compilation happen twice is when I was programming C++, and needed to compile twice for 32-bit and 64-bit.
Thank you!
Processing is built on top of JOGL which is (basically) a Java wrapper of OpenGL, which is a device-specific graphics library.
Also, Processing (can) include a whole JVM with its exported applications, so end users don't have to worry about downloading Java. The JVM itself is OS-dependent, so the exported application is as well.
You can confirm this by taking a look at the files that Processing creates. Specifically, notice these files:
jogl-rt-natives-windows-amd64.jar
jogl-all-natives-windows-amd64.jar
These .jar files contain the native files required by JOGL.

Java application performance changing based on how it is executed

hopefully this is an easy and quick question. I recently developed a CPU intensive java application in Netbeans. It uses A* pathfinding tens of thousands of times per second to solve a tiles matching game. The application is finished, and it runs pretty fast (I've been testing in netbeans the whole time). I've clocked it at 700 attempts per second (each attempt is probably 20 or so pathfinds). When I build the project it creates a jar, and I can run this outside of netbeans. If I use the command line (Windows 7), and use java -jar theFile.jar, I clock it at 1000 attempts per second. This is understandable since the IDE was probably using a bit of cpu power and holding it back (My application is multicored, you can set the number. I usually use 3/4 so it doesnt slow my system too much). Now, the confusing part. Obviously I don't want the user to have to use the command line every time they want to run this application on windows. They should just be able to click the jar. The problem is that when I double click the jar file, the program runs at a sickly 300 attempts per second!!
Why on earth would these three ways of running the exact same program, all else being constant, have such a massive impact on performance? Is my fix to create a script to run the .jar by command line, or do you guys recognize what's going on here? Thanks very much!
Edit: New Information
I made a batch file with the command: java -jar theFile.jar
When this is executed, it runs at the same speed as it would if I ran it in the console (so, 1000 att/sec)
However, I also made an executable with a simple c++ program. The program had just a couple lines, and was System("java -jar theFile.jar"); and return 0;. Unbeleivably, this runs at the speed of double clicking the jar file, about 300att/sec. How bizarre! It could very well be different IDE parameters, but i'm not sure how to check the default system parameters, or how to modify them for this particular jar.
You may be running into the differences between the client and server versions of the HotSpot VM. From this article:
On platforms typically used for client applications, the JDK comes with a VM implementation called the Java HotSpotâ„¢ Client VM (client
VM). The client VM is tuned for reducing start-up time and memory
footprint. It can be invoked by using the -client command-line option
when launching an application.
On all platforms, the JDK comes with an implementation of the Java virtual machine called the Java HotSpot Server VM (server VM). The
server VM is designed for maximum program execution speed. It can be
invoked by using the -server command-line option when launching an
application.
I'm guessing that clicking the jar file may be invoking the client VM, unless you set the -server flag. This article provides some more details:
What's the difference between the -client and -server systems?
These two systems are different binaries. They are essentially two
different compilers (JITs)interfacing to the same runtime system. The
client system is optimal for applications which need fast startup
times or small footprints, the server system is optimal for
applications where the overall performance is most important. In
general the client system is better suited for interactive
applications such as GUIs. Some of the other differences include the
compilation policy,heap defaults, and inlining policy.
Where do I get the server and client systems?
Client and server systems are both downloaded with the 32-bit Solaris
and Linux downloads. For 32-bit Windows, if you download the JRE, you
get only the client, you'll need to download the SDK to get both
systems.
For 64-bit, only the server system is included. On Solaris, the 64-bit
JRE is an overlay on top of the 32-bit distribution. However, on Linux
and Windows, it's a completely separate distribution.
I would like java to default to -server. I have a lot of scripts which
I cannot change (or do not want to change). Is there any way to do
this?
Since Java SE 5.0, with the exception of 32-bit Windows, the server VM
will automatically be selected on server-class machines. The
definition of a server-class machine may change from release to
release, so please check the appropriate ergonomics document for the
definition for your release. For 5.0, it's Ergonomics in the 5.0
Java[tm] Virtual Machine.
Should I warm up my loops first so that Hotspot will compile them?
Warming up loops for HotSpot is not necessary. HotSpot contains On
Stack Replacement technology which will compile a running
(interpreted) method and replace it while it is still running in a
loop. No need to waste your applications time warming up seemingly
infinite (or very long running) loops in order to get better
application performance.

Linux identifier removed

I've encountered an interesting problem when running the following piece of Java code:
File.createTempFile("temp.cnt.ent", "cnt.feat.tmp", directory);
The following exception is thrown:
Exception in thread "main" java.io.IOException: Identifier removed
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.checkAndCreate(File.java:1704)
at java.io.File.createTempFile(File.java:1792)
I have never had this problem before and Google doesn't seem to have much for me. The system runs Scientific Linux release 5.8 (Linux 2.6.18-274.3.1.el5 x86_64) and the Java version is
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
The file system (Lustre) has 80TB of free space.
Any suggestions are greatly appreciated.
You are encountering synchronisation errors between the various instances. Lustre doesn't support file locking, which is probably what java.io.UnixFileSystem.createFileExclusively uses to avoid concurrency woes. (I say "probably" because it doesn't appear to be documented anywhere.)
Without locking it's only a matter of time until file operations interfere with each other. Reducing the number of instances is not a solution because it just makes it less likely to occur.
I believe the solution is to insure that each instance creates files in a different sub-directory
I guess that you see an EIDRM. At least the error message looks like that. The IOException wraps an error message from the underlying native libraries.
This is not a real answer to your problem, but maybe a useful hint.
http://docs.oracle.com/cd/E19455-01/806-1075/msgs-1432/index.html has some information an additional pointers.
The problem seems to be related to having too many instances of the application at a time (each in a separate VM). For some unknown reason the OS refuses to allow the creation of a temp file. Workaround: run less instances.

How serious is the Java7 "Solr/Lucene" bug?

Apparently Java7 has some nasty bug regarding loop optimization: Google search.
From the reports and bug descriptions I find it hard to judge how significant this bug is (unless you use Solr or Lucene).
What I'd like to know:
How likely is it that my (any) program is affected?
Is the bug deterministic enough that normal testing will catch it?
Note: I can't make users of my program use -XX:-UseLoopPredicate to avoid the problem.
The problem with any hotspot bugs, is that you need to reach the compilation threshold (e.g. 10000) before it can get you: so if your unit tests are "trivial", you probably won't catch it.
For example, we caught the incorrect results issue in lucene, because this particular test creates 20,000 document indexes.
In our tests we randomize different interfaces (e.g. different Directory implementations) and indexing parameters and such, and the test only fails 1% of the time, of course its then reproducable with the same random seed. We also run checkindex on every index that tests create, which do some sanity tests to ensure the index is not corrupt.
For the test we found, if you have a particular configuration: e.g. RAMDirectory + PulsingCodec + payloads stored for the field, then after it hits the compilation threshold, the enumeration loop over the postings returns incorrect calculations, in this case the number of returned documents for a term != the docFreq stored for the term.
We have a good number of stress tests, and its important to note the normal assertions in this test actually pass, its the checkindex part at the end that fails.
The big problem with this, is that lucene's incremental indexing fundamentally works by merging multiple segments into one: because of this, if these enums calculate invalid data, this invalid data is then stored into the newly merged index: aka corruption.
I'd say this bug is much sneakier than previous loop optimizer hotspot bugs we have hit (e.g. sign-flip stuff, https://issues.apache.org/jira/browse/LUCENE-2975). In that case we got wacky negative document deltas, which make it easy to catch. We also only had to manually unroll a single method to dodge it. On the other hand, the only "test" we had initially for that was a huge 10GB index of http://www.pangaea.de/, so it was painful to narrow it down to this bug.
In this case, I spent a good amount of time (e.g. every night last week) trying to manually unroll/inline various things, trying to create some workaround so we could dodge the bug and not have the possibility of corrupt indexes being created. I could dodge some cases, but there were many more cases I couldn't... and I'm sure if we can trigger this stuff in our tests there are more cases out there...
Simple way to reproduce the bug. Open eclipse (Indigo in my case), and Go to Help/Search. Enter a search string, you will notice that eclipse crashes. Have a look at the log.
# Problematic frame:
# J org.apache.lucene.analysis.PorterStemmer.stem([CII)Z
#
# Failed to write core dump. Minidumps are not enabled by default on client versions of Windows
#
# If you would like to submit a bug report, please visit:
# http://bugreport.sun.com/bugreport/crash.jsp
#
--------------- T H R E A D ---------------
Current thread (0x0000000007b79000): JavaThread "Worker-46" [_thread_in_Java, id=264, stack(0x000000000f380000,0x000000000f480000)]
siginfo: ExceptionCode=0xc0000005, reading address 0x00000002f62bd80e
Registers:
The problem, still exist as of Dec 2, 2012
in both Oracle JDK
java -version
java version "1.7.0_09"
Java(TM) SE Runtime Environment (build 1.7.0_09-b05)
Java HotSpot(TM) 64-Bit Server VM (build 23.5-b02, mixed mode)
and openjdk
java version "1.7.0_09-icedtea"
OpenJDK Runtime Environment (fedora-2.3.3.fc17.1-x86_64)
OpenJDK 64-Bit Server VM (build 23.2-b09, mixed mode)
Strange that individually any of
-XX:-UseLoopPredicate or -XX:LoopUnrollLimit=1
option prevent bug from happening,
but when used together - JDK fails
see e.g.
https://bugzilla.redhat.com/show_bug.cgi?id=849279
Well it's two years later and I believe this bug (or a variation of it) is still present in 1.7.0_25-b15 on OSX.
Through very painful trial and error I have determined that using Java 1.7 with Solr 3.6.2 and autocommit <maxTime>30000</maxTime> seems to cause index corruption. It only seems to happen w/ 1.7 and maxTime at 30000- if I switch to Java 1.6, I have no problems. If I lower maxTime to 3000, I have no problems.
The JVM does not crash, but it causes RSolr to die with the following stack trace in Ruby:
https://gist.github.com/armhold/6354416. It does this reliably after saving a few hundred records.
Given the many layers involved here (Ruby, Sunspot, Rsolr, etc) I'm not sure I can boil this down into something that definitively proves a JVM bug, but it sure feels like that's what's happening here. FWIW I have also tried JDK 1.7.0_04, and it also exhibits the problem.
As I understand it, this bug is only found in the server jvm. If you run your program on the client jvm, you are in the clear. If you run your program on the server jvm it depends on the program how serious the problem can be.

Why does Java Web Start not work with 64-bit Java environments?

Java Web Start does not come with 64-bit builds of the JDK. Why is this? What is lacking that keeps it from building and working?
Apparently, there is no reason, since its in JRE6u12. http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4626735
Thought you'd might want to know the new update is out:
http://java.sun.com/javase/6/webnotes/6u12.html
64-Bit Browser Support for Java Plugin
and Java Webstart This release
supports the New Java Plugin and Java
Webstart on AMD64 architecture, on
Windows platforms. A Java offline
installer (JRE and JDK) is provided.
Note, if you use 32-bit and 64-bit
browsers interchangeably, you will
need to install both 32-bit and 64-bit
JREs in order to have Java Plug-In for
both browsers.
Mostly its a lack of demand. You only really need 64-bit version if you intend to run more than 1200 MB of memory for a web start client. Otherwise it doesn't make much difference.
Do you know of any examples of a web start application which uses this much memory?
Yes, Javawebstart is not just used to start an app starting from a simple visit of the web via your browser. It is used also for JNLP deployment.
And applications that need MORE than just 2-3 GB are really existing. JavaWebStart DID work in the past to start in 64 bit mode.
But now javawebstart no longer honors the -D64 command line flag given to the command line parameters of the VM (what is even worse is that we are even limited to about 247MB even if we pass the VM parameters -Xmx (which are no longer honored too !!)
Using JNLP applications is now impossible. We need full support of 64 bit mode (and a way to pass VM creation parameters). This is not just a limitation but a severe return to the old ages, with Java VM becoming now extremely slow and swapping.
It looks like the Java documentation is wrong now or the support parameter for it has changed. If you ever have installed some tool that provides associations of JNLP with a command, it is possible that it has changed the mapping using an installation of the 32 bit version only of Javawebstart in \windows\syswow64 (which the Java control panel will not detect and not update), when at the same time you have the latest update instaleld in \windows\system32 with a 64-bit javaws laucnher that suports BOTH 32-bit and 64-bit VMs.
To start java in 64-bit when you have installed both 32-bit and 64-bit versions, you need to check the shortcut created on the descktop or start menu to see that it effectively usees the correct path in \windows\system32 and that the parameter "-J-d64" is present (as well as "-J-Mx=3048m" if you want to increase the maximum size of the VM. Otherwise your VM will be 32 bit and limited to 247 MB !!!
I hate those tools that install and change the Java installation somewhere else, without properly registering them in the Windows registry using a supported installation method (not just for their own use, but trying to change associations of file types.
Anyway there's a bug in the Java control panel for Windows if it does not help restoring the file associations, and does not detect that another JRE has been setup (most often oan outdated version !) And the documentation still incorrectly states that we must use the
folllowing VM parameter "-D64" when it should be "-d64" (the former is just used to define a property with an unset value in the environment).

Categories

Resources