Regular expression to parse a log file and find stacktraces

Regular expression to parse a log file and find stacktraces - java

I'm working with a legacy Java app that has no logging and just prints all information to the console. Most exceptions are also "handled" by just doing a printStackTrace() call.
In a nutshell, I've just redirected the System.out and System.error streams to a log file, and now I need to parse that log file. So far all good, but I'm having problems trying to parse the log file for stack traces.
Some of the code is obscufated as well, so I need to run the stacktraces through a utility app to de-obscufate them. I'm trying to automate all of this.
The closest I've come so far is to get the initial Exception line using this:
.+Exception[^\n]+
And finding the "at ..(..)" lines using:
(\t+\Qat \E.+\s+)+
But I can't figure out how to put them together to get the full stacktrace.
Basically, the log files looks something like the following. There is no fixed structure and the lines before and after stack traces are completely random:
Modem ERROR (AT
Owner: CoreTalk
) - TIMEOUT
IN []
Try Open: COM3
javax.comm.PortInUseException: Port currently owned by CoreTalk
at javax.comm.CommPortIdentifier.open(CommPortIdentifier.java:337)
...
at UniPort.modemService.run(modemService.java:103)
Handling file: C:\Program Files\BackBone Technologies\CoreTalk 2006\InputXML\notify
java.io.FileNotFoundException: C:\Program Files\BackBone Technologies\CoreTalk 2006\InputXML\notify (The system cannot find the file specified)
at java.io.FileInputStream.open(Native Method)
...
at com.gobackbone.Store.a.a.handle(Unknown Source)
at com.jniwrapper.win32.io.FileSystemWatcher.fireFileSystemEvent(FileSystemWatcher.java:223)
...
at java.lang.Thread.run(Unknown Source)
Load Additional Ports
... Lots of random stuff
IN []
[Fatal Error] .xml:6:114: The entity name must immediately follow the '&' in the entity reference.
org.xml.sax.SAXParseException: The entity name must immediately follow the '&' in the entity reference.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
...
at com.gobackbone.Store.a.a.run(Unknown Source)

Looks like you just need to paste them together (and use a newline as glue):
.+Exception[^\n]+\n(\t+\Qat \E.+\s+)+
But I would change your regex a bit:
^.+Exception[^\n]++(\s+at .++)+
This combines the whitespace between the at... lines and uses possessive quantifiers to avoid backtracking.

We have been using ANTLR to tackle the parsing of logfiles (in a different application area). It's not trivial but if this is a critical task for you it will be better than using regexes.

I get good results using
perl -n -e 'm/(Exception)|(\tat )/ && print' /var/log/jboss4.2/debian/server.log
It dumps all lines which have Exception or \tat in them. Since the match is in the same time the order is kept.

Related

Java Thread Dump - Negative Line Numbers

I was just trying to understand some blocked items from a thread dump:
"Thread-65" : 151 : RUNNABLE : cpu=36796875000 : cpuLoad= 0.29151857
BlockedCount:94117 BlockedTime:-1 LockName:null LockOwnerID:-1 LockOwnerName:null
WaitedCount:16 WaitedTime:-1 InNative:false IsSuspended:false at java.util.zip.ZipFile.open(ZipFile.java:-2)
at java.util.zip.ZipFile.(ZipFile.java:219)
at java.util.zip.ZipFile.(ZipFile.java:149)
at java.util.jar.JarFile.(JarFile.java:166)
at java.util.jar.JarFile.(JarFile.java:130)
at org.apache.tomcat.util.compat.JreCompat.jarFileNewInstance(JreCompat.java:188)
at org.apache.tomcat.util.compat.JreCompat.jarFileNewInstance(JreCompat.java:173)
at org.apache.catalina.webresources.AbstractArchiveResourceSet.openJarFile(AbstractArchiveResourceSet.java:316)
at org.apache.catalina.webresources.AbstractSingleArchiveResourceSet.getArchiveEntry(AbstractSingleArchiveResourceSet.java:96)
at org.apache.catalina.webresources.AbstractArchiveResourceSet.getResource(AbstractArchiveResourceSet.java:265)
at org.apache.catalina.webresources.StandardRoot.getResourceInternal(StandardRoot.java:281)
at org.apache.catalina.webresources.CachedResource.validateResource(CachedResource.java:97)
at org.apache.catalina.webresources.Cache.getResource(Cache.java:69)
at org.apache.catalina.webresources.StandardRoot.getResource(StandardRoot.java:216)
at org.apache.catalina.webresources.StandardRoot.getClassLoaderResource(StandardRoot.java:225)
at org.apache.catalina.loader.WebappClassLoaderBase.getResourceAsStream(WebappClassLoaderBase.java:1067)
at java.lang.Class.getResourceAsStream(Class.java:2223)
at bsh.BshClassManager.getResourceAsStream(null:-1)
at bsh.classpath.ClassManagerImpl.getResourceAsStream(null:-1)
at bsh.BshClassManager.loadSourceClass(null:-1)
at bsh.classpath.ClassManagerImpl.classForName(null:-1)
at bsh.NameSpace.classForName(null:-1)
at bsh.NameSpace.getImportedClassImpl(null:-1)
at bsh.NameSpace.getClassImpl(null:-1)
at bsh.NameSpace.getClass(null:-1)
at bsh.Name.consumeNextObjectField(null:-1)
at bsh.Name.toObject(null:-1)
at bsh.Name.toObject(null:-1)
at bsh.NameSpace.get(null:-1)
at bsh.Interpreter.get(null:-1)
at bsh.Interpreter.getu(null:-1)
at bsh.Interpreter.(null:-1)
at bsh.Interpreter.(null:-1)
at bsh.Interpreter.(null:-1)
What I a not getting is the negative line number. Does it mean that that the source cannot be found?

Those are all beanshell classes. I'm guessing they do that to keep it from being confusing.
I'm guessing that they are using this as a "Trick" to remove those lines in some conditions (such as when you are running in the beanshell repl).
You could use that same trick to remove those lines from your dump.
This possibly related question How to get the complete log for bean shell scripts in jmeter
Suggested adding the debug() directive to beanshell get a less limited stack trace--perhaps if you did that then it would fill in those bsh stack lines?

java.lang.UnsatisfiedLinkError when writing using crunch MemPipeline

I am using com.cloudera.crunch version: '0.3.0-3-cdh-5.2.1'.
I have a small program that reads some AVROs and filters out invalid data based on some criteria. I am using pipeline.write(PCollection, AvroFileTarget) to write the invalid data output. It works fine in production run.
For unit testing this piece of code, I use MemPipeline instance.
But, it fails while writing the output in that case.
I get error:
java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(II[BI[BIILjava/lang/String;JZ)V
at org.apache.hadoop.util.NativeCrc32.nativeComputeChunkedSumsByteArray(Native Method)
at org.apache.hadoop.util.NativeCrc32.calculateChunkedSumsByteArray(NativeCrc32.java:86)
at org.apache.hadoop.util.DataChecksum.calculateChunkedSums(DataChecksum.java:428)
at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunks(FSOutputSummer.java:197)
at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:163)
at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:144)
at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:78)
at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:50)
at java.io.DataOutputStream.writeBytes(DataOutputStream.java:276)
at com.cloudera.crunch.impl.mem.MemPipeline.write(MemPipeline.java:159)
Any idea what's wrong?

Hadoop environment variable should be configured properly along with hadoop.dll and winutils.exe.
Also pass the JVM argument while executing MR job/application
-Djava.library.path=HADOOP_HOME/lib/native

Why doesn't printStackTrace work in Clojure?

In both the Joy of Clojure and on Alex Miller's Pure Danger Tech blog-post it is recommended that you can print the last stack using something like the following:
(use 'clojure.stacktrace)
(java.util.Date. "foo")
(.printStackTrace *e 5)
But I can't get any of their examples to work, and instead just get
java.lang.NullPointerException: null
Reflector.java:26 clojure.lang.Reflector.invokeInstanceMethod
(Unknown Source) jtown$eval9755.invoke
What's up with this? .printStackTrace seems to be a Java function from the looks of it, so I am not sure why I am bringing clojure.stacktrace into my namespace, in the first place. I read through the clojure.stacktrace API, though, and see an e function, which seems similar too but is not the *e function, which is in core and is supposed to be binding to the last exception, but isn't. Could somebody straighten me out on the best way to check stack-traces?

There are some special vars available when using the REPL and
*e - holds the result of the last exception.
For instance:
core=> (java.util.Date. "foo")
IllegalArgumentException java.util.Date.parse (Date.java:615)
core=> (class *e)
java.lang.IllegalArgumentException
core=> (.printStackTrace *e)
java.lang.IllegalArgumentException
at java.util.Date.parse(Date.java:615)
<not included.....>
You are right, .printStackTrace is the java method that is invoked on the exception class. This is not very straightforward (since its java interop) so clojure.stacktrace namespace has some utilities about working with stack traces
So after
(use 'clojure.stacktrace)
you can use the stacktrace library instead of java interop:
core=> (print-stack-trace *e)
java.lang.IllegalArgumentException: null
at java.util.Date.parse (Date.java:615)
<not included.....>
Obviously in an app, instead of *e, you can do a try - catch and use the related functions as necessary

I use
(.printStackTrace *e *out*)
That seems to work.

YUICompressor crashes - stackoverflow error

I frequently get what appears to be a stackoverflow error ;-) from YUICompressor. The following is the first part of thousands of error lines that come from attempting to compress a 24074 byte css stylesheet (not the "Caused by java.lang.StackOverflowError about 8 lines down):
iMac1:src jas$ min ../style2.min.css style2.css
Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at com.yahoo.platform.yui.compressor.Bootstrap.main(Bootstrap.java:21)
Caused by: java.lang.StackOverflowError
at java.lang.Character.codePointAt(Character.java:2335)
at java.util.regex.Pattern$CharProperty.match(Pattern.java:3344)
at java.util.regex.Pattern$Branch.match(Pattern.java:4114)
... (plus 1021 more error lines)
The errors happen usually after adding a couple of lines to the file getting compressed. The css is fine, and works perfectly in the uncompressed format. I don't see a particular pattern to the types of selectors added to the file that cause the errors. In this case, adding the following selector to a previously compressible file resulted in the errors:
#thisisatest
{
margin-left:87px;
}
I am wondering if there is perhaps a flag to java to enlarge the stack that might help. Or if that is not the problem, what is?
EDIT:
As I was posting this question, it dawned on me that I should check the java command to see if there was a parameter to enlarge the stack. Turns out that it is -Xssn, where "n" is a parameter to indicate the stack size. Its default value is 512k. So I tried 1024k but that still led to the stackoverflow. Trying 2048k works however, and I think this could be the solution.
EDIT 2:
While I no longer use this method for minification any longer, to be more specific here is the full command (which I have set up as a shell alias), showing how the -Xss2048k parameter is used:
java -Xss2048k -jar ~/Documents/RepHunter/Website\ Materials/Code/Third\ Party\ Libraries/YUI\ Compressor/yuicompressor-2.4.8.jar --type css -o

As posted in my edit, the solution was to add the parameters to the java command. The clue was the error line at the 5-th "at" line, as follows:
at com.yahoo.platform.yui.compressor.Bootstrap.main(Bootstrap.java:21)
Caused by: java.lang.StackOverflowError
Seeing that the issue was a "StackOverlowError" ;-) gave the suggestion to try to increase the stack size. The default is 512k. My first try of 1024k did not work. However increasing it to 2048k did work, and I have had no further issues.

java.lang.ArrayIndexOutOfBoundsException when creating a connection to an Oracle database

It appears that Oracle's java client has a bug - if the tnsnames.ora file has misplaced spaces/tabs/new-lines in particular places, you get an exception with the following trace:
java.lang.ArrayIndexOutOfBoundsException: <some number>
at oracle.net.nl.NVTokens.parseTokens(Unknown Source)
at oracle.net.nl.NVFactory.createNVPair(Unknown Source)
at oracle.net.nl.NLParamParser.addNLPListElement(Unknown Source)
at oracle.net.nl.NLParamParser.initializeNlpa(Unknown Source)
at oracle.net.nl.NLParamParser.<init>(Unknown Source)
at oracle.net.resolver.TNSNamesNamingAdapter.loadFile(Unknown Source)
at oracle.net.resolver.TNSNamesNamingAdapter.checkAndReload(Unknown Source)
at oracle.net.resolver.TNSNamesNamingAdapter.resolve(Unknown Source)
at oracle.net.resolver.NameResolver.resolveName(Unknown Source)
at oracle.net.resolver.AddrResolution.resolveAndExecute(Unknown Source)
at oracle.net.ns.NSProtocol.establishConnection(Unknown Source)
at oracle.net.ns.NSProtocol.connect(Unknown Source)
at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1037)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:282)
at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:468)
at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:165)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:35)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:839)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:185)
If you take a C++ application and try to connect with it to the database with the same tnsnames.ora in use - it works fine. Same goes for sqlplus. Also tnsping which should parse this file has no problem resolving any service name. Seems like Oracle were too lazy to .trim() the values or something - and it is the same problem with Oracle client versions 9, 10 and 11.
Any idea why this problem exists and what is the exact problem with the tnsnames.ora format? (I just remove all white-spaces to resolve it)

I tried the advice from GriffeyDog but unfortunately it did not solve the problem - so eventually I too the check for your self approach:
Oracle's documentation states that the structure of a record in the tnsnames.ora file should be as such:
net_service_name=
(DESCRIPTION=
(ADDRESS=...)
(ADDRESS=...)
(CONNECT_DATA=
(SERVICE_NAME=sales.us.example.com)))
Ours was:
net_service_name=
(DESCRIPTION=
(ADDRESS=...)
(ADDRESS=...)
(CONNECT_DATA=
(SERVICE_NAME=sales.us.example.com)))
Apparently the indentation is crucial - if any of the lines in the block of a single net_service_name start at index 1 - this exception is thrown.
Only once you add indentation to all (can be spaces or tab) - it works. It doesn't have to look good, but has to have an offset of some sort.
Important note - the only problem is with '(', indentation rules don't apply to ')'.
E.g. the below example is perfectly fine:
net_service_name=
(DESCRIPTION=
(ADDRESS=...
)
(ADDRESS=...
)
(CONNECT_DATA=
(SERVICE_NAME=sales.us.example.com))
)
After searching for this issue to be documented - I finally found out that indeed it is documented at http://download.oracle.com/docs/cd/A57673_01/DOC/net/doc/NWUS233/apb.htm
And here is the important excerpt:
Even if you do not choose to indent your files in this way, you must indent a wrapped line by at least one space, or it will be misread as a new parameter. The following layout is acceptable:
(ADDRESS=(COMMUNITY=tcpcom.world)(PROTOCOL=tcp)
(HOST=max.world)(PORT=1521))
The following layout is not acceptable:
(ADDRESS=(COMMUNITY=tcpcom.world)(PROTOCOL=tcp)
(HOST=max.world)(PORT=1521))

I've seen similar issues arise when a text file is saved with Unix-style end-of-lines (LF) versus DOS/Windows-style (CR/LF), or vice-versa. You might try opening your tnsnames.ora file with an editor that supports saving in both formats to see if you can correct the problem that way.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Regular expression to parse a log file and find stacktraces - java

Looks like you just need to paste them together (and use a newline as glue): .+Exception[^\n]+\n(\t+\Qat \E.+\s+)+ But I would change your regex a bit: ^.+Exception[^\n]++(\s+at .++)+ This combines the whitespace between the at... lines and uses possessive quantifiers to avoid backtracking.

We have been using ANTLR to tackle the parsing of logfiles (in a different application area). It's not trivial but if this is a critical task for you it will be better than using regexes.

I get good results using perl -n -e 'm/(Exception)|(\tat )/ && print' /var/log/jboss4.2/debian/server.log It dumps all lines which have Exception or \tat in them. Since the match is in the same time the order is kept.

Related

Java Thread Dump - Negative Line Numbers

java.lang.UnsatisfiedLinkError when writing using crunch MemPipeline

Why doesn't printStackTrace work in Clojure?

YUICompressor crashes - stackoverflow error

java.lang.ArrayIndexOutOfBoundsException when creating a connection to an Oracle database

Categories

Resources