Compiling ISO SQL-2003 ANTLR Grammar - java

I am trying to compile the ISO-SQL 2003 grammar from here
http://www.antlr3.org/grammar/1304304798093/SQL2003_Grammar.zip. All three versions of it can be found here http://www.antlr3.org/grammar/list.html.
These are the steps I followed,
java -jar antlr-3.3-complete.jar -Xmx8G -Xwatchconversion sql2003Lexer.g
java -jar antlr-3.3-complete.jar -Xmx8G -Xwatchconversion sql2003Parser.g
javac ANTLRDemo.java
ANTLRDemo.java file:
import org.antlr.runtime.*;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
public class ANTLRDemo {
static String readFile(String path) throws IOException
{
byte[] encoded = Files.readAllBytes(Paths.get(path));
return new String(encoded, "UTF-8");
}
public static void main(String[] args) throws Exception {
ANTLRStringStream in = new ANTLRStringStream( readFile(args[0]) );
sql2003Lexer lexer = new sql2003Lexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
sql2003Parser parser = new sql2003Parser(tokens);
parser.eval();
}
}
First two steps work fine, then while compiling my main class I get a lot of errors related to Java syntax like these:
./sql2003Parser.java:96985: error: not a statement
$UnsignedInteger.text == '1'
./sql2003Parser.java:96985: error: ';' expected
$UnsignedInteger.text == '1'
./sql2003Parser.java:102659: error: unclosed character literal
if ( !(((Unsigned_Integer3887!=null?Unsigned_Integer3887.getText():null) == '01')) ) {
Please let me know if I am doing something wrong in setting up the parser. It would be helpful if someone can show me how exactly to setup this grammar using ANTLR.
Edit: After a little more fiddling, I think that these errors are caused by the actions present in lexer and parser rules. Is there a safe way to overcome this?

You are not doing anything wrong, ANTLR has never been able to generate a working Java parser from these grammar files.
According to a post by Douglas Godfrey to antlr-interest in Oct 2011:
I generated a C parser and lexer. they both generate and compile
successfully
on my machine with 8GB heap allocated to Antlr.
...
I don't believe that it will ever be possible to get a working parser in
Java. A C language parser on the other hand is quite possible.

Yes, basically you’re right. The grammar is broken. But also there is an error in your ANTLRDemo.java as there’s no eval() method in Parser class. You should call method with the name of any rule of the parser grammar e.g. query_specification(). In the grammar itself there were some errors looking as a typo, some undefined Java error() method calls, skip() calls in parser that are only suitable in lexer. You see all fixes in this commit. I’ve published my research in this GitHub repository.
I started to fix obvious errors of the grammar, which led to the compilation errors in generated java code. I had the same errors that you posted. Eventually I have fixed all Java syntax errors but faced another one which it impossible to fix directly because it originates from limitation of JVM, the compilation error: code too large. Reading ANTLR mailing list there was a hint to extract some static members of the huge classes into separate interfaces and “implement” them to have a sort of multiple inheritance. With trial and error I ended up with 6 interfaces ”imlemented” by parser in sql2003Parser.java.
But still there are 2 problems:
Wrong start rule. Douglas Godfrey wrote grammar that starts with sql2003Parser rule. Unfortunately if you call parser by this start rule, it won’t parse correctly even simplest select a from b. So I call parser by query_specification rule to parse SELECT clause only.
Some other errors in grammar. I didn’t dig too deep in the grammar but query_specification fails to parse some random complex SQLs.

Related

InOut.readInt() only working in Windows Java Editor

At school, I write Java programs with Windows’ Java Editor (console mode). There, InOut.readInt() (used for user input) works without problems and I don’t have to import anything.
Now, I have a Java homework for the holidays, and I try to write Java programs on my Mac. In online console Java editors, the line InOut.readInt() causes this error:
/IntBetween10And100.java:8: error: cannot find symbol
int input = InOut.readInt("Integer --> ");
^
symbol: variable InOut
location: class IntBetween10And100
1 error
I already tried import lines (placed before the class) like:
import java.*
import java.util.*
import java.util.InOut
import java.io.BufferedStreamReader
import java.util.*;
public class IntBetween10And100 {
public static void main(String args[]) {
int input = InOut.readInt("Integer --> ");
}
}
int input = InOut.readInt("Integer --> ");
should produce the line
Integer -->
but instead, the error message (seen above) appears.
OK, so you are using the "Java-Editor" tool on Windows for writing your Java code.
It turns out that Java-Editor includes a class called InOut as an example class. (You can see it here: http://javaeditor.org/doku.php?id=en:examples).
For various reasons, it is not suitable for use in production code of any kind:
It is not part of the Java SE class library, or any 3rd-party libraries.
It is a class in the default package
It has limited functionality, even compared to the real System.in and System.out
It would interfere with any application or 3rd party library code that uses System.in in the normal way. (It creates its own BufferedReader to wrap System.in. That is liable to capture "type-ahead" input.)
You don't really need to use it for educational purposes either. It is only a wrapper class ...
However, if you want to use InOut outside of the Java-Editor context, you could simply download the source code from the page above and add it to your project. I can't tell you exactly how, but adding classes should be explained in the documentation of the tool you are using now! (If you are serious about learning Java, I would actually recommend that you download and install a real Java JDK and a real Java IDE on your own computer.)
The authors have neglected to include an explicit copyright notice in the InOut.java file. However, the Java-Editor site as a whole is licensed as "CC Attribution-Share Alike 4.0 International".

Using ANTLR for static analysis of Java source file

Does anyone have a complete implementation (possibly github or googlecode) for using an ANTLR grammar file and Java source code to analyze Java source. For example, I want to simply be able to count the number of variables, method, etc.
Also using a recent version of ANTLR.
I thought I'd take a crack at this over my lunch break. This may not completely solve your problem, but it might give you a place to start. The example assumes you're doing everything in the same directory.
Download the ANTLR source from GitHub. The pre-compiled "complete" JAR from the ANTLR site contains a known bug. The GitHub repo has the fix.
Extract the ANTLR tarball.
% tar xzf antlr-antlr3-release-3.4-150-g8312471.tar.gz
Build the ANTLR "complete" JAR.
% cd antlr-antlr3-8312471
% mvn -N install
% mvn -Dmaven.test.skip=true
% mvn -Dmaven.test.skip=true package assembly:assembly
% cd -
Download a Java grammar. There are others, but I know this one works.
Compile the grammar to Java source.
% mkdir com/habelitz/jsobjectizer/unmarshaller/antlrbridge/generated
% mv *.g com/habelitz/jsobjectizer/unmarshaller/antlrbridge/generated
% java -classpath antlr-antlr3-8312471/target/antlr-master-3.4.1-SNAPSHOT-completejar.jar org.antlr.Tool -o com/habelitz/jsobjectizer/unmarshaller/antlrbridge/generated Java.g
Compile the Java source.
% javac -classpath antlr-antlr3-8312471/target/antlr-master-3.4.1-SNAPSHOT-completejar.jar com/habelitz/jsobjectizer/unmarshaller/antlrbridge/generated/*.java
Add the following source file, Main.java.
import java.io.IOException;
import java.util.List;
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import com.habelitz.jsobjectizer.unmarshaller.antlrbridge.generated.*;
public class Main {
public static void main(String... args) throws NoSuchFieldException, IllegalAccessException, IOException, RecognitionException {
JavaLexer lexer = new JavaLexer(new ANTLRFileStream(args[1], "UTF-8"));
JavaParser parser = new JavaParser(new CommonTokenStream(lexer));
CommonTree tree = (CommonTree)(parser.javaSource().getTree());
int type = ((Integer)(JavaParser.class.getDeclaredField(args[0]).get(null))).intValue();
System.out.println(count(tree, type));
}
private static int count(CommonTree tree, int type) {
int count = 0;
List children = tree.getChildren();
if (children != null) {
for (Object child : children) {
count += count((CommonTree)(child), type);
}
}
return ((tree.getType() != type) ? count : count + 1);
}
}
Compile.
% javac -classpath .:antlr-antlr3-8312471/target/antlr-master-3.4.1-SNAPSHOT-completejar.jar Main.java
Select a type of Java source that you want to count; for example, VAR_DECLARATOR, FUNCTION_METHOD_DECL, or VOID_METHOD_DECL.
% cat com/habelitz/jsobjectizer/unmarshaller/antlrbridge/generated/Java.tokens
Run on any file, including the recently created Main.java.
% java -classpath .:antlr-antlr3-8312471/target/antlr-master-3.4.1-SNAPSHOT-completejar.jar Main VAR_DECLARATOR Main.java
6
This is imperfect, of course. If you look closely, you may have noticed that the local variable of the enhanced for statement wasn't counted. For that, you'd need to use the type FOR_EACH, rather than VAR_DECLARATOR.
You'll need a good understanding of the elements of Java source, and be able to take reasonable guesses at how those match to the definitions of this particular grammar. You also won't be able to do counts of references. Declarations are easy, but counting uses of a field, for example, requires reference resolution. Does p.C.f refer to a static field f of a class C inside a package p, or does it refer to an instance field f of the object stored by a static field C of a class p? Basic parsers don't resolve references for languages as complex as Java, because the general case can be very difficult. If you want this level of control, you'll need to use a compiler (or something closer to it). The Eclipse compiler is a popular choice.
I should also mention that you have other options besides ANTLR. JavaCC is another parser generator. The static analysis tool PMD, which uses JavaCC as its parser generator, allows you to write custom rules that could be used for the kinds of counts you indicated.

Issues of using "SequenceFilesFromDirectory" in my code

I am trying to write a sample program that can call use the main method of "SequenceFilesFromDirectory", which aims to convert a set of files into sequence file format.
public class TestSequenceFileConverter {
public static void main(String args[]){
String inputDir = "inputDir";
String outputDir = "outoutDir";
SequenceFilesFromDirectory.main(new String[] {"--input",
inputDir.toString(), "--output", outputDir.toString(), "--chunkSize",
"64", "--charset",Charsets.UTF_8.name()});
}
}
But the Eclipse tells me that what I did was wrong with the following error message
Multiple markers at this line
- Syntax error on token "main", = expected after this
token
- Syntax error on token(s), misplaced construct(s)
- SequenceFilesFromDirectory cannot be resolved
I think I did not use this method correctly, but I don't know how to fix it? Thanks a lot.
The following is how the SequenceFilesFromDirectory defines. The API link for SequenceFilesFromDirectory is http://search-lucene.com/jd/mahout/utils/org/apache/mahout/text/SequenceFilesFromDirectory.html
My guess is that you're missing an import line from the first section of your file:
import org.apache.mahout.text.SequenceFilesFromDirectory;
I think your purpose for using SequenceFilesFromDirectory is to convert doc files to sequence files. If so, better to call the run()/runSequential()/runMapReduce() methods ater creating an object of SequenceFilesFromDirectory, because SequenceFilesFromDirectory.main() internally calls haddop ToolRunner.run() method for processing.
Whereas the run methods of SequenceFilesFromDirectory do the actual processings.

jpype+pdfbox class not found

I'm attempting to use JPype to call Apache Pdfbox from Python, and am having some difficulty actually importing the classes. It doesn't seem to be able to read them from the jar file in the class path.
from jpype import java, startJVM, shutdownJVM, JPackage, JClass, getDefaultJVMPath, nio
import sys, os, codecs
pdfbox_lib = "lib/pdfbox-1.6.0.jar"
classpath = '-Djava.class.path=' + pdfbox_lib + os.pathsep + '.'
startJVM(getDefaultJVMPath(), '-Xmx512m', classpath)
stream = java.io.FileInputStream(java.io.File("test.pdf"))
pdfparser = JPackage('org.apache.pdfbox.pdfparser')
parser = JClass('org.apache.pdfbox.pdfparser.PDFParser')
At this point, the script errors out with the following:
java.lang.ExceptionPyRaisable: java.lang.Exception: Class org.apache.pdfbox.pdfparser.PDFParser not found
I'm running on Linux with Python 2.7, and I know there's nothing wrong with the JPype installation (if there were, the stream declaration would error out). I've also tried various permutations of the class path statement and the JPackage/JClass statements, and nothing seems to matter. Any suggestions would be greatly appreciated!
I figured it out. Three additional jars need to be added to the class path: fontbox-x.x.x.jar, jempbox-x.x.x.jar, and commons-logging.jar.

Weird error with Locale.getISOCountries()

I'm using this code:
for (final String code : Locale.getISOCountries())
{
//stuff here
}
But on compile I get this error:
[ERROR] Line 21: No source code is available for type java.util.Locale; did you forget to inherit a required module?
And then a stack trace of compiler errors.
I'm doing both of these imports at the beginning of the class:
package com.me.example;
import java.util.Locale;
import java.util.*;
What can be wrong?
In Netbeans i see the autocomplete options and no syntax error for the Locale object...
Something screwy with your setup, the folllowing program works fine for me.
import java.util.*;
import java.util.Locale;
public class Donors {
public static void main (String [] args) {
for (final String code : Locale.getISOCountries()) {
System.out.println (code);
}
}
}
The fact that it's asking for source code leads me to believe that it's trying to compile or run it in some sort of debugging mode. You shouldn't need the source code for java.util.* to compile, that's just bizarre.
See if my simple test program works in your environment, then try looking for something along those lines (debugging options). Final step: compile your code with the baseline javac (not NetBeans).
UPDATE:
Actually, I have found something. If you are creating GWT applications, I don't think java.util.Locale is available on the client side (only the server side). All of the references on the web to this error message point to GWT and its limitations on the client side which are, after all, converted to Javascript goodies, so cannot be expected to support the entire set of Java libraries.
This page here shows how to do i18n on GWT apps and there's no mention of java.util.Locale except on the server side.
Looks like there might be something fishy in your build environment, as Locale.getISOCountries() should work just fine. Try compiling a small test program manually and see if you get the same error.
Definitely try to boil this down to a minimum, three-line program (or so), compile from the command-line, then put that class into your IDE and see if you still get the error, and if not, then change/add one line at a time until you have the original failing program, looking for what causes the problem. I'm thinking maybe some other import in your code is importing a Locale class? Why in the world would it be looking for source code?
See what happens when you compile this from the command-line:
import java.util.*;
public class LocaleTest {
public static void main(String[] args) {
Locale.getISOCountries();
}
}

Categories

Resources