Parsing / reading C-Header files using Java

Parsing / reading C-Header files using Java - java

I have a C-Header file defining a couple of stucts, containing multiple char arrays.
I'd like to parse these files using Java. Is there a library for reading C-Header files either into a structure or is there a stream parser that understands C-Header files?
Just for more background (I'm just looking for a C-Header parser, not a solution for this particular problem):
I have a text file containing data and a C-Header file explaining the structure. Both are a bit dynamic, so I don't want to generate Java class files.
example:
#define TYPE1
typedef struct type1
{
char name1[10];
char name2[5];
}
#endif
Type2, Type3 etc are similar.
Data structure:
type1ffffffffffaaaaa

You can use an existing C parser for Java. It does a lot more than parsing header files, of course, but that shouldn't hurt you.
We use the parser from the Eclipse CDT project. This is an Eclipse plugin, but we sucessfully use it outside of Eclipse, we just have to bundle 3 JAR files of Eclipse with the parser JAR.
To use the CDT parser, start with an implementation of org.eclipse.cdt.core.model.ILanguage, for example org.eclipse.cdt.core.dom.ast.gnu.c.GCCLanguage. You can call getTranslationUnit on it, passing the code and some helper stuff. A code file is represented by a org.eclipse.cdt.core.parser.FileContent instance (at least in CDT7, this seems to change a lot). The easiest way to create such an object is FileContent.createForExternalFileLocation(filename) or FileContent.create(filename, content). This way you don't need to care about the Eclipse IFile stuff, which seems to work only within projects and workspaces.
The IASTTranslationUnit you get back represents the whole AST of the file. All the nodes therein are instances of IASTSomething types, for example IASTDeclaration etc. You can implement your own subclass of org.eclipse.cdt.core.dom.ast.ASTVisitor to iterate through the AST using the visitor pattern. If you need further help, just ask.
The JAR files we use are org.eclipse.cdt.core.jar, org.eclipse.core.resources.jar, org.eclipse.equinox.common.jar, and org.eclipse.osgi.jar.
Edit: I had found a paper which contains source code snippets for this:
"Using the Eclipse C/C++ Development Tooling as a Robust, Fully Functional, Actively Maintained, Open Source C++ Parser", but it is no longer available online (only as a shortened version).

Example using Eclipse CDT with only 2 jars.
- https://github.com/ricardojlrufino/eclipse-cdt-standalone-astparser
In the example has a class that displays the structure of the source file as a tree and another example making interactions on the api ...
A detail is that with this api(Eclipse CDT Parser) you can do the parsing from a string in memory.
Another example of usage is:
https://github.com/ricardojlrufino/cplus-libparser
Library for metadata extraction (information about classes, methods, variables) of source code in C / C ++.
See file:
https://github.com/ricardojlrufino/cplus-libparser/blob/master/src/main/java/br/com/criativasoft/cpluslibparser/SourceParser.java

As mentioned already, CDT is perfect for this task. But unlike described above I used it from within a plugin and was able to use IFiles. Then everything is so mouch easier. To get the "ITranslationUnit" just do:
ITranslationUnit tu = (ITranslationUnit) CoreModel.getDefault().create(myIFile);
IASTTranslationUnit ias = tu.getAST();
I was i.e. looking for a special #define, so I could just:
ppc = ias.getAllPreprocessorStatements();
To get all the preprocessed code statements, each statement in array-element. Perfectly easy.

You can try to use ANTLR. There should be already some existing C grammar available for it.

Related

Efficient java library for text templating?

I've got a simple string coming in from a UI component as The device id is %{test}. Assume %{test} is a dynamic variable and the values for it are being assigned from the backend code. The final string should look like:
The device id is some text
----------------------------^ should be replaced with %{test} and appended to the whole string
I've read a bit and tried out some of the libraries which were pointed out here, such as Velocity and FreeMarker. But I'm quite unaware in terms of efficiency and performance on using those libraries.
Hope I could get some insights on this since I'm pretty new to this. Any help could be appreciated.

I suggest you to take a look at Arco Template Engine: It compiles the template in compile-time, producing a .java (or .class) file. And so, at run-time, the expansion is done very fast.
The templates should be coded in JSP format. Thus, all variables references must be written ${variable} (not %{variable}).
The only thing to take in account is that templates must be staticly generated (in order to be processed at compile-time).
(Read the FAQ and the examples).

How to use header file (.h) with Android

I've read a lot of posts about this topic, but I don't know sure if I can use this type of files with Android java classes.
I would like to load a big array of floats to use OpenGL to show a 3D model.
To do this, I've a .h file that contains the different arrays. This file is like this:
unsigned int numVerts= 123456;
float verts[] = {
//a lot of data here
...
}
Then... Is there anyway to load this array into an Android class?

You can't do this directly. .h files are part of the C language and cannot be used in Java.
There are two real possibilities:
If you need to use the constants defined in the .h file, you can create a Java file that defines similar constants. In fact it is pretty easy, using your favorite scripting language, to do this automatically.
Use the NDK. If you need more than the constants -- you are going to call OpenGL functions directly or some such thing -- then you can write your code in C, refer to the .h file, and call the C methods you define, from you Java code.
The guy that downvoted this answer insists that there is a third possibility, which I include only for completeness. You could, from your Android program, read in the .h file, like any other file, and parse it, to get the information that you want. Doing that would be completely crazy.

Transforming .mm file to readably form by java

I am developing Multi-mode resource-constrain project scheduling solver in Java. I was looking for test instances but only I found this. It is in .mm file that is extension for C++ compilator. Is there any way how to transform this data into something easy readable by java like XML, JSON?

As suggested you could of course parse the file as a text file. Alternatively the two other main approaches would be:
Use clang/llvm's active syntax tree (AST) to interpret the data in the file.
Use an Objective-C++ grammar for a compiler generator like yacc or, since you're using Java, JavaCC. This will also yield a syntax tree, that you can that walk and extract information from.

How to identify the file type even though the file-extension has been changed?

Files are categorized by file-extension. So my question is, how to identify the file type even the file extension has been changed.
For example, i have a video file with name myVideo.mp4, i have changed it to myVideo.txt. So if i double-click it, the preferred text editor will open the file, and won't open the exact content. But, if i play myVideo.txt in a video player, the video will be played without any problem.
I was just thinking of developing an application to determine the type of file without checking the file-extension and suggesting the software for opening the file. I would like to develop the application in Java.

One of the best libraries to do this is Apache Tika. It doesn't only read the file's header, it's also capable of performing content analysis to detect the file type. Using Tika is very simple, here's an example of detecting a file's type:
import java.net.URL;
import org.apache.tika.Tika; //Including Tika
public class TestTika {
public static void main(String[] args) {
Tika tika = new Tika();
String fileType = tika.detect(new URL("http://example.com/someFile.jpg"));
System.out.println(fileType);
}
}

Structure, magic numbers, metadata, strings and regular expressions, heuristics and statistical analysis... the tool will only be as good as the database of rules behind it.
Try DROID (Digital Record Object IDentification tool) for identifying file types; Java, Net BSD-licensed. It is a free project of the National Archives UK, unrelated to Android. Source is available on Github and Sourceforge. The DROID documentation is good, there's also a getting started guide from the Digital Preservation Coalition.
See also Darwinsys file and libmagic.

There's a tool called TrID that does what you are after - it current supports 5033 different file types - and can be trained to add new types. On *nix systems, there's also the file command, which does something similar.

well, its like having a database of file-format you want to read without looking for extension in your app. Exactly as Linux does. So whenever you open a file, you need to check file-format database which type it belongs to. Though Not sure how will it work for different file types, but most of files have fixed header format, be it zip, pdf, mpg, avi, png, etc.. so this approach should work

You could try MimeUtil2, but it's quite old and though not up2date. The best way is still the file extension.
But the solution from Adam is not as bad as you think. You could build your platform independent solution using a wrapper around command line calls. I think you will get much better results using this method.

The following code snippet retrieves information about the file type
final File file = new File("file.txt");
System.out.println("File type is: " + new MimetypesFileTypeMap().getContentType(file));
Hopefully, it may help you

Convert xml to xsd using java

I am looking for a tool or java code or class library/API that can generate XSD from XML files. (Something like the xsd.exe utility in the .NET Framework sdk)

These tools can provide a good starting point, but they aren't a substitute for thinking through what the actual schema constraints ought to be. You get the opportunity for two kinds of errors: (1) allowing XML that shouldn't be allowed and (2) disallowing XML that should be ok.
As an example, pretend that you want to infer an XSD from a few thousand patient records that include a 'gender' tag (I used to work on medical records software). The tool would likely encounter 'M' and 'F' as values and might deduce that the element is an enumeration. However, other valid (although rarely used) values are B (both), U (unknown), or N (none). These are rare, of course. So, if you used your derived schema as an input validator, it would perform well until a patient with multiple sex organs was admitted to the hospital.
Conversely, to avoid this error, an XSD generator might not add enumerated type restrictions (I can't remember what these are called in schemas), and your application would work well until it encountered an errant record with gender=X.
So, beware. It's best to use these tools only as a starting point. Also, they tend to produce verbose and redundant schemas because they can't figure out patterns as well as humans.

Check Castor, I think it has the functionality you are looking for. They also provide you with an ant task that creates XSD schemas from XML files.
PS I suggest you to add more specific tags in the future: For instance, using xml, xsd and java will increment the possibility of getting answers.

You can use xsd-gen-0.2.0-jar-with-dependencies.jar file to convert xml to xsd.
And Command for it is "java -jar xsd-gen-VERSION-jar-with-dependencies.jar /path/to/xml.xml > /path/to/my.xsd"

Try the xsd-gen project from Google.
https://code.google.com/p/xsd-gen/

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.