Reading bytecode from unloaded classes in external jarfiles - java

In my Java application, I wish to read bytecode contents from class files that aren't actually loaded, in jar files which also aren't loaded. As in, I need to be able to take any given jarfile, and find all classes inside it, ideally. So take the following situation:
My application (which is kind of a library), is asked to 'check' a certain jar at whatever path, and is provided various patterns provided by the application using my library to find similarities (such as constant pool similarities). Therefore my library needs to go through all the jars in a class file. Obviously I could make it hardcoded or loaded from a file, but I'd much rather have it so that I can go through the bytecode of all the files in a jar to match them.

You should use the JarFile API and iterate over the files in it.
It shouldn't be hard to do. This article might be a good start.
And as for the bytecode you could just treat each (uncompressed) class file as a byte array and calculate a hash, maybe an MD5 hash of each file and compare it to previous hash.

Related

Are there any Java Class Library "header files" containing all method descriptors in the standard library?

In order to create a valid .class file, every method has to have a full internal name and type descriptors associated with it. When procedurally creating these, is there some sort of lookup table one can use (outside of Java, where a ClassLoader can be used) to get these type descriptors from a method name? For example, how would one go from Scanner.hasNextByte to boolean java.util.Scanner.hasNextByte(int) / boolean java.util.Scanner.hasNextByte() (or even from java.util.Scanner.hasNextByte to boolean java.util.Scanner.hasNextByte(int) / boolean java.util.Scanner.hasNextByte())? The above example has overloading in it, which is another problem a human- but mostly computer-readable declarations file would hopefully address.
I've found many sources of human-readable documentation like https://docs.oracle.com/javase/8/docs/api/index.html containing uses of each method, hyperlinks to other places, etc. but never a simple text file or collection of files containing just declarations in any format. If there's no such file(s) don't worry about it, I can try and scrape some annoying HTML files, but if there is it would save a lot of time. Thanks!
The short answer is No.
There isn't a "header file" containing the class and method signatures for the Java class libraries. The Java tool chain has no need for such a thing. Nor do 3rd-party Java compilers, or compilers for other languages that rely on the Java SE class libraries.
AFAIK, there isn't a 3rd-party tool that builds such a file or an equivalent database or in-memory data structures.
You could create one though.
You could chose an existing Java parsing library, and use it to build parse trees for all of the source files in the class library, and emit the information that you need.
You could potentially create a custom Javadoc "doclet" plugin to emit the information.
Having said that, I don't understand why you would need such a mapping. Surely, your IDE does this already ... and exposes the information via some internal API. And if this is not for an IDE plugin, what it is for?
You commented:
I'm making a compiler for a JVM-based programming language ....
Ah ... so your compiler should do what other compilers do. Get the information from the ".class" file. You can either load the class using a standard or custom class loader, or you can use a library like asm or bcel or javassist ... which can read a ".class" file without loading it.
(I haven't checked, but I think the standard javac compiler uses an internal API to do this.)
Note that your proposed approaches won't work for interfacing with 3rd-party Java libraries where the source code is not available and/or the javadoc is not scrapable.
What about building it from the source files for the standard library?
The Oracle Java 8 API web pages you referenced was created by Javadoc processing of source files for the Java standard library.
If you use an IDE with a debugger, there is a good chance you already have much of the standard library source code downloaded. After all, if you set a break point, and then follow the program step-by-step with "Step into", you can trace the execution of the program into standard library methods. The source files would be part of the JDK.
However, some parts of the standard library source might not be available, due to licensing restrictions.

Is there an automated way to compare two .jar files to get java related infos?

I know that I can just extract a .jar contents and thus compare it to another one.
Nevertheless this process has two flaws, in my opinion:
1)It is tedious: I have to manually compare them.
2)It does not give me important information like, for example, the jdk version .class files were compiled against or, in case of .war files, any specific environment they were optimized for.
So I ask: is there any program which actually takes in .jar/.war files and tells me the differences in such a manner?

Java: any way to get a ZipFile (or anything with a direct getEntry method) from a byte array?

I have the contents of a zip file in a byte array. The file contains a number of entries (typically about 12), but I only care about three of them.
I would like to somehow get this into a ZipFile object, so I can pull those specific three ZipEntrys out using ZipFile.getEntry. I'm open to using something other than ZipFile that has a similar look-up-by-name method like getEntry.
My initial investigation suggests that I'm out of luck. ZipFile requires a real file in the file subsystem (which I cannot and do not want to access) and so I can't get there from here, and no means other than ZipFile exists that allows extracting particular entries by name; but I wanted to check. In languages like C# and Python, this is pretty straightforward (in C# I go from byte array to MemoryStream to ZipArchive; in Python I just wrap it in StringIO and treat like a file), so I wanted to make sure I'm not missing something obvious.
My Plan B is to use ZipInputStream and repeated calls to getNextEntry to go through all dozen or so entries, and throw away all except the three I care about, but that just smells bad to me.
A ZipInputStream can be instantiated for any InputStream ... including a ByteArrayInputStream.
Apart from that you are out of luck ... if you stick with Java SE classes.
The root of the problem (from an API design perspective) is that ZipFile is a wrapper for functionality that is implemented in native code. The native code opens the input stream for itself, and it uses a native filename / pathname.
The main reason for a native ZIP implementation that works that way is that the JVM needs to load code from ZIP files as part of the bootstrap procedures. This happens before the native implementation has loaded classes such as InputStream. Indeed, it has to.
There are a number of 3rd party libraries. Start by reading this Q&A - What is a good Java library to zip/unzip files?

Is there a way to pass the contents of class/jar files to a JVM without saving them explicitly on disk?

Suppose that I want to prevent trivial disassembly of jar/class files.
A JVM is started from a C++ application that can descramble the jar/class files that are stored within its own executable. Is there a way of somehow streaming the contents of such files to a JVM without saving them on disk?
I'm looking for a solution on both windows and unix platforms.
You can create a ClassLoader which gets its class data from anywhere. You could even have it call native methods to obtain byte code for a class. Have a look at URLClassLoader which is widely used, it can obtain it's classes from files on disk or the network or any supported URL.
Think part what you're after is supplied by the JarInputStream class, Docs
You'd need some custom class-loading behavior as well. May need to create a Classloader implementation that loads your classes as well if you go that route. It might be simpler to use the URLClassloader as well depending on your circumstances.

How to encrypt a .jar file

I'm working in a project where we need to encrypt the .jar file so no one can access to the .class files which inside the jar file.... is there any java coding which can help me to encrypt the .jar file ?
Even if you encrypt the jar file, it must be decrypted before the JVM is able to run it, so you'll need another jar file containing classes that decrypt and loads in the JVM.
Since this second jar file cannot be itself encrypted, a malicious user wanting to see you class files, can simply look at classes in this second jar file, and then decrypt your super-secret jar file and have access to it.
Maybe you can increase security of your code using an obfuscator, but it will eventually protect (make it harder but not impossible) your class files from decompilation, not from being used.
If obfuscation is not enough, you could consider compiling your jar file to a DLL for windows or a SO for unix/linux, that will make it much harder to decompile, but it's not always possible to do that correctly and it's generally a PITA. GCJ is able to do this somehow, and there are other commercial products that will actually compile .class/.jar directly to machine code.
However please consider that it does not matter how much security you put in it, since the client computer MUST be able to execute it, it must be able to read it, so no matter what your code will be exposed, you can only make it harder.
If you really have an algorithm so secret you don't want to disclose no matter what, consider converting it to a web service, hosting it on your server, so that you don't have to send the actual code to the client machines and can also better prevent unauthorized copies of your application by checking access to that vital part of it.
I assume you are aware of the fact that any skilled java coder can reverse-engineer the Java tool you use (or write) and still decode the app's jars? Also writing custom classloaders which read your "encrypted" code can be decompiled and a tool could be written to bypass it.
Even with obfuscation and bytecode modification and custom classloaders, java is hackable/decompileable and the source can almost always be brought to a somewhat readable state.
You want to obfuscate, not encrypt, the jar file.
A popular choice for doing this in Java is ProGuard.
No. Since your program needs to be able to run the code it would be pointless anyway.
You can obfuscate your code though so decompiling the .class files results in less readable code (meaningless variable/class names etc).
As far as I know this is not supported by standard JVM. But you can do the following. Separate your application into 2 parts. First will not be encrypted. It will be a simple loader that will instantiate the rest using custom class loader. This class loader will get Classes as arrays of bytes, decrypt and load them.
if you don't want to provide an access to the class files inside the jar, why should you supply your jar with the application?
It feels like your question is kind of wrong conceptually...
If you need some custom way of loading the classes, consider to use custom classloader.
if you are packaging in jar -> just rename it to jarname.ABCD or any misleading extension or even take off the extension, and accordingly specify the jar name in your application.
i prefer jCrypt!
It is a simple tool where you can crypt the classes(and ressources)

Categories

Resources