I am trying to access a java package loaded into memory and dump it to a file. Here is how the security works: there is an exe packed with Themida that contains the java main class code to be loaded. At runtime the Themida exe loads the clean main class java code into memory. The software is structured with the loader being contained within the exe, but several external libraries can access the packages contained within the exe. So, exe contains com.mysoft.mainloader. But the clean jar library Mylib.jar can call functions within com.mysoft.mainloader. How to I dump com.mysoft.mainloader to a jar file? Can I modify Mylib.jar to dump it as it has access to the package once it is loaded as well?
There is no supported Java SE mechanism to read / retrieve a ".class" that has been loaded by a classloader. So your options would be:
Modify the custom classloader you are using to capture the ".class" before (or after) the classloader calls defineClass.
Burrow into the JVM data structures to try and figure out whether the entire ".class" stream is captured somewhere and then retrieve it.
Modify the JVM ...
Any of these could be feasible. All will be relatively difficult.
It is possible to get loaded classes in runtime using Dynamic Attach and Instrumentation API.
The idea is to inject a Java Agent into the running application.
The agent gets an array of all loaded classes with Instrumentation.getAllLoadedClasses method, then gets their bytecode using Instrumentation.retransformClasses.
The working implementation can be found in the class-file-extractor project.
Usage:
java -jar extractor.jar <pid> mainloader.jar com.mysoft.mainloader
where
<pid> is the process ID of the target JVM application;
mainloader.jar is the output file name;
com.mysoft.mainloader is the name prefix of the classes to extract.
Related
I read about CDS in Oracle doc https://docs.oracle.com/javase/8/docs/technotes/guides/vm/class-data-sharing.html
What I understood is the system class files needed for loading the jvm are parsed, verified and then stored in a archive at jre/lib/[arch]/client/classes.jsa. Moreover they also provide their memory mapping for jvm,so jvm directly maps the memory according to the mapping information given in the archive. So this reduces the overhead of class loading everytime a jvm instance starts. Please correct me if was wrong.
Now coming to java 10, how can I achieve this for my application code ?
Secondly, would the complete application code be eligible for CDS or are there some restrictions?
There are three essential steps to creating and using an archive with application class-data (for more details, read my post about application class-data sharing):
Creating a list of classes to include in the archive:
java -XX:+UseAppCDS
-XX:DumpLoadedClassList=classes.lst
-jar app.jar
Creating an archive:
java -XX:+UseAppCDS -Xshare:dump
-XX:SharedClassListFile=classes.lst
-XX:SharedArchiveFile=app-cds.jsa
--class-path app.jar
Using the archive:
java -XX:+UseAppCDS -Xshare:on
-XX:SharedArchiveFile=app-cds.jsa
-jar app.jar
Keep the following in mind:
you can’t use wildcards or exploded JARs for the class path when creating the archive
the class path used to launch the application must have the one used to create the archive as a prefix
if you have any problems, use -Xlog:class+load (more on -Xlog) to get more information
The JEP for AppCDS has the example show casing how to add your application classes to shared archive.
As for the restrictions, there are few:
Straight classes (.class) present in directory on class path
cannot be added to the shared archive. See this thread.
Classes loaded by custom
class loaders cannot be added to the shared archive. See this thread.
There are other practical considerations to be aware of when using CDS/AppCDS, such as:
If you update the jar files on the file system, then you will have to recreate the shared archive.
If you are using Java or JVMTI agent(s) that modify/re-transform/redefine the class file at run-time, then the shared archive won't be useful as the classes will be loaded from the disk since the agents need actual classfile data which I believe is not stored in the shared archive.
Another nice and detailed article on CDS and AppCDS is https://simonis.github.io/cl4cds/.
Author of the article has also written a tool that allows sharing of application classes even if they get loaded by a custom class loaders.
If you are interested in using CDS, you can also try OpenJ9 JVM which has this feature for a long time and is much more mature and complete. Read more about it here.
I am a little bit confused...
I know that classes are loaded by the class loader only when they are needed,that is when we are trying to use static variables or when we are creating instance of that class.Thus if we have for e.g. 3 classes in our program and we are going to use only one,then only that particular class will be loaded and rest are not,but when we run the java compiler,it will create 3 .class files,I know these 3 .class files are byte code files,but then what is this byte code and what is the difference between loading a class and generating bytecode of a class?Where is the use of this byte code?If we are not going to use a particular class,then what is the need of generating a bytecode for that class?
Java is a compiled language. The purpose of compiling into bytecode is to allow the code to run on the JVM on any platform. Platform independence is a feature built into java.
Furthermore, you don't have to compile all three class files unless they have inter-dependencies. You can specify which specific files to compile in the console javac command. If you are using an IDE, check your settings or exclude the undesired class from the project.
Loading a class happens at runtime, when you're preparing to invoke whatever properties the class has.
Generating the bytecode of a class happens at compile time. This allows the code to be run on the virtual machine.
Java is a compiled language, and it runs on top of the Java Virtual Machine. Compiling bytecode translates whatever higher level code (be it Java, Scala, or Clojure) into machine-independent instructions to be read by the JVM. This is why that your (backend-specific) program will generally run without modification on Linux, Windows, and Mac OS X.
The Java language will compile any classes that have dependencies on each other within the path, so if you have a class but it is not used, chances are it will not be compiled. There may be tools that override that, so if you find yourself not using a class, then remove the class so that unnecessary bytecode is not generated.
Difference between languages like C++ and java is byte code. C++ binaries(compiled,assembled,linked) will have the machine(op) codes for the OS it got compiled for. In the case of java the byte code is the target for JVM. Byte code will have the opcodes for JVM. JVM in turn will initiate the respective os calls. So bytecode and JVM makes java programs independent of os.
Reg loading class loading,it happens when the program needs it. This is at runtime. JIT will do the second compilation of class when needed.
When we compile .java we get .class file.
The .class file is called byte code.
The Byte code in Java is nothing but a .class file which is not understandable by humans i.e (00110011). These .class files are generated only after the compilation of .java.
These .class file can be used to run on any platform.
I'm writing a tool which is supposed to determine what classes have changed when a system is upgraded. What i have as input are:
1 - the jar file used with the pre-upgraded system
2 - the jar file used with the post-upgraded system
3 - a list of classes and their purpose that I care about
asset init package1.myasset.class
asset terminate package1.assetterminate.class
etc.
What I want to do is be able to load the jar that comes with the new system and the jar that came with the old system and determine if a given class has the same parent after the change that it did before the change. i.e. if in the new system asset init
1 - is still package1.myasset.class and
2 - is still an extension of package0.generalasset.class
I assume I can use reflection but I'm not sure how to determine what a given classes parent is in each jar to see if it changed.
The jars are not necessarily used directly (runtime) by my tool - in fact they shouldn't really be used except as an input to the analysis of the system change.
You need two URLClassLoaders, one for each JAR file.
Process your list of classes and load each one in turn via both classloaders.
Do not attempt to cast them, they are just Class<?> at this stage.
Call Class.getSuperClass() on each, and then getName() on each parent class.
Compare.
Although there are many possibly important changes to a class other than its (single) parent, if that's really all you need I wouldn't bother with code, I'd just put both jars on a system with JDK and do:
javap -classpath jar1 aclassname |grep extends
javap -classpath jar2 aclassname |grep extends
# or substitute findstr on unimproved Windows
# repeat for multiple classes as desired
If you want to look for any change in the "public API", the full javap output is a good start.
This question goes a little bit further than my previous question:
Obtaining a list of all Java classes used from all JVM's?
Now I need to know the physical location from where those classes are loaded from. I have checked out the jcmd help for other commands but it wasn't useful for me. I also can't find it in jvisualvm, but the information is also not there. Anyone can help me with this?
EDIT:
This is my situation: My company has got different individual java projects (jars) for which we can control whether it starts or stops. We can control this in our own custom build webinterface. Each of such a process gets a process ID (PID) when started, and then runs on the background.
My need: I need a list of all loaded classes by each running PID java process. I already have jcmd <pid> GC.class_histogram, but this only contains a list of which classes are loaded. I also want the information where the classes are actually loaded from (which jar, location on file system).
The classes are loaded from java.lang.ClassLoader's loadClass(String name) method which in turn calls the findClass(String name) method.Usually,the custom ClassLoader overrides the findClass method to retrieve the definition of classes using a specific protocol and location.It may be that classes are loaded from database or from network location,the location of which may be dynamically generated.So you can never know location of all Java classes. The best example is AppletClassLoader which loads classes from network stream or from a remote location.
Bootstrap ClassLoader - core library package such as rt.jar present in JRE lib folder
Extension ClassLoader - jar files present in ext folder or as specified in the environment variable for ext
System ClassLoader - Application's classpath or as specified in environment variable for classpath or through JVM's startup argument parameter for -cp or -classpath
CustomClassLader -accordingly to the classLoader's class loading policy (mostly defined in findClass() method)
We have a java based (jersey+grizzly) REST Server which exposes calls like
foo.com/{game}/rules
foo.com/{game}/players
foo.com/{game}/matches
There can be arbitrary number of games
Each game has different implementations for rules, players, matches
For historical reasons, we would want separate jars for each game implementation
So there is REST Server
as and when there is a call like foo.com/tennis/rules
we want the REST Server to dynamically load 'tennis' jar. The jar does its operation. Then jar should be unloaded
If the next call was for foo.com/football/players
we want the REST Server to dynamically load 'football' jar. The jar does its operation. Then jar should be unloaded
Is there a technique to do this ?
Apparently there is a very old question around [this]: java: is there a framework that allows dynamically loading and unloading of jars (but not osgi)?
I don't know how it works on Java 8, but unloading a class in Java 7 requires unloading not only the class, but its loader, along with all references from other objects that this class might have.
Once all of them were unloaded the System.gc will be called. If other classes are still holding references then the gc won't do its job.
OSGI (as suggested by #Joop Eggen) is a viable option. JRebel, is not.
proxy-object proxy-object library
Load java jar files dynamically in isolated class loader to avoid dependency conflicts and enable modular updates. All jar files in the [main jar folder]/lib/myLib/2.0/*.jar will be loaded.
Code Examples
Create Object from a JAR file located in the myLib/2.0 folder:
File libDir = new File("myLib/2.0");
ProxyCallerInterface caller = ObjectBuilder.builder()
.setClassName("net.proxy.lib.test.LibClass")
.setArtifact(DirArtifact.builder()
.withClazz(ObjectBuilderTest.class)
.withVersionInfo(newVersionInfo(libDir))
.build())
.build();
String version = caller.call("getLibVersion").asString();