I have a library that writes data in either a text or binary format. It has the following three components:
common data structures
text writer (depends on 1)
binary writer (depends on 1)
The obvious way to distribute this is as 3 .jar files, so that users can include only what they need.
However, the "common data structures" component is really just two small classes so I'm considering creating only two .jar files and including the common .class files in both.
My question: What are the potential problems with doing this?
The potential version mismatch others mentioned is actually one case of a larger set of classloading problems you might face if you deploy the same class(es) in different jars.
Classloading bugs are most likely to bite you in an application server / EJB container or similar setup, where there are multiple components / apps loaded by a hierarchy of classloaders. However, if the same class is loaded by two different classloaders, these are seen as totally distinct classes by the JVM! Which may result in different runtime errors like LinkageError (e.g. if two different versions of the same class definition collide - as described in other answers), ClassCastException (if a cast is attempted between two class definitions loaded by different classloaders) etc. Believe me, classloading hell is a place you don't want to see.
I would put the whole library into a single jar to minimize that risk.
If you release a new version of your library, a user who uses both libraries (one old one new) could get runtime exceptions, when the new library gets a class from 1 from the old library (or vice versa if it is not backwards compatible). The easiest would be, to release all in one jar, so you would not get this version issue.
Potential problems is that in the future data structure versions between these two jar files could get inconsistent (say because bug fixing or a minor release). In this case you could get ClassDefNotFoundException if you need to include both jars into an application. I would recommend either to divide it in three jar files or just one bigger one.
I recommend using JarJar which is a library for packaging a number of jar files in a single jar file.
There is an Ant task for integrating into your build and your build environment can therefore just keep the raw jars and you can just have a simple deployment (but remember to include the license.txt files from the various libraries with your distribution).
Related
Is there a way to automatically find out which Java classes are actually loaded (either during compile time, as far as that's possible, or during the runtime of an application), and to throw out all other classes from a JAR to create a smaller JAR? Does that actually make sense in practice?
I am talking about the application classes for an application JAR. Usually there are lots of libraries in an application, and an application rarely needs all features of those libraries. So I suspect that would make a considerably smaller application. In theory that might be done for example via an Java agent that logs which classes and resources are read by one or several runs of an application (or even just by java -verbose:class), and a maven plugin that throws out all other classes from a jar-with-dependencies. Is there already something like that?
Clarification: I am not talking about unused dependencies (JARs that are not used at all), but about removing unused parts of each included JAR.
Well, the Maven Shade Plugin has an option minimizeJar when creating an Uber-JAR for your application:
https://maven.apache.org/plugins/maven-shade-plugin/
But, as others already pointed out, this is quite dangerous, as it regularly fails to detect class accesses which are done via Reflection or other dynamic references.
It may not be a good approach automate, as application can use reflection to initialise objects or one JAR is dependent on another JAR.
Only way that I can think of is to remove each JARs one by one and check if application runs as expected. Then again in this approach all modules of the application has to be tested, since one module can work without particular dependency and other may not.
Better solution is to take care while developing. The application developer must be careful in adding a dependency and removing unwanted dependency after his/her piece of code is done.
Global strategy.
1) Find all the classes that are loaded during runtime.
2) List of all the classes available in the classpath.
3) Reduce your class path by creating copies of jars containing only classes you need.
I have done 1 and 2 part so I can help you.
1) Find out all the classes that are loaded. You need 100 % code coverage (I am not talking about tests, but production). So run all possible scenarios, so all the classes your app needs will be loaded and logged.
To log loaded classes try several approaches. Reflection, –verbose:class flag, also you can learn about java agent. It allows to modify methods during runtime. This is an example of some java agent code or another java agent example
2) To find all the classes available in jar, you can write a program. You need to know all places where application jars are placed. Loop throw these jars (You can use ZipFile), loop through ZipFileEntry entries, and collect all classes.
3) After that write a script or program that reassembles your application. For example, now you can create a new jar file for each library and put there only needed classes.
Also you may use a tool (again, you are a programmer, so write a program), which checks code for classes dependence. You do not want to remove classes if they are used for compilation. When I was a student, I wrote code alanyzer, which builds an oriented graph for classes dependencies.
As #Gokul Nath KP notes, I did this before. I manually change gradle and maven dependencies, removing one by one, and then full regression test. It took me a week (our application was small comparing to modern world enterprise systems created by hundreds of developers).
So, be creative, and in case of success, your project will be used by millions!
You can skip the wall of text and go straight to the questions listed below, if you are so inclined.
Some background:
I'm currently doing some work on a large scale, highly modular Spring application. The application consists of multiple stand-alone Maven projects which are built separately. When compiling the whole application, these projects are pulled in as dependencies and overlaid onto the resulting 'super WAR' file.
The issue:
The build process (shortly) described in the preceding paragraph works well, but is very slow, even when all dependencies are already compiled and can be fetched from the local maven repository.
Some simple testing reveals that build-time of the 'super WAR' is cut in ~half when jar-compression is turned off entirely, at the cost of a comparatively small (~10%) increase in file size.
This is no surprise, really, as the build requires all the dependencies to be built/compressed and later decompressed, overlaid, and then compressed again (as a huge, unified war file).
Adding to this, a fair few of the "sub-projects" are pure web applications which contain no Java code needing compilation (or compression) at all (only static resources).
Questions:
What are the advantages of jar (war, really) compression, except for the (negligibly) reduced file size?
In the case of Java EE or Spring web applications, are there other (performance) issues introduced when turning off compression entirely? I'd think it has the potential to help both build time and JVM-startup.
Any suggestions on how to handle the build process of non-java applications with maven more efficiently are welcome as well. I've considered bundling them as resources, but am not sure how to achieve this while ensuring they are still buildable as stand-alone war files.
Besides the sometimes negligible reduction in the file size and the simplicity of having to manage only one file instead of an entire directory tree, there are still a few advantages:
Reduced copy time, as per this answer: https://superuser.com/a/360532/145340 I can also back this up by personal experience, copying or moving many small files is much slower than copying or moving an equally large single file.
Portability: The JAR file format is clearly defined, leaving no room for incompatible implementations.
Security: You can digitally sign the contents of a JAR file, ensuring the integrity and authenticity of the contents.
Package Sealing: Enforce version consistency, since all classes defined in a package must be found in the same JAR file.
Package Versioning: hold data like like vendor and version information.
I am working on a desktop application, I use Hibernate and HSQLDB. When I make my application a runnable jar file, it has a bigger fize size than I think. I see that the biggest part is from Hibernate and its dependencies. I am not sure if I need all of the Hibernate features. Is there a way to get rid of the parts of Hibernate and its dependency libraries which I don't use?
Under the /lib/ folder in Hibernate zip you will see a folder called /required/. For very basic Hibernate apps thats all you will need though you may need additional JARs for things such as JPA. I would start by only including the JARs in the lib/required/ directory, see if your project works, and if it doesn't add what you need to get your project working again.
perhaps you could use a tool to analyse your classes and dependencies (for e.g. http://www.dependency-analyzer.org/). Here is another post about it: How do I find out what jar files are actually used when compiling a java project.
the other way is to remove some jars (or even single class files) and try whether your application is still working or not. but i think this is not a very good way...
I can't think of a better tool for this than ProGuard.
ProGuard is a free Java class file shrinker, optimizer, obfuscator, and preverifier. It detects and removes unused classes, fields, methods, and attributes. It optimizes bytecode and removes unused instructions. It renames the remaining classes, fields, and methods using short meaningless names. Finally, it preverifies the processed code for Java 6 or for Java Micro Edition.
I have several applications that differ mostly based on resources. As of now, I'm copying the code around to each application. This can be problematic. An example, fixing a bug in one, and forgetting to update to the others.
I don't think creating a JAR is appropriate for this situation, as these are application specific UI classes, (actually android activity classes in specific) including the actual app start-up code.
It may be possible to include these source files into several packages, but then I have the problem that each file specifies a specific package name on the first line.
Most of the code is related to the UI and Activity processing. (The actual common code is already in a library). A similar question is posted here.
Are there any elegant solutions to this situation?
A jar is absolutely appropriate for this situation. You should split your application into layers, separating the application-specific classes from the shared code.
I solved this by going with Android Library projects. (Not sure of the details, perhaps they are ultimately jars) Check out details here, specifically the section 'Setting up a Library Project'.
I basically put in all my activity classes (except for the start-up one) into the library.
For true non-UI bound code, JARs, do seem to be the way to go.
I agree with artbristol.
I also recommend to use Maven and:
release the common jars to a corporate Maven repository
declare a dependency with specific versions on these jar artifacts
Like this you don't break applications if you do some incompatible changes.
I am developing a java web application and that includes an applet. That applet is
dependent on two jar files:
JFreeChart (for plotting graphs at the client side) - 1.7 mb(size of
jar file)
MySqlJdbcConnector (for storing data, captured at the client side, to
a remote database) - .7 mb (size of
jar file)
Now, the problem is the size of above
two jar files. The total size of my
applet jar (myApplet.jar) is 2.5
mb out of which 2.4 mb is
because of the above two jar files.
I am not using all the classes in
those jar files. Specifically, for
jfreechart, I am using a very small number of classes from that
library.
Questions
Q1. For creating myApplet.jar file, what I have done is I have unzipped both of the jar files (jfreechart and mySQLJdbcConnector) and then packed the unzipped version of the jar files with the source code of my applet code to create one single jar file (i.e myApplet.jar). Is it the correct way of packing third party jar files with your applet code? Is there any way by which I can optimize this?
Q2. I tried to find the dependencies of the classes of jfreechart library which I am using in my application so as to pack only those dependencies in myApplet.jar. For that purpose, I used DependencyAnalyzer to find the dependencies of all the classes. But later I found it difficult to do so manually because every class (class of jfreechart that I am using in my application) has lot of dependencies and I am using some 15 classes of jfreechart so doing this for every class will be very difficult. So any suggestion on this?
Q3. Is this situation very common that developers encounter or I am missing something because of which I have to do this?
I'd suggest trying out ProGuard. You can exclude parts of jar files you're not using.
Yes you can save space by creating a JAR containing only the classes that your applet requires. (I've seen this called an uber-JAR.)
There are various tools for doing this; e.g. ProGuard, Zelix ClassMaster, a Maven plugin whose name I forget and so on.
There are however a couple of issues, at least in the general case:
If your code uses dynamic loading (e.g. by calling Class.forName(className)), these tools generally cannot detect the dependency. So to avoid dynamically loaded classes being left out of the final JAR, you need to tell the tool the names of all of all classes that your application might explicitly load that way.
You need to take a look at the license of the third party library. IIRC, some licenses require you to include the library in your distributed artifacts in a way that allows people to substitute a different version of the library. One could argue that an uber-JAR makes this hard to do, and therefore could be problematic.
JFreeChart is LGPL, and LGPL is a license that has the requirement above. However MySQL is GPL, which trumps LGPL, and this means that your applet must be GPL'ed ... if you distribute it.
Finally, if you want to minimize the size of your applet JAR, you should NOT include your source code in the JAR. Source code should be in a separate JAR (or ZIP, TAR or whatever) file.
A1:
You can create an ant script or use Eclipse or any other IDE to automatically package your applet. But your way is correct, too
A2:
I wouldn't do these things manually. Finding transitive dependencies is very complex. Maybe darioo's answer is a better way to do this.
A3:
This is very common indeed. A couple of hints:
You can always re-build those third party libraries without debug information. That should slightly decrease the size of those libraries.
On the other hand, maybe you shouldn't have a direct connection from your applet to a database. You could create an RMI interface (or something similar) to transfer your SQL and result data to an application server, which actually executes your SQL. This is an important security aspect for your applet, if you don't run this in a safe intranet.