Jar sizes different for different people - java

So, I have an interesting question. I have three people using the same ant build xml file creating a jar file for an Eclipse project that hasn't changed in two months. We each do a build using this xml file and we each get a different sized jar (62 KB, 78 KB, and 101 KB). Also, when I do a winmerge on them, they are dramatically different.
What could cause this difference?

First thing to try: copy them all onto the same computer, unpack them in different directories, and run WinDiff (or whatever) on the uncompressed version. That will make it much more obvious what's going on.
Other possibilities - different versions of Java using different compression levels by default?

In addition to Jon's suggestion, are there user preferences that ant script may pickup?
E.g. build.properties file that can live in project directory, user home directory, etc.
It may be the case that there are different customizations of the project on each person's workstation.

The only difference I can see can be:
Difference in environment variables resulting in difference in compilers. Does your ant script use any other utility in addition to the regular javac for compilation or packaging? Ex: does it use AspectJ or some other assembling utility? Is that dependent on environmental variables that are different for different machines?
Difference in the size of dependent jars (one of you might have commons-logging-1.8 while the other may have a different version for example)
Do you invoke any other build utility from ant that does dependency management such as ivy for instance?
You said winmerge shows dramatic differences. Are these w.r.t. the size of the various components inside the jar or are there structural differences (folder structures, different files etc.)? The latter would be more perplexing for sure.

Sounds like the folders being jarred together are not empty when the process begins.

Related

How to know if a .jar built on AMD64 will run flawlessly on ARM?

I built a .jar in docker on the ARM architecture and one on AMD64.
The two .jar files have the identical size, but vbindiff says their contents are quite different.
I tested both .jar files on my AMD64 computer and the reversed slogan "build anywhere, run once" holds.
My hypothesis is that this has to do with Java Native Interface (JNI). The .jar is a Spring Boot Webflux backend. Unfortunately, I don't know if it or any other dependency uses JNI.
I noted that the ARM image has JDK 17.0.3 installed, while the AMD64 image has JDK 17.0.2. But this should not be a problem, since I built both .jars using the Gradle wrapper, which specifies the exact toolchain to be downloaded and used to build the project:
kotlin("jvm") version "1.6.10"
What could be the reason for the difference? Can I assume that either .jar can be used on any platform that has a compatible JVM?
EDIT: I followed Thomas' advice and used diff -r to compare the extracted contents of the .jar files. They are identical.
However, diff confirms that the .jar files themselves are different.
I just learned that the .jar format is based on .zip, which can use various compression methods, as well as include extra information in file headers, such as 'last modified' or optional OS-specific attributes. Mystery solved.
You can list the files in the .jar using jar tf myfile.jar. If the only contents are .class files and the manifest data in META-INF/, then chances are good that it's 100% portable. If you see other files like .so, .dll or .dylib, then there is native code in there which might give trouble.
Here's how you can list all the files which might warrant a closer look:
jar tf myfile.jar | grep -Pv '^META-INF/|(\.class|/)$'
Since you have already built the .jars on two different platforms, you can also extract their contents using jar xf myfile.jar and use diff -r to compare them recursively. This is a more robust way to detect differences than comparing the archives directly, although I imagine that .class files might not be byte-wise identical either even if they're semantically the same.

Executable JAR execution different from eclipse execution

I have created an App using Java which treats excel files using Apache POI. The problem is that when I run the code from eclipse, it works fine, but when I made an executable jar for the app (Using eclipse export executable jar option), the jar is working fine but the results are different, even the size of the produced excel file is different.
I made many research, but I did not find a convenient solution.
Ah yes. I have had that same experience too a few years back.
When creating the runnable .jar in Eclipse, you can choose how .class files from libraries (such as Apache POI in this case) are handled:
Package required classes (.class files) into jar file
Package required libraries (.jar files) into jar file
Copy libraries into a sub-folder
Interestingly, with Apache POI, the three different ways of packing create HUGE differences:
In startup speed
In execution speed
In memory requirements (RAM)
In the resulting output files
I cannot recall which gave me the expected results.
So you have to try them out yourself. (Judging by how Eclipse starts Java projects, it should be #3, libs in subfolder, that gets you the same results). But: try the others anyhow; as I said, HUGE differences ahead.
TBH Apache POI is a 'good' example of how software should NOT be written.
It's awfully bloated and mega RAM hungry and has quite an interesting/odd behavior.
So I wrote my own lib for the newer .xls file format which is just a 100 times faster, smaller and more reliable. And does string caching and cell format operation optimization a lot better. So a 1000000 times better :-P
The upside is that the POI dev team knows the limitations and shortcomings of their project and offers multiple modes of processing files, to overcome said shortcomings. So, after all, kudos to them!

Standard location to put jar files for command line java programs

Several developers have created stand alone java command line programs. These programs share libraries, such as sql server jar. What is the preferred or standard location these shared external jar files be placed according to convention?
/usr/local/lib
/opt
/var/lib
The location doesn't matter as much as standardizing the location. Another solution would be to use a dependency management system like Maven and package the dependencies in the jar. However, this would be inefficient if you are reusing the jars across multiple projects but it does ensure that the dependencies are present and isn't susceptible to someone swapping the version of the dependency in the shared folder with a newer version that breaks other applications.
Java provides several ways to manage class and library loading, but looking for one location based on the OS is not something that is in line with keeping Java platform independent. Instead try defining a common location for your project based on how Java finds classes.
Also, if you are executing your command line app from jars, Maven has some nice plugins to help bundle your java classes as an executable jar. It is much cleaner and encapsulates the libraries within the scope of your individual applications.

How do I include java stuff in .jar files?

Okay. So here's my question: I am making a data parser in Clojure. One part of my program is that it has to be able to graph the data. I figure, I'll use jFreeChart. However, I have absolutely NO IDEA how to include stuff in JAR files. What I mean is: if I have a app.jar file in my classpath, I don't seem to be able to do:
import app.thing.thing2
without changing the classpath to be inside the jar file.
The idea here is that I don't think I can change my classpath since I need to set it to run Clojure (Or do I?). The global classpath is currently /usr/share/java.
And please don't ask me to use Maven, Ant or any project-building tool unless it is the only way to do this. This is a script for personal use that doesn't need or want a whole lot of overhead.
I wonder if I should just unpack every JAR file, so that I can reference the directory structure? Is this bad?
Let me know if you need any clarifications!
The content of the (Java) CLASSPATH environment variable is available to Clojure so if you add your jar to the global classpath before to run Clojure, you'll "see" it:
export CLASSPATH=/path/to/jfreechart.jar:$CLASSPATH
But, in my opinion, this is not the "clean" way to add a jar to Clojure's classpath (because this makes the library visible to any Java program and may not be desired). Instead, you should use the CLOJURE_EXT environment variable. This is how this variable is documented:
# CLOJURE_EXT The path to a directory containing (either directly or as
# symbolic links) jar files and/or directories whose paths
# should be in Clojure's classpath. The value of the
# CLASSPATH environment variable for Clojure will be a list
# of these paths followed by the previous value of CLASSPATH
# (if any).
On my system, it is defined as below:
export CLOJURE_EXT=~/.clojure
So, to add jfreechart.jar (or any other library) to Clojures's classpath, copy it (or add a symlink pointing to it) in the directory defined in the CLOJURE_EXT variable.
And by the way (I'm sorry but your question is not that clear), if you want to bundle some Java classes into a jar, the command is something like that:
$ jar cf myjarfile *.class
You'll find documentation of jar - the Java Archive Tool - here.
I completely respect your desire not to use a project management tool, though I just spent longer typing this sentence than it takes to set up leiningen. For your one-off script any tool is going to be overkill and Pascal Thivent's answer covers this very well. For people reading this question who perhaps want to produce a jar file, or easily load their Clojure into emacs/slime-swank I cant recommend leiningen too strongly.
If you going to basics you can inline your classpath to include the hardcoded location of your jars, so if you on windows it will look something like
java -cp .;%CLASSPATH%;C:/here/it/is/foo.jar com.foo.MyClass
Not sure how clojure is run, but don't you just add the jar file to the classpath?
i.e.
/usr/share/java:/home/user/myjarfile.jar

Eclipse Java project folder organization

I am coming to Java and Eclipse from a C#/Visual Studio background. In the latter, I would normally organize a solution like so:
\MyProjects\MyApp\MyAppsUtilities\LowerLevelStuff
where MyApp would contain a project to build a .exe, MyAppsUtilities would make an assembly DLL called by the .exe, and LowerLevelStuff would probably build an assembly containing classes used by the higher-level utilities DLL.
In Eclipse (Ganymede, but could be convinced to switch to Galileo) I have:
\MyProjects\workspace\MyApp
When I create my initial project. There is an option to put source and build files in same folder, but I have .java files created on a path that is reflective of my package hierarchy:
\MyProjects\workspace\MyApp\src\com\mycompany\myapp\MyApp.java
My question is this: when I create subprojects (is that the right Java/Eclipse term?) for .jar files that will be analogous to the above MyAppsUtilities and LowerLevelStuff assembly DLLs in .NET, can (should) I organize the folders equivalently? E.g.:
\MyProjects\workspace\MyApp\src\com\mycompany\myapp\myapputilities\MyAppsUtilities.java
What is the standard/right way to organize this stuff, and how is it specifcally done in the IDE?
Think of Java source code packages as one big hierarchical namespace. Commercial applications typically live under 'com.mycompany.myapp' (the website for this application might be 'http://myapp.mycompany.com' although this is obviously not always the case).
How you organize stuff under your myapp package is largely up to you. The distinction you make for C# between executable (.exe), DLL's and low-level classes does not exist in the same form in Java. All Java source code is compiled into .class files (the contents of which is called 'bytecode') which can be executed by a Java Virtual Machine (JVM) on many platforms. So there is no inherent distinction in high-level/low-level classes, unless you attribute such levels via your packaging. A common way of packaging is:
com.mycompany.myapp: main class; MyApp (with a main method)
com.mycompany.myapp.model: domain model classes; Customer, Order, etc.
com.mycompany.myapp.ui: user interface (presentation or view) code
com.mycompany.myapp.service: services within your application, i.e. 'business logic'
com.mycompany.myapp.util: helper classes used in several places
this suggests a standalone Java app, it might be different if it is a webapp using one of the many frameworks.
These packages correspond to a directory hierarchy in your project. When using Eclipse, the root of such a hierarchy is called a 'source directory'. A project can define multiple source directories, commonly a 'main' and a 'test' source directory.
Example of files in your project:
src/test/java/com/acme/foo/BarTest.java
src/main/java/com/acme/foo/Bar.java
lib/utilities_1_0.jar
And inside utilities_1_0.jar:
com/acme/foo/BarUtils.class
BarUtils.class this is a compiled java class, so in platform independent bytecode form that can be run on any JVM. Usually jarfiles only contain the compiled classes although you can sometimes download a version of the jar that also contains the source (.java) files. This is useful if you want to be able to read the original source code of a jar file you are using.
In the example above Bar, BarTest and BarUtils are all in the same package com.acme.foo but physically reside in different locations on your harddisk.
Classes that reside directly in a source directory are in the 'default package', it is usually not a good idea to keep classes there because it is not clear to which company and application the class belongs and you can get name conflicts if any jar file you add to your classpath contains a class with the same name in the default package.
Now if you deploy this application, it would normally be compiled into .class files and bundled in a .jar (which is basically a fancy name for a .zip file plus some manifest info).
Making a .jar is not necessary to run the application, but handy when deploying/distributing your application. Using the manifest info you can make a .jar file 'executable', so that a user can easily run it, see [a].
Usually you will also be using several libraries, i.e. existing .jar files you obtained from the Internet. Very common examples are log4j (a logging framework) or JDBC libraries for accessing a database etc. Also you might have your own sub-modules that are deployed in separate jarfiles (like 'utilities_1_0.jar' above). How things are split over jarfiles is a deployment/distribution matter, they still all share the universal namespace for Java source code. So in effect, you could unzip all the jarfiles and put the contents in one big directory structure if you wanted to (but you generally don't).
When running a Java application which uses/consists of multiple libraries, you run into what is commonly referred to as 'Classpath hell'. One of the biggest drawbacks of Java as we know it. (note: help is supposedly on the way). To run a Java application on the command line (i.e. not from Eclipse) you have to specify every single .jar file location on the classpath. When you are using one of Java's many frameworks (Maven, Spring, OSGi, Gradle) there is usually some form of support to alleviate this pain. If you are building a web application you would generally just have to adhere to its layering/deployment conventions to be able to easily deploy the thing in the web container of your choice (Tomcat, Jetty, Glassfish).
I hope this gives some general insight in how things work in Java!
[a] To make an executable jar of the MyApp application you need a JDK on your path. Then use the following command line in your compile (bin or target) directory:
jar cvfe myapp.jar com.mycompany.myapp.MyApp com\mycompany\myapp
You can then execute it from the command line with:
java -jar myapp.jar
or by double-clicking the jar file. Note you won't see the Java console in that case so this is only useful for applications that have their own GUI (like a Swing app) or that may run in the background (like a socket server).
Maven has a well thought out standard directory layout. Even if you are not using it Maven directly, you can think of this as a defacto standard. Maven "multi module" projects are a fair analogy to the .net multiple assembly layout that you described.
Typically you would create related/sub projects as different Projects in Eclipse.
There are two things you need to clarify before this question can be answered:
Which source code repository will you use?
Which build system will you use to automatically build artifacts outside of Eclipse?
The answers will strongly influence your options.
We have opted for "one Eclipse project pr component" which may be either a library or a finished runnable/executable jar. This has made it easy to automate with Hudson. Our usage of CVS is also easier, since single projects do not have multiple responsibilities.
Note, each project may contain several source folders separating e.g. test code from configuration from Java source. That is not as important as simplifying your structure.

Categories

Resources