Why would I not use a wildcard in my classpath? - java

Are there any disadvantages in using Java 6 wildcards in my classpath ? e.g.
C:> set CLASSPATH=.\lib\*
I can see that where there are two jars that both contain a class with the same path then using a wildcard may lead to results that are hard to track down.
But other than that, is there anything else to be aware of?

If it's what you want to do, then do it. As long as you are aware of the consequences. Keep in mind that if anyone else has to maintain the project, they may copy a bunch of jars into that folder not realizing that they'll be linked by default. It shouldn't take them too long to see what's going on, though.
I generally try to minimize the number of jar files I use, and link them all in manually. I realize this is personal preference.

You might load undesired classes by doing so, and if there is two versions of the same library; well, kaboom.

An explicit classpath can server as a documentation of what libraries (and perhaps what versions thereof!) the application depends on.
You lose this if you use wildcards - if it's not documented elsewhere, then if someone gets a copy of the app without the lib folder (or you delete it accidentally), they'll have a very hard time tracking down all the dependencies via repeatedly running the app, looking at ClassNotFoundErrors and hoping that all libraries used sensible package names.

You are potentially giving the JVM a lot of places to search, this might give some overhead as classes are loaded - I guess a clever JVM will do this efficiently.

My first reflex was don't use env.CLASSPATH, but on second thought and while thinking about why one shouldn't do it I started likeing the idea, at least for a local development and test environment (and at least for a while)
The advantage of this approach is that you can keep a local folder with all your common libraries (log4j, dom4j, joda time, google collections, the apache commons zoo, ...). So you can compile and execute all your applications from the shell without wasting time typing long classpath arguments.
And your still free to use a -cp argument, because it replaces the global CLASSPATH setting.
I would never use it on a production system. The risk is just to high that someone changes the content of that folder or the CLASSPATH variable and my application doesn't work anymore.
So for production, no global 'CLASSPATH' and no wildcards in the classpath string.
A disadvantage of using the wildcard path in an environment like above: After a while too many projects depend upon the single library folder. You don't know the sideeffects of updating a library or deleting an old one. And for large application it might be hard to find out which libraries from the pool are really needed. You might end up adding unused libs to the product just because you're unsure if the application will run without that lib.
So my conclusion - a nice shortcut for development, testing, prototyping but risky for production. For production I'd prefer (autogenerated) classpath strings without wildcards.

Related

Library does not find its own "sublibrary"

I'm trying to make an addresslist in Java, which saves its contents in a Sqlite database.
Therefor (and for other future uses), I tried to create my own library for all kinds of database connections ("PentagonsDatabaseConnector-1.0.jar"). It currently supports Sqlite and MySql.
It references other libraries for them to provide the JDBC-drivers ("mysql-connector-java-8.0.16.jar" and "sqlite-jdbc-3.30.1.jar").
Problem: My Library works just fine if I'm accessing it from its own project folder, but as soon as I compile it and add it to the "Adressliste"-project, it isn't able to find the JDBC-drivers anymore (I can access the rest of my self-written library without problems though). Also, as shown in the screenshot, "PentagonsDatabaseConnector-1.0.jar" brings the JDBC-libraries with itself in "lib"-folder.
LINK TO THE SCREENSHOT
Do you guys have an idea whats wrong?
Thank you for your help!
Ps: Sorry for bad English, I'm German :)
Java cannot read jars-in-jars.
Dependencies come in a few flavours. In this case, PentagonsDC is a normal dependency; it must be there at runtime, and also be there at compile time.
The JDBC libraries are a bit special; they are runtime-only deps. You don't need them to be around at compile time. You want this, because JDBC libraries are, as a concept, pluggable.
Okay, so what do I do?
Use a build system to manage your dependencies is the answer 90%+ of java programmers go to, and what I recommend you do here. Particularly for someone starting out, I advise Maven. Here you'd just put in a text file the names of your dependencies and maven just takes care of it, at least at compile time.
For the runtime aspect, you have a few options. It depends on how your java app runs.
Some examples:
Manifest-based classpaths
You run your java application 'stand alone', as in, you wrote the psv main(String[]) method that starts the app and you distribute it everywhere it needs to run. In this case, the usual strategy is to have an installer (you need a JVM on the client to run your application and neither oracle nor any OS vendor supports maintaining a functioning JVM on end-user's systems anymore; it is now your job – this is unfortunately non-trivial), and given that you have that, you should deploy your jars such that they contain in the manifest (jars are zips, the manifest ends up at META-INF/MANIFEST.MF):
Main-Class: com.of.yourproj.Main
Class-Path: lib/sqlite-jdbc.jar lib/mysql-jdbc.jar lib/guava.jar
And then have a directory stucture like so:
C:\Program Files\yourapp\yourapp.jar
C:\Program Files\yourapp\lib\sqlite-jdbc.jar
C:\Program Files\yourapp\lib\mysql-jdbc.jar
Or the equivalent on any other OS. The classpath entries in the manifest are space separated and resolved relative to the dir that 'yourapp.jar' is in. Done this way, you can run yourapp.jar from anywhere and it along with all entries listed in Class-Path are now available to it.
Build tools can make this manifest for you.
Shading / Uberjars
Shading is the notion of packing everything into a single giant jar; not jars-in-jars, but unpack the contents of your dependency jars into the main app jar. This can be quite slow in the build (if you have a few hundred MB worth of deps, those need to be packed in and all class files need analysis for the shade rewrite, that's a lot of bits to process, so it always takes some time). The general idea behind shading is that deployment 'is as simple as transferring one jar file', but this is not actually practical, given that you can no longer assume that end users have a JVM installed, and even if they do, you cannot rely on it being properly up to date. I mention it here because you may hear this from others, but I wouldn't recommend it.
If you really do want to go for this, the only option is build systems: They have a plugin to do it; there is no command line tool that ships with java itself that can do this. There are also caveats about so-called 'signed jars' which cannot just be unpacked into a single uberjar.
App container
Not all java apps are standalone where you provide the main. If you're writing a web service, for example, you have no main at all; the framework does. Instead of a single entrypoint ('main' - the place where your code initially begins execution), web services have tons of entrypoints: One for every URL you want to respond to. The framework takes care of invoking them, and usually these frameworks have their own documentation and specs for how dependencies are loaded. Usually it is a matter of putting a jar in one place and its dependencies in a subdir named 'lib', or you build a so-called war file, but, really, so many web frameworks and so many options on how they do this. The good news is, usually its simple and the tutorial of said framework will cover it.
This advice applies to any 'app container' system; those are usually web frameworks, but there are non-web related frameworks that take care of launching your app.
Don't do these
Don't force your users to manually supply the -classpath option or mess with the CLASSPATH environment variable.
Don't try to write a custom classloader that loads jars-in-jars.
NB: Sqlite2 is rather complicated for java; it's not getting you many of the benefits that the 'lite' is supposed to bring you, as it is a native dependency. The simple, works everywhere solution in the java sphere is 'h2', which is written in all java, thus shipping the entire h2 engine as part of the java app is possible with zero native components.

Find out which Java classes are actually loaded and reduce jar

Is there a way to automatically find out which Java classes are actually loaded (either during compile time, as far as that's possible, or during the runtime of an application), and to throw out all other classes from a JAR to create a smaller JAR? Does that actually make sense in practice?
I am talking about the application classes for an application JAR. Usually there are lots of libraries in an application, and an application rarely needs all features of those libraries. So I suspect that would make a considerably smaller application. In theory that might be done for example via an Java agent that logs which classes and resources are read by one or several runs of an application (or even just by java -verbose:class), and a maven plugin that throws out all other classes from a jar-with-dependencies. Is there already something like that?
Clarification: I am not talking about unused dependencies (JARs that are not used at all), but about removing unused parts of each included JAR.
Well, the Maven Shade Plugin has an option minimizeJar when creating an Uber-JAR for your application:
https://maven.apache.org/plugins/maven-shade-plugin/
But, as others already pointed out, this is quite dangerous, as it regularly fails to detect class accesses which are done via Reflection or other dynamic references.
It may not be a good approach automate, as application can use reflection to initialise objects or one JAR is dependent on another JAR.
Only way that I can think of is to remove each JARs one by one and check if application runs as expected. Then again in this approach all modules of the application has to be tested, since one module can work without particular dependency and other may not.
Better solution is to take care while developing. The application developer must be careful in adding a dependency and removing unwanted dependency after his/her piece of code is done.
Global strategy.
1) Find all the classes that are loaded during runtime.
2) List of all the classes available in the classpath.
3) Reduce your class path by creating copies of jars containing only classes you need.
I have done 1 and 2 part so I can help you.
1) Find out all the classes that are loaded. You need 100 % code coverage (I am not talking about tests, but production). So run all possible scenarios, so all the classes your app needs will be loaded and logged.
To log loaded classes try several approaches. Reflection, –verbose:class flag, also you can learn about java agent. It allows to modify methods during runtime. This is an example of some java agent code or another java agent example
2) To find all the classes available in jar, you can write a program. You need to know all places where application jars are placed. Loop throw these jars (You can use ZipFile), loop through ZipFileEntry entries, and collect all classes.
3) After that write a script or program that reassembles your application. For example, now you can create a new jar file for each library and put there only needed classes.
Also you may use a tool (again, you are a programmer, so write a program), which checks code for classes dependence. You do not want to remove classes if they are used for compilation. When I was a student, I wrote code alanyzer, which builds an oriented graph for classes dependencies.
As #Gokul Nath KP notes, I did this before. I manually change gradle and maven dependencies, removing one by one, and then full regression test. It took me a week (our application was small comparing to modern world enterprise systems created by hundreds of developers).
So, be creative, and in case of success, your project will be used by millions!

Extract a reference graph while compiling Java codebase?

Background:
I'm working with (for me) a reasonably large codebase (eg: I've only got a few of the related projects checked out at the moment, and its > 11000 classes).
Build is ant, Tests are JUnit, CI is Jenkins.
Running all tests before checkin is not an option, it takes Jenkins hours. Even for some of the individual apps it can be 45 minutes.
There are some tests that don't reference individual methods based on reflection, and in some cases don't even directly reference the class of the tested methods, as they interrogate an aggregator class, and are aware of the patterns of pass-through methods in use here. As it's a big codebase, > 10 developers, and I'm not in charge, this is something I can not change for now.
What I want, is the ability to before check-in print out a list of all test classes that are two degrees away (Kevin-Bacon-wise) from any class in the git diff list. This way I can run them all and cut down on angry emails from Jenkins when something I missed eventually gets run and has an error.
The easiest way I can think of to achieve this is to code it myself with a Ruby script or similar, which allows me to account for some of the patterns we're using, but to do it I need to be able to query "which classes reference class X?"
I could parse .java or (easier) .class files to get this info, but I'd rather not :) Is there a way I can make Javac export it in a simple format as it compiles?
Is there a way I can make Javac export it in a simple format as it compiles?
AFAIK, no.
However, there are other ways to get a list of the dependencies:
How do I get a list of Java class dependencies for a main class?.
(Note however that you are unlikely to get a static tool to extract dependencies resulting from Class.forName(), etcetera. Also note that you cannot infer the complete set of dependencies from bytecode files because of the way that "compile time constants" are handled.)
It strikes me that there are a few problems here:
It sounds to me like your build, and indeed your project structure is monolithic. If you could restructure the code base into large-scale modules that build separately (according to their dependencies), and version controlled separately, then you only need to do a full build and run all unit tests when there is a change high up ... in a module that everything else depends on. (Can I suggest the "Maven" word. It really helps for a large codebase, and 11,000 classes is large.)
It sounds like you may be suffering from the "branches are hard" problem of classic VCS systems.
It sounds like you may need a beefier CI system. If you've got more cores and the build framework is right, you should be able to get faster CI builds. (And if you modularize so that you rebuild less ...)
I think it might be easier to address your slow build/test cycle that way rather than via extra (possibly bespoke) tooling to do dependency analysis.
But I recognize that it may not be up to you to make those decisions.

Hibernate: is it possible to reduce file size of jar?

I am working on a desktop application, I use Hibernate and HSQLDB. When I make my application a runnable jar file, it has a bigger fize size than I think. I see that the biggest part is from Hibernate and its dependencies. I am not sure if I need all of the Hibernate features. Is there a way to get rid of the parts of Hibernate and its dependency libraries which I don't use?
Under the /lib/ folder in Hibernate zip you will see a folder called /required/. For very basic Hibernate apps thats all you will need though you may need additional JARs for things such as JPA. I would start by only including the JARs in the lib/required/ directory, see if your project works, and if it doesn't add what you need to get your project working again.
perhaps you could use a tool to analyse your classes and dependencies (for e.g. http://www.dependency-analyzer.org/). Here is another post about it: How do I find out what jar files are actually used when compiling a java project.
the other way is to remove some jars (or even single class files) and try whether your application is still working or not. but i think this is not a very good way...
I can't think of a better tool for this than ProGuard.
ProGuard is a free Java class file shrinker, optimizer, obfuscator, and preverifier. It detects and removes unused classes, fields, methods, and attributes. It optimizes bytecode and removes unused instructions. It renames the remaining classes, fields, and methods using short meaningless names. Finally, it preverifies the processed code for Java 6 or for Java Micro Edition.

A tool to detect broken JAR dependencies on class and method signature level

The problem scienario is as follows (Note: this is not a cross-jar dependency issue, so tools like JarAnalyzer, ClassDep or Tattletale would not help. Thanks).
I have a big project which is compiled into 10 or more jar artifacts. All jars depend on each other and form a dependency hierarchy.
Whenever I need to modify one of the jars, I would check out the relevant source code and the source code for projects that depend on it. Modify the code, compile, repackage the jars. So far so good.
The problem is: I may forget to check one of the dependent projects, because inter-jar dependencies can be quite long, and may change with time. If this happens some jars may go "out-of-sync" and I will eventually get a NoSuchMethodException or a some other class incompatibility issue at run-time, which is what I want to avoid.
The only solution I can think of, the most straighforward one, is to check out all projects, and recompile the bunch. But this takes time, especially if I re-build it every small change. I do have a continuous integration server, that could do this for me, but it's shared with other developers, so seeing if the build breaks is not an option for me.
However, I do have all the jars so hypothetically it should be possible to verify jars which depend on the code that I modified have an inconsistency in method signature, class names, etc. But how could I perform such check?
Has anyone faced a similar problem before? If so, how did you solve it? Any tools or methodologies would be appreciated.
Let me know if you need clarification. Thanks.
EDIT:
I would like to clarify my question a little bit.
The ultimate goal of this task is to check that the changes that I have made will compile against the whole project. I am looking for a tool/technique that would aid me perform such check.
Consider this example:
You have 2 projects: A and B which are deployed as A.jar and B.jar respectively. A depends on B.
You wish to modify B, so you check it out and modify a method signature that A happens to depend on. You can compile B and run all tests by itself without any problems because B itself does not depend on anything. So you happily commit your changes.
In a few hours the complete project integration fails because A could not be compiled!
How do I avoid this?
The kind of tool I am looking for would retrieve A.jar and check that all dependencies in A on the new modified B are still fine. Like a potential compilation error that would happen if I were to recompile A and B sources together.
Another solution, as was suggested by many of you, is to set up a local continuous integration system that would recompile the whole project locally. I don't mind doing this, but I want to avoid doing it inside my workspace. On the other hand, if I check-out all sources to another temporary workspace, then I need to mirror my local changes to the temporary workspace.
This is quite a big issue in my team, as builds break very often because somebody forgot to check out (or open in Eclipse) the right set of projects. I tried persuading people to check-out source and recompile the bunch before commits, but not only it takes time, it needs running quite a few commands so most people just find it too troublesome to do. If the technique is not easy or automated, then it's unusable.
If you do not want to use your shared continuous integration server you should set up a local one on your developer machine where you perform the rebuild processes on change.
I know Jenkins - it is easy to setup (just start) on a local machine and I would advice to run it locally if no one is provided in the IT infrastructure that fits your needs.
Checking signatures is unfortunately not enough. Having the correct signatures does not mean it'll work. It's all about contracts and not just signatures. I mean what happens if the new version of a library has the same method signature, but accepts an ArrayList parameter now in reversed order? You will run into issues - sooner or later. I guess you maybe consider implementing tools like Ivy or Maven:
http://ant.apache.org/ivy/
http://maven.apache.org/
Yes it can be pain to implement it but once you have it it will "guard" your versions forever. You should never run into such an issue. But even those build tools are not 100% accurate. The only proper way of dealing with incompatible libraries, I know you won't like my answer, is extensive regression testing. For this you need bunch of testing tools. There are plenty of them out there: from very basic unit testing (JUnit) to database testing (JDBC Proxy) and UI testing frameworks like SWTBot (depends if your app is a web app or thick client).
Please note if your project gets really huge and you have large amount of dependencies you always not using all of the code there. Trying to check all interfaces and all signatures is way too much. Its not necessary to test it all when your code use lets say 30 % of the library code. What you need is to test what you really use. And this can be only done with extensive regression testing.
I have finally found a whole treasure box of answers at this post. Thanks for help, everyone!
The bounty goes to K. Claszen for the quickest and most input.
I'm also think that just setup local Jenkins is a best idea. What tool you use for build? Maybe you can improve you situation with switching to Maven as build tool? In more smart and don't recompile full project if you don't ask it directly. But switch to in can be HUGE paint in the neck - it hardly depends on how you project organized now...
And about VCS- exist Mercurial/SVN bridge - so you can use local Mercurial for you development ....
check this link: https://www.mercurial-scm.org/wiki/WorkingWithSubversion
There is a solution, jarjar, which allows to have different versions of the same library to be included multiple times in the dependency graph.
I use IntelliJ, not Eclipse, so maybe my answer is too IDE-specific. But in IntelliJ, I would simply include the modules from B into A, so that when I make changes to A, it breaks B immediately when compiling in the IDE. Modules can belong to multiple projects, so this is not anything like duplication, it's just adding references in the IDE to modules in other projects.

Categories

Resources