In my root project, I have many sub-projects:
common
dependant-1
dependant-2
...
standalone
In this scenario, common is a shared library. All other projects are dependant on it, with the exception of standalone. (standalone is actually a standalone client JAR).
What I would like to do is get 5 classes from common into the jar file produced in standalone. I only need those exact 5 classes (out of ~200) and want to avoid bringing common as a full dependency (along with all of common's dependencies). Granted it's an unusual setup, but I don't want to include classes that the client has no business with and the classes I am including just contain enum or static final.
So far, I have tried the following in the build.gradle for standalone:
jar {
manifest.attributes(
// .... Removed
)
with project(':common').jar {
include('com/classpath/ClassA.class')
include('com/classpath/ClassB.class')
include('com/classpath/ClassC.class')
include('com/classpath/ClassD.class')
include('com/classpath/ClassE.class')
}
}
This works very well for the standalone project, but tramples the other projects that were fully dependant on common; common.jar will now only ever contain the classes listed above, regardless of which dependant project.
(I'm guessing this is expected behaviour: in Gradle's configuration phase, it sees the specific configuration I have for common in standalone and applies that to common).
So, in short, what is the neatest solution to this? I'm thinking that I may need to provide a configuration in the build.gradle for the common project. I'm not sure how to do this yet (RTFM). I just wanted to check that there isn't a better approach to this?
EDIT: to answer some of alexvetter's questions.
I hadn't yet tried to include them from build/classes, but this did the trick (see the accepted answer).
I did consider creating a new common-base project. If there were much more than 5 classes that I needed, I absolutely would have gone this way. And yes, I only need the classes at runtime.
Just to clarify the last point... the standalone jar file is actually a JAX-WS client jar that we provide to another team. The reason I needed to inlcude these specific classes in the client JAR is so that the client knows what values to give to certain web service method calls. Ideally, I would have replaced all of the included classes with enum (like I mentioned, they are literally just static final definitions) and JAX-WS would have looked after everything. (I actually did this to replace one of the class files that I was including.) If I had done this for all the classes I needed to include, it would have triggered many more code changes for them (why this was a problem is a whole different story ;)
Following should do the trick:
from("${project(':common').buildDir}/classes/main") {
include('com/classpath/ClassA.class')
include('com/classpath/ClassB.class')
include('com/classpath/ClassC.class')
include('com/classpath/ClassD.class')
include('com/classpath/ClassE.class')
}
But as I already stated in my comments. This classes are only available on runtime and you should probably create a new project (e.g. base) which is a dependency of common and standalone
Related
Is there a way to automatically find out which Java classes are actually loaded (either during compile time, as far as that's possible, or during the runtime of an application), and to throw out all other classes from a JAR to create a smaller JAR? Does that actually make sense in practice?
I am talking about the application classes for an application JAR. Usually there are lots of libraries in an application, and an application rarely needs all features of those libraries. So I suspect that would make a considerably smaller application. In theory that might be done for example via an Java agent that logs which classes and resources are read by one or several runs of an application (or even just by java -verbose:class), and a maven plugin that throws out all other classes from a jar-with-dependencies. Is there already something like that?
Clarification: I am not talking about unused dependencies (JARs that are not used at all), but about removing unused parts of each included JAR.
Well, the Maven Shade Plugin has an option minimizeJar when creating an Uber-JAR for your application:
https://maven.apache.org/plugins/maven-shade-plugin/
But, as others already pointed out, this is quite dangerous, as it regularly fails to detect class accesses which are done via Reflection or other dynamic references.
It may not be a good approach automate, as application can use reflection to initialise objects or one JAR is dependent on another JAR.
Only way that I can think of is to remove each JARs one by one and check if application runs as expected. Then again in this approach all modules of the application has to be tested, since one module can work without particular dependency and other may not.
Better solution is to take care while developing. The application developer must be careful in adding a dependency and removing unwanted dependency after his/her piece of code is done.
Global strategy.
1) Find all the classes that are loaded during runtime.
2) List of all the classes available in the classpath.
3) Reduce your class path by creating copies of jars containing only classes you need.
I have done 1 and 2 part so I can help you.
1) Find out all the classes that are loaded. You need 100 % code coverage (I am not talking about tests, but production). So run all possible scenarios, so all the classes your app needs will be loaded and logged.
To log loaded classes try several approaches. Reflection, –verbose:class flag, also you can learn about java agent. It allows to modify methods during runtime. This is an example of some java agent code or another java agent example
2) To find all the classes available in jar, you can write a program. You need to know all places where application jars are placed. Loop throw these jars (You can use ZipFile), loop through ZipFileEntry entries, and collect all classes.
3) After that write a script or program that reassembles your application. For example, now you can create a new jar file for each library and put there only needed classes.
Also you may use a tool (again, you are a programmer, so write a program), which checks code for classes dependence. You do not want to remove classes if they are used for compilation. When I was a student, I wrote code alanyzer, which builds an oriented graph for classes dependencies.
As #Gokul Nath KP notes, I did this before. I manually change gradle and maven dependencies, removing one by one, and then full regression test. It took me a week (our application was small comparing to modern world enterprise systems created by hundreds of developers).
So, be creative, and in case of success, your project will be used by millions!
When I write code in nodejs, I can have multiple versions of the same library because 'require'-ing a library is not global.
The classpath is different. Each library is looking at its node_modules library.
However, in java, I cannot seem to be able to have multiple versions of same library in the classpath.
As the classpath is global.
Is there any way to make java behave more like node in that sense - by making each classloader have a different classpath?
why is my question different than that one
That question is based on "given a classpath, ... question .." while I am not assuming anything on the classpath. quite the contrary, I would like to have a different classpath for each library if possible.
Ok, I read a lot about it.. I have a long answer and a short one.
The long one - I will write a blog post about it and link it here..
Here is the short one
Assumptions
dep_1.jar - is a dependency I have, and is not dependent on any other jar. (a single jar packaging)
The API is available in my classpath. I only want to separate the implementation's classpath.
The class I want to instantiate has an empty constructor
Lets talk about a specific use case
// jar dep_1-api.jar
package guymograbi;
public interface IMyMessage {
public String getMessage();
}
// jar dep_1.jar
package guymograbi;
public class MyMessage implements IMyMessage{
public String getMessage(){ return "my message"; }
}
Code
ClassLoader classloader = new URLClassLoader(new URL[]{ new File("java_modules/dep_1.jar").toURI().toURL() })
return (IMyMessage.class) classloader.loadClass("guymograbi.MyMessage").newInstance();
This will return an IMyMessage implementation without changing my classpath.
I can then load another implementation of the same interface from another jar, etc.. etc..
This solution is:
small enough to embed in any library that wants to support it.
small enough to quickly wrap any existing library with it.
almost zero learning curve.. just a need to apply to an easy standard.
Further reading
So it seems there are many ways in which you can write this solution.
You can use Spring, Guice and ServiceLoader to do the actual implementation.
ServiceLoader - I kinda liked this solution has it has something similar to main entry in package.json. AND - obviously - it does not require a dependency that will mess with my classpath!
I also recommend checking io.github.lukehutch:fast-classpath-scanner that allows you to easily find all classes that implement a specific interface.
Next Step
The next step for me is to construct a proper java_modules folder and allow dep_1 to be a folder containing index.jar and its own java_modules folder and cascade the solution downwards..
I plan to do so using the answer from: How to get access to Maven's dependency hierarchy within a plugin
And then write a maven plugin (like assembly) to pack it all up properly.
I am building a tool from several different open source libraries. My buildpath is in the following order:
My first JAR file, stanford-corenlp-3.3.0.jar, contains a package called edu.stanford.nlp.process, which has the Morphology.class class.
My second JAR file, ark-tweet-nlp-0.3.2.jar, contains an identical package name (edu.stanford.nlp.process), and an identical class name Morphology.class.
In both JARS, inside their respective Morphology classes there exists a method called stem(). However, the constructors for these methods are different. I want to use the stem(String, String) method from my second JAR file, but since the import statement (import edu.stanford.nlp.process.Morphology;) does not specify which JAR to use, I get an error since it thinks the first JAR on the buildpath is the one I want to implement.
I don't want to change the order of my buildpath since it would throw off my other method calls.
How can I specify which JAR's Morphology class to use? Is there an import statement that specifies the JAR, along with the package.class?
EDIT: What about a way to combine my two JARs so that the two Morphology classes merge, giving me two methods with different constructors?
As several others pointed out above, it is possible to tweak Java's classloader mechanism to load classes from certain places… but this is not what you are looking for, believe me.
You hit a known problem. Instead of worrying how to tell Java to use a class from one JAR and not from the other, you should consider using a different version of ArkTweet.
Fetch the ArkTweet JAR from Maven Central. It does not contain Stanford classes.
When you notice that people package third-party classes in their JARs, I'd recommend pointing out to them that this is generally not a good idea and to encourage them to refrain from doing so. If a project provides a runnable fat-jar including all dependencies, that is fine. But, it should not be the only JAR they provide. A plain JAR or set of JARs without any third-party code should also be offered. In the rare cases that third-party code was modified and must be included, it should be done under the package namespace of the provider, not of the original third-party.
Finally, for real solutions to building modular Java applications and handling classloader isolation, check out one of the several OSGi implementations or project Jigsaw.
The default ClassLoader will only load one of the jars, ignoring the second one, so this can't be done out of the box. Maybe a custom ClassLoader can help.
For more info about ClassLoaders start from here.
Good luck!
EDIT: We are looking at some horrible packaging choices causing as side effect this Jar Hell here. The author of this "Ark Twitter" library decided it was a good idea to release a JAR artifact that includes a third party library (the Stanford NLP library). This leads to unnecessarily tight coupling between Ark Twitter and the specific version of the Stanford NLP library used by it. This is a very bad practice that should be discouraged in any case: this violates the whole idea about transitive dependencies.
EDIT (continued): One possible (and hopefully working) solution is to rebuild the Ark Twitter JAR so that it does not include the aforementioned library but only its own code (basically the cmu.arktweetnlp package only) and hoping that the version of NLP required by your project works with Ark Twitter. Ideally you should submit a pull request to the author of the library but in the meantime you can get away with un-jarring and re-jarring the existing JAR file.
EDIT 2: Looking at the JAR file again, it's much worse that I originally thought: ALL the dependencies are repackaged in the released JAR file. This is really the worst possible solution for releasing a library. Good luck.
I think your problem can be solved simply by using the lemma(String word, String tag) method in the current CoreNLP's Morphology class:
String word = ...;
String tag = ...;
String lemma = morphology.lemma(word, tag);
WordTag wt = new WordTag(lemma, tag);
When the class was revised a couple of years ago, the method you're looking for was deleted. The feeling was that with most of the Stanford NLP code moving to using CoreLabels, methods that return WordTag are less useful (though deleting all such methods is still a work in progress).
No there isn't. This is a weakness of Java, that cannot be simply solved. You should use only one of the libraries. Having both on the classpath will make java always select the first one.
This problem is named as Jar hell.
The order in the buildpath generally determines the order in which the classloader will search for the class. In general, though, you don't want duplicates of the same class in your build path--and it sure doesn't seem like ark-tweet-nlp-0.3.2.jar should have a edu.stanford package within it.
When you load a class, it's loaded at given address, and that address is then placed in the header of objects created from the class, so that (among other things) the methods in the class can be located.
So if you somehow load ClassA, with method abc(String), from zip file XYZ.zip, that loads into address 12345. Then (using a class loader trick) you load another ClassA, with method abc(String, String), from zip file ZYX.zip, and that loads into address 67890.
Now create an instance of the first ClassA. In its header will the class address 12345. If you could somehow attempt to invoke the method abc(String,String) on that class, that method would not be found in the class at 12345. (In actuality, you will not even be able to attempt the call, since the verifier will stop you because, to it, the two classes are entirely different and you're trying to use one where the other is called for, just as if their names were entirely different.)
I have feeling I will get down voted. Please pardon my ignorance on this subject as I need to get this working soon.
Basically, I have two dependencies. Same GroupId, Same Version, but different artifactID.
There are duplicated Classes in these two artifacts, but one of them has some beta features.
How do I ensure part of my code use classes from Jar1, the rest of my code use classes from Jar2 ?
My bad. There is no overlap. Sorry for all the confusion :-(
What about creating two projects? One dependent on artifactID1 other other dependent on artifactID2. One project would then depend on the other project. This is predicated on all the code using one version of the jar not being dependent on code that uses the other version.
I have several applications that differ mostly based on resources. As of now, I'm copying the code around to each application. This can be problematic. An example, fixing a bug in one, and forgetting to update to the others.
I don't think creating a JAR is appropriate for this situation, as these are application specific UI classes, (actually android activity classes in specific) including the actual app start-up code.
It may be possible to include these source files into several packages, but then I have the problem that each file specifies a specific package name on the first line.
Most of the code is related to the UI and Activity processing. (The actual common code is already in a library). A similar question is posted here.
Are there any elegant solutions to this situation?
A jar is absolutely appropriate for this situation. You should split your application into layers, separating the application-specific classes from the shared code.
I solved this by going with Android Library projects. (Not sure of the details, perhaps they are ultimately jars) Check out details here, specifically the section 'Setting up a Library Project'.
I basically put in all my activity classes (except for the start-up one) into the library.
For true non-UI bound code, JARs, do seem to be the way to go.
I agree with artbristol.
I also recommend to use Maven and:
release the common jars to a corporate Maven repository
declare a dependency with specific versions on these jar artifacts
Like this you don't break applications if you do some incompatible changes.