To start off, I am still a student and trying to learn things, but I am quite stuck on how to get all the dependencies of each package from Maven. Initially, I got all the packages from https://libraries.io/, but without their dependencies, as I would like to construct a temporal graph in which I could display those dependencies using time. Libraries.io does not take time into account and therefore I only downloaded the packages.
Now, I'm finding it quite hard to download the dependencies of each package. Initially, I thought parsing the .pom files from this central repo https://repo1.maven.org/maven2/ is enough to get the dependencies for each package, but later on, I found out that there are also external dependencies that you can only get by resolving the .jar file (I think?). I do not really understand that, and I am trying to learn how could I get all the data in this case, as parsing does not seem to be entirely accurate.
Also, I should mention that I am trying to do this in golang, as it was a new language I wanted to try out and it seemed fitting for the task.
EDIT 1- For example, I would like to have this for each package from Maven:
{
"name": "react-dom",
"versions": {
"1.00": {
"timestamp": "06-05-2022T10:00:01",
"dependencies": {
"name": "^1.0.2"
"name": "^2.1.2"
}
}
}
}
EDIT 2: Yes, this is for research purposes. This is the description of the task https://imgur.com/a/D0LcbzF. Initially, I thought this research was not this complex, but after the first meeting, I was told that we basically have to do what https://libraries.io/ does, but make it accurate by adding a time component. From what I understood by the professor, what libraries.io does not take into account is an example like this:
Library A releases a version at time 1, named version A
v1.1
Library B, which depends on library A’s latest version at
the current time, is releasing a version at time 2. Therefore, library B depends on library A v1.1
Library A releases a version at time 3, named version A v1.2
Library C, which depends on library B, releases a version at time 4. Therefore, it also depends on library A.
Library C should depend on the latest version of library
A, which is A v1.2, even though library B depends on
version A v1.1.
So, to summarize, I have been tasked with getting the packages from https://libraries.io/api, and get the dependencies by resolving the pom files somehow (I cannot tell you more because I am very confused, I am not knowledgeable enough on this matter, I have just started)
Professor sent me this "WRT maven dependency listing: I was thinking something along the lines of mvn -DgroupId=junit -DartifactId=junit -Dversion=4.13.1 dependency:get but his only works for retrieving the pom/jar for downloading particular dependency into .m2", maybe that helps you understand something.
Afterward, after data is available, make a temporal graph that could suit the example above, and finally, see what measures we can find to see what are the most used software?
To me, and from what you have already said, this seems very out of reach and I am literally lost on what to do, as the professor does not really guide me in any way.
Some papers I found: https://www.researchgate.net/publication/335499638_The_Maven_Dependency_Graph_A_Temporal_Graph-Based_Representation_of_Maven_Central
The part of your comment: on the data to see what are the most used packages is data you don't have.
for that you need the download statistics of central repository where you don't have access to ...
If you would even have the download statistics of central repository it does not represent how much an artifact is being used because many companies are using repository managers which means an artifact is being downloaded exactly once but internally used a lot.
Furthermore if an artifacts is being downloaded does not really mean it's used. Some artifacts are downloaded based on transitive dependencies or just added in pom file but are not really used.
Technically you can download all artifacts or at least the pom files and analyse the dependencies but you lack the download statistics of central repository.
Also this is prevented because you can't download all the artifacts because you would being blocked from central repository.
The size of central is, an educated guess of mine ca. 5 TiB+
Another thing is that not only central repository does exist there are a lot of other Maven repositories available and being used.
Related
I have a Spring module having:
1) depedency org.hibernate-validator 6...
2) transitive depedency org.hibernate-validator 5...
3) uber transitive depedency hardFileCoded in fat jar(gwt-user) of vaadin dependency with org.hibernate-validator <6
They seem uncompatible to interchange.
The problem is - they do not conflict and raise ex.
But in compiletime(it follows bad artifact till successful remake) they mess randomly(?).
And wrong version(?) is used for retriving validation message error text.
Resulting in bad print because versions <=5 does not have javax.validation.constraints.NotBlank.message:
Object: ***, error: {javax.validation.constraints.NotBlank.message}
instead of correct:
Object: ***, error: must not be blank
I can't really remove something etc.
Need somehow to distinguish them and use appropriate versions in places.
First of all limit that fat jar validator to never go out of vaadin =)
Thanks a lot in advance for any directions to dig.
As always you should look for a normal healthy dependency
where you can <exclude> (maven) some transitives and include them explicitly with the right version
also check this dependency is really needed
in rare extra cases see classloaders, so you can load same classes of different versions for a different consumers
In my concrete case found that fat jar unused and free to delete =)
Thanks for recommendations.
What is the best way to find the right dependency for a used class that are part of the maven-online-repository?
As far I see it is this approach:
lookup the import (e.g. org.whatever.X;) from your code at the maven-repository online (search.maven.org).
Pick one of the result list and include it in the dependency section of the POM.
Hope the chosen version and artifact of the dependency matches your requirements (compiling, runtime). If not try another artifact or version.
I'd like to share my way of doing it. What do you mean by "finding the ... for a used class that are part of the ..."? Do you mean that the dependancy is already used in somewhere else, or that you only know the package name that you may need?
I would first check which version I need for the current project.
If I'm working on a team project and someone has used the dependency in somewhere else, I would check their pom (to ensure we are using the same dependency).
Then I would look up the dependency in Maven repo and include it in my pom.
Hope this helps.
Essentially, yes this is what you have to do to obtain libraries/modules for your project.
Something that's helped me out though with this specific problem: versioning. You can set the versions you need for each of your dependencies with <properties> -> <gson.version>2.8.1</gson.version> (for example). That way, you can guarantee that your build matches with the reqs of the class or type of code you're trying to implement.
Maven doc ref: https://maven.apache.org/pom.html#Properties
Java 8 here.
Say there is an old version of the widget libray, with Maven coordinates widgetmakers:widget:1.0.4, that has a class defined in it like so:
public class Widget {
private String meow;
// constructor, getters, setters, etc.
}
Years pass. The maintainers of this widget library decide that a Widget should never meow, rather, that it should in fact bark. And so a new release is made, with Maven coordinates widgetmakers:widget:2.0.0 and with Widget looking like:
public class Widget {
private Bark bark;
// constructor, getters, setters, etc.
}
So now I go to build my app, myapp. And, wanting to use the latest stable versions of all my dependencies, I declare my dependencies like so (inside of build.gradle):
dependencies {
compile (
,'org.slf4j:slf4j-api:1.7.20'
,'org.slf4j:slf4j-simple:1.7.20'
,'bupo:fizzbuzz:3.7.14'
,'commons-cli:commons-cli:1.2'
,'widgetmakers:widget:2.0.0'
)
}
Now let's say that this (fictional) fizzbuzz library has always depended on a 1.x version of the widget library, where Widget would meow.
So now, I'm specifying 2 versions of widget on my compile classpath:
widgetmakers:widget:1.0.4 which is pulled in by the fizzbuzz library, as a dependency of it; and
widgetmakers:widget:2.0.0 which I am referencing directly
So obviously, depending on which version of Widget gets classloaded first, we will either have a Widget#meow or a Widget#bark.
Does Gradle provide any facilities for helping me out here? Is there any way to pull in multiple versions of the same class, and configure fizzbuzz classes to use the old version of Widget, and my classes to use the new version? If not, the only solutions I can think of are:
I might be able to accomplish some kind of shading- and/or fatjar-based soltuion, where perhaps I pull in all my dependencies as packages under myapp/bin and then give them different version-prefixes. Admittedly I don't see a clear solution here, but am sure something is feasible (yet totally hacky/nasty). Or...
Carefully inspect my entire dependency graph and just make sure that all of my transitive dependencies don't conflict with each other. In this case for me, this means either submitting a pull-request to the fizzbuzz maintainers to upgrade it to the latest widget version, or, sadly, downgrading myapp to use the older widget version.
But Gradle (so far) has been magic for me. So I ask: is there any Gradle magic that can avail me here?
Don't know the specifics of Gradle, as I'm a Maven person, but this is more generic anyway. You basically have two options (and both are hacky):
ClassLoader magic. Somehow, you need to convince your build system to load two versions of the library (good luck with that), then at runtime, load the classes that use the old version with a ClassLoader that has the old version. I have done this, but it's a pain. (Tools like OSGI may take away some of this pain)
Package shading. Repackage the library A that uses the old version of library B, so that B is actually inside A, but with a B-specific package prefix. This is common practice, e.g. Spring ships its own version of asm. On the Maven side, the maven-shade-plugin does this, there probably is a Gradle equivalent. Or you can use ProGuard, the 800 pound gorilla of Jar manipulation.
Gradle will only set up the classpath with your dependencies, it doesn't provide its own runtime to encapsulate dependencies and its transitive dependencies. The version active at runtime will be the one according to the classloading rules, which I believe is the first jar in the classpath order to contain the class. OSGI provides runtime that can deal with situations like this and so will the upcoming module system.
EDIT: Bjorn is right in that it will try to resolve conflicts in different versions; it'll compile the classpath based on its strategies, so the order you put your dependencies in the file doesn't matter. However you still only get one class per classname, it won't resolve OP's issue
If you have different versions of a library with otherwise equal coordinates, Gradles conflict resolution mechanism comes into play.
The default resolution strategy is to use the newest requested version of the library. You will not get multiple versions of the same library in your dependendcy graph.
If you really need different versions of the same library at runtime you would have to either do some ClassLoader magic which definitely is possible or do some shading for one of the libraries or both.
Regarding conflict resolution, Gradle has built-in the newest strategy that is default and a fail strategy that fails if different versions are in the dependency graph and you have to explicitly resolve version conflicts in your build files.
Worse case is when the same class appears in multiple jars. This is more insidious - look at the metrics jars from Codahale and Dropwizard with incompatible versions of the same class in the two jars.
The gradle classpath-hell plugin can detect this horror.
A project runs on Google App Engine. The project has dependency that uses a class that can't be invoked on App Engine due to security constraints (it's not on the whitelist). My (very hacky) solution was to just copy a modified version of that class into my project (matching the original Class's name and package) that doesn't need the restricted class. This works on both dev and live, I assume because my source appears in the classpath before my external dependencies.
To make it a bit cleaner, I decided to put my modified version of that class into it's own project that can be packaged up in a jar and published for anyone else to use should they face this problem.
Here's my build.gradle:
// my jar that has 'fixed' version of Class.
compile files('path/to/my-hack-0.0.1.jar')
// dependency that includes class that won't run on appengine
compile 'org.elasticsearch:elasticsearch:1.4.4'
On my local dev server, this works fine, the code finds my hacked version of the class first at runtime. On live, for some unknown reason, the version in the elasticsearch dependency is loaded first.
I know having two versions of the same class in the classpath isn't ideal but I was hoping I could reliably force my version to be at the start of the classpath. Any ideas? Alternatively, is there a better way to solve this problem?
Not really sure if this is what people visiting this question were looking for, but this was what my problem and a solution that I reached at.
Jar A: contains class XYZ
Jar B: also contains class XYZ
My Project needs Jar B on the classpath before Jar A to be able to get compiled.
Problem is Gradle sorts the dependencies based on alphabetical order post resolving them which meant Jar B will be coming after Jar A in the generated classpath leading to error while compiling.
Solution:
Declare a custom configuration and patch the compileClasspath. This is how the relevant portion of build.gradle might look like.
configurations {
priority
sourceSets.main.compileClasspath = configurations.priority + sourceSets.main.compileClasspath
}
dependencies {
priority 'org.blah:JarB:2.3'
compile 'org.blah:JarA:2.4'
...
}
It's the app engine classloader I should have been investigating, not gradle...
App Engine allows you to customise the class loader JAR ordering with a little bit of xml in your appengine-web.xml. In my case:
<class-loader-config>
<priority-specifier filename="my-hack-0.0.1.jar"/>
</class-loader-config>
This places my-hack-0.0.1.jar as the first JAR file to be searched for classes, barring those in the directory war/WEB-INF/classes/.
...Thanks to a nudge in the right direction from #Danilo Tommasina :)
UPDATE 2020:
I just hit the same problem again and came across my own question... This time, live appengine was loading a different version of org.json than was being loaded in dev. Very frustrating and no amount of fiddling the build script would fix it. For future searchers, if you're getting this:
java.lang.NoSuchMethodError: org.json.JSONObject.keySet()Ljava/util/Set;
It's because it's loading an old org.json dependency from god-knows-where. I fixed it by adding this to my appengine-web.xml:
<class-loader-config>
<priority-specifier filename="json-20180130.jar"/>
</class-loader-config>
You'll also need a matching dependency in build.gradle if you don't already have one:
compile 'org.json:json:20180130'
According to gradle dependencies documentation, the order of dependencies defines the order in the classpath. So, we can simply put the libraries in the correct order in "dependencies".
But beware! here are two rules with higher priorities:
For a dynamic version, a 'higher' static version is preferred over a 'lower' version.
Modules declared by a module descriptor file (Ivy or POM file) are preferred over modules that have an artifact file only.
I would like to be able to determine what versions I am running of a dependency at runtime as well as the version of the web application itself.
Each web application I deploy is packaged with a pom.xml which I can read from, that part is trivial. The next part is parsing the pom without much effort.
As the web application is running, I want to be able to understand what version I am, and what versions my dependencies are.
Ideally, I would like to do something like:
MavenPom pom = new MavenPom(webApplicationPomInputStream);
pom.getVersion();
pom.getArtifactId();
pom.getGroupId();
for(Dependency dependency:pom.getDependencies())
{
dependency.getVersion();
dependency.getArtifactId();
dependency.getGroupId();
}
Should I just use XPath notation here, or is there a library I can call to do this type of thing?
After these posts, I am thinking the quickest/most reliable way is to generate a text file with the dependency tree in it: mvn dependency:tree. Then I will parse the text file, separate the groupId, artifactId, and version, and then determine the structure by the indentation level.
If I do that, can I export to XML instead of text? I can then use JAXB and easily parse that file without doing any/much work.
It is a hack, but looks promising.
Walter
I will just use the mvn dependency:tree plugin to generate a text file with the dependency tree. Then I will parse that in and create the dependency tree/graph from that. I will get the scope of the artifact, groupId, artifactId, version, and its parent.
I successfully implemented this type of lookup, it simply takes the dependency output, parses it and organizes dependencies simply using the indentation, nothing fancy. The artifact, group, version, and scope are easily parsed since the separator is a :.
Walter
Maven has of course such an API. Have a look at org.apache.maven.project.MavenProject. But, to be honest, I don't think it will be that easy to create a MavenProject instance. The source code will be helpful here, check for example MavenProjectTest or maybe the Maven Plugin API (actually, this task would be much, really much, simpler to achieve from a Mojo) for some guidance.
I'd suggest to search for or ask this question on the Maven Mailing Lists, org.apache.maven.dev would be appropriate here IMHO.