NiFi custom processors with shared references use the oldest version

NiFi custom processors with shared references use the oldest version - java

Been working with Nifi for a few months and noticed a dependency pattern that doesn't make sense, hopefully I can describe it clearly in case someone can shed light.
We've been prototyping several related but distinct custom processors, and these processors all use some common jar libraries.
For example,
Processors A and B use Library1
ProcessorA gets developed along with some of the code in Library1, build Library1 and then build ProcessorA for testing on the NiFi server, all is good
ProcessorB gets developed also with some of the code in Library1, and Library1 is rebuilt before building ProcessorB as a deployable nar
Then when testing ProcessorB on the NiFi server (including attaching to the process to step through) we find need to update relevant code in Library1
But, despite this update to the Library, ProcessorB still executes the old Libary code. A specific example was a for loop; it would still initialize int i = 1 after I changed the code to int i = 0.
Only after I did a clean/rebuild of ALL the processors (both A and B) did ProcessorB start recognizing the updated library code.
This is surprising since every processor nar package should have its own copy of the library jar, but it is acting like the earliest version takes precedent.
My question, finally: is this expected behavior when using shared code libraries among nar packages, or is there a better practice for architecting these? Or am I missing some wisdom from from the java/maven realm?
NiFi is v 1.8 but also experienced this in 1.5
Many TIAs

Related

Best practice for deploying different versions of project for different other projects

We have many Maven projects, and one of these projects is included as a dependency in every other project. The problem is, that when deploying a new version of this dependency used by the others, every project gets this new version, which could lead to problems.
Of course, I could manually change the version every time I deploy the project, but that could lead to problems as well, i.e. when forgetting to change the version before deploying it.
I also saw the solution of using a ${version} placeholder in the "version"-tag, but that would mean, that I have to specify the version every time I'm doing a Maven command.
Is there a solution for such problems, where you have a dependency used in many other projects and need a different version in everyone of these projects?

The first thing I see
The problem is, that when deploying a new version of this dependency
used by the others, every project gets this new version, which could
lead to problems.
This shows a big issue. You seemed to be violating the foundational rule of immutable releases.
In consequence, the version is in the end useless because it's always the same and does not transport any kind of information.
There is first to say you should follow semantic versioning.
Also you should use either maven-release-plugin to increment the numbers automatically (but that will not solve the issue minor or major release), there are tools to identify such things). This should be solved by using a CI/CD setup (Jenkins, etc.).
Tools to check changes (compatibility) are things like RevAPI or
JAPI-Checker etc. Also some useful information here.
Furthermore you can do that via different setups The mentioned ${version} is simply wrong and will not work in several ways.
Upgrading a larger number of projects can be done by using something like Renovate or things like Dependabot or even some existing Maven plugins which can be run via CI/CD automatically (scheduled) which you should even do for security scans etc. That means automation is the keyword here.

intellij , gradle "implementation" doesn't work, transitive dependecies leaks

note: title updated, the problem seems to be in intellij
I've create a simple repo to test this problem :
https://github.com/fvigotti/gradle-implementation-error
Gradle 5.4.1 but also tested with previous versions before upgrading..
Expected Behavior
implementation should not leak "implementation libraries"
Current Behavior
everything leaked..
Context
I've created an empty ad-hoc sample project to demonstrate this..
https://github.com/fvigotti/gradle-implementation-error
Steps to Reproduce
gradle clean build publishToMavenLocal
then go to another project and import:
implementation "net.me:library-sample:1.0-SNAPSHOT"
this should not be exposed :
import org.apache.commons.codec.Decoder but it is!
I'm testing more sources found on github seems that almost everyone uses
from components.java without worrying about implementation dependencies leakage... I'm sure I'm missing something here..
thank you,
Francesco
UPDATE
here is the video of the issue :
https://vimeo.com/334392418

You say in your bug report to IntelliJ, „I'm surprised no one surfaced this“. Probably the reason why it hasn't surfaced, is because what you are encountering is not a problem in the real world.
I think the real problem might be a matter of understanding the notoriously arcane nature of class loading in Java.
Class loading must be able to work at both the system level and the application level. Class.forName(String) is really working at the system level. But you're sort of driving the system level from the application level with that call on Class.
Gradle's api/implementation constraints apply at the application level only. Not at the system level. Outside of the context of a Gradle build run, Gradle can't enforce constraints on how the Java class loading system itself is designed to operate.
It's like I shouldn't be able to access my computer's CPU instruction register from a word processor application. But that doesn't mean that the operating system should be forbidden from interfacing with the CPU somehow.
In your case, your net.me.consumer.Main class is the „word processor“. org.apache.commons.csv.CSVFormat and org.apache.commons.codec.Decoder are analogous to instructions in the CPU. And java.lang.Class is analogous to an operating system.
You can't access the CPU directly from your word processor. However, you can do things in a word processor that cause the OS to interface with the CPU. You're doing something analogous to that with Class.forName(decoderClazzName).
The same „leakage“ your project demonstrates is repoducable in Eclipse too. And also in Visual Studio Code. In fact, I can add this to both your Main and MainTest classes of your consumer project:
org.apache.commons.codec.Decoder decoder = new Decoder() {
#Override
public Object decode(Object source) throws DecoderException {
// TODO Auto-generated method stub
return null;
}
};
Eclipse not only allows it, it automatically creates it for me. Plus it adds the necessary import statements without me even asking it to. And very snappily and happily compiles it with no problems nor complaints. Even though there is no dependency on commons-codec defined in your consumer's build.gradle.
And though I haven't tried it in NetBeans, I suspect that you would get the same results in that or any other Java IDE.
The way you've filed a bug with IntelliJ, you would also have to file a bug with every Java IDE, every Java-based application server, every JVM-based compiler, and so on.
I don't think it is reasonable to expect every software vendor out there in the Java ecosystem to comply with the constraints defined by one single dependency management tool. Regardless of how elephantine they like to think they are ;)

How do you resolve several-levels-indirect missing dependency hell in Maven/Gradle?

We have a project which depends on Aspose Words' com.aspose:aspose-words:16.10.0:jdk16.
The POM for aspose-words declares no dependencies, but this turns out to be a lie. It actually uses jai-core, latest version of which is at javax.media:jai-core:1.1.3.
The POM for jai-core, though, also lies - it declares no dependencies, but actually depends on jai-codec, which is at com.sun.media:jai-codec:1.1.3.
Getting these projects to fix things seems impractical. JAI is basically a dead project and Maven Central have no idea who added that POM so there is nobody responsible for fixing the metadata. Aspose refuse to fix things without a test reproducing it, even if you can show them their own code doing it wrong, and even if they fixed it, they would then add their dependency on jai-core:1.1.3, which only fixes half the problem anyway.
If I look at our entire tree of dependencies, this is only one example of the problem. Others are lurking, masked out by other dependency chains coincidentally pulling in the missing dependency. In some cases, we have even reported POM issues to projects, only for them to say that the dependency "isn't real", despite their classes clearly referring to a class in the other library.
I can think of a few equally awkward options:
Create jai-core:1.1.3.1 and aspose-words:16.10.0.1 and fix their POMs to include the missing dependencies, but whoever updates them in the future will have to do the same thing. Plus, any other library I don't know about which happens to depend on jai-core would also have to be updated.
Add a dependency from our own project, even though it really isn't one.
Edit the POM for the versions which are there now to fix the problem directly, only caveat being that people might have cached the wrong one.
So I guess I have two related questions about this:
Is there no proper way to resolve this? It seems like any non-toy project would eventually hit this problem, so there not being an obviously correct way to deal with it is worrying.
Is there a way to stop incorrect dependency metadata getting into the artifact server in the first place? It's getting kind of out of hand, because other devs on the team are adding the dependencies without checking things properly, and then I'm left to clean up their error when something breaks a year later.

Tunaki has already given many good approaches. Let me add the following:
We had to deal with a lot of legacy jars which are some old or strange versions of already existing jars on MavenCentral. We gave them a special kind of version number (like 1.2.3-companyname) and created a POM for them that fitted our purposes. This is - more or less - your first "awkward option". This is what I would go for in your case; additionally, I would define the version in the dependencyManagement, so that Maven dependency mediation will not set it to some other version.
If a new version of your jar comes around, you can check if it still has the same problems (if they did a correct Maven build, they should have all dependencies inside the POM). If so, you need to fix it again.
I wouldn't change poms for already existing versions because it confuses people and may lead to inconsistency problems because Maven will not grab the new POM if an old version is already in the local repository. Adding the dependency to your own project is an option if you have very few projects to manage so that you still see what is going on (a proper comment on the dependencies in the POM could make it clearer).

JAI is optional for Aspose.Words for Java. Aspose.Words for Java uses JAI image encoders and decoders only if they available. And it will work okay without JAI.
The codecs complement standard java ImageIO encoders/decoders. The most notable addition is support of Tiff.
JAI (Java Advanced Imaging) is not usual library. First of all - it is native library. I.e. it has separate distributives for different platforms. It has also "portable" pure-java distributive, but if you want full power of JAI - you should stick to native option.
Another thing: usually you should run installation of JAI native distributive on the host system. I.e. it installed like desktop application, not like usual java library. Again, JAI codec acts not like usual library: if it installed on system - it will plug into ImageIO, irrelevant to classpath.
So, i don't know good way to install JAI using Maven - it is like using Maven to install Skype or any other desktop application. But it is IMHO, I am not great specialist on Maven:)

two web projects sharing same base source code in eclipse

I am creating a project that has two flavors
one for enterprises and one for families .
the deferences are simple really : changed labels(property) , changed colors(CSS) , a few deleted features...
i don't want to start a new project and copy the whole enterprise to the family version because that would mean that each bug must be solve in both .
technology used :JSF,Jquery,Hibernate and eclipse as an IDE also I am deploying to Jboss.
i dont know where to start but i have the following thoughts:
solution attempt 1 :
along time ago when i used to develop J2me application on netbeans we used to have something called configurations where we can choose to include deferent segments of the code depending on the configuration(device) so we would end up having multiple executable files.
solution attempt 2:choose which code is common and move it to a separate location and include it in both projects (this would be painful to implement and also i need to replicate some fixes some times)
is there any good solution to this issue ?
thanks

Consider using a repository (git, svn) - create a branch for each of the versions, and then when you fix something on one of the versions, merge it to the second. With git you can have a local repo just for your own on local machine, and it's really simple to set up.

A tool to detect broken JAR dependencies on class and method signature level

The problem scienario is as follows (Note: this is not a cross-jar dependency issue, so tools like JarAnalyzer, ClassDep or Tattletale would not help. Thanks).
I have a big project which is compiled into 10 or more jar artifacts. All jars depend on each other and form a dependency hierarchy.
Whenever I need to modify one of the jars, I would check out the relevant source code and the source code for projects that depend on it. Modify the code, compile, repackage the jars. So far so good.
The problem is: I may forget to check one of the dependent projects, because inter-jar dependencies can be quite long, and may change with time. If this happens some jars may go "out-of-sync" and I will eventually get a NoSuchMethodException or a some other class incompatibility issue at run-time, which is what I want to avoid.
The only solution I can think of, the most straighforward one, is to check out all projects, and recompile the bunch. But this takes time, especially if I re-build it every small change. I do have a continuous integration server, that could do this for me, but it's shared with other developers, so seeing if the build breaks is not an option for me.
However, I do have all the jars so hypothetically it should be possible to verify jars which depend on the code that I modified have an inconsistency in method signature, class names, etc. But how could I perform such check?
Has anyone faced a similar problem before? If so, how did you solve it? Any tools or methodologies would be appreciated.
Let me know if you need clarification. Thanks.
EDIT:
I would like to clarify my question a little bit.
The ultimate goal of this task is to check that the changes that I have made will compile against the whole project. I am looking for a tool/technique that would aid me perform such check.
Consider this example:
You have 2 projects: A and B which are deployed as A.jar and B.jar respectively. A depends on B.
You wish to modify B, so you check it out and modify a method signature that A happens to depend on. You can compile B and run all tests by itself without any problems because B itself does not depend on anything. So you happily commit your changes.
In a few hours the complete project integration fails because A could not be compiled!
How do I avoid this?
The kind of tool I am looking for would retrieve A.jar and check that all dependencies in A on the new modified B are still fine. Like a potential compilation error that would happen if I were to recompile A and B sources together.
Another solution, as was suggested by many of you, is to set up a local continuous integration system that would recompile the whole project locally. I don't mind doing this, but I want to avoid doing it inside my workspace. On the other hand, if I check-out all sources to another temporary workspace, then I need to mirror my local changes to the temporary workspace.
This is quite a big issue in my team, as builds break very often because somebody forgot to check out (or open in Eclipse) the right set of projects. I tried persuading people to check-out source and recompile the bunch before commits, but not only it takes time, it needs running quite a few commands so most people just find it too troublesome to do. If the technique is not easy or automated, then it's unusable.

If you do not want to use your shared continuous integration server you should set up a local one on your developer machine where you perform the rebuild processes on change.
I know Jenkins - it is easy to setup (just start) on a local machine and I would advice to run it locally if no one is provided in the IT infrastructure that fits your needs.

Checking signatures is unfortunately not enough. Having the correct signatures does not mean it'll work. It's all about contracts and not just signatures. I mean what happens if the new version of a library has the same method signature, but accepts an ArrayList parameter now in reversed order? You will run into issues - sooner or later. I guess you maybe consider implementing tools like Ivy or Maven:
http://ant.apache.org/ivy/
http://maven.apache.org/
Yes it can be pain to implement it but once you have it it will "guard" your versions forever. You should never run into such an issue. But even those build tools are not 100% accurate. The only proper way of dealing with incompatible libraries, I know you won't like my answer, is extensive regression testing. For this you need bunch of testing tools. There are plenty of them out there: from very basic unit testing (JUnit) to database testing (JDBC Proxy) and UI testing frameworks like SWTBot (depends if your app is a web app or thick client).
Please note if your project gets really huge and you have large amount of dependencies you always not using all of the code there. Trying to check all interfaces and all signatures is way too much. Its not necessary to test it all when your code use lets say 30 % of the library code. What you need is to test what you really use. And this can be only done with extensive regression testing.

I have finally found a whole treasure box of answers at this post. Thanks for help, everyone!
The bounty goes to K. Claszen for the quickest and most input.

I'm also think that just setup local Jenkins is a best idea. What tool you use for build? Maybe you can improve you situation with switching to Maven as build tool? In more smart and don't recompile full project if you don't ask it directly. But switch to in can be HUGE paint in the neck - it hardly depends on how you project organized now...
And about VCS- exist Mercurial/SVN bridge - so you can use local Mercurial for you development ....
check this link: https://www.mercurial-scm.org/wiki/WorkingWithSubversion

There is a solution, jarjar, which allows to have different versions of the same library to be included multiple times in the dependency graph.

I use IntelliJ, not Eclipse, so maybe my answer is too IDE-specific. But in IntelliJ, I would simply include the modules from B into A, so that when I make changes to A, it breaks B immediately when compiling in the IDE. Modules can belong to multiple projects, so this is not anything like duplication, it's just adding references in the IDE to modules in other projects.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.