I have zero experience with Java, but when trying to understand a certain "apocalyptic" vulnerability, I ended up with a fundamental question about imports in Java, so please bear with me.
My question is, as given in the title, why a Java package can not be updated with a single central patch.
For comparison, two hypothetical diametric cases that I think I understand reasonably well:
If, say, a python library had some vulnerability, then it should suffice (on well-maintained systems that use centralized libraries located on PYTHONPATH) to update that single library and any code that imports it should, in general, be fixed.
On the other hand, if a C library had a vulnerability, then it would be necessary to replace every single binary whose source includes the vulnerable library with a patched binary.
Now, as far as I could tell, Java is actually closer to the former category of languages, where external imports are not included in compiled sources.
If this is the case, then why can't a single patch be applied to fix an entire system (au contraire, our IT department forwarded a gigantic list of software for us to check individually)? Is it because of multiple decentralized copies of identical libraries being installed, or is there some other reason? Or am I misunderstanding the issue?
Java applications themselves are separate processes. In principle, all these processes can use different VM's. This is often the case for larger applications, which are tested against a specific VM. In principle, Java runtimes (J2SE implementations) should remain as compatible as possible with each other, but it is certainly possible for developers or libraries to muck this up, e.g. by using "Sun" inner classes or by assuming things not specified for the API calls. Personally hate these kind of J2SE inclusions; I'd rather have applications that are created to remain compatible.
Smaller applications usually just run on one of the installed JRE's. However, they usually still need additional libraries or components - say, for instance, Log4J from Apache. These are often offered as separate .jar files (or "artifacts" in Maven speak). These libraries may also get updates; there is however not a common way of updating these on most systems; there is no single "application" set of shared libraries although it is certainly possible to create one. On Linux for instance there may be a set of libraries in /usr/share/java (by version, with generic names pointing to the latest one).
Many web applications - I those running on a specific application server such as Tomcat, Glassfish etc. do share a common "classpath", where application specific .jar files are put in specific folder. In that case an update of a library in the shared folder will affect all applications.
Java has had a framework for specific class-loaders, and in principle any framework can define their own set, so where the libraries are stored can depend on the framework. Java is very flexible and doesn't really have one single way of handling applications.
All this has previous little to do with import statements. These are just use as a shorthand notation, basically. You might as well use java.util.List as import java.util.List followed by List further in the code. Class files contain references to other classes (etc.), and those are resolved (found and loaded) at runtime; see the description from Oracle here.
Related
Can I make a Java program to generate another java application at runtime.
I want to make a "installer" program, which takes user input and generates an application as per user requirement, instead of just configuring the pre-built application according to the user needs.
I came across this solution - how to compile & run java program in another java program?, but I don't want to make clients install JDK on there computer.
Dynamically create table and Java classes at runtime -
which also need JDK, but I got a work around:
ToolProvider.getSystemJavaCompiler() returns null - usable with only JRE installed?
Can I make a complete application using above methods?
Is it a bad idea to generate such program?
Can I make Spring and Hibernate applications like that?
Or is there any existing framework for doing so?
(if possible it should create tables in db and generate html files as well. I came across http://velocity.apache.org/, so is it possible to generate java code using that.)
Your goal doesn't make a lot of sense from a practical perspective. I hope that my answer will help you to understand why.
Can I make a java program to generate another java application at runtime.
Yes you can. But it is a lot of work, especially if the application if complicated.
I want to make a "installer" program, which takes user input and generate an application as per user requirement, instead of just configuring the pre-build application according to the user needs.
That is possible ... in theory.
The problem is that you have to write a program that is capable of reading and understanding the user's requirements, and can then converting those requirements into code. Normally ... this is what a programmer does. Writing a program to do what a programmer does is not practical. (My guess is that it is 20 or more years beyond the "state of the art" of artificial intelligence to do such a thing.)
Now if the problem domain was sufficiently restricted, and the requirements were tightly specified in an unambiguous notation, then it might be feasible to do this. However, benefits of generating a program rather than configuring an existing one (based on the same requirement notation) are pretty small. And probably not worth the effort.
... but I don't want to make clients install JDK on their computer.
If you are generating Java programs you need a Java compiler. So if you insist on using a JRE (in Java 8), you need to include a 3rd party Java compiler in your application.
However, for Java 9 onward this is moot:
Oracle no longer provides JRE distributions for Java 9+ so you would need to get your client to use a 3rd-party source for their JRE.
You could (should) be using the Java 9+ jlink utility to produce a custom JRE for you application, and that can include the standard Java compiler.
If you are trying to generate code at the bytecode level, your problem is immediately ten times harder.
Sorry, I am using Java 8
Are you aware that Java 8 is "end of life" for commercial use? That is likely to affect your clients.
Can I make a complete application using above methods?
Maybe yes, maybe no. It depends on the problem domain. The more complicated it is, and the more diverse / general the requirements, the harder it will be.
Is it a bad idea to generate such program?
Yes. It is a bad idea. It is a lot more work than writing an application that is configured in the conventional way. (Noting that the configuration could include writing plugins in Java, rules in some scripting language, and so on.)
I would advise only generating source code or bytecodes if you already have a conventional application with most / all of the required functionality that you can use as a prototype for the generated generated code. (If you can't write such a prototype by hand, then writing a generator that will create one is not realistic.)
And even when it is feasible, I would question the wisdom of building a generator. There doesn't seem to be a significant pay-off for the extra effort. (For example, where is the benefit for the end user?)
Can I make spring and hibernate application like that?
I don't see why you couldn't generate such an application. But see 1) and 2).
Or is there any existing frameworks for doing so?
There are frameworks that could be used in some cases:
Templating frameworks like Velocity1 can be used to generate Java source code.
Bytecode engineering frameworks could be used to generate code directly.
1 - Indeed, I have used Velocity for Java source code generation. It worked, though I'm not convinced it was an ideal solution.
Sure you can. You can also leverage a project like GraalVM to generate native binaries for a given platform.
However, it is a lot of work, and the end result won't probably be as useful as you think. Any use case you have in mind will probably be a lot better served by an app that you just configure to do different tasks, so your efforts are probably best spent in that direction.
We're currently migrating from Java 8 to Java 11. However, upgrading our services was less painful, than we anticipated. We basically only had to change the version number in our build.gradle file and the services were happily up and running. We upgraded libraries as well as (micro) services that use those libs. No problems until now.
Is there any need to actually switch to modules? This would generate needless costs IMHO. Any suggestion or further reading material is appreciated.
To clarify, are there any consequences if Java 9+ code is used without introducing modules? E.g. can it become incompatible with other code?
No.
There is no need to switch to modules.
There has never been a need to switch to modules.
Java 9 and later releases support traditional JAR files on the
traditional class path, via the concept of the unnamed module, and will
likely do so until the heat death of the universe.
Whether to start using modules is entirely up to you.
If you maintain a large legacy project that isn’t changing very much,
then it’s probably not worth the effort.
If you work on a large project that’s grown difficult to maintain over
the years then the clarity and discipline that modularization brings
could be beneficial, but it could also be a lot of work, so think
carefully before you begin.
If you’re starting a new project then I highly recommend starting with
modules if you can. Many popular libraries have, by now, been upgraded
to be modules, so there’s a good
chance that all of the dependencies that you need are already available
in modular form.
If you maintain a library then I strongly recommend that you
upgrade it to be a module if you haven’t done so already, and if all of
your library’s dependencies have been converted.
All this isn’t to say that you won’t encounter a few stumbling blocks
when moving past Java 8. Those that you do encounter will, however,
likely have nothing to do with modules per se. The most common
migration problems that we’ve heard about since we released Java 9 in
2017 have to do with changes to the syntax of the version
string and to the removal or
encapsulation of internal APIs
(e.g., sun.misc.Base64Decoder) for which public, supported
replacements have been available for years.
I can only tell you my organization opinion on the matter. We are in the process of moving to modules, for every single project that we are working on. What we are building is basically micro-services + some client libraries. For micro-services the transition to modules is somehow a lower priority: the code there is already somehow isolated in the docker container, so "adding" modules in there does not seem (to us) very important. This work is being picked up slowly, but it's low priority.
On the other hand, client libraries is an entirely different story. I can not tell you the mess we have sometimes. I'll explain one point that I hated before jigsaw. You expose an interface to clients, for everyone to use. Automatically that interface is public - exposed to the world. Usually, what I do, is have then some package-private classes, that are not exposed to the clients, that use that interface. I don't want clients to use that, it is internal. Sounds good? Wrong.
The first problem is that when those package-private classes grow, and you want more classes, the only way to keep everything hidden is to create classes in the same package:
package abc:
-- /* non-public */ Usage.java
-- /* non-public */ HelperUsage.java
-- /* non-public */ FactoryUsage.java
....
When it grows (in our cases it does), those packages are way too big. Moving to a separate package you say? Sure, but then that HelperUsage and FactoryUsage will be public and we tried to avoid that from the beginning.
Problem number two: any user/caller of our clients can create the same package name and extend those hidden classes. It happened a few times to us already, fun times.
modules solves this problem in a beautiful way : public is not really public anymore; I can have friend access via exports to directive. This makes our code lifecycle and management much easier. And we get away from classpath hell. Of course maven/gradle handle that for us, mainly, but when there is a problem, the pain will be very real. There could be many other examples, too.
That said, transition is (still) not easy. First of all, everyone on the team needs to be aligned; second there are hurdles. The biggest two I still see is: how do you separate each module, based on what, specifically? I don't have a definite answer, yet. The second is split-packages, oh the beautiful "same class is exported by different modules". If this happens with your libraries, there are ways to mitigate; but if these are external libraries... not that easy.
If you depend on jarA and jarB (separate modules), but they both export abc.def.Util, you are in for a surprise. There are ways to solve this, though. Somehow painful, but solvable.
Overall, since we migrated to modules (and still do), our code has become much cleaner. And if your company is "code-first" company, this matters. On the other hand, I have been involved in companies were this was seen as "too expensive", "no real benefit" by senior architects.
I'd like to use Kotlin & Scala together in projects, and maybe some other languages, but I've seen no good way of doing it. The only way I thought of was compiling one language and decompiling it into Java to work with the other. Are there any alternatives?
For the sake of completeness and not putting words into someone else's mouth, I wanted to weigh in.
I agree with the last sentence of ziggystar's answer. The right thing to do is to take a component-based approach and not try to combine multiple languages in one component or project.
From a technical perspective, each of the JVM languages has their own compiler. Some, such as Scala's, can compile both Scala and Java files. However, this may or may not be true for other compilers. In order to avoid strange build processes, a good approach would be to use a single language for every built module.
Since you're sticking to JVM languages, every languages can be compiled into a JAR, so you can easily distribute your executable binary as a single JAR file, with all of the components wrapped up inside it. This is the Fat JAR approach (see this question on Stack Overflow, this post on Java Code Geeks).
From a human readability perspective, this should also make your software more easily understood. Not only have you decomposed it into logical building blocks (each component), but someone making modifications only needs to understand the language that the component they are working on is written in and the public interface of the components they need to interact with. There's no mental context switching between languages.
You can use Scala and Java simultaneously, since scalac understands and compiles Java files. The same probably holds for other languages. Problems might arise when using multiple alternative JVM languages, since, e.g., the Kotlin compiler probably can't understand the Scala files and vice versa.
I think the best way would be to split the project into different modules, and use at most one alternative language per module.
What do I mean with module?
With module, I mean a set of source files that gets translated into one (binary) artifact, i.e. a jar file. Under different circumstances I would simply call a "module" a project. Note that a module may depend on other modules on the binary level (e.g. has some jar files as dependencies).
Multi module support in IDEs
I think it should be possible with most major IDEs to work on different modules simultaneously, even if each module uses a different language. Terminology varies across IDEs.
Terminology
For Intellij IDEA, one of my modules is called "module". For Eclipse it would be called "project".
Java's package management system always seemed simple and effective to me. It is heavily used by the JDK itself. We have been using it to mimic the concept of namespaces and modules.
What is Project Jigsaw (aka Java Platform Module System) trying to fill in?
From the official site:
The goal of this Project is to design and implement a standard module
system for the Java SE Platform, and to apply that system to the
Platform itself and to the JDK.
Jigsaw and OSGi are trying to solve the same problem: how to allow coarser-grained modules to interact while shielding their internals.
In Jigsaw's case, the coarser-grained modules include Java classes, packages, and their dependencies.
Here's an example: Spring and Hibernate. Both have a dependency on a 3rd party JAR CGLIB, but they use different, incompatible versions of that JAR. What can you do if you rely on the standard JDK? Including the version that Spring wants breaks Hibernate and visa versa.
But, if you have a higher-level model like Jigsaw you can easily manage different versions of a JAR in different modules. Think of them as higher-level packages.
If you build Spring from the GitHub source you'll see it, too. They've redone the framework so it consists of several modules: core, persistence, etc. You can pick and choose the minimal set of module dependencies that your application needs and ignore the rest. It used to be a single Spring JAR, with all the .class files in it.
Update: Five years later - Jigsaw might still have some issues to resolve.
AFAIK The plan is to make the JRE more modular. I.e. have smaller jars which are optional and/or you can download/upgrade only the functionality you need.
Its to make it less bloated and give you the option of dropping legacy modules which perhaps most people don't use.
Based on Mark Reinhold's keynote speech at Devoxx Belgium, Project Jigsaw is going to address two main pain points:
Classpath
Massive Monolithic JDK
What's wrong with Classpath?
We all know about the JAR Hell. This term describes all the various ways in which the classloading process can end up not working. The most known limitations of classpath are:
It's hard to tell if there are conflicts. build tools like maven can do a pretty good job based on artifact names but if the artifacts themselves have the different names but same contents, there could be a conflict.
The fundamental problem with jar files is that they are not components. They're just bunch of file containers that will be searched linearly. Classpath is a way to lookup classes regardless of what components they're in, what packages they're in or their intended use.
Massive Monolithic JDK
The big monolithic nature of JDK causes several problems:
It doesn't fit on small devices. Even though small IoT type devices have processors capable of running an SE class VM but they do not have necessarily the memory to hold all of the JDK, especially, when the application only uses small part of it.
It's even a problem in the Cloud. Cloud is all about optimizing the use of hardware, if you got thousands of images containing the whole JDK but applications only use small part of it, it would be a waste.
Modules: The Common Solution
To address the above problems, we treat modules as a fundamental new kind of Java program component. A module is a named, self-describing collection of code and data. Its code is organized as a set of packages containing types, i.e., Java classes and interfaces; its data includes resources and other kinds of static information.
To control how its code refers to types in other modules, a module declares which other modules it requires in order to be compiled and run. To control how code in other modules refers to types in its packages, a module declares which of those packages it exports.
The module system locates required modules and, unlike the class-path mechanism, ensures that code in a module can only refer to types in the modules upon which it depends. The access-control mechanisms of the Java language and the Java virtual machine prevent code from accessing types in packages that are not exported by their defining modules.
Apart from being more reliable, modularity could improve performance. When code in a module refers to a type in a package then that package is guaranteed to be defined either in that module or in precisely one of the modules read by that module. When looking for the definition of a specific type there is, therefore, no need to search for it in multiple modules or, worse, along the entire class path.
JEPs to Follow
Jigsaw is an enormous project that is ongoing for a quite a few years. It's got an impressive amount of JEPs which are great places to gain more information about the project. Some of these JEPs are as the following:
JEP 200: The Modular JDK: Use the Java Platform Module System (JPMS) to modularize the JDK
JEP 201: Modular Source Code: Reorganize the JDK source code into modules, enhance the build system to compile modules, and enforce module boundaries at build time
JEP 261: Module System: Implement the Java Platform Module System, as specified by JSR 376, together with related JDK-specific changes and enhancements
JEP 220: Modular Run-Time Images: Restructure the JDK and JRE run-time images to accommodate modules and to improve performance, security, and maintainability
JEP 260: Encapsulate Most Internal APIs: Make most of the JDK's internal APIs inaccessible by default but leave a few critical, widely-used internal APIs accessible, until supported replacements exist for all or most of their functionality
JEP 282: jlink: The Java Linker: Create a tool that can assemble and optimize a set of modules and their dependencies into a custom run-time image as defined in JEP 220
Closing Remarks
In the initial edition of The State of the Module System report, Mark Reinhold describes the specific goals of the module system as following:
Reliable configuration, to replace the brittle, error-prone class-path mechanism with a means for program components to declare explicit dependences upon one another, along with
Strong encapsulation, to allow a component to declare which of its public types are accessible to other components, and which are not.
These features will benefit application developers, library developers, and implementors of the Java SE Platform itself directly and, also, indirectly, since they will enable a scalable platform, greater platform integrity, and improved performance.
For the sake of argument, let's assert that Java 8 (and earlier) already has a "form" of modules (jars) and module system (the classpath). But there are well-known problems with these.
By examining the problems, we can illustrate the motivation for Jigsaw. (The following assumes we are not using OSGi, JBoss Modules, etc, which certainly offer solutions.)
Problem 1: public is too public
Consider the following classes (assume both are public):
com.acme.foo.db.api.UserDao
com.acme.foo.db.impl.UserDaoImpl
At Foo.com, we might decide that our team should use UserDao and not use UserDaoImpl directly. However, there is no way to enforce that on the classpath.
In Jigsaw, a module contains a module-info.java file which allows us to explicitly state what is public to other modules. That is, public has nuance. For example:
// com.acme.foo.db.api.UserDao is accessible, but
// com.acme.foo.db.impl.UserDaoImpl is not
module com.acme.foo.db {
exports com.acme.foo.db.api;
}
Problem 2: reflection is unbridled
Given the classes in #1, someone could still do this in Java 8:
Class c = Class.forName("com.acme.foo.db.impl.UserDaoImpl");
Object obj = c.getConstructor().newInstance();
That is to say: reflection is powerful and essential, but if unchecked, it can be used to reach into the internals of a module in undesirable ways. Mark Reinhold has a rather alarming example. (The SO post is here.)
In Jigsaw, strong encapsulation offers the ability to deny access to a class, including reflection. (This may depend on command-line settings, pending the revised tech spec for JDK 9.) Note that because Jigsaw is used for the JDK itself, Oracle claims that this will allow the Java team to innovate the platform internals more quickly.
Problem 3: the classpath erases architectural relationships
A team typically has a mental model about the relationships between jars. For example, foo-app.jar may use foo-services.jar which uses foo-db.jar. We might assert that classes in foo-app.jar should not bypass "the service layer" and use foo-db.jar directly. However, there is no way to enforce that via the classpath. Mark Reinhold mentions this here.
By comparison, Jigsaw offers an explicit, reliable accessibility model for modules.
Problem 4: monolithic run-time
The Java runtime is in the monolithic rt.jar. On my machine, it is 60+ MB with 20k classes! In an age of micro-services, IoT devices, etc, it is undesirable to have Corba, Swing, XML, and other libraries on disk if they aren't being used.
Jigsaw breaks up the JDK itself into many modules; e.g. java.sql contains the familiar SQL classes. There are several benefits to this, but a new one is the jlink tool. Assuming an app is completely modularized, jlink generates a distributable run-time image that is trimmed to contain only the modules specified (and their dependencies). Looking ahead, Oracle envisions a future where the JDK modules are compiled ahead-of-time into native code. Though jlink is optional, and AOT compilation is experimental, they are major indications of where Oracle is headed.
Problem 5: versioning
It is well-known that the classpath does not allow us to use multiple versions of the same jar: e.g. bar-lib-1.1.jar and bar-lib-2.2.jar.
Jigsaw does not address this problem; Mark Reinhold states the rationale here. The gist is that Maven, Gradle, and other tools represent a large ecosystem for dependency management, and another solution will be more harmful than beneficial.
It should be noted that other solutions (e.g. OSGi) do indeed address this problem (and others, aside from #4).
Bottom Line
That's some key points for Jigsaw, motivated by specific problems.
Note that explaining the controversy between Jigsaw, OSGi, JBoss Modules, etc is a separate discussion that belongs on another Stack Exchange site. There are many more differences between the solutions than described here. What's more, there was sufficient consensus to approve the Public Review Reconsideration Ballot for JSR 376.
This article explains in detail the problems which both OSGi and JPMS/Jigsaw try to solve:
"Java 9, OSGi and the Future of Modularity" [22 SEP 2016]
It also goes thoroughly into the approaches of both OSGi and JPMS/Jigsaw.
As of now, it appears authors listed almost no practical Pros for JPMS/Jigsaw compared with matured (16 years old) OSGi.
I'm trying to code an application which runs un different java platforms like J2SE, J2ME, Android, etc. I already know that I'll have to rewrite most of the UI for each platform, but want to reuse the core logic.
Keeping this core portable involves three drawbacks that I know of:
Keeping to the old Java 1.4 syntax, not using any of the nice language features of Java 5.0
only using external libraries that are known to work on those platforms (that is: don't use JNI and don't have dependencies to other libs which violate this rules)
only using the classes which are present on all those platforms
I know of ways to overcome (1): code in 5.0 style and automatically convert it to 1.4 (retroweaver - haven't tried it yet, but seems ok).
I think (2) is a problem that I just have to accept.
Now I'd like to know what's the best workarround for (3), especially collection classes, which I miss the most. I can think of those:
Most programmers I know just don't use Set, Map, List, etc. and fallback to Vector and plain Arrays. I think this makes code ugly in the first place. But I also know that the right choice between TreeSet/Hashset or LinkedList/ArrayList is crucial for performance, and always using Vector and Arrays can't be right.
I could code my own implementations of that classes. This seems to be reinventing the wheel, and I think I could not do it as good as others have done.
Since Java is open source, I could grab the sourcecode of the J2SE Collections framework and include into my application when building for J2ME. I don't know if this is a good idea, though. Pherhaps there are good reasons not to do this.
Maybe there already are libraries out there, which rebuild the most important features of the collections framework, but are optimized for low end systems, pherhaps by not implementing functionality that is used infrequently. Do you know any?
Thanks for your answers and opinions!
Edit: I finally found a (complex, but nice) solution, and I thought by providing my own answer and accepting it, the solution would become visible at the top. But to the contrary, my answer is still at the very bottom.
J2ME is brutal, and you're just going to have to resign yourself to doing without some of the niceties of other platforms. Get used to Hashtable and Vector, and writing your own wrappers on top of those. Also, don't make the mistake of assuming that J2ME is standard either, as each manufacturer's JVM can do things in profoundly different ways. I wouldn't worry much about performance initially, as just getting correctness on J2ME is enough of a challenge. It is possible to write an app that runs across J2ME, J2SE and Android, as I've done it, but it takes a lot of work. One suggestion that I'd have is that you write the core of your application logic and keep it strictly to java.lang, java.util and java.io. Anywhere where you're going to be doing something that might interact with the platform, such as the file system or network, you can create an interface that your core application code interacts with, that you have different implementations for the different environments. For example, you can have an interface that wraps up HTTP stuff, and uses javax.microedition.io.HttpConnection with J2ME and java.net.HttpURLConnection on Android. It's a pain, but if you want to maintain an app running on all three of those environments, it can get you there. Good luck.
It's been a while since I asked this question, and I while since I found a nice, working solution for the problem, but I had since forgotton to tell you.
My main focus was the Java Collections Framework, which is part of the java.util package.
I've finally taken the source code of Suns Java 6.0 and copied all the classes that belong to the Collections framework into a project of my own. This was a Java 6.0 project, but I used the jars from J2ME as classpath. Most of those classes that I copied depend on other J2SE classes, so there are broken dependencies. Anyway, it was quite easy to cut those depensencies by leaving out everything that deals with serialization (which is not a priority for me) and some minor adjustments.
I compiled the whole thing with a Java 6 compiler, and retrotranslator was used to port the resulting bytecode back to Java 1.2.
Next problem is the package name, because you can't deliver classes from java.util with a J2ME application and load them - the bootstrap class loader won't look into the applications jar file, the other bootloaders aren't allowed to load something with that package name, and on J2ME you can't define custom classloaders. Retrotranslator not only converts bytecode, it also helps to change name references in existing bytecode. I had to move and rename all classes in my project, e.g. java.util.TreeMap became my.company.backport.java.util.TreeMap_.
I was than able to write actual J2ME application in a second Java 6.0 project which referenced the usual java.util.TreeMap, using the generic syntax to create type-safe collections, compile that app to Java 6.0 byte code, and run it through retrotranslator to create Java 1.2 code that now references my.company.backport.java.util.TreeMap_. Note that TreeMap is just an example, it actually works for the whole collections framework and even for 3rd party J2SE Jars that reference that framework.
The resulting app can be packaged as a jar and jad file, and runs fine on both J2ME emulators and actual devices (tested on a Sony Ericsson W880i).
The whole process seems rather complex, but since I used Ant for build automation, and I needed retranslator anyway, there only was a one-time overhead to setup the collection framework backport.
As stated above, I've done this nearly a year ago, and writing this mostly from the top of my head, so I hope there are no errors in it. If you are interested in more details, leave me a comment. I've got a few pages of German documentation about that process, which I could provide if there is any demand.
We faced exactly this situation in developing zxing. If J2ME is in your list of targets, this is your limiting factor by far. We targeted MIDP 2.0 / CLDC 1.1. If you have a similar requirement, you need to stick to Java 1.2. Java 1.4 language features are definitely not present (like assert) and in general you won't find anything after 1.2 in J2ME.
We did not use external libraries, but, you could package them into your deployed .jar file with little trouble. It would make the resulting .jar bigger, and that could be an issue. (Then you can try optimizers/shrinkers like ProGuard to mitigate that.)
I did end up reimplementing something like Collections.sort() and Comparator since we needed them and they are not in J2ME. So yeah you might consider doing this in cases, though only where necessary.
We used Vector and Hashtable and arrays since there is no other choice, really, in J2ME. I would just use them unless you have a reason not to, and that would be performance I guess. In theory JVM makers are already optimizing their implementation but that doesn't mean you couldn't make a better one... I guess I would be surprised if it is worth it in the vast majority of cases. Just make sure you really need to do this before putting in the effort.
To answer part of your question another collections library would be Javolution which can be built for j2me.