Use application configuration over Hadoop configuration - java

I have built a Java application using Maven. It is packaged as an executable jar using the Maven Shade plugin. This application does several things - one of those is to upload data to a Hadoop cluster. I execute the program using the following:
$ hadoop jar <app_name>.jar <app_arg1> <app_arg2> ...
My application uses SLF4J with the Log4J bindings for logging - and so does Hadoop.
When using the hadoop jar command, Hadoop's own Log4J configuration file overrides my application's Log4J configuration file.
How can I prevent my application's Log4J configuration file from being overriden?
NOTES:
Relevant dependencies: hadoop-core:1.2.1, slf4j-api:1.7.12, and slf4j-log4j12:1.7.12.
I'm using the hadoop jar command, instead of java -jar. My application code that interacts with the Hadoop cluster only works when using the hadoop jar command. I've outlined this issue in a previous SO question.
EDIT 1: (10/02/2015)
I've done a few things.
First, I changed the name of my Log4J configuration file to avoid the name collision with the the default log4j.properties file that Hadoop uses:
log4j-<app_name>.properties
Second, I set the the HADOOP_OPTS environment variable to tell Log4J what the name of my configuration file would be:
HADOOP_OPTS=-Dlog4j.configurationi=log4j-<app_name>.properties
Third, I set the HADOOP_CLASSPATH environment variable to ensure my configuration file that is packaged within the uber jar is picked up by the hadoop jar command:
HADOOP_CLASSPATH=/absolute/path/to/<app_name>.jar
With these changes, my application now uses it's own Log4J configuration file as intended. Feels like a hack (as I would have preferred to use the java -jar command), but it resolved my issue.

By default Hadoop framework jars appear before the users’ jars in the classpath. You can set the preference for your (user) jars using -Dmapreduce.job.user.classpath.first=true parameter in the command. The new command will look below.
hadoop jar <app_name>.jar -Dmapreduce.job.user.classpath.first=true <<app_arg1>> <<app_arg2>> ...
Or You can put the below configuration in your mapred-site.xml for always giving preference to user classpath.
<property>
<name>mapreduce.job.user.classpath.first</name>
<value>true</value>
</property>
You can set this programmatically in the job configuration.
job.getConfiguration().set("mapreduce.job.user.classpath.first", "true");
You can set this via any way, it will never be late.

Related

How to set different logging levels for vertx shadow jar in runtime?

vertx starter use shadow jar plugin to package a fat-jar
log4j2.xml will be put into the jar file
How can I run the jar with different log levels? (With springboot I can set -Dspring.profile.active=test to use application-test.yml for switch on debug logging)
You can make Log4j2 use another configuration file by setting the log4j2.configurationFile system property, as explained in the configuration section of the documentation:
java -Dlog4j2.configurationFile=/path/to/log4j2.xml -jar myapp.jar

What are auto executable jars?

I was going through spring-boot-maven-plugin documentation and came across a term auto executable jar.
Could someone please explain me what is an auto executable jar and how is it different then normal jar files and how they are auto executed?
spring-boot-maven-plugin documentation mentions the term but does not go further to explain it
repackage: create a jar or war file that is auto-executable. It can replace the regular artifact or can be attached to the build lifecycle with a separate classifier.
Could someone please explain me what is an auto executable jar
A fully executable jar can be executed like any other executable
binary or it can be registered with init.d or systemd. This makes it
very easy to install and manage Spring Boot applications in common
production environments.
So In conclusion is like any other executable when you use a executable jar
how is it different then normal jar files and how they are auto executed?
Well a java file you need to run with java -jar
From Spring Docs
The Maven build of a Springboot application first build your own application and pack it into a JAR file.
In the second stage (repackage) it will wrap that jar with all the jar files from the dependency tree into a new wrapper jar archive. It will also generate a Manifest file where is defined what's the application Main class is (also in the wrapper jar).
After mvn package you can also see 2 jar files in your target directory. The original file and the wrapped jar file.
You can start a Springboot application with a simple command like:
java -jar my-springboot-app.jar
I may suggest that auto executable means that you supplied main method so that it can be launched with java -jar options, otherwise it may be just a java library.
Here is a quote from https://docs.spring.io/spring-boot/docs/current/maven-plugin/repackage-mojo.html
Repackages existing JAR and WAR archives so that they can be executed from the command line using java -jar. With layout=NONE can also be used simply to package a JAR with nested dependencies (and no main class, so not executable).
Executable jar - the one that has main class declared in manifest and can be run with java -jar yourJarFile.jar command
Other jars - jars jars without delcared main calss. Can be anything - application, library, etc. Still can run application by providing fully.qualified.class.name as entry point like java -cp yourJarFile.jar my.bootstrap.BootstrapClass
Autoexecutable jars - never heard about it :)

Take logback.xml to outside of the jar

I am using logback with slf4j in my Maven Java project. Currently logback config file (logback.xml) is in src -> main -> resources folder. And it is working fine.
My issue is, I need to give my client the ability to configure logging as he prefers. For that logback.xml should be outside the jar when I build it. But as xml is inside src folder it is inside the jar and no one can change it after build.
How to achieve this?
Specifying the location of the default configuration file as a system property
You may specify the location of the default configuration file with a system property named "logback.configurationFile". The value of this property can be a URL, a resource on the class path or a path to a file external to the application.
java -Dlogback.configurationFile=/path/to/config.xml -jar myapp.jar
From offcial docs
Logback config file location can be specified in application.properties or application.yml.
application.yml
logging:
config: logback-spring.xml
This allows you to place jar and log-back.xml at the same folder.
Please note that logback-spring.xml file in your project folder should not be included in your jar. This can be achieved setting on build.gradle or pom.xml.
build.gradle
bootJar {
archiveName 'your-project.jar'
exclude("*.xml")
}
The logback.xml file needs to be on the classpath, but it doesn't need to be inside any specific jar. The details of how you want to do this depend on the exact deployment mechanism that's being used: How does whatever's starting this application set the classpath? Whatever that mechanism is, you should be able to configure it to include wherever you're putting the logback.xml file, and then just don't include in in the src/main/resources to be embedded in the jar file.
Depending on the complexity of what you're going for, you may find the maven-assembly-plugin useful for creating your distribution of dependencies.
Using Scala SBT (1.2.1) on Windows:
Batch file:
#cd %~dp0
#set JAVA_OPTS=-Dlogback.configurationFile=logback.xml
#sbt clean run
worked for me (strange ...)

Spring Boot Executable Jar with Classpath

I am building a software system to interact with an enterprise software system, using Spring Boot. My system depends on some jars and *.ini files from that enterprise system, so I cannot pack all dependencies in Maven. I would like to be able to run Spring Boot as Executable Jar with embedded Tomcat. I would also like to be able to set the classpath via the command line. So something like:
java -classpath /home/sleeper/thirdparty/lib -jar MyApp.jar
However, -classpath and -jar cannot co-exist. I have tried "-Dloader.path". It was able to load all the jar files under the folder, but not other things, like *.ini files in the folder.
So is there a way we can make -classpath to work with an Spring executable jar with embedded Tomcat?
If you just want add external libraries you can use the loader.path property.
java -Dloader.path="your-lib/" -jar your-app.jar
UPDATE
If you also need to read additional files from the classpath you have to create/change the manifest file of your application.
Lets assume that your are initializing your Spring Boot context from the class de.app.Application. Your MANIFEST.MF should looks as follows:
Manifest-Version: 1.0
Main-Class: de.app.Application
Class-Path: your-lib/
And the you can simply start your app with java -Dloader.path="your-lib/" -jar MyApp.jar.
For more information about the MANIFEST.MF please see Working with Manifest Files: The Basics.
On Linux:
java -cp MyApp.jar:/home/sleeper/thirdparty/lib -Dloader.main=myMainApplicationClass org.springframework.boot.loader.PropertiesLauncher
On Windows:
java -cp MyApp.jar;/home/sleeper/thirdparty/lib -Dloader.main=myMainApplicationClass org.springframework.boot.loader.PropertiesLauncher
This will avoid messing with the manifest or the Spring Boot Maven plugin configuration as in the other answers. It will launch your app with the PropertiesLauncher, which allows you to specify the main class in loader.main.
As mentioned earlier, for some reason if you use PropertiesLauncher with loader.path, it will not add resource files to the classpath. This works around the issue by using -cp instead of -jar.
EDIT
As mentioned by Pianosaurus in the comment, use ":" instead of ";" as separator in the classpath on Linux
You mentioned that you needed to load *.ini files from an external folder. I had to do something similar, load CSV files from an external folder.
My file structure looked like this
./myapp.jar
./config/file.csv
I was using the ResouceLoader to load the files as:
Resource res = resourceLoader.getResource("classpath:file.csv");
File csvFile = res.getFile();
Start script:
java -Dloader.path="config" -jar your-app.jar
The resource was not loading from the "config" folder as expected. After some research I found out that I had to change my Maven plugin configuration to use ZIP layout.
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<configuration>
<layout>ZIP</layout>
</configuration>
</plugin>
This will direct Spring Boot to use PropertiesLauncher, which allows loading external resources from "loader.path".
See this excellent article for more detail.
java -cp C:\jar-path\your-jar-1.2.0.jar -Dloader.main=package-and-main class -Dloader.path=external dependency jar path org.springframework.boot.loader.PropertiesLauncher -Dspring.profiles.active=profile etc -default,test --spring.config.location=external properties file name
If want to define external memory use
java -ms8g -mx8g -cp
java -cp
Differences between "java -cp" and "java -jar"?
-Dloader.main
Spring Boot’s org.springframework.boot.loader.PropertiesLauncher comes with a JVM argument to let you override the logical main-class called loader.main:
-Dloader.path
Tell the PropertiesLauncher that it should pick up any libraries found in the “lib”
org.springframework.boot.loader.PropertiesLauncher
Spring Boot’s org.springframework.boot.loader.PropertiesLauncher comes with a JVM argument to let you override the logical main-class called loader.main:
java -cp bootApp.jar -Dloader.main=org.khan.DemoApplication org.springframework.boot.loader.PropertiesLauncher
-Dspring.profiles.active
If you are using Spring profile then you need to set profile first
set SPRING_PROFILES_ACTIVE=default,test
or window run type envi and add
spring_profiles_active
default,test
--spring.config.location
Directory is specified then that is where the application.properties is searched for
https://docs.spring.io/spring-boot/docs/current/reference/html/boot-features-external-config.html
Just to add a simple solution without PropertiesLauncher or too much arguments.
1 - Build your standard executable springboot jar (my-spring-boot-app.jar)
2 - then run it without using the -jar option and using the JarLauncher class as the main class
java -cp "/path/to/jars/*:/path/to/app/my-spring-boot-app.jar" org.springframework.boot.loader.JarLauncher
(relative pathes are also perfectly valid)
that's all
The standard way to add dependencies in Spring Boot project is placing those Jar files into BOOT-INF/lib. This will result in copy that dependencies in the jar or war file generated and with the classpath.idx updated as well.
You can see the official documentation here
The accuracy literature says:
Application classes should be placed in a nested BOOT-INF/classes directory. Dependencies should be placed in a nested BOOT-INF/lib directory
I already do that with external Jar files and everything gone Ok.
One solution that worked for me was to insert the jar with the external classes into the MANIFEST.MF's Class-Path. That's because the -jar switch ignores the -classpath option and the CLASSPATH environment variable.
Procedure:
install the maven-jar-plugin into the POM;
add the lines:
<configuration>
<archive>
<manifestEntries>
<Class-Path>/my/external/jar/absolute/path.jar</Class-Path>
</manifestEntries>
</archive>
</configuration>
Build and run with java -jar myapp.jar. Its manifest will contain the line:
Class-Path: /my/external/jar/absolute/path.jar
This way the external jar will be searched at runtime and not at compile-time (it won't be copied in BOOT_INF/lib).
Sources:
post 1
post 2

How is in the real world deployed Maven application?

I have a Java console application, till now it was developed in Netbeans IDE. When Netbeans builds application, it creates dist directory and builds an app into this directory as a jar archive and into dist/lib copies all dependencies. This this directory could be copied into final destination and run.
Now I'm trying to transfer this project into Maven. Everything goes ok, I can compile and package my app and a jar is created into target directory. I use maven-jar-plugin to set main class in manifest and maven-shade-plugin to package all sources into one jar file.
I would like to ask you how is such Maven project deployed in the real world? Should I use all target directory, copy it ad the final destination and run as I have been used to do with Netbeans? What are consequences when I don't use maven-shade-plugin - where are all libraries defined as dependencies located? I am asking, because in my testing project these libraries don't exist in target directory.
My question - I have a Java console application "A" packaged via Maven (without maven-shade-plugin) and Linux server "S" where this application should run. Can I copy all target directory manually to server "S" or is there some better / more automatic way how is this solved in the real world?
Simply copying over the target directory will not solve your problem. I have packaged many standalone applications using Maven and I have used Maven Assembly Plugin for it. You can create a distribution archive (zip, tar.gz) using the assembly plugin which your customer can unzip and start running.
It depends on you, how you want your target application directory structure (release). I usually end up with something like
bin/
conf/
lib/
log/
The bin directory contains a shell / batch script to run your program by calling your main class, setting appropriate classpath, providing relevant memory settings etc. I prefer using classworlds (which is used by Maven) to bootstrap my application and simplify writing of start scripts.
conf directory contains configuration files for your application as well as logging configuration files like log4j etc. This directory I add on classpath to make it easier to access configuration resources at runtime.
lib directory contains all the dependency jars a well as jar file for your code.
log is where your logging configuration will point to output log files.
Note that this structure is good for standalone server like applications. Also having a bin directory and run scripts allows you to add this directory to PATH on Windows / Linux to ensure you can run the application from anywhere.
If you are packaging a command line utility, simple shaded jar may work for you. Personally, I am not the biggest fan of java -jar application.jar
The question is too broad to be answered comprehensively, but I would like to provide an example of real-world maven deployment.
There are maven plugins for all major application servers. They have defined targets for local and remote deployment. One such plugin is the jboss-as-maven plugin. You can define the deployment properties (IP, port etc.) in your .pom or directly from command line, e.g.
mvn jboss-as:deploy -Dpassword=mypassword
There is also the cargo plugin that specializes in application deployment.

Categories

Resources