I have a jar application that works with an external properties file. The file is used so that the user can override default properties. The default and core properties are part of the build so no problem there.
Normaly I would do something like one of theese:
java -jar MyAwesomeApp.jar user.properties
OR
java -jar MyAwesomeApp.jar -Dmyapp.userproperties=user.properties
But with hadoop, the jar gets executed inside the hadoop framework like this
/bin/hadoop jar MyAwesomeApp.jar input output
And no matter where I put the -D I can't get the value via System.getProperty(...). The properties is not set. The hadoop documentation said that -D is a GENERIC OPTION and is set after the command. But if I do so I get an error that -D is not a valid jar file (duh...)
I aim to keep the application as clean as possible...so I only want to pass the user configuration as a parameter as a last resort, i.e.
/bin/hadoop jar MyAwesomeApp.jar input output user.properties
I hope someone can tell me what I have to do to get the -D working :/
Hadoop is running pseudo distributed so I am actually using HDFS too...
Related
I am trying to run a simple Java application in Unix. My Java application read a config file from a directory at run-time. I placed the files in /tmp/paddy/. I created a simple bash script to run a application.
I tried like below and it gives me "no main manifest attribute, in app.jar" error
#!/bin/bash
java -cp ".:./config/*.*" -jar "app.jar" com.test.MainClass
And tried with below command This time my application is running but couldnt find the aconfig file so it throw me NullPointerException - (since it couldnt load the config file)
#!/bin/bash
java -cp app.jar com.test.MainClass
What is the correct way to override classpath in Java -cp command ? I was searching over the internet, but couldnt get any good answers. I dont have any issues running in windows. Only in linux and I am pretty new to the linux environment.
You have four separate issues here.
-jar and -cp don't work together
If you use the -jar switch, the classpath is taken from the Class-Path manifest entry in the jar's manifest, and that is all that will happen - the -cp switch (and the CLASSPATH environment variable) are completely ignored. The solution is to fix your jarfile, which ought to have that classpath entry.
That's not how bash works.
Separate from that issue, your -cp parameter is broken.
*.* in.. linux...? That's late 90s DOS, mate!
It's java doing the parsing of that *, which is unique, because in linux it's normally bash doing it, but that doesn't work here, because bash will be adding spaces, and java needs colons/semicolons, which is why java does it itself. The point is, java is rather limited and only understands a single *. Which bash will mess up. So, there is really only one way to do this.
Single quotes.
One star.
For example:
java -cp '.:./config/*' com.test.MainClass
You don't seem to understand how classpaths work
Each individual entry in a classpath must be either:
A directory which contains classfiles.
A jar file
Note how it specifically cannot be 'a directory that contains jar files', and also cannot be 'a class file'; that is not a thing. The * is the usual treatment: It takes every file in the directory you padded with /* and considers them all to be part of the classpath.
So, if you write: java -cp ., that will not include app.jar. If you write java -cp './config/*', that will not include any class or config files hanging off of ./config (only jar files located there).
That's not how config files work
Including config files on the classpath is not how its done. You can, of course. This doesn't do anything whatsoever, unless you are using SomeClass.class.getResource or some other variant of getResource (those are no good, you should be using SomeClass.class.getResource or SomeClass.class.getResourceAsStream, but I digress), in which case, don't do that. Those aren't intended for config files, those are for static files (files that never change, such as, say, a 'save to cloud' icon for your swing user interface application). If you are doing that, you'd need to include ./config (and not './config/*') in your classpath, but it would be a better idea to fix your code.
config files should be in the user's home directory - System.getProperty("user.home"). You should consider the directory that contains the jar file(s) as the place where the executables live, and those are not necessarily editable by the user, and surely the point of a config file is that you can edit them. Hence why using the classpath for these is not how it is done.
I have written a small application to parse a large XML file using SAX with Intellij.
I pass -DentityExpansionLimit=0 option to my application by going to Run\Edit Configurations... and set VM options.
It works perfectly when I run the application with Intellij, but when I create the artifact with intellij it doesn't work and I get the error which needed to set that option. This is obvious that the option didn't pass to the created jar file.
How should I achieve this goal?
Is there any command that I create with a batch file or something to set this option for my user? Is there any setting file that I can modify to set this option for my machine? (I use windows 10)
Usually, to send system properties to a jar, the command is something like that:
java -DentityExpansionLimit=0 -jar thejar.jar
You are mixing up two things here:
the JVM command line command, and the fact that you can pass arguments to your application, or properties to the JVM itself
your deployment artefact (probably a JAR file)
Meaning: It seems like you want to either pass command line arguments (to some main function) or properties to your application. But the JAR file doesn't have support for that.
JAR files are just a container of class files. You can add some META information via the manifest (which class to run), but that is about it. You can't magically push your IntelliJ "runtime configuration settings" into the JAR.
In other words: IntelliJ has no way of putting these values into your JAR.
When you invoke java -jar Your.jar ... then you (or some other tooling) has to add the required values to the command line.
I have built a jar file which has a log4j.properties file in it (mvn package put it there by default from the resources directory). But when I run this jar file, I want to pass a different logging config, so I add -Dlog4j.configuration=file:{path to file}. The issue that bugs me is that the order matters here as follows:
When I run java -jar {path to jar} -Dlog4j.configuration=file:{path to file} then it reads the log file packaged in the jar.
When I run java -Dlog4j.configuration=file:{path to file} -jar {path to jar}, then it reads the config from the file I pass in the parameters.
I have rough understanding how classpaths work in java and that if I were to load several java classes with the same name, it would make a difference. But this way I am passing a config parameter with a -D prefix, so the way I expect this to work is for some code in log4j library to check whether -Dlog4j.configuration is set and if so, then load the config from there, otherwise try to find it on the classpath.
Any ideas on what I am missing?
If you provide anything after naming the JAR file, it is treated as an argument to your main method. For Log4J you actually have to define a property, and this needs to be done before you specify -jar.
I am trying to externalize the properties file of my project.
Steps to run:
Created a jar file without properties file.
Run these script file from command prompt.
.
java -jar read-apis.jar --spring.config.location=classpath:..\config\application.properties,classpath:..\config\sql-config.properties,classpath:..\config\error-config.properties,classpath:..\config\messgae-config.properties,classpath:..\config\validation-config.properties
OR
java -cp ..\config\application.properties, -cp ..\config\sql-config.properties, -cp ..\config\error-config.properties, -cp ..\config\messgae-config.properties, -cp ..\config\validation-config.properties -jar read-apis.jar
Its not working for me please help me.
I am basing information from the Spring Boot documentation here and my own experience. From what I can tell you have a configuration directory at ../config. You can either:
Put the config directory at the location where you run the application from. If the config directory is located at . instead of .. it will be picked up without any additional parameters.
OR leave it there and use something similar to your first form like this: "spring.config.location=file:..\config\application.properties". Since it is not in the jar you will need to use "file" instead of "classpath".
Give that a try and see if it works. It looks like you are trying to put multiple files in the search list. It may work but I am not certain. If that is the case then the first bullet above may not work since only application.properties would be searched for in the config directory. You could always add the other files in using the config property since it looks like it always uses the default paths also.
java -Dspring.config.location=application.properties,sql-config.properties,error-config.properties -jar read-api.jar
This works out for me.
I am trying to figure out how to set class path that reference to HDFS? I cannot find any reference.
java -cp "how to reference to HDFS?" com.MyProgram
If i cannot reference to hadoop file system, then i have to copy all the referenced third party libs/jars somewhere under $HADOOP_HOME on each hadoop machine...but i wanna avoid this by putting files to hadoop file system. Is this possible?
Example hadoop command line for the program to run (my expectation is like this, maybe i am wrong):
hadoop jar $HADOOP_HOME/contrib/streaming/hadoop-streaming-1.0.3.jar -input inputfileDir -output outputfileDir -mapper /home/nanshi/myprog.java -reducer NONE -file /home/nanshi/myprog.java
However, within the command line above, how do i added java classpath? like
-cp "/home/nanshi/wiki/Lucene/lib/lucene-core-3.6.0.jar:/home/nanshi/Lucene/bin"
What I suppose you are trying to do is include third party libraries in your distributed program. There are many options you can do.
Option 1) Easiest option that I find is to put all the jars in $HADOOP_HOME/lib (eg /usr/local/hadoop-0.22.0/lib) directory on all nodes and restart your jobtracker and tasktracker.
Option 2) Use libjars option command for this is
hadoop jar -libjars comma_seperated_jars
Option 3) Include the jars in lib directory of the jar. You will have to do that while creating your jar.
Option 4) Install all the jars in your computer and include their location in class path.
Option 5) You can try by putting those jars in distributed cache.
You cannot add to your classpath a HDFS path. The java executable wouldn't be able to interpret something like :
hdfs://path/to/your/file
But adding third party libraries to the classpath of each task needing those libraries can be done using the -libjars option. This means you need to have a so called driver class (implementing Tool) which sets up and starts your job and use the -libjars option on the command line when running that driver class.
The Tool, in turn, uses GenericParser to parse your command line arguments (including -libjars) and with the help of the JobClient will do all the necessary work to send your lib to all the machines needing them and to set them on the classpath of those machines.
Besides that, in order to run a MR job you should use the hadoop script located in the bin/ directory of your distribution.
Here is an example (using a jar containing your job and the driver class):
hadoop jar jarfilename.jar DriverClassInTheJar
-libjars comma-separated-list-of-libs <input> <output>
You can specify the jar path as
-libjars hdfs://namenode/path_to_jar
,I have used this with Hive .