unable to execute hadoop fs -put command from Java - java

I am trying to execute hadoop fs -put <source> <destination> from Java code. When I execute this command directly from the terminal, it works fine but when I try to execute this command from within the Java code using
String[] str = {"/usr/bin/hadoop","fs -put", source, dest};
Runtime.getRuntime().exec(str);
I get error as Error: Could not find or load main class fs. I tried to execute some non-hadoop commands like ls,mkdir commands from Java and they worked fine but the hadoop commands are not getting executed even though they work fine from the terminal.
What could be the possible reason for this and how can I solve it?
JAVA API TRY: I tried to use java api to perform the copy operation but I get error. The Java code is :
String source = "/home/tmpe/file1.csv";
String dest = "/user/tmpe/file1.csv";
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://node1:8020");
FileSystem fs = FileSystem.get(conf);
Path targetPath = new Path(dest);
Path sourcePath = new Path(source);
fs.copyFromLocalFile(false,true,sourcePath,targetPath);
The error which I get is:
Exception in thread "main" java.io.IOException: Mkdirs failed to create /user/tmpe
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:378)
at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:364)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:564)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:545)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:452)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:229)
at org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1230)
I have already created /user/tmpe folder and it has full read-write permissions but still this error comes. I am unable to get the issue resolved

I guess you probably do not have a HADOOP_HOME environment variable set.
But since you're in Java, why on earth would you want to do a haddop fs -put in an external process when the Java API is even more friendly than the shell ?

Came across old post but if you haven't tried already, execute it with hadoop jar app_name.jar instead of java -jar. This way if classpath of your jar does not have all hadoop jars it will fetch the jars predefined in $HADOOP_CLASSPATH.

Related

Getting No such File or Directory Exception on Ubuntu when trying compile a java project from java.lang.Process

Recently, I have switched to Ubuntu v20.04 from Windows 10 Pro v2004 because of performance purposes. When, I was on Windows I can freely compile a java project from another java program by writing:
String pathToCompiler = "\"C:/Program Files/Java/jdk-14/bin/javac\"";
Process compileProcess = Runtime.getRuntime().exec(pathToCompiler+" -d bin #.sources", null, new File("ProjectPath"))
Where the sources file is a file containing the list of classes of the project
The code above works successfully on Windows 10.
But On Linux(Ubuntu):
if I substitute the value of variable pathToCompiler as
pathToCompiler = "\"/usr/lib/jvm/java-11-openjdk-amd64/bin/javac\""
the below exception is raised up and the program executing the command exits:
"/usr/lib/jvm/java-11-openjdk-amd64/bin/javac" -d bin #.sources
java.io.IOException: Cannot run program ""/usr/lib/jvm/java-11-openjdk-amd64/bin/javac"" (in directory "/home/arham/Documents/Omega Projects/Project0"): error=2, No such file or directory
at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1128)
at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1071)
at java.base/java.lang.Runtime.exec(Runtime.java:592)
at java.base/java.lang.Runtime.exec(Runtime.java:416)
at ide.utils.systems.BuildView.lambda$3(BuildView.java:267)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: java.io.IOException: error=2, No such file or directory
at java.base/java.lang.ProcessImpl.forkAndExec(Native Method)
at java.base/java.lang.ProcessImpl.<init>(ProcessImpl.java:340)
at java.base/java.lang.ProcessImpl.start(ProcessImpl.java:271)
at java.base/java.lang.ProcessBuilder.start(ProcessBuilder.java:1107)
The problem is that the file actually exists but it says No Such File or Directory
Actually, The program which is compiling the project is a Java IDE that I am creatiing.
Someone please tell if he/she knows how to fix this bug
The Runtime.exec method has several problems that make it difficult to use, and this is one of them. Use the newer ProcessBuilder class instead.
String pathToCompiler = "C:/Program Files/Java/jdk-14/bin/javac";
Process compileProcess = new ProcessBuilder(pathToCompiler, "-d", "bin", "#.sources")
.directory(new File("ProjectPath"))
.start();
The differences are:
Remove the extra quotes from around the path to the executable. If quoting is needed, the system takes care of it.
Pass the each command line arguments as a separate string. This way you don't have to worry about quoting.
Update the path to the following:
String pathToCompiler = "/usr/lib/jvm/java-11-openjdk-amd64/bin/javac/";

Error: Could not find or load main class in python

I am trying to run this command in Python:
java JSHOP2.InternalDomain logistics
It works well when I run it in cmd.
I wrote this in Python:
args = ['java',
r"-classpath",
r".;./JSHOP2.jar;./antlr.jar",
r"JSHOP2.InternalDomain",
thisDir+"/logistics"
]
proc = subprocess.Popen(args, stdout=subprocess.PIPE)
proc.communicate()
I have the jar files in the current directory.
but I got this error:
Error: Could not find or load main class JSHOP2.InternalDomain
Does anyone know what the problem is? can't it find the jar files?
You can't count on the current working directory always being the same when running your Python code. Explicitly set a working directory using the cwd argument:
proc = subprocess.Popen(args, stdout=subprocess.PIPE,
cwd='/directory/containing/jarfiles')
Alternatively, use absolute paths in your -classpath commandline argument. If that path is thisDir, then use that:
proc = subprocess.Popen(args, stdout=subprocess.PIPE,
cwd=thisDir)

Running yarn job from java program using ProcessBuilder gives file does not exist error

I am trying to run a yarn job from a java wrapper program. The mapreduce jar takes two inputs:
A header file: I dont know the name of the file but the location and file extension and there's only one file at that location
A Input files directory
Apart from these I have an Output directory.
the processbuilder code looks like:
HEADER_PATH = INPUT_DIRECTORY+"/HEADER/*.tsv";
INPUT_FILES = INPUT_DIRECTORY+"/DATA/";
OUTPUT_DIRECTORY = OUTPUT_DIRECTORY+"/";
ProcessBuilder mapRProcessBuilder = new ProcessBuilder("yarn","jar",JAR_LOCATION,"-Dmapred.job.queue.name=name","-Dmapred.reduce.tasks=500",HEADER_PATH,INPUT_DIRECTORY,OUTPUT_DIRECTORY);
System.out.println(mapRProcessBuilder.command().toString());
Process mapRProcess = mapRProcessBuilder.start();
On run, I get the following error:
Exception in thread "main" java.io.FileNotFoundException: Requested
file /input/path/dir1/HEADER/*.tsv does not exist.
But when I run the same command as :
yarn jar jarfile.jar -Dmapred.job.queue.name=name -Dmapred.reduce.tasks=500 /input/path/dir1/HEADER/*.tsv /input/Dir /output/Dir/
It works all fine.
what can be the issue when running the command from java is causing this issue?
The * is being treated as part of the literal string in this case rather than a wildcard. Therefore globbing isn't expanding to your desired path name.
If there is only one file in the directory, why don't you find what its path is and pass that as the argument instead
eg.
File dir = new File(INPUT_DIRECTORY+"/HEADER);
if (dir.list().length > 0)
String HEADER_PATH = dir.list()[0].getAbsolutePath();

Unable to Load Dependent SO file in LInux

I am new to linux. I am trying to load a SO file in Ubuntu using Java. The file that I have specified in the java method "System.load(/home/ab/Downloads/libtesseract.so)" loads fine but its dependent so file placed in the same place as "libtesseract.so" is not found. Here is the error message I get. Error: UnSatisfiedLinkError and says "liblept.so.4" cound not be found. This so file is placed in the same location as libtesseract.so. When I place "liblept.so.4" in the "/lib". It is able to load this so file from. So what I understood is that for, its not for java to load the dependent so. It has to be loaded by ubuntu. So I tried a simple application to load this by setting the PATH variable with the location of the so file. And exported the java code into a jar and tried to run this jar file from terminal as the path variable is not persistent for entire system. It worked fine. So I tried to do the same thing programmatically by using the code below to its not working. Please advice. TIA
Code:
ProcessBuilder pb = new ProcessBuilder("/bin/sh");
Map<String, String> envMap = pb.environment();
envMap.put("LD_LIBRARY_PATH", "/home/ab/Downloads");
envMap.put("PATH", "/home/ab/Downloads");
Set<String> keys = envMap.keySet();
for(String key:keys)
{
System.out.println(key+" ==> "+envMap.get(key));
}
System.load("/home/ab/Downloads/libtesseract.so");
As far as I know you can't really modify the environment variables in Java "on-the-fly". That means you should set both LD_LIBRARY_PATH and PATH before running the java.

ProcessBuilder can't find perl

I'm trying to execute a perl script from java with the following code:
ProcessBuilder script =
new ProcessBuilder("/opt/alert-ssdb.pl");
Process tmp = script.start();
But when I execute it it returns
java.io.IOException: Cannot run program "/opt/alert-ssdb.pl": java.io.IOException: error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:488)
at scripttest.main(scripttest.java:11)
Caused by: java.io.IOException: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.<init>(UNIXProcess.java:164)
at java.lang.ProcessImpl.start(ProcessImpl.java:81)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:470)
... 1 more
about the file
ls -l alert-ssdb.pl
-rwxr-xr-x. 1 root root alert-ssdb.pl
I tried running /usr/bin/perl/ with the script as an argument and it also failed with the same exception.
/bin/ls and other simple commands run without a problem though.
Also the first line of the script is #!/usr/bin/perl
and when run on command line it works
what am I missing?
//Update:
The big picture is that I'm trying to call the script via a storm bolt and it fails at that point.
I managed to make it work by defining a python script as a bolt
using
super(python,myscript.py)
(myscript imports the storm library) and from myscript I call the perl script.
I haven't tried yet but I suppose that If I modify the perl script to be a storm bolt it will run nicely.
Try changing
new ProcessBuilder("/opt/alert-ssdb.pl");
to:
new ProcessBuilder("/usr/bin/perl", "/opt/alert-ssdb.pl");
I've had past experiences where not all my environment variables from the shell exist when using ProcessBuilder.
Edited to reflect #dcsohl's comment.

Categories

Resources