How to run Hadoop HDFS command from java code - java

I'm new in Hadoop! How can I run some hdfs commands from Java code? I've been testing successfully mapreduce with java code and hdfs commands directly from cloudera vm's terminal but now I'd like to learn how to do it with java code.
I've been looking for any materials where to learn but I haven't found yet.
Thanks

I think this may be help to you
I use it execute shell command well .here is the java example
public class JavaRunShell {
public static void main(String[] args){
try {
String shpath=" your command";
Process ps = Runtime.getRuntime().exec(shpath);
ps.waitFor();
}
catch (Exception e) {
e.printStackTrace();
}
}
}

As mentioned by Jagrut, you can use FileSystem API in your java code to interact with hdfs command. Below is the sample code where i am trying to check if a particular directory exists in hdfs or not. If exists, then remove that hdfs directory.
Configuration conf = new Configuration();
Job job = new Job(conf,"HDFS Connect");
FileSystem fs = FileSystem.get(conf);
Path outputPath = new Path("/user/cloudera/hdfsPath");
if(fs.exists(outputPath))
fs.delete(outputPath);
You can also refer to given blogs for further reference -
https://dzone.com/articles/working-with-the-hadoop-file-system-api, https://hadoop.apache.org/docs/r2.8.2/api/org/apache/hadoop/fs/FileSystem.html
https://blog.knoldus.com/2017/04/16/working-with-hadoop-filesystem-api/

You can use the FileSystem API in your Java code to interact with HDFS.

You can use FileSystem API in java code to perform Hdfs commands.
https://hadoop.apache.org/docs/r2.8.2/api/org/apache/hadoop/fs/FileSystem.html
Please find the following sample code.
package com.hadoop.FilesystemClasses;
import java.io.IOException;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.log4j.Logger;
import com.hadoop.Constants.Constants;
public class HdfsFileSystemTasks {
public static Logger logger = Logger.getLogger(HdfsFileSystemTasks.class
.getName());
public FileSystem configureFilesystem(String coreSitePath,
String hdfsSitePath) {
FileSystem fileSystem = null;
try {
Configuration conf = new Configuration();
Path hdfsCoreSitePath = new Path(coreSitePath);
Path hdfsHDFSSitePath = new Path(hdfsSitePath);
conf.addResource(hdfsCoreSitePath);
conf.addResource(hdfsHDFSSitePath);
fileSystem = FileSystem.get(conf);
return fileSystem;
} catch (Exception ex) {
ex.printStackTrace();
return fileSystem;
}
}
public String writeToHDFS(FileSystem fileSystem, String sourcePath,
String destinationPath) {
try {
Path inputPath = new Path(sourcePath);
Path outputPath = new Path(destinationPath);
fileSystem.copyFromLocalFile(inputPath, outputPath);
return Constants.SUCCESS;
} catch (IOException ex) {
ex.printStackTrace();
return Constants.FAILURE;
}
}
public String readFileFromHdfs(FileSystem fileSystem, String hdfsStorePath,
String localSystemPath) {
try {
Path hdfsPath = new Path(hdfsStorePath);
Path localPath = new Path(localSystemPath);
fileSystem.copyToLocalFile(hdfsPath, localPath);
return Constants.SUCCESS;
} catch (IOException ex) {
ex.printStackTrace();
return Constants.FAILURE;
}
}
public String deleteHdfsDirectory(FileSystem fileSystem,
String hdfsStorePath) {
try {
Path hdfsPath = new Path(hdfsStorePath);
if (fileSystem.exists(hdfsPath)) {
fileSystem.delete(hdfsPath);
logger.info("Directory{} Deleted Successfully "
+ hdfsPath);
} else {
logger.info("Input Directory{} does not Exists " + hdfsPath);
}
return Constants.SUCCESS;
} catch (Exception ex) {
System.out
.println("Some exception occurred while reading file from hdfs");
ex.printStackTrace();
return Constants.FAILURE;
}
}
public String deleteLocalDirectory(FileSystem fileSystem,
String localStorePath) {
try {
Path localPath = new Path(localStorePath);
if (fileSystem.exists(localPath)) {
fileSystem.delete(localPath);
logger.info("Input Directory{} Deleted Successfully "
+ localPath);
} else {
logger.info("Input Directory{} does not Exists " + localPath);
}
return Constants.SUCCESS;
} catch (Exception ex) {
System.out
.println("Some exception occurred while reading file from hdfs");
ex.printStackTrace();
return Constants.FAILURE;
}
}
public void closeFileSystem(FileSystem fileSystem) {
try {
fileSystem.close();
} catch (Exception ex) {
ex.printStackTrace();
System.out.println("Unable to close Hadoop filesystem : " + ex);
}
}
}
package com.hadoop.FileSystemTasks;
import com.hadoop.Constants.HDFSParameters;
import com.hadoop.Constants.HdfsFilesConstants;
import com.hadoop.Constants.LocalFilesConstants;
import com.hadoop.FilesystemClasses.HdfsFileSystemTasks;
import org.apache.hadoop.fs.FileSystem;
import org.apache.log4j.Logger;
public class ExecuteFileSystemTasks {
public static Logger logger = Logger.getLogger(ExecuteFileSystemTasks.class
.getName());
public static void main(String[] args) {
HdfsFileSystemTasks hdfsFileSystemTasks = new HdfsFileSystemTasks();
FileSystem fileSystem = hdfsFileSystemTasks.configureFilesystem(
HDFSParameters.CORE_SITE_XML_PATH,
HDFSParameters.HDFS_SITE_XML_PATH);
logger.info("File System Object {} " + fileSystem);
String fileWriteStatus = hdfsFileSystemTasks.writeToHDFS(fileSystem,
LocalFilesConstants.SALES_DATA_LOCAL_PATH,
HdfsFilesConstants.HDFS_SOURCE_DATA_PATH);
logger.info("File Write Status{} " + fileWriteStatus);
String filereadStatus = hdfsFileSystemTasks.readFileFromHdfs(
fileSystem, HdfsFilesConstants.HDFS_DESTINATION_DATA_PATH
+ "/MR_Job_Res2/part-r-00000",
LocalFilesConstants.MR_RESULTS_LOCALL_PATH);
logger.info("File Read Status{} " + filereadStatus);
String deleteDirStatus = hdfsFileSystemTasks.deleteHdfsDirectory(
fileSystem, HdfsFilesConstants.HDFS_DESTINATION_DATA_PATH
+ "/MR_Job_Res2");
hdfsFileSystemTasks.closeFileSystem(fileSystem);
}
}

#HbnKing I tried running your code but I kept getting errors. This is the error i got
java.io.IOException: Cannot run program "your": CreateProcess error=2, The system cannot
find the file specified
at java.lang.ProcessBuilder.start(Unknown Source)
at java.lang.Runtime.exec(Unknown Source)
at java.lang.Runtime.exec(Unknown Source)
at java.lang.Runtime.exec(Unknowenter code heren Source)
at jrs.main(jrs.java:5)

Related

java.nio.file.ProviderNotFoundException: Provider "zip" not found

I am trying to use the java.nio-API for traversing a .zip-file, but I get a ProviderNotFoundException when I try to call FileSystems.newFileSystem.
I have also tried changing the zip:file: to jar:file:, but I get the same kind of exception except that the message says Provider "jar" not found.
I have also tried using FileSystems.newFileSystem(Path, null) directly without creating an URI first.
Output:
Reading zip-file: /home/pyknic/Downloads/Walking.zip
Exception in thread "main" java.nio.file.ProviderNotFoundException: Provider "zip" not found
at java.base/java.nio.file.FileSystems.newFileSystem(FileSystems.java:364)
at java.base/java.nio.file.FileSystems.newFileSystem(FileSystems.java:293)
at com.github.pyknic.zipfs.Main.main(Main.java:19)
Main.java
package com.github.pyknic.zipfs;
import java.io.IOException;
import java.net.URI;
import java.nio.file.*;
import java.util.stream.StreamSupport;
import static java.lang.String.format;
import static java.util.Collections.singletonMap;
public class Main {
public static void main(String... args) {
final Path zipFile = Paths.get(args[0]);
System.out.println("Reading zip-file: " + zipFile);
final URI uri = URI.create("zip:file:" + zipFile.toUri().getPath().replace(" ", "%20"));
try (final FileSystem fs = FileSystems.newFileSystem(uri, singletonMap("create", "true"))) {
final long entriesRead = StreamSupport.stream(fs.getRootDirectories().spliterator(), false)
.flatMap(root -> {
try {
return Files.walk(root);
} catch (final IOException ex) {
throw new RuntimeException(format(
"Error traversing zip file system '%s', root: '%s'.",
zipFile, root), ex);
}
}).mapToLong(file -> {
try {
Files.lines(file).forEachOrdered(System.out::println);
return 1;
} catch (final IOException ex) {
throw new RuntimeException(format(
"Error modifying DAE-file '%s' in zip file system '%s'.",
file, zipFile), ex);
}
}).sum();
System.out.format("A total of %,d entries read.%n", entriesRead);
} catch (final IOException ex) {
throw new RuntimeException(format(
"Error reading zip-file '%s'.", zipFile
), ex);
}
}
}
How do I get access to the file system of a zip-file with the Java Nio-APIs?

Local jar is included in WEB-INF/lib, but ClassNotFoundException is still thrown

This is a web app, and every thing is packaged inside a war.
I was able to install a local jar called remote_proxy-1.0.0.jar into my maven project (It is included inside WEB-INF/lib directory of the produced artifact). This jar file contains interfaces Task, and TaskRunner, and the TaskRunner implementation class TaskRunnerRemoteObject. Everything compiles, and the app can be deployed. Below is the bootstraping code used to start the RMI server
package rmi;
import java.io.*;
import java.nio.file.*;
import java.util.logging.*;
import static java.util.stream.Collectors.joining;
import java.util.stream.Stream;
import javax.enterprise.concurrent.ManagedExecutorService;
public class ProcessInit {
public static Process startRMIServer(ManagedExecutorService pool, String WEBINF, int port, String jar) {
ProcessBuilder pb = new ProcessBuilder();
Path wd = Paths.get(WEBINF);
pb.directory(wd.resolve("classes").toFile());
Path lib = wd.resolve("lib");
String cp = Stream.of("javabuilder.jar", "remote_proxy.jar", jar)
.map(e -> lib.resolve(e).toString())
.collect(joining(File.pathSeparator));
pb.command("java", "-cp", "." + File.pathSeparator + cp, "rmi.BootStrap", String.valueOf(port));
while (true) {
try {
Process p = pb.start();
pool.execute(() -> flushIStream(p.getInputStream()));
pool.execute(() -> flushIStream(p.getErrorStream()));
return p;
} catch (Exception ex) {
ex.printStackTrace();
System.out.println("Retrying....");
}
}
}
private static void flushIStream(InputStream is) {
try (BufferedReader br = new BufferedReader(new InputStreamReader(is))) {
br.lines().forEach(System.out::println);
} catch (IOException ex) {
Logger.getLogger(ProcessInit.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
public class BootStrap {
public static void main(String[] args) {
int port = Integer.parseInt(args[0]);
System.out.println("Instantiating a task runner implemenration on port: " + port );
try {
System.setProperty("java.rmi.server.hostname", "localhost");
TaskRunner runner = new TaskRunnerRemoteObject();
TaskRunner stub = (TaskRunner)UnicastRemoteObject.exportObject(runner, 0);
Registry reg = LocateRegistry.createRegistry(port);
reg.rebind("runner" + port, stub);
} catch (RemoteException ex) {
Logger.getLogger(BootStrap.class.getName()).log(Level.SEVERE, null, ex);
}
}
}
Inside WEB-INF/lib
When this class is executed, it throws this error which I don't understand why, because the local jar remote_proxy-1.0.0.jar is included in the classpath, I don't have any compile error, yet, still get this exception.
Caused by: java.lang.ClassNotFoundException: remote_proxy.TaskRunnerRemoteObject

Last updated time for a class inside a jar

I'm trying to find the .class creation time for a file inside a jar.
But When I try to use this piece of code, I'm getting the Jar creation time instead of the .class file creation time.
URL url = TestMain.class.getResource("/com/oracle/determinations/types/CommonBuildTime.class");
url.getPath();
try {
System.out.println(" Time modified :: "+ new Date(url.openConnection().getLastModified()));
} catch (IOException e) {
e.printStackTrace();
}
But when I open the jar I can see the .class creation time is different from that of the jar creation time.
Could you please try following solution:
import java.io.IOException;
import java.util.Date;
import java.util.Enumeration;
import java.util.jar.JarEntry;
import java.util.jar.JarFile;
public class Test {
public static void main(String[] args) throws IOException {
String classFilePath = "/com/mysql/jdbc/AuthenticationPlugin.class";
String jarFilePath = "D:/jars/mysql-connector-java-5.1.34.jar";
Test test=new Test();
Date date = test.getLastUpdatedTime(jarFilePath, classFilePath);
System.out.println("getLastModificationDate returned: " + date);
}
/**
* Returns last update time of a class file inside a jar file
* #param jarFilePath - path of jar file
* #param classFilePath - path of class file inside the jar file with leading slash
* #return
*/
public Date getLastUpdatedTime(String jarFilePath, String classFilePath) {
JarFile jar = null;
try {
jar = new JarFile(jarFilePath);
Enumeration<JarEntry> enumEntries = jar.entries();
while (enumEntries.hasMoreElements()) {
JarEntry file = (JarEntry) enumEntries.nextElement();
if (file.getName().equals(classFilePath.substring(1))) {
long time=file.getTime();
return time==-1?null: new Date(time);
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
if (jar != null) {
try {
jar.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return null;
}
}

Is it possible to run a loop when a new file is created in a folder?

So I have to make a program in java that automatically runs in the background and looks for a new .dat file and when it sees the new .dat file it then runs a .bat file to load data into a database. So far I have a program that watches for new file creation, modification, and deletion. I also have a script that runs the .bat file and loads the data into the database now i just need to connect the two but I am not sure how to go about this, If someone could point me in the right direction I would greatly appreciate it.
Below is the code I have so far.
import static java.nio.file.LinkOption.NOFOLLOW_LINKS;
import static java.nio.file.StandardWatchEventKinds.ENTRY_CREATE;
import static java.nio.file.StandardWatchEventKinds.OVERFLOW;
import static java.nio.file.StandardWatchEventKinds.ENTRY_DELETE;
import static java.nio.file.StandardWatchEventKinds.ENTRY_MODIFY;
import java.io.*;
import java.util.*;
import java.io.File;
import java.io.IOException;
import java.nio.file.FileSystem;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.WatchEvent;
import java.nio.file.WatchEvent.Kind;
import java.nio.file.WatchKey;
import java.nio.file.WatchService;
public class Order_Processing {
public static void watchDirectoryPath(Path path)
{
try {
Boolean isFolder = (Boolean) Files.getAttribute(path,
"basic:isDirectory", NOFOLLOW_LINKS);
if (!isFolder)
{
throw new IllegalArgumentException("Path: " + path
+ " is not a folder");
}
}
catch (IOException ioe)
{
ioe.printStackTrace();
}
System.out.println("Watching path: "+ path);
FileSystem fs = path.getFileSystem();
try (WatchService service = fs.newWatchService())
{
path.register(service, ENTRY_CREATE, ENTRY_MODIFY, ENTRY_DELETE);
WatchKey key = null;
while (true)
{
key = service.take();
Kind<?> kind = null;
for (WatchEvent<?> watchEvent : key.pollEvents())
{
kind = watchEvent.kind();
if (OVERFLOW == kind)
{
continue;
}
else if (ENTRY_CREATE == kind)
{
Path newPath = ((WatchEvent<Path>) watchEvent)
.context();
System.out.println("New Path Created: " + newPath);
}
else if (ENTRY_MODIFY == kind)
{
Path newPath = ((WatchEvent<Path>) watchEvent)
.context();
System.out.println("New path modified: "+ newPath);
}
else if (ENTRY_DELETE == kind)
{
Path newPath = ((WatchEvent<Path>) watchEvent)
.context();
System.out.println("New path deleted: "+ newPath);
}
}
if (!key.reset())
{
break;
}
}
}
catch (IOException ioe)
{
ioe.printStackTrace();
}
catch (InterruptedException ie)
{
ie.printStackTrace();
}
}
public static void main(String[] args)
throws FileNotFoundException
{
File dir = new File("C:\\Paradigm");
watchDirectoryPath(dir.toPath());
//below is the script that runs the .bat file and it works if by itself
//with out all the other watch code.
try {
String[] command = {"cmd.exe", "/C", "Start", "C:\\Try.bat"};
Process p = Runtime.getRuntime().exec(command);
}
catch (IOException ex) {
}
}
}
This doesn't work because you have a while (true). This makes sense because you are listening and want the to happen continuously; however, the bat call will never be executed because watchDirectory(...) will never terminate. To solve this, pull the rest of the main out into its own function like so
public static void executeBat() {
try {
String[] command = {"cmd.exe", "/C", "Start", "C:\\Try.bat"};
Process p = Runtime.getRuntime().exec(command);
}
catch (IOException ex) {
// You should do something with this.
// DON'T JUST IGNORE FAILURES
}
so that upon file creation, you can call that bat script
...
else if (ENTRY_CREATE == kind)
{
Path newPath = ((WatchEvent<Path>) watchEvent).context();
executeBat();
}
...

Failed to move file to another directory using renameTo

import java.io.File;
import org.apache.commons.io.FilenameUtils;
public class Tester {
public static void main(String[] args) {
String rootPath = "F:\\Java\\Java_Project";
File fRoot = new File(rootPath);
File[] fsSub = fRoot.listFiles();
for (File file : fsSub) {
if(file.isDirectory()) continue;
String fileNewPath = FilenameUtils.removeExtension(file.getPath()) + "\\" + file.getName();
File fNew = new File(fileNewPath);
try {
file.renameTo(fNew);
} catch (Exception e) {
e.printStackTrace();
}
}
}
}
I am trying to move the file to another directory,for instance,if the File path is
"C:\out.txt"
than I want to move to
"C:\out\out.txt"
If i try to print the original File and the new original information, the work well,But they just can not move successful.
I suggest to try Java 7 NIO2
Files.move(Path source, Path target, CopyOption... options)

Categories

Resources