I am trying to use the java.nio-API for traversing a .zip-file, but I get a ProviderNotFoundException when I try to call FileSystems.newFileSystem.
I have also tried changing the zip:file: to jar:file:, but I get the same kind of exception except that the message says Provider "jar" not found.
I have also tried using FileSystems.newFileSystem(Path, null) directly without creating an URI first.
Output:
Reading zip-file: /home/pyknic/Downloads/Walking.zip
Exception in thread "main" java.nio.file.ProviderNotFoundException: Provider "zip" not found
at java.base/java.nio.file.FileSystems.newFileSystem(FileSystems.java:364)
at java.base/java.nio.file.FileSystems.newFileSystem(FileSystems.java:293)
at com.github.pyknic.zipfs.Main.main(Main.java:19)
Main.java
package com.github.pyknic.zipfs;
import java.io.IOException;
import java.net.URI;
import java.nio.file.*;
import java.util.stream.StreamSupport;
import static java.lang.String.format;
import static java.util.Collections.singletonMap;
public class Main {
public static void main(String... args) {
final Path zipFile = Paths.get(args[0]);
System.out.println("Reading zip-file: " + zipFile);
final URI uri = URI.create("zip:file:" + zipFile.toUri().getPath().replace(" ", "%20"));
try (final FileSystem fs = FileSystems.newFileSystem(uri, singletonMap("create", "true"))) {
final long entriesRead = StreamSupport.stream(fs.getRootDirectories().spliterator(), false)
.flatMap(root -> {
try {
return Files.walk(root);
} catch (final IOException ex) {
throw new RuntimeException(format(
"Error traversing zip file system '%s', root: '%s'.",
zipFile, root), ex);
}
}).mapToLong(file -> {
try {
Files.lines(file).forEachOrdered(System.out::println);
return 1;
} catch (final IOException ex) {
throw new RuntimeException(format(
"Error modifying DAE-file '%s' in zip file system '%s'.",
file, zipFile), ex);
}
}).sum();
System.out.format("A total of %,d entries read.%n", entriesRead);
} catch (final IOException ex) {
throw new RuntimeException(format(
"Error reading zip-file '%s'.", zipFile
), ex);
}
}
}
How do I get access to the file system of a zip-file with the Java Nio-APIs?
Related
Please take a look at the code I have so far and if possible explain what I'm doing wrong. I'm trying to learn.
I made a little program to search for a type of file in a directory and all its sub-directories and copy them into another folder.
Code
import java.util.ArrayList;
import java.util.List;
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.StandardCopyOption;
public class FandFandLoop {
public static void main(String[] args) {
final File folder = new File("C:/Users/ina/src");
List<String> result = new ArrayList<>();
search(".*\\.txt", folder, result);
File to = new File("C:/Users/ina/dest");
for (String s : result) {
System.out.println(s);
File from = new File(s);
try {
copyDir(from.toPath(), to.toPath());
System.out.println("done");
}
catch (IOException ex) {
ex.printStackTrace();
}
}
}
public static void copyDir(Path src, Path dest) throws IOException {
Files.walk(src)
.forEach(source -> {
try {
Files.copy(source, dest.resolve(src.relativize(source)),
StandardCopyOption.REPLACE_EXISTING);
} catch (IOException e) {
e.printStackTrace();
}
});
}
public static void search(final String pattern, final File folder, List<String> result) {
for (final File f : folder.listFiles()) {
if (f.isDirectory()) {
search(pattern, f, result);
}
if (f.isFile()) {
if (f.getName().matches(pattern)) {
result.add(f.getAbsolutePath());
}
}
}
}
}
It works, but what it actually does is to take my .txt files and write them into another file named dest without extension.
And at some point, it deletes the folder dest.
The deletion happens because of StandardCopyOption.REPLACE_EXISTING, if I understand this, but what I would have liked to obtain was that if several files had the same name then only one copy of it should be kept.
There is no need to call Files.walk on the matched source files.
You can improve this code by switching completely to using java.nio.file.Path and not mixing string paths and File objects. Additionally instead of calling File.listFiles() recursively you can use Files.walk or even better Files.find.
So you could instead use the following:
import java.io.IOException;
import java.io.UncheckedIOException;
import java.nio.file.CopyOption;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.PathMatcher;
import java.nio.file.Paths;
import java.nio.file.attribute.BasicFileAttributes;
import java.util.Objects;
import java.util.function.BiPredicate;
import java.util.stream.Stream;
public class CopyFiles {
public static void copyFiles(Path src, Path dest, PathMatcher matcher, CopyOption... copyOptions) throws IOException {
// Argument validation
if (!Files.isDirectory(src)) {
throw new IllegalArgumentException("Source '" + src + "' is not a directory");
}
if (!Files.isDirectory(dest)) {
throw new IllegalArgumentException("Destination '" + dest + "' is not a directory");
}
Objects.requireNonNull(matcher);
Objects.requireNonNull(copyOptions);
BiPredicate<Path, BasicFileAttributes> filter = (path, attributes) -> attributes.isRegularFile() && matcher.matches(path);
// Use try-with-resources to close stream as soon as it is not longer needed
try (Stream<Path> files = Files.find(src, Integer.MAX_VALUE, filter)) {
files.forEach(file -> {
Path destFile = dest.resolve(src.relativize(file));
try {
copyFile(file, destFile, copyOptions);
}
// Stream methods do not allow checked exceptions, have to wrap it
catch (IOException ioException) {
throw new UncheckedIOException(ioException);
}
});
}
// Wrap UncheckedIOException; cannot unwrap it to get actual IOException
// because then information about the location where the exception was wrapped
// will get lost, see Files.find doc
catch (UncheckedIOException uncheckedIoException) {
throw new IOException(uncheckedIoException);
}
}
private static void copyFile(Path srcFile, Path destFile, CopyOption... copyOptions) throws IOException {
Path destParent = destFile.getParent();
// Parent might be null if dest is empty path
if (destParent != null) {
// Create parent directories before copying file
Files.createDirectories(destParent);
}
Files.copy(srcFile, destFile, copyOptions);
}
public static void main(String[] args) throws IOException {
Path srcDir = Paths.get("path/to/src");
Path destDir = Paths.get("path/to/dest");
// Could also use FileSystem.getPathMatcher
PathMatcher matcher = file -> file.getFileName().toString().endsWith(".txt");
copyFiles(srcDir, destDir, matcher);
}
}
I'm new in Hadoop! How can I run some hdfs commands from Java code? I've been testing successfully mapreduce with java code and hdfs commands directly from cloudera vm's terminal but now I'd like to learn how to do it with java code.
I've been looking for any materials where to learn but I haven't found yet.
Thanks
I think this may be help to you
I use it execute shell command well .here is the java example
public class JavaRunShell {
public static void main(String[] args){
try {
String shpath=" your command";
Process ps = Runtime.getRuntime().exec(shpath);
ps.waitFor();
}
catch (Exception e) {
e.printStackTrace();
}
}
}
As mentioned by Jagrut, you can use FileSystem API in your java code to interact with hdfs command. Below is the sample code where i am trying to check if a particular directory exists in hdfs or not. If exists, then remove that hdfs directory.
Configuration conf = new Configuration();
Job job = new Job(conf,"HDFS Connect");
FileSystem fs = FileSystem.get(conf);
Path outputPath = new Path("/user/cloudera/hdfsPath");
if(fs.exists(outputPath))
fs.delete(outputPath);
You can also refer to given blogs for further reference -
https://dzone.com/articles/working-with-the-hadoop-file-system-api, https://hadoop.apache.org/docs/r2.8.2/api/org/apache/hadoop/fs/FileSystem.html
https://blog.knoldus.com/2017/04/16/working-with-hadoop-filesystem-api/
You can use the FileSystem API in your Java code to interact with HDFS.
You can use FileSystem API in java code to perform Hdfs commands.
https://hadoop.apache.org/docs/r2.8.2/api/org/apache/hadoop/fs/FileSystem.html
Please find the following sample code.
package com.hadoop.FilesystemClasses;
import java.io.IOException;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.log4j.Logger;
import com.hadoop.Constants.Constants;
public class HdfsFileSystemTasks {
public static Logger logger = Logger.getLogger(HdfsFileSystemTasks.class
.getName());
public FileSystem configureFilesystem(String coreSitePath,
String hdfsSitePath) {
FileSystem fileSystem = null;
try {
Configuration conf = new Configuration();
Path hdfsCoreSitePath = new Path(coreSitePath);
Path hdfsHDFSSitePath = new Path(hdfsSitePath);
conf.addResource(hdfsCoreSitePath);
conf.addResource(hdfsHDFSSitePath);
fileSystem = FileSystem.get(conf);
return fileSystem;
} catch (Exception ex) {
ex.printStackTrace();
return fileSystem;
}
}
public String writeToHDFS(FileSystem fileSystem, String sourcePath,
String destinationPath) {
try {
Path inputPath = new Path(sourcePath);
Path outputPath = new Path(destinationPath);
fileSystem.copyFromLocalFile(inputPath, outputPath);
return Constants.SUCCESS;
} catch (IOException ex) {
ex.printStackTrace();
return Constants.FAILURE;
}
}
public String readFileFromHdfs(FileSystem fileSystem, String hdfsStorePath,
String localSystemPath) {
try {
Path hdfsPath = new Path(hdfsStorePath);
Path localPath = new Path(localSystemPath);
fileSystem.copyToLocalFile(hdfsPath, localPath);
return Constants.SUCCESS;
} catch (IOException ex) {
ex.printStackTrace();
return Constants.FAILURE;
}
}
public String deleteHdfsDirectory(FileSystem fileSystem,
String hdfsStorePath) {
try {
Path hdfsPath = new Path(hdfsStorePath);
if (fileSystem.exists(hdfsPath)) {
fileSystem.delete(hdfsPath);
logger.info("Directory{} Deleted Successfully "
+ hdfsPath);
} else {
logger.info("Input Directory{} does not Exists " + hdfsPath);
}
return Constants.SUCCESS;
} catch (Exception ex) {
System.out
.println("Some exception occurred while reading file from hdfs");
ex.printStackTrace();
return Constants.FAILURE;
}
}
public String deleteLocalDirectory(FileSystem fileSystem,
String localStorePath) {
try {
Path localPath = new Path(localStorePath);
if (fileSystem.exists(localPath)) {
fileSystem.delete(localPath);
logger.info("Input Directory{} Deleted Successfully "
+ localPath);
} else {
logger.info("Input Directory{} does not Exists " + localPath);
}
return Constants.SUCCESS;
} catch (Exception ex) {
System.out
.println("Some exception occurred while reading file from hdfs");
ex.printStackTrace();
return Constants.FAILURE;
}
}
public void closeFileSystem(FileSystem fileSystem) {
try {
fileSystem.close();
} catch (Exception ex) {
ex.printStackTrace();
System.out.println("Unable to close Hadoop filesystem : " + ex);
}
}
}
package com.hadoop.FileSystemTasks;
import com.hadoop.Constants.HDFSParameters;
import com.hadoop.Constants.HdfsFilesConstants;
import com.hadoop.Constants.LocalFilesConstants;
import com.hadoop.FilesystemClasses.HdfsFileSystemTasks;
import org.apache.hadoop.fs.FileSystem;
import org.apache.log4j.Logger;
public class ExecuteFileSystemTasks {
public static Logger logger = Logger.getLogger(ExecuteFileSystemTasks.class
.getName());
public static void main(String[] args) {
HdfsFileSystemTasks hdfsFileSystemTasks = new HdfsFileSystemTasks();
FileSystem fileSystem = hdfsFileSystemTasks.configureFilesystem(
HDFSParameters.CORE_SITE_XML_PATH,
HDFSParameters.HDFS_SITE_XML_PATH);
logger.info("File System Object {} " + fileSystem);
String fileWriteStatus = hdfsFileSystemTasks.writeToHDFS(fileSystem,
LocalFilesConstants.SALES_DATA_LOCAL_PATH,
HdfsFilesConstants.HDFS_SOURCE_DATA_PATH);
logger.info("File Write Status{} " + fileWriteStatus);
String filereadStatus = hdfsFileSystemTasks.readFileFromHdfs(
fileSystem, HdfsFilesConstants.HDFS_DESTINATION_DATA_PATH
+ "/MR_Job_Res2/part-r-00000",
LocalFilesConstants.MR_RESULTS_LOCALL_PATH);
logger.info("File Read Status{} " + filereadStatus);
String deleteDirStatus = hdfsFileSystemTasks.deleteHdfsDirectory(
fileSystem, HdfsFilesConstants.HDFS_DESTINATION_DATA_PATH
+ "/MR_Job_Res2");
hdfsFileSystemTasks.closeFileSystem(fileSystem);
}
}
#HbnKing I tried running your code but I kept getting errors. This is the error i got
java.io.IOException: Cannot run program "your": CreateProcess error=2, The system cannot
find the file specified
at java.lang.ProcessBuilder.start(Unknown Source)
at java.lang.Runtime.exec(Unknown Source)
at java.lang.Runtime.exec(Unknown Source)
at java.lang.Runtime.exec(Unknowenter code heren Source)
at jrs.main(jrs.java:5)
EDIT: This does not seem to be possible, see https://bugs.openjdk.java.net/browse/JDK-8039910.
I have a helper class that provides a Stream<Path>. This code just wraps Files.walk and sorts the output:
public Stream<Path> getPaths(Path path) {
return Files.walk(path, FOLLOW_LINKS).sorted();
}
As symlinks are followed, in case of loops in the filesystem (e.g. a symlink x -> .) the code used in Files.walk throws an UncheckedIOException wrapping an instance of FileSystemLoopException.
In my code I would like to catch such exceptions and, for example, just log a helpful message. The resulting stream could/should just stop providing entries as soon as this happens.
I tried adding .map(this::catchException) and .peek(this::catchException) to my code, but the exception is not caught in this stage.
Path checkException(Path path) {
try {
logger.info("path.toString() {}", path.toString());
return path;
} catch (UncheckedIOException exception) {
logger.error("YEAH");
return null;
}
}
How, if at all, can I catch an UncheckedIOException in my code giving out a Stream<Path>, so that consumers of the path do not encounter this exception?
As an example, the following code should never encounter the exception:
List<Path> paths = getPaths().collect(toList());
Right now, the exception is triggered by code invoking collect (and I could catch the exception there):
java.io.UncheckedIOException: java.nio.file.FileSystemLoopException: /tmp/junit5844257414812733938/selfloop
at java.nio.file.FileTreeIterator.fetchNextIfNeeded(FileTreeIterator.java:88)
at java.nio.file.FileTreeIterator.hasNext(FileTreeIterator.java:104)
at java.util.Iterator.forEachRemaining(Iterator.java:115)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at ...
EDIT: I provided a simple JUnit test class. In this question I ask you to fix the test by just modifying the code in provideStream.
package somewhere;
import org.junit.Rule;
import org.junit.Test;
import org.junit.rules.TemporaryFolder;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import static java.nio.file.FileVisitOption.FOLLOW_LINKS;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.is;
import static org.hamcrest.Matchers.nullValue;
import static org.hamcrest.core.IsNot.not;
import static org.junit.Assert.fail;
public class StreamTest {
#Rule
public TemporaryFolder temporaryFolder = new TemporaryFolder();
#Test
public void test() throws Exception {
Path rootPath = Paths.get(temporaryFolder.getRoot().getPath());
createSelfloop();
Stream<Path> stream = provideStream(rootPath);
assertThat(stream.collect(Collectors.toList()), is(not(nullValue())));
}
private Stream<Path> provideStream(Path rootPath) throws IOException {
return Files.walk(rootPath, FOLLOW_LINKS).sorted();
}
private void createSelfloop() throws IOException {
String root = temporaryFolder.getRoot().getPath();
try {
Path symlink = Paths.get(root, "selfloop");
Path target = Paths.get(root);
Files.createSymbolicLink(symlink, target);
} catch (UnsupportedOperationException x) {
// Some file systems do not support symbolic links
fail();
}
}
}
You can make your own walking stream factory:
public class FileTree {
public static Stream<Path> walk(Path p) {
Stream<Path> s=Stream.of(p);
if(Files.isDirectory(p)) try {
DirectoryStream<Path> ds = Files.newDirectoryStream(p);
s=Stream.concat(s, StreamSupport.stream(ds.spliterator(), false)
.flatMap(FileTree::walk)
.onClose(()->{ try { ds.close(); } catch(IOException ex) {} }));
} catch(IOException ex) {}
return s;
}
// in case you don’t want to ignore exceprions silently
public static Stream<Path> walk(Path p, BiConsumer<Path,IOException> handler) {
Stream<Path> s=Stream.of(p);
if(Files.isDirectory(p)) try {
DirectoryStream<Path> ds = Files.newDirectoryStream(p);
s=Stream.concat(s, StreamSupport.stream(ds.spliterator(), false)
.flatMap(sub -> walk(sub, handler))
.onClose(()->{ try { ds.close(); }
catch(IOException ex) { handler.accept(p, ex); } }));
} catch(IOException ex) { handler.accept(p, ex); }
return s;
}
// and with depth limit
public static Stream<Path> walk(
Path p, int maxDepth, BiConsumer<Path,IOException> handler) {
Stream<Path> s=Stream.of(p);
if(maxDepth>0 && Files.isDirectory(p)) try {
DirectoryStream<Path> ds = Files.newDirectoryStream(p);
s=Stream.concat(s, StreamSupport.stream(ds.spliterator(), false)
.flatMap(sub -> walk(sub, maxDepth-1, handler))
.onClose(()->{ try { ds.close(); }
catch(IOException ex) { handler.accept(p, ex); } }));
} catch(IOException ex) { handler.accept(p, ex); }
return s;
}
}
I'm trying to find the .class creation time for a file inside a jar.
But When I try to use this piece of code, I'm getting the Jar creation time instead of the .class file creation time.
URL url = TestMain.class.getResource("/com/oracle/determinations/types/CommonBuildTime.class");
url.getPath();
try {
System.out.println(" Time modified :: "+ new Date(url.openConnection().getLastModified()));
} catch (IOException e) {
e.printStackTrace();
}
But when I open the jar I can see the .class creation time is different from that of the jar creation time.
Could you please try following solution:
import java.io.IOException;
import java.util.Date;
import java.util.Enumeration;
import java.util.jar.JarEntry;
import java.util.jar.JarFile;
public class Test {
public static void main(String[] args) throws IOException {
String classFilePath = "/com/mysql/jdbc/AuthenticationPlugin.class";
String jarFilePath = "D:/jars/mysql-connector-java-5.1.34.jar";
Test test=new Test();
Date date = test.getLastUpdatedTime(jarFilePath, classFilePath);
System.out.println("getLastModificationDate returned: " + date);
}
/**
* Returns last update time of a class file inside a jar file
* #param jarFilePath - path of jar file
* #param classFilePath - path of class file inside the jar file with leading slash
* #return
*/
public Date getLastUpdatedTime(String jarFilePath, String classFilePath) {
JarFile jar = null;
try {
jar = new JarFile(jarFilePath);
Enumeration<JarEntry> enumEntries = jar.entries();
while (enumEntries.hasMoreElements()) {
JarEntry file = (JarEntry) enumEntries.nextElement();
if (file.getName().equals(classFilePath.substring(1))) {
long time=file.getTime();
return time==-1?null: new Date(time);
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
if (jar != null) {
try {
jar.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return null;
}
}
So I have to make a program in java that automatically runs in the background and looks for a new .dat file and when it sees the new .dat file it then runs a .bat file to load data into a database. So far I have a program that watches for new file creation, modification, and deletion. I also have a script that runs the .bat file and loads the data into the database now i just need to connect the two but I am not sure how to go about this, If someone could point me in the right direction I would greatly appreciate it.
Below is the code I have so far.
import static java.nio.file.LinkOption.NOFOLLOW_LINKS;
import static java.nio.file.StandardWatchEventKinds.ENTRY_CREATE;
import static java.nio.file.StandardWatchEventKinds.OVERFLOW;
import static java.nio.file.StandardWatchEventKinds.ENTRY_DELETE;
import static java.nio.file.StandardWatchEventKinds.ENTRY_MODIFY;
import java.io.*;
import java.util.*;
import java.io.File;
import java.io.IOException;
import java.nio.file.FileSystem;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.WatchEvent;
import java.nio.file.WatchEvent.Kind;
import java.nio.file.WatchKey;
import java.nio.file.WatchService;
public class Order_Processing {
public static void watchDirectoryPath(Path path)
{
try {
Boolean isFolder = (Boolean) Files.getAttribute(path,
"basic:isDirectory", NOFOLLOW_LINKS);
if (!isFolder)
{
throw new IllegalArgumentException("Path: " + path
+ " is not a folder");
}
}
catch (IOException ioe)
{
ioe.printStackTrace();
}
System.out.println("Watching path: "+ path);
FileSystem fs = path.getFileSystem();
try (WatchService service = fs.newWatchService())
{
path.register(service, ENTRY_CREATE, ENTRY_MODIFY, ENTRY_DELETE);
WatchKey key = null;
while (true)
{
key = service.take();
Kind<?> kind = null;
for (WatchEvent<?> watchEvent : key.pollEvents())
{
kind = watchEvent.kind();
if (OVERFLOW == kind)
{
continue;
}
else if (ENTRY_CREATE == kind)
{
Path newPath = ((WatchEvent<Path>) watchEvent)
.context();
System.out.println("New Path Created: " + newPath);
}
else if (ENTRY_MODIFY == kind)
{
Path newPath = ((WatchEvent<Path>) watchEvent)
.context();
System.out.println("New path modified: "+ newPath);
}
else if (ENTRY_DELETE == kind)
{
Path newPath = ((WatchEvent<Path>) watchEvent)
.context();
System.out.println("New path deleted: "+ newPath);
}
}
if (!key.reset())
{
break;
}
}
}
catch (IOException ioe)
{
ioe.printStackTrace();
}
catch (InterruptedException ie)
{
ie.printStackTrace();
}
}
public static void main(String[] args)
throws FileNotFoundException
{
File dir = new File("C:\\Paradigm");
watchDirectoryPath(dir.toPath());
//below is the script that runs the .bat file and it works if by itself
//with out all the other watch code.
try {
String[] command = {"cmd.exe", "/C", "Start", "C:\\Try.bat"};
Process p = Runtime.getRuntime().exec(command);
}
catch (IOException ex) {
}
}
}
This doesn't work because you have a while (true). This makes sense because you are listening and want the to happen continuously; however, the bat call will never be executed because watchDirectory(...) will never terminate. To solve this, pull the rest of the main out into its own function like so
public static void executeBat() {
try {
String[] command = {"cmd.exe", "/C", "Start", "C:\\Try.bat"};
Process p = Runtime.getRuntime().exec(command);
}
catch (IOException ex) {
// You should do something with this.
// DON'T JUST IGNORE FAILURES
}
so that upon file creation, you can call that bat script
...
else if (ENTRY_CREATE == kind)
{
Path newPath = ((WatchEvent<Path>) watchEvent).context();
executeBat();
}
...