File copying in java (cannot create file) by Hadoop - java

I currently want to copy a file from hdfs to local computer. I have finished most of the work by fileinputstream and fileoutputstream. But then I encounter the following issue.
JAVA I/O exception. Mkdirs fail to create file
I have do some research and figure out that as I am using
filesystem.create()(hadoop function)
https://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileSystem.html#create(org.apache.hadoop.fs.Path,%20org.apache.hadoop.util.Progressable)
the reason is as follows:
if I set my path to a non-existing folder, a folder will be created and the file I download is inside.
if I set my path to existing folder (say current directory), the above I/O exception occur.
Say if I already get the path and fileinputstream right, what should I use (better in filesystem library) in order to go around this problem
my code
//src and dst are the path input and output
Configuration conf = new Configuration();
FileSystem inFS = FileSystem.get(URI.create(src), conf);
FileSystem outFS = FileSystem.get(URI.create(dst), conf);
FSDataInputStream in = null;
FSDataOutputStream out = null;
in = inFS.open(new Path(src));
out = outFS.create(new Path(dst),
new Progressable() {
/*
* Print a dot whenever 64 KB of data has been written to
* the datanode pipeline.
*/
public void progress() {
System.out.print(".");
}
});

In the "File" class there is a method called
createNewFile() that will create a new file only if one doent exist.

Related

Overwriting HDFS file/directory through Spark

Problem
I have a file saved in HDFS and all I want to do is to run my spark application, calculate a result javaRDD and use saveAsTextFile() in order to store the new "file" in HDFS.
However Spark's saveAsTextFile() does not work if the file already exists. It does not overwrite it.
What I tried
So I searched for a solution to this and I found that a possible way to make it work could be deleting the file through the HDFS API before trying to save the new one.
I added the Code:
FileSystem hdfs = FileSystem.get(new Configuration());
Path newFolderPath = new Path("hdfs://node1:50050/hdfs/" +filename);
if(hdfs.exists(newFolderPath)){
System.out.println("EXISTS");
hdfs.delete(newFolderPath, true);
}
filerdd.saveAsTextFile("/hdfs/" + filename);
When I tried to run my Spark application, the file was deleted but I get a FileNotFoundException.
Considering the fact, that this exception occurs when someone is trying to read a file from a path and the file does not exist, this makes no sense because after deleting the file, there is no code that tries to read it.
Part of my code
JavaRDD<String> filerdd = sc.textFile("/hdfs/" + filename) // load the file here
...
...
// Transformations here
filerdd = filerdd.map(....);
...
...
// Delete old file here
FileSystem hdfs = FileSystem.get(new Configuration());
Path newFolderPath = new Path("hdfs://node1:50050/hdfs/" +filename);
if(hdfs.exists(newFolderPath)){
System.out.println("EXISTS");
hdfs.delete(newFolderPath, true);
}
// Write new file here
filerdd.saveAsTextFile("/hdfs/" + filename);
I am trying to do the simplest thing here but I have no idea why this does not work. Maybe the filerdd is somehow connected to the path??
The problem is you use the same path for input and output. Spark's RDD will be executed lazily. It runs when you call saveAsTextFile. At this point, you have already deleted the newFolderPath. So filerdd will complain.
Anyway, you should not use the same path for input and output.

Unable to find the file kept inside project

I have an excel file which I have kept in a subfolder of my main package.
I want to read that file. When I read it using InputStream, it file is easily detected but when I read using FileInputStram or File file = new File(filepath) I get the error that the file is not found.
Can anyone help me in reading the file using FileInputStram or File file = new File(filepath)?
The code what I wrote to read the file is
File file = new File("upgradeworkbench/Resources/workbookOut.xlsm");
and
FileInputStream inp = new FileInputStream("upgradeworkbench/Resources/workbookOut.xlsm");
I tried with / in the beginning of the path but still it didn't work.
When working with File class you need to provide either absolute or relative path. Absolute path is the full file path e.g. C:\workbookOut.xlsm
In relative paths, there is a concept of a working directory and it's represented by a . (dot) and everything else is relative to it.
Try either giving the full path or the relative path.
File file = new File("./upgradeworkbench/Resources/workbookOut.xlsm");
If your file in classpath then try below code
package mypack;
import java.io.*;
public class TestPath
{
public static void main(String[] args) throws Exception
{
InputStream stream = Test.class.getResourceAsStream("/workbookOut.xlsm");
System.out.println(stream != null);
stream = Test.class.getClassLoader()
.getResourceAsStream("workbookOut.xlsm");
System.out.println(stream != null);
}
}
If your file in same package then use the below line it will work
URL url = getClass().getResource("workbookOut.xlsm");
File file =new File(url.getPath());

File not being Created

I am creating a file like this (I am sending arg[0] as the name of the file to be created).
No file is created I searched through the source of the project and found nothing, why?
import java.io.File;
public class Test {
public static void main (String [] args)
{
File f=new File(args[0]);
}
}
Try with
File f=new File(args[0]);
f.createNewFile();
File is just a representation of the path. You need to actually open an output stream with that file and write to that for a file to be created.
This is normal.
A File is an abstract object. It may, or may not, refer to an existing resource on the filesystem.
But since this is 2015, drop File, use java.nio.file instead:
final Path path = Paths.get(args[0]);
Files.createFile(path);
But really, you shouldn't use File in 2015. Seriously. Yes, .createNewFile() exists on File but... Well, read the page. In short: returns a boolean, need to check the return value, if false, SOL, you can't even diagnose.
Edit: a page to learn how to use java.nio.file: here
(shameless self-advertising for both links, sorry for that)
Just creating file object does not create physical file on disk.Actual file is created with f.createNewFile() as explained in below demo
When yo do File file=new File(args[0]); file just represents the java object not the file object on file system
Here is demo for basic file create and delete operations
public class FileDemo {
public static void main(String[] args) {
File f = null;
try{
// create new file
f = new File("test.txt");
// tries to create new file in the system
f.createNewFile();
// deletes file from the system
f.delete();
}catch(Exception e){
e.printStackTrace();
}
}
}
Executing new File(...) does not create a file in the file system. A File object is just a way of representing the path for a file system object that may or may not existing.
The typical way to create a file is to open a FileOutputStream or FileWriter for the file. The file is created even if you don't write anything. Other alternatives are to call File.createNewFile() or File.createTempFile(...).

FileDescriptor of Directory in Java

is there a way to open a directory stream in Java like in C? I need a FileDescriptor of an opened directory. Well, actually just the number of the fd.
I try to implement a checkpoint/restore functionality in Java with the help of CRIU link. To do this, I need to deploy a RPC call to the CRIU service. There I have to provide the integer value of the FD of an already opened directory, where the image files of the process will be stored.
Thank you in advance!
is there a way to open a directory stream in Java like in C?
No there isn't. Not without resorting to native code.
If you want to "read" a directory in (pure) Java, you can do it using one of the following:
File.list() - gives you the names of the directory entries as strings.
File.list(FilenameFilter) - ditto, but only directory entries that match are returned.
File.listFiles() - like list() but returning File objects.
etcetera
Files.newDirectoryStream(Path) gives you an iterator for the Path objects for the entries in a directory.
The last one could be "close" to what you are trying to achieve, but it does not entail application code getting hold of a file descriptor for a directory, or the application doing a low-level "read" on the directory.
You don't need FD in Java. All you need is a reference to that file which you can simply acquire using File file = new File("PathToYourFile");
To read/write you have Streams in Java. You can use
BufferedReader fileReader = new BufferedReader(new FileReader(new File("myFile.txt")));
PrintWriter fileWriter = new PrintWriter(new FileWriter(new File("myFile.txt")));
Even directory is a file. You can use isDirectory() on file object to check if it is a directory or a file.
private FileDescriptor openFile(String path)
throws FileNotFoundException, IOException {
File file = new File(path);
FileOutputStream fos = new FileOutputStream(file);
// remember th 'fos' reference somewhere for later closing it
fos.write((new Date() + " Beginning of process...").getBytes());
return fos.getFD();
}

How to read file from relative path in Java project? java.io.File cannot find the path specified

I have a project with 2 packages:
tkorg.idrs.core.searchengines
tkorg.idrs.core.searchengines
In package (2) I have a text file ListStopWords.txt, in package (1) I have a class FileLoadder. Here is code in FileLoader:
File file = new File("properties\\files\\ListStopWords.txt");
But I have this error:
The system cannot find the path specified
Can you give a solution to fix it?
If it's already in the classpath, then just obtain it from the classpath instead of from the disk file system. Don't fiddle with relative paths in java.io.File. They are dependent on the current working directory over which you have totally no control from inside the Java code.
Assuming that ListStopWords.txt is in the same package as your FileLoader class, then do:
URL url = getClass().getResource("ListStopWords.txt");
File file = new File(url.getPath());
Or if all you're ultimately after is actually an InputStream of it:
InputStream input = getClass().getResourceAsStream("ListStopWords.txt");
This is certainly preferred over creating a new File() because the url may not necessarily represent a disk file system path, but it could also represent virtual file system path (which may happen when the JAR is expanded into memory instead of into a temp folder on disk file system) or even a network path which are both not per definition digestable by File constructor.
If the file is -as the package name hints- is actually a fullworthy properties file (containing key=value lines) with just the "wrong" extension, then you could feed the InputStream immediately to the load() method.
Properties properties = new Properties();
properties.load(getClass().getResourceAsStream("ListStopWords.txt"));
Note: when you're trying to access it from inside static context, then use FileLoader.class (or whatever YourClass.class) instead of getClass() in above examples.
The relative path works in Java using the . specifier.
. means same folder as the currently running context.
.. means the parent folder of the currently running context.
So the question is how do you know the path where the Java is currently looking?
Do a small experiment
File directory = new File("./");
System.out.println(directory.getAbsolutePath());
Observe the output, you will come to know the current directory where Java is looking. From there, simply use the ./ specifier to locate your file.
For example if the output is
G:\JAVA8Ws\MyProject\content.
and your file is present in the folder "MyProject" simply use
File resourceFile = new File("../myFile.txt");
Hope this helps.
The following line can be used if we want to specify the relative path of the file.
File file = new File("./properties/files/ListStopWords.txt");
InputStream in = FileLoader.class.getResourceAsStream("<relative path from this class to the file to be read>");
try {
BufferedReader reader = new BufferedReader(new InputStreamReader(in));
String line = null;
while ((line = reader.readLine()) != null) {
System.out.println(line);
}
} catch (Exception e) {
e.printStackTrace();
}
try .\properties\files\ListStopWords.txt
I could have commented but I have less rep for that.
Samrat's answer did the job for me. It's better to see the current directory path through the following code.
File directory = new File("./");
System.out.println(directory.getAbsolutePath());
I simply used it to rectify an issue I was facing in my project. Be sure to use ./ to back to the parent directory of the current directory.
./test/conf/appProperties/keystore
While the answer provided by BalusC works for this case, it will break when the file path contains spaces because in a URL, these are being converted to %20 which is not a valid file name. If you construct the File object using a URI rather than a String, whitespaces will be handled correctly:
URL url = getClass().getResource("ListStopWords.txt");
File file = new File(url.toURI());
Assuming you want to read from resources directory in FileSystem class.
String file = "dummy.txt";
var path = Paths.get("src/com/company/fs/resources/", file);
System.out.println(path);
System.out.println(Files.readString(path));
Note: Leading . is not needed.
I wanted to parse 'command.json' inside src/main//js/Simulator.java. For that I copied json file in src folder and gave the absolute path like this :
Object obj = parser.parse(new FileReader("./src/command.json"));
For me actually the problem is the File object's class path is from <project folder path> or ./src, so use File file = new File("./src/xxx.txt"); solved my problem
For me it worked with -
String token = "";
File fileName = new File("filename.txt").getAbsoluteFile();
Scanner inFile = null;
try {
inFile = new Scanner(fileName);
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
while( inFile.hasNext() )
{
String temp = inFile.next( );
token = token + temp;
}
inFile.close();
System.out.println("file contents" +token);
If text file is not being read, try using a more closer absolute path (if you wish
you could use complete absolute path,) like this:
FileInputStream fin=new FileInputStream("\\Dash\\src\\RS\\Test.txt");
assume that the absolute path is:
C:\\Folder1\\Folder2\\Dash\\src\\RS\\Test.txt
String basePath = new File("myFile.txt").getAbsolutePath();
this basepath you can use as the correct path of your file
if you want to load property file from resources folder which is available inside src folder, use this
String resourceFile = "resources/db.properties";
InputStream resourceStream = ClassLoader.getSystemClassLoader().getResourceAsStream(resourceFile);
Properties p=new Properties();
p.load(resourceStream);
System.out.println(p.getProperty("db"));
db.properties files contains key and value db=sybase
If you are trying to call getClass() from Static method or static block, the you can do the following way.
You can call getClass() on the Properties object you are loading into.
public static Properties pathProperties = null;
static {
pathProperties = new Properties();
String pathPropertiesFile = "/file.xml";
// Now go for getClass() method
InputStream paths = pathProperties.getClass().getResourceAsStream(pathPropertiesFile);
}

Categories

Resources