Deleting files (weird extension) - java

I have a problem when I want to delete files in a repository using Java. What I want to do is to check if a repository exists, if it does I want to delete all the files it contains.
Here is my code :
File f = new File(this.pathToFolder);
if (f.exists() && f.isDirectory()) {
for (int i = 0; i < f.listFiles().length ; i++) {
f.listFiles()[i].delete();
}
}
else {
f.mkdir();
}
But the thing is that sometimes it doesn't delete the files. My guess is that it is because the files I want to delete have a weird title such as : 1.20.36579.55485875
They are files that I download and I can't choose their names.
Now I tried something like that :
File f = new File(this.pathToFolder);
if (f.exists() && f.isDirectory()) {
int i = 0;
while (f.listFiles().length != 0) {
boolean remove = f.listFiles()[i].delete();
System.out.println(remove);
}
}
else {
f.mkdir();
}
In the console I get a lot of "false" (about 15) and then eventually a true.
I don't understand why it is acting like this. Maybe some of you will have an idea.
Thank you,
Vaclav

The problem with your code is that you are modifying the listFiles() array while deleting files so the size of the array is shrinking for each time through the loop.
Try this
for (File file : f.listFiles()) {
file.delete();
}

Try to iterate over the files using an iterator
for (Iterator<File> it = Arrays.stream(f.listFiles()).iterator(); it.hasNext();) {
File file = it.next();
file.delete();
}

The problem of your first approach (the for loop) was already addressed.
Regarding you second approach I'm guessing a bit. I would assume (didn't verify it) that deletion is not yet finished when you come to your next iteration. So listFiles()[0] may still be the same as in the previous iteration.
If your code prints more output lines (true of false) than you have files in the directory I'd assume that I'm guessing right.
Anyway the solution suggested by Joakim Danielson ...
for (File file : f.listFiles()) {
file.delete();
}
... addresses both issues by calling listFiles() only once and then iterating over the resulting array.
You could achieve the same (in a less elegant way) by amending your code slightly:
File f = new File(this.pathToFolder);
if (f.exists() && f.isDirectory()) {
File[] files = f.listFiles(); // calling listFiles only once ...
for (int i = 0; i < files.length ; i++) { // ... and then operating on the resulting array
files[i].delete();
}
}
else {
f.mkdir();
}
... not tested - I hope I made no mistake.
Btw the javadoc for listFiles says:
There is no guarantee that the name strings in the resulting array will appear in any specific order; they are not, in particular, guaranteed to appear in alphabetical order.
That's another good reason to not call listFiles() more than once while processing a directory.

Related

How to create a index file in java

Can someone tell me what is the use of -1 here? cant understand why it has to be there
public void indexFile(File file) throws IOException {
int fileno = files.indexOf(file.getPath());
if (fileno == -1) {
files.add(file.getPath());
fileno = files.size() - 1;
}
In this scenario, the -1 means that the string file.getPath() does not exist in the list files
I don't know exactly what is going on because I don't know what "files" stands for...
However, I think I can hazard a guess based on naming conventions.
This method does not create an index file. It indexes a given file in a List<String> called files. The files object is a combination of all the previous files. If the file is in the buffer, fileno!=-1, if fileno==-1, then it adds the path to the buffer and sets the index to its new point.

Further clarification on "sorting files by last modified in java- efficient way"

This question is already asked here may times. Its just a reference to one of the solutions provided here at Finding the 3 most recently modified files in a long list of files.
I tried adding comment to the solution, but I don't have enough reputation points to comment, so I am asking it here. The solution provides a method for sorting the files by last modified.
public static void sortFilesDesc(File[] files)
{
File firstMostRecent = null;
File secondMostRecent = null;
File thirdMostRecent = null;
for (File file : files) {
if ((firstMostRecent == null)
|| (firstMostRecent.lastModified() < file.lastModified())) {
thirdMostRecent = secondMostRecent;
secondMostRecent = firstMostRecent;
firstMostRecent = file;
} else if ((secondMostRecent == null)
|| (secondMostRecent.lastModified() < file.lastModified())) {
thirdMostRecent = secondMostRecent;
secondMostRecent = file;
} else if ((thirdMostRecent == null)
|| (thirdMostRecent.lastModified() < file.lastModified())) {
thirdMostRecent = file;
}
}
}
The method will take array of files as argument and is suppose to sort them according to the last modified.
What I dont understand from this method is that how is it modifying the input array so that all the elements in it are sorted. In the code also we are only changing the values of the local variables firstMostRecent, secondMostRecent and thirdMostRecent. How is the array getting modified? There might be something missing from my understanding , but I am not getting it. Please clarify my confusion.
Input Array is not modified.
This is a code for finding out 1st, 2nd & 3rd last modified files. Thats it.
After Execution of that for loop , you can get top 3 last modified files.

A method to get Directory size returns different results

I'm trying to get the size of a folder on Android. The problems are:
Each time the function returns something different.
It doesn't scan all files.
Can someone tell me why?
private long dirSize(File dir) {
long result = 0;
File[] fileList = dir.listFiles();
for(int i = 0; i < fileList.length; i++) {
// Recursive call if it's a directory
if(fileList[i].isDirectory()) {
result += dirSize(fileList [i]);
} else {
// Sum the file size in bytes
result += fileList[i].length();
}
}
return result/1024/1024; // return the file size
}
I got this function from another thread. How can I improve this code?
Get rid of the division at the end. Each subfolder is being rounded to nearest MB, and added to the size of the current folder in bytes. Return bytes and handle conversion in the caller.

Java max file caching solution

I'm trying to write many files to a directory, and when the directory reaches X number of files, I want the least recently accessed file to be deleted before writing the new file. I don't really want to roll my own solution to this because I'd imagine someone else has already done this before. Are there existing solutions to this? Note this is for a windows application.
This is related to my question Java ehcache disk store, but I'm asking this question separately since now I'm focusing on a file caching solution.
Thanks,
Jeff
I would roll my own, because the problem sounds so easy that writing it yourself is probably easier than trying to learn and adopt an existing library :-)
It it's a low number of files and / or your cash is accessed from multiple processes, call the following method before writing a file:
void deleteOldFiles(String dir, long maxFileCount) {
while (true) {
File oldest = null;
long oldestTime = 0;
File[] list = new File(dir).listFiles();
if (list.length < maxFileCount) {
break;
}
for (File f : list) {
long m = f.lastModified();
if (oldest == null || oldestTime > m) {
oldestTime = m;
oldest = f;
}
}
oldest.delete();
}
}
If you only access the cache from one process, you could write something more efficient using LinkedHashMap or LinkedHashSet.
Update
Check the number of files instead of the total file size.
You can try this before creating a new file:
void deleteOldFiles(String dir, int maxFiles) {
File fdir = new File(dir);
while (true) {
// Check number of files. Also do nothing if maxFiles == 0
File[] files = fdir.listFiles();
if (maxFiles == 0 || files.length < maxFiles)
break;
// Delete oldest
File oldest = files[0];
for (int i = 1; i < files.length; i++) {
if (files[i].lastModified() < oldest.lastModified()) {
oldest = files[i];
}
}
oldest.delete();
}
}
This would not be efficient for a large number of files, though. In that case I would keep a list of files in the directory, sorted by creation time.
Although all of this gets into the 'roll my own category'...
If you were using Cacheonix you could hook up to the cache events API and remove the files when receiving the notification that a cache entry was evicted by the LRU algorithm.

Counting the number of files in a directory using Java

How do I count the number of files in a directory using Java ? For simplicity, lets assume that the directory doesn't have any sub-directories.
I know the standard method of :
new File(<directory path>).listFiles().length
But this will effectively go through all the files in the directory, which might take long if the number of files is large. Also, I don't care about the actual files in the directory unless their number is greater than some fixed large number (say 5000).
I am guessing, but doesn't the directory (or its i-node in case of Unix) store the number of files contained in it? If I could get that number straight away from the file system, it would be much faster. I need to do this check for every HTTP request on a Tomcat server before the back-end starts doing the real processing. Therefore, speed is of paramount importance.
I could run a daemon every once in a while to clear the directory. I know that, so please don't give me that solution.
Ah... the rationale for not having a straightforward method in Java to do that is file storage abstraction: some filesystems may not have the number of files in a directory readily available... that count may not even have any meaning at all (see for example distributed, P2P filesystems, fs that store file lists as a linked list, or database-backed filesystems...).
So yes,
new File(<directory path>).list().length
is probably your best bet.
Since Java 8, you can do that in three lines:
try (Stream<Path> files = Files.list(Paths.get("your/path/here"))) {
long count = files.count();
}
Regarding the 5000 child nodes and inode aspects:
This method will iterate over the entries but as Varkhan suggested you probably can't do better besides playing with JNI or direct system commands calls, but even then, you can never be sure these methods don't do the same thing!
However, let's dig into this a little:
Looking at JDK8 source, Files.list exposes a stream that uses an Iterable from Files.newDirectoryStream that delegates to FileSystemProvider.newDirectoryStream.
On UNIX systems (decompiled sun.nio.fs.UnixFileSystemProvider.class), it loads an iterator: A sun.nio.fs.UnixSecureDirectoryStream is used (with file locks while iterating through the directory).
So, there is an iterator that will loop through the entries here.
Now, let's look to the counting mechanism.
The actual count is performed by the count/sum reducing API exposed by Java 8 streams. In theory, this API can perform parallel operations without much effort (with multihtreading). However the stream is created with parallelism disabled so it's a no go...
The good side of this approach is that it won't load the array in memory as the entries will be counted by an iterator as they are read by the underlying (Filesystem) API.
Finally, for the information, conceptually in a filesystem, a directory node is not required to hold the number of the files that it contains, it can just contain the list of it's child nodes (list of inodes). I'm not an expert on filesystems, but I believe that UNIX filesystems work just like that. So you can't assume there is a way to have this information directly (i.e: there can always be some list of child nodes hidden somewhere).
Unfortunately, I believe that is already the best way (although list() is slightly better than listFiles(), since it doesn't construct File objects).
This might not be appropriate for your application, but you could always try a native call (using jni or jna), or exec a platform-specific command and read the output before falling back to list().length. On *nix, you could exec ls -1a | wc -l (note - that's dash-one-a for the first command, and dash-lowercase-L for the second). Not sure what would be right on windows - perhaps just a dir and look for the summary.
Before bothering with something like this I'd strongly recommend you create a directory with a very large number of files and just see if list().length really does take too long. As this blogger suggests, you may not want to sweat this.
I'd probably go with Varkhan's answer myself.
Since you don't really need the total number, and in fact want to perform an action after a certain number (in your case 5000), you can use java.nio.file.Files.newDirectoryStream. The benefit is that you can exit early instead having to go through the entire directory just to get a count.
public boolean isOverMax(){
Path dir = Paths.get("C:/foo/bar");
int i = 1;
try (DirectoryStream<Path> stream = Files.newDirectoryStream(dir)) {
for (Path p : stream) {
//larger than max files, exit
if (++i > MAX_FILES) {
return true;
}
}
} catch (IOException ex) {
ex.printStackTrace();
}
return false;
}
The interface doc for DirectoryStream also has some good examples.
If you have directories containing really (>100'000) many files, here is a (non-portable) way to go:
String directoryPath = "a path";
// -f flag is important, because this way ls does not sort it output,
// which is way faster
String[] params = { "/bin/sh", "-c",
"ls -f " + directoryPath + " | wc -l" };
Process process = Runtime.getRuntime().exec(params);
BufferedReader reader = new BufferedReader(new InputStreamReader(
process.getInputStream()));
String fileCount = reader.readLine().trim() - 2; // accounting for .. and .
reader.close();
System.out.println(fileCount);
Using sigar should help. Sigar has native hooks to get the stats
new Sigar().getDirStat(dir).getTotal()
This method works for me very well.
// Recursive method to recover files and folders and to print the information
public static void listFiles(String directoryName) {
File file = new File(directoryName);
File[] fileList = file.listFiles(); // List files inside the main dir
int j;
String extension;
String fileName;
if (fileList != null) {
for (int i = 0; i < fileList.length; i++) {
extension = "";
if (fileList[i].isFile()) {
fileName = fileList[i].getName();
if (fileName.lastIndexOf(".") != -1 && fileName.lastIndexOf(".") != 0) {
extension = fileName.substring(fileName.lastIndexOf(".") + 1);
System.out.println("THE " + fileName + " has the extension = " + extension);
} else {
extension = "Unknown";
System.out.println("extension2 = " + extension);
}
filesCount++;
allStats.add(new FilePropBean(filesCount, fileList[i].getName(), fileList[i].length(), extension,
fileList[i].getParent()));
} else if (fileList[i].isDirectory()) {
filesCount++;
extension = "";
allStats.add(new FilePropBean(filesCount, fileList[i].getName(), fileList[i].length(), extension,
fileList[i].getParent()));
listFiles(String.valueOf(fileList[i]));
}
}
}
}
Unfortunately, as mmyers said, File.list() is about as fast as you are going to get using Java. If speed is as important as you say, you may want to consider doing this particular operation using JNI. You can then tailor your code to your particular situation and filesystem.
public void shouldGetTotalFilesCount() {
Integer reduce = of(listRoots()).parallel().map(this::getFilesCount).reduce(0, ((a, b) -> a + b));
}
private int getFilesCount(File directory) {
File[] files = directory.listFiles();
return Objects.isNull(files) ? 1 : Stream.of(files)
.parallel()
.reduce(0, (Integer acc, File p) -> acc + getFilesCount(p), (a, b) -> a + b);
}
Count files in directory and all subdirectories.
var path = Path.of("your/path/here");
var count = Files.walk(path).filter(Files::isRegularFile).count();
In spring batch I did below
private int getFilesCount() throws IOException {
ResourcePatternResolver resolver = new PathMatchingResourcePatternResolver();
Resource[] resources = resolver.getResources("file:" + projectFilesFolder + "/**/input/splitFolder/*.csv");
return resources.length;
}

Categories

Resources