Sort files in numeric order

Sort files in numeric order - java

I made a program to combine all files in a folder together.
Here's part of my code:
File folder = new File("c:/some directory");
File[] listOfFiles = folder.listFiles();
for (File file : listOfFiles){
if (file.isFile()){
System.out.println(file.getName());
File f = new File("c:/some directory"+file.getName());
However, I hope my files can be in order of like:
job1.script, job2.script, .....
but I get:
job1.script, job10.script, job11.script, that 10,11,12... are in front of 2.
I hope I can get efficient code that can avoid this problem.

Time to get rid of all the clumpsy code, and use Java 8! This answer also features the Path class, which is already part of Java 7, however seems to be heavily improved in Java 8.
The code:
private void init() throws IOException {
Path directory = Paths.get("C:\\Users\\Frank\\Downloads\\testjob");
Files.list(directory)
.filter(path -> Files.isRegularFile(path))
.filter(path -> path.getFileName().toString().startsWith("job"))
.filter(path -> path.getFileName().toString().endsWith(".script"))
.sorted(Comparator.comparingInt(this::pathToInt))
.map(path -> path.getFileName())
.forEach(System.out::println);
}
private int pathToInt(final Path path) {
return Integer.parseInt(path.getFileName()
.toString()
.replace("job", "")
.replace(".script", "")
);
}
The explanation of pathToInt:
From a given Path, obtain the String representation of the file.
Remove "job" and ".script".
Try to parse the String as an Integer.
The explanation of init, the main method:
Obtain a Path to the directory where the files are located.
Obtain a lazily populated list of Paths in the directory, be aware: These Paths are still fully qualified!
Keep files that are regular files.
Keep files of which the last part of the Path, thus the filename (for example job1.script) starts with "job". Be aware that you need to first obtain the String representation of the Path before you can check it, else you will be checking if the whole Path starts with a directory called "job".
Do the same for files ending with ".script".
Now comes the fun point. Here we sort the file list based on a Comparator that compares the integers which we obtain by calling pathToInt on the Path. Here I am using a method reference, the method comparingInt(ToIntFunction<? super T> keyExtractor expects a function that maps a T, in this case a Path, to an int. And this is exactly what pathToInt does, hence it can be used a method reference.
Then I map every Path to the Path only consisting of the filename.
Lastly, for each element of the Stream<Path>, I call System.out.println(Path.toString()).
It may seem like this code could be written easier, however I have purposefully written it more verbose. My design here is to keep the full Path intact at all times, the very last part of the code in the forEach actually violates that principle as shortly before it gets mapped to only the file name, and hence you are not able to process the full Path anymore at a later point.
This code is also designed to be fail-fast, hence it is expecting files to be there in the form job(\D+).script, and will throw a NumberFormatException if that is not the case.
Example output:
job1.script
job2.script
job10.script
job11.script
An arguably better alternative features the power of regular expressions:
private void init() throws IOException {
Path directory = Paths.get("C:\\Users\\Frank\\Downloads\\testjob");
Files.list(directory)
.filter(path -> Files.isRegularFile(path))
.filter(path -> path.getFileName().toString().matches("job\\d+.script"))
.sorted(Comparator.comparingInt(this::pathToInt))
.map(path -> path.getFileName())
.forEach(System.out::println);
}
private int pathToInt(final Path path) {
return Integer.parseInt(path.getFileName()
.toString()
.replaceAll("job(\\d+).script", "$1")
);
}
Here I use the regular expression "job\\d+.script", which matches a string starting with "job", followed by one or more digits, followed by ".script".
I use almost the same expression for the pathToInt method, however there I use a capturing group, the parentheses, and $1 to use that capturing group.
I will also provide a concise way to read the contents of the files in one big file, as you have also asked in your question:
private void init() throws IOException {
Path directory = Paths.get("C:\\Users\\Frank\\Downloads\\testjob");
try (BufferedWriter writer = Files.newBufferedWriter(directory.resolve("masterjob.script"))) {
Files.list(directory)
.filter(path -> Files.isRegularFile(path))
.filter(path -> path.getFileName().toString().matches("job\\d+.script"))
.sorted(Comparator.comparingInt(this::pathToInt))
.flatMap(this::wrappedLines)
.forEach(string -> wrappedWrite(writer, string));
}
}
private int pathToInt(final Path path) {
return Integer.parseInt(path.getFileName()
.toString()
.replaceAll("job(\\d+).script", "$1")
);
}
private Stream<String> wrappedLines(final Path path) {
try {
return Files.lines(path);
} catch (IOException ex) {
//swallow
return null;
}
}
private void wrappedWrite(final BufferedWriter writer, final String string) {
try {
writer.write(string);
writer.newLine();
} catch (IOException ex) {
//swallow
}
}
Please note that lambdas cannot throw/catch checked Exceptions, hence there is a neccessity to write wrappers around the code, that decides what to do with the errors. Swallowing the exceptions is rarely a good idea, I am just using it here for code simplicitely.
The real big change here is that instead of printing out the names, I map every file to its contents and write those to a file.

If your files' name are always like jobNumber.script you could sort the array providing a custom comparator:
Arrays.sort(listOfFiles, new Comparator<File>(){
#Override
public int compare(File f1, File f2) {
String s1 = f1.getName().substring(3, f1.getName().indexOf("."));
String s2 = f2.getName().substring(3, f2.getName().indexOf("."));
return Integer.valueOf(s1).compareTo(Integer.valueOf(s2));
}
});
public static void main(String[] args) throws Exception{
File folder = new File(".");
File[] listOfFiles = folder.listFiles(new FilenameFilter() {
#Override
public boolean accept(File arg0, String arg1) {
return arg1.endsWith(".script");
}
});
System.out.println(Arrays.toString(listOfFiles));
Arrays.sort(listOfFiles, new Comparator<File>(){
#Override
public int compare(File f1, File f2) {
String s1 = f1.getName().substring(3, f1.getName().indexOf("."));
String s2 = f2.getName().substring(3, f2.getName().indexOf("."));
return Integer.valueOf(s1).compareTo(Integer.valueOf(s2));
}
});
System.out.println(Arrays.toString(listOfFiles));
}
Prints:
[.\job1.script, .\job1444.script, .\job4.script, .\job452.script, .\job77.script]
[.\job1.script, .\job4.script, .\job77.script, .\job452.script, .\job1444.script]

The easiest solution is to zero pad all digits lower than 10. Like
job01.script
instead of
job1.script
This assumes no more than 100 files. With more, simply add more zeros.
Otherwise, you'll need analyze and breakdown each file name, and then order it numerically. Currently, it's being ordered by character.

The simplest method to solve this problem is to prefix your names with 0s. This is what I did when I had the same problem. So basically you choose the biggest number you have (for example 433234) and prefix all numbers with biggestLength - currentNumLength zeroes.
An example:
Biggest number is 12345: job12345.script.
This way the first job becomes job00001.script.

Related

Regex filter for file search Java

I'm quite new to using regex so I'm having problems with my current code. I created an Abstract File Search that returns a List of Files. I would like this searcher to be filtered by a regex (have ex. the extension it looks for based on a regex filter).
The code of my Abstract Searcher:
public abstract class AbstractFileDiscoverer implements IDiscoverer {
private final Path rootPath;
AbstractFileDiscoverer(final Path rootPath) {
super();
this.rootPath = rootPath;
}
protected List<File> findFiles() throws IOException {
if (!Files.isDirectory(this.rootPath)) {
throw new IllegalArgumentException("Path must be a directory");
}
List<File> result;
try (Stream<Path> walk = Files.walk(this.rootPath)) {
result = walk.filter(p -> !Files.isDirectory(p)).map(p -> p.toFile())
.filter(f -> f.toString().toLowerCase().endsWith("")).collect(Collectors.toList());
}
return result;
}
#Override
public String getName() {
// TODO Auto-generated method stub
return null;
}
}
I would like the following part to be filtered by the regex, so that only the files that the regex returns as true (for .bat and .sql files) to be collected.
result = walk.filter(p -> !Files.isDirectory(p)).map(p -> p.toFile())
.filter(f -> f.toString().toLowerCase().endsWith("")).collect(Collectors.toList());
Could anyone help me achieving it?
FIRST EDIT:
I'm aware that toString().toLowerCase().endsWith("") always returns true, I actually need the regex there instead of an String with the extension. I forgot to mention that.

Try this website: https://regexr.com/ and paste the regex .+(?:.sql|.bat)$ for an explanation.
In code it'd look like this:
Stream.of("file1.json", "init.bat", "init.sql", "file2.txt")
.filter(filename -> filename.matches(".+(?:.sql|.bat)$"))
.forEach(System.out::println);

There is a famous quote from Jamie Zawinski about using regular expressions when simpler non-regex code will do.
In your case, I would avoid using a regular expression and would just write a private method:
private static boolean hasMatchingExtension(Path path) {
String filename = path.toString().toLowerCase();
return filename.endsWith(".bat") || filename.endsWith(".sql");
}
Then you can use it in your stream:
result = walk.filter(p -> !Files.isDirectory(p)).
.filter(p -> hasMatchingExtension(p))
.map(p -> p.toFile())
.collect(Collectors.toList());
(Consider returning List<Path> instead. The Path class is the modern replacement for the File class, some of whose methods that actually operate on files have design issues.)

Extract FileName from getAbsolutePath() method

I'm using a method from this site to read all the files exist on the system hard drives, it's working fine, but i need to check that a certain file exists while searching.
to make the story short here is the line code which is reading the files
parseAllFiles(f.getAbsolutePath());
how can I assign the output from this method to a string so i can search iniside this string for the file I want, or there any way to add/change to this statement to get the filename directely in a string?
public static void parseAllFiles(String parentDirectory){
File[] filesInDirectory = new File(parentDirectory).listFiles();
if(filesInDirectory != null){
for(File f : filesInDirectory){
if(f.isDirectory()){
parseAllFiles(f.getAbsolutePath()); // get full path
}
System.out.println("Current File -> " + f);
}
}
}

Use objects rather than strings since they tend to offer useful functionality that strings don’t offer. In your case pass a File object or a Path object to your recursive method. I take it that you start out from a string, so have your public method accept a string and construct the first object.
public static void parseAllFiles(String parentDirectory) {
parseAllFiles(new File(parentDirectory));
}
private static void parseAllFiles(File dir) {
File[] filesInDirectory = dir.listFiles();
if (filesInDirectory != null) {
for (File f : filesInDirectory) {
String fileName = f.getName();
String fullPathName = f.getAbsolutePath();
System.out.println("Current File -> " + fileName);
System.out.println("Current path -> " + fullPathName);
if (f.isDirectory()) {
parseAllFiles(f);
}
}
}
}
I didn’t get whether you wanted only the file name or the full path name of the file, but the code shows how to extract each into a string. You may then search inside these two strings for whatever you are looking for.
java.nio since Java 7
I routinely use java.nio.file for file system operations. For everyday purposes I don’t find it better to work with than the older File class, but it does offer a wealth of options that the older class doesn’t offer. #Shahar Rotshtein in a comment mentioned the FileVisitor interface from java.nio.file. Depending on your exact requirements Files.walkFileTree passing your own FileVisitor may be the best option for you. I have not understood your real requirements well enough to offer a code example.
Links
java.nio.file documentation
Section Walking the file tree in the Oracle tutorial: Basic I/O

Stream processing - searching for a file/dir in a given directory by name

I'm currently working on an assignment which consists of creating a utility class with a method allowing to search for files/directories by name in a given (as a parameter) directory.
The drill is that I am obligated to do this within the realms of functional programming / stream processing.
I have tried to achieve this using .walk() and .find() but it wouldn't work
public static List<File> findFile(Path path, String name) throws IOException{
return Files.walk(path)
.filter(n -> n.getFileName().toString().equals(name))
.map(n -> n.toFile())
.collect(Collectors.toList());
}

Check out this question
File dir= new File("path");
File[] fileList = dir.listFiles(new FilenameFilter()
{
public boolean accept(File dir, String foundFileName)
{
return name.equalsIgnoreCase(foundFileName);
}
});
Although it seems you're just trying to search for one specific file, so it isn't the best code for this task. Anyway, your file should be in the first position of the array. Or make some nasty dir.listFiles()..[0]

You can instead use Files.list(Path dir)
List<File> listOfFiles = Files.list(Paths.get(path))
.filter(e -> e.getFileName().endsWith(name))
.map(n -> n.toFile()).collect(Collectors.toList());
Note:
Tweak the filter condition as per your requirement. I have assumed name to have a value like .txt, so the listOfFiles will contain all files that end with .txt

The walk method will work for you, but converting to file at the beginning lets you to use the getName method which returns the File name as a String
public static List<File> findFile(Path path, String name) throws IOException{
return Files
.walk(path)
.map(Path::toFile) //.map(p -> p.toFile())
.filter(File::isFile) //.filter(f -> f.isFile())
.filter(f -> f.getName().equals(name))
.collect(Collectors.toList());
}
Note that the provided name must be exactly the same of the file (including its extension) or the function won't find it

Recursion: Checking for files in Directories and reading them

Before you speculate something like "This guy is asking for homework help", I'll go ahead and clear any doubts you may have and say yes, this is related to homework. However, I hope that does not take away from the learning that this question provides to me and/or anyone who reads this in the future.
Background: We're currently working on recursion and our assignment asks that we write a program that uses command arguments to check a directory and its file contents for a string(that is also a command argument). We must use recursion for this.
-I want to make this clear that I UNDERSTAND WHAT THE ASSIGNMENT IS ASKING
I am simply asking, how would this work recursively because I just don't get it.
We did a problem where we had to find the size of a directory and it made sense, but I don't get how to check if something is a directory or file and based on that we read its contents or go deeper into the directory until we find a file.
Here's what I've currently done. Not too sure how wrong this is as I'm basing entirely off of the 'check the size of a directory' assignment we previously did:
The folder that I'm checking is something like this:
Directory ---> files --inside main directory --->> Two directories ----> files within both of those directories
public class SearchingForStrings {
public static void main(String[] args) {
String path = "."; // default location of this project
File sf = new File(path);
String mysteriesDirectory = args[0];
String keyString = args[1];
countLinesWithString(sf, mysteriesDirectory, keyString);
}
public static int countLinesWithString(File startPath, String mysteriesDirectory, String keyString) {
if(!startPath.exists()) {
throw new IllegalArgumentException("File " + startPath + " does not exist!");
} else if(startPath.isFile()) {
return Integer.parseInt(startPath.getAbsolutePath()); // Just to show where the file is I located the parsing is just to stop an error from flagging on this part; Going to ask professor if it's okay with him
// this is where we would begin reading the contents of the files
} else if(startPath.isDirectory()) {
// This is where our recursion would take place: essentially
// we will be going 'deeper' into the directory until we find a file
//File[] subFiles = startPath.listFiles();
countLinesWithString(startPath, mysteriesDirectory, keyString);
} else {
throw new IllegalStateException("Unknown file type: " + startPath);
}
}
}
In short: Could someone explain how recursion would work if you wanted to go deeper into a director(y/ies)?

I'll give this a try. It's something that is easier to explain than to understand.
The recursive method, on which you have made a decent start, might be documented as follows:
"For a given directory: for each file in the directory, count all the lines which contain a given string; for each directory in the directory, recurse".
The recursion is possible - and useful - because your original target is a container, and one of the types of things it can contain is another container.
So think of the counting method like this:
int countLines(dir, string) // the string could be an instance variable, also, and not passed in
{
var countedLines = 0;
for each item in dir:
if item is file, countedLines += matchedLinesInFile(item, string);
else if item is dir, countedLines += countLines(item, string);
else throw up; // or throw an exception -- your choice
}
then call countLines from an exterior method with the original dir to use, plus the string.
One of the things that trips people up about recursion is that, after you get it written, it doesn't seem possible that it can do all that it does. But think through the above for different scenarios. If the dir passed in has files and no dirs, it will accumulate countedLines for each file in the dir, and return the result. That's what you want.
If the dir does contain other dirs, then for each one of those, you're going to call the routine and start on that contained dir. The call will accumulate countedLines for each file in that dir, and call itself for each dir recursively down the tree, until it reaches a dir that has no dirs in it. And it still counts lines in those, it just doesn't have any further down to recurse.
At the lowest level, it is going to accumulate those lines and return them. Then the second-lowest level will get that total to add to its total, and start the return trips back up the recursion tree.
Does that explain it any better?

Just help you get started with recursion check this :
It will recursively go from base directory printing all the folders and files.
Modify this to your requirements. Try and let us know.
import java.io.File;
public class Test {
public static void getResource(final String resourcePath) {
File file = new File(resourcePath);
if (file.isFile()) {
System.out.println("File Name : " + file.getName());
return;
} else {
File[] listFiles = file.listFiles();
if (listFiles != null) {
for (File resourceInDirectory : listFiles) {
if (!resourceInDirectory.isFile()) {
System.out.println("Folder "
+ resourceInDirectory.getAbsolutePath());
getResource(resourceInDirectory.getAbsolutePath());
} else {
getResource(resourceInDirectory.getAbsolutePath());
}
}
}
}
}
public static void main(String[] args) {
final String folderPath = "C:/Test";
getResource(folderPath);
}
}

Is there a way in Java to determine if a path is valid without attempting to create a file?

I need to determine if a user-supplied string is a valid file path (i.e., if createNewFile() will succeed or throw an Exception) but I don't want to bloat the file system with useless files, created just for validation purposes.
Is there a way to determine if the string I have is a valid file path without attempting to create the file?
I know the definition of "valid file path" varies depending on the OS, but I was wondering if there was any quick way of accepting C:/foo or /foo and rejecting banana.
A possible approach may be attempting to create the file and eventually deleting it if the creation succeeded, but I hope there is a more elegant way of achieving the same result.

Path class introduced in Java 7 adds new alternatives, like the following:
/**
* <pre>
* Checks if a string is a valid path.
* Null safe.
*
* Calling examples:
* isValidPath("c:/test"); //returns true
* isValidPath("c:/te:t"); //returns false
* isValidPath("c:/te?t"); //returns false
* isValidPath("c/te*t"); //returns false
* isValidPath("good.txt"); //returns true
* isValidPath("not|good.txt"); //returns false
* isValidPath("not:good.txt"); //returns false
* </pre>
*/
public static boolean isValidPath(String path) {
try {
Paths.get(path);
} catch (InvalidPathException | NullPointerException ex) {
return false;
}
return true;
}
Edit:
Note Ferrybig's
comment : "The only disallowed character in a file name on Linux is the NUL character, this does work under Linux."

This would check for the existance of the directory as well.
File file = new File("c:\\cygwin\\cygwin.bat");
if (!file.isDirectory())
file = file.getParentFile();
if (file.exists()){
...
}
It seems like file.canWrite() does not give you a clear indication if you have permissions to write to the directory.

File.getCanonicalPath() is quite useful for this purpose. IO exceptions are thrown for certain types of invalid filenames (e.g. CON, PRN, *?* in Windows) when resolving against the OS or file system. However, this only serves as a preliminary check; you will still need to handle other failures when actually creating the file (e.g. insufficient permissions, lack of drive space, security restrictions).

A number of things can go wrong when you try and create a file:
Your lack the requisite permissions;
There is not enough space on the device;
The device experiences an error;
Some policy of custom security prohibits you from creating a file of a particular type;
etc.
More to the point, those can change between when you try and query to see if you can and when you actually can. In a multithreaded environment this is one of the primary causes of race conditions and can be a real vulnerability of some programs.
Basically you just have to try and create it and see if it works. And that's the correct way to do it. It's why things like ConcurrentHashMap has a putIfAbsent() so the check and insert is an atomic operation and doesn't suffer from race conditions. Exactly the same principle is in play here.
If this is just part of some diagnostic or install process, just do it and see if it works. Again there's no guarantee that it'll work later however.
Basically your program has to be robust enough to die gracefully if it can't write a relevant file.

boolean canWrite(File file) {
if (file.exists()) {
return file.canWrite();
}
else {
try {
file.createNewFile();
file.delete();
return true;
}
catch (Exception e) {
return false;
}
}
}

Here's something you can do that works across operating systems
Using regex match to check for existing known invalid characters.
if (newName.matches(".*[/\n\r\t\0\f`?*\\<>|\":].*")) {
System.out.println("Invalid!");
} else {
System.out.println("Valid!");
}
Pros
This works across operating systems
You can customize it whatever way
you want by editing that regex.
Cons
This might not be a complete list and need more research to fill in more invalid patterns or characters.

Just do it (and clean up after yourself)
A possible approach may be attempting to create the file and eventually deleting it if the creation succeeded, but I hope there is a more elegant way of achieving the same result.
Maybe that's the most robust way.
Below is canCreateOrIsWritable that determines whether your program is able to create a file and its parent directories at a given path, or, if there's already a file there, write to it.
It does so by actually creating the necessary parent directories as well as an empty file at the path. Afterwards, it deletes them (if there existed a file at the path, it's left alone).
Here's how you might use it:
var myFile = new File("/home/me/maybe/write/here.log")
if (canCreateOrIsWritable(myFile)) {
// We're good. Create the file or append to it
createParents(myFile);
appendOrCreate(myFile, "new content");
} else {
// Let's pick another destination. Maybe the OS's temporary directory:
var tempDir = System.getProperty("java.io.tmpdir");
var alternative = Paths.get(tempDir, "second_choice.log");
appendOrCreate(alternative, "new content in temporary directory");
}
The essential method with a few helper methods:
static boolean canCreateOrIsWritable(File file) {
boolean canCreateOrIsWritable;
// The non-existent ancestor directories of the file.
// The file's parent directory is first
List<File> parentDirsToCreate = getParentDirsToCreate(file);
// Create the parent directories that don't exist, starting with the one
// highest up in the file system hierarchy (closest to root, farthest
// away from the file)
reverse(parentDirsToCreate).forEach(File::mkdir);
try {
boolean wasCreated = file.createNewFile();
if (wasCreated) {
canCreateOrIsWritable = true;
// Remove the file and its parent dirs that didn't exist before
file.delete();
parentDirsToCreate.forEach(File::delete);
} else {
// There was already a file at the path → Let's see if we can
// write to it
canCreateOrIsWritable = java.nio.file.Files.isWritable(file.toPath());
}
} catch (IOException e) {
// File creation failed
canCreateOrIsWritable = false;
}
return canCreateOrIsWritable;
}
static List<File> getParentDirsToCreate(File file) {
var parentsToCreate = new ArrayList<File>();
File parent = file.getParentFile();
while (parent != null && !parent.exists()) {
parentsToCreate.add(parent);
parent = parent.getParentFile();
}
return parentsToCreate;
}
static <T> List<T> reverse(List<T> input) {
var reversed = new ArrayList<T>();
for (int i = input.size() - 1; i >= 0; i--) {
reversed.add(input.get(i));
}
return reversed;
}
static void createParents(File file) {
File parent = file.getParentFile();
if (parent != null) {
parent.mkdirs();
}
}
Keep in mind that between calling canCreateOrIsWritable and creating the actual file, the contents and permissions of your file system might have changed.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Sort files in numeric order - java

Related

Regex filter for file search Java

Extract FileName from getAbsolutePath() method

Stream processing - searching for a file/dir in a given directory by name

Recursion: Checking for files in Directories and reading them

Is there a way in Java to determine if a path is valid without attempting to create a file?

Categories

Resources