I want to write a Java application that validate files and directories according to certain naming standards. The program would let you pick a directory and would recursively analyze -- giving a list of files/directories that do not match the given rules.
Eventually I want the user to be able to input rules, but for now they would be hard coded. Oh, and this would need to be cross-platform.
I'm have a working knowledge of basic Java constructs but have no experience with libraries and have not had much luck finding demos/code samples for this type of thing.
I would love suggestions for what trees to start barking up, pseudo-code -- whatever you feel would be helpful.
EDIT: I'm not trying to remove anything here, just get a recursive listing of any names that break certain rules (e.g. no spaces or special characters, no directories that start with uppercase) in the chosen directory.
I would like to use Commons IO, I think DirectoryWalker will help you.
Here is the sample for checking and removing ".svn" dir
public class FileCleaner extends DirectoryWalker {
public FileCleaner() {
super();
}
public List clean(File startDirectory) {
List results = new ArrayList();
walk(startDirectory, results);
return results;
}
protected boolean handleDirectory(File directory, int depth, Collection results) {
// delete svn directories and then skip
if (".svn".equals(directory.getName())) {
directory.delete();
return false;
} else {
return true;
}
}
protected void handleFile(File file, int depth, Collection results) {
// delete file and add to list of deleted
file.delete();
results.add(file);
}
}
Related
Let's say you have a task to read all files that are saved in some folder and process every single file. For simplicity sake let's say that all files are HTML files and you want to extract the HTML content from them.
In Java 8 there is Files.walk API that allows us to do something like that. Here is an example:
try (Stream<Path> paths = Files.walk(Paths.get("/home/you/Desktop"))) {
paths
.filter(Files::isRegularFile)
.forEach(System.out::println);
}
This sound really good if you have to process small amount of folders and files, but if you have milion of files distributed across several network drives then this process will take ages and obviously needs to be paralelised. Any ideas how to do parallelism in this case?
I don't think there is a simple general algorithm to solve your problem.
In fact the general idea when dealing with big amount of data distributed on many nodes is letting each node do the collecting of data and the processing those partial results in a single node.
Doing all the scanning from a single system is going to be hard.
To do some real optimization you cannot treat all the folder in the same way.
What you could do is to create a Collection of Paths that could be scanned in parallel.
So instead of walking along a single root not you could start several walks along several folder (possibly one for each network drive).
For this to work you need to know which path is a network path and which is a local one.
If you, for example have a folder where each child folder is a mounted network drive, you could easily collect all those folders and the run your walk in parallel for each.
I would do something similar to the following code:
public class ParallelWalks {
ExecutorService executor = Executors.newCachedThreadPool();
ExecutorService singleThreadExecutor = Executors.newSingleThreadExecutor();
public static void main(String[] args) {
new ParallelWalks().exec();
}
public ExecutorService executorSelector(Path path) {
if(isNetworkDrive(path)) {
return executor;
}else {
return singleThreadExecutor;
}
}
private boolean isNetworkDrive(Path path) {
// Here goes the logic to choose which path should go on a different
// thread.
return path.toString().contains("srv");
}
private void exec() {
Path path = Paths.get("/home/you/Desktop");
try (Stream<Path> files = Files.list(path)) {
files.forEach(this::taskRunner);
} catch (IOException e) {
// Do something with the exception
}
}
private void taskRunner(final Path path) {
executorSelector(path)
.submit(() -> doWalk(path));
}
private void doWalk(Path path) {
try (Stream<Path> paths = Files.walk(path)) {
paths.filter(Files::isRegularFile).forEach(System.out::println);
} catch (IOException e) {
// Do something with the exception
}
}
}
This way all your local dir will be processed sequentially, and all network drives will be processed each on his thread.
It would work only if all (or most of) your network drives share the same mount point parent.
Otherwise you should implement your own walk .
I want to create an application that shows a user how many times he opened or used the software. For this I have created the code below. But it is not showing correct output: when I run the application first it is showing 1 and then the second time I run it it is also showing 1.
public Founder() {
initComponents();
int c=0;
c++;
jLabel1.setText(""+c);
return;
}
I’m unsure whether I’m helping you or giving you a load of new problems and unanswered questions. The following will store the count of times the class Founder has been constructed in a file called useCount.txt in the program’s working directory (probably the root binary directory, where your .class files are stored). Next time you run the program, it will read the count from the file, add 1 and write the new value back to the file.
static final Path counterFile = FileSystems.getDefault().getPath("useCount.txt");
public Founder() throws IOException {
initComponents();
// read use count from file
int useCount;
if (Files.exists(counterFile)) {
List<String> line = Files.readAllLines(counterFile);
if (line.size() == 1) { // one line in file as expected
useCount = Integer.parseInt(line.get(0));
} else { // not the right file, ignore lines from it
useCount = 0;
}
} else { // program has never run before
useCount = 0;
}
useCount++;
jLabel1.setText(String.valueOf(useCount));
// write new use count back to file
Files.write(counterFile, Arrays.asList(String.valueOf(useCount)));
}
It’s not the most elegant nor robust solution, but it may get you started. If you run the program on another computer, it will not find the file and will start counting over from 0.
When you are running your code the first time, the data related to it will be stored in your system's RAM. Then when you close your application, all the data related to it will be deleted from the RAM (for simplicity let's just assume it will be deleted, although in reality it is a little different).
Now when you are opening your application second time, new data will be stored in the RAM. This new data contains the starting state of your code. So the value of c is set to 0 (c=0).
If you want to remember the data, you have to store it in the permanent storage (your system hard drive for example). But I think you are a beginner. These concepts are pretty advanced. You should do some basic programming practice before trying such things.
Here you need to store it on permanent basic.
Refer properties class to store data permanently: https://docs.oracle.com/javase/7/docs/api/java/util/Properties.html
You can also use data files ex. *.txt, *.csv
Serialization also provide a way for persistent storage.
You can create a class that implements Serializable with a field for each piece of data you want to store. Then you can write the entire class out to a file, and you can read it back in later.Learn about serialization here:https://www.tutorialspoint.com/java/java_serialization.htm
I have seen to similar questions here on this topic but none really helped me grasp the steps to solve this.
Given a queue, and a rootNode , how can I iterate through it? I understand first I have to enqueue the node I start with, but I am confused on how to implement the next() method. Also it has to be breadth-first traversal.
for the next() method I have this:
public File next(){
while(the peek() is a directory ("parent"){
use lists() to return the array of files
iterate thru those and add to queue
remove the first node
}
return peek();
It seems to work if I have a single file directory.
Also, I am looking for a pseucode not the code. I am just confused on whether I am on the right path or not.
If, for some reason, you insist on non recursive solution, although FileVisitor is definitely way to go with this in Java, breadth first search can be implemented non recursively.
This is general definition, marking is used to avoid circular referencing although you wont have that in this case:
Enqueue root of directories and mark root as discovered
while queue is not empty
dequeue and process element
discover adjacent edges - children
for every child, if not marked already and is a directory, mark and queue
To get children you need: String[] directories = file.list().
To make sure you are queuing directories and not files
call file.isDirectory() and enqueue only directories.
You do not need to do marking before queuing as you won't have circular reference between directories.
Edit:
Here is recursive breadth first search, you can modify it into iterative with the pseudo code above.
import java.io.File;
import java.util.LinkedList;
import java.util.Queue;
public class BFSRecursive {
public static void main(String[] args) {
File file = new File("D:/Tomcat");
Queue<File> files = new LinkedList<>();
files.add(file);
bfs(files);
}
private static void bfs(Queue<File> files) {
if (files.isEmpty())
return;
File file = files.poll();
System.out.println(file.getName());
String[] directories = file.list();
for(String child: directories) {
File childFile = new File(file.getAbsolutePath() +"/"+ child);
if (childFile.isDirectory())
files.add(childFile);
}
bfs(files);
}
}
I tried a code.
Well, the first part is the main method that basically lays down the directory structure. I tried to delete a directory that contains other directories using the rmdirs method I wrote below.
public static void rmdirs(File k)
{
String[] y= k.list();
int i;
File f;
for(i=0;i<y.length;i++)
{
f= new File(k,y[i]);
if(f.isDirectory() && f.list().length>0)
{
rmdirs(f);
}
else
{
f.delete();
}
}
k.delete();
}
The rmdirs method is working and seems to be doing what I expected, but how do I add this program to a library, so that I can repeatedly use it by importing something.
Also, the above program does something like
rmdirs(f2);
to delete a file.
I would like it to be something like
f2.rmdirs();
And I am wondering how I can do it. I tried somehting like
import java.io.*;
public class RFile extends File
{
public RFile(String p)
{
super(p);
}
public RFile(File f1,String p1)
{
super(f1,p1);
}
public void rmdirs()
{
RFile k=this;
String[] y= k.list();
int i;
RFile f;
for(i=0;i<y.length;i++)
{
f= new RFile(k,y[i]);
if(f.isDirectory() && f.list().length>0)
{
f.rmdirs();
}
else
{
f.delete();
}
}
k.delete();
}
}
But then, the tester class or main class becomes one in which I have to use RFile and not File.
This is a problem; Also, like I asked before, how do I add all these to a library so that importing java.io.RFile or something like that will do the job?
You don't extend java.io.File (unless you have a very good reason and this is not such a reason)
One solution is to create a class like "FileUtils" which has a static method "remove" so you can call:
FileUtils.remove(myFile);
It's a general design philosophy that you can find in for example apache libraries (e.g. http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/FileUtils.html)
UPDATE
A library is simply a reusable collection of code with a specific purpose.
Apache is a foundation that manages a lot of open source projects (a lot if not all of it java-based). They provide high quality (though in a few cases outdated) software that can be reused. While you're at it you might want to take a peek at "apache maven" which handles the lifecycle of a project and makes library management easy (believe it or not there is a whole repository with more than 600.000 libraries in it for you to use: http://mvnrepository.com/
And this is just one (although the largest) repository...
Design philosophy is an enormous subject with as many opinions as there are coders. However there are some best practices that everyone adheres to.
Apache usually has pretty high quality code so you can check them if not for code, at least for a good way to write libraries. Other than that I can only point you towards books and google to find your way.
Writing maintainable code is more of an art than a science and it takes a lot of reading and practice to master it.
I am getting list of file using method File.listFiles() in java.io.File, but it returns some system files like (.sys and etc).. I am in need of excluding all system related files (Windows, Linux, Mac) while returning lists. Can any one solve my issue?
I'd implement a simple FileFilter with the logic to determine, if a file is a system file or not and use an instance of it the way AlexR showed in his answer. Something like this (the rules a for demonstration purposes only!):
public class IgnoreSystemFileFilter implements FileFilter {
Set<String> systemFileNames = new HashSet<String>(Arrays.asList("sys", "etc"));
#Override
public boolean accept(File aFile) {
// in my scenario: each hidden file starting with a dot is a "system file"
if (aFile.getName().startsWith(".") && aFile.isHidden()) {
return false;
}
// exclude known system files
if (systemFileNames.contains(aFile.getName()) {
return false;
}
// more rules / other rules
// no rule matched, so this is not a system file
return true;
}
I don't think there is a general solution to this. For a start, operating systems such as Linux and MacOS don't have a clear notion of a "system file" or any obvious way to distinguish a system file from a non-system file.
I think your bet is to decide what you mean by a system file, and write your own code to filter them out.
Generally filtering of file lists is done by using file filter.
new java.io.File("dir").listFiles(new FileFilter() {
#Override
public boolean accept(File pathname) {
// add here logic that identifies the system files and returns false for them.
}
});
The problem is how do you define system files. If for example you want to filter out all files with extension .sys it is simple. If not please define your criteria. If you have difficulties to implement your criteria please ask specific question.
As others have pointed out, some operating systems do not have a definition for "system file".
However, if you are using Java 7, there is a new extension called NIO.2 which might help you under Windows:
Path srcFile = Paths.get("test");
DosFileAttributes dfa = Files.readAttributes(srcFile, DosFileAttributes.class);
System.out.println("isSystem? " + dfa.isSystem());