Is there a built in Java code that will parse a given folder and search it for .txt files?
You can use the listFiles() method provided by the java.io.File class.
import java.io.File;
import java.io.FilenameFilter;
public class Filter {
public File[] finder( String dirName){
File dir = new File(dirName);
return dir.listFiles(new FilenameFilter() {
public boolean accept(File dir, String filename)
{ return filename.endsWith(".txt"); }
} );
}
}
Try:
List<String> textFiles(String directory) {
List<String> textFiles = new ArrayList<String>();
File dir = new File(directory);
for (File file : dir.listFiles()) {
if (file.getName().endsWith((".txt"))) {
textFiles.add(file.getName());
}
}
return textFiles;
}
You want to do a case insensitive search in which case:
if (file.getName().toLowerCase().endsWith((".txt"))) {
If you want to recursively search for through a directory tree for text files, you should be able to adapt the above as either a recursive function or an iterative function using a stack.
import org.apache.commons.io.filefilter.WildcardFileFilter;
.........
.........
File dir = new File(fileDir);
FileFilter fileFilter = new WildcardFileFilter("*.txt");
File[] files = dir.listFiles(fileFilter);
The code above works great for me
It's really useful, I used it with a slight change:
filename=directory.list(new FilenameFilter() {
public boolean accept(File dir, String filename) {
return filename.startsWith(ipro);
}
});
I made my solution based on the posts I found here with Google. And I thought there is no harm to post mine as well even if it is an old thread.
The only plus this code gives is that it can iterate through sub-directories as well.
import java.io.File;
import java.io.FileFilter;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import org.apache.commons.io.filefilter.DirectoryFileFilter;
import org.apache.commons.io.filefilter.WildcardFileFilter;
Method is as follows:
List <File> exploreThis(String dirPath){
File topDir = new File(dirPath);
List<File> directories = new ArrayList<>();
directories.add(topDir);
List<File> textFiles = new ArrayList<>();
List<String> filterWildcards = new ArrayList<>();
filterWildcards.add("*.txt");
filterWildcards.add("*.doc");
FileFilter typeFilter = new WildcardFileFilter(filterWildcards);
while (directories.isEmpty() == false)
{
List<File> subDirectories = new ArrayList();
for(File f : directories)
{
subDirectories.addAll(Arrays.asList(f.listFiles((FileFilter)DirectoryFileFilter.INSTANCE)));
textFiles.addAll(Arrays.asList(f.listFiles(typeFilter)));
}
directories.clear();
directories.addAll(subDirectories);
}
return textFiles;
}
import java.io.IOException;
import java.nio.file.FileSystems;
import java.nio.file.FileVisitResult;
import java.nio.file.Path;
import java.nio.file.PathMatcher;
import java.nio.file.SimpleFileVisitor;
import java.nio.file.attribute.BasicFileAttributes;
import java.util.ArrayList;
public class FileFinder extends SimpleFileVisitor<Path> {
private PathMatcher matcher;
public ArrayList<Path> foundPaths = new ArrayList<>();
public FileFinder(String pattern) {
matcher = FileSystems.getDefault().getPathMatcher("glob:" + pattern);
}
#Override
public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
Path name = file.getFileName();
if (matcher.matches(name)) {
foundPaths.add(file);
}
return FileVisitResult.CONTINUE;
}
}
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.LinkOption;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
public class Main {
public static void main(String[] args) throws IOException {
Path fileDir = Paths.get("files");
FileFinder finder = new FileFinder("*.txt");
Files.walkFileTree(fileDir, finder);
ArrayList<Path> foundFiles = finder.foundPaths;
if (foundFiles.size() > 0) {
for (Path path : foundFiles) {
System.out.println(path.toRealPath(LinkOption.NOFOLLOW_LINKS));
}
} else {
System.out.println("No files were founds!");
}
}
}
import org.apache.commons.io.FileUtils;
List<File> htmFileList = new ArrayList<File>();
for (File file : (List<File>) FileUtils.listFiles(new File(srcDir), new String[]{"txt", "TXT"}, true)) {
htmFileList.add(file);
}
This is my latest code to add all text files from a directory
Here is my platform specific code(unix)
public static List<File> findFiles(String dir, String... names)
{
LinkedList<String> command = new LinkedList<String>();
command.add("/usr/bin/find");
command.add(dir);
List<File> result = new LinkedList<File>();
if (names.length > 1)
{
List<String> newNames = new LinkedList<String>(Arrays.asList(names));
String first = newNames.remove(0);
command.add("-name");
command.add(first);
for (String newName : newNames)
{
command.add("-or");
command.add("-name");
command.add(newName);
}
}
else if (names.length > 0)
{
command.add("-name");
command.add(names[0]);
}
try
{
ProcessBuilder pb = new ProcessBuilder(command);
Process p = pb.start();
p.waitFor();
InputStream is = p.getInputStream();
InputStreamReader isr = new InputStreamReader(is);
BufferedReader br = new BufferedReader(isr);
String line;
while ((line = br.readLine()) != null)
{
// System.err.println(line);
result.add(new File(line));
}
p.destroy();
}
catch (Exception e)
{
e.printStackTrace();
}
return result;
}
Related
Here I am trying to read a folder containing .sql files and I am getting those files in an array, now my requirement is to read every file and find particular word like as join if join is present in the file return filename or else discard , someone can pls help me with this ..
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
import java.util.Optional;
import java.util.stream.Stream;
public class Filter {
public static List<String> textFiles(String directory) {
List<String> textFiles = new ArrayList<String>();
File dir = new File(directory);
for (File file : dir.listFiles()) {
if (file.getName().endsWith((".sql"))) {
textFiles.add(file.getName());
}
}
return textFiles;
}
public static void getfilename(String directory) throws IOException {
List<String> textFiles = textFiles(directory);
for (String string : textFiles) {
Path path = Paths.get(string);
try (Stream<String> streamOfLines = Files.lines(path)) {
Optional<String> line = streamOfLines.filter(l -> l.contains("join")).findFirst();
if (line.isPresent()) {
System.out.println(path.getFileName());
} else
System.out.println("Not found");
} catch (Exception e) {
}
}
}
public static void main(String[] args) throws IOException {
getfilename("/home/niteshb/wave1-master/wave1/sql/scripts");
}
}
You can search word in file as belwo, pass the path of file
try(Stream <String> streamOfLines = Files.lines(path)) {
Optional <String> line = streamOfLines.filter(l ->
l.contains(searchTerm))
.findFirst();
if(line.isPresent()){
System.out.println(line.get()); // you can add return true or false
}else
System.out.println("Not found");
}catch(Exception e) {}
}
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
import java.util.Optional;
import java.util.stream.Stream;
public class Filter {
public static List<String> textFiles(String directory) {
List<String> textFiles = new ArrayList<String>();
File dir = new File(directory);
for (File file : dir.listFiles()) {
if (file.getName().endsWith((".sql"))) {
textFiles.add(file.getAbsolutePath());
}
}
System.out.println(textFiles.size());
return textFiles;
}
public static String getfilename(String directory) throws IOException {
List<String> textFiles = textFiles(directory);
for (String string : textFiles) {
Path path = Paths.get(string);
try (Stream<String> streamOfLines = Files.lines(path)) {
Optional<String> line = streamOfLines.filter(l -> l.contains("join")).findFirst();
if (line.isPresent()) {
System.out.println(path.getFileName());
} else
System.out.println("");
} catch (Exception e) {
}
}
return directory;
}
public static void main(String[] args) throws IOException {
getfilename("/home/wave1-master/wave1/sql/");
}
}
I would like to group specific files based on their file names from multiple paths. I have followed this stackoverflow link. I have not been able to loop through each file after I start streaming the path to find that specific file name.
Here are the paths with files contents:
/var/tmp/data_sample1/data2_first_example.set.csv
/var/tmp/data_sample1/data3_first_example.set.csv
/var/tmp/data_sample1/data1_first_example.set.csv
/var/tmp/data_sample2/data2_second_example.set.csv
/var/tmp/data_sample2/data1_second_example.set.csv
/var/tmp/data_sample2/data3_second_example.set.csv
/tmp/csv_files/data_sample3/data2_third_example.set.csv
/tmp/csv_files/data_sample3/data1_third_example.set.csv
/tmp/csv_files/data_sample3/data3_third_example.set.csv
Enum Class:
enum PersonType {
A,
B
}
FileName.java
import java.util.Arrays;
import java.util.List;
public class FileName {
private final String first = "_first_sample";
private final String second = "_second_sample";
private final String third = "_third_sample";
private final List<String> filenames = Arrays.asList(first, second, third);
public List<String> getFilenames() {
return filenames;
}
}
CSVFiles.java
import java.io.File;
import java.io.IOException;
import java.io.UncheckedIOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.*;
import java.util.stream.Collectors;
public class CSVFiles {
private PersonType personType;
private List<String> fileNames = new ArrayList<>();
private List<File> firstSample = new ArrayList<>();
private List<File> secondSample = new ArrayList<>();
private List<File> thirdSample = new ArrayList<>();
public CSVFiles(PersonType personType, List<String> paths) {
if (personType == PersonType.A) {
this.personType = personType;
FileName fileName = new FileName();
this.fileNames = fileName.getFilenames();
setCSVFiles(paths);
}
}
public List<File> setCSVFiles(List<String> paths) {
List<Path> collect = paths.stream()
.flatMap(path -> {
try {
return Files.find(Paths.get(path), Integer.MAX_VALUE,
(p, attrs) -> attrs.isRegularFile()
&& p.toString().contains(".set")
&& p.toString().endsWith(".csv")
);
} catch (IOException ex) {
throw new UncheckedIOException(ex);
}
}).collect(Collectors.toList());
return collect.stream()
.map(Path::toFile)
.filter(file -> {
if (file.getName().contains("_first_sample")) {
firstSample.add(file);
return true;
} else if (file.getName().contains("_second_sample")) {
secondSample.add(file);
return true;
} else if (file.getName().contains("_third_sample")) {
thirdSample.add(file);
return true;
}
return false;
})
.collect(Collectors.toList());
}
}
CSVFilesTest.java
import org.junit.Test;
import java.io.IOException;
import java.io.UncheckedIOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.*;
public class CSVFilesTest {
#Test
public void test() {
String data_sample1 = "/var/tmp/data_sample1";
String data_sample2 = "/var/tmp/data_sample2";
String data_sample3 = "/tmp/csv_files/data_sample3";
List<String> paths = Arrays.asList(data_sample1, data_sample2, data_sample3);
System.out.println(paths);
CSVFiles csvFiles = new CSVFiles(PersonType.A, paths);
}
}
Desired Output:
firstSample: [data1_first_example.set.csv, data2_first_example.set.csv, data3_first_example.set.csv]
secondSample: [data1_second_example.set.csv, data2_second_example.set.csv, data3_second_example.set.csv]
thirdSample: [data1_third_example.set.csv, data2_third_example.set.csv, data3_third_example.set.csv]
Any feedback is appreciated!
Solution thanks to "sync it" comments:
public Map<String, List<String>> setCSVFiles(List<String> paths) {
List<Path> collect = paths.stream()
.flatMap(path -> {
try {
return Files.find(Paths.get(path), Integer.MAX_VALUE,
(p, attrs) -> attrs.isRegularFile()
&& p.toString().contains(".set")
&& p.toString().endsWith(".csv")
);
} catch (IOException ex) {
throw new UncheckedIOException(ex);
}
}).collect(Collectors.toList());
return collect.stream()
.map(Path::toString)
.collect(Collectors.groupingBy(path ->
path.substring(path.lastIndexOf("/")+1)
));
}
I have a below CSV string and I want to check if given file or directory exist.
private static String dir = "/Users/swapnil.kotwal/Swapnil/myproject/build/WEB-INF/classes/test/";
private static String csvConnClasses = dir + "FirstTest*.class,"+ dir+"SecondTest.class,"+dir+"abcd/";
I tried below pice of code but I'm running it through ant getting exception java.lang.NoClassDefFoundError: org/aspectj/lang/Signature
File dir = new File(cls.substring(0, cls.lastIndexOf("/")));
String[] splits = dir.getAbsolutePath().split(dir.getPath());
String basePath = splits[0] + "build/WEB-INF/classes/" + dir.getPath();
dir = new File(basePath);
if (dir.exists() && dir.isDirectory() && dir.list().length > 0) {
final String className = getClassName(new File(cls));
File[] files = dir.listFiles(new FileFilter() {
public boolean accept(File file) {
System.out.println("File Name >>> " + file.getName());
return (file.getName().startsWith(className) && file.getName().endsWith(".class"));
}
});
if (files.length == 0) {
throw new BuildException(cls + " class not found - ");
}
if (classSet.contains(cls)) {
dups.add(cls);
}
classSet.add(cls);
} else
throw new BuildException(cls + " directory not found - ");
}
Can somebody suggest me implementation using PathMatcher/Regex to check if the given files and folders are exists.
I'm planing to use Java NIO Program to Search File Entries with GLOB Pattern.
package com.test.inspector;
import java.io.File;
import java.io.IOException;
import java.nio.file.FileSystems;
import java.nio.file.FileVisitResult;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.PathMatcher;
import java.nio.file.SimpleFileVisitor;
import java.nio.file.attribute.BasicFileAttributes;
import java.util.StringTokenizer;
public class SearchFile {
private static String dir = "/Users/swapnil.kotwal/Swapnil/myproject/build/WEB-INF/classes/test/";
private static String csvConnClasses = dir + "FirstTest*.class,"+ dir+"SecondTest.class,"+dir+"abcd/";
public static class SearchFileVisitor extends SimpleFileVisitor<Path> {
private final PathMatcher pathMatcher;
private int matchCount = 0;
SearchFileVisitor(String globPattern) {
pathMatcher = FileSystems.getDefault().getPathMatcher(
"glob:" + globPattern);
}
#Override
public FileVisitResult visitFile(Path filePath,
BasicFileAttributes basicFileAttrib) {
if (pathMatcher.matches(filePath.getFileName())) {
matchCount++;
System.out.println(filePath);
}
return FileVisitResult.CONTINUE;
}
#Override
public FileVisitResult preVisitDirectory(Path directoryPath,
BasicFileAttributes basicFileAttrib) {
if (pathMatcher.matches(directoryPath.getFileName())) {
matchCount++;
System.out.println(directoryPath);
}
return FileVisitResult.CONTINUE;
}
public int getMatchCount() {
return matchCount;
}
}
public static void main(String[] args) throws IOException {
if (null != csvConnClasses) {
StringTokenizer st = new StringTokenizer(csvConnClasses, ",");
while (st.hasMoreTokens()) {
String cls = st.nextToken();
// Removes all whitespaces and non-visible characters like tab,
// \n etc.
cls = cls.replaceAll("\\s+", "");
Path rootPath = FileSystems.getDefault().getPath( cls.substring(0, cls.lastIndexOf("/")) );
String globPattern = (new File(cls)).getName();
SearchFileVisitor searchFileVisitor = new SearchFileVisitor(globPattern);
Files.walkFileTree(rootPath, searchFileVisitor);
System.out.println("Match Count: " + searchFileVisitor.getMatchCount());
}
}
}
}
I have a code here that can read all .txt file in 1 folder, it can print every content inside the .txt file to console. Then it moved to new folder.
The problem is: it has been read randomly, but i want to read the .txt file by it time-stamp, which is who have last edited time will be read at first...
import java.io.IOException;
import java.nio.file.DirectoryStream;
import java.nio.file.FileSystems;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardCopyOption;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class Basic {
public static void main(String[] args) throws IOException {
String source = "C:\\Users\\NN\\Documents\\Test1";
String target = "C:\\Users\\NN\\Documents\\Test2";
List<Path> filePaths = filePathsList(source); // Step 1: get all files from a directory
List<Path> filteredFilePaths = filter(filePaths); // Step 2: filter by ".txt"
Map<Path, List<String>> contentOfFiles = getContentOfFiles(filteredFilePaths); // Step 3: get content of files
move(filteredFilePaths, target); // Step 4: move files to destination
printToConsole(contentOfFiles); // Step 5: print content of files to console
}
public static List<Path> filePathsList(String directory) throws IOException {
List<Path> filePaths = new ArrayList<>();
DirectoryStream<Path> directoryStream = Files.newDirectoryStream(FileSystems.getDefault().getPath(directory));
for (Path path : directoryStream) {
filePaths.add(path);
}
return filePaths;
}
private static List<Path> filter(List<Path> filePaths) {
List<Path> filteredFilePaths = new ArrayList<>();
for (Path filePath : filePaths) {
if (filePath.getFileName().toString().endsWith(".txt")) {
filteredFilePaths.add(filePath);
}
}
return filteredFilePaths;
}
private static Map<Path, List<String>> getContentOfFiles(List<Path> filePaths) throws IOException {
Map<Path, List<String>> contentOfFiles = new HashMap<>();
for (Path filePath : filePaths) {
contentOfFiles.put(filePath, new ArrayList<>());
Files.readAllLines(filePath).forEach(contentOfFiles.get(filePath)::add);
}
return contentOfFiles;
}
private static void move(List<Path> filePaths, String target) throws IOException {
Path targetDir = FileSystems.getDefault().getPath(target);
if (!Files.isDirectory(targetDir)) {
targetDir = Files.createDirectories(Paths.get(target));
}
for (Path filePath : filePaths) {
System.out.println("Moving " + filePath.getFileName() + " to " + targetDir.toAbsolutePath());
Files.move(filePath, Paths.get(target, filePath.getFileName().toString()), StandardCopyOption.ATOMIC_MOVE);
}
}
private static void printToConsole(Map<Path, List<String>> contentOfFiles) {
System.out.println("Content of files:");
contentOfFiles.forEach((k,v) -> v.forEach(System.out::println));
}
}
You could use File.lastModified() and sort it by its date.
I have a directory with files, directories, subdirectories, etc. How I can get the list of absolute paths to all files and directories using the Apache Hadoop API?
Using HDFS API :
package org.myorg.hdfsdemo;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
public class HdfsDemo {
public static void main(String[] args) throws IOException {
Configuration conf = new Configuration();
conf.addResource(new Path("/Users/miqbal1/hadoop-eco/hadoop-1.1.2/conf/core-site.xml"));
conf.addResource(new Path("/Users/miqbal1/hadoop-eco/hadoop-1.1.2/conf/hdfs-site.xml"));
FileSystem fs = FileSystem.get(conf);
System.out.println("Enter the directory name :");
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
Path path = new Path(br.readLine());
displayDirectoryContents(fs, path);
}
private static void displayDirectoryContents(FileSystem fs, Path rootDir) {
// TODO Auto-generated method stub
try {
FileStatus[] status = fs.listStatus(rootDir);
for (FileStatus file : status) {
if (file.isDir()) {
System.out.println("This is a directory:" + file.getPath());
displayDirectoryContents(fs, file.getPath());
} else {
System.out.println("This is a file:" + file.getPath());
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
Writer a recursive function which takes a file and check if its a directory or not, if directory list out all files in it and in a for loop check if the file is a directory then recursively call or just return the list of files.
Something like this below but not exactly same (here I am returning only .java files)
private static List<File> recursiveDir(File file) {
if (!file.isDirectory()) {
// System.out.println("[" + file.getName() + "] is not a valid directory");
return null;
}
List<File> returnList = new ArrayList<File>();
File[] files = file.listFiles();
for (File f : files) {
if (!f.isDirectory()) {
if (f.getName().endsWith("java")) {
returnList.add(f);
}
} else {
returnList.addAll(recursiveDir(f));
}
}
return returnList;
}
with hdfs you can use hadoop fs -lsr .