I want to split textfile into mutilple text files - java

Hi I have Text file having some tag based data and i want to split into multiple text files.
Main Text files having data like this:
==========110CYL067.txt============
<Entity Text>Cornell<Entity Type>Person
<Entity Text>Donna<Entity Type>Person
<Entity Text>Sherry<Entity Type>Person
<Entity Text>Goodwill<Entity Type>Organization
==========110CYL068.txt============
<Entity Text>Goodwill Industries Foundation<Entity Type>Organization
<Entity Text>Goodwill<Entity Type>Organization
NOTE: Over here 110CYL068.txt and 110CYL067.txt are text files.
I want to split this file into 110CYL068.txt and 110CYL067.txt and so on.
This ============ pattern is fixed.Between ============ FileName ============
file name could be anything.does anyone have any insight.

I don't want to write codes for you, so you can read the file using a BufferedReader or FileReader. You can create and write to a new File using any file writer whenever you see a line starting with ======= or containing .txt.
If you encounter those close the previous file and repeat the process.

Done ppl way to complicatet just did it fast and dirty.
public static List<String> lines = new ArrayList<String>();
public static String pattern = "==========";
public static void main(String[] args) throws IOException {
addLines(importFile());
}
private static List<String> importFile() throws FileNotFoundException, IOException {
BufferedReader br = new BufferedReader(new FileReader("C:\\temp\\test.txt"));
try {
StringBuilder sb = new StringBuilder();
String line = br.readLine();
while (line != null) {
lines.add(line.replaceFirst(pattern, ";") + "\n");
line = br.readLine();
}
} finally {
br.close();
}
return lines;
}
private static void addLines(List<String> list) throws IOException {
String FilesString = list.toString();
System.out.println(FilesString);
String[] FilesArray = FilesString.split(";");
for (String string : FilesArray) {
createFile(string);
}
}
private static void createFile(String content) throws IOException {
String[] Lines = content.replaceAll("=", "").split("\n");
File file = new File("C:\\temp\\" + Lines[0]);
file.createNewFile();
FileWriter writer = new FileWriter(file);
Lines[0] = null;
for (String Line : Lines) {
if (Line != null) {
writer.append(Line.replace(",", "")+"\n");
}
}
writer.flush();
writer.close();
}
}

Also quick and dirty, not using regex. I don't really recommend doing it like this because the for loop in main is quite confusing and could break, but it might be beneficial to use this for ideas.
import java.io.*;
import java.util.*;
class splitFiles {
public static void main(String[] args){
try {
List<String> fileRead = readFiles("some.txt");
for(int i=0; i<fileRead.size(); i++){
if(fileRead.get(i).charAt(0) == '='){
PrintWriter writer = new PrintWriter(getFileName(fileRead.get(i)), "UTF-8");
for(int j=i+1; j<fileRead.size(); j++){
if(fileRead.get(j).charAt(0) == '='){
break;
} else {
writer.println(fileRead.get(j));
}
}
writer.close();
}
}
} catch (Exception e){
}
}
public static String getFileName(String fileLine){
String[] split = fileLine.split("=");
for(String e: split){
if(e.isEmpty()){
continue;
} else {
return e;
}
}
return "No file name found";
}
public static ArrayList<String> readFile(String path){
try {
Scanner s = new Scanner(new File(path));
ArrayList<String> list = new ArrayList<String>();
while(s.hasNext()){
list.add(s.next());
}
s.close();
return list;
} catch (FileNotFoundException f){
System.out.println("File not found.");
}
return null;
}
static List<String> readFiles(String fileName) throws IOException {
List<String> words = new ArrayList<String>();
BufferedReader reader = new BufferedReader(new FileReader(fileName));
String line;
while ((line = reader.readLine()) != null) {
words.add(line);
}
reader.close();
return words;
}
}

Related

How to remove row which contains blank cell from csv file in Java

I'm trying to do data cleaning on dataset. by data cleaning i meant removing the row which containes NaN or duplicates values or empty cell. here is my code
dataset look like this:
Sno Country noofDeaths
1 32432
2 Pakistan NaN
3 USA 3332
3 USA 3332
excel file image:
public class data_reader {
String filePath="src\\abc.csv";
public void readData() {
BufferedReader br = null;
String line = "";
HashSet<String> lines = new HashSet<>();
try {
br = new BufferedReader(new FileReader(filePath));
while ((line = br.readLine()) != null) {
if(!line.contains("NaN") || !line.contains("")) {
if (lines.add(line)) {
System.out.println(line);
}
}
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (br != null) {
try {
br.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
}
it is working fine for NaN values and duplicates rows but not for empty cell, please help how to do this.
!line.contains("")
this is not working.
Condition !line.contains("") - doesn't make sence because every string contains empty string.
General suggestions:
don't hard code file-path, code must be reusable;
use try with resources;
camel-case names.
public class DataReader {
public static void main(String[] args) {
new DataReader().readData("src\\abc.csv");
}
public void readData(String filePath) {
try(BufferedReader br = new BufferedReader(new FileReader(filePath))) {
HashSet<String> lines = new HashSet<>();
String line = null;
while ((line = br.readLine()) != null) {
if(!line.contains("NaN")) {
for (String cell: line.split(",")) {
if (!cell.isBlank()&&lines.add(cell)) {
System.out.print(cell + " ");
}
}
}
System.out.println();
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
Seems to me this is a pretty easy problem to solve. Given a CSV file with an empty row
foo,bar,baz
1,One,123
,,
2,Two,456
3,Three,789
You can read the lines and define an empty line as one which contains empty strings separated by commas. You could read the contents of the file, store the populated lines into a string buffer, and then save the contents of the buffer once the empty lines are extracted out. The code below accomplishes this:
public static void main(String[] args) throws IOException {
String file ="test.csv";
BufferedReader reader = new BufferedReader(new FileReader(file));
String line = null;
StringBuilder sbuff = new StringBuilder();
while ((line = reader.readLine()) != null) {
String[] tokens = line.split(",");
if (containsText(tokens)) {
sbuff.append(line + "\n");
}
}
reader.close();
System.out.println(sbuff.toString());
// save file here
}
public static boolean containsText(String[] tokens) {
for (String token: tokens) {
if (token.length() > 0)
return true;
}
return false;
}
After running the code, the output is:
foo,bar,baz
1,One,123
2,Two,456
3,Three,789
This same code can be used to determine if a cell is empty with a simple method:
public static boolean isCellEmpty(String[] tokens) {
for (String token: tokens) {
if (token.isBlank())
return true;
}
return false;
}

creating multiple arraylists from a file in java

i am new to java so i need help...
i have a file which contains:-
Model
A
T
ENMDL
Model
A
T
ENMDL
.... repeat multiple times and i need to make a program which separate them and store them in different arraylists.
can anyone help..
public ArrayList<String> GetAllFile(String File) throws IOException
{
FileReader fr=new FileReader(File);
BufferedReader br=new BufferedReader(fr);
String rowData;
ArrayList<String> allFile = new ArrayList<String>();
while((rowData=br.readLine())!=null)
if(rowData.startsWith("MODEL"))
allFile.add(rowData);
fr.close();
return allFile;
}
}
Change your return type.
public static List<List<String>> fileToArrayList(String fileName) {
Create the outer container.
List<List<String>> allFile = new ArrayList<>();
Then outside of your loop.
List<String> modelLines = new ArrayList<>();
Then the condition inside of your loop should be.
if(rowData.startsWith("Model")){
modelLines = new ArrayList<>();
allFile.add(modelLines);
} else{
modelLines.add(rowData);
}
Here is an solution that might suit you:
public class FileToArrayList {
public static void main(String[] args) {
// Get the file as an List.
List<String> fileAsList = FileToArrayList.fileToArrayList("SomeFile.txt");
// Print the lines.
for (String oneLine : fileAsList) {
System.out.println(oneLine);
}
}
public static List<String> fileToArrayList(String fileName) {
// Container for the lines.
List<String> lines = new ArrayList<>();
// Try with resources, it will close it automatically afterwards.
try(FileReader fr = new FileReader(new File(fileName))) {
BufferedReader br = new BufferedReader(fr);
String line;
// line = br.readLine() is an expression which will return line, therefore
// we can check if that expression is not null, because
// when its null, we reached EOF (end of file)
while((line = br.readLine()) != null) {
lines.add(line);
}
} catch(IOException e) {
e.printStackTrace();
}
return lines;
}
}

Reading file to ArrayList

I'm trying to write a .dat file to an ArrayList. The file contains lines formatted like this : #name#,#number#.
Scanner s = new Scanner(new File("file.dat"));
while(s.hasNext()){
String string = s.next();
names.add(string.split(",")[0];
numbers.add(Integer.parseInt(string.split(",")[1];
}
If I check if it runs with printing, all I get is the first line.
With standard Java libraries (full code example):
BufferedReader in = null;
List<String> myList = new ArrayList<String>();
try {
in = new BufferedReader(new FileReader("myfile.txt"));
String str;
while ((str = in.readLine()) != null) {
myList.add(str);
//Or split your read string here as you wish.
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if (in != null) {
in.close();
}
}
With other common libraries:
A one-liner with commons-io:
List<String> lines = FileUtils.readLines(new File("/path/to/file.txt"), "utf-8");
The same with guava:
List<String> lines =
Files.readLines(new File("/path/to/file.txt"), Charset.forName("utf-8"));
Then you can iterate over the read lines and split each String to your desired ArrayLists.
Instead of using a Scanner, use a BufferedReader. The BufferedReader provides a method to read one line at a time. Using this, you can process every line individually by splitting them (line.split(",")) , stripping the trailing hashes, then pushing them into your ArrayLists.
This is how I read a file and turn it into a arraylist
public List<String> readFile(File file){
try{
List<String> out = new ArrayList<String>();
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(file)));
String line;
while((line = reader.readLine()) != null){
if(line != null){
out.add(line);
}
}
reader.close();
return out;
}
catch(IOException e){
}
return null;
}
Hope it helps.
May be this is lengthy way but works:
text file:
susheel,1134234
testing,1342134
testing2,123455
Main class:
import java.io.BufferedReader;
import java.io.FileReader;
import java.util.ArrayList;
import java.util.List;
public class Equal {
public static void main(String[] args) {
List<Pojo> data= new ArrayList<Pojo>();
String currentLine;
try {
BufferedReader br = new BufferedReader(new FileReader("E:\\test.dat"));
while ((currentLine = br.readLine()) != null) {
String[] arr = currentLine.split(",");
Pojo pojo = new Pojo();
pojo.setName(arr[0]);
pojo.setNumber(Long.parseLong(arr[1]));
data.add(pojo);
}
for(Pojo i : data){
System.out.print(i.getName()+" "+i.getNumber()+"\n");
}
} catch (Exception e) {
System.out.print(e.getMessage());
}
}
}
POJO class:
public class Pojo {
String name;
long number;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public long getNumber() {
return number;
}
public void setNumber(long number) {
this.number = number;
}
}

Sorting a text file in Java

I have a text file with a list of words which I need to sort in alphabetical order using Java. The words are located on seperate lines.
How would I go about this, Read them into an array list and then sort that??
This is a simple four step process, with three of the four steps addressed by Stackoverflow Questions:
Read each line and turn them into Java String
Store each Java String in a Array (don't think you need a reference for this one.)
Sort your Array
Write out each Java String in your array
Here is an example using Collections sort:
public static void sortFile() throws IOException
{
FileReader fileReader = new FileReader("C:\\words.txt");
BufferedReader bufferedReader = new BufferedReader(fileReader);
List<String> lines = new ArrayList<String>();
String line = null;
while ((line = bufferedReader.readLine()) != null) {
lines.add(line);
}
bufferedReader.close();
Collections.sort(lines, Collator.getInstance());
FileWriter writer = new FileWriter("C:\\wordsnew.txt");
for(String str: lines) {
writer.write(str + "\r\n");
}
writer.close();
}
You can also use your own collation like this:
Locale lithuanian = new Locale("lt_LT");
Collator lithuanianCollator = Collator.getInstance(lithuanian);
import java.io.*;
import java.util.*;
public class example
{
TreeSet<String> tree=new TreeSet<String>();
public static void main(String args[])
{
new example().go();
}
public void go()
{
getlist();
System.out.println(tree);
}
void getlist()
{
try
{
File myfile= new File("C:/Users/Rajat/Desktop/me.txt");
BufferedReader reader=new BufferedReader(new FileReader(myfile));
String line=null;
while((line=reader.readLine())!=null){
addnames(line);
}
reader.close();
}
catch(Exception ex)
{
ex.printStackTrace();
}
}
void addnames(String a)
{
tree.add(a);
for(int i=1;i<=a.length();i++)
{
}
}
}
public List<String> readFile(String filePath) throws FileNotFoundException {
List<String> txtLines = new ArrayList<>();
try {
BufferedReader reader = new BufferedReader(new FileReader(filePath));
String line;
while (!((line = reader.readLine()) == null)) {
txtLines.add(line);
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return txtLines.stream().sorted().collect(Collectors.toList());
}

Appending multiple files into one

I have 4 different files in some locations like:
D:\1.txt
D:\2.txt
D:\3.txt and
D:\4.txt
I need to create a new file as NewFile.txt, It should contains all the contents present in the above files 1.txt, 2.txt,3.txt 4.txt.......
All Data should present in the New Single file(NewFile.txt)..
Please suggest me some idea to do the same in java or Groovy....
Here's one way to do it in Groovy:
// Get a writer to your new file
new File( '/tmp/newfile.txt' ).withWriter { w ->
// For each input file path
['/tmp/1.txt', '/tmp/2.txt', '/tmp/3.txt'].each { f ->
// Get a reader for the input file
new File( f ).withReader { r ->
// And write data from the input into the output
w << r << '\n'
}
}
}
The advantage of doing it this way (over calling getText on each of the source files) is that it will not need to load the entire file into memory before writing its contents out to newfile. If one of your files was immense, the other method could fail.
This is in groovy
def allContentFile = new File("D:/NewFile.txt")
def fileLocations = ['D:/1.txt' , 'D:/2.txt' , 'D:/3.txt' , 'D:/4.txt']
fileLocations.each{ allContentFile.append(new File(it).getText()) }
i am showing you the way it is to be done in java:
public class Readdfiles {
public static void main(String args[]) throws Exception
{
String []filename={"C:\\WORK_Saurabh\\1.txt","C:\\WORK_Saurabh\\2.txt"};
File file=new File("C:\\WORK_Saurabh\\new.txt");
FileWriter output=new FileWriter(file);
try
{
for(int i=0;i<filename.length;i++)
{
BufferedReader objBufferedReader = new BufferedReader(new FileReader(getDictionaryFilePath(filename[i])));
String line;
while ((line = objBufferedReader.readLine())!=null )
{
line=line.replace(" ","");
output.write(line);
}
objBufferedReader.close();
}
output.close();
}
catch (Exception e)
{
throw new Exception (e);
}
}
public static String getDictionaryFilePath(String filename) throws Exception
{
String dictionaryFolderPath = null;
File configFolder = new File(filename);
try
{
dictionaryFolderPath = configFolder.getAbsolutePath();
}
catch (Exception e)
{
throw new Exception (e);
}
return dictionaryFolderPath;
}
}
tell me if you have any doubts
I tried solving this and i found its quite easy if you copy the contents to an array and write the array to a different file
public class Fileread
{
public static File read(File f,File f1) throws FileNotFoundException
{
File file3=new File("C:\\New folder\\file3.txt");
PrintWriter output=new PrintWriter(file3);
ArrayList arr=new ArrayList();
Scanner sc=new Scanner(f);
Scanner sc1=new Scanner(f1);
while(sc.hasNext())
{
arr.add(sc.next());
}
while(sc1.hasNext())
{
arr.add(sc1.next());
}
output.print(arr);
output.close();
return file3;
}
/**
*
* #param args
* #throws FileNotFoundException
*/
public static void main(String[] args) {
try
{
File file1=new File("C:\\New folder\\file1.txt");
File file2=new File("C:\\New folder\\file2.txt");
File file3=read(file1,file2);
Scanner sc=new Scanner(file3);
while(sc.hasNext())
System.out.print(sc.next());
}
catch(Exception e)
{
System.out.printf("Error :%s",e);
}
}
}
You can do something like this in Java. Hope it helps you resolve your problem:
import java.io.*;
class FileRead {
public void readFile(String[] args) {
for (String textfile : args) {
try{
// Open the file that is the first
// command line parameter
FileInputStream fstream = new FileInputStream(textfile);
// Get the object of DataInputStream
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;
//Read File Line By Line
while ((strLine = br.readLine()) != null) {
// Print the content on the console
System.out.println (strLine);
// Write to the new file
FileWriter filestream = new FileWriter("Combination.txt",true);
BufferedWriter out = new BufferedWriter(filestream);
out.write(strLine);
//Close the output stream
out.close();
}
//Close the input stream
in.close();
}catch (Exception e){//Catch exception if any
System.err.println("Error: " + e.getMessage());
}
}
}
public static void main(String args[]) {
FileRead myReader = new FileRead();
String fileArray[] = {"file1.txt", "file2.txt", "file3.txt", "file4.txt"};
myReader.readFile(fileArray);
}
}
One liner example:
def out = new File(".all_profiles")
['.bash_profile', '.bashrc', '.zshrc'].each {out << new File(it).text}
OR
['.bash_profile', '.bashrc', '.zshrc'].collect{new File(it)}.each{out << it.text}
Tim's implementation is better if you have big files.
public static void main(String[] args) throws IOException {
List<String> files=new ArrayList<String>();
for(int i=10;i<14;i++)
files.add("C://opt/Test/test"+i+".csv");
String destFile ="C://opt/Test/test.csv";
System.out.println("TO "+destFile);
long st=System.currentTimeMillis();
mergefiles(files, destFile);
System.out.println("DONE."+(st-System.currentTimeMillis()));
}
public static void mergefiles(List<String> files,String destFile){
Path outFile = Paths.get(destFile);
try(FileChannel out=FileChannel.open(outFile, StandardOpenOption.CREATE, StandardOpenOption.WRITE)) {
for(String file:files) {
Path inFile=Paths.get(file);
System.out.println(inFile);
try(FileChannel in=FileChannel.open(inFile, StandardOpenOption.READ)) {
for(long p=0, l=in.size(); p<l; )
p+=in.transferTo(p, l-p, out);
}catch (IOException e) {
System.out.println("ERROR:: "+e.getMessage());
}
out.write(ByteBuffer.wrap("\n".getBytes()));
}
} catch (IOException e) {
System.out.println("ERROR:: "+e.getMessage());
}
}

Categories

Resources