I have an arff file that needs to be modified while keeping the same structure of the file each time I run the code.
For example I have the following arff file
#relation australian
#attribute A1 numeric
#attribute A2 numeric
#attribute A3 numeric
#attribute A4 numeric
#attribute A5 numeric
#attribute A6 numeric
#attribute A7 {0,1}
#data
1,3,5,2,4,3,1
3,5,1,2,5,6,0
6,1,4,2,3,4,1
I need to replace the three lines of data with another three lines each time I run the code
I use the following code but it appends the new data to the old data not replacing it.
BufferedReader reader = new BufferedReader(new FileReader("aa.txt"));
String toWrite = "";
String line = null;
while ((line = reader.readLine()) != null) {
toWrite += line;
// System.out.println(toWrite);
}
FileWriter fw = new FileWriter("colon.arff",true);
fw.write(toWrite);
fw.close();
To clear up a couple things:
FileWriter fw = new FileWriter("colon.arff", true);
In your FileWriter declaration you utilize the boolean append flag as true which enables appending to the file supplied. I'm not sure this is really what you want considering the fact that you want the data being written to the file to be the very same format as the file being read. You wouldn't want the chance of anything to be appended to that file therefore distorting the original content format.
toWrite += line;
Doing concatenations like this within a loop is never really a good idea (yes, I still do it from time to time for simple things and demo purposes). Simple concatenations outside of a loop is fine since the compiler utilizes StringBuilder anyways if it thinks it will be more beneficial. It's just better to use the StringBuilder class for the following reason:
In Java, string objects are immutable, which means once it is created, you cannot change it. So when we concatenate one string with another, a new string is created, and the older one is marked for the garbage collector. Let's say we need to concatenate a million strings. Then, we are creating 1 million extra strings which will eventually be garbage collected.
To solve this problem, the StringBuilder class is used. It works like a mutable String object. The StringBuilder#append() method helps to avoid all the copying required in string concatenation. To utilize StringBuilder in your case you would declare the builder above the while loop:
StringBuilder toWrite = new StringBuilder();
and then within the loop:
toWrite.append(line).append(System.lineSeparator());
Notice the additional append(System.lineSeparator())? When you want to write a finished line to a file with FileWriter you need to add a Line Break ("\n" or "\r\n" depending on the OS) so that the next line that will need to be written will be on a new line within the file. In this case you are actually building the String that will be written to the file on a single write so if an appended string needs to be on a new line within the file then a line break needs to also be appended. The System.lineSeparator() method returns the Operating System dependent Line Break character(s).
The code below will do what you're asking:
// Demo data to replace in file...
String[] newData = {"4,2,13,1,4,2,0",
"1,3,3,5,2,4,1",
"7,7,2,1,5,8,1"};
// 'Try With Resourses' is used here to auto-close the reader and writer.
try (BufferedReader reader = new BufferedReader(new FileReader("aa.txt"));
FileWriter fw = new FileWriter("colon.arff")) {
String ls = System.lineSeparator(); // The Line Break use by OS.
StringBuilder toWrite = new StringBuilder(); // A String builder object
int skip = 0; // Used for skipping old file data for placement of the new data
String line = null; // Use to hold file lines read (one at a time)
// Start reading file...
while ((line = reader.readLine()) != null) {
/* If skip is greater than 0 then read in next line and
decrement skip by 1. This is used in case the data
in file contains more rows of data than what you are
replacing. */
if (skip > 0) {
skip--;
continue;
}
// Append the file line read into the StringBuilder object
toWrite.append(line).append(ls);
// If the file line read equals "#data"
if (line.trim().equals("#data")) {
/* Append the new data to the toWrite variable here,
for example: if the new data was in a string array
named newData (see above declaration)... */
for (int i = 0; i < newData.length; i++) {
/* Perform new data Validation...
Make sure all values are string representations
of numerical data and that the 7th column of data
is no less than 0 and no more than 1. */
String[] ndParts = newData[i].split("\\s{0,},\\s{0,}"); // Split the current data row
boolean isValid = true; // flag
for (int v = 0; v < ndParts.length; v++) {
if (!ndParts[v].matches("\\d+") ||
(v == 6 && (!ndParts[v].equals("0") &&
!ndParts[v].equals("1")))) {
isValid = false;
System.err.println("Invalid numerical value supplied on Row " +
(i+1) + " in Column " + (v+1) + ". (Data: " + newData[i] + ")" +
ls + "Not writing data line to file!");
break;
}
}
/* If the current new data row is valid then append
it to the build and increment skip by 1. */
if (isValid) {
toWrite.append(newData[i]).append(ls);
skip++;
}
}
}
}
// Write the entire built string to file.
fw.write(toWrite.toString());
}
catch (FileNotFoundException ex) {
System.err.println(ex.getMessage());
}
catch (IOException ex) {
System.err.println(ex.getMessage());
}
Related
I have a probem, and I didnt find any solution yet. Following Problem: I have to read a CSV File which has to look like this:
First Name,Second Name,Age,
Lucas,Miller,17,
Bob,Jefferson,55,
Andrew,Washington,31,
The assignment is to read this CSV File with JAVA and display it like this:
First Name: Lucas
Second Name: Miller
Age: 17
The Attribut Names are not always the same, so it also could be:
Street,Number,Postal Code,ID,
Schoolstreet,93,20000,364236492,
("," has to be replaced with ";")
Also the file Adress is not always the same.
I already have the view etc. I only need the MODEL.
Thanks for your help. :))
I already have a FileChooser class in Controller, which returns an URI.
If your CSV file(s) always contains a Header Line which indicates the Table Column Names then it's just a matter of catching this line and splitting it so as to place those column names into a String Array (or collection, or whatever). The length of this array determines the amount of data expected to be available for each record data line. Once you have the Column Names it's gets relatively easy from there.
How you acquire your CSV file path and it's format type is obviously up to you but here is a general concept how to carry out the task at hand:
public static void readCsvToConsole(String csvFilePath, String csvDelimiter) {
String line; // To hold each valid data line.
String[] columnNames = new String[0]; // To hold Header names.
int dataLineCount = 0; // Count the file lines.
StringBuilder sb = new StringBuilder(); // Used to build the output String.
String ls = System.lineSeparator(); // Use System Line Seperator for output.
// 'Try With Resources' to auto-close the reader
try (BufferedReader br = new BufferedReader(new FileReader(csvFilePath))) {
while ((line = br.readLine()) != null) {
// Skip Blank Lines (if any).
if (line.trim().equals("")) {
continue;
}
dataLineCount++;
// Deal with the Header Line. Line 1 in most CSV files is the Header Line.
if (dataLineCount == 1) {
/* The Regular Expression used in the String#split()
method handles any delimiter/spacing situation.*/
columnNames = line.split("\\s{0,}" + csvDelimiter + "\\s{0,}");
continue; // Don't process this line anymore. Continue loop.
}
// Split the file data line into its respective columnar slot.
String[] lineParts = line.split("\\s{0,}" + csvDelimiter + "\\s{0,}");
/* Iterate through the Column Names and buld a String
using the column names and its' respective data along
with a line break after each Column/Data line. */
for (int i = 0; i < columnNames.length; i++) {
sb.append(columnNames[i]).append(": ").append(lineParts[i]).append(ls);
}
// Display the data record in Console.
System.out.println(sb.toString());
/* Clear the StringBuilder object to prepare for
a new string creation. */
sb.delete(0, sb.capacity());
}
}
// Trap these Exceptions
catch (FileNotFoundException ex) {
System.err.println(ex.getMessage());
}
catch (IOException ex) {
System.err.println(ex.getMessage());
}
}
With this method you can have 1 to thousands of columns, it doesn't matter (not that you would ever have thousands of data columns in any given record but hey....you never know... lol). And to use this method:
// Read CSV To Console Window.
readCsvToConsole("test.csv", ",");
Here is some code that I recently worked on for an interview that might help: https://github.com/KemarCodes/ms3_csv/blob/master/src/main/java/CSVProcess.java
If you always have 3 attributes, I would read the first line of the csv and set values in an object that has three fields: attribute1, attribute2, and attribute3. I would create another class to hold the three values and read all the lines after, creating a new instance each time and reading them in an array list. To print I would just print the values in the attribute class each time alongside each set of values.
I am trying to go over a bunch of files, read each of them, and remove all stopwords from a specified list with such words. The result is a disaster - the content of the whole file copied over and over again.
What I tried:
- Saving the file as String and trying to look with regex
- Saving the file as String and going over line by line and comparing tokens to the stopwords that are stored in a LinkedHashSet, I can also store them in a file
- tried to twist the logic below in multiple ways, getting more and more ridiculous output.
- tried looking into text / line with the .contains() method, but no luck
My general logic is as follows:
for every word in the stopwords set:
while(file has more lines):
save current line into String
while (current line has more tokens):
assign current token into String
compare token with current stopword:
if(token equals stopword):
write in the output file "" + " "
else: write in the output file the token as is
Tried what's in this question and many other SO questions, but just can't achieve what I need.
Real code below:
private static void removeStopWords(File fileIn) throws IOException {
File stopWordsTXT = new File("stopwords.txt");
System.out.println("[Removing StopWords...] FILE: " + fileIn.getName() + "\n");
// create file reader and go over it to save the stopwords into the Set data structure
BufferedReader readerSW = new BufferedReader(new FileReader(stopWordsTXT));
Set<String> stopWords = new LinkedHashSet<String>();
for (String line; (line = readerSW.readLine()) != null; readerSW.readLine()) {
// trim() eliminates leading and trailing spaces
stopWords.add(line.trim());
}
File outp = new File(fileIn.getPath().substring(0, fileIn.getPath().lastIndexOf('.')) + "_NoStopWords.txt");
FileWriter fOut = new FileWriter(outp);
Scanner readerTxt = new Scanner(new FileInputStream(fileIn), "UTF-8");
while(readerTxt.hasNextLine()) {
String line = readerTxt.nextLine();
System.out.println(line);
Scanner lineReader = new Scanner(line);
for (String curSW : stopWords) {
while(lineReader.hasNext()) {
String token = lineReader.next();
if(token.equals(curSW)) {
System.out.println("---> Removing SW: " + curSW);
fOut.write("" + " ");
} else {
fOut.write(token + " ");
}
}
}
fOut.write("\n");
}
fOut.close();
}
What happens most often is that it looks for the first word from the stopWords set and that's it. The output contains all the other words even if I manage to remove the first one. And the first will be there in the next appended output in the end.
Part of my stopword list
about
above
after
again
against
all
am
and
any
are
as
at
With tokens I mean words, i.e. getting every word from the line and comparing it to the current stopword
After awhile of debugging I believe I have found the solution. This problem is very tricky as you have to use several different scanners and file readers etc. Here is what I did:
I changed how you added to your StopWords set, as it wasn't adding them correctly. I used a buffered reader to read each line, then a scanner to read each word, then added it to the set.
Then when you compared them I got rid of one of your loops as you can easily use the .contains() method to check if the word was a stopWord.
I left you to do the part of writing to the file to take out the stop words, as I'm sure you can figure that out now that everything else is working.
-My sample stop words txt file:
Stop words
Words
-My samples input file was the exact same, so it should catch all three words.
The code:
// create file reader and go over it to save the stopwords into the Set data structure
BufferedReader readerSW = new BufferedReader(new FileReader("stopWords.txt"));
Set<String> stopWords = new LinkedHashSet<String>();
String stopWordsLine = readerSW.readLine();
while (stopWordsLine != null) {
// trim() eliminates leading and trailing spaces
Scanner words = new Scanner(stopWordsLine);
String word = words.next();
while(word != null) {
stopWords.add(word.trim()); //Add the stop words to the set
if(words.hasNext()) {
word = words.next(); //If theres another line, read it
}
else {
break; //else break the inner while loop
}
}
stopWordsLine = readerSW.readLine();
}
BufferedReader outp = new BufferedReader(new FileReader("Words.txt"));
String line = outp.readLine();
while(line != null) {
Scanner lineReader = new Scanner(line);
String line2 = lineReader.next();
while(line2 != null) {
if(stopWords.contains(line2)) {
System.out.println("removing " + line2);
}
if(lineReader.hasNext()) { //If theres another line, read it
line2 = lineReader.next();
}
else {
break; //else break the first while loop
}
}
lineReader.close();
line = outp.readLine();
}
OutPut:
removing Stop
removing words
removing Words
Let me know if I can elaborate any more on my code or why I did something!
I am relatively new to programming, especially in Java, so bear that in mind when answering.
I'm programming a simple collectible card game deck building program, but file reading/writing proved to be problematic.
Here is the code for "addDeck" method that I'm trying to get working:
/**
* Adds a deckid and a deckname to decks.dat file.
*/
public static void AddDeck() throws IOException {
// Opens the decks.dat file.
File file = new File("./files/decks.dat");
BufferedReader read = null;
BufferedWriter write = null;
try {
read = new BufferedReader(new FileReader(file));
write = new BufferedWriter(new FileWriter(file));
String line = read.readLine();
String nextLine = read.readLine();
String s = null; // What will be written to the end of the file as a new line.
String newDeck = "Deck ";
int newInd = 00; // Counter index to indicate the new deckid number.
// If there are already existing deckids in the file,
// this will be the biggest existing deckid number + 1.
// If the first line (i.e. the whole file) is initially empty,
// the following line will be created: "01|Deck 01", where the
// number before the '|' sign is deckid, and the rest is the deckname.
if (line == null) {
s = "01" + '|' + newDeck + "01";
write.write(s);
}
// If the first line of the file isn't empty, the following happens:
else {
// A loop to find the last line and the biggest existing deckid of the file.
while (line != null) {
// The following if clause should determine whether or not the next
// line is the last line of the file.
if ((nextLine = read.readLine()) == null) {
// Now the reader should be at the last line of the file.
for (int i = 0; Character.isDigit(line.charAt(i)); i++) {
// Checks the deckid number of the last line and stores it.
s += line.charAt(i);
}
// The value of the last existing deckid +1 will be stored to newInd.
// Also, the divider sign '|' and the new deckname will be added.
// e.g. If the last existing deckid of decks.dat file is "12",
// the new line to be added would read "13|Deck 13".
newInd = (Integer.parseInt(s)) + 1;
s += '|' + newDeck + newInd;
write.newLine();
write.write(s);
}
else {
// If the current line isn't the last line of the file:
line = nextLine;
nextLine = read.readLine();
}
}
}
} finally {
read.close();
write.close();
}
}
The addDeck method should make the decks.dat file longer by one line each time when invoked. But no matter how many times I invoke this method, the
decks.dat has only one line that reads "01|Deck 01".
Also, I need to make a method removeDeck, which removes one whole line from the decks.dat file, and I'm even more at a loss there.
I would be so very grateful for any help!
For starters, this line will create a new file called decks.dat each time the program runs. That is, it will overwrite the contents of the file always.
File file = new File("./files/decks.dat");
As a result, if (line == null) { computes to true always and you end up with "01|Deck 01" in the file always.
To solve this problem, remove the above line and just open the BufferedReader like so:
read = new BufferedReader(new FileReader("./files/decks.dat"));
The second problem is, you cannot really open the same file to read and write at the same time, so you should not open up write like you did. I suggest you collect the updated version into a variable (I suggest StringBuilder) and finally write the contents of this variable into the decks.dat file.
Once you work on these issues, you should be able to make progress with what you intend to do.
I have an application that needs to read only specific content from a text file. I have to read the text from 10,000 different text files arranged in a folder and have to populate the content from all those text files into a single CSV file.
My application runs fine, but it is reading up to file number 999 only. No error, but is not reading file after 999.
Any ideas?
public void calculate(String location) throws IOException{
String mylocation = location;
File rep = new File(mylocation);
File f2 = new File (mylocation + "\\" + "metricvalue.csv");
FileWriter fw = new FileWriter(f2);
BufferedWriter bw = new BufferedWriter (fw);
if(rep.exists() && rep.isDirectory()){
File name[] = rep.listFiles();
for(int j = 0; j < name.length; j++){
if(name[j].isFile()){
String filename = name[j].getPath();
String nameinfo = name[j].getName();
File f1= new File (filename);
FileReader fr = new FileReader(f1);
BufferedReader br = new BufferedReader (fr);
String line = null;
while((line = br.readLine()) != null){
if(line.contains(" | #1 #2 % Correct")){
bw.write(nameinfo + ",");
while((line=br.readLine()) != null) {
if((line.indexOf("#" ) != -1)){
String info[] = line.split("\\s+");
String str = info[2] + "," + info[3] + ",";
bw.write(str);
}
}
}
}
bw.newLine();
br.close();
}
}
}
bw.close();
}
Your platform's file system is limited to 999 open files. You may need to increase the limit or close() the FileReader explicitly:
fr.close();
How to debug:
Put a breakpoint at File name[] = rep.listFiles();
Open variables when Eclipse pauses and check that your array contains all of the file names you want. This will tell you if your problem is there or in your parsing.
You need to debug your code. Here are a couple of pointers to get you started:
File name[] = rep.listFiles();
for(int j =0;j<name.length; j++) {
if(name[j].isFile()) {
What is the size of the array? Figure it out. If there are 10000 elements in the array, that's how many iterations your loop will do, there is simply no other way. Just adding
System.out.println(name.length) will answer this question for you
If the array is shorter than 10000, that's your answer, you simply counted your files incorrectly. If it is not, then your problem must be that one of the "files" isn't really a file (and the test of the if statement fails). Add an else statement to it, and print out the name ... Or better yet, remove this if at all (in general, avoid nested conditionals encompassing the entire body of an outer structure, especially, huge ones like this, it makes your code fragile, and logic very hard to follow), and replace it with
if(!name[j].isFile()) {
System.out.println("Skipping " + name[j] + " because it is not a plain file.");
continue;
}
This will tell you which of 10000 files you are skipping. If it does not print anything, that means, that you do in fact read all 10000 files, as you expect, and the actual problem causing the symptom you are investigating, is elsewhere.
I want to read strings from a file. When a certain string (><) is found, I want to start reading integers instead, and convert them to binary strings.
My program is reading the strings in and saving them in an ArrayList successfully, but
it does not recognise the >< symbol and therefore the reading of the binary strings is not successful.
The Code
try {
FileInputStream fstream = new FileInputStream(fc.getSelectedFile().getPath());
// Get the object of DataInputStream
DataInputStream ino = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(ino));
String ln;
String str, next;
int line, c =0;
while ((ln = br.readLine()) != null) {
character = ln;
System.out.println(character);
iname.add(ln); // arraylist that holds the strings
if (iname.get(c).equals("><")) {
break; // break and moves
// on with the following while loop to start reading binary strings instead.
}
c++;
}
String s = "";
// System.out.println("SEQUENCE of bytes");
while ((line = ino.read()) != -1) {
String temp = Integer.toString(line, 2);
arrayl.add(temp);
System.out.println("telise? oxii");
System.out.println(line);
}
ino.close();
} catch (Exception exc) { }
The file I'm trying to read is for example:
T
E
a
v
X
L
A
.
x
"><"
sequence of bytes.
Where the last part is saved as bytes and in the textfile appears like that. no worries this bit works. all the strings are saved in a new line.
< is two characters and iname.get(c) is only one character.
What u should do is test if ln equals > and then another test if the next character equals < . If both test pass then break out of the loop.
you will have to becarefull
Use a Scanner. It allows you to specify a delimiter, and has methods for reading input tokens as String or int.
Could you not do something like:
while ((ln = br.readLine()) != null){
character=ln;
System.out.println(character);
//
// Look for magic characters >< and stop reading if found
//
if (character.indexOf("><") >= 0) {
break;
}
iname.add(ln);
}
This would work if you didn't want to add the magic symbol to your ArrayList. Your code sample is incomplete - if you're still having trouble you'd need to post the whole class.