Counting Words and Newlines In A File Using Java? - java

I am writing a small java app which will scan a text file for any instances of particular word and need to have a feature whereby it can report that an instance of the word was found to be the 14th word in the file, on the third line, for example.
For this i tried to use the following code which i thought would check to see whether or not the input was a newline (\n) character and then incerement a line variable that i created:
FileInputStream fileStream = new FileInputStream("src/file.txt");
DataInputStream dataStream = new DataInputStream(fileStream);
BufferedReader buffRead = new BufferedReader(new InputStreamReader(dataStream));
String strLine;
String Sysnewline = System.getProperty("line.separator");
CharSequence newLines = Sysnewline;
int lines = 1;
while ((strLine = buffRead.readLine()) != null)
{
if(strLine.contains(newLines))
{
System.out.println("Line Found");
lines++;
}
}
System.out.println("Total Number Of Lines In File: " + lines);
This does not work for, it simply display 0 at the end of this file. I know the data is being placed into strLine during the while loop as if i change the code slightly to output the line, it is successfully getting each line from the file.
Would anyone happen to know the reason why the above code does not work?

Read the javadocs for readLine.
Returns:
A String containing the contents of the line, not including any line-termination characters, or null if the end of the stream has been reached

readLine() strips newlines. Just increment every iteration of the loop. Also, you're overcomplicating your file reading code. Just do new BufferedReader(new FileReader("src/file.txt"))

Related

Keep new lines when reading in a file

I'm trying to read in a file and modify the text, but I need to keep new lines when doing so. For example, if I were to read in a file that contained:
This is some text.
This is some more text.
It would just read in as
This is some text.This is some more text.
How do I keep that space? I think it has something to do with the /n escape character. I've seen using BufferReader and FileReader, but we haven't learned that in my class yet, so is there another way? What I've tried is something like this:
if (ch == 10)
{
ch = '\n';
fileOut.print(ch);
}
10 is the ASCII table code for a new line, so I thought Java could recognize it as that, but it doesn't.
In Java 8:
You can read lines using:
List<String> yourFileLines = Files.readAllLines(Paths.get("your_file"));
Then collect strings:
String collect = yourFileLines.stream().filter(StringUtils::isNotBlank).collect(Collectors.joining(" "));
The problem is that you (possibly) want to read your file a line at a time, and then you want to write it back a line at a time (keeping empty lines).
The following source does that, it reads the input file one line at a time, and writes it back one line at a time (keeping empty lines).
The only problem is ... it possibly changes the new line, maybe you are reading a unix file and write a dos file or vice-versa depending on the system you are running in and the source type of the file you a reading.
Keeping the original newline can introduce a lot complexity, read BufferedReader and PrintWriter api docs for more information.
public void process(File input , File output){
try(InputStream in = new FileInputStream(input);
OutputStream out = new FileOutputStream(output)){
BufferedReader reader = new BufferedReader(new InputStreamReader(in, "utf-8"),true);
PrintWriter writer = new PrintWriter( new OutputStreamWriter(out,"utf-8"));
String line=null;
while((line=reader.readLine())!=null){
String processed = proces(line);
writer.println(processed);
}
} catch (IOException e) {
// Some exception management
}
}
public String proces(String line){
return line;
}
/n should be \n
if (ch == 10)
{
ch = '\n';
fileOut.print(ch);
}
Is that a typo?
ch = '/n';
otherwise use
ch = '\n';

Deleting first five lines of a text file

I'm trying to delete the first 5 lines of a text file that match five values stored in an array. Here's what I have so far...
void write(String[] activecode) throws IOException
{
File productcodes = new File("productcodes.txt");
String charset = "UTF-8";
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(productcodes), charset));
File temp = File.createTempFile("productcodes", ".txt", productcodes.getParentFile());
PrintWriter writer = new PrintWriter(new OutputStreamWriter(new FileOutputStream(temp), charset));
int counter = 0;
for (String line; (line = reader.readLine()) != null && counter != activecode.length;)
{
line = line.replace(activecode[counter], "");
writer.println(line);
counter++;
}
reader.close();
writer.close();
productcodes.delete();
temp.renameTo(productcodes);
}
Also for reference, here is what the text file looks like...
BH390311ED6911-D8P8-BG7X
BH390311ED6912-GXKQ-BQ9V
BH390311ED6913-B6JF-55YG
BH390311ED6914-7B56-W37Y
BH390311ED6915-HPDW-V949
BH390311ED6916-3XX4-NDSN
BH390311ED6917-JH4M-PK6B
BH390311ED6918-WQKJ-5TKG
BH390311ED6919-TKS3-WHG3
BH390311ED6920-QTJV-9F43
BH390311ED6921-D45V-GHNG
BH390311ED6922-JH5F-4KXM
BH390311ED6923-6NQM-WSWF
BH390311ED6924-DMFD-BTN6
BH390311ED6925-7883-JG67
BH390311ED6926-3GRN-W7YT
BH390311ED6927-CBKB-47RW
The array is already saved as the first five values of the text file.
Any got any ideas on why the output is the text file with only the first three values remaining? I'm very new to Java (as you can probably tell :D)
EDIT:
The contents of the array activecode[] is:
BH390311ED6911-D8P8-BG7X
BH390311ED6912-GXKQ-BQ9V
BH390311ED6913-B6JF-55YG
BH390311ED6914-7B56-W37Y
BH390311ED6915-HPDW-V949
My desired output would be:
BH390311ED6916-3XX4-NDSN
BH390311ED6917-JH4M-PK6B
BH390311ED6918-WQKJ-5TKG
BH390311ED6919-TKS3-WHG3
BH390311ED6920-QTJV-9F43
BH390311ED6921-D45V-GHNG
BH390311ED6922-JH5F-4KXM
BH390311ED6923-6NQM-WSWF
BH390311ED6924-DMFD-BTN6
BH390311ED6925-7883-JG67
BH390311ED6926-3GRN-W7YT
BH390311ED6927-CBKB-47RW
Which is the original file minus the contents of the array.
This is the loop in your code:
for (String line; (line = reader.readLine()) != null && counter != activecode.length;)
{
line = line.replace(activecode[counter], "");
writer.println(line);
counter++;
}
Here's what it does: For counter = 0, 1, 2, 3, and 4, it reads a line from the input file. line.replace(activecode[counter],"") will look for the code from activecode in the input line. If it finds it, it removes it from the line. If the entire line equals the entire activecode[counter] element, then the line is replaced by an empty string "".
But then you write the line to the file. If your intent was to delete the lines from the file, this doesn't do that. line is now (probably) an empty string, and writer.println(line) will write an empty string to the output file. I'm not entirely sure what your needs are; it may be something like
if (!line.equals(activecode[counter]))
writer.println(line);
which will write out the line unless it equals activecode[counter], and if they're equal, it will skip the line and not write anything out. However, I'm not clear on what the exact requirements are--for instance, if the third line in the file equals activecode[0], what's supposed to happen? So I don't know whether the above is the correct solution. I think you'll need to define (at least for yourself) exactly what the program is supposed to do.
Finally, after this loop is done, your program doesn't read any more of the input. That is, it only reads the first five lines. Then it closes the input and output files. If you need to read the rest of the input file and copy it to the output file, you'll need to write another loop to do that.

Java StringBuffer: replace the content from a starting index to the end of a line

I have content of a file in a StringBuffer. The content of the file includes many lines (not on a single line). I want to edit the content of a line from index 4 (just for example) to the end of that line. I use replace() to edit the content of the StringBuffer.
The point is that the replace method has parameters such as starting index and ending index. But I don't know what is the ending index since each line have different number of characters
I think of using str.indexOf("\n") to find the ending index of the line, but then the file have many lines, so it will return incorrect results.
this is the readFile() if u need to read the code
Thank you
public StringBuffer readFile(){ //read file line by line
File f = getFilePath(fileName);
StringBuffer sb = new StringBuffer();
String textinLine;
try {
FileInputStream fs = new FileInputStream(f);
InputStreamReader in = new InputStreamReader(fs);
BufferedReader br = new BufferedReader(in);
while (true){
textinLine = br.readLine();
if (textinLine == null) break;
sb.append(textinLine+ "\n");
}
fs.close();
in.close();
br.close();
} ... // just some catch statements heres
}
Use String.indexOf() as you indicated, but pass in the starting position, e.g. indexOf('\n', 4);
I agree with Jim's idea, why not process string before appending it to StringBuffer.
By the way, I think you can use indexOf(String str, int fromIndex) function to parse StringBuffer, and each time when you get '\n', you can set an offset value, then next time when you get the next \n, you can just let index value plus the offset.

Java: Copy strings from a file to another without losing the 'newline format'

Sorry in advance if the title is misleading/wrong but this is the best I can do after a really long day spent practicing with Java. (my brain is melting)
I put this code togheter to read a file and copy it into another file, skipping the line/lines that begins with a given string (BeginOfTheLineToRemove). It actually works and remove the desired line, but, for some reason, it forgets about the \n (newline). Spacing and symbols are copied. I can't figure it out. I really hope someone will help. cheers from a java newb from italy ;)
public void Remover(String file, String BeginOfTheLineToRemove) {
File StartingFile = new File(file);
File EndingFile = new File(StartingFile.getAbsolutePath() + ".tmp");
BufferedReader br = new BufferedReader(new FileReader(file));
PrintWriter pw = new PrintWriter(new FileWriter(EndingFile));
String line;
while ((line = br.readLine()) != null) {
if (line.startsWith(LineToRemoveThatBeginWithThis)) {
continue;
}
pw.write(line);
}
pw.close();
br.close();
}
Use pw.println instead of pw.write. println adds new line character after it writes content.
You are using PrintWriter.write() to write the lines - This does not by default write newline at the end. Use println() instead.
This will probably help you.
The BufferedReader.readLine() method does not read any line termination characters. So therefore your line will not contain any termination characters.
BufferedReader#readLine documentation says:
Returns: A String containing the contents of the line, not including any line-termination characters, or null if the end of the stream has been reached
That is, the reader strips the line termination characters from your Strings, so you need to manually add them again:
// \n on Linux/Mac, \r\n on Windows
String lineSep = System.getProperty("line.separator");
pw.write(line);
pw.write(lineSep);
BufferedReader.readLine() uses the newline to identify the end of the line, and the string that it returns does not contain this newline. The newline is a separator, so it is not considered part of the data.
To compensate for this, you can add a newline to your output, like so:
while((line = br.readLine()) != null) {
if(line.startsWith(LineToRemoveThatBeginWithThis)) continue;
pw.write(line);
pw.println();
}
The extra call to PrintWriter.println() will print a newline after you write out your line of text.
Outside the loop get the system's line seperator:
String lineSeparator = System.getProperty("line.separator");
Then append that to the line you've read in:
pw.write(line+lineSeparator);

Java - Reading a csv file line by line - stuck with weird non-existent characters being read!

hello fellow java developers. I'm having a very strange issue.
I'm trying to read a csv file line by line. Im at the point where Im just testing out the reading of the lines. ONly each time that I read a line, the line contains square characters between each character of text. I even saved the file as a txt file in wordpad and notepad with no change.
Thus I must be doing something stupid...
I have a csv file, standard csv file, yes a text file with commas in it. I try to read a line of text, but the text is all f-ed up and cannot find the phrase within the text.
Any advice? code below.
//open csv
File filReadMe = new File(strRoot + "data2.csv");
BufferedReader brReadMe = new BufferedReader
(new InputStreamReader(new FileInputStream(filReadMe)));
String strLine = brReadMe.readLine();
//for all lines
while (strLine != null){
//if line contains "(see also"
if (strLine.toLowerCase().contains("(see also")){
//write line from "(see also" to ")"
int iBegin = strLine.toLowerCase().indexOf("(see also");
String strTemp = strLine.substring(iBegin);
int iLittleEnd = strTemp.indexOf(")");
System.out.println(strLine.substring(iBegin, iBegin + iLittleEnd));
}
//update line
strLine = brReadMe.readLine();
} //end for
brReadMe.close();
I can only think that this is an inconsistent character encoding. Open the file in notepad, choose Save As, and select UTF-8 in the drop down for "encoding". Then add "UTF-8" as a second parameter to InputStreamReader, e.g.
BufferedReader brReadMe = new BufferedReader
(new InputStreamReader(new FileInputStream(filReadMe), "UTF-8"));
That should sort out any inconsistencies with encoding.

Categories

Resources