How to skip rest of line using BufferedReader - java

In my program users frequently search a .txt file for certain information. To know if the right bit of data has been found I first check each line to see if it starts with a special character signalling the start of a group of data, something like this:
//one character has so far been read
if(character == '#'){
//continue to examine data
}else{
//skip the rest of the line
}
The problem I'm having is how to actually "skip the rest of the line", if the line did not start with my special character of choice.
As per complaints about insufficient information: I am indeed using a while loop to read each line

You can just do the action inside the if:
BufferedReader csvFile = new BufferedReader(
new InputStreamReader(inputStream));
while ((csvLine = csvFile.readLine()) != null) {
if (csvLine.charAt(0) == '#') {
// do # data action here
}
}

use the scanner class and the method nextLine().it will help you a lot
in case if it seems a bit difficult. then read the file line by line and then use RegEx pattern to check your required pattern for that line of file.

Related

.contains not working when reading from a text file?

Recently started Java and have been trying to make a database sorts of program which reads from a preset text file, the user can either search for a definition using the term or keywords/terms within the definition itself. The searching by term works fine but the key term always outputs not found.
FileReader fr = new FileReader("text.txt");
BufferedReader br = new BufferedReader(fr);
boolean found = false;
String line = br.readLine(); // first line so the term itself
String lineTwo = br.readLine(); // second line which is the definition
do {
if (lineTwo.toLowerCase().contains(keyterm.toLowerCase())) {
found = true;
System.out.println("Found "+keyterm);
System.out.println(line);
System.out.println(lineTwo);
}
} while ((br.readLine()!=null)&(!found));
if (!found){System.out.println("Not Found");} br.close(); fr.close();
This is my method used to check for the key term which works partially, it seems to be able to find the first two lines. Which causes it to output the definition of the first term if the key term is there however it doesn't work for any of the other terms.
edit
The text file it reads from looks something like this:
term
definition
term
definition
Each have their own line.
Edit 2
Thanks to #Matthew Kerian it now checks through the whole file, changing the end of the do while loop to
while (((lineTwo = br.readLine())!=null)&(!found));
It now finds the actual definition but is now outputting the wrong term with it.
Edit 3 The key term is defined by the users input
Edit 4 If it wasn't clear the output in the end I am looking for is either the definition of the term/key term if it is in the txt file or just not found if its not found.
Edit 5 Tried to look at what it was outputting and noticed it was outputting array (the first term in the text file) after every "lineTwo" it seems as though line is not updating.
Final Edit Managed to crudely solve the problem by making another text file with it flipped in the way it goes term definition it now goes definition term, lets me call upon the next line once the definition is found so it reads properly.
lineTwo is not begin refreshed with new data. Something like this would work better:
do {
if (lineTwo.toLowerCase().contains(keyterm.toLowerCase())) {
found = true;
System.out.println("Found "+keyterm);
System.out.println(line);
System.out.println(lineTwo);
}
} while (((lineTwo = br.readLine())!=null)&(!found));
We're still checking for EOF by checking nullness, but by setting it equal to line two we're constantly refreshing our buffer.

JAVA Unrecognized Character of the first character in the first line

I have lines of code to read the content of the file in Java. Basically I am using FileReader and BufferedReader. I am reading the lines correctly, however, the first character of the first line seems to be an undefined symbol. I have no idea where I got this symbol since the content of the input file is correct.
Here is the code:
FileReader readFile = new FileReader(chosenFile);
BufferedReader input = new BufferedReader(readFile);
while((line = input.readLine()) != null) {
System.out.println(line);
}
If it apears only in the first line, this is probably BOM (Byte Order Mark). All modern Text editors recognize this and do not present it as part of the text file. When you save the text file, there should be option to save with or without it.
If you wish to read the BOM marker in java, see here Reading UTF-8 - BOM marker

Scanner's nextLine(), Only fetching partial

So, using something like:
for (int i = 0; i < files.length; i++) {
if (!files[i].isDirectory() && files[i].canRead()) {
try {
Scanner scan = new Scanner(files[i]);
System.out.println("Generating Categories for " + files[i].toPath());
while (scan.hasNextLine()) {
count++;
String line = scan.nextLine();
System.out.println(" ->" + line);
line = line.split("\t", 2)[1];
System.out.println("!- " + line);
JsonParser parser = new JsonParser();
JsonObject object = parser.parse(line).getAsJsonObject();
Set<Entry<String, JsonElement>> entrySet = object.entrySet();
exploreSet(entrySet);
}
scan.close();
// System.out.println(keyset);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
}
as one goes over a Hadoop output file, one of the JSON objects in the middle is breaking... because scan.nextLine() is not fetching the whole line before it brings it to split. ie, the output is:
->0 {"Flags":"0","transactions":{"totalTransactionAmount":"0","totalQuantitySold":"0"},"listingStatus":"NULL","conditionRollupId":"0","photoDisplayType":"0","title":"NULL","quantityAvailable":"0","viewItemCount":"0","visitCount":"0","itemCountryId":"0","itemAspects":{ ... "sellerSiteId":"0","siteId":"0","pictureUrl":"http://somewhere.com/45/x/AlphaNumeric/$(KGrHqR,!rgF!6n5wJSTBQO-G4k(Ww~~
!- {"Flags":"0","transactions":{"totalTransactionAmount":"0","totalQuantitySold":"0"},"listingStatus":"NULL","conditionRollupId":"0","photoDisplayType":"0","title":"NULL","quantityAvailable":"0","viewItemCount":"0","visitCount":"0","itemCountryId":"0","itemAspects":{ ... "sellerSiteId":"0","siteId":"0","pictureUrl":"http://somewhere.com/45/x/AlphaNumeric/$(KGrHqR,!rgF!6n5wJSTBQO-G4k(Ww~~
Most of the above data has been sanitized (not the URL (for the most part) however... )
and the URL continues as:
$(KGrHqZHJCgFBsO4dC3MBQdC2)Y4Tg~~60_1.JPG?set_id=8800005007
in the file....
So its slightly miffing.
This also is entry #112, and I have had other files parse without errors... but this one is screwing with my mind, mostly because I dont see how scan.nextLine() isnt working...
By debug output, the JSON error is caused by the string not being split properly.
And almost forgot, it also works JUST FINE if I attempt to put the offending line in its own file and parse just that.
EDIT:
Also blows up if I remove the offending line in about the same place.
Attempted with JVM 1.6 and 1.7
Workaround Solution:
BufferedReader scan = new BufferedReader(new FileReader(files[i]));
instead of scanner....
Based on your code, the best explanation I can come up with is that the line really does end after the "~~" according to the criteria used by Scanner.nextLine().
The criteria for an end-of-line are:
Something that matches this regex: "\r\n|[\n\r\u2028\u2029\u0085]" or
The end of the input stream
You say that the file continues after the "~~", so lets put EOF aside, and look at the regex. That will match any of the following:
The usual line separators:
<CR>
<NL>
<CR><NL>
... and three unusual forms of line separator that Scanner also recognizes.
0x0085 is the <NEL> or "next line" control code in the "ISO C1 Control" group
0x2028 is the Unicode "line separator" character
0x2029 is the Unicode "paragraph separator" character
My theory is that you've got one of the "unusual" forms in your input file, and this is not showing up in .... whatever tool it is that you are using to examine the files.
I suggest that you examine the input file using a tool that can show you the actual bytes of the file; e.g. the od utility on a Linux / Unix system. Also, check that this isn't caused by some kind of character encoding mismatch ... or trying to read or write binary data as text.
If these don't help, then the next step should be to run your application using your IDE's Java debugger, and single-step it through the Scanner.hasNextLine() and nextLine() calls to find out what the code is actually doing.
And almost forgot, it also works JUST FINE if I attempt to put the offending line in its own file and parse just that.
That's interesting. But if the tool you are using to extract the line is the same one that is not showing the (hypothesized) unusual line separator, then this evidence is not reliable. The process of extraction may be altering the "stuff" that is causing the problems.

New lines in Java source files: How to test for them using Character class?

Writing a lexer of .java source files in Java. I have a stream of characters and I trying to make the lexer skip single-line comments.
I loop through each char and my hypothesis is that it should be possible to first detect the // of the comment and then skip subsequent chars until the next new line character. But it cannot work and I cannot detect any new line character. This is my code:
//is it a single line comment?
if(currentChar == '/') {
//loop through char:s until next new line
while(inComment == true) {
//increment loop
i++;
//extract next char
currentChar = stringInput.charAt(i);
//check if current character is a new line
if(( currentChar == '\n' ) || ( currentChar == '\r' )) {
inComment = false;
System.out.println("End Of Line Comment.");
}
}
}
So, does .java source files have new line characters? Is it possible to detect them using the Character class or in any other way?
Many thanks in advance!
SOLUTION:
The new line characters seem to been lost while reading the code from the .java source file using a BufferedReader and appending the lines to a StringBuilder. The problem was solved by instead reading the .java file using readFileToString() from org.apache.commons.io.FileUtils which worked a charm!
How do you read stringInput? If you're using readLine, why not just follow this psuedo-code:
if (stringInput starts with "//")
readNextLine()
Much shorter and easier to follow. Hint: Read through the String API.
That should work, the comparison currentChar == '\n' should work fine and return true when you reached the end of the line.
Are you sure that your line breaks don't get lost already when reading in the file, e.g. by using BufferedReader.readLine()? If that could be the case, try another way to read the file into a String, e.g. use FileUtils.readFileToString from the jakarta commons-io.
Between your if condition and while loop are you setting the inComment value to true ? Also you have to check for two slashes.

I Am Not Getting the Result I Expect Using readLine() in Java

I am using the code snippet below, however it's not working quite as I understand it should.
public static void main(String[] args) {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
String line;
try {
line = br.readLine();
while(line != null) {
System.out.println(line);
line = br.readLine();
}
} catch (IOException e) {
e.printStackTrace();
}
}
From reading the Javadoc about readLine() it says:
Reads a line of text. A line is considered to be terminated by any one of a line feed (\n), a carriage return (\r), or a carriage return followed immediately by a linefeed.
Returns:
A String containing the contents of the line, not including any line-termination characters, or null if the end of the stream has been reached
Throws:
IOException - If an I/O error occurs
From my understanding of this, readLine should return null the first time no input is entered other than a line termination, like \r. However, this code just ends up looping infinitely. After debugging, I have found that instead of null being returned when just a termination character is entered, it actually returns an empty string (""). This doesn't make sense to me. What am I not understanding correctly?
From my understanding of this, readLine should return null the first time no input is entered other than a line termination, like '\r'.
That is not correct. readLine will return null if the end of the stream is reached. That is, for example, if you are reading a file, and the file ends, or if you're reading from a socket and the socket closses.
But if you're simply reading the console input, hitting the return key on your keyboard does not constitute an end of stream. It's simply a character that is returned (\n or \r\n depending on your OS).
So, if you want to break on both the empty string and the end of line, you should do:
while (line != null && !line.equals(""))
Also, your current program should work as expected if you pipe some file directly into it, like so:
java -cp . Echo < test.txt
No input is not the same as the end of the stream. You can usually simulate the end of the stream in a console by pressing Ctrl+D (AFAIK some systems use Ctrl+Z instead). But I guess this is not what you want so better test for empty strings additionally to null strings.
There's a nice apache commons lang library which has a good api for common :) actions. You could use statically import StringUtils and use its method isNotEmpty(String ) to get:
while(isNotEmpty(line)) {
System.out.println(line);
line = br.readLine();
}
It might be useful someday:) There are also other useful classes in this lib.

Categories

Resources