I'm writing to a RandomAccessFile like that: (in a LinkedList's subclass)
file.setLength(0);
for (Person person : this)
file.writeUTF(person.getBlob());
Person.getBlob() returns a string of constant length, containing only basic alphanumeric characters, spaces and CRs (only one-byte characters). At this place the file contains exactly 100 records. (confirmed with a hex editor)
Then I try to read that file:
int counter = 0;
while (true) {
try {
add(Person.fromBlob(file.readUTF()));
} catch (EOFException e) {
System.out.println(counter + " records read from file.");
break;
} catch (Exception exception) {
throw new DBException(exception);
}
counter++;
}
I always end up with one record read correctly and an EOFException. What's wrong with this code?
I've found the solution. That class had a custom add() method that rewrote the file everytime something was added. At the beginning of the loop there were 100 entries, but after it executed once only one entry was left. There was also some extra code that had always added those missing 99 entries.
Replacing add() with super.add() solved the problem.
Related
I have this code:
public static void main(String[] args) {
List<Valtozas> lista = new ArrayList<Valtozas>();
try {
File fajl = new File("c://data//uzemanyag.txt");
Scanner szkenner = new Scanner(fajl, "UTF8");
while (szkenner.hasNext()) {
String sor = szkenner.nextLine();
String [] darabok = sor.split(";");
String szoveg = darabok[0];
Valtozas valtozas = new Valtozas(Integer.valueOf(darabok[0]), Integer.valueOf(darabok[1]), Integer.valueOf(darabok[2]));
lista.add(valtozas);
}
}
catch (Exception e) {
}
//3.FELADAT
System.out.println("3.Feladat: Változások száma: "+lista.size());
}
}
Here I want to convert the String to int, but I cant. I tried the Integer.Valueof(darabok[0]), but its not working. And nothing is happening, so the code is build, but quit from the while.
Based on the source code you have shown us, the problem is that there is a format mismatch between the input file and the program you are trying to read.
The program reads the file a line at a time, and splits it into fields using a single semicolon (with no whitespace!) as the file separator. Then it tries to parse the first three fields of each split line as integers.
For this to work the following must be true:
Every line must have at least three fields. (Otherwise you will get a ArrayIndexOutOfBound exception.)
The three fields must match the following regex: -?[0-9]+ i.e. an optional minus signed followed by one or more decimal digits.
The resulting number must be between Integer.MIN_VALUE and Integer.MAX_VALUE.
Elaborating on the above:
Leading and trailing whitespace characters are not allowed.
Embedded decimals markers and grouping characters are not allowed.
Scientific notation is not allowed.
Numbers that are too large or too small are not allowed.
Note that if any of the above constraints is not met, then the runtime system will throw an exception. Unfortunately, you surround your code with this:
try {
...
}
catch (Exception e) {
}
That basically ignores all exceptions by catching them and doing nothing in the handler block. You are saying to Java "if anything goes wrong, don't tell me about it!". So, naturally, Java doesn't tell you what is going wrong.
DON'T EVER DO THIS. This is called exception squashing, and it is a really bad idea1.
There are two ways to address this:
Print or log the stacktrace; e.g.
catch (Exception e) {
e.printStackTrace();
}
Remove the try / catch, and add throws IOException to the signature of your main method.
(You can't just remove the try / catch because new Scanner(fajl, "UTF8") throws IOException. That's a checked exception so must be handled or declared in the method signature.)
Once you have dealt with the exception properly you will get an exception message and stacktrace that tells you what is actually going wrong. Read it, understand it, and fix your program.
1 - It is like taping over the "annoying" red light that indicates that your car's oil level is low!
I have a program that reads from a txt file each line and I'm supposed to handle the error when the line has more than 50 characters. I'm not very familiar with exceptions in Java, but is it ok if I just use an 'if' condition like this:
if(line.length() > 50) {
System.out.println("over 50 characters on this line");
return;
}
or should I declare a function like this:
static void checkLineLength(int lineLength) {
if(lineLength > 50) {
throw new ArithmeticException("over 50 characters");
}
}
and call it inside the main function?
checkLineLength(line.length());
LE: I've changed the exception handling block a bit:
static void checkLineLength(int lineLength) {
if(lineLength > 50) {
try {
throw new Exception("over 50 ch");
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
System.exit(1);
}
}
}
Is it better? I see it works but I want to know if it's the professional way to do it.
The other answers so far concentrate on throwing and handling exceptions (with very good advice), but don't discuss the point whether exceptions are the best way of handling the long-text-line situation.
You write:
I'm supposed to handle the error when the line has more than 50
characters.
The wording "handle the error" needs interpretation / clarification. What are you supposed to do if a single line from the text file exceeds the 50-characters limit?
Use the first 50 characters and silently ignore the trailing rest?
Ignore the single line as faulty, but read the other lines?
Abort the whole file as unreadable because of syntax error, but keep the program running, e.g. to allow the user to select a different file?
Abort the whole program?
Depending on the answer to this question, exceptions might or might not be the answer for your problem.
Let's suppose, the method we talk about looks like this:
public List<FancyObject> readFromTextFile(File file) { ... }
It reads the text file line by line and puts one FancyObject per line into the result List.
In Java, a method can only either return a result or throw an exception. So in the first and second case, where you want to get a result (at least from the short lines), you can't throw an exception.
In the third case, I'd recommend to throw an exception as soon as you find a line longer than 50 characters (just as eddySoft suggested).
Even in the fourth case, I wouldn't put the System.exit() into the readFromTextFile() method, but in some higher-level method that's responsible for controlling the whole application, e.g. main(). It's a matter of readability or "principle of least surprise". Nobody expects a method named readFromTextFile() to be able to completely abort the Java Virtual machine. So, even in this case, I'd have the method throw its LineLimitException, and have main() catch that, inform the user and do the System.exit().
As long as you throw an Exception, both the approaches are fine. Only advantage while writing the method is you can reuse it.
Just one thing, System.out.println("over 50 characters on this line"); will log this in the console and silently move forward.
Throwing some exception like throw new ArithmeticException("over 50 characters"); , will break the flow.
EDIT:
METHOD 1:
You can use this piece of code:
static void checkLineLength(int lineLength) {
if(lineLength > 50) {
throw new ArithmeticException("over 50 characters");
}
}
OR
METHOD 2:
static void checkLineLength(int lineLength) {
if(lineLength > 50) {
throw new ArithmeticException("over 50 characters");
}
}
and call this method from somewhere in your code and put it in a try block:
try{
checkLineLength(line.length()); // call to the method
}
catch(Exception e){
e.printStackTrace(); // print the stacktrace if exception occurs
System.exit(1);
}
Define your Exception class. Example
class LineLimitException extends RuntimeException{
public LineLimitException(String message){
super(message);
}
}
Use your exception class in your logic
if(lineLength > 50) {
throw new LineLimitException("over 50 characters");
}
I'm trying to read all integers from a file into an ArrayList in the #BeforeClass of a java JUnit test. For testing purposes, I am then simply trying to print all values of the arraylist to the screen. Nothing is being output however. Any input would be greatly appreciated.
public class CalcAverageTest
{
static List<Integer> intList = new ArrayList<Integer>();
#BeforeClass
public static void testPrep() {
try {
Scanner scanner = new Scanner(new File("gradebook.txt"));
while (scanner.hasNextInt()) {
intList.add(scanner.nextInt());
}
for (int i=0;i<intList.size();i++) {
System.out.println(intList.get(i));
}
} catch (IOException e) {
e.printStackTrace();
} catch (NumberFormatException ex) {
ex.printStackTrace();
}
}
}
(promoting a comment to an answer)
If gradebook.txt is an empty file, or starts with something that does not parse as an int, such as text or comments at the top of the file, then scanner.hasNextInt() will immediately return false, and intList will remain empty. The for loop will then loop over the empty list zero times, and no output will be generated, as observed.
I have some strings to skip over before the integers.
scanner.readLine() can be used to skip over comment lines before the numbers. If it is not a set number of lines that need skipping, or if there are words on the line before the numbers, we would need to see a sample of the input to advise the best strategy for finding the numbers in the input file.
You need to iterate over the file till the last line, so you will need to change the condition in the loop and use .hasNextLine() instead of .nextInt()
while (scanner.hasNextLine()) {
String currLine = scanner.nextLine();
if (currLine != null && currLine.trim().length() > 0 && currLine.matches("^[0-9]*$"))
intList.add(Integer.parseInt(currLine));
}
}
Here, we read each line and store it in currLine. Now only if it contains a numeric value it is added to the intList else it is skipped. ^[0-9]$* is a regex used to match only numeric values.
From the docs, hasNextLine()
Returns true if there is another line in the input of this scanner.
This method may block while waiting for input. The scanner does not
advance past any input.
I need to read a text file using java. Not a problem. But I need to reject the empty lines at the end of the file. The file is quite large, around a million or so lines. I need to process each line, one at a time. Even if they are empty.
But, if the empty lines are at the end of the file, then I need to reject it. Note that there can be multiple empty lines at the end of the file.
Any quick solutions? I almost want to write a FileUtility.trimEmptyLinesAndEnd(File input). But I cant help feeling that someone might have written something like this already.
Any help appreciated.
Note:
I have read this link.
Java: Find if the last line of a file is empty.
But this is not what I am trying to do. I need to reject multiple
empty lines.
When you find an empty line, increment a counter for the number of empty lines. If the next line is also empty, increment the counter. If you reach the end of the file, just continue on with what you want to do (ignoring the empty lines you found). If you reach a non-empty line, first do whatever you do to process an empty line, repeating it for each empty line you counted. Then process the non-empty line as normal, and continue through the file. Also, don't forget to reset the empty line counter to zero.
Pseudo code:
emptyLines = 0;
while (the file has a next line) {
if (line is empty) {
emptyLines++;
} else {
if (emptyLines > 0) {
for (i = 0; i < emptyLines; i++) {
process empty line;
}
emptyLines = 0;
}
process line;
}
}
You have to read all lines in your file. You can introduce a guarding that will store the value of last not empty line. At the end return the subset from zero to guardian.
In case you have a stream process.
read line
if empty
increase empty lines counter
else
if there was some empty lines
yield fake empty lines that counter store
reset counter
yield line
Thanks for all the responses. I think both Vash - Damian Leszczyński and forgivenson cracked the pseudocode for this problem. I have taken that forward and am providing here the Java code for people who come looking for an answer after me.
#Test
public void test() {
BufferedReader br = null;
try {
String sCurrentLine;
StringBuffer fileContent = new StringBuffer();
int consecutiveEmptyLineCounter = 0;
br = new BufferedReader(new FileReader("D:\\partha\\check.txt"));
while ((sCurrentLine = br.readLine()) != null) {
// if this is not an empty line
if (!(sCurrentLine.trim().length() == 0)) {
// if there are no empty lines before this line.
if (!(consecutiveEmptyLineCounter > 0)) {
// It is a non empty line, with non empty line prior to this
// Or it is the first line of the file.
// Don't do anything special with it.
// Appending "|" at the end just for ease of debug.
System.out.println(sCurrentLine + "|");
} else {
// This is a non empty line, but there were empty lines before this.
// The consecutiveEmptyLineCounter is > 0
// The "fileContent" already has the previous empty lines.
// Add this non empty line to "fileContent" and spit it out.
fileContent.append(sCurrentLine);
System.out.println(fileContent.toString() + "#");
// and by the way, the counter of consecutive empty lines has to be reset.
// "fileContent" has to start from a clean slate.
consecutiveEmptyLineCounter = 0;
fileContent = new StringBuffer();
}
} else {
// this is an empty line
// Don't execute anything on it.
// Just keep it in temporary "fileContent"
// And count up the consecutiveEmptyLineCounter
fileContent.append(sCurrentLine);
fileContent.append(System.getProperty("line.separator"));
consecutiveEmptyLineCounter++;
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (br != null)
br.close();
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
Thanks for all the help.
And, what I have provided here is just one solution. If someone comes across something more clever, please share. I can't shake the feeling off that there should be some FileUtils.trimEmptyLinesAtEnd() sort of method somewhere.
Just read the File backwards. Starting from the first line you read, refrain from processing all blank lines you encounter.
Starting with the first non-blank line you encounter, and thereafter, process all lines whether or not they're blank.
The problem is "intractable" wrt to a neat solution if you read the File forward since you can never know if at some point after a long run of blank lines there might yet be a non-blank line.
If the processing of lines in order, first-to-last, matters, then there is no neat solution and anything like what you have now is about what there is.
Aside: I am using the penn.txt file for the problem. The link here is to my Dropbox but it is also available in other places such as here. However, I've not checked whether they are exactly the same.
Problem statement: I would like to do some word processing on each line of the penn.txt file which contains some words and syntactic categories. The details are not relevant.
Actual "problem" faced: I suspect that the file has some consecutive blank lines (which should ideally not be present), which I think the code verifies but I have not verified it by eye, because the number of lines is somewhat large (~1,300,000). So I would like my Java code and conclusions checked for correctness.
I've used slightly modified version of the code for converting file to String and counting number of lines in a string. I'm not sure about efficiency of splitting but it works well enough for this case.
File file = new File("final_project/penn.txt"); //location
System.out.println(file.exists());
//converting file to String
byte[] encoded = null;
try {
encoded = Files.readAllBytes(Paths.get("final_project/penn.txt"));
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
String mystr = new String(encoded, StandardCharsets.UTF_8);
//splitting and checking "consecutiveness" of \n
for(int j=1; ; j++){
String split = new String();
for(int i=0; i<j; i++){
split = split + "\n";
}
if(mystr.split(split).length==1) break;
System.out.print("("+mystr.split(split).length + "," + j + ") ");
}
//counting using Scanner
int count=0;
try {
Scanner reader = new Scanner(new FileInputStream(file));
while(reader.hasNext()){
count++;
String entry = reader.next();
//some word processing here
}
reader.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
System.out.println(count);
The number of lines in Gedit--if I understand correctly--matched the number of \n characters found at 1,283,169. I have verified (separately) that the number of \r and \r\n (combined) characters is 0 using the same splitting idea. The total splitting output is shown below:
(1283169,1) (176,2) (18,3) (13,4) (11,5) (9,6) (8,7) (7,8) (6,9) (6,10) (5,11) (5,12) (4,13) (4,14) (4,15) (4,16) (3,17) (3,18) (3,19) (3,20) (3,21) (3,22) (3,23) (3,24) (3,25) (2,26) (2,27) (2,28) (2,29) (2,30) (2,31) (2,32) (2,33) (2,34) (2,35) (2,36) (2,37) (2,38) (2,39) (2,40) (2,41) (2,42) (2,43) (2,44) (2,45) (2,46) (2,47) (2,48) (2,49) (2,50)
Please answer whether the following statements are correct or not:
From this, what I understand is that there is one instance of 50 consecutive \n characters and because of that there are exactly two instances of 25 consecutive \n characters and so on.
The last count (using Scanner) reading gives 1,282,969 which is an exact difference of 200. In my opinion, what this means is that there are exactly 200 (or 199?) empty lines floating about somewhere in the file.
Is there any way to separately verify this "discrepancy" of 200? (something like a set-theoretic counting of intersections maybe)
A partial answer to question (the last part) is as follows:
(Assuming the two statements in the question are true)
If instead of printing number of split parts, if you print no. of occurrences of \n j times, you'll get (simply doing a -1):
(1283168,1) (175,2) (17,3) (12,4) (10,5) (8,6) (7,7) (6,8) (5,9) (5,10) (4,11) (4,12) (3,13) (3,14) (3,15) (3,16) (2,17) (2,18) (2,19) (2,20) (2,21) (2,22) (2,23) (2,24) (2,25) (1,26) (1,27) (1,28) (1,29) (1,30) (1,31) (1,32) (1,33) (1,34) (1,35) (1,36) (1,37) (1,38) (1,39) (1,40) (1,41) (1,42) (1,43) (1,44) (1,45) (1,46) (1,47) (1,48) (1,49) (1,50)
Note that for j>3, product of both numbers is <=50, which is your maximum. What this means is that there is a place with 50 consecutive \n characters and all the hits you are getting from 4 to 49 are actually part of the same.
However for 3, the maximum multiple of 3 less than 50 is 48 which gives 16 while you have 17 occurrences here. So there is an extra \n\n\n somewhere with non-\n character on both its 'sides'.
Now for 2 (\n\n), we can subtract 25 (coming from the 50 \ns) and 1 (coming from the separate \n\n\n) to obtain 175-26 = 149.
Counting for the discrepancy, we should sum (2-1)*149 + (3-1)*1 + (50-1)*1, the -1 coming because first \n in each of these is accounted for in the Scanner counting. This sum is 200.