FileOutputStream switching between appending and overwriting - java

I have made a simple data file which contains a header of only 4 bytes. These 4 bytes define how many records are stored inside the file. Both the header and record are of pre-defined sizes and cannot vary.
EDIT: Also the record only contains 4 bytes aswell. Which define just an integer number.
The LINE_SEPERATOR = byte { '\r', '\n' }
My problem is that each time I add a new record (append) I need to overwrite the header (not appending) becouse the record count should be increased by one. However the program refuses to switch between them and it just sticks with the non append mode.
addRecord code:
public void addRecord(ISaveable record)
throws IllegalArgumentException, FileNotFoundException, IOException
{
if(record == null)
{
throw new IllegalArgumentException("The given record may not be null");
}
this.header.increaseRecordCount();
writeHeader();
FileOutputStream oStream = null;
try
{
oStream = new FileOutputStream(DEFAULT_DIR, true);
long offset = calculateOffset(this.header.getRecordCount() - 1, record.getSize());
System.out.println("writing record # " + offset);
oStream.getChannel().position(offset);
PacketOutputStream pOut = new PacketOutputStream();
record.Save(pOut);
pOut.writeBytes(END_LINE);
oStream.write(pOut.Seal());
}
catch(FileNotFoundException ex)
{
throw ex;
}
catch(IOException ex)
{
throw ex;
}
finally
{
if(oStream != null)
{
oStream.flush();
oStream.close();
}
}
}
The writeHeader code:
private void writeHeader()
throws IOException
{
FileOutputStream oStream = null;
try
{
oStream = new FileOutputStream(DEFAULT_DIR, false);
oStream.getChannel().position(0);
PacketOutputStream pOut = new PacketOutputStream();
this.header.Save(pOut);
pOut.writeBytes(END_LINE);
oStream.write(pOut.Seal());
}
catch(IOException ex)
{
throw ex;
}
finally
{
if(oStream != null)
{
oStream.flush();
oStream.close();
}
}
}
As you can see I am using the booleans in the constructor of FileOutputStream correctly. Giving the writeHeader a false (becouse I want to overwrite the existing header) and the record a true (becouse it should be added to the end of the file). Please ignore the fact that setting append to true that it will automaticly seek to the end. The calculateOffset method is for future implementations.
I have done experiments where I only write the header everytime. It works perfectly when set to not append. And as expected when it's set to append it will add multiple headers.
The result I'm getting from my file right now after trying to add 4 records is only 2 lines. The header is perfect, there's nothing wrong with it. However all 4 records are written on the next line overwriting eachother.
The resulting debug text:
writing record # 6
writing record # 12
writing record # 18
writing record # 24
reading record # 6
3457
All record positions are correct, however the '3457' is the result of all 4 records overwritten on the same line.

If you want to write to multiple points in a file, you should really consider using RandomAccessFile, which was designed for this purpose.
Update: You should also use the same RandomAccessFile instance for all writes instead of creating one separately every time you update the header or the contents.

Related

Move from string to array but after that select by the first character (Record Type 1, 2, 5)

I need your help, I am new in Java
I need to read a flat file with 5 different of records
the way to differentiate each record is the first characters, after that I have the idea to move to an 5 different array to play with with the data inside.
example
120220502Name Last Name1298843984 $1.50
120220501other client 8989899889 $23.89
2Toronto372 Yorkland drive 1 year Ontario
512345678Transfer Stove Pay
522457839Pending Microwave Interactive
any help will quite appreciated
Break the problem into chunks. The first problem is reading the file:
try (BufferedReader reader = new BufferedReader(new FileReader("path/to/file"))) {
parseData(reader); //method to do the work.
} catch (IOException e) {
e.printStackTrace();
}
Then you need to decide what kind of record it is:
public void parseData(BufferedReader input) throws IOException {
for (String line = input.readLine(); line != null; line = input.readLine()) {
if (line.startsWith("1")) {
parseType1(line);
} else if (line.startsWith("2")) {
parseType2(line);
} else if (line.startsWith("5")) {
parseType5(line);
} else {
throw new Exception("Unknown record type: " + line.charAt(0));
}
}
}
Then you'll need to create the various parseTypeX method to handle turning the text into usable chunks and then into classes.
public Type1Record parseType1(String data) {
//create a Type1Record
Type1Record record = new Type1Record();
//split the string something like
String [] fields = data.split("\\s+");
//Assign those chunks to the record
record.setId(fields[0]);
record.setFirstName(fields[1]);
record.setLastName(fields[2]);
record.setTotal(fields[3]); //if you want this to be a real number, you'll need to remove the $
}
Repeat the process with the other record types. You'll likely need to group records together, but that should be easy enough.

OutputStreamWriter only writing one item into file

I have used the following code to write elements from an arraylist into a file, to be retrieved later on using StringTokenizer. It works perfect for 3 other arraylists but somehow for this particular one, it throws an exception when reading with .nextToken() and further troubleshooting with .countTokens() shows that it only has 1 token in the file. The delimiters for both write and read are the same - "," as per the other arraylists as well.
I'm puzzled why it doesnt work the way it should as with the other arrays when I have not changed the code structure.
=================Writing to file==================
public static void copy_TimeZonestoFile(ArrayList<AL_TimeZone> timezones, Context context){
try {
FileOutputStream fileOutputStream = context.openFileOutput("TimeZones.dat",Context.MODE_PRIVATE);
OutputStreamWriter writerFile = new OutputStreamWriter(fileOutputStream);
int TZsize = timezones.size();
for (int i = 0; i < TZsize; i++) {
writerFile.write(
timezones.get(i).getRegion() + "," +
timezones.get(i).getOffset() + "\n"
);
}
writerFile.flush();
writerFile.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
==========Reading from file (nested in thread/runnable combo)===========
public void run() {
if (fileTimeZones.exists()){
System.out.println("Timezone file exists. Loading.. File size is : " + fileTimeZones.length());
try{
savedTimeZoneList.clear();
BufferedReader reader = new BufferedReader(new InputStreamReader(openFileInput("TimeZones.dat")));
String lineFromTZfile = reader.readLine();
while (lineFromTZfile != null ){
StringTokenizer token = new StringTokenizer(lineFromTZfile,",");
AL_TimeZone timeZone = new AL_TimeZone(token.nextToken(),
token.nextToken());
savedTimeZoneList.add(timeZone);
}
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} catch (Exception e){
e.printStackTrace();
}
}
}
===================Trace======================
I/System.out: Timezone file exists. Loading.. File size is : 12373
W/System.err: java.util.NoSuchElementException
at java.util.StringTokenizer.nextToken(StringTokenizer.java:349)
at com.cryptotrac.trackerService$1R_loadTimeZones.run(trackerService.java:215)
W/System.err: at java.lang.Thread.run(Thread.java:764)
It appears that this line of your code is causing the java.util.NoSuchElementException to be thrown.
AL_TimeZone timeZone = new AL_TimeZone(token.nextToken(), token.nextToken());
That probably means that at least one of the lines in file TimeZones.dat does not contain precisely two strings separated by a single comma.
This can be easily checked by making sure that the line that you read from the file is a valid line before you try to parse it.
Using method split, of class java.lang.String, is preferable to using StringTokenizer. Indeed the javadoc of class StringTokenizer states the following.
StringTokenizer is a legacy class that is retained for compatibility reasons although its use is discouraged in new code. It is recommended that anyone seeking this functionality use the split method of String or the java.util.regex package instead.
Try the following.
String lineFromTZfile = reader.readLine();
while (lineFromTZfile != null ){
String[] tokens = lineFromTZfile.split(",");
if (tokens.length == 2) {
// valid line, proceed to handle it
}
else {
// optionally handle an invalid line - maybe write it to the app log
}
lineFromTZfile = reader.readLine(); // Read next line in file.
}
There are probably multiple things wrong, because I'd actually expect you to run into an infinite loop, because you are only reading the first line of the file and then repeatedly parse it.
You should check following things:
Make sure that you are writing the file correctly. What does the written file exactly contain? Are there new lines at the end of each line?
Make sure that the data written (in this case, "region" and "offset") never contain a comma, otherwise parsing will break. I expect that there is a very good chance that "region" contains a comma.
When reading files you always need to assume that the file (format) is broken. For example, assume that readLine will return an empty line or something that contains more or less than one comma.

Reading file >4GB file in java

I have mainframe data file which is greater than 4GB. I need to read and process the data for every 500 bytes. I have tried using FileChannel, however I am getting error with message Integer.Max_VALUE exceeded
public void getFileContent(String fileName) {
RandomAccessFile aFile = null;
FileChannel inChannel = null;
try {
aFile = new RandomAccessFile(Paths.get(fileName).toFile(), "r");
inChannel = aFile.getChannel();
ByteBuffer buffer = ByteBuffer.allocate(500 * 100000);
while (inChannel.read(buffer) > 0) {
buffer.flip();
for (int i = 0; i < buffer.limit(); i++) {
byte[] data = new byte[500];
buffer.get(data);
processData(new String(data));
buffer.clear();
}
}
} catch (Exception ex) {
// TODO
} finally {
try {
inChannel.close();
aFile.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Can you help me out with a solution?
The worst problem of you code is the
catch (Exception ex) {
// TODO
}
part, which implies that you won’t notice any exceptions thrown by your code. Since there is nothing in the JRE printing a “Integer.Max_VALUE exceeded” message, that problem must be connected to your processData method.
It might be worth noting that this method will be invoked way too often with repeated data.
Your loop
for (int i = 0; i < buffer.limit(); i++) {
implies that you iterate as many times as there are bytes within the buffer, up to 500 * 100000 times. You are extracting 500 bytes from the buffer in each iteration, processing a total of up to 500 * 500 * 100000 bytes after each read, but since you have a misplaced buffer.clear(); at the end of the loop body, you will never experience a BufferUnderflowException. Instead, you will invoke processData each of the up to 500 * 100000 times with the first 500 bytes of the buffer.
But the whole conversion from bytes to a String is unnecessarily verbose and contains unnecessary copy operations. Instead of implementing this yourself, you can and should just use a Reader.
Besides that, your code makes a strange detour. It starts with a Java 7 API, Paths.get, to convert it to a legacy File object, create a legacy RandomAccessFile to eventually acquire a FileChannel. If you have a Path and want a FileChannel, you should open it directly via FileChannel.open. And, of course, use a try(…) { … } statement to ensure proper closing.
But, as said, if you want to process the contents as Strings, you surely want to use a Reader instead:
public void getFileContent(String fileName) {
try( Reader reader=Files.newBufferedReader(Paths.get(fileName)) ) {
CharBuffer buffer = CharBuffer.allocate(500 * 100000);
while(reader.read(buffer) > 0) {
buffer.flip();
while(buffer.remaining()>500) {
processData(buffer.slice().limit(500).toString());
buffer.position(buffer.position()+500);
}
buffer.compact();
}
// there might be a remaining chunk of less than 500 characters
if(buffer.position()>0) {
processData(buffer.flip().toString());
}
} catch(Exception ex) {
// the *minimum* to do:
ex.printStackTrace();
// TODO real exception handling
}
}
There is no problem with processing files >4GB, I just tested it with a 8GB file. Note that the code above uses the UTF-8 encoding. If you want to retain the behavior of your original code of using whatever happens to be your system’s default encoding, you may create the Reader using
Files.newBufferedReader(Paths.get(fileName), Charset.defaultCharset())
instead.

.txt Reader/Writer displaying wrong size in bytes of txt (and more)

I wanted to make a program in Java that checks if src exists (if not to throw an FileNoot)
and to copy the contents of src.txt to des.txt
and to print the sizes of two files at the opening and the closing
The output is:
src.txt is in current directory
Before opening files:Size of src.txt:43 Bytes Size of des.txt:0 Bytes
After closing files:Size of src.txt:43 Bytes Size of des.txt:0 Bytes
After src.txt writes its contents in des.txt , des should be 43 bytes
First, I would like to ask if I can omit File declaration by writing
PrintWriter outStream = new PrintWriter(new FileWriter("des.txt"));
Secondly,I would like to ask how to adapt the following switch case (system indepent newline)
In order to add a newline after the one read.
Thirdly,I would like to ask the importance of try/catch block while closing File
Terribly sorry for this type of question but In C there was no error handling(I think) close() was certain to work
I am sorry for these types of questions but I am a beginner in java
import java.io.*;
public class Main
{
public static void main() throws FileNotFoundException
{
File src = new File("src.txt");
if(src.exists())
System.out.println("src.txt is in current directory");
else throw new FileNotFoundException("src.txt is not in current directory");
BufferedReader inStream = null;
PrintWriter outStream = null;
try {
File des = new File("des.txt");
inStream = new BufferedReader(new FileReader(src));
outStream = new PrintWriter(new FileWriter(des));
System.out.print("Before opening files:Size of src.txt:"+src.length()+" Bytes\t");
System.out.println("Size of des.txt:"+des.length()+" Bytes");
int c;
while((c = inStream.read()) != -1) {
switch(c){
case ' ': outStream.write('#');
break;
case '\r':
case '\n':outStream.write('\n');
outStream.write('\n');
break;
default:outStream.write(c);
}
}
System.out.print("After closing files:Size of src.txt:"+src.length()+" Bytes\t");
System.out.println("Size of des.txt:"+des.length()+" Bytes");
} catch(IOException io) {
System.out.println("Read/Write Error:"+io.toString());
} finally {
try {
if (inStream != null) {
inStream.close();
}
if (outStream != null) {
outStream.close();
}
} catch (IOException io) {
System.out.println("Error while closing Files:"+io.toString());
}
}
}
}
You have 3 questions inside your main question
The problem of the file sizes not being correct after you are done is caused by buffering of the file contents, by default it buffers some data to prevent short writes to the hard disk, causing lowered performance, check the size of you file after you closed the file so you see the correct size with the .length() call.
You can use
PrintWriter outStream = new PrintWriter(new FileWriter("des.txt"));
inside your code, since FileWriter accepts a String argument at its constructor.
It is recommend practice to close file handler/streams since they are not automatically closed at the time you are done with them, since the garbage collector don't run whole the time, but only at the times there is need for it, this can cause problems with undeletable files since the are still in use by a stream you cannot reach, but is still loaded inside the memory, this can also some problems with the fact that some streams are delayed writing using buffers, and if they are not closed, it causes problems that identify itself as your first problem.

How / when to delete a file in java?

The problem is, user clicks a button in JSP, which will export the displayed data. So what i am doing is, creating a temp. file and writing the contents in it [ resultSet >> xml >> csv ], and then writing the contents to ServletResponse. After closing the respons output stream, i try to delete the file, but every time it returns false.
code;
public static void writeFileContentToResponse ( HttpServletResponse response , String fileName ) throws IOException{
ServletOutputStream responseoutputStream = response.getOutputStream();
File file = new File(fileName);
if (file.exists()) {
file.deleteOnExit();
DataInputStream dis = new DataInputStream(new FileInputStream(
file));
response.setContentType("text/csv");
int size = (int) file.length();
response.setContentLength(size);
response.setHeader("Content-Disposition",
"attachment; filename=\"" + file.getName() + "\"");
response.setHeader("Pragma", "public");
response.setHeader("Cache-control", "must-revalidate");
if (size > Integer.MAX_VALUE) {
}
byte[] bytes = new byte[size];
dis.read(bytes);
FileCopyUtils.copy(bytes, responseoutputStream );
}
responseoutputStream.flush();
responseoutputStream.close();
file.delete();
}
i have used 'file.deleteOnExit();' and file.delete(); but none of them is working.
file.deleteOnExit() isn't going to produce the result you want here - it's purpose is to delete the file when the JVM exits - if this is called from a servlet, that means to delete the file when the server shuts down.
As for why file.delete() isn't working - all I see in this code is reading from the file and writing to the servlet's output stream - is it possible when you wrote the data to the file that you left the file's input stream open? Files won't be deleted if they're currently in use.
Also, even though your method throws IOException you still need to clean up things if there's an exception while accessing the file - put the file operations in a try block, and put the stream.close() into a finally block.
Don't create that file.
Write your data directly from your resultset to your CSV responseoutputStream.
That saves time, memory, diskspace and headache.
If you realy need it, try using File.createTempFile() method.
These files will be deleted when your VM stops normaly if they haven't been deleted before.
I'm assuming you have some sort of concurrency issue going on here. Consider making this method non-static, and use a unique name for your temp file (like append the current time, or use a guid for a filename). Chances are that you're opening the file, then someone else opens it, so the first delete fails.
as I see it, you are not closing the DataInputStream dis - this results to the false status, when you do want to delete file. Also, you should handle the streams in try-catch-finally block and close them within finally. The code is a bit rough, but it is safe:
DataInputStream dis = null;
try
{
dis = new DataInputStream(new FileInputStream(
file));
... // your other code
}
catch(FileNotFoundException P_ex)
{
// catch only Exceptions you want, react to them
}
finally
{
if(dis != null)
{
try
{
dis.close();
}
catch (IOException P_ex)
{
// handle exception, again react only to exceptions that must be reacted on
}
}
}
How are you creating the file. You probably need to use createTempFile.
You should be able to delete a temporary file just fine (No need for deleteOnExit). Are you sure the file isn't in use, when you are trying to delete it? You should have one file per user request (That is another reason you should avoid temp files and store everything in memory).
you can try piped input and piped output stream. those buffers need two threads one to feed the pipe (exporter) and the other (servlet) to consume data from the pipe and write it to the response output stream
You really don't want to create a temporary file for a request. Keep the resulting CSV in memory if at all possible.
You may need to tie the writing of the file in directly with the output. So parse a row of the result set, write it out to response stream, parse the next row and so on. That way you only keep one row in memory at a time. Problem there is that the response could time out.
If you want a shortcut method, take a look at Display tag library. It makes it very easy to show a bunch of results in a table and then add pre-built export options to said table. CSV is one of those options.
You don't need a temporary file. The byte buffer which you're creating there based on the file size may also cause OutOfMemoryError. It's all plain inefficient.
Just write the data of the ResultSet immediately to the HTTP response while iterating over the rows. Basically: writer.write(resultSet.getString("columnname")). This way you don't need to write it to a temporary file or to gobble everything in Java's memory.
Further, most JDBC drivers will by default cache everything in Java's memory before giving anything to ResultSet#next(). This is also inefficient. You'd like to let it give the data immediately row-by-row by setting the Statement#setFetchSize(). How to do it properly depends on the JDBC driver used. In case of for example MySQL, you can read it up in its JDBC driver documentation.
Here's a kickoff example, assuming that you're using MySQL:
protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
response.setContentType("text/csv");
response.setCharacterEncoding("UTF-8");
Connection connection = null;
Statement statement = null;
ResultSet resultSet = null;
PrintWriter writer = response.getWriter();
try {
connection = database.getConnection();
statement = connection.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
statement.setFetchSize(Integer.MIN_VALUE);
resultSet = statement.executeQuery("SELECT col1, col2, col3 FROM tbl");
while (resultSet.next()) {
writer.append(resultSet.getString("col1")).append(',');
writer.append(resultSet.getString("col2")).append(',');
writer.append(resultSet.getString("col3")).println();
// Note: don't forget to escape quotes/commas as per RFC4130.
}
} catch (SQLException e) {
throw new ServletException("Retrieving CSV rows from DB failed", e);
} finally {
if (resultSet != null) try { resultSet.close; } catch (SQLException logOrIgnore) {}
if (statement != null) try { statement.close; } catch (SQLException logOrIgnore) {}
if (connection != null) try { connection.close; } catch (SQLException logOrIgnore) {}
}
}
That's it. This way effectlvely only one database row is been kept in the memory all the time.

Categories

Resources