How are these methods allowing / causing data to be lost on disk?

How are these methods allowing / causing data to be lost on disk? - java

I have a program that writes its settings and data out to disk every so often (15 seconds or so).
If the program is running and the computer is shut off abruptly -- for example, with the power being cut at the wall -- somehow all of my data files on disk are changed to empty files.
Here is my code, which I thought I designed to protect against this failure, but based on testing the failure still exists:
SaveAllData -- Called every so often, and also when JavaFX.Application.stop() is called.
public void saveAllData () {
createNecessaryFolders();
saveAlbumsAndTracks();
saveSources();
saveCurrentList();
saveQueue();
saveHistory();
saveLibraryPlaylists();
saveSettings();
saveHotkeys();
}
CreateNecessaryFolders
private void createNecessaryFolders () {
if ( !playlistsDirectory.exists() ) {
boolean playlistDir = playlistsDirectory.mkdirs();
}
}
Save Functions -- they all look just like this
public void saveCurrentList () {
File tempCurrentFile = new File ( currentFile.toString() + ".temp" );
try ( ObjectOutputStream currentListOut = new ObjectOutputStream( new FileOutputStream( tempCurrentFile ) ) ) {
currentListOut.writeObject( player.getCurrentList().getState() );
currentListOut.flush();
currentListOut.close();
Files.move( tempCurrentFile.toPath(), currentFile.toPath(), StandardCopyOption.REPLACE_EXISTING );
} catch ( Exception e ) {
LOGGER.warning( e.getClass().getCanonicalName() + ": Unable to save current list to disk, continuing." );
}
}
Github repository to commit where this problem exists. See Persister.java.
As I said, when the power is cut abruptly all setting files saved by this method are blanked. This makes particularly no sense to me, since they are called in sequence and I am making sure the file is written to disk and flushed before calling move().
Any idea how this could be happening? I thought by calling flush, close, then move, I would ensure that the data is written to disk before overwriting the old data. Somehow, this isn't the case, but I am clueless. Any suggestions?
Note: these files are only written to by these functions, and only read from by corresponding load() functions. There is no other access to the files any where else in my program.
Note 2: I am experiencing this on Ubuntu Linux 16.10. I have not tested it on other platforms yet.

Adding StandardCopyOption.ATOMIC_MOVE to the Files.move() call solves the problem:
public void saveCurrentList () {
File tempCurrentFile = new File ( currentFile.toString() + ".temp" );
try ( ObjectOutputStream currentListOut = new ObjectOutputStream( new FileOutputStream( tempCurrentFile ) ) ) {
currentListOut.writeObject( player.getCurrentList().getState() );
currentListOut.flush();
currentListOut.close();
Files.move( tempCurrentFile.toPath(), currentFile.toPath(), StandardCopyOption.REPLACE_EXISTING, StandardCopyOption.ATOMIC_MOVE );
} catch ( Exception e ) {
LOGGER.warning( e.getClass().getCanonicalName() + ": Unable to save current list to disk, continuing." );
}
}

Related

How to read/copy a (partially locked) log file, or at least the unlocked parts?

I am working on a utility that zips up a number of files (for diagnostics purposes). At it's core, it uses the following function:
private void write(ZipOutputStream zipStream, String entryPath, ByteSource content) throws IOException {
try (InputStream contentStream = content.openStream()) {
zipStream.putNextEntry(new ZipEntry(entryPath));
ByteStreams.copy(contentStream, zipStream);
zipStream.closeEntry();
}
}
But one of the files I want to read is a log file that another application runs and locks. Because that file is locked, I get an IO exception.
<ERROR>java.io.IOException: The process cannot access the file because another process has locked a portion of the file
at java.base/java.io.FileInputStream.readBytes(Native Method)
at java.base/java.io.FileInputStream.read(FileInputStream.java:257)
at com.google.common.io.ByteStreams.copy(ByteStreams.java:112)
If I am willing to accept that I might get some garbage because of conflicts between my reads and the other application's writes, what is the best/easiest way to work around this? Is there a file reader that ignores locks or perhaps only reads all the unlocked sections only?
Update -- To clarify, I am looking to read a log file, or as much of it as possible. So, I could just start reading the file, wait until I get a block I can't read, catch the error, append a file end and go. Notepad++ and other programs can read files that are partially locked. I'm just looking for a way to do that without re-inventing the ByteStreams.copy function to create a "Copy as much as I can" function.
I should have perhaps asked "How to read all the unlocked parts of a log file" and I will update the title.

One possible answer (which I don't like) is to create a method almost identical to ByteStreams.copy(), which I call "copyUntilLock" which catches any IOException, then it checks to see if the exception is a because another process has locked a portion of the file.
If that is the case, then simply stop writing and return the number of bytes so far. If its some other exception go ahead and throw it. (You could also write a note to the stream like "READING FAILED DUE TO LOCK").
Still looking for a better answer. Code included below.
private static long copyUntilLock (InputStream from, OutputStream to) throws IOException {
checkNotNull(from);
checkNotNull(to);
byte[] buf = createBuffer();
long total = 0;
try {
while (true) {
int r = from.read(buf);
if (r == -1) {
break;
}
to.write(buf, 0, r);
total += r;
}
return total;
} catch (IOException iox) {
if (iox.getMessage() != null && iox.getMessage().contains("another process has locked a portion of the file")) {
return total;
} else {
throw iox;
}
}
}

Fast non-blocking read/writes using MappedByteBuffer?

I am processing messages from a vendor as a stream of data and want to store msgSeqNum locally in a local file. Reason:
They send msgSeqNum to uniquely identify each message. And they provide a 'sync-and-stream' functionality to stream messages on reconnecting from a given sequence number. Say if the msgSeqNum starts from 1 and my connection went down at msgSeqNum 50 and missed the next 100 messages (vendor server's current msgSeqNum is now 150), then when I reconnect to the vendor, I need to call 'sync-and-stream' with msgSeqNum=50 to get the missed 100 messages.
So I want to understand how I can persist the msgSeqNum locally for fast access. I assume
1) Since the read/writes happen frequently i.e. while processing every message (read to ignore dups, write to update msgSeqNum after processing a msg), I think it's best to use Java NIO's 'MappedByteBuffer'?
2) Could someone confirm if the below code is best for this where I expose the mapped byte buffer object to be reused for reads and writes and leave the FileChannel open for the lifetime of the process? Sample Junit code below:
I know this could be achieved with general Java file operations to read and write into a file but I need something fast which is equivalent to non-IO as I am using a single writer patten and want to be quick in processing these messages in a non-blocking manner.
private FileChannel fileChannel = null;
private MappedByteBuffer mappedByteBuffer = null;
private Charset utf8Charset = null;
private CharBuffer charBuffer = null;
#Before
public void setup() {
try {
charBuffer = CharBuffer.allocate( 24 ); // Long max/min are till 20 bytes anyway
System.out.println( "charBuffer length: " + charBuffer.length() );
Path pathToWrite = getFileURIFromResources();
FileChannel fileChannel = (FileChannel) Files
.newByteChannel( pathToWrite, EnumSet.of(
StandardOpenOption.READ,
StandardOpenOption.WRITE,
StandardOpenOption.TRUNCATE_EXISTING ));
mappedByteBuffer = fileChannel
.map( FileChannel.MapMode.READ_WRITE, 0, charBuffer.length() );
utf8Charset = Charset.forName( "utf-8" );
//charBuffer = CharBuffer.allocate( 8 );
} catch ( Exception e ) {
// handle it
}
}
#After
public void destroy() {
try {
fileChannel.close();
} catch ( IOException e ) {
// handle it
}
}
#Test
public void testWriteAndReadUsingSharedMappedByteBuffer() {
if ( mappedByteBuffer != null ) {
mappedByteBuffer.put( utf8Charset.encode( charBuffer.wrap( "101" ) )); // TODO improve this and try reusing the same buffer instead of creating a new one
} else {
System.out.println( "mappedByteBuffer null" );
fail();
}
mappedByteBuffer.flip();
assertEquals( "101", utf8Charset.decode(mappedByteBuffer).toString() );
}

How to guarantee atomic move or exception of a file in Java?

In one of my projects I have concurrent write access to one single file within one JRE and want to handle that by first writing to a temporary file and afterwards moving that temp file to the target using an atomic move. I don't care about the order of the write access or such, all I need to guarantee is that any given time the single file is usable. I'm already aware of Files.move and such, my problem is that I had a look at at least one implementation for that method and it raised some doubts about if implementations really guarantee atomic moves. Please look at the following code:
Files.move on GrepCode for OpenJDK
1342 FileSystemProvider provider = provider(source);
1343 if (provider(target) == provider) {
1344 // same provider
1345 provider.move(source, target, options);
1346 } else {
1347 // different providers
1348 CopyMoveHelper.moveToForeignTarget(source, target, options);
1349 }
The problem is that the option ATOMIC_MOVE is not considered in all cases, but the location of the source and target path is the only thing that matters in the first place. That's not what I want and how I understand the documentation:
If the move cannot be performed as an atomic file system operation then
AtomicMoveNotSupportedException is thrown. This can arise, for example, when the target
location is on a different FileStore and would require that the file be copied, or target
location is associated with a different provider to this object.
The above code clearly violates that documentation because it falls back to a copy-delete-strategy without recognizing ATOMIC_MOVE at all. An exception would be perfectly OK in my case, because with that a hoster of our service could change his setup to use only one filesystem which supports atomic moves, as that's what we expect in the system requirements anyway. What I don't want to deal with is things silently failing just because an implementation uses a copy-delete-strategy which may result in data corruption in the target file. So, from my understanding it is simply not safe to rely on Files.move for atomic operations, because it doesn't always fail if those are not supported, but implementations may fall back to a copy-delete-strategy.
Is such behaviour a bug in the implementation and needs to get filed or does the documentation allow such behaviour and I'm understanding it wrong? Does it make any difference at all if I now already know that such maybe broken implementations are used out there? I would need to synchronize the write access on my own in that case...

You are looking at the wrong place. When the file system providers are not the same, the operation will be delegated to moveToForeignTarget as you have seen within the code snippet you’ve posted. The method moveToForeignTarget however will use the method convertMoveToCopyOptions (note the speaking name…) for getting the necessary copy options for the translated operation. And convertMoveToCopyOptions will throw an AtomicMoveNotSupportedException if it encounters the ATOMIC_MOVE option as there is no way to convert that move option to a valid copy option.
So there’s no reason to worry and in general it’s recommended to avoid hasty conclusion from seeing just less than ten lines of code (especially when not having tried a single test)…

The standard Java library does not provide a way to perform an atomic move in all cases.
Files.move() does not guarantee atomic move. You can pass ATOMIC_MOVE as an option, but if the move cannot be performed as an atomic operation, AtomicMoveNotSupportedException is thrown (this is the case when target location is on a different FileStore and would require that the file be copied).
You have to implement it yourself if you really need that. One solution can be to catch AtomicMoveNotSupportedException and then do this: Try to move the file without the ATOMIC_MOVE option but catch exceptions and remove the target if error occured during the copy.

I came across similar problem to be solved:
One process frequently updates file via 'save to tempfile -> move tempfile to final file' using Files.move(tmp, out, ATOMIC_MOVE, REPLACE_EXISTING);
Another one or more processes read that file - completely, all-at-once, and closes immediatelly. File is rather small - less than 50k.
And it just does not work reliably, at least on windows. Under heavy load reader occasionally gets NoSuchFileException - this means Files.move is not that ATOMIC even on the same file system :(
My env: Windows 10 + java 11.0.12
Here is the code to play with:
import org.junit.Test;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.ByteBuffer;
import java.nio.channels.ByteChannel;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.Set;
import static java.nio.charset.StandardCharsets.UTF_8;
import static java.nio.file.StandardCopyOption.ATOMIC_MOVE;
import static java.nio.file.StandardCopyOption.REPLACE_EXISTING;
import static java.util.Locale.US;
public class SomeTest {
static int nWrite = 0;
static int nRead = 0;
static int cErrors = 0;
static boolean writeFinished;
static boolean useFileChannels = true;
static String filePath = "c:/temp/test.out";
#Test
public void testParallelFileAccess() throws Exception {
new Writer().start();
new Reader().start();
while( !writeFinished ) {
Thread.sleep(10);
}
System.out.println("cErrors: " + cErrors);
}
static class Writer extends Thread {
public Writer() {
setDaemon(true);
}
#Override
public void run() {
File outFile = new File("c:/temp/test.out");
File outFileTmp = new File(filePath + "tmp");
byte[] bytes = "test".getBytes(UTF_8);
for( nWrite = 1; nWrite <= 100000; nWrite++ ) {
if( (nWrite % 1000) == 0 )
System.out.println("nWrite: " + nWrite + ", cReads: " + nRead);
try( FileOutputStream fos = new FileOutputStream(outFileTmp) ) {
fos.write(bytes);
}
catch( Exception e ) {
logException("write", e);
}
int maxAttemps = 10;
for( int i = 0; i <= maxAttemps; i++ ) {
try {
Files.move(outFileTmp.toPath(), outFile.toPath(), ATOMIC_MOVE, REPLACE_EXISTING);
break;
}
catch( IOException e ) {
try {
Thread.sleep(1);
}
catch( InterruptedException ex ) {
break;
}
if( i == maxAttemps )
logException("move", e);
}
}
}
System.out.println("Write finished ...");
writeFinished = true;
}
}
static class Reader extends Thread {
public Reader() {
setDaemon(true);
}
#Override
public void run() {
File inFile = new File(filePath);
Path inPath = inFile.toPath();
byte[] bytes = new byte[100];
ByteBuffer buffer = ByteBuffer.allocateDirect(100);
try { Thread.sleep(100); } catch( InterruptedException e ) { }
for( nRead = 0; !writeFinished; nRead++ ) {
if( useFileChannels ) {
try ( ByteChannel channel = Files.newByteChannel(inPath, Set.of()) ) {
channel.read(buffer);
}
catch( Exception e ) {
logException("read", e);
}
}
else {
try( InputStream fis = Files.newInputStream(inFile.toPath()) ) {
fis.read(bytes);
}
catch( Exception e ) {
logException("read", e);
}
}
}
}
}
private static void logException(String action, Exception e) {
cErrors++;
System.err.printf(US, "%s: %s - wr=%s, rd=%s:, %s%n", cErrors, action, nWrite, nRead, e);
}
}

Serialization Overwriting Data

I have a program I'm making for a small business which is implementing serializable on a linkedList to save data. This all works fine, until I have two staff members try and add more data to the list and one ends up overwriting the other.
JButton btnSaveClientFile = new JButton("Save Client File");
btnSaveClientFile.addMouseListener(new MouseAdapter() {
#Override
public void mouseClicked(MouseEvent arg0) {
// add new items to list
jobList.add(data);
.
.
.
Controller.saveData();
}
});
btnSaveClientFile.setBounds(10, 229, 148, 23);
frame.getContentPane().add(btnSaveClientFile);
This method results in one overwriting the other, so I tried doing it like this
JButton btnSaveClientFile = new JButton("Save Client File");
btnSaveClientFile.addMouseListener(new MouseAdapter() {
#Override
public void mouseClicked(MouseEvent arg0) {
Controller.retrieveData();
// add new items to list
jobList.add(data);
.
.
.
Controller.saveData();
}
});
btnSaveClientFile.setBounds(10, 229, 148, 23);
frame.getContentPane().add(btnSaveClientFile);
And when I use this one, I get no data added to the list at all. Here are my Serialization methods. This one is used to save my data.
// methods to serialize data
public static void saveData() {
System.out.println("Saving...");
FileOutputStream fos = null;
ObjectOutputStream oos = null;
try {
fos = new FileOutputStream("Data.bin");
oos = new ObjectOutputStream(fos);
oos.writeObject(myOLL);
oos.close();
} catch (Exception ex) {
ex.printStackTrace();
}
}
And this one is used to collect my data
public static void retrieveData() {
// Get data from disk
System.out.println("Loading...");
FileInputStream fis = null;
ObjectInputStream ois = null;
try {
fis = new FileInputStream("Data.bin");
ois = new ObjectInputStream(fis);
myOLL = (OrderedLinkedList) ois.readObject();
ois.close();
} catch (Exception ex) {
System.err.println("File cannot be found");
ex.printStackTrace();
}
}
How do I make it so I can save data to my file from two different computers at a similar time, without one overwriting the other?

This is a demo (and not meant to be used in this crude way) how to acquire a lock on file /tmp/data.
RandomAccessFile raf = new RandomAccessFile( "/tmp/data", "rw" );
FileChannel chan = raf.getChannel();
FileLock lock = null;
while( (lock = chan.tryLock() ) == null ){
System.out.println( "waiting for file" );
Thread.sleep( 1000 );
}
System.out.println( "using file" );
Thread.sleep( 3000 );
System.out.println( "done" );
lock.release();
Clearly, reading a sequential file, mulling over it for some time and then rewriting or not is prohibitive if you require a high level of concurrency. That's why such applications typically use database systems, the client-server paradigm. A free-for-all on the file system isn't tolerable except in rare circumstances. Your organization may be able to assign updates of the data to one person at a time, which would simplify matters.

add more data to the list and one ends up overwriting the other.
This is how files work by default, in fact the ObjectOutputStream doesn't support an "append" mode. Once you have closed the stream, you can't alter it.
How do I make it so I can save data to my file from two different computers at a similar time, without one overwriting the other?
You have two problems here
how to write to a file twice without losing information.
how to co-ordinate writes between processes without one impacting the other.
For the first part, you need to read the contents of the list first, add the entries you wand to add, and write out the contents again. OR you can change the file format to one which supports appending.
For the second part, you need to use locking of some kind. A simple way to do this is to create a lock file. You can create a second file atomically e.g. file.lock and the one which succeeds in creating the file holds the lock, that process alters the file and deletes the lock which finished. Some care needs to be taken to ensure you always remove the lock.
Another approach is to use file locks. You have to take care not to delete the file in the process however this has the benefit that the OS will clean up the lock if your process dies.

Testing writing into a csv file using JUnit [duplicate]

This question already has an answer here:
JUnit testing for IO
(1 answer)
Closed 8 years ago.
I am new to junit testing and I want to write unit tests. Actually the methods does not return anything. It take the a list of signals and write it to a csv file. I am not sure how to test methods with void return types.
Anyone can help me ?
public void createCSV ( final ArrayList< Signal > messages, File file )
{
try
{
// Use FileWriter constructor that specifies open for appending
csvOutput = new MyWriter( new FileWriter( file, false ), ',' );
// Create Header for CSV
csvOutput.writeRecord( "Message Source" );
csvOutput.writeRecord( "Message Name" );
csvOutput.writeRecord( "Component" );
csvOutput.writeRecord( "Occurance" );
csvOutput.writeRecord( "Message Payload with Header" );
csvOutput.writeRecord( "Bandwidth(with Header %)" );
csvOutput.writeRecord( "Message Payload" );
csvOutput.writeRecord( "Bandwidth(%)" );
csvOutput.endOfRecord();
for ( Signal signal : messages )
{
csvOutput.writeRecord( signal.getSource() );
csvOutput.writeRecord( signal.getName() );
csvOutput.writeRecord( signal.getComponent() );
csvOutput.writeRecord( Integer.toString( signal.getOccurance() ) );
csvOutput.writeRecord( Integer.toString( signal
.getSizewithHeader() ) );
csvOutput.writeRecord( Float.toString( signal
.getBandwidthWithHeader() ) );
csvOutput.writeRecord( Integer.toString( signal.getSize() ) );
csvOutput.writeRecord( Float.toString( signal.getBandwidth() ) );
csvOutput.endOfRecord();
}
}
catch ( IOException e )
{
logger.error( "Error in writing CSV file for messages", e );
}
finally
{
try
{
if ( csvOutput != null )
{
csvOutput.flush();
csvOutput.close();
}
messages.clear();
}
catch ( IOException ex )
{
ex.printStackTrace();
}
}
}
}

One takes a map and sort it.
Pass in a map with known, unsorted values. Verify the map has been sorted after the method was called.
The other take the sorted map and write it to a csv file. I am not sure how to test methods with void return types.
Two options:
Pass in a temporary file path, e.g. see JUnit temporary folders, then read that file after the method has been called and test it for correctness.
Adjust your method to accept an OutputStream instead of a File. Then you can pass a ByteArrayOutputStream and verify its contents by calling toByteArray() and inspecting the bytes.

Unit test for File
If you dont want to change the src code:
In the unit test I would pass a file to a temp path, call that create csv method and
then open the file and dependendent of how many effort you want to invest:
check
1) if the file exists (use a filename genereated that contains the current time)
2) check that the length is more than 0 bytes
3) read the first and last line and check for expected content
But in most cases, an OutputStream is more flexible than a File parameter.
In productive code you pass a FileOutputStream, in your unit test a ByteArrayOutputStream, which you can parse using an ByteArrayInputStream.
This is the cleaner solution, since it does not create files which should be cleaned up, and it runs faster.
Unit test for sorting
Just create an unsorted map. call you sort, and check the result to be sorted:
Iterate and check that each next element is e.g greater than the previous one (or smaller depending on the sort order)
Just

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.