Faster way of copying data in Java? - java

I have been given a task of copying data from a server. I am using BufferedInputStream and output stream to copy the data and I am doing it byte by byte. Even though it is running but It is taking ages to copy the data as some of them are in 100's MBs, so definitely it is not gonna work. Can anyone suggest me any alternate of Byte by Byte copy so that my code can copy file that are in few Hundred MBs.
Buffer is 2048.
Here is how my code look like:
static void copyFiles(SmbFile[] files, String parent) throws IOException {
SmbFileInputStream input = null;
FileOutputStream output = null;
BufferedInputStream buf_input = null;
try {
for (SmbFile f : files) {
System.out.println("Working on files :" + f.getName());
if (f.isDirectory()) {
File folderToBeCreated = new File(parent+f.getName());
if (!folderToBeCreated.exists()) {
folderToBeCreated.mkdir();
System.out.println("Folder name " + parent
+ f.getName() + "has been created");
} else {
System.out.println("exists");
}
copyFiles(f.listFiles(), parent + f.getName());
} else {
input = (SmbFileInputStream) f.getInputStream();
buf_input = new BufferedInputStream(input, BUFFER);
File t = new File(parent + f.getName());
if (!t.exists()) {
t.createNewFile();
}
output = new FileOutputStream(t);
int c;
int count;
byte data[] = new byte[BUFFER];
while ((count = buf_input.read(data, 0, BUFFER)) != -1) {
output.write(data, 0, count);
}
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
if (input != null) {
input.close();
}
if (output != null) {
output.close();
}
}
}

Here is a link to an excellent post explaining how to use nio channels to make copies of streams. It introduces a helper method ChannelTools.fastChannelCopy that lets you copy streams like this:
final InputStream input = new FileInputStream(inputFile);
final OutputStream output = new FileOutputStream(outputFile);
final ReadableByteChannel inputChannel = Channels.newChannel(input);
final WriteableByteChannel outputChannel = Channels.newChannel(output);
ChannelTools.fastChannelCopy(inputChannel, outputChannel);
inputChannel.close();
outputChannel.close()

Well since you're using a BufferedInputStream, you aren't reading byte by byte, but rather the size of the buffer. You could just try increasing the buffer size.

Reading/writing byte-by-byte is definitely going to be slow, even though the actual reading/writing is done by chunks of the buffer size. One way to speed it up is to read/write by blocks. Have a look at read(byte[] b, int off, int len) method of BufferedInputStream. However it probably won't give you enough of the improvement.
What would be much better is to use nio package (new IO) to copy data using nio channels. Have a look at nio documentation for more info.

I would suggest to use FileUtils from org.apache.commons.io. It has enough utility methods to perform file operations.
org.apache.commons.io.FileUtils API Here

Related

Error where FileOutputStream only writes to file after the program has been terminated

I've had this error in the past but never fully understood it. After closing an OutputStream, regardless of the location of the java file or the manner in which it is called, completely screws up all sequential runs or attempts to write to another file, even if a different method of writing to a file is used. For this reason I avoid closing streams even though it is a horrible habit not to. In my program, I created was trying a test case that had a close statement which destroyed all of my previous streams, making it for some reason that they only write to files after the program has been terminated.
I kept the file location open and it writes the Text in the text file at the appropriate time, however the "Preview" panel in Windows does not detect it (which used to happen). Note that this all worked perfectly before the stream was accidentally closed. Is there a manner to reset the stream? I've tried flushing it during the process but is still does not run as it did prior.
Here is the method used to create the file:
protected void createFile(String fileName, String content) {
try {
String fileLoc = PATH + fileName + ".txt";
File f = new File(fileLoc);
if(!f.isFile())
f.createNewFile();
FileOutputStream outputStream = new FileOutputStream(fileLoc);
byte[] strToBytes = content.getBytes();
outputStream.write(strToBytes);
} catch (IOException e) {
e.printStackTrace();
return;
}
}
as well as the method used to read the file:
protected String readFile(String fileName) {
try {
StringBuilder sb = new StringBuilder("");
String fileLoc = PATH + fileName + ".txt";
File f = new File(fileLoc);
if(!f.exists())
return "null";
Scanner s = new Scanner(f);
int c = 0;
while(s.hasNext()) {
String str = s.nextLine();
sb.append(str);
if(s.hasNext())
sb.append("\n");
}
return sb.toString();
} catch(Exception e) {
e.printStackTrace();
return "null";
}
}
I'd be happy to answer any clarification questions if needed. Thank you for the assistance.
without try-resource, you need close in final clause to make sure no leak. Or use Stream.flush() if you need more 'in-time' update.
} catch (IOException e) {
e.printStackTrace();
return;
} finally {
outputStream.close();
}
You need to call flush() on the stream to write the bytes to the stream.
You're currently calling write() by itself, like this:
FileOutputStream outputStream = new FileOutputStream(fileLoc);
outputStream.write(content.getBytes());
What you want to do is this:
FileOutputStream outputStream = new FileOutputStream(fileLoc);
outputStream.write(content.getBytes());
outputStream.flush();
From the Javadoc (https://docs.oracle.com/javase/8/docs/api/java/io/OutputStream.html#flush--) for OutputStream (where FileOutputStream is an OutputStream), this is what it says for flush():
Flushes this output stream and forces any buffered output bytes to be written out. The general contract of flush is that calling it is an indication that, if any bytes previously written have been buffered by the implementation of the output stream, such bytes should immediately be written to their intended destination.
Even better would be to close the stream in a finally block, so that no matter what your code always tries to free up any open resources, like this:
FileOutputStream outputStream = null;
try {
outputStream = new FileOutputStream(fileLoc);
outputStream.write(content.getBytes());
outputStream.flush();
} finally {
if (outputStream != null) {
outputStream.close();
}
}
or use automatic resource management, like this:
try (FileOutputStream outputStream = new FileOutputStream(fileLoc)) {
outputStream.write(content.getBytes());
outputStream.flush();
}

compressing file by independent chunks and then concat them into one valid archive

I wonder if it is possible to compress an arbitrary file (or folder, or any other file structure) by independent chunks and then get a valid archive (e.g. gzip) by concatenating them together. Some requirements:
java 8
chunks <= 16MB
folder structure does not change during the process
chunks are compressed independently, but order is preserved
each compressed chunk is appended to the end of the resulting archive
resulting archive should be valid and decompressable by any standard tool
It looks like to achieve that I would need to create an archive header first and then just append compressed blocks to it https://www.rfc-editor.org/rfc/rfc1952, however I'm not sure if it is supported by any of standard java utils or 3rd party libraries. Does anybody have any ideas on where to start from?
Some background:
I have a client-server app, which allows user to upload files to a cloud storage. Communication via REST api, client side is going to be responsible for dividing files into chunks and upload them one by one. It is possible to do compression in browser, however I wonder if we can move that load to the backend.
Yes. A concatenation of gzip files is a valid gzip file, per the standard (RFC 1952). gzip certainly handles this.
You are correct to be concerned that some code out there might not support it, since it is not very common to have concatenated gzip members. If you want to be super-safe, you can combine the gzip files into a single gzip member, without having to recompress. You do however need to read through all of the compressed data, effectively decompressing it in memory (which is still much faster than compressing). You can find an example of that in gzjoin.c.
You can try something like this for tar + gzip:
Maven dependency:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-compress</artifactId>
<version>1.18</version>
</dependency>
Java code to compress into chunks:
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
import org.apache.commons.compress.utils.IOUtils;
import java.io.*;
import java.nio.file.Files;
import java.nio.file.Paths;
[..]
private static final int MAX_CHUNK_SIZE = 16000000;
public void compressTarGzChunks(String inputDirPath, String outputDirPath) throws Exception {
PipedInputStream in = new PipedInputStream();
final PipedOutputStream out = new PipedOutputStream(in);
new Thread(() -> {
try {
int chunkIndex = 0;
int n = 0;
byte[] buffer = new byte[8192];
do {
String chunkFileName = String.format("archive-part%d.tar.gz", chunkIndex);
try (OutputStream fOut = Files.newOutputStream(Paths.get(outputDirPath, chunkFileName));
BufferedOutputStream bOut = new BufferedOutputStream(fOut);
GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(bOut)) {
int currentChunkSize = 0;
if (chunkIndex > 0) {
gzOut.write(buffer, 0, n);
currentChunkSize += n;
}
while ((n = in.read(buffer)) != -1 && currentChunkSize + n < MAX_CHUNK_SIZE) {
gzOut.write(buffer, 0, n);
currentChunkSize += n;
}
chunkIndex++;
}
} while (n != -1);
in.close();
} catch (IOException e) {
// logging and exception handling should go here
}
}).start();
try (TarArchiveOutputStream tOut = new TarArchiveOutputStream(out)) {
compressTar(tOut, inputDirPath, "");
}
}
private static void compressTar(TarArchiveOutputStream tOut, String path, String base)
throws IOException {
File file = new File(path);
String entryName = base + file.getName();
TarArchiveEntry tarEntry = new TarArchiveEntry(file, entryName);
tarEntry.setSize(file.length());
tOut.putArchiveEntry(tarEntry);
if (file.isFile()) {
try (FileInputStream in = new FileInputStream(file)) {
IOUtils.copy(in, tOut);
tOut.closeArchiveEntry();
}
} else {
tOut.closeArchiveEntry();
File[] children = file.listFiles();
if (children != null) {
for (File child : children) {
compressTar(tOut, child.getAbsolutePath(), entryName + "/");
}
}
}
}
Java code to concatenate the chunks into a single archive:
public void concatTarGzChunks(List<InputStream> sortedTarGzChunks, String outputFile) throws IOException {
try {
try (FileOutputStream fos = new FileOutputStream(outputFile)) {
for (InputStream in : sortedTarGzChunks) {
int len;
byte[] buf = new byte[1024 * 1024];
while ((len = in.read(buf)) != -1) {
fos.write(buf, 0, len);
}
}
}
} finally {
sortedTarGzChunks.forEach(is -> {
try {
is.close();
} catch (IOException e) {
// logging and exception handling should go here
}
});
}
}

Android FileOutputStream Seems to Fail

I am trying to transfer a video file from an RPi hotspot to my a directory on my phone over WiFi. I have been able to successfully create a folder in my storage, connect with the RPi server, and receive data. However, the file that appears after being written isn't correct. In fact, when I try to open it, it just opens a separate, unrelated app on my phone. Very weird!
Here is the code in question:
try {
BufferedInputStream myBis = new BufferedInputStream(mySocket.getInputStream());
DataInputStream myDis = new DataInputStream(myBis);
byte[] videoBuffer = new byte[4096*2];
int i = 0;
while (mySocket.getInputStream().read(videoBuffer) != -1) {
Log.d(debugStr, "while loop");
videoBuffer[videoBuffer.length-1-i] = myDis.readByte();
Log.d(debugStr, Arrays.toString(videoBuffer));
i++;
}
Log.d(debugStr, "done with while loop");
// create a File object for the parent directory
File testDirectory = new File(Environment.getExternalStorageDirectory()+File.separator, "recordFolder");
Log.d(debugStr, "path made?");
if(!testDirectory.exists()){
testDirectory.mkdirs();
}
Log.d(debugStr, "directory made");
// create a File object for the output file
File outputFile = new File(testDirectory.getPath(), "recording1");
Log.d(debugStr, "outputfile made");
// now attach the OutputStream to the file object, i
FileOutputStream fileOutputStream = new FileOutputStream(outputFile);
Log.d(debugStr, "write to file object made");
fileOutputStream.write(videoBuffer);
Log.d(debugStr, "video written");
fileOutputStream.close();
Log.d(debugStr, "done");
} catch (IOException e1) {
e1.printStackTrace();
}
The video is initially in .h264 format and is being sent as a byte array. The file is 10MB in size. In my while loop, I print out the value of the array as a string, and it prints a lot of data. Enough data for me to suspect that all the data is being sent. When I navigate to the folder it should be in, there is a file with the name I gave it, "recording1", but it is only 8KB in size.
Any ideas on what is going on? Any help is greatly appreciated!
Android FileOutputStream seems to fail
No it doesn't. Your code seems to fail. That's because your code makes no sense. You're throwing away large chunks of data, more or less accumulating only 1 out of every 8192 bytes; you're using both buffered and unbuffered reads; you're limiting the input to 8192 bytes; and you're never closing the input. And if the input is larger than 8192*8193 you can get an ArrayIndexOutOfBoundsException.
Throw it all away and use this:
try {
File testDirectory = new File(Environment.getExternalStorageDirectory()+File.separator, "recordFolder");
if(!testDirectory.exists()){
testDirectory.mkdirs();
}
File outputFile = new File(testDirectory, "recording1");
try (OutputStream out = new BufferedOutputStream(new FileOutputStream(outputFile));
BufferedInputStream in = new BufferedInputStream(mySocket.getInputStream())) {
byte[] buffer = new byte[8192]; // or more, whatever you like > 0
int count;
// Canonical Java copy loop
while ((count = in.read(buffer)) > 0)
{
out.write(buffer, 0, count);
}
}
} catch (IOException e1) {
e1.printStackTrace();
}

How to read file part by part while writing it?

I'm working on an app that records video and I need to send already written data in videofile to server in base64 string without stopping record process. Does anyone know how to make it with less memory consumption?
For now I'm doing it this way
private void sendNewVideos(String path) {
try {
Log.i(TAG, "VIDEO PATH - " + path);
FileWriter fileWriter = new FileWriter(new File(pathToFolder + "/temp.txt"));
String base64String = new String();
File file = new File(path);
Long size = 0L;
base64String = Base64.encodeToString(readFile(file, size), Base64.DEFAULT);
fileWriter.append(base64String);
fileWriter.flush();
boolean flag = true;
while (flag) {
if (size < file.length()) {
base64String = Base64.encodeToString(readFile(file, size), Base64.DEFAULT);
fileWriter.append(base64String);
fileWriter.flush();
size = file.length();
}
}
fileWriter.close();
} catch (IOException e) {
e.printStackTrace();
}
}
private byte[] readFile(File file, Long size) {
try {
RandomAccessFile randomAccessFile = new RandomAccessFile(file, "r");
randomAccessFile.seek(size);
FileChannel fileChannel = randomAccessFile.getChannel();
ByteBuffer buffer = ByteBuffer.allocate(1024 * 1024 * 2);
while (fileChannel.read(buffer) > 0) {
buffer.flip();
byte[] temp = new byte[buffer.limit()];
for (int i = 0; i < buffer.limit(); i++) {
temp[i] = buffer.get(i);
}
buffer.clear();
return temp;
}
fileChannel.close();
randomAccessFile.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
Writing to file is just to check how it works. But after some time recording stops. Sometimes LogCat shows something like this
I/art: Thread[3,tid=23425,WaitingInMainSignalCatcherLoop,Thread*=0x7fe42c410800,peer=0x22c08080,"Signal Catcher"]: reacting to signal 3
I/art: Wrote stack traces to '/data/anr/traces.txt'
I think that's because of either memory leak or just out of memory problem.
Some kind of solutions.
Don't use Base64 for encoding video for sending via network (even wi-fi) as it increases amount of data approximately 10 times which is not very good for battery and could kill or hang you process/service.
Avoid reading file that is in process of written as it could and would slowdown IO operation speed.
If you still need to send data from such file use some kind of next algorithm:
get access to file (for example with buffered input stream);
read part of file to buffer;
do as simpler work with it as possible. For, example, send buffer to server in separate thread with HTTPUrlConnection. You can find example here.
Control used memory otherwise system try to kill you process.

Reading a block of bytes from one file and writing to other until all blocks are read?

I am working a project in which I have to play with some file reading writing tasks. I have to read 8 bytes from a file at one time and perform some operations on that block and then write that block to second file, then repeat the cycle until first file is completely read in chuncks of 8 bytes everytime and the after manipulation the data should be added/appended to the second. However, in doing so, I am facing some problems. Following is what I am trying:
private File readFromFile1(File file1) {
int offset = 0;
long message= 0;
try {
FileInputStream fis = new FileInputStream(file1);
byte[] data = new byte[8];
file2 = new File("file2.txt");
FileOutputStream fos = new FileOutputStream(file2.getAbsolutePath(), true);
DataOutputStream dos = new DataOutputStream(fos);
while(fis.read(data, offset, 8) != -1)
{
message = someOperation(data); // operation according to business logic
dos.writeLong(message);
}
fos.close();
dos.close();
fis.close();
} catch (IOException e) {
System.out.println("Some error occurred while reading from File:" + e);
}
return file2;
}
I am not getting the desired output this way. Any help is appreciated.
Consider the following code:
private File readFromFile1(File file1) {
int offset = 0;
long message = 0;
File file2 = null;
try {
FileInputStream fis = new FileInputStream(file1);
byte[] data = new byte[8]; //Read buffer
byte[] tmpbuf = new byte[8]; //Temporary chunk buffer
file2 = new File("file2.txt");
FileOutputStream fos = new FileOutputStream(file2.getAbsolutePath(), true);
DataOutputStream dos = new DataOutputStream(fos);
int readcnt; //Read count
int chunk; //Chunk size to write to tmpbuf
while ((readcnt = fis.read(data, 0, 8)) != -1) {
//// POINT A ////
//Skip chunking system if an 8 byte octet is read directly.
if(readcnt == 8 && offset == 0){
message = someOperation(tmpbuf); // operation according to business logic
dos.writeLong(message);
continue;
}
//// POINT B ////
chunk = Math.min(tmpbuf.length - offset, readcnt); //Determine how much to add to the temp buf.
System.arraycopy(data, 0, tmpbuf, offset, chunk); //Copy bytes to temp buf
offset = offset + chunk; //Sets the offset to temp buf
if (offset == 8) {
message = someOperation(tmpbuf); // operation according to business logic
dos.writeLong(message);
if (chunk < readcnt) {
System.arraycopy(data, chunk, tmpbuf, 0, readcnt - chunk);
offset = readcnt - chunk;
} else {
offset = 0;
}
}
}
//// POINT C ////
//Process remaining bytes here...
//message = foo(tmpbuf);
//dos.writeLong(message);
fos.close();
dos.close();
fis.close();
} catch (IOException e) {
System.out.println("Some error occurred while reading from File:" + e);
}
return file2;
}
In this excerpt of code, what I did was:
Modify your reading code to include the amount of bytes actually read from the read() method (noted readcnt).
Added a byte chunking system (the processing does not happen until there are at least 8 bytes in the chunking buffer).
Allowed for separate processing of the final bytes (that do not make up a 8 byte octet).
As you can see from the code, the data being read is first stored in a chunking buffer (denoted tmpbuf) until at least 8 bytes are available. This will happen only if 8 bytes are not always available (If 8 bytes are available directly and nothing is chunked, directly process. See "Point A" in code). This is done as a form of optimization to prevent excess array copies.
The chunking system uses offsets which increment every time bytes are written to tmpbuf until it reaches a value of 8 (it will not go over as the Math.min() method used in the assignment of 'chunk' will limit the value). Upon offset == 8, proceed to execute the processing code.
If that particular read produced more bytes than actually processed, continue writing them to tmpbuf, from the beginning again, whilst setting offset appropriately, otherwise set offset to 0.
Repeat cycle.
The code will leave the last few bytes of data that do not fit in an octet in the array tmpbuf with the offset variable indicating how much has actually been written. This data can then be processed separately at point C.
Seems a lot more complicating than it should be, and there probably is a better solution (possibly using existing java library methods), but off the top of my head, this is what I got. Hope this is clear enough for you to understand.
You could use the following, it uses NIO and especially the ByteBuffer class for the long handling. You can of course implement it the standard java way, but since i am a NIO fan, here is a possible solution.
The major problem in your code is that while(fis.read(data, offset, 8) != -1) will read up to 8 bytes, and not always 8 bytes, plus reading in such small portions is not very efficient.
I have put some comments in my code, if something is unclear please leave a comment. My someOperation(...) function just copies the next long value from the buffer.
Update:
added finally block to close the files.
import java.io.File;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.file.StandardOpenOption;
public class TestFile {
static final int IN_BUFFER_SIZE = 1024 * 8;
static final int OUT_BUFFER_SIZE = 1024 *9; // make the out-buffer > in-buffer, i am lazy and don't want to check for overruns
static final int MIN_READ_BYTES = 8;
static final int MIN_WRITE_BYTES = 8;
private File readFromFile1(File inFile) {
final File outFile = new File("file2.txt");
final ByteBuffer inBuffer = ByteBuffer.allocate(IN_BUFFER_SIZE);
final ByteBuffer outBuffer = ByteBuffer.allocate(OUT_BUFFER_SIZE);
FileChannel readChannel = null;
FileChannel writeChannel = null;
try {
// open a file channel for reading and writing
readChannel = FileChannel.open(inFile.toPath(), StandardOpenOption.READ);
writeChannel = FileChannel.open(outFile.toPath(), StandardOpenOption.CREATE, StandardOpenOption.WRITE);
long totalReadByteCount = 0L;
long totalWriteByteCount = 0L;
boolean readMore = true;
while (readMore) {
// read some bytes into the in-buffer
int readOp = 0;
while ((readOp = readChannel.read(inBuffer)) != -1) {
totalReadByteCount += readOp;
} // while
// prepare the in-buffer to be consumed
inBuffer.flip();
// check if there where errors
if (readOp == -1) {
// end of file reached, read no more
readMore = false;
} // if
// now consume the in-buffer until there are at least MIN_READ_BYTES in the buffer
while (inBuffer.remaining() >= MIN_READ_BYTES) {
// add data to the write buffer
outBuffer.putLong(someOperation(inBuffer));
} // while
// compact the in-buffer and prepare for the next read, if we need to read more.
// that way the possible remaining bytes of the in-buffer can be consumed after leaving the loop
if (readMore) inBuffer.compact();
// prepare the out-buffer to be consumed
outBuffer.flip();
// write the out-buffer until the buffer is empty
while (outBuffer.hasRemaining())
totalWriteByteCount += writeChannel.write(outBuffer);
// prepare the out-buffer for writing again
outBuffer.flip();
} // while
// error handling
if (inBuffer.hasRemaining()) {
System.err.println("Truncated data! Not a long value! bytes remaining: " + inBuffer.remaining());
} // if
System.out.println("read total: " + totalReadByteCount + " bytes.");
System.out.println("write total: " + totalWriteByteCount + " bytes.");
} catch (IOException e) {
System.out.println("Some error occurred while reading from File: " + e);
} finally {
if (readChannel != null) {
try {
readChannel.close();
} catch (IOException e) {
System.out.println("Could not close read channel: " + e);
} // catch
} // if
if (writeChannel != null) {
try {
writeChannel.close();
} catch (IOException e) {
System.out.println("Could not close write channel: " + e);
} // catch
} // if
} // finally
return outFile;
}
private long someOperation(ByteBuffer bb) {
// consume the buffer, do whatever you want with the buffer.
return bb.getLong(); // consumes 8 bytes of the buffer.
}
public static void main(String[] args) {
TestFile testFile = new TestFile();
File source = new File("input.txt");
testFile.readFromFile1(source);
}
}

Categories

Resources