Issue in chunking the file using RamdomAccesFile and FileChannel

Issue in chunking the file using RamdomAccesFile and FileChannel - java

I wrote this piece of code to chunk the files into multiple chunks. The program works fine for a file of size 12KB with chunk size of 8KB. However, when I give a input file size of 2980144 bytes, it goes into spin - never comes out.
Is there something to do with the size of input file and the FileChannel issue to access? I want to use this program to chunk the larger files (binary form) into multiple chunks for easy transport over network. I have kept the chunk size as parameter, so that I can configure as per requirement.
public static void main(String[] args) {
int chunkSize = 8000;
long offset = 0;
while (offset >= 0) {
offset = splitter.GetNextChunk(offset);
}
}
public long GetNextChunk(long offset) {
long bytesRead = 0;
ByteBuffer tmpBuf = ByteBuffer.allocate(chunkSize);
RandomAccessFile outFile = null;
RandomAccessFile inFile = null;
FileChannel inFC = null;
FileChannel outFC = null;
try {
inFile = new RandomAccessFile(inFileName, "r");
inFC = inFile.getChannel();
tmpBuf.clear();
// Seek to the offset in the file
inFC.position(offset);
// Read the specified number of bytes into the buffer.
do {
bytesRead = inFC.read(tmpBuf);
} while (bytesRead != -1 && tmpBuf.hasRemaining());
// Write the copied bytes into a new file (chunk).
String outFileName = outFolder + File.separator + "Chunk" + String.valueOf(chunkCounter++) + ".dat";
outFile = new RandomAccessFile(outFileName, "rw");
outFC = outFile.getChannel();
outFC.position(0);
tmpBuf.flip();
while(tmpBuf.hasRemaining()) {
outFC.write(tmpBuf);
}
// Reposition the buffer to 0.
tmpBuf.rewind();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
try {
if (inFC != null)
inFile.close();
if (outFC != null)
outFile.close();
if (inFC != null)
inFC.close();
if (outFC != null)
outFC.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
return bytesRead;
}

Found the issue. The loop was faulty. Below is the correct loop.
while (bytesRead >= 0) {
bytesRead = splitter.GetNextChunk(offset);
if (bytesRead == -1)
break;
offset += bytesRead;
System.out.println("Byte offset is: " + offset);
}

It's not nearly as hard as you're making it. Your code is about ten times as long as it needs to be. Try this:
while (in.read(buffer) > 0 || buffer.position() > 0)
{
buffer.flip();
out.write(buffer);
buffer.compact();
}
If 'out' is a SocketChannel this will send the file over the network at maximum speed.
You don't need a monstro buffer, but you should always use powers of 2. I generally use 8192.

Related

File md5 hash changes when chunking it (for netty transfer)

Question at the bottom
I'm using netty to transfer a file to another server.
I limit my file-chunks to 1024*64 bytes (64KB) because of the WebSocket protocol. The following method is a local example what will happen to the file:
public static void rechunck(File file1, File file2) {
FileInputStream is = null;
FileOutputStream os = null;
try {
byte[] buf = new byte[1024*64];
is = new FileInputStream(file1);
os = new FileOutputStream(file2);
while(is.read(buf) > 0) {
os.write(buf);
}
} catch (IOException e) {
Controller.handleException(Thread.currentThread(), e);
} finally {
try {
if(is != null && os != null) {
is.close();
os.close();
}
} catch (IOException e) {
Controller.handleException(Thread.currentThread(), e);
}
}
}
The file is loaded by the InputStream into a ByteBuffer and directly written to the OutputStream.
The content of the file cannot change while this process.
To get the md5-hashes of the file I've wrote the following method:
public static String checksum(File file) {
InputStream is = null;
try {
is = new FileInputStream(file);
MessageDigest digest = MessageDigest.getInstance("MD5");
byte[] buffer = new byte[8192];
int read = 0;
while((read = is.read(buffer)) > 0) {
digest.update(buffer, 0, read);
}
return new BigInteger(1, digest.digest()).toString(16);
} catch(IOException | NoSuchAlgorithmException e) {
Controller.handleException(Thread.currentThread(), e);
} finally {
try {
is.close();
} catch(IOException e) {
Controller.handleException(Thread.currentThread(), e);
}
}
return null;
}
So: just in theory it should return the same hash, shouldn't it? The problem is that it returns two different hashes that do not differ with every run.. file size stays the same and the content either.
When I run the method once for in: file-1, out: file-2 and again with in: file-2 and out: file-3 the hashes of file-2 and file-3 are the same! This means the method will properly change the file every time the same way.
1. 58a4a9fbe349a9e0af172f9cf3e6050a
2. 7b3f343fa1b8c4e1160add4c48322373
3. 7b3f343fa1b8c4e1160add4c48322373
Here is a little test that compares all buffers if they are equivalent. Test is positive. So there aren't any differences.
File file1 = new File("controller/templates/Example.zip");
File file2 = new File("controller/templates2/Example.zip");
try {
byte[] buf1 = new byte[1024*64];
byte[] buf2 = new byte[1024*64];
FileInputStream is1 = new FileInputStream(file1);
FileInputStream is2 = new FileInputStream(file2);
boolean run = true;
while(run) {
int read1 = is1.read(buf1), read2 = is2.read(buf2);
String result1 = Arrays.toString(buf1), result2 = Arrays.toString(buf2);
boolean test = result1.equals(result2);
System.out.println("1: " + result1);
System.out.println("2: " + result2);
System.out.println("--- TEST RESULT: " + test + " ----------------------------------------------------");
if(!(read1 > 0 && read2 > 0) || !test) run = false;
}
} catch (IOException e) {
e.printStackTrace();
}
Question: Can you help me chunking the file without changing the hash?

while(is.read(buf) > 0) {
os.write(buf);
}
The read() method with the array argument will return the number of files read from the stream. When the file doesn't end exactly as a multiple of the byte array length, this return value will be smaller than the byte array length because you reached the file end.
However your os.write(buf); call will write the whole byte array to the stream, including the remaining bytes after the file end. This means the written file gets bigger in the end, therefore the hash changed.
Interestingly you didn't make the mistake when you updated the message digest:
while((read = is.read(buffer)) > 0) {
digest.update(buffer, 0, read);
}
You just have to do the same when you "rechunk" your files.

Your rechunk method has a bug in it. Since you have a fixed buffer in there, your file is split into ByteArray-parts. but the last part of the file can be smaller than the buffer, which is why you write too many bytes in the new file. and that's why you do not have the same checksum anymore. the error can be fixed like this:
public static void rechunck(File file1, File file2) {
FileInputStream is = null;
FileOutputStream os = null;
try {
byte[] buf = new byte[1024*64];
is = new FileInputStream(file1);
os = new FileOutputStream(file2);
int length;
while((length = is.read(buf)) > 0) {
os.write(buf, 0, length);
}
} catch (IOException e) {
Controller.handleException(Thread.currentThread(), e);
} finally {
try {
if(is != null)
is.close();
if(os != null)
os.close();
} catch (IOException e) {
Controller.handleException(Thread.currentThread(), e);
}
}
}
Due to the length variable, the write method knows that until byte x of the byte array, only the file is off, then there are still old bytes in it that no longer belong to the file.

Read all InputStream values at once into a byte[] array

Is there a way to read all InputStream values at once without a need of using some Apache IO lib?
I am reading IR signal and saving it from the InputStream into the byte[] array. While debugging, I have noticed that it works only if I put a delay there, so that I read all bytes at once and then process it.
Is there a smarter way to do it?
CODE:
public void run() {
Log.i(TAG, "BEGIN mConnectedThread");
byte[] buffer = new byte[100];
int numberOfBytes;
removeSharedPrefs("mSharedPrefs");
// Keep listening to the InputStream while connected
while (true) {
try {
// Read from the InputStream
numberOfBytes = mmInStream.read(buffer);
Thread.sleep(700); //If I stop it here for a while, all works fine, because array is fully populated
if (numberOfBytes > 90){
// GET AXIS VALUES FROM THE SHARED PREFS
String[] refValues = loadArray("gestureBuffer", context);
if (refValues!=null && refValues.length>90) {
int incorrectPoints;
if ((incorrectPoints = checkIfGesureIsSameAsPrevious(buffer, refValues, numberOfBytes)) < 5) {
//Correct
} else {
//Incorrect
}
}
saveArray(buffer, numberOfBytes);
}else{
System.out.println("Transmission of the data was corrupted.");
}
buffer = new byte[100];
// Send the obtained bytes to the UI Activity
mHandler.obtainMessage(Constants.MESSAGE_READ, numberOfBytes, -1, buffer)
.sendToTarget();
} catch (IOException e) {
Log.e(TAG, "disconnected", e);
connectionLost();
// Start the service over to restart listening mode
BluetoothChatService.this.start();
break;
} catch (InterruptedException e) {
e.printStackTrace();
}
}
}

Edit:
My old answer is wrong, see EJPs comment! Please don't use it. The behaviour of ByteChannels depend on wether InputStreams are blocking or not.
So this is why I would suggest, you just copy IOUtils.read from Apache Commons:
public static int read(final InputStream input, final byte[] buffer) throws IOException {
int remaining = buffer.length;
while (remaining > 0) {
final int location = buffer.length - remaining;
final int count = input.read(buffer, location, remaining);
if (count == -1) { // EOF
break;
}
remaining -= count;
}
return buffer.length - remaining;
}
Old answer:
You can use ByteChannels and read into a ByteBuffer:
ReadableByteChannel c = Channels.newChannel(inputstream);
ByteBuffer buf = ByteBuffer.allocate(numBytesExpected);
int numBytesActuallyRead = c.read(buf);
This read method is attempting to read as many bytes as there is remaining space in the buffer. If the stream ends before the buffer is fully filled, the number of bytes actually read is returned. See JavaDoc.

Getting game ID of psx game

I was wondering which is the best way to get the title from a disk image in .iso or .cue+.bin format,
Is there any java library that can do this or should I read from the file header?
UPDATE:
I managed to do it, i was particularly interested in PSX ISOs title. It's 10 bytes long and this is a sample code to read it:
File f = new File("cdimage2.bin");
FileInputStream fin = new FileInputStream(f);
fin.skip(37696);
int i = 0;
while (i < 10) {
System.out.print((char) fin.read());
i++;
}
System.out.println();
UPDATE2: This method is better:
private String getPSXId(File f) {
FileInputStream fin;
try {
fin = new FileInputStream(f);
fin.skip(32768);
byte[] buffer = new byte[4096];
long start = System.currentTimeMillis();
while (fin.read(buffer) != -1) {
String buffered = new String(buffer);
if (buffered.contains("BOOT = cdrom:\\")) {
String tmp = "";
int lidx = buffered.lastIndexOf("BOOT = cdrom:\\") + 14;
for (int i = 0; i < 11; i++) {
tmp += buffered.charAt(lidx + i);
}
long elapsed = System.currentTimeMillis() - start;
// System.out.println("BOOT = cdrom:\\" + tmp);
tmp = tmp.toUpperCase().replace(".", "").replace("_", "-");
fin.close();
return tmp;
}
}
fin.close();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return null;
}

Just start reading after 32768 bytes (unused by ISO9660) in 2048 byte chunks (Volume Descriptor). The first byte determines the type of the descriptor, and 1 means Primary Volume Descriptor, which contain the title after the first 7 bytes (which are always \x01CD001\x01). The next byte is a NUL (\x00) and it is followed by 32 bytes of system and 32 bytes of volume identifier, the latter usually known and displayed as title. See http://alumnus.caltech.edu/~pje/iso9660.html for a more detailed description.

Downloading with BufferInputStream not working properly

The following code doesn't work to download a file (btw clen is file's length):
int pos = 0, total_pos = 0;
byte[] buffer = new byte[BUFFER_SIZE];
while (pos != -1) {
pos = in.read(buffer, 0, BUFFER_SIZE);
total_pos += pos;
out.write(buffer);
setProgress((int) (total_pos * 100 / clen));
}
...but this works fine:
int buf;
while ((buf = in.read()) != -1)
out.write(buf);
I'm wondering why, even though the second code segment works quickly. On that note, is there any particular reason to use a byte[] buffer (since it doesn't seem to be faster, and BufferedInputStream already uses a buffer of its own....?)

Here's how it should be done.
public static void copyStream(InputStream is, OutputStream os)
{
byte[] buff = new byte[4096];
int count;
try {
while((count = is.read(buff)) > 0)
os.write(buff, 0, count);
}catch (Exception e) {
e.printStackTrace();
}finally {
try {
if(is != null)
is.close();
} catch (IOException e) {
e.printStackTrace();
}
try {
if(os != null)
os.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}

I've tried to make the minimum changes necessary to your code to get it working. st0le did a good job of providing a neater version of stream copying.
public class Test {
private static final String FORMAT = "UTF-8";
private static final int BUFFER_SIZE = 10; // for demonstration purposes.
public static void main(String[] args) throws Exception {
String string = "This is a test of the public broadcast system";
int clen = string.length();
ByteArrayInputStream in = new ByteArrayInputStream(string.getBytes(FORMAT));
OutputStream out = System.out;
int pos = 0, total_pos = 0;
byte[] buffer = new byte[BUFFER_SIZE];
while (pos != -1) {
pos = in.read(buffer, 0, BUFFER_SIZE);
if (pos > 0) {
total_pos += pos;
out.write(buffer, 0, pos);
setProgress((int) (total_pos * 100 / clen));
}
}
}
private static void setProgress(int i) {
}
}
You were ignoring the value of pos when you were writing out the buffer to the output stream.
You also need to re-check the value of pos because it may have just read the end of the file. You don't increment the total_pos in that case (although you should probably report that you are 100% complete)
Be sure to handle your resources correctly with close()s in the appropriate places.
-edit-
The general reason for using an array as a buffer is so that the output stream can do as much work as it can with a larger set of data.
Writing to a console there might not be much of a delay, but it might be a network socket being written to or some other slow device. As the JavaDoc states
The write method of OutputStream calls the write method of one argument on each of the bytes to be written out. Subclasses are encouraged to override this method and provide a more efficient implementation.
The benefit of using it when using a Buffered Input/Output Stream are probably minimal.

Socket Programming : Inputstream Stuck in loop - read() always return 0

Server side code
public static boolean sendFile() {
int start = Integer.parseInt(startAndEnd[0]) - 1;
int end = Integer.parseInt(startAndEnd[1]) - 1;
int size = (end - start) + 1;
try {
bos = new BufferedOutputStream(initSocket.getOutputStream());
bos.write(byteArr,start,size);
bos.flush();
bos.close();
initSocket.close();
System.out.println("Send file to : " + initSocket);
} catch (IOException e) {
System.out.println(e.getLocalizedMessage());
disconnected();
return false;
}
return true;
}
Client Side
public boolean receiveFile() {
int current = 0;
try {
int bytesRead = bis.read(byteArr,0,byteArr.length);
System.out.println("Receive file from : " + client);
current = bytesRead;
do {
bytesRead =
bis.read(byteArr, current, (byteArr.length-current));
if(bytesRead >= 0) current += bytesRead;
} while(bytesRead != -1);
bis.close();
bos.write(byteArr, 0 , current);
bos.flush();
bos.close();
} catch (IOException e) {
System.out.println(e.getLocalizedMessage());
disconnected();
return false;
}
return true;
}
Client side is multithreading,server side not use multithreading. I just paste some code that made problem if you want see all code please tell me.
After I debug the code, I found that if I set max thread to any and then the first thread always stuck in this loop. That bis.read(....) always return 0. Although, server had close stream and it not get out of the loop. I don't know why ... But another threads are work correctly.
do {
bytesRead =
bis.read(byteArr, current, (byteArr.length-current));
if(bytesRead >= 0) current += bytesRead;
} while(bytesRead != -1);

How large is your input file (the one you send?) and how large is "byteArr"?
Also, by the time your check how many bytes are read, you already called bis.read(..) twice:
int bytesRead = bis.read(byteArr,0,byteArr.length);
You probably want to read/send files larger than your buffer, so you probably want to do something like this:
byte [] buffer = new byte[4096];
int bytesRead;
int totalLength = 0;
while(-1 != (bytesRead = is.read(buffer))) {
bos.write(buffer, 0, bytesRead);
totalLength += bytesRead;
}
bos.close();
is.close();
"is" would be a plain InputStream, Peter is right, you do not need to buffer it.

read() will return 0 when you give it a buffer with no room left. (Which appears to be the case here)
I would suggest you use a DataInputStream.readFully() which does this for you.
dis.readFully(byteArr); // keeps reading until the byte[] is full.
If you are only writing large byte[] or only writing one piece of data, using a Buffered Stream just adds overhead. You don't need it.
BTW: When you call close() it will call flush() for you.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Issue in chunking the file using RamdomAccesFile and FileChannel - java

Found the issue. The loop was faulty. Below is the correct loop. while (bytesRead >= 0) { bytesRead = splitter.GetNextChunk(offset); if (bytesRead == -1) break; offset += bytesRead; System.out.println("Byte offset is: " + offset); }

Related

File md5 hash changes when chunking it (for netty transfer)

Read all InputStream values at once into a byte[] array

Getting game ID of psx game

Downloading with BufferInputStream not working properly

Socket Programming : Inputstream Stuck in loop - read() always return 0

Categories

Resources