How to use a chunk delimiter in a raw data file? - java

I want to save raw data chunks to a file, And later on read those chunks one by one. This is no big deal except the following doubt:
What exact bytes to use as a delimiter, i.e to identify end of one chunk and beginning of next ? Given that chunk data might also contain such a sequence of bytes by random chance.
Notes: chunks are of variable size and contain random data. They are jpeg images actually.

You could first write the length of the chunk to the file as a fixed-size value, e.g. a 4 bytes integer, followed by the data itself:
public void appendChunk(byte[] data, File file) throws IOException {
DataOutputStream stream = null;
try {
stream = new DataOutputStream(new BufferedOutputStream(new FileOutputStream(file, true)));
stream.writeInt(data.length);
stream.write(data);
} finally {
if (stream != null) {
try {
stream.close();
} catch (IOException e) {
// ignore
}
}
}
}
If you later have to read the chunks back from that file, you start by reading the length of the first chunk. You now can decide whether to read the chunk data, or whether to skip it and continue with the next chunk.
public void processChunks(File file) throws IOException {
DataInputStream stream = null;
try {
stream = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));
while (true) {
try {
int length = stream.readInt();
byte[] data = new byte[length];
stream.readFully(data);
// todo: do something with the data
} catch (EOFException e) {
// end of file reached
break;
}
}
} finally {
if (stream != null) {
try {
stream.close();
} catch (IOException e) {
// ignore
}
}
}
}
You can also add other meta-data about the chunks, like writing the original name of the file with stream.writeUTF(...). You only have to make sure that you write and read the same data in the same order.

Create a 2nd file in which you save the byteranges of your chunks in the chunkfile, or add that information to the header of your chunkfile. Did something similar once, don't forget that the byteranges than have the additional offset of the length of the header.
int startbyte = 0;
int lastByte = 0;
int chunkcount = 0;
File chunkfile;
File structurefile;
for (every chunk) {
append chunk to chunkfile
lastByte = startByte + chunk.sizeInBytes()
append to structurefile: chunkcount startByte lastByte
chunkcount++;
startByte = lastByte + 1
}

Related

Unable to convert bytes array into audio AAC file

I am struggling with finding a solution to write my bytes array to a playable AAC audio file.
From my Flutter.io front-end, I am encoding my .aac audio files as a list of UInt8List and sending it to my Spring-Boot server. Then I am able to convert them to a proper bytes array where I then attempt to write it back to a .aac file as seen below:
public void writeToAudioFile(ArrayList<Double> audioData) {
byte[] byteArray = new byte[1024];
Iterator<Double> iterator = audioData.iterator();
System.out.println(byteArray);
while (iterator.hasNext()) {
// for some reason my list came in as a list of doubles
// so I am making sure to get these values back to an int
Integer i = iterator.next().intValue();
byteArray[i] = i.byteValue();
}
try {
File someFile = new File("test.aac");
FileOutputStream fos = new FileOutputStream(someFile);
fos.write(byteArray);
fos.flush();
fos.close();
System.out.println("File created");
} catch (Exception e) {
// TODO: handle exception
System.out.println("Error: " + e);
}
I am able to write my bytes array back to an audio file, however, it is unplayable. So I am wondering if this approach is possible and If my issue does lie in Java.
I have been doing extraneous research and I think that I need to say that this file is a specific type of media file? Or maybe the encoded audio file is corrupt when reaching my server?
Your conversion loop
while (iterator.hasNext()) {
// for some reason my list came in as a list of doubles
// so I am making sure to get these values back to an int
Integer i = iterator.next().intValue();
byteArray[i] = i.byteValue();
}
gets the value i from the iterator, and then tries to write it at the position i in the byteArray, which kind of jumbles your audio bytes in a weird way.
A working function that converts List<Double> to byte[] would look something like this
byte[] inputToBytes(List<Double> audioData) {
byte[] result = new byte[audioData.size()];
for (int i = 0; i < audioData.size(); i++) {
result[i] = audioData.get(i).byteValue();
}
return result;
}
then you could use it in the writeToAudioFile():
void writeToAudioFile(ArrayList<Double> audioData) {
try (FileOutputStream fos = new FileOutputStream("test.aac")) {
fos.write(inputToBytes(audioData));
System.out.println("File created");
} catch (Exception e) {
// TODO: handle exception
System.out.println("Error: " + e);
}
}
This certainly produces the playable file if you have the valid bytes in the audioData. The contents and the extension should be enough for the OS/player to recognize the format.
If this doesn’t work, I would look into the data received to see if it is correct.

Java-Shifting Characters from Binary Files

I looked at some previous threads about binary files and I am doing the dataStream like it says, but I am not for sure why mine isn't working as I think I am doing the same thing as threads say I am. My goal is to make a method that takes in a file name that is in .bin format with a shift integer. I will make a new file of the .bin type with the characters shifted. Only capital or lower case letters will be shifted though. I don't know the length of the binary file that is being read in and needs to go through all of the characters. The file will only have 1 line though. I have a method that gives me the number of characters on that line and a method that creates a file. The program I know does create the file correctly. Anyways, what is happening is it creates the file, then gives me an EOF exception about the line: char currentChar=data.readChar();
Here is my code:
private static void cipherShifter(String file, int shift) {
String newFile=file+"_cipher";
createFile(newFile);
int numChar;
try {
FileInputStream stream=new FileInputStream(file);
DataInputStream data=new DataInputStream(stream);
FileOutputStream streamOut=new FileOutputStream(newFile);
DataOutputStream dataOut=new DataOutputStream(streamOut);
numChar=readAllInts(data);
for (int i=0;i<numChar;++i) {
char currentChar=data.readChar();
if (((currentChar>='A')&&(currentChar<='Z'))||((currentChar>='a')&&(currentChar<='z'))) {
currentChar=currentChar+=shift;
dataOut.writeChar(currentChar);
}
else {
dataOut.writeChar(currentChar);
}
}
data.close();
dataOut.flush();
dataOut.close();
} catch(IOException error) {
error.printStackTrace();
}
}
private static void createFile(String fileName) {
File file=new File(fileName);
if (file.exists()) {
//Do nothing
}
else {
try {
file.createNewFile();
} catch (IOException e) {
//Do nothing
}
}
}
private static int readAllInts(DataInputStream din) throws IOException {
int count = 0;
while (true) {
try {
din.readInt(); ++count;
} catch (EOFException e) {
return count;
}
}
}
So the error I do not think should be happening because I do have the correct data type and I am telling it to read just a character. Any help would be great. Thanks in advance.
Based on the description above, your error is reported at the data.readChar() method invocation and not inside the readAllInts method. I simulated the code near your error and got the same Exception on a text file at the same location.
I used the readByte method to read one byte at a time since you are mainly interested in ASCII bytes. I also changed readAllInts to be readAllBytes so I work with total byte count.
private static void cipherShifter(String file, int shift) {
String newFile=file+"_cipher";
createFile(newFile);
int numChar;
try {
FileInputStream stream=new FileInputStream(file);
DataInputStream data=new DataInputStream(stream);
FileOutputStream streamOut=new FileOutputStream(newFile);
DataOutputStream dataOut=new DataOutputStream(streamOut);
numBytes=readAllBytes(data);
stream.close();
data.close();
stream=new FileInputStream(file);
data=new DataInputStream(stream);
for (int i=0;i<numBytes;++i) {
byte currentByte=data.readByte();
if (((currentByte>=65)&&(currentByte<=90))||((currentByte>=97)&&(currentByte<=122))) {
currentByte=currentByte+=shift; //need to ensure no overflow beyond a byte
dataOut.writeByte(currentByte);
}
else {
dataOut.writeByte(currentByte);
}
}
data.close();
dataOut.flush();
dataOut.close();
} catch(IOException error) {
error.printStackTrace();
}
}
private static void createFile(String fileName) {
File file=new File(fileName);
if (file.exists()) {
//Do nothing
}
else {
try {
file.createNewFile();
} catch (IOException e) {
//Do nothing
}
}
}
private static int readAllBytes(DataInputStream din) throws IOException {
int count = 0;
while (true) {
try {
din.readByte(); ++count;
} catch (EOFException e) {
return count;
}
}
}
It looks like you're getting the EOFException because you're passing the DataInputStream object to your readAllInts method, reading through the stream, then trying to read from it again inside your for loop. The problem there is that the pointer that keeps track of where you are in the stream is already near the end of the stream (or at the end of it) when readAllInts returns. I suspect it's near the end, rather than at it since the readChar() method is throwing the EOFException immediately, which it does when it only reads one of the two bytes it expects to be able to read before hitting the EOF.
To solve that problem, you could call data.mark() before passing the reader to the readAllInts method, then calling data.reset() after that method returns; that would repoint the pointer to the beginning of the stream. (This assumes data.markSupported() is true.)
You also have the problem we talked about above that your counter is reading in four bytes at a time, and your character reader is reading in two at a time. Your suggested method of doubling the return value of readAllInts would help (you could also use readChar() instead of readInt().)
You still need to think about how you're going to handle the case of binary files that are odd-numbered bytes long. There are a variety of ways you could handle that one. I'm too beat to write up a code sample tonight, but if you're still stuck tomorrow, add a comment and I'll see what I can do to help.

Reading a block of bytes from one file and writing to other until all blocks are read?

I am working a project in which I have to play with some file reading writing tasks. I have to read 8 bytes from a file at one time and perform some operations on that block and then write that block to second file, then repeat the cycle until first file is completely read in chuncks of 8 bytes everytime and the after manipulation the data should be added/appended to the second. However, in doing so, I am facing some problems. Following is what I am trying:
private File readFromFile1(File file1) {
int offset = 0;
long message= 0;
try {
FileInputStream fis = new FileInputStream(file1);
byte[] data = new byte[8];
file2 = new File("file2.txt");
FileOutputStream fos = new FileOutputStream(file2.getAbsolutePath(), true);
DataOutputStream dos = new DataOutputStream(fos);
while(fis.read(data, offset, 8) != -1)
{
message = someOperation(data); // operation according to business logic
dos.writeLong(message);
}
fos.close();
dos.close();
fis.close();
} catch (IOException e) {
System.out.println("Some error occurred while reading from File:" + e);
}
return file2;
}
I am not getting the desired output this way. Any help is appreciated.
Consider the following code:
private File readFromFile1(File file1) {
int offset = 0;
long message = 0;
File file2 = null;
try {
FileInputStream fis = new FileInputStream(file1);
byte[] data = new byte[8]; //Read buffer
byte[] tmpbuf = new byte[8]; //Temporary chunk buffer
file2 = new File("file2.txt");
FileOutputStream fos = new FileOutputStream(file2.getAbsolutePath(), true);
DataOutputStream dos = new DataOutputStream(fos);
int readcnt; //Read count
int chunk; //Chunk size to write to tmpbuf
while ((readcnt = fis.read(data, 0, 8)) != -1) {
//// POINT A ////
//Skip chunking system if an 8 byte octet is read directly.
if(readcnt == 8 && offset == 0){
message = someOperation(tmpbuf); // operation according to business logic
dos.writeLong(message);
continue;
}
//// POINT B ////
chunk = Math.min(tmpbuf.length - offset, readcnt); //Determine how much to add to the temp buf.
System.arraycopy(data, 0, tmpbuf, offset, chunk); //Copy bytes to temp buf
offset = offset + chunk; //Sets the offset to temp buf
if (offset == 8) {
message = someOperation(tmpbuf); // operation according to business logic
dos.writeLong(message);
if (chunk < readcnt) {
System.arraycopy(data, chunk, tmpbuf, 0, readcnt - chunk);
offset = readcnt - chunk;
} else {
offset = 0;
}
}
}
//// POINT C ////
//Process remaining bytes here...
//message = foo(tmpbuf);
//dos.writeLong(message);
fos.close();
dos.close();
fis.close();
} catch (IOException e) {
System.out.println("Some error occurred while reading from File:" + e);
}
return file2;
}
In this excerpt of code, what I did was:
Modify your reading code to include the amount of bytes actually read from the read() method (noted readcnt).
Added a byte chunking system (the processing does not happen until there are at least 8 bytes in the chunking buffer).
Allowed for separate processing of the final bytes (that do not make up a 8 byte octet).
As you can see from the code, the data being read is first stored in a chunking buffer (denoted tmpbuf) until at least 8 bytes are available. This will happen only if 8 bytes are not always available (If 8 bytes are available directly and nothing is chunked, directly process. See "Point A" in code). This is done as a form of optimization to prevent excess array copies.
The chunking system uses offsets which increment every time bytes are written to tmpbuf until it reaches a value of 8 (it will not go over as the Math.min() method used in the assignment of 'chunk' will limit the value). Upon offset == 8, proceed to execute the processing code.
If that particular read produced more bytes than actually processed, continue writing them to tmpbuf, from the beginning again, whilst setting offset appropriately, otherwise set offset to 0.
Repeat cycle.
The code will leave the last few bytes of data that do not fit in an octet in the array tmpbuf with the offset variable indicating how much has actually been written. This data can then be processed separately at point C.
Seems a lot more complicating than it should be, and there probably is a better solution (possibly using existing java library methods), but off the top of my head, this is what I got. Hope this is clear enough for you to understand.
You could use the following, it uses NIO and especially the ByteBuffer class for the long handling. You can of course implement it the standard java way, but since i am a NIO fan, here is a possible solution.
The major problem in your code is that while(fis.read(data, offset, 8) != -1) will read up to 8 bytes, and not always 8 bytes, plus reading in such small portions is not very efficient.
I have put some comments in my code, if something is unclear please leave a comment. My someOperation(...) function just copies the next long value from the buffer.
Update:
added finally block to close the files.
import java.io.File;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.file.StandardOpenOption;
public class TestFile {
static final int IN_BUFFER_SIZE = 1024 * 8;
static final int OUT_BUFFER_SIZE = 1024 *9; // make the out-buffer > in-buffer, i am lazy and don't want to check for overruns
static final int MIN_READ_BYTES = 8;
static final int MIN_WRITE_BYTES = 8;
private File readFromFile1(File inFile) {
final File outFile = new File("file2.txt");
final ByteBuffer inBuffer = ByteBuffer.allocate(IN_BUFFER_SIZE);
final ByteBuffer outBuffer = ByteBuffer.allocate(OUT_BUFFER_SIZE);
FileChannel readChannel = null;
FileChannel writeChannel = null;
try {
// open a file channel for reading and writing
readChannel = FileChannel.open(inFile.toPath(), StandardOpenOption.READ);
writeChannel = FileChannel.open(outFile.toPath(), StandardOpenOption.CREATE, StandardOpenOption.WRITE);
long totalReadByteCount = 0L;
long totalWriteByteCount = 0L;
boolean readMore = true;
while (readMore) {
// read some bytes into the in-buffer
int readOp = 0;
while ((readOp = readChannel.read(inBuffer)) != -1) {
totalReadByteCount += readOp;
} // while
// prepare the in-buffer to be consumed
inBuffer.flip();
// check if there where errors
if (readOp == -1) {
// end of file reached, read no more
readMore = false;
} // if
// now consume the in-buffer until there are at least MIN_READ_BYTES in the buffer
while (inBuffer.remaining() >= MIN_READ_BYTES) {
// add data to the write buffer
outBuffer.putLong(someOperation(inBuffer));
} // while
// compact the in-buffer and prepare for the next read, if we need to read more.
// that way the possible remaining bytes of the in-buffer can be consumed after leaving the loop
if (readMore) inBuffer.compact();
// prepare the out-buffer to be consumed
outBuffer.flip();
// write the out-buffer until the buffer is empty
while (outBuffer.hasRemaining())
totalWriteByteCount += writeChannel.write(outBuffer);
// prepare the out-buffer for writing again
outBuffer.flip();
} // while
// error handling
if (inBuffer.hasRemaining()) {
System.err.println("Truncated data! Not a long value! bytes remaining: " + inBuffer.remaining());
} // if
System.out.println("read total: " + totalReadByteCount + " bytes.");
System.out.println("write total: " + totalWriteByteCount + " bytes.");
} catch (IOException e) {
System.out.println("Some error occurred while reading from File: " + e);
} finally {
if (readChannel != null) {
try {
readChannel.close();
} catch (IOException e) {
System.out.println("Could not close read channel: " + e);
} // catch
} // if
if (writeChannel != null) {
try {
writeChannel.close();
} catch (IOException e) {
System.out.println("Could not close write channel: " + e);
} // catch
} // if
} // finally
return outFile;
}
private long someOperation(ByteBuffer bb) {
// consume the buffer, do whatever you want with the buffer.
return bb.getLong(); // consumes 8 bytes of the buffer.
}
public static void main(String[] args) {
TestFile testFile = new TestFile();
File source = new File("input.txt");
testFile.readFromFile1(source);
}
}

Faster way of copying data in Java?

I have been given a task of copying data from a server. I am using BufferedInputStream and output stream to copy the data and I am doing it byte by byte. Even though it is running but It is taking ages to copy the data as some of them are in 100's MBs, so definitely it is not gonna work. Can anyone suggest me any alternate of Byte by Byte copy so that my code can copy file that are in few Hundred MBs.
Buffer is 2048.
Here is how my code look like:
static void copyFiles(SmbFile[] files, String parent) throws IOException {
SmbFileInputStream input = null;
FileOutputStream output = null;
BufferedInputStream buf_input = null;
try {
for (SmbFile f : files) {
System.out.println("Working on files :" + f.getName());
if (f.isDirectory()) {
File folderToBeCreated = new File(parent+f.getName());
if (!folderToBeCreated.exists()) {
folderToBeCreated.mkdir();
System.out.println("Folder name " + parent
+ f.getName() + "has been created");
} else {
System.out.println("exists");
}
copyFiles(f.listFiles(), parent + f.getName());
} else {
input = (SmbFileInputStream) f.getInputStream();
buf_input = new BufferedInputStream(input, BUFFER);
File t = new File(parent + f.getName());
if (!t.exists()) {
t.createNewFile();
}
output = new FileOutputStream(t);
int c;
int count;
byte data[] = new byte[BUFFER];
while ((count = buf_input.read(data, 0, BUFFER)) != -1) {
output.write(data, 0, count);
}
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
if (input != null) {
input.close();
}
if (output != null) {
output.close();
}
}
}
Here is a link to an excellent post explaining how to use nio channels to make copies of streams. It introduces a helper method ChannelTools.fastChannelCopy that lets you copy streams like this:
final InputStream input = new FileInputStream(inputFile);
final OutputStream output = new FileOutputStream(outputFile);
final ReadableByteChannel inputChannel = Channels.newChannel(input);
final WriteableByteChannel outputChannel = Channels.newChannel(output);
ChannelTools.fastChannelCopy(inputChannel, outputChannel);
inputChannel.close();
outputChannel.close()
Well since you're using a BufferedInputStream, you aren't reading byte by byte, but rather the size of the buffer. You could just try increasing the buffer size.
Reading/writing byte-by-byte is definitely going to be slow, even though the actual reading/writing is done by chunks of the buffer size. One way to speed it up is to read/write by blocks. Have a look at read(byte[] b, int off, int len) method of BufferedInputStream. However it probably won't give you enough of the improvement.
What would be much better is to use nio package (new IO) to copy data using nio channels. Have a look at nio documentation for more info.
I would suggest to use FileUtils from org.apache.commons.io. It has enough utility methods to perform file operations.
org.apache.commons.io.FileUtils API Here

Why is my image coming out garbled?

I've got some Java code using a servlet and Apache Commons FileUpload to upload a file to a set directory. It's working fine for character data (e.g. text files) but image files are coming out garbled. I can open them but the image doesn't look like it should. Here's my code:
Servlet
protected void doPost(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
try {
String customerPath = "\\leetest\\";
// Check that we have a file upload request
boolean isMultipart = ServletFileUpload.isMultipartContent(request);
if (isMultipart) {
// Create a new file upload handler
ServletFileUpload upload = new ServletFileUpload();
// Parse the request
FileItemIterator iter = upload.getItemIterator(request);
while (iter.hasNext()) {
FileItemStream item = iter.next();
String name = item.getFieldName();
if (item.isFormField()) {
// Form field. Ignore for now
} else {
BufferedInputStream stream = new BufferedInputStream(item
.openStream());
if (stream == null) {
LOGGER
.error("Something went wrong with fetching the stream for field "
+ name);
}
byte[] bytes = StreamUtils.getBytes(stream);
FileManager.createFile(customerPath, item.getName(), bytes);
stream.close();
}
}
}
} catch (Exception e) {
throw new UploadException("An error occured during upload: "
+ e.getMessage());
}
}
StreamUtils.getBytes(stream) looks like:
public static byte[] getBytes(InputStream src, int buffsize)
throws IOException {
ByteArrayOutputStream byteStream = new ByteArrayOutputStream();
byte[] buff = new byte[buffsize];
while (true) {
int nBytesRead = src.read(buff);
if (nBytesRead < 0) {
break;
}
byteStream.write(buff);
}
byte[] result = byteStream.toByteArray();
byteStream.close();
return result;
}
And finally FileManager.createFile looks like:
public static void createFile(String customerPath, String filename,
byte[] fileData) throws IOException {
customerPath = getFullPath(customerPath + filename);
File newFile = new File(customerPath);
if (!newFile.getParentFile().exists()) {
newFile.getParentFile().mkdirs();
}
FileOutputStream outputStream = new FileOutputStream(newFile);
outputStream.write(fileData);
outputStream.close();
}
Can anyone spot what I'm doing wrong?
Cheers,
Lee
One thing I don't like is here in this block from StreamUtils.getBytes():
1 while (true) {
2 int nBytesRead = src.read(buff);
3 if (nBytesRead < 0) {
4 break;
5 }
6 byteStream.write(buff);
7 }
At line 6, it writes the entire buffer, no matter how many bytes are read in. I am not convinced this will always be the case. It would be more correct like this:
1 while (true) {
2 int nBytesRead = src.read(buff);
3 if (nBytesRead < 0) {
4 break;
5 } else {
6 byteStream.write(buff, 0, nBytesRead);
7 }
8 }
Note the 'else' on line 5, along with the two additional parameters (array index start position and length to copy) on line 6.
I could imagine that for larger files, like images, the buffer returns before it is filled (maybe it is waiting for more). That means you'd be unintentionally writing old data that was remaining in the tail end of the buffer. This is almost certainly happening most of the time at EoF, assuming a buffer > 1 byte, but extra data at EoF is probably not the cause of your corruption...it is just not desirable.
I'd just use commons io Then you could just do an IOUtils.copy(InputStream, OutputStream);
It's got lots of other useful utility methods.
Are you sure that the image isn't coming through garbled or that you aren't dropping some packets on the way in.
I don't know what difference it makes, but there seems to be a mismatch of method signatures. The getBytes() method called in your doPost() method has only one argument:
byte[] bytes = StreamUtils.getBytes(stream);
while the method source you included has two arguments:
public static byte[] getBytes(InputStream src, int buffsize)
Hope that helps.
Can you perform a checksum on your original file, and the uploaded file and see if there is any immediate differences?
If there are then you can look at performing a diff, to determine the exact part(s) of the file that are missing changed.
Things that pop to mind is beginning or end of stream, or endianness.

Categories

Resources