Breaking binary files - java

With reference to my previous question.
I have made the program with the following approach:
The program first reads 2k of data from the file and stores it into a byte array.
Then the data to be added to each packet is also stored in an array and both are added to an array list.
The array list is then written to an output stream for the file.
The Code is here:
File bin=chooser.getSelectedFile();
int filesize=(int)bin.length();
int pcount=filesize/2048;
byte[] file=new byte[filesize];
byte[] meta=new byte[12];
int arraysize=pcount*12+filesize;
byte[] rootfile=new byte[46];
ArrayList al = new ArrayList();
String root;
prbar.setVisible(true);
int mark=0;
String metas;
try{
FileInputStream fis=new FileInputStream(bin);
FileOutputStream fos=new FileOutputStream(bin.getName().replace(".bin", ".xyz"));
ObjectOutputStream os=new ObjectOutputStream(fos);
root="46kb"+"5678"+"0000"+pcount+"MYBOX"+"13"+"S208";
rootfile=root.getBytes();
for(int n=0;n<=pcount;n++)
{
fis.read(file, 0, 2048);
mark=mark+2048;
int v=(mark/filesize)*100;
prbar.setValue(v);
metas="02KB"+"1234"+n;
meta=metas.getBytes();
al.add(rootfile);
al.add(meta);
al.add(file);
}
os.writeObject(al.toArray());
}
catch(Exception ex){
erlabel.setText(ex.getMessage());
}
The program runs without any errors but the file is not created correctly.
Either the approach is wrong or the code.
Please Help

You appear to be writing your own binary format but you are using ObjectOutputStream which has it own header. writeObject write an Object not data in a manner that lets a Java process deserialize that object e.g. with it class hierarchy and field names.
For binary, I suggest you use a plain DataOutputStream with a BufferedOutputStream which will be more efficient and do what you want.
I also suggest you write the data as you generate it rather than using an ArrayList. This will use less memory, make the code simpler and be faster.
I would write the code more like this
File bin = chooser.getSelectedFile();
int filesize = (int) bin.length();
int pcount = (filesize + 2048 - 1) / 2048;
byte[] file = new byte[2048];
FileInputStream fis = new FileInputStream(bin);
String name2 = bin.getName().replace(".bin", ".xyz");
OutputStream os = new BufferedOutputStream(new FileOutputStream(name2));
byte[] rootfile = ("46kb" + "5678" + "0000" + pcount + "MYBOX" + "13" + "S208").getBytes("UTF-8");
for (int n = 0; n < pcount; n++) {
os.write(rootfile);
byte[] metas = ("02KB" + "1234" + n).getBytes("UTF-8");
os.write(metas);
int len = fis.read(file);
os.write(file, 0, len);
int percent = 100 * n / pcount;
prbar.setValue(percent);
}
ow.close();

With the smallest thing first:
int v=(mark/filesize)*100;
Is using integer division yielding always 0 I think.
int v = mark * 100 / filesize;
The byte[] object (file for instance) is created once and many times added to the list.
You get n copies of the last overwrite.

Related

Trying to use BufferedInputStream and Base64 to Encode a large file in Java

I am new to the Java I/O so please help.
I am trying to process a large file(e.g. a pdf file of 50mb) using the apache commons library.
At first I try:
byte[] bytes = FileUtils.readFileToByteArray(file);
String encodeBase64String = Base64.encodeBase64String(bytes);
byte[] decoded = Base64.decodeBase64(encodeBase64String);
But knowing that the
FileUtils.readFileToByteArray in org.apache.commons.io will load the whole file into memory, I try to use BufferedInputStream to read the file piece by piece:
BufferedInputStream bis = new BufferedInputStream(inputStream);
StringBuilder pdfStringBuilder = new StringBuilder();
int byteArraySize = 10;
byte[] tempByteArray = new byte[byteArraySize];
while (bis.available() > 0) {
if (bis.available() < byteArraySize) { // reaching the end of file
tempByteArray = new byte[bis.available()];
}
int len = Math.min(bis.available(), byteArraySize);
read = bis.read(tempByteArray, 0, len);
if (read != -1) {
pdfStringBuilder.append(Base64.encodeBase64String(tempByteArray));
} else {
System.err.println("End of file reached.");
}
}
byte[] bytes = Base64.decodeBase64(pdfStringBuilder.toString());
However, the 2 decoded bytes array don't look quite the same... ... In fact, the only give 10 bytes, which is my temp array size... ...
Can anyone please help:
what am I doing it wrong to read the file piece by piece?
why is the decoded byte array only returns 10 bytes in the 2nd solution?
Thanks in advance:)
After some digging, it turns out that the byte array's size has to be multiple of 3 in order to avoid padding. After using a temp array size with multiple of 3, the program is able to go through.
I simply change
int byteArraySize = 10;
to be
int byteArraySize = 1024 * 3;

Add ByteArray to integer

In the following java code-snippet you'll see this line packetLengthMax += bytes.toByteArray()[43];
My question is: How does this work?
byte[] dataBuffer = new byte[265];
int packetLength = 0;
int packetLengthMax = 44;
ByteArrayOutputStream bytes = new ByteArrayOutputStream();
DataOutputStream outMessage = new DataOutputStream(bytes);
/* Client = Socket*/
DataInputStream clientIn = new DataInputStream(Client.getInputStream());
while (packetLength < packetLengthMax) {
packetLength += clientIn.read(dataBuffer);
outMessage.write(dataBuffer);
if (packetLength >= 43) {
packetLengthMax += bytes.toByteArray()[43];
}
}
My explanation:
First a socket (Client) is passed to the code. Then it does the setup of all variables. In the while loop, it reads all data that comes from the socket. Then it also writes this data to the DataOutputStream.
But in the if statement - it adds a byte array to an integer.
How does it work? I don't get that point. Thank you for helping!
It's not adding the whole byte array, it's just adding the byte at position 43. (i.e. the 44th byte in the array).

java: read large binary file

I need to read out a given large file that contains 500000001 binaries. Afterwards I have to translate them into ASCII.
My Problem occurs while trying to store the binaries in a large array. I get the warning at the definition of the array ioBuf:
"The literal 16000000032 of type int is out of range."
I have no clue how to save these numbers to work with them! Has somebody an idea?
Here is my code:
public byte[] read(){
try{
BufferedInputStream in = new BufferedInputStream(new FileInputStream("data.dat"));
ByteArrayOutputStream bs = new ByteArrayOutputStream();
BufferedOutputStream out = new BufferedOutputStream(bs);
byte[] ioBuf = new byte[16000000032];
int bytesRead;
while ((bytesRead = in.read(ioBuf)) != -1){
out.write(ioBuf, 0, bytesRead);
}
out.close();
in.close();
return bs.toByteArray();
}
The maximum Index of an Array is Integer.MAX_VALUE and 16000000032 is greater than Integer.MAX_VALUE
Integer.MAX_VALUE = 2^31-1 = 2147483647
2147483647 < 16000000032
You could overcome this by checking if the Array is full and create another and continue reading.
But i'm not quite sure if your approach is the best way to perform this. byte[Integer_MAX_VALUE] is huge ;)
Maybe you can split the input file in smaller chunks process them.
EDIT: This is how you could read a single int of your file. You can resize the buffer's size to the amount of data you want to read. But you tried to read the whole file at once.
//Allocate buffer with 4byte = 32bit = Integer.SIZE
byte[] ioBuf = new byte[4];
int bytesRead;
while ((bytesRead = in.read(ioBuf)) != -1){
//if bytesRead == 4 you read 1 int
//do your stuff
}
If you need to declare a large constant, append an 'L' to it which indicates to the compiler that is a long constant. However, as mentioned in another answer you can't declare arrays that large.
I suspect the purpose of the exercise is to learn how to use the java.nio.Buffer family of classes.
I made some progress by starting from scratch! But I still have a problem.
My idea is to read up the first 32 bytes, convert them to a int number. Then the next 32 bytes etc. Unfortunately I just get the first and don't know how to proceed.
I discovered following method for converting these numbers to int:
public static int byteArrayToInt(byte[] b){
final ByteBuffer bb = ByteBuffer.wrap(b);
bb.order(ByteOrder.LITTLE_ENDIAN);
return bb.getInt();
}
so now I have:
BufferedInputStream in=null;
byte[] buf = new byte[32];
try {
in = new BufferedInputStream(new FileInputStream("ndata.dat"));
in.read(buf);
System.out.println(byteArrayToInt(buf));
in.close();
} catch (IOException e) {
System.out.println("error while reading ndata.dat file");
}

Update data to file each amount of bytes

I want to write my content data to a file each 10kb of file. It looks like this:
What I tried:
FileInputStream is;
FileOutputStream out;
File input = new File(filePath);
int fileLength = input.length();
int len = 0;
while (len < fileLength){
len += is.read(buff);
// write my data
out.write(data, 0, data.length);
// how to move is to read next 10kb???
}
I wonder is there anyway to move the cursor reader to next amount of bytes? Or do I miss anything?
Update:Thank to #DThought, here is my implementation:
File input = new File(filePath);
long fileLength = input.length();
byte[] data;
byte[] buff = new byte[data.length];
long JUMP_LENGTH = 10 * 1024;
RandomAccessFile raf = new RandomAccessFile(input, "rw");
long step = JUMP_LENGTH + data.length;
for (long i = 0; i < fileLength; i += step) {
// read to buffer
raf.seek(i);
raf.read(buff);
raf.seek(i); // make sure it move to correct place after reading
raf.write(data);
}
raf.close();
And it worked well.
Try http://developer.android.com/reference/java/io/RandomAccessFile.html RandomAccessFile instead of FileOutputStream.
This will enable you to seek to arbitary positions
byte[] data=new byte[1024];
RandomAccessFile file=new RandomAccessFile(new File("name"),"rw");
file.seek(10*1024);
file.write(data);
You can write empty array or spaces to that specific portion for example,as you can't jump to specific memory of file and can't avoid 10KB.
FOR EXAMPLE
OutputStream os = new FileOutputStream(new File("D:/a.txt"));
byte[] emptyByte=new byte[10*1024];
Arrays.fill(emptyByte, " ".getBytes()[0]);//Empty array
os.write(yourData,0,yourData.length-1);
os.write(emptyByte,0,emptyByte.length-1);
//Write after each data to leave space of 10KB
NOTE I don't know how exactly set it for 10KB and other than that it is just an example,you can use it for yours.I have added spaces in that portion of file.You can achieve it according to your requirements.I think you can't directly jump to specific memory address but you can fill it with empty data.
I guess #seek method of RandomAccessFile may also help you as suggested by DThought,on this but it is measured from the beginning of this file so kindly note that.

Reading a binary input stream into a single byte array in Java

The documentation says that one should not use available() method to determine the size of an InputStream. How can I read the whole content of an InputStream into a byte array?
InputStream in; //assuming already present
byte[] data = new byte[in.available()];
in.read(data);//now data is filled with the whole content of the InputStream
I could read multiple times into a buffer of a fixed size, but then, I will have to combine the data I read into a single byte array, which is a problem for me.
The simplest approach IMO is to use Guava and its ByteStreams class:
byte[] bytes = ByteStreams.toByteArray(in);
Or for a file:
byte[] bytes = Files.toByteArray(file);
Alternatively (if you didn't want to use Guava), you could create a ByteArrayOutputStream, and repeatedly read into a byte array and write into the ByteArrayOutputStream (letting that handle resizing), then call ByteArrayOutputStream.toByteArray().
Note that this approach works whether you can tell the length of your input or not - assuming you have enough memory, of course.
Please keep in mind that the answers here assume that the length of the file is less than or equal to Integer.MAX_VALUE(2147483647).
If you are reading in from a file, you can do something like this:
File file = new File("myFile");
byte[] fileData = new byte[(int) file.length()];
DataInputStream dis = new DataInputStream(new FileInputStream(file));
dis.readFully(fileData);
dis.close();
UPDATE (May 31, 2014):
Java 7 adds some new features in the java.nio.file package that can be used to make this example a few lines shorter. See the readAllBytes() method in the java.nio.file.Files class. Here is a short example:
import java.nio.file.FileSystems;
import java.nio.file.Files;
import java.nio.file.Path;
// ...
Path p = FileSystems.getDefault().getPath("", "myFile");
byte [] fileData = Files.readAllBytes(p);
Android has support for this starting in Api level 26 (8.0.0, Oreo).
You can use Apache commons-io for this task:
Refer to this method:
public static byte[] readFileToByteArray(File file) throws IOException
Update:
Java 7 way:
byte[] bytes = Files.readAllBytes(Paths.get(filename));
and if it is a text file and you want to convert it to String (change encoding as needed):
StandardCharsets.UTF_8.decode(ByteBuffer.wrap(bytes)).toString()
You can read it by chunks (byte buffer[] = new byte[2048]) and write the chunks to a ByteArrayOutputStream. From the ByteArrayOutputStream you can retrieve the contents as a byte[], without needing to determine its size beforehand.
I believe buffer length needs to be specified, as memory is finite and you may run out of it
Example:
InputStream in = new FileInputStream(strFileName);
long length = fileFileName.length();
if (length > Integer.MAX_VALUE) {
throw new IOException("File is too large!");
}
byte[] bytes = new byte[(int) length];
int offset = 0;
int numRead = 0;
while (offset < bytes.length && (numRead = in.read(bytes, offset, bytes.length - offset)) >= 0) {
offset += numRead;
}
if (offset < bytes.length) {
throw new IOException("Could not completely read file " + fileFileName.getName());
}
in.close();
Max value for array index is Integer.MAX_INT - it's around 2Gb (2^31 / 2 147 483 647).
Your input stream can be bigger than 2Gb, so you have to process data in chunks, sorry.
InputStream is;
final byte[] buffer = new byte[512 * 1024 * 1024]; // 512Mb
while(true) {
final int read = is.read(buffer);
if ( read < 0 ) {
break;
}
// do processing
}

Categories

Resources