I want to read images inside a .CBZ archive and store them inside an ArrayList. I have tried the following solution but it has, at least, 2 problems.
I get an OutOfMemory error after adding 10-15 images to the ArrayList
There must be a better way of getting the images inside the ArrayList instead of writing them on a temp file and reading that again.
public class CBZHandler {
final int BUFFER = 2048;
ArrayList<BufferedImage> images = new ArrayList<BufferedImage>();
public void extractCBZ(ZipInputStream tis) throws IOException{
ZipEntry entry;
BufferedOutputStream dest = null;
if(!images.isEmpty())
images.clear();
while((entry = tis.getNextEntry()) != null){
System.out.println("Extracting " + entry.getName());
int count;
FileOutputStream fos = new FileOutputStream("temp");
dest = new BufferedOutputStream(fos,BUFFER);
byte data[] = new byte[BUFFER];
while ((count = tis.read(data, 0, BUFFER)) != -1) {
dest.write(data, 0, count);
}
dest.flush();
dest.close();
BufferedImage img = ImageIO.read(new FileInputStream("temp"));
images.add(img);
}
tis.close();
}
}
The "OutOfMemoryError" may or may not be inherent in the amount of data you're trying to store in memory. You may need to change your maximum heap size. However, you can certainly avoid writing to disk - just write to a ByteArrayOutputStream instead, then you can get at the data as a byte array - potentially creating a ByteArrayInputStream round it if you need to. Do you definitely need to add them in your list as BufferedImage rather than (say) keeping each as a byte[]?
Note that if you're able to use Guava it makes the "extract data from an InputStream" bit very easy:
byte[] data = ByteStreams.toByteArray(tis);
Each BufferedImage will typically require significantly more memory than the byte[] from which it is constructed. Cache the byte[] and stamp each one out to an image as needed.
Related
In Java, how do you split a binary file into multiple parts while only loading a small portion of the File into memory at one time?
So I have a file FullFile that is large. I need to upload it to cloud storage but it's so large that it often times out.
I can make this problem less likely if I split the file and upload in chunks.
So I need to split FullFile into files of chunk size MaxChunkSize.
List<File> fileSplit(File fullFile, int maxChunkSize)
File fileJoin(List<File> splitFiles)
Most code snippets around require the file to be text. But in my case the files are compressed binary.
What would be the best way to implement these methods?
Below is the full answer:
The maxChunkSize represents the size in bytes of a file chunk.
In the example below I read a 5mb zip file and split it into five 1MB chunks and later join them back using the fileJoin function.
The method stageLocally stages the files locally but you can modify it to work with any cloud storage. (Better to abstract this out so you can switch between multiple storage implementations)
You can tweak maxChunkSize based on the amount of data you want to store inmemory at a given time
The IOutils.copy() methods is from the commons library, here is the maven link. You can also use Files.copy() in liue of it. The Files.copy() methods comes from the java.nio package, so you don't have to add an external dependency to use it.
I have ommitted the exception handling for brevity.
public static void main(String[] args) throws IOException {
File input = new File(_5_MB_FILE_PATH);
File outPut = fileJoin(split(input, 1_024_000));
System.out.println(IOUtils.contentEquals(Files.newInputStream(input.toPath()), Files.newInputStream(outPut.toPath())));
}
public static List<File> split(File largeFile, int maxChunkSize) throws IOException {
InputStream in = Files.newInputStream(largeFile.toPath());
List<File> list = new ArrayList<>();
final byte[] buffer = new byte[maxChunkSize];
int dataRead = in.read(buffer);
while (dataRead > -1) {
list.add(stageLocally(buffer, dataRead));
dataRead = in.read(buffer);
}
return list;
}
private static File stageLocally(byte[] buffer, int length) throws IOException {
File outPutFile = File.createTempFile("temp-", "split", new File(TEMP_DIRECTORY));
FileOutputStream fos = new FileOutputStream(outPutFile);
fos.write(buffer, 0, length);
fos.close();
return outPutFile;
}
public static File fileJoin(List<File> list) throws IOException {
File outPutFile = File.createTempFile("temp-", "unsplit", new File(TEMP_DIRECTORY));
FileOutputStream fileOutputStream = new FileOutputStream(outPutFile);
for (File file : list) {
InputStream in = Files.newInputStream(file.toPath());
IOUtils.copy(in, fileOutputStream);
in.close();
}
fileOutputStream.close();
return outPutFile;
}
Let me know if this helps.
This question already has answers here:
Java - Read file and split into multiple files
(11 answers)
Closed 3 years ago.
How can I split a file into parts larger than 2GB?
An array of bytes accepts an int instead of a long as the size. any solution?
public void splitFile(SplitFile file) throws IOException {
int partCounter = 1;
int sizeOfFiles = (int)value;
byte[] buffer = new byte[sizeOfFiles];
File f = file.getFile();
String fileName = f.getName();
try (FileInputStream fis = new FileInputStream(f);
BufferedInputStream bis = new BufferedInputStream(fis)) {
int bytesAmount = 0;
while ((bytesAmount = bis.read(buffer)) > 0) {
String filePartName = fileName + partCounter + file.config.options.getExtension();
partCounter++;
File newFile = new File(f.getParent(), filePartName);
try (FileOutputStream out = new FileOutputStream(newFile)) {
out.write(buffer, 0, bytesAmount);
}
}
}
}
Don't read the entire file into memory, obviously, or even an entire 'part file'.
Your code as pasted will split the file into as many parts as the read method partitions; this seems very silly; after all, the read() method is specced to allow it to partition into single byte increments.
Don't make a new part-file for every call to read. Instead, separate this out: Your read call gets anywhere from 1 to <BUFFER_SIZE> bytes, and your part's size is <PART_SIZE> large; these two things do not have to be the same and you shouldn't write the code that way.
Once you have an open FileOutputStream you can call .write(buffer, 0, bytesAmount) on it any number of times; you can even call .write(buffer, 0, theSmallerOfBytesLeftToWriteInThisPartAndBytesAmount) followed by opening up the next part file FileOutputStream and calling .write(buffer, whereWeLeftOff, remainder) on that one.
I'm trying to copy part of a file from a filechannel to another (writing a new file, in effect, equals to the first one).
So, I'm reading chunks of 256kb, then putting them back into another channel
static void openfile(String str) throws FileNotFoundException, IOException {
int size=262144;
FileInputStream fis = new FileInputStream(str);
FileChannel fc = fis.getChannel();
byte[] barray = new byte[size];
ByteBuffer bb = ByteBuffer.wrap(barray);
FileOutputStream fos = new FileOutputStream(str+"2" /**/);
FileChannel fo = fos.getChannel();
StringBuilder sb;
while (fc.read(bb) != -1) {
fo.write(bb /**/);
bb.clear();
}
}
The problem is that fo.write (I think) writes again from the beginning of the channel, so the new file is made only of the last chunk read.
I tried with fo.write (bb, bb.position()) but it didn't work as I expected (does the pointer returns to the beginning of the channel?) and with FileOutputStream(str+"2", true) thinking it would append to the end of the new file, but it didn't.
I need to work with chunks of 256kb, so I can't change much the structure of the program (unless I am doing something terribly wrong)
Resolved with bb.flip();
while (fi.read(bb) != -1) {
bb.flip();
fo.write(bb);
bb.clear();
}
This is a very old question but I stumbled upon it and though I might add another answer that has potentially better performance using using FileChannel.transferTo or FileChannel.transferFrom. As per the javadoc:
This method is potentially much more efficient than a simple loop that reads from the source channel and writes to this channel. Many operating systems can transfer bytes directly from the source channel into the filesystem cache without actually copying them.
public static void copy(FileChannel src, FileChannel dst) throws IOException {
long size = src.size();
long transferred = 0;
do {
transferred += src.transferTo(0, size, dst);
} while (transferred < size);
}
on most cases a simple src.transferTo(0, src.size(), dst); will work if non of the channels are non-blocking.
The canonical way to copy between channels is as follows:
while (in.read(bb) > 0 || bb.position() > 0)
{
bb.flip();
out.write(bb);
bb.compact();
}
The simplified version in your edited answer doesn't work in all circumstances, e.g. when 'out' is non-blocking.
I am trying to copy a file of about 80 megabytes from the assets folder of an Android application to the SD card.
The file is another apk. For various reasons I have to do it this way and can't simply link to an online apk or put it on the Android market.
The application works fine with smaller apks but for this large one I am getting an out of memory error.
I'm not sure exactly how this works but I am assuming that here I am trying to write the full 80 megabytes to memory.
try {
int length = 0;
newFile.createNewFile();
InputStream inputStream = ctx.getAssets().open(
"myBigFile.apk");
FileOutputStream fOutputStream = new FileOutputStream(
newFile);
byte[] buffer = new byte[inputStream.available()];
while ((length = inputStream.read(buffer)) > 0) {
fOutputStream.write(buffer, 0, length);
}
fOutputStream.flush();
fOutputStream.close();
inputStream.close();
} catch (Exception ex) {
if (ODP_App.getInstance().isInDebugMode())
Log.e(TAG, ex.toString());
}
I found this interesting -
A question about an out of memory issue with Bitmaps
Unless I've misunderstood, in the case of Bitmaps, there appears to be some way to split the stream to reduce memory usage using BitmapFactory.Options.
Is this do-able in my scenario or is there any other possible solution?
The trick is not to try to read the whole file in one go, but rather read it in small chunks and write each chunk before reading the next one into the same memory segment. The following version will read it in 1K chunks. It's for example only - you need to determine the right chunk size.
try {
int length = 0;
newFile.createNewFile();
InputStream inputStream = ctx.getAssets().open(
"myBigFile.apk");
FileOutputStream fOutputStream = new FileOutputStream(
newFile);
//note the following line
byte[] buffer = new byte[1024];
while ((length = inputStream.read(buffer)) > 0) {
fOutputStream.write(buffer, 0, length);
}
fOutputStream.flush();
fOutputStream.close();
inputStream.close();
} catch (Exception ex) {
if (ODP_App.getInstance().isInDebugMode())
Log.e(TAG, ex.toString());
}
Do not read the whole file into memory; read 64k at a time, then write them, repeat until you reach the end of file. Or use IOUtils from Apache Commons IO.
AssetManager mngr = getAssets();
test_file = mngr.open("sample.txt");
above test_file variable is of InputStream type. Any way to calculate the file size of sample.txt from it?
I have an alternative to get size of a file in assets using AssetFileDescriptor:
AssetFileDescriptor fd = getAssets().openFd("test.png");
Long size = fd.getLength();
Hope it helps.
test_file.available();
Is not a very reliable method to get the file length as is stated in the docs.
size = fd.getLength();
Using the FileDescriptor as shown by Ayublin is!
His answer should be promoted to the correct answer.
inputStream.available() might match the file size if the file is very small, but for larger files it isn't expected to match.
For a compressed asset, the only way to get the size reliably is to copy it to the filesystem, ex: context.getCacheDir() then read the length of the file from there. Here's some sample code that does this. It probably then also makes sense to use the file from the cache dir as opposed to the assets after this.
String filename = "sample.txt";
InputStream in = context.getAssets().open(filename);
File outFile = new File(context.getCacheDir(), filename);
OutputStream out = new FileOutputStream(outFile);
try {
int len;
byte[] buff = new byte[1024];
while ((len = in.read(buff)) > 0) {
out.write(buff, 0, len);
}
} finally {
// close in & out
}
long theRealFileSizeInBytes = outFile.length();
You should also delete the file from the cache dir when you are done with it (and the entire cache dir will also be deleted automatically when uninstalling the app).