I have an InputStream, and the relative file name and size.
I need to access/read some random (increasing) positions in the InputStream. This positions are stored in an integer array (named offsets).
InputStream inputStream = ...
String fileName = ...
int fileSize = (int) ...
int[] offsets = new int[]{...}; // the random (increasing) offsets array
Now, given an InputStream, I've found only two possible solutions to jump to random (increasing) positions of the file.
The first one is to use the skip() method of the InputStream (note that I actually use BufferedInputStream, since I will need to mark() and reset() the file pointer).
//Open a BufferInputStream:
BufferedInputStream bufferedInputStream = new BufferedInputStream(inputStream);
byte[] bytes = new byte[1];
int curFilePointer = 0;
long numBytesSkipped = 0;
long numBytesToSkip = 0;
int numBytesRead = 0;
//Check the file size:
if ( fileSize < offsets[offsets.length-1] ) { // the last (bigger) offset is bigger then the file size...
//Debug:
Log.d(TAG, "The file is too small!\n");
return;
}
for (int i=0, k=0; i < offsets.length; i++, k=0) { // for each offset I have to jump...
try {
//Jump to the offset [i]:
while( (curFilePointer < offsets[i]) && (k < 10) ) { // until the correct offset is reached (at most 10 tries)
numBytesToSkip = offsets[i] - curFilePointer;
numBytesSkipped = bufferedInputStream.skip(numBytesToSkip);
curFilePointer += numBytesSkipped; // move the file pointer forward
//Debug:
Log.d(TAG, "FP: " + curFilePointer + "\n");
k++;
}
if ( curFilePointer != offsets[i] ) { // it did NOT jump properly... (what's going on?!)
//Debug:
Log.d(TAG, "InputStream.skip() DID NOT JUMP PROPERLY!!!\n");
break;
}
//Read the content of the file at the offset [i]:
numBytesRead = bufferedInputStream.read(bytes, 0, bytes.length);
curFilePointer += numBytesRead; // move the file pointer forward
//Debug:
Log.d(TAG, "READ [" + curFilePointer + "]: " + bytes[0] + "\n");
}
catch ( IOException e ) {
e.printStackTrace();
break;
}
catch ( IndexOutOfBoundsException e ) {
e.printStackTrace();
break;
}
}
//Close the BufferInputStream:
bufferedInputStream.close()
The problem is that, during my tests, for some (usually big) offsets, it has cycled 5 or more times before skipping the correct number of bytes. Is it normal? And, above all, can/should I thrust skip()? (That is: Are 10 cycles enough to be SURE it will ALWAYS arrive to the correct offset?)
The only alternative way I've found is the one of creating a RandomAccessFile from the InputStream, through File.createTempFile(prefix, suffix, directory) and the following function.
public static RandomAccessFile toRandomAccessFile(InputStream inputStream, File tempFile, int fileSize) throws IOException {
RandomAccessFile randomAccessFile = new RandomAccessFile(tempFile, "rw");
byte[] buffer = new byte[fileSize];
int numBytesRead = 0;
while ( (numBytesRead = inputStream.read(buffer)) != -1 ) {
randomAccessFile.write(buffer, 0, numBytesRead);
}
randomAccessFile.seek(0);
return randomAccessFile;
}
Having a RandomAccessFile is actually a much better solution, but the performance are exponentially worse (above all because I will have more than a single file).
EDIT: Using byte[] buffer = new byte[fileSize] speeds up (and a lot) the RandomAccessFile creation!
//Create a temporary RandomAccessFile:
File tempFile = File.createTempFile(fileName, null, context.getCacheDir());
RandomAccessFile randomAccessFile = toRandomAccessFile(inputStream, tempFile, fileSize);
byte[] bytes = new byte[1];
int numBytesRead = 0;
//Check the file size:
if ( fileSize < offsets[offsets.length-1] ) { // the last (bigger) offset is bigger then the file size...
//Debug:
Log.d(TAG, "The file is too small!\n");
return;
}
for (int i=0, k=0; i < offsets.length; i++, k=0) { // for each offset I have to jump...
try {
//Jump to the offset [i]:
randomAccessFile.seek(offsets[i]);
//Read the content of the file at the offset [i]:
numBytesRead = randomAccessFile.read(bytes, 0, bytes.length);
//Debug:
Log.d(TAG, "READ [" + (randomAccessFile.getFilePointer()-4) + "]: " + bytes[0] + "\n");
}
catch ( IOException e ) {
e.printStackTrace();
break;
}
catch ( IndexOutOfBoundsException e ) {
e.printStackTrace();
break;
}
}
//Delete the temporary RandomAccessFile:
randomAccessFile.close();
tempFile.delete();
Now, is there a better (or more elegant) solution to have a "random" access from an InputStream?
It's a bit unfortunate you have an InputStream to begin with, but in this situation buffering the stream in a file is of no use iff you are always skipping forward. But you don't have to count the number of times you have called skip, that's not really of interest.
What you do have to check if the stream has ended already, to prevent an infinite loop. Checking the source of the default skip implementation, I'd say you'll have to keep calling skip until it returns 0. This will indicate the end of stream has been reached. The JavaDoc was a bit unclear about this for my taste.
You can't. An InputStream is a stream, that is to say a sequential construct. Your question embodies a contradiction in terms.
Related
I was bitten by this in some unit tests.
I want to decompress some ZLIB-compressed data, using Inflater, where the raw data length is known in advance.
This (straightforward) works as expected
/*
* Decompresses a zlib compressed buffer, with given size of raw data.
* All data is fed and inflated in full (one step)
*/
public static byte[] decompressFull(byte[] comp, int len) throws Exception {
byte[] res = new byte[len]; // result (uncompressed)
Inflater inf = new Inflater();
inf.setInput(comp);
int n = inf.inflate(res, 0, len);
if (n != len)
throw new RuntimeException("didn't inflate all data");
System.out.println("Data done (full). bytes in :" + inf.getBytesRead()
+ " out=" + inf.getBytesWritten()
+ " finished: " + inf.finished());
// done - the next is not needed, just for checking...
//try a final inflate just in case (might trigger ZLIB crc check)
byte[] buf2 = new byte[6];
int nx = inf.inflate(buf2);//should give 0
if (nx != 0)
throw new RuntimeException("nx=" + nx + " " + Arrays.toString(buf2));
if (!inf.finished())
throw new RuntimeException("not finished?");
inf.end();
return res;
}
Now, the compressed input can come in arbitrarily-sized chunks. The following code emulates the case where the compressed input is fed in full except for the last 4 bytes, and then the remaining bytes are fed one at a time.
(As I understand, the last 4 -or 5 bytes- of the zlib stream are not needed to decompress the full data, but they are needed to check the integrity - Adler-32 CRC).
public static byte[] decompressBytexByte(byte[] comp, int len) throws Exception {
byte[] res = new byte[len]; // result (uncompressed)
Inflater inf = new Inflater();
inf.setInput(comp, 0, comp.length - 4);
int n = inf.inflate(res, 0, len);
if (n != len)
throw new RuntimeException("didn't inflate all data");
// inf.setInput(comp, comp.length-4,4);
// !!! works if I uncomment the line befor and comment the next for
for (int p = comp.length - 4; p < comp.length; p++)
inf.setInput(comp, p, 1);
System.out.println("Data done (decompressBytexByte). bytes in :" + inf.getBytesRead()
+ " out=" + inf.getBytesWritten() + " finished: " + inf.finished());
// all data fed... try a final inflate (might -should?- trigger ZLIB crc check)
byte[] buf2 = new byte[6];
int nx = inf.inflate(buf2);//should give 0
if (nx != 0)
throw new RuntimeException("nx=" + nx + " " + Arrays.toString(buf2));
if (!inf.finished())
throw new RuntimeException("not finished?");
inf.end();
return res;
}
Well, this doesn't work for me (Java 1.8.0_181). The inflater is not finished, the Adler CRC check is not done, it seems; more: it seems that the bytes are not fed into the inflater.
Even more strange: it works if the trailing 4 bytes are fed in one call.
You can try it here: https://repl.it/#HernanJJ/Inflater-Test
Even stranger things happen when I fed the whole input one byte at a time: sometimes the line int nx= inf.inflate(buf2);//should give 0 return non-zero
(when all data has already been inflated).
Is this expected behaviour? Am I missing something?
As #SeanBright already noticed, you are supposed to only feed it new input when Inflater.needsInput() returns true.
A sequential call of setInput overrides your previously passed input.
Javadoc of Inflater.needsInput():
Returns true if no data remains in the input buffer. This can be used to determine if #setInput should be called in order to provide more input.
As long as you feed it byte by byte that always is the case, so you can probably skip the check itself.
You could replace the input setting part of the decompressBytexByte method, with this (for complete byte by byte feeding):
byte[] res = new byte[len];
Inflater inf = new Inflater();
int a = 0; // number of bytes that have already been obtained
for (int p = 0; p < comp.length; p++) {
inf.setInput(comp, p, 1);
a += inf.inflate(res, a, len - a);
}
Hi I need to calculate the entropy of order m of a file where m is the number of bit (m <= 16).
So:
H_m(X)=-sum_i=0 to i=2^m-1{(p_i,m)(log_2 (p_i,m))}
So, I thought to create an input stream to read the file and then calculate the probability of each sequence composed by m bit.
For m = 8 it's easy because I consider a byte.
Since that m<=16 I tought to consider as primitive type short, save each short of the file in an array short[] and then manipulate bits using bitwise operators to obtain all the sequences of m bit in the file.
Is this a good idea?
Anyway, I'm not able to create a stream of short. This is what I've done:
public static void main(String[] args) {
readFile(FILE_NAME_INPUT);
}
public static void readFile(String filename) {
short[] buffer = null;
File a_file = new File(filename);
try {
File file = new File(filename);
FileInputStream fis = new FileInputStream(filename);
DataInputStream dis = new DataInputStream(fis);
int length = (int)file.length() / 2;
buffer = new short[length];
int count = 0;
while(dis.available() > 0 && count < length) {
buffer[count] = dis.readShort();
count++;
}
System.out.println("length=" + length);
System.out.println("count=" + count);
for(int i = 0; i < buffer.length; i++) {
System.out.println("buffer[" + i + "]: " + buffer[i]);
}
fis.close();
}
catch(EOFException eof) {
System.out.println("EOFException: " + eof);
}
catch(FileNotFoundException fe) {
System.out.println("FileNotFoundException: " + fe);
}
catch(IOException ioe) {
System.out.println("IOException: " + ioe);
}
}
But I lose a byte and I don't think this is the best way to proced.
This is what I think to do using bitwise operator:
int[] list = new int[l];
foreach n in buffer {
for(int i = 16 - m; i > 0; i-m) {
list.add( (n >> i) & 2^m-1 );
}
}
I'm assuming in this case to use shorts.
If I use bytes, how can I do a cycle like that for m > 8?
That cycle doesn't work because I have to concatenate multiple bytes and each time varying the number of bits to be joined..
Any ideas?
Thanks
I think you just need to have a byte array:
public static void readFile(String filename) {
ByteArrayOutputStream outputStream=new ByteArrayOutputStream();
try {
FileInputStream fis = new FileInputStream(filename);
byte b=0;
while((b=fis.read())!=-1) {
outputStream.write(b);
}
byte[] byteData=outputStream.toByteArray();
fis.close();
}
catch(IOException ioe) {
System.out.println("IOException: " + ioe);
}
Then you can manipulate byteData as per your bitwise operations.
--
If you want to work with shorts you can combine bytes read this way
short[] buffer=new short[(int)(byteData.length/2.)+1];
j=0;
for(i=0; i<byteData.length-1; i+=2) {
buffer[j]=(short)((byteData[i]<<8)|byteData[i+1]);
j++;
}
To check for odd bytes do this
if((byteData.length%2)==1) last=(short)((0x00<<8)|byteData[byteData.length-1]]);
last is a short so it could be placed in buffer[buffer.length-1]; I'm not sure if that last position in buffer is available or occupied; I think it is but you need to check j after exiting the loop; if j's value is buffer.length-1 then it is available; otherwise might be some problem.
Then manipulate buffer.
The second approach with working with bytes is more involved. It's a question of its own. So try this above.
I wrote this piece of code to chunk the files into multiple chunks. The program works fine for a file of size 12KB with chunk size of 8KB. However, when I give a input file size of 2980144 bytes, it goes into spin - never comes out.
Is there something to do with the size of input file and the FileChannel issue to access? I want to use this program to chunk the larger files (binary form) into multiple chunks for easy transport over network. I have kept the chunk size as parameter, so that I can configure as per requirement.
public static void main(String[] args) {
int chunkSize = 8000;
long offset = 0;
while (offset >= 0) {
offset = splitter.GetNextChunk(offset);
}
}
public long GetNextChunk(long offset) {
long bytesRead = 0;
ByteBuffer tmpBuf = ByteBuffer.allocate(chunkSize);
RandomAccessFile outFile = null;
RandomAccessFile inFile = null;
FileChannel inFC = null;
FileChannel outFC = null;
try {
inFile = new RandomAccessFile(inFileName, "r");
inFC = inFile.getChannel();
tmpBuf.clear();
// Seek to the offset in the file
inFC.position(offset);
// Read the specified number of bytes into the buffer.
do {
bytesRead = inFC.read(tmpBuf);
} while (bytesRead != -1 && tmpBuf.hasRemaining());
// Write the copied bytes into a new file (chunk).
String outFileName = outFolder + File.separator + "Chunk" + String.valueOf(chunkCounter++) + ".dat";
outFile = new RandomAccessFile(outFileName, "rw");
outFC = outFile.getChannel();
outFC.position(0);
tmpBuf.flip();
while(tmpBuf.hasRemaining()) {
outFC.write(tmpBuf);
}
// Reposition the buffer to 0.
tmpBuf.rewind();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} finally {
try {
if (inFC != null)
inFile.close();
if (outFC != null)
outFile.close();
if (inFC != null)
inFC.close();
if (outFC != null)
outFC.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
return bytesRead;
}
Found the issue. The loop was faulty. Below is the correct loop.
while (bytesRead >= 0) {
bytesRead = splitter.GetNextChunk(offset);
if (bytesRead == -1)
break;
offset += bytesRead;
System.out.println("Byte offset is: " + offset);
}
It's not nearly as hard as you're making it. Your code is about ten times as long as it needs to be. Try this:
while (in.read(buffer) > 0 || buffer.position() > 0)
{
buffer.flip();
out.write(buffer);
buffer.compact();
}
If 'out' is a SocketChannel this will send the file over the network at maximum speed.
You don't need a monstro buffer, but you should always use powers of 2. I generally use 8192.
Here is my code, imageFile is a pdf file, intent is to get Base64 encoded file for image file. I am using Java6 and no possibility to upgrade to Java7
Base64Inputstream is of type org.apache.commons.codec.binary.Base64InputStream
private File toBase64(File imageFile) throws Exception {
LOG.info(this.getClass().getName() + " toBase64 method is called");
System. out.println("toBase64 is called" );
Base64InputStream in = new Base64InputStream(new FileInputStream(imageFile), true );
File f = new File("/root/temp/" + imageFile.getName().replaceFirst("[.][^.]+$" , "" ) + "_base64.txt" );
Writer out = new FileWriter(f);
copy(in, out);
return f;
}
private void copy(InputStream input, Writer output)
throws IOException {
InputStreamReader in = new InputStreamReader(input);
copy(in, output);
}
private int copy(Reader input, Writer output) throws IOException {
long count = copyLarge(input, output);
if (count > Integer.MAX_VALUE) {
return -1;
}
return (int) count;
}
private static final int DEFAULT_BUFFER_SIZE = 1024 * 4;
private long copyLarge(Reader input, Writer output) {
char[] buffer = new char[DEFAULT_BUFFER_SIZE];
long count = 0;
int n = 0;
try {
while (-1 != (n = input.read(buffer))) {
output.write(buffer, 0, n);
count += n;
System.out.println("Count: " + count);
}
} catch (IOException e) {
e.printStackTrace();
}
return count;
}
I was using IOUtils.copy(InputStream input, Writer output) method. But for some pdf files (note, not all) it throws exception. So, in the process of debugging I copied IOUtils.copy code locally and exception is thrown after Count: 2630388. This is the stack trace:
Root Exception stack trace:
java.io.IOException: Underlying input stream returned zero bytes
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:268)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:306)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:158)
Under what situations can this block above said throw exception:
while (-1 != (n = input.read(buffer))) {
output.write(buffer, 0, n);
count += n;
System.out.println("Count: " + count);
}
Please help me understand the cause and how can I fix it
You should not use Reader/Writer which are text oriented and not binary, at least without encoding. They use an encoding. And PDF is binary. Either explicitly given, or the default OS encoding (unportable).
For InputStream use readFully.
And then do always a close(). The copy method, maybe leaving the close to the callers, could at least call flush() in that case.
In Java 7 there already exists a copy, but needs a Path and an extra option.
private File toBase64(File imageFile) throws Exception {
LOG.info(this.getClass().getName() + " toBase64 method is called");
System.out.println("toBase64 is called");
Base64InputStream in = new Base64InputStream(new FileInputStream(imageFile),
true);
File f = new File("/root/temp/" + imageFile.getName()
.replaceFirst("[.][^.]+$", "") + "_base64.txt");
Files.copy(in, f.toPath(), StandardCopyOption.REPLACE_EXISTING);
in.close();
return f;
}
I need to create a BMP (bitmap) image from a database using Java. The problem is that I have huge sets of integers ranging from 10 to 100.
I would like to represent the whole database as a bmp. The amount of data 10000x10000 per table (and growing) exceeds the amount of data I can handle with int arrays.
Is there a way to write the BMP directly to the hard drive, pixel by pixel, so I don't run out of memory?
A file would work (I definitely woudln't do a per pixel call, you'll be waiting hours for the result). You just need a buffer. Break the application apart along the lines of ->
int[] buffer = new int[BUFFER_SIZE];
ResultSet data = ....; //Forward paging result set
while(true)
{
for(int i = 0; i < BUFFER_SIZE; i++)
{
//Read result set into buffer
}
//write buffer to cache (HEAP/File whatever)
if(resultSetDone)
break;
}
Read the documentation on your database driver, but any major database is going to optimize your ResultSet object so you can use a cursor and not worry about memory.
All that being said... an int[10000][10000] isn't why you're running out of memory. Its probably what you're doing with those values and your algorithm. Example:
public class Test
{
public static void main(String... args)
{
int[][] ints = new int[10000][];
System.out.println(System.currentTimeMillis() + " Start");
for(int i = 0; i < 10000; i++)
{
ints[i] = new int[10000];
for(int j = 0; j < 10000; j++)
ints[i][j] = i*j % Integer.MAX_VALUE / 2;
System.out.print(i);
}
System.out.println();
System.out.println(Integer.valueOf(ints[500][999]) + " <- value");
System.out.println(System.currentTimeMillis() + " Stop");
}
}
Output ->
1344554718676 Start
//not even listing this
249750 <- value
1344554719322 Stop
Edit--Or if I misinterpreted your question try this ->
http://www.java2s.com/Code/Java/Database-SQL-JDBC/LoadimagefromDerbydatabase.htm
I see... well take a look around, I'm rusty but this seems to be a way to do it. I'd double check my buffering...
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
public class Test
{
public static void main(String... args)
{
// 2 ^ 24 bytes, streams can be bigger, but this works...
int size = Double.valueOf((Math.floor((Math.pow(2.0, 24.0))))).intValue();
byte[] bytes = new byte[size];
for(int i = 0; i < size; i++)
bytes[i] = (byte) (i % 255);
ByteArrayInputStream stream = new ByteArrayInputStream(bytes);
File file = new File("test.io"); //kill the hard disk
//Crappy error handling, you'd actually want to catch exceptions and recover
BufferedInputStream in = new BufferedInputStream(stream);
BufferedOutputStream out = null;
byte[] buffer = new byte[1024 * 8];
try
{
//You do need to check the buffer as it will have crap in it on the last read
out = new BufferedOutputStream(new FileOutputStream(file));
while(in.available() > 0)
{
int total = in.read(buffer);
out.write(buffer, 0, total);
}
}
catch (IOException e)
{
e.printStackTrace();
}
finally
{
if(out != null)
try
{
out.flush();
out.close();
}
catch (IOException e)
{
e.printStackTrace();
}
}
System.out.println(System.currentTimeMillis() + " Start");
System.out.println();
System.out.println(Integer.valueOf(bytes[bytes.length - 1]) + " <- value");
System.out.println("File size is-> " + file.length());
System.out.println(System.currentTimeMillis() + " Stop");
}
}
You could save it as a file, which is conceptually just a sequence of bytes.