I am writing to a disk some text as bytes. I need to maximize my performance and write as complete pages.
Does anybody know what is the optimal size of a page in bytes when writing to disk?
If you use a BufferedWriter or Buffered streams, you should be good. Java uses a 8K buffer. This should be sufficient for most usage patterns. Is your use case anything specific (like do you have fixed length data that needs to be written and fetched from disk in a single shot) etc which is making you optimize what Java already provides?
Related
We are emulating a p2p network in java. So we divide the file into chunks (with checksums) so that the individual chunks can be recompiled into the original file once we have all the parts. What is the best way to store the individual parts while they are being downloaded?
I was thinking of just storing each chunk as a separate file...but if there are 20000 chunks, it would create as many files. is this the best way?
Thanks
Either keep chunks in memory or in files. There is no much to discuss here about. Findd the perfect ratio between chunks count and the actual size of it, to suit your needs.
Files sounds more reasonable as data would not be totally lost in case of application crash and continue of download would be possible.
I would write to memory until you reach some threshold, at which point you dump your memory to disk, and keep reading into memory. When the file transfer completes, you can take what is currently stored in memory, and concatenate it with what may have been stored on disk.
I want to constantly write data to disc.
And I want to flush data to disc frequently (for example every chunk of 64MB). What solution can you propose?
I think standard OutputStream might be a better choice than nio.channels because it is more straightforward.
If you are writing a continuous stream of data, for example appending to the end of a file, regular OutputStream with flush() called once in a while is just as good or better than nio. Where nio could give you a big advantage would be writing many small chunks spread over different regions of a file. In that case you could use a memory mapped file and this could be an improvement over old-style writes. However, from the question I understand you are rather dealing with a continuous stream of data. I suggest you implement the regular solution which gives you code you find nicer and only search for alternatives if you find performance lacking. In this case I wouldn't expect nio to make noticeable difference.
I'm creating a compression algorithm in Java;
to use my algorithm I require a lot of information about the structure of the target file.
After collecting the data, I need to reread the file. <- But I don't want to.
While rereading the file, I make it a good target for compression by 'converting' the data of the file to a rather peculiar format. Then I compress it.
The problems now are:
I don't want to open a new FileInputStream for rereading the file.
I don't want to save the converted file which is usually 150% the size of the target file to the disk.
Are there any ways to 'reset' a FileInputStream for moving to the start of the file, and how would I store the huge amount 'converted' data efficiently without writing to the disk?
You can use one or more RandomAccessFiles. You can memory map them to ByteBuffer() which doesn't consume heap (actually they use about 128 bytes) or direct memory but can be accessed randomly.
Your temporary data can be storing in a direct ByteBuffer(s) or more memory mapped files. Since you have random access to the original data, you may not need to duplicate as much data in memory as you think.
This way you can access the whole data with just a few KB of heap.
There's the reset method, but you need to wrap the FileInputStream in a BufferedInputStream.
You could use RandomAccessFile, or java.nio ByteBuffer is what you are looking for. (I do not know.)
Resources might be saved by pipes/streams: immediately writing to a compressed stream.
To answer your questions on reset: not possible; the base class InputStream has provisions for mark and reset-to-mark, but FileInputStream was made optimal for several operating systems and does purely sequential input. Closing and opening is best.
For content type "text/plain" which of the following is more efficient if I have to send huge data.
ServletOutputStream sos = response.getOutputStream();
sos.write(byte[])
//or
sos.println("")
Thanks
That depends on the format you have your source data in.
If it's a String, you're likely going to get better performance using response.getPrintWriter().print() - and it's most certainly going to be safer as far as encoding is concerned.
If it's a byte array then ServletOutputStream.write(byte[]) is likely the fastest as it won't do any additional conversions.
The real answer, however, to this and all other "which is faster" questions is - measure it :-)
After quickly looking at Sun's implementation of both OutputStream.write(byte[]) and ServletOutputStream.println(String), I'd say there's no real difference. But as ChssPly76 put it, that can only be verified by measuring it.
The most efficient is writing an InputStream (which is NOT in flavor of a ByteArrayInputStream).
Simply because each byte of a byte[] eats exactly one byte of JVM's memory and each character of a String eats that amount of bytes from JVM's memory the character takes in space. So imagine you have 128MB of available heap memory and the "huge" text/plain file is 1.28MB in size and there are 100 users concurrently requesting the file, then your application will crash with an OutOfMemoryError. Not really professional.
Have the "huge" data somewhere in a database or on the disk file system and obtain it as an InputStream the "default way" (i.e. from DB by ResultSet#getBinaryStream() or from disk by FileInputStream) and write it to the OutputStream through a bytebuffer and/or BufferedInputStream/BufferedOutputStream.
An example of such a servlet can be found here.
Good luck.
What is the best way to change a single byte in a file using Java? I've implemented this in several ways. One uses all byte array manipulation, but this is highly sensitive to the amount of memory available and doesn't scale past 50 MB or so (i.e. I can't allocate 100MB worth of byte[] without getting OutOfMemory errors). I also implemented it another way which works, and scales, but it feels quite hacky.
If you're a java io guru, and you had to contend with very large files (200-500MB), how might you approach this?
Thanks!
I'd use RandomAccessFile, seek to the position I wanted to change and write the change.
If all I wanted to do was change a single byte, I wouldn't bother reading the entire file into memory. I'd use a RandomAccessFile, seek to the byte in question, write it, and close the file.