Create file with given size in Java - java

Is there an efficient way to create a file with a given size in Java?
In C it can be done with ftruncate (see that answer).
Most people would just write n dummy bytes into the file, but there must be a faster way. I'm thinking of ftruncate and also of Sparse files…

Create a new RandomAccessFile and call the setLength method, specifying the desired file length. The underlying JRE implementation should use the most efficient method available in your environment.
The following program
import java.io.*;
class Test {
public static void main(String args[]) throws Exception {
RandomAccessFile f = new RandomAccessFile("t", "rw");
f.setLength(1024 * 1024 * 1024);
}
}
on a Linux machine will allocate the space using the ftruncate(2)
6070 open("t", O_RDWR|O_CREAT, 0666) = 4
6070 fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
6070 lseek(4, 0, SEEK_CUR) = 0
6070 ftruncate(4, 1073741824) = 0
while on a Solaris machine it will use the the F_FREESP64 function of the fcntl(2) system call.
/2: open64("t", O_RDWR|O_CREAT, 0666) = 14
/2: fstat64(14, 0xFE4FF810) = 0
/2: llseek(14, 0, SEEK_CUR) = 0
/2: fcntl(14, F_FREESP64, 0xFE4FF998) = 0
In both cases this will result in the creation of a sparse file.

Since Java 8, this method works on Linux and Windows :
final ByteBuffer buf = ByteBuffer.allocate(4).putInt(2);
buf.rewind();
final OpenOption[] options = { StandardOpenOption.WRITE, StandardOpenOption.CREATE_NEW , StandardOpenOption.SPARSE };
final Path hugeFile = Paths.get("hugefile.txt");
try (final SeekableByteChannel channel = Files.newByteChannel(hugeFile, options);) {
channel.position(HUGE_FILE_SIZE);
channel.write(buf);
}

You can open the file for writing, seek to offset (n-1), and write a single byte. The OS will automatically extend the file to the desired number of bytes.

Related

Parsing files over 2.15 GB in Java using Kaitai Struct

I'm parsing large PCAP files in Java using Kaitai-Struct. Whenever the file size exceeds Integer.MAX_VALUE bytes I face an IllegalArgumentException caused by the size limit of the underlying ByteBuffer.
I haven't found references to this issue elsewhere, which leads me to believe that this is not a library limitation but a mistake in the way I'm using it.
Since the problem is caused by trying to map the whole file into the ByteBuffer I'd think that the solution would be mapping only the first region of the file, and as the data is being consumed map again skipping the data already parsed.
As this is done within the Kaitai Struct Runtime library it would mean to write my own class extending fom KatiaiStream and overwrite the auto-generated fromFile(...) method, and this doesn't really seem the right approach.
The auto-generated method to parse from file for the PCAP class is.
public static Pcap fromFile(String fileName) throws IOException {
return new Pcap(new ByteBufferKaitaiStream(fileName));
}
And the ByteBufferKaitaiStream provided by the Kaitai Struct Runtime library is backed by a ByteBuffer.
private final FileChannel fc;
private final ByteBuffer bb;
public ByteBufferKaitaiStream(String fileName) throws IOException {
fc = FileChannel.open(Paths.get(fileName), StandardOpenOption.READ);
bb = fc.map(FileChannel.MapMode.READ_ONLY, 0, fc.size());
}
Which in turn is limitted by the ByteBuffer max size.
Am I missing some obvious workaround? Is it really a limitation of the implementation of Katiati Struct in Java?
There are two separate issues here:
Running Pcap.fromFile() for large files is generally not a very efficient method, as you'll eventually get all files parsed into memory array at once. A example on how to avoid that is given in kaitai_struct/issues/255. The basic idea is that you'd want to have control over how you read every packet, and then dispose of every packet after you've parsed / accounted it somehow.
2GB limit on Java's mmaped files. To mitigate that, you can use alternative RandomAccessFile-based KaitaiStream implementation: RandomAccessFileKaitaiStream — it might be slower, but it should avoid that 2GB problem.
This library provides a ByteBuffer implementation which uses long offset. I haven't tried this approach but looks promising. See section Mapping Files Bigger than 2 GB
http://www.kdgregory.com/index.php?page=java.byteBuffer
public int getInt(long index)
{
return buffer(index).getInt();
}
private ByteBuffer buffer(long index)
{
ByteBuffer buf = _buffers[(int)(index / _segmentSize)];
buf.position((int)(index % _segmentSize));
return buf;
}
public MappedFileBuffer(File file, int segmentSize, boolean readWrite)
throws IOException
{
if (segmentSize > MAX_SEGMENT_SIZE)
throw new IllegalArgumentException(
"segment size too large (max " + MAX_SEGMENT_SIZE + "): " + segmentSize);
_segmentSize = segmentSize;
_fileSize = file.length();
RandomAccessFile mappedFile = null;
try
{
String mode = readWrite ? "rw" : "r";
MapMode mapMode = readWrite ? MapMode.READ_WRITE : MapMode.READ_ONLY;
mappedFile = new RandomAccessFile(file, mode);
FileChannel channel = mappedFile.getChannel();
_buffers = new MappedByteBuffer[(int)(_fileSize / segmentSize) + 1];
int bufIdx = 0;
for (long offset = 0 ; offset < _fileSize ; offset += segmentSize)
{
long remainingFileSize = _fileSize - offset;
long thisSegmentSize = Math.min(2L * segmentSize, remainingFileSize);
_buffers[bufIdx++] = channel.map(mapMode, offset, thisSegmentSize);
}
}
finally
{
// close quietly
if (mappedFile != null)
{
try
{
mappedFile.close();
}
catch (IOException ignored) { /* */ }
}
}
}

Is there any way to make image compression and saving faster on Android?

The situation
I should show 200-350 frames animation in my application. Images have 500x300ish resolution. If user wants to share animation, i have to convert it to Video. For convertion i am using ffmpeg command.
ffmpeg -y -r 1 -i /sdcard/videokit/pic00%d.jpg -i /sdcard/videokit/in.mp3 -strict experimental -ar 44100 -ac 2 -ab 256k -b 2097152 -ar 22050 -vcodec mpeg4 -b 2097152 -s 320x240 /sdcard/videokit/out.mp4
To convert images to video ffmpeg wants actual files not Bitmap or byte[].
Problem
Compressing bitmaps to image files taking to much time. 210 image convertion takes about 1 minute to finish on average device(HTC ONE m7). Converting image files to mp4 takes about 15 seconds on the same device. All together user have to wait about 1.5 minutes.
What i have tried
I changed comrpession format form PNG to JPEG(1.5 minute result is
achieved with JPEG compression(quality=80),with PNG it takes about
2-2.5 minutes) success
Tried to find how pass byte[] or bitmap to ffmpeg - no succes.
QUESTION
Is there any way(library (even native)) to make saving process faster.
Is there any way to pass byte[] or Bitmap objects (i mean png file decompressed to Android Bitmap Class Object) to ffmpeg library video creating method
Is there any other working library which will create mp4(or any supported format(supported by main Social Networks)) from byte[] or Bitmap objects in about 30 seconds(for 200 frames).
You can convert Bitmap (or byte[]) to YUV format quickly, using renderscript (see https://stackoverflow.com/a/39877029/192373). You can pass these YUV frames to ffmpeg library (as suggests halfelf), or use the built-in native MediaCodec which uses dedicated hardware on modt devices (but compression options are less flexible than all-software ffmpeg).
There are two steps slow us down. Compressing image to PNG/JPG and writing them to disk. Both can be skipped if we directly code against ffmpeg libs, instead of calling ffmpeg command. (There are other improvements too, such like GPU encoding and multithreading, but much more complicated.)
Some approaches to code:
Only use C/C++ NDK for android programming. FFmpeg will happily work. But I guess it's not an option here.
Build it from scratch by Java JNI. Not much experience here. I only know this could link java to c/c++ libs.
Some java wrapper. Luckily I found javacpp-presets. (There are others too, but this one is good enough and up to date.)
This library includes a good example ported from famous dranger's ffmpeg tutorial, though it is a demuxing one.
We can try to write a muxing one, following ffmpeg's muxing.c example.
import java.io.*;
import org.bytedeco.javacpp.*;
import static org.bytedeco.javacpp.avcodec.*;
import static org.bytedeco.javacpp.avformat.*;
import static org.bytedeco.javacpp.avutil.*;
import static org.bytedeco.javacpp.swscale.*;
public class Muxer {
public class OutputStream {
public AVStream Stream;
public AVCodecContext Ctx;
public AVFrame Frame;
public SwsContext SwsCtx;
public void setStream(AVStream s) {
this.Stream = s;
}
public AVStream getStream() {
return this.Stream;
}
public void setCodecCtx(AVCodecContext c) {
this.Ctx = c;
}
public AVCodecContext getCodecCtx() {
return this.Ctx;
}
public void setFrame(AVFrame f) {
this.Frame = f;
}
public AVFrame getFrame() {
return this.Frame;
}
public OutputStream() {
Stream = null;
Ctx = null;
Frame = null;
SwsCtx = null;
}
}
public static void main(String[] args) throws IOException {
Muxer t = new Muxer();
OutputStream VideoSt = t.new OutputStream();
AVOutputFormat Fmt = null;
AVFormatContext FmtCtx = new AVFormatContext(null);
AVCodec VideoCodec = null;
AVDictionary Opt = null;
SwsContext SwsCtx = null;
AVPacket Pkt = new AVPacket();
int GotOutput;
int InLineSize[] = new int[1];
String FilePath = "/path/xxx.mp4";
avformat_alloc_output_context2(FmtCtx, null, null, FilePath);
Fmt = FmtCtx.oformat();
AVCodec codec = avcodec_find_encoder_by_name("libx264");
av_format_set_video_codec(FmtCtx, codec);
VideoCodec = avcodec_find_encoder(Fmt.video_codec());
VideoSt.setStream(avformat_new_stream(FmtCtx, null));
AVStream stream = VideoSt.getStream();
VideoSt.getStream().id(FmtCtx.nb_streams() - 1);
VideoSt.setCodecCtx(avcodec_alloc_context3(VideoCodec));
VideoSt.getCodecCtx().codec_id(Fmt.video_codec());
VideoSt.getCodecCtx().bit_rate(5120000);
VideoSt.getCodecCtx().width(1920);
VideoSt.getCodecCtx().height(1080);
AVRational fps = new AVRational();
fps.den(25); fps.num(1);
VideoSt.getStream().time_base(fps);
VideoSt.getCodecCtx().time_base(fps);
VideoSt.getCodecCtx().gop_size(10);
VideoSt.getCodecCtx().max_b_frames();
VideoSt.getCodecCtx().pix_fmt(AV_PIX_FMT_YUV420P);
if ((FmtCtx.oformat().flags() & AVFMT_GLOBALHEADER) != 0)
VideoSt.getCodecCtx().flags(VideoSt.getCodecCtx().flags() | AV_CODEC_FLAG_GLOBAL_HEADER);
avcodec_open2(VideoSt.getCodecCtx(), VideoCodec, Opt);
VideoSt.setFrame(av_frame_alloc());
VideoSt.getFrame().format(VideoSt.getCodecCtx().pix_fmt());
VideoSt.getFrame().width(1920);
VideoSt.getFrame().height(1080);
av_frame_get_buffer(VideoSt.getFrame(), 32);
// should be at least Long or even BigInteger
// it is a unsigned long in C
int nextpts = 0;
av_dump_format(FmtCtx, 0, FilePath, 1);
avio_open(FmtCtx.pb(), FilePath, AVIO_FLAG_WRITE);
avformat_write_header(FmtCtx, Opt);
int[] got_output = { 0 };
while (still_has_input) {
// convert or directly copy your Bytes[] into VideoSt.Frame here
// AVFrame structure has two important data fields:
// AVFrame.data (uint8_t*[]) and AVFrame.linesize (int[])
// data includes pixel values in some formats and linesize is size of each picture line.
// For example, if formats is RGB. linesize should has 3 valid values equaling to `image_width * 3`. And data will point to three arrays containing rgb values.
// But I guess we'll need swscale() to convert pixel format here. From RGB to yuv420p (or other yuv family formats).
Pkt = new AVPacket();
av_init_packet(Pkt);
VideoSt.getFrame().pts(nextpts++);
avcodec_encode_video2(VideoSt.getCodecCtx(), Pkt, VideoSt.getFrame(), got_output);
av_packet_rescale_ts(Pkt, VideoSt.getCodecCtx().time_base(), VideoSt.getStream().time_base());
Pkt.stream_index(VideoSt.getStream().index());
av_interleaved_write_frame(FmtCtx, Pkt);
av_packet_unref(Pkt);
}
// get delayed frames
for (got_output[0] = 1; got_output[0] != 0;) {
Pkt = new AVPacket();
av_init_packet(Pkt);
avcodec_encode_video2(VideoSt.getCodecCtx(), Pkt, null, got_output);
if (got_output[0] > 0) {
av_packet_rescale_ts(Pkt, VideoSt.getCodecCtx().time_base(), VideoSt.getStream().time_base());
Pkt.stream_index(VideoSt.getStream().index());
av_interleaved_write_frame(FmtCtx, Pkt);
}
av_packet_unref(Pkt);
}
// free c structs
avcodec_free_context(VideoSt.getCodecCtx());
av_frame_free(VideoSt.getFrame());
avio_closep(FmtCtx.pb());
avformat_free_context(FmtCtx);
}
}
For porting C code, normally several changes should be done:
Mostly the work is to replace every C struct member access (. and ->) to java getter/setter.
Also there are many C address-of operators &, just delete them.
Change C NULL macro and C++ nullptr pointer to Java null object.
C codes used to check bool result of an int type in if, for, while. Have to compare them with 0 in java.
And there may be other API changes, as long as referencing to javacpp-presets docs, it'll be ok.
Note that I omitted all error handling codes here. It may be needed in real development/production.
Really I don't want to make publicity but to use pkzip and its SDK may be a good
solution. Pkzip compress file to 95% as they say.
The Smartcrypt SDK is available in all major programming languages, including C++, Java, and C#, and can be used to encrypt both structured and unstructured data. Changes to existing applications typically consist of two or three lines of code.

Java―good size for a char buffer

I am coding a little java based tool to process mysqldump files, which can become quite large (up to a gigabyte for now). I am using this code to read and process the file:
BufferedReader reader = getReader();
BufferedWriter writer = getWriter();
char[] charBuffer = new char[CHAR_BUFFER_SIZE];
int readCharCout;
StringBuffer buffer = new StringBuffer();
while( ( readCharCout = reader.read( charBuffer ) ) > 0 )
{
buffer.append( charBuffer, 0, readCharCout );
//processing goes here
}
What is a good size for the charBuffer? At the moment it is set to 1000, but my code will run with an arbitrary size, so what is best practice or can this size be calculated depending on the file size?
Thanks in ahead,
greetings philipp
It should always be a power of 2. The optimal value is based on the OS and disk format. In code I've seen 4096 is often used, but the bigger the better.
Also, there are better ways to load a file into memory.

How do you write to disk (with flushing) in Java and maintain performance?

Using the following code as a benchmark, the system can write 10,000 rows to disk in a fraction of a second:
void withSync() {
int f = open( "/tmp/t8" , O_RDWR | O_CREAT );
lseek (f, 0, SEEK_SET );
int records = 10*1000;
clock_t ustart = clock();
for(int i = 0; i < records; i++) {
write(f, "012345678901234567890123456789" , 30);
fsync(f);
}
clock_t uend = clock();
close (f);
printf(" sync() seconds:%lf writes per second:%lf\n", ((double)(uend-ustart))/(CLOCKS_PER_SEC), ((double)records)/((double)(uend-ustart))/(CLOCKS_PER_SEC));
}
In the above code, 10,000 records can be written and flushed out to disk in a fraction of a second, output below:
sync() seconds:0.006268 writes per second:0.000002
In the Java version, it takes over 4 seconds to write 10,000 records. Is this just a limitation of Java, or am I missing something?
public void testFileChannel() throws IOException {
RandomAccessFile raf = new RandomAccessFile(new File("/tmp/t5"),"rw");
FileChannel c = raf.getChannel();
c.force(true);
ByteBuffer b = ByteBuffer.allocateDirect(64*1024);
long s = System.currentTimeMillis();
for(int i=0;i<10000;i++){
b.clear();
b.put("012345678901234567890123456789".getBytes());
b.flip();
c.write(b);
c.force(false);
}
long e=System.currentTimeMillis();
raf.close();
System.out.println("With flush "+(e-s));
}
Returns this:
With flush 4263
Please help me understand what is the correct/fastest way to write records to disk in Java.
Note: I am using the RandomAccessFile class in combination with a ByteBuffer as ultimately we need random read/write access on this file.
Actually, I am surprised that test is not slower. The behavior of force is OS dependent but broadly it forces the data to disk. If you have an SSD you might achieve 40K writes per second, but with an HDD you won't. In the C example its clearly isn't committing the data to disk as even the fastest SSD cannot perform more than 235K IOPS (That the manufacturers guarantee it won't go faster than that :D )
If you need the data committed to disk every time, you can expect it to be slow and entirely dependent on the speed of your hardware. If you just need the data flushed to the OS and if the program crashes but the OS does not, you will not loose any data, you can write data without force. A faster option is to use memory mapped files. This will give you random access without a system call for each record.
I have a library Java Chronicle which can read/write 5-20 millions records per second with a latency of 80 ns in text or binary formats with random access and can be shared between processes. This only works this fast because it is not committing the data to disk on every record, but you can test that if the JVM crashes at any point, no data written to the chronicle is lost.
This code is more similar to what you wrote in C. Takes only 5 msec on my machine. If you really need to flush after every write, it takes about 60 msec. Your original code took about 11 seconds on this machine. BTW, closing the output stream also flushes.
public static void testFileOutputStream() throws IOException {
OutputStream os = new BufferedOutputStream( new FileOutputStream( "/tmp/fos" ) );
byte[] bytes = "012345678901234567890123456789".getBytes();
long s = System.nanoTime();
for ( int i = 0; i < 10000; i++ ) {
os.write( bytes );
}
long e = System.nanoTime();
os.close();
System.out.println( "outputstream " + ( e - s ) / 1e6 );
}
Java equivalent of fputs is file.write("012345678901234567890123456789"); , you are calling 4 functions and just 1 in C, delay seems obvious
i think this is most similar to your C version. i think the direct buffers in your java example are causing many more buffer copies than the C version. this takes about 2.2s on my (old) box.
public static void testFileChannelSimple() throws IOException {
RandomAccessFile raf = new RandomAccessFile(new File("/tmp/t5"),"rw");
FileChannel c = raf.getChannel();
c.force(true);
byte[] bytes = "012345678901234567890123456789".getBytes();
long s = System.currentTimeMillis();
for(int i=0;i<10000;i++){
raf.write(bytes);
c.force(true);
}
long e=System.currentTimeMillis();
raf.close();
System.out.println("With flush "+(e-s));
}

java.lang.OutOfMemoryError when doing javax.imageio.ImageIO.read("filename")

I want to compress the "JPG" images,which are about 4M or more.here is my codes:
public static void Compress(String sourceFolder,String destFolder,double proportion) throws IOException
{
File source=new File(sourceFolder);
File[] sourceFiles=null;
if(source.isDirectory())
{
sourceFiles=source.listFiles();
for(int i=0;i<sourceFiles.length;i++)
{
String name="";
javax.imageio.ImageIO.setUseCache(false);
Image src = javax.imageio.ImageIO.read(sourceFiles[i]);
name=sourceFiles[i].getName();
int width=src.getWidth(null);
int height=src.getHeight(null);
destWidth=(int) (height*proportion);
destHeight=(int) (width*proportion);
BufferedImage tag=new BufferedImage(destWidth,destHeight,BufferedImage.TYPE_INT_RGB);
Graphics g = tag.getGraphics();
g.drawImage(src, 0, 0, destWidth, destHeight, null);
src.flush();
src=null;
FileOutputStream out = new FileOutputStream(destFolder+"/"+name);
JPEGImageEncoder encoder = JPEGCodec.createJPEGEncoder(out);
encoder.encode(tag);
out.close();
}
}
else
System.exit(0);
}
When it runs
Image src = javax.imageio.ImageIO.read("filename");
Exception occured:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.awt.image.DataBufferByte.<init>(DataBufferByte.java:58)
at java.awt.image.ComponentSampleModel.createDataBuffer(ComponentSampleModel.java:397)
at java.awt.image.Raster.createWritableRaster(Raster.java:938)
at javax.imageio.ImageTypeSpecifier.createBufferedImage(ImageTypeSpecifier.java:1056)
at javax.imageio.ImageReader.getDestination(ImageReader.java:2879)
at com.sun.imageio.plugins.jpeg.JPEGImageReader.readInternal(JPEGImageReader.java:943)
at com.sun.imageio.plugins.jpeg.JPEGImageReader.read(JPEGImageReader.java:915)
at javax.imageio.ImageIO.read(ImageIO.java:1422)
at javax.imageio.ImageIO.read(ImageIO.java:1282)
at functions.CompressImage.Compress(CompressImage.java:50)
at functions.CompressImage.main(CompressImage.java:24)
I tried the run arguments(-Xms=1g),it still doesn't work!
Who knows the solution? Please help me,thank you!
you need to get heap dump and analyze it. so the simplest way is to add JVM params like
-XX:+HeapDumpOnOutOfMemoryError
This will automatically create heap dump/ Later you can analyze what's wrong using java profilers(yourkit,jprofiler, etc)
A 4MB JPG will result in a huge BitMap File. I think, it will just need a lot of memory. I ofte read about large memory-sonsumption in javax.imagio.
To get the bitmap size, calculate image_X * image_Y * (8 to 10 bit * 3(colors))
update
Some Math:
I assume 8bit per colorchannel:
7000 * 4900 * 8 * 3 = 1029000000 bit
= 122MB
I beleave, there has to be a byte[] of 122MB within the Memory. If the operating system (not the JVM) can't create that memory block, you'll get that exception.

Categories

Resources