Testing disk performance: differs with and without using Java

Testing disk performance: differs with and without using Java - java

I've been asked to measure current disk performance, as we are planning to replace local disk with network attached storage on our application servers. Since our applications which write data are written in Java, I thought I would measure the performance directly in Linux, and also using a simple Java test. However I'm getting significantly different results, particularly for reading data, using what appear to me to be similar tests. Directly in Linux I'm doing:
dd if=/dev/zero of=/data/cache/test bs=1048576 count=8192
dd if=/data/cache/test of=/dev/null bs=1048576 count=8192
My Java test looks like this:
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
public class TestDiskSpeed {
private byte[] oneMB = new byte[1024 * 1024];
public static void main(String[] args) throws IOException {
new TestDiskSpeed().execute(args);
}
private void execute(String[] args) throws IOException {
long size = Long.parseLong(args[1]);
testWriteSpeed(args[0], size);
testReadSpeed(args[0], size);
}
private void testWriteSpeed(String filePath, long size) throws IOException {
File file = new File(filePath);
BufferedOutputStream writer = null;
long start = System.currentTimeMillis();
try {
writer = new BufferedOutputStream(new FileOutputStream(file), 1024 * 1024);
for (int i = 0; i < size; i++) {
writer.write(oneMB);
}
writer.flush();
} finally {
if (writer != null) {
writer.close();
}
}
long elapsed = System.currentTimeMillis() - start;
String message = "Wrote " + size + "MB in " + elapsed + "ms at a speed of " + calculateSpeed(size, elapsed) + "MB/s";
System.out.println(message);
}
private void testReadSpeed(String filePath, long size) throws IOException {
File file = new File(filePath);
BufferedInputStream reader = null;
long start = System.currentTimeMillis();
try {
reader = new BufferedInputStream(new FileInputStream(file), 1024 * 1024);
for (int i = 0; i < size; i++) {
reader.read(oneMB);
}
} finally {
if (reader != null) {
reader.close();
}
}
long elapsed = System.currentTimeMillis() - start;
String message = "Read " + size + "MB in " + elapsed + "ms at a speed of " + calculateSpeed(size, elapsed) + "MB/s";
System.out.println(message);
}
private double calculateSpeed(long size, long elapsed) {
double seconds = ((double) elapsed) / 1000L;
double speed = ((double) size) / seconds;
return speed;
}
}
This is being invoked with "java TestDiskSpeed /data/cache/test 8192"
Both of these should be creating 8GB files of zeros, 1MB at a time, measuring the speed, and then reading it back and measuring again. Yet the speeds I'm consistently getting are:
Linux: write - ~650MB/s
Linux: read - ~4.2GB/s
Java: write - ~500MB/s
Java: read - ~1.9GB/s
Can anyone explain the large discrepancy?

When I run this using NIO on my system. Ubuntu 15.04 with an i7-3970X
public class Main {
static final int SIZE_GB = Integer.getInteger("sizeGB", 8);
static final int BLOCK_SIZE = 64 * 1024;
public static void main(String[] args) throws IOException {
ByteBuffer buffer = ByteBuffer.allocateDirect(BLOCK_SIZE);
File tmp = File.createTempFile("delete", "me");
tmp.deleteOnExit();
int blocks = (int) (((long) SIZE_GB << 30) / BLOCK_SIZE);
long start = System.nanoTime();
try (FileChannel fc = new FileOutputStream(tmp).getChannel()) {
for (int i = 0; i < blocks; i++) {
buffer.clear();
while (buffer.remaining() > 0)
fc.write(buffer);
}
}
long mid = System.nanoTime();
try (FileChannel fc = new FileInputStream(tmp).getChannel()) {
for (int i = 0; i < blocks; i++) {
buffer.clear();
while (buffer.remaining() > 0)
fc.read(buffer);
}
}
long end = System.nanoTime();
long size = tmp.length();
System.out.printf("Write speed %.1f GB/s, read Speed %.1f GB/s%n",
(double) size/(mid-start), (double) size/(end-mid));
}
}
prints
Write speed 3.8 GB/s, read Speed 6.8 GB/s

You may get better performance if you drop the BufferedXxxStream. It's not helping since you're doing 1Mb read/writes, and is cause extra memory copy of the data.
Better yet, you should be using the NIO classes instead of the regular IO classes.
try-finally
You should clean up your try-finally code.
// Original code
BufferedOutputStream writer = null;
try {
writer = new ...;
// use writer
} finally {
if (writer != null) {
writer.close();
}
}
// Cleaner code
BufferedOutputStream writer = new ...;
try {
// use writer
} finally {
writer.close();
}
// Even cleaner, using try-with-resources (since Java 7)
try (BufferedOutputStream writer = new ...) {
// use writer
}

To complement Peter's great answer, I am adding the code below. It compares head-to-head the performance of the good-old java.io with NIO. Unlike Peter, instead of just reading data into a direct buffer, I do a typical thing with it: transfer it into an on-heap byte array. This steals surprisingly little from the performance: where I was getting 7.5 GB/s with Peter's code, here I get 6.0 GB/s.
For the java.io approach I can't have a direct buffer, but instead I call the read method directly with my target on-heap byte array. Note that this array is smallish and has an awkward size of 555 bytes. Nevertheless I retrieve almost identical performance: 5.6 GB/s. The difference is so small that it would evaporate completely in normal usage, and even in this artificial scenario if I wasn't reading directly from the disk cache.
As a bonus I include at the bottom a method which can be used on Linux and Mac to purge the disk caches. You'll see a dramatic turn in performance if you decide to call it between the write and the read step.
public final class MeasureIOPerformance {
static final int SIZE_GB = Integer.getInteger("sizeGB", 8);
static final int BLOCK_SIZE = 64 * 1024;
static final int blocks = (int) (((long) SIZE_GB << 30) / BLOCK_SIZE);
static final byte[] acceptBuffer = new byte[555];
public static void main(String[] args) throws IOException {
for (int i = 0; i < 3; i++) {
measure(new ChannelRw());
measure(new StreamRw());
}
}
private static void measure(RW rw) throws IOException {
File file = File.createTempFile("delete", "me");
file.deleteOnExit();
System.out.println("Writing " + SIZE_GB + " GB " + " with " + rw);
long start = System.nanoTime();
rw.write(file);
long mid = System.nanoTime();
System.out.println("Reading " + SIZE_GB + " GB " + " with " + rw);
long checksum = rw.read(file);
long end = System.nanoTime();
long size = file.length();
System.out.printf("Write speed %.1f GB/s, read Speed %.1f GB/s%n",
(double) size/(mid-start), (double) size/(end-mid));
System.out.println(checksum);
file.delete();
}
interface RW {
void write(File f) throws IOException;
long read(File f) throws IOException;
}
static class ChannelRw implements RW {
final ByteBuffer directBuffer = ByteBuffer.allocateDirect(BLOCK_SIZE);
#Override public String toString() {
return "Channel";
}
#Override public void write(File f) throws IOException {
FileChannel fc = new FileOutputStream(f).getChannel();
try {
for (int i = 0; i < blocks; i++) {
directBuffer.clear();
while (directBuffer.remaining() > 0) {
fc.write(directBuffer);
}
}
} finally {
fc.close();
}
}
#Override public long read(File f) throws IOException {
ByteBuffer buffer = ByteBuffer.allocateDirect(BLOCK_SIZE);
FileChannel fc = new FileInputStream(f).getChannel();
long checksum = 0;
try {
for (int i = 0; i < blocks; i++) {
buffer.clear();
while (buffer.hasRemaining()) {
fc.read(buffer);
}
buffer.flip();
while (buffer.hasRemaining()) {
buffer.get(acceptBuffer, 0, Math.min(acceptBuffer.length, buffer.remaining()));
checksum += acceptBuffer[acceptBuffer[0]];
}
}
} finally {
fc.close();
}
return checksum;
}
}
static class StreamRw implements RW {
final byte[] buffer = new byte[BLOCK_SIZE];
#Override public String toString() {
return "Stream";
}
#Override public void write(File f) throws IOException {
FileOutputStream out = new FileOutputStream(f);
try {
for (int i = 0; i < blocks; i++) {
out.write(buffer);
}
} finally {
out.close();
}
}
#Override public long read(File f) throws IOException {
FileInputStream in = new FileInputStream(f);
long checksum = 0;
try {
for (int i = 0; i < blocks; i++) {
for (int remaining = acceptBuffer.length, read;
(read = in.read(buffer)) != -1 && (remaining -= read) > 0; )
{
in.read(acceptBuffer, acceptBuffer.length - remaining, remaining);
}
checksum += acceptBuffer[acceptBuffer[0]];
}
} finally {
in.close();
}
return checksum;
}
}
public static void purgeCache() throws IOException, InterruptedException {
if (System.getProperty("os.name").startsWith("Mac")) {
new ProcessBuilder("sudo", "purge")
// .inheritIO()
.start().waitFor();
} else {
new ProcessBuilder("sudo", "su", "-c", "echo 3 > /proc/sys/vm/drop_caches")
// .inheritIO()
.start().waitFor();
}
}
}

Related

Limit file size while writing in java

I need to limit the file size to 1 GB while writing preferably using BufferedWriter.
Is it possible using BufferedWriter or I have to use other libraries ?
like
try (BufferedWriter writer = Files.newBufferedWriter(path)) {
//...
writer.write(lines.stream());
}

You can always write your own OutputStream to limit the number of bytes written.
The following assumes you want to throw exception if size is exceeded.
public final class LimitedOutputStream extends FilterOutputStream {
private final long maxBytes;
private long bytesWritten;
public LimitedOutputStream(OutputStream out, long maxBytes) {
super(out);
this.maxBytes = maxBytes;
}
#Override
public void write(int b) throws IOException {
ensureCapacity(1);
super.write(b);
}
#Override
public void write(byte[] b) throws IOException {
ensureCapacity(b.length);
super.write(b);
}
#Override
public void write(byte[] b, int off, int len) throws IOException {
ensureCapacity(len);
super.write(b, off, len);
}
private void ensureCapacity(int len) throws IOException {
long newBytesWritten = this.bytesWritten + len;
if (newBytesWritten > this.maxBytes)
throw new IOException("File size exceeded: " + newBytesWritten + " > " + this.maxBytes);
this.bytesWritten = newBytesWritten;
}
}
You will of course now have to set up the Writer/OutputStream chain manually.
final long SIZE_1GB = 1073741824L;
try (BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(
new LimitedOutputStream(Files.newOutputStream(path), SIZE_1GB),
StandardCharsets.UTF_8))) {
//
}

Exact bytes to 1 GB is very difficult in cases where you are writing lines. Each line may contain unknown number of bytes in it. I am assuming you want to write data line by line in file.
However, you can check how many bytes does line has before writing it to the file and another approach is to check file size after writing each line.
Following basic example writes one same line each time. Here This is just a test ! text takes 21 bytes on file in UTF-8 encoding. Ultimately after 49 writes it reaches to 1029 Bytes and stops writing.
public class Test {
private static final int ONE_KB = 1024;
public static void main(String[] args) {
File file = new File("D:/test.txt");
try (BufferedWriter writer = Files.newBufferedWriter(file.toPath())) {
while (file.length() < ONE_KB) {
writer.write("This is just a test !");
writer.flush();
}
System.out.println("1 KB Data is written to the file.!");
} catch (IOException e) {
e.printStackTrace();
}
}
}
As you can see we have already written out of the limit of 1KB as above program writes 1029 Bytes and not less than 1024 Bytes.
Second approach is checking the bytes according to specific encoding before writing it to file.
public class Test {
private static final int ONE_KB = 1024;
public static void main(String[] args) throws UnsupportedEncodingException {
File file = new File("D:/test.txt");
String data = "This is just a test !";
int dataLength = data.getBytes("UTF-8").length;
try (BufferedWriter writer = Files.newBufferedWriter(file.toPath())) {
while (file.length() + dataLength < ONE_KB) {
writer.write(data);
writer.flush();
}
System.out.println("1 KB Data written to the file.!");
} catch (IOException e) {
e.printStackTrace();
}
}
}
In this approach we check length of bytes prior to writing it to the file. So, it will write 1008 Bytes and it will stop writing.
Problems with both the approaches,
Write and Check : You may end up with some extra bytes and file size may cross the limit
Check and Write : You may have less bytes than the limit if next line has lot of data in it. You should be careful about the encoding.
However, there are other ways to do this validations with some third party library like apache io and I find it more cumbersome then conventional java ways.

int maxSize = 1_000_000_000;
Charset charset = StandardCharsets.UTF_F);
int size = 0;
int lineCount = 0;
while (lineCount < lines.length) {
long size2 = size + (lines[lineCount] + "\r\n").getBytes(charset).length;
if (size2 > maxSize) {
break;
}
size = size2;
++lineCount;
}
List<String> linesToWrite = lines.substring(0, lineCount);
Path path = Paths.get("D:/test.txt");
Files.write(path, linesToWrite , charset);
Or a bit faster while decoding only once:
int lineCount = 0;
try (FileChannel channel = new RandomAccessFile("D:/test.txt", "w").getChannel()) {
ByteBuffer buf = channel.map(FileChannel.MapMode.WRITE, 0, maxSize);
lineCount = lines.length;
for (int i = 0; i < lines.length; i++) {
bytes[] line = (lines.get(i) + "\r\n").getBytes(charset);
if (line.length > buffer.remaining()) {
lineCount = i;
break;
}
buffer.put(line);
}
}

IIUC, there are various ways to do it.
Keep writing data in chucks and flushing it and keep checking the file size after every flush.
Use log4j (or some logging framework) which can let us rollover to new file after certain size or time or some other trigger point.
While BufferedReader is great, there are some new APIs in java which could make it faster. Fastest way to write huge data in text file Java

Java-Writing huge file using Byteoutputstream

I am trying to write a file of size in between 1kb to 10GB using ByteArrayOutputStream but the below exception is thrown. I am using jdk 6. Please suggest any better high performance Api. I am using same network box to read and write.
Exception in thread "main" java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.util.Arrays.copyOf(Unknown Source)
at java.io.ByteArrayOutputStream.grow(Unknown Source)
at java.io.ByteArrayOutputStream.ensureCapacity(Unknown Source)
at java.io.ByteArrayOutputStream.write(Unknown Source)
at java.io.OutputStream.write(Unknown Source)
at
Code:
import java.io.BufferedOutputStream;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
public class PrepareFile {
/**
* #param args
* #throws Exception
*/
public static void main(String[] args) throws Exception {
// TODO Auto-generated method stub
new PrepareFile().constructFile("f:\\hello","f:\\output",10000000);
}
//Writes a large file of 10 GB using input file data of small size by duplicating
public void constructFile(String fileName, String outPath, int multiplier) throws Exception {
BufferedOutputStream fos = null;
FileInputStream fis = null;
final File inputFile = new File(fileName);
String path = inputFile.getParent();
if (outPath != null && !outPath.isEmpty()) {
path = outPath;
}
fis = new FileInputStream(fileName);
try {
// read the transactions in the input file.
byte[] txnData = new byte[(int) inputFile.length()];
fis.read(txnData);
final File outFile = new File(path, "Myfile");
fos = new BufferedOutputStream(new FileOutputStream(outFile));
final ByteArrayOutputStream baos = new ByteArrayOutputStream();
final ByteArrayOutputStream baos1 = new ByteArrayOutputStream();
//multiplier if input file size is 1 KB and output file is 10 GB, then multiplier value is (1024*1024)
for (long i = 1; i <= multiplier; i++) {
if(i >=40000 && i % 40000==0){
System.out.println("i value now: "+i);
baos.writeTo(fos);
baos.reset();
//baos.write(txnData);
}
// write transactions
baos.write(txnData);
baos1.write(txnData); //Exception is coming at this line
}
int Padding = myCustomMethod(baos1.toByteArray());
// write all out data to the output stream
baos.writeTo(fos);
baos.flush();
baos1.flush();
} catch(Exception e){
e.printStackTrace();
}finally {
fos.close();
fis.close();
}
}
public int myCustomMethod(byte[] b){
//Need complete bytes to prepare the file trailer
return 0;
}
}

You can't have buffer of 2 GB or more in a ByteArrayOutputStream as the size is 32-bit signed.
If you want performance I would process the file progressively and avoid such large memory copies as they are really expensive.
BTW I have a library Chronicle Bytes which support buffers larger than 2 GB, and can be use native memory and mapped to files to avoid using the heap and can be larger than main memory.
However, if you process the data progressively you won't need such a large buffer.
I also suggest you use Java 8 as it performs 64-bit operations better than Java 6 (which was released ten years ago)
EDIT Based on your code, there is no need to use ByteArrayOutputStream and you can prepare the file progressively.
//Writes a large file of 10 GB using input file data of small size by duplicating
public void constructFile(String fileName, String outFileName, int multiplier) throws IOException {
byte[] bytes;
try (FileInputStream fis = new FileInputStream(fileName)) {
bytes = new byte[fis.available()];
fis.read(bytes);
}
try (FileOutputStream fos = new FileOutputStream(outFileName)) {
for (int i = 0; i < multiplier; i++) {
fos.write(bytes);
}
}
// now process the file "outFileName"
// how depends on what you are trying to do.
// NOTE: It is entirely possible the file should be processed as it is written.
}

Although extreme, you can make a Super ByteArrayOutputStream which hides several ByteArrayOutputStreams inside (the example below uses 3 of them with maximum capacity 6 GB):
public class LargeByteArrayOutputOutputStream extends OutputStream {
private DirectByteArrayOutputStream b1 = new DirectByteArrayOutputStream(Integer.MAX_VALUE -8);
private DirectByteArrayOutputStream b2 = new DirectByteArrayOutputStream(Integer.MAX_VALUE -8);
private DirectByteArrayOutputStream b3 = new DirectByteArrayOutputStream(Integer.MAX_VALUE -8);
private long posWrite = 0;
private long posRead = 0;
#Override
public void write(int b) throws IOException {
if (posWrite < b1.getArray().length) {
b1.write(b);
} else if (posWrite < ((long)b1.getArray().length + (long)b2.getArray().length)) {
b2.write(b);
} else {
b3.write(b);
}
posWrite++;
}
public long length() {
return posWrite;
}
/** Propably you may want to read afterward */
public int read() throws IOException
{
if (posRead > posWrite) {
return (int)-1;
} else {
byte b = 0;
if (posRead < b1.getArray().length) {
b = b1.getArray()[(int)posRead];
} else if (posRead < ((long)b1.getArray().length + (long)b2.getArray().length)) {
b = b2.getArray()[(int)(posRead - b1.getArray().length)];
} else {
b = b3.getArray()[(int)(posRead - ((long)b1.getArray().length + (long)b2.getArray().length))];
}
posRead++;
return b;
}
}
}
public class DirectByteArrayOutputStream extends java.io.ByteArrayOutputStream {
public DirectByteArrayOutputStream(int size) {
super(size);
}
/**
* Reference to the byte array that backs this buffer.
*/
public byte[] getArray() {
return buf;
}
protected void finalize() throws Throwable
{
super.finalize();
}
}

Splitting files into chunks with size bigger than 127

I'm trying to make a simplified HDFS (Hadoop Distributed File System) for a final project in a Distributed System course.
So, the first thing that I'm trying is to write a program which split an arbitrary file into blocks (chunks) of an arbitrary dimension.
I found this useful example, which code is:
package javabeat.net.io;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
/**
* Split File Example
*
* #author Krishna
*
*/
public class SplitFileExample {
private static String FILE_NAME = "TextFile.txt";
private static byte PART_SIZE = 5;
public static void main(String[] args) {
File inputFile = new File(FILE_NAME);
FileInputStream inputStream;
String newFileName;
FileOutputStream filePart;
int fileSize = (int) inputFile.length();
int nChunks = 0, read = 0, readLength = PART_SIZE;
byte[] byteChunkPart;
try {
inputStream = new FileInputStream(inputFile);
while (fileSize > 0) {
if (fileSize <= 5) {
readLength = fileSize;
}
byteChunkPart = new byte[readLength];
read = inputStream.read(byteChunkPart, 0, readLength);
fileSize -= read;
assert (read == byteChunkPart.length);
nChunks++;
newFileName = FILE_NAME + ".part"
+ Integer.toString(nChunks - 1);
filePart = new FileOutputStream(new File(newFileName));
filePart.write(byteChunkPart);
filePart.flush();
filePart.close();
byteChunkPart = null;
filePart = null;
}
inputStream.close();
} catch (IOException exception) {
exception.printStackTrace();
}
}
}
But I think that there is a big issue: the value of PART_SIZE cannot be greater than 127, otherwise an error: possible loss of precision will occur.
How can I solve without totally changing the code?

The problem is that PART_SIZE is a byte; its maximum value is therefore indeed 127.
The code you have at the moment however is full of problems; for one, incorrect resource handling etc.
Here is a version using java.nio.file:
private static final String FILENAME = "TextFile.txt";
private static final int PART_SIZE = xxx; // HERE
public static void main(final String... args)
throws IOException
{
final Path file = Paths.get(FILENAME).toRealPath();
final String filenameBase = file.getFileName().toString();
final byte[] buf = new byte[PART_SIZE];
int partNumber = 0;
Path part;
int bytesRead;
byte[] toWrite;
try (
final InputStream in = Files.newInputStream(file);
) {
while ((bytesRead = in.read(buf)) != -1) {
part = file.resolveSibling(filenameBase + ".part" + partNumber);
toWrite = bytesRead == PART_SIZE ? buf : Arrays.copyOf(buf, bytesRead);
Files.write(part, toWrite, StandardOpenOption.CREATE_NEW);
partNumber++;
}
}
}

List<PDDocument> Pages=new ArrayList<PDDocument>();
Document.load(filePath);
try {
Splitter splitter = new Splitter();
splitter.setSplitAtPage(NoOfPagesDocumentWillContain);
Pages = splitter.split(document);
}catch(Exception e)
{
l
e.getCause().printStackTrace();
}

Performance of MappedByteBuffer vs ByteBuffer

I'm trying to do a few performance enhancements and am looking to use memory mapped files for writing data. I did a few tests and surprisingly, MappedByteBuffer seems slower than allocating direct buffers. I'm not able to clearly understand why this would be the case. Can someone please hint at what could be going on behind the scenes? Below are my test results:
I'm allocating 32KB buffers. I've already created the files with sizes 3Gigs before starting the tests. So, growing the file isn't the issue.
I'm adding the code that I used for this performance test. Any input / explanation about this behavior is much appreciated.
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.FileChannel.MapMode;
public class MemoryMapFileTest {
/**
* #param args
* #throws IOException
*/
public static void main(String[] args) throws IOException {
for (int i = 0; i < 10; i++) {
runTest();
}
}
private static void runTest() throws IOException {
// TODO Auto-generated method stub
FileChannel ch1 = null;
FileChannel ch2 = null;
ch1 = new RandomAccessFile(new File("S:\\MMapTest1.txt"), "rw").getChannel();
ch2 = new RandomAccessFile(new File("S:\\MMapTest2.txt"), "rw").getChannel();
FileWriter fstream = new FileWriter("S:\\output.csv", true);
BufferedWriter out = new BufferedWriter(fstream);
int[] numberofwrites = {1,10,100,1000,10000,100000};
//int n = 10000;
try {
for (int j = 0; j < numberofwrites.length; j++) {
int n = numberofwrites[j];
long estimatedTime = 0;
long mappedEstimatedTime = 0;
for (int i = 0; i < n ; i++) {
byte b = (byte)Math.random();
long allocSize = 1024 * 32;
estimatedTime += directAllocationWrite(allocSize, b, ch1);
mappedEstimatedTime += mappedAllocationWrite(allocSize, b, i, ch2);
}
double avgDirectEstTime = (double)estimatedTime/n;
double avgMapEstTime = (double)mappedEstimatedTime/n;
out.write(n + "," + avgDirectEstTime/1000000 + "," + avgMapEstTime/1000000);
out.write("," + ((double)estimatedTime/1000000) + "," + ((double)mappedEstimatedTime/1000000));
out.write("\n");
System.out.println("Avg Direct alloc and write: " + estimatedTime);
System.out.println("Avg Mapped alloc and write: " + mappedEstimatedTime);
}
} finally {
out.write("\n\n");
if (out != null) {
out.flush();
out.close();
}
if (ch1 != null) {
ch1.close();
} else {
System.out.println("ch1 is null");
}
if (ch2 != null) {
ch2.close();
} else {
System.out.println("ch2 is null");
}
}
}
private static long directAllocationWrite(long allocSize, byte b, FileChannel ch1) throws IOException {
long directStartTime = System.nanoTime();
ByteBuffer byteBuf = ByteBuffer.allocateDirect((int)allocSize);
byteBuf.put(b);
ch1.write(byteBuf);
return System.nanoTime() - directStartTime;
}
private static long mappedAllocationWrite(long allocSize, byte b, int iteration, FileChannel ch2) throws IOException {
long mappedStartTime = System.nanoTime();
MappedByteBuffer mapBuf = ch2.map(MapMode.READ_WRITE, iteration * allocSize, allocSize);
mapBuf.put(b);
return System.nanoTime() - mappedStartTime;
}
}

You're testing the wrong thing. This is not how to write the code in either case. You should allocate the buffer once, and just keep updating its contents. You're including allocation time in the write time. Not valid.

Swapping data to disk is the main reason for MappedByteBuffer to be slower than DirectByteBuffer.
cost of allocation and deallocation is high with direct buffers , including MappedByteBuffer, and this is cost is accrued to both the examples hence the only difference in writing to disk , which is the case with MappedByteBuffer but not with Direct Byte Buffer

What's the difference between DataOutputStream and ObjectOutputStream?

I'm learning about socket programming in Java. I've seen client/server app examples with some using DataOutputStream, and some using ObjectOutputStream.
What's the difference between the two?
Is there a performance difference?

DataInput/OutputStream performs generally better because its much simpler. It can only read/write primtive types and Strings.
ObjectInput/OutputStream can read/write any object type was well as primitives. It is less efficient but much easier to use if you want to send complex data.
I would assume that the Object*Stream is the best choice until you know that its performance is an issue.

This might be useful for people still looking for answers several years later... According to my tests on a recent JVM (1.8_51), the ObjectOutput/InputStream is surprisingly almost 2x times faster than DataOutput/InputStream for reading/writing a huge array of double!
Below are the results for writing 10 million items array (for 1 million the results are the essentially the same). I also included the text format (BufferedWriter/Reader) for the sake of completeness:
TestObjectStream written 10000000 items, took: 409ms, or 24449.8778 items/ms, filesize 80390629b
TestDataStream written 10000000 items, took: 727ms, or 13755.1582 items/ms, filesize 80000000b
TestBufferedWriter written 10000000 items, took: 13700ms, or 729.9270 items/ms, filesize 224486395b
Reading:
TestObjectStream read 10000000 items, took: 250ms, or 40000.0000 items/ms, filesize 80390629b
TestDataStream read 10000000 items, took: 424ms, or 23584.9057 items/ms, filesize 80000000b
TestBufferedWriter read 10000000 items, took: 6298ms, or 1587.8057 items/ms, filesize 224486395b
I believe Oracle has heavily optimized the JVM for using ObjectStreams in last Java releases, as this is the most common way of writing/reading data (including serialization), and thus is located on the Java performance critical path.
So looks like today there's no much reason anymore to use DataStreams. "Don't try to outsmart JVM", just use the most straightforward way, which is ObjectStreams :)
Here's the code for the test:
class Generator {
private int seed = 1235436537;
double generate(int i) {
seed = (seed + 1235436537) % 936855463;
return seed / (i + 1.) / 524323.;
}
}
class Data {
public final double[] array;
public Data(final double[] array) {
this.array = array;
}
}
class TestObjectStream {
public void write(File dest, Data data) {
try (ObjectOutputStream out = new ObjectOutputStream(new BufferedOutputStream(new FileOutputStream(dest)))) {
for (int i = 0; i < data.array.length; i++) {
out.writeDouble(data.array[i]);
}
} catch (IOException e) {
throw new RuntimeIoException(e);
}
}
public void read(File dest, Data data) {
try (ObjectInputStream in = new ObjectInputStream(new BufferedInputStream(new FileInputStream(dest)))) {
for (int i = 0; i < data.array.length; i++) {
data.array[i] = in.readDouble();
}
} catch (IOException e) {
throw new RuntimeIoException(e);
}
}
}
class TestDataStream {
public void write(File dest, Data data) {
try (DataOutputStream out = new DataOutputStream(new BufferedOutputStream(new FileOutputStream(dest)))) {
for (int i = 0; i < data.array.length; i++) {
out.writeDouble(data.array[i]);
}
} catch (IOException e) {
throw new RuntimeIoException(e);
}
}
public void read(File dest, Data data) {
try (DataInputStream in = new DataInputStream(new BufferedInputStream(new FileInputStream(dest)))) {
for (int i = 0; i < data.array.length; i++) {
data.array[i] = in.readDouble();
}
} catch (IOException e) {
throw new RuntimeIoException(e);
}
}
}
class TestBufferedWriter {
public void write(File dest, Data data) {
try (BufferedWriter out = new BufferedWriter(new FileWriter(dest))) {
for (int i = 0; i < data.array.length; i++) {
out.write(Double.toString(data.array[i]));
out.newLine();
}
} catch (IOException e) {
throw new RuntimeIoException(e);
}
}
public void read(File dest, Data data) {
try (BufferedReader in = new BufferedReader(new FileReader(dest))) {
String line = in.readLine();
int i = 0;
while (line != null) {
if(!line.isEmpty()) {
data.array[i++] = Double.parseDouble(line);
}
line = in.readLine();
}
} catch (IOException e) {
throw new RuntimeIoException(e);
}
}
}
#Test
public void testWrite() throws Exception {
int N = 10000000;
double[] array = new double[N];
Generator gen = new Generator();
for (int i = 0; i < array.length; i++) {
array[i] = gen.generate(i);
}
Data data = new Data(array);
Map<Class, BiConsumer<File, Data>> subjects = new LinkedHashMap<>();
subjects.put(TestDataStream.class, new TestDataStream()::write);
subjects.put(TestObjectStream.class, new TestObjectStream()::write);
subjects.put(TestBufferedWriter.class, new TestBufferedWriter()::write);
subjects.forEach((aClass, fileDataBiConsumer) -> {
File f = new File("test." + aClass.getName());
long start = System.nanoTime();
fileDataBiConsumer.accept(f, data);
long took = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - start);
System.out.println(aClass.getSimpleName() + " written " + N + " items, took: " + took + "ms, or " + String.format("%.4f", (N / (double)took)) + " items/ms, filesize " + f.length() + "b");
});
}
#Test
public void testRead() throws Exception {
int N = 10000000;
double[] array = new double[N];
Data data = new Data(array);
Map<Class, BiConsumer<File, Data>> subjects = new LinkedHashMap<>();
subjects.put(TestDataStream.class, new TestDataStream()::read);
subjects.put(TestObjectStream.class, new TestObjectStream()::read);
subjects.put(TestBufferedWriter.class, new TestBufferedWriter()::read);
subjects.forEach((aClass, fileDataBiConsumer) -> {
File f = new File("test." + aClass.getName());
long start = System.nanoTime();
fileDataBiConsumer.accept(f, data);
long took = TimeUnit.NANOSECONDS.toMillis(System.nanoTime() - start);
System.out.println(aClass.getSimpleName() + " read " + N + " items, took: " + took + "ms, or " + String.format("%.4f", (N / (double)took)) + " items/ms, filesize " + f.length() + "b");
});
}

DataOutputStream and ObjectOutputStream: when handling basic types, there is no difference apart from the header that ObjectOutputStream creates.
With the ObjectOutputStream class, instances of a class that implements Serializable can be written to the output stream, and can be read back with ObjectInputStream.
DataOutputStream can only handle basic types.

Only objects that implement the java.io.Serializable interface can be written to streams using ObjectOutputStream.Primitive data types can also be written to the stream using the appropriate methods from DataOutput. Strings can also be written using the writeUTF method. But DataInputStream on the other hand lets an application write primitive Java data types to an output stream in a portable way.
Object OutputStream
Data Input Stream

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Testing disk performance: differs with and without using Java - java

Related

Limit file size while writing in java

Java-Writing huge file using Byteoutputstream

Splitting files into chunks with size bigger than 127

Performance of MappedByteBuffer vs ByteBuffer

What's the difference between DataOutputStream and ObjectOutputStream?

Categories

Resources