I'm trying to do a few performance enhancements and am looking to use memory mapped files for writing data. I did a few tests and surprisingly, MappedByteBuffer seems slower than allocating direct buffers. I'm not able to clearly understand why this would be the case. Can someone please hint at what could be going on behind the scenes? Below are my test results:
I'm allocating 32KB buffers. I've already created the files with sizes 3Gigs before starting the tests. So, growing the file isn't the issue.
I'm adding the code that I used for this performance test. Any input / explanation about this behavior is much appreciated.
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.FileChannel.MapMode;
public class MemoryMapFileTest {
/**
* #param args
* #throws IOException
*/
public static void main(String[] args) throws IOException {
for (int i = 0; i < 10; i++) {
runTest();
}
}
private static void runTest() throws IOException {
// TODO Auto-generated method stub
FileChannel ch1 = null;
FileChannel ch2 = null;
ch1 = new RandomAccessFile(new File("S:\\MMapTest1.txt"), "rw").getChannel();
ch2 = new RandomAccessFile(new File("S:\\MMapTest2.txt"), "rw").getChannel();
FileWriter fstream = new FileWriter("S:\\output.csv", true);
BufferedWriter out = new BufferedWriter(fstream);
int[] numberofwrites = {1,10,100,1000,10000,100000};
//int n = 10000;
try {
for (int j = 0; j < numberofwrites.length; j++) {
int n = numberofwrites[j];
long estimatedTime = 0;
long mappedEstimatedTime = 0;
for (int i = 0; i < n ; i++) {
byte b = (byte)Math.random();
long allocSize = 1024 * 32;
estimatedTime += directAllocationWrite(allocSize, b, ch1);
mappedEstimatedTime += mappedAllocationWrite(allocSize, b, i, ch2);
}
double avgDirectEstTime = (double)estimatedTime/n;
double avgMapEstTime = (double)mappedEstimatedTime/n;
out.write(n + "," + avgDirectEstTime/1000000 + "," + avgMapEstTime/1000000);
out.write("," + ((double)estimatedTime/1000000) + "," + ((double)mappedEstimatedTime/1000000));
out.write("\n");
System.out.println("Avg Direct alloc and write: " + estimatedTime);
System.out.println("Avg Mapped alloc and write: " + mappedEstimatedTime);
}
} finally {
out.write("\n\n");
if (out != null) {
out.flush();
out.close();
}
if (ch1 != null) {
ch1.close();
} else {
System.out.println("ch1 is null");
}
if (ch2 != null) {
ch2.close();
} else {
System.out.println("ch2 is null");
}
}
}
private static long directAllocationWrite(long allocSize, byte b, FileChannel ch1) throws IOException {
long directStartTime = System.nanoTime();
ByteBuffer byteBuf = ByteBuffer.allocateDirect((int)allocSize);
byteBuf.put(b);
ch1.write(byteBuf);
return System.nanoTime() - directStartTime;
}
private static long mappedAllocationWrite(long allocSize, byte b, int iteration, FileChannel ch2) throws IOException {
long mappedStartTime = System.nanoTime();
MappedByteBuffer mapBuf = ch2.map(MapMode.READ_WRITE, iteration * allocSize, allocSize);
mapBuf.put(b);
return System.nanoTime() - mappedStartTime;
}
}
You're testing the wrong thing. This is not how to write the code in either case. You should allocate the buffer once, and just keep updating its contents. You're including allocation time in the write time. Not valid.
Swapping data to disk is the main reason for MappedByteBuffer to be slower than DirectByteBuffer.
cost of allocation and deallocation is high with direct buffers , including MappedByteBuffer, and this is cost is accrued to both the examples hence the only difference in writing to disk , which is the case with MappedByteBuffer but not with Direct Byte Buffer
Related
I have a project where I am to write data (strings and ints) into a binary random access file, and read the data in a separate class. The problem I have is I'm trying to iterate through the file and read the data in a specific order (int, String, String, int), however the Strings are various byte sizes.
I am getting an EOFException but cannot figure out why.
Here is the class which writes the data. Part of the requirements is to limit the number of bytes for the Strings and catch a user defined exception if they are exceeded.
import java.io.RandomAccessFile;
import java.io.IOException;
import java.io.FileNotFoundException;
import java.util.Arrays;
import java.util.ArrayList;
import java.io.File;
public class QuestionBank {
private RandomAccessFile file;
private ArrayList <Questions> listQuestions;
public QuestionBank(){
file = null;
listQuestions = new ArrayList<Questions>();
}
public void storeQuestion (Questions ques) throws IOException {
ques = new Questions(ques.getQuesIDNum(), ques.getQuestion(), ques.getAnswer(), ques.getValue());
listQuestions.add(ques);
byte[] quesBytes = ques.getQuestion().getBytes("UTF-8");
byte[] ansBytes = ques.getAnswer().getBytes("UTF-8");
try {
file = new RandomAccessFile(new File("Question.bin"), "rw");
long fileSize = file.length();
file.seek(fileSize);
file.writeInt(ques.getQuesIDNum());
file.writeUTF(ques.getQuestion());
for (int i = 0; i <= 50 - ques.getQuestion().length(); i++){
file.writeByte(50);
}
if (quesBytes.length > 50) {
throw new ByteSizeException("Question has too many bytes");
}
file.writeUTF(ques.getAnswer());
for (int i = 0; i <= 20 - ques.getAnswer().length(); i++){
file.writeByte(20);
}
if (ansBytes.length > 20) {
throw new ByteSizeException("Answer has too many bytes");
}
file.writeInt(ques.getValue());
file.close();
} catch (IOException e) {
System.out.println("I/O Exception Found");
} catch (ByteSizeException eb) {
System.out.println("String has too many bytes");
}
}
Here is the class which reads the file.
import java.util.ArrayList;
import java.util.Random;
import java.io.RandomAccessFile;
import java.io.IOException;
import java.io.FileNotFoundException;
import java.io.File;
public class TriviaGame {
public static final int RECORD = 78;
private ArrayList<Questions> quesList;
private int IDNum;
private String question;
private String answer;
private int points;
public TriviaGame() {
quesList = new ArrayList<Questions>();
IDNum = 0;
question = "";
answer = "";
points = 0;
}
public void read(){
try {
RandomAccessFile file;
file = new RandomAccessFile(new File("Question.bin"), "r");
long fileSize = file.length();
long numRecords = fileSize/RECORD;
file.seek(0);
for (int i = 0; i < numRecords; i++){
IDNum = file.readInt();
question = file.readUTF();
answer = file.readUTF();
points = file.readInt();
System.out.println("ID: " + IDNum + " Question: " + question + " Answer: " + answer + " Points: " + points);
}
file.close();
} catch (IOException e) {
System.out.println(e.getClass());
System.out.println("I/O Exception found");
}
}
}
Thanks
file.writeUTF(ques.getQuestion());
Here you have written the question.
for (int i = 0; i <= 50 - ques.getQuestion().length(); i++){
file.writeByte(50);
}
if (quesBytes.length > 50) {
throw new ByteSizeException("Question has too many bytes");
}
Here for some unknown reason you are padding the question to 50 bytes. Remove. Same with the answer. You are using readUTF() to read both of these, so all you need is writeUTF() to write them. No padding required.
Or, if you insist on this padding, you have to skip over it when reading: after the first readUTF(), you need to skip over the padding.
I've been asked to measure current disk performance, as we are planning to replace local disk with network attached storage on our application servers. Since our applications which write data are written in Java, I thought I would measure the performance directly in Linux, and also using a simple Java test. However I'm getting significantly different results, particularly for reading data, using what appear to me to be similar tests. Directly in Linux I'm doing:
dd if=/dev/zero of=/data/cache/test bs=1048576 count=8192
dd if=/data/cache/test of=/dev/null bs=1048576 count=8192
My Java test looks like this:
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
public class TestDiskSpeed {
private byte[] oneMB = new byte[1024 * 1024];
public static void main(String[] args) throws IOException {
new TestDiskSpeed().execute(args);
}
private void execute(String[] args) throws IOException {
long size = Long.parseLong(args[1]);
testWriteSpeed(args[0], size);
testReadSpeed(args[0], size);
}
private void testWriteSpeed(String filePath, long size) throws IOException {
File file = new File(filePath);
BufferedOutputStream writer = null;
long start = System.currentTimeMillis();
try {
writer = new BufferedOutputStream(new FileOutputStream(file), 1024 * 1024);
for (int i = 0; i < size; i++) {
writer.write(oneMB);
}
writer.flush();
} finally {
if (writer != null) {
writer.close();
}
}
long elapsed = System.currentTimeMillis() - start;
String message = "Wrote " + size + "MB in " + elapsed + "ms at a speed of " + calculateSpeed(size, elapsed) + "MB/s";
System.out.println(message);
}
private void testReadSpeed(String filePath, long size) throws IOException {
File file = new File(filePath);
BufferedInputStream reader = null;
long start = System.currentTimeMillis();
try {
reader = new BufferedInputStream(new FileInputStream(file), 1024 * 1024);
for (int i = 0; i < size; i++) {
reader.read(oneMB);
}
} finally {
if (reader != null) {
reader.close();
}
}
long elapsed = System.currentTimeMillis() - start;
String message = "Read " + size + "MB in " + elapsed + "ms at a speed of " + calculateSpeed(size, elapsed) + "MB/s";
System.out.println(message);
}
private double calculateSpeed(long size, long elapsed) {
double seconds = ((double) elapsed) / 1000L;
double speed = ((double) size) / seconds;
return speed;
}
}
This is being invoked with "java TestDiskSpeed /data/cache/test 8192"
Both of these should be creating 8GB files of zeros, 1MB at a time, measuring the speed, and then reading it back and measuring again. Yet the speeds I'm consistently getting are:
Linux: write - ~650MB/s
Linux: read - ~4.2GB/s
Java: write - ~500MB/s
Java: read - ~1.9GB/s
Can anyone explain the large discrepancy?
When I run this using NIO on my system. Ubuntu 15.04 with an i7-3970X
public class Main {
static final int SIZE_GB = Integer.getInteger("sizeGB", 8);
static final int BLOCK_SIZE = 64 * 1024;
public static void main(String[] args) throws IOException {
ByteBuffer buffer = ByteBuffer.allocateDirect(BLOCK_SIZE);
File tmp = File.createTempFile("delete", "me");
tmp.deleteOnExit();
int blocks = (int) (((long) SIZE_GB << 30) / BLOCK_SIZE);
long start = System.nanoTime();
try (FileChannel fc = new FileOutputStream(tmp).getChannel()) {
for (int i = 0; i < blocks; i++) {
buffer.clear();
while (buffer.remaining() > 0)
fc.write(buffer);
}
}
long mid = System.nanoTime();
try (FileChannel fc = new FileInputStream(tmp).getChannel()) {
for (int i = 0; i < blocks; i++) {
buffer.clear();
while (buffer.remaining() > 0)
fc.read(buffer);
}
}
long end = System.nanoTime();
long size = tmp.length();
System.out.printf("Write speed %.1f GB/s, read Speed %.1f GB/s%n",
(double) size/(mid-start), (double) size/(end-mid));
}
}
prints
Write speed 3.8 GB/s, read Speed 6.8 GB/s
You may get better performance if you drop the BufferedXxxStream. It's not helping since you're doing 1Mb read/writes, and is cause extra memory copy of the data.
Better yet, you should be using the NIO classes instead of the regular IO classes.
try-finally
You should clean up your try-finally code.
// Original code
BufferedOutputStream writer = null;
try {
writer = new ...;
// use writer
} finally {
if (writer != null) {
writer.close();
}
}
// Cleaner code
BufferedOutputStream writer = new ...;
try {
// use writer
} finally {
writer.close();
}
// Even cleaner, using try-with-resources (since Java 7)
try (BufferedOutputStream writer = new ...) {
// use writer
}
To complement Peter's great answer, I am adding the code below. It compares head-to-head the performance of the good-old java.io with NIO. Unlike Peter, instead of just reading data into a direct buffer, I do a typical thing with it: transfer it into an on-heap byte array. This steals surprisingly little from the performance: where I was getting 7.5 GB/s with Peter's code, here I get 6.0 GB/s.
For the java.io approach I can't have a direct buffer, but instead I call the read method directly with my target on-heap byte array. Note that this array is smallish and has an awkward size of 555 bytes. Nevertheless I retrieve almost identical performance: 5.6 GB/s. The difference is so small that it would evaporate completely in normal usage, and even in this artificial scenario if I wasn't reading directly from the disk cache.
As a bonus I include at the bottom a method which can be used on Linux and Mac to purge the disk caches. You'll see a dramatic turn in performance if you decide to call it between the write and the read step.
public final class MeasureIOPerformance {
static final int SIZE_GB = Integer.getInteger("sizeGB", 8);
static final int BLOCK_SIZE = 64 * 1024;
static final int blocks = (int) (((long) SIZE_GB << 30) / BLOCK_SIZE);
static final byte[] acceptBuffer = new byte[555];
public static void main(String[] args) throws IOException {
for (int i = 0; i < 3; i++) {
measure(new ChannelRw());
measure(new StreamRw());
}
}
private static void measure(RW rw) throws IOException {
File file = File.createTempFile("delete", "me");
file.deleteOnExit();
System.out.println("Writing " + SIZE_GB + " GB " + " with " + rw);
long start = System.nanoTime();
rw.write(file);
long mid = System.nanoTime();
System.out.println("Reading " + SIZE_GB + " GB " + " with " + rw);
long checksum = rw.read(file);
long end = System.nanoTime();
long size = file.length();
System.out.printf("Write speed %.1f GB/s, read Speed %.1f GB/s%n",
(double) size/(mid-start), (double) size/(end-mid));
System.out.println(checksum);
file.delete();
}
interface RW {
void write(File f) throws IOException;
long read(File f) throws IOException;
}
static class ChannelRw implements RW {
final ByteBuffer directBuffer = ByteBuffer.allocateDirect(BLOCK_SIZE);
#Override public String toString() {
return "Channel";
}
#Override public void write(File f) throws IOException {
FileChannel fc = new FileOutputStream(f).getChannel();
try {
for (int i = 0; i < blocks; i++) {
directBuffer.clear();
while (directBuffer.remaining() > 0) {
fc.write(directBuffer);
}
}
} finally {
fc.close();
}
}
#Override public long read(File f) throws IOException {
ByteBuffer buffer = ByteBuffer.allocateDirect(BLOCK_SIZE);
FileChannel fc = new FileInputStream(f).getChannel();
long checksum = 0;
try {
for (int i = 0; i < blocks; i++) {
buffer.clear();
while (buffer.hasRemaining()) {
fc.read(buffer);
}
buffer.flip();
while (buffer.hasRemaining()) {
buffer.get(acceptBuffer, 0, Math.min(acceptBuffer.length, buffer.remaining()));
checksum += acceptBuffer[acceptBuffer[0]];
}
}
} finally {
fc.close();
}
return checksum;
}
}
static class StreamRw implements RW {
final byte[] buffer = new byte[BLOCK_SIZE];
#Override public String toString() {
return "Stream";
}
#Override public void write(File f) throws IOException {
FileOutputStream out = new FileOutputStream(f);
try {
for (int i = 0; i < blocks; i++) {
out.write(buffer);
}
} finally {
out.close();
}
}
#Override public long read(File f) throws IOException {
FileInputStream in = new FileInputStream(f);
long checksum = 0;
try {
for (int i = 0; i < blocks; i++) {
for (int remaining = acceptBuffer.length, read;
(read = in.read(buffer)) != -1 && (remaining -= read) > 0; )
{
in.read(acceptBuffer, acceptBuffer.length - remaining, remaining);
}
checksum += acceptBuffer[acceptBuffer[0]];
}
} finally {
in.close();
}
return checksum;
}
}
public static void purgeCache() throws IOException, InterruptedException {
if (System.getProperty("os.name").startsWith("Mac")) {
new ProcessBuilder("sudo", "purge")
// .inheritIO()
.start().waitFor();
} else {
new ProcessBuilder("sudo", "su", "-c", "echo 3 > /proc/sys/vm/drop_caches")
// .inheritIO()
.start().waitFor();
}
}
}
I'm trying to find a way to use
copyInputStreamToFile(InputStream source, File destination)
to make a small progress bar in the console by file size. Is there a way to do this?
The short answer you can't, look at the source code of this method, I tried to track its execution path and it goes to this method at IOUtils class:
public static long copyLarge(final InputStream input, final OutputStream output, final byte[] buffer)
throws IOException {
long count = 0;
int n = 0;
while (EOF != (n = input.read(buffer))) {
output.write(buffer, 0, n);
count += n;
}
return count;
}
So, this functionality is encapsulated by an API.
The long answer you can implement downloading method by yourself, by using relative parts of IOUtils and FileUtils libraries and add functionality to print percentage of downloaded file in a console:
This is a working kick-off example:
package apache.utils.custom;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.IOUtils;
public class Downloader {
private static final int EOF = -1;
private static final int DEFAULT_BUFFER_SIZE = 1024 * 4;
public static void copyInputStreamToFileNew(final InputStream source, final File destination, int fileSize) throws IOException {
try {
final FileOutputStream output = FileUtils.openOutputStream(destination);
try {
final byte[] buffer = new byte[DEFAULT_BUFFER_SIZE];
long count = 0;
int n = 0;
while (EOF != (n = source.read(buffer))) {
output.write(buffer, 0, n);
count += n;
System.out.println("Completed " + count * 100/fileSize + "%");
}
output.close(); // don't swallow close Exception if copy completes normally
} finally {
IOUtils.closeQuietly(output);
}
} finally {
IOUtils.closeQuietly(source);
}
}
You should provide expected file size to this method which you can calculate by using this code:
URL url = new URL(urlString);
URLConnection urlConnection = url.openConnection();
urlConnection.connect();
int file_size = urlConnection.getContentLength();
Of course the better idea is to encapsulate the whole functionality in a single method.
Hope it will help you.
I was trying to read from one file and write the bytes read to another file using the classes specified in the title.I successfully did it but while i was trying to try different things i came across a problem which i do not understand.
Here is the code
import java.io.*;
public class FileInputStreamDemo {
public static void main(String[] args)
throws Exception {
// TODO Auto-generated method stub
int size;
InputStream f = new
FileInputStream("G:/Eclipse Workspace/FileInputStream Demo/src/FileInputStreamDemo.java");
System.out.println("Total available bytes: " + (size = f.available()));
/*int n=size/40;
System.out.println("first " + n + " bytes of file one read() at a time");
for (int i=0;i<n;i++)
{
System.out.print((char) f.read());
}
System.out.println("\n Still available: " + f.available());
System.out.println("reading the next" + n + "with one read(b[])");
byte b[] = new byte[n]; */
/*for(int i=0;i<size;i++)
{
System.out.print((char) f.read());
}*/
OutputStream f1 = new
FileOutputStream("G:/Eclipse Workspace/FileInputStream Demo/test.txt");
for (int count = 0; count < size; count++) {
f1.write(f.read());
}
for (int i = 0; i < size; i++) {
System.out.print(f.read());
}
f.close();
f1.close();
}
}
The problem that i am talking about is that when i first read from the FileInputStream object f i.e f.read() and write it to the f1 i.e FileOutputStream object it goes on to do what it is meant to do ,but when i try to read it again it returns -1. why so ?
Use RandomAccessFile and seek(0) method to come back at the beginning.
RandomAccessFile file = new RandomAccessFile(new File("G:/Eclipse Workspace/FileInputStream Demo/src/FileInputStreamDemo.java"), "r");
Here is sample code:
import java.io.File;
import java.io.FileOutputStream;
import java.io.OutputStream;
import java.io.RandomAccessFile;
public class FileInputStreamDemo {
public static void main(String[] args) throws Exception {
long size;
File file = new File("D:/Workspace/JavaProject/src/com/test/FileInputStreamDemo.java");
RandomAccessFile f = new RandomAccessFile(file, "r");
System.out.println("Total available bytes: " + (size = file.length()));
OutputStream f1 = new FileOutputStream(new File(
"D:/Workspace/JavaProject/resources/test.txt"));
for (int count = 0; count < size; count++) {
f1.write(f.read());
}
f.seek(0);
for (int i = 0; i < size; i++) {
System.out.print((char)f.read());
}
f.close();
f1.close();
}
}
I need to create a BMP (bitmap) image from a database using Java. The problem is that I have huge sets of integers ranging from 10 to 100.
I would like to represent the whole database as a bmp. The amount of data 10000x10000 per table (and growing) exceeds the amount of data I can handle with int arrays.
Is there a way to write the BMP directly to the hard drive, pixel by pixel, so I don't run out of memory?
A file would work (I definitely woudln't do a per pixel call, you'll be waiting hours for the result). You just need a buffer. Break the application apart along the lines of ->
int[] buffer = new int[BUFFER_SIZE];
ResultSet data = ....; //Forward paging result set
while(true)
{
for(int i = 0; i < BUFFER_SIZE; i++)
{
//Read result set into buffer
}
//write buffer to cache (HEAP/File whatever)
if(resultSetDone)
break;
}
Read the documentation on your database driver, but any major database is going to optimize your ResultSet object so you can use a cursor and not worry about memory.
All that being said... an int[10000][10000] isn't why you're running out of memory. Its probably what you're doing with those values and your algorithm. Example:
public class Test
{
public static void main(String... args)
{
int[][] ints = new int[10000][];
System.out.println(System.currentTimeMillis() + " Start");
for(int i = 0; i < 10000; i++)
{
ints[i] = new int[10000];
for(int j = 0; j < 10000; j++)
ints[i][j] = i*j % Integer.MAX_VALUE / 2;
System.out.print(i);
}
System.out.println();
System.out.println(Integer.valueOf(ints[500][999]) + " <- value");
System.out.println(System.currentTimeMillis() + " Stop");
}
}
Output ->
1344554718676 Start
//not even listing this
249750 <- value
1344554719322 Stop
Edit--Or if I misinterpreted your question try this ->
http://www.java2s.com/Code/Java/Database-SQL-JDBC/LoadimagefromDerbydatabase.htm
I see... well take a look around, I'm rusty but this seems to be a way to do it. I'd double check my buffering...
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
public class Test
{
public static void main(String... args)
{
// 2 ^ 24 bytes, streams can be bigger, but this works...
int size = Double.valueOf((Math.floor((Math.pow(2.0, 24.0))))).intValue();
byte[] bytes = new byte[size];
for(int i = 0; i < size; i++)
bytes[i] = (byte) (i % 255);
ByteArrayInputStream stream = new ByteArrayInputStream(bytes);
File file = new File("test.io"); //kill the hard disk
//Crappy error handling, you'd actually want to catch exceptions and recover
BufferedInputStream in = new BufferedInputStream(stream);
BufferedOutputStream out = null;
byte[] buffer = new byte[1024 * 8];
try
{
//You do need to check the buffer as it will have crap in it on the last read
out = new BufferedOutputStream(new FileOutputStream(file));
while(in.available() > 0)
{
int total = in.read(buffer);
out.write(buffer, 0, total);
}
}
catch (IOException e)
{
e.printStackTrace();
}
finally
{
if(out != null)
try
{
out.flush();
out.close();
}
catch (IOException e)
{
e.printStackTrace();
}
}
System.out.println(System.currentTimeMillis() + " Start");
System.out.println();
System.out.println(Integer.valueOf(bytes[bytes.length - 1]) + " <- value");
System.out.println("File size is-> " + file.length());
System.out.println(System.currentTimeMillis() + " Stop");
}
}
You could save it as a file, which is conceptually just a sequence of bytes.