ChannelInputStream skip method is very slow - java

I have following piece of test code:
try {
InputStream is;
Stopwatch.start("FileInputStream");
is = new FileInputStream(imageFile.toFile());
is.skip(1024*1024*1024);
is.close();
Stopwatch.stop();
Stopwatch.start("Files.newInputStream");
is = Files.newInputStream(imageFile);
is.skip(1024*1024*1024);
is.close();
Stopwatch.stop();
}
catch(Exception e)
{
}
and I have following output:
Start: FileInputStream
FileInputStream : 0 ms
Start: Files.newInputStream
Files.newInputStream : 3469 ms
Do you have any idea what is going on? Why skip is so slow in the second case?
I need to use InputStreams acquired from channels because my test have shown that best for my task is to have two threads reading from file simultaneously (and I can notice any improvement only when I am using Streams from Channels).
During tests I figured out that I can do something like this:
SeekableByteChannel sbc = Files.newByteChannel(imageFile);
sbc.position(1024*1024*1024);
is = Channels.newInputStream(sbc);
which takes only avg. 28ms but that does not help me a lot because to use that I would have to make major API changes.
My platform:
Linux galileo 3.11.0-13-generic #20-Ubuntu SMP Wed Oct 23 07:38:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
java version "1.7.0_45"
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)

Looking at the source, it appears that the default implementation of skip() is actually reading through (and discarding) the stream content until it reaches the target position:
public long skip(long n) throws IOException {
long remaining = n;
int nr;
if (n <= 0) {
return 0;
}
int size = (int)Math.min(MAX_SKIP_BUFFER_SIZE, remaining);
byte[] skipBuffer = new byte[size];
while (remaining > 0) {
nr = read(skipBuffer, 0, (int)Math.min(size, remaining));
if (nr < 0) {
break;
}
remaining -= nr;
}
return n - remaining;
}
The SeekableByteChannel#position() method probably just updates an offset pointer, which doesn't actually require any I/O. Presumably, FileInputStream overrides the skip() method with a similar optimization. The documentation supports this theory:
This method may skip more bytes than are remaining in the backing file. This produces no exception and the number of bytes skipped may include some number of bytes that were beyond the EOF of the backing file. Attempting to read from the stream after skipping past the end will result in -1 indicating the end of the file.
On platter disks or network storage, this could have a significant impact.

Try to set the range with GetObjectRequest.setRange to have the same behavior of skip.
GetObjectRequest req = new GetObjectRequest(BUCKET_NAME, "myfile.zip");
req.setRange(1024); // start download skiping 1024 bytes
S3ObjectInputStream in = client.getObject(req).getObjectContent();
// read "in" while not eof
I used this to avoid SocketTimeoutException on my implementation.
Each time I got a SocketTimeoutException I restart the download using the setRange to skip the bytes that I already downloaded.

Related

How do I read system information in Java? [duplicate]

I'm currently building a Java app that could end up being run on many different platforms, but primarily variants of Solaris, Linux and Windows.
Has anyone been able to successfully extract information such as the current disk space used, CPU utilisation and memory used in the underlying OS? What about just what the Java app itself is consuming?
Preferrably I'd like to get this information without using JNI.
You can get some limited memory information from the Runtime class. It really isn't exactly what you are looking for, but I thought I would provide it for the sake of completeness. Here is a small example. Edit: You can also get disk usage information from the java.io.File class. The disk space usage stuff requires Java 1.6 or higher.
public class Main {
public static void main(String[] args) {
/* Total number of processors or cores available to the JVM */
System.out.println("Available processors (cores): " +
Runtime.getRuntime().availableProcessors());
/* Total amount of free memory available to the JVM */
System.out.println("Free memory (bytes): " +
Runtime.getRuntime().freeMemory());
/* This will return Long.MAX_VALUE if there is no preset limit */
long maxMemory = Runtime.getRuntime().maxMemory();
/* Maximum amount of memory the JVM will attempt to use */
System.out.println("Maximum memory (bytes): " +
(maxMemory == Long.MAX_VALUE ? "no limit" : maxMemory));
/* Total memory currently available to the JVM */
System.out.println("Total memory available to JVM (bytes): " +
Runtime.getRuntime().totalMemory());
/* Get a list of all filesystem roots on this system */
File[] roots = File.listRoots();
/* For each filesystem root, print some info */
for (File root : roots) {
System.out.println("File system root: " + root.getAbsolutePath());
System.out.println("Total space (bytes): " + root.getTotalSpace());
System.out.println("Free space (bytes): " + root.getFreeSpace());
System.out.println("Usable space (bytes): " + root.getUsableSpace());
}
}
}
The java.lang.management package does give you a whole lot more info than Runtime - for example it will give you heap memory (ManagementFactory.getMemoryMXBean().getHeapMemoryUsage()) separate from non-heap memory (ManagementFactory.getMemoryMXBean().getNonHeapMemoryUsage()).
You can also get process CPU usage (without writing your own JNI code), but you need to cast the java.lang.management.OperatingSystemMXBean to a com.sun.management.OperatingSystemMXBean. This works on Windows and Linux, I haven't tested it elsewhere.
For example ... call the get getCpuUsage() method more frequently to get more accurate readings.
public class PerformanceMonitor {
private int availableProcessors = getOperatingSystemMXBean().getAvailableProcessors();
private long lastSystemTime = 0;
private long lastProcessCpuTime = 0;
public synchronized double getCpuUsage()
{
if ( lastSystemTime == 0 )
{
baselineCounters();
return;
}
long systemTime = System.nanoTime();
long processCpuTime = 0;
if ( getOperatingSystemMXBean() instanceof OperatingSystemMXBean )
{
processCpuTime = ( (OperatingSystemMXBean) getOperatingSystemMXBean() ).getProcessCpuTime();
}
double cpuUsage = (double) ( processCpuTime - lastProcessCpuTime ) / ( systemTime - lastSystemTime );
lastSystemTime = systemTime;
lastProcessCpuTime = processCpuTime;
return cpuUsage / availableProcessors;
}
private void baselineCounters()
{
lastSystemTime = System.nanoTime();
if ( getOperatingSystemMXBean() instanceof OperatingSystemMXBean )
{
lastProcessCpuTime = ( (OperatingSystemMXBean) getOperatingSystemMXBean() ).getProcessCpuTime();
}
}
}
I think the best method out there is to implement the SIGAR API by Hyperic. It works for most of the major operating systems ( darn near anything modern ) and is very easy to work with. The developer(s) are very responsive on their forum and mailing lists. I also like that it is GPL2 Apache licensed. They provide a ton of examples in Java too!
SIGAR == System Information, Gathering And Reporting tool.
There's a Java project that uses JNA (so no native libraries to install) and is in active development. It currently supports Linux, OSX, Windows, Solaris and FreeBSD and provides RAM, CPU, Battery and file system information.
https://github.com/oshi/oshi
For windows I went this way.
com.sun.management.OperatingSystemMXBean os = (com.sun.management.OperatingSystemMXBean) ManagementFactory.getOperatingSystemMXBean();
long physicalMemorySize = os.getTotalPhysicalMemorySize();
long freePhysicalMemory = os.getFreePhysicalMemorySize();
long freeSwapSize = os.getFreeSwapSpaceSize();
long commitedVirtualMemorySize = os.getCommittedVirtualMemorySize();
Here is the link with details.
You can get some system-level information by using System.getenv(), passing the relevant environment variable name as a parameter. For example, on Windows:
System.getenv("PROCESSOR_IDENTIFIER")
System.getenv("PROCESSOR_ARCHITECTURE")
System.getenv("PROCESSOR_ARCHITEW6432")
System.getenv("NUMBER_OF_PROCESSORS")
For other operating systems the presence/absence and names of the relevant environment variables will differ.
Add OSHI dependency via maven:
<dependency>
<groupId>com.github.dblock</groupId>
<artifactId>oshi-core</artifactId>
<version>2.2</version>
</dependency>
Get a battery capacity left in percentage:
SystemInfo si = new SystemInfo();
HardwareAbstractionLayer hal = si.getHardware();
for (PowerSource pSource : hal.getPowerSources()) {
System.out.println(String.format("%n %s # %.1f%%", pSource.getName(), pSource.getRemainingCapacity() * 100d));
}
Have a look at the APIs available in the java.lang.management package. For example:
OperatingSystemMXBean.getSystemLoadAverage()
ThreadMXBean.getCurrentThreadCpuTime()
ThreadMXBean.getCurrentThreadUserTime()
There are loads of other useful things in there as well.
Usually, to get low level OS information you can call OS specific commands which give you the information you want with Runtime.exec() or read files such as /proc/* in Linux.
CPU usage isn't straightforward -- java.lang.management via com.sun.management.OperatingSystemMXBean.getProcessCpuTime comes close (see Patrick's excellent code snippet above) but note that it only gives access to time the CPU spent in your process. it won't tell you about CPU time spent in other processes, or even CPU time spent doing system activities related to your process.
for instance i have a network-intensive java process -- it's the only thing running and the CPU is at 99% but only 55% of that is reported as "processor CPU".
don't even get me started on "load average" as it's next to useless, despite being the only cpu-related item on the MX bean. if only sun in their occasional wisdom exposed something like "getTotalCpuTime"...
for serious CPU monitoring SIGAR mentioned by Matt seems the best bet.
On Windows, you can run the systeminfo command and retrieves its output for instance with the following code:
private static class WindowsSystemInformation
{
static String get() throws IOException
{
Runtime runtime = Runtime.getRuntime();
Process process = runtime.exec("systeminfo");
BufferedReader systemInformationReader = new BufferedReader(new InputStreamReader(process.getInputStream()));
StringBuilder stringBuilder = new StringBuilder();
String line;
while ((line = systemInformationReader.readLine()) != null)
{
stringBuilder.append(line);
stringBuilder.append(System.lineSeparator());
}
return stringBuilder.toString().trim();
}
}
If you are using Jrockit VM then here is an other way of getting VM CPU usage. Runtime bean can also give you CPU load per processor. I have used this only on Red Hat Linux to observer Tomcat performance. You have to enable JMX remote in catalina.sh for this to work.
JMXServiceURL url = new JMXServiceURL("service:jmx:rmi:///jndi/rmi://my.tomcat.host:8080/jmxrmi");
JMXConnector jmxc = JMXConnectorFactory.connect(url, null);
MBeanServerConnection conn = jmxc.getMBeanServerConnection();
ObjectName name = new ObjectName("oracle.jrockit.management:type=Runtime");
Double jvmCpuLoad =(Double)conn.getAttribute(name, "VMGeneratedCPULoad");
It is still under development but you can already use jHardware
It is a simple library that scraps system data using Java. It works in both Linux and Windows.
ProcessorInfo info = HardwareInfo.getProcessorInfo();
//Get named info
System.out.println("Cache size: " + info.getCacheSize());
System.out.println("Family: " + info.getFamily());
System.out.println("Speed (Mhz): " + info.getMhz());
//[...]
One simple way which can be used to get the OS level information and I tested in my Mac which works well :
OperatingSystemMXBean osBean =
(OperatingSystemMXBean)ManagementFactory.getOperatingSystemMXBean();
return osBean.getProcessCpuLoad();
You can find many relevant metrics of the operating system here
To get the System Load average of 1 minute, 5 minutes and 15 minutes inside the java code, you can do this by executing the command cat /proc/loadavg using and interpreting it as below:
Runtime runtime = Runtime.getRuntime();
BufferedReader br = new BufferedReader(
new InputStreamReader(runtime.exec("cat /proc/loadavg").getInputStream()));
String avgLine = br.readLine();
System.out.println(avgLine);
List<String> avgLineList = Arrays.asList(avgLine.split("\\s+"));
System.out.println(avgLineList);
System.out.println("Average load 1 minute : " + avgLineList.get(0));
System.out.println("Average load 5 minutes : " + avgLineList.get(1));
System.out.println("Average load 15 minutes : " + avgLineList.get(2));
And to get the physical system memory by executing the command free -m and then interpreting it as below:
Runtime runtime = Runtime.getRuntime();
BufferedReader br = new BufferedReader(
new InputStreamReader(runtime.exec("free -m").getInputStream()));
String line;
String memLine = "";
int index = 0;
while ((line = br.readLine()) != null) {
if (index == 1) {
memLine = line;
}
index++;
}
// total used free shared buff/cache available
// Mem: 15933 3153 9683 310 3097 12148
// Swap: 3814 0 3814
List<String> memInfoList = Arrays.asList(memLine.split("\\s+"));
int totalSystemMemory = Integer.parseInt(memInfoList.get(1));
int totalSystemUsedMemory = Integer.parseInt(memInfoList.get(2));
int totalSystemFreeMemory = Integer.parseInt(memInfoList.get(3));
System.out.println("Total system memory in mb: " + totalSystemMemory);
System.out.println("Total system used memory in mb: " + totalSystemUsedMemory);
System.out.println("Total system free memory in mb: " + totalSystemFreeMemory);
Hey you can do this with java/com integration. By accessing WMI features you can get all the information.
Not exactly what you asked for, but I'd recommend checking out ArchUtils and SystemUtils from commons-lang3. These also contain some relevant helper facilities, e.g.:
import static org.apache.commons.lang3.ArchUtils.*;
import static org.apache.commons.lang3.SystemUtils.*;
System.out.printf("OS architecture: %s\n", OS_ARCH); // OS architecture: amd64
System.out.printf("OS name: %s\n", OS_NAME); // OS name: Linux
System.out.printf("OS version: %s\n", OS_VERSION); // OS version: 5.18.16-200.fc36.x86_64
System.out.printf("Is Linux? - %b\n", IS_OS_LINUX); // Is Linux? - true
System.out.printf("Is Mac? - %b\n", IS_OS_MAC); // Is Mac? - false
System.out.printf("Is Windows? - %b\n", IS_OS_WINDOWS); // Is Windows? - false
System.out.printf("JVM name: %s\n", JAVA_VM_NAME); // JVM name: Java HotSpot(TM) 64-Bit Server VM
System.out.printf("JVM vendor: %s\n", JAVA_VM_VENDOR); // JVM vendor: Oracle Corporation
System.out.printf("JVM version: %s\n", JAVA_VM_VERSION); // JVM version: 11.0.12+8-LTS-237
System.out.printf("Username: %s\n", getUserName()); // Username: johndoe
System.out.printf("Hostname: %s\n", getHostName()); // Hostname: garage-pc
var processor = getProcessor();
System.out.printf("CPU arch: %s\n", processor.getArch()) // CPU arch: BIT_64
System.out.printf("CPU type: %s\n", processor.getType()); // CPU type: X86

Slow service response Times : Java SecureRandom & /dev/random [duplicate]

This question already has answers here:
How to deal with a slow SecureRandom generator?
(17 answers)
Closed 2 years ago.
I am trying to debug a few slow responses served by an app deployed on Tomcat.
Right now I am focussing on SecureRandom and /dev/random (some of the other probable causes have been investigated and ruled out).
The pattern is as follows:
The first call takes exactly 30.0xy seconds after Tomcat restart (even if the request arrives 4 minutes after the Startup)
Later, some calls take exactly 15.0pq seconds (there was no specific pattern that I could establish, pq being the time approximate time taken in TP99)
The service call involves encryption and decryption (AES/ECB/PKCS5Padding).
Is it possible that SecureRandom init/repopulating is leading to this?
(Although, there is a log written in catalina.log that says "Creation of SecureRandom instance for session ID generation using [SHA1PRNG] took [28,760] milliseconds.")
Also, in order to check whether /dev/random or /dev/urandom is being used, I used the test from this question. To my surprise, I didn't see reads from either of them unlike the way it happens in the linked question.
These are the last few lines from the strace log:
3561 lstat("/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/jsse.jar", {st_mode=S_IFREG|0644, st_size=258525, ...}) = 0
3561 open("/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre/lib/jsse.jar", O_RDONLY) = 6
3561 stat("/dev/random", {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 8), ...}) = 0
3561 stat("/dev/urandom", {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 9), ...}) = 0
3561 open("/dev/random", O_RDONLY) = 7
3561 open("/dev/urandom", O_RDONLY) = 8
3561 unlink("/tmp/hsperfdata_xxxx/3560") = 0
What is then being used for seeding SecureRandom?
fyi, java -version
java version "1.6.0_32"
OpenJDK Runtime Environment (IcedTea6 1.13.4) (rhel-7.1.13.4.el6_5-x86_64)
OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode)
I could not check your OpenJDK concrete version, but I could check jdk6-b33.
SecureRandom uses SeedGenerator to get the seed bytes
public byte[] engineGenerateSeed(int numBytes) {
byte[] b = new byte[numBytes];
SeedGenerator.generateSeed(b);
return b;
}
SeedGenerator gets the seedSource (String) from SunEntries
String egdSource = SunEntries.getSeedSource();
SunEntries tries to get the source from the system property java.security.egd first, if is not found then tries to get the property securerandom.source from the java.security properties file, if the property is not found returns a blank string.
// name of the *System* property, takes precedence over PROP_RNDSOURCE
private final static String PROP_EGD = "java.security.egd";
// name of the *Security* property
private final static String PROP_RNDSOURCE = "securerandom.source";
final static String URL_DEV_RANDOM = "file:/dev/random";
final static String URL_DEV_URANDOM = "file:/dev/urandom";
private static final String seedSource;
static {
seedSource = AccessController.doPrivileged(
new PrivilegedAction<String>() {
public String run() {
String egdSource = System.getProperty(PROP_EGD, "");
if (egdSource.length() != 0) {
return egdSource;
}
egdSource = Security.getProperty(PROP_RNDSOURCE);
if (egdSource == null) {
return "";
}
return egdSource;
}
});
}
the SeedGenerator check this value to initialize the instance
// Static instance is created at link time
private static SeedGenerator instance;
private static final Debug debug = Debug.getInstance("provider");
final static String URL_DEV_RANDOM = SunEntries.URL_DEV_RANDOM;
final static String URL_DEV_URANDOM = SunEntries.URL_DEV_URANDOM;
// Static initializer to hook in selected or best performing generator
static {
String egdSource = SunEntries.getSeedSource();
// Try the URL specifying the source
// e.g. file:/dev/random
//
// The URL file:/dev/random or file:/dev/urandom is used to indicate
// the SeedGenerator using OS support, if available.
// On Windows, the causes MS CryptoAPI to be used.
// On Solaris and Linux, this is the identical to using
// URLSeedGenerator to read from /dev/random
if (egdSource.equals(URL_DEV_RANDOM) || egdSource.equals(URL_DEV_URANDOM)) {
try {
instance = new NativeSeedGenerator();
if (debug != null) {
debug.println("Using operating system seed generator");
}
} catch (IOException e) {
if (debug != null) {
debug.println("Failed to use operating system seed "
+ "generator: " + e.toString());
}
}
} else if (egdSource.length() != 0) {
try {
instance = new URLSeedGenerator(egdSource);
if (debug != null) {
debug.println("Using URL seed generator reading from "
+ egdSource);
}
} catch (IOException e) {
if (debug != null)
debug.println("Failed to create seed generator with "
+ egdSource + ": " + e.toString());
}
}
// Fall back to ThreadedSeedGenerator
if (instance == null) {
if (debug != null) {
debug.println("Using default threaded seed generator");
}
instance = new ThreadedSeedGenerator();
}
}
if the source is
final static String URL_DEV_RANDOM = "file:/dev/random";
or
final static String URL_DEV_URANDOM = "file:/dev/urandom"
uses the NativeSeedGenerator, on Windows tries to use the native CryptoAPI on Linux the class simply extends the SeedGenerator.URLSeedGenerator
package sun.security.provider;
import java.io.IOException;
/**
* Native seed generator for Unix systems. Inherit everything from
* URLSeedGenerator.
*
*/
class NativeSeedGenerator extends SeedGenerator.URLSeedGenerator {
NativeSeedGenerator() throws IOException {
super();
}
}
and call to the superclass constructor who loads /dev/random by default
URLSeedGenerator() throws IOException {
this(SeedGenerator.URL_DEV_RANDOM);
}
so, OpenJDK uses /dev/random by default until you do not set another value in the system property java.security.egd or in the property securerandom.source of security properties file.
If you want to see the read results using strace you can change the command line and add the trace=open,read expression
sudo strace -o a.strace -f -e trace=open,read java class
the you can see something like this (I did the test with Oracle JDK 6)
13225 open("/dev/random", O_RDONLY) = 8
13225 read(8, "#", 1) = 1
13225 read(3, "PK\3\4\n\0\0\0\0\0RyzB\36\320\267\325u\4\0\0u\4\0\0 \0\0\0", 30) = 30
....
....
The Tomcat Wiki section for faster startup suggest using a non-blocking entropy source like /dev/urandom if you are experiencing delays during startup
More info: https://wiki.apache.org/tomcat/HowTo/FasterStartUp#Entropy_Source
Hope this helps.
The problem is not SecureRandom per se but that /dev/random blocks if it doesn't have enough data. You can use urandom instead but that might not be a good idea if you need cryptographically strong random seeds.
On headless Linux systems you can install the haveged daemon. This keeps /dev/random topped up with enough data so that calls don't have to wait for the required entropy to be generated.
I've done this on a Debian Aws instance and watched SecureRandom generateBytes calls drop from 25 seconds to sub millisecond (Openjdk 1.7 something, can't remember specifically what version).

How do you write to disk (with flushing) in Java and maintain performance?

Using the following code as a benchmark, the system can write 10,000 rows to disk in a fraction of a second:
void withSync() {
int f = open( "/tmp/t8" , O_RDWR | O_CREAT );
lseek (f, 0, SEEK_SET );
int records = 10*1000;
clock_t ustart = clock();
for(int i = 0; i < records; i++) {
write(f, "012345678901234567890123456789" , 30);
fsync(f);
}
clock_t uend = clock();
close (f);
printf(" sync() seconds:%lf writes per second:%lf\n", ((double)(uend-ustart))/(CLOCKS_PER_SEC), ((double)records)/((double)(uend-ustart))/(CLOCKS_PER_SEC));
}
In the above code, 10,000 records can be written and flushed out to disk in a fraction of a second, output below:
sync() seconds:0.006268 writes per second:0.000002
In the Java version, it takes over 4 seconds to write 10,000 records. Is this just a limitation of Java, or am I missing something?
public void testFileChannel() throws IOException {
RandomAccessFile raf = new RandomAccessFile(new File("/tmp/t5"),"rw");
FileChannel c = raf.getChannel();
c.force(true);
ByteBuffer b = ByteBuffer.allocateDirect(64*1024);
long s = System.currentTimeMillis();
for(int i=0;i<10000;i++){
b.clear();
b.put("012345678901234567890123456789".getBytes());
b.flip();
c.write(b);
c.force(false);
}
long e=System.currentTimeMillis();
raf.close();
System.out.println("With flush "+(e-s));
}
Returns this:
With flush 4263
Please help me understand what is the correct/fastest way to write records to disk in Java.
Note: I am using the RandomAccessFile class in combination with a ByteBuffer as ultimately we need random read/write access on this file.
Actually, I am surprised that test is not slower. The behavior of force is OS dependent but broadly it forces the data to disk. If you have an SSD you might achieve 40K writes per second, but with an HDD you won't. In the C example its clearly isn't committing the data to disk as even the fastest SSD cannot perform more than 235K IOPS (That the manufacturers guarantee it won't go faster than that :D )
If you need the data committed to disk every time, you can expect it to be slow and entirely dependent on the speed of your hardware. If you just need the data flushed to the OS and if the program crashes but the OS does not, you will not loose any data, you can write data without force. A faster option is to use memory mapped files. This will give you random access without a system call for each record.
I have a library Java Chronicle which can read/write 5-20 millions records per second with a latency of 80 ns in text or binary formats with random access and can be shared between processes. This only works this fast because it is not committing the data to disk on every record, but you can test that if the JVM crashes at any point, no data written to the chronicle is lost.
This code is more similar to what you wrote in C. Takes only 5 msec on my machine. If you really need to flush after every write, it takes about 60 msec. Your original code took about 11 seconds on this machine. BTW, closing the output stream also flushes.
public static void testFileOutputStream() throws IOException {
OutputStream os = new BufferedOutputStream( new FileOutputStream( "/tmp/fos" ) );
byte[] bytes = "012345678901234567890123456789".getBytes();
long s = System.nanoTime();
for ( int i = 0; i < 10000; i++ ) {
os.write( bytes );
}
long e = System.nanoTime();
os.close();
System.out.println( "outputstream " + ( e - s ) / 1e6 );
}
Java equivalent of fputs is file.write("012345678901234567890123456789"); , you are calling 4 functions and just 1 in C, delay seems obvious
i think this is most similar to your C version. i think the direct buffers in your java example are causing many more buffer copies than the C version. this takes about 2.2s on my (old) box.
public static void testFileChannelSimple() throws IOException {
RandomAccessFile raf = new RandomAccessFile(new File("/tmp/t5"),"rw");
FileChannel c = raf.getChannel();
c.force(true);
byte[] bytes = "012345678901234567890123456789".getBytes();
long s = System.currentTimeMillis();
for(int i=0;i<10000;i++){
raf.write(bytes);
c.force(true);
}
long e=System.currentTimeMillis();
raf.close();
System.out.println("With flush "+(e-s));
}

FileChannel.transferTo for large file in windows

Using Java NIO use can copy file faster. I found two kind of method mainly over internet to do this job.
public static void copyFile(File sourceFile, File destinationFile) throws IOException {
if (!destinationFile.exists()) {
destinationFile.createNewFile();
}
FileChannel source = null;
FileChannel destination = null;
try {
source = new FileInputStream(sourceFile).getChannel();
destination = new FileOutputStream(destinationFile).getChannel();
destination.transferFrom(source, 0, source.size());
} finally {
if (source != null) {
source.close();
}
if (destination != null) {
destination.close();
}
}
}
In 20 very useful Java code snippets for Java Developers I found a different comment and trick:
public static void fileCopy(File in, File out) throws IOException {
FileChannel inChannel = new FileInputStream(in).getChannel();
FileChannel outChannel = new FileOutputStream(out).getChannel();
try {
// inChannel.transferTo(0, inChannel.size(), outChannel); // original -- apparently has trouble copying large files on Windows
// magic number for Windows, (64Mb - 32Kb)
int maxCount = (64 * 1024 * 1024) - (32 * 1024);
long size = inChannel.size();
long position = 0;
while (position < size) {
position += inChannel.transferTo(position, maxCount, outChannel);
}
} finally {
if (inChannel != null) {
inChannel.close();
}
if (outChannel != null) {
outChannel.close();
}
}
}
But I didn't find or understand what is meaning of
"magic number for Windows, (64Mb - 32Kb)"
It says that inChannel.transferTo(0, inChannel.size(), outChannel) has problem in windows, is 32768 (= (64 * 1024 * 1024) - (32 * 1024)) byte is optimum for this method.
Windows has a hard limit on the maximum transfer size, and if you exceed it you get a runtime exception. So you need to tune. The second version you give is superior because it doesn't assume the file was transferred completely with one transferTo() call, which agrees with the Javadoc.
Setting the transfer size more than about 1MB is pretty pointless anyway.
EDIT Your second version has a flaw. You should decrement size by the amount transferred each time. It should be more like:
while (size > 0) { // we still have bytes to transfer
long count = inChannel.transferTo(position, size, outChannel);
if (count > 0)
{
position += count; // seeking position to last byte transferred
size-= count; // {count} bytes have been transferred, remaining {size}
}
}
I have read that it is for compatibility with the Windows 2000 operating system.
Source: http://www.rgagnon.com/javadetails/java-0064.html
Quote: In win2000, the transferTo() does not transfer files > than 2^31-1 bytes. it throws an exception of "java.io.IOException: Insufficient system resources exist to complete the requested service is thrown." The workaround is to copy in a loop 64Mb each time until there is no more data.
There appears to be anecdotal evidence that attempts to transfer more than 64MB at a time on certain Windows versions results in a slow copy. Hence the check: this appears to be the result of some detail of the underlying native code that implements the transferTo operation on Windows.

Create file with given size in Java

Is there an efficient way to create a file with a given size in Java?
In C it can be done with ftruncate (see that answer).
Most people would just write n dummy bytes into the file, but there must be a faster way. I'm thinking of ftruncate and also of Sparse files…
Create a new RandomAccessFile and call the setLength method, specifying the desired file length. The underlying JRE implementation should use the most efficient method available in your environment.
The following program
import java.io.*;
class Test {
public static void main(String args[]) throws Exception {
RandomAccessFile f = new RandomAccessFile("t", "rw");
f.setLength(1024 * 1024 * 1024);
}
}
on a Linux machine will allocate the space using the ftruncate(2)
6070 open("t", O_RDWR|O_CREAT, 0666) = 4
6070 fstat(4, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
6070 lseek(4, 0, SEEK_CUR) = 0
6070 ftruncate(4, 1073741824) = 0
while on a Solaris machine it will use the the F_FREESP64 function of the fcntl(2) system call.
/2: open64("t", O_RDWR|O_CREAT, 0666) = 14
/2: fstat64(14, 0xFE4FF810) = 0
/2: llseek(14, 0, SEEK_CUR) = 0
/2: fcntl(14, F_FREESP64, 0xFE4FF998) = 0
In both cases this will result in the creation of a sparse file.
Since Java 8, this method works on Linux and Windows :
final ByteBuffer buf = ByteBuffer.allocate(4).putInt(2);
buf.rewind();
final OpenOption[] options = { StandardOpenOption.WRITE, StandardOpenOption.CREATE_NEW , StandardOpenOption.SPARSE };
final Path hugeFile = Paths.get("hugefile.txt");
try (final SeekableByteChannel channel = Files.newByteChannel(hugeFile, options);) {
channel.position(HUGE_FILE_SIZE);
channel.write(buf);
}
You can open the file for writing, seek to offset (n-1), and write a single byte. The OS will automatically extend the file to the desired number of bytes.

Categories

Resources