Buffer vs Unsafe - Outside JVM

Buffer vs Unsafe - Outside JVM - java

I have a requirement to use a space in the available RAM which the GC has no control on. I read a few articles on the same which gave me the introduction on two approaches. They are specified in the following code.
package com.directmemory;
import java.lang.reflect.Field;
import java.nio.ByteBuffer;
import sun.misc.Unsafe;
public class DirectMemoryTest {
public static void main(String[] args) {
//Approach 1
ByteBuffer directByteBuffer = ByteBuffer.allocateDirect(8);
directByteBuffer.putDouble(1.0);
directByteBuffer.flip();
System.out.println(directByteBuffer.getDouble());
//Approach 2
Unsafe unsafe = getUnsafe();
long pointer = unsafe.allocateMemory(8);
unsafe.putDouble(pointer, 2.0);
unsafe.putDouble(pointer+8, 3.0);
System.out.println(unsafe.getDouble(pointer));
System.out.println(unsafe.getDouble(pointer+8));
System.out.println(unsafe.getDouble(pointer+16));
}
public static Unsafe getUnsafe() {
try {
Field f = Unsafe.class.getDeclaredField("theUnsafe");
f.setAccessible(true);
return (Unsafe) f.get(null);
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
}
I have a couple of questions
1) Why should I ever pay attention to Approach 1 mentioned in the code because as per my understanding ByteBuffer.allocateDirect() cannot return me a buffer with a storage capacity of greater than 2GB ? So if my requirement is to store say 3 GB of data, I have to create a new buffer and store the data there which would mean that apart from storing data I have additional responsibility of identifying the respective buffer (out of the list of 'n' buffers) which maintains a pointer to direct memory.
2) Isn't approach 2 a little faster than approach 1 because I don't have to first find the buffer and then the data, I just need an indexing mechanism of for an object's field and use getDouble/getInt methods and pass the absolute address ?
3) Is the allocation of direct memory (is it right to say off heap memory ?) related to a PID ? If on a machine, I have 2 java processes, allocateMemory calls in PID 1 and PID 2 give me never intersecting memory blocks to use ?
4) Why is the last sysout statement not resulting in 0.0 ? The idea is that every double uses 8 bytes so I store 1.0 at address returned by allocateMemory say address = 1, the 2.0 at address 1+8 which is 9 and then stop. So shouldn't the default value be 0.0 ?

One point to consider is that sun.misc.Unsafe is not a supported API. It will be replaced by something else (http://openjdk.java.net/jeps/260)
1) If your code must run unchanged with Java 8 to Java 10 (and later), approach 1 with ByteBuffers is the way to go.
If you're ready to replace the use of sun.misc.Unsafe with whatever replace is in Java 9 / Java 10 you may well go with sun.misc.Unsafe.
2) For very large data structures with more than 2 GBytes approach 2 might be faster due to the necessary additional indirection in approach 1. However without a solid (micro) benchmark I would not bet anything on it.
3) The allocated memory is always bound to the currently running JVM. So with two JVMs running on the same machine you will not get intersecting memory.
4) You are allocating 8 bytes of uninitialized memory. The only amount of memory you may legally access now is 8 bytes. For the memory beyond your allocated size no guarantees are made.
4a) You are writing 8 bytes beyond your allocated memory (unsafe.putDouble(pointer+8, 3.0);), which already leads to memory corruption and can lead to a JVM crash on the next memory allocation.
4b) You are reading 16 bytes beyond your allocated memory, which (depending on your processor architecture and operating system and previous memory use) can lead to an immediate JVM crash.

Related

What is an overhead for creating Java objects from lines of csv file

the code reads lines of CSV file like:
Stream<String> strings = Files.lines(Paths.get(filePath))
then it maps each line in the mapper:
List<String> tokens = line.split(",");
return new UserModel(tokens.get(0), tokens.get(1), tokens.get(2), tokens.get(3));
and finally collects it:
Set<UserModel> current = currentStream.collect(toSet())
File size is ~500MB
I've connected to the server using jconsole and see that heap size grew from 200MB to 1.8GB while processing.
I can't understand where this x3 memory usage came from - I expected something like 500MB spike or so?
My first impression was it's because there is no throttling and garbage collector simply doesn't have enough time for cleanup.
But I've tried to use guava rate limiter to let garbage collector time to do it's job but result is the same.

Tom Hawtin made good points - I just wanna expand on them and provide a bit more details.
Java Strings take at least 40 bytes of memory (that's for empty string) due to java object header (see later) overhead and an internal byte array.
That means the minimal size for non-empty string (1 or more characters) is 48 bytes.
Nowawadays, JVM uses Compact Strings which means that ASCII-only strings only occupy 1 byte per character - before it was 2 bytes per char minimum.
That means if your file contains characters beyond ASCII set, then memory usage can grow significantly.
Streams also have more overhead compared to plain iteration with arrays/lists (see here Java 8 stream objects significant memory usage)
I guess your UserModel object adds at least 32 bytes overhead on top of each line, because:
the minimum size of java object is 16 bytes where first 12 bytes are the JVM "overhead": object's class reference (4 bytes when Compressed Oops are used) + the Mark word (used for identity hash code, Biased locking, garbage collectors)
and the next 4 bytes are used by the reference to the first "token"
and the next 12 bytes are used by 3 references to the second, third and fourth "token"
and the last 4 bytes are required due to Java Object Alignment at 8-byte boundaries (on 64-bit architectures)
That being said, it's not clear whether you even use all the data that you read from the file - you parse 4 tokens from a line but maybe there are more?
Moreover, you didn't mention how exactly the heap size "grew" - If it was the commited size or the used size of the heap. The used portion is what actually is being "used" by live objects, the commited portion is what has been allocated by the JVM at some point but could be garbage-collected later; used < commited in most cases.
You'd have to take a heap snapshot to find out how much memory actually the result set of UserModel occupies and that would actually be interesting to compare to the size of the file.

It may be that the String implementation is using UTF-16 whereas the file may be using UTF-8. That would be double the size assuming all US ASCII characters. However, I believe JVM tend to use a compact form for Strings nowadays.
Another factor is that Java objects tend to be allocated on a nice round address. That means there's extra padding.
Then there's memory for the actual String object, in addition to the actual data in the backing char[] or byte[].
Then there's your UserModel object. Each object has a header and references are usually 8-bytes (may be 4).
Lastly not all the heap will be allocated. GC runs more efficiently when a fair proportion of the memory isn't, at any particular moment, being used. Even C malloc will end up with much of the memory unused once a process is up and running.

You code reads the full file into memory. Then you start splitting each line into an array, then you create objects of your custom class for each line. So basically you have 3 different pieces of "memory usage" for each line in your file!
While enough memory is available, the jvm might simply not waste time running the garbage collector while turning your 500 megabytes into three different representations. Therefore you are likely to "triplicate" the number of bytes within your file. At least until the gc kicks in and throws away the no longer required file lines and splitted arrays.

Declaring multiple arrays with 64 elements 1000 times faster than declaring array of 65 elements

Recently I noticed declaring an array containing 64 elements is a lot faster (>1000 fold) than declaring the same type of array with 65 elements.
Here is the code I used to test this:
public class Tests{
public static void main(String args[]){
double start = System.nanoTime();
int job = 100000000;//100 million
for(int i = 0; i < job; i++){
double[] test = new double[64];
}
double end = System.nanoTime();
System.out.println("Total runtime = " + (end-start)/1000000 + " ms");
}
}
This runs in approximately 6 ms, if I replace new double[64] with new double[65] it takes approximately 7 seconds. This problem becomes exponentially more severe if the job is spread across more and more threads, which is where my problem originates from.
This problem also occurs with different types of arrays such as int[65] or String[65].
This problem does not occur with large strings: String test = "many characters";, but does start occurring when this is changed into String test = i + "";
I was wondering why this is the case and if it is possible to circumvent this problem.

You are observing a behavior that is caused by the optimizations done by the JIT compiler of your Java VM. This behavior is reproducible triggered with scalar arrays up to 64 elements, and is not triggered with arrays larger than 64.
Before going into details, let's take a closer look at the body of the loop:
double[] test = new double[64];
The body has no effect (observable behavior). That means it makes no difference outside of the program execution whether this statement is executed or not. The same is true for the whole loop. So it might happen, that the code optimizer translates the loop to something (or nothing) with the same functional and different timing behavior.
For benchmarks you should at least adhere to the following two guidelines. If you had done so, the difference would have been significantly smaller.
Warm-up the JIT compiler (and optimizer) by executing the benchmark several times.
Use the result of every expression and print it at the end of the benchmark.
Now let's go into details. Not surprisingly there is an optimization that is triggered for scalar arrays not larger than 64 elements. The optimization is part of the Escape analysis. It puts small objects and small arrays onto the stack instead of allocating them on the heap - or even better optimize them away entirely. You can find some information about it in the following article by Brian Goetz written in 2005:
Urban performance legends, revisited: Allocation is faster than you think, and getting faster
The optimization can be disabled with the command line option -XX:-DoEscapeAnalysis. The magic value 64 for scalar arrays can also be changed on the command line. If you execute your program as follows, there will be no difference between arrays with 64 and 65 elements:
java -XX:EliminateAllocationArraySizeLimit=65 Tests
Having said that, I strongly discourage using such command line options. I doubt that it makes a huge difference in a realistic application. I would only use it, if I would be absolutely convinced of the necessity - and not based on the results of some pseudo benchmarks.

There are any number of ways that there can be a difference, based on the size of an object.
As nosid stated, the JITC may be (most likely is) allocating small "local" objects on the stack, and the size cutoff for "small" arrays may be at 64 elements.
Allocating on the stack is significantly faster than allocating in heap, and, more to the point, stack does not need to be garbage collected, so GC overhead is greatly reduced. (And for this test case GC overhead is likely 80-90% of the total execution time.)
Further, once the value is stack-allocated the JITC can perform "dead code elimination", determine that the result of the new is never used anywhere, and, after assuring there are no side-effects that would be lost, eliminate the entire new operation, and then the (now empty) loop itself.
Even if the JITC does not do stack allocation, it's entirely possible for objects smaller than a certain size to be allocated in a heap differently (eg, from a different "space") than larger objects. (Normally this would not produce quite so dramatic timing differences, though.)

Does OutputStream.write(buf, offset, size) have memory leak on Linux?

I write a piece of java code to create 500K small files (average 40K each) on CentOS. The original code is like this:
package MyTest;
import java.io.*;
public class SimpleWriter {
public static void main(String[] args) {
String dir = args[0];
int fileCount = Integer.parseInt(args[1]);
String content="##$% SDBSDGSDF ASGSDFFSAGDHFSDSAWE^#$^HNFSGQW%##&$%^J#%##^$#UHRGSDSDNDFE$T##$UERDFASGWQR!#%!#^$##YEGEQW%!#%!!GSDHWET!^";
StringBuilder sb = new StringBuilder();
int count = 40 * 1024 / content.length();
int remainder = (40 * 1024) % content.length();
for (int i=0; i < count; i++)
{
sb.append(content);
}
if (remainder > 0)
{
sb.append(content.substring(0, remainder));
}
byte[] buf = sb.toString().getBytes();
for (int j=0; j < fileCount; j++)
{
String path = String.format("%s%sTestFile_%d.txt", dir, File.separator, j);
try{
BufferedOutputStream fs = new BufferedOutputStream(new FileOutputStream(path));
fs.write(buf);
fs.close();
}
catch(FileNotFoundException fe)
{
System.out.printf("Hit filenot found exception %s", fe.getMessage());
}
catch(IOException ie)
{
System.out.printf("Hit IO exception %s", ie.getMessage());
}
}
}
}
You can run this by issue following command:
java -jar SimpleWriter.jar my_test_dir 500000
I thought this is a simple code, but then I realize that this code is using up to 14G of memory. I know that because when I use free -m to check the memory, the free memory kept dropping, until my 15G memory VM only had 70 MB free memory left. I compiled this using Eclipse, and I compile this against JDK 1.6 and then JDK1.7. The result is the same. The funny thing is that, if I comment out fs.write(), just open and close the stream, the memory stabilized at certain point. Once I put fs.write() back, the memory allocation just go wild. 500K 40KB files is about 20G. It seems Java's stream writer never deallocate its buffer during the operation.
I once thought java GC does not have time to clean. But this make no sense since I closed the file stream for every file. I even transfer my code into C#, and running under windows, the same code producing 500K 40KB files with memory stable at certain point, not taking 14G as under CentOS. At least C#'s behavior is what I expected, but I could not believe Java perform this way. I asked my colleague who were experienced in java. They could not see anything wrong in code, but could not explain why this happened. And they admit nobody had tried to create 500K file in a loop without stop.
I also searched online and everybody says that the only thing need to pay attention to, is close the stream, which I did.
Can anyone help me to figure out what's wrong?
Can anybody also try this and tell me what you see?
BTW, some people in this community tried the code on Windows and it seemed to worked fine. I didn't tried it on windows. I only tried in Linux as I thought that where people use Java for. So, it seems this issue happened on Linux).
I also did the following to limit the JVM heap, but it take no effects
java -Xmx2048m -jar SimpleWriter.jar my_test_dir 500000

I tried to test your prog on Win XP, JDK 1.7.25. Immediately got OutOfMemoryExceptions.
While debugging, with only 3000 count (args[1]), the count variable from this code:
int count = 40 * 1024 * 1024 / content.length();
int remainder = (40 * 1024 * 1024) % content.length();
for (int i = 0; i < count; i++) {
sb.append(content);
}
count is 355449. So the String you are trying to create will be 355449 * contents long, or as you calculated, 40Mb long. I was out of memory when i was 266587, and sb was 31457266 chars long. At which point each file I get is 30Mb.
The problem does not seem with memory or GC, but with the way you crate the string.
Did you see files created or was memory eating up before any file was created?
I think your main problem is the line:
int count = 40 * 1024 * 1024 / content.length();
should be:
int count = 40 * 1024 / content.length();
to create 40K, not 40Mb files.

[Edit2: The original answer is left in italics at the end of this post]
After your clarifications in the comments, I have run your code on a windows machine (Java 1.6) and here is my findings (numbers are from VisualVM, OS memory as seen from task manager):
Example with 40K size, writing to 500K files (no parameters to JVM):
Used Heap: ~4M, Total Heap: 16M, OS memory: ~16M
Example with 40M size, writing to 500 files (parameters to JVM -Xms128m -Xmx512m. Without parameters I get an OutOfMemory error when creating StringBuilder):
Used Heap: ~265M, Heap size: ~365M, OS memory: ~365M
Especially from the second example you can see that my original explanation still stands. Yes someone would expect that most of the memory would be freed since the byte[] of the BufferedOutputStream reside in the first generation space (short lived objects) but this a) does not happen immediately and b) when GC decides to kicks in (it actually does in my case), yes it will try to clear memory but it can clear as much memory as it sees fit, not necessarily all of it. GC does not provide any guarentees that you can count upon.
So generally speaking you should give to JVM as much memory you feel comfortable with. If you need to keep the memory low for special functionalities you should try a strategy as the code example I gave down below in my original answer i.e. just don't create all those byte[] objects.
Now in your case with CentOS, it does seem that JVM's behaves strangely. Perhaps we could talk about a buggy or bad implementation. To classify it as a leak/bug though you should try to use -Xmx to restrict the heap. Also please try what Peter Lawrey suggested to not create the BufferedOutputStream at all (in the small file case) since you just write all the bytes at once.
If it still exceeds the memory limit then you have encountered a leak and should probably file a bug. (You could still complain though and they may optimize it in the future).
[Edit1: The answer below assumed that the OP's code performed as many reading operations as the write operations, so the memory usage was justifiable. The OP clarified this is not the case, so his question is not answered
"...my 15G memory VM..."
If you give the JVM as much memory why it should try to run GC? As far as the JVM is concerned it is allowed to get as much memory from the system and run GC only when it thinks that is appropriate to do so.
Each execution of BufferedOutputStream will allocate a buffer of 8K size by default. JVM will try to reclaim that memory only when it needs to. This is the expected behaviour.
Do not confuse the memory that you see as free from the system's point of view and from the JVM's point of view. As far the system is concerned the memory is allocated and will be released when the JVM shuts down. As far the JVM's is concerned all the byte[] arrays allocated from BufferedOutputStream are not in use any more, it is "free" memory and will be reclaimed if it needs to.
If for some reason you don't desire this behaviour you could try the following: Extend the BufferedOutputStream class (e.g. create a ReusableBufferedOutputStream class) and add a new method e.g. reUseWithStream(OutputStream os). This method would then clear the internal byte[], flush and close the previous stream, reset any variables used and set the new stream. Your code then would become as below:
// intialize once
ReusableBufferedOutputStream fs = new ReusableBufferedOutputStream();
for (int i=0; i < fileCount; i ++)
{
String path = String.format("%s%sTestFile_%d.txt", dir, File.separator, i);
//set the new stream to be buffered and read
fs.reUseWithStream(new FileOutputStream(path));
fs.write(this._buf, 0, this._buf.length); // this._buf was allocated once, 40K long contain text
}
fs.close(); // Close the stream after we are done
Using the above approach you will avoid creating many byte[]. However I don't see any problem with the expected behaviour neither you mention any problem other than "I see it takes too much memory". You have congifured it to use it after all.]

File size vs. in memory size in Java

If I take an XML file that is around 2kB on disk and load the contents as a String into memory in Java and then measure the object size it's around 33kB.
Why the huge increase in size?
If I do the same thing in C++ the resulting string object in memory is much closer to the 2kB.
To measure the memory in Java I'm using Instrumentation.
For C++, I take the length of the serialized object (e.g string).

I think there are multiple factors involved.
First of all, as Bruce Martin said, objects in java have an overhead of 16 bytes per object, c++ does not.
Second, Strings in Java might be 2 Bytes per character instead of 1.
Third, it could be that Java reserves more Memory for its Strings than the C++ std::string does.
Please note that these are just ideas where the big difference might come from.

Assuming that your XML file contains mainly ASCII characters and uses an encoding that represents them as single bytes, then you can espect the in memory size to be at least double, since Java uses UTF-16 internally (I've heard of some JVMs that try to optimize this, thouhg). Added to that will be overhead for 2 objects (the String instance and an internal char array) with some fields, IIRC about 40 bytes overall.
So your "object size" of 33kb is definitely not correct, unless you're using a weird JVM. There must be some problem with the method you use to measure it.

In Java String object have some extra data, that increases it's size.
It is object data, array data and some other variables. This can be array reference, offset, length etc.
Visit http://www.javamex.com/tutorials/memory/string_memory_usage.shtml for details.

String: a String's memory growth tracks its internal char array's growth. However, the String class adds another 24 bytes of overhead.
For a nonempty String of size 10 characters or less, the added overhead cost relative to useful payload (2 bytes for each char plus 4 bytes for the length), ranges from 100 to 400 percent.
More:
What is the memory consumption of an object in Java?

Yes, you should GC and give it time to finish. Just System.gc(); and print totalMem() in the loop. You also better to create a million of string copies in array (measure empty array size and, then, filled with strings), to be sure that you measure the size of strings and not other service objects, which may present in your program. String alone cannot take 32 kb. But hierarcy of XML objects can.
Said that, I cannot resist the irony that nobody cares about memory (and cache hits) in the world of Java. We are know that JIT is improving and it can outperform the native C++ code in some cases. So, there is not need to bother about memory optimization. Preliminary optimization is a root of all evils.

As stated in other answers, Java's String is adding an overhead. If you need to store a large number of strings in memory, I suggest you to store them as byte[] instead. Doing so the size in memory should be the same than the size on disk.
String -> byte[] :
String a = "hello";
byte[] aBytes = a.getBytes();
byte[] -> String :
String b = new String(aBytes);

Using too much Ram in Java

I'm writing a program in java which has to make use of a large hash-table, the bigger the hash-table can be, the better (It's a chess program :P). Basically as part of my hash table I have an array of "long[]", an array of "short[]", and two arrays of "byte[]". All of them should be the same size. When I set my table size to ten-million however, it crashes and says "java heap out of memory". This makes no sense to me. Here's how I see it:
1 Long + 1 Short + 2 Bytes = 12 bytes
x 10,000,000 = 120,000,000 bytes
/ 1024 = 117187.5 kB
/ 1024 = 114.4 Mb
Now, 114 Mb of RAM doesn't seem like too much to me. In total my CPU has 4Gb of RAM on my mac, and I have an app called FreeMemory which shows how much RAM I have free and it's around 2Gb while running this program. Also, I set the java preferences like -Xmx1024m, so java should be able to use up to a gig of memory. So why won't it let me allocate just 114Mb?

You predicted that it should use 114 MB and if I run this (on a windows box with 4 GB)
public static void main(String... args) {
long used1 = memoryUsed();
int Hash_TABLE_SIZE = 10000000;
long[] pos = new long[Hash_TABLE_SIZE];
short[] vals = new short[Hash_TABLE_SIZE];
byte[] depths = new byte[Hash_TABLE_SIZE];
byte[] flags = new byte[Hash_TABLE_SIZE];
long used2 = memoryUsed() - used1;
System.out.printf("%,d MB used%n", used2 / 1024 / 1024);
}
private static long memoryUsed() {
return Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory();
}
prints
114 MB used
I suspect you are doing something else which is the cause of your problem.
I am using Oracle HotSpot Java 7 update 10

Has not taken into account that each object is a reference and also use memory, and more "hidden things"... we must also take into account also the alignment... byte is not always a byte ;-)
Java Objects Memory Structure
How much memory is used by Java
To see how much memory is really in use, you can use a profiler:
visualvm
If you are using standard HashMap (or similar from JDK), each "long" (boxing/unboxing) really are more than 8bytes), you can use this as a base... (use less memory)
NativeIntHashMap

From what I have read about BlueJ, and serious technical information is almost impossible to find, BlueJ VM is quite likely not to support primitive types at all; your arrays are actually of boxed primitives. BlueJ uses a subset of all Java features, with emphasis on object orientation.
If that is the case, plus taking into consideration that performance and efficiency are quite low on BlueJ VM's list of priorities, you may actually be using quite a bit more memory than you think: a whole order of magnitude is quite imaginable.

I believe one way it would be to clean the heap memory after each execution, one link is here:
Java heap space out of memory

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.