Consider such method:
#Override
public String toString()
{
final StringBuilder sb = new StringBuilder();
for (final Room room : map)
{
sb.append(room.toString());
sb.append(System.getProperty("line.separator")); // THIS IS IMPORTANT
}
return sb.toString();
}
System.getProperty("line.separator") can be called many times.
Should I cache this value with public final static String lineSeperator = System.getProperty("line.separator")
and later use only lineSeperator?
Or System.getProperty("line.separator") is as fast as using a static field?
I see your question as presenting a false dichotomy. I would neither call getProperty every time, nor declare a static field for it. I'd simply extract it to a local variable in toString.
#Override
public String toString()
{
final StringBuilder sb = new StringBuilder();
final String newline = System.getProperty("line.separator");
for (final Room room : map) sb.append(room.toString()).append(newline);
return sb.toString();
}
BTW I have benchmarked the call. The code:
public class GetProperty
{
static char[] ary = new char[1];
#GenerateMicroBenchmark public void everyTime() {
for (int i = 0; i < 100_000; i++) ary[0] = System.getProperty("line.separator").charAt(0);
}
#GenerateMicroBenchmark public void cache() {
final char c = System.getProperty("line.separator").charAt(0);
for (int i = 0; i < 100_000; i++) ary[0] = (char)(c | ary[0]);
}
}
The results:
Benchmark Mode Thr Cnt Sec Mean Mean error Units
GetProperty.cache thrpt 1 3 5 10.318 0.223 ops/msec
GetProperty.everyTime thrpt 1 3 5 0.055 0.000 ops/msec
The cached approach is more than two orders of magnitude faster.
Do note that the overall impact of getProperty call against all that string building is very, very unlikely to be noticeable.
You do not need to fear that the line separator will change while your code is running, so I see no reason against caching it.
Caching a value is certainly faster than executing a call over and over, but the difference will probably be negligible.
If you have become aware of a performance problem that you know relates to this, yes.
If you haven't, then no, the lookup is unlikely to have enough overhead to matter.
This would fall under either or both of the general categories "micro-optimization" and "premature optimization." :-)
But if you're worried about efficiency, you probably have a much bigger opportunity in that your toString method is regenerating the string every time. If toString will be called a lot, rather than caching the line terminator, cache the generated string, and clear that whenever your map of rooms changes. E.g.:
#Override
public String toString()
{
if (cachedString == null)
{
final StringBuilder sb = new StringBuilder();
final String ls = System.getProperty("line.separator");
for (final Room room : map)
{
sb.append(room.toString());
sb.append(ls);
}
cachedString = sb.toString();
}
return cachedString;
}
...and when your map changes, do
cachedString = null;
That's a lot more bang for the buck (the buck being the overhead of an extra field). Granted it's per-instance rather than per-class, so (reference earlier comment about efficiency) only do it if you have a good reason to.
Since it's so easy to do, why not? At the very least the implementation of System.getProperty() will have to do a hash table lookup (even if cached internally) to find the property you are requesting, then the virtual method getString() will be called on the resulting Object. None of these are very expensive but will need to be called multiple times. Not to mention many String temporaries will be created and need GCing after.
If you move this out to the top of your loop and reuse the same value, you avoid all of these problems. So why not?
If the system property is guaranteed to remain constant during the application it can be cached but in general you will loose the feature of the property which is changing the behavior when you change it.
For instance a text generator could use the property to generate text for windows or for linux and allow the property to be changed dynamically in the application, why not ?
In general, catching a property implies making useless the function setProperty.
Related
Identity HashMap is special implementation in Java which compares the objects reference instead of equals() and also uses identityHashCode() instead of hashCode(). In addition, it uses linear-probe hash table instead of Entry list.
Map<String, String> map = new HashMap<>();
Map<String, String> iMap = new IdentityHashMap<>();
Does that mean for the String keys IdentifyHashMap will be usually faster if tune correctly ?
See this example:
public class Dictionary {
public static void main(String[] args) throws IOException {
BufferedReader br = new BufferedReader(new FileReader("/usr/share/dict/words"));
String line;
ArrayList<String> list = new ArrayList<String>();
while ((line = br.readLine()) != null) {
list.add(line);
}
System.out.println("list.size() = " + list.size());
Map<String, Integer> iMap = new IdentityHashMap<>(list.size());
Map<String, Integer> hashMap = new HashMap<>(list.size());
long iMapTime = 0, hashMapTime = 0;
long time;
for (int i = 0; i < list.size(); i++) {
time = System.currentTimeMillis();
iMap.put(list.get(i), i);
time = System.currentTimeMillis() - time;
iMapTime += time;
time = System.currentTimeMillis();
hashMap.put(list.get(i), i);
time = System.currentTimeMillis() - time;
hashMapTime += time;
}
System.out.println("iMapTime = " + iMapTime + " hashMapTime = " + hashMapTime);
}
}
Tried very basic performance check. I am reading dictionary words (235K) & pushing into the both maps. It prints:
list.size() = 235886
iMapTime = 101 hashMapTime = 617
I think this is very good improvment to ignore, unless I am doing something wrong here.
How does IdentityHashMap<String,?> work?
To make IdentityHashMap<String,?> work for arbitrary strings, you'll have to String.intern() both the keys you put() and potential keys you pass to get(). (Or use an equivalent mechanism.)
Note: unlike stated in #m3th0dman's answer, you don't need to intern() the values.
Either way, interning a string ultimately requires looking it up in some kind of hash table of already interned strings. So unless you had to intern your strings for some other reason anyway (and thus already paid the cost), you won't get much of an actual performance boost out of this.
So why does the test show that you can?
Where your test is unrealistic is that you keep the exact list of keys you used with put() and you iterate across them one by one in list order. Note (the same could be achieved by inserting the elements into a LinkedHashMap and simply calling iterator() on its entry set.
What's the point of IdentityHashMap then?
There are scenarios where it is guaranteed (or practically guaranteed) that object identity is the same as equals(). Imagine trying to implement your own ThreadLocal class for example, you'll probably write something like this:
public final class ThreadLocal<T> {
private final IdentityHashMap<Thread,T> valueMap;
...
public T get() {
return valueMap.get( Thread.currentThread() );
}
}
Because you know that threads have no notion of equality beyond identity. Same goes if your map keys are enum values and so on.
You will see significantly faster performance on IdentityHashMap, however that comes at a substantial cost.
You must be absolutely sure that you will never ever have objects added to the map that have the same value but different identities.
That's hard to guarantee both now and for the future, and a lot of people make mistaken assumptions.
For example
String t1 = "test";
String t2 = "test";
t1==t2 will return true.
String t1 = "test";
String t2 = new String("test");
t1==t2 will return false.
Overall my recommendation is that unless you absolutely critically need the performance boost and know exactly what you are doing and heavily lock down and comment access to the class then by using IdentityHashMap you are opening yourself up to massive risks of very hard to track down bugs in the future.
Technically you can do something like this to make sure you have the same instance of the string representation:
public class StringIdentityHashMap extends IdentityHashMap<String, String>
{
#Override
public String put(String key, String value)
{
return super.put(key.intern(), value.intern());
}
#Override
public void putAll(Map<? extends String, ? extends String> m)
{
m.entrySet().forEach(entry -> put(entry.getKey().intern(), entry.getValue().intern()));
}
#Override
public String get(Object key)
{
if (!(key instanceof String)) {
throw new IllegalArgumentException();
}
return super.get(((String) key).intern());
}
//implement the rest of the methods in the same way
}
But this won't help you very much since intern() calls equals() to make sure the given String exists or not in the String pool so you end up with the performance of the typical HashMap.
This, however will only help you to improve memory and not CPU. There is no way to achieve better CPU usage and to be sure your program is correct (without possible using some internal knowledge of JVM which might change) because Strings can be in String pool or not and you cannot know if they are in without (not implicitly) calling equals().
Interestingly, IdentityHashMap can be SLOWER. I am using Class objects as keys, and seeing a ~50% performance INCREASE with HashMap over IdentityHashMap.
IdentityHashMap and HashMap are different internally, so if the equals() method of your keys is really fast, HashMap seems better.
I am doing a java code inspection. Here is a function (snippet):
String getValue() {
String res;
StringBuilder strBuilder = new StringBuilder();
// More code here that sets strBuilder
return res = strBuilder.toString();
}
First there is a warning that the value of res is not used. Secondly I don't understand the return. Why don't they just return( strBuilder.toString() ). Is there some sort of advantage?
res is not used, so there is no reason to return like that. You can remove it:
String getValue() {
StringBuilder bs = new StringBuilder();
//
// More code here that sets sb
return bs.toString();
}
That sort of code can sometimes result from incomplete removal of debug artifacts:
String getValue() {
String res;
StringBuilder bs = new StringBuilder();
//
// More code here that sets sb
res = bs.toString();
// Test and/or display res here
return res;
}
It certainly seems like a good candidate for the next round of refactoring and clean-up.
Just guessing, but some (most?) IDEs don't allow you to directly inspect the value of function returns. With this scheme, you could put a breakpoint at the end of the method, and mouse over "res" to get the return value.
You're absolutely right; assignment to res makes no sense; return bs.toString(); would do the the same.
P.S. +1 for not ignoring compiler warnings.
You cant do either
String res = strBuilder.toString();
return res ;
Or directly,
return strBuilder.toString();
Now If you want to know about benefits as you asked Is there any benefit, i always prefer directly return. My personal logic is simple as
You gonna write one line less code !!! (declaring variables allover is not a good feeling to me and also you don't have to think about the name of the variable, conflicts etc.. those silly matter )
The value will not be stored in memory and wait for the GC to collect it. SO, less memory see.....
Fast write to a variable and then read from it and return ..... more read/write isn't it?
Those things are nothing big, I had to say as you asked
Can also be written as:
String getValue() {
return new StringBuilder().toString();
}
I'm creating a StringBuilder to collect strings that I periodically flush to a server. If the flush fails, I want to keep the strings to try again next time, although in the mean time I might get additional strings to send which must be added to the StringBuilder.
What I want to know is what the most efficient way to do this would be, as this is being done in an Android app where battery usage and thus CPU usage is a big concern. Does calling StringBuilder's toString() function store the resulting string it returns internally so that a subsequent call doesn't have to do the work of copying all the original strings over? Or if the call fails, should I create a new StringBuilder initialized with the return value from toString()?
Here is the OpenJDK source code for StringBuilder:
public String toString() {
// Create a copy, don't share the array
return new String(value, 0, count);
}
The source for the String constructor with those parameters is:
public String(char value[], int offset, int count) {
if (offset < 0) {
throw new StringIndexOutOfBoundsException(offset);
}
if (count < 0) {
throw new StringIndexOutOfBoundsException(count);
}
// Note: offset or count might be near -1>>>1.
if (offset > value.length - count) {
throw new StringIndexOutOfBoundsException(offset + count);
}
this.offset = 0;
this.count = count;
this.value = Arrays.copyOfRange(value, offset, offset+count);
}
So yes, it does create a new String everytime, and yes, it makes a copy of the char[] everytime.
It's important to note that this is one implementation of toString, and another implementation may obviously be different.
It would be an implementation detail. Since java strings are immutable a correct impl can choose to share or create new strings from StringBuilder.toString() even if it's not needed.
As everyone says, you can test to see if this is indeed a real performance issue for you. If it is one (clunky) workaround is to wrap StringBuilder and cache the resulting string. You can use a dirty flag to indicate the content was modified.
StringBuilder.toString API says that a new String object is allocated and initialized to contain the character sequence currently represented by this object.
I'm the lead author of ORMLite which uses Java annotations on classes to build database schemas. A big startup performance problem for our package turns out to be the calling of annotation methods under Android 1.6. I see the same behavior up through 3.0.
We are seeing that the following simple annotation code is incredibly GC intensive and a real performance problem. 1000 calls to an annotation method takes almost a second on a fast Android device. The same code running on my Macbook Pro can do 28 million (sic) calls in the same time. We have an annotation that has 25 methods in it and we'd like to do more than 50 of these a second.
Does anyone know why this is happening and if there is any work around? There are certainly things that ORMLite can do in terms of caching this information but is there anything that we can do to "fix" annotations under Android? Thanks.
public void testAndroidAnnotations() throws Exception {
Field field = Foo.class.getDeclaredField("field");
MyAnnotation myAnnotation = field.getAnnotation(MyAnnotation.class);
long before = System.currentTimeMillis();
for (int i = 0; i < 1000; i++)
myAnnotation.foo();
Log.i("test", "in " + (System.currentTimeMillis() - before) + "ms");
}
#Target(FIELD) #Retention(RUNTIME)
private static #interface MyAnnotation {
String foo();
}
private static class Foo {
#MyAnnotation(foo = "bar")
String field;
}
This results in the following log output:
I/TestRunner( 895): started: testAndroidAnnotations
D/dalvikvm( 895): GC freed 6567 objects / 476320 bytes in 85ms
D/dalvikvm( 895): GC freed 8951 objects / 599944 bytes in 71ms
D/dalvikvm( 895): GC freed 7721 objects / 524576 bytes in 68ms
D/dalvikvm( 895): GC freed 7709 objects / 523448 bytes in 73ms
I/test ( 895): in 854ms
EDIT:
After #candrews pointed me in the right direction, I did some poking around the code. The performance problem looks to be caused by some terrible, gross code in Method.equals(). It is calling the toString() of both methods and then comparing them. Each toString() use StringBuilder with a bunch of append methods without a good initializing size. Doing the .equals by comparing fields would be significantly faster.
EDIT:
An interesting reflection performance improvement was given to me. We are now using reflection to peek inside the AnnotationFactory class to read the list of fields directly. This makes the reflection class 20 times faster for us since it bypasses the invoke which is using the method.equals() call. It is not a generic solution but here's the Java code from ORMLite SVN repository. For a generic solution, see yanchenko's answer below.
Google has acknowledged the issue and fixed it "post-Honeycomb"
https://code.google.com/p/android/issues/detail?id=7811
So at least they know about it and have supposedly fixed it for some future version.
Here's a generic version of Gray's & user931366's idea:
public class AnnotationElementsReader {
private static Field elementsField;
private static Field nameField;
private static Method validateValueMethod;
public static HashMap<String, Object> getElements(Annotation annotation)
throws Exception {
HashMap<String, Object> map = new HashMap<String, Object>();
InvocationHandler handler = Proxy.getInvocationHandler(annotation);
if (elementsField == null) {
elementsField = handler.getClass().getDeclaredField("elements");
elementsField.setAccessible(true);
}
Object[] annotationMembers = (Object[]) elementsField.get(handler);
for (Object annotationMember : annotationMembers) {
if (nameField == null) {
Class<?> cl = annotationMember.getClass();
nameField = cl.getDeclaredField("name");
nameField.setAccessible(true);
validateValueMethod = cl.getDeclaredMethod("validateValue");
validateValueMethod.setAccessible(true);
}
String name = (String) nameField.get(annotationMember);
Object val = validateValueMethod.invoke(annotationMember);
map.put(name, val);
}
return map;
}
}
I've benchmarked an annotation with 4 elements.
Millisecond times for 10000 iterations of either getting values of all of them or calling the method above:
Device Default Hack
HTC Desire 2.3.7 11094 730
Emulator 4.0.4 3157 528
Galaxy Nexus 4.3 1248 392
Here's how I've integrated it into DroidParts: https://github.com/yanchenko/droidparts/commit/93fd1a1d6c76c2f4abf185f92c5c59e285f8bc69.
To follow up on this, there's still a problem here when calling methods on annotations. The bug listed above by candrews fixes the getAnnotation() slowness, but calling a method on the annotation is still a problem due to the Method.equals() issues.
Couldn't find a bug report for Method.equals() so I created one here:
https://code.google.com/p/android/issues/detail?id=37380
Edit:
So my work around for this (thanks for the ideas #Gray), is actually pretty simple.
(this is trunkcated code, some caching and such is omitted)
annotationFactory = Class.forName("org.apache.harmony.lang.annotation.AnnotationFactory");
getElementDesc = annotationFactory.getMethod("getElementsDescription", Class.class);
Object[] members = (Object[])getElementDesc.invoke(annotationFactory, clz); // these are AnnotationMember[]
Object element = null;
for (Object e:members){ // AnnotationMembers
Field f = e.getClass().getDeclaredField("name");
f.setAccessible(true);
String fname = (String) f.get(e);
if (methodName.equals(fname)){
element = e;
break;
}
}
if (element == null) throw new Exception("Element was not found");
Method m = element.getClass().getMethod("validateValue");
return m.invoke(element, args);
You mileage will vary based on use, but in may case this was about 15-20 times faster then doing it the "right way"
I think if you manage to change the RUNTIME retention policy, it should not be that slow.
EDIT: I know, for your project that may not be an option. Perhaps it is more a problem of what you are doing with that annotation rather than bad performance in general.
Is it (performance-wise) better to use Arrays or HashMaps when the indexes of the Array are known? Keep in mind that the 'objects array/map' in the example is just an example, in my real project it is generated by another class so I cant use individual variables.
ArrayExample:
SomeObject[] objects = new SomeObject[2];
objects[0] = new SomeObject("Obj1");
objects[1] = new SomeObject("Obj2");
void doSomethingToObject(String Identifier){
SomeObject object;
if(Identifier.equals("Obj1")){
object=objects[0];
}else if(){
object=objects[1];
}
//do stuff
}
HashMapExample:
HashMap objects = HashMap();
objects.put("Obj1",new SomeObject());
objects.put("Obj2",new SomeObject());
void doSomethingToObject(String Identifier){
SomeObject object = (SomeObject) objects.get(Identifier);
//do stuff
}
The HashMap one looks much much better but I really need performance on this so that has priority.
EDIT: Well Array's it is then, suggestions are still welcome
EDIT: I forgot to mention, the size of the Array/HashMap is always the same (6)
EDIT: It appears that HashMaps are faster
Array: 128ms
Hash: 103ms
When using less cycles the HashMaps was even twice as fast
test code:
import java.util.HashMap;
import java.util.Random;
public class Optimizationsest {
private static Random r = new Random();
private static HashMap<String,SomeObject> hm = new HashMap<String,SomeObject>();
private static SomeObject[] o = new SomeObject[6];
private static String[] Indentifiers = {"Obj1","Obj2","Obj3","Obj4","Obj5","Obj6"};
private static int t = 1000000;
public static void main(String[] args){
CreateHash();
CreateArray();
long loopTime = ProcessArray();
long hashTime = ProcessHash();
System.out.println("Array: " + loopTime + "ms");
System.out.println("Hash: " + hashTime + "ms");
}
public static void CreateHash(){
for(int i=0; i <= 5; i++){
hm.put("Obj"+(i+1), new SomeObject());
}
}
public static void CreateArray(){
for(int i=0; i <= 5; i++){
o[i]=new SomeObject();
}
}
public static long ProcessArray(){
StopWatch sw = new StopWatch();
sw.start();
for(int i = 1;i<=t;i++){
checkArray(Indentifiers[r.nextInt(6)]);
}
sw.stop();
return sw.getElapsedTime();
}
private static void checkArray(String Identifier) {
SomeObject object;
if(Identifier.equals("Obj1")){
object=o[0];
}else if(Identifier.equals("Obj2")){
object=o[1];
}else if(Identifier.equals("Obj3")){
object=o[2];
}else if(Identifier.equals("Obj4")){
object=o[3];
}else if(Identifier.equals("Obj5")){
object=o[4];
}else if(Identifier.equals("Obj6")){
object=o[5];
}else{
object = new SomeObject();
}
object.kill();
}
public static long ProcessHash(){
StopWatch sw = new StopWatch();
sw.start();
for(int i = 1;i<=t;i++){
checkHash(Indentifiers[r.nextInt(6)]);
}
sw.stop();
return sw.getElapsedTime();
}
private static void checkHash(String Identifier) {
SomeObject object = (SomeObject) hm.get(Identifier);
object.kill();
}
}
HashMap uses an array underneath so it can never be faster than using an array correctly.
Random.nextInt() is many times slower than what you are testing, even using array to test an array is going to bias your results.
The reason your array benchmark is so slow is due to the equals comparisons, not the array access itself.
HashTable is usually much slower than HashMap because it does much the same thing but is also synchronized.
A common problem with micro-benchmarks is the JIT which is very good at removing code which doesn't do anything. If you are not careful you will only be testing whether you have confused the JIT enough that it cannot workout your code doesn't do anything.
This is one of the reason you can write micro-benchmarks which out perform C++ systems. This is because Java is a simpler language and easier to reason about and thus detect code which does nothing useful. This can lead to tests which show that Java does "nothing useful" much faster than C++ ;)
arrays when the indexes are know are faster (HashMap uses an array of linked lists behind the scenes which adds a bit of overhead above the array accesses not to mention the hashing operations that need to be done)
and FYI HashMap<String,SomeObject> objects = HashMap<String,SomeObject>(); makes it so you won't have to cast
For the example shown, HashTable wins, I believe. The problem with the array approach is that it doesn't scale. I imagine you want to have more than two entries in the table, and the condition branch tree in doSomethingToObject will quickly get unwieldly and slow.
Logically, HashMap is definitely a fit in your case. From performance standpoint is also wins since in case of arrays you will need to do number of string comparisons (in your algorithm) while in HashMap you just use a hash code if load factor is not too high. Both array and HashMap will need to be resized if you add many elements, but in case of HashMap you will need to also redistribute elements. In this use case HashMap loses.
Arrays will usually be faster than Collections classes.
PS. You mentioned HashTable in your post. HashTable has even worse performance thatn HashMap. I assume your mention of HashTable was a typo
"The HashTable one looks much much
better "
The example is strange. The key problem is whether your data is dynamic. If it is, you could not write you program that way (as in the array case). In order words, comparing between your array and hash implementation is not fair. The hash implementation works for dynamic data, but the array implementation does not.
If you only have static data (6 fixed objects), array or hash just work as data holder. You could even define static objects.