Using collection size in for loop comparison - java

Is there a compiler optimization for the size() methods of Collections in Java?
Consider the following code:
for(int i=0;i<list.size();i++)
...some operation.....
There is a call to the size() methods for every i. Won't it be better to find out the size and reuse it? (Method calls have overheads).
final int len = list.size()
for(int i=0;i<len;i++)
...some operation.....
However, when I timed both these code pieces there was no significant time difference, even for i as high as 10000000.
Am I missing something here?
Update1: I understand that the size is not computed again unless the collection changes. But there has to be some overhead associated with a method call. Is it the case that the compiler always inlines these (See Esko's answer)?
Update 2: My curiosity has been fueled further. From the answers given, I see that good JIT compilers will often inline this function call. But they will still have to determine whether the collection was modified or not. I am not accepting an answer in the hope that someone will give me pointers regarding how this is handled by compilers.

Okay, here is an excerpt from the JDK sources (src.zip in the JDK folder):
public int size() {
return size;
}
This is from ArrayList, but I think other collections have similar implementations. Now if we imagine that the compiler inlines the size() call (which would make perfect sense), your loop turns into this:
for(int i=0;i<list.size;i++)
// ...
(Well, let's forget that the size is private.) How does compiler checks if the collection was modified? The answer that it doesn't and doesn't need to do so because the size is already available in the field, so all it has to do is to access the size field on each iteration, but accessing an int variable is a very fast operation. Note that it probably calculates its address once, so it doesn't even have to dereference list on each iteration.
What happens when the collection is modified, say, by the add() method?
public boolean add(E e) {
ensureCapacity(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
As you can see, it just increases the size field. So the compiler doesn't actually need to do anything to ensure it has access to the latest size. The only exception would be that if you modify the collection from another thread you need to synchronize, otherwise the loop thread may see its local cached value of size which may or may not be updated.

The value returned by collection's .size() method is usually cached and recalculated only when the actual collection is modified (new elements are added or old ones removed).
Instead of comparing for loop control scoping, try using the for each loop since that actually uses Iterator which in some collection implementations is a lot faster than iterating by using index.

Calling the size() method of a collection is just returning an integer value that is already kept track of. There isnt much of a time difference because size() isnt actually counting the number of items but instead the number of items are kept track of when you add or remove them.

The java language specification explains, that the expression is evaluated on each iteration step. With you example, list.size() is called 10.000.000 times.
This doesn't matter in your case, because list implementations (usually) have a private attribute that stores the actual list size. But it may cause trouble, if the evaluation really takes time. In those cases it's advisable to store the result of the expression to a local variable.

Related

Does java compiler insert free when pointer is allocated and go out of scope in a block?

I am scratching my head trying to understand the point of the following code
Map<String Set<MyOtherObj>> myMap = myapi.getMyMap();
final MyObj[] myObjList;
{
final List<MyObj> list = new ArrayList<>(myMap.size());
for (Entry<String, Set<MyOtherObj>> entry : myMap.entrySet()) {
final int myCount = MyUtility.getCount(entry.getValue());
if (myCount <= 0)
continue;
list.add(new MyObj(entry.getKey(), myCount));
}
if (list.isEmpty())
return;
myObjList = list.toArray(new MyObj[list.size()]);
}
Which can be rewrite into the following
Map<String Set<MyOtherObj>> myMap = myapi.getMyMap();
final List<MyObj> list = new ArrayList<>(myMap.size());
for (Entry<String, Set<MyOtherObj>> entry : myMap.entrySet()) {
final int myCount = MyUtility.getCount(entry.getValue());
if (myCount <= 0)
continue;
list.add(new MyObj(entry.getKey(), myCount));
}
if (list.isEmpty())
return;
The only reason I can think of why we put the ArrayList in a block and then reassign the content to an array is
The size of ArrayList is bigger than the size of list, so reassigning ArrayList to array save space
There is some sort of compiler magic or gc magic that deallocates and reclaim the memory use by ArrayList immediately after the block scope ends (eg. like rust), otherwise we are now sitting on up to 2 times amount of space until gc kicks in.
So my question is, does the first code sample make sense, is it more efficient?
This code currently executes 20k message per second.
As stated in this answer:
Scope is a language concept that determines the validity of names. Whether an object can be garbage collected (and therefore finalized) depends on whether it is reachable.
So, no, the scope is not relevant to garbage collection, but for maintainable code, it’s recommended to limit the names to the smallest scope needed for their purpose. This, however, does not apply to your scenario, where a new name is introduced to represent the same thing that apparently still is needed.
You suggested the possible motivation
The size of ArrayList is bigger than the size of list, so reassigning ArrayList to array save space
but you can achieve the same when declaring the variable list as ArrayList<MyObj> rather than List<MyObj> and call trimToSize() on it after populating it.
There’s another possible reason, the idea that subsequently using a plain array was more efficient than using the array encapsulated in an ArrayList. But, of course, the differences between these constructs, if any, rarely matter.
Speaking of esoteric optimizations, specifying an initial array size when calling toArray was believed to be an advantage, until someone measured and analyzed, to find that, i.e. myObjList = list.toArray(new MyObj[0]); would be actually more efficient in real life.
Anyway, we can’t look into the author’s mind, which is the reason why any deviation from straight-forward code should be documented.
Your alternative suggestion:
There is some sort of compiler magic or gc magic that deallocates and reclaim the memory use by ArrayList immediately after the block scope ends (eg. like rust), otherwise we are now sitting on up to 2 times amount of space until gc kicks in.
is missing the point. Any space optimization in Java is about minimizing the amount of memory occupied by objects still alive. It doesn’t matter whether unreachable objects have been identified as such, it’s already sufficient that they are unreachable, hence, potentially reclaimable. The garbage collector will run when there is an actual need for memory, i.e. to serve a new allocation request. Until then, it doesn’t matter whether the unused memory contains old objects or not.
So the code may be motivated by a space saving attempt and in that regard, it’s valid, even without an immediate freeing. As said, you could achieve the same in a simpler fashion by just calling trimToSize() on the ArrayList. But note that if the capacity does not happen to match the size, trimToSize()’s shrinking of the array doesn’t work differently behind the scenes, it implies creating a new array and letting the old one become subject to garbage collection.
But the fact that there’s no immediate freeing and there’s rarely a need for immediate freeing should allow the conclusion that space saving attempts like this would only matter in practice, when the resulting object is supposed to persist a very long time. When the lifetime of the copy is shorter than the time to the next garbage collection, it didn’t save anything and all that remains, is the unnecessary creation of a copy. Since we can’t predict the time to the next garbage collection, we can only make a rough categorization of the object’s expected lifetime (long or not so long)…
The general approach is to assume that in most cases, the higher capacity of an ArrayList is not a problem and the performance gain matters more. That’s why this class maintains a higher capacity in the first place.
No, it is done for the same reason as empty lines are added to the code.
The variables in the block are scoped to that block, and can no longer be used after the block. So one does not need to pay attention to those block variables.
So this is more readable:
A a;
{ B b; C c; ... }
...
Than:
A a;
B b;
C c;
...
...
It is an attempt to structure the code more readable. For instance above one can read "a declaration of A a; and then a block probably filling a.
Life time analysis in the JVM is fine. Just as there is absolutely no need to set variables to null at the end of their usage.
Sometimes blocks are also abused to repeat blocks with same local variables:
A a1;
{ B b; C c; ... a1 ... }
A a2;
{ B b; C c; ... a2 ... }
A a3;
{ B b; C c; ... a3 ... }
Needless to say that this is the opposite of making code better style.

Iterator versus Stream of Java 8

To take advantage of the wide range of query methods included in java.util.stream of Jdk 8 I am attempted to design domain models where getters of relationship with * multiplicity (with zero or more instances ) return a Stream<T>, instead of an Iterable<T> or Iterator<T>.
My doubt is if there is any additional overhead incurred by the Stream<T> in comparison to the Iterator<T>?
So, is there any disadvantage of compromising my domain model with a Stream<T>?
Or instead, should I always return an Iterator<T> or Iterable<T>, and leave to the end-user the decision of choosing whether to use a stream, or not, by converting that iterator with the StreamUtils?
Note that returning a Collection is not a valid option because in this case most of the relationships are lazy and with unknown size.
There's lots of performance advice here, but sadly much of it is guesswork, and little of it points to the real performance considerations.
#Holger gets it right by pointing out that we should resist the seemingly overwhelming tendency to let the performance tail wag the API design dog.
While there are a zillion considerations that can make a stream slower than, the same as, or faster than some other form of traversal in any given case, there are some factors that point to streams haven a performance advantage where it counts -- on big data sets.
There is some additional fixed startup overhead of creating a Stream compared to creating an Iterator -- a few more objects before you start calculating. If your data set is large, it doesn't matter; it's a small startup cost amortized over a lot of computation. (And if your data set is small, it probably also doesn't matter -- because if your program is operating on small data sets, performance is generally not your #1 concern either.) Where this does matter is when going parallel; any time spent setting up the pipeline goes into the serial fraction of Amdahl's law; if you look at the implementation, we work hard to keep the object count down during stream setup, but I'd be happy to find ways to reduce it as that has a direct effect on the breakeven data set size where parallel starts to win over sequential.
But, more important than the fixed startup cost is the per-element access cost. Here, streams actually win -- and often win big -- which some may find surprising. (In our performance tests, we routinely see stream pipelines which can outperform their for-loop over Collection counterparts.) And, there's a simple explanation for this: Spliterator has fundamentally lower per-element access costs than Iterator, even sequentially. There are several reasons for this.
The Iterator protocol is fundamentally less efficient. It requires calling two methods to get each element. Further, because Iterators must be robust to things like calling next() without hasNext(), or hasNext() multiple times without next(), both of these methods generally have to do some defensive coding (and generally more statefulness and branching), which adds to inefficiency. On the other hand, even the slow way to traverse a spliterator (tryAdvance) doesn't have this burden. (It's even worse for concurrent data structures, because the next/hasNext duality is fundamentally racy, and Iterator implementations have to do more work to defend against concurrent modifications than do Spliterator implementations.)
Spliterator further offers a "fast-path" iteration -- forEachRemaining -- which can be used most of the time (reduction, forEach), further reducing the overhead of the iteration code that mediates access to the data structure internals. This also tends to inline very well, which in turn increases the effectiveness of other optimizations such as code motion, bounds check elimination, etc.
Further, traversal via Spliterator tend to have many fewer heap writes than with Iterator. With Iterator, every element causes one or more heap writes (unless the Iterator can be scalarized via escape analysis and its fields hoisted into registers.) Among other issues, this causes GC card mark activity, leading to cache line contention for the card marks. On the other hand, Spliterators tend to have less state, and industrial-strength forEachRemaining implementations tend to defer writing anything to the heap until the end of the traversal, instead storing its iteration state in locals which naturally map to registers, resulting in reduced memory bus activity.
Summary: don't worry, be happy. Spliterator is a better Iterator, even without parallelism. (They're also generally just easier to write and harder to get wrong.)
Let’s compare the common operation of iterating over all elements, assuming that the source is an ArrayList. Then, there are three standard ways to achieve this:
Collection.forEach
final E[] elementData = (E[]) this.elementData;
final int size = this.size;
for (int i=0; modCount == expectedModCount && i < size; i++) {
action.accept(elementData[i]);
}
Iterator.forEachRemaining
final Object[] elementData = ArrayList.this.elementData;
if (i >= elementData.length) {
throw new ConcurrentModificationException();
}
while (i != size && modCount == expectedModCount) {
consumer.accept((E) elementData[i++]);
}
Stream.forEach which will end up calling Spliterator.forEachRemaining
if ((i = index) >= 0 && (index = hi) <= a.length) {
for (; i < hi; ++i) {
#SuppressWarnings("unchecked") E e = (E) a[i];
action.accept(e);
}
if (lst.modCount == mc)
return;
}
As you can see, the inner loop of the implementation code, where these operations end up, is basically the same, iterating over indices and directly reading the array and passing the element to the Consumer.
Similar things apply to all standard collections of the JRE, all of them have adapted implementations for all ways to do it, even if you are using a read-only wrapper. In the latter case, the Stream API would even slightly win, Collection.forEach has to be called on the read-only view in order to delegate to the original collection’s forEach. Similarly, the iterator has to be wrapped to protect against attempts to invoke the remove() method. In contrast, spliterator() can directly return the original collection’s Spliterator as it has no modification support. Thus, the stream of a read-only view is exactly the same as the stream of the original collection.
Though all these differences are hardly to notice when measuring real life performance as, as said, the inner loop, which is the most performance relevant thing, is the same in all cases.
The question is which conclusion to draw from that. You still can return a read-only wrapper view to the original collection, as the caller still may invoke stream().forEach(…) to directly iterate in the context of the original collection.
Since the performance isn’t really different, you should rather focus on the higher level design like discussed in “Should I return a Collection or a Stream?”

Value does not update in while loop unless printed out [duplicate]

This question already has an answer here:
Loop doesn't see value changed by other thread without a print statement
(1 answer)
Closed 7 years ago.
Ok, so I have a monitoring thread that checks a ArrayList size and does something after that size goes greater than a certain number. The problem I am having right now is the size value is never updated unless I have a print statement in my loop. Here is some code to show what exactly I have going.
while(working) {
// Get size function just returns the size of my list in my t class
int size = t.getSize();
if (size >= 10) {
//DO STUFF
}
}
This above code does not work. It never goes into the if statement. However, this works fine:
while(working) {
// Get size function just returns the size of my list in my t class
int size = t.getSize();
System.out.println(size);
if (size >= 10) {
//DO STUFF
}
}
EDIT: getSize() code:
public ArrayList<byte[]> myQueue = new ArrayList<byte[]>();
public int getSize() {
return myQueue.size();
}
NOTE: I have another thread running that is updating and adding to my list in my t class.
Any help? this is really annoying to have it spitting out numbers when I am trying to debug in the console.
If the only thing changing between your working and non working code is the println statement, then you almost certainly have a threading issue. The System.out.println() statement adds a small pause which may coincidentally cause it to behave, but is not solving the issue. You could do something like:
try {
Thread.sleep(10);
} catch (InterruptedException ex) {
}
...in place of this and check for the same behaviour, if indeed it is the same then this pretty much confirms the threading issue. However, as pointed out below println() also does a few other things such as causing a memory barrier, so this isn't a foolproof test. Another, perhaps better check could be to temporarily swap out ArrayList for Vector, which is thread safe - though this is a legacy collection so not recommended for use in the final code.
If this is the case, it sounds like you're not synchronising on ArrayList calls properly - ArrayList is not a thread safe collection. Whenever you read or write to the list, do it inside a synchronized block like so, synchronizing on the list:
synchronized(list) {
list.whatever();
}
...which will ensure that only one thread can access the ArrayList at once, hopefully solving the threading issue.
ArrayList is not synchronized, or otherwise prepared for use in two threads at once. In JLS terms, there is no happens-before relationship between the addition of elements in one thread and a size() call in another thread.
The cleanest solution would be to use a synchronized List implementation. You can either use Vector, or create a synchronized wrapper around your ArrayList using Collections.synchronizedList().
Your reader is starving the writer so it never gets a chance to run, never adding anything to your list.
By adding a I/O call (system.out.print or Thread.sleep) you put the reader thread in a blocking state which allows other one to run.
Generally, loops which consumes 100% CPU like this is bad. At least add a short sleep/yield somewhere in the loop.
Without seeing any other code there are a number of things that can be going on here. If you give us a short reproduce able example that may help.
As far as 'a number of things' It is possible that the the compiler can re-order the write of size outside of the while loop for instance
int size = t.getSize();
while(working){
if(size >= 10){
}
}
But again that is just speculation at this point.

Hashtable: why is get method synchronized?

I known a Hashtable is synchronized, but why its get() method is synchronized?
Is it only a read method?
If the read was not synchronized, then the Hashtable could be modified during the execution of read. New elements could be added, the underlying array could become too small and could be replaced by a bigger one, etc. Without sequential execution, it is difficult to deal with these situations.
However, even if get would not crash when the Hashtable is modified by another thread, there is another important aspect of the synchronized keyword, namely cache synchronization. Let's use a simplified example:
class Flag {
bool value;
bool get() { return value; } // WARNING: not synchronized
synchronized void set(bool value) { this->value = value; }
}
set is synchronized, but get isn't. What happens if two threads A and B simultaneously read and write to this class?
1. A calls read
2. B calls set
3. A calls read
Is it guaranteed at step 3 that A sees the modification of thread B?
No, it isn't, as A could be running on a different core, which uses a separate cache where the old value is still present. Thus, we have to force B to communicate the memory to other core, and force A to fetch the new data.
How can we enforce it? Everytime, a thread enters and leaves a synchronized block, an implicit memory barrier is executed. A memory barrier forces the cache to be updated. However, it is required that both the writer and the reader have to execute the memory barrier. Otherwise, the information is not properly communicated.
In our example, thread B already uses the synchronized method set, so its data modification is communicated at the end of the method. However, A does not see the modified data. The solution is to make get synchronized, so it is forced to get the updated data.
Have a look in Hashtable source code and you can think of lots of race conditions that can cause problem in a unsynchronized get() .
(I am reading JDK6 source code)
For example, a rehash() will create a empty array, and assign it to the instance var table, and put the entries from old table to the new one. Therefore if your get occurs after the empty array assignment, but before actually putting entries in it, you cannot find your key even it is in the table.
Another example is, there is a loop iterate thru the linked list at the table index, if in middle in your iteration, rehash happens. You may also failed to find the entry even it exists in the hashtable.
Hashtable is synchronized meaning the whole class is thread-safe
Inside the Hashtable, not only get() method is synchronized but also many other methods are. And particularly put() method is synchronized like Tom said.
A read method must be synchronized as a write method because it will make sure the visibility and the consistency of the variable.

How to read the last X entries of a vector while being thread safe?

I have a singleton logger that contains a vector. Objects from outside can append information to this vector by calling singletonLogger.append(String data) and read the whole vector by calling singletonLogger.getLogEntries() which returns a string.
It would be nice to overload the getLogEntries-method with an int-parameter, e.g. getLogEntries(int x), to be able to get only the last x entries instead of the whole log.
Without regarding mutliple threads, this would be easy, something like:
String getLogEntries(int x) {
int size = vector.size();
for(int i = size; i > (size - x); i--) {
// StringBuilder.append(vector.elementAt....
}
}
But of course, this is not really safe when taking multiple threads into account. Imagine the vector gets cleared by another method shortly after its size was determined by the method above, the loop will crash.
On the other hand, I do not want to mark the whole method as synchronized, because the loop processing could last 5 - 10 seconds. This would block all the code that is trying to call the logger's methods, right?
Is there another way to reliably get the last x elements of a vector?
Thanks
Edit
Vector has a sublist method that should work and be synchronized but that doesn't solve someone clearing the Vector in another thread. You could use ReadWriteLock and get a readLock() when reading from the end of the Vector using sublist() and a writeLock() (which guarantees exclusive access) when clear() needs to be called. If your background thread is writing the log entries to disk or something, it should count the number of line written, and then get a writeLock() and remove those from the front of the list instead of calling clear(). That would limit the time under the lock to be more efficient.
You might also consider maintaining your own internal queue so you can control the synchronization specifically. This may make it easier to clear the earlier entries from the queue. Then again you may need a ReadWriteLock for that as well.
Did you consider copying the relevant elements to a new Vector in a synchronized block and then handling them outside one?

Categories

Resources