I have the following method which adds an element to a size limited ArrayList. If the size of the ArrayList exceeds, previous elements are removed (like FIFO = "first in first out") (version 1):
// adds the "item" into "list" and satisfies the "limit" of the list
public static <T> void add(List<T> list, final T item, int limit) {
var size = list.size() + 1;
if (size > limit) {
var exeeded = size - limit;
for (var i = 0; i < exeeded; i++) {
list.remove(0);
}
}
list.add(item);
}
The "version 1"-method works. However, I wanted to improve this method by using subList (version 2):
public static <T> void add(List<T> list, final T item, int limit) {
var size = list.size() + 1;
if (size > limit) {
var exeeded = size - limit;
list.subList(0, exeeded).clear();
}
list.add(item);
}
Both methods works. However, I want to know if "version 2" is also more performant than "version 1".
EDIT:
improved "Version 3":
public static <T> void add(List<T> list, final T item, int limit) {
var size = list.size() + 1;
if (size > limit) {
var exeeded = size - limit;
if (exeeded > 1) {
list.subList(0, exeeded).clear();
} else {
list.remove(0);
}
}
list.add(item);
}
It seems you have the ArrayList implementation in mind where remove(0) imposes the cost of copying all remaining elements in the backing array, repeatedly if you invoke remove(0) repeatedly.
In this case, using subList(0, number).clear() is a significant improvement, as you’re paying the cost of copying elements only once instead of number times.
Since the copying costs of remove(0) and subList(0, number).clear() are identical when number is one, the 3rd variant would save the cost of creating a temporary object for the sub list in that case. This, however is a tiny impact that doesn’t depend on the size of the list (or any other aspect of the input) and usually isn’t worth the more complex code. See also this answer for a discussion of the costs of a single temporary object. It’s even possible that the costs of the sub list construction get removed by the JVM’s runtime optimizer. Hence, such a conditional should only be used when you experience an actual performance problem, the profiler traces the problem back to this point, and benchmarks prove that the more complicated code has a positive effect.
But this is all moot when you use an ArrayDeque instead. This class has no copying costs when removing its head element, hence you can simply remove excess elements in a loop.
Question 1: The problem is this line:
list = list.subList(exeeded, list.size());
You're reassigning the variable list which will not change to object passed as an argument but only its local counterpart.
Question 2: The sublist will (on an array list) still need to recreate the array at some point. If you don't want that you could use a LinkedList. But as a general rule the ArrayList will still perform better on the whole. Since the underlying array only has to be recreated when exceeding the maximum capacity it usually doesn't matter a lot.
You could also try to actually shift the array, move every element to the next slot in the array. That way you would have to move all elements when a new one is added but don't need to recreate the array. So you avoid the trip to the heap which is usually the biggest impact on performance.
Related
I am trying to write the method that efficiently wraps each element in the List passed to this method and returns the created ArrayList with wrapped elements.
According to the documentation:
The size(), isEmpty(), get(), set(), iterator(), and listIterator() operations run in constant time. The add operation runs in amortized constant time, that is, adding n elements requires O(n) time. All of the other operations run in linear time (roughly speaking). The constant factor is low compared to that for the LinkedList implementation.
Do I understand it right that If I create an ArrayList and pass the initial capacity to the constructor, the elements in ArrayList won't be reallocated in memory when new ones are added?
Example:
public static <T> ArrayList<RequestToExternalSource<T>> wrapExternalSources(List<ExternalSource<T>> externalSources, BiConsumer<Integer, T> publishResult) {
ArrayList<RequestToExternalSource<T>> requests = new ArrayList<>(externalSources.size());
ListIterator<ExternalSource<T>> externalSourcesIterator = externalSources.listIterator();
int index = 0;
while (externalSourcesIterator.hasNext()) {
requests.add(new RequestToExternalSource<>(
index++,
externalSourcesIterator.next(),
publishResult));
}
return requests;
}
To answer this, we can look directly at the source code of ArrayList#add. We first see the following method:
public boolean add(E e) {
modCount++;
add(e, elementData, size);
return true;
}
The method above calls the following private, overloaded add method:
private void add(E e, Object[] elementData, int s) {
if (s == elementData.length)
elementData = grow();
elementData[s] = e;
size = s + 1;
}
We can see that elementData (the Object[] that holds the data) will only grow when s (the size parameter, equal to ArrayList#size in our case) equals the length of the data array. For this reason, elementData is not grown even if we add n elements to an ArrayList initialized with a capacity of n, which is good!
Do I understand it right that If I create an ArrayList and pass the initial capacity to the constructor, the elements in ArrayList won't be reallocated in memory when new ones are added?
For these reasons, yes, you're correct, until you add more elements than the capacity specified.
I have a java.util.LinkedList containing data logically like
1 > 2 > 3 > 4 > 5 > null
and I want to remove elements from 2 to 4 and make the LinkedList like this
1 > 5 > null
In reality we should be able to achieve this in O(n) complexity considering you have to break chain at 2 and connect it to 5 in just a single operation.
In Java LinkedList I am not able to find any function which lets remove chains from linkedlist using from and to in a single O(n) operation.
It only provides me an option to remove the elements individually (Making each operation O(n)).
Is there anyway I can achieve this in just a single operation (Without writing my own List)?
One solution provided here solves the problem using single line of code, but not in single operation.
list.subList(1, 4).clear();
The question was more on algorithmic and performance. When I checked the performance, this is actually slower than removing the element one by one. I am guessing this solution do not actually remove an entire sublist in o(n) but doing that one by one for each element (each removal of O(n)). Also adding extra computation to take the sublist.
Average of 1000000 computations in ms:
Without sublist = 1414
With the provided sublist solution : = 1846**
The way to do it in one step is
list.subList(1, 4).clear();
as documented in the Javadoc for java.util.LinkedList#subList(int, int).
Having checked the source code, I see that this ends up removing the elements one at a time. subList is inherited from AbstractList. This implementation returns a List that simply calls removeRange on the backing list when you invoke clear on it. removeRange is also inherited from AbstractList and the implementation is
protected void removeRange(int fromIndex, int toIndex) {
ListIterator<E> it = listIterator(fromIndex);
for (int i=0, n=toIndex-fromIndex; i<n; i++) {
it.next();
it.remove();
}
}
As you can see, this removes the elements one at a time. listIterator is overridden in LinkedList, and it starts by finding the first node by following chains either by following links from the start of the list or the end (depending on whether fromIndex is in the first or second half of the list). This means that list.subList(i, j).clear() has time complexity
O(j - i + min(i, list.size() - i)).
Apart from the case when the you are better off starting from the end and removing the elements in reverse order, I am not convinced there is a solution that is noticeably faster. Testing the performance of code is not easy, and it is easy to be drawn to false conclusions.
There is no way of using the public API of the LinkedList class to remove all the elements in the middle in one go. This surprised me, as about the only reason for using a LinkedList rather than an ArrayList is that you are supposed to be able to insert and remove elements from the middle efficiently, so I thought this case worth optimising (especially as it's so easy to write).
If you absolutely need the O(1) performance that you should be able to get from a call such as
list.subList(1, list.size() - 1)).clear();
you will either have to write your own implementation or do something fragile and unwise with reflection like this:
public static void main(String[] args) {
LinkedList<Integer> list = new LinkedList<>();
for (int a = 0; a < 5; a++)
list.add(a);
removeRange_NEVER_DO_THIS(list, 2, 4);
System.out.println(list); // [0, 1, 4]
}
public static void removeRange_NEVER_DO_THIS(LinkedList<?> list, int from, int to) {
try {
Method node = LinkedList.class.getDeclaredMethod("node", int.class);
node.setAccessible(true);
Object low = node.invoke(list, from - 1);
Object hi = node.invoke(list, to);
Class<?> clazz = low.getClass();
Field nextNode = clazz.getDeclaredField("next");
Field prevNode = clazz.getDeclaredField("prev");
nextNode.setAccessible(true);
prevNode.setAccessible(true);
nextNode.set(low, hi);
prevNode.set(hi, low);
Field size = LinkedList.class.getDeclaredField("size");
size.setAccessible(true);
size.set(list, list.size() - to + from);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
To remove the middle elements in a single operation (method call) you could subclass java.util.LinkedList and then expose a call to List.removeRange(int, int):
list.removeRange(1, 4);
(Credit to the person who posted this answer then removed it. :)) However, even this method calls ListIterator.remove() n times.
I do not believe there is a way to remove n consecutive entries from a java.util.LinkedList without performing n operations under the hood.
In general removing n consecutive items from any linked list seems to require O(n) operations as one must traverse from the start index to the end index one item at a time - inherently - in order to find the next list entry in the modified list.
The question is pretty much as stated in the title. I'm in an algorithms course and the professor and I disagree regarding whether or not operations performed on an ArrayList sublist (a sublist generated by ArrayList.sublist) can be considered 'in place'. To my read of the Java API:
Returns a view of the portion of this list between the specified fromIndex, inclusive, and toIndex, exclusive. (If fromIndex and toIndex are equal, the returned list is empty.) The returned list is backed by this list, so non-structural changes in the returned list are reflected in this list, and vice-versa. The returned list supports all of the optional list operations.
you are still manipulating the 'master' ArrayList directly. To his view, you are copying references from the 'master' array into a new sub-array which means employing ArrayList.subList is not considered 'in place'. Obviously, for the purposes of the course what he says goes (that is, if I want to pass :-/) but I would like to know either way for my own growth as a programmer. Code is below - and thank you!
public static int findK (int findME, int mVal, ArrayList<Integer> arr) {
// pre stage return variable
int returnVal = -1;
// make a subarray consisting of indexes 0 - ((m-2)+(m-1)).
// this because the relationship between the m, p, and q
// is p>q>m - therfore the max value of q is m-1 and the
// max value of p is m-2.
int newArrSize = (mVal-2) + (mVal-1);
ArrayList<Integer> subArr = new ArrayList<Integer>(arr.subList(0, newArrSize));
// make the list smaller by looking at only the last [mVal]
// elements. this because we know what we're looking for
// has to be in the second section of the array, and that
// section can't possibly be larger than mVal
int fromIndex = subArr.size() - mVal;
subArr = new ArrayList<Integer> (subArr.subList(fromIndex, subArr.size()));
// at this point we can do a simple binary search, which on an
// a sorted array of size mVal is lg(m)
while (subArr.size() > 1) {
// get midpoint value
int midPointIndex = subArr.size() / 2;
int midPointValue = subArr.get(midPointIndex);
// check for case where midpoint value is in the first
// region of the array
// check for case where the midpoint is less than the
// findME value
//
// if true, discard first half of the array
if ((midPointValue == 9000) || (midPointValue < findME)) {
subArr = new ArrayList<Integer> (subArr.subList(midPointIndex, subArr.size()));
continue;
}
// else if midpoint is less than findMe, discard the second
// half of the array
else if (midPointValue > findME) {
subArr = new ArrayList<Integer> (subArr.subList(0, midPointIndex));
continue;
}
// if we're here, we've found our value!
returnVal = midPointValue;
break;
}
// check for match and return result to caller
// only perform check if we haven't already found the value
// we're looking for
if (returnVal == -1) returnVal = (subArr.get(0) == findME) ? (subArr.get(0)) : (-1);
return returnVal;
}
I assume in this answer, that by "in place" actually "uses constant additional memory" is meant.
The sublist function creates a view of the original list. This uses only O(1) memory.
However you allocate a new list (Indices were replaced with my own names here, for simplicity):
subArr = new ArrayList<Integer> (subArr.subList(index1, index2));
What you do with such a statement is:
create a subList view (uses O(1) memory)
copy the sublist (uses O(sublist size) = O(index2 - index1) memory).
delete reference to subList (and by that the reference to the old list too)
Note that the garbage collector can not claim the memory of the old list until all references to it are deleted. The sublist view contains a reference to the old list, so the old list cannot be claimed by the GC until all references to the sublist view are deleted. This means for a short while you use O(index2 - index1) more memory than in the list at the beginning. Since binary search makes the list half as large in every step, you use O(subArr.size()) additional memory (not in O(1)).
Lines like these:
subArr = new ArrayList<Integer> (subArr.subList(fromIndex, subArr.size()));
That's your "copy". The new ArrayList is indeed, a copy of the data from the subList.
If you were using the subList "raw", then it could be better argued that you are "in place", because then the subList is simply a set of offsets in to the original array.
But with the create of new ArrayLists, you are definitely copying.
Of course, the easiest way to check is that when you find your value, change the value in the list to some sentinel (9999 or whatever). Then dump your original list. If 9999 shows up, it's in place. If not, then, it's not.
"In-place" isn't a term that applies to binary-search, as it almost always refers to how modifications are made (e.g. for sorting, like quicksort (in-place) and mergesort (not in-place)). So it's not something you need to worry about for binary search, as searching makes no modifications.
As for whether ArrayList#subList() copies data, a look at the source code should prove that that is incorrect. Unfortunately, the source is a tad long for a SO answer, but I'll do my best to summarize.
The subList() method is this:
public List<E> subList(int fromIndex, int toIndex) {
subListRangeCheck(fromIndex, toIndex, size);
return new SubList(this, 0, fromIndex, toIndex);
}
Where SubList is defined as the inner class:
private class SubList extends AbstractList<E> implements RandomAccess
with instance fields
private final AbstractList<E> parent;
private final int parentOffset;
private final int offset;
int size;
Note how none of those fields are a type of array or list.
Looking at the implementations of mutators shows that all work is delegated to the parent class. For example, here is set():
public E set(int index, E e) {
rangeCheck(index);
checkForComodification();
E oldValue = ArrayList.this.elementData(offset + index);
ArrayList.this.elementData[offset + index] = e;
return oldValue;
}
Notice how ArrayList.this is used to refer to the containing ArrayList instance, which means that the source ArrayList implementation is modified and not any (nonexistent) copy.
The add() method shows something similar:
public void add(int index, E e) {
rangeCheckForAdd(index);
checkForComodification();
parent.add(parentOffset + index, e);
this.modCount = parent.modCount;
this.size++;
}
parent.add() is used here, where parent is also the containing instance. So again, it is the source list that is modified, and not any (nonexistent) copy.
And so on and so forth.
However, as pointed out by Will Hartung, all this is moot if you pass the resulting SubList into the constructor of a new ArrayList<>(), as the constructor:
public ArrayList(Collection<? extends E> c) {
elementData = c.toArray(); // <------------ This line
size = elementData.length;
// c.toArray might (incorrectly) not return Object[] (see 6260652)
if (elementData.getClass() != Object[].class)
elementData = Arrays.copyOf(elementData, size, Object[].class);
}
makes a copy of the internal array (through toArray()), which is the copy your professor/TA were likely talking about.
What is the best way to do a resizable array in Java? I tried using Vector, but that shifts all elements over by when when you do an insert, and I need an array that can grow but the elements stay in place. I'm sure there's a simple answer for this, but I still not quite sure.
As an alternative, you could use an ArrayList. It is a resizable-array implementation of the List interface.
Usage (using String):
List<String> myList = new ArrayList<String>();
myList.add("a");
myList.add("c");
myList.add("b");
The order will be just like you put them in: a, c, b.
You can also get an individual item like this:
String myString = myList.get(0);
Which will give you the 0th element: "a".
Like Sanjo pointed out: "An array is a static datastructure, so they can't grow". The list interface can by backed by an array(for example ArrayList like Kevin pointed out in his post). When the list structure is full and a new item has to be added to the list. Then the structure first creates a new array which can contain the old elements plus the new element which has to be added to the list.
The list interface has a different implementations which all have there pros/cons and you should pick the one best solving your problem-set. Below I will try to give a short summary when to use which implementation:
Not thread-safe implementations:
ArrayList: Resizable-array implementation of the List interface. You should use this implementation when you are doing a lot of size, isEmpty, get, set, iterator, and listIterator operations run in constant time. The add operation runs in amortized constant time, that is, adding n elements requires O(n) time. I think you should use this implementation when doing more lookups(get()) then adding items to list(add()).
LinkedList: This implementation is not backup by an array but "links" the nodes together. In my opinion you should use this implementation when you are doing more add() then get().
Thread-safe implementations:
Be aware that these list implementations aren't thread-safe which means it is possible to get race conditions when accesing them from multiple threads. If you want to use List implementations from multiple threads I would advise you to study the java.util.concurrent package and use implementation from that class.
You probably should use ArrayList instead of Vector for reasons explained in other answers.
However ...
I tried using Vector, but that shifts all elements over by when when you do an insert, and I need an array that can grow but the elements stay in place.
When you do an insertElementAt(pos, elem), you have specifically asked for the element shifting. If you don't want the elements to be shifted, you should use set(pos, elem) instead. Or if you want to add the element at the end of the vector, you can also use add(elem).
Incidentally, the previous paragraph applies to all implementations of List, not just Vector, though the implementation details and performance vary across the different kinds of List.
I tried using Vector, but that shifts all elements over by when when you do an insert, and I need an array that can grow but the elements stay in place.
You probably want to use ArrayList instead of Vector.
They both provide about the same interface, and you can replace elements with both of them by calling set(idx, element). That does not do any shifting around. It also does not allow you to grow the array, though: You can only insert at already occupied positions (not beyond the current size of the array), to add new elements at the end you have to use add(element).
The difference between ArrayList and Vector is that Vector has synchronization code which you most likely do not need, which makes ArrayList a little faster.
If you want to operate array data after all element had already inserted or deleted, there is a way that try to create a LinkedList or ArrayList, its simply resize, after the data input is finished, you can transfer the ArrayList to an Array, then do all the things you normally to Array.
ArrayList and LinkedList
Space Complexity:
a) ArrayList:
Allocates a chunk of memory when you initialize and doubles everytime it reaches it max size whenever you add an element dynamically.
b) LinkedList:
It allocates memory only everytime you add an item to the list.
Runtime Complexity:
a) ArrayList:
Search is faster, insertion and deletion is slower compared to linked list
b) LinkedList:
Insertion and deletion is faster, search is slower compared to array list
An array cannot be resized dynamically in Java. The solution to this is using ArrayList or creating another temporary array and then assign it.
You can find tutorials about ArrayList, but if you just want custom ResizableArray in Java. Here's it is. But it's NOT recommend to use! It's just a FAKE resizable array and heap memory will be increased when you create too many objects. This is just to show you the idea.
The Interface
public interface Resizable<T> {
void add(T data);
int delete(int index);
int size();
void print();
}
Implementation Class
public class ResizeableImpl<T> implements Resizable<T> {
private Object[] temp = null;
private Object[] originals = new Object[0];
#Override
public void add(T data) {
Object[] temp = new Object[originals.length+1];
for (int i=0; i<originals.length; i++) {
temp[i]=originals[i];
}
temp[originals.length]=data;
originals=temp;
}
#Override
public int delete(int index) {
int success=0;
switch (originals.length) {
case 0: //No Data to delete
success=0;
break;
case 1: //One Data is delete and so no data, too!
originals = new Object[0];
success = 1;
break;
default: //>=2
int count=0;
originals[index]=null;
temp = new Object[originals.length-1];
for (int i=0; i<originals.length; i++) {
if (originals[i]!=null)
temp[count++]=originals[i];
}
originals = temp;
success = 1;
}
return success;
}
#Override
public int size() {
return originals.length;
}
#Override
public void print() {
StringBuilder sb = null;
if (originals.length==0) {
System.out.println("No data available!");
return;
}
for (int i=0; i<originals.length; i++) {
if (sb==null) {
sb = new StringBuilder();
sb.append(originals[i]);
}
else {
sb.append(", "+originals[i]);
}
}
sb.append(".");
System.out.println(sb.toString());
}
}
Main method
public class App {
public static void main(String[] args) {
//Program to interfaces, not implementations
Resizable<Integer> obj = new ResizeableImpl<>();
obj.add(13);
obj.add(20);
obj.add(17);
obj.add(25);
obj.add(100);
obj.add(12);
obj.print();
int result = obj.delete(2); //This will delete 17.
if (result==1) {
System.out.println("Deletion is successful!");
}
obj.print();
obj.delete(3); //This will delete 100.
obj.print();
}
}
Output
13, 20, 17, 25, 100, 12.
Deletion is successful!
13, 20, 25, 100, 12.
13, 20, 25, 12.
Use either ArrayList or LinkedList.
Using wonderful classes in Collections framework is the better than using arrays.
But in case your question is from a "quizzing" perspective, here is what you should do.
Create your own resize method such as:
int[] oldArray = {1,2,3};
int oldSize = java.lang.reflect.Array.getLength(oldArray);
Class elementType = oldArray.getClass().getComponentType();
Object newArray = java.lang.reflect.Array.newInstance(
elementType,newSize);
int preserveLength = Math.min(oldSize,newSize);
if (preserveLength > 0)
System.arraycopy (oldArray,0,newArray,0,preserveLength);
oldArray = newArray;
I am calculating a large number of possible resulting combinations of an algortihm. To sort this combinations I rate them with a double value und store them in PriorityQueue. Currently, there are about 200k items in that queue which is pretty much memory intesive. Acutally, I only need lets say the best 1000 or 100 of all items in the list.
So I just started to ask myself if there is a way to have a priority queue with a fixed size in Java. I should behave like this:
Is the item better than one of the allready stored? If yes, insert it to the according position and throw the element with the least rating away.
Does anyone have an idea? Thanks very much again!
Marco
que.add(d);
if (que.size() > YOUR_LIMIT)
que.poll();
or did I missunderstand your question?
edit: forgot to mention that for this to work you probably have to invert your comparTo function since it will throw away the one with highest priority each cycle. (if a is "better" b compare (a, b) should return a positvie number.
example to keep the biggest numbers use something like this:
public int compare(Double first, Double second) {
// keep the biggest values
return first > second ? 1 : -1;
}
MinMaxPriorityQueue, Google Guava
There is indeed a class for maintaining a queue that, when adding an item that would exceed the maximum size of the collection, compares the items to find an item to delete and thereby create room: MinMaxPriorityQueue found in Google Guava as of version 8.
EvictingQueue
By the way, if you merely want deleting the oldest element without doing any comparison of the objects’ values, Google Guava 15 gained the EvictingQueue class.
There is a fixed size priority queue in Apache Lucene: http://lucene.apache.org/java/2_4_1/api/org/apache/lucene/util/PriorityQueue.html
It has excellent performance based on my tests.
Use SortedSet:
SortedSet<Item> items = new TreeSet<Item>(new Comparator<Item>(...));
...
void addItem(Item newItem) {
if (items.size() > 100) {
Item lowest = items.first();
if (newItem.greaterThan(lowest)) {
items.remove(lowest);
}
}
items.add(newItem);
}
Just poll() the queue if its least element is less than (in your case, has worse rating than) the current element.
static <V extends Comparable<? super V>>
PriorityQueue<V> nbest(int n, Iterable<V> valueGenerator) {
PriorityQueue<V> values = new PriorityQueue<V>();
for (V value : valueGenerator) {
if (values.size() == n && value.compareTo(values.peek()) > 0)
values.poll(); // remove least element, current is better
if (values.size() < n) // we removed one or haven't filled up, so add
values.add(value);
}
return values;
}
This assumes that you have some sort of combination class that implements Comparable that compares combinations on their rating.
Edit: Just to clarify, the Iterable in my example doesn't need to be pre-populated. For example, here's an Iterable<Integer> that will give you all natural numbers an int can represent:
Iterable<Integer> naturals = new Iterable<Integer>() {
public Iterator<Integer> iterator() {
return new Iterator<Integer>() {
int current = 0;
#Override
public boolean hasNext() {
return current >= 0;
}
#Override
public Integer next() {
return current++;
}
#Override
public void remove() {
throw new UnsupportedOperationException();
}
};
}
};
Memory consumption is very modest, as you can see - for over 2 billion values, you need two objects (the Iterable and the Iterator) plus one int.
You can of course rather easily adapt my code so it doesn't use an Iterable - I just used it because it's an elegant way to represent a sequence (also, I've been doing too much Python and C# ☺).
A better approach would be to more tightly moderate what goes on the queue, removing and appending to it as the program runs. It sounds like there would be some room to exclude some the items before you add them on the queue. It would be simpler than reinventing the wheel so to speak.
It seems natural to just keep the top 1000 each time you add an item, but the PriorityQueue doesn't offer anything to achieve that gracefully. Maybe you can, instead of using a PriorityQueue, do something like this in a method:
List<Double> list = new ArrayList<Double>();
...
list.add(newOutput);
Collections.sort(list);
list = list.subList(0, 1000);