Best way to tranverse List in Java? - java

I have a concurrent List used in multi-threaded environment. Once the List is built, mostly operation is traversing it. I am wondering which of the following 2 methods are more efficient, or what's cost of creating a new List vs using synchronized? Or maybe there are other better ways?
List<Object> list = new CopyOnWriteArrayList<Object>();
public int[] getAllValue1() {
List<Object> list2 = new ArrayList<Object>(list);
int[] data = new int[list2.size()];
int i = 0;
for (Object obj : list2) {
data[i++] = obj.getValue();
}
return data;
}
public int[] getAllValue2() {
synchronized (list) {
int[] data = new int[list.size()];
int i = 0;
for (Object obj : list) {
data[i++] = obj.getValue();
}
return data;
}
}
UPDATE
getAllValue1(): It is threadsafe, because it takes a snapshot of the CopyOnWriteList, which itself is threadsafe List. However, as sharakan points out, the cost is iterate 2 lists, and creating a local object ArrayList, which could be costly if the original list is large.
getAllValue2(): It is also threadsafe in the synchronization block. (Assume other functions do synchronization properly.) The reason to put it in the synchronization block is because I want to pre-allocate the array, to make sure .size() call is synchronized with iteration. (Iteration part is threadsafe, because it use CopyOnWriteList.) However the cost here is the opportunity cost of using syncrhonized block. If there are 1 million clients calling getAllValue2(), each one has to wait.
So I guess the answer really depends on how many concurrent users need to read the data. If not many concurrent users, probably the method2 is better. Otherwise, method1 is better. Agree?
In my usage, I have a couple concurrent clients, probably method2 is preferred. (BTW, my list is about 10k size).

getAllValue1 looks good given that you need to return an array of primitive types based on a a field of your objects. It'll be two iterations, but consistent and you won't cause any contention between reader threads. You haven't posted any profiling results, but unless your list is quite large I'd be more worried about contention in a multithreaded environment than the cost of two complete iterations.
You could remove one iteration if you change the API. Easiest way to do that is to return a Collection instead, as follows:
public Collection<Integer> getAllValue1() {
List<Integer> list2 = new ArrayList<Integer>(list.size());
for (Object obj : list2) {
list2.add(obj.getValue());
}
return list2;
}
if you can change your API that way, that'd be an improvement.

I think the second one is more efficient. The reason is, in the first one, you create another list as a local creation. That means if the original list contains lot of data, it is gonna copy all of them. If it contains millions of data, then it will be a issue.
However, there is list.toArray() method
Collections interface also contain some useful stuff
Collection synchronizedCollection = Collections.synchronizedCollection(list);
or
List synchronizedList = Collections.synchronizedList(list);
If you need objects VALUE, and not object, then move with the second code of yours. Else, you can replace appropriate parts of the second code with the above functions, and do whatever you want.

Edit (again):
Since you are using a copy on write array list (should've been more observant) I would get the iterator and use that to initialize your array. Since the iterator is a snapshot of the array at the time you ask for it you can synchronize on the list to get the size and then subsequently iterate without fear of ConcurrentModificationException or having the iterator change.
public int[] getAllValue1() {
synchronized(list){
int[] data = new int[list.size()];
}
Iterator i = list.iterator();
while(i.hasNext()){
data[i++] = i.next().getValue();
}
return data;
}

Related

Good programming practice while returning empty list

List<String> doSomething(String input){
if(input == null){
return Collections.emptyList();
}
List<String> lst = getListfromSomewhereElse(input)
if(lst.isEmpty(){
return Collections.emptyList(); //Approach 1
// return lst; //Return same empty list
}
// do some more processing on lst
return lst;
}
I prefer approach 1, coz its more readable and explicit. What is better approch 1 or 2?
Question is if the list is empty should i return same list or explicitly create new empty list and return
Collections.emptyList() return one constant member of Collections, so it takes no excessive time (can be optimized by JIT) and memory.
On the other side return of getListfromSomewhereElse possibly locks empty list returned from other code. Potentially you can get any list class and potentially it can take a bit of memory. Generally it's not a problem, as this method is also derived, reviewed and tested by your own team, but who knows what happens in outer libraries?
For example, getListfromSomewhereElse can read really large file into memory and then remove all elements from it. So, empty list will hold thousands elements capacity unless you/them know its structure and get rid of excessive capacity. Approach 1 will simply overcome this by usage of already existing constant list.
As a side note, if you process list elements in java 8 stream style, you naturally get new list with .collect(Collectors.toList()) step. But JDK developers don't force emptyList in this case.
So, unless you are sure in getListfromSomewhereElse, you better return Collections.emptyList() (or new ArrayList() or whatever list type you return by method contract).
I would prefer
List<String> doSomething(String input) {
List<String> list = new ArrayList<String>();
if (input != null) {
List<String> listFromSomewhereElse = getListfromSomewhereElse(input);
list.addAll(listFromSomewhereElse);
}
return list;
}
Keep in mind that Collections.emptyList() is unmodifiable. Depending on the result of getListFromSomewhereElse a client of doSomething might be confused that it can sometimes modify the list it gets and under some other situation it throws an UnsupportedOperationException. E.g.
List<String> list = someClass.doSomething(null);
list.add("A");
will throw UnsupportedOperationException
while
List<String> list = someClass.doSomething("B");
list.add("A");
might work depending on the result of getListFromSomewhereElse
It's very seldom necessary to do (pseudocode):
if(input list is empty) {
return an empty list
} else {
map each entry in input list to output list
}
... because every mainstream way of mapping an input list to an output list produces an empty list "automatically" for an empty input. For example:
List<String> input = Collections.emptyList();
List<String> output = new ArrayList<>();
for(String entry : input) {
output.add(entry.toLowerCase());
}
return output;
... will return an empty list. To treat an empty list as a special case makes for wasted code, and less expressive code.
Likewise, the modern Java approach of using Streams does the same:
List<String> output = input.stream()
.map( s -> s.toLowerCase())
.collect(Collectors.toList());
... will create an empty List in output, with no "special" handling for an empty input.
Collections.emptyList() returns a class that specifically implements an immutable, empty list. It has a very simple implementation, for example its size() is just return 0;.
But this means your caller won't be able to modify the returned list -- only if it's empty. Although immutability is a good thing, it's inconsistent to sometimes return a immutable list and other times not, and this could result in errors you detect late. If you want to enforce immutability, to it always by wrapping the response in Collections.unmodifiableList(), or using an immutable list from a library like Guava.
You also test whether the input is null. Consider whether this is necessary. Who's going to be calling the method? If it's just you, then don't do that! If you know you're not going to do it, your code needn't check for it.
If it's a public API for other programmers, you might need to handle nulls gratefully, but in many cases it's entirely appropriate to document that the input mustn't be null, and just let it throw a NullPointerException if it happens (or you can force one early by starting your method with Objects.requireNonNull(input).
Conclusion: my recommendation is:
List<String> doSomething(String input){
Objects.requireNonNull(input); // or omit this and let an
// exception happen further down
return doMoreProcessingOn(getListfromSomewhereElse(input));
}
It's best if doMoreProcessingOn() produces a new List, rather than modifying input.

Java convert list of int[] to array of int[] - concurrent safe

I have a List<int[]> and I would like to convert it to an int[][]
I have tried like this:
private List<int[]> pos = new LinkedList<>();
...
public int[][] getPos() {
int[][] ret = new int[pos.size()][2];
int i = 0;
for (Object o : pos) {
ret[i] = Arrays.stream((int[]) o).toArray();
i++;
}
return ret;
}
The big problem in my case is that an other thread is changing the List during the process of convertion. So I get a ConcurrentModificationException on my foreach.
Is there any other ways of doing this like with Streams ? That would be concurent safe.
Regards
LinkedList is not threadsafe so it is not possible to make your program thread safe without using locks/synchronization.
But if you change it to e.g. CopyOnWriteArrayList it would be easier, you could iterate over it safely while other thread will write to it (but you will copy only the list state that was when you started foreach/iterator, all changes to the List won't be reflected until you do another foreach/iterator).
A better solution would be to use a ConcurrentQueue. This was you can be adding values while you are consuming them. Note: you won't be able to copy any int[] produced after you have done the copy.

'Collections.unmodifiableCollection' in constructor

From time to time during code reviews I see constructors like that one:
Foo(Collection<String> words) {
this.words = Collections.unmodifiableCollection(words);
}
Is this proper way of protecting internal state of the class? If not, what's the idiomatic approach to create proper defensive copy in constructors?
It should be, but it isn't correct because the caller can still modify the underlying list.
Instead of wrapping the list, you should make a defensive copy, for example by using Guava's ImmutableList instead.
Foo(Collection<String> words) {
if (words == null) {
throw new NullPointerException( "words cannot be null" );
}
this.words = ImmutableList.copyOf(words);
if (this.words.isEmpty()) { //example extra precondition
throw new IllegalArgumentException( "words can't be empty" );
}
}
So the correct way of establishing the initial state for the class is:
Check the input parameter for null.
If the input type isn't guaranteed to be immutable (as is the case with Collection), make a defensive copy. In this case, because the element type is immutable (String), a shallow copy will do, but if it wasn't, you'd have to make a deeper copy.
Perform any further precondition checking on the copy. (Performing it on the original could leave you open to TOCTTOU attacks.)
Collections.unmodifiableCollection(words);
only creates wrapper via which you can't modify words, but it doesn't mean that words can't be modified elsewhere. For example:
List<String> words = new ArrayList<>();
words.add("foo");
Collection<String> fixed = Collections.unmodifiableCollection(words);
System.out.println(fixed);
words.add("bar");
System.out.println(fixed);
result:
[foo]
[foo, bar]
If you want to preserve current state of words in unmodifiable collection you will need to create your own copy of elements from passed collection and then wrap it with Collections.unmodifiableCollection(wordsCopy);
like if you only want to preserve order of words:
this.words = Collections.unmodifiableCollection(new ArrayList<>(words));
// separate list holding current words ---------^^^^^^^^^^^^^^^^^^^^^^
No, that doesn't protect it fully.
The idiom I like to use to make sure that the contents is immutable:
public Breaker(Collection<String> words) {
this.words = Collections.unmodifiableCollection(
Arrays.asList(
words.toArray(
new String[words.size()]
)
)
);
}
The downside here though, is that if you pass in a HashSet or TreeSet, it'll lose the speed lookup. You could do something other than converting it to a fixed size List if you cared about Hash or Tree characteristics.

What type of List/Map should I use for categorizing data but maintaining the order?

I have the following data objects:
MyObject {
priority (e.g. HIGH, LOW, ...)
information
}
I need to save them in the correct order for iterating over it if necessary.
I also need to get only data with priority HIGH or LOW sometimes (also in correct order).
If I use a List (e.g. ArrayList) I would have to iterate over every single data-object to search for my priorities.
If I use Map<Priority, List<Information>> I would lose the order between information within two different priorities.
Sample data input:
LOW, "Hello1"
HIGH, "Hello2"
LOW, "World3"
HIGH, "World4"
Desired results:
printData() -> Hello1, Hello2, World3, World4
printLow() -> Hello1, World3
printHigh() -> Hello2, World4
What data structure would fulfill my requirements at the best? (Java)
If iterating over the list is really too slow, then maintain two parallel collections:
a List<Information> to iterate over all the informations in order,
and a Map<Priority, List<Information>> to iterate over the informations of a given priority.
I would only do that if I had a proven performance problem and I have proven that it was caused by the iteration over the list of all the informations. Otherwise, it's premature optimization that makes the code harder to maintain and make correct, especially if the collection is mutable.
Use HashMap and a seperate list as below:
public enum Priority { .... };
Map<Priority, List<Information>> map = new HashMap<Priority, List<Information>>();
map.put(Priority.HIGH, new LinkedList<Information>());
map.put(Priority.MID, new LinkedList<Information>());
map.put(Priority.LOW, new LinkedList<Information>());
List<Information> infoOrderedList = new LinkedList<Information>();
public void putInfo(MyObject myobject) {
List<Information> infoList = map.get(myObject.getPriority());
infoList.add(myobject.getInformation());
infoOrderedList.add(myobject.getInformation());
}
public void removeInfo(MyObject myobject) {
List<Information> infoList = map.get(myObject.getPriority());
infoList.remove(myobject.getInformation());
infoOrderedList.remove(myobject.getInformation());
}
You may avoid explicit iteration using lambda and filtering on a List.
For instance if you want to get a list of high priority items just type:
List<MyObject> high = list.stream().filter(o -> o.priority == Priority.HIGH).collect(Collectors.toList());
Using an ArrayList you keep the sorting.
To improve performance you may use parallelStream() instead of stream()

How can I make a resizable array in Java?

What is the best way to do a resizable array in Java? I tried using Vector, but that shifts all elements over by when when you do an insert, and I need an array that can grow but the elements stay in place. I'm sure there's a simple answer for this, but I still not quite sure.
As an alternative, you could use an ArrayList. It is a resizable-array implementation of the List interface.
Usage (using String):
List<String> myList = new ArrayList<String>();
myList.add("a");
myList.add("c");
myList.add("b");
The order will be just like you put them in: a, c, b.
You can also get an individual item like this:
String myString = myList.get(0);
Which will give you the 0th element: "a".
Like Sanjo pointed out: "An array is a static datastructure, so they can't grow". The list interface can by backed by an array(for example ArrayList like Kevin pointed out in his post). When the list structure is full and a new item has to be added to the list. Then the structure first creates a new array which can contain the old elements plus the new element which has to be added to the list.
The list interface has a different implementations which all have there pros/cons and you should pick the one best solving your problem-set. Below I will try to give a short summary when to use which implementation:
Not thread-safe implementations:
ArrayList: Resizable-array implementation of the List interface. You should use this implementation when you are doing a lot of size, isEmpty, get, set, iterator, and listIterator operations run in constant time. The add operation runs in amortized constant time, that is, adding n elements requires O(n) time. I think you should use this implementation when doing more lookups(get()) then adding items to list(add()).
LinkedList: This implementation is not backup by an array but "links" the nodes together. In my opinion you should use this implementation when you are doing more add() then get().
Thread-safe implementations:
Be aware that these list implementations aren't thread-safe which means it is possible to get race conditions when accesing them from multiple threads. If you want to use List implementations from multiple threads I would advise you to study the java.util.concurrent package and use implementation from that class.
You probably should use ArrayList instead of Vector for reasons explained in other answers.
However ...
I tried using Vector, but that shifts all elements over by when when you do an insert, and I need an array that can grow but the elements stay in place.
When you do an insertElementAt(pos, elem), you have specifically asked for the element shifting. If you don't want the elements to be shifted, you should use set(pos, elem) instead. Or if you want to add the element at the end of the vector, you can also use add(elem).
Incidentally, the previous paragraph applies to all implementations of List, not just Vector, though the implementation details and performance vary across the different kinds of List.
I tried using Vector, but that shifts all elements over by when when you do an insert, and I need an array that can grow but the elements stay in place.
You probably want to use ArrayList instead of Vector.
They both provide about the same interface, and you can replace elements with both of them by calling set(idx, element). That does not do any shifting around. It also does not allow you to grow the array, though: You can only insert at already occupied positions (not beyond the current size of the array), to add new elements at the end you have to use add(element).
The difference between ArrayList and Vector is that Vector has synchronization code which you most likely do not need, which makes ArrayList a little faster.
If you want to operate array data after all element had already inserted or deleted, there is a way that try to create a LinkedList or ArrayList, its simply resize, after the data input is finished, you can transfer the ArrayList to an Array, then do all the things you normally to Array.
ArrayList and LinkedList
Space Complexity:
a) ArrayList:
Allocates a chunk of memory when you initialize and doubles everytime it reaches it max size whenever you add an element dynamically.
b) LinkedList:
It allocates memory only everytime you add an item to the list.
Runtime Complexity:
a) ArrayList:
Search is faster, insertion and deletion is slower compared to linked list
b) LinkedList:
Insertion and deletion is faster, search is slower compared to array list
An array cannot be resized dynamically in Java. The solution to this is using ArrayList or creating another temporary array and then assign it.
You can find tutorials about ArrayList, but if you just want custom ResizableArray in Java. Here's it is. But it's NOT recommend to use! It's just a FAKE resizable array and heap memory will be increased when you create too many objects. This is just to show you the idea.
The Interface
public interface Resizable<T> {
void add(T data);
int delete(int index);
int size();
void print();
}
Implementation Class
public class ResizeableImpl<T> implements Resizable<T> {
private Object[] temp = null;
private Object[] originals = new Object[0];
#Override
public void add(T data) {
Object[] temp = new Object[originals.length+1];
for (int i=0; i<originals.length; i++) {
temp[i]=originals[i];
}
temp[originals.length]=data;
originals=temp;
}
#Override
public int delete(int index) {
int success=0;
switch (originals.length) {
case 0: //No Data to delete
success=0;
break;
case 1: //One Data is delete and so no data, too!
originals = new Object[0];
success = 1;
break;
default: //>=2
int count=0;
originals[index]=null;
temp = new Object[originals.length-1];
for (int i=0; i<originals.length; i++) {
if (originals[i]!=null)
temp[count++]=originals[i];
}
originals = temp;
success = 1;
}
return success;
}
#Override
public int size() {
return originals.length;
}
#Override
public void print() {
StringBuilder sb = null;
if (originals.length==0) {
System.out.println("No data available!");
return;
}
for (int i=0; i<originals.length; i++) {
if (sb==null) {
sb = new StringBuilder();
sb.append(originals[i]);
}
else {
sb.append(", "+originals[i]);
}
}
sb.append(".");
System.out.println(sb.toString());
}
}
Main method
public class App {
public static void main(String[] args) {
//Program to interfaces, not implementations
Resizable<Integer> obj = new ResizeableImpl<>();
obj.add(13);
obj.add(20);
obj.add(17);
obj.add(25);
obj.add(100);
obj.add(12);
obj.print();
int result = obj.delete(2); //This will delete 17.
if (result==1) {
System.out.println("Deletion is successful!");
}
obj.print();
obj.delete(3); //This will delete 100.
obj.print();
}
}
Output
13, 20, 17, 25, 100, 12.
Deletion is successful!
13, 20, 25, 100, 12.
13, 20, 25, 12.
Use either ArrayList or LinkedList.
Using wonderful classes in Collections framework is the better than using arrays.
But in case your question is from a "quizzing" perspective, here is what you should do.
Create your own resize method such as:
int[] oldArray = {1,2,3};
int oldSize = java.lang.reflect.Array.getLength(oldArray);
Class elementType = oldArray.getClass().getComponentType();
Object newArray = java.lang.reflect.Array.newInstance(
elementType,newSize);
int preserveLength = Math.min(oldSize,newSize);
if (preserveLength > 0)
System.arraycopy (oldArray,0,newArray,0,preserveLength);
oldArray = newArray;

Categories

Resources