I have to optimize and algorithm and i noticed that we have a loop like this
while (!floorQueues.values().stream().allMatch(List::isEmpty))
It seems like on each iteration it checks if all of the lists in this map are empty.
The data in the map is taken from a two dimensional array like this
int currentFloorNumber = 0;
for (int[] que : queues) {
List<Integer> list = Arrays.stream(que).boxed().collect(Collectors.toList());
floorQueues.put(currentFloorNumber, list);
currentFloorNumber++;
}
I thought that it will be more optimal if i take the count of elements in the arrays when transforming the data and then check how many times i deleted from the lists as a condition to end the loop
while (countOfDeltedElements < totalCountOfElements)
but when i tested the code it runs slower that before. So i wonder how isEmpty works behind
the scenes to be faster than my solution.
This may depend on the implementation of the class that implements List.
An ArrayList simply checks if there are 0 elements:
/**
* Returns <tt>true</tt> if this list contains no elements.
*
* #return <tt>true</tt> if this list contains no elements
*/
public boolean isEmpty() {
return size == 0;
}
A distinct non-answer: Java performance, and Java benchmarking doesn't work this way.
You can't look at these 5 lines of source code to understand what exactly will happen at runtime. Or to be precise: you have to understand that streams are a very advanced, aka complex thing. Stream code might create plenty of objects at runtime, that are only used once, and then thrown away. In order to assess the true performance impacts, you really have to understand what that code is doing. And that could be: a lot!
The serious answer is: if you really need to understand what is going on, then you will need to:
study the stream implementation in great detail
and worse: you will need to look into the activities of the Just in Time compiler within the JVM (to see for example how that "source" code gets translated and optimised to machine code).
and most likely: you will need to apply a real profiler, and to plenty of experiments.
In any case, you should start reading here to ensure that the numbers you are measuring make sense in the first place.
Normally we have two ways to check if the list is empty or not either list.length() > 0 or !list.isEmpty() .
when we use list.length() what happen at backend it will iterate/go through till the end of list and if the list have big numbers of elements and it will surely going to take long to reach the end .On the other hand, if we use 'list.isEmpty()' then it will check only first element of the list if its their or not (O(1)) and return true/false just on this first index which is obviously fast.
From performance prespective , we should alwayz use isEmpty() as best practice
Related
I have a task about building a pyramid using list of numbers, but there is one problem with one test. In my task I need to sort a list. I use Collections.sort():
Collections.sort(inputNumbers, (o1, o2) -> {
if (o1 != null && o2 != null) {
return o1.compareTo(o2);
} else {
throw new CannotBuildPyramidException("Unable to build a pyramid");
}
});
But this test fails
#Test(expected = CannotBuildPyramidException.class)
public void buildPyramid8() {
// given
List<Integer> input = Collections.nCopies(Integer.MAX_VALUE - 1, 0);
// run
int[][] pyramid = pyramidBuilder.buildPyramid(input);
// assert (exception)
}
with OutOfMemoryError instead of my own CannotBuildPyramidException(it will be thrown in another method after sorting). I understand that it is because of TimSort in Collections.sort() method. I tried to use HeapSort, but I couldn`t even swap elements because my input list was initialized as Arrays.asList() and when I use set() method I get UnsupportedOperationException. Then I tried to convert my list to common ArrayList
ArrayList<Integer> list = new ArrayList<>(inputNumbers);
but I got OutOfMemoryError again. It`s not allowed to edit tests. I dont know what to do with this problem. Im using Java8 and IntelliJIdea SDK
Note that the list created by Collections.nCopies(Integer.MAX_VALUE - 1, 0) uses a tiny amount of memory and is immutable. The documentation says "Returns an immutable list consisting of n copies of the specified object. The newly allocated data object is tiny (it contains a single reference to the data object)". And if you look at the implementation, you'll see it does exactly what one would expect from that description. It returns a List object that only pretends to be large, only holding the size and the element once and returning that element when asked about any index.
The problem with Collections.sort is then two-fold:
The list must not be immutable, but that list is. That btw also explains the UnsupportedOperationException you got when you tried to set().
For performance reasons, it "obtains an array containing all elements in this list, sorts the array, [and writes back to the list]". So at this point the tiny pretend-list does get blown up and causes your memory problem.
So you need to find some other way to sort. One that works in-place and doesn't swap anything for this input (which is correct, as the list is already sorted). You could for example use bubble sort, which takes O(n) time and O(1) space on this input and doesn't attempt any swaps here.
Btw, about getting the memory problem "because of TimSort": Timsort is really not to blame. You don't even get to the Timsort part, as it's the preparatory copy-to-array that causes the memory problem. And furthermore, Timsort is smart and would detect that the data is already sorted and then wouldn't do anything. So if you actually did get to the Timsort part, or if you could directly apply it to the list, Timsort wouldn't cause a problem.
This list is too huge! Collections.nCopies(Integer.MAX_VALUE - 1, 0); gives us list of 2^31-1 elements (2147483647), each one taking about 4 bytes in memory (this is "simplified" size of Integer). If we multiply it, we'll have about 8.59 GB of memory required to store all those numbers. Are you sure you have enough memory to store it?
I believe this test is written in a very bad manner - one should never try to create such huge List.
I've got an ArrayList that can be anywhere from 0 to 5000 items long (pretty big objects, too).
At one point I compare it against another ArrayList, to find their intersection. I know this is O(n^2).
Is creating a HashMap alongside this ArrayList, to achieve constant-time lookup, a valid strategy here, in order to reduce the complexity to O(n)? Or is the overhead of another data structure simply not worth it? I believe it would take up no additional space (besides for the references).
(I know, I'm sure 'it depends on what I'm doing', but I'm seriously wondering if there's any drawback that makes it pointless, or if it's actually a common strategy to use. And yes, I'm aware of the quote about prematurely optimizing. I'm just curious from a theoretical standpoint).
First of all, a short side note:
And yes, I'm aware of the quote about prematurely optimizing.
What you are asking about here is not "premature optimization"!
You are not talking about replacing a multiplication with some odd bitwise operations "because they are faster (on a 90's PC, in a C-program)". You are thinking about the right data structure for your application pattern. You are considering the application cases (though you did not tell us many details about them). And you are considering the implications that the choice of a certain data structure will have on the asymptotic running time of your algorithms. This is planning, or maybe engineering, but not "premature optimization".
That being said, and to tell you what you already know: It depends.
To elaborate this a bit: It depends on the actual operations (methods) that you perform on these collections, how frequently you perform then, how time-critical they are, and how memory-sensitive the application is.
(For 5000 elements, the latter should not be a problem, as only references are stored - see the discussion in the comments)
In general, I'd also be hesitant to really store the Set alongside the List, if they are always supposed to contain the same elements. This wording is intentional: You should always be aware of the differences between both collections. Primarily: A Set can contain each element only once, whereas a List may contain the same element multiple times.
For all hints, recommendations and considerations, this should be kept in mind.
But even if it is given for granted that the lists will always contain elements only once in your case, then you still have to make sure that both collections are maintained properly. If you really just stored them, you could easily cause subtle bugs:
private Set<T> set = new HashSet<T>();
private List<T> list = new ArrayList<T>();
// Fine
void add(T element)
{
set.add(element);
list.add(element);
}
// Fine
void remove(T element)
{
set.remove(element);
list.remove(element); // May be expensive, but ... well
}
// Added later, 100 lines below the other methods:
void removeAll(Collection<T> elements)
{
set.removeAll(elements);
// Ooops - something's missing here...
}
To avoid this, one could even consider to create a dedicated collection class - something like a FastContainsList that combines a Set and a List, and forwards the contains call to the Set. But you'll qickly notice that it will be hard (or maybe impossible) to not violate the contracts of the Collection and List interfaces with such a collection, unless the clause that "You may not add elements twice" becomes part of the contract...
So again, all this depends on what you want to do with these methods, and which interface you really need. If you don't need the indexed access of List, then it's easy. Otherwise, referring to your example:
At one point I compare it against another ArrayList, to find their intersection. I know this is O(n^2).
You can avoid this by creating the sets locally:
static <T> List<T> computeIntersection(List<T> list0, List<T> list1)
{
Set<T> set0 = new LinkedHashSet<T>(list0);
Set<T> set1 = new LinkedHashSet<T>(list1);
set0.retainAll(set1);
return new ArrayList<T>(set0);
}
This will have a running time of O(n). Of course, if you do this frequently, but rarely change the contents of the lists, there may be options to avoid the copies, but for the reason mentioned above, maintainng the required data structures may become tricky.
I have a short (12 elements) LinkedList of short strings (7 characters each).
I need to search through this list both by index and by content (i.e. search a particular string and get its index in the list).
I thought about making a copy of the LinkedList as an array at runtime (just once, since the LinkedList is a static member of my class), so I can access the strings by index more quickly.
Given that the LinkedList is never changed at runtime, is this bad programming practice or is this an idea worth considering?
IMPORTANT EDIT: the array can't be sorted, I need it to map specific strings to specific numbers.
Instead of a LinkedList just use an ArrayList - you can look up fast based on an index, and you can easily search through it.
What problem are you trying to solve here? Are you worried that accessing elements by index is too slow in LinkedList? If so, you might want to use ArrayList instead.
But for a 12-element list, the improvement probably won't make any measurable difference. Unless this is something you're accessing several hundred times a second, I wouldn't waste any time on trying to optimize it.
Another idea you might want to consider is using a Map:
Map someMap<int, String>
It's easy to search for values in a map by both key and value.
Might also not be the best idea, but at least better then creating 2 lists with the same values =)
The question is, why are you using a LinkedList in the first place?
The main reason to choose a LinkedList over an array list is if you need to make a number of insertions/deletions in the middle of the List or if you don't know the exact size of the list and don't want to make a number of reallocations of the Array.
The main reason to choose an ArrayList over a LinkedList is if you need to have random access to each of the elements.
(There are other advantages/disadvantages to each, but those are probably the main ones that come to mind)
It looks like you do need random access to the list, so why did you pick a LinkedList over an ArrayList
I would say it depends on your intention and the effect it really has.
With only 12 elements it seems unlikely to me that converting the LinkedList to an array has an impact on performance. So it could make the code unnecessarily (slightly) harder to understand for other people. From this point of view it could be considered a non optimal programming style.
If the number of elements increases, i.g. you're need to pre-process some data which would require a dynamic data structure. And for later use an indexed lookup performs much better, this wouldn't be a bad programming style, rather a required improvement.
Given that you know the exact amount of elements you are going to be using why not use an array from the start?
string[] myArray = new string[7];
// Add your data
Sort(myArray); // Sort your strings
int value = binarySearch(myArray, "key"); // Search your array
Or since you cant sort the array you could just make a linear search method
public int Search(string[] array, string key)
{
for(int i = 0; i < array.legnth(); i++)
{
if(array[i] == key)
return i;
}
return -1;
}
Edit: After re-loading the page and reading peoples responses I agree that ArrayList should be exactly what you need.
I want to write a program to implement an array-based stack, which accept integer numbers entered by the user.the program will then identify any occurrences of a given value from user and remove the repeated values from the stack,(using Java programming language).
I just need your help of writing (removing values method)
e.g.
input:6 2 3 4 3 8
output:6 2 4 8
Consider Collection.contains (possibly in conjunction with Arrays.asList, if you are so unfortunate), HashMap, or Set.
It really depends on what you have, where you are really going, and what silly restrictions the homework/teacher mandates. Since you say "implement an array-based stack" I am assuming there are some silly mandates in which case I would consider writing a custom arrayContains helper* method and/or using a secondary data-structure (Hash/Set) to keep track of 'seen'.
If you do the check upon insertion it's just (meta code, it's your home work :-):
function addItem (i) begin
if not contains(stack, i) then
push(stack, i)
end if
end
*You could use the above asList/contains if you don't mind being "not very efficient", but Java comes with very little nice support for Arrays and thus the recommendation for the helper which is in turn just a loop over the array returning true if the value was found, false otherwise. (Or, perhaps return the index found or -1... your code :-)
Assuming that the "no-duplicates" logic is a part of the stack itself, I would do the following:
1) Implement a helper method:
private boolean remove(int item)
This method should scan the array, and if it finds the item it should shrink the array by moving all subsequent items one position backwards. The returned value indicates whether a removal took place.
2) Now it is easy to implement the push method:
public void push(int item) {
if (!remove(item)) {
arr[topPos++] = item;
}
}
Note that my solution assumes there is always enough space in the array. A proper implementation should take care of resizing the array when necessary.
The question is an interesting (or troubling) one in that it breaks the spirit of the stack to enforce such a constraint. A pure stack can only be queried about its top element.
As a result, doing this operation necessarily requires treating the stack not as a stack but as some other data structure, or at least transferring all of the data in the stack to a different, intermediate data structure.
If you want to accomplish this from within the stack class itself, others' replies will prove useful.
If you want to accomplish this from outside of the stack, using only the traditional methods of a stack interface (push() and pop()), your algorithm might look something like this:
Create a Set of Integers to keep track of values encountered so far.
Create a second stack to hold the values temporarily.
While the stack isn't empty,
Pop off the top element.
If the set doesn't contain that element yet, add it to the set and push it onto the second stack.
If the set does contain the element, that means you've already encountered it and this is a duplicate. So ignore it.
While the second stack isn't empty,
Pop off the top element
Push the element back onto the original stack.
There are various other ways to do this, but I believe all would require some auxiliary data structure that is not a stack.
override the push method and have it run through the stack to determine whether the value already exists. if so, return false otherwise true (if you want to use a boolean return value).
basically, this is in spirit of the answer posted by Mr. Schneider, but why shrink the array or modify the stack at all if you can just determine whether a new item is a duplicate or not? if it's a duplicate, don't add it and the array does not need to be modified. am i missing something?
I have a java list
List<myclass> myList = myClass.selectFromDB("where clause");
//myClass.selectFromDB returns a list of objects from DB
But I want a different list, specifically.
List<Integer> goodList = new ArrayList<Integer>();
for(int i = 0;i++; i<= myList.size()) {
goodList[i] = myList[i].getGoodInteger();
}
Yes, I could do a different query from the DB in the initial myList creation, but assume for now I must use that as the starting point and no other DB queries. Can I replace the for loop with something much more efficient?
Thank you very much for any input, apologies for my ignorance.
In order to extract a field from the "myclass", you're going to have to loop through the entire contents of the list. Whether you do that with a for loop, or use some sort of construct that hides the for loop from you, it's still going to take approximately the same time and use the same amount of resources.
An important question is: why do you want to do this? Are you trying to make your code cleaner? If so, you could write a method along these lines:
public static List<Integer> extractGoodInts (List<myClass> myList) {
List<Integer> goodInts = new ArrayList<Integer>();
for(int i = 0; i < myList.size(); i++){
goodInts.add(myList.get(i).getGoodInteger());
}
return goodInts;
}
Then, in your code, you can just go:
List<myClass> myList = myClass.selectFromDB("where clause");
List<Integer> goodInts = myClass.extractGoodInts(myList);
However, if you're trying to make your code more efficient and you're not allowed to change the query, you're out of luck; somehow or another, you're going to need to individually grab each int from the list, which means you're going to be running in O(n) time no matter what clever tricks you can come up with.
There are really only two ways I can think of that you can make this more "efficient":
Somehow split this up between multiple cores so you can do the work in parallel. Of course, this assumes that you've got other cores, they aren't doing anything useful already, and that there's enough processing going on that the overheard of doing this is even worth it. My guess is that (at least) the last point isn't true in your case given that you're just calling a getter. If you wanted to do this you'd try to have a number of threads (I'd probably actually use an Executor and Futures for this) equal to the number of cores, and then give roughly equal amounts of work to each of them (probably just by slicing your list into roughly equal sized pieces).
If you believe that you'll only be accessing a small subset of the resulting List, but are unsure of exactly which elements, you could try doing things lazily. The easiest way to do that would be to use a pre-built lazy mapping List implementation. There's one in Google Collections Library. You use it by calling Lists.transform(). It'll immediately return a List, but it'll only perform your transformation on elements as they are requested. Again, this is only more efficient if it turns out that you only ever look at a small fraction of the output List. If you end up looking at the entire thing this will not be more efficient, and will probably work out to be less efficient.
Not sure what you mean by efficient. As the others said, you have to call the getGoodInteger method on every element of that list one way or another. About the best you can do is avoid checking the size every time:
List<Integer> goodInts = new ArrayList<Integer>();
for (MyClass myObj : myList) {
goodInts.add(myObj.getGoodInteger());
}
I also second jboxer's suggestion of making a function for this purpose.