Simple question here -- mostly about APIs.
I want to iterate through an array in random order.
It is easy enough to:
fill a List with the numbers 0 to N
shuffle the List with Collections.shuffle
Use this shuffled list to guide my array iteration.
However, I was wondering if step 1 (generating the list of numbers from 0 to N) exists somewhere in prewritten code.
For instance, could it be a convenience method in guava's XYZ class??
The closest thing in Guava would be
ContiguousSet.create(Range.closedOpen(0, n), DiscreteDomains.integers())
...but, frankly, it is probably more readable just to write the for loop yourself.
Noting specifically your emphasis on 'quick', I can't imagine there'd be much quicker than
List<Integer> = new ArrayList<Integer>(range);
and then iterating and populating each entry. Note that I set the capacity in order to avoid having the list resize under the covers.
You may want to check out the Apache Commons which among many other usefull functions, implement the nextPermutation method in RandomDataGenerator class
This is obviously something much bigger then a method of populating the List or array, but commons are really powerfull libraries, which give much more good methods for mathematical computations.
Java doesn't allow you to auto-populate your values. See this question for the ways to populate an array in java
"Creating an array of numbers without looping?"
If you skip step 1 and just do the shuffeling immediately I think you will have the fastest solution.
int range = 1000;
List<Integer> arr = new ArrayList<Integer>(range);
for(int i=0;i<range;i++) {
arr.add((int)(Math.random()*i), new Integer(i));
}
Related
In Java, I have an ArrayList with a list of objects. Each object has a date field that is just a long data type. The ArrayList is sorted by the date field. I want to insert a new object into the ArrayList so that it appears in the correct position with regard to its date. The only solution I can see is to iterate through all the items and compare the date field of the object being inserted to the objects being iterated on and then insert it once I reach the correct position. This will be a performance issue if I have to insert a lot of records.
What are some possible ways to improve this performance? Maybe an ArrayList is not the best solution?
I would say that you are correct in making the statement:
Maybe an ArrayList is not the best solution
Personally, I think that a tree structure would be better suited for this. Specifically Binary Search Tree, which is sorted on the object's date time. Once you have the tree created, you can use binary search which would take O(log n) time.
Whether or not binary search + O(n) insertion is bad for you depends on at least these things:
size of the list,
access pattern (mostly insert or mostly read),
space considerations (ArrayList is far more compact than the alternatives).
Given the existence of these factors and their quite complex interactions you should not switch over to a binary search tree-based solution until you find out how bad your current design is—through measurements. The switch might even make things worse for you.
I would consider using TreeSet and make your item Comparable. Then you get everything out of the box.
If this is not possible I would search for the index via Collections.binarySearch(...).
EDIT: Make sure performance is an issue before you start optimizing
first you should sort ArrayList Using:
ArrayList<Integer> arr = new ArrayList<>();
...
Collections.sort(arr);
Then Your Answer is:
int index = Collections.binarySearch(arr , 5);
Is there a faster way to check if an item in a list is greater than, less than, equal to a certain number?
Or you just have to loop through it? Im just curious if there are pre-built functions for this.
Example:
List contains 5, 5, 10, 15, 15, 20.
I want to check if how many items are actually >= 5. So the answer is 6. If I want to check >= 15, the answer would be 3.
step 1 : sort the list
step 2 : find the index for desired element
step 3 : print length-index
I don't see any such methods in the documentation, so I would say NO. You have to iterate through the List. If it is sorted, you can do binary search for faster results.
You need to loop to check the condition for each element.
Are you trying to sort? If you are dealing with arrays you can use Arrays.sort() else if you are dealing with collections you can use Collections.sort()
No there is no pre-built function afaik. If your list items are not order critical(i.e. you are not making any priority list or LIFO/FIFO), you can improve the searching through sorting list before finding the element.
You can sort the list and then you'll have to compare only its first and the last elements with the number.
That is why I am asking a question if there is such method that currently exists within the Collection.
There is no such method in the standard collection APIs.
Write a loop. It should be quicker code and test a 5 line method to do this than scouring the internet for a 3rd party library. And your code will most likely be faster ... and certainly no slower.
Just do it. (I'd write the code for you myself, but it sounds like you need the practice ...)
I have two ArrayList<Long> with huge size about 5,00,000 in each. I have tried using for loop which usage list.contains(object), but it takes too much time. I have tried by splitting one list and comparing in multiple threads but no effective result found.
I need the no. of elements that are same in both list.
Any optimized way?
Let l1 be the first list and l2 the second list. In Big O notation, that runs in O(l1*l2)
Another approach could be to insert one list into a HashSet, then for all other elements in the other list test if it exist in the HashSet. This would give roughly 2*l1+l2 -> O(l1+l2)
Have you considered putting you elements into a HashSet instead? This would make the lookups much faster. This would of course only work if you don't have duplicates.
If you have duplicates you could construct HashMap that has the value as the key and the count as the value.
General mechanism would be to sort both lists and then iterate the sorted lists looking for matches.
A list isn't a efficient data structure when you have much elements, you have to use a data structure more efficent when you search a element.
For example an tree or a hashmap!
Let us assume that list one has m elements and list two has n elements , m>n. If elements are not numerically ordered , it seems that they are not , total number of comparison steps - that is the cost of the method - factor mxn - n^2/2. In this case cost factor is about 50000x49999.
Keeping both lists ordered will be the optimal solution. If lists are ordered , cost of comparison of these will be factor m. In this case that is about 50000. This optimal result will be achieved , when both of lists are iterated via two cursor. This method can be represented in code as follows :
int i=0,j=0;
int count=0;
while(i<List1.size() && j<List2.size())
{
if(List1[i]==List2[j])
{
count++;
i++;
}
else if(List1[i]<List2[j])
i++;
else
j++;
}
If it is possible for you to keep lists ordered all the time , this method will make difference. Also I consider that it is not possible split and compare unless lists are ordered.
how can I optimize the following:
final String[] longStringArray = {"1","2","3".....,"9999999"};
String searchingFor = "9999998"
for(String s : longStringArray)
{
if(searchingFor.equals(s))
{
//After 9999998 iterations finally found it
// Do the rest of stuff here (not relevant to the string/array)
}
}
NOTE: The longStringArray is only searched once per runtime & is not sorted & is different every other time I run the program.
Im sure there is a way to improve the worst case performance here, but I cant seem to find it...
P.S. Also would appreciate a solution, where string searchingFor does not exist in the array longStringArray.
Thank you.
Well, if you have to use an array, and you don't know if it's sorted, and you're only going to do one lookup, it's always going to be an O(N) operation. There's nothing you can do about that, because any optimization step would be at least O(N) to start with - e.g. populating a set or sorting the array.
Other options though:
If the array is sorted, you could perform a binary search. This will turn each lookup into an O(log N) operation.
If you're going to do more than one search, consider using a HashSet<String>. This will turn each lookup into an O(1) operation (assuming few collisions).
import org.apache.commons.lang.ArrayUtils;
ArrayUtils.indexOf(array, string);
ArrayUtils documentation
You can create a second array with the hash codes of the string and binary search on that.
You will have to sort the hash array and move the elements of the original array accordingly. This way you will end up with extremely fast searching capabilities but it's going to be kept ordered, so inserting new elements takes resources.
The most optimal would be implementing a binary tree or a B-tree, if you have really so much data and you have to handle inserts it's worth it.
Arrays.asList(longStringArray).contains(searchingFor)
I have a variable number of ArrayList's that I need to find the intersection of. A realistic cap on the number of sets of strings is probably around 35 but could be more. I don't want any code, just ideas on what could be efficient. I have an implementation that I'm about to start coding but want to hear some other ideas.
Currently, just thinking about my solution, it looks like I should have an asymptotic run-time of Θ(n2).
Thanks for any help!
tshred
Edit: To clarify, I really just want to know is there a faster way to do it. Faster than Θ(n2).
Set.retainAll() is how you find the intersection of two sets. If you use HashSet, then converting your ArrayLists to Sets and using retainAll() in a loop over all of them is actually O(n).
The accepted answer is just fine; as an update : since Java 8 there is a slightly more efficient way to find the intersection of two Sets.
Set<String> intersection = set1.stream()
.filter(set2::contains)
.collect(Collectors.toSet());
The reason it is slightly more efficient is because the original approach had to add elements of set1 it then had to remove again if they weren't in set2. This approach only adds to the result set what needs to be in there.
Strictly speaking you could do this pre Java 8 as well, but without Streams the code would have been quite a bit more laborious.
If both sets differ considerably in size, you would prefer streaming over the smaller one.
There is also the static method Sets.intersection(set1, set2) in Google Guava that returns an unmodifiable view of the intersection of two sets.
One more idea - if your arrays/sets are different sizes, it makes sense to begin with the smallest.
The best option would be to use HashSet to store the contents of these lists instead of ArrayList. If you can do that, you can create a temporary HashSet to which you add the elements to be intersected (use the putAll(..) method). Do tempSet.retainAll(storedSet) and tempSet will contain the intersection.
Sort them (n lg n) and then do binary searches (lg n).
You can use single HashSet. It's add() method returns false when the object is alredy in set. adding objects from the lists and marking counts of false return values will give you union in the set + data for histogram (and the objects that have count+1 equal to list count are your intersection). If you throw the counts to TreeSet, you can detect empty intersection early.
In case that is required the state if 2 set has intersection, I use the next snippet on Java 8+ versions code:
set1.stream().anyMatch(set2::contains)