Maximum amount of tasks running in parallel? - java

The problem:
We are given a set of n tasks, each having an integer start time and
end time. What is the maximum amount of tasks running in parallel at
any given time?
The algorithm should run in O(n log n) time.
This is a school assignment so i don't need a direct answer but any code snippets are welcome as long as they are in Java or Scala (assignment supposed to be written in scala.)
Some of the hints say that i should take advantage of Priority queues. I read the documentation, but I'm not really sure on how to use them, so any code snippets are welcome.
The input data could for instance be Array[Pair[Int,Int]] = Array((1000,2000),(1500,2200)) and so on.
I'm really struggling to set the Ordering of the priority queue, so if nothing else i hope someone could help me with that.
PS:
The priority queue is supposed to be initialized with PriorityQueue()(ord).
Edit: i came up with the solution using priority queues but thank you for all the answers. You guys helped me figure out the logic!

Soln without using Priority Queue.
Consider the array of tasks as follows:
[(1,2), (1,5), (2,4), ....] // (a,b) : (start_time, end_time)
Step 1 : Construct an array considering start_time and end_time together.
[1,2,1,5,2,4....]
Step 2 : Maintain another array to know whether the time at index i is start_time or end_time
[S,E,S,E,S,E...] // S:Start_Time, E:End_Time
Step 3 : Sort the first array. And make sure to change the index in another array accordingly.
Step 4 : Maintain two variables, parallel_ryt_now and max_parallel_till_now. And traverse the second array as follows:
for i in 1:len(second_array):
if(second_array[i] == "S"):
parallel_ryt_now ++
else
parallel_ryt_now --
if parallel_ryt_now > max_parallel_till_now:
max_parallel_till_now = parallel_ryt_now
Logic :
While traversing the sorted array, when u encounter a start_time, that means a task has started. Thus increment the the parallel_ryt_now and when u encounter an end_time, means that a task has completed, thus decrement the parallel_ryt_now.
This way, at every moment the parallel_ryt_now var stores the parallel running tasks.
Time Complexity = Sort + Traverse = O(nlogn) + O(n) = O(nlogn)
Space Complexity = O(n) (To store the extra array for info about whether time at index i is start_time or end_time )
I hope it helped.

Related

Condition to terminate BFS

I have this assignment where given a list of tuples where each tuple contains 2 Strings like this :
[ ("...","...") , ("...","...") , ("...","...") ... ]
I have to calculate the shortest path which will lead to an extreme-string.
An extreme-string is defined as a tuple of 2 strings where the first string is equal to the second string.
I know this might sound confusing so let me set an example.
Given :
The list [("0","100") , ("01","00") , ("110","11")]
With indices 0,1,2
The shortest path is : [2,1,2,0]
The extreme-string is equal to : "110011100"
Step by step explanation :
Starting with tuple of index 2 the initial string is : "110","11"
Appending tuple of index 1 next string is : "11001","1100"
Appending tuple of index 2 next string is : "11001110","110011"
Appending tuple of index 0 final string is : "110011100","110011100"
So say you begin with tuple ("X","Y") and then you pick tuple ("A","B") then result is ("XA","YB").
The way I approached this problem was using BFS which I already implemented and sounds right to me but there is an issue I am dealing with.
If the input is something like :
[("1","111")]
then the algorithm will never terminate as it will always be in the state "111..." - "111111111..." .
Checking for this specific input is not a good idea as there many inputs that can reproduce this result.
Having an upper bound for the iterations is also not a good idea because in some cases a finite result may actually exist after the iterations bound.
Any insight would be really useful.
Since its an assignment I can't really solve it for you, but I'll try to give tips:
BFS sounds great to me as well.
One thing that differentiates the BFS from, say, DFS is that you place the elements of level N into the queue (as opposed to stack). Since queue is FIFO, you'll process the elements of Level N before elements at the level of N + 1. So this algorithm will finish (although might occupy a lot of memory).
The interesting part is what exactly you put into the queue and how you organize the traversal algorithm. This is something that I feel you've already solved or at least you have a direction. So think about my previous paragraph and hopefully you'll come to the solution ;)

Java interview question: get entry by two fields in O(log(n)) time

Hi had an interview task, the idea is to store elements with fields: id, name, updateTime;
There should be methods add(Element), getElement(id), getLastUpdatedElements()
Requirements:
code should be on Java
Should be thread safe
Upper bound of computational complexiy for all these methods should be O(log(n))
Notes
Update time of any element can be changed in runtime
getLastUpdatedElements - returns updated last minute elements
My thoughts
I can not use CopyOnWriteArrayList because it will take O(N) to find last updated elements if the key is id, what breaks the requirement.
To fit O(log(N)) complexity with getLastUpdatedElements() I can use ConcurrentSkipListSet with comparator by updateTime but in that case it will take O(N) to get element by ID. (Please note that in this case add(Element) is O(log(N)) since we know updateTime for newly created elements)
I can use two trees, first one with comparator by id, second - with comparator by updateTime, but all access methods I should make synchronize what makes my programm single threaded
I think I'm close, just need to find how to get element with O(log(N)) but my thoughts are running out.
I hope I understood you correctly.
If you need to store the elements and have an "add" and "get" time as low as (log(N)), that sounds like classic hash map (which uses linked list hash and binary tree if search time reaches a certain threshold - since java 8 I believe).
so in the worst case it's log(N).
for the "get last updated" function: you can store each updated element in a stack (not really a stack, just a list you keep adding into) and when the function is performed. just perform a binary search on the list. when you reach the first item that has been updated in the last minute - just return the index to that item.
that way you only perform binary search (log(N)).
oh and of course just have a lock for those two data structures.
if you really want to dig into it performance-wise, you can implement two locks: one for inserting/updating entries, and one just for reading them.
similar to the "readers-writers problem" like so: https://www.tutorialspoint.com/readers-writers-problem

Traversal of Giant LinkedList

For a project I am required to write a method that times the traversal of a LinkedList filled with 5 million random Integers using a listIterator, then with LinkedList's get(index) method.
I had no problem traversing it with the listIterator and it completed in around 75ms. HOWEVER, after trying the get method traversal on 5 million Integers, I just stopped the run at around 1.5 hours.
The getTraverse method I used is something like the code below for example (however mine was grouped with other methods in a class and was non-static, but works the same way).
public static long getTraverse(LinkedList<Integer> list) {
long start = System.currentTimeMillis();
for (int i = 0; i < linkedList.size(); i++) {
linkedList.get(i);
}
long stop = System.currentTimeMillis();
return stop - start;
}
This worked perfectly fine for LinkedLists of Integers of sizes 50, 500, 5000, 50000, and took quite a while but completed for 500000.
My professor tends to be extremely vague with instructions and very unhelpful when approached with questions. So, I don't know if my code is broken, or if he got carried away with the Integers in the guidelines. Any input is appreciated.
Think about how a LinkedList is implemented - as a chain of nodes - and you'll see that to get to a particular node you have to start at the head and traverse to that node.
You're calling .get() on a LinkedList n times, which requires traversing the list until it reaches that index. This means your getTraverse() method takes O(n^2) (or quadratic) time, because for each element it has to traverse (part of) the list.
As Elliott Frisch said, I suspect you're discovering exactly what your instructor wanted you to discover - that different algorithms can have drastically different runtimes, even if in principle they do the same thing.
A LinkedList is optimised for insertion, which is a constant time operation.
Searching a LinkedList requires you to iterate over every element to find the one you want. You provide the index to the get method, but under the covers it is traversing the list to that index.
If you add some print statements, you'll probably see that the first X elements are retrieved pretty fast and it slows down over time as you index elements further down the list.
An ArrayList (backed by an array) is optimised for retrieval and can index the desired element in constant time. Try changing your code to use an ArrayList and see how much faster get runs.

In java, when sorting, what's the solution to a specific issue to sort when I have equal values

So i have this problem right here. I want to code non-preemptive priority scheduling algorithm and my way is to sort it since you wanna get the highest priority first as the algorithm says. If ever I have priority values inside an Array. example: job1 = 2 ; job2 = 5; job3 = 2; job4 = 4.
The algorithm is that when two or more jobs with equal priority are present, the processor is allocated to the one "who arrived first". From the examples above It should be expected to be sorted this way(descending order): job2 - job4 - job1 - job3.
Since job1 and job3 is having the same priority, I want job1 to be in first before job3.
Now my problem is this. What's the solution for the sort to get the job1 first and not the job3? Or is it already in the system that I might automatically sort this out. Because I never tried anything before if job3 goes first or last.
Priority Queue Data Structure already exists in Java , you can use that. Thread safe version is - PriorityBlockingQueue
You can define your custom comparator to keep queue sorted based on priority while maintaining insertion order when priorities equalize.
Lots of examples are here - Java: How do I use a PriorityQueue?
Other comparator strategies listed here
Refer this one too
Hope it helps !!
You are talking about stable sorting, which keeps the order of equal value elements. Merge sort is stable, while quick sort is not. Collections.sort uses merge sort and should do the job in O(nlogn).
However, if time complexity is an issue and since the number of priorities is limited, radix sort should sort, generally (though not guaranteed), in O(sn), when n integer keys of size s are used.
Stable sorting is the answer to your question, assuming that jobs are stored to the front of the back of the array when they arrive. So if Job 1 = 2 is inputted, then some other jobs, then Job 3 = 2, then the array will look something like this: [Job 1, Job x, Job y, ..., Job 3]. Stable sorting, by definition, means that if two elements in the array have the same value, then the original ordering of the two elements is preserved. As myin528 states, Mergesort is stable, as are radix sort, which could be faster depending on the values of your array, or insertion sort, if your array is small.

Find K max values from a N List

I got requirements-
1. Have random values in a List/Array and I need to find 3 max values .
2. I have a pool of values and each time this pool is getting updated may be in every 5 seconds, Now every time after the update , I need to find the 3 max Values from the list pool.
I thought of using Math.max thrice on the list but I dont think it as
a very optimized approach.
> Won't any sorting mechanism be costly as I am bothered about only top
3 Max Values , why to sort all these
Please suggest the best way to do it in JAVA
Sort the list, get the 3 max values. If you don't want the expense of the sort, iterate and maintain the n largest values.
Maintain the pool is a sorted collection.
Update: FYI Guava has an Ordering class with a greatestOf method to get the n max elements in a collection. You might want to check out the implementation.
Ordering.greatestOf
Traverse the list once, keeping an ordered array of three largest elements seen so far. This is trivial to update whenever you see a new element, and instantly gives you the answer you're looking for.
A priority queue should be the data structure you need in this case.
First, it would be wise to never say again, "I dont think it as a very optimized approach." You will not know which part of your code is slowing you down until you put a profiler on it.
Second, the easiest way to do what you're trying to do -- and what will be most clear to someone later if they are trying to see what your code does -- is to use Collections.sort() and pick off the last three elements. Then anyone who sees the code will know, "oh, this code takes the three largest elements." There is so much value in clear code that it will likely outweigh any optimization that you might have done. It will also keep you from writing bugs, like giving a natural meaning to what happens when someone puts the same number into the list twice, or giving a useful error message when there are only two elements in the list.
Third, if you really get data which is so large that O(n log n) operations is too slow, you should rewrite the data structure which holds the data in the first place -- java.util.NavigableSet for example offers a .descendingIterator() method which you can probe for its first three elements, those would be the three maximum numbers. If you really want, a Heap data structure can be used, and you can pull off the top 3 elements with something like one comparison, at the cost of making adding an O(log n) procedure.

Categories

Resources