Recently I see this question Bellman Ford and Some Facts as follows:
We know the bellman-ford algorithms check all edges in each step, and for each edge if, d(v)>d(u)+w(u,v) was hold then d(v) being updated. w(u,v) is the weight of edge (u, v) and d(u) is the length of best finding path for vertex u. if at any step there is no update for any vertexes, the algorithm terminate.
for finding all shortest path from vertex s in graph G with n vertexes this algorithm terminate after k < n iteration.
The following fact is true.
number of edges in all shortest paths from s is at most k-1
in this Book we have 3 implementation (some optimization) of BFord. My question is that if we have
simultaneously relaxation which algorithm of these should be used, and by using it the above fact should be true? or not in general the above fact is true?
The final algorithm, BellmanFordFinal(s) is the optimised version of all the 3 mentioned algorithms.
OPTIMIZATION 1:
In the book, the legendary Prof. Jeff Erickson has explained, how the original algorithm presented by Bellman is optimised by removing the indentation of the last 3 lines in the algorithm.
Because the outermost iteration considers each edge u->v exactly once, the order in which they get processed, does not matter.
OPTIMIZATION 2:
The indices i-1 has been changed to i in the last 2 lines, which enables the algorithm to work more faster with correct computation of the values (shortest path).
OPTIMIZATION 3:
The 2 dimensional DP array is converted to 1 dimension as it was needless.
Therefore, use the final algorithm, titled, BellmanFordFinal(s).
The fact that we need to run the outermost loop for N-1 times is always true and independent of any implementation, because, the longest shortest path from a source to destination will be N-1 in case of the linear graph, where N is the number of the nodes in the graph.
Answer to your concern about k-1
number of edges in all shortest paths from s is at most k-1
The above statement is dependent on the implementation of your algorithm.
If you add a if condition to break the outermost loop when none of the edges are further relaxed, then this statement is false.
Otherwise it is true.
Have a look at the following implementation I found on https://cp-algorithms.com/graph/bellman_ford.html:
void solve()
{
vector<int> d (n, INF);
d[v] = 0;
for (;;)
{
bool any = false;
for (int j=0; j<m; ++j)
if (d[e[j].a] < INF)
if (d[e[j].b] > d[e[j].a] + e[j].cost)
{
d[e[j].b] = d[e[j].a] + e[j].cost;
any = true;
}
if (!any) break;
}
// display d, for example, on the screen
}
The variable any is used to check if their is any relaxation done in any of the iteration. If not done, break the loop.
Therefore, it might happen that the loop gets terminated before, for example, when k=2 and the number of edges in the longest shortest path is 3. Now, 3 <= 2-1, does not hold correct.
Related
The following code prints all permutations of a string:
void permutation(String str) {
permutation(str, "");
}
void permutation(String str, String prefix) {
if (str.length() == 0) {
System.out.println(prefix);
} else {
for (int i = 0; i < str.length(); i++) {
String rem = str.substring(0, i) + str.substring(i + 1);
permutation(rem, prefix + str.charAt(i));
}
}
}
GeeksForGeeks analyzes the time complexity of the code by determining:
the function gets called n! times in its base case
the for-loop runs n times
as a result, there will be no more than n * n! factorial nodes in the recursion tree.
each function call corresponds to O(n) work, therefore the total time complexity is O(n2 * n!).
I know that the time complexity can be estimated by multiplying the number of nodes in the recursion tree by the amount of work each function call does. If I use the formula branchesdepth, to estimate the number of nodes in the recursion tree, I get nn nodes in the recursion tree, which is quite different from n * n!.
I'd like to know why branchesdepth isn't a tight bound for this problem, and in which cases I shouldn't use O(branchesdepth) to estimate the time complexity of a function that makes multiple calls.
Recursion Call Tree
The formula branches ^ depth is applicable for time-complexity analysis when each recursive call spawn the same number of branches. Like, for instance, a recursive implementation for N'th Fibonacci sequence member. Where every call either terminates or creates exactly 2 new branches of execution and the time complexity will be O(2^n).
The tree of recursive method calls for the code listed in the question is shown below (remained characters on the left, prefix is on the right).
As you can see, the number of branches decreases from top to bottom (from n to 1), because the number of characters hasn't been used yet decreases.
Permutations
The number of permutations of the given String of length n is n!.
Let's explore why with a simple example.
Imagine that you need to pick 1 card from the standard deck of cards of size 52. I.e. there are 52 possibilities to chose from.
After the first card has been picked, you need to choose the next one, and there's a 51 way to make this choice.
That means that the total number of ways of how 2 cards can be picked is 52*51.
Then, for the third card, we have only 50 cards remaining in the deck.
That means that there are 52*51*50 ways to choose 3 cards from the deck.
For four cards, that number will be 52*51*50*49, for five - 52*51*50*49*48, and so on.
That is a so-called proof by induction that there are 52*51*50*49*48*...*3*2*1 (and that is a factorial - 52!) ways to pick 52 cards from the deck.
Now, you might think of each character in a string as if it's a card and the string itself is a deck of size n. Similarly, there will be n! way to pick all n characters from this string.
Algorithm - costs of operations
Since the number of permutations is n! that means that we need to generate n! strings of length n.
Each string has to be constructed letter by letter:
prefix + str.charAt(i) // append a charactor at position `i` to the `prefix` string
In order to create a string of length n, we need to pick a character n times in a loop. Each step of iteration will spawn a recursive method call. Therefore, n*n! method calls will be required to construct all n! permutations.
There's an additional cost that contributes to the total time complexity.
Creation of the remaining string (characters that are not picked yet) requires to invoke
str.substring(0, i) + str.substring(i + 1)
at each method call. And that will cost O(n) time because in order to create a substring we have iterated over the source string and copy up to n - 1 characters from its underlying array into the new string.
Therefore, the overall time complexity will be O(n^2*n!).
As #rici suggested in the comments, branchesdepth doesn't seem to be a good estimation when the number of branches for a node varies based on the recursion depth, as seen in this problem.
For this problem, it seems that a better way to estimate the number of nodes is to use the formula described by #Shubham here:
Complexity = length of tree from root node to leaf node * number of leaf nodes
When I draw the recursion tree of this function for a string of length n, I see that there are n! leaf nodes. The depth of the recursion is n, so the total number of nodes is n * n!. As mentioned in the question, since each function call corresponds to O(n) work, therefore the total time complexity is O(n2 * n!).
Would it be O(26n) where 26 is the number of letters of the alphabet and n is the number of levels of the trie? For example this is the code to print a trie:
public void print()
{
for(int i = 0; i < 26; i++)
{
if(this.next[i] != null)
{
this.next[i].print();
}
}
if(this.word != null)
{
System.out.println(this.word.getWord());
}
}
So watching this code makes me think that my aproximation of the time complexity is correct in the worst of the cases that would be the 26 nodes full for n levels.
Would it be O(26n) where 26 is the number of letters of the alphabet and n is the number of levels of the trie?
No. Each node in the trie must be visited, and O(1) work performed for each one (ignoring the work attributable to processing the children, which is accounted separately). The number of children does not matter on a per-node basis, as long as it is bounded by a constant (e.g. 26).
How many nodes are there altogether? Generally more than the number of words stored in the trie, and possibly a lot more. For a naively-implemented, perfectly balanced, complete trie with n levels below the root, each level has 26 times as many nodes as the previous, and so the total number of nodes is 1 + 26 + 262 + ... + 26n. That is O(26n+1) == O(26n) == O(2n), or "exponential" in the number of levels, which also corresponds to the length of the longest word stored within.
But one is more likely to be interested in measuring the complexity in terms of the number of words stored in the trie. With a careful implementation, it is possible to have nodes only for those words and for each maximal initial substring that is common to two or more of those words. In that event, every node has either zero children or at least two, so for w total words, the total number of nodes is bounded by w + w/2 + w/4 + ..., which converges to 2w. Therefore, a traversal of a trie with those structural properties costs O(2w) == O(w).
Moreover, with a little more thought, it is possible to conclude that the particular structural property I described is not really necessary to have O(w) traversal.
I am not familiar with a trie but the big O notation is mainly to depict approximately how quickly the running time or resource consumption grows relative to the input size. The way I think of it is just rereferring to general shape of the curve on the graph rather than exact points on the graph. A O(1) looks like a flat line, while a O(n) looks like a line at a 45 deg angle, etc.
source: https://medium.com/dataseries/how-to-calculate-time-complexity-with-big-o-notation-9afe33aa4c46
Now for the algorithm in the question. I am not familiar with a trie, but at first glace I would say it is O(1) (constant time), because the number of iterations of the loop is constant (always 26). However, in the loop it has this.next[i].print() which could completely change the answer depending on its complexity, and uncovers a important question we need to know: what is n?.
I am going to assume that the this.next[i] is of the same type as this, making the this.next[i].print() kind of a recursive call. In such a scenario the time it takes to finish executing will all depend on the number of instances that will have to be iterated though (visited). This algorithm resembles Depth First Search but does not safe guard against infinite recursion. This may be based on some additional information known about the next[i] instances (nodes) such as an instance is only ever referenced by at most 1 other instance. In this case the runtime complexity would be on order of O(n) where n is the number of instances or nodes.
... assuming that the this.word.getWord() runs in constant time as well. If it depends on some other word input, the runtime may as well be O(n * w) where n is number of nodes and w is the size of the words.
I have created the following simple algorithm in Java that computes the Pascal triangle in a recursive way, in the form of a 2D list of ints:
public class PascalTriangleRec{
private final int[][] points;
public PascalTriangleRec(int size){
points = new int[size][];
for (int i =0;i<size;i++){
int[] row = new int[i+1];
for (int j = 0;j<=i;j++){
row[j]=getValueAtPoint(i,j);
}
points[i]=row;
}
}
public static int getValueAtPoint(int row, int col){
if (col == 0 || col == row) return 1;
else return getValueAtPoint(row-1,col-1) + getValueAtPoint(row-1,col);
}
}
I need to know the time complexity of this algorithm. I found another question on StackOverflow that gives the time complexity of the getValueAtPoint function as O(2^n/sqrt(n)). I figured that since this function is embedded in two nested for loops, the time complexity of the entire Pascal triangle is O(sqrt(n^3)*2^n). I am pretty sure this reasoning is correct.
On the other hand I devised a completely different way to think about this problem, which goes as follows:
There is a certain property of Pascal triangles called Pascal's Corollary 8. This property states that the sum of all the coëfficients on a given row r is equal to 2^r, with r starting at 0.
One can also note that the getValueAtPoint function from my code sample will keep recursively calling itself until it returns 1 at some point. This means that all the coëfficients in the Pascal triangle are formed by adding 1 as many times as the value of that coëfficient.
Since adding 1s takes a constant time, one can say that the time needed to compute a given row in the triangle is equal to some constant time multiplied by the combined value of all the coëfficients in that row. This means that the time complexity of a given row r in the triangle must be 2^r.
The time needed to compute the entire triangle is equal to the sum of the time needed to calculate all the rows in the triangle. This results in a geometric series, which computes the sum of all 2^r for r going from 0 to n-1.
Using the summation property of the geometric series, this series can be rewritten in the following form.
This means that the time complexity of the algorithm according to this last derivation is O(2^n).
These two approaches yield different results, even though they both seem logical and correct to me. My question is in the first place if both these approaches are correct, and if both can be seen as correct at the same time? As I view it both of them are correct, but the second one is more accurate since for the first one the worst-case scenario is taken for the getValueAtPoint function, and applied to all coëfficients, which is clearly not the case in reality. Does this mean that the first one becomes incorrect, even though the logic behind it is correct, just because a better approach exists?
The simple answer is "too many variables". First of all, your analysis is exactly correct: the complexity depends on the sum of all the values computed. The same logic underlies the answer that got you O(2^n/sqrt(n)).
There are two problems:
Little problem: Stirling's approximation is just that: some terms are elided. I think they fall out when you combine all the loops, but I'd have to work through the nasty details to be sure.
Big problem: the values of n you combine are not the same n. That last n value you incorporated is i running from 0 to size; each value of i becomes n for an initial call to getValueAtPoint.
Try doing the sum from 0 to n on your previous complexity, and see what you get?
.
I want to find the shortest path between two vertices with an additional constraint : max n vertices can be visited. The graph is directed, connected, non negative weights and may contain cycles.
Example:
Shortest path 0->2 with n = 2 is 18
Shortest path 0->3 with n = 3 is 22
Shortest path 0->3 with n = 4 is 9
So far I've implemented Djikstras algorithm to get the simple shortest path, and my idea was to keep a counter of the current vertices visited, if it exceeds n it takes one or more steps back and tries with another path.. but as far as I know Djikstras can't be used for backtracking as explained here.
Another idea is somehow to store every path between every node in a table. But I'm not really sure how Djikstra can discover the path 0->2 with weight 18 since it is not really a shortest path...
Does anyone have any ideas how to tackle this problem?
Divided each vertices into n vertices, that is, for vertices u, we create n vertices expressed as (u, 1) ... (u, n), the second number shows the number of steps to this vertices. For each edge from u to v, we create an edge from (u, i) to (v, i+1) where 1<=i<=n-1 in new graph. Now if you want to calculate the shortest path between u and v with n, just do Dijkstra from (u, 1), then your answer is min(result (v, i) | 1<=i<=n)
The total number of vertices can be n*n, so the complexity is about O(n^2*log(n^2))
Let COST_TO(v,n) be the total weight of the minimum path to vertex v with n edges or less.
When n=0, the answer is easy:
for all v, COST_T(v,0) = 0 if v is the source vertex and infinity otherwise
For n>0, COST_TO(v,n) is the minimum of COST_TO(v,n-1) and all COST_TO(w,n-1)+WEIGHT(w,v), where there is an edge from w to v
so, for n = 0 to N, keep track of all the vertices with COST_(v,n) < infinity along with their costs, and calculate the costs for n from the values for n-1.
At the same time you can keep track of the minimum weight path to each v -- every time you update the cost to v with the edge rule, the new path to v is the path to w plus that edge. A reverse-singly-linked list is handy for this.
Let's assume that you want to find the shortest path from the source vertex S to a destination vertex T consisting of at most K edges. I picked K, because the n in your question is misleading - if you want to find the shortest path visiting at most n vertices, where n is the total number of vertices, you can just run Dijkstra, because any simple path has at most n vertices - I assume that this is not what you want here.
Then, if you want a simple implementation, Bellman-Ford is a good choice here:
After the i-th iteration of the outer loop, the algorithm has computed the shortest path from the source vertex S to any other vertex consisting of at most i edges, so it consists of at at most i + 1 vertices. So in order to solve your problem, run K - 1 outer loops of Bellman-Ford, and check if the distance to the destination vertex T is well defined (is different than infinity), and if it is, you have your result. Otherwise, T is not reachable from S in visiting K or less vertices.
Maybe try a bfs and check the number of vertices against the max. Save the best of the ones that fulfill the constraints.
Heres a video about it
https://youtu.be/TvHV3PB8ANs
Nist.gov has some algos as well.
I'm studying Linked Lists and the question is - Write a function to print the middle term of a given linked list (assume that LL has odd number of nodes).
Method 1 - Traverse the LL and count the number of nodes using a counter. Add 1 (to make it an even number) and divide the counter by 2 (ignore math for discrepancies). Traverse the LL again but this time only upto the counter-th term and return.
void GetMiddleTermMethod1(){
//Count the number of nodes
int counter = 0;
Node n = FirstNode;
while (n.next != null){
counter = counter + 1;
n = n.next;
}
counter=counter+1/2;
//now counter is equal to the half of the number of nodes
//now a loop to return the nth term of a LL
Node temp = FirstNode;
for(int i=2; i<=counter; i++){
temp = temp.next;
}
System.out.println(temp.data);
}
Method 2 - Initialize 2 references to nodes. One traverses 2 nodes at a time and the other only traverses 1. When the fast reference reaches null (the end of LL), the slow one would have reached the middle and return.
void GetMiddleTermMethod2(){
Node n = FirstNode;
Node mid = FirstNode;
while(n.next != null){
n = n.next.next;
mid = mid.next;
}
System.out.println(mid.next.data);
}
I have 3 questions -
Q1 - How can I know which algorithm is more efficient in case I'm asked this question in a job interview? I mean both functions traverse the LL one and a half times (second one does it in one loop instead of 2 but still its traverses the LL one and a half times)...
Q2 - Since both algorithms have the Big O of O(n), what parameters will decide which one is more efficient?
Q3 - What is the general method of calculating the efficiency of such algorithms? I'd really appreciate If you could link me towards the suitable tutorial...
Thanks
Well, there is no real simple answer for that, the answer may differ on the compiler optimization, JIT optimization, and the actual machine that runs the program (which might be optimized better for one algorithm for some reason).
The truth is, other than the theoretical big O notation that gives us asymptotic behavior, there is seldom a "clean, theoretical" way to determine Algorithm A is faster than Algorithm B in conditions (1),(2),...,(k).
However, it doesn't mean there is nothing you can do, you can benchmark the code by creating various random data sets, and time the duration each algorithm takes. It is very important to do it more than once. How much more? Until you get a statistical significance, when using some known and accepted statistical test, such as Wilcoxon signed ranked test.
In addition, in many cases, insignificant performance usually doesn't worth the time spent to optimize the code, and it would be even worse if it makes the code less readable - and thus harder to maintain.
I just implemented your solution i java and tested it in a LinkedList of 1.111.111 random integers up to 1000. Results are very much the same:
Method 1:
time: 162ms
Method 2:
time: 171ms
Furthermore I wanted to point out that you have two major flaws in your methods:
Method 1:
Change counter = counter + 1 / 2; to counter = (counter + 1) / 2; otherwise you get the end of the list as counter remains counter :)
Method 2:
Change System.out.println(mid.next.data); to System.out.println(mid.data);