Maximum N-noded connected subgraph in a node-weighted graph

Maximum N-noded connected subgraph in a node-weighted graph - java

Take this node weighted graph for example :
The maximum subgraph containing exactly 1 node (and the 'entry point') would be 14.
The maximum subgraph containing exactly 2 nodes (and the 'entry point') would be 14 / 9.
The maximum subgraph containing exactly 3 nodes (and the 'entry point') would be 3 / 19 / 15.
The maximum subgraph containing exactly 4 nodes (and the 'entry point') would be 14 / 1 / 7 / 240.
I can't manage to think of a better method than a bruteforce to get the maximum subgraph.
And if there is no known efficient algorithm, would a genetic algorithm be find in that case (the crossovers seem tricky) ?

I think you can modify Dijkstra's algorithm to solve this problem.
With Dijkrasta's algorithm you are solving the shortest path. Just change that to find the maximum path. To restrict it to only 1,2,3 nodes in a graph, keep track of the number of nodes it takes to get to each node when you "visit it". Stop when there are no other nodes with a count less than the number of nodes you are looking for.

Related

How priority queue is used with heap to solve min distance

Please bear with me I am very new to data structures.
I am getting confused how a priroity queue is used to solve min distance. For example if I have a matrix and want to find the min distance from the source to the destination, I know that I would perform Dijkstra algorithm in which with a queue I can easily find the distance between source and all elements in the matrix.
However, I am confused how a heap + priority queue is used here. For example say that I start at (1,1) on a grid and want to find the min distance to (3,3) I know how to implement the algorithm in the sense of finding the neighbours and checking the distances and marking as visited. But I have read about priority queues and min heaps and want to implement that.
Right now, my only understanding is a priority queue has a key to position elements. My issue is when I insert the first neighbours (1,0),(0,0),(2,1),(1,2) they are inserted in the pq based on a key (which would be distance in this case). So then the next search would be the element in the matrix with the shortest distance. But with the pq, how can a heap be used here with more then 2 neighbours? For example the children of (1,1) are the 4 neighbours stated above. This would go against the 2*i and 2*i + 1 and i/2
In conclusion, I don't understand how a min heap + priority queue works with finding the min of something like distance.
0 1 2 3
_ _ _ _
0 - |2|1|3|2|
1 - |1|3|5|1|
2 - |5|2|1|4|
3 - |2|4|2|1|

You need to use the priority queue to get the minimum weights in every move so the MinPQ will be fit for this.
MinPQ uses internally technique of heap to put the elements in the right position operations such as sink() swim()
So the MinPQ is the data structure that uses heap technique internally

If I'm interpreting your question correctly, you're getting stuck at this point:
But with the pq, how can a heap be used here with more then 2 neighbours? For example the children of (1,1) are the 4 neighbours stated above. This would go against the 2*i and 2*i + 1 and i/2
It sounds like what's tripping you up is that there are two separate concepts here that you may be combining together. First, there's the notion of "two places in a grid might be next to one another." In that world, you have (up to) four neighbors for each location. Next, there's the shape of the binary heap, in which each node has two children whose locations are given by certain arithmetic computations on array indices. Those are completely independent of one another - the binary heap has no idea that the items its storing come from a grid, and the grid has no idea that there's an array where each node has two children stored at particular positions.
For example, suppose you want to store locations (0, 0), (2, 0), (-2, 0) and (0, 2) in a binary heap, and that the weights of those locations are 1, 2, 3, and 4, respectively. Then the shape of the binary heap might look like this:
(0, 0)
Weight 1
/ \
(2, 0) (0, 2)
Weight 2 Weight 4
/
(0, -2)
Weight 3
This tree still gives each node two children; those children just don't necessarily map back to the relative positions of nodes in the grid.
More generally, treat the priority queue as a black box. Imagine that it's just a magic device that says "you can give me some new thing to store" and "I can give you the cheapest thing you've given be so far" and that's it. The fact that, internally, it coincidentally happens to be implemented as a binary heap is essentially irrelevant.
Hope this helps!

Dijkstra - find destinations within budget

I was given a graph, including a start vertex, other vertices, and edges represent the costs going from one vertex to another. I need to find the set of destination vertices that I can travel to from the start vertex. The budget is a certain amount of dollars and the travel total cost should be within the budget. How can i implement the Dijkstra's algorithm to this problem? I think we usually use Dijkstra to find the shortest path between two fixed vertices before. But I am not sure how to implement Dijkstra on this budget problem. If someone can give some ideas, that really helps!

To my undersatanding, Dijkstra's algorithm solves the single-source shortest path problem. This means that the resulting data structure is a tree rooted at the starting node. Typically, an implementation would store the minimum cost to reach each node from the starting node. In total, the set of vertices that can be reached within the budget are the ones to which the cost of the shortest path does not exceed the budget. The algorithm itself does not need to be modified; in an additional step, the nodes which can be reached within budged has to be returned as the output.

If you use Dijkstra algorithm, you may end up with the below scenarios:
Assuming budget of 50 and the graph of 4 nodes (start, node 1, node 2, node 3)
Start Node -> node 1 (15) -> node 2 (10): so total cost is 25
Start Node -> node 3 (15): total cost is 15
So now what is your expected result? You should go to node 1 then node 2 and ignore node 3 (since you cannot return back to start then go to node 3, the return will cost 30 too). Or you should go to node 1, back to start, then go to node 3 (total cost of 45, the largest cost you can utilise)
What you need is not the shortest path covering all destinations which is Floyd-Warshall algorithm https://en.wikipedia.org/wiki/Floyd–Warshall_algorithm

calculating the height of a Minimum spannning tree

I need a JAVA code that could help me find the height of the Minimum spanning tree.
Basically i m looking for an extension of the
Prim's/Kruskal's algo that not only gives the height of the MST but also gives its height.
Thanks in advance.

Take as the root vertex one of the tree's centers and calculate the maximal distance from the chosen center to the leaf nodes.
The height can the be calculated in the following way:
set the height to 0
while there are at least 3 remaining vertices:
delete all leaf vertices
height := height+1
if 2 vertices remain:
height := height+1
the remaining vertices are the centers of the tree.
The time complexity is O(n).
A practical way of calculating the height would be to combine all leaf nodes int a single node that serves as a root, then calculate the MST of that modification and the height as the maximum distance from generated root node to the newly generated leaf nodes.

I am not writing the code here, just giving you a hint.
In every node keep a variable that stores the height/depth of that node.
So, depth for starting node will be 0. Now, whenever you add a edge to the MST, increase the depth of the new node by 1.
Currently you have the following discovered nodes with the following depth.
a 0
b 1
c 1Now suppose you want to add the edge from c to d, so the depth of node d will be depth(c)+1, i.e 1+1=2.
Also, you can keep a track of the maximum depth among all nodes at each step of the algorithm.
So, finally answer will be the maximum depth among all the nodes of the tree.

Best algorthm to get all combination pair of nodes in an undirected graph (need to improve time complexity)

I have an undirected graph A that has : no multiple-links between any two nodes , no self-connected node, there can be some isolated nodes (nodes with degree 0).
I need to go through all the possible combinations of pair of nodes in graph A to assign some kind of score for non-existence links (Let say if my graph has k nodes and n links, then the number of combination should be (k*(k-1)/2 - n) of combinations). The way that I assign score is based on the common neighbor nodes between the 2 nodes of combination.
Ex: score between A-D should be 1, score between G-D should be 0 ...
The biggest problem is that my graph has more than 100.000 nodes and it was too slow to handle almost 10^10 combinations which is my first attempt to approach the solution.
My second thought is since the algorithm is based on common neighbors of the node, I might only need to look at the neighbors so that I can assign score which is different from 0. The rest can be determined as 0 and no need to compute. But this could possibly repeat a combination.
Any idea to approach this solution is appreciated. Please keep in mind that the actual network has more than 100.000 nodes.

If you represent your graph as an adjacency list (rather than an adjacency matrix), you can make use of the fact that your graph has only 600,000 edges to hopefully reduce the computation time.
Let's take a node V[j] with neighbors V[i] and V[k]:
V[i] ---- V[j] ---- V[k]
To find all such pairs of neighbors you can take the list of nodes adjacent to V[j] and find all combinations of those nodes. To avoid duplicates you will have to generate the combinations rather than the permutations of the end nodes V[i] and V[k] by requiring that i < k.
Alternatively, you can start with node V[i] and find all of the nodes that have a distance of 2 from V[i]. Let S be the set of all the nodes adjacent to V[i]. For each node V[j] in S, create a path V[i]-V[j]-V[k] where:
V[k] is adjacent to V[j]
V[k] is not an element of S (to avoid directly connected nodes)
k != i (to avoid cycles)
k > i (to avoid duplications)
I personally like this approach better because it completes the adjacency list for one node before moving on to the next.
Given that you have ~600,000 edges in a graph with ~100,000 nodes, assuming an even distribution of edges across all of the nodes each node would have an average degree of 12. The number of possible paths for each node is then on the order of 102. Over 105 nodes that gives on the order of 107 total paths rather than the theoretical limit of 1010 for a complete graph. Still a large number, but a thousand times faster than before.

graph representation with minimum cost (time and space)

i have to represent a graph in java but neither as an adjacency list nor an adjacency matrix..
the basic idea is that if
deg[i]
is that exit degree of vertex i, then its neigboors can store in
edges[i][j] where
i <= j <= deg[i]
, but given that
edges[][]
must be initialized with some values i dont know how to make it differ from an adjacency matrix..
any suggestions?

In my knowledge there are only two ways to represent a graph in a language.
Either Use Adjacency matrix
Or Use Incidence matrix
You can make an incidence matrix like
E1 E2 E3 E4
V1 1 2 1
1 V2 2
1 2 1 V3
1 1 1 2
V4 1 1 2 1

You are fighting against lower bounds on this question. The two main representations of the graph are already very good for their respective use.
Adjacency list, minimizes space. You will be hard pressed to use less memory than 1 pointer per edge. Space: O(V*E). Search: O(V)
Adjacency matrix, is very fast, at the cost of v^2 space. Space: O(V^2). Search: O(1)
So, to make something that is better for both space and time you will have to combine the ideas from both. Also realize will just have better practical performance, theoretically you will not improve O(1) search, or O(V*E) size.
My idea would be to store all the graph nodes in one array. Then for each Node have an adjacency list represented as a bit vector. This would essentially be a matrix like representation, but only for those nodes that exist in the graph, giving you a smaller size than a matrix. Search would be slightly improved over an Adjacency list as the query node can be tested against the bit vector.
Also check out sparse matrix representations.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.