I have a Directed Graph G=(V,E) that each vertex v has two properties:
r indicating the worthiness
m indicating the highest v''s r (where v' is a reachable vertex from v).
I need to find ms for all vertices in O(|V|+|E|) time.
For example,
Initial G
A(r = 1, m = 1) → B(r = 3, m = 3) ← C(r = 2, m = 2)
↓
D(r = 4, m = 4)
has to be
A(r = 1, m = 4) → B(r = 3, m = 3) ← C(r = 2, m = 3)
↓
D(r = 4, m = 4)
I searched SO and found some Here, but one of the answers does not bound in time and another answer is very badly explained. Is there any simpler idea here?
In practice, I would use use the algorithm from Ehsan's answer, but it's not quite O(V+E). If you really need that complexity, then you can do this:
Divide the graph into strongly-connected components using, e.g., Tarjan's algorithm This is O(V+E).
Make a graph of the SCCs. Every node in an SCC is reachable from every other one, so the node for each SCC in the new graph gets the highest r value in the SCC. You can do this in O(V+E) too.
The graph of SCCs is acyclic, so you can do a topological sort. All the popular algorithms for that are O(V+E).
Process the SCC nodes in reverse topological order, calculating each m from neighbors. Because all the edges point from later to earlier nodes, the inputs for each node will be finished by the time you get to it. This is O(V+E) too.
Go through the original graph, setting every node's m to the value for its component in the SCC graph. O(V)
Use following O(E+V*log(V)) algorithm :
- Reverse all directions
- while |V| > 0 do
find max(v) from remaining nodes in V
from that node execute DFS and find all reachable nodes and update their m as max(V)
remove all updated nodes from V
the time-complexity of this algorithm is as your request O(V*log(V)+E)
How to solve the problem?
Reachable vertices in a directed graph
Which vertices can a given vertex visit?
Which vertices can visit the given vertex?
We are dealing with directed graphs. So, we need to find strongly connected components to answer the questions like above efficiently for this problem.
Once we know the strongly connected components, we can deal with the highest worthiness part.
In every strongly connected component, what is the highest worthiness value? Update accordingly.
Both steps are possible with O(V + E). With proper thought process, I believe it should be able to do both the steps in a single pass.
How to find strongly connected components?
Kosaraju's algorithm
Tarjan's algorithm
Path-based strong component algorithm
If you are looking for something simple, go for Kosaraju's algorithm. To me, it is the simplest of the above three.
If you are looking for efficiency, Kosaraju's algorithm takes two depth-first traversals but the other two algorithms accomplish the same within 1 depth-first traversal.
A Space-Efficient Algorithm for Finding Strongly Connected Components mentions that Tarjan’s algorithm required at most v(2 + 5w) bits of storage, where w is the machine’s word size. The improvement mentioned in the paper reduces the space requirements to v(1 + 3w) bits in the worst case.
Implementation:
Apparently, you are looking for some type of implementation.
For the mentioned 3 ways of finding strongly connected components, you can find java implementation here.
There are multiple Path-based strong component algorithms. To my knowledge, Gabow's algorithm is much simpler to understand than Tarjan's algorithm and the latest in path-based strong component algorithms. You can find java implementation for Gabow's algorithm here.
I am adding this answer, although there are correct answers with upvotes before me, only because you tagged java and python. So I will add java implementation now, and if needed the python implementation will follow.
The algorithm
This is a tweak on the classic topological sort:
foreach vertex:
foreach neighbour:
if didn't yet calculate m, calculate.
Take the maximum of yourself and neighbours. Mark yourself as visited, and if asked again for m, return the calculated.
It is implemented at calculateMostValuableVertex.
Time computation complexity
foreach vertex (O(|V|))
2. foreach edge(O(|E|) totally, as it will eventually go over each edge once.):
If not yet computed, compute m.
Please note that foreach vertex, it will be calculated either in stage 1, or 3. not twice, wince it is checked before the calculation.
Therefore the time complexity of this algorithm is O(|V| + |E|)
Assumptions
This solution relies heavily on the fact that HashMap in Java does operations such as add/update in O(1). That is true in average, but if that is not enough, the same idea can be fully implemented only with arrays, which will improve the solution into O(|V|+|E|) in the worst case.
Implementation
Let's first define the basic classes:
Vertex:
import java.util.ArrayList;
class Vertex {
String label;
public int r; // Worthiness
public int m; // Highest worthiness.
Vertex(String label, int r, int m) {
this.label = label;
this.r = r;
this.m = m;
}
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result * r * m
+ ((label == null) ? 0 : label.hashCode());
return result;
}
#Override
public boolean equals(final Object obj) {
if (this == obj)
return true;
if (obj == null)
return false;
if (getClass() != obj.getClass())
return false;
final Vertex other = (Vertex) obj;
boolean labelEquals;
if (label == null) {
labelEquals = other.label == null;
} else {
labelEquals = label.equals(other.label);
}
return labelEquals && r == other.r && m == other.m;
}
#Override
public String toString() {
return "Vertex{" +
"label='" + label + '\'' +
", r=" + r +
", m=" + m +
'}';
}
}
It is important to define the methods equals and hashCode so later on their hash computations will work as expected.
Graph:
class Graph {
private final Map<Vertex, List<Vertex>> adjVertices = new HashMap<>();
private final Map<String, Vertex> nameToVertex = new HashMap<>();
private final List<Vertex> vertices = new ArrayList<>();
void addVertex(String label, int r, int m) {
Vertex vertex = new Vertex(label, r, m);
adjVertices.putIfAbsent(vertex, new ArrayList<>());
nameToVertex.putIfAbsent(label, vertex);
vertices.add(vertex);
}
void addEdge(String label1, String label2) {
adjVertices.get(nameToVertex.get(label1)).add(nameToVertex.get(label2));
}
public void calculateMostValuableVertex() {
Map<Vertex, Boolean> visitedVertices = new HashMap<>();
for (Vertex vertex : vertices) {
visitedVertices.put(vertex, false);
}
for (Vertex vertex : vertices) {
if (visitedVertices.get(vertex)) {
continue;
}
calculateMostValuableVertexInternal(vertex, visitedVertices);
}
}
public void calculateMostValuableVertexInternal(Vertex vertex, Map<Vertex, Boolean> visitedVertices) {
List<Vertex> neighbours = adjVertices.get(vertex);
visitedVertices.put(vertex, true);
int max = vertex.r;
for (Vertex neighbour: neighbours) {
if (visitedVertices.get(neighbour)) {
max = Math.max(max, neighbour.m);
} else {
calculateMostValuableVertexInternal(neighbour, visitedVertices);
max = Math.max(max, neighbour.m);
}
}
vertex.m = max;
}
#Override
public String toString() {
StringBuilder sb = new StringBuilder();
Iterator<Map.Entry<Vertex, List<Vertex>>> iter = adjVertices.entrySet().iterator();
while (iter.hasNext()) {
Map.Entry<Vertex, List<Vertex>> entry = iter.next();
sb.append(entry.getKey());
sb.append('=').append('"');
sb.append(entry.getValue());
sb.append('"');
if (iter.hasNext()) {
sb.append(',').append('\n');
}
}
return "Graph{" +
"adjVertices=\n" + sb +
'}';
}
}
Finally, to run the above logic, you can do:
Graph g = new Graph();
g.addVertex("A", 1, 1);
g.addVertex("B", 3, 3);
g.addVertex("C", 2, 2);
g.addVertex("D", 4, 4);
g.addEdge("A", "B");
g.addEdge("C", "B");
g.addEdge("A", "D");
g.calculateMostValuableVertex();
System.out.println(g);
The output of the above is:
Graph{adjVertices=
Vertex{label='A', r=1, m=4}="[Vertex{label='B', r=3, m=3}, Vertex{label='D', r=4, m=4}]",
Vertex{label='D', r=4, m=4}="[]",
Vertex{label='B', r=3, m=3}="[]",
Vertex{label='C', r=2, m=3}="[Vertex{label='B', r=3, m=3}]"}
as expected. It supports graphs with cycles as well. For example the output of:
Graph g = new Graph();
g.addVertex("A", 1, 1);
g.addVertex("B", 3, 3);
g.addVertex("C", 2, 2);
g.addVertex("D", 4, 4);
g.addVertex("E", 5, 5);
g.addVertex("F", 6, 6);
g.addVertex("G", 7, 7);
g.addEdge("A", "B");
g.addEdge("C", "B");
g.addEdge("A", "D");
g.addEdge("A", "E");
g.addEdge("E", "F");
g.addEdge("F", "G");
g.addEdge("G", "A");
g.calculateMostValuableVertex();
System.out.println(g);
is:
Graph{adjVertices=
Vertex{label='A', r=1, m=7}="[Vertex{label='B', r=3, m=3}, Vertex{label='D', r=4, m=4}, Vertex{label='E', r=5, m=7}]",
Vertex{label='B', r=3, m=3}="[]",
Vertex{label='C', r=2, m=3}="[Vertex{label='B', r=3, m=3}]",
Vertex{label='D', r=4, m=4}="[]",
Vertex{label='E', r=5, m=7}="[Vertex{label='F', r=6, m=7}]",
Vertex{label='F', r=6, m=7}="[Vertex{label='G', r=7, m=7}]",
Vertex{label='G', r=7, m=7}="[Vertex{label='A', r=1, m=7}]"}
I implemented my answer from the linked question in Python. The lines that don't reference minreach closely follow Wikipedia's description of Tarjan's SCC algorithm.
import random
def random_graph(n):
return {
i: {random.randrange(n) for j in range(random.randrange(n))} for i in range(n)
}
class SCC:
def __init__(self, graph):
self.graph = graph
self.index = {}
self.lowlink = {}
self.stack = []
self.stackset = set()
self.minreach = {}
self.components = []
def dfs(self, v):
self.lowlink[v] = self.index[v] = len(self.index)
self.stack.append(v)
self.stackset.add(v)
self.minreach[v] = v
for w in self.graph[v]:
if w not in self.index:
self.dfs(w)
self.lowlink[v] = min(self.lowlink[v], self.lowlink[w])
elif w in self.stackset:
self.lowlink[v] = min(self.lowlink[v], self.index[w])
self.minreach[v] = min(self.minreach[v], self.minreach[w])
if self.lowlink[v] == self.index[v]:
component = set()
while True:
w = self.stack.pop()
self.stackset.remove(w)
self.minreach[w] = self.minreach[v]
component.add(w)
if w == v:
break
self.components.append(component)
def scc(self):
for v in self.graph:
if v not in self.index:
self.dfs(v)
return self.components, self.minreach
if __name__ == "__main__":
g = random_graph(6)
print(g)
components, minreach = SCC(g).scc()
print(components)
print(minreach)
For this program, I read in a excel file that lays out a map of towns, towns adjacent to them, and the distance between them, which looks like:
Bourke Nyngan 200
Brewarrina Walgett 134
Broken Hill Mildura 266
Broken Hill Wilcannia 195
Bungendore Queanbeyan 54
etc.
And that's working great. Everything seems perfect. I'm trying to make a program that if I give it two towns, it returns the shortest possible path between the two.
I can get my program to correctly read in the file, and set up everything, so I know that this issue isn't anything to do with the setup. To the best of my knowledge this part of my program works, as my program can get through it without throwing errors:
//returns the Vertex with the smallest distance from a list of Vertices
public static Vertex minDist(List<Vertex> Q){
int min = 2147483647;
Vertex closest = new Vertex("Closest");
for(Vertex v : Q){
if(v.distance < min){
closest = v;
}
}
return closest;
}
//used to relax (change the distance of a town)
public static void relax(Vertex u, Vertex v, int w){
if(v.distance > u.distance + w){
v.distance = u.distance + w;
v.predecessor = u;
}
}
public static void Dijkstra(Graph G, Vertex s){
Vertex u = new Vertex("not good");
List<Vertex> Q = V;
Vertex v = new Vertex("oh no");
//while Q is not empty:
while(!Q.isEmpty()){
//the vertex in Q that has the smallest distance (at first s with 0, then we relax and that changes things)
u = minDist(Q);
if(u.name.equals("Closest")){
//Q.remove(u);
return;
}
Q.remove(u);
S.add(u);
//for each edge e in u's adjacencyList:
for(Edge e : u.roadList){
if(e != null && !e.finish.name.equals(u.name) ){
v = e.finish;
relax(u,v,w(u,v)); //w(u,v) returns the distance between u and v
}
}
}
System.out.println("Q is null");
}
So I have that, and things look okay to me. I know it's a bit Frankenstein'ed together, but I got it to at least run without errors, because the ConcurrentModificationException gets thrown AFTER this method returns in my main method.
This is where my Dijkstra method gets called in my main method. I never reach the line in my code that prints "SHOULD REACH HERE" because the program throws the ConcurrentModificationException.
//if both towns exist and are unique, find the shortest route between them.
if(isTown(town1,V) && isTown(town2,V) && !town1.equals(town2)){
for(Vertex f : V){
if(f.name.equals(town2)){
destination = f;
}
}
System.out.println("Traveling...");
Graph G = new Graph(V,E);
for(Vertex s : V){
if(s.name.equals(town1)){
//////////////////DIJKSTRA STUFF GOES HERE///////////////////
initialize(G,s);
Dijkstra(G, s);
System.out.println("FINISHED DIJKSTRA");
//Print out the things in the vertex array S with their distances.
for(Vertex b : S){
System.out.println(b.name + " (" + b.distance + ")");
}
///////////////////////////////////////////////
}
}
System.out.println("SHOULD REACH HERE");
}
I have never seen a ConcurrentModificationException, my lab TA has never seen a ConcurrentModificationException, and even my professor has never seen a ConcurrentModificationException. Can I get some help with avoiding this? A person in a higher class said that he has only seen this happening when working with multiple threads, and I don't even know what that really means so I assume my program doesn't do that.
If I run the program with with town1 = Grafton and town2 = Bathurst, then the output should be:
First town: Grafton
Second town: Bathurst
Bathurst (820)
Lithgow (763)
Windsor (672)
Singleton (511)
Muswellbrook (463)
Tamworth (306)
Bendemeer (264)
Uralla (218)
Armidale (195)
Ebor (106)
Grafton
But is instead
First town: Grafton
Second town: Bathurst
Grafton (0)
Glen Innes (158)
Inverell (225)
Warialda (286)
Coffs Harbour (86)
I/O error: java.util.ConcurrentModificationException
You're getting this because you're removing from Q while iterating over it it. See Iterating through a Collection, avoiding ConcurrentModificationException when removing in loop
Dijkstra algorithm has a step which mentions "chose the node with shortest path". I realize that this step is unnecessary if we dont throw a node out of the graph/queue. This works great in my knowledge with no known disadvantage. Here is the code. Please instruct me if it fails ? if it does then how ? [EDIT => THIS CODE IS TESTED AND WORKS WELL, BUT THERE IS A CHANCE MY TEST CASES WERE NOT EXHAUSTIVE, THUS POSTING IT ON STACKOVERFLOW]
public Map<Integer, Integer> findShortest(int source) {
final Map<Integer, Integer> vertexMinDistance = new HashMap<Integer, Integer>();
final Queue<Integer> queue = new LinkedList<Integer>();
queue.add(source);
vertexMinDistance.put(source, 0);
while (!queue.isEmpty()) {
source = queue.poll();
List<Edge> adjlist = graph.getAdj(source);
int sourceDistance = vertexMinDistance.get(source);
for (Edge edge : adjlist) {
int adjVertex = edge.getVertex();
if (vertexMinDistance.containsKey(adjVertex)) {
int vertexDistance = vertexMinDistance.get(adjVertex);
if (vertexDistance > (sourceDistance + edge.getDistance())) {
//previous bug
//vertexMinDistance.put(adjVertex, vertexDistance);
vertexMinDistance.put(adjVertex, sourceDistance + edge.getDistance())
}
} else {
queue.add(adjVertex);
vertexMinDistance.put(adjVertex, edge.getDistance());
}
}
}
return vertexMinDistance;
}
Problem 1
I think there is a bug in the code where it says:
int vertexDistance = vertexMinDistance.get(adjVertex);
if (vertexDistance > (sourceDistance + edge.getDistance())) {
vertexMinDistance.put(adjVertex, vertexDistance);
}
because this has no effect (vertexMinDistance for adjVertex is set back to its original value).
Better would be something like:
int vertexDistance = vertexMinDistance.get(adjVertex);
int newDistance = sourceDistance + edge.getDistance();
if (vertexDistance > newDistance ) {
vertexMinDistance.put(adjVertex, newDistance );
}
Problem 2
You also need to add the adjVertex into the queue using something like:
int vertexDistance = vertexMinDistance.get(adjVertex);
int newDistance = sourceDistance + edge.getDistance();
if (vertexDistance > newDistance ) {
vertexMinDistance.put(adjVertex, newDistance );
queue.add(adjVertex);
}
If you don't do this then you will get an incorrect answer for graphs such as:
A->B (1)
A->C (10)
B->C (1)
B->D (10)
C->D (1)
The correct path is A->B->C->D of weight 3, but without the modification then I believe your algorithm will choose a longer path (as it doesn't reexamine C once it has found a shorter path to it).
High level response
With these modifications I think this approach is basically sound, but you should be careful about the computational complexity.
Dijkstra will only need to go round the main loop V times (where V is the number of vertices in the graph), while your algorithm may need many more loops for certain graphs.
You will still get the correct answer, but it may take longer.
Although the worst-case complexity will be much worse than Dijkstra, I would be interested in how well it performs in practice. My guess is that it will work well for sparse almost tree-like graphs, but less well for dense graphs.
I'm trying to make a graph implementation for an assignment, which has Graph(GraphImp) objects and Node(NodeImp) objects.
Node objects contain a reference to their Graph, x & y co-ordinates and a name.
The Graph object contains a linked list of its Nodes.
The problem occurs when I try to add a Node into the middle of the List of Nodes (Appending to the end works fine). The program runs out of heap space. I'm not sure why this is occurring though, since the complexity of inserting to a LinkedList should be O(1), and Java (I believe) uses pointers, rather that the objects themselves. I've also tried an arraylist
Making the heap larger is not an option in this instance, and (as far as I understand) should not be the source of the problem.
Thanks in advance.
Here is the error:
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.LinkedList.addBefore(LinkedList.java:795)
at java.util.LinkedList.add(LinkedList.java:361)
at pt.graph.GraphImp.addNode(GraphImp.java:79)
at pt.graph.NodeImp.<init>(NodeImp.java:25)
at pt.graph.Graphs.newNode(Solution.java:68)
Here is the Code:
class Graphs
{
static Node newNode(Graph g, double xpos, double ypos, String name) throws InvalidGraphException,InvalidLabelException
{
if(g==null || !(g instanceof GraphImp)){ //Checking validity of inputs
throw new InvalidGraphException();
}
if(name==null){
throw new InvalidLabelException();
}
NodeImp[] existNodes = ((GraphImp)g).getNodes(); //Get all Nodes already present in the Graph
for(int i=0;i<existNodes.length;i++){
if(existNodes[i].getXPos() == xpos && existNodes[i].getYPos() == ypos){ //If node already present at this position, throw InvalidLabelException()
throw new InvalidLabelException();
}
}
Node n = new NodeImp((GraphImp)g, xpos, ypos, name); //If all inputs are valid, create new node
return n;
}
}
class NodeImp extends Node //Node Class
{
private Object flags = null;
private GraphImp g = null;
private double xpos = 0.0;
private double ypos = 0.0;
private String name = "";
NodeImp(GraphImp g, double xpos, double ypos, String name){
this.g = g;
this.xpos = xpos;
this.ypos = ypos;
this.name = name;
g.addNode(this); // Add Node to the Graph
}
}
class GraphImp extends Graph
{
private LinkedList<NodeImp> nodes = new LinkedList<NodeImp>(); //LinkedList of all Nodes in the Graph
GraphImp(){
}
NodeImp[] getNodes(){ //Returns an array of all Nodes
NodeImp[] nArr = new NodeImp[nodes.size()];
return nodes.toArray(nArr);
}
int countNodes(){ //Returns number of Nodes
return nodes.size();
}
void addNode(NodeImp n){ //Add a Node to the LinkedList in order
boolean added = false;
for(int i = 0;i<nodes.size();i++){
if(n.compareTo(nodes.get(i))<=0 ){
nodes.add(i,n); //fails here
}
}
if(!added){
nodes.add(n);
}
return;
}
}
The problem is that you are not exiting your loop after inserting the new node in the middle of the list. Your code will try to insert the same node an infinite number of times, hence the OOM.
Try this:
for(int i = 0;i<nodes.size();i++){
if(n.compareTo(nodes.get(i))<=0 ){
nodes.add(i,n);
added = true;
break;
}
}
As an aside, your insertion is pretty inefficient. Since you know the list is already sorted you could use a binary search to find the insertion point rather than an O(n) scan of the list. Your current implementation is O(n^2) to insert n items, but it could be O(n log n).
It's hard to diagnose the exact cause of your OOM without the whole program, but here's one observation:
getNodes()
is pretty inefficient. You toArray the LinkedList simply to traverse it and look for a particular instance. Why not just use .contains() properly? No need to copy all of the elements then. Or just do what you were doing before but do it on the List instead of an array copy:
for(NodeImp n : existingNodes){
if(n.getXPos() == xpos && n.getYPos() == ypos){
throw new InvalidLabelException();
}
}
My guess is that the 'old' approach of adding to the end was likely to hit an OOM as well but for some heisenbug reason it hasn't manifested itself. Have you run with a profiler?