Best Way to Implement an Edge Weighted Graph in Java

Best Way to Implement an Edge Weighted Graph in Java - java

First of all, I'm dealing with graphs more than 1000 edges and I'm traversing adjacency lists, as well as vertices more than 100 times per second. Therefore, I really need an efficient implementation that fits my goals.
My vertices are integers and my edges are undirected, weighted.
I've seen this code.
However, it models the adjacency lists using edge objects. Which means I have to spend O(|adj|) of time when I'd like to get the adjacents of a vertex, where |adj| is the cardinality of its adjacents.
On the other hand, I'm considering to model my adjacency lists using Map<Integer, Double>[] adj.
By using this method, I would just use adj[v], v being the vertex, and get the adjacents of the vertex to iterate over.
The other method requires something like:
public Set<Integer> adj(int v)
{
Set<Integer> adjacents = new HashSet<>();
for(Edge e: adj[v])
adjacents.add(e.other(v));
return adjacents;
}
My goals are:
I want to sort a subset of vertices by their connectivities (number of adjacents) any time I want.
Also, I need to sort the adjacents of a vertex, by the weights of the edges that connect itself and its neighbors.
I want to do these without using so much space that slows down the operations. Should I consider using an adjacency matrix?

I've used th JGrapht for a library for a variety of my own graph representations. They have a weighted graph implementation here: http://jgrapht.org/javadoc/org/jgrapht/graph/SimpleWeightedGraph.html
That seems to handle a lot of what you are looking for, and I've used it to represent graphs with up to around 2000 vertices, and it handles reasonably well for my needs, though I don't remember my access rate.

Related

Java Adjacency list implementation of graph with directed weighted edges

I am trying to implement a directed weighted edge graph in Java using adjacency lists. It consists of an array with the size equal to the number of vertices, each entry of the array is a LinkedList of the successors of that each particular Vertex.
I want to add weight to each edge, I was thinking of doing this by adding a weight label to each successor Object in the LinkedList, furthermore I want to add other variables per Vertex for future use. If I want to do this, I would have to create a new Data structure for the vertices and a separate one as an adjacency lists. What would be an efficient design to combine both as a single Data structure?

You should represent your graph as a HashMap where key is label of vertices and value is vertex objects.
HashMap<String,Vertex> graph = new HashMap<String,Vertex>();
Vertex is a class encapsulating vertex attributes.There will be an attribute HashMap for adjacent vertices with weights.
HashMap<Vertex,Integer> adjListWithWeights = new HashMap<Vertex,Integer>();
You can add more functionality and attributes to your graph through Vertex class.

Java Graph Implementation without data structures

I had an interview Samsung. I'm a fresh graduate and they asked me to program an application such that:
it's a graph of cities , each city can reach the other (all vertices can reach each other) , there is some vertices if you moved(Deleted) they will affect the graph (it will be split and some of the vertices will never reach each other)
The question is, determine the vertices that if they are removed, it will disconnect the graph (affect a vercticy or more from reaching all others).
*without using any data structure (no queue, no array list) just arrays are allowed.
The input is like:
V = number of vertices.
E = number of edges
and the edges in the input file are like
1 2 3 4 3 6 4 8
1-2 are 1 relation between 1 and 2 and it is bidirectional so both ways.
How can this question be solved, and do you think its hard for a fresh graduate?

You can define a class Vertex (or Node) and map the Edge between two Vertices as the assoziation between two instances. This would be a bit naive, but would work if you shall only set up easy operations like removing.
public class Vertex {
int number;
Vertex[] vertices; // you have to look yourself if array is big enough
}
Now, each vertex has an array of Vertices he is connected to. You can realize a bidirection if you add a Vertex in this vertices array and then add the calling vertex to the array of the added vertex:
vertex.add(new Vertex(1)); //internally, it will be looked if array is big enoguh and then the add method will do its work
...
public void add(Vertex v) {
//add into vertices
...
v.add(this);
}
To remove a vertex, the remove operation should delete the references on this vertex in the vertices-arrays of the other Vertices objects.
A more conceptual solution could be to provide a second class Edge that has a start and end vertex.
Maybe that helps you a bit.

Traverse a tree represented by its edges

My tree is represented by its edges and the root node. The edge list is undirected.
char[][] edges =new char[][]{
new char[]{'D','B'},
new char[]{'A','C'},
new char[]{'B','A'}
};
char root='A';
The tree is
A
B C
D
How do I do depth first traversal on this tree? What is the time complexity?
I know time complexity of depth first traversal on linked nodes is O(n). But if the tree is represented by edges, I feel the time complexity is O(n^2). Am I wrong?
Giving code is appreciated, although I know it looks like homework assignment..

The general template behind DFS looks something like this:
function DFS(node) {
if (!node.visited) {
node.visited = true;
for (each edge {node, v}) {
DFS(v);
}
}
}
If you have your edges represented as a list of all the edges in the graph, then you could implement the for loop by iterating across all the edges in the graph and, every time you find one with the current node as its source, following the edge to its endpoint and running a DFS from there. If you do this, then you'll do O(m) work per node in the graph (here, m is the number of edges), so the runtime will be O(mn), since you'll do this at most once per node in the graph. In a tree, the number of edges is always O(n), so for a tree the runtime is O(n2).
That said, if you have a tree and there are only n edges, you can speed this up in a bunch of ways. First, you could consider doing an O(n log n) preprocessing step to sort the array of edges. Then, you can find all the edges leaving a given node by doing a binary search to find the first edge leaving the node, then iterating across the edges starting there to find just the edges leaving the node. This improves the runtime quite a bit: you do O(log n) work per node for the binary search, and then every edge gets visited only once. This means that the runtime is O(n log n). Since you've mentioned that the edges are undirected, you'll actually need to create two different copies of the edges array - one that's the original one, and one with the edges reversed - and should sort each one independently. The fact that DFS marks visited nodes along the way means that you don't need to do any extra bookkeeping here to figure out which direction you should go at each step, and this doesn't change the overall time complexity, though it does increase the space usage.
Alternatively, you could use a hashing-based solution. Before doing the DFS, iterate across the edges and convert them into a hash table whose keys are the nodes and whose values are lists of the edges leaving that node. This will take expected time O(n). You can then implement the "for each edge" step quite efficiently by just doing a hash table lookup to find the edges in question. This reduces the time to (expected) O(n), though the space usage goes up to O(n) as well. Since your edges are undirected, as you populate the table, just be sure to insert the edge in each direction.

How to generate random graphs?

I want to be able to generate random, undirected, and connected graphs in Java. In addition, I want to be able to control the maximum number of vertices in the graph. I am not sure what would be the best way to approach this problem, but here are a few I can think of:
(1) Generate a number between 0 and n and let that be the number of vertices. Then, somehow randomly link vertices together (maybe generate a random number per vertex and let that be the number of edges coming out of said vertex). Traverse the graph starting from an arbitrary vertex (say with Breadth-First-Search) and let our random graph G be all the visited nodes (this way, we make sure that G is connected).
(2) Generate a random square matrix (of 0's and 1's) with side length between 0 and n (somehow). This would be the adjacency matrix for our graph (the diagonal of the matrix should then either be all 1's or all 0's). Make a data structure from the graph and traverse the graph from any node to get a connected list of nodes and call that the graph G.
Any other way to generate a sufficiently random graph is welcomed. Note: I do not need a purely random graph, i.e., the graph you generate doesn't have to have any special mathematical properties (like uniformity of some sort). I simply need lots and lots of graphs for testing purposes of something else.
Here is the Java Node class I am using:
public class Node<T> {
T data;
ArrayList<Node> children= new ArrayList<Node>();
...}
Here is the Graph class I am using (you can tell why I am only interested in connected graphs at the moment):
public class Graph {
Node mainNode;
ArrayList<Node> V= new ArrayList<Node>();
public Graph(Node node){
mainNode= node;
}
...}
As an example, this is how I make graphs for testing purposes right now:
//The following makes a "kite" graph G (with "a" as the main node).
/* a-b
|/|
c-d
*/
Node<String> a= new Node("a");
Node<String> b= new Node("b");
Node<String> c= new Node("c");
Node<String> d= new Node("d");
a.addChild(b);
a.addChild(c);
b.addChild(a);
b.addChild(c);
b.addChild(d);
c.addChild(a);
c.addChild(b);
c.addChild(d);
d.addChild(c);
d.addChild(b);
Graph G1= new Graph(a);

Whatever you want to do with your graph, I guess its density is also an important parameter. Otherwise, you'd just generate a set of small cliques (complete graphs) using random sizes, and then connect them randomly.
If I'm correct, I'd advise you to use the Erdős-Rényi model: it's simple, not far from what you originally proposed, and allows you to control the graph density (so, basically: the number of links).
Here's a short description of this model:
Define a probability value p (the higher p and the denser the graph: 0=no link, 1=fully connected graph);
Create your n nodes (as objects, as an adjacency matrix, or anything that suits you);
Each pair of nodes is connected with a (independent) probability p. So, you have to decide of the existence of a link between them using this probability p. For example, I guess you could ranbdomly draw a value q between 0 and 1 and create the link iff q < p. Then do the same thing for each possible pair of nodes in the graph.
With this model, if your p is large enough, then it's highly probable your graph is connected (cf. the Wikipedia reference for details). In any case, if you have several components, you can also force its connectedness by creating links between nodes of distinct components. First, you have to identify each component by performing breadth-first searches (one for each component). Then, you select pairs of nodes in two distinct components, create a link between them and consider both components as merged. You repeat this process until you've got a single component remaining.

The only tricky part is ensuring that the final graph is connected. To do that, you can use a disjoint set data structure. Keep track of the number of components, initially n. Repeatedly pick pairs of random vertices u and v, adding the edge (u, v) to the graph and to the disjoint set structure, and decrementing the component count when the that structure tells you u and v belonged to different components. Stop when the component count reaches 1. (Note that using an adjacency matrix simplifies managing the case where the edge (u, v) is already present in the graph: in this case, adj[u][v] will be set to 1 a second time, which as desired has no effect.)
If you find this creates graphs that are too dense (or too sparse), then you can use another random number to add edges only k% of the time when the endpoints are already part of the same component (or when they are part of different components), for some k.

The following paper proposes an algorithm that uniformly samples connected random graphs with prescribed degree sequence, with an efficient implementation. It is available in several libraries, like Networkit or igraph.
Fast generation of random connected graphs with prescribed degrees.
Fabien Viger, Matthieu Latapy
Be careful when you make simulations on random graphs: if they are not sampled uniformly, then they may have hidden properties that impact simulations; alternatively, uniformly sampled graphs may be very different from the ones your code will meet in practice...

k nearest neighbors graph implementation in Java

I am implementing a Flame clustering algorithm as a way of learning a bit more about graphs and graph traversal, and one of the first steps is constructing a K-nearest-neighbors graph, and I'm wondering what the fastest way would be of running through a list of nodes and connecting each one only to say, it's nearest five neighbors. My thought was that I would start at a node, iterate through the list of other nodes and keep the ones that are closest within an array, making sure that everything past the top n are discarded. Now, I could do this by just sorting a list and keeping the top n entries, but I would much rather keep less fewer things in memory and so I was wondering if there was a way to just have the final array and update that array as I iterate through, or if there is a more efficient way of generating a k nearest neighbors graph.
Also, please note, this is NOT a duplicate of K-Nearest Neighbour Implementation in Java. KNNG is distinct from KNN.

Place the first n nodes, sorted in a List. Then iterate through the rest of nodes and if it fits in the current list (i.e. is a top n node), place it in the corresponding position in the list and discard the last top n node. If it doesn't fit in the top n list, discard it.
for each neighborNode
for(int i = 0; i < topNList.size(); i++){
if((dist = distanceMetric(neighborNode,currentNode)) > topNList.get(i).distance){
topNList.remove(topNList.size()-1)
neighborNode.setDistance(dist);
topNList.add(i, neighborNode);
}

I think the most efficient way would be using a bound priority queue, like
https://github.com/tdebatty/java-graphs#bounded-priority-queue

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.