Is there any simple way of Comparing two graphs in neo4j - java

My back-end generates a graph that contains a node (lets call it Node 1), the graph looks as follows
1 (TOPNODE)
/ \
2 1 2
/ \
3 3 4
/ \
4 5 6
The top node contains the date the graph was generated.
After that all the even levels (level 2 and level 4 which contain nodes 1, 2,5 and 6) contain a unique name, and a value ie. a phone number.
All of the odd levels (level 3 AKA: nodes 3 and 4) contain their parent name and their children information.
In my service I can edit parts of the graph. For example: I can change the value in the node (NOT THE NAME). or I can delete nodes at once. But i cna only access the edited information by generating that part of the subgraph.
SO my question is: Can I get the full graph into JAVA, then compare only that subgraph with the new subgraph that was just generated and then create a new version of the old graph but with the changes?
What I have tried is:
pulling all of the graph into java as a JSON, and using that to compare to the smaller graph, this works. But I dont know if there is a more efficient way or if there is any way to get the nodes in java as actual nodes instead of JSON. To get it into a JSON I did the following:
Session session = driver.session();
String message = "START n=node(*) MATCH (n)-[r]->(m) RETURN n,r,m;";
StatementResult result = session.run(message);
while ( result.hasNext() ) {
Record record = result.next();
Gson gson = new Gson();
System.out.println(gson.toJson(record.asMap()));
String m = gson.toJson(record.asMap().get("n"));
JSONObject json = new JSONObject(gson.toJson(record.asMap()));
convert(json,m);
}
session.close();

Related

How to find graph schema in Gremlin?

I wanna to find all node and edge properties in graph. How can I list node (or edge) properties that exist in graph?
for example if nodes has 3 non-reserved properties such as NAME, education, gender. I wanna a methods like
g.V().schema().toList();
// result: [ID, LABEL, NAME, GENDER, EDUCATION]
Gremlin itself has no notion of schema. This was a deliberate design choice as the capabilities and behavior around schema APIs is quite different from one graph system implementation to the next and forming an appropriate abstraction in Apache TinkerPop for that is quite difficult. In this way it is quite akin to TinkerPop 2.x's attempt to build a general index API, which ended up being too generic to be useful to anyone and had there been more complexity added, more that what was required for most cases. In the end, like indexing APIs, ideas for generalizing schema were left out for TinkerPop 3.x.
If you use a graph that allows for schema definition like JanusGraph or DSE Graph you should simply use the underlying Schema API of that graph system to get all of your schema values. If you aren't using that type of graph then you will need to do something along the lines of what has been offered in the other answers thus far and iterate through all of the vertices (or edges) and get the unique property keys. Here's my version:
gremlin> graph = TinkerFactory.createModern()
==>tinkergraph[vertices:6 edges:6]
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V().properties().key().dedup()
==>name
==>age
==>lang
The problem here is that to do this type of traversal, you will require a full graph scan, which will be problematic if you have a large graph. In those cases you will need to use an OLAP-based traversal with Spark or the like.
If all nodes have a same properties. we can find the properties of the first vertex and generalize it to all nodes:
TinkerGraph tg = TinkerGraph.open() ;
tg.io(IoCore.graphml()).readGraph("src\\main\\resources\\air-routes.graphml");
GraphTraversalSource g = tg.traversal();
g.V().propertyMap().select(Column.keys).next();
// result = {LinkedHashSet#1831} size = 12
// 0 = "country"
// 1 = "code"
// 2 = "longest"
// 3 = "city"
// 4 = "elev"
// 5 = "icao"
// 6 = "lon"
// 7 = "type"
// 8 = "region"
// 9 = "runways"
// 10 = "lat"
// 11 = "desc"
but If there is no guaranty to each node has a same set of properties, I don't find any other solution instead of retrieving all properties in a Map List and find distinct property with java collection methods (outside Gremlin).
The last two lines in the JUnit Test Case might be closer to do what you want.
see also:
https://github.com/BITPlan/com.bitplan.simplegraph/blob/master/simplegraph-core/src/test/java/com/bitplan/simplegraph/core/TestTinkerPop3.java
graph.traversal().V().next().properties()
.forEachRemaining(prop -> System.out.println(String.format("%s=%s",
prop.label(), prop.value().getClass().getSimpleName())));
graph.traversal().V().next().edges(Direction.OUT)
.forEachRemaining(edge -> System.out.println(
String.format("%s->%s", edge.label(), edge.outVertex().label())));
producing:
name=String
age=Integer
created->person
knows->person
JUnit Test Case
#Test
public void testSchema() {
Graph graph = TinkerFactory.createModern();
graph.traversal().V().next().properties()
.forEachRemaining(prop -> System.out.println(String.format("%s=%s",
prop.label(), prop.value().getClass().getSimpleName())));
graph.traversal().V().next().edges(Direction.OUT)
.forEachRemaining(edge -> System.out.println(
String.format("%s->%s", edge.label(), edge.outVertex().label())));
}

neo4j: Replace multiple nodes with same property by one node

Let's say I have a property "name" of nodes in neo4j. Now I want to enforce that there is maximally one node for a given name by identifying all nodes with the same name. More precisely: If there are three nodes where name is "dog", I want them to be replaced by just one node with name "dog", which:
Gathers all properties from all the original three nodes.
Has all arcs that were attached to the original three nodes.
The background for this is the following: In my graph, there are often several nodes of the same name which should considered as "equal" (although some have richer property information than others). Putting a.name = b.name in a WHERE clause is extremely slow.
EDIT: I forgot to mention that my Neo4j is of version 2.3.7 currently (I cannot update it).
SECOND EDIT: There is a known list of labels for the nodes and for the possible arcs. The type of the nodes is known.
THIRD EDIT: I want to call above "node collapse" procedure from Java, so a mixture of Cypher queries and procedural code would also be a useful solution.
I have made a testcase with following schema:
CREATE (n1:TestX {name:'A', val1:1})
CREATE (n2:TestX {name:'B', val2:2})
CREATE (n3:TestX {name:'B', val3:3})
CREATE (n4:TestX {name:'B', val4:4})
CREATE (n5:TestX {name:'C', val5:5})
MATCH (n6:TestX {name:'A', val1:1}) MATCH (m7:TestX {name:'B', val2:2}) CREATE (n6)-[:TEST]->(m7)
MATCH (n8:TestX {name:'C', val5:5}) MATCH (m10:TestX {name:'B', val3:3}) CREATE (n8)<-[:TEST]-(m10)
What results in following output:
Where the nodes B are really the same nodes. And here is my solution:
//copy all properties
MATCH (n:TestX), (m:TestX) WHERE n.name = m.name AND ID(n)<ID(m) WITH n, m SET n += m;
//copy all outgoing relations
MATCH (n:TestX), (m:TestX)-[r:TEST]->(endnode) WHERE n.name = m.name AND ID(n)<ID(m) WITH n, collect(endnode) as endnodes
FOREACH (x in endnodes | CREATE (n)-[:TEST]->(x));
//copy all incoming relations
MATCH (n:TestX), (m:TestX)<-[r:TEST]-(endnode) WHERE n.name = m.name AND ID(n)<ID(m) WITH n, collect(endnode) as endnodes
FOREACH (x in endnodes | CREATE (n)<-[:TEST]-(x));
//delete duplicates
MATCH (n:TestX), (m:TestX) WHERE n.name = m.name AND ID(n)<ID(m) detach delete m;
The resulting output looks like this:
It has to be marked that you have to know the type of the various relationships.
All the properties are copied from the nodes with "higher" IDs to the nodes with the "lower" IDs.
I think you need something like a synonym of nodes.
1) Go through all nodes and create a node synonym:
MATCH (N)
WITH N
MERGE (S:Synonym {name: N.name})
MERGE (S)<-[:hasSynonym]-(N)
RETURN count(S);
2) Remove the synonyms with only one node:
MATCH (S:Synonym)
WITH S
MATCH (S)<-[:hasSynonym]-(N)
WITH S, count(N) as count
WITH S WHERE count = 1
DETACH DELETE S;
3) Transport properties and relationships for the remaining synonyms (with apoc):
MATCH (S:Synonym)
WITH S
MATCH (S)<-[:hasSynonym]-(N)
WITH [S] + collect(N) as nodesForMerge
CALL apoc.refactor.mergeNodes( nodesForMerge );
4) Remove Synonym label:
MATCH (S:Synonym)<-[:hasSynonym]-(N)
CALL apoc.create.removeLabels( [S], ['Synonym'] );

Neo4j-ogm query path

In my Java code I have a query to match the shortest path from root to a leaf in my tree.
Strinq query = "Match path = (p:Root)-[*1..100]-(m:Leaf) "
+ "WITH p,m,path ORDER BY length(path) LIMIT 1 RETURN path";
However, when I try to query this as follows
SessionFactory sessionFactory = new SessionFactory("incyan.Data.Neo4j.Models");
Session session = sessionFactory.openSession("http://localhost:7474");
Object o = session(query, new HashMap<String,Object>());
o contains an ArrayList of LinkedHashMaps instead of mapped objects.
I cannot even determine the labels of the path elements and the start and end nodes of the relations.
What am I doing wrong?
The current neo4j-ogm release does not map query results to domain entities. Returning a path will only give you the properties of nodes and relationships in that path (in order, so you can infer the relationship start/end). ID's aren't returned by the Neo4j REST api currently used by the OGM for this particular operation and that's why they are missing. You may instead have to extract the ID's and return them as part of your query.
Mapping individual query result columns to entities will be available in a Neo4j-OGM 2.0 release.
I'm not sure about the Java bit, but if you use the shortestPath function (keyword?) your query should be more efficient:
MATCH path=shortestPath((p:Root)-[*1..100]-(m:Leaf))
RETURN path
Also, I don't know what your data model is like, but I would expect the labels on the nodes of your tree (I'm assuming it's a tree) to all be the same. You can tell if a node is a root or a leaf using Cypher:
MATCH path=shortestPath((root:Element)-[*1..100]-(leaf:Element))
WHERE NOT((root)-[:HAS_PARENT]->()) AND NOT(()-[:HAS_PARENT]->(leaf))
RETURN path

java.lang.IllegalStateException: No primary SDN label exists in Spring Data Neo4j when using imported data

Trying to retrieve nodes in SDN, imported using the Neo4j CSV Batch Importer, gives the java.lang.IllegalStateException:
java.lang.IllegalStateException: No primary SDN label exists .. (i.e one with starting with _)
This is after having added a new label through a cypher query:
match (n:Movie) set n:_Movie;
Inspecting nodes created through SDN shows they have the same labels. The result when running
match (n) where id(n)={nodeId} return labels(n) as labels;
as found in LabelBasedStrategyCypher.java is the same for both:
["Movie","_Movie"]
Saving and retrieving nodes thorugh SDN works without any issues. I must be missing something as I got the impression that setting the appropiate labels should be enough.
I'm running SDN 3.0.2.RELEASE and neo4j 2.0.3 on arch linux x64 with Oracle Java 1.8.0_05
Edit: My CSV file looks like this. The appId is only used to assure the node is the same that we have stored earlier, as the internal nodeId is Garbage collected and new nodes could get old nodeIds after the old ones are deleted. The nodeId is used for actual lookups and for connecting relationships and so on.
appId:int l:label title:string:movies year:int
1 Movie Dinosaur Planet 2003
2 Movie Isle of Man TT 2004 Review 2004
Edit2:
I have made more tests, checking the source of LabelBasedNodeTypeRepresentationStrategy to see what is going wrong. Running the readAliasFrom() method that the Exception is thrown from does not return any errors:
String query = "start n=node({id}) return n";
Node node = null;
for(Node n : neo4jTemplate.query(query,params).to(Node.class)){
node = n;
}
// when running the readAliasFrom method manually the label is returned correctly
LabelBasedNodeTypeRepresentationStrategy strategy = new
LabelBasedNodeTypeRepresentationStrategy(neo4jTemplate.getGraphDatabase());
System.out.println("strategy returns: " +(String)strategy.readAliasFrom(node));
// trying to convert the node to a movie object, however throws the Illegal State Exception
Movie movie = null;
movie = neo4jTemplate.convert(node,Movie.class);
So, the _Movie label exists, running the readAliasFrom() method manually doesn't throw Exceptions but trying to convert the node into a Movie does. Nodes created from SDN do not have these issues, even if they look identical to me.

How should I get all the existing relationships in a graph in neo4j by java?

How should I get all the existing relationships between each two nodes in a graph in neo4j by java?
I want the results which this cypher query returns:
start r=rel(*) return r
so later I can change or delete some of them based on my conditions?
or get the start or end node of them.
this is what I have done so far:
Iterable<Relationship> rels=GlobalGraphOperations.at(db).getAllRelationships();
for (Relationship rel: rels )
{}
but I have error in this line:for (Relationship rel: rels )
the error is because does not know rels ,and wants to create a class for it.
I used this for indexing and it was working:
GlobalGraphOperations ggo = GlobalGraphOperations.at(db);
for (Relationship r : ggo.getAllRelationships()) {
//indexing code
}
try to get relationships on single node and check result
e.g.
Iterable<BatchRelationship> _itlRelationship= _neo.getRelationships(_empNodeId);
Iterator<BatchRelationship> _itRelationship= _itlRelationship.iterator();
while (_itRelationship.hasNext()) {}

Categories

Resources