Janusgraph Remote Traversal with Java - java

I am building a Java application that needs to connect to a remote JanusGraph server and create graphs on the fly.
I have installed/configured a single node JanusGraph Server with a Berkeley database backend and ConfigurationManagementGraph support so that I can create/manage multiple graphs on the server.
In a Gremlin console I can connect to the remote server, create graphs, create vertexes, etc. Example:
gremlin> :remote connect tinkerpop.server conf/remote.yaml session
gremlin> :remote console
gremlin> map = new HashMap<String, Object>();
gremlin> map.put("storage.backend", "berkeleyje");
gremlin> map.put("storage.directory", "db/test");
gremlin> ConfiguredGraphFactory.createTemplateConfiguration(new MapConfiguration(map));
gremlin> ConfiguredGraphFactory.create("test");
gremlin> graph = ConfiguredGraphFactory.open("test");
gremlin> g = graph.traversal();
gremlin> g.addV("person").property("name", "peter");
gremlin> g.tx().commit();
gremlin> graph.vertices().size();
==>1
gremlin> g.V();
==>v[4288]
gremlin> g.V().count();
==>1
gremlin> g.close();
So far so good. On the Java side, I can connect to the remote server and issue commands via Client.submit() method. In the following example, I connect to the remote server and create a new graph called "test2":
Cluster cluster = Cluster.build()
.addContactPoint(host)
.port(port)
.serializer(Serializers.GRYO_V3D0)
.create();
String name = "test2";
String sessionId = UUID.randomUUID().toString();
Client client = cluster.connect(sessionId);
client.submit("map = new HashMap<String, Object>();");
client.submit("map.put(\"storage.backend\", \"berkeleyje\");");
client.submit("map.put(\"storage.directory\", \"db/" + name + "\");");
client.submit("ConfiguredGraphFactory.createTemplateConfiguration(new MapConfiguration(map));");
client.submit("ConfiguredGraphFactory.create(\"" + name + "\");");
I can confirm that the graph was created and see other graphs programmatically as well using the client.submit() method:
ResultSet results = client.submit("ConfiguredGraphFactory.getGraphNames()");
Iterator<Result> it = results.iterator();
while (it.hasNext()){
Result result = it.next();
String graphName = result.getString();
System.out.println(graphName);
}
Next I want to connect to a graph and traverse the nodes programmatically (in Java). However, I can't seem to figure out how to do this. From what I've read, it should be something as simple as this:
DriverRemoteConnection conn = DriverRemoteConnection.using(client, name); //"_traversal"
GraphTraversalSource g = AnonymousTraversalSource.traversal().withRemote(conn);
These commands don't raise any errors but the GraphTraversalSource appears to be empty:
System.out.println(g.getGraph()); //always returns emptygraph[empty]
System.out.prinltn(g.V()); //Appears to be empty [GraphStep(vertex,[])]
Iterator<Vertex> it = g.getGraph().vertices(); //empty
Any suggestions of how to get a GraphTraversalSource for a remote JanusGraph server in Java? I suspect that my issue something to do with ConfigurationManagementGraph but I can't put my finger on it. Again, the client.submit() works. It would be cool if I could do something like this:
GraphTraversalSource g = (GraphTraversalSource) client.submit("ConfiguredGraphFactory.open(\"" + name + "\");").iterator().next();
...but of course, that doesn't work
UPDATE
Looking at the code, it appears that the graph name (remoteTraversalSourceName) passed to the DriverRemoteConnection is being ignored.
Starting with the DriverRemoteConnection:
DriverRemoteConnection conn = DriverRemoteConnection.using(client, name);
Under the hood, the graph name (remoteTraversalSourceName) is simply used to set an alias (e.g. client.alias(name);)
Next, in the AnonymousTraversalSource.traversal().withRemote() method
GraphTraversalSource g = AnonymousTraversalSource.traversal().withRemote(conn);
Under the hood, withRemote() is calling:
traversalSourceClass.getConstructor(RemoteConnection.class).newInstance(remoteConnection);
Where traversalSourceClass is GraphTraversalSource.class
Which is the same as this:
g = GraphTraversalSource.class.getConstructor(RemoteConnection.class).newInstance(conn);
Finally, the constructor for the GraphTraversalSource looks like this:
public GraphTraversalSource(final RemoteConnection connection) {
this(EmptyGraph.instance(), TraversalStrategies.GlobalCache.getStrategies(EmptyGraph.class).clone());
this.connection = connection;
this.strategies.addStrategies(new RemoteStrategy(connection));
}
As you can see, the graph variable in the GraphTraversalSource is never set.
I suspect that either (a) I shouldn't be using the AnonymousTraversalSource or (b) that I need to instantiate the GraphTraversalSource some other way, perhaps using a Graph object.

Updated answer
Most likely the channelizer that you use is the Tinkerpop's channelizer org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer. Replacing it with the Janus' channelizer org.janusgraph.channelizers.JanusGraphWsAndHttpChannelizer properly binds the graph to the connection.
Older answer
The workaround that was used along when the channelizer was org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer.
I'm having the same issue. For now, the workaround that I found is to bind the traversals during the Janus startup.
Additionally to the gremlin-server.yaml and janusgraph.properties, I also override the empty-sample.groovy with the content:
def globals = [:]
ConfiguredGraphFactory.getGraphNames().each { name ->
globals << [ (name + "_traversal") : ConfiguredGraphFactory.open(name).traversal()]
}
Now the graph that you create is available at the yourgraphname_traversal:
import org.apache.tinkerpop.gremlin.driver.Cluster
import org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection
import org.apache.tinkerpop.gremlin.driver.ser.Serializers
import org.apache.tinkerpop.gremlin.process.traversal.AnonymousTraversalSource
...
val cluster = Cluster.build()
.addContactPoint("your_load_balancer_host")
.port(8182)
.serializer(Serializers.GRAPHBINARY_V1D0.simpleInstance())
.create()
val remoteConnection = DriverRemoteConnection.using(cluster, "yourgraphname_traversal")
val g = AnonymousTraversalSource.traversal().withRemote(remoteConnection)
The solution is not ideal, cause it requires all the Janus nodes to be restarted in order to update the bindings. Assuming that the graphs are rarely created, this solution is at least something.

Related

How to create Index using Java API for orientdb 3.0?

I am using oriendb multi-model java api. I use the OVertex and OEdge class to store my documents. They inherit from the OElement class. It seems that the OElement class does not seem to expose a createIndex() method. I know that this is possible if we were using the OClass to create classes and hold documents.
How do I create an index using the multi-model API if I am using the OVertex and OEdge classes.
I am missing the link [OVertex,OEdge]--inherits-from-->[OElement]--(?)-->[OClass]
If you are using the JAVA multi model API the cleanest way I found was:
// create the connection pool for orientdb
OrientDB orient = new OrientDB(orientUrl, OrientDBConfig.defaultConfig());
OrientDBConfigBuilder poolCfg = OrientDBConfig.builder();
poolCfg.addConfig(OGlobalConfiguration.DB_POOL_MIN, 2);
poolCfg.addConfig(OGlobalConfiguration.DB_POOL_MAX, 5);
ODatabasePool pool = new ODatabasePool(orientUrl, databaseName, orientUser, orientPass, poolCfg.build());
// acquire the orient pool connection
try (ODatabaseSession db = pool.acquire()) {
// check and create vertex/edge class
if (db.getClass("className") == null) {
// create the class if it does not exist in the DB
OClass orientClass =
db.createVertexClass("className");
// OR db.createEdgeClass("className");
orientClass.createProperty("id", OType.STRING);
orientClass.createIndex("id", OClass.INDEX_TYPE.UNIQUE, "id");
}
// now create the OVertex/OEdge/OElement
OVertex vertex = db.newVertex("className");
// add properties to your document
vertex.setProperty("id", id);
...
// release the connection back to the pool
} finally {
orient.close();
}
I did not find this in the document yet, so maybe it helps someone.

Cannot read Gremlin data from remote after writing

I use Java to connect to a "remote" (localhost:8182) Gremlin server g this way:
traversalSource = traversal().withRemote(DriverRemoteConnection.using("localhost", 8182, "g"));
Then, I write some node like this:
traversalSource.addV("TenantProfile");
From Gremlin console, connected to the same Gremlin server, I see all created nodes and edges
gremlin> g==>graphtraversalsource[tinkergraph[vertices:42 edges:64], standard]
and queries work, but if I read graph from Java, it results empty, so querying e.g. like
traversalSource.V()
.has("label", TENANT_PROFILE_LABEL)
.has("fiscal id", "04228480408")
.out(OWNS_LABEL)
.has("type", "SH")
.values("description")
.toList();
returns an emtpy list.
Could anyone help me solve this mistery, please?
Thanks.
In reply to Stephen, I post the last instructions before iterate()
for (final Map<String, String> edgePropertyMap : edgePropertyTable) {
edgeTraversal = traversalSource
.V(vertices.get(edgePropertyMap.get(FROM_KEY)))
.addE(edgeLabel)
.to(vertices.get(edgePropertyMap.get(TO_KEY)));
final Set<String> edgePropertyNames = edgePropertyMap.keySet();
for (final String nodePropertyName : edgePropertyNames)
if ((!nodePropertyName.equals(FROM_KEY)) && (!nodePropertyName.equals(TO_KEY))) {
final String edgePropertyValue = edgePropertyMap.get(nodePropertyName);
edgeTraversal = edgeTraversal.property(nodePropertyName, edgePropertyValue);
}
edgeTraversal.as(edgePropertyMap.get(IDENTIFIER_KEY)).iterate();
}
Anyway, if no iterate() were present, how could nodes and edges be visible from inside console? How could they have been "finalized" on remote server?

Neo4j ExecutionEngine does not return valid results

Trying to use a similar example from the sample code found here
My sample function is:
void query()
{
String nodeResult = "";
String rows = "";
String resultString;
String columnsString;
System.out.println("In query");
// START SNIPPET: execute
ExecutionEngine engine = new ExecutionEngine( graphDb );
ExecutionResult result;
try ( Transaction ignored = graphDb.beginTx() )
{
result = engine.execute( "start n=node(*) where n.Name =~ '.*79.*' return n, n.Name" );
// END SNIPPET: execute
// START SNIPPET: items
Iterator<Node> n_column = result.columnAs( "n" );
for ( Node node : IteratorUtil.asIterable( n_column ) )
{
// note: we're grabbing the name property from the node,
// not from the n.name in this case.
nodeResult = node + ": " + node.getProperty( "Name" );
System.out.println("In for loop");
System.out.println(nodeResult);
}
// END SNIPPET: items
// START SNIPPET: columns
List<String> columns = result.columns();
// END SNIPPET: columns
// the result is now empty, get a new one
result = engine.execute( "start n=node(*) where n.Name =~ '.*79.*' return n, n.Name" );
// START SNIPPET: rows
for ( Map<String, Object> row : result )
{
for ( Entry<String, Object> column : row.entrySet() )
{
rows += column.getKey() + ": " + column.getValue() + "; ";
System.out.println("nested");
}
rows += "\n";
}
// END SNIPPET: rows
resultString = engine.execute( "start n=node(*) where n.Name =~ '.*79.*' return n.Name" ).dumpToString();
columnsString = columns.toString();
System.out.println(rows);
System.out.println(resultString);
System.out.println(columnsString);
System.out.println("leaving");
}
}
When I run this in the web console I get many results (as there are multiple nodes that have an attribute of Name that contains the pattern 79. Yet running this code returns no results. The debug print statements 'in loop' and 'nested' never print either. Thus this must mean there are not results found in the Iterator, yet that doesn't make sense.
And yes, I already checked and made sure that the graphDb variable is the same as the path for the web console. I have other code earlier that uses the same variable to write to the database.
EDIT - More info
If I place the contents of query in the same function that creates my data, I get the correct results. If I run the query by itself it returns nothing. It's almost as the query works only in the instance where I add the data and not if I come back to the database cold in a separate instance.
EDIT2 -
Here is a snippet of code that shows the bigger context of how it is being called and sharing the same DBHandle
package ContextEngine;
import ContextEngine.NeoHandle;
import java.util.LinkedList;
/*
* Class to handle streaming data from any coded source
*/
public class Streamer {
private NeoHandle myHandle;
private String contextType;
Streamer()
{
}
public void openStream(String contextType)
{
myHandle = new NeoHandle();
myHandle.createDb();
}
public void streamInput(String dataLine)
{
Context context = new Context();
/*
* get database instance
* write to database
* check for errors
* report errors & success
*/
System.out.println(dataLine);
//apply rules to data (make ContextRules do this, send type and string of data)
ContextRules contextRules = new ContextRules();
context = contextRules.processContextRules("Calls", dataLine);
//write data (using linked list from contextRules)
NeoProcessor processor = new NeoProcessor(myHandle);
processor.processContextData(context);
}
public void runQuery()
{
NeoProcessor processor = new NeoProcessor(myHandle);
processor.query();
}
public void closeStream()
{
/*
* close database instance
*/
myHandle.shutDown();
}
}
Now, if I call streamInput AND query in in the same instance (parent calls) the query returns results. If I only call query and do not enter ANY data in that instance (yet web console shows data for same query) I get nothing. Why would I have to create the Nodes and enter them into the database at runtime just to return a valid query. Shouldn't I ALWAYS get the same results with such a query?
You mention that you are using the Neo4j Browser, which comes with Neo4j. However, the example you posted is for Neo4j Embedded, which is the in-process version of Neo4j. Are you sure you are talking to the same database when you try your query in the Browser?
In order to talk to Neo4j Server from Java, I'd recommend looking at the Neo4j JDBC driver, which has good support for connecting to the Neo4j server from Java.
http://www.neo4j.org/develop/tools/jdbc
You can set up a simple connection by adding the Neo4j JDBC jar to your classpath, available here: https://github.com/neo4j-contrib/neo4j-jdbc/releases Then just use Neo4j as any JDBC driver:
Connection conn = DriverManager.getConnection("jdbc:neo4j://localhost:7474/");
ResultSet rs = conn.executeQuery("start n=node({id}) return id(n) as id", map("id", id));
while(rs.next()) {
System.out.println(rs.getLong("id"));
}
Refer to the JDBC documentation for more advanced usage.
To answer your question on why the data is not durably stored, it may be one of many reasons. I would attempt to incrementally scale back the complexity of the code to try and locate the culprit. For instance, until you've found your problem, do these one at a time:
Instead of looping through the result, print it using System.out.println(result.dumpToString());
Instead of the regex query, try just MATCH (n) RETURN n, to return all data in the database
Make sure the data you are seeing in the browser is not "old" data inserted earlier on, but really is an insert from your latest run of the Java program. You can verify this by deleting the data via the browser before running the Java program using MATCH (n) OPTIONAL MATCH (n)-[r]->() DELETE n,r;
Make sure you are actually working against the same database directories. You can verify this by leaving the server running. If you can still start your java program, unless your Java program is using the Neo4j REST Bindings, you are not using the same directory. Two Neo4j databases cannot run against the same database directory simultaneously.

Relationships don't show in neo4j web browser when created by api...neo4j 2.0.1

I have created a simple test in the java api (see below). I then set the neo4j-server.properties to that db and restart neo4j. I'm expecting to see two nodes and a relationship using the web browser localhost:7474/browser.
When code below is executed, the for loop does detect a relationship, however it does not display (nor are they returned by CYPHER query) in the browser.
I'm using 2.0.1 in the java and neo4j server. Is my expectation in error?
Transaction tx = gdb.beginTx();
try
{
Label courseLabel = DynamicLabel.label( "Book" );
Label courseLabelP = DynamicLabel.label( "Person" );
Node a = gdb.createNode(courseLabel), b = gdb.createNode(courseLabelP);
Relationship rel = a.createRelationshipTo( b, CourseRelTypes.HAS_AUTHOR );
for(Relationship r : b.getRelationships(CourseRelTypes.HAS_AUTHOR)) {
System.out.println("has rel");
}
}
finally {
tx.close();
}
You need to do a tx.success() to commit that transaction

Exception using QueryEngine inside a Transaction

I'm using the neo4j 1.9.M01 version with the java-rest-binding 1.8.M07, and I have a problem with this code that aims to get a node from a neo4j database with the property "URL" that is "ARREL", using the Query language via rest. The problems seems to happens only inside a transaction, throwing an exception, but otherwise works well :
RestGraphDatabase graphDb = new RestGraphDatabase("http://localhost:7474/db/data");
RestCypherQueryEngine queryEngine = new RestCypherQueryEngine(graphDb.getRestAPI());
Node nodearrel = null;
Transaction tx0 = gds.beginTx();
try{
final String queryStringarrel = ("START n=node(*) WHERE n.URL =~{URL} RETURN n");
QueryResult<Map<String, Object>> retornar = queryEngine.query(queryStringarrel, MapUtil.map("URL","ARREL"));
for (Map<String,Object> row : retornar)
{
nodearrel = (Node)row.get("n");
System.out.println("Arrel: "+nodearrel.getProperty("URL")+" id : "+nodearrel.getId());
}
tx0.success();
}
(...)
But an exception happens: *exception tx0: Error reading as JSON ''
* every execution at the line that returns the QueryResult object.
I also have tried to do it with the ExecutionEngine (between a transaction):
ExecutionEngine engine = new ExecutionEngine( graphDb );
String ARREL = "ARREL";
ExecutionResult result = engine.execute("START n=node(*) WHERE n.URL =~{"+ARREL+"} RETURN n");
Iterator<Node> n_column = result.columnAs("n");
Node arrelat = (Node) n_column.next();
for ( Node node : IteratorUtil.asIterable( n_column ) )
(...)
But it also fails at the *n_column.next()* returning a null object that throws an exception.
The problem is that I need to use the transactions to optimize the queries due if not it take too much time processing all the queries that I need to do. Should I try to join several operations to the query, to avoid using the transactions?
try to add single quotes at:
START n=node(*) WHERE n.URL =~ '{URL}' RETURN n
Can you update your java-rest-binding to the latest version (1.8) ? In between we had a version that automatically applied REST-batch-operations to places with transaction semantics.
So the transactions you see are not real transactions but just recording your operations to be executed as batch-rest-operations on tx.success/finish
Execute the queries within the transaction, but only access the results after the tx is finished. Then your results will be there.
This is for instance useful to send many cypher queries in one go to the server and have the results available all in one go afterwards.
And yes #ulkas use parameters but not like that:
START n=node(*) WHERE n.URL =~ {URL} RETURN n
params: { "URL" : "http://your.url" }
No quotes neccessary when using params, just like SQL prepared statements.

Categories

Resources