Neo4j embedded mode use of GDS

Neo4j embedded mode use of GDS - java

I am attempting to use GDS 1.8.2 as part of a system running an embedded Neo4j 4.4.3 server. The embedded server has been an operational component for several years, and several versions of Neo4j, so that component on its own has been time-tested and functions great. This is the first attempt to add support for graph algorithms into that component.
My first test is simply to submit a CQL query:
CALL gds.graph.create("someNamedGraph", ["SomeNodeLabel"], ["SomeRelationshipType"])
In the course of getting this to work, I found I had to register the org.neo4j.gds.catalog.GraphCreateProc class in the GlobalProcedures registry of the graph database. This seems to have been successful because, while I was initially encountering a CQL exception saying the gds.graph.create procedure is unknown, now it appears to execute without exception. However, I am now seeing that the transaction doesn't produce the named graph (verified by checking the graph database using out-of-the-box Neo4j Community Edition server mode). It only runs for perhaps 0.1 seconds (vs several seconds when done through the Neo4j Community Edition server mode where it works just fine).
What I now see is that the Query Execution Type (as indicated in the Result object coming back from the execution) is marked as READ_ONLY. There are no exceptions, notifications etc. I have verified that a subsequent write transaction in the same test code, which creates a simple node (as a test), succeeds in writing a node and the Result object provides all the verifying information for that transaction.
Can anyone suggest why the gds.graph.create procedure would seem to execute with no exceptions yet somehow is getting marked as a READ_ONLY transaction? Is this even the reason why the named graph isn't getting created?
Thank you for suggestions or tips! I'm happy to provide more details if anyone has exploratory questions that might help unearth the root cause for this.

Providing an answer to my own question as this was resolved with assist from Mats Rydberg. The issue was that the call alone does not execute the operation. The result has to be iterated.
So a more fitting way to do this in embedded mode would be:
CALL gds.graph.create("someNamedGraph", ["someNodeLabel"], ["someRelationshipType"]) YIELD graphName
And, on the server-side:
try(Transaction tx = graphDb.beginTx())
{
Result resultSet = tx.execute(cql);
while(resultSet.hasNext())
resultSet.next();
}
which indeed produces the named graph.
The root problem, to me, is that my original query (with no YIELD clause) works correctly in the built-in browser app for neo4j community edition server-mode, suggesting there are things happening behind the scenes, such that its obfuscated as to how that ends up working. Regardless, the solution to the problem is now understood and hopefully, in the future, there will be more documentation about making GDS work in embedded mode specifically.

Related

Does deleting from a table of a h2 database handled by Hibernate corrupts the table?

Here a quick description of the system:
A java 7 REST client receives jsons and write their parsed content into an h2 database via Hibernate.
Some Pentaho Kettle Spoon 4 ETLs directly connect to the same database to read and delete a lot of entries at once.
This solution worked fine in our test environment, but in production (where the traffic is really higher because of course it is) the ETLs are often failing with the following error
Error inserting/updating row
General error: "java.lang.ArrayIndexOutOfBoundsException: -1"; SQL statement:
DELETE FROM TABLE_A
WHERE COLUMN_A < ? [50000-131]
and if I navigate the database I can indeed see that that table is not readable (apparently because it thinks its lenght is -1?). The error code 50000 is for "Generic" so is no use.
Apart from the trivial "maybe h2 is not good for an Event Handler", I've been thinking that the corruption could possible be caused by a confict between Kettle and Hibernate, or in other words that no one should delete from an Hibernate handled database without him knowing.
My questions to those more experienced then me with Hibernate are:
Is my sopposition correct?
Should I re-design my solution to also use the same restful Hibernate to perform deletes?
Should I resign using h2 for such a system?
Thanks for the help!
EDIT:
The database is created by a simple sh script that runs the following command that basically uses the provided Shell tool to connect to a non existing db which by defalts creates it.
$JAVA_HOME/bin/java -cp *thisIsAPath*/h2database/h2/main/h2-1.3.168-redhat-2.jar org.h2.tools.Shell -user $DB_USER -password $DB_PASSWORD -url jdbc:h2:$DB_FOLDER/Temp_SD_DS_EventAgent<<END
So all its parameters are set to version 1.3.168's defaults. Unfortunately while I can find the current URL setting I can't find where to look for that version's defauts and experimentals.
I also found the followings:
According to the tutorial When using Hibernate, try to use the H2Dialect if possible. which I didn't.
The tutorial also says Please note MVCC is enabled in version 1.4.x by default, when using the MVStore. Does that mean cuncurrency is disabled/unsupported by default in this older case and this is the problem?
The database is created with h2 version 1.3.168 but the consumer uses 1.4.197. Is this a big deal?

I cannot comment on the credibility of h2 db.
But from application perspective, I think you should use locking mechanism - Optimistic or Pessimistic lock. This will avoid the conflict situations. Hope this answer helps to point in correct direction
Article on Optimistic and Pessimistic locking

couchbase insert and query latency

I am writing some integration tests for an app I built that uses couchbase.
I am using Java. My question is, if I insert a document using:
bucket.insert(rawJsonDocument);
And then immediately run an N1qlQuery, should I expect to get a response when querying for the document that I just inserted, or is there some expected time period/delay before the document is actually persisted in the bucket?
I am seeing cases where my tests are failing intermittently because the document isn't being found, but I re-ran the test, and it works sporadically.
Does the couchbase bucket object have something similar to what an EntityManager in JPA has with its flush operation? I know flush has a totally different meaning in couchbase, but I'm trying to nail down why I'm seeing this behavior.
I've verified the query syntax using the query tool in the console.

There are a couple consistency options you can choose from. Though you need to use them carefully and understand how it will impact your application performance.
Options are :
-Not bounded (default)
-RequestPlus - waits for all document changes and index updates
-AtPlus - allows 'read your own write'
You read discussion about them in this blog post.
https://blog.couchbase.com/new-to-couchbase-4-5-atplus/

Calling Oracle Stored Procedure in a quartz job in Spring webapp + the SP does not give consistent results

I have a Spring webapp which is basically a monitoring application. There are various KPIs defined in the appliation and I am using Quartz framework to make these KPIs run at regular intervals. One of the Quartz Job calls an Oracle Stored Procedure (with one input and one output parameter). I am using JDBC Callable Statement to call the SP from job. The Stored Procedure is so written that it returns 0 in case of any exception.
The issue that i am facing is that the call from JAVA to this stored procedure is not giving consistent results. Sometimes, the procdeure returns correct results and sometimes it results 0. However if I run the SP directly in SQLDeveloper then it runs perfectly fine.
I have debugged quite a lot on this but didnt find anything. Is it something related to snchronous/asynchronous running of SP or is it some issue with the connection object used to excute the SP. I simply need some pointers which I can use to continue my debugging.
Thanks.

just a 2 cents suggestion. Instead of returning 0 in case of any error, why not returning the real code of error (sign negative)?
Exemple, ORA-01401 would just return -1401.
This way, you will be able to identify the real reason of errors (and if it comes from server or client side).
Regards.
Christian (I'm french, sorry for my frenchglish ;-) )

A data migration issue in HSQL, the new database does not contain few tables

I am currently responsible for migrating data for our application, for upgrading to new version.I am trying to migrate from HSQL to HSQL, later we will move on to other combinations.
So I have a stand alone utility to do this. I am using MockServletContext to initialize my services(this migration is to be done without starting the servers).
The problem is that all the tables are migrated except for 2-3 tables, the number depending on size of the data migrated. On extensive debugging I found nothing wrong. Meaning that all the data is getting migrated on debugging via eclipse, but on normal running it fails to complete for the last 3 tables.
Any clue where to look at?
In normal run I have put loggers to see if we are reading all the data from the source database and indeed the logs prove we do.
The only place where I am unable to put logs is when it calls a method in driver.
In the last step we give a call to PreparedStatement object's executeBatch()/executeUpdate() methods(Tried with both but exactly same result).
I am completeley clueless what to do and where to look for. Any suggestions?
Thanks

In normal run I have put loggers to see if we are reading all the data from the source database and indeed the logs prove we do. The only place where I am unable to put logs is when it calls a method in driver.
If you suspect something wrong there, try wrapping your driver in log4jdbc. It will show the SQL issued to DB. Good luck!

How to diagnose performance problems with SQL Server Views and JDBC

I have a view defined in SQL server 2008 that joins 4 tables together. Executing this view in SQL Server Management Studio takes roughly 3 seconds to run and returns about 45,000 records. My application is written in Java using hibernate to simply do a "from MyViewObject" query in HQL. When this is run, the execution time is consistently around 45 seconds. I have also tried simply using JDBC to run this query and received the same level of performance, so I've assumed it has nothing to do with hibernate.
My question: What can I do to diagnose this problem? There is obviously something different between how Management Studio is running the query vs how my application is running the query but I have not been able to come up with much.
The only thing I've come up with as a potentially viable explanation is an issue with the jtds library that contains the driver for SQL Server in Java.
Any guidance here would be greatly appreciated.
UPDATE
I went back to trying pure JDBC and tried adding the selectMethod and responseBuffering attributes to my connection string but didn't get any improvements. I also took my JDBC code from my application and ran it from a test program containing nothing but my JDBC code and it ran in the expected 3 seconds. So to me this seems environmental for the application.
My application is a Google Web Toolkit(GWT) based app, and the JDBC code is being run in my primary RPC Servlet. Essentially, the RPC method receives the call and immediately executes the JDBC code. Nothing in this setup gives me much indication of why the performance is terrible though. I am going to try the JDBC 3.0 driver and see if that works any better, but it doesn't feel like that will fix the issue to me quite yet.
My goal for the moment is to get my query working live with JDBC and then switch it back over to Hibernate so I can keep the testing simple enough. Thanks for the help so far!
UPDATE 2
I'm finally starting to zero in on the source of the problem, though still no idea what the actual issue is. I opened up the view in SQL Server and copied the SQL statement (rather large) exactly into my code and executed it using JDBC instead of pulling the data from the view and most of the performance issues are gone. It seems that some combination of GWT, SQL Server Views and JDBC is not working properly here. I don't see keeping a very large hand-written query in my code as a long term solution, but it does offer a bit more insight.

<property name="hibernate.show_sql">true</property>
setting this will show you the SQL query generated by hibernate. Analyze the query and make sure you are not missing a relationship.
reply for Update 1 and 2:
Like you mentioned, ran the query on your sql query and it seems like it is fast. So another thing to remember about hibernate is that it creates the object that is returned by your query (of course this depends if you initialize lazy obj. Dont remember what it is called). How many objects does your query return? also you can do a simple bench on where the issue is.
For example, before running the query, sysout the current time and then sysout the current time after. do these for all the places that you suspect is slowing your application down.

To analyze the problem you should look up you manual for tools that display the query or execution plan. Maybe you're missing an index on a join column.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.