couchbase insert and query latency

couchbase insert and query latency - java

I am writing some integration tests for an app I built that uses couchbase.
I am using Java. My question is, if I insert a document using:
bucket.insert(rawJsonDocument);
And then immediately run an N1qlQuery, should I expect to get a response when querying for the document that I just inserted, or is there some expected time period/delay before the document is actually persisted in the bucket?
I am seeing cases where my tests are failing intermittently because the document isn't being found, but I re-ran the test, and it works sporadically.
Does the couchbase bucket object have something similar to what an EntityManager in JPA has with its flush operation? I know flush has a totally different meaning in couchbase, but I'm trying to nail down why I'm seeing this behavior.
I've verified the query syntax using the query tool in the console.

There are a couple consistency options you can choose from. Though you need to use them carefully and understand how it will impact your application performance.
Options are :
-Not bounded (default)
-RequestPlus - waits for all document changes and index updates
-AtPlus - allows 'read your own write'
You read discussion about them in this blog post.
https://blog.couchbase.com/new-to-couchbase-4-5-atplus/

Related

Neo4j embedded mode use of GDS

I am attempting to use GDS 1.8.2 as part of a system running an embedded Neo4j 4.4.3 server. The embedded server has been an operational component for several years, and several versions of Neo4j, so that component on its own has been time-tested and functions great. This is the first attempt to add support for graph algorithms into that component.
My first test is simply to submit a CQL query:
CALL gds.graph.create("someNamedGraph", ["SomeNodeLabel"], ["SomeRelationshipType"])
In the course of getting this to work, I found I had to register the org.neo4j.gds.catalog.GraphCreateProc class in the GlobalProcedures registry of the graph database. This seems to have been successful because, while I was initially encountering a CQL exception saying the gds.graph.create procedure is unknown, now it appears to execute without exception. However, I am now seeing that the transaction doesn't produce the named graph (verified by checking the graph database using out-of-the-box Neo4j Community Edition server mode). It only runs for perhaps 0.1 seconds (vs several seconds when done through the Neo4j Community Edition server mode where it works just fine).
What I now see is that the Query Execution Type (as indicated in the Result object coming back from the execution) is marked as READ_ONLY. There are no exceptions, notifications etc. I have verified that a subsequent write transaction in the same test code, which creates a simple node (as a test), succeeds in writing a node and the Result object provides all the verifying information for that transaction.
Can anyone suggest why the gds.graph.create procedure would seem to execute with no exceptions yet somehow is getting marked as a READ_ONLY transaction? Is this even the reason why the named graph isn't getting created?
Thank you for suggestions or tips! I'm happy to provide more details if anyone has exploratory questions that might help unearth the root cause for this.

Providing an answer to my own question as this was resolved with assist from Mats Rydberg. The issue was that the call alone does not execute the operation. The result has to be iterated.
So a more fitting way to do this in embedded mode would be:
CALL gds.graph.create("someNamedGraph", ["someNodeLabel"], ["someRelationshipType"]) YIELD graphName
And, on the server-side:
try(Transaction tx = graphDb.beginTx())
{
Result resultSet = tx.execute(cql);
while(resultSet.hasNext())
resultSet.next();
}
which indeed produces the named graph.
The root problem, to me, is that my original query (with no YIELD clause) works correctly in the built-in browser app for neo4j community edition server-mode, suggesting there are things happening behind the scenes, such that its obfuscated as to how that ends up working. Regardless, the solution to the problem is now understood and hopefully, in the future, there will be more documentation about making GDS work in embedded mode specifically.

Does deleting from a table of a h2 database handled by Hibernate corrupts the table?

Here a quick description of the system:
A java 7 REST client receives jsons and write their parsed content into an h2 database via Hibernate.
Some Pentaho Kettle Spoon 4 ETLs directly connect to the same database to read and delete a lot of entries at once.
This solution worked fine in our test environment, but in production (where the traffic is really higher because of course it is) the ETLs are often failing with the following error
Error inserting/updating row
General error: "java.lang.ArrayIndexOutOfBoundsException: -1"; SQL statement:
DELETE FROM TABLE_A
WHERE COLUMN_A < ? [50000-131]
and if I navigate the database I can indeed see that that table is not readable (apparently because it thinks its lenght is -1?). The error code 50000 is for "Generic" so is no use.
Apart from the trivial "maybe h2 is not good for an Event Handler", I've been thinking that the corruption could possible be caused by a confict between Kettle and Hibernate, or in other words that no one should delete from an Hibernate handled database without him knowing.
My questions to those more experienced then me with Hibernate are:
Is my sopposition correct?
Should I re-design my solution to also use the same restful Hibernate to perform deletes?
Should I resign using h2 for such a system?
Thanks for the help!
EDIT:
The database is created by a simple sh script that runs the following command that basically uses the provided Shell tool to connect to a non existing db which by defalts creates it.
$JAVA_HOME/bin/java -cp *thisIsAPath*/h2database/h2/main/h2-1.3.168-redhat-2.jar org.h2.tools.Shell -user $DB_USER -password $DB_PASSWORD -url jdbc:h2:$DB_FOLDER/Temp_SD_DS_EventAgent<<END
So all its parameters are set to version 1.3.168's defaults. Unfortunately while I can find the current URL setting I can't find where to look for that version's defauts and experimentals.
I also found the followings:
According to the tutorial When using Hibernate, try to use the H2Dialect if possible. which I didn't.
The tutorial also says Please note MVCC is enabled in version 1.4.x by default, when using the MVStore. Does that mean cuncurrency is disabled/unsupported by default in this older case and this is the problem?
The database is created with h2 version 1.3.168 but the consumer uses 1.4.197. Is this a big deal?

I cannot comment on the credibility of h2 db.
But from application perspective, I think you should use locking mechanism - Optimistic or Pessimistic lock. This will avoid the conflict situations. Hope this answer helps to point in correct direction
Article on Optimistic and Pessimistic locking

how to find out the slow queries in mysql (mostly in Mariadb)

I have few questions:
1) I am newbie to performance testing as a starting assignment I have to investigate the slow queries in MariaDb version : 10.0.17-MariaDB MariaDB Server .
I tried with these settings in the /etc/my.cnf.d/server.cnf
[mysqld]
long_query_time=1
log-slow-queries=/var/log/mysql/log-slow-queries.log
And after doing that I could no start the database. I get a simple
starting MySQL.... [FAILED] message.
I came across Slow query log overview for mariadb which made a little sense :(
Can any one provide me a tutorial of how it should be done.
2) In my application we already use Hibernate for data layer, Does it even make sense to find out the slow query log in the above mentioned way ?
3) How can i achieve the same thing in the mongodb.? like to list out the most frequently used queries, Slow queries ?
Any help would be appreciated.

Converting comment to answer:
When mysql won't start, as first thing you should check mysql error log (probably /var/log/(mysql/)mysqld.log) for exact error.
In your case "log-slow-queries" is starting option name (and deprecated too), you should use slow_query_log with boolean value and slow_query_log_file for filename.
slow_query_log=1 means ENABLE logging
long_query_time=1 means IF ENABLED log queries longer than 1 second
then there is
- log_queries_not_using_indexes=0/1 which, if enabled, will log even queries faster than 1s if they are not using indexes to locate rows
All these and other can be found with descriptions in MySQL manual
For MongoDB there seems to be profiler which is described in answers to this question How to find queries not using indexes or slow in mongodb

A data migration issue in HSQL, the new database does not contain few tables

I am currently responsible for migrating data for our application, for upgrading to new version.I am trying to migrate from HSQL to HSQL, later we will move on to other combinations.
So I have a stand alone utility to do this. I am using MockServletContext to initialize my services(this migration is to be done without starting the servers).
The problem is that all the tables are migrated except for 2-3 tables, the number depending on size of the data migrated. On extensive debugging I found nothing wrong. Meaning that all the data is getting migrated on debugging via eclipse, but on normal running it fails to complete for the last 3 tables.
Any clue where to look at?
In normal run I have put loggers to see if we are reading all the data from the source database and indeed the logs prove we do.
The only place where I am unable to put logs is when it calls a method in driver.
In the last step we give a call to PreparedStatement object's executeBatch()/executeUpdate() methods(Tried with both but exactly same result).
I am completeley clueless what to do and where to look for. Any suggestions?
Thanks

In normal run I have put loggers to see if we are reading all the data from the source database and indeed the logs prove we do. The only place where I am unable to put logs is when it calls a method in driver.
If you suspect something wrong there, try wrapping your driver in log4jdbc. It will show the SQL issued to DB. Good luck!

Testing SQL query on Oracle which includes a remote database

Our development databases (Oracle 9i) use a remote database link to a remote shared database.
This decision was made years ago when it wasn't practical to put some of the database schemas on a development machine - they were too big.
We have certain schemas on the development machines and we make the remote schemas look local by using Oracle's database links, together with some synonyms on the development machines.
The problem I have is that I would like to test a piece of SQL which joins tables in schemas on either side of the database link.
e.g. (a simplified case):
select a.col, b.col
from a, b
where a.b_id = b.id
a is on the local database
b is on the remove database
I have a synonymn on the locale DB so that 'b' actually points at b#remotedb.
Running the query takes ages in the development environment because of the link. The queries run fine in production (I don't think the Oracle cost based optimiser can cope very well with database links).
We have not been very good at writing unit tests for these types of queries in the past - probably due to the due to the poor performance - so I'd like to start creating some tests for them.
Does anyone have any strategies for writing a unit test for such a query, so as to avoid the performance problems of using the database link?
I'd normally be looking at ways of trying to mock out remote service, but since all this is in a SQL query, I can't see anyway of easily mocking out the remove database.

You should create exact copies of all the schema you need from production on development but without all the data. You should populate the schema with enough data so you can do a proper test. You can also manipulate the optimizer to behave on the test system to be like production by exporting the statistics from the production server and importing them to the development database for the schemas you are duplicating. That way the query will run with the data set you've made but the query will optimize with plans that is similar to that of production. Then you can estimate theoretically how it will scale on production.

Copy the relevant data into your development database and create the tables locally.
Ideally, just build a test case which tells you:
The SQL is correct (it parses)
It operates correctly with a few rows of test data
Don't fall for the "let's copy everything" because that means you'll have no idea what you're testing anymore (and what you're missing).
If in doubt, create a table b with just a single record. If you get an error in this area, add more rows as you learn where it can fail.
If you want to take this to the edge, create the test table (with all data) in a unit test. This way, you can document the test data you're using.
[EDIT] What you need is a test database. Don't run tests against a database which can change. Ideally, the tests should tear down the whole database and recreate it from scratch (tables, indexes, data, everything) as the first step.
In this test database, only keep well defined test data that only changes by defining new tests (and not by someone "just doing something"). If you can, try to run your tests against an in-memory database.

I would suggest materialized views. These are views that store remote data locally.

In theory to do the unit-testing you can work with any set of controlled data created and designed based on your test-cases. It doesn't have to be your live or development system. That's assuming your unit is portable enough. You would test it with your current databases/application when you come to integration testing, which might as well be on the live system anyway (so no db links will be required - I understand your live databases are in one place).
What I'm trying to say, is that you can/should test your unit (i.e. your component, query or whatever you define as a unit) on a controlled set of data that would simulate different 'use cases' and once you complete your testing to satisfactory results, then you can proceed to integration + running integration tests.
Integration tests - you could run this in the live environment, but only after you've proved by unit-testing that your component is 'bullet-proof' (if that's OK with your company's approach/philosophy :) - sys admin's reaction:"Are you flippin creazy?!")
If you are trying to go back in time and test already implemented units, then why bother? If they've been in a production use for some time without any incidents then I would argue that they're OK. However, there's always a chance that your unit/query might have some 'slowly ticking time bomb' effect on the side (cumulative effect over time). Well, analyse the impact is the answer.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.