commitlog analysis in cassandra

commitlog analysis in cassandra - java

I found the commitlog(.log) files in the folder, and would like to analyze them. For example, I wanna know which query is executed in the history of the machine. Is there any code to do that?

Commit log files are specific for a version of Cassandra, and you may need to tinker with CommitLogReader, etc. You can find more information in the documentation on Change Data Capture.
But the main issue for you is that commit log doesn't contain the query executed, it contains the data that is modified. What you really need is the audit functionality - here you have several choices:
It's built-in into upcoming Cassandra 4.0 - see the documentation on how to use it
use ecAudit plugin open sourced by Ericsson - it supports Cassandra 2.2, 3.0 & 3.11
if you use DataStax Enterprise (DSE) it has built-in support for audit logging

Related

Does deleting from a table of a h2 database handled by Hibernate corrupts the table?

Here a quick description of the system:
A java 7 REST client receives jsons and write their parsed content into an h2 database via Hibernate.
Some Pentaho Kettle Spoon 4 ETLs directly connect to the same database to read and delete a lot of entries at once.
This solution worked fine in our test environment, but in production (where the traffic is really higher because of course it is) the ETLs are often failing with the following error
Error inserting/updating row
General error: "java.lang.ArrayIndexOutOfBoundsException: -1"; SQL statement:
DELETE FROM TABLE_A
WHERE COLUMN_A < ? [50000-131]
and if I navigate the database I can indeed see that that table is not readable (apparently because it thinks its lenght is -1?). The error code 50000 is for "Generic" so is no use.
Apart from the trivial "maybe h2 is not good for an Event Handler", I've been thinking that the corruption could possible be caused by a confict between Kettle and Hibernate, or in other words that no one should delete from an Hibernate handled database without him knowing.
My questions to those more experienced then me with Hibernate are:
Is my sopposition correct?
Should I re-design my solution to also use the same restful Hibernate to perform deletes?
Should I resign using h2 for such a system?
Thanks for the help!
EDIT:
The database is created by a simple sh script that runs the following command that basically uses the provided Shell tool to connect to a non existing db which by defalts creates it.
$JAVA_HOME/bin/java -cp *thisIsAPath*/h2database/h2/main/h2-1.3.168-redhat-2.jar org.h2.tools.Shell -user $DB_USER -password $DB_PASSWORD -url jdbc:h2:$DB_FOLDER/Temp_SD_DS_EventAgent<<END
So all its parameters are set to version 1.3.168's defaults. Unfortunately while I can find the current URL setting I can't find where to look for that version's defauts and experimentals.
I also found the followings:
According to the tutorial When using Hibernate, try to use the H2Dialect if possible. which I didn't.
The tutorial also says Please note MVCC is enabled in version 1.4.x by default, when using the MVStore. Does that mean cuncurrency is disabled/unsupported by default in this older case and this is the problem?
The database is created with h2 version 1.3.168 but the consumer uses 1.4.197. Is this a big deal?

I cannot comment on the credibility of h2 db.
But from application perspective, I think you should use locking mechanism - Optimistic or Pessimistic lock. This will avoid the conflict situations. Hope this answer helps to point in correct direction
Article on Optimistic and Pessimistic locking

Is there way to use pg_cron from the google cloud sql?

Currently I need to move my database from postgreSql to google cloud sql. I use pg_cron to remove stale records like this:
SELECT cron.schedule('30 3 * * 6', $$DELETE FROM events WHERE event_time < now() - interval '1 week'$$);
I've read folowing article: https://cloud.google.com/sql/docs/postgres/extensions
And was not found anything related to pg_cron
Also I've read Does Cloud SQL Postgres have any cron-like tool? but it looks like overengineering for my task.
Is there a simpler way?

As of November 2021 pg_cron is available:
https://cloud.google.com/sql/docs/release-notes#November_19_2021
Flags to enable and configure pg_cron available in following link
https://cloud.google.com/sql/docs/postgres/flags

Unfortunately pg_cron is not supported by Cloud SQL. In order to be able to run this extension you need to do it as SUPERUSER. As mentioned in the post you found Cloud SQL is a fully-managed service. This means that while some operations such as seting up, maintain, managing and administering your databases are managed easily, you don't have the SUPERUSER privilege.
There is an open Feature Request regarding this improvement but there is no estimated time of arrival for this.
If the workaround given in the same post is not suitable for you, you can always create a Compute Engine VM instance and set up PostgreSQL there. This will allow you to fully manage your database.

Cassandra's Metrics Monitoring native API

I want to capture Dropwizard Metrics of my Cassandra cluster in my Java Program(I don't want to use JMX) and pass those values in JSON to some other server(which will use these values to generate alarms etc). I'm new in Java and would really appreciate if I can get some guidance. Are there any native Dropwizard APIs for collecting these metrics? Can you provide a sample Java code of using that API for fetching any metric for example? The reason for not using JMX is that I've read here that it's not recommended to try to gather metrics from production environment as JMX’s RPC API is fragile.

You can send metrics using available plugins for Metrics library, such as graphite, or ganglia...
To do this, you need to put .jar file for corresponding plugin into lib directory of Cassandra, add corresponding configuration file for plugin, modify Cassandra's jvm.options file with following line:
-Dcassandra.metricsReporterConfigFile=<reporting-configuration>.yaml
and restart Cassandra to pickup the changes.
There are several blog post on configuration of Cassandra to use custom metrics plugins that could provide more details: 1, 2.
You may also try to setup standard Metrics Servlets to do query - it should be configured almost the same way - add library & provide configuration

What's the recommended way to migrate from H2 1.3.175 to 1.4.195

Now that H2 1.4 is out of beta, I'd like to migrate my old 1.3.175 database to 1.4.195.
Background info:
In the docs, database upgrade does not mention 1.4 yet.
The roadmap still lists "Automatic migration from 1.3 databases to 1.4." as "planned change".
The current state of MVStore is still labeled as "experimental".
So, what's the recommended way to migrate?
Additional aspects/bonus questions:
Should I enable MVStore or stick with PageStore (pros/cons)? Which one delivers better performance (multithreading is not important for me), which one better stability, especially resilience against OutOfMemoryErrors?

A database created with 1.3.175 can be read and opened with 1.4.195 without any additional work. H2 will automatically detect that it is using the Page Store and treat it as such. There will be no problems with doing this.
The advantage to doing this is that while the MVStore was being developed, the Page Store continued to receive performance improvements and bug fixes. Consequently H2 with the Page Store has become an extremely stable database store.
There is as yet no automatic upgrade procedure for converting a database from using the Page Store to using the MVStore. If you do want to do this, you'll need to do it manually. With the latest H2 Jar, use H2's SCRIPT command to export SQL from your 1.3 database, then use RUNSCRIPT into a freshly created db with 1.4.195.
If your H2 JDBC URL doesn't explicitly specify ;mv_store=false, note that H2 will first look to see if a page store database already exists. If it doesn't then it will create an MVStore database. This will appear seamless to you, your app, and your users. The only superficial difference you'll notice is that the database file on disk has a different file extension.
Finally, a suggestion. If your customer databases are large, consider only using the page store. I'm a heavy user of H2. (My commercial product built on H2 has thousands of users who typically have databases many gigabytes in size.) I still use the page store for all my customers, even though I use the latest H2 Jar. There are still some performance issues with the MVStore that start to appear as databases get large. With time, I expect the cause of the problems to be identified and fixed.

#Steve McLeod's answer is on point. For completeness, here are the exact commands:
//Do a backup of current .h2.db file
//Connect to current DB with same URL as always
SCRIPT TO 'fileName'
//Rename the .h2.db to something else, possibly another backup
//Connect to database with same URL as before. The new MVStore engine will be chosen by default, and the .mv.db file will be created
RUNSCRIPT FROM 'fileName'
Documentation, H2 Grammar
Moreover, if you prefer using the H2 jar for this, refer to Thomas's answers (1 and 2). Concretely, the corresponding classes are org.h2.tools.Script and org.h2.tools.RunScript

To highlight another alternative for similar requests I would like to mention a tool which allows an automatized migration of an old H2 database into a new H2 database:
https://github.com/manticore-projects/H2MigrationTool

How to upgrade database schema built with an ORM tool?

I'm looking for a general solution for upgrading database schema with ORM tools, like JPOX or Hibernate. How do you do it in your projects?
The first solution that comes to my mind is to create my own mechanism for upgrading databases, with SQL scripts doing all the work. But in this case I'll have to remember about creating new scripts every time the object mappings are updated. And I'll still have to deal with low-level SQL queries, instead of just defining mappings and allowing the ORM tools to do all the job...
So the question is how to do it properly. Maybe some tools allow for simplifying this task (for example, I heard that Rails have such mechanism built-in), if so please help me decide which ORM tool to choose for my next Java project.

LiquiBase is an interesting open source library for handling database refactorings (upgrades). I have not used it, but will definitely give it a try on my next project where I need to upgrade a db schema.

I don't see why ORM generated schemas are any different to other DB schemas - the problem is the same. Assuming your ORM will spit out a generation script, you can use an external tool to do the diff
I've not tried it but google came back with SQLCompare as one option - I'm sure there are others.

We hand code SQL update scripts and we tear down the schema and rebuild it applying the update scripts as part of our continuous build process. If any hibernate mappings do not match the schema, the build will fail.

You can check this feature comparison of some database schema upgrade tools.
A comparison of the number of questions in SOW of some of those tools:
mybatis (1049 questions tagged)
Liquibase (663 questions tagged)
Flyway (400 questions tagged)
DBDeploy (24 questions tagged).

DbMaintain can also help here.

I think your best bet is to use an ORM-tool that includes database migration like SubSonic:
http://subsonicproject.com/2-1-pakala/subsonic-using-migrations/

We ended up making update scripts each time we changed the database. So there's a script from version 10 to 11, from 11 to 12, etc.. Then we can run any consecutive set of scripts to skip from some existing version to the new version. We stored the existing version in the database so we could detect this upon startup.
Yes this involved database-specific code! One of the main problems with Hibernate!

When working with Hibernate, I use an installer class that runs from the command-line and has options for creating database schema, inserting base data, and dynamically updating the database schema using SchemaUpdate. I find it to be extremely useful. It also gives me a place to put one-off scripts that will be run when a new version is launched to, for example, populate a new field in an existing DB table.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.