Getting events from a database - java

I am not very familiar with databases and what they offer outside of the CRUD operations.
My research has led me to triggers. Basically it looks like triggers offer this type of functionality:
(from Wikipedia)
There are typically three triggering events that cause triggers to "fire":
INSERT event (as a new record is being inserted into the database).
UPDATE event (as a record is being changed).
DELETE event (as a record is being deleted).
My question is: is there some way I can be notified in Java (preferably including the data that changed) by the database when a record is Updated/Deleted/Inserted using some sort of trigger semantics?
What might be some alternate solutions to this problem? How can I listen to database events?
The main reason I want to do this is a scenario like this:
I have 5 client applications all in different processes/existing across different PCs. They all share a common database (Postgres in this case).
Lets say one client changes a record in the DB that all 5 of the clients are "interested" in. I am trying to think of ways for the clients to be "notified" of the change (preferably with the affected data attached) instead of them querying for the data at some interval.

Using Oracle you can setup a Trigger on a table and then have the trigger send a JMS message. Oracle has two different JMS implementations. You can then have a process that will 'listen' for the message using the JDBC Driver. I have used this method to push changes out to my application vs. polling.
If you are using a Java database (H2) you have additional options. In my current application (SIEM) I have triggers in H2 that publish change events using JMX.

Don't mix up the database (which contains the data), and events on that data.
Triggers are one way, but normally you will have a persistence layer in your application. This layer can choose to fire off events when certain things happen - say to a JMS topic.
Triggers are a last ditch thing, as you're operating on relational items then, rather than "events" on the data. (For example, an "update", could in reality map to a "company changed legal name" event) If you rely on the db, you'll have to map the inserts & updates back to real life events.... which you already knew about!
You can then layer other stuff on top of these notifications - like event stream processing - to find events that others are interested in.
James

Hmm. So you're using PostgreSQL and you want to "listen" for events and be "notified" when they occur?
http://www.postgresql.org/docs/8.3/static/sql-listen.html
http://www.postgresql.org/docs/8.3/static/sql-notify.html
Hope this helps!

Calling external processes from the database is very vendor specific.
Just off the top of my head:
SQLServer can call CLR programs from
triggers,
postgresql can call arbitrary C
functions loaded dynamically,
MySQL can call arbitrary C functions,
but they must be compiled in,
Sybase can make system calls if set
up to do so.

The simplest thing to do is to have the insert/update/delete triggers make an entry in some log table, and have your java program monitor that table. Good columns to have in your log table would be things like EVENT_CODE, LOG_DATETIME, and LOG_MSG.
Unless you require very high performance or need to handle 100Ks of records, that is probably sufficient.

I think you're confusing two things. They are both highly db vendor specific.
The first I shall call "triggers". I am sure there is at least one DB vendor who thinks triggers are different than this, but bear with me. A trigger is a server-side piece of code that can be attached to table. For instance, you could run a PSQL stored procedure on every update in table X. Some databases allow you to write these in real programming languages, others only in their variant of SQL. Triggers are typically reasonably fast and scalable.
The other I shall call "events". These are triggers that fire in the database that allow you to define an event handler in your client program. IE, any time there are updates to the clients database, fire updateClientsList in your program. For instance, using python and firebird see http://www.firebirdsql.org/devel/python/docs/3.3.0/beyond-python-db-api.html#database-event-notification
I believe the previous suggestion to use a monitor is an equivalent way to implement this using some other database. Maybe oracle? MSSQL Notification services, mentioned in another answer is another implementation of this as well.
I would go so far as to say you'd better REALLY know why you want the database to notify your client program, otherwise you should stick with server side triggers.

What you're asking completely depends on both the database you're using and the framework you're using to communicate with your database.
If you're using something like Hibernate as your persistence layer, it has a set of listeners and interceptors that you can use to monitor records going in and out of the database.

There are a few different techniques here depending on the database you're using. One idea is to poll the database (which I'm sure you're trying to avoid). Basically you could check for changes every so often.
Another solution (if you're using SQL Server 2005) is to use Notification Services, although this techonology is supposedly being replaced in SQL 2008 (we haven't seen a pure replacement yet, but Microsoft has talked about it publicly).

This is usually what the standard client/server application is for. If all inserts/updates/deletes go through the server application, which then modifies the database, then client applications can find out much easier what changes were made.

If you are using postgresql it has capability to listen notifications from JDBC client.

I would suggest using a timestamp column, last updated, together with possibly the user updating the record, and then let the clients check their local record timestamp against that of the persisted record.
The added complexity of adding a callback/trigger functionality is just not worth it in my opinion, unless supported by the database backend and the client library used, like for instance the notification services offered for SQL Server 2005 used together with ADO.NET.

Related

Real time data consumption from mysql

I have a use case in which my data is present in Mysql.
For each new row insert in Mysql, I have to perform analytics for the new data.
How I am currently solving this problem is:
My application is a Spring-boot application, in which I have used Scheduler which checks for new row entered in the database after every 2 seconds.
The problem with the current approach is:
Even if there is no new data available in Mysql table, Scheduler fires MySQL query to check if new data available or not.
One way to solve this type of problem in any SQL database in Triggers .
But till now I am not successful in creating Mysql triggers which can call Java-based Spring application or a simple java application.
My question is :
Is their any better way to solve my above use-case? Even I am open to change to another storage (database) system if they are built for this type of use-case.
This fundamentally sounds like an architecture issue. You're essentially using a database as an API which, as you can see, causes all kinds of issues. Ideally, this db would be wrapped in a service that can manage the notification of systems that need to be notified. Let's look at a few different options going forward.
Continue to poll
You didn't outline what the actual issue is with your current polling approach. Is running the job when it's not needed causing an issue of some kind? I'd be a proponent for just leaving it unless you're interested in making a larger change.
Database Trigger
While I'm unaware of a way to launch a java process via a db trigger, you can do an HTTP POST from one. With that in mind, you can have your batch job staged in a web app that uses a POST to launch the job when the trigger fires.
Wrap existing datastore in a service
This is, IMHO, the best option. This allows there to be a system of record that provides an API that can be versioned, etc. This would allow any logic around who to notify would also be encapsulated into this service.
Replace data store with something that allows for better notifications
Without any real information on what the data being store is, it's hard to say how practical this is. But using something as Apache Kafka or Apache Geode would both be options that provide the ability to be notified when new data is persisted (Kafka by listening to the topic, Geode via a continuous query).
For the record, I'd advocate for the wrapping of the existing database in a service. That service would be the only way into the db and take on responsibility for any notifications required.

Can you place a listener on a db table or similar

I have a java backend system. We need integrate to a third party. I need to return results to a client. Currently we are using a view (SQL SERVER) that the third party write to and keep a tracker of the unique id somewhere.
I have a Spring-wired poller that runs every 10min that will return everything from what was last sent to end and update the tracker table with the new ID. Nothing complicated.
I would like to know if there is a simple way to almost "listen" on the table. If any new rows are added, grab them and return them via my service. And if there is, is it advisable?
Disclaimer: I have not exactly done this, yet, and SQL Server is not my playground. However, combining triggers with non-SQL commands should be possible nowadays, and searching the internet for 'sql server notification' yields a section on 'Query Notifications in SQL Server':
https://msdn.microsoft.com/en-us/library/t9x04ed2(v=vs.110).aspx
In general, as long as it is possible to somehow send a command to a socket from inside a trigger then you could use a PUB-SUB message queue (RabbitMQ, NSQ etc.) to send a notification that you can retrieve in your Java program. Of course, you would have to install triggers on any columns that you want to monitor. Whether it is possible to monitor a schema (or database) for changes in general - this might only be possible if there is some logging inside the database that you have access to. This might not be there, out of the box. The trigger is probably the cleaner way because it resides in the schema/database itself and does not need to access system tables.
EDIT: also found this SO question/answer about socket connection inside a trigger: Creating socket inside a SQL-CLR trigger or stored procedure
You could use triggers on the view, but that won't help in java land as most DBs aren't going to let you run java code there. Your choices are: 1) polling like you are doing or 2) create a web service (or JMS queue or something) that the third party calls to push the update/insert of data. Then you are in java land and Hibernate/Spring can handle the insert and do whatever processing you need.
You could use triggers on the tables, if performance is not an issue

Multiprocessing on web hosting

I have a java dynamic web app. I am exposing RESTful webservices for my android application.
The thing is that there are some services that do DB updates. Now, I want to host the application on public domain. I was wondering how parallel processing works on web hosting.
Say, my service /updateDB updates the database. Now, if there are two users who hit the same service at the same time, will the two of them run concurrently, because that will cause inconsistency in data. How exactly does the whole thing work.
Do I need to take care of synchronisation in my code?
Why kind of database are you using?
Certain database engines already have mechanisms in place to allow a transaction to be completed before another request over writes data. Most web developers do not have to worry about this because the application server (websphere, weblogic) and database (Mysql,Oracle) take care of these things for you.
(I am going to overly simplify this for you.)
A request to the webservice may perform one or more actions on the DB. These actions can be clumped together and be called a transaction. A transaction can include one or more of the following INSERT, UPDATE, DELETE etc. e.g A new customer registers for your webservice. the following actions take place which can be considered into a transaction.
Insert a new customer username password in the Customer table
Insert customers address in Address table
Update total customer count in Summary table
All the above actions can be completed as one transaction. If any of this fails then all actions will be reverted back automatically. Similarly if two customers registers simultaneously then the database will take care to not over write each other as well.
We can configure the database to make sure that every transaction should be completed before another transaction can dirty the data in a row.
In a database they are called ACID properties.
A - Atomicity - Every transaction must be complete, if anything in a transaction fails, then do not complete the transaction and also revert back every previous action within that transaction.
C - Consistency - make sure that every transaction that occurs will always update the database in a predefined manner. e.g. after every customer registration make sure that all the actions within it are executed
I - Isolation - if more than one request comes in, then they get executed on the database separately
D - Durabilty - after a transaction completes, the changes done should remain forever.
For example Mysql Database with the InnoDB engine supports this. There are other databases which support this as well.
You can read more here
http://java.dzone.com/articles/beginners-guide-acid-and
This is a very vast topic in databases.
Programming language have APIS which will help you write code in this manner. But the basic take away is that databases and applications servers will do most of the work for you. You just have to make sure to design the code structure to identify transactions and commit them appropriately).
Java and other programming languages are aware of ACID properties in DB and will help you achieve that goal.
Read more here about how you use Java to achieve things we mentioned above.
http://docs.oracle.com/javase/tutorial/jdbc/basics/transactions.html
Similarly other languages have similar functionality and APIs.
In google search for "java database transaction" or "<your favorite language>database transaction"

How to detect database events with, for example, Java

Is there a way to detect database events, e.g. insert, update and delete, comparable to file access monitors like JNotify (can detect read, create, modify of files and directories)?
Looking for something like database event listeners because I don't want to do polling.
Thanks!
In general no. AFAIK, no such facility exists in JDBC or in the SQL standards.
It might be possible for certain databases / configurations using database specific functionality. For example, if the database can be configured to run arbitrary Java code in a trigger, you might be able to get it to send an event into a pubsub system that will deliver it to your application code.
But I think it would be better to modify your application code-base to generate the events itself.
It depends on the database you are using, as some allow you to run code that would allow you to make a call to a server, through a trigger, but, then you take a performance hit on these modifications, so you would want to use a webservice that doesn't send any information back to, to limit the performance hit.
It also depends on if you are using a server, as then you could use AspectJ to monitor any of the update queries.

PostgreSQL and JMS (Or other Pub-Sub/Callback Mechanism)

I want to have my PostgreSQL server send out notifications when a piece of data changes, preferably over JMS, but also considering any other Pub-Sub mechanism or Callback.
Any ideas if this is possible?
Are there any available Java Add-on Packages that replicate this sort of functionality?
EDIT: I've been informed that PostgreSQL does support stored procedures in Java. That means the following approach becomes feasible:
Essentially, the way I would go is to put a trigger on whatever it is you want to watch, and then call a stored procedure from that. The stored procedure then needs to communicate with the world outside the DB server; I once did an SP like this in Java that opened up a socket connection to a process on the same server listening on a port. If worst came to worst, you could maybe write a file and have something like imon monitoring that file, or you could start up a program in an exec() shell of its own... something like that.
the simplest approach is to use LISTEN/NOTIFY interface, write your own program that connects to database, issues some LISTENs, and does whatever you want when it gets notification - for example sens information over JMS, or simply does what should be done, without adding additional transportation layer.
You can certainly create a Java-language stored procedure and put it into PostgreSQL. But why not keep it simple and debuggable until you know you have your messaging scheme working perfectly? If I were doing this (I am actually doing something similar) here's what I'd do.
(1) create an "outbound message" table with columns for the payload and other info for your JMS messages. I'd put a timestamp column in each row.
(2) write a database trigger for each item that you want to generate a message. Have the trigger INSERT a row into your "outbound message" table.
(3) unit test (1) and (2) by looking at the contents of your outbound message table as you change stuff in your database that should generate messages.
(4) write yourself a simple but high-performance Java JDBC client program that will query this outbound message table, send a JMS message for each row, and then DELETE it. Order the rows in your query by timestamp to preserve your message order. To get it to be high performance you'll need to do a good job with PreparedStatement objects and other aspects of heap management.
(5) Unit test (4) by running it a few times while message-generating changes are happening to your data base.
(6) set up this program to repeat operation (6) several times a minute, while using a single persistent JDBC connection. Queries to a small or empty table aren't very expensive, so this won't smack down your table server.
(7) system test this whole setup.
(8) figure out how to start your Java program from your crontab or your startup script.
When you get all this working you'll have a functioning messaging / notification system ready for systems integration. More importantly, you'll know exactly what you want your Java message-originating software to do. Once you're up and running, if the latency of your messages or the database overhead proves to be a big problem, then you can migrate your Java program into a stored procedure.
Note also that there are message-origination bindings for PERL and other languages built into the Apache ActiveMQ package, so you have some choices about how you implement your message-originating agent.
This approach happens to have two advantages: you aren't critically dependent on postgreSQL's distinctive stored-procedure scheme, and you aren't putting code with external communications dependencies into your table server.
Good luck.
If LISTEN/NOTIFY isn't accessible via JDBC, perhaps you could implement a long-polling HTTP comet-like mechanism via the LOCK statement, or plain "SELECT ... FOR UPDATE" and "SELECT ... FOR SHARE" or other similar queries from within a transaction that'd cause other transactions to block.
The message-writing party could e.g. start a transaction, perform "SELECT ... FOR UPDATE", wait (java code) until either something changes, or a timer expires (say after 30 seconds or so), update the locked row to indicate if data (elsewhere?) is available and commit the transaction to unblock others. Then repeat with a new transaction with "SELECT ... FOR UPDATE" immediately.
The message-reading party would perform a "SELECT ... FOR SHARE" which would block while a "SELECT ... FOR UPDATE" initiated elsewhere is active. It'd return an indication of message availability or the message data itself when the message-writing party's transaction ends.
Hopefully PostgreSQL queues the parties fairly, so that there's no risk of live-lock continuously blocking the message-reading party.
I would install PL/Java to Postgres and write a stored procedure based trigger for the the data you are interested in, which then calls JMS when being called. PL/Java documentation covers the trigger + stored procedure part pretty nicely btw.
I haven't used the JMS from the trigger code, but I'm pretty certain that there are no reasons why it wouldn't be doable, as this is standard Java code and my quick recheck on the documentation also didn't indicate anything suspicious.
Another possibility would be to call the JMS through a proxy service using either, perl, python or any other language that is available for postgres stored procedure development. Just as the JMS doesn't have a standard wire protocol you have to write a proxy service which does the translation.
Since the original question mentions JMS, consider using Apache ActiveMQ. ActiveMQ can use an SQL database for message persistence and supports Postgres in this way, see:
https://activemq.apache.org/jdbc-support
If you don't want to run the ActiveMQ broker as a separate service, it can be run in embedded mode as described here:
https://activemq.apache.org/how-do-i-embed-a-broker-inside-a-connection
Though I'm not sure if there are any limitations when running it embedded

Categories

Resources