Spring and Mixing SQL and NoSQL db

Spring and Mixing SQL and NoSQL db - java

The answers to my previous post encouraged me mixing SQL and NoSQL db.
What is the best practice implementation of two databases in terms of application context configuration and DAO creation?
Let's choose Derby as the SQL db and Cassandra as the other one. What I am searching for is e.g. an example appcontext.xml, two DAOs, one implementing CRUD on Derby and the other one ond Cassandra and one (not two) sample unit test using both DAOs simultanously.
Tutorials, sample (maven ;-) ) projects, book recommenditions etc. welcome.

Try spring-data http://www.springsource.org/spring-data (an introductory reference video here)
i think Spring Data JPA provides a repository programming model that starts with an interface per managed domain object. May be we can switch or point domain to different data stores. i haven't tried out this by myself. You can check getting-started-with-spring-data-jpa
but combining nosql and relational will be complex. but one can persist certain type of data into one store based on the its segregation level. i am not sure the transaction isolation possible for multiple data stores.say you store documents to one store and relational data to one which requires data integrity if needed.

Related

Create schema and tables on demand at the runtime

So as the title suggests - I need to create an application (preferably Spring Boot), which will create schemas and tables based on user input. Basically, a rest endpoint will be offered to the clients where they would upload their data model in json format. I'll be parsing the json and constructing the db artifacts (schema and tables) in runtime. And once all the tables are created, provide a rest endpoint (with unique identifier), to the client, to perform CRUD operations on their schema.
The approach I am considering currently is -
Create a super user in db , before deploying the app which will have priviliges to create new schemas and db
Create prepared statements to invoke schema/table creation on demand. The prepared statements will have place holders to take the schema name and table definition.
After proper authentication, allow users to upload their data model definition in json.
Clean the json and invoke the schema/table creation prepared statements.
Few questions that I had in mind -
Since all these DB operations will be invoked from a single super user's account, is it safe ?
The schemas and tables will be realized using native SQL queries instead of Hibernate's ORM capabilities. Is it safe/efficient ?
For the CRUD operations, is it possible to switch the db connection from super user to the client specific schema created in the earlier steps ? Or should I continue using the same super user for the CRUD operations?
It would be nice if it is possible to switch schemas in runtime using Hibernate/Spring-Boot.
What I would like is a general approach to this problem. I do not need any code.

A typical web application already has permissions to DELETE all the data for all the users.
JPA makes your queries slower, not faster. JPA can help with caching, but it doesn’t seem you need this.
Yes, you can have multiple datasources in Spring Boot. Look at this for example: https://www.baeldung.com/spring-abstract-routing-data-source
Be aware that your database might not like having millions of tables. Query planning, maintanance jobs, backups, etc all get performance penalties. Basically, databases are not designed for your use case.

unifying query language for accessing data from heterogeneous database

In my current project, I am trying to unify query language for accessing heterogeneous database. Heterogeneous database means their query language for accessing data is different. For instance, SQL is a query language for accessing data from Apache Derby, while nonSQL for MongoDB.
My question is "Is there any domain specific language, which have been proposed to unify heterogeneous databases ? "
Please feel free to direct me other efforts as well.

That's quite an interesting question. There is at least one proposed solution called UnQL (Unstructured Data Query Language) - http://www.couchbase.com/press-releases/unql-query-language.
I suppose out of the box UnQL will work at least for CouchDB and SQLite. This just seems to be a great step ahead.
Personally I would say such a task seems to be a tricky one because of the conceptual differences between structured and unstructured data approaches. Anyway, it should be relatively easy to develop such a DSL for a well defined SQL and NoSQL data models used by a certain application.

There is a project called Hibernate OGM, which aims to generalize JPQL to NoSQL databases.
From their web page:
Hibernate Object/Grid Mapper (OGM) aims at providing Java Persistence (JPA) support for NoSQL solutions. It reuses Hibernate Core's engine but persists entities into a NoSQL data store instead of a relational database. It reuses the Java Persistence Query Language (JP-QL) to search their data.
I don't tried it out for myself, so I cannot say how usefull it is.

JSONiq can process data from different SQL and NoSQL products.
The open source implementation of JSONiq has connectors for Couchbase, Oracle NoSQL, SQLite, and JDBC.
For instance, the following slidedeck showcase the same query being executed on both Couchbase and MongoDB: https://speakerdeck.com/wcandillon/jsoniq-the-sql-of-nosql

SPARQL is a W3C-standardized query language that works on top of an abstract data model (RDF), rather than a specific type of database, which makes it very suitable as an enabler for heterogeneous database querying.
Implementations of SPARQL exist on top of various NoSQL databases, including native RDF databases (often referred to as triplestores), as well as on top of relational databases.

Support JPA and MongoDB in the same application

there exists a requirement which sounds quite simple: support a couple of RDBMS (which i intend to do by using JPA) and MongoDB (spring-data-mongodb is preferred) for persistence. More precisely either the one or the other has to be configured and used, i'm not talking about a cross store.
The procedure shall be the following: code the application, deliver the .war to the customer, in a config file the customer puts the persistence information like the databaseurl (i.e. either mongodb:localhost/test or jdbc:oracle:thin:1521#foo).
Additionally it would be nice to extend the implemenation for further datastores like couchdb.
Is there a best practice or at least any of a non-too-much-overhead-solution which is not that dirty?

Is Eclipselink an option? The latest supports JPA for both RDBMS and NOSQL (including Mongo)
https://blogs.oracle.com/theaquarium/entry/jpa_and_nosql_using_eclipselink

I am currently developing a project with similar needs. I can advise you according to my experience.
I believe that the major concern here is not regarding the technology but more regarding how you will structure data. For that I advise you to use the AbstractFactory and FactoryMethod design patterns. Regaring technology I am using Morphia for MongoDB and JPA for MySQL (as an example) and it's working like a charm.
So the easiest way is to create interfaces for all the objects you want to persist, and then do an implementation for MongoDB with Morphia tags and another with JPA tags. Create one factory for MongoDB that will deal with all the CRUD operations in the MongoDB objects and do the same with a JPA factory.
When the application is starting, you only have to verify the user choice for persistence and then initialize the corresponding factory.

DataNucleus JPA allows you to persist to RDBMS, MongoDB and a host of other datastores (LDAP, HBase, AppEngine, Neo4j, etc), with a simple change to the connection URL, and has done so for quite some time

JPA or JDBC, how are they different?

I am learning Java EE and I downloaded the eclipse with glassfish for the same. I saw some examples and also read the Oracle docs to know all about Java EE 5. Connecting to a database was very simple. I opened a dynamic web project, created a session EJB , I used EntityManager and with the get methods could access the stored data table.
For my next project I had create a simple class and then access some DB table. The very first problem I encountered was that the PersistenceUnit attribute would only be recognized by EJB,Servlet etc and not a simple java class. So then I could not use the EntityManager way(or can I?)
I was asked to go via the "JDBC" way. The very first problem I encountered was to get the connection to the DB. It seems all this must be hardcoded. I had a persistence.xml with which I could easily configure the data base connection. Even setting up a driver for the DB was easy. Also there no get/set methods in the JDBC for accessing table entities.
How do I understand JPA and persistence in relation to JDBC? What was JPA thought for? Why is there set/get methods? Can someone throw some light on the essence of these two and what are the pros/cons without "jargons"?? Please also suggest some links. A simple google search for JPA and JDBC differences led me to some sites full of "terminology" I couldn't follow :(

In layman's terms:
JDBC is a standard for Database Access
JPA is a standard for ORM
JDBC is a standard for connecting to a DB directly and running SQL against it - e.g SELECT * FROM USERS, etc. Data sets can be returned which you can handle in your app, and you can do all the usual things like INSERT, DELETE, run stored procedures, etc. It is one of the underlying technologies behind most Java database access (including JPA providers).
One of the issues with traditional JDBC apps is that you can often have some crappy code where lots of mapping between data sets and objects occur, logic is mixed in with SQL, etc.
JPA is a standard for Object Relational Mapping. This is a technology which allows you to map between objects in code and database tables. This can "hide" the SQL from the developer so that all they deal with are Java classes, and the provider allows you to save them and load them magically. Mostly, XML mapping files or annotations on getters and setters can be used to tell the JPA provider which fields on your object map to which fields in the DB. The most famous JPA provider is Hibernate, so it's a good place to start for concrete examples.
Other examples include OpenJPA, toplink, etc.
Under the hood, Hibernate and most other providers for JPA write SQL and use JDBC to read and write from and to the DB.

Main difference between JPA and JDBC is level of abstraction.
JDBC is a low level standard for interaction with databases. JPA is higher level standard for the same purpose. JPA allows you to use an object model in your application which can make your life much easier. JDBC allows you to do more things with the Database directly, but it requires more attention. Some tasks can not be solved efficiently using JPA, but may be solved more efficiently with JDBC.

JDBC is a much lower-level (and older) specification than JPA. In it's bare essentials, JDBC is an API for interacting with a database using pure SQL - sending queries and retrieving results. It has no notion of objects or hierarchies. When using JDBC, it's up to you to translate a result set (essentially a row/column matrix of values from one or more database tables, returned by your SQL query) into Java objects.
Now, to understand and use JDBC it's essential that you have some understanding and working knowledge of SQL. With that also comes a required insight into what a relational database is, how you work with it and concepts such as tables, columns, keys and relationships. Unless you have at least a basic understanding of databases, SQL and data modelling you will not be able to make much use of JDBC since it's really only a thin abstraction on top of these things.

JDBC is the predecessor of JPA.
JDBC is a bridge between the Java world and the databases world. In JDBC you need to expose all dirty details needed for CRUD operations, such as table names, column names, while in JPA (which is using JDBC underneath), you also specify those details of database metadata, but with the use of Java annotations.
So JPA creates update queries for you and manages the entities that you looked up or created/updated (it does more as well).
If you want to do JPA without a Java EE container, then Spring and its libraries may be used with the very same Java annotations.

The difference between JPA and JDBC is often the deciding factor, as the two database technologies take very different approaches to work with persistent data. JDBC, allows developers to construct database-driven Java programs utilizing object-oriented semantics
JPA is database-agnostic, meaning that the same code can be used in a variety of databases with few modifications. JPA serves as a layer of abstraction that hides the low-level JDBC calls from the developer, making database coding considerably easier
hibernate is implementation of JPA
hibernate you can see further details from here about jpa Query

JDBC is a layer of abstraction on top of vendor-specific relational DB drivers. Without JDBC you would have to deal with peculiarities of a specific DB (not much fun). JDBC, however, is too low-level and entails a lot of boilerplate code.
JPA is a specification of an ORM (just an interface). It's useless without an implementation.
ORM is a kind of framework concerned with saving and retrieving objects to/from the relational DB. There are many ORMs out there with different levels of abstraction. Some of them require manually-written SQL.
Some of ORMs implement JPA (Hibernate or EclipseLink, for example). Most of them are built on top of JDBC.
Such ORMs provide the maximum level of abstraction to the point you almost never have to write SQL queries. Some people love JPA-based ORMs (they reduce boilerplate), some hate (abstraction is leaky, specification is overly complex and there are lots of corner cases).
Java analogy:
class ORM extends JDBC implements JPA {
}

Persistence layers have protocols versions so abstractions also have versions therefore you need ranges of supported versions. It is version hell

Does Hibernate have to drive database design?

I spent all of yesterday reading various articles/tutorials on Hibernate and although I am blown-away by how powerful it is, I have one major concern with it.
It seems that the standard practice is to allow Hibernate to design/generate your DB schema for you, which is a new and scary concept that I am choking on. From the tutorials I read, you just add a new entity to your hibernate.cfg.xml config file, annotate any POJO you want with #Entity, and voila - Hibernate creates the tables for you. Although this is very cool, it has me wondering about a handful of scenarios:
What if you already have a DB schema and the one Hibernate wants to generate for you does not conform to it? What if you have a crazy DBA that refuses to budge on the pre-defined (non-Hibernate) schema?
What if you have reference tables with tens of thousands of records in it (like all the cities in the world)? Would you have to instantiate and save() tens of thousands of unique POJOs or is there a way to configure Hibernate so it will honor and not overwrite data already existing in your tables?
What if you want to do perf tuning on your schema/tables? This includes indexing, normalizing above and beyond what Hibernate creates automatically?
What if you want to add constraints or triggers to your tables? Indexes?
I guess at the root of this is the following:
It looks like Hibernate creates and forces a particular schema/config on your DB. I am wondering how this agenda will conflict with our platform standards, our DBA philosophies, and our ability to perf tune/tweak tables that Hibernate interacts with.
Thanks in advance.

I think you're attributing too much power to Hibernate.
Hibernate does have an idiom that may influence database implementation.
Hibernate does not generate a schema for you unless you ask it to do so. It's possible to start with an existing schema and map it to Java objects using Hibernate. But it might not be possible or optimal if the schema conflicts with Hibernate requirements.
If the DBA won't budge - as they shouldn't - or Hibernate can't accomodate you, then you have your answer: you can't use Hibernate.
Your DBA might consent, but your app might find that the dynamic SQL that's generated for you by Hibernate isn't what you want.
Fortunately for you, it's not the only game in town.
I don't think implementations have to be all or none. If you use simple JDBC to access reference data, what's the harm?
Database design considerations should be independent of Hibernate. Constraints, triggers, normalization, and indexes should be driven by business needs, not your middleware choices.
If you don't have a solid object model, or the schema can't accomodate it, then you should reconsider Hibernate. There's straight JDBC, stored procedures, Spring JDBC, and iBatis as alternatives.

Hibernate comes with a default way to map objects to tables - like several tools/libraries, it favours convention over configuration for simplicity.
However, if you want to map the entities to database tables differently, you can explicitly tell Hibernate how these are mapped (from simple attributes such as changing the table name, through to redefining the foreign-key relationships between related entities and how this is persisted).
If you do this correctly, you don't need to instantiate and save existing data, as this would be pointless - the database already contains the information about the entities in exactly the form that Hibernate understands. (Think about it - to load and then immediately save an entity should always be a no-op, and so can be skipped altogether.)
So the short answer to your question is "no". If you don't care for designing tables, you can let Hibernate adopt a reasonable default. If you do want to design your schema explicitly though, you can do this and then describe that exact schema to Hibernate.

As someone who's worked on java and hibernate in the enterprise for a long time, I have seen very few projects which use this capability. You'll see some build tools and other things do this, but for a real enterprise app, i've never seen this.
Most DBA's won't let the application user create tables. They rely on a privileged user to do those things, and the user that the app connects as would have r/w privs on the data but not the schema itself.
As a result, you write the SQL yourself, and you do the hibernate mappings to match. It doesn't mean your object design won't influence your SQL, but you should still always create your schema upfront.

No. You can use hibernate tools to generate the entities from existing database.
There are 2 ways you can go about in using Hibernate. If you have good DBA or database designer, then it is better to design the database and then map it into hibernate.
On the other hand if you don't have DBA and have good developer then let Hibernate generate Database for you.
The concept behind Hibernate is to map Database and the Objects. So it is called as ORM (Object-Relational Mapping) tool.
Read here for Object Relational Impedance.

This is the preferred way for a quick'n dirty prototype or a simple tutorial, but it's far from being the preferred way for any production application. I largely prefer designing the database independently, using scripts to generate the schema, tables, views, indexes, etc., and map the schema to entities.
As long as the mapping finds the tables and columns in the database, everything is fine.
As soon as you have data in your database and the schema must change, you'll have to write migration scripts anyway. You can't just drop everything and restart from scratch. The tutorials are written for developers starting with Hibernate and who must discover Hibernate as quick as possible, without dealing with complex SQL scripts.

What if you already have a DB schema ...
I don't know where you get that impression. Hibernate can use existing schema. It is quite flexible.
What if you have reference tables ...
Make the relationship LAZY, and it won't load automatically. Only changed object will be saved.
What if you want to do perf tuning ...
Just don't use the generated schema. It is just a starting point. You can customize as you need.
What if you want to add constraints or triggers to your tables? Indexes?
Some as above.

You can use hibernate with an existing database schema.
You can use various annotations to map to existing tables and columns, for example:
#Table(name = "dbschema.dbTable") - should be placed before your class file to map it
#Column(name = "colName") - to map a column
Just be sure that the hibernate is configured with this option:
hibernate.hbm2ddl.auto=update
If you set this to create it will create the schema, so do not do this in your case.

Use hibernate/jpa when appropiate. A common practice when designing apps is to extract the draft and alter it manually after needs (indices etc). However, it will be a pain for you if you change the db layout from hibernate way to do things. Lots of the beauty of JPA will be lost. For tasks which require heavy performance tuning and full control - just go for reguar jdbc.

Some answers:
A. It is possible to add an index annotation : see the table annotation.
B. If you have reference tables, you may choose to have lazy fetching or eager fetching (i.e - if your tables represent a person and a its books - whether to load a person without its book, or with its books)
C. Hibernate can be used to work on existing schema. The schema might not be trivial to work with , but as other have said, you should design db only according to business needs, and not according to framework conventions
D. I would like to encourage you also to read what hibernate does "under the hood" - it uses lots of usage of proxies, which hurts performance, you must understand well the scope of session , and the usages of 1st level and 2nd level cache .
E. Following what I wrote at section D - working with triggers will cause your DB to change "under the hood" when it comes to hibernate. Consider a case where updating a record will create (using a trigger) an entry in some archiving table , and let's say this table is also annotated via hibernate - your hibernate caching will not be aware of the change that happend outside of the application scope.
F. It is important to me to state that I'm not against Hibernate, but you should not use it for all solutions, this is a mistake I did in the past. I now work with Spring-JDBC and I'm quite pleased (for our application needs it will be hard to use Hibernate, and I assume we will consider this only in the case we need to support more than one DB flavor).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.