The system I am currently working on requires some role-based security, which is well catered for in the Java EE stack. The system intends to be a framework for business domain experts to write their code on top of.
However, there is also a requirement for data security. That is, what information is visible to an end user.
This effectively means reducing visibility to rows (and perhaps even columns) in the database.
We are using Hibernate for our persistence. However, we are using our own annotations so as not to expose our persistence choice to the business domain experts.
For row based security this means we could add an annotation such as #Secured at the entity level, which would cause an extra column to be added to the underlying table to constrain our selects?
For column based security, we could perhaps have #Secured to either assist in query generation, or perhaps use an aspect to filter the information returned?
I'm curious to know how this might affect hibernate's caching mechanisms as well?
I'm sure a lot of others will have had the same issue, and I was wondering how you approached this?
Much appreciated...
Hibernate has a filter mechanism that may work for you. The filters will rewrite the queries hibernate generates to include an additional clause to limit the rows returned. I'm not aware of anything in hibernate to mask/hide columns.
Your database may also have support for this functionality. Oracle, for example, has the Virtual Private Database (VPD) which will rewrite your queries at the database level. This solution has the added benefit that any external program (e.g. reporting tools) that goes against your db will have your security restrictions enforced. VPD also has support to mask restricted columns with NULLs.
Unfortunately, the above solutions have not been adequate to support the security requirements for the types projects I typically work on. There is usually some sort of context that cannot be easily expressed in the above solutions. For example, users can view data that they have created, or that have been been marked as public, or belong to a project which they manage.
We typically create query/finder/DAO objects where we pass in the values required to enforce the security and then create the query accordingly.
I hope this helps
When using Hibernate filters you need to be aware that the additional restrictions will not be applied to SQL statements generted by the load() or get() methods.
Related
I need to build a reporting section of my site that consists of some decently complicated queries including things like UNION, GROUP_CONCAT, etc. JPA integration with my entities has maintained database independence so far. Currently the system uses MsSQL, but we want to be sure later we can switch to Postgres or MySQL if needed.
What's a good approach to take with these reports so that without too much work I can make it work on MySQL or Postgres?
The site also uses Spring
As I see it, your question is more or less "how can I make use of vendor specific features without becoming tied to that vendor?".
This result in not an easy answer; probably the most flexible would be stick to JPA and suck up the performance hit.
Other possibilities:
Define the reports as a component that publish a set of interfaces. Use CDI to inject the implementation related to your DB of choice
A variation of above, setup your own DAO interfaces for data access. Like another ORM framework that, being more specific, can have better performance. Build reports on top of that.
If your bussiness allows for it, chose a RDBM to work with for reports. During night time (maybe even on demand, if there is not too much data), dump your production database into it.
The best option is to use data access objects, one per database, each conforming to a common interface. Any client code can use the common DAO interface without needing to know about the underlying database.
With Spring it is easy to swap between the DAO implementation classes as a configuration option, e.g. if you have a CustomerDao interface that has Oracle and DB2 implementations, then use either:
<bean id="customerDao" class="my.package.customer.OracleCustomerDao"/>
or
<bean id="customerDao" class="my.package.customer.Db2CustomerDao"/>
If you want your SQL to work across multiple databases, then here's a plan you can follow:
Test your SQL across multiple databases
To get portability, you need to write portable SQL. The only way to ensure that your SQL is portable is to check it for portability.
If you stick to using standard SQL, then this should be fairly straightforward; you won't be able to use database-specific features, but there is a huge amount you can do without them (they're mostly syntactic sugar, or for stepping outside the relational model, which you hopefully won't need to do). If you've already strayed into using nonstandard SQL, then it may be very hard to get the point where you can do this, but if if you start off working in a disciplined way, i would be optimistic about your ability to stick to standards.
If you're working on SQL Server, the PostgreSQL would be a good choice for a second database to test against, as it's free, easy to set up, and very capable.
I spent all of yesterday reading various articles/tutorials on Hibernate and although I am blown-away by how powerful it is, I have one major concern with it.
It seems that the standard practice is to allow Hibernate to design/generate your DB schema for you, which is a new and scary concept that I am choking on. From the tutorials I read, you just add a new entity to your hibernate.cfg.xml config file, annotate any POJO you want with #Entity, and voila - Hibernate creates the tables for you. Although this is very cool, it has me wondering about a handful of scenarios:
What if you already have a DB schema and the one Hibernate wants to generate for you does not conform to it? What if you have a crazy DBA that refuses to budge on the pre-defined (non-Hibernate) schema?
What if you have reference tables with tens of thousands of records in it (like all the cities in the world)? Would you have to instantiate and save() tens of thousands of unique POJOs or is there a way to configure Hibernate so it will honor and not overwrite data already existing in your tables?
What if you want to do perf tuning on your schema/tables? This includes indexing, normalizing above and beyond what Hibernate creates automatically?
What if you want to add constraints or triggers to your tables? Indexes?
I guess at the root of this is the following:
It looks like Hibernate creates and forces a particular schema/config on your DB. I am wondering how this agenda will conflict with our platform standards, our DBA philosophies, and our ability to perf tune/tweak tables that Hibernate interacts with.
Thanks in advance.
I think you're attributing too much power to Hibernate.
Hibernate does have an idiom that may influence database implementation.
Hibernate does not generate a schema for you unless you ask it to do so. It's possible to start with an existing schema and map it to Java objects using Hibernate. But it might not be possible or optimal if the schema conflicts with Hibernate requirements.
If the DBA won't budge - as they shouldn't - or Hibernate can't accomodate you, then you have your answer: you can't use Hibernate.
Your DBA might consent, but your app might find that the dynamic SQL that's generated for you by Hibernate isn't what you want.
Fortunately for you, it's not the only game in town.
I don't think implementations have to be all or none. If you use simple JDBC to access reference data, what's the harm?
Database design considerations should be independent of Hibernate. Constraints, triggers, normalization, and indexes should be driven by business needs, not your middleware choices.
If you don't have a solid object model, or the schema can't accomodate it, then you should reconsider Hibernate. There's straight JDBC, stored procedures, Spring JDBC, and iBatis as alternatives.
Hibernate comes with a default way to map objects to tables - like several tools/libraries, it favours convention over configuration for simplicity.
However, if you want to map the entities to database tables differently, you can explicitly tell Hibernate how these are mapped (from simple attributes such as changing the table name, through to redefining the foreign-key relationships between related entities and how this is persisted).
If you do this correctly, you don't need to instantiate and save existing data, as this would be pointless - the database already contains the information about the entities in exactly the form that Hibernate understands. (Think about it - to load and then immediately save an entity should always be a no-op, and so can be skipped altogether.)
So the short answer to your question is "no". If you don't care for designing tables, you can let Hibernate adopt a reasonable default. If you do want to design your schema explicitly though, you can do this and then describe that exact schema to Hibernate.
As someone who's worked on java and hibernate in the enterprise for a long time, I have seen very few projects which use this capability. You'll see some build tools and other things do this, but for a real enterprise app, i've never seen this.
Most DBA's won't let the application user create tables. They rely on a privileged user to do those things, and the user that the app connects as would have r/w privs on the data but not the schema itself.
As a result, you write the SQL yourself, and you do the hibernate mappings to match. It doesn't mean your object design won't influence your SQL, but you should still always create your schema upfront.
No. You can use hibernate tools to generate the entities from existing database.
There are 2 ways you can go about in using Hibernate. If you have good DBA or database designer, then it is better to design the database and then map it into hibernate.
On the other hand if you don't have DBA and have good developer then let Hibernate generate Database for you.
The concept behind Hibernate is to map Database and the Objects. So it is called as ORM (Object-Relational Mapping) tool.
Read here for Object Relational Impedance.
This is the preferred way for a quick'n dirty prototype or a simple tutorial, but it's far from being the preferred way for any production application. I largely prefer designing the database independently, using scripts to generate the schema, tables, views, indexes, etc., and map the schema to entities.
As long as the mapping finds the tables and columns in the database, everything is fine.
As soon as you have data in your database and the schema must change, you'll have to write migration scripts anyway. You can't just drop everything and restart from scratch. The tutorials are written for developers starting with Hibernate and who must discover Hibernate as quick as possible, without dealing with complex SQL scripts.
What if you already have a DB schema ...
I don't know where you get that impression. Hibernate can use existing schema. It is quite flexible.
What if you have reference tables ...
Make the relationship LAZY, and it won't load automatically. Only changed object will be saved.
What if you want to do perf tuning ...
Just don't use the generated schema. It is just a starting point. You can customize as you need.
What if you want to add constraints or triggers to your tables? Indexes?
Some as above.
You can use hibernate with an existing database schema.
You can use various annotations to map to existing tables and columns, for example:
#Table(name = "dbschema.dbTable") - should be placed before your class file to map it
#Column(name = "colName") - to map a column
Just be sure that the hibernate is configured with this option:
hibernate.hbm2ddl.auto=update
If you set this to create it will create the schema, so do not do this in your case.
Use hibernate/jpa when appropiate. A common practice when designing apps is to extract the draft and alter it manually after needs (indices etc). However, it will be a pain for you if you change the db layout from hibernate way to do things. Lots of the beauty of JPA will be lost. For tasks which require heavy performance tuning and full control - just go for reguar jdbc.
Some answers:
A. It is possible to add an index annotation : see the table annotation.
B. If you have reference tables, you may choose to have lazy fetching or eager fetching (i.e - if your tables represent a person and a its books - whether to load a person without its book, or with its books)
C. Hibernate can be used to work on existing schema. The schema might not be trivial to work with , but as other have said, you should design db only according to business needs, and not according to framework conventions
D. I would like to encourage you also to read what hibernate does "under the hood" - it uses lots of usage of proxies, which hurts performance, you must understand well the scope of session , and the usages of 1st level and 2nd level cache .
E. Following what I wrote at section D - working with triggers will cause your DB to change "under the hood" when it comes to hibernate. Consider a case where updating a record will create (using a trigger) an entry in some archiving table , and let's say this table is also annotated via hibernate - your hibernate caching will not be aware of the change that happend outside of the application scope.
F. It is important to me to state that I'm not against Hibernate, but you should not use it for all solutions, this is a mistake I did in the past. I now work with Spring-JDBC and I'm quite pleased (for our application needs it will be hard to use Hibernate, and I assume we will consider this only in the case we need to support more than one DB flavor).
I have an existing Java EE 6 application (deployed in Glassfish v 3.1) and want to support multiple tenants. Technologies/APIs I'm currently using in my app are
EJB (including the EJB timer service)
JPA 2.0 (EclipseLink)
JSF 2.0
JMS
JAX-RS
I plan to use CDI as well
As far as I know, adding multi-tenancy support affects only the persistence layer. My question: Has anybody done this before? What are the steps to convert the application? Will this affect other layers other than persistence?
There will be a high number of tenants, therefore, all data will reside in the same DB schema.
Persistence Layer
Start with the persistence layer. Roll upwards through your architecture once you have that done.
The Schema that you are proposing would have an ID that identifies the tenant (eg. TenantId). Each table would have this ID. In all of your queries you would have to ensure that the TenantId matches the logged in User's TenantId.
The difficulty with this is that it is a very manual process.
If you go with Hibernate as your JPA provider then there are some tools that will help with this; namely Hibernate Filters.
These are commonly used to restrict access on multi-tenant Schemas (see here and here for some more)
I haven't used EclipseLink but it does look like it has good support for Multi-Tenancy as well. The DiscriminatorColumn looks like a very similar concept to Hibernate Filters.
Service Layer
I assume that you're using JAX-RS and JMS for a Service Layer. If so then you will also need to think about how you are going to pass the tenantId around and authenticate your Tenants. How are you going to prevent one tenant from accessing the REST service of another? Same thing for JMS.
UI Layer
You are going to have to hook up your login in your UI to a Bean (Hibernate or Eclipselink) that sets the TenantId for the Filter/Discriminator.
Tell us about the number and the degree of separation and customization necessary for different tenants.
If you have a small number of tenants, I would propose to create a customizable "white-label" product. This gives you the opportunity to create some specific things for one tenant without overcomplexing matters. Plus, separating the applications per tenant helps you in maintenance. We did this for a product with a handful of different tenants.
If you have many tenants, this is of course no longer practical. We did a generic version of the same product. All we did then was distinguish tenants by id after login, thus separating the data from others. But still, there was nothing to do in terms of changing the application or a layer within, the id was all what was needed to separate the data and the workflow is automatically separated by having different instances of beans or other managed objects.
There's several ways you can go with this, depending on the level of separation you want to achieve and how many concurrent tenants you want to support. At one extreme, you can create a new schema for each tenant and therefore ensure database-level isolation of data. For most practical purposes it's usually sufficient to have a logical partitioning of your data by assigning a tenant_id to every entity in your domain model and maintaining foreign-key constraints. Of course this means you'll probably want to always pass in your current session's tenant_id to every query / finder method so that it can restrict the data set based on that. You'll want to make sure that users cannot access another tenant's data by entering a tenant id (or a entity id) that does not belong to them in url.
Go message oriented.
If you choose messaging as the strategic approach and refactor (if necessary) business logic around JMS, then other options remain viable and locally applicable.
With this approach, you pay a specific fixed cost (refactor) in your existing (single tenant) system. You then can apply approaches of various degrees of complexity, ranging from simple sharding (#Geziefer's id based association) to a full blown shared-core-schema + extended-tenant-specific-schemas approach, without impacting system architecture and additional refactoring.
You will further have orthogonal control over your system data flows via the messaging layer (applying routers, filters, special processing paths, etc.)
[edit per request]
There is nothing per se in M.T. that explicitly suggests message orientation. But as a general problem, we are looking at widening interfaces, and enriched data flows. Per an API based approach, you would need to carefully inject the appropriate the tenant discriminant in all required interfaces (e.g. methods). A message based (or alternatively a context based API approach) allows for a normative (stable) interface (e.g. message.send()) and at the same allows for explicit specialized data flows. If switching to a message based backbone is not on the table, you are strongly suggested to consider injecting a uniform context (e.g. "RequestContext") param in your APIs. This single extension should cover all your future specialization needs.
I am currently evaluating authentication / authorization frameworks.
Apache Shiro seems to be very nice but I am missing row-level security features.
E.g. there might be special rows in a database which should only visible and accessible by users with special privileges.
To avoid unnecessary round-trips, we currently modify the SQL queries to join with our authorization data to get only the visible rows for the current user.
But this concepts doesn't feel 'right' to me, because we mix business code with security related code which should be orthogonal and independent from each other.
What solutions are available/possible?
How do you implement row-level security (especially in combination with jpa)?
UPDATE:
Target database is mostly Oracle 10g/11g
- but a database independent solution would be preferred if there are no big drawbacks
Row level security is really best done in the database itself. The database has to be told what your user context is when you grab a connection. That user is associated with one or more security groups. The database then automatically appends filters to user supplied queries to filter out what can't be seen from the security groups. This of course means that this is a per database-type solution.
Oracle has pretty good Row Level Security support, see http://www.orafusion.com/art_fgac.htm as an example.
We implemented it as JDBC wrapper.
This wrapper simply parses and transforms SQL.
Hibernate filter is good idea too but we have many reports and ad-hoc queries, Hibernate is not the only tool to access data in our applications.
jsqlparser is an excellent open source SQL parser but we have to fork it to fix some issues and to add support of some advanced SQL features e.g. ROLLUP for reporting purposes https://github.com/jbaliuka/sql-analytic
This reporting tool is also available on github but there is no dependency on row level security infrastructure https://github.com/jbaliuka/x4j-analytic
There is a helpful article: http://mattfleming.com/node/243
The idea is that you can implement row level functionality in two ways: directly setting restrictions in your repository or binding the restrictions via AOP. The latter is preferred because security layer should be separated from business logic (orthogonal concerns).
In Hibernate you can use the concept of filters which are applied transparently and repository doesn't know about them. You can add such filters via AOP. The other way is intercepting session.createCriteria() and adding Restrictions to the Criteria transparently using AOP.
I have a thick client, java swing application with a schema of 25 tables and ~15 JInternalFrames (data entry forms for the tables). I need to make a design choice of straight JDBC or ORM (hibernate with spring framework in this case) for DBMS interaction. Build out of the application will occur in the future.
Would hibernate be overkill for a project of this size? An explanation of either yes or no answer would be much appreciated (or even a different approach if warranted).
TIA.
Good question with no single simple answer.
I used to be a big fan of Hibernate after using it in multiple projects over multiple years.
I used to believe that any project should default to hibernate.
Today I am not so sure.
Hibernate (and JPA) is great for some things, especially early in the development cycle.
It is much faster to get to something working with Hibernate than it is with JDBC.
You get a lot of features for free - caching, optimistic locking and so on.
On the other hand it has some hidden costs. Hibernate is deceivingly simple when you start. Follow some tutorial, put some annotations on your class - and you've got yourself persistence. But it's not simple and to be able to write good code in it requires good understanding of both it's internal workings and database design. If you are just starting you may not be aware of some issues that may bite you later on, so here is an incomplete list.
Performance
The runtime performance is good enough, I have yet to see a situation where hibernate was the reason for poor performance in production. The problem is the startup performance and how it affects your unit tests time and development performance. When hibernate loads it analyzes all entities and does a lot of pre-caching - it can take about 5-10-15 seconds for a not very big application. So your 1 second unit test is going to take 11 secods now. Not fun.
Database Independency
It is very cool as long as you don't need to do some fine tuning on the database.
In-memory Session
For every transaction Hibernate will store an object in memory for every database row it "touches". It's a nice optimization when you are doing some simple data entry. If you need to process lots of objects for some reason though, it can seriously affect performance, unless you explicitly and carefully clean up the in-memory session on your own.
Cascades
Cascades allow you to simplify working with object graphs. For example if you have a root object and some children and you save root object, you can configure hibernate to save children as well. The problem starts when your object graph grow complex. Unless you are extremely careful and have a good understanding of what goes on internally, it's easy to mess this up. And when you do it is very hard to debug those problems.
Lazy Loading
Lazy Loading means that every time you load an object, hibernate will not load all it's related objects but instead will provide place holders which will be resolved as soon as you try to access them. Great optimization right? It is, except you need to be aware of this behaviour otherwise you will get cryptic errors. Google "LazyInitializationException" for an example. And be careful with performance. Depending on the order of how you load your objects and your object graph you may hit "n+1 selects problem". Google it for more information.
Schema Upgrades
Hibernate allows easy schema changes by just refactoring java code and restarting. It's great when you start. But then you release version one. And unless you want to lose your customers you need to provide them schema upgrade scripts. Which means no more simple refactoring as all schema changes must be done in SQL.
Views and Stored Procedures
Hibernate requires exclusive write access to the data it works with. Which means you can't really use views, stored procedures and triggers as those can cause changes to data with hibernate not aware of them. You can have some external processes writing data to the database in a separate transactions. But if you do, your cache will have invalid data. Which is one more thing to care about.
Single Threaded Sessions
Hibernate sessions are single threaded. Any object loaded through a session can only be accessed (including reading) from the same thread. This is acceptable for server side applications but might complicate things unnecessary if you are doing GUI based application.
I guess my point is that there are no free meals.
Hibernate is a good tool, but it's a complex tool, and it requires time to understand it properly. If you or your team members don't have such knowledge it might be simpler and faster to go with pure JDBC (or Spring JDBC) for a single application. On the other hand if you are willing to invest time into learning it (including learning by doing and debugging) than in the future you will be able to understand the tradeoffs better.
Hibernate can be good but it and other JPA ORMs tend to dictate your database structure to a degree. For example, composite primary keys can be done in Hibernate/JPA but they're a little awkward. There are other examples.
If you're comfortable with SQL I would strongly suggest you take a look at Ibatis. It can do 90%+ of what Hibernate can but is far simpler in implementation.
I can't think of a single reason why I'd ever choose straight JDBC (or even Spring JDBC) over Ibatis. Hibernate is a more complex choice.
Take a look at the Spring and Ibatis Tutorial.
No doubt Hibernate has its complexity.
But what I really like about the Hibernate approach (some others too) is the conceptual model you can get in Java is better. Although I don't think of OO as a panacea, and I don't look for theoritical purity of the design, I found so many times that OO does in fact simplify my code. As you asked specifically for details, here are some examples :
the added complexity is not in the model and entities, but in your framework for manipulating all entities for example. For maintainers, the hard part is not a few framework classes but your model, so Hibernate allows you to keep the hard part (the model) at its cleanest.
if a field (like an id, or audit fields, etc) is used in all your entities, then you can create a superclass with it. Therefore :
you write less code, but more importantly ...
there are less concepts in your model (the unique concept is unique in the code)
for free, you can write code more generic, that provided with an entity (unknown, no type-switching or cast), allows you to access the id.
Hibernate has also many features to deal with other model caracteristics you might need (now or later, add them only as needed). Take it as an extensibility quality for your design.
You might replace inheritance (subclassing) by composition (several entities having a same member, that contains a few related fields that happen to be needed in several entities).
There can be inheritance between a few of your entities. It often happens that you have two tables that have pretty much the same structure (but you don't want to store all data in one table, because you would loose referential integrity to a different parent table).
With reuse between your entities (but only appropriate inheritance, and composition), there is usually some additional advantages to come. Examples :
there is often some way to read the data of the entities that is similar but different. Suppose I read the "title" field for three entities, but for some I replace the result with a differing default value if it is null. It is easy to have a signature "getActualTitle" (in a superclass or an interface), and implement the default value handling in the three implementations. That means the code out of my entities just deals with the concept of an "actual title" (I made this functional concept explicit), and the method inheritance takes care of executing the correct code (no more switch or if, no code duplication).
...
Over time, the requirements evolve. There will be a point where your database structure has problems. With JDBC alone, any change to the database must impact the code (ie. double cost). With Hibernate, many changes can be absorbed by changing only the mapping, not the code. The same happens the other way around : Hibernate lets you change your code (between versions for example) without altering your database (changing the mapping, although it is not always sufficient). To summarize, Hibernate lets your evolve your database and your code independtly.
For all these reasons, I would choose Hibernate :-)
I think either is a fine choice, but personally I would use hibernate. I don't think hibernate is overkill for a project of that size.
Where Hibernate really shines for me is dealing with relationships between entities/tables. Doing JDBC by hand can take a lot of code if you deal with modifying parent and children (grandchildren, siblings, etc) at the same time. Hibernate can make this a breeze (often a single save of the parent entity is enough).
There are certainly complexities when dealing with Hibernate though, such as understanding how the Session flushing works, and dealing with lazy loading.
Straight JDBC would fit the simplest cases at best.
If you want to stay within Java and OOD then going Hibernate or Hibernate/JPA or any-other-JPA-provider/JPA should be your choice.
If you are more comfortable with SQL then having Spring for JDBC templates and other SQL-oriented frameworks won't hurt.
In contrast, besides transactional control, there is not much help from having Spring when working with JPA.
Hibernate best suits for the middleware applications. Assume that we build a middle ware on top of the data base, The middelware is accessed by around 20 applications in that case we can have a hibernate which satisfies the requirement of all 20 applications.
In JDBC, if we open a database connection we need to write in try, and if any exceptions occurred catch block will takers about it, and finally used to close the connections.
In jdbc all exceptions are checked exceptions, so we must write code in try, catch and throws, but in hibernate we only have Un-checked exceptions
Here as a programmer we must close the connection, or we may get a chance to get our of connections message…!
Actually if we didn’t close the connection in the finally block, then jdbc doesn’t responsible to close that connection.
In JDBC we need to write Sql commands in various places, after the program has created if the table structure is modified then the JDBC program doesn’t work, again we need to modify and compile and re-deploy required, which is tedious.
JDBC used to generate database related error codes if an exception will occurs, but java programmers are unknown about this error codes right.
While we are inserting any record, if we don’t have any particular table in the database, JDBC will rises an error like “View not exist”, and throws exception, but in case of hibernate, if it not found any table in the database this will create the table for us
JDBC support LAZY loading and Hibernate supports Eager loading
Hibernate supports Inheritance, Associations, Collections
In hibernate if we save the derived class object, then its base class object will also be stored into the database, it means hibernate supporting inheritance
Hibernate supports relationships like One-To-Many,One-To-One, Many-To- Many-to-Many, Many-To-One
Hibernate supports caching mechanism by this, the number of round trips between an application and the database will be reduced, by using this caching technique an application performance will be increased automatically
Getting pagination in hibernate is quite simple.
Hibernate has capability to generate primary keys automatically while we are storing the records into database
... In-memory Session ... LazyInitializationException ...
You could look at Ebean ORM which doesn't use session objects ... and where lazy loading just works. Certainly an option, not overkill, and will be simpler to understand.
if billions of user using out app or web then in jdbc query will get executed billions of time but in hibernate query will get executed only once for any number of user most important and easy advantage of hibernate over jdbc.