High Level Java Client selection for Apache Cassandra [closed] - java

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
There are four high level APIs to access Cassandra and I do not have time to try them all. So I hoped to find somebody who could help me to choose the proper one.
I'll try to write down my findings about them:
Datanucleus-Cassandra-Plugin
pros:
supports JPA1, JPA2, JDO1 - JDO3 - as I read in a review, JDO scales better than Hibernate with JPA
all the pros as mentioned in kundera?
cons:
no exeirience with JDO up to now (relevant only for me of course ;)
documentation not found!
kundera
pros:
JPA 1.0 annotations with all advantages (standard conform, no boilerplate code, ...)
promise for following features in near future: JPA listeners, #PrePersist #PostPersist etc. - Relationships, #OneToMany, #ManyToMany etc. - Transactional support, #Transactional
cons:
early development stage of the plugin?
bugs?
no possibillity to fix problems in the JDO / JPA framework?
s7 pelops
pros:
pure java api --> finer control over persistence?
cons:
pure java api --> boilerplate code
hector 0.7
pros:
mavenized
spring integration --> dependency injection
pure java api --> finer control over persistence?
jmx monitoring?
managing of nodes seems to be easy and flexible
cons:
pure java api (no annotations) --> boiler plate code
Conclusion so far
As I am confident with RDMS, Hibernate, JPA, Spring and not so up to date anymore with EJB, my first impression was, to go for kundera would have been the right choice. But after reading some posts regarding JPO, DataNucleus, I am not sure anymore. As the learning curve should be steep (also for expirienced JPA developers?) for DataNucleus, I am not sure, whether I should go for it.
My major concern is the status of the plugin. Also the forum support/help for JDO and Datanucleus-Cassandra-Plugin, as it is not as wide spread, as far as I understood.
Is anybody out there, who has experience, with some of the framworks already and can give me a hint? Maybe a mixed strategy would make sense as well. In cases (if they exist) JDO is not flexible/sufficient/whatever enough for my needs, to fall back to one of the easier APIs of pelops or hector? Is this possible? Is there an approach like in JPA to get an sql connection and fetch/put data?
After reading a bit on, I found following additional information:
Datanucleus-Cassandra-Plugin is based on the pelops, which also can be accessed for more flexibility, more performance (?), which should be used on the column families with a lot of data, JDO/JPA access should be only used on "administrative" data, where performance is not so important and data amount is not overwhelming.
Which still leaves the question open to start with hector or pelops.
pelops for it's later Datanucleus-Cassandra-Plugin extensibility, or
hector for it's more sufficient support on node hanldling.

I tried most of these solutions and find hector the best. Even when you have some problem you can always reach people who wrote hector in #cassandra in freenode. and the code is more mature as far as I concern. In cassandra client the most critical part would be connection pooling management (since all the clients do mostly the same operations through thrift, but connection pooling is what makes high level client roll). In that case I would vote for hector since I am using it in production for over a year now with no visible problem (1 reconnect issue fixed as soon as I discovered and send an email about it).
I am still using cassandra 0.6 though.

The author of the datanucleus plugin, Todd Nine, is working on the next-gen JPA support in Hector now.

The Hector client was the API that we choose because of the following things that it had:
Connection Pooling (huge performance gain when sharing a connection to a node)
Complete Custom Configuration using interfaces for most everything.
Auto Discovery Hosts
Custom Load Balancing Policy definitions (LeastActiveBalancingPolicy or RoundRobinBalancingPolicy or implement LoadBalancingPolicy)
Light-weight adapter on top of the Thrift API.
Great examples: See hector-examples
Built in JMX support.
Downside of Hector:
Documentation not bad, but the Java Docs are lacking a bit. That could easily be a Git fork / pull request by the user community.
The ORM support was a bit limited, but not urgent for usage in our case. I couldn't get some of the one-to-many associations to work easily, plus lack of describing what type of Cassandra model (super columns or column families for associated collections). Also a lack of Java examples (maybe there are some, please post if you find some).
Also, I tried using kundera with very little success. Not many examples to use or try, very little forum support. It appears to be maintained by one person, which makes it even hard to choose a tool like that. It appears based on the SVN activity it was migrating to using Hadoop instead or support for it as well.

Kundera 2.0.4 released.
Major Changes in this release:
Cross-datastore persistence( Easy to migerate existing mysql app over nosql)
support for relational databases (e.g Mysql etc)
replace solandra with lucene based indexing.
Support added for bi-directinal associations.
Performance improvement fixes.

I would propose also Astyanax, I'm working with it and I'm quite happy. Only the documentation is not really good.
Astyanax API
Astyanax implements a fluent API which guides the caller to narrow or
customize the query via a set of well defined interfaces. We've also
included some recipes that will be executed efficiently and as close
to the low level RPC layer as possible. The client also makes heavy
use of generics and overloading to almost eliminate the need to
specify serializers.
Some key features of the API include:
Key and column types are defined in a ColumnFamily class which
eliminates the need to specify serializers.
Multiple column family key types in the same keyspace. Annotation based composite column names.
Automatic pagination.
Parallelized queries that are token aware.
Configurable consistency level per operation.
Configurable retry policy per operation.
Pin operations to specific node.
Async operations with a single timeout using Futures.
Simple annotation based object mapping.
Operation result returns host, latency, attempt count.
Tracer interfaces to log custom events for operation failure and success.
Optimized batch mutation.
Completely hide the clock for the caller, but provide hooks to customize it.
Simple CQL support.
RangeBuilders to simplify constructing simple as well as composite column ranges.
Composite builders to simplify creating composite column names.
Recipes Recipes for some common use cases:
CSV importer.
JSON exporter to convert any query result to JSON with a wide range of
customizations.
Parallel reverse index search.
Key unique constraint validation.
http://techblog.netflix.com/2012/01/announcing-astyanax.html

I suggest you give Kundera-2.0.1 a try. It has gone a major change since its inception and I see a lot of new features getting added and bugs being fixed. Currently it supports JPA 1.0 and Cassandra 0.7.6 but they are planning to add support for Cassandra 0.8 and JPA 2.0 very soon. There is a pretty good example here: https://github.com/impetus-opensource/Kundera/wiki/Getting-Started-in-5-minutes

You can try Achilles, a new Entity Manager I've developed that supports all CQL3 features.
Entity mapping
JPA style operations
Limited support for join
Mapping of clustered entities using compound primary key
Queries (native, typed, slice)
Support for counters
Support for Consistency level
TTL & timestamp
JUnit 4 Rule to start embedded Cassandra server for testing
And so more ...
There are 2 implementations: Thrift & CQL
The Thrift version relies on Hector under the hood.
The CQL version pulls the brand new Java Driver Core from Datastax for all operations
Quick reference here

Related

JDBC VS Hibernate [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
We have been using JDBC for a very long time in our web applications. The main reason we used it is because we have 100% control over the code, sql and fix things by our hands. Apart from that we used triggers inside the database, and the database is developed separately by DB experts.
However many now recommend using Hibernate so we also thought about using it. But, we found the below issues.
Hibernate cannot connect with an "Existing" database. It always try to create a one of its own.
Our database might access by same application which is in different platforms (cloud, server, VPS, Personal Computer). Hibernate can make problems because of its caching in this situation.
We never like to give the "table creating work" to the java code. We create tables manually, always.
We might have to use very long and complex SQL statements. Last time we used an statement with more than 150 lines, joining more than 20 tables. We doubt whether we will face troubles in this when it comes to Hibernate.
Our SQL code is nice and standard. Hibernate generated code seems to be bit dirty for us.
We always use MySQL. Never use any other DB.
The application we create require max security, related to medical. If at least one data record is leaked, we are done.
There are lot of foreign keys, Primary Keys, Composite Keys, Unique Keys etc etc in database. In forums, some complained that Hibernate messed with those.
We decided to try hibernate because some people claims, "Are you Software Engineers? You are using already dead JDBC !!. "
Considering these, please let me know whether the above points are actually true (as I said, I got to know them via googling, discussion etc) or not. And, what are the pros and cons of Hibernate VS Java JDBC?
Answering issues listed above:
1. Hibernate cannot connect with an "Existing" database. It always try to create a one of its own.
This is wrong. Hibernate can connect to an existing database, and it doesn't always try to recreate it. You just should turn of parameter like hbm2ddl. auto.
2. Our database might access by same application which is in different platforms (cloud, server, VPS, Personal Computer). Hibernate can make problems because of its caching in this situation.
Hibernate has an adjustable cache, so this is also not a problem.
3. We never like to give the "table creating work" to the java code. We create tables manually, always.
No problem. See p.1 above. Furthemore there are several convinient libraries for indirect table creation and update (e.g. liquibase) which can be used in couple with hibernate perfectly.
4. We might have to use very long and complex SQL statements. Last time we used an statement with more than 150 lines, joining more than 20 tables. We doubt whether we will face troubles in this when it comes to Hibernate.
You can always use direct JDBC calls and invoke native SQL queries via hibernate, if it is neeeded.
5. Our SQL code is nice and standard. Hibernate generated code seems to be bit dirty for us.
Again, if you have to invoke some logic complicated SQL code instead of hibernate auto-generated - you can do it.
6. We always use MySQL. Never use any other DB.
Not a problem at all. Hibernate has special MySQL dialect support: org.hibernate.dialect.MySQLDialect.
7. The application we create require max security, related to medical. If at least one data record is leaked, we are done.
Security issues aren't related to ORM techniques. Hibernate is just logical and convinient object-oriented layer between pure database JDBC calls and programmers tools. It doesn't influence somehow on common net security.
Hibernate is a great tool and you'll find plenty of documentation, books, and blog articles about it.
I will address all your concerns:
Hibernate cannot connect with an "Existing" database. It always tries to create one of its own.
Hibernate should use a separate database schema management procedure even for integration testing. You should use an incremental versioning tool like FlywayDB to manage your schema changes.
Our database might access by same application which is in different platforms (cloud, server, VPS, Personal Computer). Hibernate can make problems because of its caching in this situation.
You don't have to use the 2nd level cache, which uses 3rd party caching implementations. All caching solutions may break transactional consistency. The first level cache guarantees session-level repeatable reads and with the optimistic locking in place you can prevent lost updates.
We never like to give the "table creating work" to the java code. We create tables manually, always.
The DB should be separated from your ORM tool. That's a best practice anyway.
We might have to use very long and complex SQL statements. Last time we used an statement with more than 150 lines, joining more than 20 tables. We doubt whether we will face troubles in this when it comes to Hibernate.
Hibernate is great for write operations and for concurrency control. You still need to use native SQL for advanced queries (window functions, CTE). But Hibernate allows you to run native queries.
Our SQL code is nice and standard. Hibernate generated code seems to be bit dirty for us.
You don't need and you shouldn't probably use the hbmdll utility anyway.
We always use MySQL. Never use any other DB.
That's even better. You can therefore use advance native queries without caring for database portability issues.
The application we create require max security, related to medical. If at least one data record is leaked, we are done.
Hibernate doesn't prevent you from securing your database or the data access code. You can still use database security measures with Hibernate too. You can even use Jasypt to enable all sorts of security-related features:
advanced password hashing
two-way encryption
There are lot of foreign keys, Primary Keys, Composite Keys, Unique Keys etc etc in database. In forums, some complained that Hibernate messed with those.
All of those are supported by Hibernate. Aside from the JPA conventions, Hibernate also offers particular mapping for any exotic mapping.
We decided to try hibernate because some people claims, "Are you Software Engineers? You are using already dead JDBC !!. "
That's not the right argument for switching from a library you already master. If you think you can benefit from using Hibernate then that's the only compelling reason for switching from JDBC.
Using plain old JDBC, does not mean you are lacking in IT industry, rather Hibernate also uses JDBC in the underlying layer.
What advantages it gives us what we should look for.
1.) Cache Mechanism.
2.) Managing sessions, transactions etc.
3.) Reduce efforts in writing queries, more utilities of hibernate like Query API, Criteria API, HQL
The questions that you have raised are more or less covered in Hibernate docs.
Also there are lot more caching strategy available ehcache, infinispan, depends on the server we are deploying, JBOSS, Weblogic, Tomcat etc. ++ environment like cloud, distributed cache etc.
Hibernate still provides you with option of turning off automatically creating schema and pointing to the one create by you.
Here are the quick answers that I know
1) You can connect to an existing database. But yeah as stated here
If you don't have a solid object model, I'd say that Hibernate is a
terrible choice.
2) As you database is been accessed from different applications so you can maintain locks. On-the-other-hand you can trun-off caching as done here.
3) You can create tables manually and connect it using .hbm.xml file.
4) You can use any type of query in hibernate like simple SQL queries criteria.
5) You can directly use SQL code in Hibernate, if you want. Other option is to use criteria.
6) Hibernate is NOT DB specific. You can go for any Database and connect it with hibernate.
7) Using locks and giving rights in database you can maintain security.
8) Agreed that foreign keys are messy in Hibernate If You Donot Handle It Well. So Use OO approach and maintain cascades well, then Hibernate will be good choice.

hibernate vs ebean as scalable, performant ORM

We are going to write a service for which we are trying to evaluate technology stack. So as part of ORM we are thinking of using hibernate but from one of my colleague I came to know abt ebean. But we don't have any idea of ebean.
So my question is: Is there any disadvantage associated to hibernate, any salability or performance bottleneck? And what is the advantage ebean brings to the table?
What does Ebean bring to the table?
In short with Ebean it brings a full function ORM that is a lot easier to use and most importantly optimize (Well, it is easy but can also be done automatically via profiling).
A query language designed to optimise object graph construction via good support for Partial Objects and built in avoidance of N + 1
A "Sessionless" ORM ... architected to not have attach/detach semantics (So this makes it easier to use / fast to master).
Ebean now has SQL2011 History support and ElasticSearch integration. You could argue Hibernate has similar features.
Reference links:
ElasticSearch http://ebean-orm.github.io/docs/features/elasticsearch/
Automatic query tuning http://ebean-orm.github.io/docs/query/autotune
N + 1 http://ebean-orm.github.io/docs/query/nplus1
There are lot of issues with hibernate and basically any implementation of JPA in large and very scalable application. You should consider use another solution at all. Issues are well described in article Large Application Model issues and how model should look like in article Model for large applications.
As it is mentioned before, Ebean is sessionless ORM so you don't need to think about sessions. Hibernate has first level cache which is impossible to disable. It means that if you query item through ORM and then delete it directly with SQL, it stays in the cache. You can explicitly clear the cache to get the most updated results from database but unfortunately such behavior may bring errors like "detached entity passed to persist".

Hibernate multiple users, dynamically changing

There are technically two questions here, but are tightly coupled :)
I'm using Hibernate in a new project. It's a POS project.
It uses Oracle database.
We have decided to use Hibernate because the project is large, and because it provides (the most popular) ORM capabilities.
Spring is, for now, out of the question - the reason being: the project is a Swing client-server application, and it adds needless complexity. And, also, Spring is supposed to be very hungry on the hardware resources.
There is a possibility to throw away Hibernate, and to use JDBC. Why? The project requirement is precise database interaction. Meaning, we should have complete control over the connections, sessions and transactions(and, yes, going as low as unoptimized queries).
The first question is - what are your opinions on using the mentioned requrement?
The second question revolves around Hibernate.
We developed a simple Hibernate pilot project.
Another project requirement is - one database user / one connection per user / one session per user / transactions are flexibile(we can end them when we want, as sessions).
Multiple user can log in the application at the same time.
We achived something like that. To be precise, we achived the full described functionality without the multiple users requirement.
Now, looking at the available resources, I came to a conclusion that if we are to have multiple users on the database(on the same schema), we will end up using multiple SessionFactory, implementing a dynamic ConnectionProvider for new user connections. Why?
The users hashed passwords are in the database, so we need to dynamically add a user to the list of current users.
The second question is - can this be done a little easier, it seems weird that Hibernate doesn't support such configurations.
Thank you.
If you're pondering about weather to use Hibernate or JDBC, honestlly go for JDBC. If your domain model is not too complex, you don't really get a lot of advantages from using hibernate. On the other hand using JDBC will greatly improve performance, as you have better control on your queries, and you get A LOT less memory usage from not habing all the Hibernate overhead. Balance this my making an as detailed as possible first scetch of your model. If you're able to schetch it all from the start (no parts that are possible to change wildly in throughout the project), and if said model doesn't look to involved, JDBC will be your friend.
About your users and sessions there, I think you might be mistaking (tho it could just be me), but I don't think you need multiple SessionFactories to have multiple sessions. SessionFactory is a heavy object to initialize, but once you have one you can get multiple hibernate session objects from it which are lightweight.
As a final remark, if you truly stick with an ORM solution (for whatever reason), if possible chose EclipseLink JPA2 implementation. JPA2 has more features over hibernate and the Eclipselink implementation is less buggy then hibernate.
So, as far as Hibernate goes, I still dont know if the only way to dynamicaly change database users(change database connections) was to create multiple session factories, but I presume it is.
We have lowered our requriements, and decided to use Hibernate, use only one user on the database(one connection), one session per user(multiple sessions/multiple "logical" users). We created a couple of Java classes to wrap that functionality. The resources how this can be done can be found here.
Why did we use Hibernate eventually? Using JDBC is more precise, and more flexibile, but the effort to once again map the ResultSet values into objects is, again, the same manual ORM approach.
For example, if I have a GUI that needs to save a Page, first I have to fetch all the Page Articles and then, after I save the Page, update all the Articles FK to that Page. Notice that Im speaking in nouns(objects), and I dont see any other way to wrap the Page/Articles, except using global state. This is the one thing I wouldnt like to see in my application, and we are, after all, using Java, a OO language.
When we already have an ORM mapper that can be configured(forced would be the more precise word to use in this particular example) to process these thing itself, why to go programming it?
Also, we decided to user google Guice - its much faster, typesafe, and could significantly simplify our development/maintence/testing.

An alternative to Hibernate or TopLink? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Is there a viable alternative to Hibernate? Preferably something that doesn't base itself on JPA.
Our problem is that we are building a complex (as in, many objects refer to each other) stateful RIA system. It seems as Hibernate is designed to be used mainly on one-off applications - JSF and the like.
The problem is mainly that of lazy loading. Since there can be several HTTP requests between the initialization and actually loading lazy collections, a session per transaction is out of the question. A long-lived session (one per application) doesn't work well either, because once a transaction hits a snag and throws an exception, the whole session is invalidated, thus the lazy loaded objects break. Then there's all kinds of stuff that just don't work for us (like implicit data persisting of data from outside an initialized transaction).
My poor explanations aside, the bottom line is that Hibernate does magic we don't like. It seems like TopLink isn't any better, it also being written on top of EJB.
So, a stateless persistence layer (or even bright-enough object-oriented database abstraction layer) is what we would need the most.
Any thoughts, or am I asking for something that doesn't exist?
Edit: I'm sorry for my ambiguous terminology, and thank you all for your corrections and insightful answers. Those who corrected me, you are all correct, I meant JPA, not EJB.
If you're after another JPA provider (Hibernate is one of these) then take a look at EclipseLink. It's far more fully-featured than the JPA 1.0 reference implementation of TopLink Essentials. In fact, EclipseLink will be the JPA 2.0 reference implementation shipped with Glassfish V3 Final.
JPA is good because you can use it both inside and outside a container. I've written Swing clients that use JPA to good effect. It doesn't have the same stigma and XML baggage that EJB 2.0/2.1 came with.
If you're after an even lighter weight solution then look no further than ibatis, which I consider to be my persistence technology of choice for the Java platform. It's lightweight, relies on SQL (it's amazing how much time ORM users spend trying to make their ORM produce good SQL) and does 90-95% of what JPA does (including lazy loading of related entities if you want).
Just to correct a couple of points:
JPA is the peristence layer of EJB, not built on EJB;
Any decent JPA provider has a whole lot of caching going on and it can be hard to figure it all out (this would be a good example of "Why is Simplicity So Complex?"). Unless you're doing something you haven't indicatd, exceptions shouldn't be an issue for your managed objects. Runtime exceptions typically rollback transactions (if you use Spring's transaction management and who doesn't do that?). The provider will maintain cached copies of loaded or persisted objects. This can be problematic if you want to update outside of the entity manager (requiring an explicit cache flush or use of EntityManager.refresh()).
As mentioned, JPA <> EJB, they're not even related. EJB 3 happens to leverage JPA, but that's about it. We have a bunch of stuff using JPA that doesn't even come close to running EJB.
Your problem is not the technology, it's your design.
Or, I should say, your design is not an easy fit on pretty much ANY modern framework.
Specifically, you're trying to keep transactions alive over several HTTP requests.
Naturally, most every common idiom is that each request is in itself one or more transactions, rather than each request being a portion of a larger transaction.
There is also obvious confusion when you used the term "stateless" and "transaction" in the same discussion, as transactions are inherently stateful.
Your big issue is simply managing your transactions manually.
If you transaction is occurring over several HTTP requests, AND those HTTP requests happen to be running "very quicky", right after one another, then you shouldn't really be having any real problem, save that you WILL have to ensure that your HTTP requests are using the same DB connection in order to leverage the Databases transaction facility.
That is, in simple terms, you get a connection to the DB, stuff it in the session, and make sure that for the duration of the transaction, all of your HTTP requests go through not only that same session, but in such a way that the actual Connection is still valid. Specifically, I don't believe there is an off the shelf JDBC connection that will actually survive failover or load balancing from one machine to another.
So, simply, if you want to use DB transactions, you need to ensure that your using the same DB Connection.
Now, if your long running transaction has "user interactions" within it, i.e. you start the DB transaction and wait for the user to "do something", then, quite simply, that design is all wrong. You DO NOT want to do that, as long lived transactions, especially in interactive environments, are just simply Bad. Like "Crossing The Streams" Bad. Don't do it. Batch transactions are different, but interactive long lived transactions are Bad.
You want to keep your interactive transactions as short lived as practical.
Now, if you can NOT ensure you will be able to use the same DB connection for your transaction, then, congratulations, you get to implement your own transactions. That means you get to design your system and data flows as if you have no transactional capability on the back end.
That essentially means that you will need to come up with your own mechanism to "commit" your data.
A good way to do this would be where you build up your data incrementally into a single "transaction" document, then feed that document to a "save" routine that does much of the real work. Like, you could store a row in the database, and flag it as "unsaved". You do that with all of your rows, and finally call a routine that runs through all of the data you just stored, and marks it all as "saved" in a single transaction mini-batch process.
Meanwhile, all of your other SQL "ignores" data that is not "saved". Throw in some time stamps and have a reaper process scavenging (if you really want to bother -- it may well be actually cheaper to just leave dead rows in the DB, depends on volume), these dead "unsaved" rows, as these are "uncomitted" transactions.
It's not as bad as it sounds. If you truly want a stateless environment, which is what it sounds like to me, then you'll need to do something like this.
Mind, in all of this the persistence tech really has nothing to do with it. The problem is how you use your transactions, rather than the tech so much.
I think you should have a look at apache cayenne which is a very good alternative to "big" frameworks. With its decent modeler, the learning curve is shorten by a good documentation.
I've looked at SimpleORM last year, and was very impressed by its lightweight no-magic design. Now there seems to be a version 3, but I don't have any experience with that one.
Ebean ORM (http://www.avaje.org)
It is a simpler more intuitive ORM to use.
Uses JPA Annotations for Mapping (#Entity, #OneToMany etc)
Sessionless API - No Hibernate Session or JPA Entity Manager
Lazy loading just works
Partial Object support for greater performance
Automatic Query tuning via "Autofetch"
Spring Integration
Large Query Support
Great support for Batch processing
Background fetching
DDL Generation
You can use raw SQL if you like (as good as Ibatis)
LGPL licence
Rob.
BEA Kodo (formerlly Solarmetric Kodo) is another alternative. It supports JPA, JDO, and EJ3. It is highly configurable and can support agressive pre-fetching, detaching/attaching of objects, etc.
Though, from what you've described, Toplink should be able to handle your problems. Mostly, it sounds like you need to be able to attach/detach objects from the persistence layer as requests start and end.
Just for reference, why the OP's design is his biggest problem: spanning transactions across multiple user requests means you can have as many open transactions at a given time as there are users connected to your app - a transaction keeps the connection busy until it is committed/rolled back. With thousand of simultaneously connected users, this can potentially mean thousands of connections. Most databases don't support this.
Neither Hibernate nor Toplink (EclipseLink) is based on EJB, they are both POJO persistancy frameworks (ORM).
I agree with the previous answer: iBatis is a good alternative to ORM frameworks: full control over sql, with a good caching mechanism.
One other option is Torque, I am not saying it is better than any of the options mentioned above but just that it is another option to look at.
It is getting quite old now but may fit some of your requirements.
Torque
When I was myself looking for a replacement to Hibernate I stumbled upon DataNucleus Access Platform, which is an Apache2-licensed ORM. It isn't just ORM as it provides persistence and retrieval of data also in other datasources than RDBMS, like LDAP, DB4O and XML. I don't have any usage experience, but it looks interesting.
Consider breaking your paradigm completely with something like tox. If you need Java classes you could load the XML result into JDOM.

What JDBC tools do you use for synchronization of data sources?

I'm hoping to find out what tools folks use to synchronize data between databases. I'm looking for a JDBC solution that can be used as a command-line tool.
There used to be a tool called Sync4J that used the SyncML framework but this seems to have fallen by the wayside.
I have heard that the Data Replication Service provided by Db4O is really good. It allows you to use Hibernate to back onto a RDBMS - I don't think it supports JDBC tho (http://www.db4o.com/about/productinformation/drs/Default.aspx?AspxAutoDetectCookieSupport=1)
There is an open source project called Daffodil, but I haven't investigated it at all. (https://daffodilreplicator.dev.java.net/)
The one I am currently considering using is called SymmetricDS (http://symmetricds.sourceforge.net/)
There are others, they each do it slightly differently. Some use triggers, some poll, some use intercepting JDBC drivers. You need to decide what technical limitations you are under to determine which one you really want to use.
Wikipedia provides a nice overview of different techniques (http://en.wikipedia.org/wiki/Multi-master_replication) and also provides a link to another alternative DBReplicator (http://dbreplicator.org/).
If you have a model and DAO layer that exists already for your codebase, you can just create your own sync framework, it isn't hard.
Copy data is as simple as:
read an object from database A
remove database metadata (uuid, etc)
insert into database B
Syncing has some level of knowledge about what has been synced already. You can either do it at runtime by getting a list of uuids from TableInA and TableInB and working out which entries are new, or you can have a table of items that need to be synced (populate with a trigger upon insert/update in TableInA), and run from that. Your tool can be a TimerTask so databases are kept synced at the time granularity that you desire.
However there is probably some tool out there that does it all without any of this implementation faff, and each implementation would be different based on business needs anyway. In addition at the database level there will be replication tools.
True synchronization requires some data that I hope your database schema has (you can read the SyncML doc to see how they proceed). Sync4J won't help you much, it's really high-level and XML oriented. If you don't foresee any conflicts (which means: really easy synchronisation), you could try with a lightweight ETL like Enhydra Octopus.
I'm primarily using Oracle at the moment, and the most full-featured route I've come across is Red Gate's Data Compare:
http://www.red-gate.com/products/oracle-development/data-compare-for-oracle/
This old blog gives a good summary of the solution routes available:
http://www.novell.com/coolsolutions/feature/17995.html
The JDBC-specific offerings I've come across have been very basic. The solution mentioned by Aidos seems the most feature complete if you want to go down the publish-subscribe route:
http://symmetricds.codehaus.org/
Hope this helps.

Categories

Resources