"Select or create" from database, in Java

"Select or create" from database, in Java - java

I've done all my database development for the past few years in Ruby, mostly using ActiveRecord. Now I'm stuck using Java for a project, and it feels so verbose and hamfisted, I'm wondering if I'm doing things wrong.
In an ORM paradigm, if I want to insert into related tables, I'd so something like
# Joe Bob got a new car
p = Person.find_or_create_by_name("Joe Bob");
Car.new({:make=>"Toyota", :plate=>"ABC 123", :owner=>p});
In Java, at least using JDBC directly, I'm going to have to do the Person lookup by hand, insert if it doesn't exist, then create the Car entry.
Of course, in real life, it's more than just 2 tables and the pain scales exponentially. Surely there's a better way?

You can use ORM solutions for Java - there are various solutions available.
Links worth looking at:
Hibernate - http://www.hibernate.org/ - probably the leading Java ORM solution
SO Question - Hibernate, iBatis, Java EE or other Java ORM tool
Having said that, I've usually found that for complex applications ORM frequently causes more trouble than it is worth (and yes, this does include Ruby projects with Activerecord). Sometimes it really does make sense to just get at the data directly via SQL rather than attempt to force on object-oriented facade on top of it.

The better way is learn SQL! The ORM you like so much writes SQL for you behind the scenes.
So you can make a quick helper function that tries to select the record, and if it doesn't exist creates it for you.
In MySQL you can use INSERT IGNORE ..... which will insert the row only if it doesn't exist.
And here is a special bit of SQL you may like (MySQL only):
INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE id=LAST_INSERT_ID(id), c=3;
This tries to insert the record, if it doesn't exist it returns the auto_increment like usual that you retrieve in your program.
But: If it does exist then it updates it - (only c is set to update in that case), but the cool part is it sets the LAST_INSERT_ID() just like it would on an insert.
So either way you get the ID field. And all in a single bit of SQL.
SQL is a very nice language - you should learn it and not rely on the psudo-language of orm.

If you are using JDBC you need to lookup yourself and create person if it does not exist. There is no better way if you use JDBC.
But you can use Hibernate, it will help you reduce writing the O-R mapping yourself and reduce the boilerplate.
As you come from Ruby and If you find it painful to write all the SQL queries, JDBC boilerplate then the better way is to use ORM. I recommend one of the following,
Hibernate
JPA (If you want to change the ORM implementation then use JPA)

Sormula contains an active record package. The save method will update an existing record or insert if no record exists.
See the active record example on web site.
Also see org.sormula.tests.active.SaveTest.java within the project:
SormulaTestAR record = new SormulaTestAR();
record.attach(getActiveDatabase()); // record needs to know data source
record.setId(8002);
record.setType(8);
record.setDescription("Save one AR 2");
record.save();

Looks like I'm late here, however, ActiveJDBC will do what you want in Java:
Person p = Person.findOrCreateIt("name", "Joe Bob");
Car car = Car.createIt("make", "Toyota", "plate", "ABC 123", "owner", p);
There is a ton more it can do, check out at: http://javalite.io/

Related

dynamic sql generation design

Good Afternoon,
We are bulding a web application and as part of it building a search functionality, have a design question on "Search Functionality"
The field names on the UI vs DB are different .i.e. a field on the UI called as "Number" the same is called Text10 in the DB. following are the two issues
How to generate a SQL as user gives the UI field names, we have a table in the DB where we r maintaining configuration(UI name to DB Name)?
User selects the columns which he wants to search, say for example there fields are selected "Number, Description, Price" and once the sql is generated, how to know what data corresponds to what column? Do we have to maintain an index capturing position or a bean?
what is the better way to gather the data based on the resultset?
Thanks

A solution that promotes commonality between UI and database column names would be nice but probably not feasible.
Some sort of mapping table that captures the following will work:
META-DB-TABLE-NAME
META-DB-COLUMN-NAME
META-UI-COLUMN-NAME
Personally I would prefer to keep this mapping meta-data as close to the database as possible.
User-defined meta data is nicely described here from an Oracle perspective:
http://docs.oracle.com/cd/B28359_01/appdev.111/b28369/xdb_repos_meta.htm
Do some research on this and keep us informed with what you find. Very interesting question!

In such a dynamic SQL scenario, query builders like jOOQ really shine. See for example the jOOQ manual section about dynamic SQL.
In your specific case, assuming you're using generated code in jOOQ (which isn't a must, but certainly recommended), you'll be maintaining some sort of lookup between UI fields and SQL fields, such as:
Map<UIField, Field<?>> lookup = ...
lookup.put(UI.NUMBER, TABLE.NUMBER);
lookup.put(UI.DESCRIPTION, TABLE.DESCRIPTION);
lookup.put(UI.PRICE, TABLE.PRICE);
You can then construct your query dynamically according to user needs:
List<UIField> userRequestedFields = ...
List<Field<?>> queryFields = userRequestedFields
.stream()
.map(lookup::get)
.toList();
And then:
ctx.select(queryFields)
.from(TABLE)
.where(...)
.fetch();
There are other query builders, even JPA has the criteria API for these purposes. You could also roll your own, though you'll be re-inventing a lot of wheels.
Disclaimer: I work for the company behind jOOQ.

Java: JOOQ persistence framework performance and feed back [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
I've stumbled over a nice SQL builder framework, called JOOQ. BTW, in Russian JOOQ sounds like noun meaning "bug" (as an insect), "beetle" ;)
If you have any feedback about JOOQ, it's performance and such, please share. Links to blogs about JOOQ also welcome.

I think I should answer here also because I started using jooq one and a half months ago so I have some experience with it.
I wanted to use tool like jooq because:
ORM is an overkill in my current project (distributed calculations platform for cluster) since I need to read and write only separate fields from db, not complete table rows and some of my queries are complex enough not to be executed by simple and lightweight ORMs.
I wanted syntax autocomplete for my queries so that I don't need to keep my whole DB in mind
I wanted to be able to write queries directly in Java so that compiler could check basic query syntax on build.
I wanted my queries to be type-safe so that I couldn't accidentally pass a variable of one type, where another one is expected.
I wanted SQL, but I wanted it very convenient and easy to use
Well, with jooq I was able to achieve all that. My main requirement was for jooq to handle complex enough queries (nested, with grouping etc.). That was fulfilled.
I also wanted to be able to run queries using as few lines of code as possible and was able to reach this with jooq fluent API which allows jquery-like calls to perform SELECTs.
On my way using of jooq I reported a one or two bugs and I must say, that they were fixed surprisingly fast.
I also missed some features and again I must say, that I already have almost all of them.
What I liked very much, is that jooq now uses SLF4J for reporting some very interesting data about it's performance as well as for outputting the actual queries it has built. It really helped me with debugging.
Jooq even generate Java artifacts for stored procedures, UDFs and updatable recordsets, which I don't use currently, though.
What's important, jooq transparently supports DB2, Derby, H2, HSQLDB, MySQL, Oracle, PostGreSQL, SQLite, SQL Server, Sybase SQL Anywhere. Pretty extensive list, I think.
Jooq has support forum in Google groups where Lukas is day and night ready to answer even the stupidest of my questions.
Jooq supports Maven and that's a great relief for me since all my Java projects are Maven-based. We still miss Maven plugin for generator, but that's not important since running generator is a piece of cake.
Writting my queries with jooq I suddenly discovered, that they became really portable because I almost never used any MySQL-specific feature in the code since jooq tries to be as portable as possible. For those who can't live with such peculiarities, as I know support for SQL extensions is under the way also.
What does jooq lack for a moment from my point of view?
Well, there is no fluent API for statements other than SELECT. This complicates code a little and makes UPDATE/DELETE statements a little more complicated to write. But I think this will be added soon. Just implemented in 1.5.9! Ha! Too quick for me ;)
And one more thing. Jooq has a good manual, but... I don't know. May be I just don't understand it's structure or architecture... When I started using jooq for the first time, I opened one page after another looking for a feature I need. For example, try to guess, where in jooq manual UPDATE and DELETE statements are described, looking at contents... But that's really subjective, I believe. I also cannot even explain, what's wrong with manual from my point of view. When I can, I will post a ticket or two ;)
Manual also is not really well-navigable since Trac has no automatic "here, there and back"-like links.
Well, for me in Moscow (Russia) Trac pages doesn't open fast also so reading manual is a little boring.
Manual also misses a good architecture description of jooq for contributors. Jooq follows design-by-contract principle it seems and when I wanted to learn how certain feature is implemented inside by using my usual Ctrl-Click on some method name in IDE, I ended up inside a dull interface with no implementation ;) Not that I'm too smart to start improving jooq right away, but certainly I would enjoy understanding how exactly jooq is architectured from ground to up.
It's a pity also that we cannot contribute to jooq manual. I expected it to be in some kind of wiki.
What I would also want to improve, is the way news are reported. I would prefer link to manual there or examples how this or that new feature works.
Release notes link in manual is really just a roadmap. I think, I will do that one myself tomorrow...
Jooq also have relatively small community currently, but I am glad to report that it doesn't affect code quality or the way new features are introduced.
Jooq is really a good project. I will stick to it for my future projects as well. I really like it.

You can also take a look on MentaBean, a lightweight ORM and SQL Builder that lets you be as close as possible to SQL offering a lot of help with the boilerplate code. Here is an example:
Programmatic Configuration:
private BeanConfig getUserBeanConfig() {
// programmatic configuration for the bean... (no annotation or XML)
BeanConfig config = new BeanConfig(User.class, "Users");
config.pk("id", DBTypes.AUTOINCREMENT);
config.field("username", DBTypes.STRING);
config.field("birthdate", "bd", DBTypes.DATE); // note that the database column name is different
config.field("status", new EnumValueType(User.Status.class));
config.field("deleted", DBTypes.BOOLEANINT);
config.field("insertTime", "insert_time", DBTypes.TIMESTAMP).defaultToNow("insertTime");
return config;
}
// create table Users(id integer primary key auto_increment,
// username varchar(25), bd datetime, status varchar(20),
// deleted tinyint, insert_time timestamp)
A simple SQL join query:
Post p = new Post(1);
StringBuilder query = new StringBuilder(256);
query.append("select ");
query.append(session.buildSelect(Post.class, "p"));
query.append(", ");
query.append(session.buildSelect(User.class, "u"));
query.append(" from Posts p join Users u on p.user_id = u.id");
query.append(" where p.id = ?");
stmt = conn.prepareStatement(query.toString());
stmt.setInt(1, p.getId());
rset = stmt.executeQuery();
if (rset.next()) {
session.populateBean(rset, p, "p");
u = new User();
session.populateBean(rset, u, "u");
p.setUser(u);
}

If you are looking for only a SQL builder solution. I have one project which is an ORM framework for Java but it is still premature and in continuous development however handles many primitive usages of databases. https://github.com/ahmetalpbalkan/orman
There is no documentation in this stage however it can build safe queries using only Java chain methods and can handle many SQL operations. It can also map classes-fields to tables-columns respectively.
Here's a sample query building operation for query
SELECT COUNT(*) FROM sailors WHERE
rating>4 AND rating<9 GROUP BY rating HAVING AVG(age)>20;
Java code:
QueryBuilder qb = QueryBuilder.getBuilder(QueryType.SELECT);
System.out.println(qb
.from("sailors")
.where(
C.and(
C.gt("rating", 5),
C.lt("rating", 9)))
.groupBy("rating")
.having(
C.gt(
new OperationalField(QueryFieldOperation.AVG,
"age").toString(), 20)
).getQuery());
(LOL just give up developing that framework!)
Most probably that won't work for you but just wanted to announce my project :P

Java ORM related question - SQL Vs Google DB (Big Table?) GAE

I was wondering about the following two options when one is not using SQL tables but ORM based DBs (Example - when you are using GAE)
Would the second option be less efficient?
Requirement:
There is an object. The object has a collection of similar items. I need to store this object. Example, say the object is a tree and it has a collection of leaves.
Option 1:
Traditional SQL type structure:
Table for the Tree (with TreeId as the
identifier for a row in the Table.)
Table for the Leaves (where each leaf
has a TreeId and to show the leaves
of a tree, I query all leaves where
the TreeId is the Id of the tree.)
Here, the Tree structure DOES NOT
have a field with leaves.
Option 2:
ORM / GAE Tables:
Using the same example above,
I have an object for Tree where the object has a collection (Set/List in Java/C++) of leaves.
I store and retrieve the Tree together with the leaves (as the leaves are implemented as a Set in the Tree object)
My question is, will the second one be less efficient that the first option?
If so, why? Are there other alternatives?
Thank you!

It would be better to use Hibernate(for java) or other ORM framework than ORM db.
1. orm db's are mostly amateur.
2. no one appreciates it. You will be much better specialist if you know PostgreSQL with orm framework than just some orm db.
3. there are many standards in the world of rdbms. no standards in orm dbs.
4. rdbms support and community make this choice safer in long term.
5. effeciency is a tricky question. almost 80% sure that if you want to find row with "name = 'Alex'" it will be faster in rdbms than in orm db, cuz orm db will need to unpack object for this operation.
PS: i understand, my post is almost offtopic, but i think it contains some good stuff to think about.

When to 'IN' and when not to?

Let's presume that you are writing an application for a retail store chain. So, you would design your object model such that you would define 'Store' as the core business object and lots of supporting objects. Let's say 'Store' looks like follows:
class Store implements Validatable{
int storeNo;
int storeName;
... etc....
}
So, your client tells you that you have to import store schedule from a excel sheet into the application and you would have to run a series of validations on 'em. For instance, 'StoreIsInSameCountry';'StoreIsValid'... etc. So, you would design a Rule interface for checking all business conditions. Something like this:
interface Rule T extends Validatable> {
public Error check(T value) throws Exception;
}
Now, here comes the question. I am uploading 2000 stores from this excel sheet. So, I would end up running each rule defined for a store that many times. If I were to have 4 rules = 8000 queries to the database, i.e, 16000 hits to the connection pool. For a simple check where I would just have to check whether the store exists or not, the query would be:
SELECT STORE_ATTRIB1, STORE_ATTRIB2... from STORE where STORE_ID = ?
That way I would obtain get my 'Store' object. When I don't get anything from the database, then that store doesn't exist. So, for such a simple check, I would have to hit the database 2000 times for 2000 stores.
Alternatively, I could just do:
SELECT STORE_ATTRIB1, STORE_ATTRIB2... from STORE where STORE_ID in (1,2,3..... )
This query would actually return much faster than doing the one above it 2000 times.
However, it doesn't go well with the design that a Rule can be run for a single store only.
I know using IN is not a suggested methodology. So, what do you think I should be doing? Should I go ahead and use IN here, coz it gives better performance in this scenario? Or should I change my design?
What would you do if you were in my shoes, and what is the best practice?

That way I would obtain get my 'Store' object from the database. When I don't get anything from the database, then that store doesn't exist. So, for such a simple check, I would have to hit the database 2000 times for 2000 stores.
This is what you should not do.
Create a temporary table, fill the table with your values and JOIN this table, like this:
SELECT STORE_ATTRIB1, STORE_ATTRIB2...
FROM temptable tt
JOIN STORE s
ON s.STORE_ID = t.id
or this:
SELECT STORE_ATTRIB1, STORE_ATTRIB2...
FROM STORE s
WHERE s.STORE_ID IN
(
SELECT id
FROM temptable tt
)
I know using IN is not a suggested methodology. So, what do you think I should be doing? Should I go ahead and use IN here, coz it gives better performance in this scenario? Or should I change my design?
IN filters duplicates out.
If you want each eligible row to be selected for each duplicate value in the list, use JOIN.
IN is in no way a "not suggested methology".
In fact, there was a time when some databases did not support IN queries effciently, that's why folk wisdom still advices against using it.
But if your store_id is indexed properly (and it most probably is, if it's a PRIMARY KEY which it looks like), then all modern versions of major databases (that is Oracle, SQL Server, MySQL and PostgreSQL) will use an efficient plan to perform this query.
See this article in my blog for performance details in SQL Server:
IN vs. JOIN vs. EXISTS
Note, that in a properly designed database, validation rules are also set-based.
I. e. you implement your validation rules as queries against the temptable.
However, to support legacy rules, you can select values from temptable row-by-agonizing-row, apply the rules, and delete values which did not pass validation.

SELECT store_id FROM store WHERE store_active = 1
or even
SELECT store_id FROM store
will tell you all the active stores in a single query. You can now conduct the other tests on stores you know to exist, and you've saved yourself 1,999 hits to the database.
If you've got relatively uncontested database access, and no time constraint on how long the whole thing is going to take then you've no real need to worry about hitting the connection pool over and over again. That's what it's designed for, after all!

I think it's more of a business question with parameter of how often does the client run the import, how long would it take for you to implement either of the solution, and how expensive is your time per hour.
If it's something that runs once in a while, a bit of bad performance is acceptable in my opinion, especially if you can get the job done quick using clean code.

...a Rule can be run for a single store only.
Managing business rules along with performance is a tricky task, so there is a library ("Persistence Layer") that does exactly that. You define rules, then execute a bulk of commands, then the library fetch from DB whatever the rules require in a single query (by using temp tables rather than 'IN') and then passes it to the rules.
There is an example of a validator in here.

Adding Java Objects to database

For a university assignment I have been assigned I have a Prize object which contains either text, image, or video content. I would like to persist this information into a BLOB field within an Apache Derby database (which will be running on a low powered PDA). How can I add this data to the database?
Thanks in advance.

In this article Five Steps to Managing Unstructured Data with Derby
you can read how to do this.
It describes how to insert binary data into a column with the BLOB datatype in Apache Derby using JDBC.

I assume you'll be connecting via JDBC. If so, simply write your SQL and look at the setBlob method of a PreparedStatement. Should be pretty straightforward.

Serialization is the easy way to do it, however if possible you could make it look like a real database table with a structure containing id (bigint), datatype (smallint), creationdate (date) and data (blob) and specifically make the client code to save the object's data there. This way you could do searches like "get all video prizes created between January 1st 2008 and January 15th 2009" and it wouldn't break down old data if your class would change too much for the serialization to stop working.
This sort of solution would be easy to extend in the future too if there would be need for it; I understand this is a school assignment and such need most likely won't ever surface but if your teacher/professor knows his stuff, I bet he's willing to give an extra point or two for doing this excercise in this way since it takes a bit more time and shows that you can take the steps to prepare in advance for coping in the everchanging landscape of software development.

If you are using Netbeans (I assume Eclipse has similar functionality) you can setup your database schema and the create new Java entity classes from the database and it will generate the appropriate JPA classes for you.
http://hendrosteven.wordpress.com/2008/03/06/simple-jpa-application-with-netbeans/
This is nice as it allows you to focus on your code rather than the database glue code.

The best solution , is to use Derby, because it keep being a multi platform app developed via Java.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.