Even easier Java/SQL data transfer needed

Even easier Java/SQL data transfer needed - java

So, I'm using jdbc to talk to a MySQL DB. For many a table and for many queries/views, I have created one class which encapsulates one row of the table or the query/table result. Accesses to the DB return one object of such a class (when I know exactly that there's only one matching row) or a Vector of such objects.
Each class features a factory method that builds an object from a row of a ResultSet. A lot of ResultSet.getXXX() methods are needed, as is finicky bookkeeping about which value is in which column, especially after changes to the table/query/view layout.
Creating and maintaining these objects is a boring, work-intensive and mind-numbing task. In other words, the sort of task that is done by a tool. It should read SQL (the MySQL variant, alas) and generate Java code. Or, at least, give me a representation (XML? DOM?) of the table/query/view, allowing me to do the java code generation myself.
Can you name this tool?

I'm a little confused about your questions. Why don't you use an Object-relational-mapping framework like Hibernate?
I used to have the same problem having to read and write a lot of SQL directly. Eventually I started writing newer projects with Hibernate and haven't looked back. The system takes care for me of building the actual tables and running the SQL in the background, and I can mostly work with the java objects.

If you are looking for a simple framework to help with the drudge work in writing sql, I would recommend ibatis sql maps. This framework basically does exactly what you want.
Hibernate is also a good option, but it seems a bit oversized for a simple problem like yours.
You might also have a look at the spring framework. This aims to create a simple environment for writing java application and has a very usable sql abstraction as well. But be careful with spring, you might start to like the framework and spend too many happy hours with it 8)
As to your concern with reflection. Java has no major problems anymore with performance overhead of reflection (at least since Version 1.4 and with O/R mapping tools).
In my experience, it is better to care about well written and easily understandable code, than caring about some performance overhead this might perhaps cost, that is only theoretical.
In most cases performance problems will not show up, where you expect them and can only be identified with measurement tools used on your code after it has been written. The most common problems with performance are I/O related or are based on some error in your own code (i.e. massively creating new instances of classes or loops with millions of runs, that are not necessary...) and not in the jdk itself.

I created a mini-framework like that years ago, but it was for prototyping and not for production.
The idea follows and it is VERY very simple to do. The tradeoff is the cost of using reflection. Although Hibernate and others ORM tools pay this cost also.
The idea is very simple.
You have a Dao class where you execute the query.
Read the ResultSet Metadata and there you can grab the table name, fields, types etc.
Find in the class path a Class that matches the table name and / or have the same number/types of fields.
Set the values using reflection.
Return this object and cast it in the other side and you're done.
It might seem absurd to find the class at runtime. And may look too risky too, because the query may change or the table structure may change. But think about it. When that happens, you have to update your mappings anyway to match the new structure. So instead you just update the matching class and live happy with that.
I'm not aware on how does ORM tools work to reduce reflection call cost ( because the mapping the only thing it does is help you to find the matching class ) In my version the lookup among about 30,000 classes ( I added jars from other places to test it ) took only .30 ms or something like that. I saved in cache that class and the second time I didn't have to make the lookup.
If you're interested ( and still reading ) I'll try to find the library in my old PC.
At the end my code was something like this:
Employee e = ( Employee ) MagicDataSource.find( "select * from employee where id = 1 ");
or
Employee[] emps = ( Employee[] ) MagicDataSource.findAll("select * from employee ");
Inside it was like:
Object[] findAll( String query ) {
ResultSet rs = getConnection().prepareStatemet( query ).executeQuery();
ResultSetMetaData md = rs.getMetadata();
String tableName = md.getTableName();
String clazz = findClass( toCamelCase( tableName ) ); // search in a list where all the class names where loaded.
Class.forName( clazz );
while( rs.next() ) {
for each attribute etc. etc.
setter...
end
result.append( object );
}
return result.toArray();
}
If anyone knows how ORM tools deal with reflection cost please let me know. The code I have read from open source projects don't event attempt to do anything about it.
At the end it let me create quick small programs for system monitoring or stuff like that. I don't do that job anymore and that lib is now in oblivion.

Apart from the ORMs...
If you're using the rs.getString and rs.getInt routines, then you can certainly ease your maintenance burden if you rely on named columns rather than numbered columns.
Specifically rs.getInt("id") rather than rs.getInt(1), for example.
It's been rare that I've had an actual column change data type, so future SQL maintenance is little more than adding the new columns that were done to the table, and those can be simply tacked on to the end of your monster bind list in each of you little DAO objects.
Next, you then take that idiom of using column names, and you extend it to a plan of using consistent names, and, at the same, time, "unique" names. The intent there is that each column in your database has a unique name associated with it. In theory it can be as simple (albeit verbose) as tablename_columnname, thus if you have a "member" table, the column name is "member_id" for the id column.
What does this buy you?
It buys you being able to use your generic DAOs on any "valid" result set.
A "valid" result set is a result set with the columns named using your unique naming spec.
So, you get "select id member_id, name member_name from member where id = 1".
Why would you want to do that? Why go to that bother?
Because then your joins become trivial.
PreparedStatement = con.prepareStatement("select m.id member_id, m.name member_name, p.id post_id, p.date post_date, p.subject post_subject from member m, post p where m.id = p.member_id and m.id = 123");
ResultSet rs = ps.executeQuery();
Member m = null;
Post p = null;
while(rs.next()) {
if (m == null) {
m = MemberDAO.createFromResultSet(rs);
}
p = PostDAO.createFromResultSet(rs);
m.addPost(p);
}
See, here the binding logic doesn't care about the result set contents, since it's only interested in columns it cares about.
In your DAOs, you make them slightly clever about the ResultSet. Turns out if you do 'rs.getInt("member_id")' and member_id doesn't happen to actually BE in the result set, you'll get a SQLException.
But with a little work, using ResultSetMetaData, you can do a quick pre-check (by fetching all of the column names up front), then rather than calling "rs.getInt" you can call "baseDAO.getInt" which handles those details for you so as not to get the exception.
The beauty here is that once you do that, you can fetch incomplete DAOs easily.
PreparedStatement = con.prepareStatement("select m.id member_id from member m where m.id = 123");
ResultSet rs = ps.executeQuery();
Member m = null;
if (rs.next()) {
m = MemberDAO.createFromResultSet(rs);
}
Finally, it's really (really) a trivial bit of scripting (using, say, AWK) that can take the properties of a bean and convert it into a proper blob of binding code for an initial DAO. A similar script can readily take a SQL table statement and convert it in to a Java Bean (at least the base members) that then your IDE converts in to a flurry of getters/setters.
By centralizing the binding code in to the DAO, maintenance is really hardly anything at all, since it's changed in one place. Using partial binding, you can abuse them mercilessly.
PreparedStatement = con.prepareStatement("select m.name member_name, max(p.date) post_date from member m, post p where post.member_id = m.id and m.id = 123");
ResultSet rs = ps.executeQuery();
Member m = null;
Post p = null;
if (rs.next()) {
m = MemberDAO.createFromResultSet(rs);
p = MemberDAO.craateFromResultSet(rs);
}
System.out.println(m.getName() + " latest post was on " + p.getDate());
Your burden moving forward is mostly writing the SQL, but even that's not horrible. There's not much difference between writing SQL and EQL. Mind, is does kind of suck having to write a select statement with a zillion columns in it, since you can't (and shouldn't anyway) use "select * from ..." (select * always (ALWAYS) leads to trouble, IME).
But those are just the reality. I have found,though, that (unless you're doing reporting), that problem simply doesn't happen a lot. It happens at least once for most every table, but it doesn't happen over and over and over. And, naturally, once you have it once, you can either "cut and paste" your way to glory, or refactor it (i.e. sql = "select " + MemberDAO.getAllColumns() + ", " + PostDAO.getAllColumns() + " from member m, post p").
Now, I like JPA and ORMs, I find them useful, but I also find them a PITA. There is a definite love/hate relationship going on there. And when things are going smooth, boy, is it smooth. But when it gets rocky -- hoo boy. Then it can get ugly. As a whole, however, I do recommend them.
But if you're looking for a "lightweight" non-framework, this technique is useful, practical, low overhead, and gives you a lot of control over your queries. There's simply no black magic or dark matter between your queries and your DB, and when things don't work, it's not some arcane misunderstanding of the framework or edge case bug condition in someone elses 100K lines of code, but rather, odds are, a bug in your SQL -- where it belongs.

Edit: Nevermind. While searching for a solution to my own problem, I forgot to check the date on this thing. Sorry. You can ignore the following.
#millermj - Are you doing that for fun, or because there's a need? Just curious, because that sounds exactly like what Java IDEs like Eclipse and NetBeans already provide (using the Java Persistence API) with New->JPA->Entity Classes from Tables functionality.
I could be missing the point, but if someone just needs classes that match their tables and are persistable, the JPA plus some IDE "magic" might be just enough.

Related

How to build query in Java to prevent SQL injection using prepared statement

I need to build a query in such a way as to prevent the possibility of an SQL injection attack.
I know of two ways to build a query.
String query = new StringBuilder("select * from tbl_names where name = '").append(name).append(';).toString();
String query = "select * from tbl_names where name = ? ";
In the first case, all I do is a connection.preparestatement(query)
In the second case I do something like:
PreparedStatement ps = connection.prepareStatement(query)
ps.setString(1,name);
I want to know what is the industry standard? Do you use the string append way to build the query and then prepare the statement or prepare the statement already and pass parameters later?

Your first fragment of code is unsafe and vulnerable to SQL injection. You should not use that form.
To make your first fragment safe, you would need to manually escape the value to prevent SQL injection. That is hard to do correctly, and choosing the wrong way of handling values could potentially reduce performance depending on the underlying database (eg some database systems will not use an index if you supply a string literal for an integer column).
The second fragment is the standard way. It protects you against SQL injection. Use this form.
Using a prepared statement with parameter placeholders is far simpler, and it also allows you to reuse the compiled statement with different sets of values. In addition, depending on the database, this can have additional performance advantages for reusing query plans across connections.

You could also use the [OWASP ESAPI library][1]. It includes validators, encoders and many other helpful things.
For example, you can do
ESAPI.encoder().encodeForSQL(Codec,input);
More codecs are under development. Currently, MySQL and Oracle are supported. One of those might be helpful in your case.

Can apache-commons-dbutils covert beans to an SQL statement?

As a newbie to Servlet programming, I think I may not have gotten something right here: I understand the concept of Java Beans and little ORM helper classes like org.apache.commons.dbutils.DbUtils. I can convert a ResultSet into an instance of my JavaBean-object with a ResultSetHandler and a BeanHandler. But isn't there any convenient way to do it the other way round, other than hardcoding the SQL string? Something like
QueryRunner run = new QueryRunner(datasource);
int result = run.update("UPDATE " + tableName + " SET " + [and now some Handler sets all the columns from the JavaBean]);
At least, I didn't find anything like that! Or did I get it wrong? Help appreciated.

You did not get it wrong, you will still need a hard-coded SQL string as shown in this answer. Sql2o also requires a hard-coded SQL string but it will let you bind a POJO which gets you half-way there, see here (bottom of the page).
I think you will always need a hard-coded SQL string of some form because these are JDBC helper libraries and not "object relational mappers". Before the insert is done it is not known which properties are auto-generated, have a default-value, are foreign keys, allow null-values, etc.. All this information is required to prepare a proper insert statement based on a POJO/JavaBean and that goes beyond the scope of the helper libraries. On the plus-side: specifying a SQL string is explicit (there is no magic behind the scenes) and keeps you in full control.

Toplink, the DatabaseQuery class and oracle execution paths., Avoiding hardparses

Abstract:
An application I work on uses top link, I'm having trouble finding out if and when top link automatically uses bind variables.
Problem Description:
Lets say I need to do something akin to validating if a vehicle full of people can travel somewhere, where each person could invalidate the trip, and provide error messages so the person could get their restrictions removed before the trip starts. A simple way to do that is to validate each member in the list, and display a list of errors. Lets say their info is stored on an oracle database and I query for each riders info using their unique ids, This query will be executed for each member in the list. A naïve implementation would cause a hard parse, a new execution path, despite only the unique id changing.
I've been reading about bind variables in sql, and how they allow for reuse of an execution path, avoiding cpu intensive hard parses.
A couple links on them are:
http://www.akadia.com/services/ora_bind_variables.html
https://oracle-base.com/articles/misc/literals-substitution-variables-and-bind-variables
An application I work on uses toplink and does something similar to the situation described above. I'm looking to make the validation faster, without changing the implementation much.
If I do something like the following:
Pseudo-code
public class userValidator{
private static DataReadQuery GET_USER_INFO;
static{
GET_USER_INFO = "select * from schema.userInfo ui where ui.id= #accountId"
GET_USER_INFO.bindAllParameters();
GET_USER_INFO.cacheStatement();
GET_USER_INFO.addArgument("accountId", String.class);
}
void validate(){
List<String> listOfUserAccountIds = getuserAccountIdList();
list args;
for(String userAccountId: listOfUserAccountIds){
args = new ArrayList(1);
args.add(userAccountId)
doSomethingWithInfo(getUnitOfWork().executequery(GET_USER_INFO, args);
}
}
}
The Question:
Will a new execution path be parsed for each execution of GET_USER_INFO?
What I have found:
If I understand the bindAllParameters function inside of the DatabaseQuery class well enough, it simple is a type validation to stop sql injection attacks.
There is also a shouldPrepare function inside the same class, however that seems to have to do more with allowing dynamic sql usage where the number of arguments is variable. A prepared DatabaseQuery has its sql written once with just the values of the variables changing based on the argument list passed in, which sounds like simple substitution and not bind variables.
So I'm at a lost.

This seems answered by the TopLink documentation
By default, TopLink enables parameterized SQL but not prepared
statement caching.
So prepared statements are used by default, just not cached. This means subsequent queries will have the added cost of re-preparing statements if not optimized by the driver. See this for more information on optimizations within TopLink

MyBatis Generator setDistinct(true)

I'm trying to use setDistinct(true) as it is described in the guide: http://mybatis.github.io/generator/generatedobjects/exampleClassUsage.html
I've written in this way:
testExample ae = new testExample();
testExample.Criteria criteriatest = ae.createCriteria();
ae.setDistinct(true);
criteriatest.andIDENTLAVEqualTo(Long.parseLong(cert.getCODINDIVID()));
ae.or(criteriatest);
List<test> listtest = testMapper.selectByExample(ae);
but the setDistinct(true) doesn't affect the results.
Where should I add the setDistinct line?

It looks like the link you referenced is for an extremely old version of MyBatis. On that page, it lists the following:
Version: 1.3.3-SNAPSHOT
The latest version is:
mybatis-3.3.0-SNAPSHOT
Grepping the 3.x code for setDistinct does not return anything:
https://github.com/mybatis/mybatis-3/search?q=setDistinct
I'm surprised you don't get a compile-time error about the method not being found. Are you using version 1.3.3 (or 1.x)?
I would recommend doing the DISTINCT right in the query. Since MyBatis is generally a sort of a close-to-the-SQL-metal type of mapping framework, I think it's best to add it in the mapper file's query itself. Plus that way, you can choose specifically what to DISTINCT by. The setDistinct method does not seem to provide any way to specify the target.
For MyBatis 3, I think the analogous style of query would be this:
http://mybatis.github.io/mybatis-3/statement-builders.html
This seems to be analogous to a jOOQ-style DSL. It has a SELECT_DISTINCT method. I personally find it easier to code/read the pure SQL with some XML markup as needed for dynamic SQL in a mapper file, but this is certainly a viable option in MyBatis 3.
Edit:
So, I did some more digging, and the reason I couldn't find the code in the MyBatis3 git repo is because setDistinct is in the mybatis-generator code base.
I think part of the issue here may stem from what is part of Mybatis-Generator's description on GitHub:
MBG seeks to make a major impact on the large percentage of database
operations that are simple CRUD (Create, Retrieve, Update, Delete).
So, it provides a way to do simple DISTINCTs, but with limited control.
The code resides in the addClassElements method of the ProviderSelectByExampleWithoutBLOBsMethodGenerator class. Searching for setDistinct won't show up on a Github search since it's an automatically generated setter.
This is the relevant code snippet:
boolean distinctCheck = true;
for (IntrospectedColumn introspectedColumn : getColumns()) {
if (distinctCheck) {
method.addBodyLine("if (example != null && example.isDistinct()) {"); //$NON-NLS-1$
method.addBodyLine(String.format("%sSELECT_DISTINCT(\"%s\");", //$NON-NLS-1$
builderPrefix,
escapeStringForJava(getSelectListPhrase(introspectedColumn))));
method.addBodyLine("} else {"); //$NON-NLS-1$
method.addBodyLine(String.format("%sSELECT(\"%s\");", //$NON-NLS-1$
builderPrefix,
escapeStringForJava(getSelectListPhrase(introspectedColumn))));
method.addBodyLine("}"); //$NON-NLS-1$
} else {
method.addBodyLine(String.format("%sSELECT(\"%s\");", //$NON-NLS-1$
builderPrefix,
escapeStringForJava(getSelectListPhrase(introspectedColumn))));
}
distinctCheck = false;
}
So, essentially, this looks like it's wrapping the SELECT_DISTINCT method I mentioned originally, and it attempts to introspect the columns and apply the DISTINCT to all of the ones it gets back.
Digging a bit deeper, it ultimately calls this code to get the columns:
/**
* Returns all columns in the table (for use by the select by primary key
* and select by example with BLOBs methods)
*
* #return a List of ColumnDefinition objects for all columns in the table
*/
public List<IntrospectedColumn> getAllColumns() {
List<IntrospectedColumn> answer = new ArrayList<IntrospectedColumn>();
answer.addAll(primaryKeyColumns);
answer.addAll(baseColumns);
answer.addAll(blobColumns);
return answer;
}
So, this is definitely essentially an all-or-nothing DISTINCT (whereas Postgres itself allows DISTINCT on just certain columns).
Try moving the setDistinct to the very last line before you actually invoke the ae object. Perhaps subsequent calls are affecting the column set (although from the code, it doesn't seem like it should -- basically once the columns are set, the setDistinct should use them).
The other thing that would be interesting would be to see what SQL it is actually generating with and without setDistinct.
Check this link out for more detail on debug/logging:
http://mybatis.github.io/generator/reference/logging.html
I'd recommend perhaps trying out the XML-based mapper file definitions which interleave SQL with XML tags for dynamic-ness. IMO, it's much easier to follow than the code Mybatis Generator code snippet above. I suppose that's one of the main tradeoffs with a generator -- easier to create initially, but more difficult to read/maintain later.
For super-dynamic queries, I could see some more advantages, but then that sort of goes against their self-description of it being for simple CRUD operations.

SqlInjection with prepared statement without bind variable?

As we know the best way to avoid sql injection is using prepared statement with bind variables. But i have question what
if i use just prepared statement but not bind variables like below where customer id is coming from User interface
String query ="select * from customer where customerId="+customerId;
PreparedStatement stmt = con.prepareStatement(query); //line1
Does line 1 take care restricting sql injection even when i have not used bind variables?
I agree the best way is below but if above approach also takes care of restrcting sql injection then i would prefer above one(as
its a legacy project)
String query ="select * from customer where customerId=?";
PreparedStatement stmt = con.prepareStatement(query);
stmt.setInt(1, 100);
Is prepared statement without using bind variable sufficient to make sure sql injection not possible?

One have to distinguish several matters.
Using prepared statement won't do any help just by itself.
As well as there is no harm in using non-prepared way in general.
The thing works only when you need to insert dynamical part into query.
So, in this latter case such a dynamical part have to go into query via placeholder only, which actual value have to be bound later (placeholder is a ? or any other mark that represents the actual data in the query).
The very term "prepared statement" implies using placeholders for all the dynamical data that goes into query. So,
if you have no dynamical parts in the query, there would be obviously no injection at all, even without using prepared statements.
if you're using a prepared statement, but inject values directly into query instead of binding them - it would be wide open to injection.
So, again - only with placeholders for all dynamical data prepared statement would work. And it works because:
every dynamical value have to be properly formatted
prepared statement makes proper formatting (or handling) inevitable.
prepared statement does proper formatting (or handling) in the only proper place - right before query execution, not somewhere else, so, our safety won't rely on such unreliable sources like
some 'magic' feature which rather would spoil the data than make it safe.
good will of one (or several) programmers, who can decide to format (or not to format) our variable somewhere in the program flow. That's the point of great importance.
prepared statement affects the very value that is going into query, but not the source variable, which remains intact and can be used in the further code (to be sent via email or shown on-screen).
prepared statement can make application code dramatically shorter, doing all the formatting behind the scenes (*only if driver permits).

Line 1 will not check if develeper want or not want to drop table. If you write query it's assumed it is Ok.
Goal of sql injection is to prepare values that allows making additional sql query without will nor knowledge of developer. Quering your website with fake values in attributes.
Example:
id = "'); DROP ALL TABLES; --";
query = "select * from customer where customerId="+id;
PreparedStatement ensures that special symbols (like ' or ") added to query using setInt/setString/etc will not interfere with sql query.

I know this is an older post, I just wanted to add that you avoid injection attacks if you can make sure you are only allowing integers into your query for line 1. String inputs are where the injection attacks happen. In the sample above, it is unclear which class of variable 'customerId' is, although it looks like an int. Since the question is tagged as Java, you can't do an injection attack with an int, so you should be fine.
If it is a string in line 1, you need to be confident that the 'customerId' comes from a secure source from which it must be an integer. If it comes from a post form or other user generated field then you can either try to escape it or convert it to an integer to be sure. If it is a string, cast it to an integer and you will not need to bind params.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.