TL;DR: What's the recommended approach for reusing PreparedStatement objects in Java, without making a mess out of the code?
If I want to reuse them I have to define them all early in the code, and it becomes a mess.
If I don't create them until I need to, I can keep the code tidy and clean, but then I can't reuse the objects.
I have a method like this:
PreparedStatement psQuery = conn.prepareStatement("select ...");
PreparedStatement psChild = conn.prepareStatement("select ... where parent = ? and ...");
ResultSet rsQuery = psQuery.executeQuery();
while (rsQuery.next()) {
psChild.setInt(1, rsQuery.getInt("id"));
psChild.executeQuery();
...
}
I create two preparedStatement first, and then reuse them every time I need to execute those specific SQL queries. I don't define psChild inside my loop because then I'd be creating a new prepared statement in each iteration, instead of just reusing it.
Now, my code is much more complex. I'm actually using 13 different preparedStatement instances, and the code spreads through a few hundred lines. I'd very much like to split it into different methods, but I'm not sure how to properly do it. I can think of two options. The first one is like I'm doing it right now, only splitting into methods:
PreparedStatement psQuery = conn.prepareStatement("select ...");
PreparedStatement psChild = conn.prepareStatement("select ... where parent = ? and ...");
ResultSet rsQuery = psQuery.executeQuery();
while (rsQuery.next()) {
processChildren(rsQuery.getInt("id"), psChild);
}
The problem is, I end up with a processChildren with this signature:
private static void processChild(
...,
final PreparedStatement psFoo,
final PreparedStatement psBar,
final PreparedStatement psDoc,
final PreparedStatement psGrumpy,
final PreparedStatement psHappy,
final PreparedStatement psSleepy,
final PreparedStatement psDopey,
final PreparedStatement psBashful,
final PreparedStatement psSneezy,
final PreparedStatement psYetAnotherOne,
final PreparedStatement psAndAnotherOne,
final PreparedStatement psLastOne,
...) {
Not exactly great.
The other option would be to create each prepared statement in the method where I'll need it. That would be much cleaner, but it's the same as creating them inside the loop: I wouldn't be reusing them.
There is yet another option, to declare the variables as class attributes, this way I could create them first, and then reuse without the need to clutter the "children method" signatures. But this feels even more wrong, in the same way that using a global variable would. Worse, 13 "global" variables all of them with the same class and very similar names. No way I'm doing that!
How could I proceed?
Note: I'm aware of much better persistence solutions, such as JPA. I'm not looking for an alternative to prepared statements, I only want to know what's the usual approach in cases like mine.
Edit: It seems like I oversimplified my example. This is closer to what I need to do:
Retrieve all the records from database 1.
For each one of them (first loop):
Check if it exists in another database 2.
If it doesn't, create it in database 2, and:
Retrieve all children from database 1.
For each of the children (second loop):
Check if child exists in database 2.
If it doesn't, then insert it.
So I have two levels of nested loops which I can't get rid of. And creating the prepared statements over and over inside of the loops seems like a poor idea.
I need to build a query in such a way as to prevent the possibility of an SQL injection attack.
I know of two ways to build a query.
String query = new StringBuilder("select * from tbl_names where name = '").append(name).append(';).toString();
String query = "select * from tbl_names where name = ? ";
In the first case, all I do is a connection.preparestatement(query)
In the second case I do something like:
PreparedStatement ps = connection.prepareStatement(query)
ps.setString(1,name);
I want to know what is the industry standard? Do you use the string append way to build the query and then prepare the statement or prepare the statement already and pass parameters later?
Your first fragment of code is unsafe and vulnerable to SQL injection. You should not use that form.
To make your first fragment safe, you would need to manually escape the value to prevent SQL injection. That is hard to do correctly, and choosing the wrong way of handling values could potentially reduce performance depending on the underlying database (eg some database systems will not use an index if you supply a string literal for an integer column).
The second fragment is the standard way. It protects you against SQL injection. Use this form.
Using a prepared statement with parameter placeholders is far simpler, and it also allows you to reuse the compiled statement with different sets of values. In addition, depending on the database, this can have additional performance advantages for reusing query plans across connections.
You could also use the [OWASP ESAPI library][1]. It includes validators, encoders and many other helpful things.
For example, you can do
ESAPI.encoder().encodeForSQL(Codec,input);
More codecs are under development. Currently, MySQL and Oracle are supported. One of those might be helpful in your case.
The Prepared Statement is a slightly more powerful version of a Statement, and should always be at least as quick and easy to handle as a Statement.
The Prepared Statement may be parametrized
Most relational databases handles a JDBC / SQL query in four steps:
Parse the incoming SQL query
Compile the SQL query
Plan/optimize the data acquisition path
Execute the optimized query / acquire and return data
A Statement will always proceed through the four steps above for each SQL query sent to the database. A Prepared Statement pre-executes steps (1) - (3) in the execution process above. Thus, when creating a Prepared Statement some pre-optimization is performed immediately. The effect is to lessen the load on the database engine at execution time.
Now my question is this:
"Is there any other advantage of using Prepared Statement?"
Advantages of a PreparedStatement:
Precompilation and DB-side caching of the SQL statement leads to overall faster execution and the ability to reuse the same SQL statement in batches.
Automatic prevention of SQL injection attacks by builtin escaping of quotes and other special characters. Note that this requires that you use any of the PreparedStatement setXxx() methods to set the values
preparedStatement = connection.prepareStatement("INSERT INTO Person (name, email, birthdate, photo) VALUES (?, ?, ?, ?)");
preparedStatement.setString(1, person.getName());
preparedStatement.setString(2, person.getEmail());
preparedStatement.setTimestamp(3, new Timestamp(person.getBirthdate().getTime()));
preparedStatement.setBinaryStream(4, person.getPhoto());
preparedStatement.executeUpdate();
and thus don't inline the values in the SQL string by string-concatenating.
preparedStatement = connection.prepareStatement("INSERT INTO Person (name, email) VALUES ('" + person.getName() + "', '" + person.getEmail() + "'");
preparedStatement.executeUpdate();
Eases setting of non-standard Java objects in a SQL string, e.g. Date, Time, Timestamp, BigDecimal, InputStream (Blob) and Reader (Clob). On most of those types you can't "just" do a toString() as you would do in a simple Statement. You could even refactor it all to using PreparedStatement#setObject() inside a loop as demonstrated in the utility method below:
public static void setValues(PreparedStatement preparedStatement, Object... values) throws SQLException {
for (int i = 0; i < values.length; i++) {
preparedStatement.setObject(i + 1, values[i]);
}
}
Which can be used as below:
preparedStatement = connection.prepareStatement("INSERT INTO Person (name, email, birthdate, photo) VALUES (?, ?, ?, ?)");
setValues(preparedStatement, person.getName(), person.getEmail(), new Timestamp(person.getBirthdate().getTime()), person.getPhoto());
preparedStatement.executeUpdate();
They are pre-compiled (once), so faster for repeated execution of dynamic SQL (where parameters change)
Database statement caching boosts DB execution performance
Databases store caches of execution plans for previously executed statements. This allows the database engine to reuse the plans for statements that have been executed previously. Because PreparedStatement uses parameters, each time it is executed it appears as the same SQL, the database can reuse the previous access plan, reducing processing. Statements "inline" the parameters into the SQL string and so do not appear as the same SQL to the DB, preventing cache usage.
Binary communications protocol means less bandwidth and faster comms calls to DB server
Prepared statements are normally executed through a non-SQL binary protocol. This means that there is less data in the packets, so communications to the server is faster. As a rule of thumb network operations are an order of magnitude slower than disk operations which are an order of magnitude slower than in-memory CPU operations. Hence, any reduction in amount of data sent over the network will have a good effect on overall performance.
They protect against SQL injection, by escaping text for all the parameter values provided.
They provide stronger separation between the query code and the parameter values (compared to concatenated SQL strings), boosting readability and helping code maintainers quickly understand inputs and outputs of the query.
In java, can call getMetadata() and getParameterMetadata() to reflect on the result set fields and the parameter fields, respectively
In java, intelligently accepts java objects as parameter types via setObject, setBoolean, setByte, setDate, setDouble, setDouble, setFloat, setInt, setLong, setShort, setTime, setTimestamp - it converts into JDBC type format that is comprehendible to DB (not just toString() format).
In java, accepts SQL ARRAYs, as parameter type via setArray method
In java, accepts CLOBs, BLOBs, OutputStreams and Readers as parameter "feeds" via setClob/setNClob, setBlob, setBinaryStream, setCharacterStream/setAsciiStream/setNCharacterStream methods, respectively
In java, allows DB-specific values to be set for SQL DATALINK, SQL ROWID, SQL XML, and NULL via setURL, setRowId, setSQLXML ans setNull methods
In java, inherits all methods from Statement. It inherits the addBatch method, and additionally allows a set of parameter values to be added to match the set of batched SQL commands via addBatch method.
In java, a special type of PreparedStatement (the subclass CallableStatement) allows stored procedures to be executed - supporting high performance, encapsulation, procedural programming and SQL, DB administration/maintenance/tweaking of logic, and use of proprietary DB logic & features
PreparedStatement is a very good defense (but not foolproof) in preventing SQL injection attacks. Binding parameter values is a good way to guarding against "little Bobby Tables" making an unwanted visit.
Some of the benefits of PreparedStatement over Statement are:
PreparedStatement helps us in preventing SQL injection attacks because it automatically escapes the special characters.
PreparedStatement allows us to execute dynamic queries with parameter inputs.
PreparedStatement provides different types of setter methods to set the input parameters for the query.
PreparedStatement is faster than Statement. It becomes more visible when we reuse the PreparedStatement or use it’s batch processing methods for executing multiple queries.
PreparedStatement helps us in writing object Oriented code with setter methods whereas with Statement we have to use String Concatenation to create the query. If there are multiple parameters to set, writing Query using String concatenation looks very ugly and error prone.
Read more about SQL injection issue at http://www.journaldev.com/2489/jdbc-statement-vs-preparedstatement-sql-injection-example
nothing much to add,
1 - if you want to execute a query in a loop (more than 1 time), prepared statement can be faster, because of optimization that you mentioned.
2 - parameterized query is a good way to avoid SQL Injection. Parameterized querys are only available in PreparedStatement.
Statement is static and prepared statement is dynamic.
Statement is suitable for DDL and prepared statment for DML.
Statement is slower while prepared statement is faster.
more differences (archived)
Can't do CLOBs in a Statement.
And: (OraclePreparedStatement) ps
As Quoted by mattjames
The use of a Statement in JDBC should be 100% localized to being used
for DDL (ALTER, CREATE, GRANT, etc) as these are the only statement
types that cannot accept BIND VARIABLES. PreparedStatements or
CallableStatements should be used for EVERY OTHER type of statement
(DML, Queries). As these are the statement types that accept bind
variables.
This is a fact, a rule, a law -- use prepared statements EVERYWHERE.
Use STATEMENTS almost no where.
Statement will be used for executing static SQL statements and it can't accept input parameters.
PreparedStatement will be used for executing SQL statements many times dynamically. It will accept input parameters.
sql injection is ignored by prepared statement so security is increase in prepared statement
It's easier to read
You can easily make the query string a constant
Statement interface executes static SQL statements without parameters
PreparedStatement interface (extending Statement) executes a precompiled SQL statement with/without parameters
Efficient for repeated executions
It is precompiled so it's faster
Another characteristic of Prepared or Parameterized Query: Reference taken from this article.
This statement is one of features of the database system in which same SQL statement executes repeatedly with high efficiency. The prepared statements are one kind of the Template and used by application with different parameters.
The statement template is prepared and sent to the database system and database system perform parsing, compiling and optimization on this template and store without executing it.
Some of parameter like, where clause is not passed during template creation later application, send these parameters to the database system and database system use template of SQL Statement and executes as per request.
Prepared statements are very useful against SQL Injection because the application can prepare parameter using different techniques and protocols.
When the number of data is increasing and indexes are changing frequently at that time Prepared Statements might be fail because in this situation require a new query plan.
Dont get confusion : simply remember
Statement is used for static queries like DDLs i.e. create,drop,alter and prepareStatement is used for dynamic queries i.e. DML query.
In Statement, the query is not precompiled while in prepareStatement query is precompiled, because of this prepareStatement is time efficient.
prepareStatement takes argument at the time of creation while Statement does not take arguments.
For Example if you want to create table and insert element then ::
Create table (static) by using Statement and Insert element (dynamic)by using prepareStatement.
I followed all the answers of this question to change a working legacy code using - Statement ( but having SQL Injections ) to a solution using PreparedStatement with a much slower code because of poor understanding of semantics around Statement.addBatch(String sql) & PreparedStatement.addBatch().
So I am listing my scenario here so others don't make same mistake.
My scenario was
Statement statement = connection.createStatement();
for (Object object : objectList) {
//Create a query which would be different for each object
// Add this query to statement for batch using - statement.addBatch(query);
}
statement.executeBatch();
So in above code , I had thousands of different queries, all added to same statement and this code worked faster because statements not being cached was good & this code executed rarely in the app.
Now to fix SQL Injections, I changed this code to ,
List<PreparedStatement> pStatements = new ArrayList<>();
for (Object object : objectList) {
//Create a query which would be different for each object
PreparedStatement pStatement =connection.prepareStatement(query);
// This query can't be added to batch because its a different query so I used list.
//Set parameter to pStatement using object
pStatements.add(pStatement);
}// Object loop
// In place of statement.executeBatch(); , I had to loop around the list & execute each update separately
for (PreparedStatement ps : pStatements) {
ps.executeUpdate();
}
So you see, I started creating thousands of PreparedStatement objects & then eventually not able to utilize batching because my scenario demanded that - there are thousands of UPDATE or INSERT queries & all of these queries happen to be different.
Fixing SQL injection was mandatory at no cost of performance degradation and I don't think that it is possible with PreparedStatement in this scenario.
Also, when you use inbuilt batching facility, you have to worry about closing only one Statement but with this List approach, you need to close statement before reuse , Reusing a PreparedStatement
As we know the best way to avoid sql injection is using prepared statement with bind variables. But i have question what
if i use just prepared statement but not bind variables like below where customer id is coming from User interface
String query ="select * from customer where customerId="+customerId;
PreparedStatement stmt = con.prepareStatement(query); //line1
Does line 1 take care restricting sql injection even when i have not used bind variables?
I agree the best way is below but if above approach also takes care of restrcting sql injection then i would prefer above one(as
its a legacy project)
String query ="select * from customer where customerId=?";
PreparedStatement stmt = con.prepareStatement(query);
stmt.setInt(1, 100);
Is prepared statement without using bind variable sufficient to make sure sql injection not possible?
One have to distinguish several matters.
Using prepared statement won't do any help just by itself.
As well as there is no harm in using non-prepared way in general.
The thing works only when you need to insert dynamical part into query.
So, in this latter case such a dynamical part have to go into query via placeholder only, which actual value have to be bound later (placeholder is a ? or any other mark that represents the actual data in the query).
The very term "prepared statement" implies using placeholders for all the dynamical data that goes into query. So,
if you have no dynamical parts in the query, there would be obviously no injection at all, even without using prepared statements.
if you're using a prepared statement, but inject values directly into query instead of binding them - it would be wide open to injection.
So, again - only with placeholders for all dynamical data prepared statement would work. And it works because:
every dynamical value have to be properly formatted
prepared statement makes proper formatting (or handling) inevitable.
prepared statement does proper formatting (or handling) in the only proper place - right before query execution, not somewhere else, so, our safety won't rely on such unreliable sources like
some 'magic' feature which rather would spoil the data than make it safe.
good will of one (or several) programmers, who can decide to format (or not to format) our variable somewhere in the program flow. That's the point of great importance.
prepared statement affects the very value that is going into query, but not the source variable, which remains intact and can be used in the further code (to be sent via email or shown on-screen).
prepared statement can make application code dramatically shorter, doing all the formatting behind the scenes (*only if driver permits).
Line 1 will not check if develeper want or not want to drop table. If you write query it's assumed it is Ok.
Goal of sql injection is to prepare values that allows making additional sql query without will nor knowledge of developer. Quering your website with fake values in attributes.
Example:
id = "'); DROP ALL TABLES; --";
query = "select * from customer where customerId="+id;
PreparedStatement ensures that special symbols (like ' or ") added to query using setInt/setString/etc will not interfere with sql query.
I know this is an older post, I just wanted to add that you avoid injection attacks if you can make sure you are only allowing integers into your query for line 1. String inputs are where the injection attacks happen. In the sample above, it is unclear which class of variable 'customerId' is, although it looks like an int. Since the question is tagged as Java, you can't do an injection attack with an int, so you should be fine.
If it is a string in line 1, you need to be confident that the 'customerId' comes from a secure source from which it must be an integer. If it comes from a post form or other user generated field then you can either try to escape it or convert it to an integer to be sure. If it is a string, cast it to an integer and you will not need to bind params.
In many programming languages something like this is possible for prepared statements:
PreparedStatement statement = connection.prepareStatement(
"SELECT id FROM Company WHERE name LIKE ${name}");
statement.setString("name", "IBM");
But not with java.sql.PreparedStatement. In Java one has to use parameter indices:
PreparedStatement statement = connection.prepareStatement(
"SELECT id FROM Company WHERE name LIKE ?");
statement.setString(1, "IBM");
Is there a solution to work with string variables like in the first example?
Is "${.*}" not used somewhere else in the SQL language, or are there any conflicts? Cause then I would implement it by myself (parsing the SQL string and replacing every variable by "?" and then doing it the Java way).
Regards,
Kai
Standard JDBC PreparedStatements don't have this ability. Spring JDBC provides this functionality through NamedParameterJdbcTemplate.
As kd304 mentioned in the comment to my posting, this is a very nice solution if you don't want to incorporate another 3rd party library (like Spring) into your project: Javaworld Article: Named Parameters for PreparedStatement
Using a raw PreparedStatement, this is not possible, as you say. It is possible with CallableStatement, but that requires a stored procedure rather than just a SQL statement.
ORM layers like Hibernate also provide named parameter substitution, and Hibernate also allows you to execute native SQL, bypassing the OR mapping functionality completely.
So if you were really keen to use named parameters, you could employ Hibernate as a way of doing this; you'd only be using a tiny fraction of its functionality.