expose database queries to users - java

Suppose I have a table with columns 'name', 'age', 'city', 'country'. I would like to expose the database query interface to users, that is, they should be able to perform all kinds of queries that sql let's us do.
The only way I can think of doing this is to have a row for each column where each row is of the form:
column name | operator | value
An example query in an activity would be:
name | = | Bob
age | > | 25
Then with that information I could have a method perform the query and return the result.
For this simple example it would work. But there are more interesting things one can ask sql and this approach would fail at a lot those queries.
What can I do about this?

First of all, you need to think what operations are you going to allow your users. The most common SQL queries are SELECT, INSERT, UPDATE, DELETE. Once you have the list of operations that you are going to provide, you need to think of the parameters that the user can select while making any of these queries. For example, if the user wants to fetch (SELECT) some data, what can he provide as input, in your case the age of the person. Similar case for the other queries.
So once you have the above information, you will need to convert it into a query. This can be done by creating a utility or helper class which takes into account the users query as well as parameters and forms the SQL query which you can execute on your database.
For example, lets take the StackOverflow Jobs page (https://stackoverflow.com/jobs?med=site-ui&ref=jobs-tab). Here you can see that the user can fetch data based on multiple parameters which are keyword, location, remote etc. Taking this example into consideration, your utility will create a SELECT query and pass the user selected parameters and generate a SQL query of the form
SELECT jobs WHERE is_remote=<user_param> AND tech=<user_param> AND compensation=<user_param>
This is just an overview of what needs to be done. There might be some changes based on your exact use case. But this will just about cover what you need to achieve.

Related

Limiting SQL Injection when query is almost entirely configurable

I have a requirement to perform a scheduled dump of a SQL query from a web application. Initially it was an entire table (only the table name was configurable), but then the addition of a configurable WHERE clause was raised, along with a subset of columns.
The configurable options now required are:
columns
table name
where clause
At this point, it might as well just be the entire query, right?!
I know that SQLi can be mitigated somewhat by java.sql.PreparedStatement, but as far as I can tell, that relies on knowing the columns and datatypes at compile time.
The configurable items will not be exposed to end users. They will sit in a properties file within WEB-INF/classes, so the user's I am defending from here are sysadmins that are not as good as they think they are.
Am I being over cautious here?
If nothing else, can java.sql.PreparedStatement prevent multiple queries from being executed if, say, the WHERE clause was Robert'); DROP TABLE students;--?
A prepared statement will not handle this for you. With a prepared statement you can only safely add parameters to your query, not table names, column names or entire where clauses.
Especially the latter makes it virtually impossible to prevent injection if there are no constraints whatsoever. Column and table name parameters could be checked against a list of valid values either statically defined or dynamically based on you database structure. You could do some basic regex checking on the where parameter, but that will only really help against obvious SQL injection.
With the flexiblity you intend to offer in the form of SELECT FROM WHERE you could have queries like this:
SELECT mycolumn FROM mytable WHERE id = 1 AND 'username' in (SELECT username FROM users)
You could look at something like JOOQ to offer safe dynamic query building while still being able to constrain the things your users are allowed to query for.
Constraining your users in one way or another is key here. Not doing that means you have to worry not just about SQL injection, but also about performance issues for instance. Provide them with a visual (drag-and-drop) query builder for instance.
"It all depends".
If you have an application where users can type in the where clause as free text, then yes, they can construct SQL Injection attacks. They can also grind your server to a halt by selecting huge cartesian joins.
You could create a visual query builder - use the schema metadata to show a list of tables, and once the table is selected the columns, and for each column the valid comparisons. You can then construct the query as a parameterized query, and limit the human input to the comparison values, which you can in turn use as parameters.
It's a lot of work, though, and in most production systems of any scale, letting users run this kind of query is usually not particularly useful...
It's insecure to allow users to execute arbitrary queries. This is the kind of thing you'd see at Equifax. You don't want to allow it.
Prepared statements don't help make SQL expressions safe. Using parameters in prepared statements help make values safe. You can use a parameter only in the place where you would normally put a constant value, like a number, a quoted string, or a quoted date.
The easiest solution would be to NOT allow arbitrary queries or expressions on demand.
Instead, allow users to submit their custom query for review.
The query is reviewed by a human being, who may authorize the stored query to be run by the user (or other users). If you think you can develop some kind of automatic validator, be my guest, but IMHO that's bound to be a lot more work than just having a qualified database administrator review it.
Subsequently, the user is allowed to run the stored query on demand, but only by its id.
Here's another alternative idea: users who want to run custom queries can apply to get a replica of the database, to host on their own computer. They will get a dump of the subset of data they are authorized to view. Then if they run queries that trash the data, or melt their computer, that's their business.

Hibernate limit amount of result but check for more

As the title states, I want to only retrieve a maximum of for example 1000 rows, but if the queries result would be 1001, i would like to know in some way. I have seen examples which would check the amount of rows in result with a a second query, but i would like to have it in the query i use to get the 1000 rows. I am using hibernate and criteria to receive my results from my database. Database is MS SQL
What you want is not posssible in a generic way.
The 2 usual patterns for pagination are :
use 2 queries : a first one that count, the next one that get a page of result
use only one query, where you fetch one result more than what you show on the page
With the first pattern, your pagination have more functionalities because you can display the total number of pages, and allow the user to jump to the page he wants directly, but you get this possibility at the cost of an additional sql query.
With the second pattern you can just say to the user if there is one more page of data or not. The user can then just jump to the next page, (or any previous page he already saw).
You want to have two information that results from two distinct queries :
select (count) from...
select col1, col2, from...
You cannot do it in a single executed Criteria or JPQL query.
But you can do it with a native SQL query (by using a subquery by the way) with a different way according to the DBMS used.
By making it, you would make more complex your code, make it more dependent to a specific DBMS and you would probably not gained really something in terms of performance.
I think that you should use rather a count and a second query to get the rows.
And if later you want to exploit the result of the count to fetch next results, you should favor the use of the pagination mechanisms provided by Hibernate rather doing it in a custom way.

Hibernate dynamic number of results per row

I'm working on a feature which allow some users to define their own SQL queries and run them on the database.
Basically a query could look like this:
1. SELECT first_name, last_name FROM user;
2. SELECT first_name, last_name, id, address, email FROM user.
As you can see there may be a different number of columns in the result table.
Is there a way to handle this in Hibernate?
For instance, the basic usage displayed below does not help me in any way because I cannot be sure that each result row has at least 2 columns.
Query query = session.getSession().createSQLQuery(queryStr);
ScrollableResults results = query.scroll(ScrollMode.FORWARD_ONLY);
while (results.next()) {
data.put("firstName", String.valueOf(results.get(0)));
data.put("lastName", String.valueOf(results.get(1)));
}
Furthermore, I don't think I can use the select new map because the users have to run native MySQL queries.
Is there any solution to this?
Thanks in advance for your time and suggestions!
First of all, I do not think that it is a good idea to expose an interface that allows SQL input to users.
Anyway, in your case you could program the SQL result set extraction yourself. There is no need for object relational mapping, if you do not map into objects. You could then go and just check for the existence of the columns and map them right away into your data structure. Nice way to learn plain JDBC. Just for the case you use Hibernate for other issues in your application, you can even mix the approaches.

How to validate if a record exists when issuing a REST update request using spring jdbcTemplate?

I have a simple database table users with 3 columns:
| id | username | nationality |
| 1 | John | American |
| 2 | Doe | English |
I want to issue an update via a POST request to http://mysite/users/2/nationality
Now my initial approach was to do a single query
UPDATE users SET nationality="French" WHERE id=2; followed by a query for the updated object SELECT * FROM users WHERE id=2; then return the updated object in the response.
The problem is the id passed in the request may not exist in my database. How should I validate if a user exists in the database?
Should I just check if the query returns an object?
Should I validate the update first for the affected rows (affected
rows will be zero if the no change was made to the data to be
updated so I can't throw a UserNotFoundException in that case)?
Is it better to issue a query before the update just to check if the
row exists then update then query the updated row?
public void updateRecord(Long id, String username) {
String updateSql = "UPDATE users SET username = ? WHERE id = ?";
JdbcTemplate template = new JdbcTemplate(dataSource);
Object[] params = { username, id};
int[] types = {Types.VARCHAR, Types.BIGINT};
int rows = template.update(updateSql, params, types);
System.out.println(rows + " row(s) updated.");
}
If you always need the update to return the updated object in the response, then option 1 seems like a reasonable way to check if the update matched an existing user. Although if you aren't using transactions, you should be aware that the user may not exist at the time of the update, but a separate connection could insert the user before your select.
That said, without transactions there is always a chance that the select will return the object in a different state from the update you just performed. It is slightly worse in this case, though, because technically the update should have failed.
If you don't need the update to return the updated object in the response, then options 2 seems like a much better solution. For this to work, though, you need the update to return the number of matched rows rather than the number of changed rows (so if the update matches an existing user, but the field you are updating doesn't change, you'll still get a non-zero result).
Usually you would have to set a connection attribute to make this work for MySQL (for example, in PHP's PDO driver there is the MYSQL_ATTR_FOUND_ROWS attribute). However, my understanding is that this option is already enabled in JDBC so executeUpdate should return the number of matched rows. I can't confirm that at the moment, but it should be easy enough for you to test.
Best approach in your case is to run a select query against given id in order to verify that a corresponding record exists in your database. If record exists, then you can proceed with success flow and run the update and the select queries you mentioned above. Otherwise, if record does not exists, then you can proceed with the failure flow (throw exceptions etc.)
I'd go with option 3 for the reusability and sanity aspect, unless you're worried about a couple of extra queries for (very strict and not obvious) performance reasons.
Since you'll likely reuse the code to retrieve the user in other places, I'd first retrieve the user, and return a 404 if he's not found. Then I'd call update and sanity check the number of rows changed. Finally, I'd call the retrieval method to get the user and marshall it into the response body. It's simple, it works, it's readable, it's predictable, it's testable, and it's most likely fast enough. And you've just reused your retrieval method.
I had a similar issue. Here is how I tackled the issue
Whenever its a new user, then mark the id with a default number (eg. 51002122) here 51002122 is never the id in db. So the page shows "/51002122/user". When ever the id of the user is 51002122 then I would do an insert to db. After the insert, I render the page with the id from db. Eg. after insertion, the page would be "/27/user".
For all other ids other than 51002122 ( eg. /12/user or /129/user ) I would do an update in the db because I know that this user exists in the db.
Not sure if this is the right approach but this works. Can someone tell a better or correct approach.
I think that the most safe way is:
SELECT EXISTS(
SELECT *
FROM users
WHERE id = 3 ) as columnCount;
This will return the number of rows that have id=3. Then you can return this and check if the columnCount is 1, then execute the update statement else do something else.
Before arriving at a solution, few things needs to be considered.
This is a REST API call that calls for simplicity from the usage perspective.
The server side code should also consider the performance implication of the implementation chosen.
The API should be robust. Meaning, come what may, the request should always take the flows(happy/exception) conceived in design.
Based on these considerations, I would suggest the following approach.
In the DAO, define two different methods, namely updateRecord(Long id, String username) and getRecord(Long id).
Mark transaction attribute (#Transaction) to these method as follows
Mark transaction attribute for updateRecord as REQUIRED.
Mark transaction attribute for getRecord as NOT_REQUIRED since this is purly a read call.
Note that in all the cases, at least on DB call is required.
From controller, call updateRecord method first. This method will return an integer.
If the returned value is nonzero, then call getRecord to retrieve the updated record from database.
If the returned value is zero, that indicates the user does not exists and no need to call getRecord. An appropriate error response (404 Not Found ) to the returned to the calling client.
In this approach, you will save on one database call when the user does not exist.
Overall this approach is neat, less cluttered and most importantly simple and efficient(We are limiting transaction boundary only for update call). Moreover getRecord can be used independently as another API to retrieve a record (without transaction).
I had similar issue where I was supposed to update a table but before that need to check if the id exists or not. I used openjpa and wrote method verifyUser(id) where id was the one which i need to check. OpenJpa findById returns u the record. On update it returns you the complete record whereas on add it returns you the new primary key by which the record is added. I am not sure how it works for hibernate but there are many similarities in jpa n hibernate.

build oracle sql query dynamically from java application

How do I build oracle pl/sql query dynamically from a java application? The user will be presented with a bunch of columns that are present in different tables in the database. The user can select any set of column and the application should build the complete select query using only the tables that contain the selected columns.
For example, lets consider that there are 3 tables in the database. The user selects col11, col22. In this case, the application should build the query using Tabl1 and Tabl2 only.
How do I achieve this?
Tabl1
- col11
- col12
- col13
Tabl2
- fkTbl1
- col21
- col22
- col23
Tabl3
- col31
- col32
- col33
- fkTbl1
Ad hoc reporting is an old favourite. It frequently appears as a one-liner at the end of the Reports Requirements section: "Users must be able to define and run their own reports". The only snag is that ad hoc reporting is an application in its own right.
You say
"The user will be presented with a
bunch of columns that are present in
different tables in the database."
You can avoid some of the complexities I discuss below if the "bunch of columns" (and the spread of tables) is preselected and tightly controlled. Alas, it is in the nature of ad hoc reporting that users will want pretty much all columns from all tables.
Let's start with your example. The user has selected col11 and col22, so you need to generate this query:
SELECT tabl1.col11
, tabl2.col22
FROM tabl1 JOIN tabl2
ON (TABL1.ID = TABL2.FKTABL1)
/
That's not too difficult. You just need to navigate the data dictionary views USER_CONSTRAINTS and USER_CONS_COLUMNS to establish the columns in the join condition - providing you have defined foreign keys (please have foreign keys!).
Things become more complicated if we add a fourth table:
Tabl4
- col41
- col42
- col43
- fkTbl2
Now when the user choose col11 and col42 you need to navigate the data dictionary to establish that Tabl2 acts as an intermediary table to join Tabl4 and Tabl1 (presuming you are not using composite primary keys, as most people don't). But suppose the user selects col31 and col41. Is that a legitimate combination? Let's say it is. Now you have to join Tabl4 to Tabl2 to Tabl1 to Tabl3. Hmmm...
And what if the user selects columns from two completely unrelated tables - Tabl1 and Tabl23? Do you blindly generate a CROSS JOIN or do you hurl an exception? The choice is yours.
Going back to that first query, it will return all the rows in both tables. Almost certainly your users will want the option to restrict the result set. So you need to offer them the ability to add to filters to the WHERE clause. Gotchas here include:
ensuring that supplied values are of an appropriate data-type (no strings for a number, no numbers for a date)
providing look-ups to reference data
values
handling multiple values (IN list
rather than equals)
ensuring date ranges are sensible
(opening bound before closing bound)
handling free text searches (are you
going to allow it? do you need to
use TEXT indexes or will you run the
risk of users executing LIKE
'%whatever%' against some CLOB
column?)
The last point highlights one risk inherent in ad hoc reporting: if the users can assemble a query from any tables with any filters they can assemble a query which can drain all the resources from your system. So it is a good idea to apply profiles to prevent that happening. Also, as I have already mentioned, it is possible for the users to build nonsensical queries. Bear in mind that you don't need very many tables in your schema to generate too many permutations to test.
Finally there is the tricky proposition of security policies. If users are restricted to seeing subsets of data on the basis their department or their job role, then you will need to replicate those rules. In such cases the automatic application of policies through Row Level Security is a real boon
All of which might lead you to conclude that the best solution would be to pursuade your users to acquire an off-the-shelf product instead. Although that approach isn't without its own problems.
The way that I've done this kind of thing in the past is to simply construct the SQL query on the fly using a StringBuilder and then executing it using a JDBC a non-prepared statement. This is rather inefficient since the Oracle DB has to repeat all of the query analysis and optimization work for each query.

Categories

Resources