How to SELECT items from from an array (IN clause details)? - java

I would like to do something in Java (using iBatis, JDBC, etc., really in SQL) like:
SELECT SUM(rowName) FROM myTable WHERE id = [myArrayOfIds]
Where myArrayOfIds can be almost any length. Now I know you can do:
SELECT SUM(rowName) FROM myTable WHERE id IN (x, y, z)
but what happens for longer lists? For example my list could be as little as a few items to hundreds or more items. How can I do this?

I think it depends on your flavour of SQL. For instance, Oracle does not allow more than 1000 values in an IN() list. Other flavours may vary.

one alternative would be to insert those ids to a table, then do a join
SELECT SUM(rowName) FROM myTable ta inner join tempTable tb on ta.id = tb.id

Oracle definitely allows more than 1000 items in the IN clause. It's your persistence tool that is limiting this. iBatis or Hibernate, whatever. Use Oracle Sqlplus and you'll see this is not an Oracle limit.
Suggestion from BlackTigerX would work, or you could call the query multiple times, passing 1000 items at a time and aggregating the results. Either way, you're just working around your persistence tool limitation.

Related

PostgreSQL multiple 'WHERE' conditions (1000+) request

I'm not a pro in SQL at all :)
Having a very critical performance issue.
Here is the info directly related to problem.
I have 2 tables in my DB- table condos and table goods.
table condos have the fields:
id (PK)
name
city
country
table items:
id (PK)
name
multiple fields not related to issue
condo_id (FK)
I have 1000+ entities in condos table and 1000+ in items table.
The problem is how i perform items search
currently it is:
For example, i want to get all the items for city = Sydney
Perform a SELECT condos.condo_id FROM public.condos WHERE city = 'Sydney'
Make a SELECT * FROM public.items WHERE item.condo_id = ? for each condo_id i get in step 1.
The issue is that once i get 1000+ entities in condos table, the request is performed 1000+ times for each condo_id belongs to 'Sydney'. And the execution of this request takes more then a 2 minutes which is a critical performance issue.
So, the questions is:
What is the best way for me to perform such search ? should i put a 1000+ id's in single WHERE request? or?
For add info, i use PostgreSQL 9.4 and Spring MVC.
Use a table join to perform a query such that you do not need to perform a additional query. In your case you can join condos and items by condo_id which is something like:
SELECT i.*
FROM public.items i join public.condos c on i.condo_id = c.condo_id
WHERE c.city = 'Sydney'
Note that performance tuning is a board topic. It can varied from environment to environment, depends on how you structure the data in table and how you organize the data in your code.
Here is some other suggestion that may also help:
Try to add index to the field where you use sorting and searching, e.g. city in condos and condo_id in items. There is a good answer to explain how indexing work.
I also recommend you to perform EXPLAIN to devises a query plan for your query whether there is full table search that may cause performance issue.
Hope this can help.
Essentially what you need is to eliminate the N+1 query and at the same time ensure that your City field is indexed. You have 3 mechanisms to go. One is already stated in one of the other answers you have received this is the SUBSELECT approach. Beyond this approach you have another two.
You can use what you have stated :
SELECT condos.condo_id FROM public.condos WHERE city = 'Sydney'
SELECT *
FROM public.items
WHERE items.condo_id IN (up to 1000 ids here)
the reason why I am stating up to 1000 is because some SQL providers have limitations.
You also can do join as a way to eliminate the N+1 selects
SELECT *
FROM public.items join public.condos on items.condo_id=condos.condo_id and condos.city='Sydney'
Now what is the difference in between the 3 queries.
Pros of Subselect query is that you get everything at once.
The Cons is that if you have too many elements the performance may suffer:
Pros of simple In clause. Effectivly solves the N+1 problem,
Cons may lead to some extra queries compared to the Subselect
Joined query pros, you can initialize in one go both Condo and Item.
Cons leads to some data duplication on Condo side
If we have a look into a framework like Hibernate, we can find there that in most of the cases as a fetch strategy is used either Joined either IN strategies. Subselect is used rarely.
Also if you have critical performance you may consider reading everything In Memory and serving it from there. Judging from the content of these two tables it should be fairly easy to just upload it into a Map.
Effectively everything that solves your N+1 query problem is a solution in your case if we are talking of just 2 times 1000 queries. All three options are solutions.
You could use the first query as a subquery in an in operator in the second query:
SELECT *
FROM public.items
WHERE item.condo_id IN (SELECT condos.condo_id
FROM public.condos
WHERE city = 'Sydney')

Is there a way to make query return a ResultSet?

I have the following query:
#Select("SELECT* FROM "+MyData.TABLE_NAME+" where data_date = #{refDate}")
public List<MyData> getMyData(#Param("refDate") Date refDate);
This table data is HUGE! Loading so many rows in memory is not the best way!
Is it possible to have this same query return a resultset so that I can just iterate over one item?
edit:
I tried adding:
#ResultType(java.sql.ResultSet.class)
public ResultSet getMyData(#Param("refDate") Date refDate);
but it gives me:
nested exception is org.apache.ibatis.reflection.ReflectionException: Error instantiating interface java.sql.ResultSet with invalid types () or values (). Cause: java.lang.NoSuchMethodException: java.sql.ResultSet.<init>()
I'd suggest you use limit in your query. limit X, Y syntax is good for you. Try it.
If the table is huge, the query will become slower and slower. Then the best way to to iterate will be to filter based on id and use limit.
such as
select * from table where id>0 limit 100 and then
select * from table where id>100 limit 100 etc
There are multiple options you have ...
Use pagination on database side
I will just suppose the database is oracle. However other db vendors would also work. In oracle you have a rownum with which you can limit number of records to return. To return desired number of records you need to prepare a where clause using this rownum. Now, the question is how to supply a dynamic rownum in a query. This is where dynamic sqls of mybatis comes in use. You can pass these rownum values inside a parameter map which there onwards you can use in your query inside a mapper xml using a #{} syntax. With this approach you filter the records on db level itself and only bring or prepare java objects which are needed or in the current page.
Use pagination on mybatis side
Mybatis select method on sqlSession has a Rowbounds attribute. Populate this as per your needs and it will bring you those number of records only. Here, you are limiting number of records on mybatis side whereas in first approach the same was performed on db side which is better performant .
Use a Result handlers
Mybatis will give you control of actual jdbc result set. So, you can do/iterate over the result one by one here itself.
See this blog entry for more details.

Good Practices For Generating Variations of SQL Queries

I'm working on a (Java) project that requires different variations of SQL queries depending on the filters the user wants to use.
I have 4+ queries right now that all use the same tables, require the same table joins, and use the same "order by". Currently I have hard coded these queries into the code and I'm not happy with that. I would like to dynamically generate them but I'm having trouble figuring out a solution or if I even should bother to generate them.
Note: I can not use stored procedures.
EXAMPLE:
SELECT t1.column1, t2.column2, t3.column3 FROM
(SELECT column1, column2, sum(column3) FROM t1
WHERE X = Y
GROUP BY column1, column2
ORDER BY column1)
LEFT JOIN t2 on t1.column1 = t2.column1
LEFT JOIN t3 on t1.column2 = t3.column2
WHERE Y = Z AND A = B
ORDER BY t1.column1
The differences are in the WHERE, SELECT, and GROUP BY statements. I could put nested if-statements between the dynamic parts but that seems too messy.
if ()
"SELECT A"
else
"SELECT B"
+ "FROM T1"
if ()
"WHERE x = y
"LEFT JOIN ..."
etc.
Doing something like this feels wrong. Should I just stick to hard coding them or is there a better solution?
EDIT: I included it in the tags but I wanted to note up here that I'm using Oracle.
I've had the same type of problem on projects I've done. In some of these cases I've used the Builder pattern to create dynamic SQL statements. One advantage of using a Builder is that you can unit test your Builder for all the combinations of your criteria. Yes, you will still have some conditional logic, but it will all be encapsulated in your SQL Builder.
There are more than one good solutions for this - there are several tools for creating typesafe and easly manipulable SQL strings out there in Java.
But since you are using Oracle, the best way to go is probably prepared statements.
http://en.wikipedia.org/wiki/Prepared_statement
https://docs.oracle.com/javase/tutorial/jdbc/basics/prepared.html
If it is only a few (less than, say, ten) combinations, you could have the hard-coded SQL strings to choose from. For more than that, you want to programmatically build the statements on the fly. There is no good way to do this with just JDBC, you might want to invest the time to look at some database access libraries.
To avoid having to manipulate SQL strings when dynamically building database queries, have a look at jOOQ. Or, if you want to go that way, an ORM (such as JPA) will also have a CriteriaBuilder.
You will still have the same conditional statements and logic, but at least you can work on Java objects instead of having to manipulate strings and worry that you get all keywords in the right order and all parens balanced.

Connect By in HSQL DB

I was writing test cases for query that uses connect by hierarchical clause.
It seems that there is no support for this clause in HSQL Db.
Are there any alternatives for testing the query or writing a different query that does the same thing.
The query is simple
SELECT seq.nextval
FROM DUAL
CONNECT BY level <= ?
Thanks.
You don't need a recursive query for that.
To generate a sequence of numbers you can use sequence_array
select *
from unnest(sequence_array(1, ?, 1))
More details are in the manual:
http://hsqldb.org/doc/2.0/guide/builtinfunctions-chapt.html#N14088
If you need that to advance a sequence a specific number of entries, you can use something like this:
select NEXT VALUE FOR seq
from unnest(sequence_array(1, 20, 1));
If you need that to set the sequence to a new value, this is much easier in HSQLDB:
ALTER SEQUENCE seq restart with 42;
If you are looking for a recursive query, then HSQLDB supports the ANSI SQL standard for that: recursive common table expressions, which are documented in the manual:
http://hsqldb.org/doc/2.0/guide/dataaccess-chapt.html#dac_with_clause
According to this 2-year-old ticket, only Oracle and a database called CUBRID have CONNECT BY capability. If you really want it, maybe you could vote on the ticket. However, as far as I have been able to tell, there are only two people working on the project, so don't hold your breath.

Issue happened when union\minus on Oracle clob

I have a table named "preference" which includes more than 100 columns in oracle,I wrote a little bit complicated SQL which need use keyword UNION/INTERSECT/MINUS to do a query.
Take a simple example:
select a.* from preference a where a.id = ? union
select a.* from preference a where a.id = ?
The business care have been changed due to unlimited length string storage on demand. one column need to be re-defined to Clob type. Oracle don't allow union on the clob type, so ideally the a.* cannot be used here.
I changed SQL to like below:
select a.a,a.b,a.c... from preference a where a.id = ? union
select a.a,a.b,a.c... from preference a where a.id = ?
It lists all columns except clob and then I have to do another selection to append the Clob value together. Is that a good idea?
The Another issue brought from above case is that: as I mentioned this table has large columns, list all columns in sql it make SQL much longer. Is there expression I can select all columns but getting rid of specific one?
Oracle when delaing with log does not allow union/minus but allows union all, may be you can rewrite your query using union all and use a select . in the select clause you can issue a select a. or list every column.
After reading your question my main concern is memory usage on Java, are you using an orm to load the data? or are you using the jdbc api?
If you are loading all the clobs into some strings you could end with an OutOfMemoryError. My advice is to load the clob only for rows you need to show to the user (or for the rows where the clob filed has to be processed).
Can you give more insight about your application (the numer fo rows it has to process) and your data (epsecially the clob size)?

Categories

Resources