Streaming join multiple resultSet - java

I have a problem about sql performance, my db is too many rows, so this make long time to query.
SELECT * FROM A JOIN B ON A.id = B.id where ...
So I change to
SELECT * FROM A where A= a...
SELECT * FROM B where B= b...
I got 2 resultSet from 2 query here.
Can someone help me how to join 2 resultset with the best performance.
I have to split to 2 query because this database have 10 mil records.

Select col1, col2 ...
from
(
-- first query
) as tab1
join
(
-- second query
) as tab2 on tab1.colx = tab2.coly

Related

Union Alternative of HQL

I am trying to rewrite this query to improve performance given that it takes more than 10 mins to execute. I believe the issue is mainly due to the large Inbox table. As union isn't an option in HQL, what can I do to improve this performance? Thanks in advance.
select distinct p from PrReq p, Inbox i where i.documentNo in
(select distinct kdn.kdnId from PrKdn kdn where kdn.prReqId = p ) or
i.documentNo in (select distinct grn.grnId from PrGrn grn where
grn.prReqId = p) and i.inboxStatus = 0 and i.moduleStatus =
:currentStatus and p.recvDept.code = :dept and p.organization = :org
order by p.billToDept, p.prId ASC, p.createDate desc

SELECT DISTINCT rows FROM one table HAVING MIN(value) FROM another table

I have one table (table1) with columns:
ID
NAME
1
A
2
B
3
C
4
D
and another table (table2) with columns:
ID
table1.ID
DATE
STATUS
1
1
21-JUL-2020
INACTIVE
2
1
22-JUL-2022
ACTIVE
3
1
23-JUL-2022
ACTIVE
4
2
21-JAN-2022
ACTIVE
5
2
22-JAN-2022
INACTIVE
6
2
23-JAN-2022
ACTIVE
7
3
20-JAN-2022
INACTIVE
8
3
20-JAN-2022
INACTIVE
I am trying to write a query that will return distinct rows from table1 where status from table2 is ACTIVE and results should be ordered by min DATE from table2.
Desired result:
ID
NAME
2
B
1
A
I tried with the following:
select t1, min(t2.date)
from table1 t1 join t1.t2List t2 -- table1 Entity has OneToMany t2List defined
where t2.status = 'ACTIVE'
group by t1.id
order by t2.date desc;
Problem here is that I can't use my Entity class table1 and I would like to avoid creating a new class that will hold this additional aggregated result (min date).
Also tried using HAVING clause but could not get it working.
select t1 from table1 t1
where t1.id in (
select table1.id from table2 t2 where t2.status = 'ACTIVE'
group by t1.id, t2.date
having t2.date = min(t2.date));
Appriciete any help here!
Join table1 to a query that aggregates in table2 and returns all table1_ids with an ACTIVE row:
SELECT t1.*
FROM table1 t1
INNER JOIN (
SELECT table1_id, MIN(date) date
FROM table2
WHERE status = 'ACTIVE'
GROUP BY table1_id
) t2 ON t2.table1_id = t1.id
ORDER BY t2.date;
See the demo.
There are multiple ways of solving this problem, as #forpas has already answered
using a subquery in the JOIN clause.
The same results can be achieved using this query
SELECT table1.ID , table1.name
FROM table1
INNER JOIN table2 ON table2.table1_ID = table1.ID
WHERE status = 'ACTIVE'
GROUP BY table1.ID , table1.name
ORDER BY MIN(table2.statusDate)

Need to know the right syntax of a count query to count rows by id

I got these two tables and I want to have a query to count the amount of cars by each brand and insert this count to a column in the brand table
I've tried many queries but can't get it right.
First table,
Second table,
Use JOIN.
Query
select t1.car_brand_id, t2.brand_name, count(t1.car_name) as total_count
from table1 t1
join table2 t2
on t1.car_brand_id = t2.brand_id
group by t1.car_brand_id, t2.brand_name;
You need join count and group by
this is a select for see the count by brand_name
select b.brand_name, count(*)
from table_one a
inner join table_two b on b.brand_id = a.brand_id
group by b.brand_name
Once you have added the column you need in table_two ( with eg alter table command adding my_count_col)
you could use an update like this
update table_two
inner join (
select b.brand_name, count(*) my_count
from table_one a
inner join table_two b on b.brand_id = a.brand_id
group by b.brand_name ) t on t.brand_name = table_two.brand_name
set table_two.my_count_col = t.my_count

Getting a count for each duplicate record in response from SQL query

I have four columns, 2 columns lets say FIRST_NAME and LAST_NAME in table PersonalInfo and other 2 columns ADD1 and ADD2 in table AddressDetails. Now what I have to do is I want to know the count for duplicate record for each row considering all 4 columns.How can I do that?
I have 2 approach till now:
1. I iterate each response and compare with the remaining.
2. Do something with query.
I know the first case is worst option because it will take so much of the time. Is there something I can do with query?
I searched and found this:
SELECT LAST_NAME, count(LAST_NAME) FROM SchemaName.PersonalInfo S GROUP by LAST_NAME;
and the basic is working for single column and not for multiple columns.
How can I do it. Please suggest.
Group by all the cols not just last name.
Working fiddle : http://sqlfiddle.com/#!9/a76e7e/3
SELECT P.fname,P.lname,A.add1,A.add2, count(*)
FROM PersonalInfo P, AddressDetails A
Where P.id = A.id
GROUP by P.fname,P.lname,A.add1,A.add2
having count(*) > 1;
Join you tables on the id like in above example.
I suppose you table PersonalInfo has a primary key and AddressDetails ha a forey key on this primary key ?
select t1.FIRST_NAME , t1.LAST_NAME, t2.ADD1 , t2.ADD2, count(*) NbPossibility
from PersonalInfo t1 inner join AddressDetails t2
on t1.idPersonalInfo=t2.idPersonalInfo
group by t1.FIRST_NAME , t1.LAST_NAME, t2.ADD1 , t2.ADD2
having count(*)>1
Il you have not keys for join this tables (anormal):
select t1.FIRST_NAME , t1.LAST_NAME, t2.ADD1 , t2.ADD2, count(*) NbPossibility
from PersonalInfo t1 cross join AddressDetails t2
group by t1.FIRST_NAME , t1.LAST_NAME, t2.ADD1 , t2.ADD2
having count(*)>1

Unique row in DB

I have a table with huge data. I am storing logging details of rest call, where sentTime and a combination of three fields let say (COL1, COL2, COL3) are unique.
I need to get the last call for each rest call.
For example, if API1, API2, and API3 are called 10 times each, I have around 30 rows in my table. I need the last calls of all 3 API's so I will get 3 rows, one for each API.
I am using following query:
SELECT tb.id
FROM Table1 (nolock) tb
INNER JOIN (
SELECT col1, col2, col3, MAX(sentTime) as lastSentTime
FROM Table1 (nolock) GROUP BY col1, col2, col3) a
ON a.col1 = tb.col1 AND
a.col2 = tb.col2 AND
a.col2 = tb.col2 AND
a.lastSentTime = tb.sentTime
But it doesn't work as expected.
For example:
id Name Sent_Time Temp_id Temp_id2
1 Delete 04/03/16 17:54 AB 2222701
2 Update 04/03/16 17:54 UD 6900001
3 Create 04/03/16 17:54 EL 2017301
4 Read 04/03/16 17:54 AB 2670001
5 Update 08/03/16 17:54 UD 1069501
6 Create 08/03/16 17:54 EL 3490801
Except there are millions of rows.
The combination of name, Temp_id and Temp_id2 is unique.
In java I have taken all the data and put it into a HashMap with key as name + Temp_id + Temp_id2. So that it is unique. Is it possible I can get the same data through a query?
You could try this. I guess each distinct REST call has its own values of col1, col2, and col3 and you want the most recent.
SELECT MAX(sentTime) mostrecent_sentTime,
col1, col2, col3
FROM Table1
GROUP BY col1, col2, col3
If you want all the columns in the row, then use window functions:
select t.*
from (select t.*,
row_number() over (partition by col1, col2, col3 order by sentTime desc) as seqnum
from t
) t
where seqnum = 1;
If your table has only four columns (or you only care about four columns), then aggregation as suggested by #OllieJones is perhaps more reasonable.

Categories

Resources