I am using MySql database for my Project,
I have one table having 6 Millions records, Out of which I am updating around 5 Millions records, my table structure and query are as below.
Table
CREATE TABLE `temp` (
`ref_id_1` int(11) NOT NULL,
`ref_id_2` varchar(32) NOT NULL,
`dna_id` int(11) DEFAULT NULL,
`product_id` varchar(16) DEFAULT NULL,
PRIMARY KEY (`ref_id_1`,`ref_id_2`),
KEY `product_id` (`product_id`),
KEY `dna_id` (`dna_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Update Query
UPDATE temp SET dna_id = 8 WHERE product_id ='Dr_23' AND ref_id_1 = 4;
There are 5 Million records for Product Id Dr_23
Above query taking around 2 minutes to execute. I have 32 GB RAM and SSD Harddisk.
Does anyone know how to optimize this query
Related
Environment:
mariadb-java-client-2.7.0
DB : MariaDB 10.5.7
ojdbc8 - Oracle 11.2.0.3.0 JDBC 4.0
DB : Oracle Database 11g
Hibernate 4.3.8
Code :
Session session = sessionFactory.openSession();
Criteria fetchCriteria = session.createCriteria("Student");
Disjunction disjunction = Restrictions.disjunction();
for (int i = 1; i <= 10000; i++) {
Conjunction conjunction = Restrictions.conjunction();
conjunction.add(Restrictions.eq("RollNumber", i+""));
disjunction.add(conjunction);
}
fetchCriteria.add(disjunction);
long start1 = System.currentTimeMillis();
List resultList = fetchCriteria.setFirstResult(0).setResultTransformer(Criteria.ALIAS_TO_ENTITY_MAP).list();
long end1 = System.currentTimeMillis();
System.out.println("Time took :"+(end1-start1) +"ms");
Issue
If i run above code with Hibernate 4.3.8 + Oracle 8 it taking less than 5000 milliseconds.
If i run above code with Hibernate 4.3.8 +mariadb-java-client-2.7.0 it taking more than 40,000 milliseconds.
Extra Configuration :
I have set hibernate.jdbc.fetch_size to 100 in hibernate.cfg.xml
along with jdbc URL ,username and password.
Findings:
The query generated in both cases are same and if i execute those
query with SQL Client it takes 10-11 seconds for ORACLE and 41-42 seconds for MariaDB.
The query which is generated by both database if i invoke using JDBC
program (both for ORACLE and MariaDB) it is taking approx 600 milliseconds
Note: Both tables (Oracle and MariaDB) have 15,000 records.
Can anyone help me why MariaDB is taking time?
or some extra settings are required to improve the MariaDB performance.
I have tried defaultFetchSize which is mentioned in https://mariadb.com/kb/en/about-mariadb-connector-j/ but no luck.
SQL Query Generated by the databases:
select this_.rollNo as RollNo1_0_0_, this_.VersionID as Version2_0_0_,
this_.Name as Name3_0_0_, this_.dept as dept4_0_0_,
this_.favSubj as favSubj5_0_0_,
this_.ID as ID33_0_0_
from Student this_
where ((this_.ID='1')
or (this_.ID='2')
or (this_.ID='3')
or ....
or (this_.ID='10000')
MariaDB DDL
CREATE TABLE `student` (
`RollNo` bigint(20) NOT NULL ,
`VersionID` bigint(20) NOT NULL,
`Name` varchar(100) COLLATE ucs2_bin DEFAULT NULL,
`dept` varchar(100) COLLATE ucs2_bin DEFAULT NULL,
`favSubj` varchar(100) COLLATE ucs2_bin DEFAULT NULL,
`ID` varchar(100) COLLATE ucs2_bin DEFAULT NULL,
PRIMARY KEY (`RollNo`),
UNIQUE KEY `UK_student` (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=20258138 DEFAULT CHARSET=ucs2 COLLATE=ucs2_bin
Oracle DDL
CREATE TABLE student (
RollNo NUMBER(19,0),
VersionID NUMBER(19,0) NOT NULL ENABLE,
Name VARCHAR2(100),
dept VARCHAR2(100),
favSubj VARCHAR2(100),
ID VARCHAR2(100),
PRIMARY KEY ("RollNo"),
CONSTRAINT "UK_student" UNIQUE ("ID")
)
MariaDB explain select query output
id
select_type
table
type
possible_keys
key
key_len
ref
rows
Extra
1
SIMPLE
this_
range
UK_Student
UK_Student
203
NULL
10000
Using index condition
An OR with 10K items takes a long time to parse. Faster would be an IN:
where this_.ID IN ('1', '2', ..., '10000')
However, even that is likely to take a long time to run.
In the case of MariaDB, I think the Optimizer will say
Oh, that's too many items for me to look up each one, so
I will, instead, simply scan the table, checking each row for an ID in that list (using some kind of efficient lookup in the 10K-long list).
However, if there are 20M rows in the table, that will take a long time.
Can you provide the query plan (EXPLAIN) so we can confirm what I am hypothecating?
This seems logical and faster, but will not work correctly:
where this_.ID BETWEEN '1' AND '10000'
because it is a VARCHAR!!
Performance -- Make id an INT, not a VARCHAR!
Java 11. PostgreSQL.
Having following table in db:
TABLE public.account (
id bigserial NOT NULL,
account_id varchar(100) NOT NULL,
display_name varchar(100) NOT NULL,
is_deleted bool NULL DEFAULT false,
);
There are about 1000 rows in this table. In the code I have a static method, which return random string - Helper.getRandomName()
How, using JDBC, in this table (public.account) for all rows replace "display_name" value with value of Helper.getRandomName()?
This is a SQL question. You need to run an update query:
UPDATE public.account set display_name = ?
And provide the new name as the parameter. The absence of a WHERE clause means that all rows will be affected.
If you want to do this for each row individually, then it's harder. You'll want to do a select statement to find all the IDs, and then you can prepare a batch of updates using JDBC, adding a where clause for each ID.
JDBC is just a thin Java wrapper around plain SQL execution.
I have made a table in oracle which uses auto incremented field thorough sequence.
Here is the sql:
CREATE TABLE Users(
user_ID INT NOT NULL,
user_name VARCHAR (20) NOT NULL,
user_password VARCHAR (20) NOT NULL,
user_role INT NOT NULL,
PRIMARY KEY (user_ID)
);
ALTER TABLE Users
ADD FOREIGN KEY (user_role) REFERENCES User_Roles (role_ID);
CREATE SEQUENCE seq_users
MINVALUE 1
START WITH 1
INCREMENT BY 1
CACHE 10;
Now I need to insert the data into the table through a java program, is there any way, I don't have to use the query like this:
Insert into User_Roles values (seq_user_roles.nextval,'system admin');
User Role Table:
CREATE TABLE User_Roles(
role_ID INT NOT NULL,
role_name VARCHAR (20) NOT NULL,
PRIMARY KEY (role_ID)
);
CREATE SEQUENCE seq_user_roles
MINVALUE 1
START WITH 1
INCREMENT BY 1
CACHE 10;
I want to insert the data from a java program and can't specify that name of the sequence.
Crete below trigger after sequence creation. So it will populate date into your column. And no need to mention role_id column in you insert statement script.
CREATE OR REPLACE TRIGGER TRG_User_Roles_BRI
BEFORE INSERT
ON User_Roles
REFERENCING NEW AS NEW OLD AS OLD
FOR EACH ROW
BEGIN
:NEW.role_ID := seq_user_roles.NEXTVAL;
END;
I have 2 tables. First one holds the total values of some shopping lists and the second table holds the products in that list. When a shopping list is done the total value is added into the total table together with some informations like the list number(nrList which is some kind of list id) and the number of products on that list nrProducts while the products go into the listproducts table.Lets say there are 3 products tomato,oranges and apples.They will all share the same nrList which,as mentioned before,is something like the list id.
First table totals:
CREATE TABLE IF NOT EXISTS `totals` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`nrList` int(11) NOT NULL,
`nrProducts` int(11) DEFAULT NULL,
`total` double NOT NULL,
`data` date DEFAULT NULL,
`ora` time DEFAULT NULL,
`dataora` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
`Operator` varchar(50) DEFAULT NULL,
`anulat` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`)
)
Second table listproducts:
CREATE TABLE IF NOT EXISTS `listproducts` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`nrList` int(11) DEFAULT NULL,
`product` varchar(50) DEFAULT NULL,
`quantity` double DEFAULT NULL,
`price` double DEFAULT NULL,
`data` date DEFAULT NULL,
`operator` varchar(50) NOT NULL,
`anulat` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
)
Now,i have two things i want to do,they are very similar.
Lets say i have a list with 3 products.In the totals table there will be a row with some info and with the total=10$ nrProducts=3 and nrList=1.In the listproducts table i will have 3 rows all having nrList=1 and each having price=3$,3$,4$.
Now,i want the check the following :
1.That if the value of nrProducts=3 then i have products for that list in the other table.
2.Check if the total in the first table is equal to the sum of the products in the second table.(quantity*price SUM)
I've done some stuff but i don't know what to do next.
I managed to get the number of products for each list from the second table by using this:
SELECT nrList,operator,COUNT(*) as count FROM listproducts GROUP BY nrList
But i don't know how to compare if the values are equal without doing two queries.
For the second thing again, I know how to get the sum but i don't know how to compare them without doing two separate queries.
SELECT SUM(price*quantity) FROM `listproducts` WHERE nrList='10' and operator like '%x%'
I can also do something like what i've done in the other select,this is not the issue.
The issue is that i don't know how to do the things i want in a single select instead of doing two and comparing them.I'm doing this in java so i can compare but i'd like to know if and how i can do this in a single query.
Thanks and sorry for the long post.
You can try something like this:
SELECT totals.nrList,
IF (totals.nrProducts = t.nrProductsActual, 'yes', 'no') AS matchNrProducts,
IF (totals.total = t.totalActual, 'yes', 'no') AS matchTotal
FROM totals INNER JOIN
(SELECT nrList,
COUNT(*) AS nrProductsActual,
SUM(quantity*price) AS totalActual
FROM listproducts
GROUP BY nrList) AS t ON totals.nrList = t.nrList
I want to design a page tracker database table, but I am facing few issues with it.
create table pageTracker(
ID bigint(20) NOT NULL,
TrackerID bigint(20) NOT NULL,
SessionID varchar(100) NOT NULL,
pageViews bigint,
pageVisits bigint,
primary key(ID)
);
If I update pageviews and pageVisits corresponding to specific SessionID I can not query pageViews and pageVisits within specific time interval.
create table pageTracker(
ID bigint(20) NOT NULL,
TrackerID bigint(20) NOT NULL,
SessionID varchar(100) NOT NULL,
pageViews bigint,
pageVisits bigint,
time TimeStamp,
primary key(ID)
);
But if I add extra column time, if I want to insert each pageViews and pageVisits as new entry for specific time it creates huge number of entry in the table.
Is there any efficient way to do it?
I am assuming that you want to update pageViews and pageVisits everytime against a SessionID. In this case first insert will have say:
Session ID = 23R4E11, pageViews = 1, pageVisits = 1
Now if same user revisits same page, you will update existing row as:
Session ID = 23R4E11, pageViews = 2, pageVisits = 1
In this case to maintain all the updates, you can create one more table called as pageTrackerHistory and then write trigger which can insert entry in pageTrackerHistory table whenever update is made on pageTracker table.
By doing this your operational table pageTracker contains minimal rows and pageTrackerHistory table contains huge audit records.
Hope this will give you some direction. :-)