I have written a webapplication for payroll system which can do (insert,update,delete) to mysql database.
I want to know
how many transaction happened in mysql database ?
how many transaction happened in mysql database during start_time and end_time ?
MySQL has command counters. They can be seen with SHOW GLOBAL STATUS LIKE "COM\_%". Each execution of a command increments the counter associated with it. Transaction related counters are Com_begin, Com_commit and Com_rollback. Also Uptime is the number of seconds since server start. Reading and graphing these values or their delta yields the information you ask for.
There are also counters for Com_insert, Com_update, Com_delete and variations thereof. You might want to graph these as well.
Not sure this the answer you are looking for you I've heard that the following JDBC logger is very useful for tracking what an application is doing to a database. It should show where your application is opening and commiting transactions. You should then be able to write some scripts to process the logs to determine the number transactions.
http://code.google.com/p/log4jdbc/
It basically sits between your application and the real database driver. You add a log4jdbc prefix to your JDBC URL. For example, if your normal jdbc url is
jdbc:mysql://db.foo.com/webapplicationdb
then you would change it to:
jdbc:log4jdbc:mysql://db.foo.com/webapplicationdb
Related
First of all I know it's odd to rely on a manual vacuum from the application layer, but this is how we decided to run it.
I have the following stack :
HikariCP
JDBC
Postgres 11 in AWS
Now here is the problem. When we start fresh with brand new tables with autovacuum=off the manual vacuum is working fine. I can see the number of dead_tuples growing up to the threshold then going back to 0. The tables are being updated heavily in parallel connections (HOT is being used as well). At some point the number of dead rows is like 100k jumping up to the threshold and going back to 100k. The n_dead_tuples slowly creeps up.
Now the worst of all is that when you issue vacuum from a pg console ALL the dead tuples are cleaned, but oddly enough when the application is issuing vacuum it's successful, but partially cleans "threshold amount of records", but not all ?
Now I am pretty sure about the following:
Analyze is not running, nor auto-vacuum
There are no long running transactions
No replication is going on
These tables are "private"
Where is the difference between issuing a vacuum from the console with auto-commit on vs JDBC ? Why the vacuum issued from the console is cleaning ALL the tupples whereas the vacuum from the JDBC cleans it only partially ?
The JDBC vacuum is ran in a fresh connection from the pool with the default isolation level, yes there are updates going on in parallel, but this is the same as when a vacuum is executed from the console.
Is the connection from the pool somehow corrupted and can not see the updates? Is the ISOLATION the problem?
Visibility Map corruption?
Index referencing old tuples?
Side-note: I have observed that same behavior with autovacuum on and cost limit through the roof like 4000-8000 , threshold default + 5% . At first the n_dead_tuples is close to 0 for like 4-5 hours... The next day the table is 86gigs with milions of dead tuples. All the other tables are vacuumed and ok...
PS: I will try to log a vac verbose in the JDBC..
PS2: Because we are running in AWS could it be a backup that is causing it to stop cleaning ?
PS3: When refering to vaccum I mean simple vacuum, not full vacuum. We are not issuing full vacuum.
The main problem was that vacuum is run by another user. The vacuuming that I was seeing was the HOT updates + selects running over that data resulting in on-the-fly vacuum of the page.
Next: Vacuuming is affected by long running transactions ACROSS ALL schemas and tables. Yes, ALL schemas and tables. Changing to the correct user fixed the vacuum, but it will get ignored if there is an open_in_transaction in any other schema.table.
Work maintance memory helps, but in the end when the system is under heavy load all vacuuming is paused.
So we upgraded the DB's resources a bit and added a monitor to help us if there are any issues.
We are trying to capture some transactions on [near] Real-Time occuring on the Core-database, in another remote database connected via VPN.
These transactions can be identified easily but we are facing challenge while deciding the workflow and identifying which technology to use.
For eg.
1.) Dumping CSV file every x seconds.
From the core system we create a CSV file every x seconds with the required information. We will then push/pull this file to the remote system and process it.
2.) Web Service
We will have 2 web services, one on the sender side & another on the reciever side.
Every x seconds the sender web service will execute a query and fetch records from the source database and push the data to reciever web service in batches of 'y' records.
The receiver will then process the records and send an acknowledgement for 'y' records.
Note.
1.) Ideally we would like to make the process Real-Time. Both the above ideas are [near] Real-Time and not Real-Time.
2.) The source database system is not specific. It can be oracle,ms-sql,mysql,sybase,informix etc.
3.) Remote target database is oracle.
Any ideas are most welcome and also the technology used can be flexible.
The main focus is on decreasing the load caused due to this process on the core-database.
Edit:
It is becoming more and more clear to me that getting actual Real time with heterogeneous database systems will be nearly impossible as the trigger/notify on insertion of records are RDBMS specific.
I would like to shift the focus of the question to get better near Real time ideas apart from the above 2 examples shared.
Also please note that we have little to no control over the source database & also the process/service which originally inserts the records in the database. We only have control over the records.
See this article for an example on how to listen for database changes (in this case a database trigger) in PostgreSQL. Basically you set up a function to handle the trigger that sends an event to all interested clients. You Application will then listen for this event and can start the sync whenever the trigger is executed. The example applies the trigger to new insertions on a specific table.
Lot of SHOW TRANSACTION ISOLATION LEVEL appears in process list in Postgres 9.0 .
What are reasons for this and when it appears ?. All are in idle state.
How to disable this ?
I assume that with process list you mean the system view pg_stat_activity (accessible in pgAdmin III in the "statistics" tab or with "Tools/Server Status").
Since you say that the connections are idle, the query column does not show an active query, it shows the last query that has been issued in this database connection. I don't know which ORM or connection pooler you are using, but some software in your stack must insert these statements routinely at the end of a database action.
I wouldn't worry about them, this statements are not resource intensive and probably won't cause you any trouble.
If you really need to get rid of them, figure out which software in your stack causes them and investigate there.
We have a web application (Tomcat/Spring/Hibernate) running against a MySQL database. Every once in a while, the application runs a data-driven query that takes a huge amount of time to complete. Right now, we have no way to track them without logging ALL the queries, which would be a huge number (very busy app.) The only way we can identify a query is if it actually times out, then we get a org.apache.tomcat.jdbc.pool.ConnectionPool abandon warning.
Is there some way in Tomcat, Spring or Hibernate to track only queries that take over a certain time to execute?
MySQL has a slow query log. Enable that if it isn't already.
http://dev.mysql.com/doc/refman/5.1/en/slow-query-log.html
Session factory has getStatistics() method to know all kinds of statistics. Find about it here. You may be interested in stats.getQueryExecutionMaxTime() method.
The underlying problem I want to solve is running a task that generates several temporary tables in MySQL, which need to stay around long enough to fetch results from Java after they are created. Because of the size of the data involved, the task must be completed in batches. Each batch is a call to a stored procedure called through JDBC. The entire process can take half an hour or more for a large data set.
To ensure access to the temporary tables, I run the entire task, start to finish, in a single Spring transaction with a TransactionCallbackWithoutResult. Otherwise, I could get a different connection that does not have access to the temporary tables (this would happen occasionally before I wrapped everything in a transaction).
This worked fine in my development environment. However, in production I got the following exception:
java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
This happened when a different task tried to access some of the same tables during the execution of my long running transaction. What confuses me is that the long running transaction only inserts or updates into temporary tables. All access to non-temporary tables are selects only. From what documentation I can find, the default Spring transaction isolation level should not cause MySQL to block in this case.
So my first question, is this the right approach? Can I ensure that I repeatedly get the same connection through a Hibernate template without a long running transaction?
If the long running transaction approach is the correct one, what should I check in terms of isolation levels? Is my understanding correct that the default isolation level in Spring/MySQL transactions should not lock tables that are only accessed through selects? What can I do to debug which tables are causing the conflict, and prevent those tables from being locked by the transaction?
I consider keeping transaction open for an extended time evil. During my career the definition of "extended" has descended from seconds to milli-seconds.
It is an unending source of non-repeatable problems and headscratching problems.
I would bite the bullet in this case and keep a 'work log' in software which you can replay in reverse to clean up if the batch fails.
When you say your table is temporary, is it transaction scoped? That might lead to other transactions (perhaps on a different transaction) not being able to see/access it. Perhaps a join involving a real table and a temporary table somehow locks the real table.
Root cause: Have you tried to use the MySQL tools to determine what is locking the connection? It might be something like next row locking. I don't know the MySQL tools that well, but on oracle you can see what connections are blocking other connections.
Transaction timeout: You should create a second connection pool/data source with a much longer timeout. Use that connection pool for your long running task. I think your production environment is 'trying' to help you out by detecting stuck connections.
As mentioned by Justin regarding Transaction timeout, I recently faced the problem in which the connection pool ( in my case tomcat dbcp in Tomcat 7), had setting which was supposed to mark the long running connections mark abandon and then close them. After tweaking those parameters I could avoid that issue.