how to find out the slow queries in mysql (mostly in Mariadb) - java

I have few questions:
1) I am newbie to performance testing as a starting assignment I have to investigate the slow queries in MariaDb version : 10.0.17-MariaDB MariaDB Server .
I tried with these settings in the /etc/my.cnf.d/server.cnf
[mysqld]
long_query_time=1
log-slow-queries=/var/log/mysql/log-slow-queries.log
And after doing that I could no start the database. I get a simple
starting MySQL.... [FAILED] message.
I came across Slow query log overview for mariadb which made a little sense :(
Can any one provide me a tutorial of how it should be done.
2) In my application we already use Hibernate for data layer, Does it even make sense to find out the slow query log in the above mentioned way ?
3) How can i achieve the same thing in the mongodb.? like to list out the most frequently used queries, Slow queries ?
Any help would be appreciated.

Converting comment to answer:
When mysql won't start, as first thing you should check mysql error log (probably /var/log/(mysql/)mysqld.log) for exact error.
In your case "log-slow-queries" is starting option name (and deprecated too), you should use slow_query_log with boolean value and slow_query_log_file for filename.
slow_query_log=1 means ENABLE logging
long_query_time=1 means IF ENABLED log queries longer than 1 second
then there is
- log_queries_not_using_indexes=0/1 which, if enabled, will log even queries faster than 1s if they are not using indexes to locate rows
All these and other can be found with descriptions in MySQL manual
For MongoDB there seems to be profiler which is described in answers to this question How to find queries not using indexes or slow in mongodb

Related

Does deleting from a table of a h2 database handled by Hibernate corrupts the table?

Here a quick description of the system:
A java 7 REST client receives jsons and write their parsed content into an h2 database via Hibernate.
Some Pentaho Kettle Spoon 4 ETLs directly connect to the same database to read and delete a lot of entries at once.
This solution worked fine in our test environment, but in production (where the traffic is really higher because of course it is) the ETLs are often failing with the following error
Error inserting/updating row
General error: "java.lang.ArrayIndexOutOfBoundsException: -1"; SQL statement:
DELETE FROM TABLE_A
WHERE COLUMN_A < ? [50000-131]
and if I navigate the database I can indeed see that that table is not readable (apparently because it thinks its lenght is -1?). The error code 50000 is for "Generic" so is no use.
Apart from the trivial "maybe h2 is not good for an Event Handler", I've been thinking that the corruption could possible be caused by a confict between Kettle and Hibernate, or in other words that no one should delete from an Hibernate handled database without him knowing.
My questions to those more experienced then me with Hibernate are:
Is my sopposition correct?
Should I re-design my solution to also use the same restful Hibernate to perform deletes?
Should I resign using h2 for such a system?
Thanks for the help!
EDIT:
The database is created by a simple sh script that runs the following command that basically uses the provided Shell tool to connect to a non existing db which by defalts creates it.
$JAVA_HOME/bin/java -cp *thisIsAPath*/h2database/h2/main/h2-1.3.168-redhat-2.jar org.h2.tools.Shell -user $DB_USER -password $DB_PASSWORD -url jdbc:h2:$DB_FOLDER/Temp_SD_DS_EventAgent<<END
So all its parameters are set to version 1.3.168's defaults. Unfortunately while I can find the current URL setting I can't find where to look for that version's defauts and experimentals.
I also found the followings:
According to the tutorial When using Hibernate, try to use the H2Dialect if possible. which I didn't.
The tutorial also says Please note MVCC is enabled in version 1.4.x by default, when using the MVStore. Does that mean cuncurrency is disabled/unsupported by default in this older case and this is the problem?
The database is created with h2 version 1.3.168 but the consumer uses 1.4.197. Is this a big deal?
I cannot comment on the credibility of h2 db.
But from application perspective, I think you should use locking mechanism - Optimistic or Pessimistic lock. This will avoid the conflict situations. Hope this answer helps to point in correct direction
Article on Optimistic and Pessimistic locking

JDBC - SQL statements execution time logging

I need to log the SQL execution times in my Java EE application (Any further statistics would be an optional bonus).
Things are setup in a more-less standard way: Datasource on Application server serving pooled JDBC connections.
Application uses for DB access mix of:
Hibernate and
Spring JDBCTemplate
It runs on:
Glassfish OSE and
Oracle DBS
I know about: Anything better than P6Spy? however the question/answers are outdated, from my point of view.
What I've found so far:
I could go for the pure Hibernate approach (hibernate show query execution time)
but it's not feasible due to mixed DB access in my case
I could use some of the custom JDBC drivers
p6spy - however project seems couple years dead (last commit 3 years ago: http://sourceforge.net/p/p6spy/code/23/tree/trunk/)
log4jdbc - however no release for more than 1 year, and source activity seems to be cca 6 months not touched (http://code.google.com/p/log4jdbc/source/list)
another 2: log4jdbc-log4j2 and log4jdbc-remix - which seem alive, but I'm not sure about stability and broad usage
Recommendations based on experiences are very welcome.
Please note, I'm interested in kind of answers like: We're using XYZ and this is our experience, rather than I googled just now and feel like...
If you want dead accurate SQL execution time then best option is sql trace. But you want to put it in your Java EE application, so obviously want a somewhat accurate execution time.
Following are the things i would suggest [I have implemented them in my code]:
If you just want it for loggin purpose then have appropriate Log4j debug messages which wil l print time along with log entry.
I implemented a BatchLog table for my application which use to record start and end time of the operation. So in your case it will be start and end time of your query. Now if it is just a single query then probably triggers might help here or else you can update log table just before and after running query. Or even better will be a stored procedure which can take care of whole thing and give more accurate data.
Measuring time and logging it seems like a job for AOP. If you are using EJB, a simple interceptor should solve your problem(for example http://www.javacodegeeks.com/2013/07/java-ee-ejb-interceptors-tutorial-and-example.html). If its Spring(judging from JDBCTemplate), try Aspectj.
OK, thanks a lot for your answers.
Finally I decided to go for the P6Spy with patches specific to our scenario.
Moreover, as I believe that the gap P6spy is filling still exists these days, I decided to participate in it's development (https://github.com/p6spy/p6spy). Feel free to report your issues or feature requests to: https://github.com/p6spy/p6spy/issues to make it fit your needs.

A data migration issue in HSQL, the new database does not contain few tables

I am currently responsible for migrating data for our application, for upgrading to new version.I am trying to migrate from HSQL to HSQL, later we will move on to other combinations.
So I have a stand alone utility to do this. I am using MockServletContext to initialize my services(this migration is to be done without starting the servers).
The problem is that all the tables are migrated except for 2-3 tables, the number depending on size of the data migrated. On extensive debugging I found nothing wrong. Meaning that all the data is getting migrated on debugging via eclipse, but on normal running it fails to complete for the last 3 tables.
Any clue where to look at?
In normal run I have put loggers to see if we are reading all the data from the source database and indeed the logs prove we do.
The only place where I am unable to put logs is when it calls a method in driver.
In the last step we give a call to PreparedStatement object's executeBatch()/executeUpdate() methods(Tried with both but exactly same result).
I am completeley clueless what to do and where to look for. Any suggestions?
Thanks
In normal run I have put loggers to see if we are reading all the data from the source database and indeed the logs prove we do. The only place where I am unable to put logs is when it calls a method in driver.
If you suspect something wrong there, try wrapping your driver in log4jdbc. It will show the SQL issued to DB. Good luck!

How to diagnose performance problems with SQL Server Views and JDBC

I have a view defined in SQL server 2008 that joins 4 tables together. Executing this view in SQL Server Management Studio takes roughly 3 seconds to run and returns about 45,000 records. My application is written in Java using hibernate to simply do a "from MyViewObject" query in HQL. When this is run, the execution time is consistently around 45 seconds. I have also tried simply using JDBC to run this query and received the same level of performance, so I've assumed it has nothing to do with hibernate.
My question: What can I do to diagnose this problem? There is obviously something different between how Management Studio is running the query vs how my application is running the query but I have not been able to come up with much.
The only thing I've come up with as a potentially viable explanation is an issue with the jtds library that contains the driver for SQL Server in Java.
Any guidance here would be greatly appreciated.
UPDATE
I went back to trying pure JDBC and tried adding the selectMethod and responseBuffering attributes to my connection string but didn't get any improvements. I also took my JDBC code from my application and ran it from a test program containing nothing but my JDBC code and it ran in the expected 3 seconds. So to me this seems environmental for the application.
My application is a Google Web Toolkit(GWT) based app, and the JDBC code is being run in my primary RPC Servlet. Essentially, the RPC method receives the call and immediately executes the JDBC code. Nothing in this setup gives me much indication of why the performance is terrible though. I am going to try the JDBC 3.0 driver and see if that works any better, but it doesn't feel like that will fix the issue to me quite yet.
My goal for the moment is to get my query working live with JDBC and then switch it back over to Hibernate so I can keep the testing simple enough. Thanks for the help so far!
UPDATE 2
I'm finally starting to zero in on the source of the problem, though still no idea what the actual issue is. I opened up the view in SQL Server and copied the SQL statement (rather large) exactly into my code and executed it using JDBC instead of pulling the data from the view and most of the performance issues are gone. It seems that some combination of GWT, SQL Server Views and JDBC is not working properly here. I don't see keeping a very large hand-written query in my code as a long term solution, but it does offer a bit more insight.
<property name="hibernate.show_sql">true</property>
setting this will show you the SQL query generated by hibernate. Analyze the query and make sure you are not missing a relationship.
reply for Update 1 and 2:
Like you mentioned, ran the query on your sql query and it seems like it is fast. So another thing to remember about hibernate is that it creates the object that is returned by your query (of course this depends if you initialize lazy obj. Dont remember what it is called). How many objects does your query return? also you can do a simple bench on where the issue is.
For example, before running the query, sysout the current time and then sysout the current time after. do these for all the places that you suspect is slowing your application down.
To analyze the problem you should look up you manual for tools that display the query or execution plan. Maybe you're missing an index on a join column.

Large ResultSet on postgresql query

I'm running a query against a table in a postgresql database. The database is on a remote machine. The table has around 30 sub-tables using postgresql partitioning capability.
The query will return a large result set, something around 1.8 million rows.
In my code I use spring jdbc support, method JdbcTemplate.query, but my RowCallbackHandler is not being called.
My best guess is that the postgresql jdbc driver (I use version 8.3-603.jdbc4) is accumulating the result in memory before calling my code. I thought the fetchSize configuration could control this, but I tried it and nothing changes. I did this as postgresql manual recomended.
This query worked fine when I used Oracle XE. But I'm trying to migrate to postgresql because of the partitioning feature, which is not available in Oracle XE.
My environment:
Postgresql 8.3
Windows Server 2008 Enterprise 64-bit
JRE 1.6 64-bit
Spring 2.5.6
Postgresql JDBC Driver 8.3-603
In order to use a cursor to retrieve data you have to set the ResultSet type of ResultSet.TYPE_FORWARD_ONLY (the default) and autocommit to false in addition to setting a fetch size. That is referenced in the doc you linked to but you didn't explicitly mention that you did those steps.
Be careful with PostgreSQL's partitioning scheme. It really does very horrible things with the optimizer and can cause massive performance issues where there should not be (depending on specifics of your data). In any case, is your row only 1.8M rows? There is no reason that it would need to be partitioned based on size alone given that it is appropriately indexed.
I'm betting that there's not a single client of your app that needs 1.8M rows all at the same time. You should think of a sensible way to chunk the results into smaller pieces and give users the chance to iterate through them.
That's what Google does. When you do a search there might be millions of hits, but they return 25 pages at a time with the idea that you'll find what you want in the first page.
If it's not a client, and the results are being massaged in some way, I'd recommend letting the database crunch all those rows and simply return the result. It makes no sense to return 1.8M rows just to do a calculation on the middle tier.
If neither of those apply, you've got a real problem. Time to rethink it.
After reading the later responses it sounds to me like this is more of a reporting solution that ought to be crunched in batch or calculated in real time and stored in tables that are not part of your transactional system. There's no way that bringing 1.8M rows to the middle tier for calculating moving averages can scale.
I'd recommend reorienting yourself - start thinking about it as a reporting solution.
The fetchSize property worked as described at postgres manual.
My mistake was that I was setting auto commit = false to a connection from a connection pool that was not the connection being used by the prepared statement.
Thanks for all the feedback.
I did everything above, but I needed one last piece: be sure the call is wrapped in a transaction and set the transaction to read only, so that no rollback state is required.
I added this: #Transactional(readOnly = true)
Cheers.

Categories

Resources