I have a problem with my Oracle DB network speed.
First of all, what's the essence of the problem. There are java application on my computer and Oracle DB on a remote server. Connection speed between them is about 2,5MB/s. I execute in my java app
a very simple query like "select id, name from table_name", result set contains ~60K rows (size is about 1,5 Mb) and transfers to my app for ~80 seconds. Accordingly to the profiler the most of the time application spends in oracle.net.Packet.recieve method.
For comparison the same query executes in SQL Developer for 0,5-0,7 seconds for 5000 rows. Extrapolating to 60K rows we have about 6-8 seconds.
The result of excution of tcpdump for my application shows that data transfers in chunks with size about 200 bytes. On the other hand for SQL Developer tcpdump shows package size more than 2000 bytes.
Official Oracle documentations suggests to increase SDU and TDU parameters, unfortunately i can't change configuration of database, so i tried to determine them on client side in a such way:
jdbc:oracle:thin:#(DESCRIPTION=(SDU=11280)(TDU=11280)(ADDRESS=(PROTOCOL=tcp)(HOST=<host>)(PORT=1521)(SEND_BUF_SIZE=11784)(RECV_BUF_SIZE=11784))(CONNECT_DATA=(SERVICE_NAME=<db>)))
But this didn't bring any changes. Can database or ojdbc driver ignore this parameters? Or maybe i'm on the wrong way?
As it turned out, the reason was in fetch size. Increasing its value allows to decrease execution time at ~100 times.
Related
First of all I know it's odd to rely on a manual vacuum from the application layer, but this is how we decided to run it.
I have the following stack :
HikariCP
JDBC
Postgres 11 in AWS
Now here is the problem. When we start fresh with brand new tables with autovacuum=off the manual vacuum is working fine. I can see the number of dead_tuples growing up to the threshold then going back to 0. The tables are being updated heavily in parallel connections (HOT is being used as well). At some point the number of dead rows is like 100k jumping up to the threshold and going back to 100k. The n_dead_tuples slowly creeps up.
Now the worst of all is that when you issue vacuum from a pg console ALL the dead tuples are cleaned, but oddly enough when the application is issuing vacuum it's successful, but partially cleans "threshold amount of records", but not all ?
Now I am pretty sure about the following:
Analyze is not running, nor auto-vacuum
There are no long running transactions
No replication is going on
These tables are "private"
Where is the difference between issuing a vacuum from the console with auto-commit on vs JDBC ? Why the vacuum issued from the console is cleaning ALL the tupples whereas the vacuum from the JDBC cleans it only partially ?
The JDBC vacuum is ran in a fresh connection from the pool with the default isolation level, yes there are updates going on in parallel, but this is the same as when a vacuum is executed from the console.
Is the connection from the pool somehow corrupted and can not see the updates? Is the ISOLATION the problem?
Visibility Map corruption?
Index referencing old tuples?
Side-note: I have observed that same behavior with autovacuum on and cost limit through the roof like 4000-8000 , threshold default + 5% . At first the n_dead_tuples is close to 0 for like 4-5 hours... The next day the table is 86gigs with milions of dead tuples. All the other tables are vacuumed and ok...
PS: I will try to log a vac verbose in the JDBC..
PS2: Because we are running in AWS could it be a backup that is causing it to stop cleaning ?
PS3: When refering to vaccum I mean simple vacuum, not full vacuum. We are not issuing full vacuum.
The main problem was that vacuum is run by another user. The vacuuming that I was seeing was the HOT updates + selects running over that data resulting in on-the-fly vacuum of the page.
Next: Vacuuming is affected by long running transactions ACROSS ALL schemas and tables. Yes, ALL schemas and tables. Changing to the correct user fixed the vacuum, but it will get ignored if there is an open_in_transaction in any other schema.table.
Work maintance memory helps, but in the end when the system is under heavy load all vacuuming is paused.
So we upgraded the DB's resources a bit and added a monitor to help us if there are any issues.
I have a service which deletes and inserts records into tables from other tables based on the changes in data. When I run my service, It started transactions and seems working fine, after sometime it stuck at some point like it is waiting for some resources and after waiting for long hours it gives the connection time out exception. I checked with DBA and they cleared indexing and fragmentation on the tables and I also reduced no.of transactions at a time from 50k to 10K, no luck with any of these changes. I am trying to process around 3.8 millions of records on total.
Note: It was working fine with 2 cpu cores but used to take long hours to complete the run. So we increased 2 more cpu cores, after increasing the cores it worked fine for the 1st time, after that each time it is giving connection timeout exception.
Please check number of allowed active connection in sql server.
make sure you are closing your collection properly after every call.
I've been working on an application which requires regular writes and massive reads at once.
The application is storing a few text columns which are not very big in size and a map of which is the biggest column in the table.
Working with Phantom-DSL in Scala (Datastax Java driver underneath), my application crashes when the data size increases.
Here is a log from my application.
[error] - com.websudos.phantom - All host(s) tried for query failed (tried: /127.0.0.1:9042 (com.datastax.driver.core.OperationTimedOutException: [/127.0.0.1:9042] Operation timed out))
And here are the cassandra logs.
I have posted Cassandra logs in a pastebin because they were pretty large to be embedded in the answers.
I can't seem to understand the reason for this crash. I have tried increasing the timeout and turning off row cache.
From what I understand, this is a basic problem and can be resolved by tuning cassandra for this special case.
My cassandra usage is coming from different data sources. So writes are not very frequent. But reads are big in size in that over 300K rows may be required at once which then need to be transferred over HTTP.
The logs show significant GC pressure (ParNew of 5 seconds).
When you say "reads are big in size in that over 300K rows may be required at once", do you mean you're pulling 300k rows in a single query? The Datastax driver supports native paging - set the fetch size significantly lower (500 or 1000), and allow it to page through those queries instead of trying to load all 300k rows in a single pass?
Maps (and collections in general) can be very demanding for Cassandra heapspace. Changing your data model to replace the map with another table may solve your gc issues. But that's a lot of speculation due to the lack of further details on your Cassandra usage.
So I've been browsing questions that may already have my answer, but they do not directly answer my question, my situation is I'd like to write a plugin for a game that will collect statistics, one of them would be collecting a statistic that could potentially with enough players in the server be about easily 200 queries within 3 seconds, on the specifications shown below, querying a remote database I have two questions, the first being, is this going to cause noticeable network issues on a 100Mbit port, and my second question being, will all the queries show tremendous amounts of CPU usage on top of a highly intensive game engine that takes a lot of CPU usage?
Server Specifications
- i3 3420 4 Cores
- 16GB RAM
- 100Mbit Port
On a Side Note,
Would moving the database to the local server reduce potential usage to the point where it's highly recommended?
Well, without knowing the amount of data being stored, it's hard to make any judgement calls on this one. However, a couple of things...
I doubt any database could handle 200 queries in 3 seconds on that kind of machine, unless you have tables with only a few records.
The 100mbit port won't be a problem; you're not actually transporting the whole database across the wire, just the query ("SELECT FROM ...") and the results (which should be a single row for statistics).
However, you will bog down the server with such queries, causing hickups and delays for your gamers. My recommendation for you would be to replicate the gamedatabase to a separate server (a simple Master/Slave setup) and perform your statistics queries on the slave database.
I have a problem with my html-scraper. Html-scraper is multithreading application written on Java using HtmlUnit, by default it run with 128 threads. Shortly, it works as follows: it takes a site url from big text file, ping url and if it is accessible - parse site, find specific html blocks, save all url and blocks info including html code into corresponding tables in database and go to the next site. Database is mysql 5.1, there are 4 InnoDb tables and 4 views. Tables have numeric indexes for fields used in table joining. I also has a web-interface for browsing and searching parsed data (for searching I use Sphinx with delta indexes), written on CodeIgniter.
Server configuration:
CPU: Type Xeon Quad Core X3440 2.53GHz
RAM: 4 GB
HDD: 1TB SATA
OS: Ubuntu Server 10.04
Some mysql config:
key_buffer = 256M
max_allowed_packet = 16M
thread_stack = 192K
thread_cache_size = 128
max_connections = 400
table_cache = 64
query_cache_limit = 2M
query_cache_size = 128M
Java machine run with default parameters except next options: -Xms1024m -Xmx1536m -XX:-UseGCOverheadLimit -XX:NewSize=500m -XX:MaxNewSize=500m -XX:SurvivorRatio=6 -XX:PermSize=128M -XX:MaxPermSize=128m -XX:ErrorFile=/var/log/java/hs_err_pid_%p.log
When database was empty, scraper process 18 urls in second and was stable enough. But after 2 weaks, when urls table contains 384929 records (~25% of all processed urls) and takes 8.2Gb, java application begun work very slowly and crash every 1-2 minutes. I guess the reason is mysql, that can not handle growing loading (parser, which perform 2+4*BLOCK_NUMBER queries every processed url; sphinx, which updating delta indexes every 10 minutes; I don't consider web-interface, because it's used by only one person), maybe it rebuild indexes very slowly? But mysql and scraper logs (which also contain all uncaught exceptions) are empty. What do you think about it?
I'd recommend running the following just to check a few status things.. puting that output here would help as well:
dmesg
top Check the resident vs virtual memory per processes
So the application become non responsive? (Not the same as a crash at all) I would check all your resources are free. e.g. do a jstack to check if any threads are tied up.
Check in MySQL you have the expect number of connections. If you continuously create connections in Java and don't clean them up the database will run slower and slower.
Thank you all for your advice, mysql was actually cause of the problem. By enabling slow query log in my.conf I see that one of the queries, which executes every iteration, performs 300s (1 field for searching was not indexed).