Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I have a problem with performance of my program.I have a java program that connect to MySql database and do some process witch depend execute SELECT query on database.
Now problem is that my program for process have to execute 130,000 select query on mySql and this need a long of time.
I have 10 minutes for do all process.
Have any idea how do execute 130000 select query maximum in 10 minutes?
Have any idea how do execute 130000 select query maximum in 10 minutes?
It's a 200 queries per second, this should be doable for simple queries. However, the simplest solution is probably to replace them by a single query loading all needed data (you're already assuming that it fits in memory) and process it in Java.
As a HashMap is orders of magnitude faster than any database, the task will become trivial and your machine gets bored.
One Approach is to use InMemory databases Like H2 or HSQL. This Solution will work only if you can pre-load data into memory. Lot of other factors also need to consider like Volume of Data , Frequency of Changes in data in the database etc.
If this is possible, then you can query directly in Memory, which is always faster.
Identify the important data to be loaded to Memory
Create corresponding table structure in Memory DB like H2 Or HSQL
Load those data from your actual MySQL and Insert this in Memory DB
Fire Query in that DB
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 17 days ago.
Improve this question
If I need to perform an automated housekeeping task, and this is my query:
delete from sample_table where id = '1'
And, this scheduled query gets executed from multiple service instances.
Will this have a significant performance impact? What would be an appropriate way of testing this?
Issuing multiple deletes for the same partition can have a significant impact on your cluster.
Remember that all writes in Cassandra (INSERT, UPDATE, DELETE) are inserts under the hood. Since Cassandra does not perform a read-before-write (with the exception of lightweight transactions), issuing a DELETE will insert a tombstone marker regardless of whether the data exists or has already been deleted.
Every single DELETE you issue counts as a write request so depending on how busy your cluster is, it may have a measurable impact on its performance. Cheers!
Erick's answer is pretty solid, but I'd just like to add that the time that you'll likely see performance issues is at read-time. That's because doing a:
SELECT * FROM sample_table WHERE id='1';
...will read ALL of the times that the DELETE was written (tombstones) from the SSTable file. The default settings on a table result in deleted data staying around for 10 days (to ensure proper replication) before they can be picked-up by compaction.
So figure out how many times that DELETE happens per key over a 10 day period, and that's about how many Cassandra will have to reconcile at read-time.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 13 days ago.
Improve this question
I am trying to retrieve the new inserts from a table in my java client application (Spring JMS), to do some processing and send them to a message broker. I do not have access to any CDC tool like Goldengate. I only need the new inserts and not the updates or deletes. I am having difficulty finding a way to do this. Is there a way to do this? I read that there is an option to do these with triggers, but will it have a high throughput on the db, because this table gets a lot of inserts in a day (approximately around 50K records inserted in a day).
Thanks in advance
50,000 rows per day is actually rather a low volume. Some data warehouse tables get 50 million rows a day. So an insert trigger is unlikely to make any appreciable difference on the loading job. Add a date column (e.g. LOAD_DATE) and have a before insert trigger assign :new.LOAD_DATE := SYSDATE.
That being said, if you wanted to avoid a trigger, you can modify the loading job itself to load such a date a column with SYSDATE.
With either method, your retrieval is then simple: each day as you retrieve records, record the maximum LOAD_DATE value you retrieved. The next day, pull only records with a LOAD_DATE >= that value.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have to build a search functionality where GUI will provide a search field to search objects in Oracle database. There are currently 30K objects I have to search on but they will grow in number over time. 300-400 per month approx.
As a part of the requirement, when user types in any text in search Like for example "ABC", then all objects in the DB that contains ABC should appear in a datatable more like system is predicting results based on what user has types in the search field.
Question is how to architect such feature?
Simple way to do is to load everything in the GUI Javascript object and run search on it. Since JS is ridiculously fast, performance wont be an issue.
Another way is to run query in the Database everytime user types in text in search field. This does not seem convenient as it will put unnecessary load on the database.
Is there a better way to architect this feature? Please share thoughts.
premature optimization is seldom useful.
300-400 object growth per month with a 30k base object is nothing at all for any DB to handle.
loading all 30k object at once on the browser is awful and may affect performance while querying result in the DB will not have this problem until you have LOT of and LOT of users accessing the DB.
You should be building the service using the Database and then if/when you reach a bottleneck you can think about optimization trick such as caching frequent queries on the database.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
In my web application i have a employee table with employee id,name,designation salary... as attributes which may contain thousands of records in it.. I want to search employee name, so for searching employeename which one will work faster hitting DB every time or create list of employee names once in java bean and iterating it for searching every time... Which one is better..??
By far, even if you have millions of records, it is better to hit the database per request. To enhance this, you can add a key/index on your name field in your employee table and the requests will be faster.
In case the data in your employee table doesn't vary that much, you have another option which is using a cache for your employee table. With this, your access to the data will be even faster since it will look up the employee at cache (usually RAM), but this comes with a more complex design and adding policies for your cache data retrieval and setting periods to update the cache data.
This is depends in few things..
Hitting the DB is IO action and in case you have specific screen/process that does a lot in specific flow of course it will be better to load list from DB once and use it several times, And this is in case that you can be sure that employees list won't be change in DB by other process/Or it can change and this is not critical for you..
If the screen/process make only few hits to get employees it should be hitting DB.
Remember that Hitting DB a lot of time can also load the DB and cause him to be slow.. He can't handle with infinite number of request.
Hope that helps
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
I want to move millions of records from sql server to oracle in java, so the logic is,
1. select all data from sql server per the specified date range,
2. insert them into oracle one by one,
3. Delete the data in sql server
But as the data is very huge, I'm afraid the process will cost too much time. so I want to do
1. Using multiple thread to read date from sql server(Slip the specified date range to some smaller range)
2. Using multiple threads to insert date to oracle.
But I'm not sure multiple threads can solve the issue.
Wish get get some suggestions.
1) Dump data into intermediate file (CSV or fixed-width)
2) use SQLLDR to import it
You will have to describe your dump file for SQLLDR