Oracle from java, select all table with blob column affect perfomance?

Oracle from java, select all table with blob column affect perfomance? - java

I´m developing a Java SE app for vet.
I have a table name pets, each pet have a photo that is a blob Column, the question is, when selecting all pets the photo column affect the perfomance of the app?.
In the app I pass the query result to Pet objects i'm thinking when having many rows, the photos charged in memory will affect perfomance.

In general, it should not affect performance. Much. A blob is ultimately just a pointer to an IO stream. The performance hit comes in when you actually start loading the photos, which will start doing IO operations. The question is, when do you actually need the image data?
As Thorbjorn pointed out, you probably shouldn't put the blobs in the main table. Either the system has to maintain the table of blob pointers, or your object will have to load image data that it may not need yet. Better to have a separate table with an "imageID" column or somesuch. Then add a "Pet.loadImage()" method, or maybe an event trigger, that will load the image as needed.

Related

Dynamic Switching Of Column based on Size

We are not sure what would be the size of file on upload an image
For now, planning to go with medium blob and using mysql as database.
Assume, if the customer uploads more than 16MB, the DB column should be automatically changed to longblob to accomodate the file. This is to cover the edge scenario, however we may not sure whether it would happen
What is the best way to support or achieve this?
Thanks.

The best way to achieve this is to define your database fields properly. You don't switch fields just because the file is larger than the datatype, that's bad design. Also, there are really no disadvantages from using a LONGBLOB from the start. If you want to limit the amount of data pasted into your database, you should already limit this in the java side, not rely on the database to do so.

database or ObjectOutputStream, Object specific member or actual object for reference

I'm working on an application for a pharmacy , basically this application has a class "item" and another class "selling invoices" which logs selling processes .
So my question here if the pharmacy is expected to have about ten thousand products in stock, and I'm storing these products in a linked list of type Item, and storing the invoices in linked list also , then on closing the app i save them using object output stream and reload them upon the start, Is it a bad practice ? Have I to use database instead?
My second question is, if i continue on using linkedlist and object output stream , what is better for performance and memory, storing the actual item as a field member in the invoice class or just its ID and then getting the item upon recalling using this ID reference, so what's better ?
Thanks in advance .

It is a bad idea to use ObjectOutputStream like that.
Here are some of the reasons:
If your application crashes (or the power fails) before you "save", then all changes are lost.
Saving "all objects" is expensive.
Serialized objects are opaque. It is only practical to look at them from Java code.
Serialized objects are fragile. If your application classes change, you may find that old serialized objects can no longer be read. That's bad enough, but now consider what happens if your client wants to look at pharmacy records from 5 years ago ... from a backup tape.
Serialized objects provide no way of searching ... apart from reading all of the objects one at a time.
Designs which involve reading all objects into memory do not scale. You are liable to run out of memory. Or compromise on your requirements to avoid running out of memory.
By contrast:
A database won't lose any changes have been committed. They are much more resilient to things like application errors and system level failures.
Committing database changes is not as expensive, because you only write data that has changed.
Typical databases can be viewed, queried, and if necessary repaired using an off-the-shelf database tool.
Changing Java code doesn't break the database. And for some schema changes, there are ways to migrate the database schema and records to match an updated database.
Databases have indexes and query languages for implementing efficient search.
Databases scale because the primary copy of the data is on disk, not in memory.

In populating an ObservableList, do I have to load ALL the records from my database?

So I'm porting my Swing Java database application to Java FX (still a beginner here, I recently just learned the basics of FXML and the MVC pattern, so please bear with me).
I intend to load the data from my existing database to the "students" ObservableList so I can show it on a TableView, but on my original Swing code, I have a search TextField, and when the user clicks on a button or presses Enter, the program:
Executes an SQLite command that searches for specific records, and retrieves the RecordSet.
Creates a DefaultTableModel based on the RecordSet contents
And throws that TableModel to the JTable.
However, Java FX is a completely different beast (or at least it seems so to me--don't get me wrong, I love Java FX :D ) so I'm not sure what to do.
So, my question is, do I have to load ALL the students in the database, then use some Java code to filter out students that don't fit the search criteria (and display all students when the search text is blank), or do I still use SQLite in filtering and retrieving records (which means I need to clear the list then add students every time a search is performed, and maybe it will also mess up with the bindings? Maybe there will be a speed penalty on this method also? Besides that, it will also reset the currently selected record because I clear the list--basically, bad UI design and will negatively impact the usability)
Depending on the right approach, there is also a follow-up question (sorry, I really can't find the answer to these even after Googling):
If I get ALL students from database and implement a search feature in Java, won't it use up more RAM than it should, because I am storing ALL the database data in RAM, instead of just the ones searched for? I mean, sure, even my lowly laptop has 4GB RAM, but the feeling of using more memory than I should makes me feel somewhat guilty LOL
If I choose to just update the contents of the ObservableList every time a new search has been performed, will it mess up with the bindings? Do I have to set up bindings again? How do I clear the contents of the ObservableList before adding the new contents?
I also have the idea of just setting the selected table item to the first record that matches the search string but I think it will be difficult to use, since only one record can be highlighted per search. Even if we highlight multiple rows, it'd be difficult to browse all selected items.
Please give me the proper way, not the "easy" way. This is my first time implementing a pattern (MVC or am I actually doing MVP, I don't know) and I realized how unmaintainable and ugly my previous programs are because I used my own style. This is a relatively big project that I need to support and improve for several years so having clean code and doing stuff the right way should help in maintaining the functionality of this program.
Thank you very much in advance for your help, and I hope I don't come off as a "dumb person who can't even Google" in asking these questions. Please bear with me here.

Basic design tradeoffs
You can, of course, do this either of the ways you describe. The basic tradeoffs are:
If you load everything from the database, and filter the table in Java, you use more memory (though not as much as you might think, as explained below)
If you filter from the database and reload every time the user changes the filter, there will be a bigger latency (delay) in displaying the data, as a new query will be executed on the database, with (usually) network communication between the database and the application being the biggest bottleneck (though there are others).
Database access and concurrency
In general, you should perform database queries on a background thread (see Using threads to make database requests); if you are frequently making database queries (i.e. filtering via the database), this gets complex and involves frequently disabling controls in the UI while a background task is running.
TableView design and memory management
The JavaFX TableView is a virtualized control. This means that the visual components (cells) are created only for visible elements (plus, perhaps, a small amount of caching). These cells are then reused as the user scrolls around, displaying different "items" as required. The visual components are typically quite memory-consumptive (they have hundreds of properties - colors, font properties, dimensions, layout properties, etc etc - most of which have CSS representations), so limiting the number created saves a lot of memory, and the memory consumption of the visible part of the table view is essentially constant, no matter how many items are in the table's backing list.
General memory consumption computations
The items observable list that forms the table's backing list contains only the data: it is not hard to ballpark-estimate the amount of memory consumed by a list of a given size. Strings use 2 bytes per character, plus a small fixed overhead, doubles use 8 bytes, ints use 4 bytes, etc. If you wrap the fields in JavaFX properties (which is recommended), there will be a few bytes overhead for each; each object has an overhead of ~16 bytes, and references themselves typically use up to 8 bytes. So a typical Student object that stores a few string fields will usually consume of the order of a few hundred bytes in memory. (Of course, if each has an image associated with it, for example, it could be a lot more.) Thus if you load, say 100,000 students from a database, you would use up of the order of 10-100MB of RAM, which is pretty manageable on most personal computer systems.
Rough general guidelines
So normally, for the kind of application you describe, I would recommend loading what's in your database and filtering it in memory. In my usual field of work (genomics), where we sometimes need 10s or 100s of millions of entities, this can't be done. (If your database contains, say, all registered students in public schools in the USA, you may run into similar issues.)
As a general rule of thumb, though, for a "normal" object (i.e. one that doesn't have large data objects such as images associated with it), your table size will be prohibitively large for the user to comfortably manage (even with filtering) before you seriously stretch the memory capacity of the user's machine.
Filtering a table in Java (all objects in memory)
Filtering in code is pretty straightforward. In brief, you load everything into an ObservableList, and wrap the ObservableList in a FilteredList. A FilteredList wraps a source list and a Predicate, which returns true is an item should pass the filter (be included) or false if it is excluded.
So the code snippets you would use might look like:
ObservableList<Student> allStudents = loadStudentsFromDatabase();
FilteredList<Student> filteredStudents = new FilteredList<>(allStudents);
studentTable.setItems(filteredStudents);
And then you can modify the predicate based on a text field with code like:
filterTextField.textProperty().addListener((obs, oldText, newText) -> {
if (newText.isEmpty()) {
// no filtering:
filteredStudents.setPredicate(student -> true);
} else {
filteredStudents.setPredicate(student ->
// whatever logic you need:
student.getFirstName().contains(newText) || student.getLastName().contains(newText));
}
});
This tutorial has a more thorough treatment of filtering (and sorting) tables.
Comments on implementing "filtering via queries"
If you don't want to load everything from the database, then you skip the filtered list entirely. Querying the database will almost certainly not work fast enough to filter (using a new database query) as the user types, so you would need an "Update" button (or action listener on the text field) which recomputed the new filtered data. You would probably need to do this in a background thread too. You would not need to set new cellValueFactorys (or cellFactorys) on the table's columns, or reload the columns; you would just call studentTable.setItems(newListOfStudents); when the database query finished.

How to store database data with lots of attributes into cache?

Let's say that I have a table with columns TABLE_ID, CUSTOMER_ID, ACCOUNT_NUMBER, PURCHASE_DATE, PRODUCT_CATEGORY, PRODUCT_PRICE.
This table contains all purchases made in some store.
Please don't concentrate on changing the database model (there are obvious improvement possibilities) because this is a made-up example and I can't change the actual database model, which is far from perfect.
The only thing I can change is the code which uses the already existing database model.
Now, I don't want to access the database all the time, so I have to store the data into cache and then read it from there. The problem is, my program has to support all sorts of things:
What is the total value of purchases made by customer X on date Y?
What is the total value of purchases made for products from category X?
Give me a list of total amounts spent grouped by customer_id.
etc.
I have to be able to preserve this hierarchy in my cache.
One possible solution is to have a map inside a map inside a map... etc.
However, that gets messy very quickly, because I need an extra nesting level for every attribute in the table.
Is there a smarter way to do this?

Have you already established that you need a cache? Are you sure the performance of your application requires it? The database itself can optimize queries, have things in memory, etc.
If you're sure you need a cache, you also need to think about cache invalidation: is the data changing from beneath your feet, i.e. is another process changing the data in the database, or is the database data immutable, or is your application the only process modifying your data.
What do you want your cache to do? Just keep track of queries and results that have been requested so the second time a query is run, you can return the result from the cache? Or do you want to aggressively pre calculate some aggregates? Can the cache data fit into your app memory or do you want to use ReferenceMaps for example that shrink when memory gets tight?
For your actual question, why do you need maps inside maps? You probably should design something that's closer to your business model, and store objects that represent the data in a meaningful way. You could have each query (PurchasesByCustomer, PurchasesByCategory) represented as an object and store them in different maps so you get some type safety. Similarly don't use maps for the result but the actual objects you want.
Sorry, your question is quite vague, but hopefully I've given you some food for thoughts.

Should I always retrieve full object from a database?

This is a very simple question that applies to programming web interfaces with java. Say, I am not using an ORM (even if I am using one), and let's say I've got this Car (id,name, color, type, blah, blah) entity in my app and I have a CAR table to represent this entity in the database. So, say I have this need to update only a subset of fields on a bunch of cars, I understand that the typical flow would be:
A DAO class (CarDAO) - getCarsForUpdate()
Iterate over all Car objects, update just the color to say green or something.
Another DAO call to updateCars(Cars cars).
Now, isn't this a little beating around the bush for what would be a simple select and update query? In the first step above, I would be retrieving the entire object data from the database: "select id,name,color,type,blah,blah.. where ..from CAR" instead of "select id,color from CAR where ...". So why should I retrieve those extra fields when post the DAO call I would never use anything other than "color"? The same applies to the last step 3. OR, say I query just for the id and color (select id,color) and create a car object with only id and color populated - that is perfectly ok, isn't it? The Car object is anemic anyway?
Doesn't all this (object oriented-ness) seem a little fake?

For one, I would prefer that if the RDBMS can handle your queries, let it. The reason is that you don't want your JVM do all the work especially when running an enterprise application (and you have many concurrent connections needing the same resource).
If you particularly want to update an object (e.g. set the car colour to green) in database, I would suggest a SQL like
UPDATE CAR SET COLOR = 'GREEN';
(Notice I haven't used the WHERE clause). This updates ALL CAR table and I didn't need to pull all Car object, call setColor("Green") and do an update.
In hindsight, what I'm trying to say is that apply engineering knowledge. Your DAO should simply do fast select, update, etc. and let all SQL "work" be handled by RDBMS.

From my experience, what I can say is :
As long as you're not doing join operations, i.e. just querying columns from the same table, the number of columns you fetch will change almost nothing to performance. What really affects performance is how many rows you get, and the where clause. Fetching 2 or 20 columns changes so little you won't see any difference.
Same thing for updating

I think that in certain situations, it is useful to request a subset of the fields of an object. This can be a performance win if you have a large number of columns or if there are some large BLOB columns that would impact performance if they were hydrated. Although the database usually reads in an entire row of information whenever there is a match, it is typical to store BLOB and other large fields in different locations with non-trivial IO requirements.
It might also make sense if you are iterating across a large table and doing some sort of processing. Although the savings might be insignificant on a single row, it might be measurable across a large table.
Also, if you are only using fields that are in indexes, I believe that the row itself will never be read and it will use the fields from the index itself. Not sure in your example if color would be indexed however.
All this said, if you are only persisting objects that are relatively simple without BLOB or other large database fields then this could turn into premature optimization since the query processing, row IO, JDBC overhead, and object creation are most likely going take a lot more time compared to hydrating a subset of the fields in the row. Converting database objects into the final Java class is typically a small portion of the load of each query.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.