getting data from a database vs getting data from a hash map

getting data from a database vs getting data from a hash map - java

In my android *project* I have to keep track of product details of certain number of products. All the data on these products are store in a SQLite database. I can use select and update in SQLite to perform keep track of product objects. So I can store product details when ever they are changed.
Also I can load all the products into a hash map or such a data structure at the beginning and keep track of those product objects.
what matters to me is out of above both which one is more efficient and productive. Can someone help me. Thank you!

This depends on the number of products. A HashMap resides in RAM, a database resides on the disk.
This also depends on the number of queries per second and the nature of the queries. Database developers have put a lot of effort to support indexing and filtering; if you need that, reuse is better than re-inventing.
Whatever approach you choose, you must remember that an Android application process may be killed at any moment (e.g. when memory is needed for another process), and your code is guaranteed only to receive onPause() (and not onDestroy() or onStop(), see the activity life cycle). If necessary, the application will be restarted later, but all data kept in RAM (and not saved) will be of course lost. Therefore, in either onPause() or in onSaveInstanceState() (what bundles are for) you must save the application state. In your case this may mean having to save the HashMap and all auxiliary data structures.
OTOH, if the number of products is small (and is expected to remain small), a database can be an overkill; if you just need to choose one of 10 items, writing a loop is faster than doing all the database support.
One more important note: an Activity is a Controller from the MVC viewpoint (and as to the View, you usually create and XML and reuse the existing framework classes, although you could program custom controls). An Activity is re-created each time the device orientation changes and in some other cases. But your data belong to the Model part. They must survive the orientation change. So they must not be owned by an Activity.

To answer
a. HashMap will no doubt win over sqlite in terms of memory operations, hence it will run faster. But it will not provide you consistency, Mobile being a volatile environment, OS is bound to kill an application in background, or in foreground if its memory requirement are not met, in such scenarios you might loose all the important updates made before committing it to a permanent storage.
b. SQLITE is slow compared to Map, but is reliable it will make sure all the commits to your data is saved properly, even though app crashes db will guarantee to restore data you committed before that, Map certainly lacks this functionality.
c. Considering a Scenario when you have loaded data into a MAP to enable faster operation and performing sync with database whenever you record any delta to your data, this scenario is very plausible and can perform excellently if designed properly.
To conclude, You should proceed with DB + MAp operation, this will make your app running smooth if there is lots of database operation involved, just need to make sure keep data + app seperate to eleborate dont make your app dependant on loading of data initially.

Related

Java : relational database vs static variable

I have a web application in which I'm maintaining many static Maps to store my relevant information. Since the application is deployed on a server. Each and every hit to the server side java uses these maps to match the key and get appropriate result and send back to the client side. My code contains a rank and retrieval feature so I have to read the entire keySet of each of these Maps.
My question is:
1. Is working with static variables better than storing this data in a local embedded DB like Apache Derby and then using it?
2. The use of this data is very frequent. So if I use database will that be faster approach? Since I read the full keyset the where clause may not come handy in many operations.
3. How does the server's memory gets impacted on holding data in static variables?
My no. of maps are fixed but the size of the Maps keeps increasing? Please suggest the better solution.

If you want the data to be saved regularly an embedded database like H2 makes sense. You then also have snapshots of the data, and development, structural changes are a bit more safe.
A real database also has an incredible power behind it: concurrency, caching and so on. An embedded (when file based) database less so.
The problem with maps is that the data extraction can become several indirections. It is more versatile to have SQL queries with joins on the tables.
So SQL is more abstract (does not prescribe the actual query implementation), and easier to test. SQL for instance releases the developer of programming reports.
So go for a database IMHO, when you are really doing hard work.

What you might want to consider is to store the data searched in map when it's searched.
For instance, if a user searches for something specific, that something is stored in the map so that the next user who searches for that gets the data directly from the map rather than the database.
There are some downsides though, as you need to make sure that if the data is changed on the database, the hashmap/cache should be cleared or updated with the new data, as to prevent feeding outdated data to the user.
As for the impact on the server's memory, it depends on the size of the data you're storing. It's hard to give you a precise answer, but you can however test that on your own:
long memoryBefore = Runtime.getRuntime().freeMemory();
// populate your map
long memoryAfter = Runtime.getRuntime().freeMemory();
System.out.println(memoryBefore - memoryAfter);
That should give you the amount of bytes used (more or less, depending on the operations you run between memoryBefore and memoryAfter, as you may have instantiated other classes/variables unrelated to the hashmap)

database or ObjectOutputStream, Object specific member or actual object for reference

I'm working on an application for a pharmacy , basically this application has a class "item" and another class "selling invoices" which logs selling processes .
So my question here if the pharmacy is expected to have about ten thousand products in stock, and I'm storing these products in a linked list of type Item, and storing the invoices in linked list also , then on closing the app i save them using object output stream and reload them upon the start, Is it a bad practice ? Have I to use database instead?
My second question is, if i continue on using linkedlist and object output stream , what is better for performance and memory, storing the actual item as a field member in the invoice class or just its ID and then getting the item upon recalling using this ID reference, so what's better ?
Thanks in advance .

It is a bad idea to use ObjectOutputStream like that.
Here are some of the reasons:
If your application crashes (or the power fails) before you "save", then all changes are lost.
Saving "all objects" is expensive.
Serialized objects are opaque. It is only practical to look at them from Java code.
Serialized objects are fragile. If your application classes change, you may find that old serialized objects can no longer be read. That's bad enough, but now consider what happens if your client wants to look at pharmacy records from 5 years ago ... from a backup tape.
Serialized objects provide no way of searching ... apart from reading all of the objects one at a time.
Designs which involve reading all objects into memory do not scale. You are liable to run out of memory. Or compromise on your requirements to avoid running out of memory.
By contrast:
A database won't lose any changes have been committed. They are much more resilient to things like application errors and system level failures.
Committing database changes is not as expensive, because you only write data that has changed.
Typical databases can be viewed, queried, and if necessary repaired using an off-the-shelf database tool.
Changing Java code doesn't break the database. And for some schema changes, there are ways to migrate the database schema and records to match an updated database.
Databases have indexes and query languages for implementing efficient search.
Databases scale because the primary copy of the data is on disk, not in memory.

Collection processing or database request ? which one is better

This is my first post on stackoverflow, so please be nice to me :-)
So let me explain the context. I'm developing a web service with a standard layer (resources, services, DAO Layer...). I use JPA with hibernate implementation for my object model with the database.
For a class A parent and a class B child, most of the time when i want to find an object B on the collection, I use the streamAPI to filter the collection based on what i want. My question here is more general, is it better to search an object by requesting the database (from my point of view this gonna cause a lot of calls to the database but it's gonna use less CPU), or do the opposite by searching over the model object and process over collection (this gonna cause less database calls, but more CPU process)

If you consider latency, the database will always be slower.
So you gotta ask yourself some questions:
how far away is the database (latency)?
how big is the dataset?
How do I process them ?
do I have any major runtime issues ?
from my point of view this gonna cause a lot of calls to the database but it's gonna use less CPU), or do the opposite by searching over the model object and process over collection (this gonna cause less database calls, but more CPU process)
You're program is probably not very performant programmed. I suggest you check the O-Notation if you have any major runtime leaks.
Your Question is very broad, so it's hard to tell you, for your use-case, which might be the best.

Use database to return data what you need and Java to perform processing on them that would be complicated to do in a JPQL/SQL query.
Databases are designed to perform queries more efficiently than Java (stream or no).
Besides, fetching many data from a database to finally keep only a part of them is not efficient.

The database is usually faster since it is optimized for requesting specific data. Usually one would add indexes to speed up querying on certain fields.
TLDR: Filter your data in the database and process them from java.

This isn't an easy question to answer, since there are many different factors that would influence my decision to go to the db or not. First, I think it's fair to say that, for almost every app I've worked on in the past 20 years, hitting the DB for information is the default strategy. More recently (say past 10 or so years) data access through web service calls has become common as well.
For me, the main question would be something along the lines of, "Are there any situations when I would not hit an external resource (DB, Service, or even file read) for data every time I need it?"
So, I'll outline some of the things I would consider.
Is the data search space very small?
If you are searching a data space of tens of different records, then this information might be a candidate for non-db storage. On the other hand, once you get past a fairly small set records, this approach becomes increasingly untenable. Examples of these "small sets" might be something like salutations (Mr., Ms., Dr., Mrs., Lord). I looks for small sets of data that rarely change, which I, as a lazy developer, wouldn't mind typing into a configuration file. Once I get past something like 50 different records (like US States, for example), I want to pull that info from a DB or service call.
Are the data cacheable?
If you have multiple requests that could legitimately use the exact same data, then leverage caching in your application. Examine the data and expected usage of your service for opportunities to leverage regularities in data and likely requests to cache data whenever possible. Remember to consider cache keys, how long items should be cached, and when cached items should be evicted.
In many web usage scenarios, it's not uncommon that each display could include a fairly large amount of cached information, and a small amount of dynamic data. Menu and other navigation items are good candidates for caching. User-specific data, such as contract-sepcific pricing in an eCommerce app are often poor candidates.
Can you pre-load some data into cache?
Some items can be read once and cached for the entire duration of your application. A list of US States and/or Canadian Provinces is a good example here. These almost never change, so once read from the db, you would rarely need to read them again. Consider application components that can load such data on startup, and then hold this data in an appropriate collection.

Java Application / ArrayList verses direct database queries

In short I want to know how effective it is to use arraylists in Java to hold objects with lot of data in it. How long an arraylist can grow and is there any issues using arraylist to hold 2000+ customer details (objects) while at runtime? Does it hit the performance in any way? Or is there any better way to design app which needs to quickly access data?
I am developing a new module (customer lead tracker) for my small ERM application which also handles payroll details for a company. So far the data was not so huge, now with this module I am expecting the data base to grow fast and I will have to load 2000+ customer details from database to perform different data manipulations, updates.
I wanted some suggestion as to which approach would be better,
Querying customer Database (100+ columns) and getting data to work with for each transaction. (A lot of seperate queries for each)
Load each row into objects save it in an arraylist at the beginning of and use the list to work with each row when required. And save the objects (rows) at end of a transaction?
Sorry if I have asked a dump question, I am really a start up independent developer this may sound a bit awkward from an experienced developer's perspective.

Depends on how much memory you have.Querying DB for each and every transaction is not a good approach as well.A better approach would be load data into memory depending on your memory size and once you are done with it, remove it and fire next set of db queries.In thi way you can optimize memory as well as db queries.

Any ArrayList can hold not much than 231-1 elements, due to int typed index of inner array.
There is an approach called in memory Db which implies that you hold a lot of data in memory for gain fast access to it. But this approach also implies, that:
a. you have a lot of memory, available for holding all necessary data (it could be several tens of gigabytes);
b. you db implements compact form of data storage. It means that db will not contain ready java-objects, but fragments of byte-array data, from which you will contstruct objects on demand.
So, you need to reckon, how much memory you will need for all data that you want to load into memory and decide whether this approach eligible or not.

why don't almost all databases implement row cache?

I noticed that almost all database don't implement row cache internally. I ask this question because I find someone add row-cache patch for innodb. At least, it has one advantage beyond performance gain, i.e. it is transparent for the client.
Is there any difficult technical reason which prevent doing so, or just because it's useful for very specific access pattern?
Thanks

Quite frankly, if you're getting the same version of the same row multiple times, you're doing it wrong. Data should be cached, as a general rule, if it's unlikely to change and needs to be accessed multiple times. Given this rule, if caching a DB row on the server side ever helps, it means you're making too many round trips to the database for the data you're interested in. You should instead be caching it client-side to cut down on the round trips. If the data changes often and so you need to access it often, caching still won't help because the cached data is out of date and the query must be re-executed. Only getting the data that's different from the cached data doesn't help; you have to figure out what's different and you're still making a query of the DB.
On top of that, most databases are designed for high-concurrency performance. Caching one guy's massive result set is going to eat into resources available for the next guy's massive result set and so on. In a high-user-count scenario, building a cache would likely simply result in the cached data being thrown away to make room for more cached data; it wouldn't be able to stick around long enough to be of use.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.