Efficient handling of saving Object data with image in MongoDB

Efficient handling of saving Object data with image in MongoDB - java

Here i want to save one object to MongoDB using Java. I found Morphia, Jongo, Springs are providing framework to achieve it.
To store the images to mongoDB i found GridFS
Here my problem is,
1. I have one object it contains both data as well as image. I have to store and have to do lot of mathematical calculation towards the fields in it. As well i want to search a particular image if certain condition satisfies..??
2. If i separate the image with object store image using GridFs and data as BSon data, then how can link this document with image..??
3. While i'm separating the data from object, if that data itself exists 16 MB means how i have to handle this ..?? For this also if i go for GridFs means it is converting into Chunks I want to analyse field by field ..??
4. At particular time can i find the size of the Object in java before write it into mongodb..??
Can any one please suggest me to over come this problem..any link..or any idea which java framework with MongoDB will be very efficient to handle all this real time scenario..??
More informations about the data structure:
I want to store complex business object. For example if i want to store one classroom object it contains many students each student contains many photos. In classroom object has its own data.And each student has its own data with list of photos. I have to efficiently query and analyse the data here. It may be classroom wise or student wise.

You can save the metadata for the image in a normal document which also includes the GridFS filename under which the binary data can be found.
Putting the metadata on GridFS would mean that it would become a binary lumb of data. You then have no longer any way to query it execpt by its filename. So when your image metadata also risks exceeding the 16MB limit, it means that you should reconsider your database schema and separate it into multiple documents.
When you want to do data analysis on both classroom and student level, you should put each student in an own document and then either have the classrooms reference the students or the students reference the classrooms (or both).
I assume that your students will add more and more images during the lifetime of the application. MongoDB does not like documents which grow over time, because growing objects mean that MongoDB needs to constantly reallocate their storage space which is a performance killer on write operations. When that's the case you should also have a separate document per image which references the student they belong to. But when this is not the case (list of images is created once at creation and then rarely changed) you should rather embed the images as an array in the student-object.
Regardless of if you embed or reference the images, the image documents/objects should only contain the meta-data while the binary image data itself should be stored on GridFS and referenced by its filename from the image document.

Related

Saving big amount of data (words): Serialization or DB

I need to save permanently a big vocabulary and associate to each word some information (and use it to search words efficiently).
Is it better to store it in a DB (in a simply table and let the DBMS make the work of structuring data based on the key) or is it better to create a
trie data structure and then serialize it to a file and deserialize once the program is started, or maybe instead of serialization use a XML file?
Edit: the vocabulary would be in the order of 5 thousend to 10 thousend words in size, and for each word the metadata are structured in array of 10 Integer. The access to the word is very frequent (this is why I thought to trie data structure that have a search time ~O(1) instead of DB that use B-tree or something like that where the search is ~O(logn)).
p.s. using java.
Thanks!

using DB is better.
many companies are merged to DB, like the erp divalto was using serializations and now merged to DB to get performance
you have many choices between DBMS, if you want to see all data in one file the simple way is to use SQLITE. his advantage it not need any server DBMS running.

File uploaded in postgres db

I'm new of vaadin and I'm developing my first application with spring and vaadin.
Now I'm trying to save an image inside my database. I followed the description of upload component on vaadin-book (Upload Component)
What do I have to change if I want to store it in the database?
Can you give me an example?

The upload component writes the received data to an java.io.OutputStream so you have plenty of freedom in how you can process the upload content
If you want to store it as a large object, you can write directly as the stream comes in. See large object support.
If you want to store it as bytea in a row, you must accumulate it in memory then pass it to a parameterized query with setObject(parameterIndex, myDataBuffer, Types.BLOB) . This will consume several times the object's size in memory, so bytea is really only well suited to smaller data.

Mysql Db Layout for Movie Information from java

I'm trying to write a java-based movie database program. It recursively searches a given list of folders, retrieves information about the movie files within, and allows you to search for movies using tags/actors/file resolution/etc, and will show a cover thumbnail of the movie and screenshots.
The program as I have it now stores information (atm only filename/size) in arraylists, and generates a html page with the thumbnails. I would like to use a mysql database instead so it will persistently store the movie information and doesn't have to search through all the folders every time.
I can use a "Media" object for each movie to store all this info, but I'm not sure of the best way to store this in a database. I don't want to just serialize the Media objects because then I would have to iterate through the whole db to search, which would be slow. I assume I need to use mysql queries to search.
I have only a less than basic knowledge of database design although I do have an idea how to use jdbc to create/access a mysql database once I have decided on a layout.
Can anyone give me some pointers on what I need to learn and where to learn it in order to decide how to lay out/index and link the tables for my movie database?
Here's my current Media object:
public class Media {
File file_location;
String filename;
Date date;
int hres;
int vres;
boolean has_tbn;
File tbnloc;
boolean has_ss;
File ssloc;
int length;
String[] actors;
String[] tags;
boolean HD;
long filesize;
String md5;

From what you said, I assume you already know how to setup/create a MySQL database, so I won't bother with that.
First, you will need to make a table for your media. Start designing a column that can store each object in your Media class with the appropriate data type, as well as set a primary key, which should be unique. This is preferably a separate id number. Also be sure to specify which columns cannot be null or blank, to avoid accidentally writing incomplete data.
However, I also see you have arrays for actors and tags. As a designer, I would prefer for these to be in separate tables, with another table joining them to the main media table for a many-to-many relationship. You could also make it simpler and just set it as a long string with a separator that you will parse and place into the array.
You may want to reconsider saving a file directly into MySQL as well. I prefer just saving the file location and keeping the actual file as is.
I think you should also take a look at the sample database the current version of MySQL has, called Sakila, which should also be available for install when you install the latest version of MySQL. It's a database for a video store, and thus has well-designed tables for storing film data and all its related factors.

Android arraylist large number of string objects

I am developing a Android application for a website. It has large number of users around 100000. I have to fetch these users to an Arraylist for a custom user search. Is that a good practice to store this much amount of data in an Arraylist (particularly in Android). If not I am planning to use a Sqlite database any suggestions?

You do not want to use a list of any type.
Databases are optimized to store and search through large amounts of data, if you store these usernames in an ArrayList, you would be responsible for ensuring that you efficiently search.
This seems like a poor idea in the first place. Why would you want to have a local copy of all 1lakh+ usernames? This is terrible waste of bandwidth! It would be better if the application could query the database for the usernames it is searching for. You could then store the results only on the client.
ex: SELECT * FROM `user` WHERE `name` LIKE "david"
Store only the results from the query. Not every user.

Make Data Classes and make it Serializable and use file storage.. Because if your using DataBase getting and putting data is a different task... storing file Object is better for data handling..

It seems that it is not a good idea to store the content in an ArrayList. Depending upon the data or your application, you may get a 'OutOfMemory' error. Try persisting the information to a SQLite database or file.
On the other hand, I do not find the necessity to bulk download the 1 lakh user data and store it locally for search on device. You could make your service to do the search and return only the search results. If this is possible, then storing it in ArrayList is not bad. If the size of your arraylist exceeds the amount tolerated by the DVM, you could override onLowMemory callback and empty the list contents. By this way you could prevent your app from being killed

Merging a large table with a large text file using JPA?

We have a large table of approximately 1 million rows, and a data file with millions of rows. We need to regularly merge a subset of the data in the text file into a database table.
The main reason for it being slow is that the data in the file has references to other JPA objects, meaning the other jpa objects need to be read back for each row in the file. ie Imagine we have 100,000 people, and 1,000,000 asset objects
Person object --> Asset list
Our application currently uses pure JPA for all of its data manipulation requirements. Is there an efficient way to do this using JPA/ORM methodologies or am I going to need to revert back to pure SQL and vendor specific commands?

why doesnt use age old technique: divide and conquer? Split the file into small chunks and then have parallel processes work on these small files concurrently.
And use batch inserts/updates that are offered by JPA and Hibernate. more details here
The ideal way in my opinion though is to use batch support provided by plain JDBC and then commit at regular intervals.
You might also wants to look at spring batch as it provided split/parallelization/iterating through files etc out of box. I have used all of these successfully for an application of considerable size.

One possible answer which is painfully slow is to do the following
For each line in the file:
Read data line
fetch reference object
check if data is attached to reference object
if not add data to reference object and persist
So slow it is not worth considering.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.