what database architecture is a good choiche for this application?

what database architecture is a good choiche for this application? - java

I have a servlet-based application that runs in a tomcat7 environment.
This application needs to manage users' files in such a way these files can be accessed in many ways and through different classification methods (for instance time-oriented classification and search, keywords, tags, author and so on).
So I have a multidimensional search space and I need to organize a database-based grouping system.
Let focus on a single and specific aspect.
Any user can upload his own files. So I'll have a path in which these files will be saved.
Then I need also a place where to store the informations associated to the files.
I though that it is good to separate files from associated informaions (title, ...) and then to create a third entity that is a small string that univocally identificate both info and file.
This way once i know the file id I can get both the informations (that are stored in a specific file) and the file but I can save this id in any perverse classification table without copying anything heavy.
So If I have the file id (fid) I can get the file and the informations. and when I have for example to associate an object to a file I can simply associate that object to the fid.
Then any user must have its own table that collects the variuos fid of the files he uploaded .
Therefore I have one table for each user. Then for any other classification dimension I will have N tables (where N is the size of the dimension). So for instance I want to classify files for keywords, I'll need N tables each for a specific keyboard. (it will be too unefficient to search each time I want files associated to key AGAA through all the users files)
So if I need to show the 50 more recent files associated to the keyword "AGAAA" I need a table for AGAAA. and so on.
This is crazy. as the number of users increases I get exponentialy more tables.
I heard about table limit per database in mysql databases.
Until now I'm using mysql (mariaDB) with connection pooling.
I though to split tables of different "nature" (i.e. those of the keyboards, those fo the time and so on) in different databases (also in order to organize in a clearly way the contents). But with connection pooling I need to declare the database name in the resource definition. So for different databases I will need different pools.
Now questions.
Using pooling I must create a different pool resource for each different database access. aint I?
If yes, is It a good pratice to use the same database for all the different kind of tables?
If no. How can change database runtime?
I thought I could manage different tables with different database systems. for example I could use SQLite in order to manage classification tables, mysql to manage user interaction and so on. Is this a good pratice?
Is SQLite in general faster than server-based databases in multi-user applications?
Can I use connection pooling with SQLite ? I mean, what are SQLite connection if SQLite has no server? and does it make sense to think about connection pooling?
What database architecture do you suggest for this kind of problematics?
thanks

Why would each user or keyword need its own table? Tables can have many rows.
Using pooling I must create a different pool resource for each
different database access. aint I?
Your question has multiple meanings, but generally you create one pool for one application, and it manages itself.
If yes, is It a good pratice to use the same database for all the
different kind of tables? If no. How can change database runtime?
Generally one would use one database for an application.
I thought I could manage different tables with different database
systems. for example I could use SQLite in order to manage
classification tables, mysql to manage user interaction and so on. Is
this a good pratice?
You could, but that would be insane.
Is SQLite in general faster than server-based databases in multi-user
applications?
Absolutely not. SQLite can only have one writer at a time, though it is fine for many readers.
Can I use connection pooling with SQLite ? I mean, what are SQLite
connection if SQLite has no server? and does it make sense to think
about connection pooling?
I don't know, but you shouldn't use SQLite if you expect multiple concurrent users writing / uploading to the database.
What database architecture do you suggest for this kind of
problematics?
I would suggest you use a content repository like Apache JackRabbit, or a search server like Apache Solr.

Related

Connecting to Multiple "Dynamic" Databases Along With Local "Static" Database in Spring Boot

I'm building an application using Java and Spring Boot where I want to query two foreign databases (they might have different schemas and data) every time I run. Therefore I'd like to query two different databases every time. After accessing those databases, I would then like to store the result (my business logic) on a local static database.
I originally wanted to store all the database data (user, pass, url) in the application.properties, but then realized that this might not be best practice as the details for the two DBs I'm querying will be received as input from the user. Therefore, I'm not sure if it's the best idea to update and overwrite application.properties every time I receive a new request (please let me know if there's a better way to do this.
Assuming I have the DBs info in application.properties, I've followed multiple tutorials for multiple DB connections in Spring, and they all followed something along the lines of making configuration files for each DB, calling a repository/DAO file for each DB, which references a model of said DB. That seems a bit problematic for me as I don't know the schema of the databases before hand, so I can't define a model class. And even if I did, this will probably change across databases, so I'm really not sure what to do.
Is there a more flexible/versatile way to query "foreign" databases with Spring or old school Java given that I don't know what their schemas might look like?
Any help is greatly appreciated!

Multiple databases config have to be maintained in application.properties or config class as a best practice. Refer here - https://docs.spring.io/spring-boot/docs/current/reference/htmlsingle/#howto-two-datasources
You can have a POJO with DB properties which gets assigned from user provided values. Use that POJO in a DB config class to connect to different databases.
Not knowing schema is not a problem as you can handle data with java collections.

How to synchronize data between MongoDB and OpenLDAP databses

I have two data bases for one system. One is OpenLDAP and another one is MongoDB. To be specific this OpenLDAP is used by Atlassian Crowd that is used by us. I need to synchronize users in these two databases. That is,
If I create a user it will be defaultly created in the OpenLDAP and it has to be created in the MongoDB as well.
In past there were issues in handling this and there may be users who are in OpenLDAP but not in MongoDB. I need to find these users also.
If I delete or update a user from one I need the delete or operation to happen in both DBs.
I am going to have a cache copy of LDAP using Redis. What is the best way to synchronize data between these two databases to match the above expectations?
If it helps I am using Java in backend.

2 possible ways:
(Preferred) Design your code in a way you can "plug" database operators to handle the different databases, so you access them from a facade code that lets you access it without worriying the underlaying databases. , so creating an user, for example, would be something like this:
createUser() -> foreach dbhandle do dbhandle->createUser() forend
The same applies to delete or update any data. This approache should also solve the problem 2.
You can just update one database and have a script that runs in background updating the databases. This approach will let you work just with 1 database, letting the script handle the rest of the databases, but it is way more expensive and less reliable (as you might access 1 database that has not been updated from the master database yet)

Accessing Data across two Databases

I have large set of data(more than 1TB). This will be accessed by more than 1000 people concurrently. Storing it in one database will make the application really slow. So I was planning to store it across different databases. Does mongo DB support routing between different databases? Or should this in our application? I am developing using Java and use Spring framework to interact with mongo.

Given the reason for splitting your data into multiple databases is to improve performance, I would suggest sharding a single database rather than splitting across multiple. If location is granular enough and you would like to split load across servers you could then use tag aware sharding to pin specific locations or location ranges to a specific server. There is a good tutorial on this available here.
Before following this route I would suggest performing load tests on your application with your database on the hardware you plan to use for your system. It is worth confirming that you really do need to shard/split data and if so the # of servers you may need. If your database is going to be read rather than write intensive it could be that a non-sharded database would handle your load giving your working set fits in memory.

Dynamic table name in Hibernate

I am developing an application in Java that uses Hibernate to connect to MySQL database.
My application manages students of different batches. If a student joined in 2010 then they are in the 2010 batch, so whenever the administrators of the application create a new batch, my application has to create new tables for that batch. While the scheme is much more like the old tables that are already there in the database, the table name changes. How do I accomplish this using Hibernate?
How do I create the XML files and the classes required dynamically?

If I understood your problem right, I think you want to check Hibernate Shards. Note that this is an advanced feature, unsupported and not really tested (nor maintained). So, use it at your own risk. You may want to pay special attention to the "Shard Selection Strategy" section:
http://docs.jboss.org/hibernate/stable/shards/reference/en/html_single/#shards-strategy-shardselection
From the documentation:
We expect many applications will want to implement attribute-based sharding, so for our example application that stores weather reports let's shard reports by the continents on which the reports originate
But as the others said: think twice before splitting your data. Do it only if you expect really large volumes of data. A couple million records are not really that much.

Alternative of Storing data except databases like mysql,sql etc

I had completed my project Address Book in Java core, in which my data is stored in database (MySql).
I am facing a problem that when i run my program on other computer than tere is the requirement of creating the hole data base again.
So please tell me any alternative for storing my data without using any database software like mysql, sql etc.

You can use an in-memory database such as HSQLDB, Derby (a.k.a JavaDB), H2, ..
All of those can run without any additional software installation and can be made to act like just another library.

I would suggest using an embeddable, lightweight database such as SQLite. Check it out.
From the features page (under the section Suggested Uses For SQLite):
Application File Format. Rather than
using fopen() to write XML or some
proprietary format into disk files
used by your application, use an
SQLite database instead. You'll avoid
having to write and troubleshoot a
parser, your data will be more easily
accessible and cross-platform, and
your updates will be transactional.

The whole point of StackOverflow was so that you would not have to email around questions/answers :)
You could store data in a filesystem, memory (use serialisation etc) which are simple alternatives to DB. You can even use HSQLDB which can be run completely in memory

If you data is not so big, you may use simple txt file and store everything in it. Then load it in memory. But this will lead to changing the way you modify/query data.

Database software like mysql, sql etc provides an abstraction in terms of implementation effort. If you wish to avoid using the same, you can think of having your own database like XML or flat files. XML is still a better choice as XML parsers or handlers are available. Putting your data in your customised database/flat files will not be manageable in the long run.
Why don't you explore sqlite? It is file based, means you don't need to install it separately and still you have the standard SQL to retrieve or interact with the data? I think, sqlite will be a better choice.

Just use a prevayler (.org). Faster and simpler than using a database.

I assume from your question that you want some form of persistent storage to the local file system of the machine your application runs on. In addition to that, you need to decide on how the data in your application is to be used, and the volume of it. Do you need a database? Are you going to be searching the data different fields? Do you need a query language? Is the data small enough to fit in to a simple data structure in memory? How resilient does it need to be? The answers to these types of questions will help lead to the correct choice of storage. It could be that all you need is a simple CSV file, XML or similar. There are a host of lightweight databases such as SQLite, Berkelely DB, JavaDB etc - but whether or not you need the power of a database is up to your requirements.

A store that I'm using a lot these days is Neo4j. It's a graph database and is not only easy to use but also is completely in Java and is embedded. I much prefer it to a SQL alternative.

In addition of the others answers about embedded databases I was working on a objects database that directly serialize java objects without the need for ORM. Its name is Sofof and I use it in my projects. It has many features which are described in its website page.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.