Adding custom fields in my application

Adding custom fields in my application - java

I have a SAAS product, which is build by Spring MVC and Hibernate. Generally SAAS products allow user's to customize the product like adding extra fields to the table. So i want to give the flexibility to users, to create custom fields in the tables for themselves. Please provide all the viable solutions to achieve it. Thank you so much for your help.

I'm guessing your trying to back this to a Relational database. The primary problem is that relational databases store things in tables, and tables don't really handle free form data well.
So one solution is to use a document structure that is flexible, like XML (and perhaps ditch the database) but databases have features which are nice, so let's also consider the database-using approaches.
You could create a "custom field" table which would have columns (composite primary key) for
ExtendedTable
ColumnName
but you'd also have to store the data somewhere
(ExtendedKey)
DataItem
And now we get into the really nasty bits. How would you apply constraints to this data? I mean, what would the type be of a DataItem? A general solution would be quite complex (being a type of free form database). Hopefully you could limit the solution to solve only the problems you require solved.
Another approach is to use a single "extra" column that contains an XML record which embeds it's own "column and value" extensions, but if you wanted to display a table of the efficiently, you'd have to parse out every XML document in every field, which is not ideal.
Neither one of these approaches will work well with the existing SQL query language, so you'll then start building your own query language.
I suggest you go back and look at real data requirements, instead of sweeping them under the table with a "and anything else one might want" set of columns on your table.

Your requirement is best suited use case for NoSQL databases (like MongoDB).
Dynamically creating relational database tables & columns (modifying schemas) upon user requests in an application is not a best practice as these involve DDL operations, which are very powerful and in case if you don't handle them carefully, the whole application's database goes to the inconsistent state.

Related

Hierarchical Data Model with JPA

Recently I come across a schema model like this
Structure looks exactly the same, i just renamed with Entity name like Table (*)
Starting from Table C, all the tables are having close to 200 Columns, from C to L
Reason for posting this is like, I never come across structure like this before, if anyone who have already experienced like this or worked similar or more complex than this please do share your idea,
Having a structure like this is good or bad, and why?
Assume we need to have API to save data for the table structure like this,
how to design the API
How we are going to manage the Transactional across all these tables
In service code, there are few cases where we might need to get data from these table and transfer to external system.
Catch here is, external system is accepting the request in the flatten structure not in the hierarchy which we have as mentioned above. If this data needs to be transferred to external system, how can we manage marshaling and un marshaling
Last but not least, API which is going to manage the data like this can be consumed atleast 2K a day.
What is your thought on this, I don't know exactly why we need it, it needs a detailed discussion and we need to break up the things.
If I consider Spring Data JPA, Hibernate. What are all things i need to consider,
More Importantly, all these tables row values will be limited based on the the ownerId/tenantId, so the data needs to be consistent across all the tables.

I can not comment on the general aspect of the structure as that is pretty domain specific and one would need to know why this structure was chosen to be able to say if it's good or not. Either way, you probably can't change this anyway, so why bother asking if it's good or not?
Having said that, with such a model there are a few aspects that you should consider:
When updating data, it is pretty important to update only columns that really changed to avoid index trashing and allow the DB to use spare storage in pages. This is a performance concern that usually comes up when using Hibernate with such models as Hibernate usually updates all "updatable" columns, not just the dirty ones. There is an option to do dynamic updates though. Without dynamic updates, you might produce a few more IOs per update and thus keep locks for a longer time which affects the overall scalability.
When reading data, it is very important not to use join fetching by default as that might result in a result set size explosion.

dynamic sql generation design

Good Afternoon,
We are bulding a web application and as part of it building a search functionality, have a design question on "Search Functionality"
The field names on the UI vs DB are different .i.e. a field on the UI called as "Number" the same is called Text10 in the DB. following are the two issues
How to generate a SQL as user gives the UI field names, we have a table in the DB where we r maintaining configuration(UI name to DB Name)?
User selects the columns which he wants to search, say for example there fields are selected "Number, Description, Price" and once the sql is generated, how to know what data corresponds to what column? Do we have to maintain an index capturing position or a bean?
what is the better way to gather the data based on the resultset?
Thanks

A solution that promotes commonality between UI and database column names would be nice but probably not feasible.
Some sort of mapping table that captures the following will work:
META-DB-TABLE-NAME
META-DB-COLUMN-NAME
META-UI-COLUMN-NAME
Personally I would prefer to keep this mapping meta-data as close to the database as possible.
User-defined meta data is nicely described here from an Oracle perspective:
http://docs.oracle.com/cd/B28359_01/appdev.111/b28369/xdb_repos_meta.htm
Do some research on this and keep us informed with what you find. Very interesting question!

In such a dynamic SQL scenario, query builders like jOOQ really shine. See for example the jOOQ manual section about dynamic SQL.
In your specific case, assuming you're using generated code in jOOQ (which isn't a must, but certainly recommended), you'll be maintaining some sort of lookup between UI fields and SQL fields, such as:
Map<UIField, Field<?>> lookup = ...
lookup.put(UI.NUMBER, TABLE.NUMBER);
lookup.put(UI.DESCRIPTION, TABLE.DESCRIPTION);
lookup.put(UI.PRICE, TABLE.PRICE);
You can then construct your query dynamically according to user needs:
List<UIField> userRequestedFields = ...
List<Field<?>> queryFields = userRequestedFields
.stream()
.map(lookup::get)
.toList();
And then:
ctx.select(queryFields)
.from(TABLE)
.where(...)
.fetch();
There are other query builders, even JPA has the criteria API for these purposes. You could also roll your own, though you'll be re-inventing a lot of wheels.
Disclaimer: I work for the company behind jOOQ.

Best way to avoid EAV model, but still allow for flexibility

I have a requirement to store CSV data in an Oracle database for later retrieval by dynamic query scripts. The data needs to be stored such that any column of the CSV data can be queried using SQL and performance is key (some CSV files are 100k+ lines).
The content of the CSV files (number of columns, headings, data types) is not known ahead of time and the system needs to be able to handle multiple file structures (which are added to a config file so the system knows how to read them, by people who don't know SQL).
My current solution, in order to avoid an EAV model, is to have my code create new tables every time a new CSV structure is added to the config file. I'm curious to know if there is a better way to achieve what I'm trying to do. I'm not particularly fond of having my code create new tables in production at run-time.
The system is written in groovy, in case it matters.

I am inclined to go with your current solution, which is a separate table for each type. Somehow, I'm most comfortable with storing data in well-defined tables with well-defined types.
An EAV (entity-attribute-value) solution is also viable. With 100k rows of data, the EAV solution should perform pretty well, unless you have lots of tables. One downside is the types of the columns. Without a lot of extra work, you are pretty much limited to strings for all the values.
Oracle does offer another possibility, which is an XML solution. This can give you the flexibility of dynamic column names along with the "simplicity" of not having to define a separate table for each one. You can read more about it in the documentation here.

It comes down to what you want to model. If you need to handle adhoc queries against any of the columns in the CSV file, then I guess you need to model them all as Oracle columns. If you need to only retrieve a whole line based on a particular key, then you could model as two columns: the key and the line. If you need to model the individual columsn that such a thing would not be in first normal form.
When you create an EAV model, you are making a flexible system that allows for additional columns to be added/removed easily. Oracle is already a flexible system that allows for additional columns to be added/removed easily. They've just put more thought into locking, performance, scalability and tool support that your naive EAV model might have.
Overall, I think what you are probably doing is best. It's not an easy problem and it's not exactly what Oracle was designed for so you might have issues with statistics and which indexes to create and so on.

Exploring user specific data in webapps

I am busy practicing on designing a simple todo list webapp whereby a user can authenticate into the app and save todo list items. The user is also only able to to view/edit the todo list items that they added.
This seems to be a general feature (authenticated user only views their own data) in most web applications (or applications in general).
To me what is important is having knowledge of the different options for accomplishing this. What I would like to achieve is a solution that can handle lots of users' data effectively. At the moment I am doing this using a Relational Database, but noSQL answers would be useful to me as well.
The following ideas came to mind:
Add a user_id column each time this "feature" is needed.
Add an association table (in the example above a user_todo_list_item table) that associates the data.
Design in such a way that you have a table per user per "feature" ... so you would have a todolist_userABC table. It's an option but I do not like it much since a thousand user's means a thousand tables?!
Add row level security to the specific "feature". I am not familiar on how this works but it seems to be a valid option. I am also not sure whether this is database vendor specific.
Of my choices I went with the user_id column on the todolist_item table. Although it can do the job, I feel that a user_id column might be problematic when reading data if the data within the table gets large enough. One could add an index I guess but I am not sure of the index's effectiveness.
What I don't like about it is that I need to have a user_id for every table where I desire this type of feature which doesn't seem correct to me? It also seems that when I implement the database layer I would have to add this to my queries for every feature (unless I use some AOP)?
I had a look around (How does Trello store data in MongoDB? (Collection per board?)), but it does not speak about the techniques regarding user_id columns or things like that. I also tried reading about this in some security frameworks (Spring Security to be specific) but it seems that it only goes into privileges/permissions on a table level and not a row level?
So the question is whether my choice was appropriate and if there are better techniques to do this?

Your choice is the natural thing to do.
The table-per-user is a non-starter (anything that modifies the database structure in response to user action is usually suspect).
Row-level security isn't really an option for webapps - it requires each user session to have a separate, persistent connection to the database, which is rarely practical. And yes, it is vendor-specific.
How you index your tables depends entirely on your usage patterns and types of queries you want to run. Is 'show all TODOs for a user' a query you want to support (seems like it would be)? Then and index on the user id is obviously needed.
Why does having a user_id column seem wrong to you? If you want to restrict access by user, you need to be able to identify which user the record belongs to. Doesn't actually mean that every table needs it - for example, if one record composes another (say, your TODOs have 'steps', each step belongs to a single TODO), only the root of the object graph needs the user id.

dynamic object relation mapping

I am trying to create an application in java which pulls out records from the database and maps it to objects. It does that without knowing what the schema of the database looks like. All i want to do is fetch all rows from all tables and store them somewhere. There could be a thousand tables with thousands of records each. The application doesn't know the name of any table or attribute. It should map "on the fly". I looked at hibernate but it doesnt give me what i want for this app. I don't want to create hard-coded xml files and classes for mapping. Any ideas how i can accomplish this ?
Thanks

Oracle has a bunch of data dictionary views for metadata.
ALL_TABLES, ALL_TAB_COLUMNS would be first places to start. Then you'd build ad-hoc queries based on what you get out of there. Not sure whether you have to deal with all data types (dates, blobs, spatial, user-defined....).
Not sure what you mean by "store them somewhere". If you start thinking CSV or XML files, you'll need to escape various characters from VARCHAR2 columns.
If you are looking for some generic extract/unload routines, you should look at what is already available in the database or open-source/commercially.

MyBatis provides a pretty simple way to map data results to objects and back, maybe check that out?
http://code.google.com/p/mybatis/

Not to be flip, but for this task, you might want to check out Ruby on Rails and its ActiveRecord approach

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.