I am developing a facebook type application for my institute.
and I am stuck at the friends module. i.e. How to know if the particular users are one's friends.
I googled a lot but didn't get any satisfactory answers.
What I got is : there will be many friends of a person and implementing users and their friends in seperate table will only increase redundancy and large DB size.
I thought of using a graph with vertices as users and edges as connection .
But how to implement something like that in db.
Or How Facebook handles such huge amount of relationships?
Personally, I would have a dedicated table for it:
You could have a table with just two columns: userID and friendID
Since the relationships between users in the db will be many-to-many, normalizing it requires a link table which breaks it into many-to-one-to-many
http://dev.mysql.com/tech-resources/articles/intro-to-normalization.html#03
This kind of problems are usually solved by using a different type of database. For a social network, a graph database should make sense, as nodes and relationships are first class citizens in it. There's a social network example for the Neo4j graph database, the full source code of the example is included in the standard dowload package. I've also written a blog post on this theme, with another example as starting point.
Related
I am making an application for a campground that has a list of camping sites, each of which has a list of reservations. There can be any number of reservations for each camping site and any number of camping sites.
I started off by just storing this data in a csv and using that to handle everything. This is working really well because after a few columns of identifying information for each site, I just have an arbitrary number of reservations that I can go through. It's also really easy for me to add a new site if I want to add that functionality to my application since I can just add a new line and leave the reservation part blank if there aren't any reservations.
I'm thinking that I should instead use a database to store this information for the following reasons:
I need to be able to search and modify specific sites in addition to just iterating through them in order. With the CSV solution, iterating through them in order has been great when I need to use them all but I don't want to have to iterate through to find a specific reservation. Plus modifying the csv isn't looking to be that great either.
I am hoping to gain a better understanding of databases in general and I don't believe using a csv is the most professional way to do this, nor would it be the best practice.
How exactly could I do this? If, for example, I had only one camping site then I would just add another table that represents the reservations as an entry to the table but since I have at least 100 camping sites and can have an arbitrary amount, this doesn't seem to make sense. Am I just not understanding something or am I correct in coming to that conclusion?
Some more information:
Since I haven't said it yet, this is a java application with an apache derby database.
This project is not for a class or a job, I'm just trying to practice my programming skills and help someone out who runs a campground at the same time.
We are migrating a whole application originally developed in Oracle Forms a few years back, to a Java (7) web based application with Hibernate (4.2.7.Final) and Hibernate Search (4.1.1.Final).
One of the requirements is: as users are using the new migrated version, they able to use the Oracle Forms version - so Hibernate Search indexes will be out of sync. Is it feasable to implement a servlet so that some PL-SQL accesses some link that updates the local indexes in the application server (AS)?
I thought of implementing a some sort clustering mechanism for hibernate, but as I read through the documentation I realised that as clustering may be a good option for scalabillity and performance, for maintaining legacy data in sync may be a bit overkill.
Does anyone have any idea of how to implement a service, accessible via servlet, to update local AS indexes in a given model entity with a given ID?
I don't know what exactly you mean by the clustering part, but anyways:
It seems like you are facing a similar problem like me. I am currently in the works of creating a Hibernate-Search adaption for JPA providers (that are not Hibernate-ORM, meaning EclipseLink, TopLink, etc.) and I am working on an automatic reindexing feature at the moment. Since JPA doesn't have a event system suitable for reindexation with Hibernate-Search I came up with the idea to use triggers on a database level to keep track of everything.
For a basic OneToOne relationship it's pretty straight forward and for other things like relation-tables or anything that is not stored in the main table of an entity it gets a bit trickier, but once you got a system for OneToOne relationships it's not that hard to get to that next step. Okay, Let's start:
Imagine two Entities: Place and Sorcerer in the Lord of the rings universe. In order to keep things simple let's just say they are in a (quite restrictive :D) 1:1 relationship with each other. Normally you end up with 2 tables named SORCERER and PLACE.
Now you have to create 3 triggers (one for CREATE, one for DELETE and one for UPDATE) on each Table (SORCERER and PLACE) that store information about what entity (only the id, for mapping tables there are always multiple ids) has changed and how (CREATE, UPDATE, DELETE) into special UPDATE tables. Let's call these PLACE_UPDATES and SORCERER_UPDATES.
In addition to the ID of the original Object that has changed and the event-type these will need an ID field that is needed to be UNIQUE among all UPDATE tables. This is needed because if you want to feed information from the Update tables to the Hibernate-Search index you have to make sure the events are in the right order or you will break your index. How such an UNIQUE ID can be created on your database should be easy to find on the internet/stackoverflow.
Okay. Now that you have set up the triggers correctly you will just have to find a way to access all the UPDATES tables in a feasible fashion (I do this via querying from multiple tables at once and sorting each query by our UNIQUE id field and then just comparing the first result of each query with the others) and then update my index.
This can be a bit tricky and you have to find the correct ways of dealing with the specific update event but it can be done (that's what I am currently working on).
If you're interested in that part, you can find it here:
https://github.com/Hotware/Hibernate-Search-JPA/blob/master/hibernate-search-db/src/main/java/com/github/hotware/hsearch/db/events/IndexUpdater.java
The link to the whole project is:
https://github.com/Hotware/Hibernate-Search-JPA/
This uses Hibernate-Search 5.0.0.
I hope this was of help (at least a little bit).
And about your remote indexing problem:
The update tables can easily be used as some kind of dump for events until you send them to the remote machine that is to be updated.
I am busy practicing on designing a simple todo list webapp whereby a user can authenticate into the app and save todo list items. The user is also only able to to view/edit the todo list items that they added.
This seems to be a general feature (authenticated user only views their own data) in most web applications (or applications in general).
To me what is important is having knowledge of the different options for accomplishing this. What I would like to achieve is a solution that can handle lots of users' data effectively. At the moment I am doing this using a Relational Database, but noSQL answers would be useful to me as well.
The following ideas came to mind:
Add a user_id column each time this "feature" is needed.
Add an association table (in the example above a user_todo_list_item table) that associates the data.
Design in such a way that you have a table per user per "feature" ... so you would have a todolist_userABC table. It's an option but I do not like it much since a thousand user's means a thousand tables?!
Add row level security to the specific "feature". I am not familiar on how this works but it seems to be a valid option. I am also not sure whether this is database vendor specific.
Of my choices I went with the user_id column on the todolist_item table. Although it can do the job, I feel that a user_id column might be problematic when reading data if the data within the table gets large enough. One could add an index I guess but I am not sure of the index's effectiveness.
What I don't like about it is that I need to have a user_id for every table where I desire this type of feature which doesn't seem correct to me? It also seems that when I implement the database layer I would have to add this to my queries for every feature (unless I use some AOP)?
I had a look around (How does Trello store data in MongoDB? (Collection per board?)), but it does not speak about the techniques regarding user_id columns or things like that. I also tried reading about this in some security frameworks (Spring Security to be specific) but it seems that it only goes into privileges/permissions on a table level and not a row level?
So the question is whether my choice was appropriate and if there are better techniques to do this?
Your choice is the natural thing to do.
The table-per-user is a non-starter (anything that modifies the database structure in response to user action is usually suspect).
Row-level security isn't really an option for webapps - it requires each user session to have a separate, persistent connection to the database, which is rarely practical. And yes, it is vendor-specific.
How you index your tables depends entirely on your usage patterns and types of queries you want to run. Is 'show all TODOs for a user' a query you want to support (seems like it would be)? Then and index on the user id is obviously needed.
Why does having a user_id column seem wrong to you? If you want to restrict access by user, you need to be able to identify which user the record belongs to. Doesn't actually mean that every table needs it - for example, if one record composes another (say, your TODOs have 'steps', each step belongs to a single TODO), only the root of the object graph needs the user id.
basically I am wondering how you would go about in Couchdb as you would in MysQL: storing username, password in one table and link the user id as foreign key on another table of tasks?
should I just use mysql for the user authentication part and couchdb to store lots of user submitted documents? so create a random unique token to link each user to their "documents" on couchdb?
also I am looking to store Java objects to the couchdb, and retrieve them to be used directly in my application. which Java-couchdb library does this? Ektorp's example is seems more complicated compared to couchdb4j.
I do not know Java very well, but I suggest use the most simple tool you find. CouchDB is very simple and usually it is most beneficial to access it with simple tools too.
Yes, if you will have many relationships in the data, MySQL will help. However CouchDB can do some simple has-many queries.
First, there is view collation. You use map/reduce, and for every "child" document, you emit a key pointing to the parent document. When you query for ?key=parent then you get a long list of children. (The wiki explains it pretty well.)
Secondly, I suggest the article What's new in CouchDB 0.11 which shows how to use document _ids to link between two documents.
Good luck!
I am building an app that lets users sign up, and other users see data about those who have already registered. If I record the userID of everyone who signs up, how can I look up data about those userIDs later?
(I'm using the Java SDK.)
If you have signed-up users and data about them, having some sort of SignedUpUser Entity makes sense and is straightforward. From there it's a matter of arranging to construct indexes that support the types of lookups that you'll be doing (e.g., by name, by recency of activity). At this level, it's not much different from how you'd construct this on top of an RDBMS.