hierarchical nested data structure in db4o (or any other oodb)

hierarchical nested data structure in db4o (or any other oodb) - java

Really my question is, if I were to use a nested data structure in oodb would I be placing instance of classes within other instances in the db, or is there some sort of relational mapping that would be required.
I've been interested in OODB (Object Oriented Databases) for a year or so now. I'm a web/application developer in essence and for a while now have been noticing the severe limitations both in terms of complexity and performance of representing complex hierarchical structures such as a website hierarchy in relational models such as MS T-SQL and MySQL.
To present a quick java (pseudo-code) example:-
CLASS/DB TYPE:
public class PageObject{
public String title = "";
public String shortname = "";
public boolean published = false;
public PageObject[] pages = null;
public PageObject() {}
}
So if we started with this class, which would be capable of holding other instances of the same class in the pages array (or vector, or collection or whatever) we could eventually end up with the possiblity of having a site layout as such:-
Home
First Home Child
1
2
3
Second Home Child
Third Home Child
Looking at this we can see that the Home item would have 3 items stored in it's pages collection, with the First Home Child item within this collection having a further 3 items in its own pages collection.
If we were to then store this structure in DB4O (or any other OODB) would this present problems in terms of performance, in that any calls for top level objects such as the homepage would also return ALL items beneath them, assuming the database grows significantly?
This question might appear quite subjective, for which I apologise in advance, but I just can't seem to wrench my head out of the relational model so am having real problems even trying to plan out any kind of data model, before I progress into further work in code.
Any clarity anyone can shed on this would be absolutely appreciated at this stage! Cheers in advance for any thoughts!

This is where OODBs are precisely a fit, when you're dealing with complex object hierarchies where tables and joins feel overkill.
db4o (and other oodbs such as Versan't VOD) do not need to use joins (as in an rdbms) and deal with the relationship between objects transparently (as defined in your object model). In essence your object model ends up being your data model or schema. These oodbms systems usually perform better than rdbms when dealing with nested structures and can even handle cyclic references.
In order to avoid loading/storing more objects than expected oodbms can work with arbitrary levels of object activation (or update) depth (for example in your example you can tell the db to only retrieve/update first level home childs). Alternatively you can configure them to wok in transparent persistence mode (as Sam suggests) where the db only retrieves or updates what you access on demand (ie. as you navigate your object tree).
More info (db4o):
http://developer.db4o.com/Documentation/Reference/db4o-8.0/java/reference/Content/basics/activation.htm
HTH
Best!
German

If your hierarchy is truly a tree, wouldn't it be better to model this using a parent relationship (sorry, I can't bring myself to use a class named PageObject):
class Page {
Page parent = null
}
? You could then find the roots by searching for all Page's that have a null Parent.
In general, you should also learn about transparent activation.
Another way that is 'semi-relational' is to define Page objects with no containment information and containment relationship objects:
class Page
class Contains {
Page container
Page contained
}
Here, pulling a Contains object out of the database at worst pulls out two Pages. You need to manage Page deletions carefully though.
PS: pardon my abbreviated Java, I'm getting too used to Scala.

Related

REST API Creation of Multiple Resources

I am currently working with an API where a POST request may create multiple resources, depending on the resource being passed. To give you an example, I have a Reservation resource, and two child resources, Ancillary and SpecialRequest. A reservation is uniquely identified by an alphanumeric string of 6 characters, and an Ancillary is identified by a unique ID within the reservation (i.e. ancillary IDs are only known to the parent reservation).
So, to create an Ancillary, my API endpoint looks like this:
POST /reservations/{reservationId}/ancillaries/
Usually REST states that the resource being created (Ancillary in this case) is the resource that should be returned. However, my use-case is somewhat more complicated than that, since the reservation system which my API is interfacing with is legacy, and is somewhat unpredictable.
There are certain ancillaries (bundles) which actually create multiple ancillaries. For example, an ancillary might be a package of two other ancillaries, which costs less than if you had to purchase the other two ancillaries. Moreover, an ancillary might also be linked automatically to a SpecialRequest.
I'm wondering what my options are ... so far I've come up with the following:
Return the entire Reservation, which is sure to include all sub resources which were created and/or modified as a result of the Ancillary creation. This is the "safest" option, but in doing so I wouldn't be able to tell the user which Ancillarys were created.
Return only the Ancillary which was created, however this approach is likely make my API less usable since the API user is extremely likely to perform a GET /reservations/{reservationId}
Return a Ancillary[] regardless of the ancillary type, although this still leaves out the link between the Ancillary and the SpecialRequest.
Thanks in advance for your thoughts and input

Xpages "Views" with beans: Categorize, Sort, Search

I'm trying to do the OOP approach in all my xPages. As expected I'm facing several issues, but also have tons of advantages doing so.
My question is related to Views (Repeat controls). I am loading a List<myCustomBean> for my repeat controls that contains all available objects of type myCustomBean and display each myCustomBean the way I want in a Bootstrap table row. That works all fine.
I'm able to sort my List with URL parameter sortedBy=MySortColumn with my own method. - Problem 1 solved.
How would I approach a Categorization in my Repeat Control? So I could easily sort the beans by the Cotegory, but how would I display it, incl. expandable and collapsible twisties? Maybe there is a Custom control that I can use? Or a Control of the Extension Library?
Or do I have to build everything from scratch myself?
Any advice is much appreciated.

The Data View control is probably the best. Like the View Panel or Data View, it's a extension of the Repeat Control. But it has much more flexibility that the View Panel and allows much more configurable layout than the Data View. It has a categoryColumn property, but that's designed for binding to a dominoView datasource. But there is also the categoryRow facet which can be used.
Essentially, using a dominoView component is already using OOP programming. Your repeat is using List<myCustomBean>, dominoView returns List<DominoViewEntry>. Properties on the dominoView are used to interrogate the underlying View object within the database and return only those ViewEntry objects from the ViewNavigator or ViewEntryCollection that are required. It wraps the ViewEntry as a DominoViewEntry object for just a selection of those, based on the rows property of whatever uses the DominoView.
As someone who built a subset of that functionality for use from Vaadin (see my XPages to Web App blog series http://www.intec.co.uk/tag/xpages-to-web-app-tutorial/), within XPages I typipcally use the dominoView object unless I'm extracting a small subset of ViewEntries / Documents. When I use ViewEntryCollection / DocumentCollection, I rarely wrap, preferring to let XPages optimise retrieval rather than re-develop that optimisation myself.

Android database best practice

Coming from a perl background and having done some simple OO in that, I am struggling to grasp the android/Java way of interacting with databases.
In my experience I would create a file or class for each object, and an object would match/represent a table in the database.
Within that object would be a constructor, variables for the data, methods/functions to return individual variables but also the DB queries to read and write from the DB, doing all the necessary CRUD functions.
From what I read on the web, in Android I would create the objects similarly but without the DB interaction. This would happen in either a single class with all my DB functionality in it, or multiple DB classes, one per table.
My questions are all to do with best practices really.
Can I do my app how I would in Perl. If not why not, and if so,what are the pros and cons and limitations?
What do the terms DAO, Adapter and POJO mean?
Should I make an application class and globally declare the DB there?
Should I create one handler in each activity or in the application class only?
I have read so many tutorials now my head is spinning, all with a diff way of doing things and mostly only with a single table and few showing actual objects directly representing tables.
I'm happy to hear opinion, be linked to tutorials or just have the odd term explained.
Thanks in advance

If I am reading you correctly, ORMLite may be your best bet. It uses reflection for database creation and maintenance which seems to be how Perl does it.

POJO is Plain old java object which means it is just a normal class.
An adapter would be the class that contains the CRUD stuff and manages the database itself. There are quite some patterns around in the Android world and talking about can fill a book.
I prefer the pattern, that I open the database once in my Application class and I never close it (Android does that when it kills the app). A sample from a very old project I had might show you the basic idea.
DAO is Data Access Object and can fill dozens of books. If would just start programming and see where you're heading...

The other poster is correct in putting out ORMLite as a great way to manage code relationships that mirror your database. If you're looking to do it on your own, however, there are a ton of ways to do things, and I wouldn't say the community has really gravitated toward one over the other. Personally, I tend to have my entities represented by Plain Old Java Objects (POJO - implies no special connectivity to other things, like databases), where the various attributes of the table correspond to field values. I then persist and retrieve those objects through a Data Access Object (DAO). The DAO's all have access to a shared, open, Database object - against which they execute queries according to need.
For example: if I had a table foo, I would have a corresponding entity class Foo, with attributes corresponding to columns. class FooDAO would have mechanisms to get a Foo:
public Foo getFooById(Integer id) {
String[] selection = {id.toString()};
String limit = "1"
Cursor c = mDatabase.query(FOO_TABLE, null, "id=?", selection, null, null, null, 1);
// Create a new Foo from returned cursor and close it
}
A second entity bar might have many foo. For that, we would, in Bar, reference the FooDAO to get all of bar's member foo:
public class Bar {
public List<Foo> getFoo() {
return mFooDAO.getFooByBar(this);
}
}
etc... the scope of what one can do in rolling your own ORM like this is pretty vast, so do as much or as little as you need. Or just use ORMLite and skip the whole thing :)
Also, the android engineers frown on subclassing Application for globally accessible objects in favor of Singletons (see hackbod's answer), but opinions vary

Java design: Component, Model, Function

I have three types of objects:
Components: identified by a package and a name
Models: identified by a package and a name
Functions: identified by a name
The relationships are defined as follows:
A Component can have zero or more Model(s) [0..*]
A Model can have zero or more Function(s) [0..*]
A Function can be in one or more Model(s) [1..*]
A Model can be in zero or more Component(s) [0..*]
The natural design would be to have Component to have a set of references to Model(s) and Model to have a set references to Function. With this design I can easily navigate relationships in a top-down fashion (and easily answer queries like: "What Function(s) are contained in this Model?".
The problem is that I need more flexibility.
I would like to have something that's easily navigable to answer these kind of queries:
Given a Function name, in which Models and in turn in which Components is this Function referenced?
Given a Model package+name, in which Components is that Model referenced?
Are there Models not referenced by any Component?
I've thought of having Component, Model and Function as simple POJOs and to keep track of references between them with multiple HashMaps (HashMap<Component, Model>, Hashmap<Model, Component>, HashMap<Model, Function>, HashMap<Function, Model>) but this seems inefficient to me.
Can you suggest me something better designed?

The logical data structure to represent your problem domain is a graph, but doing so literally will not provide you the "efficient" means of answering the queries you cite as examples. It would help to know whether these queries are merely examples you've thought of out of what you imagine to be a larger set, or whether they constitute a complete specification of the requirements.
I suspect you won't like this answer, but what I think you would benefit from most here is a relational database. You can embed one that holds all the data in memory if you prefer to avoid some of the normal complication with using such a database. SQLite
is one relational database to consider, but there are many others available for use in Java.
I reach that conclusion based on your phrasing. You mention navigating the graph edges (or the aggregating relationships between the entities) in both directions. That's trivial to express in a relational model of the problem, but becomes very difficult when you use in-memory structures like the maps you propose; the former has no implied directionality to the foreign references among relations, whereas the latter can only represent unidirectional references from one entity to another.
In the relational model, you're able to express facts as follows:
There are entities called Components, that have these properties.
There are entities called Models, that have these other properties.
There are links between Components and Models, where every Model is "reachable from" any number of Components, and every Component "can reach" any number of Models. (Note that I did not write "contained within" or "owns" or any other relationship that suggests exclusivity.)
The relational model allows one to evaluate queries against these relations without any bias as to which way the links "point." Of course, the links are assertions with some meaning—likely directional, making the logical graph a directed rather than an undirected graph—specific to your problem domain, but it's your application that understands that meaning, not the database or the relational model that governs its operation.
Beyond the logical model, answering your queries as efficiently as possible will require you to specify that the database maintain some non-constraint-based indices. Some of the records can be looked up efficiently without you asking for anything special beyond integrity constraints, as the database will likely build indices on its own to aid in efficient enforcement of the stated constraints. But while it may be able to tell you quickly whether there are already any pairings between a given Component and a Model, it won't be ready to answer which Components, if any, reference a particular Model.
Requesting that the database maintain such indices is akin to you maintaining some of the in-memory maps that you proposed originally, but there's a difference in the design approaches: In your design, some of the potential queries that will emerge can't be answered at all, because the relationships won't be captured in a way that they can be navigated, even inefficiently. With the database, though, adapting to new queries is usually a matter of defining additional indices to help speed up queries over data that is already there. In other words, the database can still answer your nascent queries; it just may have to struggle in embarrassing ways to do so. Not until you define the proper indices will it then be ready to handle those queries as efficiently as the others you've anticipated.
I'll make one more point here. Using a relational database here may be overkill technically, but it is the right didactic choice. It's the kind of solution that your problem deserves. You can build something more narrow, more tailored, that provides a small subset of the database's capabilities and meets your needs, but in doing so, I think you're missing out on the larger design lesson here. You will have used your problem to learn something about how to implement a database, rather than learning about how to employ a database to model your problem. Making the latter both possible and easy is the reason the industry has made such database technology available.

The data structure that will allow you to track all of the relationships you describe in a single collection is a MultiMap. There is a discussion of Java MultiMaps in the Map Interface section of the Java Tutorials (you will have to scroll down to the MultiMaps section or follow the link and search the page for MultiMap; there is no direct anchor to that section of the tutorial). There are available implementations of MultiMap for Java:
The Apache Commons Collections: org.apache.commons.collections.MultiMap
The Google Collections Library: com.google.common.collect.MultMap
Using a MultiMap, you can create a mapping for your object types that can contain or aggregate the others:
//Associate multiple Models to one Component:
multiMap.put( componentD, modelN );
multiMap.put( componentD, modelO );
multiMap.put( componentD, modelP );
//Associate multiple Models to a Component (some different, some the same)
multiMap.put( componentE, modelQ );
multiMap.put( componentE, modelR );
multiMap.put( componentE, modelN ); //also associated with componentD
//And associate multiple Functions to one Model:
multiMap.put( modelQ, functionG );
multiMap.put( modelQ, functionH );
multiMap.put( modelQ, functionI );
You may later retrieve the Collection that is associated to any mapped key. Here is an example using the Apache Commons Collections MultiHashMapapi-doc:
Collection modelFunctions = multiMap.get( modelQ );
This approach will make it easy to traverse top-down from Component to Model to Function. But you can also make it easy to perform bottom-up traversal if you add both ends of each relationship to the MultiMap. For example, to establish the relationship between a Model and a Function, you could:
multiMap.put( modelR, functionJ );
multiMap.put( functionJ, modelR );
Because both relationships have been mapped, you can easily retrieve all of the Functions contained within a Model (as in the example above) or just as easily retrieve all of the Models that contain a Function:
Collection functionModels = multiMap.get( functionJ );
Of course this also means that if you want to break a relationship you must remember to remove both mappings from the MultiMap, but that is fairly straightforward. I hope this helps you -

Another option would be to abstract from creation and quering thus having possibility to easily optimize / extend certain implementation when it becomes bottleneck in terms of performance / ease of use.
Such interface would look like the following:
public interface IEntityManager {
// methods for retrieving entities by plain queries
Component getComponent(String package, String componentName);
Model getModel(String package, String modelName);
Function getFunction(String name);
// more specific queries
List<Component> getComponents(Function function);
List<Model> getModels(Function function);
// even more specific queries ...
}
In this way one can use this interface in the production code while providing arbitrary implementations with desired level of performance.
Now, concerning concrete implementation - there is slight difference depending on whether all Component, Model and Function instances and their relationships:
are created somewhere at the beginning of application (e.g. at the beginning of main method for desktop app or in the ServletContextListener.contextInitialised of web app) and are not changed during application execution
are created / removed / updated during application execution
Let's start from 1st case because it's simpler: one should make sure that all Component, Model and Function instances (and relationships between them) are known to instance of IEntityManager which is shared between logic using these entities. One of the most straightforward ways to achieve this is to put entity classes into the same Java package as IEntityManager implementation and make their constructors package-private thus moving creation logic to concrete IEntityManager implementation:
package com.company.entities;
public class Component {
public final String package;
public final String name;
Component(String package, String name) {
this.package = package;
this.name = name;
}
// ...
}
// similar Model and Function class declarations
public class EntityManager implements IEntityManager {
private final Map<Pair<String, String>, Component> components = new HashMap<Pair<String, String>, Component>();
private final Map<Pair<String, String>, Model> models = new HashMap<Pair<String, String>, Model>();
private final Map<String, Function> functions = new HashMap<String, Function>();
// variation of factory-method
public Component addComponent(String package, String name) {
// only this EntityManager can create instances
// so one can be sure all instances are tracked by it
final Component newComponent = new Component(package, name);
components.put(new Pair(package, name), newComponent);
}
// ... addModel, addFunction methods
public void addFunctionToModel(Function function, Model model) {
// here one should store 'somehow' information that function is related to model
}
public void addModelToComponent(Model model, Component component) {
// here one should store 'somehow' information that model is related to component
}
// ... other methods
}
Please note, that in the 1st case (all entities are created at the beginning of application) one can also use Builder pattern to create instance of EntityManager class - this will clearly abstract creation logic from usage. In the 2nd case, however, one should have methods like addComponent, addModelToComponent in the class (for multi-threaded usage of single instance of EntityManager one should consider making methods modifying it's state thread-safe).
And finally, concerning how exactly one should store relationships between entities: there is no silver bullet for efficient storing / retrieving entities with such relationships. I would say, that if one has not more than 1000 entities with not so many relationships - search in HashMap<String, Function> will be quite fast and shouldn't be a bottleneck. And if it becomes bottleneck, one should inspect thoroughly which kind of queries are used more often and which are used rarely and based on that observations tune the EntityManager inner implementation - hence the suggestion to abstract everything behing IEntityManager interface.
Concerning instance of EntityManager in application - obviously, there should be only one instance of it (unless one has very special case). One can achieve this using Singleton pattern (less preferred solution, though it may work fine for some time), or simply instantiating EntityManager at the beginning of application and passing explicit instances to classes / methods which need it (more preferred solution).
Hope this helps ...

Patterns for propagating changes to nested objects

I am implementing a game/application where the player's account/state is synced to the server. I am contemplating a general framework communicating modifications of nested objects of an entity (the entity being the users's account). Let us assume for discussions of computation/reflection that both the client and server are written in Java (in reality client is in Actionscript which can modify properties dynamically)
Take for instance Firebase. Modifications to any object of the root object (a Firebase object) are propagated with a request that probably looks something like:
Service: PersistenceService
Action: modifiedObjects
Body:
Objects [{"/full/Path/To/Object/1","newValue"},{"/full/Path/to/Object/2","newValue"}]
My request for your input is the following:
1) Please correct and/or augment the following thoughts on implementing this general framework for propagating modifications to a tree of objects.
On the sending side, it would appear that every object either:
1) Needs to store it's full path from the root entity
2) Changes to properties of all nested objects need to be done reflectively
OR
A sync needs to forced, comparing the entity's saved object tree from the last request to the current object tree for modifications.
On the server side, one can analyze the paths of the objects to cache objects that are accessed multiple times in one request so as not to access the tree by reference/search collections multiple times.

The answer I have come up with is actually very obvious the obviously the best way to do it. The answer is to mirror a database of tables. Assign each object an id, and store every object in an ArrayList (or assign each object a unique ID based on type and store the object in its type's ArrayList which is itself stored in a HashMap).
I call my interfaces ServiceObject and ServiceContainer.
Now the only thing I have to see that works is how json and protostuff serialize dual references to objects. Are they serialized as seperate objects? If so, than I any nested ServiceObject's need to deserialized as references to the objects in the ArrayList.

Generally observer pattern is answer to kind of requirement you have (from wiki)
The observer pattern (aka. Dependents, publish/subscribe) is a software design pattern in which an object, called the subject, maintains a list of its dependents, called observers, and notifies them automatically of any state changes, usually by calling one of their methods. It is mainly used to implement distributed event handling systems.
You need implementation on the client server hence example given on the wiki is not applicable you might want to check this :
http://deepintojee.wordpress.com/2011/03/18/observer-pattern-applied-at-remote-level/

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.