I have three types of objects:
Components: identified by a package and a name
Models: identified by a package and a name
Functions: identified by a name
The relationships are defined as follows:
A Component can have zero or more Model(s) [0..*]
A Model can have zero or more Function(s) [0..*]
A Function can be in one or more Model(s) [1..*]
A Model can be in zero or more Component(s) [0..*]
The natural design would be to have Component to have a set of references to Model(s) and Model to have a set references to Function. With this design I can easily navigate relationships in a top-down fashion (and easily answer queries like: "What Function(s) are contained in this Model?".
The problem is that I need more flexibility.
I would like to have something that's easily navigable to answer these kind of queries:
Given a Function name, in which Models and in turn in which Components is this Function referenced?
Given a Model package+name, in which Components is that Model referenced?
Are there Models not referenced by any Component?
I've thought of having Component, Model and Function as simple POJOs and to keep track of references between them with multiple HashMaps (HashMap<Component, Model>, Hashmap<Model, Component>, HashMap<Model, Function>, HashMap<Function, Model>) but this seems inefficient to me.
Can you suggest me something better designed?
The logical data structure to represent your problem domain is a graph, but doing so literally will not provide you the "efficient" means of answering the queries you cite as examples. It would help to know whether these queries are merely examples you've thought of out of what you imagine to be a larger set, or whether they constitute a complete specification of the requirements.
I suspect you won't like this answer, but what I think you would benefit from most here is a relational database. You can embed one that holds all the data in memory if you prefer to avoid some of the normal complication with using such a database. SQLite
is one relational database to consider, but there are many others available for use in Java.
I reach that conclusion based on your phrasing. You mention navigating the graph edges (or the aggregating relationships between the entities) in both directions. That's trivial to express in a relational model of the problem, but becomes very difficult when you use in-memory structures like the maps you propose; the former has no implied directionality to the foreign references among relations, whereas the latter can only represent unidirectional references from one entity to another.
In the relational model, you're able to express facts as follows:
There are entities called Components, that have these properties.
There are entities called Models, that have these other properties.
There are links between Components and Models, where every Model is "reachable from" any number of Components, and every Component "can reach" any number of Models. (Note that I did not write "contained within" or "owns" or any other relationship that suggests exclusivity.)
The relational model allows one to evaluate queries against these relations without any bias as to which way the links "point." Of course, the links are assertions with some meaning—likely directional, making the logical graph a directed rather than an undirected graph—specific to your problem domain, but it's your application that understands that meaning, not the database or the relational model that governs its operation.
Beyond the logical model, answering your queries as efficiently as possible will require you to specify that the database maintain some non-constraint-based indices. Some of the records can be looked up efficiently without you asking for anything special beyond integrity constraints, as the database will likely build indices on its own to aid in efficient enforcement of the stated constraints. But while it may be able to tell you quickly whether there are already any pairings between a given Component and a Model, it won't be ready to answer which Components, if any, reference a particular Model.
Requesting that the database maintain such indices is akin to you maintaining some of the in-memory maps that you proposed originally, but there's a difference in the design approaches: In your design, some of the potential queries that will emerge can't be answered at all, because the relationships won't be captured in a way that they can be navigated, even inefficiently. With the database, though, adapting to new queries is usually a matter of defining additional indices to help speed up queries over data that is already there. In other words, the database can still answer your nascent queries; it just may have to struggle in embarrassing ways to do so. Not until you define the proper indices will it then be ready to handle those queries as efficiently as the others you've anticipated.
I'll make one more point here. Using a relational database here may be overkill technically, but it is the right didactic choice. It's the kind of solution that your problem deserves. You can build something more narrow, more tailored, that provides a small subset of the database's capabilities and meets your needs, but in doing so, I think you're missing out on the larger design lesson here. You will have used your problem to learn something about how to implement a database, rather than learning about how to employ a database to model your problem. Making the latter both possible and easy is the reason the industry has made such database technology available.
The data structure that will allow you to track all of the relationships you describe in a single collection is a MultiMap. There is a discussion of Java MultiMaps in the Map Interface section of the Java Tutorials (you will have to scroll down to the MultiMaps section or follow the link and search the page for MultiMap; there is no direct anchor to that section of the tutorial). There are available implementations of MultiMap for Java:
The Apache Commons Collections: org.apache.commons.collections.MultiMap
The Google Collections Library: com.google.common.collect.MultMap
Using a MultiMap, you can create a mapping for your object types that can contain or aggregate the others:
//Associate multiple Models to one Component:
multiMap.put( componentD, modelN );
multiMap.put( componentD, modelO );
multiMap.put( componentD, modelP );
//Associate multiple Models to a Component (some different, some the same)
multiMap.put( componentE, modelQ );
multiMap.put( componentE, modelR );
multiMap.put( componentE, modelN ); //also associated with componentD
//And associate multiple Functions to one Model:
multiMap.put( modelQ, functionG );
multiMap.put( modelQ, functionH );
multiMap.put( modelQ, functionI );
You may later retrieve the Collection that is associated to any mapped key. Here is an example using the Apache Commons Collections MultiHashMapapi-doc:
Collection modelFunctions = multiMap.get( modelQ );
This approach will make it easy to traverse top-down from Component to Model to Function. But you can also make it easy to perform bottom-up traversal if you add both ends of each relationship to the MultiMap. For example, to establish the relationship between a Model and a Function, you could:
multiMap.put( modelR, functionJ );
multiMap.put( functionJ, modelR );
Because both relationships have been mapped, you can easily retrieve all of the Functions contained within a Model (as in the example above) or just as easily retrieve all of the Models that contain a Function:
Collection functionModels = multiMap.get( functionJ );
Of course this also means that if you want to break a relationship you must remember to remove both mappings from the MultiMap, but that is fairly straightforward. I hope this helps you -
Another option would be to abstract from creation and quering thus having possibility to easily optimize / extend certain implementation when it becomes bottleneck in terms of performance / ease of use.
Such interface would look like the following:
public interface IEntityManager {
// methods for retrieving entities by plain queries
Component getComponent(String package, String componentName);
Model getModel(String package, String modelName);
Function getFunction(String name);
// more specific queries
List<Component> getComponents(Function function);
List<Model> getModels(Function function);
// even more specific queries ...
}
In this way one can use this interface in the production code while providing arbitrary implementations with desired level of performance.
Now, concerning concrete implementation - there is slight difference depending on whether all Component, Model and Function instances and their relationships:
are created somewhere at the beginning of application (e.g. at the beginning of main method for desktop app or in the ServletContextListener.contextInitialised of web app) and are not changed during application execution
are created / removed / updated during application execution
Let's start from 1st case because it's simpler: one should make sure that all Component, Model and Function instances (and relationships between them) are known to instance of IEntityManager which is shared between logic using these entities. One of the most straightforward ways to achieve this is to put entity classes into the same Java package as IEntityManager implementation and make their constructors package-private thus moving creation logic to concrete IEntityManager implementation:
package com.company.entities;
public class Component {
public final String package;
public final String name;
Component(String package, String name) {
this.package = package;
this.name = name;
}
// ...
}
// similar Model and Function class declarations
public class EntityManager implements IEntityManager {
private final Map<Pair<String, String>, Component> components = new HashMap<Pair<String, String>, Component>();
private final Map<Pair<String, String>, Model> models = new HashMap<Pair<String, String>, Model>();
private final Map<String, Function> functions = new HashMap<String, Function>();
// variation of factory-method
public Component addComponent(String package, String name) {
// only this EntityManager can create instances
// so one can be sure all instances are tracked by it
final Component newComponent = new Component(package, name);
components.put(new Pair(package, name), newComponent);
}
// ... addModel, addFunction methods
public void addFunctionToModel(Function function, Model model) {
// here one should store 'somehow' information that function is related to model
}
public void addModelToComponent(Model model, Component component) {
// here one should store 'somehow' information that model is related to component
}
// ... other methods
}
Please note, that in the 1st case (all entities are created at the beginning of application) one can also use Builder pattern to create instance of EntityManager class - this will clearly abstract creation logic from usage. In the 2nd case, however, one should have methods like addComponent, addModelToComponent in the class (for multi-threaded usage of single instance of EntityManager one should consider making methods modifying it's state thread-safe).
And finally, concerning how exactly one should store relationships between entities: there is no silver bullet for efficient storing / retrieving entities with such relationships. I would say, that if one has not more than 1000 entities with not so many relationships - search in HashMap<String, Function> will be quite fast and shouldn't be a bottleneck. And if it becomes bottleneck, one should inspect thoroughly which kind of queries are used more often and which are used rarely and based on that observations tune the EntityManager inner implementation - hence the suggestion to abstract everything behing IEntityManager interface.
Concerning instance of EntityManager in application - obviously, there should be only one instance of it (unless one has very special case). One can achieve this using Singleton pattern (less preferred solution, though it may work fine for some time), or simply instantiating EntityManager at the beginning of application and passing explicit instances to classes / methods which need it (more preferred solution).
Hope this helps ...
Related
I am working on a project where I need to add users to multiple systems (active directory, a database, & Sisense) based on data received from a spreadsheet. I've coded can get the data input correctly into each system, but I am struggling to figure out how to organize my code, in terms of what design pattern to use.
I have a model class for each component that contains the field each system needs:
ActiveDirectoryUser
SisenseUser
DatabaseUser
Then, I have what I call the worker class for each of these that actually does creates the user in the system.
ActiveDirectoryWorker
SisenseWorker
DatabaseWorker
The basic flow of my code is
Read in each line from the spreadsheet
Validate the input is valid.
Create a instance of each model class that contains the appropriate fields.
Call the individual worker classes that control how the user get added to the respective system. The model instance will be passed into this class.
I've read up on some of the various design patterns, but none of the explanations are in "plain" English. Still learning the ropes here a bit, so I'd appreciate someone suggesting a model that fits my scenario.
It sounds as though you've defined three distinct data models, one for each storage. That makes your job more difficult than it has to be. Instead, consider modelling data based on data in the spreadsheet. You could, for instance, define a class called SpreadsheetUser, which contains the valid data from a spreadsheet row.
Now define an interface, e.g. UserCreator:
interface UserCreator
{
void Create(SpreadsheetUser user);
}
Now loop through each row in your spreadsheet, validate the data and then call Create on a Composite, which could be defined like this:
class CompositeUserCreator : UserCreator
{
UserCreator[] creators;
CompositeUserCreator(params UserCreator[] creators)
{
this.creators = creators;
}
public void Create(SpreadsheetUser user)
{
foreach (creator in creators)
creator.Create(user);
}
}
You also define three concrete implementations of UserCreator, one for each storage system, and create the composite like this:
CompositeUserCreator creator =
new CompositeUserCreator(
new ActiveDirectoryUserCreator(/* perhaps some config values here... */),
new SisenseUserCreator(/* ... and here... */),
new DatabaseUserCreator(/* ... and here... */));
You'll still have the problem of dealing with failures. What should happen if you've already created a user in active directory, but then Sisense creation fails? That is, however, not a problem introduced by the Composite pattern, but a problem which is inherent in distributed computing.
Coming from a perl background and having done some simple OO in that, I am struggling to grasp the android/Java way of interacting with databases.
In my experience I would create a file or class for each object, and an object would match/represent a table in the database.
Within that object would be a constructor, variables for the data, methods/functions to return individual variables but also the DB queries to read and write from the DB, doing all the necessary CRUD functions.
From what I read on the web, in Android I would create the objects similarly but without the DB interaction. This would happen in either a single class with all my DB functionality in it, or multiple DB classes, one per table.
My questions are all to do with best practices really.
Can I do my app how I would in Perl. If not why not, and if so,what are the pros and cons and limitations?
What do the terms DAO, Adapter and POJO mean?
Should I make an application class and globally declare the DB there?
Should I create one handler in each activity or in the application class only?
I have read so many tutorials now my head is spinning, all with a diff way of doing things and mostly only with a single table and few showing actual objects directly representing tables.
I'm happy to hear opinion, be linked to tutorials or just have the odd term explained.
Thanks in advance
If I am reading you correctly, ORMLite may be your best bet. It uses reflection for database creation and maintenance which seems to be how Perl does it.
POJO is Plain old java object which means it is just a normal class.
An adapter would be the class that contains the CRUD stuff and manages the database itself. There are quite some patterns around in the Android world and talking about can fill a book.
I prefer the pattern, that I open the database once in my Application class and I never close it (Android does that when it kills the app). A sample from a very old project I had might show you the basic idea.
DAO is Data Access Object and can fill dozens of books. If would just start programming and see where you're heading...
The other poster is correct in putting out ORMLite as a great way to manage code relationships that mirror your database. If you're looking to do it on your own, however, there are a ton of ways to do things, and I wouldn't say the community has really gravitated toward one over the other. Personally, I tend to have my entities represented by Plain Old Java Objects (POJO - implies no special connectivity to other things, like databases), where the various attributes of the table correspond to field values. I then persist and retrieve those objects through a Data Access Object (DAO). The DAO's all have access to a shared, open, Database object - against which they execute queries according to need.
For example: if I had a table foo, I would have a corresponding entity class Foo, with attributes corresponding to columns. class FooDAO would have mechanisms to get a Foo:
public Foo getFooById(Integer id) {
String[] selection = {id.toString()};
String limit = "1"
Cursor c = mDatabase.query(FOO_TABLE, null, "id=?", selection, null, null, null, 1);
// Create a new Foo from returned cursor and close it
}
A second entity bar might have many foo. For that, we would, in Bar, reference the FooDAO to get all of bar's member foo:
public class Bar {
public List<Foo> getFoo() {
return mFooDAO.getFooByBar(this);
}
}
etc... the scope of what one can do in rolling your own ORM like this is pretty vast, so do as much or as little as you need. Or just use ORMLite and skip the whole thing :)
Also, the android engineers frown on subclassing Application for globally accessible objects in favor of Singletons (see hackbod's answer), but opinions vary
My original question was quite incorrect, I have classes (not POJO), which have shortcut methods for business logic classes, to give the consumer of my API the ability to use it like:
Connector connector = new ConnectorImpl();
Entity entity = new Entity(connector);
entity.createProperty("propertyName", propertyValue);
entity.close;
Instead of:
Connector connector = new ConnectorImpl();
Entity entity = new Entity();
connector.createEntityProperty(entity, "propertyName", propertyValue);
connector.closeEntity(entity);
Is it good practice to create such shortcut methods?
Old question
At the moment I am developing a small framework and have a pretty nice separation of the business logic in different classes (connectors, authentication tokens, etc.), but one thing is still bothers me. I have methods which manipulates with POJOs, like this:
public class BuisnessLogicImpl implements BusinessLogic{
public void closeEntity(Entity entity) {
// Business Logic
}
}
And POJO entities which also have a close method:
public class Entity {
public void close(){
businessLogic.closeEntity(this);
}
}
Is it good practice to provide two ways to do the same thing? Or better, just remove all "proxy" methods from POJOs for clarity sake?
You should remove the methods from the "POJOs"... They aren't really POJO's if you encapsulate functionality like this. The reason for this comes from SOA design principles which basically says you want loose coupling between the different layers of your application.
If you are familiar with Inversion of control containers, like Google_Guice or Spring Framework-- this separation is a requirement. For instance, let's say you have a CreditCard POJO and a CreditCardProcessor service, and a DebugCreditCardProcess service that doesn't actually charge the CC money (for testing).
#Inject
private CardProcessor processor;
...
CreditCard card = new CreditCard(...params...);
processor.process(card);
In my example, I am relying on an IoC container to provide me with a CardProcessor. Whether this is the debug one, or the real one... I don't really care and neither does the CreditCard object. The one that is provided is decided by your application configuration.
If you had coupling between the processor and credit card where I could say card.process(), you would always have to pass in the processor in the card constructor. CreditCards can be used for other things besides processing however. Perhaps you just want to load a CreditCard from the database and get the expiration date... It shouldn't need a processor to do this simple operation.
You may argue: "The credit card could get the processor from a static factory". While true, singletons are widely regarded as an anti-pattern requiring keeping a global state in your application.
Keeping your business logic separate from your data model is always a good thing to do to reduce the coupling required. Loose coupling makes testing easier, and it makes your code easier to read.
I do not see your case as "two methods", because the logic of the implementation is kept in bussinessLogic. It would be akin of asking if it is a good idea java.lang.System has both a method getProperties() and a getProperty(String), more than a different method is just a shortcut to the same method.
But, in general, no, it is not good practice. Mainly because:
a) if the way to do that thing changes in the future, you need to remember that you have to touch two implementations.
b) when reading your code, other programmers will wonder if there are two methods because they are different.
Also, it does not fit very well with assigning responsabilities to a specific class for a given task, which is one of the tenets of OOP.
Of course, all absolute rules may have a special case where some considerations (mainly performance) may suggest breaking the rule. Think if you win something by doing so and document it heavily.
Really my question is, if I were to use a nested data structure in oodb would I be placing instance of classes within other instances in the db, or is there some sort of relational mapping that would be required.
I've been interested in OODB (Object Oriented Databases) for a year or so now. I'm a web/application developer in essence and for a while now have been noticing the severe limitations both in terms of complexity and performance of representing complex hierarchical structures such as a website hierarchy in relational models such as MS T-SQL and MySQL.
To present a quick java (pseudo-code) example:-
CLASS/DB TYPE:
public class PageObject{
public String title = "";
public String shortname = "";
public boolean published = false;
public PageObject[] pages = null;
public PageObject() {}
}
So if we started with this class, which would be capable of holding other instances of the same class in the pages array (or vector, or collection or whatever) we could eventually end up with the possiblity of having a site layout as such:-
Home
First Home Child
1
2
3
Second Home Child
Third Home Child
Looking at this we can see that the Home item would have 3 items stored in it's pages collection, with the First Home Child item within this collection having a further 3 items in its own pages collection.
If we were to then store this structure in DB4O (or any other OODB) would this present problems in terms of performance, in that any calls for top level objects such as the homepage would also return ALL items beneath them, assuming the database grows significantly?
This question might appear quite subjective, for which I apologise in advance, but I just can't seem to wrench my head out of the relational model so am having real problems even trying to plan out any kind of data model, before I progress into further work in code.
Any clarity anyone can shed on this would be absolutely appreciated at this stage! Cheers in advance for any thoughts!
This is where OODBs are precisely a fit, when you're dealing with complex object hierarchies where tables and joins feel overkill.
db4o (and other oodbs such as Versan't VOD) do not need to use joins (as in an rdbms) and deal with the relationship between objects transparently (as defined in your object model). In essence your object model ends up being your data model or schema. These oodbms systems usually perform better than rdbms when dealing with nested structures and can even handle cyclic references.
In order to avoid loading/storing more objects than expected oodbms can work with arbitrary levels of object activation (or update) depth (for example in your example you can tell the db to only retrieve/update first level home childs). Alternatively you can configure them to wok in transparent persistence mode (as Sam suggests) where the db only retrieves or updates what you access on demand (ie. as you navigate your object tree).
More info (db4o):
http://developer.db4o.com/Documentation/Reference/db4o-8.0/java/reference/Content/basics/activation.htm
HTH
Best!
German
If your hierarchy is truly a tree, wouldn't it be better to model this using a parent relationship (sorry, I can't bring myself to use a class named PageObject):
class Page {
Page parent = null
}
? You could then find the roots by searching for all Page's that have a null Parent.
In general, you should also learn about transparent activation.
Another way that is 'semi-relational' is to define Page objects with no containment information and containment relationship objects:
class Page
class Contains {
Page container
Page contained
}
Here, pulling a Contains object out of the database at worst pulls out two Pages. You need to manage Page deletions carefully though.
PS: pardon my abbreviated Java, I'm getting too used to Scala.
I'm writing an Android game in Java and I need a robust way to save and load application state quickly. The question seems to apply to most OO languages.
To understand what I need to save: I'm using a Strategy pattern to control my game entities. The idea is I have a very general Entity class which e.g. stores the location of a bullet/player/enemy and I then attach a Behaviour class that tells the entity how to act:
class Entiy { float x; float y; Behavior b; }
abstract class Behavior { void update(Entity e); {}
// Move about at a constant speed
class MoveBehavior extends Behavior { float speed; void update ... }
// Chase after another entity
class ChaseBehavior extends Behavior { Entity target; void update ... }
// Perform two behaviours in sequence
class CombineBehavior extends Behavior { Behaviour a, b; void update ... }
Essentially, Entity objects are easy to save but Behaviour objects can have a semi-complex graph of dependencies between other Entity objects and other Behaviour objects. I also have cases where a Behaviour object is shared between entities. I'm willing to change my design to make saving/loading state easier, but the above design works really well for structuring the game.
Anyway, the options I've considered are:
Use Java serialization. This is meant to be really slow in Android (I'll profile it sometime). I'm worried about robustness when changes are made between versions however.
Use something like JSON or XML. I'm not sure how I would cope with storing the dependencies between objects however. Would I have to give each object a unique ID and then use these IDs on loading to link the right objects together? I thought I could e.g. change the ChaseBehaviour to store a ID to an entity, instead of a reference, that would be used to look up the Entity before performing the behaviour.
I'd rather avoid having to write lots of loading/saving code myself as I find it really easy to make mistakes (e.g. forgetting to save something, reading things out in the wrong order).
Can anyone give me any tips on good formats to save to or class designs that make saving state easier?
You should definitely check serialization before using it. I don't know how it stands on Android, but for Java code, it's a known and efficient way to save objects graphs. Anyway, you can also take a look at replies to this question, which considers saving an object graph using XML.
You haven't said why you want to save/load state. If you want to protect against shutdown,
you might want to look at using Bundle and PathClassLoader along with onSaveInstanceState and onRestoreInstanceState/onCreate.