Allowing maximal flexibly/extensibility using a factory

Allowing maximal flexibly/extensibility using a factory - java

I have a little design issue on which I would like to get some advice:
I have several classes that inherit from the same base class, each one can accept the same data and analyze it in a slightly different way.
Analyzer
|
˪_ AnalyzerA
|
˪_ AnalyzerB
...
I have an input file (I do not have control over the file's format) that defines which analyzers should be invoked and their parameters. Plus it defines data-extractors in the same way and other similar things too (in similar I mean that this is an action that can have several variations).
I have a module that iterates over different analyzers in the file and calls some factory that constructs the correct analyzer. I have a factory for each of the archetypes the input file can define and so far so good.
But what if I want to extend it and to add a new type of analyzer?
The solution I was thinking about is using a property file for each factory that will be named after the factories name and it will hold a mapping between the input file's definition of whatever it wants me to execute and the actual classes that I use to execute the action.
This way I could load that class at run-time -> verify that it's implementing the right interface and then execute it.
If some John Doe would like to create his own analyzer he'd just need to add a new property to the correct file (I'm not quite sure what would be the best strategy to allow this kind of property customization).
So in short:
Is my solution too flawed?
If no what would be the most user friendly/convenient way to allow customization of properties?
P.S
Unfortunately I'm confined to using only build in JDK classes as the existing solution, so I can't just drop in SF on them.
I hope this question is not out of line I'm just not used to having my wings clipped this way, not having SF or some other to help me implement an elegant solution.

Have a look at the way how the java.sql.DriverManager.getConnection(connectionString) method is implemented. The best way is to watch the source code.
Very rough summary of the idea (it is hidden inside a lot of private methods). It is more or less an implementation of chain of responsibility, although there is not linked list of drivers.
DriverManager manages a list of drivers.
Each driver must register itself to the DriverManager by calling its method registerDriver().
Upon request for a connection, the getConnection(connectionString) method sequentially calls the drivers passing them the connectionString.
Each driver KNOWS if the given connection string is within its competence. If yes, it creates the connection and returns it. Otherwise the control is passed to the next driver.
Analogy:
drivers = your concrete Analyzers
connection strings = types of your files to be analyzed
Advantages:
There is no need to explicitly bind the analyzers with their type of file they are meant for. Let the analyzer to decide itself if it is able to analyze the file. If not, null is returned (or an exception or whatever) to tell the AnalyzerManager that the next analyzer in the row should be asked.
Adding new analyzer just means adding a new call to the register() method. Complete decoupling.

Related

Is using one class to store data safe?

So I was searching about storing data in one class, and found this. However, that's not what I'm looking for. What I wanted to know about this was whether it's bad practice, can cause performance issues in an application, or if there's another way to do it, etc... (and I'm not doing this on an Android).
Let's say I have a class that stores a HashMap<Enum, Object> and is initialized when I create the class in main.
public Main() {
// Creates the HashMap, initialized in the constructor 'MemoryContainer'
MemoryContainer m = new MemoryContainer();
m.getTestHash().put(SomeEnum.TEST, "Test"); // Using created HashMap
}
Other than casting the value every time I use it, would this cause major issues? If so, is there an alternative?

There's nothing wrong with storing static values in a class, however this is not a good practice.
To store the constants you should create an interface as every field in an interface is already a constant (public static final).
A better approach will be to store these values in properties files, and load them as needed.
A properties file can be stored externally and a person who isn't aware of your source code would be able to modify this properties file if needed. For example you can store database connection details in properties files and if server support admin determines that database instance is down, he/she can edit the properties file to point the application to a new one.
Finally for most flexibility you shouldn't store the configuration inside application at all. It can be stored in a database like MySql or in a fast data structure storage like Redis. This will allow multiple instances of your application to share the configuration data and it will also allow you to modify configuration on the fly by modifying them in the database.
Sometimes a Git repository is also used to store this kind of data (like in case of microservices). Git repository in addition to being shared among all the instances, also maintains the history of modifications.

I would not look too much at performance issues (of course, I do not know what else your application does or wants to do and how it achieves it).
What you should look at first is Mutability - in your example, nothing would stop me from changing the configuration at Runtime by calling
m.getTestHash().put(SomeEnum.TEST, "NotATestAnymore") - this would immediately change the behaviour for every other use of that specific setting.
I am also not sure why you would not just use a configuration class that would directly provide (typed) getters and, if you know all configuration settings at the launch of the app, one constructor containing all the settings.
Do you plan to read the configuration from an outside source (e.g. file)?

NO,
It won't cause major issues.
Also, it is a good practice to keep those variables (HashMap in your case) in a different class away from your main class (which contains your app logic).

A better way to call static methods in user-submitted code?

I have a large data set. I am creating a system which allows users to submit java source files, which will then be applied to the data set. To be more specific, each submitted java source file must contain a static method with a specific name, let's say toBeInvoked(). toBeInvoked will take a row of the data set as an array parameter. I want to call the toBeInvoked method of each submitted source file on each row in the data set. I also need to implement security measures (so toBeInvoked() can't do I/O, can't call exit, etc.).
Currently, my implementation is this: I have a list of the names of the java source files. For each file, I create an instance of the custom secure ClassLoader which I coded, which compiles the source file and returns the compiled class. I use reflection to extract the static method toBeInvoked() (e.g. method = c.getMethod("toBeInvoked", double[].class)). Then, I iterate over the rows of the data set, and invoke the method on each row.
There are at least two problems with my approach:
it appears to be painfully slow (I've heard reflection tends to be slow)
the code is more complicated than I would like
Is there a better way to accomplish what I am trying to do?

There is no significantly better approach given the constraints that you have set yourself.
For what it is worth, what makes this "painfully slow" is compiling the source files to class files and loading them. That is many orders of magnitude slower than the use of reflection to call the methods.
(Use of a common interface rather than static methods is not going to make a measurable difference to speed, and the reduction in complexity is relatively small.)
If you really want to simplify this and speed it up, change your architecture so that the code is provided as a JAR file containing all of the compiled classes.

Assuming your #toBeInvoked() could be defined in an interface rather than being static (it should be!), you could just load the class and cast it to the interface:
Class<? extends YourInterface> c = Class.forName("name", true, classLoader).asSubclass(YourInterface.class);
YourInterface i = c.newInstance();
Afterwards invoke #toBeInvoked() directly.
Also have a look into java.util.ServiceLoader, which could be helpful for finding the right class to load in case you have more than one source file.

Personally, I would use an interface. This will allow you to have multiple instance with their own state (useful for multi-threading) but more importantly you can use an interface, first to define which methods must be implemented but also to call the methods.
Reflection is slow but this is only relative to other options such as a direct method call. If you are scanning a large data set, the fact you have to pulling data from main memory is likely to be much more expensive.

I would suggest following steps for your problem.
To check if the method contains any unwanted code, you need to have a check script which can do these checks at upload time.
Create an Interface having a method toBeInvoked() (not a static method).
All the classes which are uploaded must implement this interface and add the logic inside this method.
you can have your custom class loader scan a particular folder for new classes being added and load them accordingly.
When a file is uploaded and successfully validated, you can compile and copy the class file to the folder which class loader scans.
You processor class can lookup for new files and then call toBeInvoked() method on loaded class when required.
Hope this help. (Note that i have used a similar mechanism to load dynamically workflow step classes in Workflow Engine tool which was developed).

Where to keep options values, paths to important files, etc [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I'm creating a program that requires some options values to be set along with some paths to image files, path to a SQLite database, some information about the text on various buttons, information about which database to use (SQLite / MySQL for example), etc.
So far, I had a helper class in which I made everything static and I accessed it from everywhere in the program, but it was becoming a mess and I read that this is sort of bad practice since it doesn't follow object-oriented programming guidelines.
So what I did is make that class a singleton with its reference in my controller object. To access my options, I now call controller.getOptionName instead of calling Helper.getOptionName directly.
I have a feeling I'm approaching this the wrong way, especially because from what I've read, if many of my objects depend on a single class (the helper class), I'm not separating everything enough.
I don't know what I should be doing instead, is there a "standard" on where to keep all my options? I thought about using a XML file or something along those lines, but I'm going to end up having to access it from everywhere anyway, so it feels like this would create the same problem.

The Problem: Configuration Sprawl Over time, programs gain features and options. When they connect to external systems and services (e.g. databases, event brokers, cloud/web services), they must also keep a growing set of configs and credentials for those services handy.
The traditional places to store this information at runtime are global variables and OS environment variables. Both suck.
Config data logically is "global environment" or context in which the app is running, but you can't easily depend on either global or environment variables.
The other traditional mechanism, config files--whether XML, INI, .properties, or whatever--help store config data between runs, but do nothing to organize configs/options for the program code, or during its execution.
You can clean things up a bit by making options into properties of your application's classes. This is the traditional "next step." Unfortunately, it can take a lot of up-front "what goes where??" thinking. IMO, more than it's worth. Even if you get those choices right, it won't last. If you have a reasonably feature-ful app, the number of options and settings will become overwhelming over time. You'll spend a lot of time hand-coding defaults and options in object constructors and the arguments/code of other methods. Trying to perfectly partition configurations among classes not only expends effort, it can lead to highly interdependent classes. So much for clean encapsulation!
I've banged my head against this particular wall often, especially when aiming for code that has "reasonable" or "intelligent" defaults and behaviors for everything, that allows users to override defaults at any time, and that presents a simple interface that doesn't require understanding the complete interplay of app classes and components to use its services.
Solution: Delegate to a Config Object The best solution I've found is to encapsulate option/config data into its own designated object. A fancy way of describing this pattern: For configuration, settings, and option data, use delegation rather than inheritance or composition.
How you build a config mapping depends on the language you're working in. In many languages, constructing your own Config object gives you a nice "look":
if opts.verbose:
print "..."
which I find more readable than the more explicit "getter" opts.get("verbose") or "indexer" opts['verbose'] ways of accessing a property. But you usually don't have to make your own Config class, which basically is just a mapping.
● The Easy Way ● Use a generic mapping: e.g. in Python a dict, in Perl a %hash, in Java a Dictionary or HashMap. Even better, there are extensions of these designed for, or especially suited to, configuration data. In Python, e.g., I use stuf and TreeDict for their simple dot-access and other nice properties. In Java, Properties is a similar specific-for-configs extension. E.g.:
from stuf import stuf # stuf has attributes!
opts = stuf(
show_module=False, # comment explaining what show_module means
where=True, # ...
truncate=False, # ...
everything=False, # ...
allvars=False, # ...
allkeys=False, # ...
yaml=False, # ...
on=True, # ...
ret=None, # ...
)
if opts.truncate:
...
This way, all your config and option data is in one place, neatly accessible, and clearly delineated from all of the other program, class, instance, and function/method data it's used side-by-side with. This helps maintain clarity over time, as the program evolves. You can quickly determine "Is this part of the core data? Or is it related to the context in which the core data is being processed?"
To make things even better, if you pre-load config data from a config file, load or copy those values directly into your config object. And if you take arguments from the command line, load or copy those values directly into your config object. Now you have one unified source of all the "what does the user want me to do, with what options and settings?" information.
TL;DR - 90% of apps or services are just fine with a simple config/options mapping. Everything that follows is for advanced use cases. Because this was a design/patterns question, here's why this approach isn't a one-off, but extends to successively more sophisticated/intricate use cases.
● Per-Instance Config ● You can have multiple levels of config/option data. The most common use for this would be defaults set at a class or module level, then potentially different options for each instance. A server app might have an instance per user, with each user/instance needing its own customized settings. The config map is copied at instance creation/initialization, either automatically or explicitly.
●● Multiple Config Objects ●● You can partition config/option data into several config objects, if that makes sense. For example, you might partition options for data retrieval from those for data formatting. You can do this at the start of the design, but need not. You can start with one monolithic config object, then refactor over time (generally, as you start to refactor the underlying functions). Obviously you don't want to "go crazy" adding config objects, but you can have a few without adding much program complexity. If you partition config objects, you can proxy multiple config "domains" through a single API--giving you quality information decomposition internally, but a very simple outward appearance.
◆ Chain Gang ◆ More elegant than copying config data per instance: Use chainable or hierarchical mapping (e.g. in Python, ChainMap) that lets you "overlay" the values of one mapping with those of another (similar to "copy-on-write" schemes, or "union" and "translucent" file systems). Instance options then refer directly to class/default options--unless they are explicitly set, in which case they're instance-specific. Advantage: If class/default/global settings are changed during program execution, subsequent instance method invocations will "see" the changed defaults and use them (as long as they haven't been overridden at the instance level).
◆◆ Transient Config ◆◆ If you need configs/options changeable "on the fly"--say for the scope of a given method invocation--the method can extend the instance option chained mapping. In Python, that's what ChainMap.new_child() does. It sounds complicated, but as far as the method code is concerned, it's drop-dead-simple. There's still just a single config object to refer to, and whatever it says is the option, use it.
◆◆◆ Arbitrary Duration Overlay ◆◆◆ There's nothing magical about the temporal scope of method invocation. With the proper setup, any level of configuration can be transiently overlaid for as long as needed. For example, if there's some period during program run you'd like to turn on debugging, logging, or profiling, you can turn that on and off whenever you like--just for certain instances, or for all of them at once. This hors catégorie usage requires a Config object slightly beyond stock ChainMap--but not by much (basically just a handle to the chain mapping).
Happily, most code doesn't come close to needing these "black diamond" levels of config sophistication. But if you want to go three or four levels deep, delegating to separate config objects will take you there in a logical way that keeps code clean and orderly.

I would recommend placing your options into some sort of file (whether it be xml or a .properties) especially if the values can change (such as a database username, etc). I would also say that you can break up these configuration files by component. Just like you break up your code, the components that need database information likely do not need image paths. So you can have a file for database info, image path info, etc. Then have your components load the file that they need.
In the java-specific case, you can put these files on your classpath and have your components reference them. I like using .properties files because java has a nice class, Properties, for dealing with them.
So here's a tiny example for an image provider that gives you a BufferedImage, given the filename.
image.properties
icon.path=/path/to/my/icons
background.path=/path/to/my/backgrounds
Make sure this file is on your classpath. Then here's my provider class
public class ImageProvider {
private static final String PROP_FILE = "image.properties";
private static final String ICON_PATH = "icon.path";
private Properties properties;
public ImageProvider() {
properties = new Properties();
properties.load(getClass().getClassLoader().getResourceAsStream(PROP_FILE));
}
public BufferedImage getIcon(String icon) {
return ImageIO.read(properties.getProperty(ICON_PATH) + icon);
}
}

times when configuration was kept in static variables or when every component access them independently has ended long ago. since then mankind has invented IoC and DI. you take a DI library (e.g. spring), use it to load all the configuration (from files, jndi, environment, user, you name it) and inject it to every component that needs it. no manual resource loading, no static configuration - it's just evil

Loading Settings - Best Practices

I'm at the point in my first real application where I am adding in the user settings. I'm using Java and being very OO (and trying to keep it that way) so here are my ideas:
Load everything in the main() and
pass it all 'down the line' to the
required objects (array)
Same as above, but just pass the
object that contains the data down
the line
Load each individual setting as
needed within the various classes.
I understand some of the basic pros and cons to each method (i.e. time vs. size) but I'm looking for some outside input as to what practices they've successfully used in the past.

Someone should stand up for the purported Java standard, the Preferences API... and it's most recent incarnation in JDK6. Edited to add, since the author seems to savvy XML, this is more appropriate than before. Thought I believe you can work XML juju with Properties too, should the spirit take you.
Related on SO: Preferences API vs. Apache solution, Is a master preferences class a good idea?
(well, that's about all the standing up I'm willing to do.)

Use a SettingsManager class or something similar that is used to abstract getting all settings data. At each point in the code where you need a setting you query the SettingsManager class - something like:
int timeout = SettingsManager.GetSetting("TimeoutSetting");
You then delegate all of the logic for how settings are fetched to this single manager class, whose implementation you can change / optimize as needed. For instance, you could implement the SettingsManager to fetch settings from a config file, or a database, or some other data store, periodically refresh the settings, handle caching of settings that are expensive to retrieve, etc. The code using the settings remains blissfully unaware of all of these implementaton decisions.
For maximum flexibility you can use an interface instead of an actual class, and have different setting managers implement the interface: you can swap them in and out as needed at some central point without having to change the underlying code at all.
In .NET there is a fairly rich set of existing configuration classes (in the System.Configuration) namespace that provide this sort of thing, and it works out quite well.
I'm not sure of the Java equivalent, but it's a good pattern.

Since configuration / settings are typically loaded once (at startup; or maybe a few times during the program's runtime. In any way, we're not talking about a very frequent / time-consuming process), I would prefer simplicity over efficiency.
That rules out option number (3). Configuration-loading will be scattered all over the place.
I'm not entirely sure what the difference is between (1) and (2) in your list. Does (1) mean "passing discreet parameters" and (2) mean "passing an object containing the entire configuration"? If so, I'd prefer (2) over (1).
The rule of thumb here is that you should keep things simple and concentrated. The advantage of reading configuration in one place is that it gives you better control in case the source of the configuration changes at some point.

Here is a tutorial on the Properties class. From the Javadocs (Properties):
The Properties class represents a
persistent set of properties. The
Properties can be saved to a stream or
loaded from a stream. Each key and its
corresponding value in the property
list is a string.
A property list can contain another
property list as its "defaults"; this
second property list is searched if
the property key is not found in the
original property list.
The tutorial gives the following example instantiation for a typical usage:
. . .
// create and load default properties
Properties defaultProps = new Properties();
FileInputStream in = new FileInputStream("defaultProperties");
defaultProps.load(in);
in.close();
// create application properties with default
Properties applicationProps = new Properties(defaultProps);
// now load properties from last invocation
in = new FileInputStream("appProperties");
applicationProps.load(in);
in.close();
. . .
You could, of course, also roll your own system fairly directly using a file-based store and an XML or YAML parser. Good luck!

We have recently started using JSR-330 dependency injection (using Guice from SVN) and found that it was possible to read in a Properties file (or any other map) and bind it inside Guice in the module in the startup code so that the
#Inject #Named("key") String value
string was injected with the value corresponding to the key when that particular code was called. This is the most elegant way I have ever seen for solving this problem!
You do not have to haul configuration objects around your code or sprinkle all kinds of magic method calls in each and every corner of the code to get the values - you just mention to Guice you need it, and it is there.
Note: I've had a look at Guice, Weld (Seam-based) and Spring which all provide injection, because we want JSR-330 in our own code, and I like Guice the best currently. I think the reason is because Guice is the clearest in its bindings as opposed to the under-the-hood magic happening with Weld.

In Java, how can I construct a "proxy wrapper" around an object which invokes a method upon changing a property?

I'm looking for something similar to the Proxy pattern or the Dynamic Proxy Classes, only that I don't want to intercept method calls before they are invoked on the real object, but rather I'd like to intercept properties that are being changed. I'd like the proxy to be able to represent multiple objects with different sets of properties. Something like the Proxy class in Action Script 3 would be fine.
Here's what I want to achieve in general:
I have a thread running with an object that manages a list of values (numbers, strings, objects) which were handed over by other threads in the program, so the class can take care of creating regular persistent snapshots on disk for the purpose of checkpointing the application. This persistor object manages a "dirty" flag that signifies whether the list of values has changed since the last checkpoint and needs to lock the list while it's busy writing it to disk.
The persistor and the other components identify a particular item via a common name, so that when recovering from a crash, the other components can first check if the persistor has their latest copy saved and continue working where they left off.
During normal operation, in order to work with the objects they handed over to the persistor, I want them to receive a reference to a proxy object that looks as if it were the original one, but whenever they change some value on it, the persistor notices and acts accordingly, for example by marking the item or the list as dirty before actually setting the real value.
Edit: Alternatively, are there generic setters (like in PHP 5) in Java, that is, a method that gets called if a property doesn't exist? Or is there a type of object that I can add properties to at runtime?

If with "properties" you mean JavaBean properties, i.e. represented bay a getter and/or a setter method, then you can use a dynamic proxy to intercept the set method.
If you mean instance variables, then no can do - not on the Java level. Perhaps something could be done by manipulations on the byte code level though.
Actually, the easiest way to do it is probably by using AspectJ and defining a set() pointcut (which will intercept the field access on the byte code level).

The design pattern you are looking for is: Differential Execution. I do believe.
How does differential execution work?
Is a question I answered that deals with this.
However, may I suggest that you use a callback instead? You will have to read about this, but the general idea is that you can implement interfaces (often called listeners) that active upon "something interesting" happening. Such as having a data structure be changed.
Obligitory links:
Wiki Differential execution
Wiki Callback
Alright, here is the answer as I see it. Differential Execution is O(N) time. This is really reasonable, but if that doesn't work for ya Callbacks will. Callbacks basically work by passing a method by parameter to your class that is changing the array. This method will take the value changed and the location of the item, pass it back by parameter to the "storage class" and change the value approipriately. So, yes, you have to back each change with a method call.
I realize now this is not what you want. What it appears that you want is a way that you can supply some kind of listener on each variable in an array that would be called when that item is changed. The listener would then change the corresponding array in your "backup" to refect this change.
Natively I can't think of a way to do this. You can, of course, create your own listeners and events, using an interface. This is basically the same idea as the callbacks, though nicer to look at.
Then there is reflection... Java has reflection, and I am positive you can write something using it to do this. However, reflection is notoriously slow. Not to mention a pain to code (in my opinion).
Hope that helps...

I don't want to intercept method calls before they are invoked on the real object, but
rather I'd like to intercept properties that are being changed
So in fact, the objects you want to monitor are no convenient beans but a resurgence of C structs. The only way that comes to my mind to do that is with the Field Access call in JVMTI.

I wanted to do the same thing myself. My solution was to use dynamic proxy wrappers using Javassist. I would generate a class that implements the same interface as the class of my target object, wrap my proxy class around original class, and delegate all method calls on proxy to the original, except setters which would also fire the PropertyChangeEvent.
Anyway I posted the full explanation and the code on my blog here:
http://clockwork-fig.blogspot.com/2010/11/javabean-property-change-listener-with.html

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.