Use of helper/utility methods in a Factory class - java

I have a question about the usage of "utility/helper" methods in a factory class. Consider an example of an XML string that reprents a document. I have a class that transforms it to an "object" (say PDF, Word, CSV, etc.). I have a factory class (lets call it DocumentFactory) that accepts this XML string and based on certain rules gives back the correct document object.
My question here is that in terms of "best practices" is it ok for me to add "utility/helper" methods to the DocumentFactory class that aid in deciding that type of object will be returned? These helpers are beyond simple if/swtich case statements. But not more than 15-20 lines.
I am using one private static class as well in my code and there are about 4-5 helper methods (the helpers are public since I have tests written for these).
So is this setup a valid one for a factory class?

No, there is nothing intrinsically wrong with using helper methods in a Factory to help decide what sort of object to return. All the usual method-related warnings apply, but there are no Factory-specific reasons to avoid them.

This is perfectly fine.. In fact, I'd say that using helper methods is the preferred way of doing it, since it is good practice to chunk up your code into as many re-usable methods as possible. You should probably make these helper methods private (and static, assuming the factory method itself is static).

Related

Is Object deserialization a proper way to implement Prototype pattern in Java?

TL;DR
Can I use Java serialization/deserialization using Serializable interface, ObjectOutputStream and ObjectInputStream classes, and probably adding readObject and writeObject in the classes implementing Serializable as a valid implementation for Prototype pattern or not?
Note
This question is not to discuss if using copy constructor is better than serialization/deserialization or not.
I'm aware of the Prototype Pattern concept (from Wikipedia, emphasis mine):
The prototype pattern is a creational design pattern in software development. It is used when the type of objects to create is determined by a prototypical instance, which is cloned to produce new objects. This pattern is used to:
avoid subclasses of an object creator in the client application, like the abstract factory pattern does.
avoid the inherent cost of creating a new object in the standard way (e.g., using the 'new' keyword) when it is prohibitively expensive for a given application.
And from this Q/A: Examples of GoF Design Patterns in Java's core libraries, BalusC explains that prototype pattern in Java is implemented by Object#clone only if the class implements Cloneable interface (marker interface similar to Serializable to serialize/deserialize objects). The problem using this approach is noted in blog posts/related Q/As like these:
Copy Constructor versus Cloning
Java: recommended solution for deep cloning/copying an instance
So, another alternative is using a copy constructor to clone your objects (the DIY way), but this fails to implement the prototype pattern for the text I emphasized above:
avoid the inherent cost of creating a new object in the standard way (e.g., using the 'new' keyword)
AFAIK the only way to create an object without invoking its constructor is by deserialization, as noted in the example of the accepted answer of this question: How are constructors called during serialization and deserialization?
So, I'm just asking if using object deserialization through ObjectOutputStream (and knowing what you're doing, marking necessary fields as transient and understanding all the implications of this process) or a similar approach would be a proper implementation of Prototype Pattern.
Note: I don't think unmarshalling XML documents is a right implementation of this pattern because invokes the class constructor. Probably this also happens when unmarshalling JSON content as well.
People would advise using object constructor, and I would mind that option when working with simple objects. This question is more oriented to deep copying complex objects, where I may have 5 levels of objects to clone. For example:
//fields is an abbreviation for primitive type and String type fields
//that can vary between 1 and 20 (or more) declared fields in the class
//and all of them will be filled during application execution
class CustomerType {
//fields...
}
class Customer {
CustomerType customerType;
//fields
}
class Product {
//fields
}
class Order {
List<Product> productList;
Customer customer;
//fields
}
class InvoiceStatus {
//fields
}
class Invoice {
List<Order> orderList;
InvoiceStatus invoiceStatus;
//fields
}
//class to communicate invoice data for external systems
class InvoiceOutboundMessage {
List<Invoice> invoice;
//fields
}
Let's say, I want/need to copy a instance of InvoiceOutboundMessage. I don't think a copy constructor would apply in this case. IMO having a lot of copy constructors doesn't seem like a good design in this case.
Using Java object serialization directly is not quite the Prototype pattern, but serialization can be used to implement the pattern.
The Prototype pattern puts the responsibility of copying on the object to be copied. If you use serialization directly, the client needs to provide the deserialization and serialization code. If you own, or plan to write, all of the classes that are to be copied, it is easy to move the responsibility to those classes:
define a Prototype interface which extends Serializable and adds an instance method copy
define a concrete class PrototypeUtility with a static method copy that implements the serialization and deserialization in one place
define an abstract class AbstractPrototype that implements Prototype. Make its copy method delegate to PrototypeUtility.copy.
A class which needs to be a Prototype can either implement Prototype itself and use PrototypeUtility to do the work, or can just extend AbstractPrototype. By doing so it also advertises that it is safely Serializable.
If you don't own the classes whose instances are to be copied, you can't follow the Prototype pattern exactly, because you can't move the responsibility for copying to those classes. However, if those classes implement Serializable, you can still get the job done by using serialization directly.
Regarding copy constructors, those are a fine way to copy Java objects whose classes you know, but they don't meet the requirement that the Prototype pattern does that the client should not need to know the class of the object instance that it is copying. A client which doesn't know an instance's class but wants to use its copy constructor would have to use reflection to find a constructor whose only argument has the same class as the class it belongs to. That's ugly, and the client couldn't be sure that the constructor it found was a copy constructor. Implementing an interface addresses those issues cleanly.
Wikipedia's comment that the Prototype pattern avoids the cost of creating a new object seems misguided to me. (I see nothing about that in the Gang of Four description.) Wikipedia's example of an object that is expensive to create is an object which lists the occurrences of a word in a text, which of course are expensive to find. But it would be foolish to design your program so that the only way to get an instance of WordOccurrences was to actually analyze a text, especially if you then needed to copy that instance for some reason. Just give it a constructor with parameters that describe the entire state of the instance and assigns them to its fields, or a copy constructor.
So unless you're working with a third-party library that hides its reasonable constructors, forget about that performance canard. The important points of Prototype are that
it allows the client to copy an object instance without knowing its class, and
it accomplishes that goal without creating a hierarchy of factories, as meeting the same goal with the AbstractFactory pattern would.
I'm puzzled by this part of your requirements:
Note: I don't think unmarshalling XML documents is a right
implementation of this pattern because invokes the class constructor.
Probably this also happens when unmarshalling JSON content as well.
I understand that you might not want to implement a copy constructor, but you will always have a regular constructor. If this constructor is invoked by a library then what does it matter? Furthermore object creation in Java is cheap. I've used Jackson for marshalling/unmarshalling Java objects with great success. It is performant and has a number of awesome features that might be very helpful in your case. You could implement a deep copier as follows:
import com.fasterxml.jackson.databind.ObjectMapper;
public class MyCloner {
private ObjectMapper cloner; // with getter and setter
public <T> clone(T toClone){
String stringCopy = mapper.writeValueAsString(toClone);
T deepClone = mapper.readValue(stringCopy, toClone.getClass());
return deepClone;
}
}
Note that Jackson will work automatically with Beans (getter + setter pairs, no-arg constructor). For classes that break that pattern it needs additional configuration. One nice thing about this configuration is that it won't require you to edit your existing classes, so you can clone using JSON without any other part of your code knowing that JSON is being used.
Another reason I like this approach vs. serialization is it is more human debuggable (just look at the string to see what the data is). Additionally, there are tons of tools out there for working with JSON:
Online JSON formatter
Veiw JSON as HTML based webpage
Whereas tools for Java serialization isn't great.
One drawback to this approach is that by default duplicate references in the original object will be made unique in the copied object by default. Here is an example:
public class CloneTest {
public class MyObject { }
public class MyObjectContainer {
MyObject refA;
MyObject refB;
// Getters and Setters omitted
}
public static void runTest(){
MyCloner cloner = new MyCloner();
cloner.setCloner(new ObjectMapper());
MyObjectContainer container = new MyObjectContainer();
MyObject duplicateReference = new MyObject();
MyObjectContainer.setRefA(duplicateReference);
MyObjectContainer.setRefB(duplicateReference);
MyObjectContainer cloned = cloner.clone(container);
System.out.println(cloned.getRefA() == cloned.getRefB()); // Will print false
System.out.println(container.getRefA() == container.getRefB()); // Will print true
}
}
Given that there are several approaches to this problem each with their own pros and cons, I would claim there isn't a 'proper' way to implement the prototype pattern in Java. The right approach depends heavily on the environment you find yourself coding in. If you have constructors which do heavy computation (and can't circumvent them) then I suppose you don't have much option but to use Deserialization. Otherwise, I would prefer the JSON/XML approach. If external libraries weren't allowed and I could modify my beans, then I'd use Dave's approach.
Your question is really interesting Luiggi (I voted for it because the idea is great), it's a pitty you don't say what you are really concerned about. So I'll try to answer what I know and let you choose what you find arguable:
Advantages :
In terms of memory use, you will get a very good memory consumption by using serialization since it serializes your objects in binary format (and not in text as json or worse: xml). You may have to choose a strategy to keep your objects "pattern" in memory as long as you need it, and persist it in a "less used first persisted" strategy, or "first used first persisted"
Coding it is pretty direct. There are some rules to respect, but it you don't have many complex structures, this remains maintainable
No need for external libraries, this is pretty an advantage in institutions with strict security/legal rules (validations for each library to be used in a program)
If you don't need to maintain your objects between versions of the program/ versions of the JVM. You can profit from each JVM update as speed is a real concern for java programs, and it's very related to io operations (JMX, memory read/writes, nio, etc...). So there are big chances that new versions will have optimized io/memory usage/serialization algos and you will find you're writing/reading faster with no code change.
Disadvantages :
You loose all your prototypes if you change any object in the tree. Serialization works only with the same object definition
You need to deserialize an object to see what is inside it: as opposed to the prototype pattern that is 'self documenting' if you take it from a Spring / Guice configuration file. The binary objects saved to disk are pretty opaque
If you're planning to do a reusable library, you're imposing to your library users a pretty strict pattern (implementing Serializable on each object, or using transient for dields that are not serializable). In addition this constraints cannot be checked by the compiler, you have to run the program to see if there's something wrong (which might not be visible immediately if an object in the tree is null for the tests). Naturally, I'm comparing it to other prototyping technologies (Guice for example had the main feature of being compile time checked, Spring did it lately too)
I think it's all what comes to my mind for now, I'll add a comment if any new aspect raises suddenly :)
Naturally I don't know how fast is writing an object as bytes compared to invoking a constructor. The answer to this should be mass write/read tests
But the question is worth thinking.
There are cases where creating new object using copy constructor is different from creating new object "in a standard way". One example is explained in the Wikipedia link in your question. In that example, to create new WordOccurrences using the constructor WordOccurrences(text, word), we need to perform heavyweight computation. If we use copy constructor WordOccurrences(wordOccurences) instead, we can immediately get the result of that computation (in the Wikipedia, clone method is used, but the principle is the same).

AbstractClass.getInstance() method is this an anti-pattern

In some places where a class hierarchy is present and the top most base class is an abstract class there is a static getInstance() method in the abstract class. This will be responsible for creating the correct sub-class and returning it to the caller. For example consider the below code.
public class abstract Product {
public static Product getInstance(String aCode) {
if ("a".equals(aCode) {
return new ProductA();
}
return ProductDefault();
}
// product behaviour methods
}
public class ProductA extends Product {}
public class ProductDefault extends Product {}
In Java, java.util.Calendar.getInstance() is one place this pattern has been followed. However this means each time a new subclass is introduced one has to modify the base class. i.e: Product class has to be modified in the above example. This seems to violate the ocp principle. Also the base class is aware about the sub class details which is again questionable.
My question is...
is the above pattern an anti-pattern ?
what are the draw-backs of using the above pattern ?
what alternatives can be followed instead ?
The interface is not an anti-pattern. But the way you've implemented it is rather poor ... for the reason you identified. A better idea would be to have some mechanism for registering factory objects for each code:
The Java class libraries do this kind of thing using SPIs and code that looks reflectively for "provider" classes to be dynamically loaded.
A simpler approach is to have a "registry" object, and populate it using dependency injection, or static initializers in the factory object classes, or a startup method that reads class names from a properties file, etcetera.
No it's not. It's more like factory method pattern http://en.wikipedia.org/wiki/Factory_method_pattern. E.g. Calendar.getInstance();. JDK is full of such examples. Also reminds of Effective Java Item 1: Consider static factory methods instead of constructors
There are a number of separate issues here.
getInstance is probably going to be a bad name. You explicitly want a new object you can play around with. "Create", "make", "new" or just leave that word out. "Instance" is also a pretty vacuous word in this context. If there is sufficient context from the class name leave it out, otherwise say what it is even if that is just a type name. If the method returns an immutable object, of is the convention (valueOf in olden times).
Putting it in an abstract base class (or in an interface if that were possible) is, as identified, not the best idea. In some cases an enumeration of all possible subtypes is appropriate - an enum obviously and really not that bad if you are going to use visitors anyway. Better to put it in a new file.
Anything to do with mutable statics is wrong. Whether it is reusing the same mutable instance, registration or doing something disgusting with the current thread. Don't do it or depend (direct or indirectly) on anything that does.
Based on the feedback i introduced a new ProductFactory class that took care of creating the correct Product. In my case the creation of the correct product instance depends on an external context (i've put the product code for the purpose of simplicity.. in the actual case it might be based on several parameters.. these could change over time). So having a Product.getInstance() method is not that suited because of the reasons outlined in the question. Also having a different ProductFactory means in the future.. Product class can become an interface if required. It just gives more extensibility.
I think when the creation of the object doesn't depend on an external context.. like in the case of Calendar.getInstance() it's perfectly ok to have such a method. In these situations the logic of finding the correct instance is internal to that particular module/class and doesn't depend on any externally provided information..

Using a function in two unrelated Java classes

I have two classes in my Java project that are not 'related' to each other (one inherits from Thread, and one is a custom object. However, they both need to use the same function, which takes two String arguments and does soem file writing stuff. Where do I best put this function? Code duplication is ugly, but I also wouldn't want to create a whole new class just for this one function.
I have the feeling I am missing a very obvious way to do this here, but I can't think of an easy way.
[a function], which takes two String arguments and does soem file writing stuff
As others have suggested, you can place that function in a separate class, which both your existing classes could then access. Others have suggested calling the class Utility or something similar. I recommend not naming the class in that manner. My objections are twofold.
One would expect that all the code in your program was useful. That is, it had utility, so such a name conveys no information about the class.
It might be argued that Utility is a suitable name because the class is utilized by others. But in that case the name describes how the class is used, not what it does. Classes should be named by what they do, rather than how they are used, because how they are used can change without what they do changing. Consider that Java has a string class, which can be used to hold a name, a description or a text fragment. The class does things with a "string of characters"; it might or might not be used for a name, so string was a good name for it, but name was not.
So I'd suggest a different name for that class. Something that describes the kind of manipulation it does to the file, or describes the format of the file.
Create a Utility class and put all common utility methods in it.
Sounds like an ideal candidate for a FileUtils class that only has static functions. Take a look at SwingUtilities to see what I'm talking about.
You could make the function static in just one of the classes and then reference the static method in the other, assuming there aren't variables being used that require the object to have been instantiated already.
Alternatively, create another class to store all your static methods like that.
To answer the first part of your question - To the best of my knowledge it is impossible to have a function standalone in java; ergo - the function must go into a class.
The second part is more fun - A utility class is a good idea. A better idea may be to expand on what KitsuneYMG wrote; Let your class take responsibility for it's own reading/writing. Then delegate the read/write operation to the utility class. This allows your read/write to be manipulated independently of the rest of the file operations.
Just my 2c (+:

Is there a rule of thumb for when to code a static method vs an instance method?

I'm learning Java (and OOP) and although it might irrelevant for where I'm at right now, I was wondering if SO could share some common pitfalls or good design practices.
One important thing to remember is that static methods cannot be overridden by a subclass. References to a static method in your code essentially tie it to that implementation. When using instance methods, behavior can be varied based on the type of the instance. You can take advantage of polymorphism. Static methods are more suited to utilitarian types of operations where the behavior is set in stone. Things like base 64 encoding or calculating a checksum for instance.
I don't think any of the answers get to the heart of the OO reason of when to choose one or the other. Sure, use an instance method when you need to deal with instance members, but you could make all of your members public and then code a static method that takes in an instance of the class as an argument. Hello C.
You need to think about the messages the object you are designing responds to. Those will always be your instance methods. If you think about your objects this way, you'll almost never have static methods. Static members are ok in certain circumstances.
Notable exceptions that come to mind are the Factory Method and Singleton (use sparingly) patterns. Exercise caution when you are tempted to write a "helper" class, for from there, it is a slippery slope into procedural programming.
If the implementation of a method can be expressed completely in terms of the public interface (without downcasting) of your class, then it may be a good candidate for a static "utility" method. This allows you to maintain a minimal interface while still providing the convenience methods that clients of the code may use a lot. As Scott Meyers explains, this approach encourages encapsulation by minimizing the amount of code impacted by a change to the internal implementation of a class. Here's another interesting article by Herb Sutter picking apart std::basic_string deciding what methods should be members and what shouldn't.
In a language like Java or C++, I'll admit that the static methods make the code less elegant so there's still a tradeoff. In C#, extension methods can give you the best of both worlds.
If the operation will need to be overridden by a sub-class for some reason, then of course it must be an instance method in which case you'll need to think about all the factors that go into designing a class for inheritance.
My rule of thumb is: if the method performs anything related to a specific instance of a class, regardless of whether it needs to use class instance variables. If you can consider a situation where you might need to use a certain method without necessarily referring to an instance of the class, then the method should definitely be static (class). If this method also happens to need to make use of instance variables in certain cases, then it is probably best to create a separate instance method that calls the static method and passes the instance variables. Performance-wise I believe there is negligible difference (at least in .NET, though I would imagine it would be very similar for Java).
If you keep state ( a value ) of an object and the method is used to access, or modify the state then you should use an instance method.
Even if the method does not alter the state ( an utility function ) I would recommend you to use an instance method. Mostly because this way you can have a subclass that perform a different action.
For the rest you could use an static method.
:)
This thread looks relevant: Method can be made static, but should it? The difference's between C# and Java won't impact its relevance (I think).
Your default choice should be an instance method.
If it uses an instance variable it must be an instance method.
If not, it's up to you, but if you find yourself with a lot of static methods and/or static non-final variables, you probably want to extract all the static stuff into a new class instance. (A bunch of static methods and members is a singleton, but a really annoying one, having a real singleton object would be better--a regular object that there happens to be one of, the best!).
Basically, the rule of thumb is if it uses any data specific to the object, instance. So Math.max is static but BigInteger.bitCount() is instance. It obviously gets more complicated as your domain model does, and there are border-line cases, but the general idea is simple.
I would use an instance method by default. The advantage is that behavior can be overridden in a subclass or if you are coding against interfaces, an alternative implementation of the collaborator can be used. This is really useful for flexibility in testing code.
Static references are baked into your implementation and can't change. I find static useful for short utility methods. If the contents of your static method are very large, you may want to think about breaking responsibility into one or more separate objects and letting those collaborate with the client code as object instances.
IMHO, if you can make it a static method (without having to change it structure) then make it a static method. It is faster, and simpler.
If you know you will want to override the method, I suggest you write a unit test where you actually do this and so it is no longer appropriate to make it static. If that sounds like too much hard work, then don't make it an instance method.
Generally, You shouldn't add functionality as soon as you imagine a use one day (that way madness lies), you should only add functionality you know you actually need.
For a longer explanation...
http://en.wikipedia.org/wiki/You_Ain%27t_Gonna_Need_It
http://c2.com/xp/YouArentGonnaNeedIt.html
the issue with static methods is that you are breaking one of the core Object Oriented principles as you are coupled to an implementation. You want to support the open close principle and have your class implement an interface that describes the dependency (in a behavioral abstract sense) and then have your classes depend on that innterface. Much easier to extend after that point going forward . ..
My static methods are always one of the following:
Private "helper" methods that evaluate a formula useful only to that class.
Factory methods (Foo.getInstance() etc.)
In a "utility" class that is final, has a private constructor and contains nothing other than public static methods (e.g. com.google.common.collect.Maps)
I will not make a method static just because it does not refer to any instance variables.

How much code should one put in a constructor?

I was thinking how much code one should put in constructors in Java? I mean, very often you make helper methods, which you invoke in a constructor, but sometimes there are some longer initialization things, for example for a program, which reads from a file, or user interfaces, or other programs, in which you don't initialize only the instance variables, in which the constructor may get longer (if you don't use helper methods). I have something in mind that the constructors should generally be short and concise, shouldn't they? Are there exceptions to this?
If you go by the SOLID principles, each class should have one reason to change (i.e. do one thing). Therefore a constructor would normally not be reading a file, but you would have a separate class that builds the objects from the file.
Take a look at this SO question. Even though the other one is for C++, the concepts are still very similar.
As little as is needed to complete the initialization of the object.
If you can talk about a portion (5 or so lines is my guideline) of your constructor as a chunk of logic or a specific process, it's probably best to split it into a separate method for clarity and organizational purposes.
But to each his own.
My customary practice is that if all the constructor has to do is set some fields on an object, it can be arbitrarily long. If it gets too long, it means that the class design is broken anyway, or data need to be packaged in some more complex structures.
If, on the other hand, the input data need some more complex processing before initializing the class fields, I tend to give the constructor the processed data and move the processing to a static factory method.
Constructors should be just long enough, but no longer =)
If you are defining multiple overloaded constructors, don't duplicate code; instead, consolidate functionality into one of them for improved clarity and ease of maintenance.
As Knuth said, "Premature optimization is the root of all evil."
How much should you put in the consructor? Everything you need to. This is the "eager" approach. When--and only when--performance becomes an issue do you consider optimizing it (to the "lazy" or "over-eager" approaches).
Constructors should create the most minimal, generic instance of your object. How generic? Choose the test cases that every instance or object that inherits from the class must pass to be valid - even if "valid" only means fails gracefully (programatically generated exception).
Wikipedia has a good description :
http://en.wikipedia.org/wiki/Constructor_(computer_science)
A Valid object is the goal of the constructor, valid not necessarily useful - that can be done in an initialization method.
Your class may need to be initialized to a certain state, before any useful work can be done with it.
Consider this.
public class CustomerRecord
{
private Date dateOfBirth;
public CustomerRecord()
{
dateOfBirth = new Date();
}
public int getYearOfBirth()
{
Calendar calendar = Calendar.getInstance();
calendar.setTime(dateOfBirth);
return calendar.get(Calendar.YEAR);
}
}
Now if you don't initialize the dateOfBirth member varialble, any subsequent invocation of getYearOfBirth(), will result in a NullPointerException.
So the bare minimum initialization which may involve
Assigning of values.
Invoking helper functions.
to ensure that the class behaves correctly when it's members are invoked later on, is all that needs to be done.
Constructor is like an Application Setup Wizard where you do only configuration. If the Instance is ready to take any (possible) Action on itself then Constructor doing well.

Categories

Resources