In Java, I would like to use hierarchies of immutable POJOs to express my domain model.
e.g.
final ServiceId id = new ServiceId(ServiceType.Foo, "my-foo-service")
final ServiceConfig cfg = new ServiceConfig("localhost", 8080, "abc", JvmConfig.DEFAULT)
final ServiceInfo info = new ServiceInfo(id, cfg)
All of these POJOs have public final fields with no getters or setters. (If you are a fan of getters, please pretend that the fields are private with getters.)
I would also like to serialize these objects using the MessagePack library in order to pass them around over the network, store them to ZooKeeper nodes, etc.
The problem is that MessagePack only supports serialization of public, non-final fields, so I cannot serialize the business objects as-is. Also MessagePack does not support enum, so I have to convert enum values to int or String for serialization. (Yes it does, if you add an annotation to your enums. See my comment below.)
To deal with this I have a hand-written corresponding hierarchy of "message" objects, with conversions between each business object and its corresponding message object. Obviously this is not ideal because it causes a large amount of duplicated code, and human error could result in missing fields, etc.
Are there any better solutions to this problem?
Code generation at compile time?
Some way to generate the appropriate serializable classes at runtime?
Give up on MessagePack?
Give up on immutability and enums in my business objects?
Is there some kind of generic wrapper library that can wrap a mutable object (the message object) into an immutable one (the business object)?
MessagePack also supports serialization of Java Beans (using the #MessagePackBeans annotation), so if I can automatically convert an immutable object to/from a Java Bean, that may get me closer to a solution.
Coincidentally, I recently created a project that does pretty much exactly what you are describing. The use of
immutable data models provides huge benefits, but many serialization technologies seem to approach
immutability as an afterthought. I wanted something that would fix this.
My project, Grains, uses code generation to create an immutable implementation
of a domain model. The implementation is generic enough that it can be adapted to different serialization frameworks.
MessagePack, Jackson, Kryo, and standard Java serialization are supported so far.
Just write a set of interfaces that describe your domain model. For example:
public interface ServiceId {
enum ServiceType {Foo, Bar}
String getName();
ServiceType getType();
}
public interface ServiceConfig {
enum JvmConfig {DEFAULT, SPECIAL}
String getHost();
int getPort();
String getUser();
JvmConfig getType();
}
public interface ServiceInfo {
ServiceId getId();
ServiceConfig getConfig();
}
The Grains Maven plugin then generates immutable implementations of these interfaces at compile time.
(The source it generates is designed to be read by humans.) You then create instances of your objects. This example
shows two construction patterns:
ServiceIdGrain id = ServiceIdFactory.defaultValue()
.withType(ServiceType.Foo)
.withName("my-foo-service");
ServiceConfigBuilder cfg = ServiceConfigFactory.newBuilder()
.setHost("localhost")
.setPort(8080)
.setUser("abc")
.setType(JvmConfig.DEFAULT);
ServiceInfoGrain info = ServiceInfoFactory.defaultValue()
.withId(id)
.withConfig(cfg.build());
Not as simple as your public final fields, I know, but inheritance and composition are not possible without getters
and setters. And, these objects are easily read and written with MessagePack:
MessagePack msgpack = MessagePackTools.newGrainsMessagePack();
byte[] data = msgpack.write(info);
ServiceInfoGrain unpacked = msgpack.read(data, ServiceInfoGrain.class);
If the Grains framework doesn't work for you, feel free to inspect its MessagePack templates.
You can write a generic TemplateBuilder that uses reflection to set the final fields of your hand-written domain model. The trick
is to create a custom TemplateRegistry that allows registration of your custom builder.
It sounds like you have merged, rather than separated, the read and write concerns of your application. You should probably consider CQRS at this point.
In my experience, immutable domain objects are almost always attached to an audit story (requirement), or it's lookup data (enums).
Your domain should probably be, mostly, mutable, but you still don't need getters and setters. Instead you should have verbs on your objects which result in a modified domain model, and which raise events when something interesting happens in the domain (interesting to the business -- business == someone paying for your time). It's probably the events that you're interested in passing over the wire, not the domain objects. Maybe it's even the commands (these are similar to events, but the source is an agent external to the bounded context in which your domain lives -- events are internal to the model's bounded context).
You can have a service to persist the events (and another one to persist commands), which is also your audit-log (fulfilling your audit stories).
You can have an event handler that pushes your events onto your bus. These events should contain either simple information or entity ID's. The services that respond to these events should perform their duties using the information provided, or they should query for the information they need using the given ID's.
You really shouldn't be exposing the internal state of your domain model. You're breaking encapsulation by doing that, and that's not really a desirable thing to do. If I were you I'd take a look at the Axon Framework. It's likely to get you further than MessagePack alone.
Related
Java 14 introduced records feature. Record creates getter with the same name as field, so one would write print(person.name()) for example. But old Java bean convention dictates that one should name this method as getName().
Using both styles in the same code base does not look very nice. Migrating everything to records is not possible, as they are too limited to replace all use-cases.
Is there any official or semi-official guidelines how to name getters and setters after Java 14 in new code?
Quote from JEP 359:
It is not a goal to declare "war on boilerplate"; in particular, it is not a goal to address the problems of mutable classes using the JavaBean naming conventions.
My understanding, based on the same document is that records are transparent holders for shallowly immutable data.
That being said:
Records are not the place to look for getters/setters syntactical sugar, as they are not meant to replace JavaBeans.
I strongly agree with you that JavaBeans are too verbose. Maybe an additional feature (called beans instead of records) could be implemented - very similar behavior with the records feature but that would permit mutability. In that case, records and beans would not be mutually exclusive.
As it has been mentioned, records are in preview mode. Let's see what the feedback from community would be.
All in all, IMHO they are a step forward... I wrote this example set where you can see a code reduction to ~15% LOC from standard JavaBeans.
Also, note that records behave like normal classes: they can be declared top level or nested, they can be generic, they can implement interfaces (from the same document). You can actually partly simulate JavaBeans (only getters would make sense, though) by extracting an interface containing the getters - however that would be a lot of work and not a really clean solution...
So, based on the logic above, to address your question, no - I didn't see any (semi)official guideline for getters and setters and I don't think that there is a motivation for it right now because, again, records are not a replacement for JavaBeans...
The record spec is now "final" as of Java 17 and this naming convention discrepancy has unfortunately not been addressed. I stumbled upon it when attempting to leverage Records as shallow holder classes to implement interfaces part of an existing domain model.
Whilst this isn't as neat a solution as I'd like, Records can have methods, so you could add "legacy" getters to your record, as in the following (contrived but simple) example.
public interface Nameable {
public String getName();
}
public record Person(String name) implements Nameable {
public String getName() {
return name; // or return name();
}
}
At least this allows client code to continue to use that tried and tested (over 20 years old) convention, which - let's face it - is used far more than in pure JavaBeans context.
You could say that the language designers have lived up to their remit of "not declaring war on boilerplate"
I stumbled up this when researching naming conventions for my project. Looking at the "recent" additions to the std lib (e.g. Path, FileSystem, HttpRequest, ...) the only more-or-less "pattern" I could detect was that .prop() implies direct, unmodified access to the field value, and thus existance of the field with that very type.
Whereas "getXXX" conveys that you cannot/should not assume the existence of a field. This property might be calculated, direct field access or read-only wrapped (e.g. List.copyOf) or converted.
So my conclusion is: if you want to communicate "structure" or enforce the precence of fields use .prop(). In all other cases stick to getXXX as it is more flexible (implementers can be entity classes, records or service classes.
Btw: I am aware that there are big offenders to this logic even in the jdk. e.g. BigDecimal that's why I focused on more recent additions.
In Java records, object fields must be private and final.
So there is just one kind of getter and one kind of setter possible.
In Java classes, object fields may be private or public.
In the latter type of field, one can get or set them simply by adding a period and the field name, e.g.
Employee emp = new Employee(); // Nullary constructor
emp.name = "John Schmidt"; // Setter
. . .
. . .
if (emp.name != "Buddy") // Getter
{
emp.bonus = 100.00;
}
Non-private fields are used a lot in Android apps to save memory and time extracting data. But there's no reason not to use them in Java where it's safe to do so.
Now, if you change away from the usual way in Java classes to something like that used in record types, e.g.
String name = emp.name(); // New getter convention for private field
you have a serious risk of confusion by code readers who might misinterpret this as a non-private object field.
And if you change the record getter to what is used in Java objects, i.e.
obj.getField()
then there is a risk of confusion by coder reviewers and possibly a compiler may treat it as a Java object, depending on execution decision criteria.
In short, it's a different type of object to the normal Java class or enum. Its accessors indicate this new type unambiguously.
That's how I see it anyhow.
Maybe someone on the Java development committee may be able to enlighten us further.
TL;DR
Can I use Java serialization/deserialization using Serializable interface, ObjectOutputStream and ObjectInputStream classes, and probably adding readObject and writeObject in the classes implementing Serializable as a valid implementation for Prototype pattern or not?
Note
This question is not to discuss if using copy constructor is better than serialization/deserialization or not.
I'm aware of the Prototype Pattern concept (from Wikipedia, emphasis mine):
The prototype pattern is a creational design pattern in software development. It is used when the type of objects to create is determined by a prototypical instance, which is cloned to produce new objects. This pattern is used to:
avoid subclasses of an object creator in the client application, like the abstract factory pattern does.
avoid the inherent cost of creating a new object in the standard way (e.g., using the 'new' keyword) when it is prohibitively expensive for a given application.
And from this Q/A: Examples of GoF Design Patterns in Java's core libraries, BalusC explains that prototype pattern in Java is implemented by Object#clone only if the class implements Cloneable interface (marker interface similar to Serializable to serialize/deserialize objects). The problem using this approach is noted in blog posts/related Q/As like these:
Copy Constructor versus Cloning
Java: recommended solution for deep cloning/copying an instance
So, another alternative is using a copy constructor to clone your objects (the DIY way), but this fails to implement the prototype pattern for the text I emphasized above:
avoid the inherent cost of creating a new object in the standard way (e.g., using the 'new' keyword)
AFAIK the only way to create an object without invoking its constructor is by deserialization, as noted in the example of the accepted answer of this question: How are constructors called during serialization and deserialization?
So, I'm just asking if using object deserialization through ObjectOutputStream (and knowing what you're doing, marking necessary fields as transient and understanding all the implications of this process) or a similar approach would be a proper implementation of Prototype Pattern.
Note: I don't think unmarshalling XML documents is a right implementation of this pattern because invokes the class constructor. Probably this also happens when unmarshalling JSON content as well.
People would advise using object constructor, and I would mind that option when working with simple objects. This question is more oriented to deep copying complex objects, where I may have 5 levels of objects to clone. For example:
//fields is an abbreviation for primitive type and String type fields
//that can vary between 1 and 20 (or more) declared fields in the class
//and all of them will be filled during application execution
class CustomerType {
//fields...
}
class Customer {
CustomerType customerType;
//fields
}
class Product {
//fields
}
class Order {
List<Product> productList;
Customer customer;
//fields
}
class InvoiceStatus {
//fields
}
class Invoice {
List<Order> orderList;
InvoiceStatus invoiceStatus;
//fields
}
//class to communicate invoice data for external systems
class InvoiceOutboundMessage {
List<Invoice> invoice;
//fields
}
Let's say, I want/need to copy a instance of InvoiceOutboundMessage. I don't think a copy constructor would apply in this case. IMO having a lot of copy constructors doesn't seem like a good design in this case.
Using Java object serialization directly is not quite the Prototype pattern, but serialization can be used to implement the pattern.
The Prototype pattern puts the responsibility of copying on the object to be copied. If you use serialization directly, the client needs to provide the deserialization and serialization code. If you own, or plan to write, all of the classes that are to be copied, it is easy to move the responsibility to those classes:
define a Prototype interface which extends Serializable and adds an instance method copy
define a concrete class PrototypeUtility with a static method copy that implements the serialization and deserialization in one place
define an abstract class AbstractPrototype that implements Prototype. Make its copy method delegate to PrototypeUtility.copy.
A class which needs to be a Prototype can either implement Prototype itself and use PrototypeUtility to do the work, or can just extend AbstractPrototype. By doing so it also advertises that it is safely Serializable.
If you don't own the classes whose instances are to be copied, you can't follow the Prototype pattern exactly, because you can't move the responsibility for copying to those classes. However, if those classes implement Serializable, you can still get the job done by using serialization directly.
Regarding copy constructors, those are a fine way to copy Java objects whose classes you know, but they don't meet the requirement that the Prototype pattern does that the client should not need to know the class of the object instance that it is copying. A client which doesn't know an instance's class but wants to use its copy constructor would have to use reflection to find a constructor whose only argument has the same class as the class it belongs to. That's ugly, and the client couldn't be sure that the constructor it found was a copy constructor. Implementing an interface addresses those issues cleanly.
Wikipedia's comment that the Prototype pattern avoids the cost of creating a new object seems misguided to me. (I see nothing about that in the Gang of Four description.) Wikipedia's example of an object that is expensive to create is an object which lists the occurrences of a word in a text, which of course are expensive to find. But it would be foolish to design your program so that the only way to get an instance of WordOccurrences was to actually analyze a text, especially if you then needed to copy that instance for some reason. Just give it a constructor with parameters that describe the entire state of the instance and assigns them to its fields, or a copy constructor.
So unless you're working with a third-party library that hides its reasonable constructors, forget about that performance canard. The important points of Prototype are that
it allows the client to copy an object instance without knowing its class, and
it accomplishes that goal without creating a hierarchy of factories, as meeting the same goal with the AbstractFactory pattern would.
I'm puzzled by this part of your requirements:
Note: I don't think unmarshalling XML documents is a right
implementation of this pattern because invokes the class constructor.
Probably this also happens when unmarshalling JSON content as well.
I understand that you might not want to implement a copy constructor, but you will always have a regular constructor. If this constructor is invoked by a library then what does it matter? Furthermore object creation in Java is cheap. I've used Jackson for marshalling/unmarshalling Java objects with great success. It is performant and has a number of awesome features that might be very helpful in your case. You could implement a deep copier as follows:
import com.fasterxml.jackson.databind.ObjectMapper;
public class MyCloner {
private ObjectMapper cloner; // with getter and setter
public <T> clone(T toClone){
String stringCopy = mapper.writeValueAsString(toClone);
T deepClone = mapper.readValue(stringCopy, toClone.getClass());
return deepClone;
}
}
Note that Jackson will work automatically with Beans (getter + setter pairs, no-arg constructor). For classes that break that pattern it needs additional configuration. One nice thing about this configuration is that it won't require you to edit your existing classes, so you can clone using JSON without any other part of your code knowing that JSON is being used.
Another reason I like this approach vs. serialization is it is more human debuggable (just look at the string to see what the data is). Additionally, there are tons of tools out there for working with JSON:
Online JSON formatter
Veiw JSON as HTML based webpage
Whereas tools for Java serialization isn't great.
One drawback to this approach is that by default duplicate references in the original object will be made unique in the copied object by default. Here is an example:
public class CloneTest {
public class MyObject { }
public class MyObjectContainer {
MyObject refA;
MyObject refB;
// Getters and Setters omitted
}
public static void runTest(){
MyCloner cloner = new MyCloner();
cloner.setCloner(new ObjectMapper());
MyObjectContainer container = new MyObjectContainer();
MyObject duplicateReference = new MyObject();
MyObjectContainer.setRefA(duplicateReference);
MyObjectContainer.setRefB(duplicateReference);
MyObjectContainer cloned = cloner.clone(container);
System.out.println(cloned.getRefA() == cloned.getRefB()); // Will print false
System.out.println(container.getRefA() == container.getRefB()); // Will print true
}
}
Given that there are several approaches to this problem each with their own pros and cons, I would claim there isn't a 'proper' way to implement the prototype pattern in Java. The right approach depends heavily on the environment you find yourself coding in. If you have constructors which do heavy computation (and can't circumvent them) then I suppose you don't have much option but to use Deserialization. Otherwise, I would prefer the JSON/XML approach. If external libraries weren't allowed and I could modify my beans, then I'd use Dave's approach.
Your question is really interesting Luiggi (I voted for it because the idea is great), it's a pitty you don't say what you are really concerned about. So I'll try to answer what I know and let you choose what you find arguable:
Advantages :
In terms of memory use, you will get a very good memory consumption by using serialization since it serializes your objects in binary format (and not in text as json or worse: xml). You may have to choose a strategy to keep your objects "pattern" in memory as long as you need it, and persist it in a "less used first persisted" strategy, or "first used first persisted"
Coding it is pretty direct. There are some rules to respect, but it you don't have many complex structures, this remains maintainable
No need for external libraries, this is pretty an advantage in institutions with strict security/legal rules (validations for each library to be used in a program)
If you don't need to maintain your objects between versions of the program/ versions of the JVM. You can profit from each JVM update as speed is a real concern for java programs, and it's very related to io operations (JMX, memory read/writes, nio, etc...). So there are big chances that new versions will have optimized io/memory usage/serialization algos and you will find you're writing/reading faster with no code change.
Disadvantages :
You loose all your prototypes if you change any object in the tree. Serialization works only with the same object definition
You need to deserialize an object to see what is inside it: as opposed to the prototype pattern that is 'self documenting' if you take it from a Spring / Guice configuration file. The binary objects saved to disk are pretty opaque
If you're planning to do a reusable library, you're imposing to your library users a pretty strict pattern (implementing Serializable on each object, or using transient for dields that are not serializable). In addition this constraints cannot be checked by the compiler, you have to run the program to see if there's something wrong (which might not be visible immediately if an object in the tree is null for the tests). Naturally, I'm comparing it to other prototyping technologies (Guice for example had the main feature of being compile time checked, Spring did it lately too)
I think it's all what comes to my mind for now, I'll add a comment if any new aspect raises suddenly :)
Naturally I don't know how fast is writing an object as bytes compared to invoking a constructor. The answer to this should be mass write/read tests
But the question is worth thinking.
There are cases where creating new object using copy constructor is different from creating new object "in a standard way". One example is explained in the Wikipedia link in your question. In that example, to create new WordOccurrences using the constructor WordOccurrences(text, word), we need to perform heavyweight computation. If we use copy constructor WordOccurrences(wordOccurences) instead, we can immediately get the result of that computation (in the Wikipedia, clone method is used, but the principle is the same).
I've been reading the book Clean Code: A Handbook of Agile Software Craftsmanship and in chapter six pages 95-98 it clarifies about the differences between objects and data structures:
Objects hide their data behind abstractions and expose functions that operate on that data. Data structures expose their data and have no meaningful functions.
Object expose behavior and hide data. This makes it easy to add new kinds of objects without changing existing behaviors. It also makes it hard to add new behaviors to existing objects.
Data structures expose data and have no significant behavior. This makes it easy to add new behaviors to existing data structures but makes it hard to add new data structures to existing functions.
I'm a tad bit confused whether some classes are objects or data structures. Say for example HashMaps in java.util, are they objects? (because of its methods like put(), get(), we dont know their inner workings) or are they data structures? (I've always thought of it as data structures because its a Map).
Strings as well, are they data structures or objects?
So far majority of the code I've been writing have been the so called "hybrid classes" which try to act as an object and a data structure as well. Any tips on how to avoid them as well?
The distinction between data structures and classes/objects is a harder to explain in Java than in C++. In C, there are no classes, only data structures, that are nothing more than "containers" of typed and named fields. C++ inherited these "structs", so you can have both "classic" data structures and "real objects".
In Java, you can "emulate" C-style data structures using classes that have no methods and only public fields:
public class VehicleStruct
{
public Engine engine;
public Wheel[] wheels;
}
A user of VehicleStruct knows about the parts a vehicle is made of, and can directly interact with these parts. Behavior, i.e. functions, have to be defined outside of the class. That's why it is easy to change behavior: Adding new functions won't require existing code to change. Changing data, on the other hand, requires changes in virtually every function interacting with VehicleStruct. It violates encapsulation!
The idea behind OOP is to hide the data and expose behavior instead. It focuses on what you can do with a vehicle without having to know if it has engine or how many wheels are installed:
public class Vehicle
{
private Details hidden;
public void startEngine() { ... }
public void shiftInto(int gear) { ... }
public void accelerate(double amount) { ... }
public void brake(double amount) { ... }
}
Notice how the Vehicle could be a motorcycle, a car, a truck, or a tank -- you don't need to know the details. Changing data is easy -- nobody outside the class knows about data so no user of the class needs to be changed. Changing behavior is difficult: All subclasses must be adjusted when a new (abstract) function is added to the class.
Now, following the "rules of encapsulation", you could understand hiding the data as simply making the fields private and adding accessor methods to VehicleStruct:
public class VehicleStruct
{
private Engine engine;
private Wheel[] wheels;
public Engine getEngine() { return engine; }
public Wheel[] getWheels() { return wheels; }
}
In his book, Uncle Bob argues that by doing this, you still have a data structure and not an object. You are still just modeling the vehicle as the sum of its parts, and expose these parts using methods. It is essentially the same as the version with public fields and a plain old C struct -- hence a data structure. Hiding data and exposing methods is not enough to create an object, you have to consider if the methods actually expose behavior or just the data!
When you mix the two approaches, e.g. exposing getEngine() along with startEngine(), you end up with a "hybrid". I don't have Martin's Book at hand, but I remember that he did not recommend hybrids at all, as you end up with the worst of both worlds: Objects where both data and behavior is hard to change.
Your questions concerning HashMaps and Strings are a bit tricky, as these are pretty low level and don't fit quite well in the kinds of classes you will be writing for your applications. Nevertheless, using the definitions given above, you should be able to answer them.
A HashMap is an object. It exposes its behavior to you and hides all the nasty hashing details. You tell it to put and get data, and don't care which hash function is used, how many "buckets" there are, and how collisions are handled. Actually, you are using HashMap solely through its Map interface, which is quite a good indication of abstraction and "real" objects.
Don't get confused that you can use instances of a Map as a replacement for a data structure!
// A data structure
public class Point {
public int x;
public int y;
}
// A Map _instance_ used instead of a data structure!
Map<String, Integer> data = new HashMap<>();
data.put("x", 1);
data.put("y", 2);
A String, on the other hand, is pretty much an array of characters, and does not try to hide this very much. I guess one could call it a data structure, but to be honest I am not sure if much is to be gained one way or the other.
This is what, I believe, Robert. C. Martin was trying to convey:
Data Structures are classes that simply act as containers of structured data. For example:
public class Point {
public double x;
public double y;
}
Objects, on the other hand, are used to create abstractions. An abstraction is understood as:
a simplification of something much more complicated that is going on under the covers The Law of Leaky Abstractions, Joel on Software
So, objects hide all their underpinnings and only let you manipulate the essence of their data in a simplified way. For instance:
public interface Point {
double getX();
double getY();
void setCartesian(double x, double y);
double getR();
double getTheta();
void setPolar(double r, double theta);
}
Where we don't know how the Point is implemented, but we do know how to consume it.
As I see it , what Robert Martin tries to convey, is that objects should not expose their data via getters and setters unless their sole purpose is to act as simple data containers. Good examples of such containers might be java beans, entity objects (from object mapping of DB entities), etc.
The Java Collection Framework classes, however, are not a good example of what he's referring to, since they don't really expose their internal data (which is in a lot of cases basic arrays). It provides abstraction that lets you retrieve objects that they contain. Thus (in my POV) they fit in the "Objects" category.
The reasons are stated by the quotes you added from the book, but there are more good reasons for refraining from exposing the internals. Classes that provide getters and setters invite breaches of the Law of Demeter, for instance. On top of that, knowing the structure of the state of some class (knowing which getters/setters it has) reduces the ability to abstract the implementation of that class. There are many more reasons of that sort.
An object is an instance of a class.
A class can model various things from the real world. It's an abstraction of something (car, socket, map, connection, student, teacher, you name it).
A data structure is a structure which organizes certain data in a certain way.
You can implement structures in ways different that by using classes (that's what you do in languages which don't support OOP e.g.; you can still implement a data structure in C let's say).
HashMap in java is a class which models a map data structure using hash-based implementation, that's why it's called HashMap.
Socket in java is a class which doesn't model a data structure but something else (a socket).
A data structure is only an abstraction, a special way of representing data. They are just human-made constructs, which help in reducing complexity at the high-level, i.e. to not work in the low-level. An object may seem to mean the same thing, but the major difference between objects and data structures is that an object might abstract anything. It also offers behaviour. A data structure does not have any behaviour because it is just data-holding memory.
The libraries classes such as Map, List,etc. are classes, which represent data structures. They implement and setup a data structure so that you can easily work with them in your programs by creating instances of them (i.e. objects).
Data structures(DS) are an abstract way of saying that a structure holds some data'. HashMap with some key value pairs is a data structure in Java. Associated arrays are similarly in PHP etc. Objects is a little lower than the DS level. Your hashmap is a data structure. now to use a hashmap you create an 'object' of it and add data to that object using put method. I can have my own class Employee which has data and is thus a DS for me. But to use this DS to do some operations like o see if the employee is a male or a female colleague i need an instance of an Employee and test its gender property.
Don't confuse objects with data structures.
An object is an instance of a class. A class can define a set of properties/fields that every instance/object of that class inherits. A data structure is a way to organize and store data. Technically a data structure is an object, but it's an object with the specific use for holding other objects (everything in Java is an object, even primitive types).
To answer your question a String is an object and a data structure. Every String object you create is an instance of the String class. A String, as Java represents it internally, is essentially a character array, and an array is a data structure.
Not all classes are blueprints for data structures, however all data structures are technically objects AKA instances of a class (that is specifically designed to store data), if that makes any sense.
Your question is tagged as Java, so I will reference only Java here.
Objects are the Eve class in Java; that is to say everything in Java extends Object and object is a class.
Therefor, all data structures are Objects, but not all Objects are data structures.
The key to the difference is the term Encapsulation.
When you make an object in Java, it is considered best practice to make all of your data members private. You do this to protect them from anyone using the class.
However, you want people to be able to access the data, sometimes change it. So, you provide public methods called accessors and mutators to allow them to do so, also called getters and setters. Additionally, you may want them to view the object as a whole in a format of your choosing, so you can define a toString method; this returns a string representing the object's data.
A structure is slightly different.
It is a class.
It is an Object.
But it is usually private within another class; As a Node is private within a tree and should not be directly accessible to the user of the tree. However, inside the tree object the nodes data members are publicly visible. The node itself does not need accessors and mutators, because these functions are trusted to and protected by the tree object.
Keywords to research: Encapsulation, Visibility Modifiers
(Disclaimer: Extreme oversimplification. The actual scenario is considerably more complex.)
Say I have two systems, Producer and Consumer. Their code is completely independent, aside from a single shared interface:
public interface Thing {
String getName();
String getDescription();
int getPrice();
}
The idea is that Producer creates a bunch of data and sends it as JSON over HTTP. Producer has a bunch of implementations of Thing, each with additional pieces of metadata and stuff required in the data generation process.
As it's undesirable for Producer to have any kind of knowledge of Jackson/serialization aside from a thin layer at the very top, serialization attributes should be kept out of the Thing implementations. Due to the amount of implementation being very likely to grow in the future, having mixins for all of them quickly becomes unsustainable. It was believed to be sufficient to apply annotations to the Thing interface itself.
The first simple approach was a #JsonSerialize annotation on the interface. At first, that seemed to work, but resulted in a problem. Some of the implementations of Thing are enums, resulting in Jackson simply serializing them as their name instead of the fields defined in the interface.
Some googling revealed the following annotation:
#JsonFormat(shape= JsonFormat.Shape.OBJECT)
While it did indeed solve the problem by serializing the fields instead of the name, it did it too well as it also began serializing the implementation-specific public fields not defined in the Thing interface, resulting not only in information leak, but also failed deserialization in Consumer due to the data containing unknown entries.
As further googling didn't yield any results, the only solution I can think of is marking all those fields as ignorable, something that is extremely undesirable due to the previously mentioned reasons.
Is there any way, simply by altering the interface itself and its annotations, to enforce that exactly those fields, no more, no less, should be serialized both when it comes to classes and enums?
I have this issue when I was working with Jackson. The deserialization fails because, during deserialization, Jackson is unable to find the polymorphic reference type.
You should be annotating your interface with #JsonTypeInfo.
Something like:
#JsonTypeInfo(use = JsonTypeInfo.Id.CLASS, include = JsonTypeInfo.As.PROPERTY, property = "class")
There isn't much of code in your question and hence this answer.
Usually you should be able to force use of certain type with:
#JsonSerialize(as=Thing.class)
and similarly with #JsonDeserialize.
Does this not work with enums?
I have a custom INIFile class that I've written that read/write INI files containing fields under a header. I have several classes that I want to serialize using this class, but I'm kind of confused as to the best way to go about doing it. I've considered two possible approaches.
Method 1: Define an Interface like ObjectPersistent enforcing two methods like so:
public interface ObjectPersistent
{
public void save(INIFile ini);
public void load(INIFile ini);
}
Each class would then be responsible for using the INIFile class to output all properties out to the file.
Method 2: Expose all properties of the classes needing serialization via getters/setters so that saving can be handling in one centralized place like so:
public void savePlayer(Player p)
{
INIFile i = new INIFile(p.getName() + ".ini");
i.put("general", "name", p.getName());
i.put("stats", "str", p.getSTR());
// and so on
}
The best part of method 1 is that not all properties need to be exposed, so encapsulation is held firm. What's bad about method 1 is that saving isn't technically something that the player would "do". It also ties me down to flat files via the ini object passed into the method, so switching to a relational database later on would be a huge pain.
The best part of method 2 is that all I/O is centralized into one location, and the actual saving process is completely hidden from you. It could be saving to a flat file or database. What's bad about method 2 is that I have to completely expose the classes internal members so that the centralized serializer can get all the data from the class.
I want to keep this as simple as possible. I prefer to do this manually without use of a framework. I'm also definitely not interested in using the built in serialization provided in Java. Is there something I'm missing here? Any suggestions on what pattern would be best suited for this, I would be grateful. Thanks.
Since you don't want (for some reason) to use Java serialization, you can use XML serialization. The simplest way is via XStream:
XStream is a simple library to serialize objects to XML and back again.
If you are really sure you don't want to use any serialization framework, you can of course use reflection. Important points there are:
getClass().getDeclaredFields() returns all fields of the class - both public and private
field.setAccessible(true) - makes a private (or protected) field accessible via reflection
Modifier.isTransient(field.getModifiers()) tells you whether the field has been marked with the transient keyword - i.e. not eligible for serialization.
nested object structures may be represented by a dot notation - team.coach.name, for example.
All serialization libraries are using reflection (or introspection) to achieve their goals.
I would choose Method 1.
It might not be the most object oriented way, but in my experience it is simpler, less error-prone and easier to maintain than Method 2.
If you are conserned about providing multiple implementations for your own serialization, you can use interfaces for save and load methods.
public interface ObjectSerializer
{
public void writeInt(String key, int value);
...
}
public interface ObjectPersistent
{
public void save(ObjectSerializer serializer);
public void load(ObjectDeserializer deserializer);
}
You can improve these ObjectSerializer/Deserializer interfaces to have enough methods and parameters to cover both flat file and database cases.
This is a job for the Visitor pattern.