Inheritance in protocol buffers

Inheritance in protocol buffers - java

How to handle inheritance in Google Protocol Buffers 3.0?
Java equivalent code:
public class Bar {
String name;
}
public class Foo extends Bar {
String id;
}
What would be Proto equivalent code?
message Bar {
string name = 1;
}
message Foo {
string id = 2;
}

Protocol Buffers does not support inheritance. Instead, consider using composition:
message Foo {
Bar bar = 1;
string id = 2;
}
However, that said, there is a trick you can use which is like inheritance -- but which is an ugly hack, so you should only use it with care. If you define your message types like:
message Bar {
string name = 1;
}
message Foo {
string name = 1;
string id = 2;
}
These two types are compatible, because Foo contains a superset of the fields of Bar. This means if you have an encoded message of one type, you can decode it as the other type. If you try to decode a Bar as type Foo, the field id will not be set (and will get its default value). If you decode a Foo as type Bar, the field id will be ignored. (Notice that these are the same rules that apply when adding new fields to a type over time.)
You can possibly use this to implement something like inheritance, by having several types all of which contain a copy of the fields of the "superclass". However, there are a couple big problems with this approach:
To convert a message object of type Foo to type Bar, you have to serialize and re-parse; you can't just cast. This can be inefficient.
It's very hard to add new fields to the superclass, because you have to make sure to add the field to every subclass and have to make sure that this doesn't create any field number conflicts.

See the Protocol Buffer Basics tutorial:
Don't go looking for facilities similar to class inheritance, though – protocol buffers don't do that.

Related

Object differ hasChanges where no changes should be detected

I'm using java-object-diff to get differences between two objects parsed from xml by JAXB. In below example, I'm using the same string to test if I get no differences, however log.info("has changes: " + diff5.hasChanges()); logs true.
JAXBContext context1 = JAXBContext.newInstance(Item.class);
Unmarshaller m1 = context1.createUnmarshaller();
Item base = (Item) m1.unmarshal(new StringReader(s));
Item working = (Item) m1.unmarshal(new StringReader(s));
DiffNode diff5 = ObjectDifferBuilder
.buildDefault()
.compare(working, base);
log.info("has changes: " + diff5.hasChanges());
diff5.visit((node, visit) -> {
final Object baseValue = node.canonicalGet(base);
final Object workingValue = node.canonicalGet(working);
final String message = node.getPath() + " changed from " +
baseValue + " to " + workingValue;
System.out.println(message);
});
The message I get from System.out.println is always the same, saying it has changed from null to <the actual value> This happens for every property. E.g.
content changed from null to Mit dem Wasserinonisator
I have verified that the both Items have the same content and none of the both actualy is not null, but the exact same content.
Item is a pojo with many subclasses (all getters and setters are present), e.g.
public class Item {
#XmlElement(name = "ASIN", required = true)
protected String asin;
#XmlElement(name = "ParentASIN")
protected String parentASIN;
#XmlElement(name = "Errors")
protected Errors errors;
#XmlElement(name = "DetailPageURL")
protected String detailPageURL;
#XmlElement(name = "ItemLinks")
protected ItemLinks itemLinks;
#XmlElement(name = "SalesRank")
protected String salesRank;
#XmlElement(name = "SmallImage")
protected Image smallImage;
}
Is there any way to make java-object-diff work, to make it compare the values correctly?

After taking a closer look at your code I know what's wrong. The first problem is the fact, that JAXB doesn't generate equals methods. For the most part, that's not a problem, because the ObjectDiffer can establish the relationship between objects based on the hierarchy. Things get more complicated when ordered or unordered Collections are involved, because the ObjectDiffer needs some kind of way to establish the relationship between the collection items in the base and working instance. By default it relies on the lookup mechanism of the underlying collection (which typically involves on or more of the methods hashCode, equals or compareTo.)
In your case this relationship cannot be established, because none of your classes (but especially those contained in Lists and Sets) implement a proper equals method. This means that instances are only ever equal to themselves. This is further complicated by the fact, that the responsible classes represent value objects and don't have any hard identifier, that could be used to easily establish the relationship. Therefore the only option is to provide custom equals methods that simply compare all properties. The consequence is, that the slightest change on those objects will cause the ObjectDiffer to mark the base version as REMOVED and the working version as ADDED. But it will also not mark them as CHANGED, when they haven't actually changed. So that's something.
I'm not sure how easy it is to make JAXB generate custom equals methods, so here are some alternative solutions possible with java-object-diff:
Implement your own de.danielbechler.diff.identity.IdentityStrategy for the problematic types and provide them to the ObjectDifferBuilder, like so (example uses Java 8 Lambdas):
ObjectDifferBuilder
.startBuilding()
.identity()
.ofCollectionItems(ItemLinks.class, "itemLink").via((working, base) -> {
ItemLink workingItemLink = (ItemLink) working;
ItemLink baseItemLink = (ItemLink) base;
return StringUtils.equals(workingItemLink.getDescription(), baseItemLink.getDescription())
&& StringUtils.equals(workingItemLink.getURL(), baseItemLink.getURL());
})
// ...
.and().build();
Ignore problematic properties during comparison. Obviously this may not be what you want, but it's an easy solution in case you don't really care about the specific object.
ObjectDifferBuilder
.startBuilding()
.inclusion()
.exclude().type(Item.ImageSets.class)
.and().build();
A solution that causes JAXB to generate custom equals methods would be my preferred way to go. I found another post that claims it's possible, so maybe you want to give this a try first, so you don't have to customize your ObjectDiffer.
I hope this helps!

Working with Protocol Buffers and internal data models

I have an existing internal data model for a Picture, as follows:
package test.model;
public class Picture {
private int height, width;
private Format format;
public enum Format {
JPEG, BMP, GIF
}
// Constructor, getters and setters, hashCode, equals, toString etc.
}
I now want to serialize it using protocol buffers. I've written a Picture.proto file that mirrors the fields of the Picture class and compiled the code under the test.model.protobuf package with a classname of PictureProtoBuf:
package test.model.protobuf;
option java_package = "test.model.protobuf";
option java_outer_classname = "PictureProtoBuf";
message Picture {
enum Format {
JPEG = 1;
BMP = 2;
GIF = 3;
}
required uint32 width = 1;
required uint32 height = 2;
required Format format = 3;
}
Now I am now assuming that if I have a Picture that I want to serialize and send somewhere I have to create a PictureProtoBuf object and map all the fields across, like so:
Picture p = new Picture(100, 200, Picture.JPEG);
PictureProtoBuf.Picture.Builder output = PictureProtoBuf.Picture.newBuilder();
output.setHeight(p.getHeight());
output.setWidth(p.getWidth());
I'm coming unstuck when I have an enumeration in my data model. The ugly way that I'm using right now is:
output.setFormat(PictureProtoBuf.Picture.Format.valueOf(p.getFormat().name());
However, this is prone to breakage and relies on the enumeration name being consistent between my internal data model and the protocol buffer data model (which isn't a great assumption as enumeration names within .proto files need to be unique). I can see me having to hand-craft switch statements on enumerations if the .name() call from the internal model doesn't match the protobuf-generated enumeration name.
I guess my question is whether I'm going about this the right way? Am I supposed to scrap my internal data model (test.model.Picture) in favour of the protobuf-generated one (test.model.protobuf.PictureProtoBuf)? If so, how can I implement some of the niceties that I have done in my internal data model (e.g. hashCode(), equals(Object), toString(), etc.)?

Although the existing answers are good, I decided to go a bit further with Marc Gravell's suggestion to look into protostuff.
You can use the protostuff runtime module along with the dynamic ObjectSchema to create schemas at runtime for your internal data model
My code now reduces to:
// Do this once
private static Schema<Picture> schema = RuntimeSchema.getSchema(Picture.class);
private static final LinkedBuffer buffer = LinkedBuffer.allocate(DEFAULT_BUFFER_SIZE);
// For each Picture you want to serialize...
Picture p = new Picture(100, 200, Picture.JPEG);
byte[] result = ProtobufIOUtil.toByteArray(p, schema, buffer);
buffer.clear();
return result;
This is a great improvement over the Google protobuf library (see my question) when you have lots and lots of attributes in your internal data model. There is also no speed penalty that I can detect (with my use cases, anyway!)

If you have control over your internal data model, you could modify test.model.Picture so that the enum values know their corresponding protobuf equivalent, probably passing in the correspondence to your enum constructors.
For example, using Guava's BiMap (bidirectional map with unique values), we get something like
enum ProtoEnum { // we don't control this
ENUM1, ENUM2, ENUM3;
}
enum MyEnum {
ONE(ProtoEnum.ENUM1), TWO(ProtoEnum.ENUM2), THREE(ProtoEnum.ENUM3);
static final ImmutableBiMap<MyEnum, ProtoEnum> CORRESPONDENCE;
static {
ImmutableBiMap.Builder<ProtoEnum, MyEnum> builder = ImmutableBiMap.builder();
for (MyEnum x : MyEnum.values()) {
builder.put(x.corresponding, x);
}
CORRESPONDENCE = builder.build();
}
private final ProtoEnum corresponding;
private MyEnum(ProtoEnum corresponding) {
this.corresponding = corresponding;
}
}
and then if we want to look up the MyEnum corresponding to a ProtoEnum, we just do MyEnum.CORRESPONDENCE.get(protoEnum), and to go the other way, we just do MyEnum.CORRESPONDENCE.inverse().get(myEnum) or myEnum.getCorresponding().

One way is to only keep the generated enum:
package test.model;
public class Picture {
private int height, width;
private PictureProtoBuf.Picture.Format format;
// Constructor, getters and setters, hashCode, equals, toString etc.
}
I've used this a few times, it may or may not make sense in your case. Using the protobuf generated classes as you data model (or extending them to add functionality), is never recommended, though.

How do I parse delimited rows of text with differing field counts in to objects, while allowing for extension?

An example is as follows:
SEG1|asdasd|20111212|asdsad
SEG2|asdasd|asdasd
SEG3|sdfsdf|sdfsdf|sdfsdf|sdfsfsdf
SEG4|sdfsfs|
Basically, each SEG* line needs to be parsed into a corresponding object, defining what each of those fields are. Some, such as the third field in SEG1 will be parsed as a Date.
Each object will generally stay the same but there may be instances in which an additional field may be added, like so:
SEG1|asdasd|20111212|asdsad|12334455
At the moment, I'm thinking of using the following type of algorithm:
List<String> segments = Arrays.asList(string.split("\r"); // Will always be a CR.
List<String> fields;
String fieldName;
for (String segment : segments) {
fields = Arrays.asList(segment.split("\\|");
fieldName = fields.get(0);
SEG1 seg1;
if (fieldName.compareTo("SEG1") == 0) {
seg1 = new Seg1();
seg1.setField1(fields.get(1));
seg1.setField2(fields.get(2));
seg1.setField3(fields.get(3));
} else if (fieldName.compareTo("SEG2") == 0) {
...
} else if (fieldName.compareTo("SEG3") == 0) {
...
} else {
// Erroneous/failure case.
}
}
Some fields may be optional as well, depending on the object being populated. My concern is if I add a new field to a class, any checks that use the expect field count number will also need to be updated. How could I go about parsing the rows, while allowing for new or modified field types in the class objects to populate?

If you can define a common interface for all to be parsed classes I would suggest the following:
interface Segment {}
class SEG1 implements Segment
{
void setField1(final String field){};
void setField2(final String field){};
void setField3(final String field){};
}
enum Parser {
SEGMENT1("SEG1") {
#Override
protected Segment parse(final String[] fields)
{
final SEG1 segment = new SEG1();
segment.setField1(fields[0]);
segment.setField1(fields[1]);
segment.setField1(fields[2]);
return segment;
}
},
...
;
private final String name;
private Parser(final String name)
{
this.name = name;
}
protected abstract Segment parse(String[] fields);
public static Segment parse(final String segment)
{
final int firstSeparator = segment.indexOf('|');
final String name = segment.substring(0, firstSeparator);
final String[] fields = segment.substring(firstSeparator + 1).split("\\|");
for (final Parser parser : values())
if (parser.name.equals(name))
return parser.parse(fields);
return null;
}
}
For each type of segment add an element to the enum and handle the different kinds of fields in the parse(String[])method.

You can use collections, e.g. ArrayList
You can use var-args
If you want to make it extensible, you may want to process each segment in a loop, instead of handling each occurance.

I would add a header row to your file format with the names of the fields being stored in the file so it looks something more like this:
(1) field1|field2|field3|field4|field5
(2) SEG1|asdasd|20111212|asdsad|
(3) SEG2|asdasd||asdasd|
(4) SEG3|sdfsdf|sdfsdf|sdfsdf|sdfsfsdf
(5) SEG4|sdfsfs|||
This is common for CSV files. I've also added more delimiters so that each line has five 'values'. This way a null value can be specified by just entering two delimiters in a row (see the third row above for an example where a null value is not the last value).
Now your parsing code knows what fields need to be set and you can call the setters using reflection in a loop. Pseudo code:
get the field names from the first line in the file
for (every line in the file except the first one) {
for (every value in the line) {
if (the value is not empty) {
use reflection to get the setter for the field and invoke it with the
value
}
}
}
This allows you to extend the file with additional fields without having to change the code. It also means you can have meaningful field names. The reflection may get a bit complicated with different types e.g. int, String, boolean etc. so I would have to say that if you can, follow #sethu's advice and use a ready-built proven library that does this for you.

Is there a necessity to use the same string with | as a delimiter? If the same classes are used to create the String, then its an ideal case for Xstream. Xstream will convert your java object into XML and back. Xstream will take care of the scenario where some fields are optional. You will not have write any code that parses your text. Here's a link:
http://x-stream.github.io/

How best to specify a Protobuf for use with Netty (preferably using the built-in protobuf support)

I'm specifying a protocol in protocol buffers. The transport layer is harnessing Netty's Protocol Buffers support - the significance being that Netty's ProtobufDecoder accepts one, and only one, type of MessageLite.
Now, I want to send a variety of different message types down this channel, each subtype having structured information associated with it. Protocol-buffers doesn't have an inheritance mechanism, so I'm using a kind of composition. I'm not sure if I am going about it the correct way.
My approach has been to categorise my different events with an enum, and encapsulate their differences using optional members. See my .proto below, I've simplified it for the sake of clarity.
My issue here is that the receiving code needs to make the association between EventType.ERROR and ErrorEventDetail. This just feels a little clumsy.
Simplified Events.proto:
package events;
option java_package = "com.example";
option java_outer_classname = "EventProtocol";
message Event {
enum EventType {
START = 0;
DELEGATE = 1;
ERROR = 2;
STOP = 3;
}
required events.Event.EventType event_type = 1 [default = START];
required int32 id = 2;
required int64 when = 3;
optional StartEventDetail start_event_detail = 4;
optional DelegateEventDetail delegate_event_detail = 5;
optional ErrorEventDetail error_event_detail = 6;
optional StopEventDetail stop_event_detail = 7;
}
message StartEventDetail {
required string object_name = 1;
}
message DelegateEventDetail {
required int32 object_id = 2;
required string task = 3;
}
message ErrorEventDetail {
required string text = 1;
required int32 error_code = 2;
optional Event cause = 3;
}
message StopEventDetail {
required int32 object_id = 2;
}
Is this optimal?
Would I be better off using extends somehow, or perhaps some other use of enum?
Or even, should I be creating a whole new OneToOneDecoder which can identify a message type by some kind of header? I could do this, but I'd rather not...
Thanks

Seems like you are pretty close / already using one of the Google's protobufs techniques which called Union Types
The gist is you have a dedicated type field, that you would "switch" on to know which message to get:
message OneMessage {
enum Type { FOO = 1; BAR = 2; BAZ = 3; }
// Identifies which field is filled in.
required Type type = 1;
// One of the following will be filled in.
optional Foo foo = 2;
optional Bar bar = 3;
optional Baz baz = 4;
}
where Foo, Bar and Baz are/could be defined in other files as separate messages. And you can switch on the type to get the actual payload (it's Scala, but you can do the same thing with Java's switch):
OneMessage.getType match {
case OneMessage.Type.FOO =>
val foo = OneMessage.getFoo
// do the processing
true
case OneMessage.Type.BAR =>
val bar = OneMessage.getBar
// do the processing
true
case OneMessage.Type.BAZ =>
val baz = OneMessage.getBaz
// do the processing
true
}

I originally solved the same problem using the extension mechanism, which I document here
But I found the code in Java required to deal with extensions was horribly ugly and verbose, so I switched to the Union method as described. The code is much cleaner as the generated Java code provides a way to get and build each message in one go.
I use two mechanisms for deciding which optional message to extract. I use the switch method also described in another Answer when performance is needed and I use a reflection method when performance is not an issue and I don't want to have to maintain a switch statement, I just create a handle(Message) for each message. An example of the reflection method is given below, in my case the java wrapper is a class called Commands, and is decoded by Netty for me. It first tries to find a handler that has the specific message as a parameter then if that fails it calls a method using the camel case name. For this to work the Enum must be the underscore name of the camel case message.
// Helper that stops me having to create a switch statement for every command
// Relies on the Cmd enum naming being uppercase version of the sub message field names
// Will call the appropriate handle(Message) method by reflection
// If it is a command with no arguments, therefore no sub message it
// constructs the method name from the camelcase of the command enum
private MessageLite invokeHandler(Commands.Command cmd) throws Exception {
Commands.Command.Cmd com= cmd.getCmd();
//String name= CaseFormat.UPPER_UNDERSCORE.to(CaseFormat.LOWER_UNDERSCORE, com.name());
String name= com.name().toLowerCase();
jlog.debug("invokeHandler() - Looking up {} from {}", name, com.name());
FieldDescriptor field= Commands.Command.getDescriptor().findFieldByName(name);
if(field != null) {
// if we have a matching field then extract it and call the handle method with that as a parameter
Object c = cmd.getField(field);
jlog.debug("invokeHandler() - {}\n{}", c.getClass().getCanonicalName(), c);
Method m = getClass().getDeclaredMethod("handle", String.class, c.getClass());
return (MessageLite) m.invoke(this, cmd.getUser(), c);
}
// else we call a method with the camelcase name of the Cmd, this is for commands that take no arguments other than the user
String methodName= "handle"+CaseFormat.UPPER_UNDERSCORE.to(CaseFormat.UPPER_CAMEL, com.name());
jlog.debug("invokeHandler() - using method: {}", methodName);
Method m = getClass().getDeclaredMethod(methodName, String.class);
return (MessageLite) m.invoke(this, cmd.getUser());
}

another approach is to use the extension mechanism that protobuf is supporting. I'm using this approach in the situations where the union type is too large.

Java annotations

I've created simple annotation in Java
#Retention(RetentionPolicy.RUNTIME)
#Target(ElementType.FIELD)
public #interface Column {
String columnName();
}
and class
public class Table {
#Column(columnName = "id")
private int colId;
#Column(columnName = "name")
private String colName;
private int noAnnotationHere;
public Table(int colId, String colName, int noAnnotationHere) {
this.colId = colId;
this.colName = colName;
this.noAnnotationHere = noAnnotationHere;
}
}
I need to iterate over all fields, that are annotated with Column and get name and value of field and annotation. But I've got problem with getting value of each field, since all of them are of different data type.
Is there anything that would return collection of fields that have certain annotation?
I managed to do it with this code, but I don't think that reflection is good way to solve it.
Table table = new Table(1, "test", 2);
for (Field field : table.getClass().getDeclaredFields()) {
Column col;
// check if field has annotation
if ((col = field.getAnnotation(Column.class)) != null) {
String log = "colname: " + col.columnName() + "\n";
log += "field name: " + field.getName() + "\n\n";
// here i don't know how to get value of field, since all get methods
// are type specific
System.out.println(log);
}
}
Do I have to wrap every field in object, which would implement method like getValue(), or is there some better way around this? Basicly all I need is string representation of each field that is annotated.
edit: yep field.get(table) works, but only for public fields, is there any way how to do this even for private fields? Or do I have to make getter and somehow invoke it?

Every object should has toString() defined. (And you can override this for each class to get a more meaningful representation).
So you where your "// here I don't know" comment is, you could have:
Object value = field.get(table);
// gets the value of this field for the instance 'table'
log += "value: " + value + "\n";
// implicitly uses toString for you
// or will put 'null' if the object is null

Reflection is exactly the way to solve it. Finding out things about types and their members at execution time is pretty much the definition of reflection! The way you've done it looks fine to me.
To find the value of the field, use field.get(table)

Reflection is exactly the way to look at annotations. They are a form of "metadata" attached to the class or method, and Java annotations were designed to be examined that way.

Reflection is one way to process the object (probably the only way if the fields are private and don't have any kind of accessor method). You'll need to look at Field.setAccessible and perhaps Field.getType.
Another approach is to generate another class for enumerating the annotated fields using a compile-time annotation processor. This requires a com.sun API in Java 5, but support is better in the Java 6 JDK (IDEs like Eclipse may require special project configuration).

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Inheritance in protocol buffers - java

How to handle inheritance in Google Protocol Buffers 3.0? Java equivalent code: public class Bar { String name; } public class Foo extends Bar { String id; } What would be Proto equivalent code? message Bar { string name = 1; } message Foo { string id = 2; }

See the Protocol Buffer Basics tutorial: Don't go looking for facilities similar to class inheritance, though – protocol buffers don't do that.

Related

Object differ hasChanges where no changes should be detected

Working with Protocol Buffers and internal data models

How do I parse delimited rows of text with differing field counts in to objects, while allowing for extension?

How best to specify a Protobuf for use with Netty (preferably using the built-in protobuf support)

Java annotations

Categories

Resources