I'm trying to deserialize JSON Object coming from an application I can't control. Here my JSON :
{"assembly":
{"name":"mm9",
"id":32,
"chromosomes":[
{"chromosome":
{"name":"MT"}
}]}}
My Pojos, are
class Assembly{
private String name;
private int id;
private ArrayList<Chromosome> chromosomes;
// getters & setters
}
class Chromosome {
private String name;
//getter/setters
}
But it's not working because of the extra fields "assembly" & "chromosome", so with a JSON like :
{"name":"mm9",
"id":32,
"chromosomes":[
{"name":"MT"}
] }}
it simply working.
Is there a way to modify configuration or something to achieve this without create more complex POJOS?
The problem is that in the first JSON snippet, chromosomes is a dictionary (Map), of which one of the entries (chromosome) happens to correspond to your Chromosome object.
A more accurate direct mapping to a Java class would be
class Assembly{
...
private Map<String, Chromosome> chromosomes;
}
Since you mention you can't control the format of the source JSON, you may want to look into using custom deserializers, or perhaps using the streaming support from Jackson rather than ObjectMapper for direct mapping, if you aren't happy changing your POJOs in this way.
By the way, it is best to refer to collections by their interface type (List) rather than a concrete type (ArrayList). It is very unlikely that code that refers to this class truly cares or needs to know that it is using an ArrayList, referring to just the List interface instead makes it a lot easier to swap other implementations in if needed (as a general principle).
Related
I have a Java class that models data meant for writing to both BigQuery and Elasticsearch. It looks something like this:
#DefaultSchema(JavaBeanSchema.class)
// also lombok annotations for getters, setters, builder, constructors, etc.
public class DataClass implements Serializable {
String field1;
List<String> field2;
List<List<String>> field3; // this one gives the compiler error below
}
We try to always use JavaBeanSchema.class for its nice compatibility with org.apache.beam.sdk.values.Row, com.google.api.services.bigquery.model.TableRow, and the org.apache.beam.sdk.io.gcp.bigquery.ToTableRow format function with org.apache.beam.sdk.io.gcp.bigquery.BigQueryIO. Combining these things means we can use BigQueryIO.Write without any explicit schema or coders, minimal boilerplate code.
Compiling the above class into a Beam template results in an IllegalArgumentException "Array of collection is not supported in BigQuery." OK - fine, but we still need this class to model the data for Elasticsearch where this schema works fine in JSON.
I am looking for the simplest way I can do some version of what I need to with minimal code. The solution I currently have is creating a separate class DataClassForBigQuery that is essentially a copy of this one except field3 is just a List<String> type, along with a PTransform<DataClass, DataClassForBigQuery> to serialize field3 as a JSON string when they are created. This is a relatively small amount of code, a fairly isolated, but:
I don't love the two classes to model the same data, just means more tests and maintenance
BQ has a native JSON field type, with some support for querying the values. Since technically the BQ field type of field3 is STRING, there's no good way to query/access that data without converting it to something else first. It would be nice if there was a way to annotate "this is a java String type, but interpret it as JSON".
I tried modeling the POJO with more complex types to refactor/wrap the nested Lists into a List of Objects and each Object contained a List of Strings. I think this works for Big Query, but Beam wasn't able to build the template. I believe the error was a stack overflow trying to build the schema.
If I could use the POJO class as-is, and somehow modify the schema on the fly to change the type of field3, and provide a function to format it but still leverage ToTableRow for the rest of it (the real class is large, 20+ fields), I'd try that. But I don't really want to extend org.apache.beam code to this, that would end up being more work.
Any ideas?
One approach is to use Jackson or Gson library to serialize the List<List<String>> field3 as a JSON string before writing it to BigQuery. You can create a custom serializer that can handle the nested List and write it as a JSON string to BigQuery. In Elasticsearch, the field3 can remain as is since it is already supported in JSON. To avoid duplicating the class, you can use inheritance or composition and create a subclass or an instance of the original class for BigQuery and override the serialization behavior for field3. Another alternative is to flatten the nested List<List<String>> field3 into a single List before writing it to BigQuery and then reverse the process when reading from BigQuery. This way you can use the same class for both BigQuery and Elasticsearch.
Consider avoiding a list of lists by introducing an intermediate type to hold the inner list:
#DefaultSchema(JavaBeanSchema.class)
public class MyIntermediateType implements Serializable {
List<String> innerField;
}
#DefaultSchema(JavaBeanSchema.class)
// also lombok annotations for getters, setters, builder, constructors, etc.
public class DataClass implements Serializable {
String field1;
List<String> field2;
List<MyIntermediateType> field3;
}
BigQuery will be able to represent this without further processing (field3 becomes a repeated STRUCT that has one field that is a repeated STRING). I hope that Elasticsearch would also be able to interpret this structure, but I don't know details of Elasticsearch ingestion.
Can we declare a attribute of type json object in a java model class
For example.
public class Sample {
private JSONobject data;
//getters and setters
}
In this way can we declare an attribute?
If so, do we need to add anything extra? I got an exception on runtime while populating the field.
I've yet seen anyone using JSONObject /JsonObject in a model.
Please refer to this for the difference between POJO (java model object) and DTO (objects like JSONObject) and this for when should you use JSONObject.
So the answer to your question is probably not.
This answer to serialization/deserialization should help you understand POJO and JSON better.
So far, I've only seen primitive values and list in a java model.
The JSONObjects need to be deserialized to POJO objects (vice versa). (Please refer to this post on why use POJO over JSONObject)
Your code won't even compile as it is asking for the parameter type of JSONObject which is not among java primitive values and part of Java collection.
However, you could argue that you can have
public class Sample<JSONobject> {
private JSONobject data;
//getters and setters
}
But this would be an unrelated object inside another object, which has no logic in implementation. For best practice, it's better to follow the common usages as many code bases would be serializing/deserializing JSON.
We use JSON serialization with Jackson to expose internal state of the system for debugging properties.
By default jackson does not serialize transient fields - but I wish to serialize them as well.
How can I serialize these fields?
One way I know is to supply a getters for these fields - but I don't want to do that, as I have some getX methods that I don't want to be invoked ( for instance, there are some getters that change the objects state ).
I know I could create an annotation, but I really want to avoid it.
So my question is:
Is there a way to setup jackson to serialize all the objects fields? include transient ones.
My solution with Jackson 2.4.3:
private static final ObjectMapper mapper =
new ObjectMapper(){{
Hibernate4Module module = new Hibernate4Module();
module.disable(Hibernate4Module.Feature.USE_TRANSIENT_ANNOTATION);
registerModule(module);
}};
I don't think Jackson supports any type of configuration to enable it to serialize a transient field. There's an open issue to add that feature, but it's old and hasn't been addressed (as far as I can tell): http://jira.codehaus.org/browse/JACKSON-623
So my question is: Is there a way to setup jackson to serialize all
the objects fields? include transient ones.
So to answer your question, no.
Some other Java JSON tools, such as GSON do support a configuration option to serialize transient fields. If you can use another tool, you might look into that (for GSON, see: https://sites.google.com/site/gson/gson-user-guide).
To expand a little, you might try a different approach.
First, You shouldn't try to serialize a transient field. After all the definition of transient is "don't serialize this." Nevertheless I can think of a few specific situations where it might be necessary, or at least convenient (like when working with code you can't modify or such). Still, in 99% of cases, the answer is don't do that. Change the field so that it's not transient if you need to serialize it. If you have multiple contexts where you use the same field, and you want it serialized in one (JSON, for example), and not serialized in another (java.io, for example) then you should create a custom serializer for the case where you don't want it, rather than abuse the keyword.
Second, as to using a getter and having "some getters that change the objects state," you should try to avoid that too. That can lead to various unintended consequences. And, technically, that's not a getter, that's a setter. What I mean is, if it mutates state, you've got a mutator (setter) rather than accessor (getter), even if you name it following the "get" convention and return some stuff.
You can create a custom getter for that transient field and use #XmlElement attribute. It doesn´t matter the name of that getter.
For example:
public class Person {
#XmlTransient private String lastname;
#XmlElement(name="lastname")
public String getAnyNameOfMethod(){
return lastname;
}
}
Another way to let Jackson serialize property is to add #JsonProperty annotation above it.
I guess it's better approach cause you do not need to disable default behaviour for all #Transient fields, like in Gere's answer.
This question is not concerning the exact specifics of how to serialize a Java object to a JSON representation, but rather a scalable and testable pattern for serializing Java objects to JSON. The system in which I'm maintaining has the notion of varying levels of granularity with regards to serialization of objects. For example, the system has a concept of a Project. A Project has the following fields:
Name
Description
Owner
List of tasks
Change history
Other metadata
When serializing a list of Projects, it's useful to only return the "summary" information:
Name
Description
Owner
Omitting the more detailed stuff. However, when request a single Project, a "detailed" view is returned which includes everything. Most objects in the system have this notion of a summary and a detail view, but I'm finding that in most cases, I'm either returning too much or too little information.
To handle which attributes are returned for which view, I've simply annotated the class, and described a summary and a detail view:
#Json(summary = { "name", "description", "owner" },
detail = { "name", "description", "owner", "tasks", "changes", ... }
class Project {
private String name;
...
}
This works decently, but as I mentioned above, I find in most cases, I'm either returning too much or too little. I would be interested to see what kind of patterns exist out there for a flexible approach to getting the data I need. Am I doing it wrong if I'm finding that I'm needing to return different representations of my data? Should I pick a set number of object representations and stick with that? Thanks for your help.
You could use subclassing with an automatic serialisation framework. For example using JAXB (which supports both JSON and XML):
class DetailSummary {
public #XmlElement String name;
public #XmlElement String description;
public #XmlElement String owner;
}
class Detail extends DetailSummary {
public #XmlElement List<Task> tasks;
...
}
This approach allows multiple levels of detail but forces you to use your classes as simple records.
I'm thinking of using the XStream library but I have a couple of questions/concerns.
Say I have a complex object that I want to serialize into XML (or JSON) using XStream. Is XStream able to handle this without any extra work?
For example:
class Foo
{
private Bar bar;
private string name;
// Getters and Setters
}
class Bar
{
private Integer id;
private string name;
// getters and setters
}
Can XStream handle this correctly? Thanks!
Short answer: Yes, it can.
But will do it with a lot of reflection overhead. I wouldn't write such code in production release.
Also, keep in mind that you have to look for bi-directional reference which will cause a runtime exception.
Yes, simple nested structures (references to other objects, lists and maps) are supported.
Things get hairy if you need to access fields from different levels (say, you need an attribute from <foo> in Bar).