Generating AVRO classes in specific package

Generating AVRO classes in specific package - java

I have two .avsc files with matching namespace and field names. When generating, the classes generated from first schema are being overwritten by classes from the second schema. Is there a way to generate classes in a specific directory, but only for one of the .avsc files?
If I change the namespace in avro schema everything is great, but Kafka messages aren't being read and I get the following error:
Could not find class com.package.TestClaim
Obviously, because avro's namespace after change is com.package.test_claim.TestClaim
This is what's generated when I added *.test_claim to namespace of one of the schemas.

in a specific directory, but only for one of the .avsc files?
That's what the namespace does. This isn't overridable elsewhere, so yes, two files will conflict if compiled separately.
Unclear how you are compiling the Java classes, but if you use Avro IDL rather than AVSC, you can declare/combine many record types under a single namespace on the protocol (which will write the classes to a single folder)
And if you needed different/nested packaging, that is available too
#namespace("org.apache.avro.firstNamespace")
protocol MyProto {
#namespace("org.apache.avro.someOtherNamespace")
record Foo {}

Related

Shared Avro Schema files and version control

What is the best practise for sharing an Avro schema / generated sources?
I have two Java applications that communicates through Kafka.
My thought was to use Avro schemas for the events thats flowing between the applications.
So extracting the Avro schemas into a shared library seems like a good idea. But what is actually best practice here? Normally generated files are not stored in Source Control. But is that also the case with Avro generated Java classes. If not - then each consumer will have to generate their own classes at compile time.(But is that even possible if the schemas are in a maven, gradle etc. dependency)

Overall, version control is good, but you should ignore generated sources such as those that end up in the target folder in Maven.
The generated Java classes can go into a shared library placed into Nexus/Artifactory, for example during a mvn deploy, and from there can be appropritaely versioned for consumers to use.
Within the avro-maven-plugin generated classes, the schema is available as a static field, so you wouldn't need to copy those resources into the package.
Otherwise, assuming you are using the Confluent Schema Registry, you could use the GenericRecord type in your consumers, then parse the messages like you would normally for a JSON message, for example. E.g. Object fieldName = message.value().get("fieldName"), meanwhile producers could still have a Specific Avro class

Parsing Xml using JAXB with multiple Schemas each in a separate maven module

I am currently working on having the Maven JAXB plugin generate the model for a really complex set of Xml schemas. I started with all schemas being in one Maven module. The resulting module was able to parse any document in that language. Unfortunately I had to split up this one module into several ones due to the fact that I needed to import that schema and model in another module. It seems that there are issues with the episode files in case of multi-schema modules.
Now for example I have one schema defining the general structure of the xml format and defines the basic types (Let's call that A). Now there are several other schemas each implementing sub-types of these base types (Let's call one of those B).
I created the modules so B extends A. Unfortunately now it seems that I am unable to parse documents anymore. I can see that in module B a ObjctFactory for A has been created containing only definitions for all the types that B extedns from A. All the others are not present anymore.
As long as this ObjectFactory is present I am not able to parse anything, because the definition of the root elemment is missing. If I delete it, all the elements defined in B are missing from the resulting object tree.
Having compared the original ObjectFactory in the module with all the schemas, I could see that in the first version there were tons of "alternatives" that sort of told the parser which elements could possibly come. In the split-up version these are missing, so I guess that if I delete the partial ObjectFactory in B the one in A is used and this one doesn't know about the elements in B.
So how can I have JAXB parse all alements in A and B. Is there some way to extend ObjectFactories? If yes, how is this done?
I guess the next thing that could be causing trouble could be that I have several schemas extending A so I have documents containing A, B, C, D and E where B, C, D and E all extend A but are completely unreleated to each other. I guess extending woudn't be an option in that case.

I have ran into this situation a lot when doing the OGC Schemas and Tools Project. This project has tons of schemas using one another.
Here are few tips for you:
Separate individual schemas into individual Maven modules. One schema - one jar.
Generate episode files.
Use these episodes and separate schema compilation to avoid generating classes for the imported schema.
Still, XJC will generate things for the imported schema here and there - even if your episode file says you don't need anything. Mostly these things can be just removed. I was using the Ant plugin to delete these files during the build.
In the runtime you just include all the JARs you need and build you JAXB context for the context path com.acme.foo:com.acme.bar
You might want to check the OGC project I mentioned. This project has a huge number of interrelated schemas, with different coexisting versions.

Multiple related schemas using jaxb2

I'm using jaxb2 for a rest webservice.
I need to use two schemas. One is my own schema, stored in the src/main/resources/schema folder, and another schema, which is an online schema http://mypage.com/1/meta/schema.xsd. The problem is that both schemas have duplicated imports, so when I try to build the package, it gives me an issue with both executions saying that certain classes were already defined before.
How can I fix this?

You could use separate schema compilation for that, i.e. out of each schema file a JAR is created.

Dealing with shared namespaces with multiple WSDL's (xmlbeans)

I have five WSDL's that share namespaces, but not all of them. I generate client code out of them (databinding with XMLBeans). Seperately they compile fine. I create JAR files out of each generated client code.
Once I try to use all JAR files within a project, I get naming / compile conflicts.
I want to reuse as much as possible. Is there any smart way to deal with this (rather than giving each client an own node in the package structure)?

The XMLBeans (2.x) faq notes the limitations of xsdconfig namespace mapping:
Note: XMLBeans doesn’t support using two or more sets of java classes (in different packages) mapped to schema types/elements that have the same names and target namespaces, using all in the same class loader. Depending on the direction you are using for the java classes to schema types mapping, some features might not work correctly. This is because even though the package names for the java classes are different, the schema location for the schema metadata (.xsb files) is the same and contains the corresponding implementing java class, so the JVM will always pick up the first on the classpath. This can be avoided if multiple class loaders are used.

Options for JAXB 2.1 Bindings Customization

I am working on generating Java objects from an XSD file using JAXB 2.1. The XSD file has several elements in it representing business model entities, with common names like Account, etc. The system which is using the generated files to unmarshal the XML has several conflicting class names in its domain model. Though we can use different package names to get around class name conflicts, I think it will be more readable/maintainable to have objects of different names.
Because of this, I'd like to alter the XJC compilation so that is produces objects like: DataTransferAccount.java, etc. instead of Account.java. Super, I'll use one of the two options JAXB provides when binding a schema (http://java.sun.com/webservices/docs/2.0/tutorial/doc/JAXBUsing4.html):
Inline Customizations - Annotate the XSD itself using the jaxb namespace to specify class names
External Bindings File - Provide an extra file to the XJC which has rules on how to map schema elements to java classes
Is there a good argument for using option 1, aside from the ease of use? Naively, I'm tempted to use it because it is easy, but down the road I can see maintenance headaches if we decide to move away from JAXB XML unmarshalling.

Your instincts are good - the only situation in which I'd consider adding inline annotations to the schema is if you or your developers were the ones responsible for maintaining the schema itself.
If the schema is someone else's, and there's any danger of it changing in future, then resist the temptation - use an external binding customization. Yes, it's a bit awkward to use, but it's worth the effort.
As for your original issue of name clashes, an XML Schema is not allowed to re-use the same name either. The only way you should be getting name clashes in the generated Java is if you're compiling schemas from multiple namespaces into the same java package. If you have multiple namespaces, I strongly suggest you put each namespace into its own package, it does tend to make things clearer. It also avoids these name clashes.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.