Getting elements from failed XML - java

I have a big xml file to be validated against a big XSD. The client asked me to populate a table with different values of data when there is a validation error. For eg if Student ID is not valid, I will show school district, region and student ID. In another section of the XML, if state is not valid I will show school name, state and region. The data to show varies based on the invalid data. But its two or three or four elements which are parents of the invalid child element should be extracted.
How I can extract data using XMLSTREAMREADER and Validator?
I tried this one and I can get only the invalid element not other data...
public class StaxReaderWithElementIdentification {
private static final StreamSource XSD = new StreamSource("files\\InterchangeEducationOrganizationExension.xsd");
private static final StreamSource XML = new StreamSource("files\\InterchangeEducationOrganizationExension.xml");
public static void main(String[] args) throws Exception {
SchemaFactory factory=SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = factory.newSchema(XSD);
XMLStreamReader reader = XMLInputFactory.newFactory().createXMLStreamReader(XML);
Validator validator = schema.newValidator();
validator.setErrorHandler(new MyErrorHandler(reader));
validator.validate(new StAXSource(reader));
}
}
and Handler is:
public class MyErrorHandler implements ErrorHandler {
private XMLStreamReader reader;
public MyErrorHandler(XMLStreamReader reader) {
this.reader = reader;
}
#Override
public void error(SAXParseException e) throws SAXException {
warning(e);
}
#Override
public void fatalError(SAXParseException e) throws SAXException {
warning(e);
}
#Override
public void warning(SAXParseException e) throws SAXException {
//System.out.println(reader.getProperty(name));
System.out.println(reader.getLocalName());
System.out.println(reader.getNamespaceURI());
e.printStackTrace(System.out);
}
}
Can anyone help me how I can extract the other data when the validation error occurred?

I'm not sure it is the best solution, but you might try using HTML EditorKit and implement a custom ParserCallback.
In that manner you could parse the document and react only to tags you are interested in. It will chew any XML/HTML no matter how invalid it is.

Related

How to generate custom tag names and namespaces in xml using apache camel

I'm trying to transform pipe delimited string data to xml using camel bindy. But it is generating the tags along with the class name. Also I would like to add namespace to my tags.
I tried to use Camel process to generate custom tag, it's not working.
ConverterRoute.java
private static final String SOURCE_INPUT_PATH = "file://inbox?fileName=3000.txt";
private static final String SOURCE_OUTPUT_PATH = "file://outbox?fileName=itemfile.xml";
public void addRoutesToCamelContext(CamelContext context) throws Exception {
context.addRoutes(new RouteBuilder() {
public void configure() {
try {
DataFormat bindyFixed = new BindyCsvDataFormat(PartInboundIFD.class);
NameSpace nameSpace = new NameSpace("PART_INB_IFD","https://apache.org.com");
from(SOURCE_INPUT_PATH).
unmarshal(bindyFixed).
marshal().
xstream().
to(SOURCE_OUTPUT_PATH);
} catch (Exception e) {
e.printStackTrace();
}
}
});
}
}
Pojo.java
#CsvRecord(separator = "\\|",skipField = true)
public class Pojo {
#Link
private ControlSegment CONTROL_SEGMENT;
}
CamelComponent.java
public class CamelConfig extends RouteBuilder {
#Override
public void configure() throws Exception {
try {
CamelContext context = new DefaultCamelContext();
ConverterRoute route = new ConverterRoute();
route.addRoutesToCamelContext(context);
context.start();
Thread.sleep(5000);
context.stop();
} catch (Exception exe) {
exe.printStackTrace();
}
}
}
OUTPUT
Result.xml
<list>
<com.abc.domain.Pojo>
<CONTROL__SEGMENT/>
<TRNNAM>PART_TRAN</TRNNAM>
<TRNVER>9.0</TRNVER>
</com.abc.domain.Pojo>
</list>
Above posted is the output of the given transformation.In the first tag it is printing the tag name with whole package and class name(eg: com.abc.domain.Pojo).Also I'm trying to generate namespace its not generating that in my output.
May be you can add an additional XSLT route (https://camel.apache.org/components/latest/xslt-component.html).
Within the XSLT it's possible to transform the XML to your liking and add the correct namespaces (How can I add namespaces to the root element of my XML using XSLT?)

How to create schema from a Map and register to Schema Registry

Is there a way to create Schema from Map.
I have a map with key-value pairs and want to create Schema from this.
I have seen the org.apache.avro.Schema class(from avro-tools-1.8.2.jar) and there is APIs like below to read JSON and create Schema from it.
public Schema parse(File file) throws IOException {
return parse(FACTORY.createJsonParser(file));
}
public Schema parse(InputStream in) throws IOException {
return parse(FACTORY.createJsonParser(in).disable(
JsonParser.Feature.AUTO_CLOSE_SOURCE));
}
public Schema parse(String s, String... more) {
StringBuilder b = new StringBuilder(s);
for (String part : more)
b.append(part);
return parse(b.toString());
}
public Schema parse(String s) {
try {
return parse(FACTORY.createJsonParser(new StringReader(s)));
} catch (IOException e) {
throw new SchemaParseException(e);
}
}
Any pointer around how to create Schema from Map? After creating schema I will registry this to Confluent Schema Registry.
I'm not sure about parsing a Map<String, ?> but you can build the schema in code rather than parsing JSON.
Example
final Schema valueType = SchemaBuilder.builder().stringType();
Schema mapSchema = SchemaBuilder.map().values(valueType);
System.out.println(mapSchema);
// {"type":"map","values":"string"}
Schema recordSchemaWithMap = SchemaBuilder.builder("my.namespace.avro").record("MapData")
.fields()
.name("attributes").type(Schema.createMap(valueType)).noDefault()
.endRecord();
System.out.println(recordSchemaWithMap);
// {"type":"record","name":"MapData","namespace":"my.namespace.avro","fields":[{"name":"attributes","type":{"type":"map","values":"string"}}]}
This could probably be extended if you would loop over some Map.Entry values and build up the Schema object
Note: all maps contain string-type keys

How to internationalization SAXParseException while parsing XML file?

I've got a problem similar to this question: SAXParseException localized
I'm trying to parse a XML file and get a list of parser errors (SAXParseException) in a several languages for example:
XmlImporter.importFile(params, "en") should return a list of errors in English, XmlImporter.importFile(params, "fr") should return a list of errors in French, XmlImporter.importFile(params, "pl") should return a list of errors in Polish language.
Every call of XmlImporter.importFile(params, "...") may be with a different locale.
This is my validation method:
private void validate(String xmlFilePath, String schemaFilePath) throws Exception {
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
Schema schema = schemaFactory.newSchema(new File(schemaFilePath));
Validator validator = schema.newValidator();
XmlErrorHandler errorHandler = new XmlErrorHandler();
validator.setErrorHandler(errorHandler);
try (InputStream stream = new FileInputStream(new File(xmlFilePath))) {
validator.validate(new StreamSource(stream));
}
XmlErrorHandler:
public class XmlErrorHandler implements ErrorHandler {
private List<String> errorsList = new ArrayList<>();
public List<String> getErrorsList() {
return errorsList;
}
#Override
public void warning(SAXParseException exception) throws SAXException {
errorsList.add(prepareExceptionDescription(exception));
}
#Override
public void error(SAXParseException exception) throws SAXException {
errorsList.add(prepareExceptionDescription(exception));
}
#Override
public void fatalError(SAXParseException exception) throws SAXException {
errorsList.add(prepareExceptionDescription(exception));
}
private String prepareExceptionDescription(SAXParseException exception) {
return "Error: " +
"colNumber: " + exception.getColumnNumber() +
" line number: " + exception.getLineNumber() +
" message: " + exception.getLocalizedMessage();
}
}
I assume, that I need to pass somehow/somewhere java.util.Locale/String to get in exception.getLocalizedMessage() custom message (in en, fr or pl)?
By the default Xerces (Java Parser which is used to convert XML file to Java object) could provide internationalization for given languages:
XMLSchemaMessages_de.properties XMLSchemaMessages_es.properties
XMLSchemaMessages_fr.properties XMLSchemaMessages_it.properties
XMLSchemaMessages_ja.properties XMLSchemaMessages_ko.properties
XMLSchemaMessages_pt_BR.properties XMLSchemaMessages_sv.properties
XMLSchemaMessages_zh_CN.properties XMLSchemaMessages_zh_TW.properties
To provide internationalization in other language:
Get XMLSchemaMessages.properties file from Apache Xerces and rename file to a new file XMLSchemaMessages_LANG.properties, where LANG needs to be changed to a new language.
Update file's messages to a new language and place this file in a classpath (You can add this file to src\main\resources\com\sun\org\apache\xerces\internal\impl\msg)
Exceptions will be visible in a new language (messages will be taken from XMLSchemaMessages_LANG.properties file)

How NOT to deserialise a nested XML document with XStream

I have a document structure like this:
<MyDocument>
<MyChildDocument>
<SubElement>
...
</SubElement>
</MyChildDocument>
</MyDocument>
I would like XStream to de-serialise this to the following object:
#XStreamAlias("MyDocument")
public class MyDocument {
String myChildDocument;
public String getMyChildDocument() {
return myChildDocument;
}
public void setMyChildDocument(String str) {
myChildDocument = str;
}
}
The myChildDocument variable should contain the full child document as a string including the tags.
I also need to do the serialisation side of this, avoiding XStream from entity encoding the XML string contained within the myChildDocument variable.
I've been looking at converters to do this for me, but have not found a good way to do it. Any ideas?
I managed to create a solution for this using a custom converter. In simple terms, when marshalling, feed the XML string for MyChildDocument into an XML reader and use a copier to feed this back out to the writer that is creating the marshalled result. Reverse the process when unmarshalling incoming XML!
public class MyExchangeConverter implements Converter {
protected static XmlPullParser pullParser;
protected static XmlPullParser getPullParser() {
if (pullParser == null) {
try {
pullParser = XmlPullParserFactory.newInstance().newPullParser();
}
catch (XmlPullParserException e) { } // Ah nuts!
}
return pullParser;
}
#Override
public boolean canConvert(#SuppressWarnings("rawtypes") Class type) {
return MyDocument.class.equals(type);
}
#Override
public void marshal(Object source, HierarchicalStreamWriter writer,
MarshallingContext context) {
MyDocument request = (MyDocument) source;
if (request.getMyChildDocument() != null) {
HierarchicalStreamReader reader;
reader = new XppReader(new StringReader(request.getMyChildDocument()), getPullParser());
HierarchicalStreamCopier copier = new HierarchicalStreamCopier();
copier.copy(reader, writer);
}
}
#Override
public Object unmarshal(HierarchicalStreamReader reader,
UnmarshallingContext context) {
MyDocument response = new MyDocument();
reader.moveDown();
Writer out = new StringWriter();
HierarchicalStreamWriter writer = new CompactWriter(out);
HierarchicalStreamCopier copier = new HierarchicalStreamCopier();
copier.copy(reader, writer);
response.setMyChildDocument(out.toString());
reader.moveUp();
return response;
}
}
Some would (rightly) argue this opens up the system to XML injection attacks to a degree. True enough, but for my particular use case, this is not a risk I am concerned about. Just something to be aware of if anybody plans to use this for public facing interfaces with unknown remote parties or the risk of man-in-the-middle attacks. You have been warned!

How to set custom ValidationEventHandler on JAXB unmarshaller when using annotations

We’re using JAX-WS in combination with JAXB to receive and parse XML web service calls. It’s all annotation-based, i.e. we never get hold of the JAXBContext in our code. I need to set a custom ValidationEventHandler on the unmarshaller, so that if the date format for a particular field is not accepted, we can catch the error and report something nice back in the response. We have a XMLJavaTypeAdapter on the field in question, which does the parsing and throws an exception. I can’t see how to set a ValidationEventHandler onto the unmarshaller using the annotation-based configuration that we have. Any ideas?
Note: same question as this comment which is currently unanswered.
I have been struggling with this issue during the last week and finally i have managed a working solution. The trick is that JAXB looks for the methods beforeUnmarshal and afterUnmarshal in the object annotated with #XmlRootElement.
..
#XmlRootElement(name="MSEPObtenerPolizaFechaDTO")
#XmlAccessorType(XmlAccessType.FIELD)
public class MSEPObtenerPolizaFechaDTO implements Serializable {
..
public void beforeUnmarshal(Unmarshaller unmarshaller, Object parent) throws JAXBException, IOException, SAXException {
unmarshaller.setSchema(Utils.getSchemaFromContext(this.getClass()));
unmarshaller.setEventHandler(new CustomEventHandler());
}
public void afterUnmarshal(Unmarshaller unmarshaller, Object parent) throws JAXBException {
unmarshaller.setSchema(null);
unmarshaller.setEventHandler(null);
}
Using this ValidationEventHandler:
public class CustomEventHandler implements ValidationEventHandler{
#Override
public boolean handleEvent(ValidationEvent event) {
if (event.getSeverity() == event.ERROR ||
event.getSeverity() == event.FATAL_ERROR)
{
ValidationEventLocator locator = event.getLocator();
throw new RuntimeException(event.getMessage(), event.getLinkedException());
}
return true;
}
}
}
And this is the method getSchemaFromContext created in your Utility class:
#SuppressWarnings("unchecked")
public static Schema getSchemaFromContext(Class clazz) throws JAXBException, IOException, SAXException{
JAXBContext jc = JAXBContext.newInstance(clazz);
final List<ByteArrayOutputStream> outs = new ArrayList<ByteArrayOutputStream>();
jc.generateSchema(new SchemaOutputResolver(){
#Override
public Result createOutput(String namespaceUri,
String suggestedFileName) throws IOException {
ByteArrayOutputStream out = new ByteArrayOutputStream();
outs.add(out);
StreamResult streamResult = new StreamResult(out);
streamResult.setSystemId("");
return streamResult;
}
});
StreamSource[] sources = new StreamSource[outs.size()];
for (int i = 0; i < outs.size(); i++) {
ByteArrayOutputStream out = outs.get(i);
sources[i] = new StreamSource(new ByteArrayInputStream(out.toByteArray()), "");
}
SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
return sf.newSchema(sources);
}

Categories

Resources