I'm creating a single xml file uploader in my grails application. There is two types of files, Ap and ApWithVendor. I would like to auto detect the file type and convert the xml to the correct object using SAXParser.
What I've been doing is throwing an exception when the sax parser is unable to find a qName match within the the first Ap object using the endElement method. I then catch the exception and try the the ApWithVendor object.
My question is there a better way to do this without doing my condition checking with exceptions?
Code example
try {
System.out.println("ApBatch");
Batch<ApBatchEntry> batch = new ApBatchConverter().convertFromXML(new String(xmlDocument, StandardCharsets.UTF_8));
byte[] xml = new ApBatchConverter().convertToXML(batch, true);
String xmlString = new String(xml, StandardCharsets.UTF_8);
System.out.println(xmlString);
errors = client.validateApBatch(batch);
if (!errors.isEmpty()) {
throw new BatchValidationException(errors);
}
return;
} catch (BatchConverterException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
try {
System.out.println("ApVendorBatch");
Batch<ApWithVendorBatchEntry> batch = new ApWithVendorBatchConverter().convertFromXML(new String(xmlDocument, StandardCharsets.UTF_8));
byte[] xml = new ApWithVendorBatchConverter().convertToXML(batch, true);
String xmlString = new String(xml, StandardCharsets.UTF_8);
System.out.println(xmlString);
errors = client.validateApWithVendorBatch(batch);
if (!errors.isEmpty()) {
throw new BatchValidationException(errors);
}
return;
} catch (BatchConverterException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
You can always iterate over the nodes in the XML and base decision on the fact that specific Node is missing (or is present - or has specific value) (see DocumentBuilder and Document class)
Using exceptions for decision-making or flow-control in 99% situations is considered bad practice.
Try converting the XML string to an XML tree object first and use XPath to decide if it's an ApWithVendor structure. I.e. check if there is an element like "/application/foo/vendor" path in the structure.
Once you have decided, convert the XML tree object to an object.
Related
I am trying to parse a xml using stax but the error I get is:
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[414,47]
Message: The reference to entity "R" must end with the ';' delimiter.
Which get stuck on the line 414 which has P&Rinside the xml file. The code I have to parse it is:
public List<Vild> getVildData(File file){
XMLInputFactory factory = XMLInputFactory.newFactory();
try {
ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(Files.readAllBytes(file.toPath()));
XMLStreamReader reader = factory.createXMLStreamReader(byteArrayInputStream, "iso8859-1");
List<Vild> vild = saveVild(reader);
reader.close();
return vild;
} catch (IOException e) {
e.printStackTrace();
} catch (XMLStreamException e) {
e.printStackTrace();
}
return Collections.emptyList();
}
private List<Vild> saveVild(XMLStreamReader streamReader) {
List<Vild> vildList = new ArrayList<>();
try{
Vild vild = new Vild();
while (streamReader.hasNext()) {
streamReader.next();
//Creating list with data
}
}catch(XMLStreamException | IllegalStateException ex) {
ex.printStackTrace();
}
return Collections.emptyList();
}
I read online that the & is invalid xml code but I don't know how to change it before it throws this error inside the saveVild method. Does someone know how to do this efficiently?
Change the question: you're not trying to parse an XML file, you're trying to parse a non-XML file. For that, you need a non-XML parser, and to write such a parser you need to start with a specification of the language you are trying to parse, and you'll need to agree the specification of this language with the other partners to the data interchange.
How much work you could all save by conforming to standards!
Treat broken XML arriving in your shop the way you would treat any other broken goods coming from a supplier: return it to sender marked "unfit for purpose".
The problem here, as you mention is that the parser finds the & and it expects also the ;
This gets fixed escaping the character, so that the parser finds & instead.
Take a look here for further reference
I go through this link for java nlp https://www.tutorialspoint.com/opennlp/index.htm
I tried below code in android:
try {
File file = copyAssets();
// InputStream inputStream = new FileInputStream(file);
ParserModel model = new ParserModel(file);
// Creating a parser
Parser parser = ParserFactory.create(model);
// Parsing the sentence
String sentence = "Tutorialspoint is the largest tutorial library.";
Parse topParses[] = ParserTool.parseLine(sentence, parser,1);
for (Parse p : topParses) {
p.show();
}
} catch (Exception e) {
}
i download file **en-parser-chunking.bin** from internet and placed in assets of android project but code stop on third line i.e ParserModel model = new ParserModel(file); without giving any exception. Need to know how can this work in android? if its not working is there any other support for nlp in android without consuming any services?
The reason the code stalls/breaks at runtime is that you need to use an InputStream instead of a File to load the binary file resource. Most likely, the File instance is null when you "load" it the way as indicated in line 2. In theory, this constructor of ParserModelshould detect this and an IOException should be thrown. Yet, sadly, the JavaDoc of OpenNLP is not precise about this kind of situation and you are not handling this exception properly in the catch block.
Moreover, the code snippet you presented should be improved, so that you know what actually went wrong.
Therefore, loading a POSModel from within an Activity should be done differently. Here is a variant that takes care for both aspects:
AssetManager assetManager = getAssets();
InputStream in = null;
try {
in = assetManager.open("en-parser-chunking.bin");
POSModel posModel;
if(in != null) {
posModel = new POSModel(in);
if(posModel!=null) {
// From here, <posModel> is initialized and you can start playing with it...
// Creating a parser
Parser parser = ParserFactory.create(model);
// Parsing the sentence
String sentence = "Tutorialspoint is the largest tutorial library.";
Parse topParses[] = ParserTool.parseLine(sentence, parser,1);
for (Parse p : topParses) {
p.show();
}
}
else {
// resource file not found - whatever you want to do in this case
Log.w("NLP", "ParserModel could not initialized.");
}
}
else {
// resource file not found - whatever you want to do in this case
Log.w("NLP", "OpenNLP binary model file could not found in assets.");
}
}
catch (Exception ex) {
Log.e("NLP", "message: " + ex.getMessage(), ex);
// proper exception handling here...
}
finally {
if(in!=null) {
in.close();
}
}
This way, you're using an InputStream approach and at the same time you take care for proper exception and resource handling. Moreover, you can now use a Debugger in case something remains unclear with the resource path references of your model files. For reference, see the official JavaDoc of AssetManager#open(String resourceName).
Note well:
Loading OpenNLP's binary resources can consume quite a lot of memory. For this reason, it might be the case that your Android App's request to allocate the needed memory for this operation can or will not be granted by the actual runtime (i.e., smartphone) environment.
Therefore, carefully monitor the amount of requested/required RAM while posModel = new POSModel(in); is invoked.
Hope it helps.
I am trying to use JaxB to marshall objects I create to an XML. What I want is to create a list then print it to the file, then create a new list and print it to the same file but everytime I do it over writes the first. I want the final XML file to look like I only had 1 big list of objects. I would do this but there are so many that I quickly max my heap size.
So, my main creates a bunch of threads each of which iterate through a list of objects it receives and calls create_Log on each object. Once it is finished it calls printToFile which is where it marshalls the list to the file.
public class LogThread implements Runnable {
//private Thread myThread;
private Log_Message message = null;
private LinkedList<Log_Message> lmList = null;
LogServer Log = null;
private String Username = null;
public LogThread(LinkedList<Log_Message> lmList){
this.lmList = lmList;
}
public void run(){
//System.out.println("thread running");
LogServer Log = new LogServer();
//create iterator for list
final ListIterator<Log_Message> listIterator = lmList.listIterator();
while(listIterator.hasNext()){
message = listIterator.next();
CountTrans.addTransNumber(message.TransactionNumber);
Username = message.input[2];
Log.create_Log(message.input, message.TransactionNumber, message.Message, message.CMD);
}
Log.printToFile();
init_LogServer.threadCount--;
init_LogServer.doneList();
init_LogServer.doneUser();
System.out.println("Thread "+ Thread.currentThread().getId() +" Completed user: "+ Username+"... Number of Users Complete: " + init_LogServer.getUsersComplete());
//Thread.interrupt();
}
}
The above calls the below function create_Log to build a new object I generated from the XSD I was given (SystemEventType,QuoteServerType...etc). These objects are all added to an ArrayList using the function below and attached to the Root object. Once the LogThread loop is finished it calls the printToFile which takes the list from the Root object and marshalls it to the file... overwriting what was already there. How can I add it to the same file without over writing and without creating one master list in the heap?
public class LogServer {
public log Root = null;
public static String fileName = "LogFile.xml";
public static File XMLfile = new File(fileName);
public LogServer(){
this.Root = new log();
}
//output LogFile.xml
public synchronized void printToFile(){
System.out.println("Printing XML");
//write to xml file
try {
init_LogServer.marshaller.marshal(Root,XMLfile);
} catch (JAXBException e) {
e.printStackTrace();
}
System.out.println("Done Printing XML");
}
private BigDecimal ConvertStringtoBD(String input){
DecimalFormatSymbols symbols = new DecimalFormatSymbols();
symbols.setGroupingSeparator(',');
symbols.setDecimalSeparator('.');
String pattern = "#,##0.0#";
DecimalFormat decimalFormat = new DecimalFormat(pattern, symbols);
decimalFormat.setParseBigDecimal(true);
// parse the string
BigDecimal bigDecimal = new BigDecimal("0");
try {
bigDecimal = (BigDecimal) decimalFormat.parse(input);
} catch (ParseException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return bigDecimal;
}
public QuoteServerType Log_Quote(String[] input, int TransactionNumber){
BigDecimal quote = ConvertStringtoBD(input[4]);
BigInteger TransNumber = BigInteger.valueOf(TransactionNumber);
BigInteger ServerTimeStamp = new BigInteger(input[6]);
Date date = new Date();
long timestamp = date.getTime();
ObjectFactory factory = new ObjectFactory();
QuoteServerType quoteCall = factory.createQuoteServerType();
quoteCall.setTimestamp(timestamp);
quoteCall.setServer(input[8]);
quoteCall.setTransactionNum(TransNumber);
quoteCall.setPrice(quote);
quoteCall.setStockSymbol(input[3]);
quoteCall.setUsername(input[2]);
quoteCall.setQuoteServerTime(ServerTimeStamp);
quoteCall.setCryptokey(input[7]);
return quoteCall;
}
public SystemEventType Log_SystemEvent(String[] input, int TransactionNumber, CommandType CMD){
BigInteger TransNumber = BigInteger.valueOf(TransactionNumber);
Date date = new Date();
long timestamp = date.getTime();
ObjectFactory factory = new ObjectFactory();
SystemEventType SysEvent = factory.createSystemEventType();
SysEvent.setTimestamp(timestamp);
SysEvent.setServer(input[8]);
SysEvent.setTransactionNum(TransNumber);
SysEvent.setCommand(CMD);
SysEvent.setFilename(fileName);
return SysEvent;
}
public void create_Log(String[] input, int TransactionNumber, String Message, CommandType Command){
switch(Command.toString()){
case "QUOTE": //Quote_Log
QuoteServerType quote_QuoteType = Log_Quote(input,TransactionNumber);
Root.getUserCommandOrQuoteServerOrAccountTransaction().add(quote_QuoteType);
break;
case "QUOTE_CACHED":
SystemEventType Quote_Cached_SysType = Log_SystemEvent(input, TransactionNumber, CommandType.QUOTE);
Root.getUserCommandOrQuoteServerOrAccountTransaction().add(Quote_Cached_SysType);
break;
}
}
EDIT: The below is code how the objects are added to the ArrayList
public List<Object> getUserCommandOrQuoteServerOrAccountTransaction() {
if (userCommandOrQuoteServerOrAccountTransaction == null) {
userCommandOrQuoteServerOrAccountTransaction = new ArrayList<Object>();
}
return this.userCommandOrQuoteServerOrAccountTransaction;
}
Jaxb is about mapping java object tree to xml document or vice versa. So in principle, you need complete object model before you can save it to xml.
Of course it would not be possible, for very large data, for example DB dump, so jaxb allows marshalling object tree in fragments, letting the user control moment of the object creation and marshaling. Typical use case would be fetching records from DB one by one and marshaling them one by one to a file, so there would not be problem with the heap.
However, you are asking about appending one object tree to another (one fresh in memory, second one already represented in a xml file). Which is not normally possible as it is not really appending but crating new object tree that contains content of the both (there is only one document root element, not two).
So what you could do,
is to create new xml representation with manually initiated root
element,
copy the existing xml content to the new xml either using XMLStreamWriter/XMLStreamReader read/write operations or unmarshaling
the log objects and marshaling them one by one.
marshal your log objects into the same xml stram
complete the xml with the root closing element. -
Vaguely, something like that:
XMLStreamWriter writer = XMLOutputFactory.newInstance().createXMLStreamWriter(new FileOutputStream(...), StandardCharsets.UTF_8.name());
//"mannually" output the beginign of the xml document == its declaration and the root element
writer.writeStartDocument();
writer.writeStartElement("YOUR_ROOT_ELM");
Marshaller mar = ...
mar.setProperty(Marshaller.JAXB_FRAGMENT, true); //instructs jaxb to output only objects not the whole xml document
PartialUnmarshaler existing = ...; //allows reading one by one xml content from existin file,
while (existing.hasNext()) {
YourObject obj = existing.next();
mar.marshal(obj, writer);
writer.flush();
}
List<YourObject> toAppend = ...
for (YourObject toAppend) {
mar.marshal(obj,writer);
writer.flush();
}
//finishing the document, closing the root element
writer.writeEndElement();
writer.writeEndDocument();
Reading the objects one by one from large xml file, and complete implementation of PartialUnmarshaler is described in this answer:
https://stackoverflow.com/a/9260039/4483840
That is the 'elegant' solution.
Less elegant is to have your threads write their logs list to individual files and the append them yourself. You only need to read and copy the header of the first file, then copy all its content apart from the last closing tag, copy the content of the other files ignoring the document openkng and closing tag, output the closing tag.
If your marshaller is set to marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
each opening/closing tag will be in different line, so the ugly hack is to
copy all the lines from 3rd to one before last, then output the closing tag.
It is ugly hack, cause it is sensitive to your output format (if you for examle change your container root element). But faster to implement than full Jaxb solution.
i want to know if is possible to me to parse some atributes from a xml file, to be a object in java
I donĀ“t wanna to create all fields that are in xml.
So, how can i do this?
For exemple below there is a xml file, and i want only the data inside the tag .
<emit>
<CNPJ>1109</CNPJ>
<xNome>OESTE</xNome>
<xFant>ABATEDOURO</xFant>
<enderEmit>
<xLgr>RODOVIA</xLgr>
<nro>S/N</nro>
<xCpl>402</xCpl>
<xBairro>GOMES</xBairro>
<cMun>314</cMun>
<xMun>MINAS</xMun>
<UF>MG</UF>
<CEP>35661470</CEP>
<cPais>58</cPais>
<xPais>Brasil</xPais>
<fone>03</fone>
</enderEmit>
<IE>20659</IE>
<CRT>3</CRT>
For Java XML parsing where you don't have the XSD and don't want to create a complete object graph to represent the XML, JDOM is a great tool. It allows you to easily walk the XML tree and pick the elements you are interested in.
Here's some sample code that uses JDOM to pick arbitrary values from the XML doc:
// reading can be done using any of the two 'DOM' or 'SAX' parser
// we have used saxBuilder object here
// please note that this saxBuilder is not internal sax from jdk
SAXBuilder saxBuilder = new SAXBuilder();
// obtain file object
File file = new File("/tmp/emit.xml");
try {
// converted file to document object
Document document = saxBuilder.build(file);
//You don't need this or the ns parameters in getChild()
//if your XML document has no namespace
Namespace ns = Namespace.getNamespace("http://www.example.com/namespace");
// get root node from xml. emit in your sample doc?
Element rootNode = document.getRootElement();
//getChild() assumes one and only one, enderEmit element. Use a lib and error
//checking as needed for your document
Element enderEmitElement = rootNode.getChild("enderEmit", ns);
//now we get two of the child from
Element xCplElement = enderEmitElement.getChild("xCpl", ns);
//should be 402 in your example
String xCplValue = xCplElement.getText();
System.out.println("xCpl: " + xCplValue);
Element cMunElement = enderEmitElement.getChild("cMun", ns);
//should be 314 in your example
String cMunValue = cMunElement.getText();
System.out.println("cMun: " + cMunValue);
} catch (JDOMException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
You can use JAXB to unmarshal the xml into Java object, with which you can read selective elements easily. With JAXB, the given XML can be represented in Java as follows :
enderEmit element :
#XmlRootElement
public class EnderEmit{
private String xLgr;
//Other elements.Here you can define properties for only those elements that you want to load
}
emit element (This represents your XML file):
#XmlRootElement
public class Emit{
private String cnpj;
private String xnom;
private EnderEmit enderEmit;
..
//Add elements that you want to load
}
Now by using the below lines of code, you can read your xml to an object :
String filePath="filePath";
File file = new File(filePath);
JAXBContext jaxbContext = JAXBContext.newInstance(Emit.class);
jaxbUnmarshaller = jaxbContext.createUnmarshaller();
Emit emit = (Emit) jaxbUnmarshaller.unmarshal(file);
The line will give you an emit object for the given xml.
Try to use StringUtils.subStringBetween
try
{
String input = "";
br = new BufferedReader(new FileReader(FILEPATH));
String result = null;
while ((input = br.readLine()) != null) // here we read the file line by line
{
result = StringUtils.substringBetween(input, ">", "<"); // using StringUtils.subStringBetween to get the data what you want
if(result != null) // if the result should not be null because some of the line not having the tags
{
System.out.println(""+result);
}
}
}
catch (IOException e)
{
e.printStackTrace();
}
finally
{
try
{
if (br != null)
{
br.close();
}
}
catch (IOException ex)
{
ex.printStackTrace();
}
}
This is driving me crazy. I have used this bit of code for lots of different projects but this is the first time it's given me this type of error. This is the whole XML file:
<layers>
<layer name="Layer 1" h="400" w="272" z="0" y="98" x="268"/>
<layer name="Layer 0" h="355" w="600" z="0" y="287" x="631"/>
</layers>
Here is the operative bit of code in my homebrew Xml class which uses the DocumentBuilderFactory to parse the Xml fed into it:
public static Xml parse(String xmlString)
{
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setNamespaceAware(true);
Document doc = null;
//System.out.print(xmlString);
try
{
doc = dbf.newDocumentBuilder().parse(
new InputSource(new StringReader(xmlString)));
// Get root element...
Node rootNode = (Element) doc.getDocumentElement();
return getXmlFromNode(rootNode);
} catch (ParserConfigurationException e)
{
System.out.println("ParserConfigurationException in Xml.parse");
e.printStackTrace();
} catch (SAXException e)
{
System.out.println("SAXException in Xml.parse ");
e.printStackTrace();
} catch (IOException e)
{
System.out.println("IOException in Xml.parse");
e.printStackTrace();
}
return null;
}
The context that I am using it is: school project to produce a Photoshop type image manipulation application. The file is being saved with the layers as .png and this xml file for the position, etc. of the layers in a .zip file. I don't know if the zipping is adding some mysterious extra characters or not.
I appreciate your feedback.
If you look at that file in an editor, you'll see content (perhaps whitespace) following the end element e.g.
</layers> <-- after here
It's worth dumping this out using a tool that will highlight whitespace chars e.g.
$ cat -v -e my.xml
will dump 'unprintable' characters.
Hopefully this can be helpful to someone at some point. The fix that worked was just to use lastIndexOf() with substring. Here's the code in situ:
public void loadFile(File m_imageFile)
{
try
{
ZipFile zipFile = new ZipFile(m_imageFile);
ZipEntry xmlZipFile = zipFile.getEntry("xml");
byte[] buffer = new byte[10000];
zipFile.getInputStream(xmlZipFile).read(buffer);
String xmlString = new String(buffer);
Xml xmlRoot = Xml.parse(xmlString.substring(0, xmlString.lastIndexOf('>')+1));
for(List<Xml> iter = xmlRoot.getNestedXml(); iter != null; iter = iter.next())
{
String layerName = iter.element().getAttributes().getValueByName("name");
m_view.getCanvasPanel().getLayers().add(
new Layer(ImageIO.read(zipFile.getInputStream(zipFile.getEntry(layerName))),
Integer.valueOf(iter.element().getAttributes().getValueByName("x")),
Integer.valueOf(iter.element().getAttributes().getValueByName("y")),
Integer.valueOf(iter.element().getAttributes().getValueByName("w")),
Integer.valueOf(iter.element().getAttributes().getValueByName("h")),
Integer.valueOf(iter.element().getAttributes().getValueByName("z")),
iter.element().getAttributes().getValueByName("name"))
);
}
zipFile.close();
} catch (FileNotFoundException e)
{
System.out.println("FileNotFoundException in MainController.loadFile()");
e.printStackTrace();
} catch (IOException e)
{
System.out.println("IOException in MainController.loadFile()");
e.printStackTrace();
}
}
Thanks for all the people that contributed. I suspect the error was either introduced by the zip process or by using the byte[] buffer. Any further feedback is appreciated.
I had some extra char at the end of XML, check the XML properly or do an online format of XML , which will throw error if XML is not proper. I used
Online XML Formatter
I just had this error in Sterling Integrator - when I looked at the file in a hex editor
it had about 5 extra lines of char(0), not space. No idea where they from, but this was precisely the issue, especially as I was basically doing a unity transform, so the xsl engine - in this case xalan - was obviously passing it through into the result. Removed extra rows after last close angle bracket, problem solved.