Parse some elements from a xml - java

i want to know if is possible to me to parse some atributes from a xml file, to be a object in java
I donĀ“t wanna to create all fields that are in xml.
So, how can i do this?
For exemple below there is a xml file, and i want only the data inside the tag .
<emit>
<CNPJ>1109</CNPJ>
<xNome>OESTE</xNome>
<xFant>ABATEDOURO</xFant>
<enderEmit>
<xLgr>RODOVIA</xLgr>
<nro>S/N</nro>
<xCpl>402</xCpl>
<xBairro>GOMES</xBairro>
<cMun>314</cMun>
<xMun>MINAS</xMun>
<UF>MG</UF>
<CEP>35661470</CEP>
<cPais>58</cPais>
<xPais>Brasil</xPais>
<fone>03</fone>
</enderEmit>
<IE>20659</IE>
<CRT>3</CRT>

For Java XML parsing where you don't have the XSD and don't want to create a complete object graph to represent the XML, JDOM is a great tool. It allows you to easily walk the XML tree and pick the elements you are interested in.
Here's some sample code that uses JDOM to pick arbitrary values from the XML doc:
// reading can be done using any of the two 'DOM' or 'SAX' parser
// we have used saxBuilder object here
// please note that this saxBuilder is not internal sax from jdk
SAXBuilder saxBuilder = new SAXBuilder();
// obtain file object
File file = new File("/tmp/emit.xml");
try {
// converted file to document object
Document document = saxBuilder.build(file);
//You don't need this or the ns parameters in getChild()
//if your XML document has no namespace
Namespace ns = Namespace.getNamespace("http://www.example.com/namespace");
// get root node from xml. emit in your sample doc?
Element rootNode = document.getRootElement();
//getChild() assumes one and only one, enderEmit element. Use a lib and error
//checking as needed for your document
Element enderEmitElement = rootNode.getChild("enderEmit", ns);
//now we get two of the child from
Element xCplElement = enderEmitElement.getChild("xCpl", ns);
//should be 402 in your example
String xCplValue = xCplElement.getText();
System.out.println("xCpl: " + xCplValue);
Element cMunElement = enderEmitElement.getChild("cMun", ns);
//should be 314 in your example
String cMunValue = cMunElement.getText();
System.out.println("cMun: " + cMunValue);
} catch (JDOMException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}

You can use JAXB to unmarshal the xml into Java object, with which you can read selective elements easily. With JAXB, the given XML can be represented in Java as follows :
enderEmit element :
#XmlRootElement
public class EnderEmit{
private String xLgr;
//Other elements.Here you can define properties for only those elements that you want to load
}
emit element (This represents your XML file):
#XmlRootElement
public class Emit{
private String cnpj;
private String xnom;
private EnderEmit enderEmit;
..
//Add elements that you want to load
}
Now by using the below lines of code, you can read your xml to an object :
String filePath="filePath";
File file = new File(filePath);
JAXBContext jaxbContext = JAXBContext.newInstance(Emit.class);
jaxbUnmarshaller = jaxbContext.createUnmarshaller();
Emit emit = (Emit) jaxbUnmarshaller.unmarshal(file);
The line will give you an emit object for the given xml.

Try to use StringUtils.subStringBetween
try
{
String input = "";
br = new BufferedReader(new FileReader(FILEPATH));
String result = null;
while ((input = br.readLine()) != null) // here we read the file line by line
{
result = StringUtils.substringBetween(input, ">", "<"); // using StringUtils.subStringBetween to get the data what you want
if(result != null) // if the result should not be null because some of the line not having the tags
{
System.out.println(""+result);
}
}
}
catch (IOException e)
{
e.printStackTrace();
}
finally
{
try
{
if (br != null)
{
br.close();
}
}
catch (IOException ex)
{
ex.printStackTrace();
}
}

Related

Java stax: The reference to entity "R" must end with the ';' delimiter

I am trying to parse a xml using stax but the error I get is:
javax.xml.stream.XMLStreamException: ParseError at [row,col]:[414,47]
Message: The reference to entity "R" must end with the ';' delimiter.
Which get stuck on the line 414 which has P&Rinside the xml file. The code I have to parse it is:
public List<Vild> getVildData(File file){
XMLInputFactory factory = XMLInputFactory.newFactory();
try {
ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(Files.readAllBytes(file.toPath()));
XMLStreamReader reader = factory.createXMLStreamReader(byteArrayInputStream, "iso8859-1");
List<Vild> vild = saveVild(reader);
reader.close();
return vild;
} catch (IOException e) {
e.printStackTrace();
} catch (XMLStreamException e) {
e.printStackTrace();
}
return Collections.emptyList();
}
private List<Vild> saveVild(XMLStreamReader streamReader) {
List<Vild> vildList = new ArrayList<>();
try{
Vild vild = new Vild();
while (streamReader.hasNext()) {
streamReader.next();
//Creating list with data
}
}catch(XMLStreamException | IllegalStateException ex) {
ex.printStackTrace();
}
return Collections.emptyList();
}
I read online that the & is invalid xml code but I don't know how to change it before it throws this error inside the saveVild method. Does someone know how to do this efficiently?
Change the question: you're not trying to parse an XML file, you're trying to parse a non-XML file. For that, you need a non-XML parser, and to write such a parser you need to start with a specification of the language you are trying to parse, and you'll need to agree the specification of this language with the other partners to the data interchange.
How much work you could all save by conforming to standards!
Treat broken XML arriving in your shop the way you would treat any other broken goods coming from a supplier: return it to sender marked "unfit for purpose".
The problem here, as you mention is that the parser finds the & and it expects also the ;
This gets fixed escaping the character, so that the parser finds & instead.
Take a look here for further reference

Using java exception for condition checking

I'm creating a single xml file uploader in my grails application. There is two types of files, Ap and ApWithVendor. I would like to auto detect the file type and convert the xml to the correct object using SAXParser.
What I've been doing is throwing an exception when the sax parser is unable to find a qName match within the the first Ap object using the endElement method. I then catch the exception and try the the ApWithVendor object.
My question is there a better way to do this without doing my condition checking with exceptions?
Code example
try {
System.out.println("ApBatch");
Batch<ApBatchEntry> batch = new ApBatchConverter().convertFromXML(new String(xmlDocument, StandardCharsets.UTF_8));
byte[] xml = new ApBatchConverter().convertToXML(batch, true);
String xmlString = new String(xml, StandardCharsets.UTF_8);
System.out.println(xmlString);
errors = client.validateApBatch(batch);
if (!errors.isEmpty()) {
throw new BatchValidationException(errors);
}
return;
} catch (BatchConverterException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
try {
System.out.println("ApVendorBatch");
Batch<ApWithVendorBatchEntry> batch = new ApWithVendorBatchConverter().convertFromXML(new String(xmlDocument, StandardCharsets.UTF_8));
byte[] xml = new ApWithVendorBatchConverter().convertToXML(batch, true);
String xmlString = new String(xml, StandardCharsets.UTF_8);
System.out.println(xmlString);
errors = client.validateApWithVendorBatch(batch);
if (!errors.isEmpty()) {
throw new BatchValidationException(errors);
}
return;
} catch (BatchConverterException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
You can always iterate over the nodes in the XML and base decision on the fact that specific Node is missing (or is present - or has specific value) (see DocumentBuilder and Document class)
Using exceptions for decision-making or flow-control in 99% situations is considered bad practice.
Try converting the XML string to an XML tree object first and use XPath to decide if it's an ApWithVendor structure. I.e. check if there is an element like "/application/foo/vendor" path in the structure.
Once you have decided, convert the XML tree object to an object.

JaxB marshaler overwriting file contents

I am trying to use JaxB to marshall objects I create to an XML. What I want is to create a list then print it to the file, then create a new list and print it to the same file but everytime I do it over writes the first. I want the final XML file to look like I only had 1 big list of objects. I would do this but there are so many that I quickly max my heap size.
So, my main creates a bunch of threads each of which iterate through a list of objects it receives and calls create_Log on each object. Once it is finished it calls printToFile which is where it marshalls the list to the file.
public class LogThread implements Runnable {
//private Thread myThread;
private Log_Message message = null;
private LinkedList<Log_Message> lmList = null;
LogServer Log = null;
private String Username = null;
public LogThread(LinkedList<Log_Message> lmList){
this.lmList = lmList;
}
public void run(){
//System.out.println("thread running");
LogServer Log = new LogServer();
//create iterator for list
final ListIterator<Log_Message> listIterator = lmList.listIterator();
while(listIterator.hasNext()){
message = listIterator.next();
CountTrans.addTransNumber(message.TransactionNumber);
Username = message.input[2];
Log.create_Log(message.input, message.TransactionNumber, message.Message, message.CMD);
}
Log.printToFile();
init_LogServer.threadCount--;
init_LogServer.doneList();
init_LogServer.doneUser();
System.out.println("Thread "+ Thread.currentThread().getId() +" Completed user: "+ Username+"... Number of Users Complete: " + init_LogServer.getUsersComplete());
//Thread.interrupt();
}
}
The above calls the below function create_Log to build a new object I generated from the XSD I was given (SystemEventType,QuoteServerType...etc). These objects are all added to an ArrayList using the function below and attached to the Root object. Once the LogThread loop is finished it calls the printToFile which takes the list from the Root object and marshalls it to the file... overwriting what was already there. How can I add it to the same file without over writing and without creating one master list in the heap?
public class LogServer {
public log Root = null;
public static String fileName = "LogFile.xml";
public static File XMLfile = new File(fileName);
public LogServer(){
this.Root = new log();
}
//output LogFile.xml
public synchronized void printToFile(){
System.out.println("Printing XML");
//write to xml file
try {
init_LogServer.marshaller.marshal(Root,XMLfile);
} catch (JAXBException e) {
e.printStackTrace();
}
System.out.println("Done Printing XML");
}
private BigDecimal ConvertStringtoBD(String input){
DecimalFormatSymbols symbols = new DecimalFormatSymbols();
symbols.setGroupingSeparator(',');
symbols.setDecimalSeparator('.');
String pattern = "#,##0.0#";
DecimalFormat decimalFormat = new DecimalFormat(pattern, symbols);
decimalFormat.setParseBigDecimal(true);
// parse the string
BigDecimal bigDecimal = new BigDecimal("0");
try {
bigDecimal = (BigDecimal) decimalFormat.parse(input);
} catch (ParseException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return bigDecimal;
}
public QuoteServerType Log_Quote(String[] input, int TransactionNumber){
BigDecimal quote = ConvertStringtoBD(input[4]);
BigInteger TransNumber = BigInteger.valueOf(TransactionNumber);
BigInteger ServerTimeStamp = new BigInteger(input[6]);
Date date = new Date();
long timestamp = date.getTime();
ObjectFactory factory = new ObjectFactory();
QuoteServerType quoteCall = factory.createQuoteServerType();
quoteCall.setTimestamp(timestamp);
quoteCall.setServer(input[8]);
quoteCall.setTransactionNum(TransNumber);
quoteCall.setPrice(quote);
quoteCall.setStockSymbol(input[3]);
quoteCall.setUsername(input[2]);
quoteCall.setQuoteServerTime(ServerTimeStamp);
quoteCall.setCryptokey(input[7]);
return quoteCall;
}
public SystemEventType Log_SystemEvent(String[] input, int TransactionNumber, CommandType CMD){
BigInteger TransNumber = BigInteger.valueOf(TransactionNumber);
Date date = new Date();
long timestamp = date.getTime();
ObjectFactory factory = new ObjectFactory();
SystemEventType SysEvent = factory.createSystemEventType();
SysEvent.setTimestamp(timestamp);
SysEvent.setServer(input[8]);
SysEvent.setTransactionNum(TransNumber);
SysEvent.setCommand(CMD);
SysEvent.setFilename(fileName);
return SysEvent;
}
public void create_Log(String[] input, int TransactionNumber, String Message, CommandType Command){
switch(Command.toString()){
case "QUOTE": //Quote_Log
QuoteServerType quote_QuoteType = Log_Quote(input,TransactionNumber);
Root.getUserCommandOrQuoteServerOrAccountTransaction().add(quote_QuoteType);
break;
case "QUOTE_CACHED":
SystemEventType Quote_Cached_SysType = Log_SystemEvent(input, TransactionNumber, CommandType.QUOTE);
Root.getUserCommandOrQuoteServerOrAccountTransaction().add(Quote_Cached_SysType);
break;
}
}
EDIT: The below is code how the objects are added to the ArrayList
public List<Object> getUserCommandOrQuoteServerOrAccountTransaction() {
if (userCommandOrQuoteServerOrAccountTransaction == null) {
userCommandOrQuoteServerOrAccountTransaction = new ArrayList<Object>();
}
return this.userCommandOrQuoteServerOrAccountTransaction;
}
Jaxb is about mapping java object tree to xml document or vice versa. So in principle, you need complete object model before you can save it to xml.
Of course it would not be possible, for very large data, for example DB dump, so jaxb allows marshalling object tree in fragments, letting the user control moment of the object creation and marshaling. Typical use case would be fetching records from DB one by one and marshaling them one by one to a file, so there would not be problem with the heap.
However, you are asking about appending one object tree to another (one fresh in memory, second one already represented in a xml file). Which is not normally possible as it is not really appending but crating new object tree that contains content of the both (there is only one document root element, not two).
So what you could do,
is to create new xml representation with manually initiated root
element,
copy the existing xml content to the new xml either using XMLStreamWriter/XMLStreamReader read/write operations or unmarshaling
the log objects and marshaling them one by one.
marshal your log objects into the same xml stram
complete the xml with the root closing element. -
Vaguely, something like that:
XMLStreamWriter writer = XMLOutputFactory.newInstance().createXMLStreamWriter(new FileOutputStream(...), StandardCharsets.UTF_8.name());
//"mannually" output the beginign of the xml document == its declaration and the root element
writer.writeStartDocument();
writer.writeStartElement("YOUR_ROOT_ELM");
Marshaller mar = ...
mar.setProperty(Marshaller.JAXB_FRAGMENT, true); //instructs jaxb to output only objects not the whole xml document
PartialUnmarshaler existing = ...; //allows reading one by one xml content from existin file,
while (existing.hasNext()) {
YourObject obj = existing.next();
mar.marshal(obj, writer);
writer.flush();
}
List<YourObject> toAppend = ...
for (YourObject toAppend) {
mar.marshal(obj,writer);
writer.flush();
}
//finishing the document, closing the root element
writer.writeEndElement();
writer.writeEndDocument();
Reading the objects one by one from large xml file, and complete implementation of PartialUnmarshaler is described in this answer:
https://stackoverflow.com/a/9260039/4483840
That is the 'elegant' solution.
Less elegant is to have your threads write their logs list to individual files and the append them yourself. You only need to read and copy the header of the first file, then copy all its content apart from the last closing tag, copy the content of the other files ignoring the document openkng and closing tag, output the closing tag.
If your marshaller is set to marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
each opening/closing tag will be in different line, so the ugly hack is to
copy all the lines from 3rd to one before last, then output the closing tag.
It is ugly hack, cause it is sensitive to your output format (if you for examle change your container root element). But faster to implement than full Jaxb solution.

converting xml to java objects with xstream in java

I have an xml ans i want to make it objects , i am using xsteam for this and I have added xstream jars in my classpath..
below is my xml...
<Eurexflows xmlns:eur="http://www.eurexchange.com/EurexIRSFullInventoryReport" xmlns:fpml="http://www.fpml.org/FpML-5/confirmation">
<EurexMessageObject>
<CCPTradeId>109599</CCPTradeId>
<novDateTime>2012-02-15 10:59:00.0</novDateTime>
</EurexMessageObject>
<EurexMessageObject>
<CCPTradeId>122270</CCPTradeId>
<novDateTime>2012-06-29 18:59:00.0</novDateTime>
</EurexMessageObject>
</Eurexflows>
below is my pojo...
public class EurexMessageObject {
private Long CCPTradeId;
private String migratedDate;
public Long getCCPTradeId() {
return CCPTradeId;
}
public void setCCPTradeId(Long cCPTradeId) {
CCPTradeId = cCPTradeId;
}
public String getMigratedDate() {
return migratedDate;
}
public void setMigratedDate(String migratedDate) {
this.migratedDate = migratedDate;
}
}
and in my main class I have coded this way..
String xmlInputtra="C:\\Rahul\\InputXml\\Xmloutput.xml";
try
{
// get XStream instance and set required aliases
XStream xstream = new XStream();
xstream.alias("EurexMessageObject", com.rbos.gdspc.eurex.EurexMessageObject.class);
// prepare cash flow message from xslt output
EurexMessageObject eurexflowMsg = (EurexMessageObject) xstream.fromXML(xmlInputtra);
System.out.println(eurexflowMsg.toString());
}catch(Exception e)
{
e.printStackTrace();
}
now upon debuging I am getting the following exception..please advise how can I overcome from this
com.thoughtworks.xstream.io.StreamException: : only whitespace content allowed before start tag and not C (position: START_DOCUMENT seen C... #1:1)
Well,the thing that is overlooked here is how you are reading in the XML file.you are using the method fromXML which is expecting the actual XML input and not the file name. So when it parses your xml (which is "Xmloutput.xml" not the actual xml)
I suggest you to use a FileReader/BufferedReader in order to get the contents of the XML back. Something like this should work:
XStream instream = new XStream();
BufferedReader br = new BufferedReader(new FileReader("Xmloutput.xml"));
StringBuffer buff = new StringBuffer();
String line;
while((line = br.readLine()) != null){
buff.append(line);
}
EurexMessageObject eurexflowMsg = (EurexMessageObject)instream.fromXML(buff.toString());
I hope it will help you, best regards.
Here path for XML file:
String xmlInputtra="C:\\Rahul\\InputXml\\Xmloutput.xml";
is treated as XML contents,
so you need to pass as String for that you can read file and pass to constructor.

Java XML creates closing xml statement but no opening

I have a program that creates an xml doc.
the filename is unimportant here because the file does get created successfully
the arraylist of entries contains a Unique identifier and a hashmap of
elements + values. the elements are as follows: world, name, location, type and data
all these values are strings and the only one that would ever be blank/null is data
my problem is that the xml file adds all the fields as required with the exception
of the data field. it leaves me with an unopened node . actual result:
<NPC>
<NPC:0>
<name>
the_name
</name>
<data/> <---- this line should have the string "null"
<loc>
2529.1294962948955:
69.0:
951.2612160649056
</loc>
<type>
Quest
</type>
<world>
world
</world>
</NPC:0>
</NPC>
My method for creating the xml file.
public void updateXML(String fileName, ArrayList<XMLEntry> entries)
{
File file = getFileByName(fileName);
try {
DocumentBuilderFactory bFac = DocumentBuilderFactory.newInstance();
DocumentBuilder b = bFac.newDocumentBuilder();
Document doc = b.parse(file);
for(int i = 0; i < entries.size(); i++)
{
XMLEntry entry = entries.get(i);
Node entry_node = doc.getElementsByTagName(entry.getName()).item(0);
if(entry_node == null)
{
Element node = doc.createElement(entry.getName());
doc.getFirstChild().appendChild(node);
entry_node = doc.getElementsByTagName(entry.getName()).item(0);
}
for (Map.Entry<String, String> attributes : entry.getAttributes().entrySet())
{
NamedNodeMap xml_attributes = entry_node.getAttributes();
Node attribute = xml_attributes.getNamedItem(attributes.getKey());
if(attribute == null)
{
if(attributes.getValue() != "" || attributes.getValue() != null)
{
Element new_xml_attribute = doc.createElement(attributes.getKey());
new_xml_attribute.appendChild(doc.createTextNode(attributes.getValue()));
entry_node.appendChild(new_xml_attribute);
} else {
Element new_xml_attribute = doc.createElement(attributes.getKey());
new_xml_attribute.appendChild(doc.createTextNode("null"));
entry_node.appendChild(new_xml_attribute);
}
} else {
attribute.setTextContent(attributes.getValue());
}
TransformerFactory tFac = TransformerFactory.newInstance();
Transformer ts = tFac.newTransformer();
DOMSource src = new DOMSource(doc);
StreamResult result = new StreamResult(file);
ts.transform(src, result);
}
}
} catch (ParserConfigurationException e) {
} catch (TransformerException e1) {
} catch (IOException e2) {
} catch (SAXException e3) {
}
}
<data/> <---- this line should have the string "null"
That isn't an XML close-element tag (which would be </data>). It's an XML empty-element tag, which combines open and close into a single piece of markup. It is semantically identical to <data></data>.
Despite your expectations, it would appear that the empty <data/> element is not being created by the path with the literal "null". Drop a printout into that code, or run it in the debugger, to confirm this. Then use the debugger, or drop in additional printouts as necessary, to figure out why.

Categories

Resources