I am trying to use JaxB to marshall objects I create to an XML. What I want is to create a list then print it to the file, then create a new list and print it to the same file but everytime I do it over writes the first. I want the final XML file to look like I only had 1 big list of objects. I would do this but there are so many that I quickly max my heap size.
So, my main creates a bunch of threads each of which iterate through a list of objects it receives and calls create_Log on each object. Once it is finished it calls printToFile which is where it marshalls the list to the file.
public class LogThread implements Runnable {
//private Thread myThread;
private Log_Message message = null;
private LinkedList<Log_Message> lmList = null;
LogServer Log = null;
private String Username = null;
public LogThread(LinkedList<Log_Message> lmList){
this.lmList = lmList;
}
public void run(){
//System.out.println("thread running");
LogServer Log = new LogServer();
//create iterator for list
final ListIterator<Log_Message> listIterator = lmList.listIterator();
while(listIterator.hasNext()){
message = listIterator.next();
CountTrans.addTransNumber(message.TransactionNumber);
Username = message.input[2];
Log.create_Log(message.input, message.TransactionNumber, message.Message, message.CMD);
}
Log.printToFile();
init_LogServer.threadCount--;
init_LogServer.doneList();
init_LogServer.doneUser();
System.out.println("Thread "+ Thread.currentThread().getId() +" Completed user: "+ Username+"... Number of Users Complete: " + init_LogServer.getUsersComplete());
//Thread.interrupt();
}
}
The above calls the below function create_Log to build a new object I generated from the XSD I was given (SystemEventType,QuoteServerType...etc). These objects are all added to an ArrayList using the function below and attached to the Root object. Once the LogThread loop is finished it calls the printToFile which takes the list from the Root object and marshalls it to the file... overwriting what was already there. How can I add it to the same file without over writing and without creating one master list in the heap?
public class LogServer {
public log Root = null;
public static String fileName = "LogFile.xml";
public static File XMLfile = new File(fileName);
public LogServer(){
this.Root = new log();
}
//output LogFile.xml
public synchronized void printToFile(){
System.out.println("Printing XML");
//write to xml file
try {
init_LogServer.marshaller.marshal(Root,XMLfile);
} catch (JAXBException e) {
e.printStackTrace();
}
System.out.println("Done Printing XML");
}
private BigDecimal ConvertStringtoBD(String input){
DecimalFormatSymbols symbols = new DecimalFormatSymbols();
symbols.setGroupingSeparator(',');
symbols.setDecimalSeparator('.');
String pattern = "#,##0.0#";
DecimalFormat decimalFormat = new DecimalFormat(pattern, symbols);
decimalFormat.setParseBigDecimal(true);
// parse the string
BigDecimal bigDecimal = new BigDecimal("0");
try {
bigDecimal = (BigDecimal) decimalFormat.parse(input);
} catch (ParseException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return bigDecimal;
}
public QuoteServerType Log_Quote(String[] input, int TransactionNumber){
BigDecimal quote = ConvertStringtoBD(input[4]);
BigInteger TransNumber = BigInteger.valueOf(TransactionNumber);
BigInteger ServerTimeStamp = new BigInteger(input[6]);
Date date = new Date();
long timestamp = date.getTime();
ObjectFactory factory = new ObjectFactory();
QuoteServerType quoteCall = factory.createQuoteServerType();
quoteCall.setTimestamp(timestamp);
quoteCall.setServer(input[8]);
quoteCall.setTransactionNum(TransNumber);
quoteCall.setPrice(quote);
quoteCall.setStockSymbol(input[3]);
quoteCall.setUsername(input[2]);
quoteCall.setQuoteServerTime(ServerTimeStamp);
quoteCall.setCryptokey(input[7]);
return quoteCall;
}
public SystemEventType Log_SystemEvent(String[] input, int TransactionNumber, CommandType CMD){
BigInteger TransNumber = BigInteger.valueOf(TransactionNumber);
Date date = new Date();
long timestamp = date.getTime();
ObjectFactory factory = new ObjectFactory();
SystemEventType SysEvent = factory.createSystemEventType();
SysEvent.setTimestamp(timestamp);
SysEvent.setServer(input[8]);
SysEvent.setTransactionNum(TransNumber);
SysEvent.setCommand(CMD);
SysEvent.setFilename(fileName);
return SysEvent;
}
public void create_Log(String[] input, int TransactionNumber, String Message, CommandType Command){
switch(Command.toString()){
case "QUOTE": //Quote_Log
QuoteServerType quote_QuoteType = Log_Quote(input,TransactionNumber);
Root.getUserCommandOrQuoteServerOrAccountTransaction().add(quote_QuoteType);
break;
case "QUOTE_CACHED":
SystemEventType Quote_Cached_SysType = Log_SystemEvent(input, TransactionNumber, CommandType.QUOTE);
Root.getUserCommandOrQuoteServerOrAccountTransaction().add(Quote_Cached_SysType);
break;
}
}
EDIT: The below is code how the objects are added to the ArrayList
public List<Object> getUserCommandOrQuoteServerOrAccountTransaction() {
if (userCommandOrQuoteServerOrAccountTransaction == null) {
userCommandOrQuoteServerOrAccountTransaction = new ArrayList<Object>();
}
return this.userCommandOrQuoteServerOrAccountTransaction;
}
Jaxb is about mapping java object tree to xml document or vice versa. So in principle, you need complete object model before you can save it to xml.
Of course it would not be possible, for very large data, for example DB dump, so jaxb allows marshalling object tree in fragments, letting the user control moment of the object creation and marshaling. Typical use case would be fetching records from DB one by one and marshaling them one by one to a file, so there would not be problem with the heap.
However, you are asking about appending one object tree to another (one fresh in memory, second one already represented in a xml file). Which is not normally possible as it is not really appending but crating new object tree that contains content of the both (there is only one document root element, not two).
So what you could do,
is to create new xml representation with manually initiated root
element,
copy the existing xml content to the new xml either using XMLStreamWriter/XMLStreamReader read/write operations or unmarshaling
the log objects and marshaling them one by one.
marshal your log objects into the same xml stram
complete the xml with the root closing element. -
Vaguely, something like that:
XMLStreamWriter writer = XMLOutputFactory.newInstance().createXMLStreamWriter(new FileOutputStream(...), StandardCharsets.UTF_8.name());
//"mannually" output the beginign of the xml document == its declaration and the root element
writer.writeStartDocument();
writer.writeStartElement("YOUR_ROOT_ELM");
Marshaller mar = ...
mar.setProperty(Marshaller.JAXB_FRAGMENT, true); //instructs jaxb to output only objects not the whole xml document
PartialUnmarshaler existing = ...; //allows reading one by one xml content from existin file,
while (existing.hasNext()) {
YourObject obj = existing.next();
mar.marshal(obj, writer);
writer.flush();
}
List<YourObject> toAppend = ...
for (YourObject toAppend) {
mar.marshal(obj,writer);
writer.flush();
}
//finishing the document, closing the root element
writer.writeEndElement();
writer.writeEndDocument();
Reading the objects one by one from large xml file, and complete implementation of PartialUnmarshaler is described in this answer:
https://stackoverflow.com/a/9260039/4483840
That is the 'elegant' solution.
Less elegant is to have your threads write their logs list to individual files and the append them yourself. You only need to read and copy the header of the first file, then copy all its content apart from the last closing tag, copy the content of the other files ignoring the document openkng and closing tag, output the closing tag.
If your marshaller is set to marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
each opening/closing tag will be in different line, so the ugly hack is to
copy all the lines from 3rd to one before last, then output the closing tag.
It is ugly hack, cause it is sensitive to your output format (if you for examle change your container root element). But faster to implement than full Jaxb solution.
Related
I am trying to take a very long file of strings and convert it to an XML according to a schema I was given. I used jaxB to create classes from that schema. Since the file is very large I created a thread pool to improve the performance but since then it only processes one line of the file and marshalls it to the XML file, per thread.
Below is my home class where I read from the file. Each line is a record of a transaction, for every new user encountered a list is made to store all of that users transactions and each list is put into a HashMap. I made it a ConcurrentHashMap because multiple threads will work on the map simultaneously, is this the correct thing to do?
After the lists are created a thread is made for each user. Each thread runs the method ProcessCommands below and receives from home the list of transactions for its user.
public class home{
public static File XMLFile = new File("LogFile.xml");
Map<String,List<String>> UserMap= new ConcurrentHashMap<String,List<String>>();
String[] UserNames = new String[5000];
int numberOfUsers = 0;
try{
BufferedReader reader = new BufferedReader(new FileReader("test.txt"));
String line;
while ((line = reader.readLine()) != null)
{
parsed = line.split(",|\\s+");
if(!parsed[2].equals("./testLOG")){
if(Utilities.checkUserExists(parsed[2], UserNames) == false){ //User does not already exist
System.out.println("New User: " + parsed[2]);
UserMap.put(parsed[2],new ArrayList<String>()); //Create list of transactions for new user
UserMap.get(parsed[2]).add(line); //Add First Item to new list
UserNames[numberOfUsers] = parsed[2]; //Add new user
numberOfUsers++;
}
else{ //User Already Existed
UserMap.get(parsed[2]).add(line);
}
}
}
reader.close();
} catch (IOException x) {
System.err.println(x);
}
//get start time
long startTime = new Date().getTime();
tCount = numberOfUsers;
ExecutorService threadPool = Executors.newFixedThreadPool(tCount);
for(int i = 0; i < numberOfUsers; i++){
System.out.println("Starting Thread " + i + " for user " + UserNames[i]);
Runnable worker = new ProcessCommands(UserMap.get(UserNames[i]),UserNames[i], XMLfile);
threadPool.execute(worker);
}
threadPool.shutdown();
while(!threadPool.isTerminated()){
}
System.out.println("Finished all threads");
}
Here is the ProcessCommands class. The thread receives the list for its user and creates a marshaller. From what I unserstand marshalling is not thread safe so it is best to create one for each thread, is this the best way to do that?
When I create the marshallers I know that each from (from each thread) will want to access the created file causing conflicts, I used synchronized, is that correct?
As the thread iterates through it's list, each line calls for a certain case. There are a lot so I just made pseudo-cases for clarity. Each case calls the function below.
public class ProcessCommands implements Runnable{
private static final boolean DEBUG = false;
private List<String> list = null;
private String threadName;
private File XMLfile = null;
public Thread myThread;
public ProcessCommands(List<String> list, String threadName, File XMLfile){
this.list = list;
this.threadName = threadName;
this.XMLfile = XMLfile;
}
public void run(){
Date start = null;
int transactionNumber = 0;
String[] parsed = new String[8];
String[] quoteParsed = null;
String[] universalFormatCommand = new String[9];
String userCommand = null;
Connection connection = null;
Statement stmt = null;
Map<String, UserObject> usersMap = null;
Map<String, Stack<BLO>> buyMap = null;
Map<String, Stack<SLO>> sellMap = null;
Map<String, QLO> stockCodeMap = null;
Map<String, BTO> buyTriggerMap = null;
Map<String, STO> sellTriggerMap = null;
Map<String, USO> usersStocksMap = null;
String SQL = null;
int amountToAdd = 0;
int tempDollars = 0;
UserObject tempUO = null;
BLO tempBLO = null;
SLO tempSLO = null;
Stack<BLO> tempStBLO = null;
Stack<SLO> tempStSLO = null;
BTO tempBTO = null;
STO tempSTO = null;
USO tempUSO = null;
QLO tempQLO = null;
String stockCode = null;
String quoteResponse = null;
int usersDollars = 0;
int dollarAmountToBuy = 0;
int dollarAmountToSell = 0;
int numberOfSharesToBuy = 0;
int numberOfSharesToSell = 0;
int quoteStockInDollars = 0;
int shares = 0;
Iterator<String> itr = null;
int transactionCount = list.size();
System.out.println("Starting "+threadName+" - listSize = "+transactionCount);
//UO dollars, reserved
usersMap = new HashMap<String, UserObject>(3); //userName -> UO
//USO shares
usersStocksMap = new HashMap<String, USO>(); //userName+stockCode -> shares
//BLO code, timestamp, dollarAmountToBuy, stockPriceInDollars
buyMap = new HashMap<String, Stack<BLO>>(); //userName -> Stack<BLO>
//SLO code, timestamp, dollarAmountToSell, stockPriceInDollars
sellMap = new HashMap<String, Stack<SLO>>(); //userName -> Stack<SLO>
//BTO code, timestamp, dollarAmountToBuy, stockPriceInDollars
buyTriggerMap = new ConcurrentHashMap<String, BTO>(); //userName+stockCode -> BTO
//STO code, timestamp, dollarAmountToBuy, stockPriceInDollars
sellTriggerMap = new HashMap<String, STO>(); //userName+stockCode -> STO
//QLO timestamp, stockPriceInDollars
stockCodeMap = new HashMap<String, QLO>(); //stockCode -> QLO
//create user object and initialize stacks
usersMap.put(threadName, new UserObject(0, 0));
buyMap.put(threadName, new Stack<BLO>());
sellMap.put(threadName, new Stack<SLO>());
try {
//Marshaller marshaller = getMarshaller();
synchronized (this){
Marshaller marshaller = init.jc.createMarshaller();
marshaller.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, true);
marshaller.setProperty(Marshaller.JAXB_FRAGMENT, true);
marshaller.marshal(LogServer.Root,XMLfile);
marshaller.marshal(LogServer.Root,System.out);
}
} catch (JAXBException M) {
M.printStackTrace();
}
Date timing = new Date();
//universalFormatCommand = new String[8];
parsed = new String[8];
//iterate through workload file
itr = this.list.iterator();
while(itr.hasNext()){
userCommand = (String) itr.next();
itr.remove();
parsed = userCommand.split(",|\\s+");
transactionNumber = Integer.parseInt(parsed[0].replaceAll("\\[", "").replaceAll("\\]", ""));
universalFormatCommand = Utilities.FormatCommand(parsed, parsed[0]);
if(transactionNumber % 100 == 0){
System.out.println(this.threadName + " - " +transactionNumber+ " - "+(new Date().getTime() - timing.getTime())/1000);
}
/*System.out.print("UserCommand " +transactionNumber + ": ");
for(int i = 0;i<8;i++)System.out.print(universalFormatCommand[i]+ " ");
System.out.print("\n");*/
//switch for user command
switch (parsed[1].toLowerCase()) {
case "One"
*Do Stuff"
LogServer.create_Log(universalFormatCommand, transactionNumber, CommandType.ADD);
break;
case "Two"
*Do Stuff"
LogServer.create_Log(universalFormatCommand, transactionNumber, CommandType.ADD);
break;
}
}
}
The function create_Log has multiple cases so as before, for clarity I just left one. The case "QUOTE" only calls one object creation function but other other cases can create multiple objects. The type 'log' is a complex XML type that defines all the other object types so in each call to create_Log I create a log type called Root. The class 'log' generated by JaxB included a function to create a list of objects. The statement:
Root.getUserCommandOrQuoteServerOrAccountTransaction().add(quote_QuoteType);
takes the root element I created, creates a list and adds the newly created object 'quote_QuoteType' to that list. Before I added threading this method successfully created a list of as many objects as I wanted then marshalled them. So I'm pretty positive the bit in class 'LogServer' is not the issue. It is something to do with the marshalling and syncronization in the ProcessCommands class above.
public class LogServer{
public static log Root = new log();
public static QuoteServerType Log_Quote(String[] input, int TransactionNumber){
ObjectFactory factory = new ObjectFactory();
QuoteServerType quoteCall = factory.createQuoteServerType();
**Populate the QuoteServerType object called quoteCall**
return quoteCall;
}
public static void create_Log(String[] input, int TransactionNumber, CommandType Command){
System.out.print("TRANSACTION "+TransactionNumber + " is " + Command + ": ");
for(int i = 0; i<input.length;i++) System.out.print(input[i] + " ");
System.out.print("\n");
switch(input[1]){
case "QUOTE":
System.out.print("QUOTE CASE");
QuoteServerType quote_QuoteType = Log_Quote(input,TransactionNumber);
Root.getUserCommandOrQuoteServerOrAccountTransaction().add(quote_QuoteType);
break;
}
}
So you wrote a lot of code, but have you try if it is actually working? After quick look I doubt it. You should test your code logic part by part not going all the way till the end. It seems you are just staring with Java. I would recommend practice first on simple one threaded applications. Sorry if I sound harsh, but I will try to be constructive as well:
Per convention, the classes names are starts with capital letter, variables by small, you do it other way.
You should make a method in you home (Home) class not a put all your code in the static block.
You are reading the whole file to the memory, you do not process it line by line. After the Home is initialized literary whole content of file will be under UserMap variable. If the file is really large you will run out of the heap memory. If you assume large file than you cannot do it and you have to redisign your app to store somewhere partial results. If your file is smaller than memmory you could keep it like that (but you said it is large).
No need for UserNames, the UserMap.containsKey will do the job
Your thread pools size should be in the range of your cores not number of users as you will get thread trashing (if you have blocking operation in your code make tCount = 2*processors if not keep it as number of processors). Once one ProcessCommand finish, the executor will start another one till you finish all and you will be efficiently using all your processor cores.
DO NOT while(!threadPool.isTerminated()), this line will completely consume one processor as it will be constantly checking, call awaitTermination instead
Your ProcessCommand, has view map variables which will only had one entry cause as you said, each will process data from one user.
The synchronized(this) is Process will not work, as each thread will synchronized on different object (different isntance of process).
I believe creating marshaller is thread safe (check it) so no need to synchronization at all
You save your log (whatever it is) before you did actual processing in of the transactions lists
The marshalling will override content of the file with current state of LogServer.Root. If it is shared bettween your proccsCommand (seems so) what is the point in saving it in each thread. Do it once you are finished.
You dont need itr.remove();
The log class (for the ROOT variable !!!) needs to be thread-safe as all the threads will call the operations on it (so the list inside the log class must be concurrent list etc).
And so on.....
I would recommend, to
Start with simple one thread version that actually works.
Deal with processing line by line, (store reasults for each users in differnt file, you can have cache with transactions for recently used users so not to keep writing all the time to the disk (see guava cache)
Process multithreaded each user transaction to your user log objects (again if it is a lot you have to save them to the disk not keep all in memmory).
Write code that combines logs from diiffernt users to create one (again you may want to do it mutithreaded), though it will be mostly IO operations so not much gain and more tricky to do.
Good luck
override cont
I have an xml ans i want to make it objects , i am using xsteam for this and I have added xstream jars in my classpath..
below is my xml...
<Eurexflows xmlns:eur="http://www.eurexchange.com/EurexIRSFullInventoryReport" xmlns:fpml="http://www.fpml.org/FpML-5/confirmation">
<EurexMessageObject>
<CCPTradeId>109599</CCPTradeId>
<novDateTime>2012-02-15 10:59:00.0</novDateTime>
</EurexMessageObject>
<EurexMessageObject>
<CCPTradeId>122270</CCPTradeId>
<novDateTime>2012-06-29 18:59:00.0</novDateTime>
</EurexMessageObject>
</Eurexflows>
below is my pojo...
public class EurexMessageObject {
private Long CCPTradeId;
private String migratedDate;
public Long getCCPTradeId() {
return CCPTradeId;
}
public void setCCPTradeId(Long cCPTradeId) {
CCPTradeId = cCPTradeId;
}
public String getMigratedDate() {
return migratedDate;
}
public void setMigratedDate(String migratedDate) {
this.migratedDate = migratedDate;
}
}
and in my main class I have coded this way..
String xmlInputtra="C:\\Rahul\\InputXml\\Xmloutput.xml";
try
{
// get XStream instance and set required aliases
XStream xstream = new XStream();
xstream.alias("EurexMessageObject", com.rbos.gdspc.eurex.EurexMessageObject.class);
// prepare cash flow message from xslt output
EurexMessageObject eurexflowMsg = (EurexMessageObject) xstream.fromXML(xmlInputtra);
System.out.println(eurexflowMsg.toString());
}catch(Exception e)
{
e.printStackTrace();
}
now upon debuging I am getting the following exception..please advise how can I overcome from this
com.thoughtworks.xstream.io.StreamException: : only whitespace content allowed before start tag and not C (position: START_DOCUMENT seen C... #1:1)
Well,the thing that is overlooked here is how you are reading in the XML file.you are using the method fromXML which is expecting the actual XML input and not the file name. So when it parses your xml (which is "Xmloutput.xml" not the actual xml)
I suggest you to use a FileReader/BufferedReader in order to get the contents of the XML back. Something like this should work:
XStream instream = new XStream();
BufferedReader br = new BufferedReader(new FileReader("Xmloutput.xml"));
StringBuffer buff = new StringBuffer();
String line;
while((line = br.readLine()) != null){
buff.append(line);
}
EurexMessageObject eurexflowMsg = (EurexMessageObject)instream.fromXML(buff.toString());
I hope it will help you, best regards.
Here path for XML file:
String xmlInputtra="C:\\Rahul\\InputXml\\Xmloutput.xml";
is treated as XML contents,
so you need to pass as String for that you can read file and pass to constructor.
i want to know if is possible to me to parse some atributes from a xml file, to be a object in java
I donĀ“t wanna to create all fields that are in xml.
So, how can i do this?
For exemple below there is a xml file, and i want only the data inside the tag .
<emit>
<CNPJ>1109</CNPJ>
<xNome>OESTE</xNome>
<xFant>ABATEDOURO</xFant>
<enderEmit>
<xLgr>RODOVIA</xLgr>
<nro>S/N</nro>
<xCpl>402</xCpl>
<xBairro>GOMES</xBairro>
<cMun>314</cMun>
<xMun>MINAS</xMun>
<UF>MG</UF>
<CEP>35661470</CEP>
<cPais>58</cPais>
<xPais>Brasil</xPais>
<fone>03</fone>
</enderEmit>
<IE>20659</IE>
<CRT>3</CRT>
For Java XML parsing where you don't have the XSD and don't want to create a complete object graph to represent the XML, JDOM is a great tool. It allows you to easily walk the XML tree and pick the elements you are interested in.
Here's some sample code that uses JDOM to pick arbitrary values from the XML doc:
// reading can be done using any of the two 'DOM' or 'SAX' parser
// we have used saxBuilder object here
// please note that this saxBuilder is not internal sax from jdk
SAXBuilder saxBuilder = new SAXBuilder();
// obtain file object
File file = new File("/tmp/emit.xml");
try {
// converted file to document object
Document document = saxBuilder.build(file);
//You don't need this or the ns parameters in getChild()
//if your XML document has no namespace
Namespace ns = Namespace.getNamespace("http://www.example.com/namespace");
// get root node from xml. emit in your sample doc?
Element rootNode = document.getRootElement();
//getChild() assumes one and only one, enderEmit element. Use a lib and error
//checking as needed for your document
Element enderEmitElement = rootNode.getChild("enderEmit", ns);
//now we get two of the child from
Element xCplElement = enderEmitElement.getChild("xCpl", ns);
//should be 402 in your example
String xCplValue = xCplElement.getText();
System.out.println("xCpl: " + xCplValue);
Element cMunElement = enderEmitElement.getChild("cMun", ns);
//should be 314 in your example
String cMunValue = cMunElement.getText();
System.out.println("cMun: " + cMunValue);
} catch (JDOMException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
You can use JAXB to unmarshal the xml into Java object, with which you can read selective elements easily. With JAXB, the given XML can be represented in Java as follows :
enderEmit element :
#XmlRootElement
public class EnderEmit{
private String xLgr;
//Other elements.Here you can define properties for only those elements that you want to load
}
emit element (This represents your XML file):
#XmlRootElement
public class Emit{
private String cnpj;
private String xnom;
private EnderEmit enderEmit;
..
//Add elements that you want to load
}
Now by using the below lines of code, you can read your xml to an object :
String filePath="filePath";
File file = new File(filePath);
JAXBContext jaxbContext = JAXBContext.newInstance(Emit.class);
jaxbUnmarshaller = jaxbContext.createUnmarshaller();
Emit emit = (Emit) jaxbUnmarshaller.unmarshal(file);
The line will give you an emit object for the given xml.
Try to use StringUtils.subStringBetween
try
{
String input = "";
br = new BufferedReader(new FileReader(FILEPATH));
String result = null;
while ((input = br.readLine()) != null) // here we read the file line by line
{
result = StringUtils.substringBetween(input, ">", "<"); // using StringUtils.subStringBetween to get the data what you want
if(result != null) // if the result should not be null because some of the line not having the tags
{
System.out.println(""+result);
}
}
}
catch (IOException e)
{
e.printStackTrace();
}
finally
{
try
{
if (br != null)
{
br.close();
}
}
catch (IOException ex)
{
ex.printStackTrace();
}
}
I have the follwing XML file -
<?xml version="1.0" encoding="UTF-8"?>
<BatchOrders xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<BatchHeader>
<ServiceProvider>123456789</ServiceProvider>
</BatchHeader>
<OrderDetails>
<MessageType>HelloWorld</MessageType>
<IssueDateTime>22/01/2012 00:00:00</IssueDateTime>
<receivedDateTime>22/01/2012 00:00:00</receivedDateTime>
<Status>TestStatus</Status>
</OrderDetails>
</BatchOrders>
I want to read in the contents and set them to fields I have created. So I have the following code below (not some is omitted - I have just included what I think I need to show. The below is in a test class which I have created - I also have a writer as part of this class that writes an XML File fine to disk as I expect. The problem I am facing is reading the file above and displaying the contents read to the Console just for now.
File myFileRead = null;
FileReader myFileReader = null;
try {
myFileRead = new File("C:/Path/myfile.xml");
myRecord = new myRecord();
myFileReader = new FileReader(myFileRead);
myXPathReader reader = new myXPathReader(myFileReader);
while (reader.hasNext())
{
record = reader.next();
//prints out then to cosole
}
So from above I have the myRecord class where I have the getters/setters for e.g ServiceProvider, etc. I also then have a class for myXpathReader which does the following:
private Document document;
private List batchorders;
private Iterator iterator;
public myXPathReader (Reader myFileReader)
throws Exception
{
SAXBuilder builder = new SAXBuilder();
document = builder.build(myFileReader);
batchorders = new JDOMXPath("//BatchOrders").selectNodes(document);
iterator = batchorders.iterator();
}
public int getSize() { return batchorders.size(); }
public boolean hasNext() { return iterator.hasNext(); }
public myRecord next()
throws Exception {
Element element = (Element) iterator.next();
myRecord record = new myRecord();
record.setServiceProvider((new JDOMXPath("./ServiceProvider").stringValueOf(element)));
//Some more sets ans close class etc...
Now if I debug the code and after the element on iterator.next I can see the file contents have being read in correctly. But on my console the ServiceProvider value and in fact all the values are getting set to empty string "". Am I doing something incorrect on the JDOMXPath in order to pull the value from the XML?
In your example XML ServiceProvider is not a child of BatchOrders, there's another level (BatchHeader) in between. So your second XPath expression should probably be
BatchHeader/ServiceProvider
instead of ./ServiceProvider
I have an 200 MB xml of the following form:
<school name = "some school">
<class standard = "2A">
<student>
.....
</student>
<student>
.....
</student>
<student>
.....
</student>
</class>
</school>
I need to split this xml into several files using StAX such that n students come under each xml file and the structure is preserved as <school> then <class> and <students> under them. The attributes of School and class also must be preserved in the resultant xmls.
Here is the code I am using:
XMLInputFactory inputFactory = XMLInputFactory.newInstance();
String xmlFile = "input.XML";
XMLEventReader reader = inputFactory.createXMLEventReader(new FileReader(xmlFile));
XMLOutputFactory outputFactory = XMLOutputFactory.newInstance();
outputFactory.setProperty("javax.xml.stream.isRepairingNamespaces", Boolean.TRUE);
XMLEventWriter writer = null;
int count = 0;
QName name = new QName(null, "student");
try {
while (true) {
XMLEvent event = reader.nextEvent();
if (event.isStartElement()) {
StartElement element = event.asStartElement();
if (element.getName().equals(name)) {
String filename = "input"+ count + ".xml";
writer = outputFactory.createXMLEventWriter(new FileWriter(filename));
writeToFile(reader, event, writer);
writer.close();
count++;
}
}
if (event.isEndDocument())
break;
}
} catch (XMLStreamException e) {
throw e;
} catch (IOException e) {
e.printStackTrace();
} finally {
reader.close();
}
private static void writeToFile(XMLEventReader reader, XMLEvent startEvent, XMLEventWriter writer) throws XMLStreamException, IOException {
StartElement element = startEvent.asStartElement();
QName name = element.getName();
int stack = 1;
writer.add(element);
while (true) {
XMLEvent event = reader.nextEvent();
if (event.isStartElement() && event.asStartElement().getName().equals(name))
stack++;
if (event.isEndElement()) {
EndElement end = event.asEndElement();
if (end.getName().equals(name)) {
stack--;
if (stack == 0) {
writer.add(event);
break;
}
}
}
writer.add(event);
}
}
Please check the function call writeToFile(reader, event, writer) in the try block. Here the reader object has only the student tag. I need the reader has the school, class, and then n students in it. so that the file generated has a similar structure as the original only with lesser children per file.
Thanks in advance.
I think you can keep track of list of parent events prior to the "student" start element event and pass it to the writeToFile() method. Then in the writeToFile() method you can use that list to simulate the "school" and "class" events.
You have code for determining when to start a new file which I haven't examined closely, but the process of finishing one file and starting the next is definitely incomplete.
On reaching a point where you want to end a file, you have to generate end events for the enclosing <class> and <school> tags and for the document before closing it. When you start your new file, you need to generate start events for the same after opening it and before starting again to copy student events.
In order to generate the start events properly, you will have to retain the corresponding events from the input.
Save yourself trouble and time and use the flat xml file structure you currently have, and then create POJO Objects which will represent each object as you've stated; Student, School and Class. And then using Jaxb bind the objects with different part of the Structure. You can then effectively unmarshal the xml and access the various elements as if you're dealing with SQL objects.
Use this link as a starting point XML parsing with JAXB
One issue doing it this way is memory consumption. For design flexibility and memory management, I will suggest using SQL to handle this.