I have created my own XML-file on my Android phone, which looks similar to this
<?xml version="1.0" encoding="utf-8" ?>
<backlogs>
<issue id="1">
<backlog id="0" name="Linux" swid="100" />
<backlog id="0" name="Project Management" swid="101" />
</issue>
<issue id="2">
<backlog id="0" name="Tests" swid="110" />
<backlog id="0" name="Online test" swid="111" />
<backlog id="0" name="Test build" swid="112" />
<backlog id="0" name="Update" swid="113" />
</issue>
</backlogs>
I have then converted it into a String to replace inside the string using Regular Expression, but I have a problem with the Regular Expression. The Regular Expression I just created looks like this
([\n\r]*)<(.*)issue(.*)1(.*)([\n\r]*)(.*)([\n\r]*)(.*)([\n\r]*)(.*)<(.*)/(.*)issue(.*)
I need to replace the specific issue-tag (located with the specific ID) with another issue-tag in another String
The Regular Expression works fine for the tag with ID 1, but not with ID 2 as there is another amount of tags, but is there any way to get around the use of amount?
I hope you understand my question
I finally found a solution for my question, which is
([\n\r]*)<(.*)issue(.*)1[\S\s]*?<(.*)/(.*)issue(.*)
Do not use regex. Please. Use XML parser.
Do you know what is the highest voted SO answer
Use a SAX (or StAX) parser and writer at the same time.
As you read one event, detect whether to write the same event type to the writer without modification, or to do some modifications in the state you are currently in - like swapping an element name or attribute value. This will handle an unlimited amount of elements at the expense of CPU usage; in general it will be pretty light-weight.
Related
I have a issue where i need to find and replace an int within a xml file.
Here is an example file:
<?xml version="1.0" encoding="UTF-8"?>
<Data>
<Element time="0.00000" num="10723465" />
<Element time="7.98000" num="10028736" />
<Element time="8.40000" num="94123576" />
</data>
I want to find and replace the "num" attribute. I have been able to do this with the DOM factory however it doesn't keep the order of the attributes. There must be a simpler way to find and replace the num. any help would be great :)
Advice 1: you should use XML library to parse, nothing else, or it would be a pain.
Information 2: order doesnt matter in XML for attributes. Then you should forget this problem:
See this with much more details
Order of XML attributes after DOM processing
Alternative: use Regex (it can work for very simple XML). example in the link before
I've read a lot about XML and SAX lately but I'm not sure nor have I seen this in any examples or seen it talked about. I have an XML file and I want to get the information from it.
<?xml version='1.0' encoding='UTF-8'?>
<eveapi version="2">
<currentTime>2015-04-30 04:22:22</currentTime>
<result>
<rowset name="characters" key="characterID"columns="name,characterID,corporationName,corporationID,allianceID,allianceName,factionID,factionName">
<row name="Grasume Crendraven" characterID="92916469" corporationName="Hyperion Guard" corporationID="98372642" allianceID="99005157" allianceName="Winmatar Republic" factionID="0" factionName="" />
<row name="Susan Snow" characterID="95325415" corporationName="Federal Navy Academy" corporationID="1000168" allianceID="0" allianceName="" factionID="0" factionName="" />
<row name="Grasume" characterID="95528725" corporationName="School of Applied Knowledge" corporationID="1000044" allianceID="0" allianceName="" factionID="0" factionName="" />
</rowset>
</result>
<cachedUntil>2015-04-30 05:02:33</cachedUntil>
</eveapi>
What I'm trying to do is get the information in the row name section then pull the info I need from that string. I assume the best thing for me to do is to pull the XML section I need and then set up a text parser to get the specific information that I need to be used in the rest of my program.
Let's say I want to use JAXB to generate two XML files -- one containing a list of items, the other containing detailed definitions of those items. For instance, something like this:
Garage.xml:
<garage>
<car ref="1" />
<car ref="2" />
</garage>
Cars.xml:
<cars>
<car id="1" color="blue" model="Impreza" />
<car id="2" color="plaid" model="Taurus" />
</cars>
Is there anything clever I can do to define a single JAXB Car.java object that will allow me to use the same object for both files? If not, is there an accepted best practice anyone would recommend beyond the obvious solution of creating two separate Car classes? (One with the ref attribute, the other with the id, color, and model attributes.)
I have a xml file roughly like this:
<batch>
<header>
<headerStuff />
</header>
<contents>
<timestamp />
<invoices>
<invoice>
<invoiceStuff />
</invoice>
<!-- Insert 1000 invoice elements here -->
</invoices>
</contents>
</batch>
I would like to split that file to 1000 files with the same headerStuff and only one invoice. Smooks documentation is very proud of the possibilities of transformations, but unfortunately I don't want to do those.
The only way I've figured how to do this is to repeat the whole structure in freemarker. But that feels like repeating the structure unnecessarily. The header has like 30 different tags so there would be lots of work involved also.
What I currently have is this:
<?xml version="1.0" encoding="UTF-8"?>
<smooks-resource-list xmlns="http://www.milyn.org/xsd/smooks-1.1.xsd"
xmlns:calc="http://www.milyn.org/xsd/smooks/calc-1.1.xsd"
xmlns:frag="http://www.milyn.org/xsd/smooks/fragment-routing-1.2.xsd"
xmlns:file="http://www.milyn.org/xsd/smooks/file-routing-1.1.xsd">
<params>
<param name="stream.filter.type">SAX</param>
</params>
<frag:serialize fragment="INVOICE" bindTo="invoiceBean" />
<calc:counter countOnElement="INVOICE" beanId="split_calc" start="1" />
<file:outputStream openOnElement="INVOICE" resourceName="invoiceSplitStream">
<file:fileNamePattern>invoice-${split_calc}.xml</file:fileNamePattern>
<file:destinationDirectoryPattern>target/invoices</file:destinationDirectoryPattern>
<file:highWaterMark mark="10"/>
</file:outputStream>
<resource-config selector="INVOICE">
<resource>org.milyn.routing.io.OutputStreamRouter</resource>
<param name="beanId">invoiceBean</param>
<param name="resourceName">invoiceSplitStream</param>
<param name="visitAfter">true</param>
</resource-config>
</smooks-resource-list>
That creates files for each invoice tag, but I don't know how to continue from there to get the header also in the file.
EDIT:
The solution has to use Smooks. We use it in an application as a generic splitter and just create different smooks configuration files for different types of input files.
I just started with Smooks myself. However... your problem sounds identical to this: http://www.smooks.org/mediawiki/index.php?title=V1.5:Smooks_v1.5_User_Guide#Routing_to_File
You will have to provide the output FTL format in whole, that's the downside of using a general purpose tool I guess. Data mapping often includes a lot of what feels like redundancy, one way around this is to leverage convention but that has to be built into the framework.
I don't know smooks, but the simplest solution (with poor performance) would be (to create the Nth file):
copy the whole xml structure
delete all the invoice tags but the Nth one
I don't know how to do that in smooks, that only an idea. In this case you don't need to duplicate the structure of the xml in a freemarker template.
<playingTestCodeDetails classCode="ENT" determinerCode="INSTANCE" >
<realmCode code="QD" />
<id assigningAuthorityName="PRMORDCODE" extension="16494" />
<id assigningAuthorityName="TESTNUMINBOOK" extension="16494" />
<code code="16494" codeSystemName="QTIM" displayName="SureSwab Candidiasis" />
<name use=""></name>
<asSeeAlsoCode classCode="ROL" > <!-- Have repeated Seealsocode section for multiple see also codes and stripped names -->
<realmCode code="QD" />
<code code="7600" displayName="Sample See Also Name" ></code>
</asSeeAlsoCode>
<asSeeAlsoCode classCode="ROL" >
<realmCode code="QD" />
<code code="6496" displayName="Sample See Also Name" ></code>
</asSeeAlsoCode>
</playingTestCodeDetails>
<subjectOf typeCode="SBJ">
<realmCode code="QD" />
<order classCode="OBS" moodCode="EVN" >
<realmCode code="QD" />
<performer nullFlavor="" typeCode="PRF"><!-- Have added this to accomodate the UnitCode-->
<performingLocatedEntity classCode="LOCE" nullFlavor="">
<locatedPerformingSite classCode="ORG" determinerCode="INSTANCE">
<id assigningAuthorityName="ASORDERED" extension="16494" />
</locatedPerformingSite>
</performingLocatedEntity>
</performer>
<origin nullFlavor="" typeCode="ORG"> <!-- Have added this to accomodate the Ordering Lab Code-->
<orderingLocatedEntity classCode="LOCE" >
<locatedOrderingSite classCode="ORG" determinerCode="INSTANCE">
<id assigningAuthorityName="PRMORDCODE" extension="16494"/>
<code code="SJC" codeSystemName="QTIM" codeSystem="ORDERINGLABCODE"/>
</locatedOrderingSite>
</orderingLocatedEntity>
</origin>
<pertinentInformation1 typeCode="PERT">
<realmCode code="QD" />
<clinicalInfo classCode="ACT" moodCode="EVN">
<realmCode code="QD" />
<title>Specialitysample1</title>
<text>Conditionsample1</text>
</clinicalInfo>
</pertinentInformation1>
<subjectOf typeCode="SUBJ">
<realmCode code="QD" />
<annotation classCode="ACT" moodCode="EVN" >
<realmCode code="QD" />
<code code="DOSCATNAME"></code>
<text><![CDATA[SureSwab<sup>®</sup>, <em>Candidiasis</em>, PCR]]></text>
</annotation>
</subjectOf>
</subjectOf>
I have a xml looking like above. I want to parse it; what is the best way to parse it?? DOM, SAX ( i have heard of JAXB, XSLT,.... not sure of this two); Can we have a combination of DOM & SAX to parse a XML??
A simple scenario to attain a tag value using attribute access as "code"
like when code=DOSCATNAME in tag then we need to take up data for corresponding tag.
Other scenario is to access tag and get the hierarchy and access extension attribute of when assigningAuthorityName attribute has value PRMORDCODE.
Can the above two scenarios can be achievable using a Parser??
I am a newbie, please understand what i need to parse & suggest me a thought... thanks in advance...
Use JAXB. Create class model and annotate your classes appropriately. The environment will do the rest.
For example you should create class PlayingTestCodeDetails with properties classCode, determinerCode etc.
I will tell you more: you can kindly ask JAXB to generate the classes for you. Start learning from this article: http://www.roseindia.net/jaxb/r/jaxb.shtml
It will take a couple of hours to start but than you will be done in 15 minutes. If you are using DOM you can start in 15 minutes of learning and the coding a couple of days to parse your XML.
It depends on your need which to use.
Both SAX and DOM are used to parse the XML document. Both has advantages and disadvantages and can be used in our programming depending on the situation.
SAX
• Parses node by node
• Doesn’t store the XML in memory
• We cant insert or delete a node
• SAX is an event based parser
• SAX is a Simple API for XML
• doesn’t preserve comments
• SAX generally runs a little faster than DOM
DOM
• Stores the entire XML document into memory before processing
• Occupies more memory
• We can insert or delete nodes
• Traverse in any direction.
• DOM is a tree model parser
• Document Object Model (DOM) API
• Preserves comments
• SAX generally runs a little faster than DOM
If we need to find a node and doesn’t need to insert or delete we can go with SAX itself otherwise DOM provided we have more memory.
These are few parsers:-
woodstox
dom4j
In addition to SAX and DOM there is STaX parsing available using XMLStreamReader which is an xml pull parser.