I am trying to remove a node from a large xml file. With this code the tags of the other elements are altered as well. I was hoping someone could explain why or how to fix it.
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
Document document = dbf.newDocumentBuilder().parse(new File(filePath)); //filePath - source file
/*while (document.getElementsByTagName("IMFile").getLength() != 0){
//Loop until all childs are removed
Element element = (Element) document.getElementsByTagName("IMFile").item(0);
element.getParentNode().removeChild(element);
}*/
//Test for first appearance
Element element = (Element) document.getElementsByTagName("IMFile").item(0);
element.getParentNode().removeChild(element);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = tf.newTransformer();
t.transform(new DOMSource(document), new StreamResult(new File(filePath+"_New"))); //destination
It changes positions of the xml such as:
<Attribute id="7" value="1920" name="width"/> to <Attribute id="7" name="width" value="1920"/>
Also it cuts off some open or end tags:
<PowerPointFilename></PowerPointFilename> to <PowerPointFilename/>
You can use a SAX transformer to modify an XML document while preserving attribute order:
public static void main(String[] args) throws IOException, TransformerException, SAXException {
XMLReader reader = XMLReaderFactory.createXMLReader();
TransformerFactory tf = TransformerFactory.newInstance();
// Load the transformer definition from the file strip.xsl:
Transformer t = tf.newTransformer(new SAXSource(reader, new InputSource(new FileInputStream("strip.xsl"))));
// Transform the file test.xml to stdout:
t.transform(new SAXSource(reader, new InputSource(new FileInputStream("test.xml"))), new StreamResult(System.out));
}
Here's an XSL transform to strip IMFile elements:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!-- Copy -->
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<!-- Strip IMFile elements -->
<xsl:template match="IMFile"/>
</xsl:stylesheet>
Related
I am using w3c DOM to write xml file.
when i used to create first child node no trouble occurs.
For the 2nd time if i'm appending a new node in pre existing file it creates unwanted new lines in previous nodes and the new lines kept increasing everytime when i used to insert new nodes.
Here is my code...
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(new File("D:\\TestXml.xml"));
Element rootElement = doc.getDocumentElement();
Element supercar = doc.createElement("supercars");
rootElement.appendChild(supercar);
Element carname = doc.createElement("carname");
carname.appendChild(doc.createTextNode("Ferrari 103"));
supercar.appendChild(carname);
Element carname1 = doc.createElement("carname");
carname1.appendChild(doc.createTextNode("Ferrari 204"));
supercar.appendChild(carname1);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File("D:\\TestXml.xml"));
transformer.transform(source, result);
And here is the Generated File.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<cars>
<supercars>
<carname>Ferrari 101</carname>
<carname>Ferrari 202</carname>
</supercars>
<supercars>
<carname>Ferrari 103</carname>
<carname>Ferrari 204</carname>
</supercars>
</cars>
The Code above is used to append the 2nd node for the 1'st time the generated file haves no extra new lines.
And if add 10 new nodes the file haves so many unnecessary new lines resulting in more than 300 lines.
Also the file size got increased.
I cannot able to come to a conclusion that why this is occurring.
The Problem occurring for every new node insertion.
Any suggestions will be really helpful.
Consider running the identity transform XSLT where its <xsl:strip-space> removes such line breaks and spaces between nodes. You can easily incorporate XSLT in your existing code:
XSLT (save below as .xsl file, copies entire document as is)
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Java
import javax.xml.transform.stream.StreamSource;
...
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(new File("D:\\TestXml.xml"));
Element rootElement = doc.getDocumentElement();
Element supercar = doc.createElement("supercars");
rootElement.appendChild(supercar);
Element carname = doc.createElement("carname");
carname.appendChild(doc.createTextNode("Ferrari 103"));
supercar.appendChild(carname);
Element carname1 = doc.createElement("carname");
carname1.appendChild(doc.createTextNode("Ferrari 204"));
supercar.appendChild(carname1);
Source xslt = new StreamSource(new File("C:\\Path\\To\\Style.xsl")); // LOAD STYLESHEET
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer(xslt); // APPLY XSLT
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File("D:\\TestXml.xml"));
transformer.transform(source, result);
I have a xml file like following:
<?xml version='1.0' encoding='UTF-8'?>
<env:Envelope
xmlns:env="http://*******************">
<env:Body>
<wd:Get_Organizations_Response
xmlns:wd="**********/****" wd:version="*****">
<wd:Organization>
<wd:Organization_Reference>
<wd:ID wd:type="WID">***************************</wd:ID>
<wd:ID wd:type="Organization_Reference_ID">Saibal Bhaduri</wd:ID>
</wd:Organization_Reference>
<wd:Organization_Data>
<wd:Reference_ID>Saibal Bhaduri</wd:Reference_ID>
<wd:Organization_Type_Reference>
<wd:ID wd:type="WID">************************</wd:ID>
<wd:ID wd:type="Organization_Type_ID">saibal</wd:ID>
</wd:Organization_Type_Reference>
<wd:Top-Level_Organization_Reference>
<wd:ID wd:type="WID">19f47283ee6c436da3ecd5dda2902747</wd:ID>
<wd:ID wd:type="Organization_Reference_ID">HRIS_matrix</wd:ID>
</wd:Top-Level_Organization_Reference>
</wd:Hierarchy_Data>
</wd:Organization_Data>
</wd:Organization>
</wd:Organization_Data>
</wd:Organization>
</wd:Response_Data>
</wd:Get_Organizations_Response>
</env:Body>
</env:Envelope>
I need to parse the 2nd ID "wd:ID" field of "wd:Organization_Reference" and "wd:Include_Manager_in_Name" and 2nd id field of wd:Organization_Type_Reference.
My Java code is below:
public static void main(String args[]) throws Exception {
//new XMLtoCSV().convertXMLtoCSV("");
System.out.println("Start Creation of CSV");
File stylesheetHumanResources = new File("config/HumanResourcesXSLT.xsl");
File xmlSource = new File("config/Human_Resources_ResponseXML.xml");
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(xmlSource);
StreamSource stylesource = new StreamSource(stylesheetHumanResources);
Transformer transformer = TransformerFactory.newInstance().newTransformer(stylesource);
Source source = new DOMSource(document);
Result outputTarget = new StreamResult(new File("config/Human_Resources_CSV.csv"));
transformer.transform(source, outputTarget);
System.out.println("Complete Creation of CSV");
}
My XSL code is below:
wd:Organization_Reference,wd:Reference_ID,wd:Name,wd:Include_Manager_in_Name,wd:Include_Organization_Code_in_Name,wd:Organization_Type_Reference,wd:Organization_Subtype_Reference,wd:Availibility_Date,wd:Organization_Visibility_Reference
<xsl:for-each select="//wd:Organization">
<xsl:variable name="mktDef" select="concat(wd:Organization_Reference/#wd:ID,',',wd:Organization_Data/#wd:Include_Manager_in_Name,',',wd:Organization_Data/wd:Organization_Type_Reference/#wd:ID,'
')"/>
</xsl:for-each>
</xsl:template>
How to parse the xml? what will be the proper syntax of XSLT?
Thank you
I need to delete all OBJECT TAG in XML file using Java. I can able to delete the OBJECT Tag when I enter parent Tag Name(SPAN) directly Hard code into the source code("span"), But I need to delete the Tag without hard code Parent Tag. If I hard code, I can able to delete only the Object Tag inside span Tag. I need to delete all the in XML even it may be inside another parent Tag, Without Hard code the Parent tag in Source code. I need to delete all Object tag available inside both span tag and also score tag in Sample XML File. For Sample XML File view the below Image.
Java Program
public class XmlObject {
public static void main(String[] args) {
String filePath = "/Users/myXml/Sample.xml";
File xmlFile = new File(filePath);
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder;
try {
dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(xmlFile);
doc.getDocumentElement().normalize();
deleteElement(doc);
doc.getDocumentElement().normalize();
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StreamResult result = new StreamResult(new File("/Users/myXml/Sample_ObjDelete.xml"));
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.transform(source, result);
System.out.println("XML file updated successfully");
} catch (SAXException | ParserConfigurationException | IOException | TransformerException e1) {
e1.printStackTrace();
}
}
private static void deleteElement(Document doc) {
NodeList RootElement = doc.getElementsByTagName("assessmentItem");
int getRootElementLength = RootElement.getLength();
System.out.println("getRootElementLength "+getRootElementLength);
for(int k = 0; k < getRootElementLength; k++){
System.out.println("2");
Node nNode = RootElement.item(0);
Element eElement = (Element) nNode;
NodeList object = eElement.getElementsByTagName("span");
Element obj = null;
for(int i=0; i<object.getLength();i++){
obj = (Element) object.item(i);
int leng = obj.getElementsByTagName("object").getLength();
System.out.println("object:" +leng);
for(int j=0; j<leng;j++){
Node objectNode = obj.getElementsByTagName("object").item(k);
(obj).removeChild(objectNode);
}
}
}
}
}
<qualityTest>
<responseDeclaration>
<correctResponse>
<value>QualityTest</value>
</correctResponse>
</responseDeclaration>
<itemBody>
<sampleTest>
<p>Who is president of uganda?</P>
<span>
<object>
Yoweri Museveni</object>
<span>
<object>
Raúl Castro
</Object>
</span>
</sampleTest>
</itemBody>
<score>
<object>
Yingluck Shinawatra
</Object>
</score>
</qualityTest>
You should walk the xml-tree recursively and remove all occurencies of any object element:
private static void deleteElement(Node someNode) {
NodeList childs = someNode.getChildNodes();
for (int i = 0; i < childs.getLength();) {
Node child = childs.item(i);
if (child.getNodeType() == Document.ELEMENT_NODE) {
if (child.getNodeName().equalsIgnoreCase("object")) {
child.getParentNode().removeChild(child);
continue;
} else {
deleteElement(child);
}
}
i++;
}
}
This little code snippet will remove any xml-tag named "object" in any depth of the tree.
You can use Xpath and XpathExpression for going Span and Score tags
XPath xPath = XPathFactory.newInstance().newXPath();
String expression = "//span";
NodeList spanNodeList = (NodeList) xPath.compile(expression).evaluate(document, XPathConstants.NODESET);
spanNodeList will give you all span nodes. so you iterate the spanNodeList and delete the span element like below code
for (int i = 0; i < spanNodeList .getLength(); i++) {
Node spanItem= spanNodeList .item(i);
Node parentNode = spanItem.getParentNode();
parentNode.removeChild(spanItem);
}
and same for Score tag also
You can use an XPath to select all elements except <object> elements. For instance, you can put this in a file named strip-object.xsl:
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="xml" omit-xml-declaration="yes"/>
<xsl:template match="//object"/>
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
Notice the first template rule, which does nothing with object nodes, effectively discarding them. The second template rule, which takes effect for all other nodes, copies them exactly.
To make use of it, initialize your Transformer with the .xsl file:
Transformer transformer = transformerFactory.newTransformer(
new StreamSource(new File("/Users/myXml/strip-object.xsl")));
If you only want to strip out object elements which are children of span and score elements, you can change the XPath expression:
<xsl:template match="//span/object|//score/object"/>
For some reason my below code gives the exception: javax.xml.transform.TransformerConfigurationException: Could not compile stylesheet
public String removePrettyPrint(String xml) throws TransformerException, TransformerFactoryConfigurationError {
String result = "";
TransformerFactory factory = TransformerFactory.newInstance();
String source = "<?xml version=\"1.0\"?><xsl:stylesheet version=\"1.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\"> <xsl:output indent=\"no\" /> <xsl:template match=\"#*|node()\"> <xsl:copy> <xsl:apply-templates select=\"#*|node()\"/> </xsl:copy> </xsl:template></xsl:stylesheet>";
Source xslt = new StreamSource(source);
Transformer transformer = factory.newTransformer(xslt);
Source text = new StreamSource(xml);
transformer.transform(text, new StreamResult(result));
return result;
}
What is wrong with it?
I think the problem is that when you pass a string as a parameter to the StreamSource, it expects it to be a URL of an XML document, not the actual XML string itself.
You probably need to use a StringReader reader here:
String source = "...XSL Here...";
StringReader xsltReader = new StringReader(source);
Source xslt = new StreamSource(xsltReader);
Transformer transformer = factory.newTransformer(xslt);
You'll probably have to do the same for the XML, assuming you are passing in XML, and not the URL to an XML document.
StringReader xmlReader = new StringReader(xml);
Source text = new StreamSource(xmlReader);
And for the transform itself, you may need to make use of a StringWriter
StringWriter writer = new StringWriter();
transformer.transform(text, new StreamResult(writer));
result = writer.toString();
The problem was because of encoding.. When you read a file using File() class it does encoding internally .. But when you try to load XML as string .. then you have to do it manually ..
observe the usage of .getBytes() and ByteArrayInputStream(bytes).
String source = "<?xml version=\"1.0\"?><xsl:stylesheet version=\"1.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\"> <xsl:output indent=\"no\" /> <xsl:template match=\"#*|node()\"> <xsl:copy> <xsl:apply-templates select=\"#*|node()\"/> </xsl:copy> </xsl:template></xsl:stylesheet>";
byte[] bytes = source .getBytes("UTF-16");
Source xslsource = new StreamSource(new ByteArrayInputStream(bytes));
Transformer transformer = factory.newTransformer(xslsource);
The alternative solution would be to use StringReader:
String source = "<?xml version=\"1.0\"?><xsl:stylesheet version=\"1.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\"> <xsl:output indent=\"no\" /> <xsl:template match=\"#*|node()\"> <xsl:copy> <xsl:apply-templates select=\"#*|node()\"/> </xsl:copy> </xsl:template></xsl:stylesheet>";
StringReader xslReader = new StringReader(source);
Source xslsource= new StreamSource(xslReader);
which takes care of encoding automatically..
I have read a lot of post and tried a lot of things but still can't get the xsl to find values in the parameter. I started with java's sun xalan and never got it working so I switched to saxon to no avail. I want to combine two xml docs into one with xls. Never are on a file system, this is for a web app that builds xml strings/ docs. I have tried passing an DTMAxisIterator, DomSource , Doc to Node set in xsl, string. It worked fine in NotePad++ with an xsl document() but I don't want to save the xml on the system.
XSL
<xsl:param name="RsXml" select="/"/>
<xsl:template match="/policy/vehicles">
<vehicle type="DP" type_code="DP"/>
<xsl:for-each select="$RsXml/InsuranceSvcRs /com.csc_PolicyOrderCurrentCarrierInqRs/PersVeh">
<vin>
<xsl:value-of select="VehIdentificationNumber"/>
</vin>
<veh_year>
<xsl:value-of select="ModelYear"/>
</veh_year>
<make>
<xsl:value-of select="Manufacturer"/>
</make>
<model>
<xsl:value-of select="Model"/>
</model>
<costnew>
<xsl:value-of select="CostNewAmt/Amt"/>
</costnew>
<symbol>
<xsl:value-of select="VehSymbolCd"/>
</symbol>
<wheregaraged></wheregaraged>
<liabilityonly></liabilityonly>
<collision></collision>
<comprehensive></comprehensive>
<rentalreimbursement></rentalreimbursement>
<towing></towing>
<altered></altered>
<title></title>
<enginesize>
<xsl:value-of select="NumCylinders"/>
</enginesize>
<trailertype/>
<trtonnage/>
<mctype/>
<mcenginecc/>
<vehicleuse></vehicleuse>
<mhawnings></mhawnings>
<vseat15></vseat15>
<vseat15text/>
<extraequipment></extraequipment>
<mcsidecar></mcsidecar>
<atvwheels/>
<damage/>
<endorsements/>
<avtotal/>
<v_underwriting>
<altered></altered>
<alteredlist/>
<alteredexplain/>
<businessuse></businessuse>
<haulstudents></haulstudents>
<pulltrailers></pulltrailers>
<trailerendorsement/>
</v_underwriting>
<driverid></driverid>
<gen_classcode></gen_classcode>
<classcode></classcode>
<primary_veh></primary_veh>
<rates>
<bi></bi>
<pd></pd>
<med></med>
<ubi></ubi>
<upd></upd>
<comp></comp>
<coll></coll>
<comm></comm>
<rr></rr>
<tl></tl>
</rates>
<xferdis></xferdis>
<atv_young_dr></atv_young_dr>
<mrcd_date/>
<hasdamage></hasdamage>
<comp_symbol></comp_symbol>
<str_legal></str_legal>
<addresses/>
</xsl:for-each>
<xsl:apply-templates/>
XML One
<?xml version="1.0" encoding="UTF-8"?>
<policy id="1735">
<vehicles>
</vehicles>
</policy>
XML Two
<ACORD>
<InsuranceSvcRs>
<com.csc_PolicyOrderCurrentCarrierInqRs>
<PersVeh id="001">
<ItemIdInfo>
<InsurerId>001</InsurerId>
</ItemIdInfo>
<Manufacturer>FORD</Manufacturer>
<Model>WINDSTAR</Model>
<ModelYear>1999</ModelYear>
<VehBodyTypeCd>ES</VehBodyTypeCd>
<CostNewAmt>
<Amt>23660</Amt>
</CostNewAmt>
<NumDaysDrivenPerWeek />
<EstimatedAnnualDistance>
<NumUnits />
<UnitMeasurementCd />
</EstimatedAnnualDistance>
<FullTermAmt>
<Amt />
</FullTermAmt>
<TerritoryCd />
<VehIdentificationNumber>1</VehIdentificationNumber>
<NumCylinders>6</NumCylinders>
<VehSymbolCd />
<AntiLockBrakeCd>4-WHEEL STD</AntiLockBrakeCd>
<DaytimeRunningLightInd />
<DistanceOneWay>
<NumUnits />
<UnitMeasurementCd>MI</UnitMeasurementCd>
</DistanceOneWay>
<AntiTheftDeviceCd>PASS-KEY</AntiTheftDeviceCd>
<VehPerformanceCd />
<VehUseCd />
<AirBagTypeCd>BOTH</AirBagTypeCd>
<com.csc_VehBodyTypeFreeformInd />
</PersVeh>
</com.csc_PolicyOrderCurrentCarrierInqRs>
</InsuranceSvcRs>
</ACORD>
Class
public String transformResultXML(String xmlSource, Templates xsl,String policyXml ) {
String result = "";
try {
StringWriter writer = new StringWriter();
StringReader reader2 = new StringReader(policyXml);
XmlHelper xh = new XmlHelper();
Document xmlSrc = xh.loadDoc(xmlSource);
DOMSource source = new DOMSource(xmlSrc);
ByteArrayInputStream byteStream = new ByteArrayInputStream(xmlSource.getBytes());
StringReader reader = new StringReader(xmlSource);
SAXSource source2 = new SAXSource(new XMLFilterImpl(), new InputSource(reader));
TransformerFactory transFact = new com.icl.saxon.TransformerFactoryImpl();
Transformer transformer = transFact.newTransformer();
transformer.setParameter("RsXml",source2);
// transformer.setParameter("RsXml",xmlSrc);
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.transform(new javax.xml.transform.stream.StreamSource(reader2),
new javax.xml.transform.stream.StreamResult(writer));
result = writer.toString();
System.out.println(result);
} catch( Exception e ) {
e.printStackTrace();
}
return result;
}
I was able to get it work with Saxon see below code. I think the key was the document.getDocumentElement() as parm
public String transformResultXML(String xmlSource, Templates xsl,String policyXml ) {
String result = "";
try {
StringWriter writer = new StringWriter();
StringReader reader2 = new StringReader(policyXml);
DocumentBuilderFactory dfactory =
DocumentBuilderFactory.newInstance( "com.icl.saxon.om.DocumentBuilderFactoryImpl",null);
dfactory.setNamespaceAware(true);
DocumentBuilder docBuilder = dfactory.newDocumentBuilder();
org.w3c.dom.Document document = docBuilder.parse(new InputSource(new StringReader(xmlSource)));
Transformer transformer = xsl.newTransformer();
transformer.setParameter("RsXml", document.getDocumentElement());
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.transform(new javax.xml.transform.stream.StreamSource(reader2),
new javax.xml.transform.stream.StreamResult(writer));
result = writer.toString();
System.out.println(result);
} catch( Exception e ) {
e.printStackTrace();
}
XSL snippet
<xsl:param name="RsXml" />
<xsl:template match="/policy/vehicles">
<xsl:for-each select="$RsXml/InsuranceSvcRs/com.csc_PolicyOrderCurrentCarrierInqRs/PersVeh">
When you use the JAXP interface, the values you can supply for parameters are not defined in the API specification, and Saxon's support may differ from Xalan's. Generally I think you will find that Saxon's s9api interface is much easier to use. Certainly its methods for supplying parameters are strongly typed and it's much clearer what you can supply. If you want to supply a node, it should be an instance of XdmNode, and you can create an XdmNode by parsing lexical XML using a s9api DocumentBuilder, or by wrapping a DOM Node.
When you pass a parameter to an XSL stylesheet, the value is a string and is not parsed into a DOM. This is not currently possible in standard XSLT. There may be an extension that will do this but I'm not aware of one.