DOM Parser not able to parse large xml file - java

I am trying to parse a large xml file using DOM Parser and Xpath, but it seems like my code breaks as it's a large xml file (60000 lines). When I try and print the xml, it starts printing from the middle of the xml. Any ideas how I can avoid this?
Regards
FileInputStream file = new FileInputStream(new File(filePath));
DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = builderFactory.newDocumentBuilder();
Document xmlDocument = builder.parse(file);
XPath xPath = XPathFactory.newInstance().newXPath();
disclaimer = xPath.compile(disclaimerPath + File.separator + "title").evaluate(xmlDocument);
TransformerFactory tf = TransformerFactory.newInstance();
Transformer transformer = tf.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
StringWriter writer = new StringWriter();
transformer.transform(new DOMSource(xmlDocument), new StreamResult(writer));
System.out.println(writer.getBuffer().toString().replaceAll("\n|\r", ""));

Related

Prevent transformer.transform( source, result ) from escaping special character

I'm updating node and text content of the xml using DOM parser. To save that DOM parser I'm using transformer.transform method.
Below is the sample code.
String xmlText = "<uc>abcd><name>mine</name>efgh\netg<tag>sd</tag></uc>";
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
InputSource inStream = new InputSource();
inStream.setCharacterStream(new StringReader(xmlText));
Document document = documentBuilder.parse(inStream);
Node node = document.getDocumentElement();
node.normalize();
NodeList childNodes = node.getChildNodes();
for(int i=0; i<childNodes.getLength(); i++) {
if(childNodes.item(i).getNodeType() == Node.TEXT_NODE) {
System.out.println(childNodes.item(i).getTextContent());
childNodes.item(i).setTextContent("123>");
}
}
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "US-ASCII");
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource( document );
OutputStream xml = new ByteArrayOutputStream();
StreamResult result = new StreamResult( xml );
transformer.transform( source, result );
String formattedXml = xml.toString();
System.out.println(formattedXml);
Since my updated document is having text content like ">", transformer.transform method is changing it to &g t;
Is there a way to get the output without escaping special characters.
I can't use other parser because of some project constraints.
I can't use StringEscapeUtils.unescapeXml(). The reason is xml can have &g t;. If i use this utility method, &g t; which was originally present in the xml will also get changed.
So i want a mechanism which will not escape any special character.
The transformer you create with
Transformer transformer = tFactory.newTransformer();
is initialized with a default stylesheed that implements the identity transformation. That means it will simply serialize your DOM to a well-formed XML document. Output escaping is automatically applied where necessary.
If you want better control over the output, and possibly generate something that does not adhere to XML document structures, you can use a custom stylesheet that switches the output method to text. This way you control more of the structure but can do more mistakes in the XML area.
More information at
https://docs.oracle.com/en/java/javase/11/docs/api/java.xml/javax/xml/transform/TransformerFactory.html#newTransformer()
https://www.w3.org/TR/xslt20/#element-output

XML file saved from Swing application

I'm developing a Java Swing Application, and I want to create objects and save them in a XML file, with the information that a user writes in some text fields.
How can I save that data into a XML file, to form those objects?
You can write your own XML-Writer to write out objects/text to a XML file. For example using DOM
public boolean writeCommonSettingsFromGUI()
{
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
Document doc = docBuilder.newDocument();
Element rootElement = doc.createElement("NAME_OF_A_ELEMENT");
doc.appendChild(rootElement);
Element xmlInfo = doc.createElement("NAME_OF_ANOTHER_ELEMENT");
xmlInfo.setTextContent("YOUR_CONTENT_TO_SET_FOR_THIS_ELEMENT");
rootElement.appendChild(xmlInfo);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputPropertiesFactory.S_KEY_INDENT_AMOUNT, "5");
transformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");
DOMSource source = new DOMSource(doc);
StreamResult result = null;
result = new StreamResult(new File("FILE_PATH_WHERE_TO_SAVE_YOUR_XML"));
transformer.transform(source, result);
return true;
}
use castor framework, you can map your java class to xml file and vice versa

Editing xml content in java and passing it as string, using node preferably

I've a xml document, which will be used as a template
<?xml version="1.0" encoding="UTF-8" standalone="no"?><entry xmlns="http://www.w3.org/2005/Atom" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata"><content type="application/xml"><m:properties><d:AccountEnabled>true</d:AccountEnabled><d:DisplayName>SampleAppTestj5</d:DisplayName><d:MailNickname>saTestj5</d:MailNickname><d:Password>Qwerty1234</d:Password><d:UserPrincipalName>saTestj5#identropy.us</d:UserPrincipalName></m:properties></content></entry>
I'm calling it in java using this code where payLoadXML.xml has the above content.
"InputStream is = getClass().getClassLoader().getResourceAsStream("/payLoadXML.xml");"
Now I'm trying to edit the tag values for example changing the from "saTestj5" to "saTestj6" and then converting this entire xml and storing it in xml. Can anyone tell me how can I achieve this? I was told this can be done by using "Node" is it possible?
Use jaxb or sax parsers convert into object by using getter method and change the object and convert back to xml
try this
DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder docBuilder = null;
docBuilder = docFactory.newDocumentBuilder();
Document doc = null;
InputStream is = getClass().getClassLoader().getResourceAsStream("/payLoadXML.xml");
doc = docBuilder.parse(is);
Node staff = doc.getElementsByTagName("m:properties").item(0);
Text givenNameValue = doc.createTextNode("abc");
Element givenName = doc.createElement("d:GivenName");
givenName.appendChild(givenNameValue);
staff.appendChild(givenName);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = null;
transformer = transformerFactory.newTransformer();
DOMSource source = new DOMSource(doc);
StringWriter writer = new StringWriter();
StreamResult result = new StreamResult(writer);
transformer.transform(source, result);

How to set UTF-16 encoding format for Xml?

I am in need to create xml as a string to pass to server. I have managed to convert the data into xml but the encoding format set to utf-8 as default. What i need is i want to set it as utf-16 format. But i haven't got any idea of setting it.
private void XmlCreation(int size,List<DataItem> item) throws ParserConfigurationException, TransformerException
{
DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
Document document = documentBuilder.newDocument();
Element rootElement = document.createElement("ArrayOfDataItem");
document.appendChild(rootElement);
for (DataItem in: item)
{
Element subroot = document.createElement("DataItem");
rootElement.appendChild(subroot);
Element em = document.createElement(in.getKey());
em.appendChild(document.createTextNode(in.getValue()));
subroot.appendChild(em);
}
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
java.io.StringWriter sw = new java.io.StringWriter();
DOMSource source = new DOMSource(document);
StreamResult result = new StreamResult(System.out);
transformer.transform(source, result);
String xml = sw.toString();
System.out.println(xml);
}
}
Thanks guys
I haven't tested, but that should do the trick:
transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-16");
This article might help you. Basically, you call setOutputProperty with OutputKeys.ENCODING as key and the desired encoding ("UTF-16") as value.

We are upgrading our application to java6 and an xsl transform that worked with java 5 now returns an empty document

Has anybody seen anything like this before? I will post the xsl and xml if I have to but I would have to take sensitive data out of it.
The code used to handle the XSL transformation:
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer(new DOMSource( xslDoc));
DOMResult domresult = new DOMResult();
transformer.transform(new DOMSource(xmlDoc), domresult);
Node node = domresult.getNode();
resultDoc = (Document) node;
Never seen it going blank. For JAVA6 (also compatible with 1.5), I have the following code that is working, the difference seems to on TransformerFactory used.
private DocumentBuilderFactory factory;
private DocumentBuilder builder;
private Transformer xformer;
//presetup - needs to be done just once
factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
builder = factory.newDocumentBuilder();
xformer = TransformerFactory.newInstance().newTransformer();
//Transform the file
Source source = new DOMSource(doc);
String oFileName = "output.xml";
File oFile = new File(outputDirectory + "/" + oFileName);
Result result = new StreamResult(oFile);
xformer.transform(source, result);
Does this correct your issue?

Categories

Resources