Read XML from InputStreamReader and write it to file - java

My code receives an XML String from an InputStreamReader (it's actually the output of REST request to another server) and then the String is written to a file (file includes not only XML).
The problem is that the String is received as one line of XML and so it's stored as one huge line in the file (no indentation, tabs, formatting etc.).
Can I receive this XML stream and format it while writing it to the file?
Note: I can't use DOM here, it must be implemented without loading the XML to memory.

You can do it using Transformer and SAX, if you are allowed to :-
public static void prettyPrintXmlToFile(String sourceXml, File targetFile) throws Exception{
Transformer serializer = SAXTransformerFactory.newInstance().newTransformer();
serializer.setOutputProperty(OutputKeys.INDENT, "yes");
serializer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
Source xmlSource = new SAXSource(new InputSource(new StringReader(sourceXml)));
StreamResult res = new StreamResult(targetFile);
serializer.transform(xmlSource, res);
}

Related

Processing xml file (Java)

I have to read and xml file, do some changes, and copy it to another location. I also have to keep the german special characters, and keep the empty tags as they are (prevent them to become self-closing tags). For preventing the self closing tags, I used Xerces Library, as in the link:
preventing empty xml elements are converted to self closing elements
In my application, if my changes in xml are ignored, the code looks like:
public static void main(String args[]) throws Exception {
InputStream inputStream= new FileInputStream(new File("D:\\qwe.xml"));
Reader reader = new InputStreamReader(inputStream,"ISO-8859-1");
InputSource is = new InputSource(reader);
is.setEncoding("ISO-8859-1");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder;
dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(is);
doc.setXmlStandalone(true);
File file = new File ("D:\\qwerty.xml");
XMLStreamWriter writer = XMLOutputFactory.newFactory().createXMLStreamWriter(new FileOutputStream(file));
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1") ;
transformer.transform(new DOMSource(doc), new StAXResult(writer));
}
The first row in the source file is
<?xml version="1.0" encoding="UTF-8"?>
The problem is in the destination file, qwerty.xml, where encoding="UTF-8" is removed. In the source file, although the encoding is UTF-8, I had to set it as "ISO-8859-1" because of german characters. I want to keep the first row as the original, keep the empty tags as they are (not self-closing tags), and keep the german characters. My code succeeds to do only the second and third thing.
The call
Transformer.setOutputProperty(OutputKeys.ENCODING, "ISO-8859-1");
has no effect unless the transformer is producing serialized output.
In your case the transformer is not producing serialized output because you are sending the output to a StAXResult. I'm not sure why you are using the XmlStreamWriter to produce output, but if you want to do it that way, it's the XmlStreamWriter that decides on the encoding, not the Transformer.
I would have thought it was simpler to send the Transformer output to a StreamResult.

Convert org.w3c.dom.Document to File file

I have a xml file as object in Java as org.w3c.dom.Document doc and I want to convert this into File file. How can I convert the type Document to File?
thanks
I want to add metadata elements in an existing xml file (standard dita) with type File.
I know a way to add elements to the file, but then I have to convert the file to a org.w3c.dom.Document. I did that with the method loadXML:
private Document loadXML(File f) throws Exception{
DocumentBuilder b = DocumentBuilderFactory.newInstance().newDocumentBuilder();
return builder.parse(f);
After that I change the org.w3c.dom.Document, then I want to continue with the flow of the program and I have to convert the Document doc back to a File file.
What is a efficient way to do that? Or what is a better solution to get some elements in a xml File without converting it?
You can use a Transformer class to output the entire XML content to a File, as showed below:
Document doc =...
// write the content into xml file
DOMSource source = new DOMSource(doc);
FileWriter writer = new FileWriter(new File("/tmp/output.xml"));
StreamResult result = new StreamResult(writer);
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
transformer.transform(source, result);
With JDK 1.8.0 a short way is to use the built-in XMLSerializer (which was introduced with JDK 1.4 as a fork of Apache Xerces)
import com.sun.org.apache.xml.internal.serialize.XMLSerializer;
Document doc = //use your method loadXML(File f)
//change Document
java.io.Writer writer = new java.io.FileWriter("MyOutput.xml");
XMLSerializer xml = new XMLSerializer(writer, null);
xml.serialize(doc);
Use an object of type OutputFormat to configure output, for example like this:
OutputFormat format = new OutputFormat(Method.XML, StandardCharsets.UTF_8.toString(), true);
format.setIndent(4);
format.setLineWidth(80);
format.setPreserveEmptyAttributes(true);
format.setPreserveSpace(true);
XMLSerializer xml = new XMLSerializer(writer, format);
Note that the classes are from com.sun.* package which is not documented and therefore generally is not seen as the preferred way of doing things. However, with javax.xml.transform.OutputKeys you cannot specify the amount of indentation or line width for example. So, if this is important then this solution should help.

How to use Java API to parse xml string with XSLT and generate output in memory only?

I need to parse the internal XML(from a response) with predefined XSLT and send back the parsed result in html to the client. I notice the following example to use and generate local files. How to avoid the file creation with Java API? I want to replace the source.xml with String and generate the html output on the fly.
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer (new javax.xml.transform.stream.StreamSource("searchresult.xslt"));
transformer.transform(new javax.xml.transform.stream.StreamSource("source.xml"),
new javax.xml.transform.stream.StreamResult( new FileOutputStream("result.html")));
StreamSource has a constructor taking a Reader as argument. You can thus pass a StringReader, which will read the XML from a String, as argument.
Similarly, the StreamResult constructor the example uses takes an OutputStream as argument. You can thus pass any kind of OutputStream (like the HTTP response output stream, or a ByteArrayOutputStream, or a socket output stream) to send the result to wherever you like.

How to convert a String to an XML object in Java

I get a SOAP message from a web service, and I can convert the response string to an XML file using the below code. This works fine. But my requirement is not to write the SOAP message to a file. I just need to keep this XML document object in memory, and extract some elements to be used in further processing. However, if I just try to access the document object below, it comes as empty.
Can somebody please tell me how I can convert a String to an in-memory XML object (without having to write to a file)?
String xmlString = new String(data);
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try
{
builder = factory.newDocumentBuilder();
// Use String reader
Document document = builder.parse( new InputSource(
new StringReader( xmlString ) ) );
TransformerFactory tranFactory = TransformerFactory.newInstance();
Transformer aTransformer = tranFactory.newTransformer();
Source src = new DOMSource( document );
Result dest = new StreamResult( new File( "xmlFileName.xml" ) );
aTransformer.transform( src, dest );
}
Remove the 5 last lines of code, and you'll just have the DOM document in memory. Store this document in some field, rather than in a local variable.
If that isn't sufficient, then please explain, with code, what you mean with "if I just try to access the document object below, it comes as empty".
JB Nizet is right, the first steps create a DOM out of xmlString. That will load your xmlString (or SOAP message) into an in-memory Document. What the following steps are doing (all the things related with the Transform) is to serialize the DOM to a file (xmlFileName.xml), which is not what you want to do, right?
When you said that your DOM is empty, I think you tried to print out the content of your DOM with document.toString(), and returned something like "[document: null]". This doesn't mean your DOM is empty. Actually your DOM contains data. You need now to use the DOM API to get access to the nodes inside your document. Try something like document.getChildNodes(), document.getElementsByTagName(), etc

How to do a search/replace on a file on the fly?

My java application loads an XML file and then parses the XML.
What I would like to is a search/replace on the file before I create the SAXBuilder. How can I do this in memory ( without having to write to the file ) ?
Here's my code, and where I envision doing the search/replace :
private String xmlFile = "D:\\mycomputer\\extract.xml";
File myXMLFile = new File(xmlFile);
// TODO
// REPLACE ALL "<content>" in xmlFile with "<content><![CDATA["
// REPLACE ALL "</content>" with "]]></content>"
SAXBuilder builder = new SAXBuilder("org.apache.xerces.parsers.SAXParser");
document = builder.build(new File(myXMLFile));
Read the file into memory, do the search/replace, and use the SAXBuilder(StringReader) method.
You can first read file to string with apache commons io and then change the input source for the SaxBuilder as in the following code snippet:
String fileStr = FileUtils.readFileToString(myXMLFile);
fileStr = fileStr.replaceAll("<content>","<content><![CDATA[");
fileStr = fileStr.replaceAll("</content>","]]></content>");
SAXBuilder builder = new SAXBuilder("org.apache.xerces.parsers.SAXParser");
document = builder.build(new ByteArrayInputStream(fileStr.getBytes()));
You answered to the question yourself - read the whole file into a StringBuilder, perform the replace in it and then call SAXParser.
The string can be passed to SAXBuilder using StringReader:
StringBuilder sb = new StringBuilder ();
loadFIleContent (filePath, sb);
document = builder.build (new StringReader (sb.toString ()));
P.S.: follow up to theglauber's answer:
If the file is really big (~100Mb) it's impractical to fully read it into memory as well as parsing it into a DOM tree. In this case you should consider using SAXParser and replacing as the file being parsed.
Depending on how large these files are, either read the file into a String, do your replacements in memory and build the XML from the String, or spawn a new thread to read the file, do the replacements and output, then build the XML from the output of that thread.
(I would suggest parsing and modifying the XML tree or using a XML filter, but i suspect you want to do this string-based replacement because the current content of your files is not correct XML.)

Categories

Resources