XSLT for a XML inside a workflow in JAVA - java

I'm creating an integration solution with Java that filters and modify huge XML files. Those XML files are inputted as a payload document through the solution and to do a big filter in the parts that are interesting for me I want to use XSLT stylesheet.
My difficult is that the default Java solution for this does not work for me (XSLT processing with Java?) as I do not want to take the XML from the system, once it is already in the workflow of the solution, and I need the output source to stay in the workflow.
Element production = docX2.createElement("PRODUCTION");
try {
TransformerFactory factory = TransformerFactory.newInstance();
Source xslt = new StreamSource("slimmer.xslt");
Transformer transformer = factory.newTransformer(xslt);
Source text = new StreamSource((InputStream) docX1);
transformer.transform(text, new StreamResult((OutputStream) production));
} catch (Exception ex) {
Logger.getLogger(IntProcess.class.getName()).log(Level.SEVERE, null, ex);
}
root.appendChild(production);
docX1 is the XML input document that is flowing through the solution, and docX2 is the output document (both are Document class in Java). Production is a tag element from docX2.

I solve it. With the help of this one Transform XML with XSLT in Java using DOM
My solution is
Element production = docX2.createElement("PRODUCTION");
try {
TransformerFactory factory = TransformerFactory.newInstance();
Source xslt = new StreamSource("slimmer.xslt");
Transformer transformer = factory.newTransformer(xslt);
Source text = new DOMSource(docX1);
transformer.transform(text, new DOMResult(production));
} catch (Exception ex) {
Logger.getLogger(IntProcess.class.getName()).log(Level.SEVERE, null, ex);
}
root.appendChild(production);
The problem was to try using Stream instead of a DOM source.

Related

How to perform mail merge functionality in java using dot/dotx and doc/docx format document template

I want to perform mail merge functionality in java using dot/dotx and doc/docx format documents. I tried using docx4j but it removes much rich text indentation from the documents.
I also tried fetching out some of the html content from the word document but couldnt able to repaste in word document.
public static void readDocxFile1(String fileName) {
// this.file = file;
try {
File file = new File(fileName);
FileInputStream finStream=new FileInputStream(file.getAbsolutePath());
HWPFDocument doc=new HWPFDocument(finStream);
WordExtractor wordExtract=new WordExtractor(doc);
Document newDocument = DocumentBuilderFactory.newInstance() .newDocumentBuilder().newDocument();
WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(newDocument) ;
wordToHtmlConverter.processDocument(doc);
StringWriter stringWriter = new StringWriter();
Transformer transformer = TransformerFactory.newInstance().newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
transformer.setOutputProperty(OutputKeys.ENCODING, "utf-8");
transformer.setOutputProperty(OutputKeys.METHOD, "html");
transformer.transform(new DOMSource( wordToHtmlConverter.getDocument()), new StreamResult( stringWriter ) );
String html = stringWriter.toString();
System.out.println("html>>>>>>"+html);
}
catch(Exception e)
{
e.printStackTrace();
}
}
My requirement is that I have to (1) read a dot/dotx or doc/docx template and for the no. of people looping it to (2) replace the keywords and then (3) repasting it in the new document.
Please suggest a way how can I perform this feature.
Also please suggest if ASPOSE.WORDS API for JAVA will do this for me.
Yes, you can meet these requirements using Aspose.Words for Java API. I would suggest you please refer to the following sections of documentation:
Loading, Saving and Converting
Mail Merge and Reporting
Find and Replace Overview
I work with Aspose as Developer Evangelist.

How to reduce the file size of a xml file that is created using java?

I have to convert a text file with coordinates into a xml file. But the point of converting the text file into a xml file is so that the file size to be smaller. How can I reduce the size of my file?
public void writeXML() throws Exception
{
ArrayList<Frame> frameList = new ArrayList<Frame>();
frameList = readFile();
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder;
try
{
dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.newDocument();
// append stuff
TransformerFactory transformerFactory = TransformerFactory.newInstance();
Transformer transformer = transformerFactory.newTransformer();
//transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(doc);
StreamResult console = new StreamResult(System.out);
StreamResult file = new StreamResult(new File("file.xml"));
//transformer.transform(source, console);
transformer.transform(source, file);
System.out.println("DONE");
}
catch (Exception e)
{
e.printStackTrace();
}
}
But the point of converting the text file into a xml file is so that the file size to be smaller.
That is probably not achievable. XML is less dense than a typical text representation because properly designed XML adds a significant amount of "markup" to the file.
A file consisting of coordinates in an appropriately designed text form (e.g. CSV) will take less space than the same coordinates expressed in XML.
If you want "denser" files than a custom text format:
consider compressing the text file, or
consider using a binary representation instead of text.
If you are fixed on the idea of using XML, then the best way to reduce file size will be to compress it. Given that XML has a lot of redundancy in it (e.g. the markup), you should get significant compression.

convert XML to CSV without XSLT in java

first time using stack overflow, I'm now a student doing a project for analtics purpose, but the company store all the records into XML and I have to convert it and make a program as so to make it automated report send by email.
I'm using java to do XML parser and I'm now trying Apache common digester as other parser needs XSLT to do that, but iIwant a program that doesn't depends on XSLT because the company wants a system and sends a report like every 5 min of summary. So using XSLT may be quite slow as I saw some of the answer here. So may I know how to do that using digester or other methord, try to show some example the codes if possible, to make a conversion.
Here is the sample code I have build under digester:
public void run() throws IOException, SAXException {
Digester digester = new Digester();
// This method pushes this (SampleDigester) class to the Digesters
// object stack making its methods available to processing rules.
digester.push(this);
// This set of rules calls the addDataSource method and passes
// in five parameters to the method.
digester.addCallMethod("datasources/datasource", "addDataSource", 5);
digester.addCallParam("datasources/datasource/name", 0);
digester.addCallParam("datasources/datasource/driver", 1);
digester.addCallParam("datasources/datasource/url", 2);
digester.addCallParam("datasources/datasource/username", 3);
digester.addCallParam("datasources/datasource/password", 4);
File file = new File("C:\\Users\\1206432E\\Desktop\\datasource.xml");
// This method starts the parsing of the document.
//digester.parse("file:\\Users\\1206432E\\Desktop\\datasource.xml");
digester.parse(file);
}
Another one which is build using DOM to convert CSV to XML but still relies one XSLT file:
public static void main(String args[]) throws Exception {
File stylesheet = new File("C:\\Users\\1206432E\\Desktop\\Style.xsl");
File xmlSource = new File("C:\\Users\\1206432E\\Desktop\\data.xml");
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(xmlSource);
StreamSource stylesource = new StreamSource(stylesheet);
Transformer transformer = TransformerFactory.newInstance()
.newTransformer(stylesource);
Source source = new DOMSource(document);
Result outputTarget = new StreamResult(
new File("C:\\Users\\1206432E\\Desktop\\temp.csv"));
transformer.transform(source, outputTarget);
}

get node raw text

How get node value with its children nodes? For example I have following node parsed into dom Document instance:
<root>
<ch1>That is a text with <value name="val1">value contents</value></ch1>
</root>
I select ch1 node using xpath. Now I need to get its contents, everything what is containing between <ch1> and </ch1>, e.g. That is a text with <value name="val1">value contents</value>.
How can I do it?
I have found the following code snippet that uses transformation, it gives almost exactly what I want. It is possible to tune result by changing output method.
public static String serializeDoc(Node doc) {
StringWriter outText = new StringWriter();
StreamResult sr = new StreamResult(outText);
Properties oprops = new Properties();
oprops.put(OutputKeys.METHOD, "xml");
TransformerFactory tf = TransformerFactory.newInstance();
Transformer t = null;
try {
t = tf.newTransformer();
t.setOutputProperties(oprops);
t.transform(new DOMSource(doc), sr);
} catch (Exception e) {
System.out.println(e);
}
return outText.toString();
}
If this is server side java (ie you do not need to worry about it running on other jvm's) and you are using the Sun/Oracle JDK, you can do the following:
import com.sun.org.apache.xml.internal.serialize.OutputFormat;
import com.sun.org.apache.xml.internal.serialize.XMLSerializer;
...
Node n = ...;
OutputFormat outputFormat = new OutputFormat();
outputFormat.setOmitXMLDeclaration(true);
ByteArrayOutputStream baos = new ByteArrayOutputStream();
XMLSerializer ser = new XMLSerializer(baos, outputFormat);
ser.serialize(n);
System.out.println(new String(baos.toByteArray()));
Remember to ensure your ultimate conversion to string may need to take an encoding parameter if the parsed xml dom has its text nodes in a different encoding than your platforms default one or you'll get garbage on the unusual characters.
You could use jOOX to wrap your DOM objects and get many utility functions from it, such as the one you need. In your case, this will produce the result you need (using css-style selectors to find <ch1/>:
String xml = $(document).find("ch1").content();
Or with XPath as you did:
String xml = $(document).xpath("//ch1").content();
Internally, jOOX will use a transformer to generate that output, as others have mentioned
As far as I know, there is no equivalent of innerHTML in Document. DOM is meant to hide the details of the markup from you.
You can probably get the effect you want by going through the children of that node. Suppose for example that you want to copy out the text, but replace each "value" tag with a programmatically supplied value:
HashMap<String, String> values = ...;
StringBuilder str = new StringBuilder();
for(Element child = ch1.getFirstChild; child != null; child = child.getNextSibling()) {
if(child.getNodeType() == Node.TEXT_NODE) {
str.append(child.getTextContent());
} else if(child.getNodeName().equals("value")) {
str.append(values.get(child.getAttributes().getNamedItem("name").getTextContent()));
}
}
String output = str.toString();

java DOM xml file create - Have no tabs or whitespaces in output file

I already looked through the postings on stackoverflow but it seems that nothing helps.
Here is what have:
// write the content into xml file
TransformerFactory transformerFactory = TransformerFactory.newInstance();
transformerFactory.setAttribute("indent-number", 2);
Transformer transformer = transformerFactory.newTransformer();
transformer.setOutputProperty(OutputKeys.INDENT, "yes");
DOMSource source = new DOMSource(xmlDoc);
StreamResult result = new StreamResult(new File("C:\\testing.xml"));
transformer.transform(source, result);
and this is what I get as output:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Satellite SatelliteName="" XmlFileVersion="">
<test0>
<test1>
<test2>
<test3>
<test4>
<test5>
<test6>
<test7>
<test8>
<test9/>
</test8>
</test7>
</test6>
</test5>
</test4>
</test3>
</test2>
</test1>
</test0>
</Satellite>
No tabs or no spaces.
I set the indent-number because of a possible bug of java and I activated OutputKeys.INDENT.
Any other ideas?
Edit 1 (after adarshr's fix):
I now have white spaces. Only the first Satellite Entry is placed in the first line which shouldn't be.
<?xml version="1.0" encoding="UTF-8"?><Satellite SatelliteName="" XmlFileVersion="">
<test0>
<test1>
<test2>
<test3>
<test4>
<test5>
<test6>
<test7>
<test8>
<test9>blah</test9>
</test8>
</test7>
</test6>
</test5>
</test4>
</test3>
</test2>
</test1>
</test0>
<sdjklhewlkr/>
</Satellite>
Edit 2:
So the current state is that I now have whitespaces but I have no line feed after the XML declaration. How can I fix this?
try setting the indent amount like this:
transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "2");
I've played with Transformer, but never got it to work. I used the Xerces (Apache) library, which has always worked like a charm for me. Try something like
OutputFormat format = new OutputFormat(document);
format.setLineWidth(65);
format.setIndenting(true);
format.setIndent(2);
Writer outxml = new FileWriter(new File("out.xml"));
XMLSerializer serializer = new XMLSerializer(outxml, format);
serializer.serialize(document);
I had faced the same problem sometime back. The issue was that the implementation of the TransformerFactory or Transformer classes loaded was different from what Java intends it to be.
There was also a System property that we had to set in order to solve it. I will try and get that for you in a moment.
EDIT: Try this
System.setProperty("javax.xml.transform.TransformerFactory", "org.apache.xalan.xsltc.trax.TransformerFactoryImpl");
I can give you 2 advice
1st
You can use xsl file for pretty output
2nd
I found interesting library ode-utils-XXX.jar
And you can just write like
String result = "";
try {
result = DOMUtils.prettyPrint(doc);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println(result);

Categories

Resources