How do you edit values on xml that has been appended to a stringbuilder?
We have an xml file looking like the following, which we eventually reads in Java:
<?xml version="1.0" encoding="UTF-8"?>
<urn:receive
xmlns:urn="urn:xxx"
xmlns:ns="xxx"
xmlns:ns1="xxx"
xmlns:urn1="urn:xxx">
<urn:give>
<urn:giveNumber>
<ns1:number>12345678</ns1:number>
</urn:giveNumber>
<urn:giveDates>
<urn1:dateFrom>2021-07-01</urn1:dateFrom>
<urn1:dateTo>2021-09-30</urn1:dateTo>
</urn:giveDates>
</urn:give>
</urn:receive>
The following is a snippet of code that we use to read an xml file by appending to a stringbuilder and eventually saving it to a string with .toString(). Do notice that there is an int for number and string for startDate and for endDate. These values must be inserted into the xml, and replace the number and dates. Keep in mind that we are not allowed to edit the xml file.
public class test {
// Logger to print output in commandprompt
private static final Logger LOGGER = Logger.getLogger(test.class.getName());
public void changeDate() {
number = 44444444;
startDate = "2021-01-01";
endDate = "2021-03-31";
try {
// the XML file for this example
File xmlFile = new File("requests/dates.xml");
Reader fileReader = new FileReader(xmlFile);
BufferedReader bufReader = new BufferedReader(fileReader);
StringBuilder sb = new StringBuilder();
String line = bufReader.readLine();
while( line != null ) {
sb.append(line).append("\n");
line = bufReader.readLine();
}
String request = sb.toString();
LOGGER.info("Request" + request);
} catch (Exception e) {
e.printStackTrace();
}
}
}
How do we replace the number and dates in the xml with number, startDate and endDate, but without editing the xml file?
LOGGER.info("Request" + request); should print the following:
<?xml version="1.0" encoding="UTF-8"?>
<urn:receive
xmlns:urn="urn:xxx"
xmlns:ns="xxx"
xmlns:ns1="xxx"
xmlns:urn1="urn:xxx">
<urn:give>
<urn:giveNumber>
<ns1:number>44444444</ns1:number>
</urn:giveNumber>
<urn:giveDates>
<urn1:dateFrom>2021-01-01</urn1:dateFrom>
<urn1:dateTo>2021-03-31</urn1:dateTo>
</urn:giveDates>
</urn:give>
</urn:receive>
Simple answer: you don't.
You need to parse the XML, and parsing the XML can be done perfectly easily by supplying the parser with the file name; reading the XML into a StringBuilder first is pointless effort.
The easiest way to make a small change to an XML document is to use XSLT, which can be easily invoked from Java. Java comes with an XSLT 1.0 processor built in. XSLT 1.0 is getting rather ancient and you might prefer to use XSLT 3.0 which is much more powerful but requires a third-party library; but for a simple job like this, 1.0 is quite adequate. The stylesheet needed consists of a general rule that copies things unchanged:
<xsl:template match="*">
<xsl:copy><xsl:apply-templates/></xsl:copy>
</xsl:template>
and then a couple of rules for changing the things you want to change:
<xsl:param name="number"/>
<xsl:param name="startDate"/>
<xsl:param name="endDate"/>
<xsl:template match="ns1:giveNumber/text()" xmlns:ns1="xxx">
<xsl:value-of select="$number"/>
</xsl:template>
<xsl:template match="urn1:dateFrom/text()" xmlns:urn1="urn:xxx">
<xsl:value-of select="$dateFrom"/>
</xsl:template>
<xsl:template match="urn1:dateTo/text()" xmlns:urn1="urn:xxx">
<xsl:value-of select="$dateTo"/>
</xsl:template>
and then you just run the transformation from Java as described at https://docs.oracle.com/javase/tutorial/jaxp/xslt/transformingXML.html, supplying values for the parameters.
Related
I have a program to perform XML mapping using XSLT. I'm using Saxon-HE-9.7 library for this. I'm also using reflexive extension functions in XSLT.
The XSLT calls a java function that returns ArrayList<HashMap<String, String>>
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0" xmlns:SQLService="com.db.SQLService" xmlns:ArrayList="java:java.util.ArrayList" xmlns:HashMap="java.util.HashMap" >
<xsl:output method="xml" indent="yes" />
<xsl:variable name="city">Texas</xsl:variable>
<xsl:variable name="query" select="'Select name, emp_id from employee where city = ?'" />
<xsl:variable name="list" select="SQLService:executeQueryMultiResult($query, $city)" />
<xsl:template match="/">
<test>
<xsl:for-each select="abc/company[#type='product']">
<employee>
<xsl:attribute name="details">
<xsl:value-of select="$list" />
</xsl:attribute>
</employee>
</xsl:for-each>
</test>
</xsl:template>
</xsl:stylesheet>
I'm getting only sinlge record in the list which is the last record of the list returned by executeQueryMultiResult.
I want to store and iterate all the elements of the list?
Firstly, I'm a bit surprised that when you iterate over abc/company[#type='product'], the body of the xsl:for-each doesn't depend in any way on the current selected company. This means that each iteration of this loop will produce exactly the same output.
Under the default Java-to-XPath conversions, the ArrayList should be converted to an XPath sequence, but the java Maps will not be converted to XPath maps; they need to be accessed as external objects.
See what count($list) returns and check that it matches your expectations.
LATER
I am unable to reproduce the problem. I tested it like this:
public void testListOfMaps() {
try {
Processor p = new Processor(true);
XsltCompiler c = p.newXsltCompiler();
String s = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
"<xsl:stylesheet version=\"3.0\" xmlns:xsl=\"http://www.w3.org/1999/XSL/Transform\" xmlns:xs=\"http://www.w3.org/2001/XMLSchema\"\n" +
" xmlns:jf=\"java:s9apitest.TestReflexionHelper\">\n" +
" <xsl:output method=\"text\" />\n" +
" <xsl:template name='main'>\n" +
" <xsl:variable name=\"theList\" select=\"jf:getListOfMaps('Weds', 'Wednesday')\" />\n" +
" <xsl:value-of select=\"count($theList)\" />\n" +
" <xsl:value-of select=\"Q{java:java.util.Map}get($theList[1], 'Weds')\" />\n" +
" </xsl:template>\n" +
"</xsl:stylesheet>";
XsltTransformer t = c.compile(new StreamSource(new StringReader(s))).load();
StringWriter out = new StringWriter();
Serializer ser = p.newSerializer(out);
t.setDestination(ser);
t.setInitialTemplate(new QName("main"));
t.transform();
assertTrue(out.toString().equals("2Wednesday"));
} catch (SaxonApiException e1) {
fail(e1.getMessage());
}
}
where the extension function jf:getListOfMaps() is:
public static List<Map<String, String>> getListOfMaps(String x, String y) {
Map<String, String> m = new HashMap<>();
m.put("Mon", "Monday");
m.put("Tues", "Tuesday");
m.put(x, y);
Map<String, String> n = new HashMap<>();
m.put("Jan", "January");
m.put("Feb", "February");
List<Map<String, String>> list = new ArrayList<>();
list.add(m);
list.add(n);
return list;
}
The test demonstrates that Saxon is behaving according to the spec: the Java List of Maps is converted to an XPath sequence of external objects, where the external object is a wrapper around the Java Map that allows use of the underlying Java methods.
I ran this on Saxon 9.9 (9.7 is no longer supported).
I suggest you try and produce a repro that simplifies the problem by replacing your extension function with a dummy stub with the same signature that anyone can run for testing.
I also suggest you tell us exactly what your environment is. I'm a bit puzzled that you say you are using Saxon-HE, because Saxon-HE doesn't support reflexive extension functions.
How can I use Saxon xslt transformation library to convert an xml file that contains many nodes to a plain csv string? Means, I want Saxon to concatenate each employee entry as csv, and put them all together.
This is the saxon setup, but I don't know how I could not transform an input xml file with it:
//false = does not required a feature from a licensed version of Saxon.
Processor processor = new Processor(false);
XsltCompiler compiler = processor.newXsltCompiler();
compiler.compile(new StreamSource("transformation.xslt"));
Serializer serializer = processor.newSerializer();
serializer.setOutputProperty(Serializer.Property.OMIT_XML_DECLARATION, "yes");
//TODO
//String result = serializer...serializeNodeToString();
I want to transform the following xml:
<employees stage="test">
<employee>
<details>
<name>Joe</name>
<age>34</age>
</details>
<address>
<street>test</street>
<nr>12</nr>
</address>
</employee>
<employee>
<address>....</address>
<details>
<!-- note the changed order of elements! -->
<age>24</age>
<name>Sam</name>
</details>
</employee>
</employees>
The result string should contain the following (one big string with linebreak separated csv lines):
test,Joe,34,test,12\n
test,Sam,24,...\n
Xslt might be similar to:
<xsl:transform version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="employee">
<xsl:value-of select="name"/>
<xsl:value-of select="age"/>
</xsl:template>
</xsl:transform>
The XSLT template can be modified as below. The elements can be selected according to the required sequence.
<xsl:template match="employee">
<xsl:value-of select="ancestor::employees/#stage, details/name, details/age, address/street, address/nr" separator=", " />
<xsl:text>
</xsl:text>
</xsl:template>
After replacing the ... with some dummy values in the <address> element the following output is generated using the above template.
test, Joe, 34, test, 12
test, Sam, 24, test1, 123
For transforming the XML (using XSLT) in Java, I use the following code snippet most of the time. This method returns the transformed XML as a String. The required library is saxon9he.jar. You may have to upgrade the library version for using with XSLT 3.0
public static String transformXML(String xml, InputStream xslFile) {
String outputXML = null;
try {
System.setProperty("javax.xml.transform.TransformerFactory", "net.sf.saxon.TransformerFactoryImpl");
TransformerFactory factory = TransformerFactory.newInstance();
Transformer transformer = factory.newTransformer(new StreamSource(xslFile));
Source xmlStream = new StreamSource(new StringReader(xml));
StringWriter writer = new StringWriter();
Result result = new StreamResult(writer);
transformer.transform(xmlStream, result);
outputXML = writer.getBuffer().toString();
} catch (TransformerConfigurationException tce) {
tce.printStackTrace();
} catch (TransformerException te) {
te.printStackTrace();
}
return outputXML;
}
I would like to create an enum class using a file.
I hope to make maintenance easier.
txt example:
//name of the enum instance and devided by a '-' are the parameter values:
JOHN-23
ANNA-19
xml example:
<friends>
<friend name="JOHN">
<age>23</age>
</friend>
<friend name="ANNA">
<age>19</age>
</friend>
</friends>
I would like to have an enum akting like this one:
enum Friends {
JOHN(23),
ANNA(19);
private int age;
Friends(int age) {
this.age = age;
}
}
You can do it with an XSLT transformation and call out to SAXON via a task in your build system.
e.g. applying this to your example XML will result in your example enum code
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:output method="text" indent="no"/>
<xsl:variable name="classname"><xsl:sequence select="concat(upper-case(substring(/*/local-name(),1,1)), substring(/*/local-name(), 2), ' '[not(last())])"/> </xsl:variable>
<xsl:template match="/*">
enum <xsl:value-of select="$classname"/>
{<xsl:for-each select="*"><xsl:if test="position()!=1">,</xsl:if><xsl:text>
</xsl:text><xsl:value-of select="#name"/>(<xsl:for-each select="*"><xsl:if test="position()!=1">, </xsl:if><xsl:value-of select="text()"/></xsl:for-each>)</xsl:for-each>;
<xsl:for-each select="*[1]/*"> private int <xsl:value-of select="local-name()"/>;
</xsl:for-each><xsl:text>
</xsl:text><xsl:value-of select="$classname"/>(<xsl:for-each select="*[1]/*"><xsl:if test="position()!=1">, </xsl:if>int <xsl:value-of select="local-name()"/></xsl:for-each>)
{
<xsl:for-each select="*[1]/*"> this.<xsl:value-of select="local-name()"/> = <xsl:value-of select="local-name()"/>;
</xsl:for-each> }
}
</xsl:template>
</xsl:stylesheet>
However,
it would break if the XML didn't have the same number of parameters for each enum value.
your input is encoding type names and field names as element names, whereas it's easier for metamodels to encode them as attributes
it's easier to write transforms for explicit rather than implicit information (i.e. say that you have an int age parameter rather than just happening to have age elements whose content is a string of decimal digits)
if you move on to anything a bit more complicated, such as generating hierarchies of classes, the queries to resolve overloads and inheritance rapidly go past simple XSLT
Here an example how you can generate the enum with StAX in java:
package codegen;
import java.io.FileWriter;
import java.io.IOException;
import java.net.URL;
import javax.xml.stream.XMLInputFactory;
import javax.xml.stream.XMLStreamException;
import javax.xml.stream.XMLStreamReader;
public class GenCode {
public static void main(String[] args) throws XMLStreamException, IOException {
URL xmlFile = GenCode.class.getResource("Friends.xml");
XMLInputFactory inFactory = XMLInputFactory.newFactory();
XMLStreamReader reader = inFactory.createXMLStreamReader(xmlFile.openStream());
try (FileWriter out = new FileWriter("generated/codegen/Friends.java")) {
out.write("package codegen;\n");
out.write("\n");
out.write("public enum Friends {\n");
String friendName = null;
boolean inAge = false;
String sep = "\t";
while (reader.hasNext()) {
switch (reader.next()) {
case XMLStreamReader.START_ELEMENT:
if (reader.getLocalName().equals("friend"))
friendName = reader.getAttributeValue(null, "name");
if (reader.getLocalName().equals("age"))
inAge = true;
break;
case XMLStreamReader.CHARACTERS:
if (inAge) {
out.write(sep + friendName + "_" + reader.getText());
sep = ",\n\t";
}
break;
case XMLStreamReader.END_ELEMENT:
if (reader.getLocalName().equals("age"))
inAge = false;
break;
}
}
out.write("\n}");
}
}
}
You might need to change some paths.
You have to compile this file, invoke it with java, which will create the Friends.java with the enum and then compile the rest.
I'm trying to programmatically convert a text file with multiple columns of info into an XML file with this format:
<ExampleDataSet>
<Example ExID="AA" exampleCode="AA" exampleDescription="THIS IS AN EXAMPLE DESCRIPTION"/>
<Example ExID="BB" exampleCode="BB" exampleDescription="THIS IS AN EXAMPLE DESCRIPTION"/>
<Example ExID="CC" exampleCode="CCC" exampleDescription="THIS IS AN EXAMPLE DESCRIPTION"/>
<Example ExID="DDD" exampleCode="DD" exampleDescription="THIS IS AN EXAMPLE DESCRIPTION"/>
<Example ExID="EEEE" exampleCode="EE" exampleDescription="THIS IS AN EXAMPLE DESCRIPTION"/>
</ExampleDataSet>
I've found other examples that do similar conversions, but on a simpler level. Could anyone point me in the right direction?
You can manually create an XML document using the below. This example creates an XML document with 1 element and the attributes required.
First, create the xml document itself and append the top level element collection header.
XmlDocument doc = new XmlDocument();
XmlNode node = doc.CreateElement("ExampleDataSet");
doc.AppendChild(node);
Now create a new element row. ( you would need a loop here, 1 per csv row!)
XmlNode eg1 = doc.CreateElement("Example");
Then create each of the attributes of the element and append.
XmlAttribute att1 = doc.CreateAttribute("ExID");
att1.Value = "AA";
XmlAttribute att2 = doc.CreateAttribute("exampleCode");
att2.Value = "AA";
XmlAttribute att3 = doc.CreateAttribute("exampleDescription");
att3.Value = "THIS IS AN EXAMPLE DESCRIPTION";
eg1.Attributes.Append(att3);
eg1.Attributes.Append(att2);
eg1.Attributes.Append(att1);
Finally, append to the parent node.
node.AppendChild(eg1);
You can get the XML string like this if you need it.
string xml = doc.OuterXml;
Or you can save it directly to a file.
doc.Save("C:\\test.xml");
Hope that helps you on your way.
Thanks
In XSLT 3.0 you can write this as, for example:
<xsl:variable name="columns" select="'exId', 'exCode', 'exDesc'"/>
<xsl:template name="xsl:initial-template">
<DatasSet>
<xsl:for-each select="unparsed-text-lines('input.csv')">
<xsl:variable name="tokens" select="tokenize(., '\t')"/>
<Example>
<xsl:for-each select="1 to count($tokens)">
<xsl:attribute name="{$columns[$i]}" select="$tokens[$i]"/>
</xsl:for-each>
</Example>
</xsl:for-each>
</DataSet>
</xsl:template>
I'm not sure why you tagged the question "Java" and "C#" but you can run this using Saxon-HE called from Java or C# or from the command line.
Using xml linq and assuming first row of the file are the column headers
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.txt";
static void Main(string[] args)
{
XDocument doc = new XDocument();
doc.Add(new XElement("ExampleDataSet"));
XElement root = doc.Root;
StreamReader reader = new StreamReader(FILENAME);
int rowCount = 1;
string line = "";
string[] headers = null;
while((line = reader.ReadLine()) != null)
{
if (rowCount++ == 1)
{
headers = line.Split(new char[] { '\t' }, StringSplitOptions.RemoveEmptyEntries);
}
else
{
string[] arrayStr = line.Split(new char[] { '\t' }, StringSplitOptions.RemoveEmptyEntries);
XElement newRow = new XElement("Example");
root.Add(newRow);
for (int i = 0; i < arrayStr.Count(); i++)
{
newRow.Add(new XAttribute(headers[i], arrayStr[i]));
}
}
}
}
}
}
I have the following code
DocumentBuilderFactory dbFactory_ = DocumentBuilderFactory.newInstance();
Document doc_;
DocumentBuilder dBuilder = dbFactory_.newDocumentBuilder();
StringReader reader = new StringReader(s);
InputSource inputSource = new InputSource(reader);
doc_ = dBuilder.parse(inputSource);
doc_.getDocumentElement().normalize();
Then I can do
doc_.getDocumentElement();
and get my first element but the problem is instead of being job the element is tns:job.
I know about and have tried to use:
dbFactory_.setNamespaceAware(true);
but that is just not what I'm looking for, I need something to completely get rid of namespaces.
Any help would be appreciated,
Thanks,
Josh
Use the Regex function. This will solve this issue:
public static String removeXmlStringNamespaceAndPreamble(String xmlString) {
return xmlString.replaceAll("(<\\?[^<]*\\?>)?", ""). /* remove preamble */
replaceAll("xmlns.*?(\"|\').*?(\"|\')", "") /* remove xmlns declaration */
.replaceAll("(<)(\\w+:)(.*?>)", "$1$3") /* remove opening tag prefix */
.replaceAll("(</)(\\w+:)(.*?>)", "$1$3"); /* remove closing tags prefix */
}
For Element and Attribute nodes:
Node node = ...;
String name = node.getLocalName();
will give you the local part of the node's name.
See Node.getLocalName()
You can pre-process XML to remove all namespaces, if you absolutely must do so. I'd recommend against it, as removing namespaces from an XML document is in essence comparable to removing namespaces from a programming framework or library - you risk name clashes and lose the ability to differentiate between once-distinct elements. However, it's your funeral. ;-)
This XSLT transformation removes all namespaces from any XML document.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:apply-templates select="node()|#*" />
</xsl:element>
</xsl:template>
<xsl:template match="#*">
<xsl:attribute name="{local-name()}">
<xsl:apply-templates select="node()|#*" />
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Apply it to your XML document. Java examples for doing such a thing should be plenty, even on this site. The resulting document will be exactly of the same structure and layout, just without namespaces.
Rather than
dbFactory_.setNamespaceAware(true);
Use
dbFactory_.setNamespaceAware(false);
Although I agree with Tomalak: in general, namespaces are more helpful than harmful. Why don't you want to use them?
Edit: this answer doesn't answer the OP's question, which was how to get rid of namespace prefixes. RD01 provided the correct answer to that.
Tomalak, one fix of your XSLT (in 3rd template):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="node() | #*" />
</xsl:copy>
</xsl:template>
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:apply-templates select="node() | #*" />
</xsl:element>
</xsl:template>
<xsl:template match="#*">
<!-- Here! -->
<xsl:copy>
<xsl:apply-templates select="node() | #*" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
public static void wipeRootNamespaces(Document xml) {
Node root = xml.getDocumentElement();
NodeList rootchildren = root.getChildNodes();
Element newroot = xml.createElement(root.getNodeName());
for (int i=0;i<rootchildren.getLength();i++) {
newroot.appendChild(rootchildren.item(i).cloneNode(true));
}
xml.replaceChild(newroot, root);
}
The size of the input xml also needs to be considered when choosing the solution. For large xmls, in the size of ~100k, possible if your input is from a web service, you also need to consider the garbage collection implications when you manipulate a large string. We used String.replaceAll before, and it caused frequent OOM in production with a 1.5G heap size because of the way replaceAll is implemented.
You can reference http://app-inf.blogspot.com/2013/04/pitfalls-of-handling-large-string.html for our findings.
I am not sure how XSLT deals with large String objects, but we ended up parsing the string manualy to remove prefixes in one parse to avoid creating additional large java objects.
public static String removePrefixes(String input1) {
String ret = null;
int strStart = 0;
boolean finished = false;
if (input1 != null) {
//BE CAREFUL : allocate enough size for StringBuffer to avoid expansion
StringBuffer sb = new StringBuffer(input1.length());
while (!finished) {
int start = input1.indexOf('<', strStart);
int end = input1.indexOf('>', strStart);
if (start != -1 && end != -1) {
// Appending anything before '<', including '<'
sb.append(input1, strStart, start + 1);
String tag = input1.substring(start + 1, end);
if (tag.charAt(0) == '/') {
// Appending '/' if it is "</"
sb.append('/');
tag = tag.substring(1);
}
int colon = tag.indexOf(':');
int space = tag.indexOf(' ');
if (colon != -1 && (space == -1 || colon < space)) {
tag = tag.substring(colon + 1);
}
// Appending tag with prefix removed, and ">"
sb.append(tag).append('>');
strStart = end + 1;
} else {
finished = true;
}
}
//BE CAREFUL : use new String(sb) instead of sb.toString for large Strings
ret = new String(sb);
}
return ret;
}
Instead of using TransformerFactory and then calling transform on it (which was injecting the empty namespace, I transformed as follows:
OutputStream outputStream = new FileOutputStream(new File(xMLFilePath));
OutputFormat outputFormat = new OutputFormat(doc, "UTF-8", true);
outputFormat.setOmitComments(true);
outputFormat.setLineWidth(0);
XMLSerializer serializer = new XMLSerializer(outputStream, outputFormat);
serializer.serialize(doc);
outputStream.close();
I also faced the namespace issue and was unable to read XML file in java. below is the solution:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(false);// this is imp code that will deactivate namespace in xml
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("XML/"+ fileName);