I'm trying to programmatically convert a text file with multiple columns of info into an XML file with this format:
<ExampleDataSet>
<Example ExID="AA" exampleCode="AA" exampleDescription="THIS IS AN EXAMPLE DESCRIPTION"/>
<Example ExID="BB" exampleCode="BB" exampleDescription="THIS IS AN EXAMPLE DESCRIPTION"/>
<Example ExID="CC" exampleCode="CCC" exampleDescription="THIS IS AN EXAMPLE DESCRIPTION"/>
<Example ExID="DDD" exampleCode="DD" exampleDescription="THIS IS AN EXAMPLE DESCRIPTION"/>
<Example ExID="EEEE" exampleCode="EE" exampleDescription="THIS IS AN EXAMPLE DESCRIPTION"/>
</ExampleDataSet>
I've found other examples that do similar conversions, but on a simpler level. Could anyone point me in the right direction?
You can manually create an XML document using the below. This example creates an XML document with 1 element and the attributes required.
First, create the xml document itself and append the top level element collection header.
XmlDocument doc = new XmlDocument();
XmlNode node = doc.CreateElement("ExampleDataSet");
doc.AppendChild(node);
Now create a new element row. ( you would need a loop here, 1 per csv row!)
XmlNode eg1 = doc.CreateElement("Example");
Then create each of the attributes of the element and append.
XmlAttribute att1 = doc.CreateAttribute("ExID");
att1.Value = "AA";
XmlAttribute att2 = doc.CreateAttribute("exampleCode");
att2.Value = "AA";
XmlAttribute att3 = doc.CreateAttribute("exampleDescription");
att3.Value = "THIS IS AN EXAMPLE DESCRIPTION";
eg1.Attributes.Append(att3);
eg1.Attributes.Append(att2);
eg1.Attributes.Append(att1);
Finally, append to the parent node.
node.AppendChild(eg1);
You can get the XML string like this if you need it.
string xml = doc.OuterXml;
Or you can save it directly to a file.
doc.Save("C:\\test.xml");
Hope that helps you on your way.
Thanks
In XSLT 3.0 you can write this as, for example:
<xsl:variable name="columns" select="'exId', 'exCode', 'exDesc'"/>
<xsl:template name="xsl:initial-template">
<DatasSet>
<xsl:for-each select="unparsed-text-lines('input.csv')">
<xsl:variable name="tokens" select="tokenize(., '\t')"/>
<Example>
<xsl:for-each select="1 to count($tokens)">
<xsl:attribute name="{$columns[$i]}" select="$tokens[$i]"/>
</xsl:for-each>
</Example>
</xsl:for-each>
</DataSet>
</xsl:template>
I'm not sure why you tagged the question "Java" and "C#" but you can run this using Saxon-HE called from Java or C# or from the command line.
Using xml linq and assuming first row of the file are the column headers
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Xml.Linq;
using System.IO;
namespace ConsoleApplication1
{
class Program
{
const string FILENAME = #"c:\temp\test.txt";
static void Main(string[] args)
{
XDocument doc = new XDocument();
doc.Add(new XElement("ExampleDataSet"));
XElement root = doc.Root;
StreamReader reader = new StreamReader(FILENAME);
int rowCount = 1;
string line = "";
string[] headers = null;
while((line = reader.ReadLine()) != null)
{
if (rowCount++ == 1)
{
headers = line.Split(new char[] { '\t' }, StringSplitOptions.RemoveEmptyEntries);
}
else
{
string[] arrayStr = line.Split(new char[] { '\t' }, StringSplitOptions.RemoveEmptyEntries);
XElement newRow = new XElement("Example");
root.Add(newRow);
for (int i = 0; i < arrayStr.Count(); i++)
{
newRow.Add(new XAttribute(headers[i], arrayStr[i]));
}
}
}
}
}
}
Related
How do you edit values on xml that has been appended to a stringbuilder?
We have an xml file looking like the following, which we eventually reads in Java:
<?xml version="1.0" encoding="UTF-8"?>
<urn:receive
xmlns:urn="urn:xxx"
xmlns:ns="xxx"
xmlns:ns1="xxx"
xmlns:urn1="urn:xxx">
<urn:give>
<urn:giveNumber>
<ns1:number>12345678</ns1:number>
</urn:giveNumber>
<urn:giveDates>
<urn1:dateFrom>2021-07-01</urn1:dateFrom>
<urn1:dateTo>2021-09-30</urn1:dateTo>
</urn:giveDates>
</urn:give>
</urn:receive>
The following is a snippet of code that we use to read an xml file by appending to a stringbuilder and eventually saving it to a string with .toString(). Do notice that there is an int for number and string for startDate and for endDate. These values must be inserted into the xml, and replace the number and dates. Keep in mind that we are not allowed to edit the xml file.
public class test {
// Logger to print output in commandprompt
private static final Logger LOGGER = Logger.getLogger(test.class.getName());
public void changeDate() {
number = 44444444;
startDate = "2021-01-01";
endDate = "2021-03-31";
try {
// the XML file for this example
File xmlFile = new File("requests/dates.xml");
Reader fileReader = new FileReader(xmlFile);
BufferedReader bufReader = new BufferedReader(fileReader);
StringBuilder sb = new StringBuilder();
String line = bufReader.readLine();
while( line != null ) {
sb.append(line).append("\n");
line = bufReader.readLine();
}
String request = sb.toString();
LOGGER.info("Request" + request);
} catch (Exception e) {
e.printStackTrace();
}
}
}
How do we replace the number and dates in the xml with number, startDate and endDate, but without editing the xml file?
LOGGER.info("Request" + request); should print the following:
<?xml version="1.0" encoding="UTF-8"?>
<urn:receive
xmlns:urn="urn:xxx"
xmlns:ns="xxx"
xmlns:ns1="xxx"
xmlns:urn1="urn:xxx">
<urn:give>
<urn:giveNumber>
<ns1:number>44444444</ns1:number>
</urn:giveNumber>
<urn:giveDates>
<urn1:dateFrom>2021-01-01</urn1:dateFrom>
<urn1:dateTo>2021-03-31</urn1:dateTo>
</urn:giveDates>
</urn:give>
</urn:receive>
Simple answer: you don't.
You need to parse the XML, and parsing the XML can be done perfectly easily by supplying the parser with the file name; reading the XML into a StringBuilder first is pointless effort.
The easiest way to make a small change to an XML document is to use XSLT, which can be easily invoked from Java. Java comes with an XSLT 1.0 processor built in. XSLT 1.0 is getting rather ancient and you might prefer to use XSLT 3.0 which is much more powerful but requires a third-party library; but for a simple job like this, 1.0 is quite adequate. The stylesheet needed consists of a general rule that copies things unchanged:
<xsl:template match="*">
<xsl:copy><xsl:apply-templates/></xsl:copy>
</xsl:template>
and then a couple of rules for changing the things you want to change:
<xsl:param name="number"/>
<xsl:param name="startDate"/>
<xsl:param name="endDate"/>
<xsl:template match="ns1:giveNumber/text()" xmlns:ns1="xxx">
<xsl:value-of select="$number"/>
</xsl:template>
<xsl:template match="urn1:dateFrom/text()" xmlns:urn1="urn:xxx">
<xsl:value-of select="$dateFrom"/>
</xsl:template>
<xsl:template match="urn1:dateTo/text()" xmlns:urn1="urn:xxx">
<xsl:value-of select="$dateTo"/>
</xsl:template>
and then you just run the transformation from Java as described at https://docs.oracle.com/javase/tutorial/jaxp/xslt/transformingXML.html, supplying values for the parameters.
In MS-Word 2010 there is an Option under File -> Information to check the document for problems before sharing it. This makes it possible to handle track changes (to new newest version) and remove all comments and annotations from the document at once.
Is this possibility available in docx4j as well or do I need to investiagte the corresponding JAXB-Objects and write a traverse finder?
Doing that manually could be a lot of work since I would have to add the RunIns (w:ins) to the R (w:r) and remove the RunDel (w:del). I also saw a w:del once inside a w:ins. In this case I don't know if this also appears vice versa or in deeper nestings.
Further research brought this XSLT up:
https://github.com/plutext/docx4all/blob/master/docx4all/src/main/java/org/docx4all/util/ApplyRemoteChanges.xslt
I was not able to run this within docx4j but by manually unzipping the docx and extracting the document.xml. After applying the xslt on the plain document.xml I wrapped it in the docx container again to open it with MS-Word. The result was not the same as it would be by accepting the revision with MS-Word itself. More concrete: The XSLT removed the deleted marked text (in a Table), but not a listing dot before the text. This appears quite often in my document.
If this request is not posible to solve in an easy manner, I will change the constraints. It is sufficent for me to have a method for getting all text of a ContentAccessor, as a String. The ContentAccessor could be a P or Tc. The String shall be inside a R there or inside a RunIns (with R inside of that) For this I have a half solution below. The intersting part starts in the line of else if (child instanceof RunIns) {. But as mentioned above I'm not sure how nested del/ins Statements might appear and if this will handle them well. And the results are still not the same as if I would prepare the document with MS-Word before.
//Similar to:
//http://www.docx4java.org/forums/docx-java-f6/how-to-get-all-text-element-of-a-paragraph-with-docx4j-t2028.html
private String getAllTextfromParagraph(ContentAccessor ca) {
String result = "";
List<Object> children = ca.getContent();
for (Object child : children) {
child = XmlUtils.unwrap(child);
if (child instanceof Text) {
Text text = (Text) child;
result += text.getValue();
} else if (child instanceof R) {
R run = (R) child;
result += getTextFromRun(run);
}
else if (child instanceof RunIns) {
RunIns ins = (RunIns) child;
for (Object obj : ins.getCustomXmlOrSmartTagOrSdt()) {
if (obj instanceof R) {
result += getTextFromRun((R) obj);
}
}
}
}
return result.trim();
}
private String getTextFromRun(R run) {
String result = "";
for (Object o : run.getContent()) {
o = XmlUtils.unwrap(o);
if (o instanceof R.Tab) {
Text text = new Text();
text.setValue("\t");
result += text.getValue();
}
if (o instanceof R.SoftHyphen) {
Text text = new Text();
text.setValue("\u00AD");
result += text.getValue();
}
if (o instanceof Br) {
Text text = new Text();
text.setValue(" ");
result += text.getValue();
}
if (o instanceof Text) {
result += ((Text) o).getValue();
}
}
return result;
}
https://github.com/plutext/docx4j/commit/309a8e4008553452ebe675e81def30aab97542a2?w=1 adds a method for transforming just one Part, and sample code to use it to accept changes.
The XSLT is just what you found (relicensed as Apache 2):
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:WX="http://schemas.microsoft.com/office/word/2003/auxHint"
xmlns:aml="http://schemas.microsoft.com/aml/2001/core"
xmlns:w10="urn:schemas-microsoft-com:office:word"
xmlns:pkg="http://schemas.microsoft.com/office/2006/xmlPackage"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:ext="http://www.xmllab.net/wordml2html/ext"
xmlns:java="http://xml.apache.org/xalan/java"
xmlns:xml="http://www.w3.org/XML/1998/namespace"
version="1.0"
exclude-result-prefixes="java msxsl ext o v WX aml w10">
<xsl:output method="xml" encoding="utf-8" omit-xml-declaration="no" indent="yes" />
<xsl:template match="/ | #*|node()">
<xsl:copy>
<xsl:apply-templates select="#*|node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="w:del" />
<xsl:template match="w:ins" >
<xsl:apply-templates select="*"/>
</xsl:template>
</xsl:stylesheet>
You'll need to add support for the other elements identified in the MSDN link. If you do that, I'd be happy to get a pull request
I use a SolrJ-based client to query Solr and I have been trying to construct HTTP requests where facet name/value pairs are excluded. The web interface I am working with has a refine further functionality, which allows excluding one or more facet values. I have 3 facet fields: domain, content type and author and I would like to be able to handle faceting by exclusion on each of them. For example, q = Dickens AND fq=-author:Dickens, Janet will construct the following HTTP request:
/solr/solrbase/select?q=Dickens&fq=-author:Dickens%2c+Janet&wt=json&indent=true
Whereas the XML dump will look like:
<facets>
<facet name="author">
<facetEntry count="20">Dickens, Charles</facetEntry>
<facetEntry count="10">Dickens, Sarah</facetEntry>
</facet>
</facets>
So far, the Java implementation I am working with does not seems to handle filter query exclusion:
private HttpSolrServer solrServer;
solrServer = new HttpSolrServer("http://localhost:8983/solr/");
private static final String CONFIG_SOLR_FACET_FIELD = "facet_field";
private String[] _facetFields = new String[] {"author"};
private static final String CONFIG_SOLR_FACETS = "facets"
Element el = myParams.getChild(CONFIG_SOLR_FACETS);
_facetUse = el.getAttributeValue("useFacets", "true");
_facetMinCount = el.getAttributeValue("minCount", String.valueOf(1));
_facetLimit = el.getAttributeValue("limit", String.valueOf(20));
List vals = el.getChildren(CONFIG_SOLR_FACET_FIELD);
if (vals.size() > 0) {
_facetFields = new String[vals.size()];
for (int i=0; i < vals.size(); i++) {
_facetFields[i] = ((Element)vals.get(i)).getTextTrim();
}
}
SolrQuery query = new SolrQuery();
query.setQuery(qs);
List facetList = doc.getRootElement().getChildren("facet");
Iterator<String> it = facetList.iterator();
while (it.hasNext()) {
Element el = (Element)it.next(); //
String name = el.getAttributeValue("name");
String value = el.getTextTrim();
if (name != null && value != null) {
facets.add(name+":"+value);
}
}
query.setQuery(qs).
setFacet(Boolean.parseBoolean(_facetUse)).
setFacetMinCount(Integer.parseInt(_facetMinCount)).
setFacetLimit(Integer.parseInt(_facetLimit)).
for (int i=0; i<_facetFields.length; i++) {
query.addFacetField(_facetFields[i]);
};
for (int i=0; i<facets.size(); i++) {
query.addFilterQuery(facets.get(i));
};
return query;
}
I was recommended to use something along these lines:
SolrQuery solrQuery = new SolrQuery();
solrQuery.set(CommonParams.FQ, “-author:Dickens,Janet”);
However, this seems to be a hardcoded approach and it cannot be easily applied across all 3 facets and all facet values. I have looked at this, but still it is not clear to me how I should include the exclusion variant in my current code. Can you help with this?
Thanks indeed,
I.
EDIT 1
I have attached the code to construct/prepare the Solr Query, but I should have also included how the Solr instance is actually queried:
private QueryResponse execQuery(SolrQuery query) throws SolrServerException {
QueryResponse rsp = solrServer.query( query );
return rsp;
}
Moreover, it would be helpful to post the code that converts the Solr query response for the facets into something that can be understood by the web application:
Element elfacets = new Element("facets");
List<FacetField> facets = rsp.getFacetFields();
if (facets != null) {
int i = 0;
for (FacetField facet : facets) {
Element sfacet = new Element("facet");
sfacet.setAttribute("name", facet.getName());
List<Count> facetEntries = facet.getValues();
for(FacetField.Count fcount : facetEntries) {
Element facetEntry = new Element("facetEntry");
facetEntry.setText(fcount.getName());
facetEntry.setAttribute("count", String.valueOf(fcount.getCount()));
sfacet.addContent(facetEntry);
}
elfacets.addContent(sfacet);
}
root.addContent(elfacets);
}
doc.addContent(root);
return doc;
}
"facets" is nothing more than the XSLT, which includes rules on how to map Solr facets with the facets as handled by the web application.
EDIT 2
I attach the "facets" template, which is called by the code as presented in EDIT 1:
<xsl:template name="facets">
<xsl:param name="q" />
<xsl:analyze-string select="$q" regex='AND facet_(.*?):\(("?.*?"?)\)'>
<xsl:matching-substring>
<xsl:choose>
<xsl:when test="regex-group(1) = 'author'">
<facet name="author"><xsl:value-of select="regex-group(2)" /></facet>
</xsl:when>
</xsl:choose>
</xsl:matching-substring>
<xsl:non-matching-substring>
<!--<xsl:analyze-string select="$q" regex='AND NOT facet_(.*?):\(("?.*?"?)\)'>
<xsl:matching-substring>
<xsl:choose>
<xsl:when test="regex-group(1) = 'author'">
<facet name="author"><xsl:value-of select="regex-group(2)" /></facet>
</xsl:when>
</xsl:choose>
</xsl:matching-substring>
</xsl:analyze-string>-->
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>
</xsl:stylesheet>
The template only features the author facet, but I have 3 facets in total. It should be noted that my web application has the following syntax for excluding facets:
AND NOT facet_author:("Dickens, Janet")
I'm sure you you have the following lines inside some method. Instead of hard coding the fq part, have some variable there.
SolrQuery solrQuery = new SolrQuery();
solrQuery.set(CommonParams.FQ, “-author:Dickens,Janet”);
If you need to use the fq, pass proper parameter (e.g “-author:Dickens,Janet”). Otherwise pass an empty string. So, your query will be like
/solr/solrbase/select?q=Dickens&fq=&wt=json&indent=true
Then add your faceting part of the query. Though your query having fq=, it won't throw an error. It basically won't work for the fq part. But rest of the query will work fine.
Hope this will help.
I have the following code
DocumentBuilderFactory dbFactory_ = DocumentBuilderFactory.newInstance();
Document doc_;
DocumentBuilder dBuilder = dbFactory_.newDocumentBuilder();
StringReader reader = new StringReader(s);
InputSource inputSource = new InputSource(reader);
doc_ = dBuilder.parse(inputSource);
doc_.getDocumentElement().normalize();
Then I can do
doc_.getDocumentElement();
and get my first element but the problem is instead of being job the element is tns:job.
I know about and have tried to use:
dbFactory_.setNamespaceAware(true);
but that is just not what I'm looking for, I need something to completely get rid of namespaces.
Any help would be appreciated,
Thanks,
Josh
Use the Regex function. This will solve this issue:
public static String removeXmlStringNamespaceAndPreamble(String xmlString) {
return xmlString.replaceAll("(<\\?[^<]*\\?>)?", ""). /* remove preamble */
replaceAll("xmlns.*?(\"|\').*?(\"|\')", "") /* remove xmlns declaration */
.replaceAll("(<)(\\w+:)(.*?>)", "$1$3") /* remove opening tag prefix */
.replaceAll("(</)(\\w+:)(.*?>)", "$1$3"); /* remove closing tags prefix */
}
For Element and Attribute nodes:
Node node = ...;
String name = node.getLocalName();
will give you the local part of the node's name.
See Node.getLocalName()
You can pre-process XML to remove all namespaces, if you absolutely must do so. I'd recommend against it, as removing namespaces from an XML document is in essence comparable to removing namespaces from a programming framework or library - you risk name clashes and lose the ability to differentiate between once-distinct elements. However, it's your funeral. ;-)
This XSLT transformation removes all namespaces from any XML document.
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="node()|#*" />
</xsl:copy>
</xsl:template>
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:apply-templates select="node()|#*" />
</xsl:element>
</xsl:template>
<xsl:template match="#*">
<xsl:attribute name="{local-name()}">
<xsl:apply-templates select="node()|#*" />
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
Apply it to your XML document. Java examples for doing such a thing should be plenty, even on this site. The resulting document will be exactly of the same structure and layout, just without namespaces.
Rather than
dbFactory_.setNamespaceAware(true);
Use
dbFactory_.setNamespaceAware(false);
Although I agree with Tomalak: in general, namespaces are more helpful than harmful. Why don't you want to use them?
Edit: this answer doesn't answer the OP's question, which was how to get rid of namespace prefixes. RD01 provided the correct answer to that.
Tomalak, one fix of your XSLT (in 3rd template):
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates select="node() | #*" />
</xsl:copy>
</xsl:template>
<xsl:template match="*">
<xsl:element name="{local-name()}">
<xsl:apply-templates select="node() | #*" />
</xsl:element>
</xsl:template>
<xsl:template match="#*">
<!-- Here! -->
<xsl:copy>
<xsl:apply-templates select="node() | #*" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
public static void wipeRootNamespaces(Document xml) {
Node root = xml.getDocumentElement();
NodeList rootchildren = root.getChildNodes();
Element newroot = xml.createElement(root.getNodeName());
for (int i=0;i<rootchildren.getLength();i++) {
newroot.appendChild(rootchildren.item(i).cloneNode(true));
}
xml.replaceChild(newroot, root);
}
The size of the input xml also needs to be considered when choosing the solution. For large xmls, in the size of ~100k, possible if your input is from a web service, you also need to consider the garbage collection implications when you manipulate a large string. We used String.replaceAll before, and it caused frequent OOM in production with a 1.5G heap size because of the way replaceAll is implemented.
You can reference http://app-inf.blogspot.com/2013/04/pitfalls-of-handling-large-string.html for our findings.
I am not sure how XSLT deals with large String objects, but we ended up parsing the string manualy to remove prefixes in one parse to avoid creating additional large java objects.
public static String removePrefixes(String input1) {
String ret = null;
int strStart = 0;
boolean finished = false;
if (input1 != null) {
//BE CAREFUL : allocate enough size for StringBuffer to avoid expansion
StringBuffer sb = new StringBuffer(input1.length());
while (!finished) {
int start = input1.indexOf('<', strStart);
int end = input1.indexOf('>', strStart);
if (start != -1 && end != -1) {
// Appending anything before '<', including '<'
sb.append(input1, strStart, start + 1);
String tag = input1.substring(start + 1, end);
if (tag.charAt(0) == '/') {
// Appending '/' if it is "</"
sb.append('/');
tag = tag.substring(1);
}
int colon = tag.indexOf(':');
int space = tag.indexOf(' ');
if (colon != -1 && (space == -1 || colon < space)) {
tag = tag.substring(colon + 1);
}
// Appending tag with prefix removed, and ">"
sb.append(tag).append('>');
strStart = end + 1;
} else {
finished = true;
}
}
//BE CAREFUL : use new String(sb) instead of sb.toString for large Strings
ret = new String(sb);
}
return ret;
}
Instead of using TransformerFactory and then calling transform on it (which was injecting the empty namespace, I transformed as follows:
OutputStream outputStream = new FileOutputStream(new File(xMLFilePath));
OutputFormat outputFormat = new OutputFormat(doc, "UTF-8", true);
outputFormat.setOmitComments(true);
outputFormat.setLineWidth(0);
XMLSerializer serializer = new XMLSerializer(outputStream, outputFormat);
serializer.serialize(doc);
outputStream.close();
I also faced the namespace issue and was unable to read XML file in java. below is the solution:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(false);// this is imp code that will deactivate namespace in xml
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("XML/"+ fileName);
My goal is to take an XML string and parse it with XMLBeans XmlObject and add a few child nodes.
Here's an example document (xmlString),
<?xml version="1.0"?>
<rootNode>
<person>
<emailAddress>joefoo#example.com</emailAddress>
</person>
</rootNode>
Here's the way I'd like the XML document to be after adding some nodes,
<?xml version="1.0"?>
<rootNode>
<person>
<emailAddress>joefoo#example.com</emailAddress>
<phoneNumbers>
<home>555-555-5555</home>
<work>555-555-5555</work>
<phoneNumbers>
</person>
</rootNode>
Basically, just adding the <phoneNumbers/> node with two child nodes <home/> and <work/>.
This is as far as I've gotten,
XmlObject xml = XmlObject.Factory.parse(xmlString);
Thank you
Here is an example of using the XmlCursor to insert new elements. You can also get a DOM Node for an XmlObject and using those APIs.
import org.apache.xmlbeans.*;
/**
* Adding nodes to xml using XmlCursor.
* #see http://xmlbeans.apache.org/docs/2.4.0/guide/conNavigatingXMLwithCursors.html
* #see http://xmlbeans.apache.org/docs/2.4.0/reference/org/apache/xmlbeans/XmlCursor.html
*/
public class AddNodes
{
public static final String xml =
"<rootNode>\n" +
" <person>\n" +
" <emailAddress>joefoo#example.com</emailAddress>\n" +
" </person>\n" +
"</rootNode>\n";
public static XmlOptions saveOptions = new XmlOptions().setSavePrettyPrint().setSavePrettyPrintIndent(2);
public static void main(String[] args) throws XmlException
{
XmlObject xobj = XmlObject.Factory.parse(xml);
XmlCursor cur = null;
try
{
cur = xobj.newCursor();
// We could use the convenient xobj.selectPath() or cur.selectPath()
// to position the cursor on the <person> element, but let's use the
// cursor's toChild() instead.
cur.toChild("rootNode");
cur.toChild("person");
// Move to </person> end element.
cur.toEndToken();
// Start a new <phoneNumbers> element
cur.beginElement("phoneNumbers");
// Start a new <work> element
cur.beginElement("work");
cur.insertChars("555-555-5555");
// Move past the </work> end element
cur.toNextToken();
// Or insert a new element the easy way in one step...
cur.insertElementWithText("home", "555-555-5555");
}
finally
{
if (cur != null) cur.dispose();
}
System.out.println(xobj.xmlText(saveOptions));
}
}
XMLBeans seems like a hassle, here's a solution using XOM:
import nu.xom.*;
Builder = new Builder();
Document doc = builder.build(new java.io.StringBufferInputStream(inputXml));
Nodes nodes = doc.query("person");
Element homePhone = new Element("home");
homePhone.addChild(new Text("555-555-5555"));
Element workPhone = new Element("work");
workPhone.addChild(new Text("555-555-5555"));
Element phoneNumbers = new Element("phoneNumbers");
phoneNumbers.addChild(homePhone);
phoneNumbers.addChild(workPhone);
nodes[0].addChild(phoneNumbers);
System.out.println(doc.toXML()); // should print modified xml
It may be a little difficult to manipulate the objects using just the XmlObject interface. Have you considered generating the XMLBEANS java objects from this xml?
If you don't have XSD for this schema you can generate it using XMLSPY or some such tools.
If you just want XML manipulation (i.e, adding nodes) you could try some other APIs like jdom or xstream or some such thing.
Method getDomNode() gives you access to the underlying W3C DOM Node. Then you can append childs using W3C Document interface.