Deleting table from word document using Docx4j - java

My word document has two tables and I am trying to delete last table with following code:
public static void removeTable() throws Docx4JException, JAXBException {
File doc = new File("D:\\Hello.docx");
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(doc);
MainDocumentPart mainDocumentPart = wordMLPackage.getMainDocumentPart();
String xpath = "//w:tbl";
List<Object> list = mainDocumentPart.getJAXBNodesViaXPath(xpath, false);
if(list.size()==2){
Tbl tbl = (Tbl) XmlUtils.unwrap(list.get(list.size()-1));
mainDocumentPart.getContent().remove(tbl.getParent());
wordMLPackage.save(new java.io.File("D:\\Hello.docx"));
System.out.println(list.size());
}
}
But nothing is happening to my document. Can anybody help me in this regard? Thanks

I used this code as base.
A working solution:
public class RemoveLastTable {
public static void main(String[] args) throws Docx4JException {
File doc = new File("d:\\tmp\\tables.docx");
WordprocessingMLPackage pkg = WordprocessingMLPackage.load(doc);
removeLastTable(pkg, "d:\\tmp\\tables_updated.docx");
}
public static void removeLastTable(WordprocessingMLPackage wordMLPackage, String outFile) throws Docx4JException {
Body body = wordMLPackage.getMainDocumentPart().getContents().getBody();
List<Object> tables = getAllElementFromObject(body, Tbl.class);
int indexTableToRemove = tables.size() - 1;
Tbl tableToRemove = (Tbl) tables.get(indexTableToRemove);
body.getContent().remove(tableToRemove.getParent());
wordMLPackage.save(new File(outFile));
}
private static List<Object> getAllElementFromObject(Object obj, Class<?> toSearch) {
List<Object> result = new ArrayList<>();
if (obj instanceof JAXBElement) {
obj = ((JAXBElement<?>) obj).getValue();
}
if (obj.getClass().equals(toSearch)) {
result.add(obj);
}
if (obj instanceof ContentAccessor) {
List<?> children = ((ContentAccessor) obj).getContent();
for (Object child : children) {
result.addAll(getAllElementFromObject(child, toSearch));
}
}
return result;
}
}
However the saving of the updated document is not perfect, my Word 2016 (Office 365) was not able to read the result, only after doing recovery.

First, specify the item you want to delete (in the list of objects your XPath returned).
Object deleteMe = list.get(1);
Use the code:
Object parent = getParent(deleteMe);
if (parent instanceof ContentAccessor) {
boolean result = ((ContentAccessor)parent).getContent().remove(deleteMe);
System.out.println("Deleted? " + result);
} else {
System.out.println("TODO: get content list from " + parent.getClass().getName());
}
with a little helper method:
private Object getParent(Object o) {
return ((Child)XmlUtils.unwrap(o)).getParent();
}

Related

Recursive call works only for super parent and not for children

I am trying to get the list of strings from a REST response like below:
private List<String> recursiveRestCall(String folderId) throws JsonMappingException, JsonProcessingException {
List<String> fileList = new ArrayList<>();
Mono<String> mono = webClient.get().uri("/some/api/" + folderId + "/endpoint/goes/here").retrieve()
.bodyToMono(String.class);
final ObjectNode node = new ObjectMapper().readValue(mono.block(), ObjectNode.class);
if (node.get("parent").get("children").isArray()) {
for (JsonNode jsonNode : node.get("parent").get("children")) {
if (jsonNode.get("child").get("isFile").asBoolean()) {
fileList.add(jsonNode.toString());
}
if (jsonNode.get("child").get("isFolder").asBoolean()) {
recursiveRestCall(jsonNode.get("child").get("id").toString().replaceAll("\"", "")); // This is not working and no error whatsoever.
}
}
return fileList;
}
return null;
}
Now, as highlighted in the comment in above snippet. Only the if (jsonNode.get("child").get("isFile").asBoolean()) { condition is executed and I get the list items. I know there are few subfolders having files in them. So effectively the super parent's children are retrieved. But not from the child who is be a parent (as subfolder) too.
What I am missing here?
Save result of this like=
List<String> tempList = recursiveRestCall(jsonNode.get("child").get("id").toString().replaceAll("\"", ""));
if(fileList !=null)
list.addAll(tempList);
Based on the comments, You can edit this method like below:
private List<String> recursiveRestCall(String folderId) throws JsonMappingException, JsonProcessingException {
List<String> fileList = new ArrayList<>();
Mono<String> mono = webClient.get().uri("/some/api/" + folderId + "/endpoint/goes/here").retrieve()
.bodyToMono(String.class);
final ObjectNode node = new ObjectMapper().readValue(mono.block(), ObjectNode.class);
if (node.get("parent").get("children").isArray()) {
for (JsonNode jsonNode : node.get("parent").get("children")) {
if (jsonNode.get("child").get("isFile").asBoolean()) {
fileList.add(jsonNode.toString());
}
if (jsonNode.get("child").get("isFolder").asBoolean()) {
fileList.addAll(recursiveRestCall(jsonNode.get("child").get("id").toString().replaceAll("\"", ""))); // This is not working and no error whatsoever.
}
}
return fileList;
}
return fileList;
}

Collect all elements in a JSON file into a single list

I am using Gson 2.8.1+ (I can upgrade if needed).
If I have the JsonObject:
"config" : {
"option_one" : {
"option_two" : "result_one"
}
}
}
... how can I convert this efficiently to the form:
"config.option_one.option_two" : "result_one"
Simple example:
public static void main(String[] args) {
String str = """
{
"config" : {
"option_one" : {
"option_two" : "result_one"
}
}
}""";
var obj = JsonParser.parseString(str).getAsJsonObject();
System.out.println(flatten(obj)); // {"config.option_one.option_two":"result_one"}
}
public static JsonObject flatten(JsonObject toFlatten) {
var flattened = new JsonObject();
flatten0("", toFlatten, flattened);
return flattened;
}
private static void flatten0(String prefix, JsonObject toFlatten, JsonObject toMutate) {
for (var entry : toFlatten.entrySet()) {
var keyWithPrefix = prefix + entry.getKey();
if (entry.getValue() instanceof JsonObject child) {
flatten0(keyWithPrefix + ".", child, toMutate);
} else {
toMutate.add(keyWithPrefix, entry.getValue());
}
}
}
Algorithm
Simplest algorithm you can come up with is recursive folding. You first dive recursively to the bottom of a structure, then ask if there is only one element in the map(you have to parse json with some framework to get a Map<string, object> structure). If there is, you join the string of parent field with property and set value of parent to value of that property. Then you move up and repeat the process until you are at the root. Of course, if map has multiple fields, you will move on to the parent and try egan.
Gson does not have anything like that, but it provides enough capabilities to build it on top: you can walk JSON streams (JsonReader) and trees (JsonElement, but not wrapped into JsonReader) stack-based and stack-based/recursively accordingly (streams may save much).
I would create a generic tree-walking method to adapt it for further purposes.
public static void walk(final JsonElement jsonElement, final BiConsumer<? super Collection<?>, ? super JsonElement> consumer) {
final Deque<Object> parents = new ArrayDeque<>();
parents.push("$");
walk(jsonElement, consumer, parents);
}
private static void walk(final JsonElement jsonElement, final BiConsumer<? super Collection<?>, ? super JsonElement> consumer, final Deque<Object> path) {
if ( jsonElement.isJsonNull() ) {
consumer.accept(path, jsonElement);
} else if ( jsonElement.isJsonPrimitive() ) {
consumer.accept(path, jsonElement);
} else if ( jsonElement.isJsonObject() ) {
for ( final Map.Entry<String, JsonElement> e : jsonElement.getAsJsonObject().entrySet() ) {
path.addLast(e.getKey());
walk(e.getValue(), consumer, path);
path.removeLast();
}
} else if ( jsonElement.isJsonArray() ) {
int i = 0;
for ( final JsonElement e : jsonElement.getAsJsonArray() ) {
path.addLast(i++);
walk(e, consumer, path);
path.removeLast();
}
} else {
throw new AssertionError(jsonElement);
}
}
Note that the method above also supports arrays. The walk method is push-semantics-driven: it uses callbacks to provide the walk progress. Making it lazy by returning an iterator or a stream would probably be cheaper and get the pull semantics applied. Also, CharSequence view elements would probably save on creating many strings.
public static String toJsonPath(final Iterable<?> path) {
final StringBuilder stringBuilder = new StringBuilder();
final Iterator<?> iterator = path.iterator();
if ( iterator.hasNext() ) {
final Object next = iterator.next();
stringBuilder.append(next);
}
while ( iterator.hasNext() ) {
final Object next = iterator.next();
if ( next instanceof Number ) {
stringBuilder.append('[').append(next).append(']');
} else if ( next instanceof CharSequence ) {
stringBuilder.append('.').append(next);
} else {
throw new UnsupportedOperationException("Unsupported: " + next);
}
}
return stringBuilder.toString();
}
Test:
final JsonElement jsonElement = Streams.parse(jsonReader);
final Collection<String> paths = new ArrayList<>();
JsonPaths.walk(jsonElement, (path, element) -> paths.add(JsonPaths.toJsonPath(path)));
for ( final String path : paths ) {
System.out.println(path);
}
Assertions.assertIterableEquals(
ImmutableList.of(
"$.nothing",
"$.number",
"$.object.subobject.number",
"$.array[0].string",
"$.array[1].string",
"$.array[2][0][0][0]"
),
paths
);

Error: Could not find or load main class thredds.catalog.dl.ADNWriter is coming

public static List<Object> getAllElementFromObject(Object obj, Class<?> toSearch) {
List<Object> result = new ArrayList<Object>();
if (obj instanceof JAXBElement)
obj = ((JAXBElement<?>) obj).getValue();
if (obj.getClass().equals(toSearch))
result.add(obj);
else if (obj instanceof ContentAccessor) {
List<?> children = ((ContentAccessor) obj).getContent();
for (Object child : children) {
result.addAll(getAllElementFromObject(child, toSearch));
}
}
return result;
}
public static void main(String[] args) throws Docx4JException {
String inputfilepath = "C:\\Users\\sugreev.sharma\\Desktop\\BMS\\test\\Ireland CTAg - Authority and CTI_July 2018.docx";
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new File(inputfilepath));
MainDocumentPart mainDocumentPart = wordMLPackage.getMainDocumentPart();
List<Object> paragraphs = getAllElementFromObject(mainDocumentPart, P.class);
for (Object par : paragraphs) {
P p = (P) par;
// Get all the runs in the paragraph
List<Object> allRuns = p.getContent();
for (Object run : allRuns) {
R r = (R) run;
// Get the Text in the Run
List<Object> allText = r.getContent();
for (Object text : allText) {
Text txt = (Text) text;
System.out.println("--> " + txt.getValue());
}
}
}
}
This is my code, which I am using in project.
I am using docx4j for docx processing.

Facing some strange issue while bookmarking the paragraph

public class BookmarkAdd extends AbstractSample {
public static JAXBContext context = org.docx4j.jaxb.Context.jc;
/**
* #param args
*/
#SuppressWarnings("deprecation")
public static void main(String[] args) throws Exception {
String inputfilepath = "Chapter_3.docx";
File file = new java.io.File(inputfilepath);
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(new java.io.File(inputfilepath));
MainDocumentPart documentPart = wordMLPackage.getMainDocumentPart();
String outputfilepath = System.getProperty("user.dir")+"/ 1.docx";
ClassFinder finder = new ClassFinder(P.class); // <----- change this to suit
new TraversalUtil(documentPart.getContent(), finder);
int Counter = 0;
System.out.println(finder.results.size());
for (Object o : finder.results)
{
P para =(P)o;
String name = "para" + Counter;
bookmarkPara(para, 0, para.getParagraphContent().size(), name, Counter);
Counter++;
SaveToZipFile saver = new SaveToZipFile(wordMLPackage);
saver.save(outputfilepath);
// wordMLPackage.save(new java.io.File(inputfilepath));
}
}
/**
* Surround the specified r in the specified p
* with a bookmark (with specified name and id)
* #param p
* #param r
* #param name
* #param id
*/
public static void bookmarkPara(P p, int StartIndex,int EndIndex, String name, int id) {
ObjectFactory factory = Context.getWmlObjectFactory();
BigInteger ID = BigInteger.valueOf(id);
// Add bookmark end first
CTMarkupRange mr = factory.createCTMarkupRange();
mr.setId(ID);
JAXBElement<CTMarkupRange> bmEnd = factory.createBodyBookmarkEnd(mr);
p.getParagraphContent().add(EndIndex, bmEnd); // from 2.7.0, use getContent()
// Next, bookmark start
CTBookmark bm = factory.createCTBookmark();
bm.setId(ID);
bm.setName(name);
JAXBElement<CTBookmark> bmStart = factory.createBodyBookmarkStart(bm);
p.getParagraphContent().add(StartIndex, bmStart);
}
public static List<Object> getAllElementFromObject(Object obj, Class<?> toSearch) {
List<Object> result = new ArrayList<Object>();
if (obj instanceof JAXBElement)
obj = ((JAXBElement<?>) obj).getValue();
if (obj.getClass().equals(toSearch))
result.add(obj);
else if (obj instanceof ContentAccessor) {
List<?> children = ((ContentAccessor) obj).getContent();
for (Object child : children) {
result.addAll(getAllElementFromObject(child, toSearch));
}
}
return result;
}
}
Using this code I bookmarks each paragraph as para0 to paran and this code works very fine for most of the document But I am not able to bookmark for two of my docx file I don't know why it shows the following error.
java.lang.IllegalArgumentException: obj parameter must not be null
at javax.xml.bind.helpers.AbstractMarshallerImpl.checkNotNull(Unknown Source)
at javax.xml.bind.helpers.AbstractMarshallerImpl.marshal(Unknown Source)
at org.docx4j.openpackaging.parts.JaxbXmlPart.marshal(JaxbXmlPart.java:361)
at org.docx4j.openpackaging.parts.JaxbXmlPart.marshal(JaxbXmlPart.java:330)
at org.docx4j.openpackaging.io.SaveToZipFile.saveRawXmlPart(SaveToZipFile.java:249)
at org.docx4j.openpackaging.io.SaveToZipFile.saveRawXmlPart(SaveToZipFile.java:198)
at org.docx4j.openpackaging.io.SaveToZipFile.savePart(SaveToZipFile.java:424)
at org.docx4j.openpackaging.io.SaveToZipFile.addPartsFromRelationships(SaveToZipFile.java:387)
at org.docx4j.openpackaging.io.SaveToZipFile.savePart(SaveToZipFile.java:442)
at org.docx4j.openpackaging.io.SaveToZipFile.addPartsFromRelationships(SaveToZipFile.java:387)
at org.docx4j.openpackaging.io.SaveToZipFile.save(SaveToZipFile.java:168)
at org.docx4j.openpackaging.io.SaveToZipFile.save(SaveToZipFile.java:97)
at Backup.BookmarkAdd.main(BookmarkAdd.java:64)
.....

jackson xml lists deserialization recognized as duplicate keys

I'm trying to convert xml into json using jackson-2.5.1 and jackson-dataformat-xml-2.5.1
The xml structure is received from web server and unknown, therefore I can't have java class to represent the object, and I'm trying to convert directly into TreeNode using ObjectMapper.readTree.
My problem is jackson failing to parse lists. It is takes only the last item of the list.
code:
String xml = "<root><name>john</name><list><item>val1</item>val2<item>val3</item></list></root>";
XmlMapper xmlMapper = new XmlMapper();
JsonNode jsonResult = xmlMapper.readTree(xml);
The json result:
{"name":"john","list":{"item":"val3"}}
If I enable failure on duplicate keys xmlMapper.enable(DeserializationFeature.FAIL_ON_READING_DUP_TREE_KEY), exception is thrown:
com.fasterxml.jackson.databind.JsonMappingException: Duplicate field 'item' for ObjectNode: not allowed when FAIL_ON_READING_DUP_TREE_KEY enabled
Is there any feature which fixes this problem? Is there a way for me to write custom deserializer which in event of duplicate keys turn them into array?
I use this approach:
Plugin a serializer into XmlMapper using a guava multimap. This puts everything into lists.
Write out the json using SerializationFeature.WRITE_SINGLE_ELEM_ARRAYS_UNWRAPPED. This unwrapps all lists with size==1.
Here is my code:
#Test
public void xmlToJson() {
String xml = "<root><name>john</name><list><item>val1</item>val2<item>val3</item></list></root>";
Map<String, Object> jsonResult = readXmlToMap(xml);
String jsonString = toString(jsonResult);
System.out.println(jsonString);
}
private Map<String, Object> readXmlToMap(String xml) {
try {
ObjectMapper xmlMapper = new XmlMapper();
xmlMapper.registerModule(new SimpleModule().addDeserializer(Object.class, new UntypedObjectDeserializer() {
#SuppressWarnings({ "unchecked", "rawtypes" })
#Override
protected Map<String, Object> mapObject(JsonParser jp, DeserializationContext ctxt) throws IOException {
JsonToken t = jp.getCurrentToken();
Multimap<String, Object> result = ArrayListMultimap.create();
if (t == JsonToken.START_OBJECT) {
t = jp.nextToken();
}
if (t == JsonToken.END_OBJECT) {
return (Map) result.asMap();
}
do {
String fieldName = jp.getCurrentName();
jp.nextToken();
result.put(fieldName, deserialize(jp, ctxt));
} while (jp.nextToken() != JsonToken.END_OBJECT);
return (Map) result.asMap();
}
}));
return (Map) xmlMapper.readValue(xml, Object.class);
} catch (Exception e) {
throw new RuntimeException(e);
}
}
static public String toString(Object obj) {
try {
ObjectMapper jsonMapper = new ObjectMapper().configure(SerializationFeature.INDENT_OUTPUT, true)
.configure(SerializationFeature.WRITE_SINGLE_ELEM_ARRAYS_UNWRAPPED, true);
StringWriter w = new StringWriter();
jsonMapper.writeValue(w, obj);
return w.toString();
} catch (Exception e) {
throw new RuntimeException(e);
}
}
It prints
{
"list" : {
"item" : [ "val1", "val3" ]
},
"name" : "john"
}
Altogether it is a variante of this approach, which comes out without guava multimap:
https://github.com/DinoChiesa/deserialize-xml-arrays-jackson
Same approach is used here:
Jackson: XML to Map with List deserialization
I ran into the same problem and decided to roll my own using straightforward DOM. The main problem is that XML does not really lends itself to a Map-List-Object type mapping like JSon does. However, with some assumptions, it is still possible:
Text are stored in null keys either as a single String or a List.
Empty elements, i.e. are modeled with an empty map.
Here's the class in the hope that it might just help someone else:
public class DeXML {
public DeXML() {}
public Map<String, Object> toMap(InputStream is) {
return toMap(new InputSource(is));
}
public Map<String, Object> toMap(String xml) {
return toMap(new InputSource(new StringReader(xml)));
}
private Map<String, Object> toMap(InputSource input) {
try {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(input);
document.getDocumentElement().normalize();
Element root = document.getDocumentElement();
return visitChildNode(root);
} catch (ParserConfigurationException | SAXException | IOException e) {
throw new RuntimeException(e);
}
}
// Check if node type is TEXT or CDATA and contains actual text (i.e. ignore
// white space).
private boolean isText(Node node) {
return ((node.getNodeType() == Element.TEXT_NODE || node.getNodeType() == Element.CDATA_SECTION_NODE)
&& node.getNodeValue() != null && !node.getNodeValue().trim().isEmpty());
}
private Map<String, Object> visitChildNode(Node node) {
Map<String, Object> map = new HashMap<>();
// Add the plain attributes to the map - fortunately, no duplicate attributes are allowed.
if (node.hasAttributes()) {
NamedNodeMap nodeMap = node.getAttributes();
for (int j = 0; j < nodeMap.getLength(); j++) {
Node attribute = nodeMap.item(j);
map.put(attribute.getNodeName(), attribute.getNodeValue());
}
}
NodeList nodeList = node.getChildNodes();
// Any text children to add to the map?
List<Object> list = new ArrayList<>();
for (int i = 0; i < node.getChildNodes().getLength(); i++) {
Node child = node.getChildNodes().item(i);
if (isText(child)) {
list.add(child.getNodeValue());
}
}
if (!list.isEmpty()) {
if (list.size() > 1) {
map.put(null, list);
} else {
map.put(null, list.get(0));
}
}
// Process the element children.
for (int i = 0; i < node.getChildNodes().getLength(); i++) {
// Ignore anything but element nodes.
Node child = nodeList.item(i);
if (child.getNodeType() != Element.ELEMENT_NODE) {
continue;
}
// Get the subtree.
Map<String, Object> childsMap = visitChildNode(child);
// Now, check if this is key already exists in the map. If it does
// and is not a List yet (if it is already a List, simply add the
// new structure to it), create a new List, add it to the map and
// put both elements in it.
if (map.containsKey(child.getNodeName())) {
Object value = map.get(child.getNodeName());
List<Object> multiple = null;
if (value instanceof List) {
multiple = (List<Object>)value;
} else {
map.put(child.getNodeName(), multiple = new ArrayList<>());
multiple.add(value);
}
multiple.add(childsMap);
} else {
map.put(child.getNodeName(), childsMap);
}
}
return map;
}
}
You can catch that exception and do something like :
List<MyClass> myObjects = mapper.readValue(input, new TypeReference<List<MyClass>>(){});
(got it from here How to use Jackson to deserialise an array of objects)
It's a hackish approach and you will have to figure out how to resume from there.
Underscore-java library supports this XML.
String xml = "<root><name>john</name><list><item>val1</item>val2<item>val3</item></list></root>";
String json = U.xmlToJson(xml);
System.out.println(json);
Output JSON:
{
"root": {
"name": "john",
"list": {
"item": [
"val1",
{
"#item": {
"#text": "val2"
}
},
"val3"
]
}
},
"#omit-xml-declaration": "yes"
}

Categories

Resources