Enterprise Architect scripting with java - create and modify linked document - java

my question: How can I create a new linked document and insert (or connect) it into an element (in my case a Note-Element of an activity diagram).
The Element-Class supports the three Methods:
GetLinkedDocument ()
LoadLinkedDocument (string Filename)
SaveLinkedDocument (string Filename)
I missing a function like
CreateLinkedDocument (string Filename)
My goal: I create an activity diagram programmatically and some notes are to big to display it pretty in the activity diagram. So my goal is to put this text into an linked document instead of directly in the activity diagram.
Regards
EDIT
Thank you very much to Uffe for the solution of my problem. Here is my solution code:
public void addLinkedDocumentToElement(Element element, String noteText) {
String filePath = "C:\\rtfNote.rtf";
PrintWriter writer;
//create new file on the disk
writer = new PrintWriter(filePath, "UTF-8");
//convert string to ea-rtf format
String rtfText = repository.GetFormatFromField("RTF", noteText);
//write content to file
writer.write(rtfText);
writer.close();
//create linked document to element by loading the before created rtf file
element.LoadLinkedDocument(filePath);
element.Update();
}
EDIT EDIT
It is also possible to work with a temporary file:
File f = File.createTempFile("rtfdoc", ".rtf");
FileOutputStream fos = new FileOutputStream(f);
String rtfText = repository.GetFormatFromField("RTF", noteText);
fos.write(rtfText.getBytes());
fos.flush();
fos.close();
element.LoadLinkedDocument(f.getAbsolutePath());
element.Update();

First up, let's separate the linked document, which is stored in the EA project and displayed in EA's built-in RTF viewer, from an RTF file, which is stored on disk.
Element.LoadLinkedDocument() is the only way to create a linked document. It reads an RTF file and stores its contents as the element's linked document. An element can only have one linked document, and I think it is overwritten if the method is called again but I'm not absolutely sure (you could get an error instead, but the EA API tends not to work that way).
In order to specify the contents of your linked document, you must create the file and then load it. The only other way would be to go hacking around in EA's internal and undocumented database, which people sometimes do but which I strongly advise against.
In .NET you can create RTF documents using Microsoft's Word API, but to my knowledge there is no corresponding API for Java. A quick search turns up jRTF, an open-source RTF library for Java. I haven't tested it but it looks as if it'll do the trick.
You can also use EA's API to create RTF data. You would then create your intended content in EA's internal display format and use Repository.GetFormatFromField() to convert it to RTF, which you would then save in the file.
If you need to, you can use Repository.GetFieldFromFormat() to convert plain-text or HTML-formatted text to EA's internal format.

Related

Is it possible to embed images in exported html

I'm trying to use the JasperHtmlExporterBuilder to generate an HTML version of a report that has images. The two options that I seem to have are:
Use JasperHtmlExporterBuilder and .setImagesURI("image?image="); This method relies on the code living in some kind of web container (like tomcat) and generates IMG tags to grab images from the server.
Use setOutputImagesToDir option of JasperHtmlExporterBuilder and force the images to be outputted separately to a local directory on disk.
I was wondering whether there might be a 3rd option where the images are base64 encoded and put directly into the HTML that's generated.
This would be ideal for me as I'd really like to return one complete result that's entirely self-contained.
One way I can "hack" it would be to use option #2 from above, then iterate over the images that get outputted, read them in, convert to base64 and manually replace the src part of the generated HTML.
Update: Below is my actual implementation based on the "hack" I describe above. Would be nice to do this better - but the code below is doing what I need (thought not very memory friendly).
public String toHtmlString() throws IOException, DRException {
File tempFile = Files.createTempFile("tempInvoiceHTML", "").toFile();
Path tempDir = Files.createTempDirectory("");
FileOutputStream fileOutputStream = new FileOutputStream(tempFile);
JasperHtmlExporterBuilder htmlExporter = export.htmlExporter(fileOutputStream).setImagesURI("");
htmlExporter.setOutputImagesToDir(true);
htmlExporter.setImagesDirName(tempDir.toUri().getPath());
htmlExporter.setUsingImagesToAlign(false);
reportBuilder.toHtml(htmlExporter);
String html = new String(Files.readAllBytes(Paths.get(tempFile.toURI())));
for (Path path : Files.list(Paths.get(tempDir.toUri().getPath())).collect(Collectors.toList())) {
String fileName = path.getFileName().toString();
byte[] encode = Base64.encode(FileUtils.readFileToByteArray(path.toFile()));
html = html.replaceAll(fileName, "data:image/png;base64,"+ new String(encode));
}
return html;
}
Is there a better way to do this?
Thanks!

Importing file to Alfresco programatically (through java backed webscript)

I am having problem when importing document (PDF) into Alfresco repository inside java backed webscript. I am using writer of ContentService.
If I use
ContentWriter writer = ContentService.getWriter(nodeRef, ContentModel.PROP_CONTENT, true);
writer.setEncoding("UTF-8");
writer.setMimetype("application/pdf");
writer.putContent(new String(byte []) );
or
writer.putContent(new String(byte [], "UTF-8") );
my document is not previewable (I see blank PDF file, tried with few small PDF files, don't know what would happen in case of other/larger files).
But if I use another putContent method which takes File as argument I'll successfully import the document.
writer.setEncoding("UTF-8");
writer.setMimetype("application/pdf");
writer.putContent(File);
I don't want to import file from disk since I get the file as Base64 encoded String but I don't know what am I missing.
You could use an InputStream as a parameter for ContentWriter::putContent. So you will prevent the String to byte array (and vice versa) conversions, which leads to difficulties with the encoding.
writer.putContent(new ByteArrayInputStream(Base64.decodeBase64("yourBase64EncodedString")))

how to use PDDocument.loadNonSeq, large pdf stripper/parsing text technique

I have some questions about parsing pdf anfd how to:
what is the purpose of using
PDDocument.loadNonSeq method that include a scratch/temporary file?
I have big pdf and i need to parse it and get text contents. I use PDDocument.load() and then PDFTextStripper to extract data page by page (pdfstripper have got setStartPage(n) and setEndPage(n)
where n=n+1 every page loop ). Is more efficient for memory using loadNonSeq instead load?
For example
File pdfFile = new File("mypdf.pdf");
File tmp_file = new File("result.tmp");
PDDocument doc = PDDocument.loadNonSeq(pdfFile, new RandomAccessFile(tmp_file, READ_WRITE));
int index=1;
int numpages = doc.getNumberOfPages();
for (int index = 1; index <= numpages; index++){
PDFTextStripper stripper = new PDFTextStripper();
Writer destination = new StringWriter();
String xml="";
stripper.setStartPage(index);
stripper.setEndPage(index);
stripper.writeText(this.doc, destination);
.... //filtering text and then convert it in xml
}
Is this code above a right loadNonSeq use and is it a good practice to read PDF page per page without vaste in memory?
I use page per page reading because I need to write text in XML using DOM memory (using stripping technique, I decide to produce an XML for every page)
what is the purpose of using PDDocument.loadNonSeq method that include a scratch/temporary file?
PDFBox implements two ways to read a PDF file.
loadNonSeq is the way documents should be loaded
load is the way documents should not be loaded but one might try to repair flles with broken cross references this way
In the 2.0.0 development branch, the algorithm formerly used for loadNonSeq is now used for load and the algorithm formerly used for load is not used anymore.
I have big pdf and i need to parse it and get text contents. I use PDDocument.load() and then PDFTextStripper to extract data page by page (pdfstripper have got setStartPage(n) and setEndPage(n) where n=n+1 every page loop ). Is more efficient for memory using loadNonSeq instead load?
Using loadNonSeq instead of load may improve memory usage for multi-revision PDFs because it only reads objects still referenced from the reference table while load can keep more in memory.
I don't know, though, whether using a scratch file makes a big difference.
is it a good practice to read PDF page per page without vaste in memory?
Internally PDFBox parses the given range page after page, too. Thus, if you process the stripper output page-by-page, it certainly is ok to parse it page by page.

How to extract data from a .docx file including image, table, formula etc?

I am doing a task in which i have to extract data from word document mainly images, tables and special texts(formula etc) .
I am able to save image from a word file it is downloaded from web but when i am applying same code to my .docx file than it is giving error.
Code for same is
//create file inputstream to read from a binary file
FileInputStream fs=new FileInputStream(filename);
//create office word 2007+ document object to wrap the word file
XWPFDocument docx=new XWPFDocument(fs);
//get all images from the document and store them in the list piclist
List<XWPFPictureData> piclist=docx.getAllPictures();
//traverse through the list and write each image to a file
Iterator<XWPFPictureData> iterator=piclist.iterator();
System.out.println(piclist.size());
while(iterator.hasNext()){
XWPFPictureData pic=iterator.next();
byte[] bytepic=pic.getData();
int i=0;
BufferedImage imag=ImageIO.read(new ByteArrayInputStream(bytepic));
//captureimage(imag,i,flag,j);
if(imag != null)
{
ImageIO.write(imag, "jpg", new File("D:/imagefromword"+i+".jpg"));
}else{
System.out.println("imag is empty");
}
It is giving incorrect format error. But I cannot change the doc file.
Secondly for above code if i am having more then one image and when i am saving this than every time it saving save image. Suppose we have 3 images then it will save 3 images but all three will be latest one.
Any help will be appreciated.
Without actual error one can only guess.
But there are two POI implementations HWPF and XWPF depending which version of word document your read the old doc one or xml-new-one docx. Typically the format error comes when you try to open the doc using the wrong one.
Also you need the full poi-ooxml-schemas jar to read more complicated documents.

convert large csv to xml and print xml data to a GUI text area

I am developing a Java application that reads a .CSV file, displays the content of a GUI textarea and convert ths content to XML data(prints XML on a textarea as well) this XML data is now transformed using XSLT.
My application accepts a .CSV file, converting comma separated values data to XML has been a challenge for me. I have read loads of materials on it and I still haven't grasped the concept yet. Can anyone direct me to how I can do this?
You should make a java class that implements Serializable. Then as you read the csv file in, populate each field in that class. Then you can use the Java XMLEncoder to write to an XML file like this.
XMLEncoder encoder = null;
MyClass data = new MyClass();
data.setField1("field 1 from csv");
try {
encoder = new XMLEncoder(new BufferedOutputStream(new FileOutputStream("c:/myfile.xml")));
encoder.writeObject(data);
} catch (final IOException e) {
logger.error(e.getMessage());
} finally {
if (encoder != null) {
encoder.close();
}
}
From your question I read, that you're already to process the csv files and that you're xml schema is already defined (you mentioned an xslt that operates on the result of the csv->xml transformation).
I'd recommend using a small xml library like dom4j to create the xml document. The quick start guide for dom4j has a short example that shows the steps for Creating a new XML document and Converting to and from Strings.

Categories

Resources