I'm using lucene library to create an index from a number of documents. For example, the name of the first document is file1.txt and it contains the following text:
.T (title of the document) .A (author of the document) .S (summary of the document)
If i want define as Field all the contents fo the document i'm writing this:
doc.add(new TextField("contents", new BufferedReader(
new InputStreamReader(fis, "UTF-8"))));
What if i want to specify only the summary of the document as a Field? Im new to java and i can't find a way.
You need to manually read the file, till you get your summary, save it all in some sort of String, e.g. StringBuilder and then add a TextField as you listed.
For reading files line by line you could use Scanner (http://docs.oracle.com/javase/1.5.0/docs/api/java/util/Scanner.html), for String concatenation you could use StringBuilder (http://docs.oracle.com/javase/7/docs/api/java/lang/StringBuilder.html)
Related
I am having an issue with an XML response and formatting it into CSV which can then later be opened as an XLS and see the entire response into a single cell. I know.. its not how I would do it either, but they get what they ask for.
So far I have tried to use a string builder. This has been successful in formatting the response into a single line string, I have tested this by writing it to a text file and copying it to Eclipse.. when I place single quotes around the XML it turns to a string.
When trying to take this same response in its single line format and stick it into a csv file.. the csv file is breaking on comma's in the XML string and placing the response across several dozen cells.
BufferedReader br = new BufferedReader(new FileReader(new File('responseXml.txt')));
String l;
StringBuilder sb = new StringBuilder();
while((l=br.readLine())!= null){sb.append(l.trim());
File respfile = new File("outresp.txt")
respfile.append(l)
println respfile.text
//verified single line string
respContents = new File("outresp.txt").text
}
File file = new File('outXML.csv')
file.append(respContents)
println file.text
// open csv still broke across many lines
What I would like is a single xml string into a single xls cell.
to fit value into single column in csv (excel) you have two choices:
remove all new lines (\r\n) and comas (,)
replace each doublequote (") with two ones ("") and wrap whole value with doublequotes.
the second variant allows you to keep the original (multiline) string format in one excel cell.
here is the code for second variant:
//assume you have whole xml in responseXml variable
def responseXml = '''<?xml version="1.0"?>
<aaa text="hey, you">
<hello name="world"/>
</aaa>
'''
//take xml string in double quotes and escape all doublequotes
responseXml = '"'+ responseXml.replaceAll(/"/,'""') + '"'
def csv = new File('/11/1.csv')
csv.setText("col1,col2,xml\nfoo,bar") // emulate some existing file
csv.append(",${responseXml}\n")
as a result:
I am trying to automate docx report generation process. For this I am using java and docx4j. I have a template document containing only single page.I would like to copy that page modify it and save it in another docx document.The output report is of multiple similar pages with modification from the template. How do I go about it.
PS : java and docx4j are my first choice but I am open to solutions apart from java and docx4j.
Leaving it up to you to modify the template, here is how you could add one document to the end of another document. Suppose base.docx contains "This is the base document." and template.docx contains "The time is:", then after executing this code:
WordprocessingMLPackage doc = Docx4J.load(new File("base.docx"));
WordprocessingMLPackage template = Docx4J.load(new File("template.docx"));
MainDocumentPart main = doc.getMainDocumentPart();
Br pageBreak = Context.getWmlObjectFactory().createBr();
pageBreak.setType(STBrType.PAGE);
main.addObject(pageBreak);
for (Object obj : template.getMainDocumentPart().getContent()) {
main.addObject(obj);
}
main.addParagraphOfText(LocalDateTime.now().toString());
doc.save(new File("result.docx"));
Then result.docx will contain something like:
This is the base document.
^L
The time is:
2018-04-16T17:37:13.541984200
(Where ^L represents a page break.)
To be more precise my original template is containing only header and some styling component.
This kind of information can be stored in a Word stylesheet (.dotx file).
PS : java and docx4j are my first choice but I am open to solutions apart from java and docx4j.
A good tool would be pxDoc: you can specify a dedicated stylesheet in your document generator, or use "variable styles"and specify the stylesheet only when you launch the document generation
my question: How can I create a new linked document and insert (or connect) it into an element (in my case a Note-Element of an activity diagram).
The Element-Class supports the three Methods:
GetLinkedDocument ()
LoadLinkedDocument (string Filename)
SaveLinkedDocument (string Filename)
I missing a function like
CreateLinkedDocument (string Filename)
My goal: I create an activity diagram programmatically and some notes are to big to display it pretty in the activity diagram. So my goal is to put this text into an linked document instead of directly in the activity diagram.
Regards
EDIT
Thank you very much to Uffe for the solution of my problem. Here is my solution code:
public void addLinkedDocumentToElement(Element element, String noteText) {
String filePath = "C:\\rtfNote.rtf";
PrintWriter writer;
//create new file on the disk
writer = new PrintWriter(filePath, "UTF-8");
//convert string to ea-rtf format
String rtfText = repository.GetFormatFromField("RTF", noteText);
//write content to file
writer.write(rtfText);
writer.close();
//create linked document to element by loading the before created rtf file
element.LoadLinkedDocument(filePath);
element.Update();
}
EDIT EDIT
It is also possible to work with a temporary file:
File f = File.createTempFile("rtfdoc", ".rtf");
FileOutputStream fos = new FileOutputStream(f);
String rtfText = repository.GetFormatFromField("RTF", noteText);
fos.write(rtfText.getBytes());
fos.flush();
fos.close();
element.LoadLinkedDocument(f.getAbsolutePath());
element.Update();
First up, let's separate the linked document, which is stored in the EA project and displayed in EA's built-in RTF viewer, from an RTF file, which is stored on disk.
Element.LoadLinkedDocument() is the only way to create a linked document. It reads an RTF file and stores its contents as the element's linked document. An element can only have one linked document, and I think it is overwritten if the method is called again but I'm not absolutely sure (you could get an error instead, but the EA API tends not to work that way).
In order to specify the contents of your linked document, you must create the file and then load it. The only other way would be to go hacking around in EA's internal and undocumented database, which people sometimes do but which I strongly advise against.
In .NET you can create RTF documents using Microsoft's Word API, but to my knowledge there is no corresponding API for Java. A quick search turns up jRTF, an open-source RTF library for Java. I haven't tested it but it looks as if it'll do the trick.
You can also use EA's API to create RTF data. You would then create your intended content in EA's internal display format and use Repository.GetFormatFromField() to convert it to RTF, which you would then save in the file.
If you need to, you can use Repository.GetFieldFromFormat() to convert plain-text or HTML-formatted text to EA's internal format.
My java application loads an XML file and then parses the XML.
What I would like to is a search/replace on the file before I create the SAXBuilder. How can I do this in memory ( without having to write to the file ) ?
Here's my code, and where I envision doing the search/replace :
private String xmlFile = "D:\\mycomputer\\extract.xml";
File myXMLFile = new File(xmlFile);
// TODO
// REPLACE ALL "<content>" in xmlFile with "<content><![CDATA["
// REPLACE ALL "</content>" with "]]></content>"
SAXBuilder builder = new SAXBuilder("org.apache.xerces.parsers.SAXParser");
document = builder.build(new File(myXMLFile));
Read the file into memory, do the search/replace, and use the SAXBuilder(StringReader) method.
You can first read file to string with apache commons io and then change the input source for the SaxBuilder as in the following code snippet:
String fileStr = FileUtils.readFileToString(myXMLFile);
fileStr = fileStr.replaceAll("<content>","<content><![CDATA[");
fileStr = fileStr.replaceAll("</content>","]]></content>");
SAXBuilder builder = new SAXBuilder("org.apache.xerces.parsers.SAXParser");
document = builder.build(new ByteArrayInputStream(fileStr.getBytes()));
You answered to the question yourself - read the whole file into a StringBuilder, perform the replace in it and then call SAXParser.
The string can be passed to SAXBuilder using StringReader:
StringBuilder sb = new StringBuilder ();
loadFIleContent (filePath, sb);
document = builder.build (new StringReader (sb.toString ()));
P.S.: follow up to theglauber's answer:
If the file is really big (~100Mb) it's impractical to fully read it into memory as well as parsing it into a DOM tree. In this case you should consider using SAXParser and replacing as the file being parsed.
Depending on how large these files are, either read the file into a String, do your replacements in memory and build the XML from the String, or spawn a new thread to read the file, do the replacements and output, then build the XML from the output of that thread.
(I would suggest parsing and modifying the XML tree or using a XML filter, but i suspect you want to do this string-based replacement because the current content of your files is not correct XML.)
I have a ms-word document (MS-Office 2003; non-xml). Within this
document there is a string associated with a bookmark. Furthermore,
the word document contains word-macros. My goal is to read the
document with java, replace the string associated with the bookmark,
and save the document back to word format.
My first approach was using Apache POI HWPF:
HWPFDocument doc = new HWPFDocument(new FileInputStream("Test.doc"));
doc.write(new FileOutputStream("Test_generated.doc"));
The problem with this solution is that the generated file does not
contain the macro anymore (File size of the original document: 32k;
file size of the generated document 19k).
Does anybody now if it's possible to retain all the original info
using POI/HWPF?
never found a solution. The customer had to pay an Aspose-license (expensive) or refrain from using macros.