I am using open office API with Java UNO. I need to get size of selected text in the document content (for example embedded pictures have own size in mm but text inserted via XText.insertString(...) method doesn't have any size).
In other words: I want to get size (preferably in mm) of the box which surrounds part of text (it can be whole paragraph or selected text via some type of cursor). Is there any possibility to achieve that?
After searching, I think there is no option to achieve this at the moment. For my purposes I write small method for getting height of the paragraph in 1/100 mm.
Here is how this method works:
Get XTextViewCursor of the XTextDocumment controller for going left/right.
Go to paragraph to measure.
Loop through paragraph getting each char. For each char do: check its height (CharHeight property of the paragraph); get XLineCursor from XTextViewCursor and check if there is end of the line - if is then add (to the result) biggest height of the character in line.
This is temporary solution (still wait for something better) and has number of bugs (example line-spacing different than single; paragraph should only contain text) but maybe it will be helpful for someone.
Related
I am using Apache PdfBox 2.0 in order to parse a pdf file. Having some fixed strings, I was able to create a system based on:
A fixed text, as a starting point
The next cell/text position, or null
The bottom area, to determine the height of the rectangle.
Using the starting point, I am computing the x and y (see below pic for pdf structure in PDF Box:
Using the "next" text block (which is another fixed value, for example a field or a table header), I am determining the width of the desired region, using formula:
width = second.x - first.x
or something similar. So, in a table, for example, knowing in advance the header names, it's easy to detect the columns. What I am trying to do (and so far fail to do so in an accurate way) is to determine the lines in a pdf table. This table sometimes contains missing values in some columns and also multiple lines values for some rows/columns. I have extended my "system" (first, next, bottom) to work dinamycally with table rows, and this works great when I have "normalized" tables (e.g. no whitespaces and/or at least, no multiple line values). But it's not working with real world data, because so far I could not find a way of determining the location (x, y, width, height) of a multi-line value. Is this even possible with PDF Box? Some people suggested to convert the pdf to html first and then to parse the html instead. Is this a viable option? Has anyone worked with this library? I will try to use this next.
Like I said in my previous comments, I have found a partial solution for my issue. This is based on two things:
First, I assume that one column for each table contains only distinct values which never occupy more than 1 row.
Next, since I also have some fixed texts in the document, I have determined these texts coordinates and use them as a delimiter of the area which contains the text I want to extract. For example, the "current, next, bottom" system (as I call it) can contain for example: "Column name A", "Column name B", "Fixed text C" (or second row from the same table - determined based on the unique single-row values).
It is not perfect, and problems may occur if the fixed texts may occur more than once in the document. Of course, improvements can be made by filtering the correct occurrence using the vertical coordinates and so on, but for the moment, I will close this question, as it seems that this problem has no standard answer and currently there is no open source library able to extract tabular data from pdfs.
I need to get absolute coordinates of paragraph that I already added to the document and join an image near that.
Generally my problem is below:
I have a checklist with images (checked/unchecked) before each line. I already did that but if check item takes for example 2 lines, then second line starts from the begining of the page. What I want is to start this second line from position that first line is starting. It is equal to if the second line will have a margin.
Thanks in advance!
I think your question is wrong. Allow me to explain: you have a specific requirement: you want to start a line with an image (representing a checked/unchecked check mark) that acts as a bullet. More specifically: you want the text that follows the bullet to be aligned correctly. That is a valid requirement.
However, in your question, you're asking about a specific implementation. You want to juggle with Y positions (check if a paragraph takes one or more lines) and X positions (start the second line using a specific indentation).
While it would probably be possible to achieve what you want using page events (asking a paragraph for its start and end postion), I think you are actually asking for functionality that is available out of the box: why not use a List with an image chunk as bullet?
I've written some sample code, ListWithImageAsBullet, where I use a light bulb as bullet (in your case, you'd use a checkbox image). I've added three items to the List and the second item takes more than one line. As you can see, the second line is indented correctly (you can augment the indentation using different methods available in the List class).
Please take a look at the resulting PDF. Is that what you're looking for?
If so, this is how it's done:
Image image = Image.getInstance(IMG);
image.scaleAbsolute(12, 12);
image.setScaleToFitHeight(false);
List list = new List();
list.setListSymbol(new Chunk(Image.getInstance(image), 0, 0));
list.add("Hello World");
list.add("This is a list item with a lot of text. It will certainly take more than one line. This shows that the list item is indented and that the image is used as bullet.");
list.add("This is a test");
document.add(list);
Note that I scaled the image to 12 by 12 pt, because 12pt is the default font size. Also don't forget to disable the automatic scaling of the image (otherwise, you'll end up with really tiny images as bullets).
I have a few records of data (less then 10). Each record consists of a few lines of text.
I want to present records to the user in a kind of grid, where user can select one of the records.
I was thinking about List component or jTable, but I couldn't make them displaying more then one line of text. What component should I use then, or how to approach this?
In subject I suggested AWT because size does matter, i.e. I want use this functionality in the applet and would like to avoid any extra libraries.
Thanks in advance
Thanks to maksimov's link I found examples of how to tackle this issue, and also very interesting link I missed somehow - http://docs.oracle.com/javase/tutorial/uiswing/components/html.html
To specify that a component's text has HTML formatting, just put the
tag at the beginning of the text, then use any valid HTML in
the remainder. Here is an example of using HTML in a button's text:
button = new JButton("<html><b><u>T</u>wo</b><br>lines</html>");
In my case it was just enough to set height of the row and add tag just before string data to be displayed. HTML tagging also let me use extra formatting, colors, etc,
Brilliant,
Thank you maksimiov
I am trying to make a bizarre text editor for people with reading problems with Netbeans. You load the text you like and the editor starts highlighting it word by word with bold letters. The change from plain to bold constantly change the word dimensions and moves the line. One solution was the Monospaced Font but I would like to add a few more fonts available for the user to choose. Is there any way to do this with Arial for example by giving some orders to the JTextPane?
You can manually split the String with <br/> by counting characters and splitting at the right spot to keep the width under your desired character width. Give some leeway so if you get a big word, it won't still go to the next line.
Alternatively, you could use a JList to display your lines (instead of using <br/>). That way, there's no way the line would split to the next line. However, if you do it that way, the user will click on the list like a list and not be able to select text like in a normal text pane.
My report contains 3 parts – 2 parts are quite straightforward table reports, and one part is contract agreement on about 10 pages, 10 pages of static formatted(bold headings) text. This contract agreement is usual agreement which consists of about 12 parts, where each part consists of heading and text, e.g.:
1. Part. Blab la bla
1.1 Some long long long text
1.2. Some more text here
…
1.5 Artart
2. Part some heading
2.1 Asdasdasd asdf adfas
and so on...
I thought that it will be quite simple to do, but…
I tried to add this as static text elements, but in such case there are few problems:
Static text element don’t expand! What means I need to do very long static text elements which also don’t work, as there is limit of height of Detail band to which I add elements.
It’s hard to style text if it’s all in one element;
I tried text field element as this elements successfully expand. But in such case it’s quite difficulty to change text in this element as all the text is in quotes and all the new lines should be done with “\n” or <BR>…
Now I try the solution where I just create simple report with JasperReports and append contract agreement pdf to report pdf.
As I am quite new with JasperReport and IReport I assume that I just don’t understand something, as this seems to me quite “easy” feature. So what is the correct way of doing such thing in IReport? Maybe there is a way to “link” or embed so long text(as HTML, RTF or whatever) into report?
Thank You for Your time!
Don't use static text element, use text field, they can expand as the text grows.
Check the Stretch with Overflow checkbox in Text Field tab of the properties window.
Also, read this topic.
You have to use "Shift + Enter" in static text to break line.
Source