Building and printing complex table layouts with Java - java

A customer requested me a software, and one of its requirements is build a form and fill it with data collected from database.
This form is currently being created in Excel. It uses cells to build the form, some cells have blank background, others blank background with black bottom border (to look like a line where text is typed), others have gray background with white text, and there's also a logo image. In Excel, some cells are merged to become bigger than other cells. They fill the text in another spreadsheet and the required cells in the form take that text and format it.
I've looked many report frameworks in Java, some are very complex and some look like Excel's graph builders, but I saw none that can make a complex 2D form like this.
Data filled in it is simple, like name, quantity, some numbers, but they have different length requiring for example that name's cell to be merged to cover a full horizontal line, and some have smaller font size. There's no repeated data that would require sorting and I have no problem gathering the data.
In the end, the filled form must also be printed, so I can't use normal Swing table or grid. It will be used in Windows now, but it'd be nice to support Linux printing too.
Any suggestion of a Java component that builds a 2D layout like this and fills it with strings will be very much appreciated. I even thought of taking a screenshot of their current form and just use 2D Graphics to print the text, but I'd not be able to print it.
This is an example of the kind of form I must build, it's somewhat like that but some areas have gray background with white text:
No, it's not a duplicate, but it is a good example of the layout.

Related

How to count color pages in a PDF/Word doc using Java

I am looking to develop a desktop application using Java to count the number of colored pages in a PDF or Word file. This will be used as part of an overall system to help calculate the cost of printing a document in terms of how many pages there are (color/B&W).
Ideally, the user of the application would use a file dialog to select the desired PRF/Word file, the application could then count and output the number of colored pages, allowing the system to automatically calculate document cost accordingly.
i.e
if A4 colored pages cost 50c per page to print,
and B&W cost 10c per page,
calculate the total cost of the document per colored/B&W pages.
I am aware of the existing software Rapid PDF Count http://www.traction-software.co.uk/rapidpdfcount/, but would be unsuitable as part on integration into a new system. I have also tried using GhostScript/Python as per this solution: http://root42.blogspot.de/2012/10/counting-color-pages-in-pdf-files.html, however this takes too long (5mins to count a 100 page pdf), and would be difficult to implement into a desktop app.
Is there any method of counting the number of colored pages in a PDF or Word file using Java (or alternative language)
Thanks
Although it might sound easy, the task is rather complicated.
One option would be to use a program such as iText to walk every single token in the PDF, look for tokens that support color and compare that to your definition of "black". However, this will only get you basic text and drawing commands. Images are a completely different beast so you'll probably need to find an image parser or grab a copy of each spec and then walk each of those.
One of the downsides of token walking is you need to properly handle tokens that reference other things and further walk those tokens.
Another downside is that things can overlap each other so you'd probably want be aware of their coordinates, z-index, transparency and such.
There will be many more bumps in the road but that's a good start. What's most interesting is that if you accomplish this, you'll actually have found that you've partially built a PDF renderer!
Next, you'll need to define "black". Off the top of my head there's RGB black, CMYK black, Grey black and maybe Lab black along with some Pantones. That shouldn't be too hard but if I were to build this I'd want to know "blank ink usage" which could also be shades of grey. There's also "rich blank" that you might need to deal with, too!
So, all that said, I think that the GhostScript option you found is really the best bet. It literally renders the PDF and calculates the ink coverage from an RGB standpoint. You still should handle grey's, too, but that shouldn't be too hard, here's a good starting point.
Wanting to know what the click-charge is going to be is a pretty common problem, but it's not easy to solve at all. As already indicated by the answer Chris Haas gave, but I want to put another spin on it.
First of all, you have to wonder whether you really want to support both Word and PDF documents. Analysing Word files is less useful than you might think because that Word file is probably going to be converted into something else before it's going to be printed. And because of the fact that you're starting from Word, the chance that your nice RGB black text in Word gets converted to less-than-perfect 4 color black in PDF is very high. In other words, even though you might count a page of black text in Word as a 'cheap' page, it might turn into an expensive color page after conversion from Word to something that can be printed.
Let's consider the PDF case then. PDF supports a whole host of color spaces (gray, RGB, CMYK, the same with an ICC Profile attached, spot color and a few multi-spot color variants, CalGray and CalRGB and Lab. Besides that there is a whole range of very tricky features such as transparency, overprint, shades, images, masks... that you all have to take into account. The only truly good way to calculate what you need is to do essentially the same work as your printer will do; convert the PDF into one image per page and examine the pixels.
Because of what you want to do, the best way to progress would be to:
1) Convert any word files into PDF
2) Convert any PDF files into CMYK
3) Render each page of that CMYK file into an image.
Once you've done that you can examine the image and see whether you have any colors left. There are a number of potential technologies you can use for this. GhostScript is definitely one, but there are commercial solutions too that would certainly be more expensive but potentially faster.

Is there always a natural "space" between cells in a table?

I created a table with "buttons" aligned onto my screen for my game I am using to learn libgdx. I want the buttons to squeeze together perfectly to form a seamless menu, however there seems to be a natural padding of 1 to 2 pixels between every cell.
Is there a way to remove that padding?
The code seems unnecessary but this is the contents of my table anyways:
table.add(buttonTyce).size(150,60).expandX().expandY().bottom().left().row();
table.add(buttonGrokk).size(150,60).bottom().left().row();
table.add(buttonCeleste).size(150,60).bottom().left().row();
table.add(buttonDaem).size(150,60).bottom().left().row();
table.add(buttonRisp).size(150,60).bottom().left().padBottom(80).row();
table.setFillParent(true);
stage.addActor(table);
Thank you for any help.
I don't think there is default space. Are you sure that the texture you are using for your buttons doesn't contain any transparent edges? Also, there's table.debug() (or something like that, check the docs) to draw lines around table cells for debugging such issues.

Java image library - turn grid image into array

If I have an image of a table of boxes, with some coloured in, is there an image processing library that can help me turn this into an array?
Thanks
You can use a thresholding function to binarize the image into dark/light pixels so dark pixels are 0 and light ones are 1.
Then you would want to remove image artifacts using dilation and erosion functions to remove noise (all these are well defined on Wikipedia).
Finally if you know where the boxes are, you can just get the value in the center of each box to determine the array value, or possibly use an area near the center and take the prevailing value (i.e. more 0's is a filled in square, more 1's is and empty square).
If you are scanning these boxes and there is a lot of variation in the position of the boxes, you will have to perform some level of image registration using known points, or fiducials.
As far as what tools to use to do this, I'd recommend first trying this manually using a tool like ImageJ, which has a UI and can also be used programatically since it is written all in Java.
Other good libraries for this include OpenCV and the Java Advanced Imaging API.
Your results will definitely vary depending on the input images and how consistenly lit and positioned they are.
The best way to see how it will do for your data is to try applying these processing steps manually to see where your threshold value should be, how much dilating/eroding you need to get consistent results.

Finding bounding box of text within JPG image

My question is similar to this one, but is more specific in scope.
In my card game application, I would like for users to be able to click on words located in a scanned jpeg image. Please see this sample Pokemon trading card.
In this case, the user should be able to hover his mouse over the text "Scratch", upon which a pulsing rectangular border will appear around the text, indicating that it is clickable. The problem is how to detect the border of the text. There will be an array of words KNOWN BEFOREHAND that the user may click on (these will be retrieved from a database on a card-by-card basis). To continue our example, the array in this case will be ["Scratch", "Live Coal"]. Once the user clicks on "Scratch", the application must know via a call-back that "Scratch" was chosen instead of "Live Coal".
I was thinking of using optical character recognition libraries to solve this problem, but the open-source options for this are poor in quality (e.g. GOCR) and/or not well-tested on multiple platforms (e.g. Tesseract). I only care about Windows and Mac compatibility. Am I missing an obvious/simpler solution/algorithm that does not require OCR? I cannot simply hand-code in bounding boxes for each card, as there will be thousands of scanned cards in my database. The user may also upload his own custom card scans with an accompanying array of clickable text.
Text color is not always black. See this panorama of different card and text styles that will be permitted. The black cards have white text, and the third-to-last card (Zekrom) has black text with a white outline.
Solutions in any programming language are appreciated. However, please note that I am looking for open-source algorithms and/or libraries. If there is a solution in Ruby or Java, even better, as my code is primarily in these two languages.
EDIT: I forgot to mention that the order of the words/phrases in the array will be the same as on the card. Thus, the array will be ["Scratch", "Live Coal"] instead of ["Live Coal", "Scratch"]. I am mentioning this because it can potentially simplify the task. Thus, for this example, I can simply look for black pixels (though I have to watch out for the black star in the white circle). However, there will be more difficult cases where there is descriptive text under the attack name in a smaller font (again, see the panorama for examples).
I would just write a program that allows you to visually draw a bounding box around your text for simplicity but could could do this buy detecting differences in pixel color. Since the text is black you could see where the upper-left most black pixel is without large indents and within the bottom half of the card.
When the cursor is stationary, check if there is a black pixel either underneath or to 4 pixels around the cursor. If it is, check the first three consecutive (because there still might be a non-black pixel between the letters) non-black pixels to the left of the cursor, to the right, to the top and at the bottom. If yes, use these locations to draw a square. You can use OpenCV.

Finding positon of an image onscreen via image matching

I'm currently trying to write a program in Java which will preform some macros however I need the macros to be capable of finding certain pictures located on my screen.
There is a function in Macro Schedule which is pretty much FindImagePos(myimage.bmp, tolerance(0-255)) and locates the image in about 0.2s at 1080P but the program would be a pain to code in Macro Scheduler.
The images are not distorted in any way except for slight color variations which is where tolerance comes in,
ex. finding this: http://min.us/idivEC.png in every one of these bars min.us/iR64a.png
I don't think I could code such a function efficiently and briefly skimmed ImageJ and Neuroph which seemed complicated and overkill and would like to know if there was something simpler.
The only theory on how to code this would be to take screenshot, convert it into a 2D RGB array, search the rows for the occurrence of the first row in the sample image with some leniency (aka tolerance) and if it matches up somewhat then check underneath the row for the second row and so on.

Categories

Resources