create a graph in html using data in csv - java

I need to show a graph(piegraph and XYgraph) in HTML file.
I have used some free tool created an image and I am trying to show this on HTML.
But, we need to place this image in shared folder or in a server to get accessed by HTML.Our client is not satisfied with both approaches.
Can some please let me know whether there is any way where I can pass data directly to html file. The data will be in csv file and it may contain some thousands of rows.
Thanks,

There is a Javascript framework that renders really pretty charts: http://www.highcharts.com/
You can use one of many CSV Javascript parsers: Javascript code to parse CSV data
If you then write some javascript code to extract your CSV data, and pass it to highcharts, you've got a very nice interactive chart.
The alternative, if you want to use your existing images, is to encode the images as base64 directly in the html file: http://webcodertools.com/imagetobase64converter/

As an alternative you can also look at the Dojo Toolkit (http://dojotoolkit.org/), its a Javascript toolkit with some really nice features including charting.

Related

rich PDF generation framework in java

I have hundreds of rich PDFs that need to be generated from my application, they have images and colorful content. I was looking to build a framework which support a template and data model and can take care of rest, so adding anew pdfs would be just adding a new template. In the past i have used free-marker to generate HTML and that print HTML to PDFs, are there any better recent solution to solve this problem?
There are various things you could do:
generate xml data, apply xslt transformation to style it, and convert
the html document to a pdf
code a small class that converts whatever data format you have to a pdf document (you would need to do all the layout through code)
create a template (using whatever design program you want) pdf document, insert form fields, and have iText fill the form (several of our customers go for this approach)
Keep in mind that JasperReports uses a proprietary format. Whereas the approaches I suggested use only open and well-established formats.
Take a look at JasperReports.

extracting text AND Images from PDF file

I have been bumping my head against the wall with this one, have researched and pretty much tried every library suggested to me. I am currently trying to write a program in java that will extract text AND images from a pdf file and allow me to write the extracted content to a word file. I have managed to extract the content using the ICEpdf library, however the problem is that I need to be able to write the content in the exact same order as it was read. So, to clarify, I need a library that will help me keep track of where exactly in the page the text and images are situated so I can put them in the same place in my word file.
A PDF to Word converter is a horribly complex proposition.
Your best bet will probably to use Open Office to do it for you and not even try to handle the intermediate steps.
http://www.openoffice.org/api/
Look at this: Advanced PDF parser for Java
OFF:
-Also to my knowledge there is a python parser that sorta converts the pdf to html (that way you can keep track of the ordering of the objects within the pdf). I know its not java, but you might be able to use the output.
http://www.unixuser.org/~euske/python/pdfminer/index.html

Java api for pdf

Which APIs in java help in extracting table metadata from a pdf, and presenting that table in a web page?
The result should be that when the source of page is viewed it will show the html code of that table.
Itext is usefull in this context
http://itextpdf.com/
I assume that, you need a PDF library for Java.
PDFBox is one of the popular libraries created to PDF manipulation and I think it is worth to look at it.
try The Metadata Extract Tool which extracts metadata from specific file types including PDF. Then you can parse the xml output with any Java XML parser. Once you're able to parse it, elements can be easily laid down in your view page.

How to generate a PDF with auto flow

I wrote a web app for generating PDF by filling data into a pre-saved PDF template, template edited by acrobat, with some text-fields. But the context of those text-fields seems in a different layer and cannot affect other existing words in template.
... But I want it affect the existing words and make them flow base on how many data fill into the text-fields.
The solution maybe use program to generate a whole PDF instead of using template. But the template changes really often in my case, I don't want waste a lot of time to adjust the position and format by coding...
Do anyone know how to use text-field with auto flow in a PDF template? just like a Word document.
PDF doesn't work like that. You need to generate the whole PDF.
Ah... but from what?
There are quite a few HTML->PDF converters out there. You could fill in your template HTML, and convert it that way.
You could develop your own input format (for your template), and write an app that reads it and builds a PDF.
The later is similar enough to HTML->PDF, that unless you can't find a converter that handles some PDF feature or other you need, I'd just go that route. There are LOTS of html->pdf apps out there. You can search SO, google, whatever. Lots.

What's the best way to extract table content from a group of HTML files?

After cleaning a folder full of HTML files with TIDY, how can the tables content be extracted for further processing?
I've used BeautifulSoup for such things in the past with great success.
Depends on what sort of processing you want to do. You can tell Tidy to generate XHTML, which is a type of XML, which means you can use all the usual XML tools like XSLT and XQuery on the results.
If you want to process them in Microsoft Excel, then you should be able to slice the table out of the HTML and put it in a file, then open that file in Excel: it will happily convert an HTML table in to a spreadsheet page. You could then save it as CSV or as an Excel workbook etc. (You can even use this on a web server -- return an HTML table but set the Content-Type header to application/ms-vnd.excel: Excel will open and import the table and turn it in to a spreadsheet.)
If you want CSV to feed in to a database then you could go via Excel as before, or if you want to automate the process, you could write a program that uses the XML-navigating API of your choice to iterate of the table rows and save them as CSV. Python's Elementtree and CSV modules would make this pretty easy.
After reviewing the suggestions, I wound up using HtmlUnit.
With HtmlUnit, I was able to customize the Java code to open each HTML file in the folder, navigate to the TABLE tag,
query each column content and extract the data I needed to create a CSV file.
In .NET you could use HTMLAgilityPack.
See this previous question on StackOverflow for more information.
If you want to extract the content from the the HTML markup, you should use some type of HTML parser. To that end there are plenty out there and here are two that might suite your needs:
http://jtidy.sourceforge.net/
http://htmlparser.sourceforge.net/
iterate through the text and Use regular expression :)
http://www.knowledgehouse.sg

Categories

Resources