I've been trying to use Flying Saucer to take HTML content and render it to a SVG. I've easily achieved it when I want a JPEG or PNG, but SVG's seem to a bit more complicated and I've been struggling to figure out the process a bit. (Mainly because I'm a Java noob as it were).
I've been trying to follow this but it looks like the examples don't quite work, and trying to follow the demos they have bundled I was struggling with a bit as well, plus most examples try and get a browser involved which I don't really care about. All I want is to take the HTML and convert it to a SVG so I can use it as resource in a PDF. This is what I have PNG conversion:
BufferedImage buff = null;
buff = Graphics2DRenderer.renderToImage("content.html", 1000, 1000);
File outputfile = new File("test.png");
ImageIO.write(buff, "PNG", outputfile);
Related
I have a file that when I try to convert it will not, but i can convert the file in a online converter. What could be the cause of this?
FileSeekableStream fss = new FileSeekableStream(tifFile);
ImageDecoder decoder = ImageCodec.createImageDecoder("tiff", fss, null);
RenderedImage image = decoder.decodeAsRenderedImage();
ImageIO.write(image, "png", new File(imageFolder + "/" + baseName + ".png"));
Edit:
Trying to be clear about the question, what may cause some tiff files to convert and some not to? What are possible things in a tiff file that I can check to see why it will not convert or things I can change before I try making a tiff to a png?
This is the image
I am not sure what you mean by "cannot convert". I ran into an issue a couple of years ago converting tiff to png - the images converted, but the colors were way off and looked horrible.
The cause was actually that the input image (tifFile) was CMYK and the output file was RGB.
I know its not really an answer, but I am unable to comment here at this point...
I found out that the file that was not converting had a completely different color model than the images that were converting, so I changed the color model of the image that was not converting and it worked, color was off a little but I made progress, thanks for the GIMP suggestion.
I need to get the images in one specific page inside the PDF, but I need to convert this image the format of ITEXT PDXObjectImage to BufferedImage, is possible?, How I can?.
I saw several examples with extraction, but I need to get in memory not in file, because I want to use with ZXING Library to read a Barcode.
Can somebody help me please?
I am going to do start a project soon and I will have to draw something on a .bmp/.jpg/.ps(any of these) in JFrame.
The pictures will present maps and I will have to generate some dots etc. on them how to do it in Java? Generally how to draw on a picture in Java?
Take a look at ImageIO api it has out of the box support for JPEG, PNG, BMP, WBMP & GIF. You can get TIFF support from the Advanced Image API.
Post script support is a little more tricky, but some PDF renderers can actually accomplish this (I used this approach to convert illustrator image formats).
There are some tutorials online. Take a look at Listing 14.17 on this site:
Java ist auch eine Insel
The code should be readable though the text is german. Hope it helps to make a first step into swing :)
About Bitmaps: I wrote my own Bitmap class to convert a png to a bmp, because the BitmapFactory and co didn't work for me. If you need it and nobody has a better "Java-API"-like solution, you can write me.
I am generating lots of images in java and saving them through the ImageIO.write method like this:
final BufferedImage img = createSomeImage();
ImageIO.write( img, "png", new File( "/some/file.png" );
I was happy with the results until Google's firefox addon 'Page Speed' told me that i can save up to 60% of the size if i optimize the images. The images are QR codes, their size is around 900B each and the firefox-plugin optimized versions are around 300B.
I'd like to save such optimized 300B Images directly from java.
So here my question again: How to save optimized png images with java's ImageIO?
Use PngEncoderB to convert your BufferedImage into a PNG encoded byte array.
You can apply a filter to it, which helps prepare the image for better optimization. This is what OptiPNG does, only OptiPNG calculates which filter will get you the best compression.
You might have to try applying each filter to see which one is consistently better for you. With 2 bit color, I think the only filter that might help is "up", so I'm guessing that's the one to use.
Once you get the image to a PNG encoded byte array, you can write that directly to a file.
I am trying to read a pdf document in a j2ee application.
For a webapplication I have to store pdf documents on disk. To make searching easy I want to make a reverse index of the text inside the document; if it is OCR.
With the PDFbox library its possible to create a pdfDocument object wich contains an entire pdf file. However to preserve memory and improve overall performance I'd rather handle the document as a stream and read one page at a time into a buffer.
I wonder if it is possible to read a filestream containing pdf page by page or even one line at a time.
For a given generic pdf document you have no way of knowing where one page end and another one starts, using PDFBox at least.
If your concern is the use of resources, I suggest you parse the pdf document into a COSDocument, extract the parsed objects from the COSDocument using the .getObjects(), which will give you a java.util.List. This should be easy to fit into whatever scarce resources you have.
Note that you can easily convert your parsed pdf documents into Lucene indexes through the PDFBox API.
Also, before venturing into the land of optimisations, be sure that you really need them. PDFBox is able to make an in-memory representation of quite large PDF documents without much effort.
For parsing the PDF document from an InputStream, look at the COSDocument class
For writing lucene indexes, look at LucenePDFDocument class
For in-memory representations of COSDocuments, look at FDFDocument
In the 2.0.* versions, open the PDF like this:
PDDocument doc = PDDocument.load(file, MemoryUsageSetting.setupTempFileOnly());
This will setup buffering memory usage to only use temporary file(s) (no main-memory) with no restricted size.
This was answered here.
Take a look at the PDF Renderer Java library. I have tried it myself and it seems much faster than PDFBox. I haven't tried getting the OCR text, however.
Here is an example copied from the link above which shows how to draw a PDF page into an image:
File file = new File("test.pdf");
RandomAccessFile raf = new RandomAccessFile(file, "r");
FileChannel channel = raf.getChannel();
ByteBuffer buf = channel.map(FileChannel.MapMode.READ_ONLY, 0, channel.size());
PDFFile pdffile = new PDFFile(buf);
// draw the first page to an image
PDFPage page = pdffile.getPage(0);
//get the width and height for the doc at the default zoom
Rectangle rect = new Rectangle(0,0,
(int)page.getBBox().getWidth(),
(int)page.getBBox().getHeight());
//generate the image
Image img = page.getImage(
rect.width, rect.height, //width & height
rect, // clip rect
null, // null for the ImageObserver
true, // fill background with white
true // block until drawing is done
);
I'd imagine you can read through the file byte by byte looking for page breaks. Line by line is more difficult because of possible PDF formatting issues.