Java. Broken image from URL

Java. Broken image from URL - java

I am trying to load this image from URL, but receive image like this.
Code:
#Override
protected void paintComponent(Graphics g) {
super.paintComponent(g);
Graphics2D g2 = (Graphics2D)g;
ByteArrayOutputStream out = new ByteArrayOutputStream();
try {
URL url = new URL("http://s.developers.org.ua/img/announces/java_1.jpg");
BufferedInputStream in = new BufferedInputStream(url.openStream());
byte[] b = new byte[512];
while (in.read(b)!=-1)
out.write(b);
Image img = ImageIO.read(new ByteArrayInputStream(out.toByteArray()));
g2.drawImage(img, 0, 0, getWidth(), getHeight(), null);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

Don't read images inside the paintComponent method, it will make your application appear sluggish, as the method is executed on the event dispatcher thread (EDT). Also, it will be re-read, whenever your component is re-painted, meaning you'll download the image over and over. Instead, read it up front, or in a separate thread (ie. use a SwingWorker), and only invoke g.drawImage(...) from inside the paintComponent method.
The reason for the broken image is your byte copying code, where you don't pay attention to how many bytes are read (as long as the value isn't -1), but instead unconditionally copy 512 bytes. However, you don't need to do that here, you can simply pass the stream to ImageIO.read, like this, making the code simpler and more readable:
URL url = new URL("http://s.developers.org.ua/img/announces/java_1.jpg");
try (BufferedInputStream in = new BufferedInputStream(url.openStream())) {
BufferedImage img = ImageIO.read(in);
}
Adding the extra try (try-with-resources) block makes sure your stream is also properly closed to avoid resource leaks.
For completeness, to fix the byte copying code, the correct version would be:
// ... as above ...
byte[] b = new byte[512];
int bytesRead; // Keep track of the number of bytes read into 'b'
while ((bytesRead = in.read(b)) != -1)
out.write(b, 0, bytesRead);

I don't know if this is the only problem, but you might write more than you get.
I suggest that you change your writing code to:
int len;
while ((len=in.read(b))!=-1)
out.write(b, 0, len);
Otherwise, if the last buffer is not exactly 512 bytes long, you'll write too much

I have some code copy file from URL to Local.. So far the result is same like actual source. Just do some modification maybe can help to solved it.
import java.awt.Image;
import java.awt.image.BufferedImage;
import java.io.File;
import java.net.URL;
import java.util.ArrayList;
import org.apache.commons.io.FilenameUtils;
import javax.imageio.ImageIO;
public class ImagesUrlToImagesLocal {
public ArrayList<String> getIt(ArrayList<String> urlFile)
{
ArrayList<String> strResult = new ArrayList<String>();
Image imagesUrl = null;
String baseName = null;
String extension = null;
File outputfile = null;
try {
for (int i = 0; i < urlFile.size(); i++)
{
URL url = new URL(urlFile.get(i));
baseName = FilenameUtils.getBaseName(urlFile.get(i));
extension = FilenameUtils.getExtension(urlFile.get(i));
imagesUrl = ImageIO.read(url);
BufferedImage image = (BufferedImage) imagesUrl;
outputfile = new File("temp_images/" + baseName + "." + extension);
ImageIO.write(image, extension, outputfile);
strResult.add("temp_images/" + baseName + "." + extension);
}
} catch (Exception e) {
e.printStackTrace();
}
return strResult;
}
}

Related

How to disable compression when writing JPEG in java?

I want to compress JPEG to fixed file size (20480 bytes). Here is my code:
package io.github.baijifeilong.jpeg;
import lombok.SneakyThrows;
import javax.imageio.IIOImage;
import javax.imageio.ImageIO;
import javax.imageio.ImageWriteParam;
import javax.imageio.ImageWriter;
import javax.imageio.plugins.jpeg.JPEGImageWriteParam;
import javax.imageio.stream.FileImageOutputStream;
import java.awt.*;
import java.awt.image.BufferedImage;
import java.io.File;
/**
* Created by BaiJiFeiLong#gmail.com at 2019/10/9 上午11:26
*/
public class JpegApp {
#SneakyThrows
public static void main(String[] args) {
BufferedImage inImage = ImageIO.read(new File("demo.jpg"));
BufferedImage outImage = new BufferedImage(143, 143, BufferedImage.TYPE_INT_RGB);
outImage.getGraphics().drawImage(inImage.getScaledInstance(143, 143, Image.SCALE_SMOOTH), 0, 0, null);
JPEGImageWriteParam jpegParams = new JPEGImageWriteParam(null);
jpegParams.setCompressionMode(ImageWriteParam.MODE_DISABLED);
ImageWriter imageWriter = ImageIO.getImageWritersByFormatName("jpg").next();
imageWriter.setOutput(new FileImageOutputStream(new File("demo-xxx.jpg")));
imageWriter.write(null, new IIOImage(outImage, null, null), jpegParams);
}
}
And the error occured:
Exception in thread "main" javax.imageio.IIOException: JPEG compression cannot be disabled
at com.sun.imageio.plugins.jpeg.JPEGImageWriter.writeOnThread(JPEGImageWriter.java:580)
at com.sun.imageio.plugins.jpeg.JPEGImageWriter.write(JPEGImageWriter.java:363)
at io.github.baijifeilong.jpeg.JpegApp.main(JpegApp.java:30)
Process finished with exit code 1
So how to disable JPEG compression? Or there be any method that can compress any image to a fixed file size with any compression?

As for the initial question, how to create non-compressed jpegs: one can't, for fundamental reasons. While I initially assumed that it is possible to write a non-compressing jpg encoder producing output that can be decoded with any existing decoder by manipulating the Huffman tree involved, I had to dismiss it. The Huffman encoding is just the last step of quite a pipeline of transformations, that can not be skipped. Custom Huffman trees may also break less sophisticated decoders.
For an answer that takes into consideration the requirement change made in comments (resize and compress any way you like, just give me the desired file size) one could reason this way:
The jpeg file specification defines an End of Image marker. So chances are, that patching zeros (or just anything perhaps) afterwards make no difference. An experiment patching some images up to a specific size showed that gimp, chrome, firefox and your JpegApp swallowed such an inflated file without complaint.
It would be rather complicated to create a compression that for any image compresses precisely to your size requirement (kind of: for image A you need a compression ratio of 0.7143, for Image B 0.9356633, for C 12.445 ...). There are attempts to predict image compression ratios based on raw image data, though.
So I'd propose just to resize/compress to any size < 20480 and then patch it:
calculate the scaling ratio based on the original jpg size and the
desired size, including a safety margin to account for the
inherently vague nature of the issue
resize the image with that ratio
patch missing bytes to match exactly the desired size
As outlined here
private static void scaleAndcompress(String fileNameIn, String fileNameOut, Long desiredSize) {
try {
long size = getSize(fileNameIn);
// calculate desired ratio for conversion to stay within size limit, including a safte maring (of 10%)
// to account for the vague nature of the procedure. note, that this will also scale up if needed
double desiredRatio = (desiredSize.floatValue() / size) * (1 - SAFETY_MARGIN);
BufferedImage inImg = ImageIO.read(new File(fileNameIn));
int widthOut = round(inImg.getWidth() * desiredRatio);
int heigthOut = round(inImg.getHeight() * desiredRatio);
BufferedImage outImg = new BufferedImage(widthOut, heigthOut, BufferedImage.TYPE_INT_RGB);
outImg.getGraphics().drawImage(inImg.getScaledInstance(widthOut, heigthOut, Image.SCALE_SMOOTH), 0, 0, null);
JPEGImageWriter imageWriter = (JPEGImageWriter) ImageIO.getImageWritersByFormatName("jpg").next();
ByteArrayOutputStream outBytes = new ByteArrayOutputStream();
imageWriter.setOutput(new MemoryCacheImageOutputStream(outBytes));
imageWriter.write(null, new IIOImage(outImg, null, null), new JPEGImageWriteParam(null));
if (outBytes.size() > desiredSize) {
throw new IllegalStateException(String.format("Excess output data size %d for image %s", outBytes.size(), fileNameIn));
}
System.out.println(String.format("patching %d bytes to %s", desiredSize - outBytes.size(), fileNameOut));
patch(desiredSize, outBytes);
try (FileOutputStream outFileStream = new FileOutputStream(new File(fileNameOut))) {
outBytes.writeTo(outFileStream);
}
} catch (IOException e) {
e.printStackTrace();
}
}
private static void patch(Long desiredSize, ByteArrayOutputStream bytesOut) {
long patchSize = desiredSize - bytesOut.size();
for (long i = 0; i < patchSize; i++) {
bytesOut.write(0);
}
}
private static long getSize(String fileName) {
return (new File(fileName)).length();
}
private static int round(double f) {
return Math.toIntExact(Math.round(f));
}

A solution using Magick (sudo apt install imagemagick on Debian), maybe not work for some images. Thanks to #curiosa-g.
package io.github.baijifeilong.jpeg;
import lombok.SneakyThrows;
import org.apache.commons.io.FileUtils;
import org.apache.commons.io.IOUtils;
import java.io.*;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Scanner;
/**
* Created by BaiJiFeiLong#gmail.com at 2019/10/9 上午11:26
*/
public class JpegApp {
#SneakyThrows
private static InputStream compressJpeg(InputStream inputStream, int fileSize) {
File tmpFile = new File(String.format("tmp-%d.jpg", Thread.currentThread().getId()));
FileUtils.copyInputStreamToFile(inputStream, tmpFile);
Process process = Runtime.getRuntime().exec(String.format("mogrify -strip -resize 512 -define jpeg:extent=%d %s", fileSize, tmpFile.getName()));
try (Scanner scanner = new Scanner(process.getErrorStream()).useDelimiter("\\A")) {
if (process.waitFor() > 0) throw new RuntimeException(String.format("Mogrify Error \n### %s###", scanner.hasNext() ? scanner.next() : "Unknown"));
}
try (FileInputStream fileInputStream = new FileInputStream(tmpFile)) {
byte[] bytes = IOUtils.toByteArray(fileInputStream);
assert bytes.length <= fileSize;
byte[] newBytes = new byte[fileSize];
System.arraycopy(bytes, 0, newBytes, 0, bytes.length);
Files.delete(Paths.get(tmpFile.getPath()));
return new ByteArrayInputStream(newBytes);
}
}
#SneakyThrows
public static void main(String[] args) {
InputStream inputStream = compressJpeg(new FileInputStream("big.jpg"), 40 * 1024);
IOUtils.copy(inputStream, new FileOutputStream("40KB.jpg"));
System.out.println(40 * 1024);
System.out.println(new File("40KB.jpg").length());
}
}
And the output:
40960
40960

How to convert pdf file in to multipagetiff images using pdfBox? [duplicate]

I'm trying to convert PDFs as represented by the org.apache.pdfbox.pdmodel.PDDocument class and the icafe library (https://github.com/dragon66/icafe/) to a multipage tiff with group 4 compression and 300 dpi. The sample code works for me for 288 dpi but strangely NOT for 300 dpi, the exported tiff remains just white. Has anybody an idea what the issue is here?
The sample pdf which I use in the example is located here: http://www.bergophil.ch/a.pdf
import java.awt.image.BufferedImage;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import cafe.image.ImageColorType;
import cafe.image.ImageParam;
import cafe.image.options.TIFFOptions;
import cafe.image.tiff.TIFFTweaker;
import cafe.image.tiff.TiffFieldEnum.Compression;
import cafe.io.FileCacheRandomAccessOutputStream;
import cafe.io.RandomAccessOutputStream;
public class Pdf2TiffConverter {
public static void main(String[] args) {
String pdf = "a.pdf";
PDDocument pddoc = null;
try {
pddoc = PDDocument.load(pdf);
} catch (IOException e) {
}
try {
savePdfAsTiff(pddoc);
} catch (IOException e) {
}
}
private static void savePdfAsTiff(PDDocument pdf) throws IOException {
BufferedImage[] images = new BufferedImage[pdf.getNumberOfPages()];
for (int i = 0; i < images.length; i++) {
PDPage page = (PDPage) pdf.getDocumentCatalog().getAllPages()
.get(i);
BufferedImage image;
try {
// image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 288); //works
image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 300); // does not work
images[i] = image;
} catch (IOException e) {
e.printStackTrace();
}
}
FileOutputStream fos = new FileOutputStream("a.tiff");
RandomAccessOutputStream rout = new FileCacheRandomAccessOutputStream(
fos);
ImageParam.ImageParamBuilder builder = ImageParam.getBuilder();
ImageParam[] param = new ImageParam[1];
TIFFOptions tiffOptions = new TIFFOptions();
tiffOptions.setTiffCompression(Compression.CCITTFAX4);
builder.imageOptions(tiffOptions);
builder.colorType(ImageColorType.BILEVEL);
param[0] = builder.build();
TIFFTweaker.writeMultipageTIFF(rout, param, images);
rout.close();
fos.close();
}
}
Or is there another library to write multi-page TIFFs?
EDIT:
Thanks to dragon66 the bug in icafe is now fixed. In the meantime I experimented with other libraries and also with invoking ghostscript. As I think ghostscript is very reliable as id is a widely used tool, on the other hand I have to rely that the user of my code has an ghostscript-installation, something like this:
/**
* Converts a given pdf as specified by its path to an tiff using group 4 compression
*
* #param pdfFilePath The absolute path of the pdf
* #param tiffFilePath The absolute path of the tiff to be created
* #param dpi The resolution of the tiff
* #throws MyException If the conversion fails
*/
private static void convertPdfToTiffGhostscript(String pdfFilePath, String tiffFilePath, int dpi) throws MyException {
// location of gswin64c.exe
String ghostscriptLoc = context.getGhostscriptLoc();
// enclose src and dest. with quotes to avoid problems if the paths contain whitespaces
pdfFilePath = "\"" + pdfFilePath + "\"";
tiffFilePath = "\"" + tiffFilePath + "\"";
logger.debug("invoking ghostscript to convert {} to {}", pdfFilePath, tiffFilePath);
String cmd = ghostscriptLoc + " -dQUIET -dBATCH -o " + tiffFilePath + " -r" + dpi + " -sDEVICE=tiffg4 " + pdfFilePath;
logger.debug("The following command will be invoked: {}", cmd);
int exitVal = 0;
try {
exitVal = Runtime.getRuntime().exec(cmd).waitFor();
} catch (Exception e) {
logger.error("error while converting to tiff using ghostscript", e);
throw new MyException(ErrorMessages.GHOSTSTSCRIPT_ERROR, e);
}
if (exitVal != 0) {
logger.error("error while converting to tiff using ghostscript, exitval is {}", exitVal);
throw new MyException(ErrorMessages.GHOSTSTSCRIPT_ERROR);
}
}
I found that the produced tif from ghostscript strongly differs in quality from the tiff produced by icafe (the group 4 tiff from ghostscript looks greyscale-like)

It's been a while since the question was asked and I finally find time and a wonderful ordered dither matrix which allows me to give some details on how "icafe" can be used to get similar or better results than calling external ghostscript executable. Some new features were added to "icafe" recently such as better quantization and ordered dither algorithms which is used in the following example code.
Here the sample pdf I am going to use is princeCatalogue. Most of the following code is from the OP with some changes due to package name change and more ImageParam control settings.
import java.awt.image.BufferedImage;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import com.icafe4j.image.ImageColorType;
import com.icafe4j.image.ImageParam;
import com.icafe4j.image.options.TIFFOptions;
import com.icafe4j.image.quant.DitherMethod;
import com.icafe4j.image.quant.DitherMatrix;
import com.icafe4j.image.tiff.TIFFTweaker;
import com.icafe4j.image.tiff.TiffFieldEnum.Compression;
import com.icafe4j.io.FileCacheRandomAccessOutputStream;
import com.icafe4j.io.RandomAccessOutputStream;
public class Pdf2TiffConverter {
public static void main(String[] args) {
String pdf = "princecatalogue.pdf";
PDDocument pddoc = null;
try {
pddoc = PDDocument.load(pdf);
} catch (IOException e) {
}
try {
savePdfAsTiff(pddoc);
} catch (IOException e) {
}
}
private static void savePdfAsTiff(PDDocument pdf) throws IOException {
BufferedImage[] images = new BufferedImage[pdf.getNumberOfPages()];
for (int i = 0; i < images.length; i++) {
PDPage page = (PDPage) pdf.getDocumentCatalog().getAllPages()
.get(i);
BufferedImage image;
try {
// image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 288); //works
image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 300); // does not work
images[i] = image;
} catch (IOException e) {
e.printStackTrace();
}
}
FileOutputStream fos = new FileOutputStream("a.tiff");
RandomAccessOutputStream rout = new FileCacheRandomAccessOutputStream(
fos);
ImageParam.ImageParamBuilder builder = ImageParam.getBuilder();
ImageParam[] param = new ImageParam[1];
TIFFOptions tiffOptions = new TIFFOptions();
tiffOptions.setTiffCompression(Compression.CCITTFAX4);
builder.imageOptions(tiffOptions);
builder.colorType(ImageColorType.BILEVEL).ditherMatrix(DitherMatrix.getBayer8x8Diag()).applyDither(true).ditherMethod(DitherMethod.BAYER);
param[0] = builder.build();
TIFFTweaker.writeMultipageTIFF(rout, param, images);
rout.close();
fos.close();
}
}
For ghostscript, I used command line directly with the same parameters provided by the OP. The screenshots for the first page of the resulted TIFF images are showing below:
The lefthand side shows the output of "ghostscript" and the righthand side the output of "icafe". It can be seen, at least in this case, the output from "icafe" is better than the output from "ghostscript".
Using CCITTFAX4 compression, the file size from "ghostscript" is 2.22M and the file size from "icafe" is 2.08M. Both are not so good given the fact dither is used while creating the black and white output. In fact, a different compression algorithm will create way smaller file size. For example, using LZW, the same output from "icafe" is only 634K and if using DEFLATE compression the output file size went down to 582K.

Here's some code to save in a multipage tiff which I use with PDFBox. It requires the TIFFUtil class from PDFBox (it isn't public, so you have to make a copy).
void saveAsMultipageTIFF(ArrayList<BufferedImage> bimTab, String filename, int dpi) throws IOException
{
Iterator<ImageWriter> writers = ImageIO.getImageWritersByFormatName("tiff");
ImageWriter imageWriter = writers.next();
ImageOutputStream ios = ImageIO.createImageOutputStream(new File(filename));
imageWriter.setOutput(ios);
imageWriter.prepareWriteSequence(null);
for (BufferedImage image : bimTab)
{
ImageWriteParam param = imageWriter.getDefaultWriteParam();
IIOMetadata metadata = imageWriter.getDefaultImageMetadata(new ImageTypeSpecifier(image), param);
param.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
TIFFUtil.setCompressionType(param, image);
TIFFUtil.updateMetadata(metadata, image, dpi);
imageWriter.writeToSequence(new IIOImage(image, null, metadata), param);
}
imageWriter.endWriteSequence();
imageWriter.dispose();
ios.flush();
ios.close();
}
I experimented on this for myself some time ago by using this code:
https://www.java.net/node/670205 (I used solution 2)
However...
If you create an array with lots of images, your memory consumption
really goes up. So it would probably be better to render an image, then
add it to the tiff file, then render the next page and lose the
reference of the previous one so that the gc can get the space if needed.

Refer to my github code for an implementation with PDFBox.

Since some dependencies used by solutions for this problem looks not maintained. I got a solution by using latest version (2.0.16) pdfbox:
ByteArrayOutputStream imageBaos = new ByteArrayOutputStream();
ImageOutputStream output = ImageIO.createImageOutputStream(imageBaos);
ImageWriter writer = ImageIO.getImageWritersByFormatName("TIFF").next();
try (final PDDocument document = PDDocument.load(new File("/tmp/tmp.pdf"))) {
PDFRenderer pdfRenderer = new PDFRenderer(document);
int pageCount = document.getNumberOfPages();
BufferedImage[] images = new BufferedImage[pageCount];
// ByteArrayOutputStream[] baosArray = new ByteArrayOutputStream[pageCount];
writer.setOutput(output);
ImageWriteParam params = writer.getDefaultWriteParam();
params.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
// Compression: None, PackBits, ZLib, Deflate, LZW, JPEG and CCITT
// variants allowed
params.setCompressionType("Deflate");
writer.prepareWriteSequence(null);
for (int page = 0; page < pageCount; page++) {
BufferedImage image = pdfRenderer.renderImageWithDPI(page, DPI, ImageType.RGB);
images[page] = image;
IIOMetadata metadata = writer.getDefaultImageMetadata(new ImageTypeSpecifier(image), params);
writer.writeToSequence(new IIOImage(image, null, metadata), params);
// ImageIO.write(image, "tiff", baosArray[page]);
}
System.out.println("imageBaos size: " + imageBaos.size());
// Finished write to output
writer.endWriteSequence();
document.close();
} catch (IOException e) {
e.printStackTrace();
throw new Exception(e);
} finally {
// avoid memory leaks
writer.dispose();
}
Then you may using imageBaos write to your local file. But if you want to pass your image to ByteArrayOutputStream and return to privious method like me. Then we need other steps.
After processing is done, the image bytes would be available in the ImageOutputStream
output object. We need to position the offset to the beginning of the output object and then read the butes to write to new ByteArrayOutputStream, a concise way like this:
ByteArrayOutputStream bos = new ByteArrayOutputStream();
long counter = 0;
while (true) {
try {
bos.write(ios.readByte());
counter++;
} catch (EOFException e) {
System.out.println("End of Image Stream");
break;
} catch (IOException e) {
System.out.println("Error processing the Image Stream");
break;
}
}
return bos
Or you can just ImageOutputStream.flush() at end to get your imageBaos Byte then return.

Inspired by Yusaku answer,
I made my own version,
This can convert multiple pdf pages to a byte array.
I Used pdfbox 2.0.16 in combination with imageio-tiff 3.4.2
//PDF converter to tiff toolbox method.
private byte[] bytesToTIFF(#Nonnull byte[] in) {
int dpi = 300;
ImageWriter writer = ImageIO.getImageWritersByFormatName("TIFF").next();
try(ByteArrayOutputStream imageBaos = new ByteArrayOutputStream(255)){
writer.setOutput(ImageIO.createImageOutputStream(imageBaos));
writer.prepareWriteSequence(null);
PDDocument document = PDDocument.load(in);
PDFRenderer pdfRenderer = new PDFRenderer(document);
ImageWriteParam params = writer.getDefaultWriteParam();
for (int page = 0; page < document.getNumberOfPages(); page++) {
BufferedImage image = pdfRenderer.renderImageWithDPI(page, dpi, ImageType.RGB);
IIOMetadata metadata = writer.getDefaultImageMetadata(new ImageTypeSpecifier(image), params);
writer.writeToSequence(new IIOImage(image, null, metadata), params);
}
LOG.trace("size found: {}", imageBaos.size());
writer.endWriteSequence();
writer.reset();
return imageBaos.toByteArray();
} catch (Exception ex) {
LOG.warn("can't instantiate the bytesToTiff method with: PDF", ex);
} finally {
writer.dispose();
}
}

Converting PDF to multipage tiff (Group 4)

I'm trying to convert PDFs as represented by the org.apache.pdfbox.pdmodel.PDDocument class and the icafe library (https://github.com/dragon66/icafe/) to a multipage tiff with group 4 compression and 300 dpi. The sample code works for me for 288 dpi but strangely NOT for 300 dpi, the exported tiff remains just white. Has anybody an idea what the issue is here?
The sample pdf which I use in the example is located here: http://www.bergophil.ch/a.pdf
import java.awt.image.BufferedImage;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import cafe.image.ImageColorType;
import cafe.image.ImageParam;
import cafe.image.options.TIFFOptions;
import cafe.image.tiff.TIFFTweaker;
import cafe.image.tiff.TiffFieldEnum.Compression;
import cafe.io.FileCacheRandomAccessOutputStream;
import cafe.io.RandomAccessOutputStream;
public class Pdf2TiffConverter {
public static void main(String[] args) {
String pdf = "a.pdf";
PDDocument pddoc = null;
try {
pddoc = PDDocument.load(pdf);
} catch (IOException e) {
}
try {
savePdfAsTiff(pddoc);
} catch (IOException e) {
}
}
private static void savePdfAsTiff(PDDocument pdf) throws IOException {
BufferedImage[] images = new BufferedImage[pdf.getNumberOfPages()];
for (int i = 0; i < images.length; i++) {
PDPage page = (PDPage) pdf.getDocumentCatalog().getAllPages()
.get(i);
BufferedImage image;
try {
// image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 288); //works
image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 300); // does not work
images[i] = image;
} catch (IOException e) {
e.printStackTrace();
}
}
FileOutputStream fos = new FileOutputStream("a.tiff");
RandomAccessOutputStream rout = new FileCacheRandomAccessOutputStream(
fos);
ImageParam.ImageParamBuilder builder = ImageParam.getBuilder();
ImageParam[] param = new ImageParam[1];
TIFFOptions tiffOptions = new TIFFOptions();
tiffOptions.setTiffCompression(Compression.CCITTFAX4);
builder.imageOptions(tiffOptions);
builder.colorType(ImageColorType.BILEVEL);
param[0] = builder.build();
TIFFTweaker.writeMultipageTIFF(rout, param, images);
rout.close();
fos.close();
}
}
Or is there another library to write multi-page TIFFs?
EDIT:
Thanks to dragon66 the bug in icafe is now fixed. In the meantime I experimented with other libraries and also with invoking ghostscript. As I think ghostscript is very reliable as id is a widely used tool, on the other hand I have to rely that the user of my code has an ghostscript-installation, something like this:
/**
* Converts a given pdf as specified by its path to an tiff using group 4 compression
*
* #param pdfFilePath The absolute path of the pdf
* #param tiffFilePath The absolute path of the tiff to be created
* #param dpi The resolution of the tiff
* #throws MyException If the conversion fails
*/
private static void convertPdfToTiffGhostscript(String pdfFilePath, String tiffFilePath, int dpi) throws MyException {
// location of gswin64c.exe
String ghostscriptLoc = context.getGhostscriptLoc();
// enclose src and dest. with quotes to avoid problems if the paths contain whitespaces
pdfFilePath = "\"" + pdfFilePath + "\"";
tiffFilePath = "\"" + tiffFilePath + "\"";
logger.debug("invoking ghostscript to convert {} to {}", pdfFilePath, tiffFilePath);
String cmd = ghostscriptLoc + " -dQUIET -dBATCH -o " + tiffFilePath + " -r" + dpi + " -sDEVICE=tiffg4 " + pdfFilePath;
logger.debug("The following command will be invoked: {}", cmd);
int exitVal = 0;
try {
exitVal = Runtime.getRuntime().exec(cmd).waitFor();
} catch (Exception e) {
logger.error("error while converting to tiff using ghostscript", e);
throw new MyException(ErrorMessages.GHOSTSTSCRIPT_ERROR, e);
}
if (exitVal != 0) {
logger.error("error while converting to tiff using ghostscript, exitval is {}", exitVal);
throw new MyException(ErrorMessages.GHOSTSTSCRIPT_ERROR);
}
}
I found that the produced tif from ghostscript strongly differs in quality from the tiff produced by icafe (the group 4 tiff from ghostscript looks greyscale-like)

It's been a while since the question was asked and I finally find time and a wonderful ordered dither matrix which allows me to give some details on how "icafe" can be used to get similar or better results than calling external ghostscript executable. Some new features were added to "icafe" recently such as better quantization and ordered dither algorithms which is used in the following example code.
Here the sample pdf I am going to use is princeCatalogue. Most of the following code is from the OP with some changes due to package name change and more ImageParam control settings.
import java.awt.image.BufferedImage;
import java.io.FileOutputStream;
import java.io.IOException;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import com.icafe4j.image.ImageColorType;
import com.icafe4j.image.ImageParam;
import com.icafe4j.image.options.TIFFOptions;
import com.icafe4j.image.quant.DitherMethod;
import com.icafe4j.image.quant.DitherMatrix;
import com.icafe4j.image.tiff.TIFFTweaker;
import com.icafe4j.image.tiff.TiffFieldEnum.Compression;
import com.icafe4j.io.FileCacheRandomAccessOutputStream;
import com.icafe4j.io.RandomAccessOutputStream;
public class Pdf2TiffConverter {
public static void main(String[] args) {
String pdf = "princecatalogue.pdf";
PDDocument pddoc = null;
try {
pddoc = PDDocument.load(pdf);
} catch (IOException e) {
}
try {
savePdfAsTiff(pddoc);
} catch (IOException e) {
}
}
private static void savePdfAsTiff(PDDocument pdf) throws IOException {
BufferedImage[] images = new BufferedImage[pdf.getNumberOfPages()];
for (int i = 0; i < images.length; i++) {
PDPage page = (PDPage) pdf.getDocumentCatalog().getAllPages()
.get(i);
BufferedImage image;
try {
// image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 288); //works
image = page.convertToImage(BufferedImage.TYPE_INT_RGB, 300); // does not work
images[i] = image;
} catch (IOException e) {
e.printStackTrace();
}
}
FileOutputStream fos = new FileOutputStream("a.tiff");
RandomAccessOutputStream rout = new FileCacheRandomAccessOutputStream(
fos);
ImageParam.ImageParamBuilder builder = ImageParam.getBuilder();
ImageParam[] param = new ImageParam[1];
TIFFOptions tiffOptions = new TIFFOptions();
tiffOptions.setTiffCompression(Compression.CCITTFAX4);
builder.imageOptions(tiffOptions);
builder.colorType(ImageColorType.BILEVEL).ditherMatrix(DitherMatrix.getBayer8x8Diag()).applyDither(true).ditherMethod(DitherMethod.BAYER);
param[0] = builder.build();
TIFFTweaker.writeMultipageTIFF(rout, param, images);
rout.close();
fos.close();
}
}
For ghostscript, I used command line directly with the same parameters provided by the OP. The screenshots for the first page of the resulted TIFF images are showing below:
The lefthand side shows the output of "ghostscript" and the righthand side the output of "icafe". It can be seen, at least in this case, the output from "icafe" is better than the output from "ghostscript".
Using CCITTFAX4 compression, the file size from "ghostscript" is 2.22M and the file size from "icafe" is 2.08M. Both are not so good given the fact dither is used while creating the black and white output. In fact, a different compression algorithm will create way smaller file size. For example, using LZW, the same output from "icafe" is only 634K and if using DEFLATE compression the output file size went down to 582K.

Here's some code to save in a multipage tiff which I use with PDFBox. It requires the TIFFUtil class from PDFBox (it isn't public, so you have to make a copy).
void saveAsMultipageTIFF(ArrayList<BufferedImage> bimTab, String filename, int dpi) throws IOException
{
Iterator<ImageWriter> writers = ImageIO.getImageWritersByFormatName("tiff");
ImageWriter imageWriter = writers.next();
ImageOutputStream ios = ImageIO.createImageOutputStream(new File(filename));
imageWriter.setOutput(ios);
imageWriter.prepareWriteSequence(null);
for (BufferedImage image : bimTab)
{
ImageWriteParam param = imageWriter.getDefaultWriteParam();
IIOMetadata metadata = imageWriter.getDefaultImageMetadata(new ImageTypeSpecifier(image), param);
param.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
TIFFUtil.setCompressionType(param, image);
TIFFUtil.updateMetadata(metadata, image, dpi);
imageWriter.writeToSequence(new IIOImage(image, null, metadata), param);
}
imageWriter.endWriteSequence();
imageWriter.dispose();
ios.flush();
ios.close();
}
I experimented on this for myself some time ago by using this code:
https://www.java.net/node/670205 (I used solution 2)
However...
If you create an array with lots of images, your memory consumption
really goes up. So it would probably be better to render an image, then
add it to the tiff file, then render the next page and lose the
reference of the previous one so that the gc can get the space if needed.

Refer to my github code for an implementation with PDFBox.

Since some dependencies used by solutions for this problem looks not maintained. I got a solution by using latest version (2.0.16) pdfbox:
ByteArrayOutputStream imageBaos = new ByteArrayOutputStream();
ImageOutputStream output = ImageIO.createImageOutputStream(imageBaos);
ImageWriter writer = ImageIO.getImageWritersByFormatName("TIFF").next();
try (final PDDocument document = PDDocument.load(new File("/tmp/tmp.pdf"))) {
PDFRenderer pdfRenderer = new PDFRenderer(document);
int pageCount = document.getNumberOfPages();
BufferedImage[] images = new BufferedImage[pageCount];
// ByteArrayOutputStream[] baosArray = new ByteArrayOutputStream[pageCount];
writer.setOutput(output);
ImageWriteParam params = writer.getDefaultWriteParam();
params.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
// Compression: None, PackBits, ZLib, Deflate, LZW, JPEG and CCITT
// variants allowed
params.setCompressionType("Deflate");
writer.prepareWriteSequence(null);
for (int page = 0; page < pageCount; page++) {
BufferedImage image = pdfRenderer.renderImageWithDPI(page, DPI, ImageType.RGB);
images[page] = image;
IIOMetadata metadata = writer.getDefaultImageMetadata(new ImageTypeSpecifier(image), params);
writer.writeToSequence(new IIOImage(image, null, metadata), params);
// ImageIO.write(image, "tiff", baosArray[page]);
}
System.out.println("imageBaos size: " + imageBaos.size());
// Finished write to output
writer.endWriteSequence();
document.close();
} catch (IOException e) {
e.printStackTrace();
throw new Exception(e);
} finally {
// avoid memory leaks
writer.dispose();
}
Then you may using imageBaos write to your local file. But if you want to pass your image to ByteArrayOutputStream and return to privious method like me. Then we need other steps.
After processing is done, the image bytes would be available in the ImageOutputStream
output object. We need to position the offset to the beginning of the output object and then read the butes to write to new ByteArrayOutputStream, a concise way like this:
ByteArrayOutputStream bos = new ByteArrayOutputStream();
long counter = 0;
while (true) {
try {
bos.write(ios.readByte());
counter++;
} catch (EOFException e) {
System.out.println("End of Image Stream");
break;
} catch (IOException e) {
System.out.println("Error processing the Image Stream");
break;
}
}
return bos
Or you can just ImageOutputStream.flush() at end to get your imageBaos Byte then return.

Inspired by Yusaku answer,
I made my own version,
This can convert multiple pdf pages to a byte array.
I Used pdfbox 2.0.16 in combination with imageio-tiff 3.4.2
//PDF converter to tiff toolbox method.
private byte[] bytesToTIFF(#Nonnull byte[] in) {
int dpi = 300;
ImageWriter writer = ImageIO.getImageWritersByFormatName("TIFF").next();
try(ByteArrayOutputStream imageBaos = new ByteArrayOutputStream(255)){
writer.setOutput(ImageIO.createImageOutputStream(imageBaos));
writer.prepareWriteSequence(null);
PDDocument document = PDDocument.load(in);
PDFRenderer pdfRenderer = new PDFRenderer(document);
ImageWriteParam params = writer.getDefaultWriteParam();
for (int page = 0; page < document.getNumberOfPages(); page++) {
BufferedImage image = pdfRenderer.renderImageWithDPI(page, dpi, ImageType.RGB);
IIOMetadata metadata = writer.getDefaultImageMetadata(new ImageTypeSpecifier(image), params);
writer.writeToSequence(new IIOImage(image, null, metadata), params);
}
LOG.trace("size found: {}", imageBaos.size());
writer.endWriteSequence();
writer.reset();
return imageBaos.toByteArray();
} catch (Exception ex) {
LOG.warn("can't instantiate the bytesToTiff method with: PDF", ex);
} finally {
writer.dispose();
}
}

Writing a byte array to a bmp java

As the title suggests, I'm attempting to write a byte array to a bmp file in Java. Currently, my program successfully writes data to the file location, however it appears to be missing data and cannot be opened because of it. Of the two functions listed below the goal is:
fromImage takes a grayscale bmp image byte data and converts each integer to a binary string, which is then stored in a LinkedList node. toImage takes that LinkedList, converts the binary strings back to integers and then writes the new byte array back to another file.
public static LinkedList<String> fromImage(BufferedImage img) {
LinkedList<String> new_buff = new LinkedList<String>();
//try{
//img = ImageIO.read(new File("img/lena.bmp"));
byte[] byte_buffer = ((DataBufferByte) img.getRaster().getDataBuffer()).getData();
for(byte b : byte_buffer){
String buffer;
buffer = Integer.toBinaryString(b & 255 | 256).substring(1);
new_buff.addLast(buffer);
//System.out.println(buffer);
}
//}catch(IOException e){}
System.out.println("Exiting fromImage");
return new_buff;
}
// Save a binary number as a BMP image
// Image input hardcoded atm
public static BufferedImage toImage(LinkedList<String> bi) {
BufferedImage img = null;
int b;
byte[] bytes = new byte[bi.size()];
for(int i = 0; i < bi.size(); i++){
String temp = bi.get(i);
b = Integer.parseInt(temp);
bytes[i] = (byte) b;
//System.out.println(i);
}
System.out.println("Exiting For loop");
try{
Files.write(Paths.get("img/encrypted.bmp"), bytes);
//img = ImageIO.read(new File("img/lena.bmp"));
//ImageIO.write(img, "bmp", new File("img/encrypted.bmp"));
//img = ImageIO.read(new File("img/encrypted.bmp"));
}catch(IOException e){}
System.out.println("Exiting toImage");
return img;
}
So ultimately, my question is - Where is the data I'm missing, why am I missing it, and what can I do to fix it?

BMP has a file structure.
Here, you are writing to a file named "encrypted.bmp", so I suppose your bytes are the encryption of something, and thus do not represent a valid bmp file.
You will have to comply to the BMP file structure, adding a header and footer so that your bytes are eg. the pixel part of the BMP file.
The easiest way to do that is by writing your image to a BufferedImage img and then use ImageIO.write(img, "BMP", new File("encrypted.bmp")).

I just want to turn Ekleog's answer into a usable code:
import java.awt.image.BufferedImage;
import java.awt.image.DataBufferByte;
import java.awt.image.Raster;
import java.io.File;
import java.io.IOException;
public static boolean arrayToBMP(byte[] pixelData, int width, int height, File outputFile) throws IOException {
BufferedImage img = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB);
img.setData(Raster.createRaster(img.getSampleModel(), new DataBufferByte(pixelData, pixelData.length), null));
return javax.imageio.ImageIO.write(img, "bmp", outputFile);
}

Output image becomes black after conversion from FileInputStream to Image

I am trying to merge two TIFF images which are in form of FileInputStream into a single Tiff image. Although the image is getting merged the output file is coming up as Black. While comparing the original image and the converted image I could see that the bit depth of the converted image changes to 1. Could anybody provide a solution to this?
The code that I am using is:
public class MergerTiffUsingBuffer {
public static void main(String[] args) {
File imageFile1 = new File("D:/Software/pdfbox-1.3.1.jar/tiff/FLAG_T24.TIF");
File imageFile2 = new File("D:/Software/pdfbox-1.3.1.jar/tiff/CCITT_3.TIF");
try {
FileInputStream fis1 = new FileInputStream(imageFile1);
FileInputStream fis2 = new FileInputStream(imageFile2);
List<BufferedImage> bufferedImages=new ArrayList<>();
List<FileInputStream> inputStreams=new ArrayList<>();
inputStreams.add(fis1);
inputStreams.add(fis2);
Iterator<?> readers = ImageIO.getImageReadersByFormatName("tiff");
ImageReader reader = (ImageReader) readers.next();
for(FileInputStream inputStream:inputStreams){
ImageInputStream iis = ImageIO.createImageInputStream(inputStream);
reader.setInput(iis);
ImageReadParam param = reader.getDefaultReadParam();
Image image = reader.read(0, param);
BufferedImage bufferedImage = new BufferedImage(image.getWidth(null), image.getHeight(null), BufferedImage.TYPE_INT_RGB);
OutputStream out = new FileOutputStream("D:/Software/pdfbox-1.3.1.jar/tiff/MergedTiff.TIF");
BufferedImage binarized = new BufferedImage(bufferedImage.getWidth(), bufferedImage.getHeight(),BufferedImage.TYPE_BYTE_BINARY);
ImageIO.write(binarized, "tiff", out);
bufferedImages.add(bufferedImage);
}
System.out.println(bufferedImages.size());
} catch (IOException e2) {
e2.printStackTrace();
}
}
}

You seem to be a little confused about how to copy image data. Simply creating a new, blank image, by passing the dimensions of another image, will not copy it... So a fully black image is what I would expect after running your code.
Replace your for loop with something like this:
for (FileInputStream inputStream : inputStreams) {
ImageInputStream iis = ImageIO.createImageInputStream(inputStream);
reader.setInput(iis);
BufferedImage image = reader.read(0, null); // a) BufferedImage is returned! b) null param is fine!
BufferedImage binarized = new BufferedImage(image.getWidth(), image.getHeight(), BufferedImage.TYPE_BYTE_BINARY);
// The following 7 lines is the important part you were missing:
Graphics2D g = binarized.createGraphics();
try {
g.drawImage(image, 0, 0, null);
}
finally {
g.dispose();
}
OutputStream out = new FileOutputStream("D:/Software/pdfbox-1.3.1.jar/tiff/MergedTiff.TIF");
ImageIO.write(binarized, "tiff", out); // You probably want to check return value (true/false)!
bufferedImages.add(image);
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java. Broken image from URL - java

I don't know if this is the only problem, but you might write more than you get. I suggest that you change your writing code to: int len; while ((len=in.read(b))!=-1) out.write(b, 0, len); Otherwise, if the last buffer is not exactly 512 bytes long, you'll write too much

Related

How to disable compression when writing JPEG in java?

How to convert pdf file in to multipagetiff images using pdfBox? [duplicate]

Converting PDF to multipage tiff (Group 4)

Writing a byte array to a bmp java

Output image becomes black after conversion from FileInputStream to Image

Categories

Resources