I have a set of images, I want to detect that the image is document or picture.
My first approach is to detect the pixel color, if the image has back and white color, then its a document other wise Image, but its not very efficient, because the set may contain clip art or other back and white image.
So I'm searching for a white paper or algorithm, through which i can detect that image is document or picture.
Thanks in advance.
Related
I want to add the following png image to my pdf:
I use following code to do it:
Image img = PngImage.getImage(filename);
img.setBorder(Image.NO_BORDER);
img.setAlignment(Element.ALIGN_CENTER);
img.scaleAbsolute(width,height);
document.add(img);
The image contains a bar graph which has no outer border. When I add the image to my pdf, it shows an outer border, but only for the bottom, left and top sides:
I want to remove border in the pdf, but the above code, does not make that happen.
I am using iText-2.1.5.
In the comments, I claimed that your original image does have a border. You claim it doesn't have a border. Now that you've shared the image, we can check the facts to see who is right.
As it turns out, I was right. When I open the image in GIMP, I clearly see a transparent border:
Maybe you don't see it, because you are looking at the image in Paint or maybe you consider "transparent" and "white" to be the same color. Obviously that assumption is wrong.
I created a PDF containing the image you shared and when I open this PDF using iText RUPS, I see something like this:
PNG is not supported in ISO-32000-1 (aka the PDF specification), hence software that wants to introduce a PNG into a PDF file needs to convert that PNG to another format. In the case of iText, "normal" PNGs are converted to a bitmap with filter /FlateDecode.
In your case, you have a PNG with tranparency. In ISO-32000-1, transparent images are always stored as two images: you have the opaque image (in my screen shot, /Img1 with object number 2) and the image mask (in my screen shot, /Img0 with object number 1).
If you look closely at the image mask (the image that makes the opaque image transparent), you see that it's a black and white image that shows a very small border. This image is shown in the lower-right panel where it says "Stream" (this is where the image stream is rendered). This very small border is the transparent border we can also see in the GIMP (or other image viewers that support transparent images).
If this border is transparent, then why do you see it in a PDF viewer? Well, this border is treated as a line with zero width. In PDF viewers, a line with zero width is shown using the smallest width that can be shown on the device that is being used to view the PDF. If you zoom into the PDF, you'll notice that the width of the line remains constant.
Summarized: you claimed that your image didn't have any border, and that a border was added by iText. I have proven you wrong: the image does have a transparent border and iText correctly introduces this transparent border as a mask. The PDF viewer shows this border as a zero-width line in accordance with ISO-32000-1.
You can solve your problem by removing the transparent border in the original image. For example: I flattened the image using the GIMP. The result is this image:
This image no longer has a transparent border and when you introduce it into a PDF, no border is shown, and no mask is added to the PDF:
Shortly, I want to erase the content of pdf page by background color without changing its page size. Here is more detail:
Says pdf page size is A4 paper, content can be texts or images, and the erased content is 1 cm spacing around (blue part)
I wonder is there any way to do this?
Update: my try with clipping path
// render text and image
//...
// then erase
PdfContentByte clipCB = pdfWriter.getDirectContent();
clipCB.saveState();
clipCB.setColorStroke(Color.WHITE);
clipCB.rectangle(100,100,600, 600);
clipCB.clip();
clipCB.newPath();
clipCB.restoreState();
There are a few options. The easiest thing to do would be to draw 4 rectangles in the background color around the edges. A more elegant approach would be to set a clipping path prior to rendering the page's contents.
Here's an example using a clipping path: http://www.java2s.com/Tutorial/Java/0419__PDF/Cliparegion.htm
I'm uploading an image, from that image I want to get the white color rectangular portion which contains some numbers or even if I get the location means the pixel values of that whole rectangular portion it will be sufficient for me.
I want to send that block of image to Tesseract for Image Processing,
Can anyone tell me How can I get it ?
Note - I do not want to use OpenCV.
Thanks in Advance !!!
When you zoom in PDF, text will not be distorted, the image distortion.
How to make the picture enlarged without distortion, But don't vector map.
the problem that you are facing is about your image's DPI. Most PDF viewers draw the image at a fixed dpi regardless of the image's dpi. My guess is you are using a 150 dpi or so. Try playing with the dpi value to find a good fit for the area you are trying to draw your image into and the size of the original image.
Also make sure your original image is big enough for the area you are trying to cover.
This is what I need to do. I need to create square thumbnails from a standard photo. The photos are usually portrain layout, so I need to shave off the bottom part of the photo so that the height is equals to the width.
If the image is landscape layout then I need to shave off equal number of pixels from left and right to make it square.
Any ideas how to do this?
My image is BufferedImage object already.
You can use getSubimage to retrieve a cropped version of the original image.