I want to remove image background with Open CV in Android. Code is working fine but output quality not as per expectation. I followed java documentation for code reference:
https://opencv-java-tutorials.readthedocs.io/en/latest/07-image-segmentation.html
Thanks
original Image
My output
Expected output
My code snippet in Android:
private fun doBackgroundRemoval(frame: Mat): Mat? {
// init
val hsvImg = Mat()
val hsvPlanes: List<Mat> = ArrayList()
val thresholdImg = Mat()
var thresh_type = Imgproc.THRESH_BINARY_INV
thresh_type = Imgproc.THRESH_BINARY
// threshold the image with the average hue value
hsvImg.create(frame.size(), CvType.CV_8U)
Imgproc.cvtColor(frame, hsvImg, Imgproc.COLOR_BGR2HSV)
Core.split(hsvImg, hsvPlanes)
// get the average hue value of the image
val threshValue: Double = getHistAverage(hsvImg, hsvPlanes[0])
threshold(hsvPlanes[0], thresholdImg, threshValue, 78.0, thresh_type)
Imgproc.blur(thresholdImg, thresholdImg, Size(1.toDouble(), 1.toDouble()))
val kernel1 =
Imgproc.getStructuringElement(Imgproc.MORPH_ELLIPSE, Size(11.toDouble(), 11.toDouble()))
val kernel2 = Mat.ones(3, 3, CvType.CV_8U)
// dilate to fill gaps, erode to smooth edges
Imgproc.dilate(thresholdImg, thresholdImg, kernel1, Point(-1.toDouble(), -1.toDouble()), 1)
Imgproc.erode(thresholdImg, thresholdImg, kernel2, Point(-1.toDouble(), -1.toDouble()), 7)
threshold(thresholdImg, thresholdImg, threshValue, 255.0, Imgproc.THRESH_BINARY_INV)
// create the new image
val foreground = Mat(
frame.size(), CvType.CV_8UC3, Scalar(
255.toDouble(),
255.toDouble(),
255.toDouble()
)
)
frame.copyTo(foreground, thresholdImg)
val img_bitmap =
Bitmap.createBitmap(foreground!!.cols(), foreground!!.rows(), Bitmap.Config.ARGB_8888)
Utils.matToBitmap(foreground!!, img_bitmap)
imageView.setImageBitmap(img_bitmap)
return foreground
}
The task, as you have seen, is not trivial at all. OpenCV has a segmentation algorithm called "GrabCut" that tries to solve this particular problem. The algorithm is pretty good at classifying background and foreground pixels, however it needs very specific information to work. It can operate on two modes:
1st Mode (Mask Mode): Using a Binary Mask (same size as the original input) where 100% definite background pixels are marked, as
well as 100% definite foreground pixels. You don't have to mark every
pixel on the image, just a region where you are sure the algorithm
will find either class of pixels.
2nd Mode (Foreground ROI): Using a bounding box that encloses 100% definite foreground pixels.
Now, I use the notation "100% definitive" to label those pixels you are 100% sure they correspond to either the background of foreground. The algorithm classifies the pixels in four possible classes: "Definite Background", "Probable Background", "Definite Foreground" and "Probable Foreground". It will predict both Probable Background and Probable Foreground pixels, but it needs a priori information of where to find at least "Definitive Foreground" pixels.
With that said, we can use GrabCut in its 2nd mode (Rectangle ROI) to try an segment the input image . We can try and get a first, rough, binary mask of the input. This will mark where we are sure the algorithm can find foreground pixels. We will feed this rough mask to the algorithm and check out the results. Now, the method is not easy and its automation not straightforward, there's some manual information we will set that work particularly well for this input image. I don't know the Java implementation of OpenCV, so I'm giving you the solution for Python. Hopefully you will be able to port it. This is the general outline of the algorithm:
Get a first rough mask of the foreground object via thresholding
Detect contours on the rough mask to retrieve a bounding rectangle
The bounding rectangle will serve as input ROI for the GrabCut algorithm
Set the parameters needed for the GrabCut algorithm
Clean the segmentation mask obtained by GrabCut
Use the segmentation mask to finally segment the foreground object
This is the code:
# imports:
import cv2
import numpy as np
# image path
path = "D://opencvImages//"
fileName = "backgroundTest.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# (Optional) Deep copy for results:
inputImageCopy = inputImage.copy()
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Adaptive Thresholding
windowSize = 31
windowConstant = 11
binaryImage = cv2.adaptiveThreshold(grayscaleImage, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY_INV, windowSize, windowConstant)
The first step is to get the rough foreground mask using Adaptive Thresholding. Here, I've use the ADAPTIVE_THRESH_MEAN_C method, where the (local) threshold value is the mean of a neighborhood area on the input image. This yields the following image:
It's pretty rough, right? We can clean this up a little bit using some morphology. I use a Closing with a rectangular kernel of size 3 x 3 and 10 iterations to join the big blobs of white pixels. I've wrapped the OpenCV functions inside custom functions that save me the typing of some lines. These helper functions are presented at the end of this post. For now, this step is as follows:
# Apply a morphological closing with:
# Rectangular SE size 3 x 3 and 10 iterations
binaryImage = morphoOperation(binaryImage, 3, 10, "Closing")
This is the rough mask after filtering:
A little bit better. Ok, we can now search for the bounding box of the biggest contour. A search for the outer contours via cv2.RETR_EXTERNAL will suffice for this example, as we can safely ignore children contours, like this:
# Find the EXTERNAL contours on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# This list will store the target bounding box
maskRect = []
Additionally, let's get a list ready where we will store the target bounding rectangle. Let's now search on the detected contours. I've also implemented an area filter in case some noise is present, so the pixels below a certain area threshold are ignored:
# Look for the outer bounding boxes (no children):
for i, c in enumerate(contours):
# Get blob area:
currentArea = cv2.contourArea(c)
# Get the bounding rectangle:
boundRect = cv2.boundingRect(c)
# Set a minimum area
minArea = 1000
# Look for the target contour:
if currentArea > minArea:
# Found the target bounding rectangle:
maskRect = boundRect
# (Optional) Draw the rectangle on the input image:
# Get the dimensions of the bounding rect:
rectX = boundRect[0]
rectY = boundRect[1]
rectWidth = boundRect[2]
rectHeight = boundRect[3]
# (Optional) Set color and draw:
color = (0, 0, 255)
cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
(int(rectX + rectWidth), int(rectY + rectHeight)), color, 2 )
# (Optional) Show image:
cv2.imshow("Bounding Rectangle", inputImageCopy)
cv2.waitKey(0)
Optionally you can draw the bounding box found by the algorithm. This is the resulting image:
It is looking good. Note that some obvious background pixels are also enclosed by the ROI. GrabCut will try to re-classify these pixels into their proper class, i.e., "Definitive Background". Alright, let's prepare the data for GrabCut:
# Create mask for Grab n Cut,
# The mask is a uint8 type, same dimensions as
# original input:
mask = np.zeros(inputImage.shape[:2], np.uint8)
# Grab n Cut needs two empty matrices of
# Float type (64 bits) and size 1 (rows) x 65 (columns):
bgModel = np.zeros((1, 65), np.float64)
fgModel = np.zeros((1, 65), np.float64)
We need to prepare three matrices/numpy arrays/whatever data type is used to represent images in Java. The first is where the segmentation mask obtained by GrabCut will be stored. This mask will have values from 0 to 3 to denote the class of each pixel on the original input. The bgModel and fgModel matrices are used internally by the algorithm to store the statistical model of the foreground and background. Be aware that both of these matrices are float matrices. Lastly, GrabCut is an iterative algorithm. It will run for n iterations. Ok, Let's run GrabCut:
# Run Grab n Cut on INIT_WITH_RECT mode:
grabCutIterations = 5
mask, bgModel, fgModel = cv2.grabCut(inputImage, mask, maskRect, bgModel, fgModel, grabCutIterations, mode=cv2.GC_INIT_WITH_RECT)
Ok, the classification is done. You can try and convert mask to an (image) visible type to check out the labels of each pixel. This is optional, but should you wish to do so, you'd get 4 matrices. Each one for each class. For example, for the "Definitive Background" class, GrabCut found these are the pixels belonging to such class (in white):
The pixels belonging to the "Probable Background" class are these:
That's pretty good, huh? Here are the pixels belonging to the "Probable Foreground" class:
Very nice. Let's create the final segmentation mask, because mask is not an image, it is just an array containing labels for each pixel. We will use the Definite Background and Probable Background pixels to set the final mask, we then can "normalize" the data range and convert it to uint8 to obtain an actual image
# Set all definite background (0) and probable background pixels (2)
# to 0 while definite foreground and probable foreground pixels are
# set to 1
outputMask = np.where((mask == cv2.GC_BGD) | (mask == cv2.GC_PR_BGD), 0, 1)
# Scale the mask from the range [0, 1] to [0, 255]
outputMask = (outputMask * 255).astype("uint8")
This is the actual segmentation mask:
Alright, we can clean a little bit this image, because there are some small holes produced by misclassifying foreground pixels as background pixels. Let's apply just another morphological closing, this time using 5 iterations:
# (Optional) Apply a morphological closing with:
# Rectangular SE size 3 x 3 and 5 iterations:
outputMask = morphoOperation(outputMask, 3, 5, "Closing")
Finally, use this outputMask in an AND with the original image to produce the final segmented result:
# Apply a bitwise AND to the image using our mask generated by
# GrabCut to generate the final output image:
segmentedImage = cv2.bitwise_and(inputImage, inputImage, mask=outputMask)
cv2.imshow("Segmented Image", segmentedImage)
cv2.waitKey(0)
This is the final result:
If you need transparency on this image, is very straightforward to use outputMask as alpha channel. This is the helper function I used earlier:
# Applies a morpho operation:
def morphoOperation(binaryImage, kernelSize, opIterations, opString):
# Get the structuring element:
morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform Operation:
if opString == "Closing":
op = cv2.MORPH_CLOSE
else:
print("Morpho Operation not defined!")
return None
outImage = cv2.morphologyEx(binaryImage, op, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
return outImage
The Code
import cv2
import numpy as np
def process(img):
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_canny = cv2.Canny(img_gray, 10, 20)
kernel = np.ones((13, 13))
img_dilate = cv2.dilate(img_canny, kernel, iterations=1)
return cv2.erode(img_dilate, kernel, iterations=1)
def get_mask(img):
contours, _ = cv2.findContours(process(img), cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
blank = np.zeros(img.shape[:2]).astype('uint8')
for cnt in contours:
if cv2.contourArea(cnt) > 500:
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, peri * 0.004, True)
cv2.drawContours(blank, [approx], -1, 255, -1)
return blank
img = cv2.imread("crystal.jpg")
img_masked = cv2.bitwise_and(img, img, mask=get_mask(img))
cv2.imshow("Masked", img_masked)
cv2.waitKey(0)
The Output
The Explanation
Import the necessary libraries:
import cv2
import numpy as np
Define a function to process an image to be fit for proper contour detection. In the function, first convert the image to grayscale, and then detect its edges using the canny edge detector. With the edges detected, we can dilate and erode them once to give the edges more body:
def process(img):
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_canny = cv2.Canny(img_gray, 10, 20)
kernel = np.ones((13, 13))
img_dilate = cv2.dilate(img_canny, kernel, iterations=1)
return cv2.erode(img_dilate, kernel, iterations=1)
Define a function to generate a mask for the image. After finding the contours of the image, define a grayscale blank image with the shape of the image, and draw every contour (of area greater than 400 to filter out noise) filled in onto the blank image. I also approximated the contours to smoothen things out a bit:
def get_mask(img):
contours, _ = cv2.findContours(process(img), cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
blank = np.zeros(img.shape[:2]).astype('uint8')
for cnt in contours:
if cv2.contourArea(cnt) > 500:
peri = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, peri * 0.004, True)
cv2.drawContours(blank, [approx], -1, 255, -1)
return blank
Finally, read in the image, and mask the image using the cv2.bitwise_and method, along with the get_mask function we defined, which uses the process function we defined. Show the masked image in the end:
img = cv2.imread("crystal.jpg")
img_masked = cv2.bitwise_and(img, img, mask=get_mask(img))
cv2.imshow("Masked", img_masked)
cv2.waitKey(0)
Transparent Background
Instead of the cv2.bitwise_and method, you can use the cv2.merge method:
img = cv2.imread("crystal.jpg")
img_masked = cv2.merge(cv2.split(img) + [get_mask(img)])
cv2.imwrite("masked_crystal.png", img_masked)
Resulting image (screenshot):
Explanation:
Keeping in mind we already imported the cv2 module and the numpy module as np. We also defined a process function and a get_mask function, we can read in the image:
img = cv2.imread("crystal.jpg")
The cv2.split method takes in an image array and returns a list of every individual channel present in the image. In our case, we only have 3 channels, and in order to make the image transparent, we need a forth channel: the alpha channel. The cv2.merge method does the opposite of cv2.split; it takes in a list of individual channels and returns an image array with the channels. So next we get the bgr channels of the image in a list, and concatenate the mask of the image as the alpha channel:
img_masked = cv2.merge(cv2.split(img) + [get_mask(img)])
Lastly we can write the four channel image into a file:
cv2.imwrite("masked_crystal.png", img_masked)
Here are some more example of the cv2.merge method: Python cv2.merge() Examples
Related
I'm finding it difficult to find an adaptive image thresholding technique for mazes that will return either a high or low value to make sure that all the paths are the same color.
So far I have tried a fixed threshold which obviously didn't work and otsu's method which return a value around the middle which meant that some pixels were not converted properly.
original image - https://imgur.com/DqaUYfW
otsu's method - https://imgur.com/a/V5t6rqZ
desired output - https://imgur.com/a/yvXuAqC
Sorry, I don't have java so I just try out some methods in python and can get the desired output that you want. Hope it will help you.
import cv2
import numpy as np
image = cv2.imread("1.png")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
_,thresh = cv2.threshold(gray,100,255,cv2.THRESH_BINARY)
cv2.imshow("thresh",thresh)
blur = cv2.GaussianBlur(gray,(5,5),0)
ret3,otsu = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
cv2.imshow("otsu",otsu)
adaptive_thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 29, 30)
cv2.imshow("adaptive_thresh",adaptive_thresh)
cv2.imshow("img",image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Otsu method
Fixed binary threshold
Adaptive threshold
I am trying to find a coordinates of a shape(s) in a white-black image. I am using findContours method for contour finding and approxPolyDP for optimizing them to a polygon. The shape in the input image (see below) is an processed text, I need to find a 4 corner polygon for each field, which would fit around this shape using less outside space. ApproxPolyDP function rarely gives me a 4 corners (despite changing parameters), which I need to use to apply perspective transform on an original image and skip the deskewing algorythm and to crop out the text. How can i find the best fitting 4 corner polygons for each field (not rectangles)? I could not find any proper tutorial on how to do that, is it really hard? Below I present my current code in java; desired result; input; current output. NOTE: I would highly appreciate if you could give me a method where HoughLines are not involved, this method is slow (for mobile phones; that's why I am asking this question), but if it is the only one possibility you know to get the result I need, please, post it, it would be appreciated.
Code for finding current shape(s):
Mat mask = new Mat(src.size(), CvType.CV_8UC3, new Scalar(0,0,0));
List<MatOfPoint> contours = new ArrayList<MatOfPoint>();
Imgproc.findContours(src, contours, new Mat(), Imgproc.RETR_CCOMP, Imgproc.CHAIN_APPROX_SIMPLE);
for (int i = 0 ; i < contours.size() ; i++)
{
int contourSize = (int)contours.get(i).total();
MatOfPoint2f curContour2f = new MatOfPoint2f(contours.get(i).toArray());
Imgproc.approxPolyDP(curContour2f, curContour2f, 0.04 * Imgproc.arcLength(curContour2f, true), true);
contours.set(i, new MatOfPoint(curContour2f.toArray()));
Imgproc.drawContours(mask, contours, i, new Scalar(0, 255, 0), 3);
}
Average input:
Desired result (it's not a rectangle, corners do not have to be 90 degrees, but here must be 4 of them):
Average current output:
Other output example: the input picture here was more detailed (with some gaps), so the output is much worse depending on what I want it to be. Polygons in other polygons is not a problem, but the main shape of a whole block has to much corners:
Thank you in advance.
This question already has answers here:
Floor Plan Edge Detection - Image Processing?
(3 answers)
Closed 7 years ago.
I am making a small indoor navigation application for my project. The main Idea behind my application is that i will be given a .pdf file or Autocad file(floor plan) for some Area. I have to parse or get data from that image to find out open path in a floor plan.
For Determining open Path from an image i have to map image content or data in some Data Structure also, so that i can apply some path finding algorithms on it.
My problem is that i don't know how can i break my image into pixels or any other form to get data from it in my initial phase. Do i need to apply some image processing using Matlab or it could be achieved by Java Or Python Libraries?
This is a rather broad question, so i can only give hints on some of the relevant points.
How to read single pixels from an Image in java:
BufferedImage bi = ImageIO.read(new File("pathToYourImage"));
bi.getRGB(0 , 0);
This way you can load an image in java and get the values of a single pixel ((0,0) in the example).
Datastructure: The most common way of representing a floorplan or any other kind of collection of paths is a graph. There are several good libraries on the net for graphs, or you can implement it on your own.
ImageProcessing: Since the image won't be b/w (i guess), you'll have to transform it in order to preform the transformation into a graph - though the conversion to a graph wouldn't even be necessary. The most common way would be to simply convert the graph into a b/w image where black pixels are walls. Since the color of pixels representing the floor might not be perfectly the same color equal, i added some imprecision (delta) to the comparison:
//comparison function
boolean isMatch(Color inp , Color toMatch)
{
final int delta = 25;
return (Math.abs(inp.getRed() - toMatch.getRed()) <= delta &&
Math.abs(inp.getBlue() - toMatch.getBlue()) <= delta &&
Math.abs(inp.getGreen() - toMatch.getGreen()) <= delta);
}
//color of pixels that don't represent obstacles
Color floor = getFloorColor();
//create a copy of the image for the transformation
BufferedImage floorPlan = new BufferedImage(getFloorPlan().getWidth() ,
getFloorPlan().getHeight() , BufferedImage.TYPE_INT_RGB);
floorPlan.getGraphics().drawImage(getFloorPlan() ,
floorPlan.getWidth() , floorPlan.getHeight() , null);
//color pixels that aren't walls or other obstacles white and obstacles/walls black
for(int i = 0 ; i < floorPlan.getWidth() ; i++)
for(int j = 0 ; j < floorPlan.getHeight() ; j++)
if(isMatch(new Color(floorPlan.getRGB(i , j)) , floor)
floorPlan.setRGB(Color.WHITE.getRGB());
else
floorPlan.setRGB(Color.BLACK.getRGB());
This image can now easily be either transformed into a graph, or used directly as representation of the graph.
I have to perpare a Trainging set for my Machine Learning Course, in which for a given face image it gives you an answer representing the side of the head ( straight , left , right , up )
For this purpose i need to read a .pgm image file in java and store its pixels in one row of matrix X, and then store the appropriate right answer of this image in a y vector. finally i will save these two arrays in a .mat file.
The problem is when trying to read the pixel values from a (P2 .pgm) image and printing them to console , they don't give identical values with the matlab matrix viewer. what would be the problem?
This is my code:
try{
InputStream f = Main.class.getResourceAsStream("an2i_left_angry_open.pgm");
BufferedReader d = new BufferedReader(new InputStreamReader(f));
String magic = d.readLine(); // first line contains P2 or P5
String line = d.readLine(); // second line contains height and width
while (line.startsWith("#")) { // ignoring comment lines
line = d.readLine();
}
Scanner s = new Scanner(line);
int width = s.nextInt();
int height = s.nextInt();
line = d.readLine();// third line contains maxVal
s = new Scanner(line);
int maxVal = s.nextInt();
for(int i=0;i<30;i++) /* printing first 30 values from the image including spaces*/
System.out.println((byte)d.read());
} catch (EOFException eof) {
eof.printStackTrace(System.out) ;
}
these are the values i get:
50
49
32
50
32
49
32
48
32
50
32
49
56
32
53
57
while this photo is what is indeed in the image from MATLAB Viewer:
(sorry i can't post images because of lack of reputationS)
and this is what you find when you open the .pgm file via notepad++
Take a look at this post in particular. I've experienced similar issues with imread and with Java's ImageIO class and for the longest time, I could not find this link as proof that other people have experienced the same thing... until now. Similarly, someone experienced related issues in this post but it isn't quite the same at what you're experiencing.
Essentially, the reason why images loaded in both Java and MATLAB are different is due to enhancement purposes. MATLAB scales the intensities so the image isn't mostly black. Essentially, the maximum intensity in your PGM gets scaled to 255 while the other intensities are linearly scaled to suit the dynamic range of [0,255]. So for example, if your image had a dynamic range from [0-100] in your PGM file before loading it in with imread, this would get scaled to [0-255] and not be the original scale of [0-100]. As such, you would have to know the maximum intensity value of the image before you loaded it in (by scanning through the file yourself). That is very easily done by reading the third line of the file. In your case, this would be 156. Once you find this, you would need to scale every value in your image so that it is rescaled to what it originally was before you read it in.
To confirm that this is the case, take a look at the first pixel in your image, which has intensity 21 in the original PGM file. MATLAB would thus scale the intensities such that:
scaled = round(val*(255/156));
val would be the input intensity and scaled is the output intensity. As such, if val = 21, then scaled would be:
scaled = round(21*(255/156)) = 34
This matches up with the first pixel when reading it out in MATLAB. Similarly, the sixth pixel in the first row, the original value is 18. MATLAB would scale it such that:
scaled = round(18*(255/156)) = 29
This again matches up with what you see in MATLAB. Starting to see the pattern now? Basically, to undo the scaling, you would need to multiply by the reciprocal of the scaling factor. As such, given that A is the image you loaded in, you need to do:
A_scaled = uint8(double(A)*(max_value/255));
A_scaled is the output image and max_value is the maximum intensity found in your PGM file before you loaded it in with imread. This undoes the scaling, as MATLAB scales the images from [0-255]. Note that I need to cast the image to double first, do the multiplication with the scaling factor as this will most likely produce floating point values, then re-cast back to uint8. Therefore, to bring it back to [0-max_value], you would have to scale in the opposite way.
Specifically in your case, you would need to do:
A_scaled = uint8(double(A)*(156/255));
The disadvantage here is that you need to know what the maximum value is prior to working with your image, which can get annoying. One possibility is to use MATLAB and actually open up the file with file pointers and get the value of the third line yourself. This is also an annoying step, but I have an alternative for you.
Alternative... probably better for you
Alternatively, here are two links to functions written in MATLAB that read and write PGM files without doing that unnecessary scaling, and it'll provide the results that you are expecting (unscaled).
Reading: http://people.sc.fsu.edu/~jburkardt/m_src/pgma_io/pgma_read.m.
Writing: http://people.sc.fsu.edu/~jburkardt/m_src/pgma_io/pgma_write.m
How the read function works is that it opens up the image using file pointers and manually parses in the data and stores the values into a matrix. You probably want to use this function instead of relying on imread. To save the images, file pointers are again used and the values are written such that the PGM standard is maintained and again, your intensities are unscaled.
Your java implementation is printing the ASCII values of the text bytes "21 2 1" etc.
50->2
51->1
32->SPACE
50->2
32->SPACE
51->1
etc.
Some PGM files use a text header, but binary representation for the pixels themselves. These are marked with a different magic string at the beginning. It looks like the java code is reading the file as if it had binary pixels.
Instead, your PGM file has ASCII-coded pixels, where you want to scan a whitespace-separated value for each pixel. You do this the same way you read the width and height.
The debug code might look like this:
line = d.readLine(); // first image line
s = new Scanner(line);
for(int i=0;i<30;i++) /* printing first 30 values from the image including spaces*/
System.out.println((byte)s.nextInt());
I have a bunch of images, to many to do by hand that are 16 color 8 bit PNG format that I need in 16 4 bit format, they all have the same palette.
I am scouring Google for the best library to use, but I am not finding much on this specific problem so I am coming here for hopefully some more targeted solutions.
I am trying to use PIL based on other answers I have found here, but not having any luck.
img = Image.open('DownArrow_focused.png')
img = img.point(lambda i: i * 16, "L")
img.save('DownArrow_focused.png', 'PNG')
but this gives me a grayscale image, not what I want.
PIL won't work, trying PyPNG. GIMP does this, but I have hundreds of these things I need to batch process them. And get batches of these to convert, so it isn't a one time thing.
A Java based solution would be acceptable as well, pretty much anything I can run from the command line on a Linux/OSX machine will be acceptable.
In PNG the palette is always stored in RGB8 (3 bytes for each index=color), with an arbitrary (up to 256) number of entries. If you currently have a 8 bit image with a 16-colors palette (16 total entries), you dont need to alter the pallete, only to repack the pixel bytes (two indexes per byte). If so, I think you could do it with PNGJ with this code (untested):
public static void reencode(String orig, String dest) {
PngReader png1 = FileHelper.createPngReader(new File(orig));
ImageInfo pnginfo1 = png1.imgInfo;
ImageInfo pnginfo2 = new ImageInfo(pnginfo1.cols, pnginfo1.rows, 4, false,false,true);
PngWriter png2 = FileHelper.createPngWriter(new File(dest), pnginfo2, false);
png2.copyChunksFirst(png1, ChunksToWrite.COPY_ALL);
ImageLine l2 = new ImageLine(pnginfo2);
for (int row = 0; row < pnginfo1.rows; row++) {
ImageLine l1 = png1.readRow(row);
l2.tf_pack(l1.scanline, false);
l2.setRown(row);
png2.writeRow(l2);
}
png1.end();
png2.copyChunksLast(png1, ChunksToWrite.COPY_ALL);
png2.end();
System.out.println("Done");
}
Elsewhere, if your current pallette has 16 "used" colors (but its length is greater because it includes unused colors), you need to do some work, modifying the palette chunk (but it also can be done).
Call Netpbm programs
http://netpbm.sourceforge.net/
from a Python script using the following commands:
$ pngtopnm test.png | pnmquant 16 | pnmtopng > test16.png
$ file test16.png
test16.png: PNG image data, 700 x 303, 4-bit colormap, non-interlaced
And GIMP reports test16.png as having Color space: Indexed color (16 colors),
which I guess is what you want.
This is not a pure Python solution but PIL is also not pure Python and has dependencies on shared libraries too. I think you cannot avoid a dependency on some external image software.