I need to do some image manipulation in java. I am porting python code, which uses numpy arrays with dimensions cols, rows, channels; these are floating point. I know how to get RGB out of a BufferedImage and how to put it back; this question is about how to lay out the resulting float image.
Here are some of the options:
direct translation:
float[][][] img = new float[cols][rows][channels];
put channels first:
float[][][] img = new float[channels][cols][rows];
combine indexes:
float[] img = new float[rows*cols*channels];
img[ i * cols * channels + j * channels + k ] = ...;
Option 1 has the advantage that it reads the same as the original code; but it seems non-idiomatic for Java, and probably not fast.
Option 2 should be faster, if I understand how Java N-dimensional arrays work under the hood; at the cost of looking slightly odd. It seems this allocates channels*cols arrays of size rows, as opposed to option 1 which allocates rows*cols arrays of size channels (a very large number of tiny arrays = large overhead).
Option 3 seems to be closest to what the AWT and other Java code does; but it requires passing around the dimensions (they are not built into the array) and it is very easy to get the indexing wrong (especially when there is other index arithmetic going on).
Which of these is better and why? What are some of the other pros and cons? Is there an even better way?
UPDATE
I benchmarked options 1 and 2, on a non-trivial example of image processing which runs four different algorithms (in a 10x loop, so the VM gets to warm up). This is on OpenJDK 7 on Ubuntu, Intel i5 cpu. Surprisingly, there isn't much of a speed difference: option 2 is about 6% slower than option 1. There is a pretty large difference in amount of memory garbage-collected (using java -verbose:gc): option 1 collects 1.32 GB of memory during the entire run, while option 2 collects only 0.87 GB (not quite half, but then again not all images used are color). I wonder how much difference there will be in Dalvik?
BoofCV has float image types and the raw pixel data can manipulated directly. See the tutorial.
BoofCV provides several routines for quickly converting BufferedImage into different BoofCV image types. Using BoofCV routines for converting to/from BufferedImages are very fast.
Convert a BufferedImage to a multispectral float type image with BoofCV:
MultiSpectral<ImageFloat32> image =
ConvertBufferedImage.convertFromMulti(image,null,true,ImageFloat32.class);
Access pixel value from the float image array:
float value = image.getBand(i).data[ image.startIndex + y*image.stride + x];
Another way to get and set the pixel value:
float f = image.getBand(i).get(x, y);
...
image.getBand(i).set(x, y, f);
Where i represents the index of the color channel.
Convert a BoofCV image back to BufferedImage:
BufferedImage bufferedImage =
new BufferedImage(image.width, image.height, BufferedImage.TYPE_4BYTE_ABGR);
BufferedImage bufferedImage = ConvertBufferedImage.convertTo(
image, bufferedImage, true);
You are right, option 3 has a smaller memory footprint.
As for which performs better, you'd have to profile and/or benchmark the options.
Given your statement that row and column counts are large, I'd go with option 3, but wrap the array in a class that knows the dimensions, e.g. called Image.
The option 3 is used by the BufferedImage in Java. It's good for memory as said Andreas, but for image processing and information continuity it's not optimal.
The most practical would be:
float[][] img = new float[channels][cols*rows];
Like that, the channels are separated and thus can be processed independently. This representation would be optimal if you want to call native codes.
Related
Java seems to scale the splash screen passed as a jvm switch e.g.
java -splash:splash_file.png
depending on the size of the monitor.
On the source i can see a reference to some natively calculated scale factor. Does anybody know how this scaling factor is calculated?
I would assume it's calculated by the standard way for graphics which takes an image of a given size in an unbounded "world" (world coordinates), transforms it to a normalized device (think unit square), and then transforms it again to the screen coordinates. The transformations consist of a translation and a scaling of the points.
Given the splash screen's window in the world (the way it should appear without translation or scaling), the normalized (x,y) values are obtained as follows:
The first part is the translation and the second the scale factor. This reduces the image to be contained in a 1 x 1 square so all the (x,y) values are fractional.
To go from the normalized to screen coordinate system the values are calculated as follows:
These operations are typically done efficiently with the help of translation and scaling matrix multiplications. Rotation can also be applied.
This is actually a low-level view of how you could take images, shapes, etc. drawn anyway you like and present them consistently across any sized screens. I'm not sure exactly how it's done in the example you give but it would likely be some variation of this. See the beginning of this presentation for a visual representation.
This value is actually both jdk implementation-dependent and architecture-dependent.
I browsed the OpenJDK code, and for a lot of architectures, it's simply hardcoded to 1.
For example, in the windows bindings, you'll find:
SPLASHEXPORT char*
SplashGetScaledImageName(const char* jarName, const char* fileName,
float *scaleFactor)
{
*scaleFactor = 1;
return NULL;
}
The scaleFactor value is then stored into a struct that is accessed via JNI through the _GetScaleFactor method.
Is there a simple way to get an rgba int[] from an argb BufferedImage? I need it to be converted for opengl, but I don't want to have to iterate through the pixel array and convert it myself.
OpenGL 1.2+ supports a GL_BGRA pixel format and reversed packed pixels.
On the surface BGRA does not sound like what you want, but let me explain.
Calls like glTexImage2D (...) do what is known as pixel transfer, which involves packing and unpacking image data. During the process of pixel transfer, data conversion may be performed, special alignment rules may be followed, etc. The data conversion step is what we are particularly interested in here; you can transfer pixels in a number of different layouts besides the obvious RGBA component order.
If you reverse the byte order (e.g. data type = GL_UNSIGNED_INT_8_8_8_8_REV) together with a GL_BGRA format, you will effectively transfer ARGB pixels without any real effort on your part.
Example glTexImage2D (...) call:
glTexImage2D (..., GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, image);
The usual use-case for _REV packed data types is handling endian differences between different processors, but it also comes in handy when you want to reverse the order of components in an image (since there is no such thing as GL_ARGB).
Do not convert things for OpenGL - it is perfectly capable of doing this by itself.
In order to transition between argb and rgba you can apply "bit-wise shifts" in order to convert them back and forth in a fast and concise format.
argb = rgba <<< 8
rgba = argb <<< 24
If you have any further questions, this topic might should give you a more in-depth answer on converting between rgba and argb.
Also, if you'd like to learn more about java's bitwise operators check out this link
I am using a BufferedImage to hold a 10 by 10 sample of an image. With this Image I would like to find an approximate average color (as a Color object) that represents this image. Currently I have two ideas on how to implement this feature:
Make a scaled instance of the image into a 1 by 1 size image and find the color of the newly created image as the average color
Use two for loops. The inner-most is used to average each line, the secondary for-loop is used to average each line pixel by pixel.
I really like the idea of the first solution, however I am not sure how accurate it would be. The second solution would be as accurate as they come, however it seems incredibly tedious. I also believe the getColor command is processor intensive on a large scale such as this (I am performing this averaging roughly at 640 to 1920 times a second), please correct me if I am wrong. Since this method will be very CPU intensive, I would like to use a fairly efficient algorithm.
It depends what you mean by average. If you have half the pixels red and half the pixels blue, would the average be purple? In that case I think you can try adding all the values up and dividing it by how many pixels you have.
However, I suspect that rather than the average, you want the dominant colour?
In that case one alternative could be to discretise the colours into 'buckets' (say at intervals of 100, or even more sparser in the extreme case just 3, one for Red, one for Green and one for Blue), and create a histogram (a simple array with counts). You would then take the bucket which has the most count.
Be careful with idea 1. Remember that scaling often takes place by sampling. Since you have a very small image, you have already lost a lot of information. Scaling down further will probably just sample a few pixels and not really average all of them. Better check what algorithm your scaling process is using.
I have java program that reads a jpegfile from the harddrive and uses it as the background image for various other things. The image itself is stored in a BufferImage object like so:
BufferedImage background
background = ImageIO.read(file)
This works great - the problem is that the BufferedImage object itself is enormous. For example, a 215k jpeg file becomes a BufferedImage object that's 4 megs and change. The app in question can have some fairly large background images loaded, but whereas the jpegs are never more than a meg or two, the memory used to store the BufferedImage can quickly exceed 100s of megabytes.
I assume all this is because the image is being stored in ram as raw RGB data, not compressed or optimized in any way.
Is there a way to have it store the image in ram in a smaller format? I'm in a situation where I have more slack on the CPU side than RAM, so a slight performance hit to get the image object's size back down towards the jpeg compression would be well worth it.
One of my projects I just down-sample the image as it is being read from an ImageStream on the fly. The down-sampling reduces the dimensions of the image to a required width & height whilst not requiring expensive resizing computations or modification of the image on disk.
Because I down-sample the image to a smaller size, it also significantly reduces the processing power and RAM required to display it. For extra optimization, I render the buffered image in tiles also... But that's a bit outside the scope of this discussion. Try the following:
public static BufferedImage subsampleImage(
ImageInputStream inputStream,
int x,
int y,
IIOReadProgressListener progressListener) throws IOException {
BufferedImage resampledImage = null;
Iterator<ImageReader> readers = ImageIO.getImageReaders(inputStream);
if(!readers.hasNext()) {
throw new IOException("No reader available for supplied image stream.");
}
ImageReader reader = readers.next();
ImageReadParam imageReaderParams = reader.getDefaultReadParam();
reader.setInput(inputStream);
Dimension d1 = new Dimension(reader.getWidth(0), reader.getHeight(0));
Dimension d2 = new Dimension(x, y);
int subsampling = (int)scaleSubsamplingMaintainAspectRatio(d1, d2);
imageReaderParams.setSourceSubsampling(subsampling, subsampling, 0, 0);
reader.addIIOReadProgressListener(progressListener);
resampledImage = reader.read(0, imageReaderParams);
reader.removeAllIIOReadProgressListeners();
return resampledImage;
}
public static long scaleSubsamplingMaintainAspectRatio(Dimension d1, Dimension d2) {
long subsampling = 1;
if(d1.getWidth() > d2.getWidth()) {
subsampling = Math.round(d1.getWidth() / d2.getWidth());
} else if(d1.getHeight() > d2.getHeight()) {
subsampling = Math.round(d1.getHeight() / d2.getHeight());
}
return subsampling;
}
To get the ImageInputStream from a File, use:
ImageIO.createImageInputStream(new File("C:\\image.jpeg"));
As you can see, this implementation respects the images original aspect ratio as well. You can optionally register an IIOReadProgressListener so that you can keep track of how much of the image has been read so far. This is useful for showing a progress bar if the image is being read over a network for instance... Not required though, you can just specify null.
Why is this of particular relevance to your situation? It never reads the entire image into memory, just as much as you need it to so that it can be displayed at the desired resolution. Works really well for huge images, even those that are 10's of MB on disk.
I assume all this is because the image
is being stored in ram as raw RGB
data, not compressed or optimized in
any way.
Exactly... Say a 1920x1200 JPG can fit in, say, 300 KB while in memory, in a (typical) RGB + alpha, 8 bits per component (hence 32 bits per pixel) it shall occupy, in memory:
1920 x 1200 x 32 / 8 = 9 216 000 bytes
so your 300 KB file becomes a picture needing nearly 9 MB of RAM (note that depending on the type of images you're using from Java and depending on the JVM and OS this may sometimes be GFX-card RAM).
If you want to use a picture as a background of a 1920x1200 desktop, you probably don't need to have a picture bigger than that in memory (unless you want to some special effect, like sub-rgb decimation / color anti-aliasing / etc.).
So you have to choices:
makes your files less wide and less tall (in pixels) on disk
reduce the image size on the fly
I typically go with number 2 because reducing file size on hard disk means you're losing details (a 1920x1200 picture is less detailed than the "same" at 3940x2400: you'd be "losing information" by downscaling it).
Now, Java kinda sucks big times at manipulating pictures that big (both from a performance point of view, a memory usage point of view, and a quality point of view [*]). Back in the days I'd call ImageMagick from Java to resize the picture on disk first, and then load the resized image (say fitting my screen's size).
Nowadays there are Java bridges / APIs to interface directly with ImageMagick.
[*] There is NO WAY you're downsizing an image using Java's built-in API as fast and with a quality as good as the one provided by ImageMagick, for a start.
Do you have to use BufferedImage? Could you write your own Image implementation that stores the jpg bytes in memory, and coverts to a BufferedImage as necessary and then discards?
This applied with some display aware logic (rescale the image using JAI before storing in your byte array as jpg), will make it faster than decoding the large jpg every time, and a smaller footprint than what you currently have (processing memory requirements excepted).
Use imgscalr:
http://www.thebuzzmedia.com/software/imgscalr-java-image-scaling-library/
Why?
Follows best practices
Stupid simple
Interpolation, Anti-aliasing support
So you aren't rolling your own scaling library
Code:
BufferedImage thumbnail = Scalr.resize(image, 150);
or
BufferedImage thumbnail = Scalr.resize(image, Scalr.Method.SPEED, Scalr.Mode.FIT_TO_WIDTH, 150, 100, Scalr.OP_ANTIALIAS);
Also, use image.flush() on your larger image after conversion to help with the memory utilization.
File size of the JPG on disk is completely irrelevant.
The pixel dimensions of the file are. If your image is 15 Megapixels expect it to require crap load of RAM to load a raw uncompressed version.
Re-size your image dimensions to be just what you need and that is the best you can do without going to a less rich colorspace representation.
You could copy the pixels of the image to another buffer and see if that occupies less memory then the BufferedImage object. Probably something like this:
BufferedImage background = new BufferedImage(
width,
height,
BufferedImage.TYPE_INT_RGB
);
int[] pixels = background.getRaster().getPixels(
0,
0,
imageBuffer.getWidth(),
imageBuffer.getHeight(),
(int[]) null
);
I have an Eclipse RCP application that displays a lot (10k+) of small images next to each other, like a film strip. For each image, I am using a SWT Image object. This uses an excessive amount of memory and resources. I am looking for a more efficient way. I thought of taking all of these images and concatenating them by creating an ImageData object of the proper total, concatenated width (with a constant height) and using setPixel() for the rest of the pixels. However, the Palette used in the ImageData constructor I can't figure out.
I also searched for SWT tiling or mosaic functionality to create one image from a group of images, but found nothing.
Any ideas how I can display thousands of small images next to each other efficiently? Please note that once the images are displayed, they are not manipulated, so this is a one-time cost.
You can draw directly on the GC (graphics context) of a new (big) image. Having one big Image should result in much less resource usage than thousands of smaller images (each image in SWT keeps some OS graphics object handle)
What you can try is something like this:
final List<Image> images;
final Image bigImage = new Image(Display.getCurrent(), combinedWidth, height);
final GC gc = new GC(bigImage);
//loop thru all the images while increasing x as necessary:
int x = 0;
int y = 0;
for (Image curImage : images) {
gc.drawImage(curImage, x, y);
x += curImage.getBounds().width;
}
//very important to dispose GC!!!
gc.dispose();
//now you can use bigImage
Presumably not every image is visible on screen at any one time? Perhaps a better solution would be to only load the images when they become (or are about to become) visible, disposing of them when they have been scrolled off the screen. Obviously you'd want to keep a few in memory on either side of the current viewport in order to make a smooth transition for the user.
I previously worked with a Java application to create photomosaics, and found it very difficult to achieve adequate performance and memory usage using the java imaging (JAI) libraries and SWT. Although we weren't using nearly as many images as you mention, one route was to rely on a utilities outside of java. In particular, you could use ImageMagick command-line utilities to stitch together your mosaic, and the load the completed memory from disk. If you want to get fancy, there is also a C++ API for ImageMagick, which is very efficient in memory.