I have a fixed size frame buffer (1200x720 RGBA efficiently converted from YUV) in a java byte array.
I would like to set a certain shade of a color (white in my case, regardless of its alpha value) to fully transparent.
Currently I am doing this via CPU by traversing the byte array and zero'ing the pixel if RGB > 0xC8. This somewhat works but is obviously extremely slow (>1sec/frame) for doing so on a live stream.
I've been researching methods to do this via GPU/OpenGL on Android and I see mentioning of Alpha test, blending, and color keying. It seems the alpha test is not useful here since it relies on the alpha information rather than RGB's values.
Any idea how to do this on Android using OpenGL/java?
It seems the alpha test is not useful here
The logic for an alpha-test is implemented in the fragment shader, so rather than testing alpha just change the test to implement a check on the RGB value. The technique here is generic and 100% flexible. The underlying operation you are looking for is fragment shaders which trigger the discard operation when the color key matches.
Alternatively you can use the same conditional check but rather than calling discard just set the output color to vec4(0.0) and use blending to avoid modifying the framebuffer for that fragment. On the whole I would expect this to be more efficient; discard tends to have odd perfomance side-effects.
You should create a custom renderscript script to convert those pixels, you can also use it to convert from yuv so you only process the pixels in the buffer once
Related
I'm working on an Image Processing project for a while, which consists in a way to measure and classify some types of sugar in the production line by its color. Until now, my biggest concern was searching and implementing the appropriate mathematical techniques to calculate distance between two colors (a reference color and the color being analysed), and then, turn this value into something more meaningful, as an industry standard measure.
From this, I'm trying to figure out how should I reliably extract the average color value from an image, once the frame captured by a video camera may contain noises or dirt in the sugar (most likely almost black dots).
Language: Java with OpenCV library.
Current solution: Before taking average image value, I'm applying the fastNlMeansDenoisingColored function, provided by OpenCV. It removes some white dots, at cost of more defined details. Couldn't remove black dots with it (not shown in the following images).
From there, I'm using the org.opencv.core.Core.mean function to computate the mean value of array elements independently for each channel, so that I can have a scalar value to use in my calculations.
I tried to use some kinds of image thresholding filters to get rid of black and white dots, and then calculate the mean with a mask, It kinda works too. Also, I tried to find any weighted average function which could return scalar values as well, but without success.
I don't know If those are robust enough pre-processing techniques to such application, mean values can vary easily. Am I in the right way? Would you suggest a better way to get reliable value that will represent my sugar's color?
Is there a simple way to get an rgba int[] from an argb BufferedImage? I need it to be converted for opengl, but I don't want to have to iterate through the pixel array and convert it myself.
OpenGL 1.2+ supports a GL_BGRA pixel format and reversed packed pixels.
On the surface BGRA does not sound like what you want, but let me explain.
Calls like glTexImage2D (...) do what is known as pixel transfer, which involves packing and unpacking image data. During the process of pixel transfer, data conversion may be performed, special alignment rules may be followed, etc. The data conversion step is what we are particularly interested in here; you can transfer pixels in a number of different layouts besides the obvious RGBA component order.
If you reverse the byte order (e.g. data type = GL_UNSIGNED_INT_8_8_8_8_REV) together with a GL_BGRA format, you will effectively transfer ARGB pixels without any real effort on your part.
Example glTexImage2D (...) call:
glTexImage2D (..., GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, image);
The usual use-case for _REV packed data types is handling endian differences between different processors, but it also comes in handy when you want to reverse the order of components in an image (since there is no such thing as GL_ARGB).
Do not convert things for OpenGL - it is perfectly capable of doing this by itself.
In order to transition between argb and rgba you can apply "bit-wise shifts" in order to convert them back and forth in a fast and concise format.
argb = rgba <<< 8
rgba = argb <<< 24
If you have any further questions, this topic might should give you a more in-depth answer on converting between rgba and argb.
Also, if you'd like to learn more about java's bitwise operators check out this link
I'm trying to optimise a rendering engine in Java to not draw object's which are covered up by 'solid' child objects drawn in front of them, i.e. the parent is occluded by its children.
I'm wanting to know if an arbitrary BufferedImage I load in from a file contains any transparent pixels - as this affects my occlusion testing.
I've found I can use BufferedImage.getColorModel().hasAlpha() to find if the image supports alpha, but in the case that it does, it doesn't tell me if it definitely contains non-opaque pixels.
I know I could loop over the pixel data & test each one's alpha value & return as soon as I discover a non-opaque pixel, but I was wondering if there's already something native I could use, a flag that is set internally perhaps? Or something a little less intensive than iterating through pixels.
Any input appreciated, thanks.
Unfortunately, you will have to loop through each pixel (until you find a transparent pixel) to be sure.
If you don't need to be 100% sure, you could of course test only some pixels, where you think transparency is most likely to occur.
By looking at various images, I think you'll find that most images that has transparent parts contains transparency along the edges. This optimization will help in many common cases.
Unfortunately, I don't think that there's an optimization that can be done in one of the most common cases, the one where the color model allows transparency, but there really are no transparent pixels... You really need to test every pixel in this case, to know for sure.
Accessing the alpha values in its "native representation" (through the Raster/DataBuffer/SampleModel classes) is going to be faster than using BufferedImage.getRGB(x, y) and mask out the alpha values.
I'm pretty sure you'll need to loop through each pixel and check for an Alpha value.
The best alternative I can offer is to write a custom method for reading the pixel data - ie your own Raster. Within this class, as you're reading the pixel data from the source file into the data buffer, you can check for the alpha values as you go. Of course, this isn't much help if you're using a built-in image reading class, and involves a lot more effort.
I am attempting to create a "hole in the fog" effect. I have a background grid image, overlapped onto that I have a "fog" texture that I use to show that certain areas are not in view. I am attempting to cut a chunk out of "fog" that will show the area that is currently in view. I am trying to "mask" a part of the fog off the screen.
I made some images to help explain what I am after:
Background:
"Mask Image" (The full transparency has to be on the inside and not the outer rim for what I am going to use it for):
Fog (Sorry, hard to see.. Mostly Transparent):
What I want as a final Product:
I have tried:
Stencil-Buffer: I got this fully working except for one fact... I wasn't able to figure out how to retain the fading transparency of the "mask" image.
glBlendFunc: I have tried many different version of the parameters and many other methods with it (glColorMask, glBlendEquation, glBlendFuncSeparate) I started by using some parameter that I found on this website: here. I used the "glBlendFunc(GL_ZERO, GL_ONE_MINUS_SRC_ALPHA);" as this seemed to be what I was looking for but... This is what ended up happening as a result: (Its hard to tell what is happening here but... There is fog covering the grid in the background. Though, the mask is just ending up as an fully opaque black blob when its supposed to be a transparent part in the fog.
Some previous code:
glEnable(GL_BLEND); // This is not really called here... It is called on the init function of the program as it is needed all the way through the rendering cycle.
renderFogTexture(delta, 0.55f); // This renders the fog texture over the background the 0.55f is the transparency of the image.
glBlendFunc(GL_ZERO, GL11.GL_ONE_MINUS_SRC_ALPHA); // This is the one I tried from one of the many website I have been to today.
renderFogCircles(delta); // This just draws one (or more) of the mask images to remove the fog in key places.
(I would have posted more code but after I tried many things I started removing some old code as it was getting very cluttered (I "backed them up" in block comments))
This is doable, provided that you're not doing anything with the alpha of the framebuffer currently.
Step 1: Make sure that the alpha of the framebuffer is cleared to zero. So your glClearColor call needs to set the alpha to zero. Then call glClear as normal.
Step 2: Draw the mask image before you draw the "fog". As Tim said, once you blend with your fog, you can't undo that. So you need the mask data there first.
However, you also need to render the mask specially. You only want the mask to modify the framebuffer's alpha. You don't want it to mess with the RGB color. To do that, use this function: glColorMask(GL_FALSE, GL_FALSE, GL_FALSE, GL_TRUE). This turns off writes to the RGB part of the color; thus, only the alpha will be modified.
Your mask texture seems to have zero where it is visible and one where it isn't. However, the algorithm needs the opposite, so you should either fix your texture or use a glTexEnv mode that will effectively flip the alpha.
After this step, your framebuffer should have an alpha of 0 where we want to see the fog, and an alpha of 1 where we don't.
Also, don't forget to undo the glColorMask call after rendering the mask. You need to get those colors back.
Step 3: Render the fog. That's easy enough; to make the masking work, you need a special blend mode. Like this one:
glEnable(GL_BLEND);
glBlendEquation(GL_FUNC_ADD);
glBlendFuncSeparate(GL_ONE_MINUS_DST_ALPHA, GL_DST_ALPHA, GL_ZERO, GL_ONE);
The separation between the RGB and A blend portions is important. You don't want to change the framebuffer's alpha (just in case you want to render more than one layer of fog).
And you're done.
The approach you're currently taking will not work, as once you draw the fog over the whole screen, there's no way to 'erase' it.
If you're using fixed pipeline:
You can use multitexturing (glTexEnv) with fixed pipeline to combine the fog and circle textures in a single pass. This function is probably kind of confusing if you haven't used it before, you'll probably have to spend some time studying the man page. You'll do something like bind fog to glActiveTexture 0, and mask to glActiveTexture 1, enable multitexturing, and then combine them with glTexEnv. I don't remember exactly the right parameters for this.
If you're using shaders:
Use a multitexturing shader where you multiply the fog alpha with the circle texture (to zero out the alpha in the circle region), and then blend this combined texture into the background in a single pass. This is probably a more conceptually easy approach, but not sure if you're using shaders.
I'm not sure there's a way you can do this where you draw the fog and the mask in separate passes, as they both have their own alpha values, it will be difficult to combine them to get the right color result.
I have written a program that takes a 'photo' and for every pixel it chooses to insert an image from a range of other photos. The image chosen is the photo of which the average colour is closest to the original pixel from the photograph.
I have done this by firstly averaging the rgb values from every pixel in 'stock' image and then converting it to CIE LAB so i could calculate the how 'close' it is to the pixel in question in terms of human perception of the colour.
I have then compiled an image where each pixel in the original 'photo' image has been replaced with the 'closest' stock image.
It works nicely and the effect is good however the stock image size is 300 by 300 pixels and even with the virtual machine flags of "-Xms2048m -Xmx2048m", which yes I know is ridiculus, on 555px by 540px image I can only replace the stock images scaled down to 50 px before I get an out of memory error.
So basically I am trying to think of solutions. Firstly I think the image effect itself may be improved by averaging every 4 pixels (2x2 square) of the original image into a single pixel and then replacing this pixel with the image, as this way the small photos will be more visible in the individual print. This should also allow me to draw the stock images at a greater size. Does anyone have any experience in this sort of image manipulation? If so what tricks have you discovered to produce a nice image.
Ultimately I think the way to reduce the memory errors would be to repeatedly save the image to disk and append the next line of images to the file whilst continually removing the old set of rendered images from memory. How can this be done? Is it similar to appending a normal file.
Any help in this last matter would be greatly appreciated.
Thanks,
Alex
I suggest looking into the Java Advanced Imaging (JAI) API. You're probably using BufferedImage right now, which does keep everything in memory: source images as well as output images. This is known as "immediate mode" processing. When you call a method to resize the image, it happens immediately. As a result, you're still keeping the stock images in memory.
With JAI, there are two benefits you can take advantage of.
Deferred mode processing.
Tile computation.
Deferred mode means that the output images are not computed right when you call methods on the images. Instead, a call to resize an image creates a small "operator" object that can do the resizing later. This lets you construct chains, trees, or pipelines of operations. So, your work would build a tree of operations like "crop, resize, composite" for each stock image. The nice part is that the operations are just command objects so you aren't consuming all the memory while you build up your commands.
This API is pull-based. It defers computation until some output action pulls pixels from the operators. This quickly helps save time and memory by avoiding needless pixel operations.
For example, suppose you need an output image that is 2048 x 2048 pixels, scaled up from a 512x512 crop out of a source image that's 1600x512 pixels. Obviously, it doesn't make sense to scale up the entire 1600x512 source image, just to throw away 2/3 of the pixels. Instead, the scaling operator will have a "region of interest" (ROI) based on it's output dimensions. The scaling operator projects the ROI onto the source image and only computes those pixels.
The commands must eventually get evaluated. This happens in a few situations, mostly relating to output of the final image. So, asking for a BufferedImage to display the output on the screen will force all the commands to evaluate. Similarly, writing the output image to disk will force evaluation.
In some cases, you can keep the second benefit of JAI, which is tile based rendering. Whereas BufferedImage does all its work right away, across all pixels, tile rendering just operates on rectangular sections of the image at a time.
Using the example from before, the 2048x2048 output image will get broken into tiles. Suppose these are 256x256, then the entire image gets broken into 64 tiles. The JAI operator objects know how to work a tile at a tile. So, scaling the 512x512 section of the source image really happens 64 times on 64x64 source pixels at a time.
Computing a tile at a time means looping across the tiles, which would seem to take more time. However, two things work in your favor when doing tile computation. First, tiles can be evaluated on multiple threads concurrently. Second, the transient memory usage is much, much lower than immediate mode computation.
All of which is a long-winded explanation for why you want to use JAI for this type of image processing.
A couple of notes and caveats:
You can defeat tile based rendering without realizing it. Anywhere you've got a BufferedImage in the workstream, it cannot act as a tile source or sink.
If you render to disk using the JAI or JAI Image I/O operators for JPEG, then you're in good shape. If you try to use the JDK's built-in image classes, you'll need all the memory. (Basically, avoid mixing the two types of image manipulation. Immediate mode and deferred mode don't mix well.)
All the fancy stuff with ROIs, tiles, and deferred mode are transparent to the program. You just make API call on the JAI class. You only deal with the machinery if you need more control over things like tile sizes, caching, and concurrency.
Here's a suggestion that might be useful;
Try segregating the two main tasks into individual programs. Your first task is to decide which images go where, and that can be a simple mapping from coordinates to filenames, which can be represented as lines of text:
0,0,image123.jpg
0,1,image542.jpg
.....
After that task is done (and it sounds like you have it well handled), then you can have a separate program handle the compilation.
This compilation could be done by appending to an image, but you probably don't want to mess around with file formats yourself. It's better to let your programming environment do it by using a Java Image object of some sort. The biggest one you can fit in memory pixelwise will be 2GB leading to sqrt(2x10^9) maximum height and width. From this number and dividing by the number of images you have for height and width, you will get the overall pixels per subimage allowed., and can paint them into the appropriate places.
Every time you 'append' are you perhaps implicitly creating a new object with one more pixel to replace the old one (ie, a parallel to the classic problem of repeatedly appending to a String instead of using a StringBuilder) ?
If you post the portion of your code that does the storing and appending, someone will probably help you find an efficient way of recoding it.