I'm relatively new to doing image compression on the byte level, and am currently working on a java image preprocessor that will take a bmp image, convert it to an 8-bit unsigned grayscale, then stack its bytes according to high and low before exporting and compressing it. After some extensive research and testing various methods of byte extraction, I'm still not seeing the results I need. Before I continue, it should be noted that all of these images are originally in DICOM format, and I'm using the ij.plugin.DICOM package to extract the pixel data as a bmp image.
The following description is represented by code bellow. Currently, I'm reading in the original image as a buffered image, converting it to grayscale, then getting the image bytes from the Raster. Then I take those bytes, and using some other code I found on stackoverflow and "converting" them to a String representation of binary bits. I then send that string to a character array. The next step might be extraneous, but I wanted to get your input before I removed it (since I'm new at this). I make a Bitset and iterate through the "binary" character array. If the character value is "1", I set that position in the BitSet to true. Then I send the BitSet to another byte array.
Then I make two new byte arrays, one for the high and one for the low byte. Using a for loop, I'm iterating over the "bit" array and storing every 4 "bits" in the high or low byte, depending on where we are in the array.
Lastly, I take the DICOM tag data, make a byte array from it, and then stack the tag array, the high byte array, and the low byte array together. My intended result is to have the image matrix be "split" with the top half containing all the high bytes and the bottom half containing all of the low bytes. I've been told that the tag bytes will be so small, they shouldn't affect the final outcome (I've tested the image without them, just to be sure, and there was no visible difference).
Below is the code. Please let me know if you have any questions, and I will modify my post accordingly. I've tried to include all relevant data. Let me know if you need more.
BufferedImage originalImage = getGrayScale(img.getBufferedImage());//returns an 8-bit unsigned grayscale conversion of the original image
byte[] imageInByte = ((DataBufferByte) originalImage.getRaster().getDataBuffer()).getData();
String binary = toBinary(imageInByte); //converts to a String representation of the binary bits
char[] binCharArray = binary.toCharArray();
BitSet bits = new BitSet(binCharArray.length);
for (int i = 0; i < binCharArray.length; i++) {
if (binCharArray[i] == '1') {
bits.set(i);
}
}
imageInByte = bits.toByteArray();
byte[] high = new byte[(int) imageInByte.length/2];
byte[] low = new byte[(int) imageInByte.length/2];
int highC = 0;
int lowC = 0;
boolean change = false; //start out storing in the high bit
//change will = true on very first run. While true, load in the high byte array. Else low byte
for(int i = 0; i < imageInByte.length; i++){
if(i % 4 == 0){
change = !change;
}
if(change){
high[highC] = imageInByte[i];
highC++;
} else {
low[lowC] = imageInByte[i];
lowC++;
}
}
//old code from a previous attempt.
// for (int j = 0; j < imageInByte.length; j++) {
// byte h = (byte) (imageInByte[j] & 0xFF);
// byte l = (byte) ((imageInByte[j] >> 8) & 0xFF);
// high[j] = h;
// low[j] = l;
// }
OutputStream out = null;
//add this array to the image array. It goes at the beginning.
byte[] tagBytes = dicomTags.getBytes();
currProcessingImageTagLength = tagBytes.length;
imageInByte = new byte[high.length + low.length + tagBytes.length];
System.arraycopy(tagBytes, 0, imageInByte, 0, tagBytes.length);
System.arraycopy(high, 0, imageInByte, tagBytes.length, high.length);
System.arraycopy(low, 0, imageInByte, tagBytes.length + high.length, low.length);
BufferedImage bImageFromConvert = new BufferedImage(dimWidth, dimHeight, BufferedImage.TYPE_BYTE_GRAY);//dimWidth and dimHeight are the image dimensions, stored much earlier in this function
byte[] bufferHolder = ((DataBufferByte) bImageFromConvert.getRaster().getDataBuffer()).getData();
System.arraycopy(imageInByte, 0, bufferHolder, 0, bufferHolder.length);
//This is where I try and write the final image before sending it off to an image compressor
ImageIO.write(bImageFromConvert, "bmp", new File(
directory + fileName + "_Compressed.bmp"));
return new File(directory + fileName + "_Compressed.bmp");
And below is the toBinary function in case you were interested:
private static String toBinary(byte[] bytes) {
StringBuilder sb = new StringBuilder(bytes.length * Byte.SIZE);
for (int i = 0; i < Byte.SIZE * bytes.length; i++) {
sb.append((bytes[i / Byte.SIZE] << i % Byte.SIZE & 0x80) == 0 ? '0' : '1');
}
return sb.toString();
}
Thank you so much for your help! I've spent nearly 20 hours now trying to solve this one problem. It's been a huge headache, and any insight you have would be appreciated.
EDIT: Here's the getGreyScale function:
public static BufferedImage getGrayScale(BufferedImage inputImage) {
BufferedImage img = new BufferedImage(inputImage.getWidth(), inputImage.getHeight(), BufferedImage.TYPE_BYTE_GRAY);
Graphics g = img.getGraphics();
g.drawImage(inputImage, 0, 0, null);
g.dispose();
return img;
}
EDIT 2: I've added some images upon request.
Current output:
current image
Note, I can't post the images with the "expected" high byte and low byte outcome due to my reputation being lower than 10.
This says every 4 bytes change; thats not what you intend:
for(int i = 0; i < imageInByte.length; i++){
if(i % 4 == 0){
change = !change;
}
if(change){
high[highC] = imageInByte[i];
highC++;
} else {
low[lowC] = imageInByte[i];
lowC++;
}
}
I would replace it with this, from your earlier attempt
for (int j = 0; j < imageInByte.length; j+=2) {
byte h = (byte) (imageInByte[j] & 0xF0);
byte h2 = (byte) (imageInByte[j+1] & 0xF0);
byte l = (byte) (imageInByte[j] & 0x0f);
byte l2 = (byte) (imageInByte[j+1] & 0x0f);
high[j/2] = h|(h2>>4);
low[j/2] = (l<<4)|l2;
}
My steganography program takes a string and a bmp image from the user and creates a new bmp image of the same photo but slightly altered to hide the string in the LSB of each byte in the image.
However, when I write the bmp image it does not open. When I just read the bytes and write them back it opens okay, but not when I alter the LSB of few of its bytes it does not. I can see from testing the code that the value of bytes are either not changed or changed by one (for ex 72 instead of 73) but that causes the image not to open for some reason. This is the code in question.
public String hideString(String payload, String cover_filename) throws IOException
{
/**reads source image*/
FileInputStream fis = new FileInputStream(cover_filename);
/**reads how many bytes are in the image*/
int size = fis.available();
/**loops through all bytes in the bmp image and saves them in List list*/
for(int i = 0; i < size; i++)
{
list.add(fis.read());
}
fis.close();
/**turns payload into an array of integers - each integer is either 0 or 1*/
int[] binValues = toBinary(payload);
/**loops though payload and hides it in each byte*/
for(int i = 0; i < binValues.length; i++)
{
/**swapLsb takes 1 or 0 from payload and puts it in Lsb in each image byte and returns a binary integer of various sizes(no leading zeros)*/
String s = "" + swapLsb(binValues[i], list.get(i));
/**converts binary integer into an integer and adds it to List listAltered*/
listAltered.add(Integer.parseInt(s, 2));
}
/**writes stego image*/
FileOutputStream fos = new FileOutputStream("stego.bmp");
/**writes altered bytes into the bmp image*/
for(int i = 0; i < binValues.length; i++)
{fos.write(listAltered.get(i));count++;}
/**writes the rest of bytes into bmp image*/
for(int i = binValues.length; i < size; i++)
fos.write(list.get(i));
fos.close();//close writing object
return "Success! new image is called 'stego.bmp'";
}
I am trying to figure out a way of taking data from a file and I want to store every 4 bytes as a bitset(32). I really have no idea of how to do this. I have played about with storing each byte from the file in an array and then tried to covert every 4 bytes to a bitset but I really cannot wrap my head around using bitsets. Any ideas on how to go about this?
FileInputStream data = null;
try
{
data = new FileInputStream(myFile);
}
catch (FileNotFoundException e)
{
e.printStackTrace();
}
ByteArrayOutputStream bos = new ByteArrayOutputStream();
byte[] b = new byte[1024];
int bytesRead;
while ((bytesRead = data.read(b)) != -1)
{
bos.write(b, 0, bytesRead);
}
byte[] bytes = bos.toByteArray();
Ok, you got your byte array. Now what you have to convert each byte to a bitset.
//Is number of bytes divisable by 4
bool divisableByFour = bytes.length % 4 == 0;
//Initialize BitSet array
BitSet[] bitSetArray = new BitSet[bytes.length / 4 + divisableByFour ? 0 : 1];
//Here you convert each 4 bytes to a BitSet
//You will handle the last BitSet later.
int i;
for(i = 0; i < bitSetArray.length-1; i++) {
int bi = i*4;
bitSetArray[i] = BitSet.valueOf(new byte[] { bytes[bi], bytes[bi+1], bytes[bi+2], bytes[bi+3]});
}
//Now handle the last BitSet.
//You do it here there may remain less than 4 bytes for the last BitSet.
byte[] lastBitSet = new byte[bytes.length - i*4];
for(int j = 0; j < lastBitSet.length; j++) {
lastBitSet[i] = bytes[i*4 + j]
}
//Put the last BitSet in your bitSetArray
bitSetArray[i] = BitSet.valueOf(lastBitSet);
I hope this works for you as I have written quickly and did not check if it works. But this gives you the basic idea, which was my intention at the beginning.
I have an int and float array each of length 220 million (fixed). Now, I want to store/upload those arrays to/from memory and disk. Currently, I am using Java NIO's FileChannel and MappedByteBuffer to solve this. It works fine, but it takes near about 5 seconds (Wall Clock Time) for storing/uploading array to/from memory to disk. Now, I want to make it faster.
Here, I should mention most of those array elements are 0 ( nearly 52 %).
like:
int arr1 [] = { 0 , 0 , 6 , 7 , 1, 0 , 0 ...}
Can anybody help me, is there any nice way to improve speed by not storing or loading those 0's. This can compensated by using Arrays.fill (array , 0).
The following approach requires n / 8 + nz * 4 bytes on disk, where n is the size of the array, and nz the number of non-zero entries. For 52% zero entries, you'd reduce storage size by 52% - 3% = 49%.
You could do:
void write(int[] array) {
BitSet zeroes = new BitSet();
for (int i = 0; i < array.length; i++)
zeroes.set(i, array[i] == 0);
write(zeroes); // one bit per index
for (int i = 0; i < array.length; i++)
if (array[i] != 0)
write(array[y]);
}
int[] read() {
BitSet zeroes = readBitSet();
array = new int[zeroes.length];
for (int i = 0; i < zeroes.length; i++) {
if (zeroes.get(i)) {
// nothing to do (array[i] was initialized to 0)
} else {
array[i] = readInt();
}
}
}
Edit: That you say this is slightly slower implies that the disk is not the bottleneck. You could tune the above approach by writing the bitset as you construct it, so you don't have to write the bitset to memory before writing it to disk. Also, by writing the bitset word by word interspersed with the actual data we can do only a single pass over the array, reducing cache misses:
void write(int[] array) {
writeInt(array.length);
int ni;
for (int i = 0; i < array.length; i = ni) {
ni = i + 32;
int zeroesMap = 0;
for (j = i + 31; j >= i; j--) {
zeroesMap <<= 1;
if (array[j] == 0) {
zeroesMap |= 1;
}
}
writeInt(zeroesMap);
for (j = i; j < ni; j++)
if (array[j] != 0) {
writeInt(array[j]);
}
}
}
}
int[] read() {
int[] array = new int[readInt()];
int ni;
for (int i = 0; i < array.length; i = ni) {
ni = i + 32;
zeroesMap = readInt();
for (j = i; j < ni; j++) {
if (zeroesMap & 1 == 1) {
// nothing to do (array[i] was initialized to 0)
} else {
array[j] = readInt();
}
zeroesMap >>= 1;
}
}
return array;
}
(The preceeding code assumes array.length is a multiple of 32. If not, write the last slice of the array in whatever way you like)
If that doesn't reduce proceccing time either, compression is not the way to go (I don't think any general purpose compression algorithm will be faster than the above).
Depending upon the distribution, consider Run-length Encoding:
Run-length encoding (RLE) is a very simple form of data compression in which runs of data (that is, sequences in which the same data value occurs in many consecutive data elements) are stored as a single data value and count, rather than as the original run. This is most useful on data that contains many such runs.
It is simple ... which is good, and possibly bad, here ;-)
In case you are willing to write the serialization-desirialization code yourself, instead of storing all the zeroes you can store a series of ranges that indicate where those zeros are(with a special marker), together with the actual non-zero data.
So the array in your example: { 0 , 0 , 6 , 7 , 1, 0 , 0 ...}
can be stored as:
%0-1, 6, 7, 1 %5-6
when reading this data, if you hit % it means you have a range in from of you, you read the start and the end and fill an zeroes. Then you go on and see a non #, this means you hit an actual value.
In a sparse array that has large sequences of consecutive values this will yield great compression.
There is a standard compression utils in java: java.util.zip - it's general purpose library but due to sheer availability is an ok solution. Specialized compressions, encoding should be researched, if need arises and I rarely recommend zip as the soultion of choise.
Here is a sample how to handle zip via Deflater/Inflater.
Most people know ZipInput/Output Stream (and esp. Gzip). All of them have downsdes in handling the copy from mem->zlib and esp. GZip which is a total disaster as having CRC32 calling the native code (calling native code removes the ability to optimize and introduces some more performance hits).
Few important notes: do not boost zip compression high, that will kill any performance whatsoever - of course one can experiment and fit their best ratio between CPU and disk activity.
The code also demonstrates one of the real shortcomings of java.util.zip - it doesn't support direct buffers. The support is beyond trivial, yet no one bother to do it. Direct buffers will save few memory copies and reduces the memory footprint.
Last note: there is java version of (j)zlib and it beats the native impl. on compression quite nicely.
package t1;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.util.Random;
import java.util.zip.DataFormatException;
import java.util.zip.Deflater;
import java.util.zip.Inflater;
public class ZInt {
private static final int bucketSize = 1<<17;//in real world should not be const, but we bored horribly
static final int zipLevel = 2;//feel free to experiement, higher compression (5+)is likely to be total waste
static void write(int[] a, File file, boolean sync) throws IOException{
byte[] bucket = new byte[Math.min(bucketSize, Math.max(1<<13, Integer.highestOneBit(a.length >>3)))];//128KB bucket
byte[] zipOut = new byte[bucket.length];
final FileOutputStream fout = new FileOutputStream(file);
FileChannel channel = fout.getChannel();
try{
ByteBuffer buf = ByteBuffer.wrap(bucket);
//unfortunately java.util.zip doesn't support Direct Buffer - that would be the perfect fit
ByteBuffer out = ByteBuffer.wrap(zipOut);
out.putInt(a.length);//write length aka header
if (a.length==0){
doWrite(channel, out, 0);
return;
}
Deflater deflater = new Deflater(zipLevel, false);
try{
for (int i=0;i<a.length;){
i = put(a, buf, i);
buf.flip();
deflater.setInput(bucket, buf.position(), buf.limit());
if (i==a.length)
deflater.finish();
//hacking and using bucket here is tempting since it's copied twice but well
for (int n; (n= deflater.deflate(zipOut, out.position(), out.remaining()))>0;){
doWrite(channel, out, n);
}
buf.clear();
}
}finally{
deflater.end();
}
}finally{
if (sync)
fout.getFD().sync();
channel.close();
}
}
static int[] read(File file) throws IOException, DataFormatException{
FileChannel channel = new FileInputStream(file).getChannel();
try{
byte[] in = new byte[(int)Math.min(bucketSize, channel.size())];
ByteBuffer buf = ByteBuffer.wrap(in);
channel.read(buf);
buf.flip();
int[] a = new int[buf.getInt()];
if (a.length==0)
return a;
int i=0;
byte[] inflated = new byte[Math.min(1<<17, a.length*4)];
ByteBuffer intBuffer = ByteBuffer.wrap(inflated);
Inflater inflater = new Inflater(false);
try{
do{
if (!buf.hasRemaining()){
buf.clear();
channel.read(buf);
buf.flip();
}
inflater.setInput(in, buf.position(), buf.remaining());
buf.position(buf.position()+buf.remaining());//simulate all read
for (;;){
int n = inflater.inflate(inflated,intBuffer.position(), intBuffer.remaining());
if (n==0)
break;
intBuffer.position(intBuffer.position()+n).flip();
for (;intBuffer.remaining()>3 && i<a.length;i++){//need at least 4 bytes to form an int
a[i] = intBuffer.getInt();
}
intBuffer.compact();
}
}while (channel.position()<channel.size() && i<a.length);
}finally{
inflater.end();
}
// System.out.printf("read ints: %d - channel.position:%d %n", i, channel.position());
return a;
}finally{
channel.close();
}
}
private static void doWrite(FileChannel channel, ByteBuffer out, int n) throws IOException {
out.position(out.position()+n).flip();
while (out.hasRemaining())
channel.write(out);
out.clear();
}
private static int put(int[] a, ByteBuffer buf, int i) {
for (;buf.hasRemaining() && i<a.length;){
buf.putInt(a[i++]);
}
return i;
}
private static int[] generateRandom(int len){
Random r = new Random(17);
int[] n = new int[len];
for (int i=0;i<len;i++){
n[i]= r.nextBoolean()?0: r.nextInt(1<<23);//limit bounds to have any sensible compression
}
return n;
}
public static void main(String[] args) throws Throwable{
File file = new File("xxx.xxx");
int[] n = generateRandom(3000000); //{0,2,4,1,2,3};
long start = System.nanoTime();
write(n, file, false);
long elapsed = System.nanoTime() - start;//elapsed will be fairer if the sync is true
System.out.printf("File length: %d, for %d ints, ratio %.2f in %.2fms %n", file.length(), n.length, ((double)file.length())/4/n.length, java.math.BigDecimal.valueOf(elapsed, 6) );
int[] m = read(file);
//compare, Arrays.equals doesn't return position, so it sucks/kinda
for (int i=0; i<n.length; i++){
if (m[i]!=n[i]){
System.err.printf("Failed at %d%n",i);
break;
}
}
System.out.printf("All done!");
};
}
Please note, the code is not a proper benchmark!
The delayed replies comes from the fact it was quite boring to code, yet another zip example, sorry
Recently, I've been experimenting with mixing AudioInputStreams together. After reading this post, or more importantly Jason Olson's answer, I came up with this code:
private static AudioInputStream mixAudio(ArrayList audio) throws IOException{
ArrayList<byte[]> byteArrays = new ArrayList();
long size = 0;
int pos = 0;
for(int i = 0; i < audio.size(); i++){
AudioInputStream temp = (AudioInputStream) audio.get(i);
byteArrays.add(convertStream(temp));
if(size < temp.getFrameLength()){
size = temp.getFrameLength();
pos = i;
}
}
byte[] compiledStream = new byte[byteArrays.get(pos).length];
for(int i = 0; i < compiledStream.length; i++){
int byteSum = 0;
for(int j = 0; j < byteArrays.size(); j++){
try{
byteSum += byteArrays.get(j)[i];
}catch(Exception e){
byteArrays.remove(j);
}
}
compiledStream[i] = (byte) (byteSum / byteArrays.size());
}
return new AudioInputStream(new ByteArrayInputStream(compiledStream), ((AudioInputStream)audio.get(pos)).getFormat(), ((AudioInputStream)audio.get(pos)).getFrameLength());
}
private static byte[] convertStream(AudioInputStream stream) throws IOException{
ByteArrayOutputStream byteStream = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
int numRead;
while((numRead = stream.read(buffer)) != -1){
byteStream.write(buffer, 0, numRead);
}
return byteStream.toByteArray();
}
This code works very well for mixing audio files. However, it seems the more audio files being mixed, the more white noise that appears in the returned AudioInputStream. All of the files being combined are identical when it comes to formatting. If anyone has any suggestions\advice, thanks in advance.
I could be wrong, but I think your problem has to do with the fact that you are messing with the bytes instead of what the bytes mean. For instance, if you are working with a 16 bit sampling rate, 2 bytes form the number that corresponds to the amplitude rather than just 1 byte. So, you end up getting something close but not quite right.