ZLib decompression fails on large byte array - java

When experimenting with ZLib compression, I have run across a strange problem. Decompressing a zlib-compressed byte array with random data fails reproducibly if the source array is at least 32752 bytes long. Here's a little program that reproduces the problem, you can see it in action on IDEOne. The compression and decompression methods are standard code picked off tutorials.
public class ZlibMain {
private static byte[] compress(final byte[] data) {
final Deflater deflater = new Deflater();
deflater.setInput(data);
deflater.finish();
final byte[] bytesCompressed = new byte[Short.MAX_VALUE];
final int numberOfBytesAfterCompression = deflater.deflate(bytesCompressed);
final byte[] returnValues = new byte[numberOfBytesAfterCompression];
System.arraycopy(bytesCompressed, 0, returnValues, 0, numberOfBytesAfterCompression);
return returnValues;
}
private static byte[] decompress(final byte[] data) {
final Inflater inflater = new Inflater();
inflater.setInput(data);
try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream(data.length)) {
final byte[] buffer = new byte[Math.max(1024, data.length / 10)];
while (!inflater.finished()) {
final int count = inflater.inflate(buffer);
outputStream.write(buffer, 0, count);
}
outputStream.close();
final byte[] output = outputStream.toByteArray();
return output;
} catch (DataFormatException | IOException e) {
throw new RuntimeException(e);
}
}
public static void main(final String[] args) {
roundTrip(100);
roundTrip(1000);
roundTrip(10000);
roundTrip(20000);
roundTrip(30000);
roundTrip(32000);
for (int i = 32700; i < 33000; i++) {
if(!roundTrip(i))break;
}
}
private static boolean roundTrip(final int i) {
System.out.printf("Starting round trip with size %d: ", i);
final byte[] data = new byte[i];
for (int j = 0; j < data.length; j++) {
data[j]= (byte) j;
}
shuffleArray(data);
final byte[] compressed = compress(data);
try {
final byte[] decompressed = CompletableFuture.supplyAsync(() -> decompress(compressed))
.get(2, TimeUnit.SECONDS);
System.out.printf("Success (%s)%n", Arrays.equals(data, decompressed) ? "matching" : "non-matching");
return true;
} catch (InterruptedException | ExecutionException | TimeoutException e) {
System.out.println("Failure!");
return false;
}
}
// Implementing Fisher–Yates shuffle
// source: https://stackoverflow.com/a/1520212/342852
static void shuffleArray(byte[] ar) {
Random rnd = ThreadLocalRandom.current();
for (int i = ar.length - 1; i > 0; i--) {
int index = rnd.nextInt(i + 1);
// Simple swap
byte a = ar[index];
ar[index] = ar[i];
ar[i] = a;
}
}
}
Is this a known bug in ZLib? Or do I have an error in my compress / decompress routines?

It is an error in the logic of the compress / decompress methods; I am not this deep in the implementations but with debugging I found the following:
When the buffer of 32752 bytes is compressed, the deflater.deflate() method returns a value of 32767, this is the size to which you initialized the buffer in the line:
final byte[] bytesCompressed = new byte[Short.MAX_VALUE];
If you increase the buffer size for example to
final byte[] bytesCompressed = new byte[4 * Short.MAX_VALUE];
the you will see, that the input of 32752 bytes actually is deflated to 32768 bytes. So in your code, the compressed data does not contain all the data which should be in there.
When you then try to decompress, the inflater.inflate()method returns zero which indicates that more input data is needed. But as you only check for inflater.finished() you end in an endless loop.
So you can either increase the buffer size on compressing, but that probably just means haveing the problem with bigger files, or you better need to rewrite to compress/decompress logic to process your data in chunks.

Apparently the compress() method was faulty.
This one works:
public static byte[] compress(final byte[] data) {
try (final ByteArrayOutputStream outputStream =
new ByteArrayOutputStream(data.length);) {
final Deflater deflater = new Deflater();
deflater.setInput(data);
deflater.finish();
final byte[] buffer = new byte[1024];
while (!deflater.finished()) {
final int count = deflater.deflate(buffer);
outputStream.write(buffer, 0, count);
}
final byte[] output = outputStream.toByteArray();
return output;
} catch (IOException e) {
throw new IllegalStateException(e);
}
}

Related

Compressing Base64 String is not of less size

I'm trying to compress a Base64 String using the java.util.zip.GZIPInputStream and Deflater clases. My problem is that after compression the size is not less from both cases. For the first case with the GZIPInputStream the size is bigger, and in the second case with the Deflater class the size is almost the same.
The output of my code is:
Original String Size: 8799
CompressedGZip String Size: 8828
UncompressedGZip String Size: 8799
Original_String_Length=8799
Compressed_String_Length Deflater=8812, Compression_Ratio=-0.147%
Decompressed_String_Length Deflater=8799 == Original_String_Length (8799)
Original_String == Decompressed_String=True
As you can see in both cases the compressed string is not less. I need to compress the input base64 String because in some cases is too long. Is there any way to achieve this?
This is my code:
private static String compressFileGZip(String data) {
try {
// Create an output stream, and a gzip stream to wrap over.
ByteArrayOutputStream bos = new ByteArrayOutputStream(data.length());
GZIPOutputStream gzip = new GZIPOutputStream(bos);
// Compress the input string
gzip.write(data.getBytes());
gzip.close();
byte[] compressed = bos.toByteArray();
bos.close();
// Convert to base64
compressed = Base64.getEncoder().encode(compressed);
// return the newly created string
return new String(compressed);
} catch(IOException e) {
return null;
}
}
private static String decompressFileGZip(String compressedText) throws IOException {
ByteArrayOutputStream stream = new ByteArrayOutputStream();
// get the bytes for the compressed string
byte[] compressed = compressedText.getBytes("UTF8");
// convert the bytes from base64 to normal string
Base64.Decoder d = Base64.getDecoder();
compressed = d.decode(compressed);
// decode.
final int BUFFER_SIZE = 32;
ByteArrayInputStream is = new ByteArrayInputStream(compressed);
GZIPInputStream gis = new GZIPInputStream(is, BUFFER_SIZE);
StringBuilder string = new StringBuilder();
byte[] data = new byte[BUFFER_SIZE];
int bytesRead;
while ((bytesRead = gis.read(data)) != -1)
{
string.append(new String(data, 0, bytesRead));
}
gis.close();
is.close();
return string.toString();
}
public static void main(String args[]) {
String input = "";
String compressedGZip = compressFileGZip(input);
String compressedDeflater = null;
String uncompressedGZip = null;
String decompressed = null;
try {
compressedDeflater = compress(input);
uncompressedGZip = decompressFileGZip(compressedGZip);
decompressed = decompress(decodeBase64(compressedDeflater));
} catch (IOException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("Original String Size: " + input.length());
System.out.println("CompressedGZip String Size: " + compressedGZip.length());
System.out.println("UncompressedGZip String Size: " + uncompressedGZip.length());
Integer savedLength = input.length() - compressedDeflater.length();
Double saveRatio = (new Double(savedLength) * 100) / input.length();
String ratioString = saveRatio.toString() + "00000000";
ratioString = ratioString.substring(0, ratioString.indexOf(".") + 4);
println("Original_String_Length=" + input.length());
println("Compressed_String_Length Deflater=" + compressedDeflater.length() + ", Compression_Ratio=" + ratioString + "%");
println("Decompressed_String_Length Deflater=" + decompressed.length() + " == Original_String_Length (" + input.length() + ")");
println("Original_String == Decompressed_String=" + (input.equals(decompressed) ? "True" : "False"));
// end
}
public static String compress(String str) throws Exception {
return compress(str.getBytes("UTF-8"));
}
public static String compress(byte[] bytes) throws Exception {
Deflater deflater = new Deflater();
deflater.setInput(bytes);
deflater.finish();
//deflater.deflate(bytes, 2, bytes.length);
ByteArrayOutputStream bos = new ByteArrayOutputStream(bytes.length);
byte[] buffer = new byte[1024];
while(!deflater.finished()) {
int count = deflater.deflate(buffer);
bos.write(buffer, 0, count);
}
bos.close();
byte[] output = bos.toByteArray();
return encodeBase64(output);
}
public static String decompress(byte[] bytes) throws Exception {
Inflater inflater = new Inflater();
inflater.setInput(bytes);
ByteArrayOutputStream bos = new ByteArrayOutputStream(bytes.length);
byte[] buffer = new byte[1024];
while (!inflater.finished()) {
int count = inflater.inflate(buffer);
bos.write(buffer, 0, count);
}
bos.close();
byte[] output = bos.toByteArray();
return new String(output);
}
public static String encodeBase64(byte[] bytes) throws Exception {
BASE64Encoder base64Encoder = new BASE64Encoder();
return base64Encoder.encodeBuffer(bytes).replace("\r\n", "").replace("\n", "");
}
public static byte[] decodeBase64(String str) throws Exception {
BASE64Decoder base64Decoder = new BASE64Decoder();
return base64Decoder.decodeBuffer(str);
}
public static void println(Object o) {
System.out.println("" + o);
}

how to do framing for audio signal in java

I want to split my audio file (.wav format) in frames of 32 milliseconds each. Sampling frequency - 16khz, number of channels - 1(mono), pcm signal, sample size = 93638.
After getting the data in the byte format, I am converting the byte array storing the wav file data to double array since I require it to pass it to a method which accepts a double array, I am using the following code can someone tell me how to proceed?
import javax.sound.sampled.AudioFileFormat;
import javax.sound.sampled.AudioInputStream;
import javax.sound.sampled.AudioSystem;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.nio.ByteBuffer;
public class AudioFiles
{
public static void main(String[] args)
{
String file = "D:/p.wav";
AudioFiles afiles = new AudioFiles();
byte[] data1 = afiles.readAudioFileData(file);
byte[] data2 = afiles.readWAVAudioFileData(file);
System.out.format("data len1: %d\n", data1.length);
System.out.format("data len2: %d\n", data2.length);
/* for(int i=0;i<data2.length;i++)
{
System.out.format("\t"+data2[i]);
}*/
System.out.println();
/* for(int j=0;j<data1.length;j++)
{
System.out.format("\t"+data1[j]);
}*/
System.out.format("diff len: %d\n", data2.length - data1.length);
double[] d = new double[data1.length];
d = toDoubleArray(data1);
for (int j = 0; j < data1.length; j++)
{
System.out.format("\t" + d[j]);
}
daub a = new daub();
a.daubTrans(d);
}
public static double[] toDoubleArray(byte[] byteArray)
{
int times = Double.SIZE / Byte.SIZE;
double[] doubles = new double[byteArray.length / times];
for (int i = 0; i < doubles.length; i++)
{
doubles[i] = ByteBuffer.wrap(byteArray, i * times, times).getDouble();
}
return doubles;
}
public byte[] readAudioFileData(final String filePath)
{
byte[] data = null;
try
{
final ByteArrayOutputStream baout = new ByteArrayOutputStream();
final File file = new File(filePath);
final AudioInputStream audioInputStream = AudioSystem
.getAudioInputStream(file);
byte[] buffer = new byte[4096];
int c;
while ((c = audioInputStream.read(buffer, 0, buffer.length)) != -1)
{
baout.write(buffer, 0, c);
}
audioInputStream.close();
baout.close();
data = baout.toByteArray();
}
catch (Exception e)
{
e.printStackTrace();
}
return data;
}
public byte[] readWAVAudioFileData(final String filePath)
{
byte[] data = null;
try
{
final ByteArrayOutputStream baout = new ByteArrayOutputStream();
final AudioInputStream audioInputStream = AudioSystem.getAudioInputStream(new File(filePath));
AudioSystem.write(audioInputStream, AudioFileFormat.Type.WAVE, baout);
audioInputStream.close();
baout.close();
data = baout.toByteArray();
}
catch (Exception e)
{
e.printStackTrace();
}
return data;
}
}
I want to pass the double array d to method performing wavelet transform, in the frames of 32 millisecond since it accepts a double array.
In my previous question I was given a reply that:
At 16kHz sample rate you'll have 16 samples per millisecond. Therefore, each 32ms frame would be 32*16=512 mono samples. Multiply by the number of bytes-per-sample (typically 2 or 4) and that will be the number of bytes per frame.
I want to know whether my frame size changes when I convert my array from byte format to double format or does it remains the same??
My Previous Question.

DES Decryption: Given final block not properly padded

I'm trying to decrypt the content of a file bigger than 1k for a "RETR" action of an FTP Client and I'm encountering this kind of exception.
javax.crypto.BadPaddingException: Given final block not properly padded
at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:811)
at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:676)
at com.sun.crypto.provider.DESCipher.engineDoFinal(DESCipher.java:314)
at javax.crypto.Cipher.doFinal(Cipher.java:2145)
This is the code that is giving me problem:
byte[] encontent = new byte[0];
byte[] buff = new byte[1024];
int k = -1;
while((k = bis.read(buff, 0, buff.length)) > -1) {
byte[] tbuff = new byte[encontent.length + k]; // temp buffer size = bytes already read + bytes last read
System.arraycopy(encontent, 0, tbuff, 0, encontent.length); // copy previous bytes
System.arraycopy(buff, 0, tbuff, encontent.length, k); // copy current lot
encontent = tbuff; // call the temp buffer as your result buff
}
System.out.println(encontent.length + " bytes read.");
byte [] plain = dcipher.doFinal(encontent, 0,encontent.length);
The length of the byte array encontent is always an 8-bit multple, because it is the result of a previous encryption.
Here it's the code that starts the operation from server side:
public void download (String pathfile)
{
Socket DataSock = null;
try {
DataSock = new Socket (clientAddr, TRANSMISSION_PORT);
if (DataSock.isConnected())
{
BufferedOutputStream bos = new BufferedOutputStream (DataSock.getOutputStream());
int size=0;
int blocks=0;
int resto=0;
if (pathfile.endsWith(".txt"))
{
String text = readTxtFile (pathfile);
byte [] encontent = ecipher.doFinal(text.getBytes("UTF8"));
sendFile (bos,encontent);
} else {
byte [] content = readFile (pathfile);
byte [] encontent = ecipher.doFinal(content);
sendFile (bos, content);
}
}
} catch (Exception e)
{
e.printStackTrace();
} finally {
try {
DataSock.close();
} catch (Exception e)
{
e.printStackTrace();
}
}
}
The final block must contain 8 bytes. If it does not, one has to pad until its 8 bytes wide. Your assumption is wrong.
Have a look at https://stackoverflow.com/a/10427679/867816

Assign a String to byte and specify string length at the start on Java

I would like to assign a String data to the byte array and also put a 4-byte String data length at the start. What is the best way to accomplish? I need this for transmitting the byte data over the socket connection. Server side reads as many bytes mentioned at the start.
Is there a better way of doing this?
private byte[] getDataSendBytes(String data) {
int numberOfDataBytes = data.getBytes().length;
ByteBuffer bb = ByteBuffer.allocate(HEADER_LENGTH_BYTES);
bb.putInt(numberOfDataBytes);
byte[] headerBytes = bb.array();
byte[] dataBytes = data.getBytes();
// create a Datagram packet
byte[] sendDataBytes = new byte[HEADER_LENGTH_BYTES + dataBytes.length];
System.arraycopy(headerBytes, 0, sendDataBytes, 0, headerBytes.length);
System.arraycopy(dataBytes, 0, sendDataBytes, headerBytes.length,
dataBytes.length);
return sendDataBytes;
}
I would use either DataOutputStream
public byte[] getDataSendBytes(String text) {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
try {
new DataOutputStream(baos).writeUTF(text);
} catch (IOException e) {
throw new AssertionError(e);
}
return baos.toByteArray();
}
or ByteBuffer for control of the length type and endianess.
public byte[] getDataSendBytes(String text) {
try {
byte[] bytes = text.getBytes("UTF-8");
ByteBuffer bb = ByteBuffer.allocate(4 + bytes.length).order(ByteOrder.LITTLE_ENDIAN);
bb.putInt(bytes.length);
bb.put(bytes);
return bb.array();
} catch (UnsupportedEncodingException e) {
throw new AssertionError(e);
}
}
or for performance, reuse the ByteBuffer and assume a ISO-8859-1 character encoding
// GC-less method.
public void writeAsciiText(ByteBuffer bb, String text) {
assert text.length() < (1 << 16);
bb.putShort((short) text.length());
for(int i=0;i<text.length();i++)
bb.put((byte) text.charAt(i));
}

Limit size byte[] Java android

I have to fill a byte[] in my Android application. Sometime, this one is bigger than 4KB.
I initialize my byte[] like this :
int size = ReadTools.getPacketSize(ptr.dataInputStream);
byte[] myByteArray = new byte[size];
Here, my size = 22625. But when I fill up my byte[] like this :
while (i != size) {
myByteArray[i] = ptr.dataInputStream.readByte();
i++;
}
But when I print the content of my byte[], I have a byte[] with size = 4060.
Does Java split my byte[] if this one is bigger than 4060 ? And if yes, how can I have a byte[] superior to 4060 ?
Here is my full code:
public class ReadSocket extends Thread{
DataInputStream inputStream;
BufferedReader reader;
GlobalContent ptr;
public ReadSocket(DataInputStream inputStream, GlobalContent ptr)
{
this.inputStream = inputStream;
this.ptr = ptr;
}
public void run() {
int i = 0;
int j = 0;
try {
ptr.StatusThreadReadSocket = 1;
while(ptr.dataInputStream.available() == 0)
{
if(ptr.StatusThreadReadSocket == 0)
{
ptr.dataInputStream.close();
break;
}
}
if(ptr.StatusThreadReadSocket == 1)
{
int end = ReadTools.getPacketSize(ptr.dataInputStream);
byte[] buffer = new byte[end];
while (i != end) {
buffer[j] = ptr.dataInputStream.readByte();
i++;
j++;
}
ptr.StatusThreadReadSocket = 0;
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
...
}
Java doesn't split anything. You should post the minimal code which reproduces your error, and tell where ReadTools comes from.
There are two options here:
ReadTools.getPacketSize() returns 4096
You inadevertedly reassign myByteArray to another array
You should really post your full code and tell what library you use. Likely, it will have a method like
read(byte[] buffer, int offset, int length);
Which will save you some typing and also give better performance if all you need is bulk reading the content of the input in memory

Categories

Resources