Compressing Base64 String is not of less size - java
I'm trying to compress a Base64 String using the java.util.zip.GZIPInputStream and Deflater clases. My problem is that after compression the size is not less from both cases. For the first case with the GZIPInputStream the size is bigger, and in the second case with the Deflater class the size is almost the same.
The output of my code is:
Original String Size: 8799
CompressedGZip String Size: 8828
UncompressedGZip String Size: 8799
Original_String_Length=8799
Compressed_String_Length Deflater=8812, Compression_Ratio=-0.147%
Decompressed_String_Length Deflater=8799 == Original_String_Length (8799)
Original_String == Decompressed_String=True
As you can see in both cases the compressed string is not less. I need to compress the input base64 String because in some cases is too long. Is there any way to achieve this?
This is my code:
private static String compressFileGZip(String data) {
try {
// Create an output stream, and a gzip stream to wrap over.
ByteArrayOutputStream bos = new ByteArrayOutputStream(data.length());
GZIPOutputStream gzip = new GZIPOutputStream(bos);
// Compress the input string
gzip.write(data.getBytes());
gzip.close();
byte[] compressed = bos.toByteArray();
bos.close();
// Convert to base64
compressed = Base64.getEncoder().encode(compressed);
// return the newly created string
return new String(compressed);
} catch(IOException e) {
return null;
}
}
private static String decompressFileGZip(String compressedText) throws IOException {
ByteArrayOutputStream stream = new ByteArrayOutputStream();
// get the bytes for the compressed string
byte[] compressed = compressedText.getBytes("UTF8");
// convert the bytes from base64 to normal string
Base64.Decoder d = Base64.getDecoder();
compressed = d.decode(compressed);
// decode.
final int BUFFER_SIZE = 32;
ByteArrayInputStream is = new ByteArrayInputStream(compressed);
GZIPInputStream gis = new GZIPInputStream(is, BUFFER_SIZE);
StringBuilder string = new StringBuilder();
byte[] data = new byte[BUFFER_SIZE];
int bytesRead;
while ((bytesRead = gis.read(data)) != -1)
{
string.append(new String(data, 0, bytesRead));
}
gis.close();
is.close();
return string.toString();
}
public static void main(String args[]) {
String input = "";
String compressedGZip = compressFileGZip(input);
String compressedDeflater = null;
String uncompressedGZip = null;
String decompressed = null;
try {
compressedDeflater = compress(input);
uncompressedGZip = decompressFileGZip(compressedGZip);
decompressed = decompress(decodeBase64(compressedDeflater));
} catch (IOException e) {
e.printStackTrace();
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("Original String Size: " + input.length());
System.out.println("CompressedGZip String Size: " + compressedGZip.length());
System.out.println("UncompressedGZip String Size: " + uncompressedGZip.length());
Integer savedLength = input.length() - compressedDeflater.length();
Double saveRatio = (new Double(savedLength) * 100) / input.length();
String ratioString = saveRatio.toString() + "00000000";
ratioString = ratioString.substring(0, ratioString.indexOf(".") + 4);
println("Original_String_Length=" + input.length());
println("Compressed_String_Length Deflater=" + compressedDeflater.length() + ", Compression_Ratio=" + ratioString + "%");
println("Decompressed_String_Length Deflater=" + decompressed.length() + " == Original_String_Length (" + input.length() + ")");
println("Original_String == Decompressed_String=" + (input.equals(decompressed) ? "True" : "False"));
// end
}
public static String compress(String str) throws Exception {
return compress(str.getBytes("UTF-8"));
}
public static String compress(byte[] bytes) throws Exception {
Deflater deflater = new Deflater();
deflater.setInput(bytes);
deflater.finish();
//deflater.deflate(bytes, 2, bytes.length);
ByteArrayOutputStream bos = new ByteArrayOutputStream(bytes.length);
byte[] buffer = new byte[1024];
while(!deflater.finished()) {
int count = deflater.deflate(buffer);
bos.write(buffer, 0, count);
}
bos.close();
byte[] output = bos.toByteArray();
return encodeBase64(output);
}
public static String decompress(byte[] bytes) throws Exception {
Inflater inflater = new Inflater();
inflater.setInput(bytes);
ByteArrayOutputStream bos = new ByteArrayOutputStream(bytes.length);
byte[] buffer = new byte[1024];
while (!inflater.finished()) {
int count = inflater.inflate(buffer);
bos.write(buffer, 0, count);
}
bos.close();
byte[] output = bos.toByteArray();
return new String(output);
}
public static String encodeBase64(byte[] bytes) throws Exception {
BASE64Encoder base64Encoder = new BASE64Encoder();
return base64Encoder.encodeBuffer(bytes).replace("\r\n", "").replace("\n", "");
}
public static byte[] decodeBase64(String str) throws Exception {
BASE64Decoder base64Decoder = new BASE64Decoder();
return base64Decoder.decodeBuffer(str);
}
public static void println(Object o) {
System.out.println("" + o);
}
Related
ZLib decompression fails on large byte array
When experimenting with ZLib compression, I have run across a strange problem. Decompressing a zlib-compressed byte array with random data fails reproducibly if the source array is at least 32752 bytes long. Here's a little program that reproduces the problem, you can see it in action on IDEOne. The compression and decompression methods are standard code picked off tutorials. public class ZlibMain { private static byte[] compress(final byte[] data) { final Deflater deflater = new Deflater(); deflater.setInput(data); deflater.finish(); final byte[] bytesCompressed = new byte[Short.MAX_VALUE]; final int numberOfBytesAfterCompression = deflater.deflate(bytesCompressed); final byte[] returnValues = new byte[numberOfBytesAfterCompression]; System.arraycopy(bytesCompressed, 0, returnValues, 0, numberOfBytesAfterCompression); return returnValues; } private static byte[] decompress(final byte[] data) { final Inflater inflater = new Inflater(); inflater.setInput(data); try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream(data.length)) { final byte[] buffer = new byte[Math.max(1024, data.length / 10)]; while (!inflater.finished()) { final int count = inflater.inflate(buffer); outputStream.write(buffer, 0, count); } outputStream.close(); final byte[] output = outputStream.toByteArray(); return output; } catch (DataFormatException | IOException e) { throw new RuntimeException(e); } } public static void main(final String[] args) { roundTrip(100); roundTrip(1000); roundTrip(10000); roundTrip(20000); roundTrip(30000); roundTrip(32000); for (int i = 32700; i < 33000; i++) { if(!roundTrip(i))break; } } private static boolean roundTrip(final int i) { System.out.printf("Starting round trip with size %d: ", i); final byte[] data = new byte[i]; for (int j = 0; j < data.length; j++) { data[j]= (byte) j; } shuffleArray(data); final byte[] compressed = compress(data); try { final byte[] decompressed = CompletableFuture.supplyAsync(() -> decompress(compressed)) .get(2, TimeUnit.SECONDS); System.out.printf("Success (%s)%n", Arrays.equals(data, decompressed) ? "matching" : "non-matching"); return true; } catch (InterruptedException | ExecutionException | TimeoutException e) { System.out.println("Failure!"); return false; } } // Implementing Fisher–Yates shuffle // source: https://stackoverflow.com/a/1520212/342852 static void shuffleArray(byte[] ar) { Random rnd = ThreadLocalRandom.current(); for (int i = ar.length - 1; i > 0; i--) { int index = rnd.nextInt(i + 1); // Simple swap byte a = ar[index]; ar[index] = ar[i]; ar[i] = a; } } } Is this a known bug in ZLib? Or do I have an error in my compress / decompress routines?
It is an error in the logic of the compress / decompress methods; I am not this deep in the implementations but with debugging I found the following: When the buffer of 32752 bytes is compressed, the deflater.deflate() method returns a value of 32767, this is the size to which you initialized the buffer in the line: final byte[] bytesCompressed = new byte[Short.MAX_VALUE]; If you increase the buffer size for example to final byte[] bytesCompressed = new byte[4 * Short.MAX_VALUE]; the you will see, that the input of 32752 bytes actually is deflated to 32768 bytes. So in your code, the compressed data does not contain all the data which should be in there. When you then try to decompress, the inflater.inflate()method returns zero which indicates that more input data is needed. But as you only check for inflater.finished() you end in an endless loop. So you can either increase the buffer size on compressing, but that probably just means haveing the problem with bigger files, or you better need to rewrite to compress/decompress logic to process your data in chunks.
Apparently the compress() method was faulty. This one works: public static byte[] compress(final byte[] data) { try (final ByteArrayOutputStream outputStream = new ByteArrayOutputStream(data.length);) { final Deflater deflater = new Deflater(); deflater.setInput(data); deflater.finish(); final byte[] buffer = new byte[1024]; while (!deflater.finished()) { final int count = deflater.deflate(buffer); outputStream.write(buffer, 0, count); } final byte[] output = outputStream.toByteArray(); return output; } catch (IOException e) { throw new IllegalStateException(e); } }
How to decompress zlib
I am hoping to decompress a zlib file. Below is a picture of my original: When byte no 4 is not equal to 0, it needs to decompress after 9th bytes. After I decompress, I receive the following error: "java.util.zip.DataFormatException: incorrect header check" My code is below: private static void writeToRawInFile(byte[] bData) { try { int index = 4; byteBuffer = null; byteBuffer = ByteBuffer.allocate(bData.length); byteBuffer.put(bData); byteBuffer.flip(); byte[] convertByte = byteBuffer.array(); System.out.println("Byte:" + convertByte.toString()); if (byteBuffer.get(index) == 0) { System.out.println("index 0"); byteBuffer = null; byteBuffer = ByteBuffer.allocate(bData.length); byteBuffer.put(bData); byteBuffer.flip(); //factoryType.fileServer.fc.write(byteBuffer); } else { //bData = Arrays.copyOfRange(bData, 1,9); bData = decompressByteArray(Arrays.copyOfRange(bData, 0,9)); byteBuffer = ByteBuffer.allocate(bData.length); byteBuffer.put(bData); byteBuffer.flip(); System.out.println("data ="+ byteBuffer); //factoryType.fileServer.fc.write(byteBuffer); } //factoryType.fileServer.fc.write(byteBuffer); } catch (Exception e) { System.out.println("expMsg .writeToRawInFile()=" + e.getMessage()); e.printStackTrace(); } if (byteBuffer != null) { byteBuffer.clear(); } } public static byte[] decompressByteArray(byte[] bytes) throws IOException, DataFormatException { Inflater inflater = new Inflater(); System.out.println("Original: " + bytes.length); inflater.setInput(bytes); ByteArrayOutputStream outputStream = new ByteArrayOutputStream(bytes.length); byte[] buffer = new byte[1024]; while (!inflater.finished()) { int count = inflater.inflate(buffer); outputStream.write(buffer, 0, count); } outputStream.close(); byte[] output = outputStream.toByteArray(); System.out.println("Compressed: " + output.length); return output; }
Is there any compression method in java to reduce the number of charaters in a string?
I am currently facing a problem while compressing a string to fewer characters in java. I have a huge string which is about 751396 characters and there is a requirement of compressing the string into a 1500 characters. I have tried GZIP Compressor, Inflater & Deflater but these libraries return byte arrays Then I tried LZ-String compressor in which I was able to get satisfactory results using UTF16 encoding and base64 encoding, But these compression return some characters which are neither alphanumeric nor are they included in the symbols list provided. N.B. The list for the Symbols is [+,-,*,/,!,#,#] is there any other technique of compressing the string into another string with fewer characters and providing at least 30% of compression ratio. The codes which I am using for GZip compression is as follows:- import java.io.BufferedReader; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; import java.io.IOException; import java.io.InputStreamReader; import java.util.zip.GZIPInputStream; import java.util.zip.GZIPOutputStream; public class GZIPCompression { public static byte[] compress(final String str) throws IOException { if ((str == null) || (str.length() == 0)) { return null; } ByteArrayOutputStream obj = new ByteArrayOutputStream(); GZIPOutputStream gzip = new GZIPOutputStream(obj); gzip.write(str.getBytes("UTF-8")); gzip.close(); return obj.toByteArray(); } public static String decompress(final byte[] compressed) throws IOException { String outStr = ""; if ((compressed == null) || (compressed.length == 0)) { return ""; } if (isCompressed(compressed)) { GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(compressed)); BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(gis, "UTF-8")); String line; while ((line = bufferedReader.readLine()) != null) { outStr += line; } } else { outStr = new String(compressed); } return outStr; } public static boolean isCompressed(final byte[] compressed) { return (compressed[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (compressed[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8)); } } The code for the Inflater & Deflater program is as follows:- import java.io.ByteArrayOutputStream; import java.io.IOException; import java.util.Arrays; import java.util.zip.DataFormatException; import java.util.zip.Deflater; import java.util.zip.Inflater; public class Apple { public static void main(String[] args) { String sr = " [120,-100,-19,89,91,79,-21,56,16,-2,43,40,-49,104,55,113,-18,-68,-91,-23,21,104,90,-38,-62,10,-83,120,48,-83,91,34,-46,-92,-21,-92,8,-124,-8,-17,103,-100,-92,77,-22,-38,-25,112,86,27,-119,106,65,84,-86,103,-58,-10,124,-98,-15,101,-66,-66,43,25,126,-99,-112,116,-109,-60,41,81,46,-34,-107,-57,36,121,14,-29,-43,-20,109,3,77,101,-94,-100,43,120,-79,-115,50,63,-39,-58,25,8,52,16,-52,-97,-62,104,-79,19,-88,32,8,-29,37,-114,-77,-70,-92,71,-78,89,125,-36,-65,-33,-107,5,-50,-40,-120,-86,-11,39,82,-31,95,-77,-40,-48,-21,-94,15,82,13,11,-58,-115,112,-102,-6,-55,-126,-103,-7,126,-65,53,-22,-113,-64,-58,123,-63,97,52,37,-85,53,97,-106,-17,74,55,10,87,79,-39,96,-63,-100,65,76,31,-46,40,-116,73,-39,111,-38,3,81,97,18,108,-41,-113,-124,-126,-52,-48,-100,-62,-50,-89,120,-103,-107,-56,108,-99,9,71,52,92,-123,49,52,91,-41,-109,125,19,44,55,9,-51,102,-124,-82,-61,24,71,-96,5,85,-101,-92,25,-76,-78,48,-55,-51,71,-61,67,-103,-92,-49,6,-45,108,75,73,27,-80,-49,-62,53,-101,23,-64,-25,75,-96,89,103,72,-67,48,-44,11,-107,-83,-105,71,105,-8,-126,35,-119,29,-70,-48,74,-69,-10,-106,-18,92,-48,-98,104,122,-90,-85,48,93,10,-118,2,108,-78,-100,102,-55,38,85,46,-44,115,-27,46,-60,-123,23,-2,-106,82,18,-49,-33,-54,21,26,4,-109,-35,-86,-114,-15,107,23,-125,119,36,-125,70,-102,71,-55,23,-58,96,47,5,-60,-13,-61,-24,-80,-28,96,97,105,-31,52,-100,123,101,60,53,-61,112,33,12,48,54,19,-61,-56,74,-112,116,41,-127,-42,74,41,-28,-69,-4,34,-53,109,-68,-64,-113,17,1,-7,-3,77,-18,-8,44,-55,112,4,-39,-77,27,12,68,61,-102,-92,-23,126,112,-45,48,-64,-91,100,-67,14,-45,-76,88,11,-45,-2,-61,-75,-108,-113,-113,-13,67,0,-99,-114,12,64,-91,17,3,-128,-124,108,14,0,-46,28,-99,3,-32,104,66,0,-35,-82,12,64,-91,-111,0,48,-35,6,1,-40,-74,-54,1,-48,84,93,-120,-96,-33,-105,33,-88,52,98,4,-70,-34,96,8,116,-45,114,121,4,-70,24,-63,-27,-91,12,65,-91,-111,32,112,27,-116,-127,-127,44,-115,71,96,-70,66,4,87,87,50,4,-107,70,-116,-64,112,26,-116,-127,-87,89,54,-113,-64,21,-57,-32,-6,90,-122,-96,-46,-120,17,-104,77,34,-80,-112,-50,111,100,36,-55,-94,-31,80,-122,-96,-46,-120,17,-40,77,30,69,-74,-87,-15,89,-124,36,103,81,16,-56,16,84,26,49,2,71,111,112,31,56,-82,-55,-97,69,-70,110,10,17,-116,70,50,4,-107,70,-116,-64,117,26,68,-96,-87,-90,-31,-16,16,92,49,-124,-101,27,25,-124,74,35,-71,-110,-31,-38,108,16,3,-46,85,126,51,27,-106,56,-111,38,19,25,-122,74,35,-63,-48,-24,-99,-96,25,8,-103,-4,-61,66,-78,-99,-89,83,25,-122,74,35,-63,96,-94,38,115,-55,-46,85,-2,72,-78,52,113,28,102,51,25,-122,74,35,-63,96,55,-71,-93,53,-57,52,-8,45,109,-19,-10,-61,-61,-57,-71,-78,43,43,-90,25,-50,-74,48,57,-100,96,-73,41,-95,51,-118,-25,-49,121,93,48,25,-34,-113,6,-82,-83,-62,-3,91,124,116,-37,-123,67,-56,50,12,3,-23,-102,-19,-72,-106,-125,92,56,-25,-64,39,-16,99,-76,-51,54,-37,18,-28,106,-123,87,-60,-117,23,67,-126,-93,-76,27,1,-100,-36,-29,-68,-108,41,-86,-118,-78,18,41,94,-53,71,-75,8,91,46,-80,-50,-21,20,-74,58,-108,-32,-25,-37,-51,-2,-127,93,-82,-93,-16,69,46,20,10,94,-43,-99,127,-74,-84,84,0,31,-40,12,59,41,76,90,127,-58,-77,64,-46,112,83,-106,10,-29,105,-105,57,-73,-49,64,-107,-91,-62,-95,-55,109,-69,110,-94,-101,-24,-40,-92,55,-99,-43,76,28,-29,-40,-30,-22,-54,-81,15,-14,-15,112,28,108,-45,97,-50,82,28,-89,120,-50,122,117,9,-55,-105,120,74,-24,75,56,39,-2,19,-90,43,-112,56,-4,-117,51,47,16,-123,111,126,54,-53,-124,-64,-94,80,-78,-24,-122,36,90,-28,-21,-100,71,2,-60,53,-55,42,-65,-59,-64,-63,-63,98,76,-109,100,89,-74,-58,-112,-5,-84,116,-37,-105,-117,65,94,-65,-58,-117,-68,113,-49,-118,46,-88,-54,70,-53,86,72,-77,39,-82,79,-25,117,19,-46,-73,118,81,-39,-42,21,-125,52,-35,66,21,-99,-105,-60,-12,-83,84,6,121,-23,-122,-93,48,43,36,-112,-54,62,43,-91,-65,-66,-101,-125,-68,-64,111,-60,-49,-5,-1,-50,79,112,52,-33,-73,-5,-8,-105,45,-40,15,-20,91,-71,-79,-18,122,67,118,-78,49,-55,97,-10,6,-55,89,-47,-95,80,-18,-113,39,-106,-25,-121,-3,-81,-123,-3,107,-118,-86,80,50,-71,-34,99,-65,-27,11,123,-113,59,94,112,59,-101,-98,121,65,-5,-84,-43,-71,-13,38,94,-81,-61,-115,-90,25,-68,47,-63,-99,-60,-105,-102,66,-18,75,32,-13,37,-16,-4,-2,-24,55,93,-71,12,36,-82,92,122,-125,-32,108,-40,-15,120,127,116,-109,31,-94,-35,-110,12,-47,30,120,-83,-50,108,-32,127,110,24,95,6,-53,-9,-90,-3,-65,58,-97,89,-27,-121,-113,-4,-74,-84,81,86,-58,17,101,5,23,-54,33,101,85,-29,20,126,66,89,-63,-33,39,73,43,-3,-41,-92,85,-50,66,125,-98,-76,-54,57,-82,127,69,90,25,123,50,74,117,-10,100,-108,-128,-76,-86,-20,-64,72,64,90,33,70,90,53,-55,89,125,85,-54,7,-23,-4,-3,117,98,-108,-113,-125,-8,119,-27,-87,81,62,22,66,39,78,-7,-24,26,79,124,-98,26,-27,-125,-48,17,113,120,106,-108,-113,-61,111,-28,-109,-93,124,44,62,-117,78,-116,-14,113,-43,-93,26,-9,-28,40,31,75,-27,121,-73,19,-92,124,44,126,51,-97,32,-27,99,-13,-44,-37,9,82,62,38,127,36,-99,32,-27,-29,30,-47,86,95,-97,-14,41,-33,-14,-115,-110,62,-59,-77,-108,39,125,10,-23,79,73,-97,-39,-92,-50,-24,-120,56,-97,-33,-90,-123,12,83,-1,21,45,-92,33,1,115,116,-56,11,25,34,94,-56,-74,63,-59,11,-79,95,43,-72,119,-87,-50,-1,-112,-73,-69,-52,-66,-119,-95,-26,-35,-4,38,-122,-66,-119,-95,-1,27,49,4,-97,31,15,0,-88,84]"; byte[] data = sr.getBytes(); try { String x = new String(decompress(compress(data))); System.out.println("decompressed " + x); } catch (IOException | DataFormatException e) { e.printStackTrace(); } } public static byte[] compress(byte[] data) throws IOException { Deflater deflater = new Deflater(); deflater.setInput(data); ByteArrayOutputStream outputStream = new ByteArrayOutputStream(data.length); deflater.finish(); byte[] buffer = new byte[1024]; while (!deflater.finished()) { int count = deflater.deflate(buffer); outputStream.write(buffer, 0, count); } outputStream.close(); byte[] output = outputStream.toByteArray(); System.out.println("Original: " + data.length); System.out.println("Compressed: " + output.length); return output; } public static byte[] decompress(byte[] data) throws IOException, DataFormatException { Inflater inflater = new Inflater(); inflater.setInput(data); ByteArrayOutputStream outputStream = new ByteArrayOutputStream(data.length); byte[] buffer = new byte[1024]; while (!inflater.finished()) { int count = inflater.inflate(buffer); outputStream.write(buffer, 0, count); } outputStream.close(); byte[] output = outputStream.toByteArray(); System.out.println(); return output; } } A sample of how the data will look like:- "120,-100,-19,89,91,79,-21,56,16,-2,43,40,-49,104,55,113,-18,-68,-91,-23,21,104,90,-38,-62,10,-83,120,48,-83,91,34,-46,-92,-21,-92,8,-124,-8,-17,103,-100,-92,77,-22,-38,-25,112,86,27,-119,106,65,84,-86,103,-58,-10,124,-98,-15,101,-66,-66,43,25,126,-99,-112,116,-109,-60,41,81,46,-34,-107,-57,36,121,14,-29,-43,-20,109,3,77,101,-94,-100,43,120,-79,-115,50,63,-39,-58,25,8,52,16,-52,-97,-62,104,-79,19,-88,32,8,-29,37,-114,-77,-70,-92,71,-78,89,125,-36,-65,-33,-107,5,-50,-40,-120,-86,-11,39,82,-31,95,-77,-40,-48,-21,-94,15,82,13,11,-58,-115,112,-102,-6,-55,-126,-103,-7,126,-65,53,-22,-113,-64,-58,123,-63,97,52,37,-85,53,97,-106,-17,74,55,10,87,79,-39,96,-63,-100,65,76,31,-46,40,-116,73,-39,111,-38,3,81,97,18,108,-41,-113,-124,-126,-52,-48,-100,-62,-50,-89,120,-103,-107,-56,108,-99,9,71,52,92,-123,49,52,91,-41,-109,125,19,44,55,9,-51,102,-124,-82,-61,24,71,-96,5,85,-101,-92,25,-76,-78,48,-55,-51,71,-61,67,-103,-92,-49,6,-45,108,75,73,27,-80,-49,-62,53,-101,23,-64,-25,75,-96,89,103,72,-67,48,-44,11,-107,-83,-105,71,105,-8,-126,35,-119,29,-70,-48,74,-69,-10,-106,-18,92,-48,-98,104,122,-90,-85,48,93,10,-118,2,108,-78,-100,102,-55,38,85,46,-44,115,-27,46,-60,-123,23,-2,-106,82,18,-49,-33,-54,21,26,4,-109,-35,-86,-114,-15,107,23,-125,119,36,-125,70,-102,71,-55,23,-58,96,47,5,-60,-13,-61,-24,-80,-28,96,97,105,-31,52,-100,123,101,60,53,-61,112,33,12,48,54,19,-61,-56,74,-112,116,41,-127,-42,74,41,-28,-69,-4,34,-53,109,-68,-64,-113,17,1,-7,-3,77,-18,-8,44,-55,112,4,-39,-77,27,12,68,61,-102,-92,-23,126,112,-45,48,-64,-91,100,-67,14,-45,-76,88,11,-45,-2,-61,-75,-108,-113,-113,-13,67,0,-99,-114,12,64,-91,17,3,-128,-124,108,14,0,-46,28,-99,3,-32,104,66,0,-35,-82,12,64,-91,-111,0,48,-35,6,1,-40,-74,-54,1,-48,84,93,-120,-96,-33,-105,33,-88,52,98,4,-70,-34,96,8,116,-45,114,121,4,-70,24,-63,-27,-91,12,65,-91,-111,32,112,27,-116,-127,-127,44,-115,71,96,-70,66,4,87,87,50,4,-107,70,-116,-64,112,26,-116,-127,-87,89,54,-113,-64,21,-57,-32,-6,90,-122,-96,-46,-120,17,-104,77,34,-80,-112,-50,111,100,36,-55,-94,-31,80,-122,-96,-46,-120,17,-40,77,30,69,-74,-87,-15,89,-124,36,103,81,16,-56,16,84,26,49,2,71,111,112,31,56,-82,-55,-97,69,-70,110,10,17,-116,70,50,4,-107,70,-116,-64,117,26,68,-96,-87,-90,-31,-16,16,92,49,-124,-101,27,25,-124,74,35,-71,-110,-31,-38,108,16,3,-46,85,126,51,27,-106,56,-111,38,19,25,-122,74,35,-63,-48,-24,-99,-96,25,8,-103,-4,-61,66,-78,-99,-89,83,25,-122,74,35,-63,96,-94,38,115,-55,-46,85,-2,72,-78,52,113,28,102,51,25,-122,74,35,-63,96,55,-71,-93,53,-57,52,-8,45,109,-19,-10,-61,-61,-57,-71,-78,43,43,-90,25,-50,-74,48,57,-100,96,-73,41,-95,51,-118,-25,-49,121,93,48,25,-34,-113,6,-82,-83,-62,-3,91,124,116,-37,-123,67,-56,50,12,3,-23,-102,-19,-72,-106,-125,92,56,-25,-64,39,-16,99,-76,-51,54,-37,18,-28,106,-123,87,-60,-117,23,67,-126,-93,-76,27,1,-100,-36,-29,-68,-108,41,-86,-118,-78,18,41,94,-53,71,-75,8,91,46,-80,-50,-21,20,-74,58,-108,-32,-25,-37,-51,-2,-127,93,-82,-93,-16,69,46,20,10,94,-43,-99,127,-74,-84,84,0,31,-40,12,59,41,76,90,127,-58,-77,64,-46,112,83,-106,10,-29,105,-105,57,-73,-49,64,-107,-91,-62,-95,-55,109,-69,110,-94,-101,-24,-40,-92,55,-99,-43,76,28,-29,-40,-30,-22,-54,-81,15,-14,-15,112,28,108,-45,97,-50,82,28,-89,120,-50,122,117,9,-55,-105,120,74,-24,75,56,39,-2,19,-90,43,-112,56,-4,-117,51,47,16,-123,111,126,54,-53,-124,-64,-94,80,-78,-24,-122,36,90,-28,-21,-100,71,2,-60,53,-55,42,-65,-59,-64,-63,-63,98,76,-109,100,89,-74,-58,-112,-5,-84,116,-37,-105,-117,65,94,-65,-58,-117,-68,113,-49,-118,46,-88,-54,70,-53,86,72,-77,39,-82,79,-25,117,19,-46,-73,118,81,-39,-42,21,-125,52,-35,66,21,-99,-105,-60,-12,-83,84,6,121,-23,-122,-93,48,43,36,-112,-54,62,43,-91,-65,-66,-101,-125,-68,-64,111,-60,-49,-5,-1,-50,79,112,52,-33,-73,-5,-8,-105,45,-40,15,-20,91,-71,-79,-18,122,67,118,-78,49,-55,97,-10,6,-55,89,-47,-95,80,-18,-113,39,-106,-25,-121,-3,-81,-123,-3,107,-118,-86,80,50,-71,-34,99,-65,-27,11,123,-113,59,94,112,59,-101,-98,121,65,-5,-84,-43,-71,-13,38,94,-81,-61,-115,-90,25,-68,47,-63,-99,-60,-105,-102,66,-18,75,32,-13,37,-16,-4,-2,-24,55,93,-71,12,36,-82,92,122,-125,-32,108,-40,-15,120,127,116,-109,31,-94,-35,-110,12,-47,30,120,-83,-50,108,-32,127,110,24,95,6,-53,-9,-90,-3,-65,58,-97,89,-27,-121,-113,-4,-74,-84,81,86,-58,17,101,5,23,-54,33,101,85,-29,20,126,66,89,-63,-33,39,73,43,-3,-41,-92,85,-50,66,125,-98,-76,-54,57,-82,127,69,90,25,123,50,74,117,-10,100,-108,-128,-76,-86,-20,-64,72,64,90,33,70,90,53,-55,89,125,85,-54,7,-23,-4,-3,117,98,-108,-113,-125,-8,119,-27,-87,81,62,22,66,39,78,-7,-24,26,79,124,-98,26,-27,-125,-48,17,113,120,106,-108,-113,-61,111,-28,-109,-93,124,44,62,-117,78,-116,-14,113,-43,-93,26,-9,-28,40,31,75,-27,121,-73,19,-92,124,44,126,51,-97,32,-27,99,-13,-44,-37,9,82,62,38,127,36,-99,32,-27,-29,30,-47,86,95,-97,-14,41,-33,-14,-115,-110,62,-59,-77,-108,39,125,10,-23,79,73,-97,-39,-92,-50,-24,-120,56,-97,-33,-90,-123,12,83,-1,21,45,-92,33,1,115,116,-56,11,25,34,94,-56,-74,63,-59,11,-79,95,43,-72,119,-87,-50,-1,-112,-73,-69,-52,-66,-119,-95,-26,-35,-4,38,-122,-66,-119,-95,-1,27,49,4,-97,31,15,0,-88,84" Is there a better Option for reducing the number of characters in a string without converting it to byte array and unwanted characters? Thanks in advance,
You can compress to a byte[] and then encode the result in Base64. This will only use alphanumeric and fewer symbols which are safe for transfering as text. i.e. it is widely used for this. public static void main(String[] args) { StringBuilder sb = new StringBuilder(); while (sb.length() < 751396) sb.append("Size: ").append(sb.length()).append("\n"); String s = sb.toString(); String s2 = deflateBase64(s); System.out.println("Uncompressed size = " + s.length() + ", compressed size=" + s2.length()); String s3 = inflateBase64(s2); System.out.println("Same after inflating is " + s3.equals(s)); } public static String deflateBase64(String text) { try { ByteArrayOutputStream baos = new ByteArrayOutputStream(); try (Writer writer = new OutputStreamWriter(new DeflaterOutputStream(baos))) { writer.write(text); } return Base64.getEncoder().encodeToString(baos.toByteArray()); } catch (IOException e) { throw new AssertionError(e); } } public static String inflateBase64(String base64) { try (Reader reader = new InputStreamReader( new InflaterInputStream( new ByteArrayInputStream( Base64.getDecoder().decode(base64))))) { StringWriter sw = new StringWriter(); char[] chars = new char[1024]; for (int len; (len = reader.read(chars)) > 0; ) sw.write(chars, 0, len); return sw.toString(); } catch (IOException e) { throw new AssertionError(e); } } prints Uncompressed size = 751400, compressed size=219564 Same after inflating is true
You can use the Deflater a little more: public static byte[] compress(byte[] data) throws IOException { new Deflater(Deflater.BEST_COMPRESSION, true); //... } So you'll have the strongest compression and you'll skip some of the header data. This is the best you can do with the builtin algorithms.
Compression and Encoding giving Wrong results in Strings
I'm trying to compress a string . I'm using Base64 encoding and decoding to conversion of String to Bytes and viceversa. import org.apache.axis.encoding.Base64; import java.io.ByteArrayOutputStream; import java.io.IOException; import java.util.zip.Deflater; import java.util.zip.Inflater; public class UtilTesting { public static void main(String[] args) { try { String original = "I am the god"; System.out.println("Starting Zlib"); System.out.println("=================================="); String zcompressed = compressString(original); String zdecompressed = decompressString(zcompressed); System.out.println("Original String: "+original); System.out.println("Compressed String: "+zcompressed); System.out.println("Decompressed String: "+zdecompressed); } catch (IOException e) { e.printStackTrace(); } public static String compressString(String uncompressedString){ String compressedString = null; byte[] bytes = Base64.decode(uncompressedString); try { bytes = compressBytes(bytes); compressedString = Base64.encode(bytes); } catch (IOException e) { e.printStackTrace(); } return compressedString; } public static String decompressString(String compressedString){ String decompressedString = null; byte[] bytes = Base64.decode(compressedString); try { bytes = decompressBytes(bytes); decompressedString = Base64.encode(bytes); } catch (IOException e) { e.printStackTrace(); } catch (DataFormatException e) { e.printStackTrace(); } return decompressedString; } public static byte[] compressBytes(byte[] data) throws IOException { Deflater deflater = new Deflater(); deflater.setInput(data); ByteArrayOutputStream outputStream = new ByteArrayOutputStream(data.length); deflater.finish(); byte[] buffer = new byte[1024]; while (!deflater.finished()) { int count = deflater.deflate(buffer); // returns the generated code... index outputStream.write(buffer, 0, count); } outputStream.close(); byte[] output = outputStream.toByteArray(); return output; } public static byte[] decompressBytes(byte[] data) throws IOException, DataFormatException { Inflater inflater = new Inflater(); inflater.setInput(data); ByteArrayOutputStream outputStream = new ByteArrayOutputStream(data.length); byte[] buffer = new byte[1024]; while (!inflater.finished()) { int count = inflater.inflate(buffer); outputStream.write(buffer, 0, count); } outputStream.close(); byte[] output = outputStream.toByteArray(); return output; } } This is giving the result : Starting Zlib ================================== Original String: I am the god Compressed String: eJxTXLm29YUGAApUAw0= Decompressed String: Iamthego As you can see, it is missing the white-spaces and it even lost the final letter in the given String. Can someone please suggest what is wrong with this code. I'm following below steps: Decode compress encode save retrieve decode decompress encode. Please help. Thank you.
In compressString, replace: Base64.decode(uncompressedString) with uncompressString.getBytes(StandardCharsets.UTF_8) You're not passing in a base64-encoded string; you simply want the bytes of the input string. Note that spaces never appear in base64 encoding, so they are likely treated as redundant and discarded. Similarly in decompressString, replace: Base64.encode(bytes) with new String(bytes, StandardCharsets.UTF_8)
Deflate and Inflate Java String in Memory Zip Exception Error
I am writing code to deflate and inflate a string in base 64 encode but I am getting the following error: Exception in thread "main" java.util.zip.ZipException: incorrect header check at java.util.zip.InflaterOutputStream.write(InflaterOutputStream.java:284) at java.io.FilterOutputStream.write(FilterOutputStream.java:108) at serializer.test.SerializerTest.main(SerializerTest.java:43) My code is: XsltObject Xslt = new XsltObject(); Xslt.setXslt(readFile("C:\\codebase\\OverallSystem\\EBE_TEMPERED_XMLS\\bank_timestamp-0.xml")); System.out.println("Original String Length: "+ Xslt.getXslt().length()); //JSONObject jsonObj = new JSONObject( Xslt ); // System.out.println( jsonObj ); //System.out.println( "Json Length:" + jsonObj); DeflaterOutputStream outputStream; for ( int i = 1; i <= 9; ++i ) { ByteArrayOutputStream arrayOutputStream = new ByteArrayOutputStream(); outputStream = new DeflaterOutputStream(arrayOutputStream, new Deflater( i, true )); outputStream.write(Xslt.getXslt().getBytes()); outputStream.close(); //System.out.println("Deflate (lvl=" + i + ");" + arrayOutputStream.toString("ISO-8859-1")); System.out.println("Deflate (lvl=" + i + ");" + arrayOutputStream.toString("ISO-8859-1").length()); String temp = DatatypeConverter.printBase64Binary(arrayOutputStream.toString("UTF-8").getBytes()); System.out.println(temp); System.out.println("Base 64 len: " + temp.length()); byte[] data =DatatypeConverter.parseBase64Binary(temp); ByteArrayOutputStream inflateArrayOutputStream = new ByteArrayOutputStream(); InflaterOutputStream iis = new InflaterOutputStream(inflateArrayOutputStream, new Inflater()); iis.write(data); iis.close(); System.out.println("Inflate (lvl=" + i + ");" + inflateArrayOutputStream.toString("ISO-8859-1")); System.out.println("Inflate (lvl=" + i + ");" + inflateArrayOutputStream.toString("ISO-8859-1").length()); What am I doing wrong?
This fixed all my issues,and is all JDK usage: package serializer.test; import java.io.BufferedReader; import java.io.FileReader; import java.io.IOException; import java.io.UnsupportedEncodingException; import java.util.Arrays; import java.util.zip.*; import javax.xml.bind.DatatypeConverter; public class DeflationApp { private String compressBase64(String stringToCompress, int level) throws UnsupportedEncodingException { byte[] compressedData = new byte[1024]; byte[] stringAsBytes = stringToCompress.getBytes("UTF-8"); Deflater compressor = new Deflater(level, false); compressor.setInput(stringAsBytes); compressor.finish(); int compressedDataLength = compressor.deflate(compressedData); byte[] bytes = Arrays.copyOf(compressedData, compressedDataLength); return DatatypeConverter.printBase64Binary(bytes); } private String decompressToStringBase64(String base64String) throws UnsupportedEncodingException, DataFormatException { byte[] compressedData = DatatypeConverter .parseBase64Binary(base64String); Inflater deCompressor = new Inflater(); deCompressor.setInput(compressedData, 0, compressedData.length); byte[] output = new byte[100000]; int decompressedDataLength = deCompressor.inflate(output); deCompressor.end(); return new String(output, 0, decompressedDataLength, "UTF-8"); } public static void main(String[] args) throws DataFormatException, IOException { DeflationApp m = new DeflationApp(); String strToBeCompressed = readFile( "C:\\codebase\\OverallSystem\\MappingMapToEBECommon.xslt") .trim(); for (int i = 1; i <= 9; ++i) { String compressedData = m.compressBase64(strToBeCompressed, i); String deCompressedString = m.decompressToStringBase64(compressedData); System.out.println("Base 64:"); System.out.println("Original Length with level("+i+"): " + strToBeCompressed.length()); System.out.println("Compressed with level("+i+"): " + compressedData.toString()); System.out.println("Compressed with level("+i+") Length: " + compressedData.toString().length()); System.out.println("Decompressed with level("+i+"): " + + deCompressedString.length()); System.out.println("Decompressed with level("+i+"): " + deCompressedString); } for (int i = 1; i <= 9; ++i) { byte[] compressedData = m.compress(strToBeCompressed, i); String deCompressedString = m.decompressToString(compressedData); System.out.println("Without Base 64:"); System.out.println("Original Length with level("+i+"): " + strToBeCompressed.length()); System.out.println("Compressed with level("+i+"): " + new String(compressedData)); System.out.println("Compressed with level("+i+") Length: " + new String(compressedData).length()); System.out.println("Decompressed with level("+i+"): " + + deCompressedString.length()); System.out.println("Decompressed with level("+i+"): " + deCompressedString); } } private byte[] compress(String stringToCompress, int level) throws UnsupportedEncodingException { byte[] compressedData = new byte[1024]; byte[] stringAsBytes = stringToCompress.getBytes("UTF-8"); Deflater compressor = new Deflater(level, false); compressor.setInput(stringAsBytes); compressor.finish(); int compressedDataLength = compressor.deflate(compressedData); return Arrays.copyOf(compressedData, compressedDataLength); } private String decompressToString(byte[] compressedData) throws UnsupportedEncodingException, DataFormatException { Inflater deCompressor = new Inflater(); deCompressor.setInput(compressedData, 0, compressedData.length); byte[] output = new byte[100000]; int decompressedDataLength = deCompressor.inflate(output); deCompressor.end(); return new String(output, 0, decompressedDataLength, "UTF-8"); } public static String readFile(String file) throws IOException { BufferedReader reader = new BufferedReader(new FileReader(file)); String line = null; StringBuilder stringBuilder = new StringBuilder(); String ls = System.getProperty("line.separator"); try { while ((line = reader.readLine()) != null) { stringBuilder.append(line); stringBuilder.append(ls); } return stringBuilder.toString(); } finally { reader.close(); } } }
I too had memory problems with DeflaterOutputStream - it works if you let it use the default constructor. This works fine : for (Entry<String, String> entry : valueMap.entrySet()) { String key = entry.getKey(); String value = entry.getValue(); ByteArrayOutputStream baos = new ByteArrayOutputStream(); DeflaterOutputStream dos = new DeflaterOutputStream(baos); try { dos.write(value.getBytes()); dos.flush(); dos.close(); } catch (IOException e) { throw new RuntimeException(e); } byte[] zipData = baos.toByteArray(); zipValueMap.put(key, zipData); } But change that to : ByteArrayOutputStream baos = new ByteArrayOutputStream(); Deflater deflater = new Deflater(Deflater.BEST_SPEED); DeflaterOutputStream dos = new DeflaterOutputStream(baos, deflater); And that gives me memory leak in JVM C code that takes up 80g and crashes my mint system. So why would the default constructor work and yet when I pass my own deflator in it fails so badly : Decoding DeflaterOutputStream (java 1.8_40) I find some special code in the close method : public void close() throws IOException { if (!closed) { finish(); if (usesDefaultDeflater) def.end(); out.close(); closed = true; } } I guess they put that in to workaround problem with deflater. The best solution was to call it explicitly in the loop : try { dos.write(value.getBytes()); dos.flush(); dos.close(); deflater.end(); } And no more memory leak. It's a bad memory leak as well, since it is from the C side, so it never threw a JVM error, it just chewed up all the 40g of ram I had, and then started on the swap space. I had to ssh into the box and kill it.