Compress a string to gzip in Java - java
public static String compressString(String str) throws IOException{
if (str == null || str.length() == 0) {
return str;
}
ByteArrayOutputStream out = new ByteArrayOutputStream();
GZIPOutputStream gzip = new GZIPOutputStream(out);
gzip.write(str.getBytes());
gzip.close();
Gdx.files.local("gziptest.gzip").writeString(out.toString(), false);
return out.toString();
}
When I save that string to a file, and run gunzip -d file.txt in unix, it complains:
gzip: gzip.gz: not in gzip format
Try to use BufferedWriter
public static String compressString(String str) throws IOException{
if (str == null || str.length() == 0) {
return str;
}
BufferedWriter writer = null;
try{
File file = new File("your.gzip")
GZIPOutputStream zip = new GZIPOutputStream(new FileOutputStream(file));
writer = new BufferedWriter(new OutputStreamWriter(zip, "UTF-8"));
writer.append(str);
}
finally{
if(writer != null){
writer.close();
}
}
}
About your code example try:
public static String compressString(String str) throws IOException{
if (str == null || str.length() == 0) {
return str;
}
ByteArrayOutputStream out = new ByteArrayOutputStream(str.length());
GZIPOutputStream gzip = new GZIPOutputStream(out);
gzip.write(str.getBytes());
gzip.close();
byte[] compressedBytes = out.toByteArray();
Gdx.files.local("gziptest.gzip").writeBytes(compressedBytes, false);
out.close();
return out.toString(); // I would return compressedBytes instead String
}
Try that :
//...
String string = "string";
FileOutputStream fos = new FileOutputStream("filename.zip");
GZIPOutputStream gzos = new GZIPOutputStream(fos);
gzos.write(string.getBytes());
gzos.finish();
//...
Save bytes from out with FileOutputStream
FileOutputStream fos = new FileOutputStream("gziptest.gz");
fos.write(out.toByteArray());
fos.close();
out.toString() seems suspicious, the result will be unreadable, if you dont care then why not to return byte[], if you do care it would look better as hex or base64 string.
Related
Compress and Decompress String in Java
I'm trying to compress and decompress a string from producer and consumer environment (which accepts only string as params). So after I compress a string, I'm converting compressed byte array to string and then passing it to producer. Then in consumer part, I'm taking the string back , converting into byte array and then decompressing the string from bytes. Instead of converting into string, if I used byte[], then it is working fine. But what I need is to convert into string and viceversa. Here is my code : public class Compression { public static void main(String[] args) throws Exception{ // TODO Auto-generated method stub String strToCompress = "Helloo!! "; byte[] compressedBytes = compress(strToCompress); String compressedStr = new String(compressedBytes, StandardCharsets.UTF_8); byte[] bytesToDecompress = compressedStr.getBytes(StandardCharsets.UTF_8); String decompressedStr = decompress(bytesToDecompress); System.out.println("Compressed Bytes : "+Arrays.toString(compressedBytes)); System.out.println("Decompressed String : "+decompressedStr); } public static byte[] compress(final String str) throws IOException { if ((str == null) || (str.length() == 0)) { return null; } ByteArrayOutputStream obj = new ByteArrayOutputStream(); GZIPOutputStream gzip = new GZIPOutputStream(obj); gzip.write(str.getBytes("UTF-8")); gzip.flush(); gzip.close(); return obj.toByteArray(); } public static String decompress(final byte[] compressed) throws IOException { final StringBuilder outStr = new StringBuilder(); if ((compressed == null) || (compressed.length == 0)) { return ""; } if (isCompressed(compressed)) { //It is not going into this if part final GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(compressed)); final BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(gis, "UTF-8")); String line; while ((line = bufferedReader.readLine()) != null) { outStr.append(line); } } else { outStr.append(compressed); } return outStr.toString(); } public static boolean isCompressed(final byte[] compressed) { return (compressed[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (compressed[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8)); } }
You can't assume a compressed string can be treated as UTF-8 as many possible byte combinations are not valid UTF-8. I suggest trying ISO-8859-1 which keeps all 8-bit values untranslated. Also note that while large pieces of text should get smaller, small strings can get larger. Note: This loop will strip any newline characters String line; while ((line = bufferedReader.readLine()) != null) { outStr.append(line); } I suggest instead copying using a char[] which won't drop any characters. char[] chars = new char[512]; for(int len; (len = reader.read(chars)) > 0;) outStr.append(chars, 0, len);
Is there any compression method in java to reduce the number of charaters in a string?
I am currently facing a problem while compressing a string to fewer characters in java. I have a huge string which is about 751396 characters and there is a requirement of compressing the string into a 1500 characters. I have tried GZIP Compressor, Inflater & Deflater but these libraries return byte arrays Then I tried LZ-String compressor in which I was able to get satisfactory results using UTF16 encoding and base64 encoding, But these compression return some characters which are neither alphanumeric nor are they included in the symbols list provided. N.B. The list for the Symbols is [+,-,*,/,!,#,#] is there any other technique of compressing the string into another string with fewer characters and providing at least 30% of compression ratio. The codes which I am using for GZip compression is as follows:- import java.io.BufferedReader; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; import java.io.IOException; import java.io.InputStreamReader; import java.util.zip.GZIPInputStream; import java.util.zip.GZIPOutputStream; public class GZIPCompression { public static byte[] compress(final String str) throws IOException { if ((str == null) || (str.length() == 0)) { return null; } ByteArrayOutputStream obj = new ByteArrayOutputStream(); GZIPOutputStream gzip = new GZIPOutputStream(obj); gzip.write(str.getBytes("UTF-8")); gzip.close(); return obj.toByteArray(); } public static String decompress(final byte[] compressed) throws IOException { String outStr = ""; if ((compressed == null) || (compressed.length == 0)) { return ""; } if (isCompressed(compressed)) { GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(compressed)); BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(gis, "UTF-8")); String line; while ((line = bufferedReader.readLine()) != null) { outStr += line; } } else { outStr = new String(compressed); } return outStr; } public static boolean isCompressed(final byte[] compressed) { return (compressed[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (compressed[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8)); } } The code for the Inflater & Deflater program is as follows:- import java.io.ByteArrayOutputStream; import java.io.IOException; import java.util.Arrays; import java.util.zip.DataFormatException; import java.util.zip.Deflater; import java.util.zip.Inflater; public class Apple { public static void main(String[] args) { String sr = " [120,-100,-19,89,91,79,-21,56,16,-2,43,40,-49,104,55,113,-18,-68,-91,-23,21,104,90,-38,-62,10,-83,120,48,-83,91,34,-46,-92,-21,-92,8,-124,-8,-17,103,-100,-92,77,-22,-38,-25,112,86,27,-119,106,65,84,-86,103,-58,-10,124,-98,-15,101,-66,-66,43,25,126,-99,-112,116,-109,-60,41,81,46,-34,-107,-57,36,121,14,-29,-43,-20,109,3,77,101,-94,-100,43,120,-79,-115,50,63,-39,-58,25,8,52,16,-52,-97,-62,104,-79,19,-88,32,8,-29,37,-114,-77,-70,-92,71,-78,89,125,-36,-65,-33,-107,5,-50,-40,-120,-86,-11,39,82,-31,95,-77,-40,-48,-21,-94,15,82,13,11,-58,-115,112,-102,-6,-55,-126,-103,-7,126,-65,53,-22,-113,-64,-58,123,-63,97,52,37,-85,53,97,-106,-17,74,55,10,87,79,-39,96,-63,-100,65,76,31,-46,40,-116,73,-39,111,-38,3,81,97,18,108,-41,-113,-124,-126,-52,-48,-100,-62,-50,-89,120,-103,-107,-56,108,-99,9,71,52,92,-123,49,52,91,-41,-109,125,19,44,55,9,-51,102,-124,-82,-61,24,71,-96,5,85,-101,-92,25,-76,-78,48,-55,-51,71,-61,67,-103,-92,-49,6,-45,108,75,73,27,-80,-49,-62,53,-101,23,-64,-25,75,-96,89,103,72,-67,48,-44,11,-107,-83,-105,71,105,-8,-126,35,-119,29,-70,-48,74,-69,-10,-106,-18,92,-48,-98,104,122,-90,-85,48,93,10,-118,2,108,-78,-100,102,-55,38,85,46,-44,115,-27,46,-60,-123,23,-2,-106,82,18,-49,-33,-54,21,26,4,-109,-35,-86,-114,-15,107,23,-125,119,36,-125,70,-102,71,-55,23,-58,96,47,5,-60,-13,-61,-24,-80,-28,96,97,105,-31,52,-100,123,101,60,53,-61,112,33,12,48,54,19,-61,-56,74,-112,116,41,-127,-42,74,41,-28,-69,-4,34,-53,109,-68,-64,-113,17,1,-7,-3,77,-18,-8,44,-55,112,4,-39,-77,27,12,68,61,-102,-92,-23,126,112,-45,48,-64,-91,100,-67,14,-45,-76,88,11,-45,-2,-61,-75,-108,-113,-113,-13,67,0,-99,-114,12,64,-91,17,3,-128,-124,108,14,0,-46,28,-99,3,-32,104,66,0,-35,-82,12,64,-91,-111,0,48,-35,6,1,-40,-74,-54,1,-48,84,93,-120,-96,-33,-105,33,-88,52,98,4,-70,-34,96,8,116,-45,114,121,4,-70,24,-63,-27,-91,12,65,-91,-111,32,112,27,-116,-127,-127,44,-115,71,96,-70,66,4,87,87,50,4,-107,70,-116,-64,112,26,-116,-127,-87,89,54,-113,-64,21,-57,-32,-6,90,-122,-96,-46,-120,17,-104,77,34,-80,-112,-50,111,100,36,-55,-94,-31,80,-122,-96,-46,-120,17,-40,77,30,69,-74,-87,-15,89,-124,36,103,81,16,-56,16,84,26,49,2,71,111,112,31,56,-82,-55,-97,69,-70,110,10,17,-116,70,50,4,-107,70,-116,-64,117,26,68,-96,-87,-90,-31,-16,16,92,49,-124,-101,27,25,-124,74,35,-71,-110,-31,-38,108,16,3,-46,85,126,51,27,-106,56,-111,38,19,25,-122,74,35,-63,-48,-24,-99,-96,25,8,-103,-4,-61,66,-78,-99,-89,83,25,-122,74,35,-63,96,-94,38,115,-55,-46,85,-2,72,-78,52,113,28,102,51,25,-122,74,35,-63,96,55,-71,-93,53,-57,52,-8,45,109,-19,-10,-61,-61,-57,-71,-78,43,43,-90,25,-50,-74,48,57,-100,96,-73,41,-95,51,-118,-25,-49,121,93,48,25,-34,-113,6,-82,-83,-62,-3,91,124,116,-37,-123,67,-56,50,12,3,-23,-102,-19,-72,-106,-125,92,56,-25,-64,39,-16,99,-76,-51,54,-37,18,-28,106,-123,87,-60,-117,23,67,-126,-93,-76,27,1,-100,-36,-29,-68,-108,41,-86,-118,-78,18,41,94,-53,71,-75,8,91,46,-80,-50,-21,20,-74,58,-108,-32,-25,-37,-51,-2,-127,93,-82,-93,-16,69,46,20,10,94,-43,-99,127,-74,-84,84,0,31,-40,12,59,41,76,90,127,-58,-77,64,-46,112,83,-106,10,-29,105,-105,57,-73,-49,64,-107,-91,-62,-95,-55,109,-69,110,-94,-101,-24,-40,-92,55,-99,-43,76,28,-29,-40,-30,-22,-54,-81,15,-14,-15,112,28,108,-45,97,-50,82,28,-89,120,-50,122,117,9,-55,-105,120,74,-24,75,56,39,-2,19,-90,43,-112,56,-4,-117,51,47,16,-123,111,126,54,-53,-124,-64,-94,80,-78,-24,-122,36,90,-28,-21,-100,71,2,-60,53,-55,42,-65,-59,-64,-63,-63,98,76,-109,100,89,-74,-58,-112,-5,-84,116,-37,-105,-117,65,94,-65,-58,-117,-68,113,-49,-118,46,-88,-54,70,-53,86,72,-77,39,-82,79,-25,117,19,-46,-73,118,81,-39,-42,21,-125,52,-35,66,21,-99,-105,-60,-12,-83,84,6,121,-23,-122,-93,48,43,36,-112,-54,62,43,-91,-65,-66,-101,-125,-68,-64,111,-60,-49,-5,-1,-50,79,112,52,-33,-73,-5,-8,-105,45,-40,15,-20,91,-71,-79,-18,122,67,118,-78,49,-55,97,-10,6,-55,89,-47,-95,80,-18,-113,39,-106,-25,-121,-3,-81,-123,-3,107,-118,-86,80,50,-71,-34,99,-65,-27,11,123,-113,59,94,112,59,-101,-98,121,65,-5,-84,-43,-71,-13,38,94,-81,-61,-115,-90,25,-68,47,-63,-99,-60,-105,-102,66,-18,75,32,-13,37,-16,-4,-2,-24,55,93,-71,12,36,-82,92,122,-125,-32,108,-40,-15,120,127,116,-109,31,-94,-35,-110,12,-47,30,120,-83,-50,108,-32,127,110,24,95,6,-53,-9,-90,-3,-65,58,-97,89,-27,-121,-113,-4,-74,-84,81,86,-58,17,101,5,23,-54,33,101,85,-29,20,126,66,89,-63,-33,39,73,43,-3,-41,-92,85,-50,66,125,-98,-76,-54,57,-82,127,69,90,25,123,50,74,117,-10,100,-108,-128,-76,-86,-20,-64,72,64,90,33,70,90,53,-55,89,125,85,-54,7,-23,-4,-3,117,98,-108,-113,-125,-8,119,-27,-87,81,62,22,66,39,78,-7,-24,26,79,124,-98,26,-27,-125,-48,17,113,120,106,-108,-113,-61,111,-28,-109,-93,124,44,62,-117,78,-116,-14,113,-43,-93,26,-9,-28,40,31,75,-27,121,-73,19,-92,124,44,126,51,-97,32,-27,99,-13,-44,-37,9,82,62,38,127,36,-99,32,-27,-29,30,-47,86,95,-97,-14,41,-33,-14,-115,-110,62,-59,-77,-108,39,125,10,-23,79,73,-97,-39,-92,-50,-24,-120,56,-97,-33,-90,-123,12,83,-1,21,45,-92,33,1,115,116,-56,11,25,34,94,-56,-74,63,-59,11,-79,95,43,-72,119,-87,-50,-1,-112,-73,-69,-52,-66,-119,-95,-26,-35,-4,38,-122,-66,-119,-95,-1,27,49,4,-97,31,15,0,-88,84]"; byte[] data = sr.getBytes(); try { String x = new String(decompress(compress(data))); System.out.println("decompressed " + x); } catch (IOException | DataFormatException e) { e.printStackTrace(); } } public static byte[] compress(byte[] data) throws IOException { Deflater deflater = new Deflater(); deflater.setInput(data); ByteArrayOutputStream outputStream = new ByteArrayOutputStream(data.length); deflater.finish(); byte[] buffer = new byte[1024]; while (!deflater.finished()) { int count = deflater.deflate(buffer); outputStream.write(buffer, 0, count); } outputStream.close(); byte[] output = outputStream.toByteArray(); System.out.println("Original: " + data.length); System.out.println("Compressed: " + output.length); return output; } public static byte[] decompress(byte[] data) throws IOException, DataFormatException { Inflater inflater = new Inflater(); inflater.setInput(data); ByteArrayOutputStream outputStream = new ByteArrayOutputStream(data.length); byte[] buffer = new byte[1024]; while (!inflater.finished()) { int count = inflater.inflate(buffer); outputStream.write(buffer, 0, count); } outputStream.close(); byte[] output = outputStream.toByteArray(); System.out.println(); return output; } } A sample of how the data will look like:- "120,-100,-19,89,91,79,-21,56,16,-2,43,40,-49,104,55,113,-18,-68,-91,-23,21,104,90,-38,-62,10,-83,120,48,-83,91,34,-46,-92,-21,-92,8,-124,-8,-17,103,-100,-92,77,-22,-38,-25,112,86,27,-119,106,65,84,-86,103,-58,-10,124,-98,-15,101,-66,-66,43,25,126,-99,-112,116,-109,-60,41,81,46,-34,-107,-57,36,121,14,-29,-43,-20,109,3,77,101,-94,-100,43,120,-79,-115,50,63,-39,-58,25,8,52,16,-52,-97,-62,104,-79,19,-88,32,8,-29,37,-114,-77,-70,-92,71,-78,89,125,-36,-65,-33,-107,5,-50,-40,-120,-86,-11,39,82,-31,95,-77,-40,-48,-21,-94,15,82,13,11,-58,-115,112,-102,-6,-55,-126,-103,-7,126,-65,53,-22,-113,-64,-58,123,-63,97,52,37,-85,53,97,-106,-17,74,55,10,87,79,-39,96,-63,-100,65,76,31,-46,40,-116,73,-39,111,-38,3,81,97,18,108,-41,-113,-124,-126,-52,-48,-100,-62,-50,-89,120,-103,-107,-56,108,-99,9,71,52,92,-123,49,52,91,-41,-109,125,19,44,55,9,-51,102,-124,-82,-61,24,71,-96,5,85,-101,-92,25,-76,-78,48,-55,-51,71,-61,67,-103,-92,-49,6,-45,108,75,73,27,-80,-49,-62,53,-101,23,-64,-25,75,-96,89,103,72,-67,48,-44,11,-107,-83,-105,71,105,-8,-126,35,-119,29,-70,-48,74,-69,-10,-106,-18,92,-48,-98,104,122,-90,-85,48,93,10,-118,2,108,-78,-100,102,-55,38,85,46,-44,115,-27,46,-60,-123,23,-2,-106,82,18,-49,-33,-54,21,26,4,-109,-35,-86,-114,-15,107,23,-125,119,36,-125,70,-102,71,-55,23,-58,96,47,5,-60,-13,-61,-24,-80,-28,96,97,105,-31,52,-100,123,101,60,53,-61,112,33,12,48,54,19,-61,-56,74,-112,116,41,-127,-42,74,41,-28,-69,-4,34,-53,109,-68,-64,-113,17,1,-7,-3,77,-18,-8,44,-55,112,4,-39,-77,27,12,68,61,-102,-92,-23,126,112,-45,48,-64,-91,100,-67,14,-45,-76,88,11,-45,-2,-61,-75,-108,-113,-113,-13,67,0,-99,-114,12,64,-91,17,3,-128,-124,108,14,0,-46,28,-99,3,-32,104,66,0,-35,-82,12,64,-91,-111,0,48,-35,6,1,-40,-74,-54,1,-48,84,93,-120,-96,-33,-105,33,-88,52,98,4,-70,-34,96,8,116,-45,114,121,4,-70,24,-63,-27,-91,12,65,-91,-111,32,112,27,-116,-127,-127,44,-115,71,96,-70,66,4,87,87,50,4,-107,70,-116,-64,112,26,-116,-127,-87,89,54,-113,-64,21,-57,-32,-6,90,-122,-96,-46,-120,17,-104,77,34,-80,-112,-50,111,100,36,-55,-94,-31,80,-122,-96,-46,-120,17,-40,77,30,69,-74,-87,-15,89,-124,36,103,81,16,-56,16,84,26,49,2,71,111,112,31,56,-82,-55,-97,69,-70,110,10,17,-116,70,50,4,-107,70,-116,-64,117,26,68,-96,-87,-90,-31,-16,16,92,49,-124,-101,27,25,-124,74,35,-71,-110,-31,-38,108,16,3,-46,85,126,51,27,-106,56,-111,38,19,25,-122,74,35,-63,-48,-24,-99,-96,25,8,-103,-4,-61,66,-78,-99,-89,83,25,-122,74,35,-63,96,-94,38,115,-55,-46,85,-2,72,-78,52,113,28,102,51,25,-122,74,35,-63,96,55,-71,-93,53,-57,52,-8,45,109,-19,-10,-61,-61,-57,-71,-78,43,43,-90,25,-50,-74,48,57,-100,96,-73,41,-95,51,-118,-25,-49,121,93,48,25,-34,-113,6,-82,-83,-62,-3,91,124,116,-37,-123,67,-56,50,12,3,-23,-102,-19,-72,-106,-125,92,56,-25,-64,39,-16,99,-76,-51,54,-37,18,-28,106,-123,87,-60,-117,23,67,-126,-93,-76,27,1,-100,-36,-29,-68,-108,41,-86,-118,-78,18,41,94,-53,71,-75,8,91,46,-80,-50,-21,20,-74,58,-108,-32,-25,-37,-51,-2,-127,93,-82,-93,-16,69,46,20,10,94,-43,-99,127,-74,-84,84,0,31,-40,12,59,41,76,90,127,-58,-77,64,-46,112,83,-106,10,-29,105,-105,57,-73,-49,64,-107,-91,-62,-95,-55,109,-69,110,-94,-101,-24,-40,-92,55,-99,-43,76,28,-29,-40,-30,-22,-54,-81,15,-14,-15,112,28,108,-45,97,-50,82,28,-89,120,-50,122,117,9,-55,-105,120,74,-24,75,56,39,-2,19,-90,43,-112,56,-4,-117,51,47,16,-123,111,126,54,-53,-124,-64,-94,80,-78,-24,-122,36,90,-28,-21,-100,71,2,-60,53,-55,42,-65,-59,-64,-63,-63,98,76,-109,100,89,-74,-58,-112,-5,-84,116,-37,-105,-117,65,94,-65,-58,-117,-68,113,-49,-118,46,-88,-54,70,-53,86,72,-77,39,-82,79,-25,117,19,-46,-73,118,81,-39,-42,21,-125,52,-35,66,21,-99,-105,-60,-12,-83,84,6,121,-23,-122,-93,48,43,36,-112,-54,62,43,-91,-65,-66,-101,-125,-68,-64,111,-60,-49,-5,-1,-50,79,112,52,-33,-73,-5,-8,-105,45,-40,15,-20,91,-71,-79,-18,122,67,118,-78,49,-55,97,-10,6,-55,89,-47,-95,80,-18,-113,39,-106,-25,-121,-3,-81,-123,-3,107,-118,-86,80,50,-71,-34,99,-65,-27,11,123,-113,59,94,112,59,-101,-98,121,65,-5,-84,-43,-71,-13,38,94,-81,-61,-115,-90,25,-68,47,-63,-99,-60,-105,-102,66,-18,75,32,-13,37,-16,-4,-2,-24,55,93,-71,12,36,-82,92,122,-125,-32,108,-40,-15,120,127,116,-109,31,-94,-35,-110,12,-47,30,120,-83,-50,108,-32,127,110,24,95,6,-53,-9,-90,-3,-65,58,-97,89,-27,-121,-113,-4,-74,-84,81,86,-58,17,101,5,23,-54,33,101,85,-29,20,126,66,89,-63,-33,39,73,43,-3,-41,-92,85,-50,66,125,-98,-76,-54,57,-82,127,69,90,25,123,50,74,117,-10,100,-108,-128,-76,-86,-20,-64,72,64,90,33,70,90,53,-55,89,125,85,-54,7,-23,-4,-3,117,98,-108,-113,-125,-8,119,-27,-87,81,62,22,66,39,78,-7,-24,26,79,124,-98,26,-27,-125,-48,17,113,120,106,-108,-113,-61,111,-28,-109,-93,124,44,62,-117,78,-116,-14,113,-43,-93,26,-9,-28,40,31,75,-27,121,-73,19,-92,124,44,126,51,-97,32,-27,99,-13,-44,-37,9,82,62,38,127,36,-99,32,-27,-29,30,-47,86,95,-97,-14,41,-33,-14,-115,-110,62,-59,-77,-108,39,125,10,-23,79,73,-97,-39,-92,-50,-24,-120,56,-97,-33,-90,-123,12,83,-1,21,45,-92,33,1,115,116,-56,11,25,34,94,-56,-74,63,-59,11,-79,95,43,-72,119,-87,-50,-1,-112,-73,-69,-52,-66,-119,-95,-26,-35,-4,38,-122,-66,-119,-95,-1,27,49,4,-97,31,15,0,-88,84" Is there a better Option for reducing the number of characters in a string without converting it to byte array and unwanted characters? Thanks in advance,
You can compress to a byte[] and then encode the result in Base64. This will only use alphanumeric and fewer symbols which are safe for transfering as text. i.e. it is widely used for this. public static void main(String[] args) { StringBuilder sb = new StringBuilder(); while (sb.length() < 751396) sb.append("Size: ").append(sb.length()).append("\n"); String s = sb.toString(); String s2 = deflateBase64(s); System.out.println("Uncompressed size = " + s.length() + ", compressed size=" + s2.length()); String s3 = inflateBase64(s2); System.out.println("Same after inflating is " + s3.equals(s)); } public static String deflateBase64(String text) { try { ByteArrayOutputStream baos = new ByteArrayOutputStream(); try (Writer writer = new OutputStreamWriter(new DeflaterOutputStream(baos))) { writer.write(text); } return Base64.getEncoder().encodeToString(baos.toByteArray()); } catch (IOException e) { throw new AssertionError(e); } } public static String inflateBase64(String base64) { try (Reader reader = new InputStreamReader( new InflaterInputStream( new ByteArrayInputStream( Base64.getDecoder().decode(base64))))) { StringWriter sw = new StringWriter(); char[] chars = new char[1024]; for (int len; (len = reader.read(chars)) > 0; ) sw.write(chars, 0, len); return sw.toString(); } catch (IOException e) { throw new AssertionError(e); } } prints Uncompressed size = 751400, compressed size=219564 Same after inflating is true
You can use the Deflater a little more: public static byte[] compress(byte[] data) throws IOException { new Deflater(Deflater.BEST_COMPRESSION, true); //... } So you'll have the strongest compression and you'll skip some of the header data. This is the best you can do with the builtin algorithms.
md5 checksum on file in android doesn't match the md5 after i email it to myself
The functions I'm using to convert the file to a string and then to an mdf are below. I'm outputting the file paths and file names to make sure everything is cool. Is there anything I'm not considering that could change the file's (a video mp4 actually) fingerprint? I'm checking it against md5sum on ubuntu. private static String readFileToString(String filePath) throws java.io.IOException{ StringBuffer fileData = new StringBuffer(1000); BufferedReader reader = new BufferedReader( new FileReader(filePath)); char[] buf = new char[1024]; int numRead=0; while((numRead=reader.read(buf)) != -1){ String readData = String.valueOf(buf, 0, numRead); fileData.append(readData); buf = new char[1024]; } reader.close(); System.out.println(fileData.toString()); return fileData.toString(); } public static String getMD5EncryptedString(String encTarget){ MessageDigest mdEnc = null; try { mdEnc = MessageDigest.getInstance("MD5"); } catch (NoSuchAlgorithmException e) { System.out.println("Exception while encrypting to md5"); e.printStackTrace(); } // Encryption algorithm mdEnc.update(encTarget.getBytes(), 0, encTarget.length()); String md5 = new BigInteger(1, mdEnc.digest()).toString(16) ; return md5; }
String isn't a container for binary data. Lose the two conversions between byte array and String. You should be reading the file as bytes and computing the MD5 directly in the bytes. You can do that while you're reading it: you don't need to read the entire file first. And MD5 isn't an encryption: it's a secure hash.
Found this answer here: How to generate an MD5 checksum for a file in Android? public static String fileToMD5(String filePath) { InputStream inputStream = null; try { inputStream = new FileInputStream(filePath); byte[] buffer = new byte[1024]; MessageDigest digest = MessageDigest.getInstance("MD5"); int numRead = 0; while (numRead != -1) { numRead = inputStream.read(buffer); if (numRead > 0) digest.update(buffer, 0, numRead); } byte [] md5Bytes = digest.digest(); return convertHashToString(md5Bytes); } catch (Exception e) { return null; } finally { if (inputStream != null) { try { inputStream.close(); } catch (Exception e) { } } } } private static String convertHashToString(byte[] md5Bytes) { String returnVal = ""; for (int i = 0; i < md5Bytes.length; i++) { returnVal += Integer.toString(( md5Bytes[i] & 0xff ) + 0x100, 16).substring(1); } return returnVal; }
compression and decompression of string data in java
I am using the following code to compress and decompress string data, but the problem which I am facing is, it is easily getting compressed without error, but the decompress method throws the following error. Exception in thread "main" java.io.IOException: Not in GZIP format public static void main(String[] args) throws Exception { String string = "I am what I am hhhhhhhhhhhhhhhhhhhhhhhhhhhhh" + "bjggujhhhhhhhhh" + "rggggggggggggggggggggggggg" + "esfffffffffffffffffffffffffffffff" + "esffffffffffffffffffffffffffffffff" + "esfekfgy enter code here`etd`enter code here wdd" + "heljwidgutwdbwdq8d" + "skdfgysrdsdnjsvfyekbdsgcu" + "jbujsbjvugsduddbdj"; System.out.println("after compress:"); String compressed = compress(string); System.out.println(compressed); System.out.println("after decompress:"); String decomp = decompress(compressed); System.out.println(decomp); } public static String compress(String str) throws Exception { if (str == null || str.length() == 0) { return str; } System.out.println("String length : " + str.length()); ByteArrayOutputStream obj=new ByteArrayOutputStream(); GZIPOutputStream gzip = new GZIPOutputStream(obj); gzip.write(str.getBytes("UTF-8")); gzip.close(); String outStr = obj.toString("UTF-8"); System.out.println("Output String length : " + outStr.length()); return outStr; } public static String decompress(String str) throws Exception { if (str == null || str.length() == 0) { return str; } System.out.println("Input String length : " + str.length()); GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(str.getBytes("UTF-8"))); BufferedReader bf = new BufferedReader(new InputStreamReader(gis, "UTF-8")); String outStr = ""; String line; while ((line=bf.readLine())!=null) { outStr += line; } System.out.println("Output String lenght : " + outStr.length()); return outStr; } Still couldn't figure out how to fix this issue!
This is because of String outStr = obj.toString("UTF-8"); Send the byte[] which you can get from your ByteArrayOutputStream and use it as such in your ByteArrayInputStream to construct your GZIPInputStream. Following are the changes which need to be done in your code. byte[] compressed = compress(string); //In the main method public static byte[] compress(String str) throws Exception { ... ... return obj.toByteArray(); } public static String decompress(byte[] bytes) throws Exception { ... GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(bytes)); ... }
The above Answer solves our problem but in addition to that. if we are trying to decompress a uncompressed("not a zip format") byte[] . we will get "Not in GZIP format" exception message. For solving that we can add addition code in our Class. public static boolean isCompressed(final byte[] compressed) { return (compressed[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (compressed[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8)); } My Complete Compression Class with compress/decompress would look like: import java.io.BufferedReader; import java.io.ByteArrayInputStream; import java.io.ByteArrayOutputStream; import java.io.IOException; import java.io.InputStreamReader; import java.util.zip.GZIPInputStream; import java.util.zip.GZIPOutputStream; public class GZIPCompression { public static byte[] compress(final String str) throws IOException { if ((str == null) || (str.length() == 0)) { return null; } ByteArrayOutputStream obj = new ByteArrayOutputStream(); GZIPOutputStream gzip = new GZIPOutputStream(obj); gzip.write(str.getBytes("UTF-8")); gzip.flush(); gzip.close(); return obj.toByteArray(); } public static String decompress(final byte[] compressed) throws IOException { final StringBuilder outStr = new StringBuilder(); if ((compressed == null) || (compressed.length == 0)) { return ""; } if (isCompressed(compressed)) { final GZIPInputStream gis = new GZIPInputStream(new ByteArrayInputStream(compressed)); final BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(gis, "UTF-8")); String line; while ((line = bufferedReader.readLine()) != null) { outStr.append(line); } } else { outStr.append(compressed); } return outStr.toString(); } public static boolean isCompressed(final byte[] compressed) { return (compressed[0] == (byte) (GZIPInputStream.GZIP_MAGIC)) && (compressed[1] == (byte) (GZIPInputStream.GZIP_MAGIC >> 8)); } }
If you ever need to transfer the zipped content via network or store it as text, you have to use Base64 encoder(such as apache commons codec Base64) to convert the byte array to a Base64 String, and decode the string back to byte array at remote client. Found an example at Use Zip Stream and Base64 Encoder to Compress Large String Data!
Another example of correct compression and decompression: #Slf4j public class GZIPCompression { public static byte[] compress(final String stringToCompress) { if (isNull(stringToCompress) || stringToCompress.length() == 0) { return null; } try (final ByteArrayOutputStream baos = new ByteArrayOutputStream(); final GZIPOutputStream gzipOutput = new GZIPOutputStream(baos)) { gzipOutput.write(stringToCompress.getBytes(UTF_8)); gzipOutput.finish(); return baos.toByteArray(); } catch (IOException e) { throw new UncheckedIOException("Error while compression!", e); } } public static String decompress(final byte[] compressed) { if (isNull(compressed) || compressed.length == 0) { return null; } try (final GZIPInputStream gzipInput = new GZIPInputStream(new ByteArrayInputStream(compressed)); final StringWriter stringWriter = new StringWriter()) { IOUtils.copy(gzipInput, stringWriter, UTF_8); return stringWriter.toString(); } catch (IOException e) { throw new UncheckedIOException("Error while decompression!", e); } } }
The problem is this line: String outStr = obj.toString("UTF-8"); The byte array obj contains arbitrary binary data. You can't "decode" arbitrary binary data as if it was UTF-8. If you try you will get a String that cannot then be "encoded" back to bytes. Or at least, the bytes you get will be different to what you started with ... to the extent that they are no longer a valid GZIP stream. The fix is to store or transmit the contents of the byte array as-is. Don't try to convert it into a String. It is binary data, not text.
Client send some messages need be compressed, server (kafka) decompress the string meesage Below is my sample: compress: public static String compress(String str, String inEncoding) { if (str == null || str.length() == 0) { return str; } try { ByteArrayOutputStream out = new ByteArrayOutputStream(); GZIPOutputStream gzip = new GZIPOutputStream(out); gzip.write(str.getBytes(inEncoding)); gzip.close(); return URLEncoder.encode(out.toString("ISO-8859-1"), "UTF-8"); } catch (IOException e) { e.printStackTrace(); } return null; } decompress: public static String decompress(String str, String outEncoding) { if (str == null || str.length() == 0) { return str; } try { String decode = URLDecoder.decode(str, "UTF-8"); ByteArrayOutputStream out = new ByteArrayOutputStream(); ByteArrayInputStream in = new ByteArrayInputStream(decode.getBytes("ISO-8859-1")); GZIPInputStream gunzip = new GZIPInputStream(in); byte[] buffer = new byte[256]; int n; while ((n = gunzip.read(buffer)) >= 0) { out.write(buffer, 0, n); } return out.toString(outEncoding); } catch (IOException e) { e.printStackTrace(); } return null; }
You can't convert binary data to String. As a solution you can encode binary data and then convert to String. For example, look at this How do you convert binary data to Strings and back in Java?
How to convert FileInputStream into string in java?
In my java project, I'm passing FileInputStream to a function, I need to convert (typecast FileInputStream to string), How to do it.?? public static void checkfor(FileInputStream fis) { String a=new String; a=fis //how to do convert fileInputStream into string print string here }
You can't directly convert it to string. You should implement something like this Add this code to your method //Commented this out because this is not the efficient way to achieve that //StringBuilder builder = new StringBuilder(); //int ch; //while((ch = fis.read()) != -1){ // builder.append((char)ch); //} // //System.out.println(builder.toString()); Use Aubin's solution: public static String getFileContent( FileInputStream fis, String encoding ) throws IOException { try( BufferedReader br = new BufferedReader( new InputStreamReader(fis, encoding ))) { StringBuilder sb = new StringBuilder(); String line; while(( line = br.readLine()) != null ) { sb.append( line ); sb.append( '\n' ); } return sb.toString(); } }
public static String getFileContent( FileInputStream fis, String encoding ) throws IOException { try( BufferedReader br = new BufferedReader( new InputStreamReader(fis, encoding ))) { StringBuilder sb = new StringBuilder(); String line; while(( line = br.readLine()) != null ) { sb.append( line ); sb.append( '\n' ); } return sb.toString(); } }
Using Apache commons IOUtils function import org.apache.commons.io.IOUtils; InputStream inStream = new FileInputStream("filename.txt"); String body = IOUtils.toString(inStream, StandardCharsets.UTF_8.name());
Don't make the mistake of relying upon or needlessly converting/losing endline characters. Do it character by character. Don't forget to use the proper character encoding to interpres the stream. public String getFileContent( FileInputStream fis ) { StringBuilder sb = new StringBuilder(); Reader r = new InputStreamReader(fis, "UTF-8"); //or whatever encoding int ch = r.read(); while(ch >= 0) { sb.append(ch); ch = r.read(); } return sb.toString(); } If you want to make this a little more efficient, you can use arrays of characters instead, but to be honest, looping over the characters can be still quite fast. public String getFileContent( FileInputStream fis ) { StringBuilder sb = new StringBuilder(); Reader r = new InputStreamReader(fis, "UTF-8"); //or whatever encoding char[] buf = new char[1024]; int amt = r.read(buf); while(amt > 0) { sb.append(buf, 0, amt); amt = r.read(buf); } return sb.toString(); }
From an answer I edited here: static String convertStreamToString(java.io.InputStream is) { if (is == null) { return ""; } java.util.Scanner s = new java.util.Scanner(is); s.useDelimiter("\\A"); String streamString = s.hasNext() ? s.next() : ""; s.close(); return streamString; } This avoids all errors and works well.
Use following code ----> try { FileInputStream fis=new FileInputStream("filename.txt"); int i=0; while((i = fis.read()) !=-1 ) { // to reach until the laste bytecode -1 System.out.print((char)i); /* For converting each bytecode into character */ } fis.close(); } catch(Exception ex) { System.out.println(ex); }