Lengthy string compression/decompression in Java

Lengthy string compression/decompression in Java - java

I am looking for String length compression to avoid lengthy filename as below. The string contains UTF-8 characters as well.
"dt=20200623_isValid=valid_module_name=A&B&C_data_source=internet_part-00001-1234-9d12-1234-123d-1234567890a1.b001.json"
Tried Huffman compression from GitHub here, it reduces the size but not much on the String length.
Size before compression: 944
Size after compression: 569
Compressed string:
01011111001111100011101000111011101011001000111110001101000011011001000110001111010001010111111001010110001111010001010001101101010000101101110001110000000110101011010110100000111111001101011111100111101111110100000010101011011110011000010011001000101110010011101001000001111101001010111110000001001101010000111100001110101001100100111110001011101110111011101001001010011000111110111000101100000101100110000010100110001111101110001010011000111110101001010011000111110111011010111011001101100110110111000011100110100111000111011101110111010011100011101111001100100010101
Please advise how to achieve length compression in Java? (The decompressed file Name value is needed for further processing).

You should try ZLIB/GZ Compression. You can find GZ compression snippet here compression and decompression of string data in java
ZLIB compression implementation is also fairly easy. You can use the below code as a starter and improve upon it.
Detailed explanation on compressions How are zlib, gzip and zip related? What do they have in common and how are they different?
Read Deflator strategies before proceeding ahead: Java Deflater strategies - DEFAULT_STRATEGY, FILTERED and HUFFMAN_ONLY
public void compressFile(String originalFileName, String compressedFileName) {
try (FileInputStream fileInputStream = new FileInputStream(originalFileName);
FileOutputStream fileOutputStream = new FileOutputStream(compressedFileName);
DeflaterOutputStream deflaterOutputStream = new DeflaterOutputStream(fileOutputStream))
{
int data;
while ((data = fileInputStream.read()) != -1) {
deflaterOutputStream.write(data);
}
} catch (IOException e) {
e.printStackTrace();
}
}
You can decompress using Inflator.
public void decompressFile(String fileTobeDecomporessed, String outputfile) {
try (
FileInputStream fileInputStream = new FileInputStream(fileTobeDecomporessed);
FileOutputStream fileOutputStream = new FileOutputStream(outputfile);
InflaterInputStream inflaterInputStream = new InflaterInputStream(fileInputStream)) {
int data;
while ((data = inflaterInputStream.read()) != -1) {
fileOutputStream.write(data);
}
} catch (IOException e) {
e.printStackTrace();
}
}
Refer: http://cr.openjdk.java.net/~iris/se/11/latestSpec/api/java.base/java/util/zip/Deflater.html

Of course using one character per binary digit is going to use up a lot of space. That library is using 16 bits (the size of a char) to represent a single bit, so it is literally making its result 16 times larger than it needs to be.
A far more compact way to represent binary data is by converting it to hexadecimal.
byte[] compressedBytes = new BigInteger(compressedString, 2).toByteArray();
Formatter formatter = new Formatter();
for (byte b : compressedBytes) {
formatter.format("%02x", b);
}
String hex = formatter.toString();
Then the result is 142 bytes:
BE7C7477591F1A1B231E8AFCAC7A28DA85B8E0356B41F9AFCF7E8156F30991727483E95F026A1E1D4C9F17777494C7DC582CC14C7DC531F5298FBB5D9B36E1CD38EEEE9C779915
You could even go a step farther and Base64 encode it, reducing the result to 96 bytes:
String s = Base64.getEncoder().encodeToString(compressedBytes);
Result:
AL58dHdZHxobIx6K/Kx6KNqFuOA1a0H5r89+gVbzCZFydIPpXwJqHh1Mnxd3dJTH3FgswUx9xTH1KY+7XZs24c047u6cd5kV

Related

Java byte array compression

I'm trying to use the java DeflaterOutputStream and InflaterOutputStream classes to compress a byte array, but both appear to not be working correctly. I assume I'm incorrectly implementing them.
public static byte[] compress(byte[] in) {
try {
ByteArrayOutputStream out = new ByteArrayOutputStream();
DeflaterOutputStream defl = new DeflaterOutputStream(out);
defl.write(in);
defl.flush();
defl.close();
return out.toByteArray();
} catch (Exception e) {
e.printStackTrace();
System.exit(150);
return null;
}
}
public static byte[] decompress(byte[] in) {
try {
ByteArrayOutputStream out = new ByteArrayOutputStream();
InflaterOutputStream infl = new InflaterOutputStream(out);
infl.write(in);
infl.flush();
infl.close();
return out.toByteArray();
} catch (Exception e) {
e.printStackTrace();
System.exit(150);
return null;
}
}
Here's the two methods I'm using to compress and decompress the byte array. Most implementations I've seen online use a fixed size buffer array for the decompression portion, but I'd prefer to avoid that if possible, because I'd need to make that buffer array have a size of one if I want to have any significant compression.
If anyone can explain to me what I'm doing wrong it would be appreciated. Also, to explain why I know these methods aren't working correctly: The "compressed" byte array that it outputs is always larger than the uncompressed one, no matter what size byte array I attempt to provide it.

This will depend on the data you are compressing. For example if we take an array of 0 bytes it compresses well:
byte[] plain = new byte[10000];
byte[] compressed = compress(plain);
System.out.println(compressed.length); // 33
byte[] result = decompress(compressed);
System.out.println(result.length); // 10000

Compression always has overhead to allow for future decompression. If the compression produced no reduction in length (the data was unique or nearly unique) then the output file could be longer than the input file

How to read /write XORed txt file UTF8 in java?

what i did so far :
I read a file1 with text, XORed the bytes with a key and wrote it back to another file2.
My problem: I read for example 'H' from file1 , the byte value is 72;
72 XOR -32 = -88
Now i wrote -88 in to the file2.
when i read file2 i should get -88 as first byte, but i get -3.
public byte[] readInput(String File) throws IOException {
Path path = Paths.get(File);
byte[] data = Files.readAllBytes(path);
byte[]x=new byte[data.length ];
FileInputStream fis = new FileInputStream(File);
InputStreamReader isr = new InputStreamReader(fis);//utf8
Reader in = new BufferedReader(isr);
int ch;
int s = 0;
while ((ch = in.read()) > -1) {// read till EOF
x[s] = (byte) (ch);
}
in.close();
return x;
}
public void writeOutput(byte encrypted [],String file) {
try {
FileOutputStream fos = new FileOutputStream(file);
Writer out = new OutputStreamWriter(fos,"UTF-8");//utf8
String s = new String(encrypted, "UTF-8");
out.write(s);
out.close();
}
catch (IOException e) {
e.printStackTrace();
}
}
public byte[]DNcryption(byte[]key,byte[] mssg){
if(mssg.length==key.length)
{
byte[] encryptedBytes= new byte[key.length];
for(int i=0;i<key.length;i++)
{
encryptedBytes[i]=Byte.valueOf((byte)(mssg[i]^key[i]));//XOR
}
return encryptedBytes;
}
else
{
return null;
}
}

You're not reading the file as bytes - you're reading it as characters. The encrypted data isn't valid UTF-8-encoded text, so you shouldn't try to read it as such.
Likewise, you shouldn't be writing arbitrary byte arrays as if they're UTF-8-encoded text.
Basically, your methods have signatures accepting or returning arbitrary binary data - don't use Writer or Reader classes at all. Just write the data straight to the stream. (And don't swallow the exception, either - do you really want to continue if you've failed to write important data?)
I would actually remove both your readInput and writeOutput methods entirely. Instead, use Files.readAllBytes and Files.write.

In writeOutput method you convert encrypted byte array into UTF-8 String which changes the actual bytes you are writing later to the file. Try this code snippet to see what is happening when you try to convert byte array with negative values to UTF-8 String:
final String s = new String(new byte[]{-1}, "UTF-8");
System.out.println(Arrays.toString(s.getBytes("UTF-8")));
It will print something like [-17, -65, -67]. Try using OutputStream to write bytes to the file.
new FileOutputStream(file).write(encrypted);

Java Decompressing byte array - incorrect data check

I have a little problem: I decompress byte array and everything is ok with following code but sometimes with some data it throws DataFormatException with incorrect data check. Any ideas?
private byte[] decompress(byte[] compressed) throws DecoderException {
Inflater decompressor = new Inflater();
decompressor.setInput(compressed);
ByteArrayOutputStream outPutStream = new ByteArrayOutputStream(compressed.length);
byte temp [] = new byte[8196];
while (!decompressor.finished()) {
try {
int count = decompressor.inflate(temp);
logger.info("count = " + count);
outPutStream.write(temp, 0, count);
}
catch (DataFormatException e) {
logger.info(e.getMessage());
throw new DecoderException("Wrong format", e);
}
}
try {
outPutStream.close();
} catch (IOException e) {
throw new DecoderException("Cant close outPutStream ", e);
}
return outPutStream.toByteArray();
}

Try with a different compression level or using the nowrap options

1 Some warning: do you use the same algorithm in both sides ?
do you use bytes ? (not String)
your arrays have the good sizes ?
2
I suggest you check step by step, catching exceptions, checking sizes, null, and comparing bytes.
like this: Using Java Deflater/Inflater with custom dictionary causes IllegalArgumentException
Take your input
Compress it
copy your bytes
decompress them
compare output with input
3 if you cant find, take another example which works, and modify it step by step
hope it helps

I found out why its happening
byte temp [] = new byte[8196];
its too big, it must be exactly size of decompressed array cause it was earlier Base64 encoded, how i can get this size before decompressing it?

Reading binary from any type of file

I'm looking for a way that I can read the binary data of a file into a string.
I've found one that reads the bytes directly and converts the bytes to binary, the only problem is that it takes up a significant amount of RAM.
Here's the code I'm currently using
try {
byte[] fileData = new byte[(int) sellect.length()];
FileInputStream in = new FileInputStream(sellect);
in.read(fileData);
in.close();
getBinary(fileData[0]);
getBinary(fileData[1]);
getBinary(fileData[2]);
} catch (IOException e) {
e.printStackTrace();
}
And the getBinary() method
public String getBinary(byte bite) {
String output = String.format("%8s", Integer.toBinaryString(bite & 0xFF)).replace(' ', '0');
System.out.println(output); // 10000001
return output;
}

Can you do something like this:
int buffersize = 1000;
int offset = 0;
byte[] fileData = new byte[buffersize];
int numBytesRead;
String string;
while((numBytesRead = in.read(fileData,offset,buffersize)) != -1)
{
string = getBinary(fileData);//Adjust this so it can work with a whole array of bytes at once
out.write(string);
offset += numBytesRead;
}
This way, you never store more information in the ram than the byte and string structures. The file is read 1000 bytes at a time, translated to a string 1 byte at a time, and then put into a new file as a string. Using read() returns the value of how many bytes it reads.

This link can help you :
File to byte[] in Java
public static byte[] toByteArray(InputStream input) throws IOException
Gets the contents of an InputStream as a byte[]. This method buffers
the input internally, so there is no need to use a
BufferedInputStream.
Parameters: input - the InputStream to read from Returns: the
requested byte array Throws: NullPointerException - if the input is
null IOException - if an I/O error occurs

compress base 64 png image in java

Hi I would like to know if there is any way in Java to reduce the size of an image.Actually My front end is IOS,they are sending Base 64 encode data and when i'm getting the encoded data and i'm decoding the encoded data and storing in byte array. and now i want to compress the PNG image in java and my method code something like
public String processFile(String strImageBase64, String strImageName,String donorId)
{
FileOutputStream fos =null;
File savedFile=null;
try
{
String FileItemRefPath = propsFPCConfig.getProperty("fileCreationReferencePath");
String imageURLReferncePath = propsFPCConfig.getProperty("imageURLReferncePath");
File f = new File(FileItemRefPath+"\\"+"productimages"+"\\"+donorId);
String strException = "Actual File "+f.getName();
if(!f.exists())
{
boolean isdirCreationStatus = f.mkdir();
}
String strDateTobeAppended = new SimpleDateFormat("yyyyMMddhhmm").format(new Date(0));
String fileName = strImageName+strDateTobeAppended;
savedFile = new File(f.getAbsolutePath()+"\\"+fileName);
strException=strException+" savedFile "+savedFile.getName();
Base64 decoder = new Base64();
byte[] decodedBytes = decoder.decode(strImageBase64);
if( (decodedBytes != null) && (decodedBytes.length != 0) )
{
System.out.println("Decoded bytes length:"+decodedBytes.length);
fos = new FileOutputStream(savedFile);
System.out.println(new String(decodedBytes) + "\n") ;
int x=0;
{
fos.write(decodedBytes, 0, decodedBytes.length);
}
fos.flush();
}
//System.out.println(savedFile.getCanonicalPath() +" savedFile.getCanonicalPath() ");
if(fos != null)
{
fos.close();
return savedFile.getAbsolutePath();
}
else
{
return null;
}
}
catch(Exception e)
{
e.printStackTrace();
}
finally
{
try
{
if( fos!= null)
{
fos.close();
}
else
{
savedFile = null;
}
}
catch (IOException e)
{
e.printStackTrace();
}
}
return savedFile.getName();
}
and i'm storing this decoded data with imagename,now i want to store this compressed image in anothe url

I don't think this should be worth the effort.
PNGs already have a very high level of compression. It is hard to reduce the size by means of additional compression significantly.
If you are really sending the image or the response Base64 encoded to the client, of course there are ways to improve transfer rates: Enable gzip compression on your server so that HTTP responses will be gzip compressed. This reduces the actual number of bytes to transfer quite a bit in case the original data is Base64 encoded (which basically means that you are only using 6 of 8 bits per bytes). Enabling gzip compression is transparent to your server code and is just a configuration switch away for most webservers.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Lengthy string compression/decompression in Java - java

Related

Java byte array compression

How to read /write XORed txt file UTF8 in java?

Java Decompressing byte array - incorrect data check

Reading binary from any type of file

compress base 64 png image in java

Categories

Resources