write/read variable byte encoded string representation to/from file in JAVA

write/read variable byte encoded string representation to/from file in JAVA - java

everyone! I recently learned about variable byte encoding.
for example, if a file contains this sequence of number: 824 5 214577
applying variable byte encoding this sequence would be encoded as 000001101011100010000101000011010000110010110001.
Now I want to know how to write that in another file such that to produce a kind of compressed file from the original. and similarly how to read it. I'm using JAVA .
Have tried this:
LinkedList<Integer> numbers = new LinkedList<Integer>();
numbers.add(824);
numbers.add(5);
numbers.add(214577);
String code = VBEncoder.encodeToString(numbers);//returns 000001101011100010000101000011010000110010110001 into code
File file = new File("test.compressed");
DataOutputStream out = new DataOutputStream(new BufferedOutputStream(new FileOutputStream(file)));
out.writeBytes(code);
out.flush();
this just writes the binary representation into the file..and this is not what I'm expecting.
I have also tried this:
LinkedList<Integer> code = VBEncoder.encode(numbers);//returns linked list of Byte(i give its describtion later)
File file = new File("test.compressed");
DataOutputStream out = new DataOutputStream(new BufferedOutputStream(new FileOutputStream(file)));
for(Byte b:code){
out.write(b.toInt());
System.out.println(b.toInt());
}
out.flush();
// he goes the describtion of the class Byte
class Byte {
int[] abyte;
Byte() {
abyte = new int[8];
}
public void readInt(int n) {
String bin = Integer.toBinaryString(n);
for (int i = 0; i < (8 - bin.length()); i++) {
abyte[i] = 0;
}
for (int i = 0; i < bin.length(); i++) {
abyte[i + (8 - bin.length())] = bin.charAt(i) - 48;
}
}
public void switchFirst() {
abyte[0] = 1;
}
public int toInt() {
int res = 0;
for (int i = 0; i < 8; i++) {
res += abyte[i] * Math.pow(2, (7 - i));
}
return res;
}
public static Byte fromString(String codestring) {
Byte b = new Byte();
for(int i=0; i < 8; i++)
b.abyte[i] = (codestring.charAt(i)=='0')?0:1;
return b;
}
public String toString() {
String res = "";
for (int i = 0; i < 8; i++) {
res += abyte[i];
}
return res;
}
}
its prints this in the console:
6
184
133
13
12
177
this second attempt seems to work...the output file size is 6 bytes while for the first attemps it was 48 bytes.
but the problem in the second attempt is that I can't successfully read back the file.
InputStreamReader inStream = new InputStreamReader(new FileInputStream(file));
int c = -1;
while((c = inStream.read()) != -1){
System.out.println( c );
}
i get this:
6
184
8230
13
12
177
..so maybe I'm doing it the wrong way: expecting to receive some good advice from you. thanks!

It is solved; I was just not reading the file the right way:below is the right way:
DataInputStream inStream = null;
inStream = new DataInputStream(new BufferedInputStream(newFileInputStream(file)));
int c = -1;
while((c = inStream.read()) != -1){
Byte b = new Byte();
b.readInt(c);
System.out.println( c +":" + b.toString());
}
now I get this as the result:
6:00000110
184:10111000
133:10000101
13:00001101
12:00001100
177:10110001
Now the importance of writing the original sequence of integers into variable encoded bytes reduces the size of the file; if we normally write this sequence of integers in the file, its size would be 12 bytes (3 * 4 bytes). but now it is just 6 bytes.
int c = -1;
LinkedList<Byte> bytestream = new LinkedList<Byte>();
while((c = inStream.read()) != -1){
Byte b = new Byte();
b.readInt(c);
bytestream.add(b);
}
LinkedList<Integer> numbers = VBEncoder.decode(bytestream);
for(Integer number:numbers) System.out.println(number);
//
//here goes the code of VBEncoder.decode
public static LinkedList<Integer> decode(LinkedList<Byte> code) {
LinkedList<Integer> numbers = new LinkedList<Integer>();
int n = 0;
for (int i = 0; !(code.isEmpty()); i++) {
Byte b = code.poll();
int bi = b.toInt();
if (bi < 128) {
n = 128 * n + bi;
} else {
n = 128 * n + (bi - 128);
numbers.add(n);
n = 0;
}
}
return numbers;
}
I get back the sequence:
824
5
214577

Related

How to use XOR to develop a OTPInputStream in Java

I want to develop a OTPInputStream in Java that extends the InputStream and takes another input stream of key data and provides a stream encrypting / decrypting input stream.I need to develop a test program to show the use of OTPInputStream that uses XOR and arbitrary data.
I tried with this code but I have problem that is
java.io.FileInputStream cannot be cast to java.lang.CharSequence
What should I do here?
public class Bitwise_Encryption {
static String file = "" ;
static String key = "VFGHTrbg";
private static int[] encrypt(FileInputStream file, String key) {
int[] output = new int[((CharSequence) file).length()];
for(int i = 0; i < ((CharSequence) file).length(); i++) {
int o = (Integer.valueOf(((CharSequence) file).charAt(i)) ^ Integer.valueOf(key.charAt(i % (key.length() - 1)))) + '0';
output[i] = o;
}
return output;
}
private static String decrypt(int[] input, String key) {
String output = "";
for(int i = 0; i < input.length; i++) {
output += (char) ((input[i] - 48) ^ (int) key.charAt(i % (key.length() - 1)));
}
return output;
}
public static void main(String args[]) throws FileNotFoundException {
FileInputStream file = new FileInputStream("directory");
encrypt(file,key);
//decrypt();
int[] encrypted = encrypt(file,key);
System.out.println("Encrypted Data is :");
for(int i = 0; i < encrypted.length; i++)
System.out.printf("%d,", encrypted[i]);
System.out.println("");
System.out.println("---------------------------------------------------");
System.out.println("Decrypted Data is :");
System.out.println(decrypt(encrypted,key));
}
}

Think what you want is just file.read() and file.getChannel().size() to read one character at a time and get the size of the file
Try something like this:
private static int[] encrypt(FileInputStream file, String key) {
int fileSize = file.getChannel().size();
int[] output = new int[fileSize];
for(int i = 0; i < output.length; i++) {
char char1 = (char) file.read();
int o = (char1 ^ Integer.valueOf(key.charAt(i % (key.length() - 1)))) + '0';
output[i] = o;
}
return output;
}
Will have to do some error handling because file.read() will return -1 if the end of the file has been reached and as pointed out reading one byte at a time is lot of IO operations and can slow down performance. You can keep the data in a buffer and read it another way like this:
private static int[] encrypt(FileInputStream file, String key) {
int fileSize = file.getChannel().size();
int[] output = new int[fileSize];
int read = 0;
int offset = 0;
byte[] buffer = new byte[1024];
while((read = file.read(buffer)) > 0) {
for(int i = 0; i < read; i++) {
char char1 = (char) buffer[i];
int o = (char1 ^ Integer.valueOf(key.charAt(i % (key.length() - 1)))) + '0';
output[i + offset] = o;
}
offset += read;
}
return output;
}
This will read in 1024 bytes at a time from the file and store it in your buffer, then you can loop through the buffer to do your logic. The offset value is to store where in our output the current spot is. Also you will have to make sure that i + offset doesn't exceed your array size.
UPDATE
After working with it; i decided to switch to Base64 Encoding/Decoding to remove non-printable characters:
private static String encrypt(InputStream file, String key) throws Exception {
int read = 0;
byte[] buffer = new byte[1024];
try(ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
while((read = file.read(buffer)) > 0) {
baos.write(buffer, 0, read);
}
return base64Encode(xorWithKey(baos.toByteArray(), key.getBytes()));
}
}
private static String decrypt(String input, String key) {
byte[] decoded = base64Decode(input);
return new String(xorWithKey(decoded, key.getBytes()));
}
private static byte[] xorWithKey(byte[] a, byte[] key) {
byte[] out = new byte[a.length];
for (int i = 0; i < a.length; i++) {
out[i] = (byte) (a[i] ^ key[i%key.length]);
}
return out;
}
private static byte[] base64Decode(String s) {
return Base64.getDecoder().decode(s.trim());
}
private static String base64Encode(byte[] bytes) {
return Base64.getEncoder().encodeToString(bytes);
}
This method is cleaner and doesn't require knowing the size of your InputStream or do any character conversions. It reads your InputStream into an OutputStream to do the Base64 Encoding as well to remove non printable characters.
I have tested this and it works both for encrypting and decrypting.
I got the idea from this answer:
XOR operation with two strings in java

FileChannel and ByteBuffer writing extra data

I am creating a method that will take in a file and split it into shardCount pieces and generate a parity file.
When I run this method, it appears that I am writing out extra data into my parity file. This is my first time using FileChannel and ByteBuffers, so I'm not certain I completely understand how to use them despite staring at the documentation for about 8 hours.
This code is a simplified version of the parity section.
public static void splitAndGenerateParityFile(File file, int shardCount, String fileID) throws IOException {
RandomAccessFile rin = new RandomAccessFile(file, "r");
FileChannel fcin = rin.getChannel();
//Create parity files
File parity = new File(fileID + "_parity");
if (parity.exists()) throw new FileAlreadyExistsException("Could not create parity file! File already exists!");
RandomAccessFile parityRAF = new RandomAccessFile(parity, "rw");
FileChannel parityOut = parityRAF.getChannel();
long bytesPerFile = (long) Math.ceil(rin.length() / shardCount);
//Make buffers for each section of the file we will be reading from
for (int i = 0; i < shardCount; i++) {
ByteBuffer bb = ByteBuffer.allocate(1024);
shardBuffers.add(bb);
}
ByteBuffer parityBuffer = ByteBuffer.allocate(1024);
//Generate parity
boolean isParityBufferEmpty = true;
for (long i = 0; i < bytesPerFile; i++) {
isParityBufferEmpty = false;
int pos = (int) (i % 1024);
byte p = 0;
if (pos == 0) {
//Read chunk of file into each buffer
for (int j = 0; j < shardCount; j++) {
ByteBuffer bb = shardBuffers.get(j);
bb.clear();
fcin.read(bb, bytesPerFile * j + i);
bb.rewind();
}
//Dump parity buffer
if (i > 0) {
parityBuffer.rewind();
parityOut.write(parityBuffer);
parityBuffer.clear();
isParityBufferEmpty = true;
}
}
//Get parity
for (ByteBuffer bb : shardBuffers) {
if (pos >= bb.limit()) break;
p ^= bb.get(pos);
}
//Put parity in buffer
parityBuffer.put(pos, p);
}
if (!isParityBufferEmpty) {
parityBuffer.rewind();
parityOut.write(parityBuffer);
parityBuffer.clear();
}
fcin.close();
rin.close();
parityOut.close();
parityRAF.close();
}
Please let me know if there is anything wrong with either the parity algorithm or the file IO, or if there's anything I can do to optimize this. I'm happy to hear about other (better) ways of doing file IO.

Here is the solution I found (though it may need more tuning):
public static void splitAndGenerateParityFile(File file, int shardCount, String fileID) throws IOException {
int BUFFER_SIZE = 4 * 1024 * 1024;
RandomAccessFile rin = new RandomAccessFile(file, "r");
FileChannel fcin = rin.getChannel();
//Create parity files
File parity = new File(fileID + "_parity");
if (parity.exists()) throw new FileAlreadyExistsException("Could not create parity file! File already exists!");
RandomAccessFile parityRAF = new RandomAccessFile(parity, "rw");
FileChannel parityOut = parityRAF.getChannel();
//Create shard files
ArrayList<File> shards = new ArrayList<>(shardCount);
for (int i = 0; i < shardCount; i++) {
File f = new File(fileID + "_part_" + i);
if (f.exists()) throw new FileAlreadyExistsException("Could not create shard file! File already exists!");
shards.add(f);
}
long bytesPerFile = (long) Math.ceil(rin.length() / shardCount);
ArrayList<ByteBuffer> shardBuffers = new ArrayList<>(shardCount);
//Make buffers for each section of the file we will be reading from
for (int i = 0; i < shardCount; i++) {
ByteBuffer bb = ByteBuffer.allocate(BUFFER_SIZE);
shardBuffers.add(bb);
}
ByteBuffer parityBuffer = ByteBuffer.allocate(BUFFER_SIZE);
//Generate parity
boolean isParityBufferEmpty = true;
for (long i = 0; i < bytesPerFile; i++) {
isParityBufferEmpty = false;
int pos = (int) (i % BUFFER_SIZE);
byte p = 0;
if (pos == 0) {
//Read chunk of file into each buffer
for (int j = 0; j < shardCount; j++) {
ByteBuffer bb = shardBuffers.get(j);
bb.clear();
fcin.position(bytesPerFile * j + i);
fcin.read(bb);
bb.flip();
}
//Dump parity buffer
if (i > 0) {
parityBuffer.flip();
while (parityBuffer.hasRemaining()) {
parityOut.write(parityBuffer);
}
parityBuffer.clear();
isParityBufferEmpty = true;
}
}
//Get parity
for (ByteBuffer bb : shardBuffers) {
if (!bb.hasRemaining()) break;
p ^= bb.get();
}
//Put parity in buffer
parityBuffer.put(p);
}
if (!isParityBufferEmpty) {
parityBuffer.flip();
parityOut.write(parityBuffer);
parityBuffer.clear();
}
fcin.close();
rin.close();
parityOut.close();
parityRAF.close();
}
As suggested by VGR, I replaced rewind() with flip(). I also switched to relative operations instead of absolute. I don't think the absolute methods adjust the cursor position or the limit, so that was likely the cause of the error. I also changed the buffer size to 4MB as I am interested in generating the parity for large files.

Why is my Datagram Packet response incomplete?

I'm trying to build an Android app to perform UDP requests. However, whenever I try to receive the response, the last four characters in the response string are missing. The response should be 38 bytes long.
I've tried specifying what encoding to use and it didn't make much difference.
private void updateState() {
final byte[] msg = hexStringToBytes("24000034...");
new Thread(new Runnable() {
public void run() {
try {
InetAddress bulbAddress = InetAddress.getByAddress(ipAddr);
if (!socket.getBroadcast()) socket.setBroadcast(true);
DatagramPacket packet = new DatagramPacket(msg, msg.length, bulbAddress, 56700);
socket.send(packet);
DatagramPacket packet1 = new DatagramPacket(msg, msg.length, bulbAddress, 56700);
socket.receive(packet1);
TextView textView = (TextView) findViewById(R.id.state);
String out = new String(packet1.getData(), packet1.getOffset(), packet1.getLength());
textView.setText(toHex(out));
} catch (Exception ex) {
ex.printStackTrace();
}
}
}).start();
}
private static byte[] hexStringToBytes(String input) {
input = input.toLowerCase(Locale.US);
int n = input.length() / 2;
byte[] output = new byte[n];
int l = 0;
for (int k = 0; k < n; k++) {
char c = input.charAt(l++);
byte b = (byte) ((c >= 'a' ? (c - 'a' + 10) : (c - '0')) << 4);
c = input.charAt(l++);
b |= (byte) (c >= 'a' ? (c - 'a' + 10) : (c - '0'));
output[k] = b;
}
return output;
}
public String toHex(String arg) {
return String.format("%040x", new BigInteger(1, arg.getBytes()));
}
I expect the last four characters to be present and either FF FF or 00 00.

I think I've found the problem, I needed to use a new byte array without values of the new length of 28 bytes instead of using the one with only 26 bytes and pass that into packet1. At some point I completely blanked on the fact the first packet is two bytes smaller than the response.

Huffman Code writing bits to a file for compression

I was asked to use huffman code to compress an input file and write it to an output file. I have finished implementing the huffman tree structure and generating the huffman codes. But I dont know how to write those codes into a file so that the file is less in size than the original file.
Right now I have the codes in string representation (e.g huffman code for 'c' is "0100"). Someone please help me write those bits into a
file.

Here a possible implementation to write stream of bits(output of Huffman coding) into file.
class BitOutputStream {
private OutputStream out;
private boolean[] buffer = new boolean[8];
private int count = 0;
public BitOutputStream(OutputStream out) {
this.out = out;
}
public void write(boolean x) throws IOException {
this.count++;
this.buffer[8-this.count] = x;
if (this.count == 8){
int num = 0;
for (int index = 0; index < 8; index++){
num = 2*num + (this.buffer[index] ? 1 : 0);
}
this.out.write(num - 128);
this.count = 0;
}
}
public void close() throws IOException {
int num = 0;
for (int index = 0; index < 8; index++){
num = 2*num + (this.buffer[index] ? 1 : 0);
}
this.out.write(num - 128);
this.out.close();
}
}
By calling write method you will able to write bit by bit in a file (OutputStream).
Edit
For your specific problem, to save each character's huffman code you can simply use this if you don't want to use some other fancy class -
String huffmanCode = "0100"; // lets say its huffman coding output for c
BitSet huffmanCodeBit = new BitSet(huffmanCode.length());
for (int i = 0; i < huffmanCode.length(); i++) {
if(huffmanCode.charAt(i) == '1')
huffmanCodeBit.set(i);
}
String path = Resources.getResource("myfile.out").getPath();
ObjectOutputStream outputStream = null;
try {
outputStream = new ObjectOutputStream(new FileOutputStream(path));
outputStream.writeObject(huffmanCodeBit);
} catch (IOException e) {
e.printStackTrace();
}

Hex to ASCII showing different result to correct PHP implementaiton

I needed a method that would convert hex to ascii, and most seem to be a variation of the following:
public String hexToAscii(String hex) {
StringBuilder sb = new StringBuilder();
StringBuilder temp = new StringBuilder();
for(int i = 0; i < hex.length() - 1; i += 2){
String output = hex.substring(i, (i + 2));
int decimal = Integer.parseInt(output, 16);
sb.append((char)decimal);
temp.append(decimal);
}
return sb.toString();
}
The idea is to look at
hexToAscii("51d37bdd871c9e1f4d5541be67a6ab625e32028744d7d4609d0c37747b40cd2d");
If I print the result out, I get
-Í#{t7?`Ô×D?2^b«¦g¾AUM??Ý{ÓQ.
This is not the result I am needing though. A friend got the correct result in PHP which was the string reverse of the following:
QÓ{Ý‡žMUA¾g¦«b^2‡D×Ô`7t{#Í-
There are clearly characters that his hexToAscii function is encoding whereas mine is not.
Not really sure why this is the case, but how can I implement this version in Java?

Assuming your input string is in, I would use a method like this
public static byte[] decode(String in) {
if (in != null) {
in = in.trim();
List<Byte> bytes = new ArrayList<Byte>();
char[] chArr = in.toCharArray();
int t = 0;
while (t + 1 < chArr.length) {
String token = "" + chArr[t] + chArr[t + 1];
// This subtracts 128 from the byte value.
int b = Byte.MIN_VALUE
+ Integer.valueOf(token, 16);
bytes.add((byte) b);
t += 2;
}
byte[] out = new byte[bytes.size()];
for (int i = 0; i < bytes.size(); ++i) {
out[i] = bytes.get(i);
}
return out;
}
return new byte[] {};
}
And then you could use it like this
new String(decode("51d37bdd871c9e1f4d5541be67a6ab625e"
+"32028744d7d4609d0c37747b40cd2d"))

How about trying like this:
public static void main(String[] args) {
String hex = "51d37bdd871c9e1f4d5541be67a6ab625e32028744d7d4609d0c37747b40cd2d";
StringBuilder output = new StringBuilder();
for (int i = 0; i < hex.length(); i+=2) {
String str = hex.substring(i, i+2);
output.append((char)Integer.parseInt(str, 16));
}
System.out.println(output);
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

write/read variable byte encoded string representation to/from file in JAVA - java

Related

How to use XOR to develop a OTPInputStream in Java

FileChannel and ByteBuffer writing extra data

Why is my Datagram Packet response incomplete?

Huffman Code writing bits to a file for compression

Hex to ASCII showing different result to correct PHP implementaiton

Categories

Resources

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

write/read variable byte encoded string representation to/from file in JAVA - java

Related

How to use XOR to develop a ​OTPInputStream​ in Java

FileChannel and ByteBuffer writing extra data

Why is my Datagram Packet response incomplete?

Huffman Code writing bits to a file for compression

Hex to ASCII showing different result to correct PHP implementaiton

Categories

Resources

How to use XOR to develop a OTPInputStream in Java