How do I convert a large binary String to byte array java? - java

I have a large binary string "101101110...", and I am trying to store it into a byte array. what is the best way of doing it?
Lets say I have largeString = "0100111010111011011000000001000110101"
Result that I'm looking for:
[78, 187, 96, 17, 21]
01001110 10111011 01100000 00010001 10101
What i've tried:
byte[] b= new BigInteger(largeString,2).toByteArray();
however it did not give me the result I'm looking for...

You can easily build an ArrayList on which you can call toArray if you want an actual array;
List<Integer> list = new ArrayList<>();
for(String str : largeString.split("(?<=\\G.{8})"))
list.add(Integer.parseInt(str, 2));
System.out.println(list); // Outputs [78, 187, 96, 17, 21]

Assuming that your binary string module 8 equals 0 binString.lenght()%8==0
/**
* Get an byte array by binary string
* #param binaryString the string representing a byte
* #return an byte array
*/
public static byte[] getByteByString(String binaryString) {
int splitSize = 8;
if(binaryString.length() % splitSize == 0){
int index = 0;
int position = 0;
byte[] resultByteArray = new byte[binaryString.length()/splitSize];
StringBuilder text = new StringBuilder(binaryString);
while (index < text.length()) {
String binaryStringChunk = text.substring(index, Math.min(index + splitSize, text.length()));
Integer byteAsInt = Integer.parseInt(binaryStringChunk, 2);
resultByteArray[position] = byteAsInt.byteValue();
index += splitSize;
position ++;
}
return resultByteArray;
}
else{
System.out.println("Cannot convert binary string to byte[], because of the input length. '" +binaryString+"' % 8 != 0");
return null;
}
}

Do it in a loop. Split the string at 8-character chunks and convert them separately. In "pseudocode" it's something like:
byte[] result = new byte[subs.size()];
int i = 0;
int j = 0;
while(i+8 <= s.length){
result[j] = new Byte.valueOf(largeString.substring(i, i+8), 2);
i+=8;
j++;
}
result[j] = new Byte.valueOf(largeString.substring(i, largeString.length));

Related

How to encode chars in 2-bits? in java

DNA molecules are denoted by one of four values: A, C, G, or T. I need to convert a string of characters from A, C, G, and T to an array of bytes, encoding each of the characters
with two bits.A with bits 00, C with bits 01, G with 10, and T with 11. I don't understand how to convert characters to 2 bits. I was trying to shift and mask, but got wrong result.
At the very beginning, I check if there are characters in the line. Then i convert each character into a bit value and insert it into an array. When i insert ACGT, in the output i got 0 1 3 2. And here I have a problem, because I don’t understand how to convert the value to 2 bits.
Scanner text = new Scanner(System.in);
String str = text.nextLine();
if (str.contains("A") && str.contains("C") && str.contains("G") && str.contains("T")){
System.out.println("");
}
else
{
System.out.println("wrong command format");
}
byte mas[] = str.getBytes();
System.out.println("String in byte array : " + Arrays.toString(mas));
for (int i = 0; i < mas.length; i++){
byte mask = 3;
byte number = mas[i];
byte result = (byte)((number >> 1) & mask);
System.out.println(result);
}
}
}
It seems that you want to save the bits in a byte. The following example might give some ideas.
public class Main
{
private static final int A = 0x00; // b00
private static final int C = 0x01; // b01
private static final int G = 0x02; // b10
private static final int T = 0x03; // b11
public static void main(String[] args) throws Exception
{
byte store = 0;
store = setByte(store, 0, A);
store = setByte(store, 1, C);
store = setByte(store, 2, G);
store = setByte(store, 3, T);
System.out.println(Integer.toBinaryString(store));
//11111111111111111111111111100100
System.out.println(getByte(store, 0)); //0
System.out.println(getByte(store, 1)); //1
System.out.println(getByte(store, 2)); //2
System.out.println(getByte(store, 3)); //3
}
//Behavior :: Store "value" into "store".
//Reminder :: Valid index 0 - 3. Valid value 0 - 3.
private static byte setByte(byte store, int index, int value)
{
store = (byte)(store & ~(0x3 << (2 * index)));
return store |= (value & 0x3) << (2 * index);
}
private static byte getByte(byte store, int index)
{
return (byte)((store >> (2 * index)) & 0x3);
}
}
I haven't tested this, but it may help you.
byte test = 69;
byte insert = 0b01;
byte index = 2;
final byte ones = 0b00000011;
//Clear out the data at specified index
test = (byte) (test & ~(ones << index));
//Insert data
test |= (byte) (insert << index);
It works as follows:
Clear the 2 bits at the index in the byte (using bitwise AND).
Insert the 2 data bits at the index in the byte using bitwise OR).
You can "convert" the chars ACGT to 0, 1, 2, 3 using bit arithmetic.
byte[] bytes = str.getBytes();
for (int i = 0; i < bytes.length; i++) {
bytes[i] = (byte)(bytes[i] >> 1 & 3 ^ bytes[i] >> 2 & 1);
}
I suspect your initial check should be:
if (!str.matches("[ACGT]+") {
System.out.println("wrong command format");
return;
}

Convert Binary to Hexadecimal in Java without methods

I am currently a beginner in programming and I am trying to write a program in java to convert binary in hexadecimal numbers.
I know that the program will have to divide the number in groups of 4 and convert them to hexadecimal.
Ex: 11101111 (b2) --> E + F --- EF
However, since I used ints to do the conversion of the numbers, I'm stuck when I need to print a letter because it is a String.
Can someone point me to the right way? What am I doing wrong? I've also tried another version with an auxiliary array to store each group of 4 digits but I can't manage to insert a proper dimension to the array.
Unfortunately I am not allowed to use any function other than Scanner and Math, the method lenght and charAt and the basic stuff. I can't modify the public static line either.
EDIT: So after your inputs and so many tries, I managed to get this code. However it gives me an error if I insert too many numbers, eg: 0111011010101111. I've tried to change int to double but that didn't fix the problem.
import java.util.Scanner;
public class Bin2HexString {
public static void main(String[] args) {
Scanner keyb = new Scanner(System.in);
System.out.println("Valor?");
int vlr = keyb.nextInt();
String num = "";
int aux = vlr;
// Hexadecimal numbers
String arr[] = {"0","1","2","3","4","5","6","7","8","9","A", "B", "C", "D", "E", "F"};
String bits[] = {"0000","0001","0010","0011","0100","0101","0110","0111","1000","1001","1010","1011","1100","1101","1110","1111"};
String letters = "";
//Divide in groups of 4
int r;
for (; aux > 0; ) {
r = aux % 10000;
aux = aux / 10000;
num = "" + r;
for (;num.length() < 4;) { //add missing zeros
String zero = "0";
num = zero + num;
}
int charint = 0,bitint = 0;
for (int i = 0; i < arr.length;i++) {
String aux2 = bits[i];
String aux3 = arr[i];
for (int j = 0; j < num.length();j++) { // compare each group with arr[i]
char charvl = num.charAt(j);
char bitsvl = aux2.charAt(j);
charint = ((int) (charvl)-'0');
bitint = ((int) (bitsvl) - '0');
if (bitint != charint)
break;
}
if (bitint == charint)
letters = aux3 + "" + letters;
}
}
System.out.println(letters);
}
}
Having thought about this for a while to determine the most effective and useful way to do this is to write methods which convert a string from any base between 2 and 16 to an int and back to a string again.
This way you have useful methods for other things. And note that they methods can be easily changed and names to simply hard code the desired radix into the method to limit it to binary and hex methods.
The indexOf utility method was written to avoid using the builtin String method.
final static String hex = "0123456789ABCDEF";
static int stringToInt(String str, int radix) {
if (radix < 2 || radix > 16) {
System.out.println("Base must be between 2 and 16 inclusive");
return -1;
}
int v = 0;
for (int i = 0; i < str.length(); i++) {
char c = str.charAt(i);
int idx = indexOf(hex, c);
if (idx < 0 || idx > radix) {
System.out.println("Illegal character in string (" + c + ")");
}
v = v * radix + idx;
}
return v;
}
static String intToBase(int v, int radix) {
if (radix < 2 || radix > 16) {
System.out.println("Base must be between 2 and 16 inclusive");
return null;
}
String s = "";
while (v > 0) {
int idx = v % radix;
s = hex.charAt(idx) + s;
v /= radix;
}
return s;
}
static int indexOf(String str, char c) {
for (int i = 0; i < str.length(); i++) {
if (str.charAt(i) == c) {
return i;
}
}
return -1;
}
And here is an example of their use.
// generate some test data
Random r = new Random(23);
String[] bitStrings =
r.ints(20, 20, 4000).mapToObj(Integer::toBinaryString).toArray(
String[]::new);
for (String bitstr : bitStrings) {
int v = baseToInt(bitstr, 2);
String hex = intToBase(v, 16);
System.out.printf("%12s = %s%n", bitstr, hex);
}
Which prints the following:
101110000011 = B83
111001111100 = E7C
10001110111 = 477
100110001111 = 98F
111001010 = 1CA
111001001111 = E4F
111000011010 = E1A
100001010010 = 852
11011001101 = 6CD
111010010111 = E97
Just some quick notes:
First this is wrong:
//Divide in groups of 4
for (; aux > 0; ) {
r = aux % 10000;
aux = aux / 10000;
Not at all what you want to do. Try it by hand and see what happens. Take a simple number that you know the answer to, and try it. You won't get the right answer. A good test is 17, which is 11 hex.
Try this instead: convert directly to the base you want. Hex is base 16 (its radix is 16), so you use 16 instead.
//Divide in groups of 4
for (; aux > 0; ) {
r = aux % 16;
aux = aux / 16;
Try those numbers with the test case, which is 17, and see what you get. That will get you much closer.
I'm assuming by "without methods" in the title, you are attempting to write your own integer parsing method instead of using Scanner.nextInt(int radix). In that case, my first advice would be work with a string instead of an integer - you'll be able to handle larger numbers and you can simply make an array of substrings (length 4) to convert to letters.
So, if you use the string approach - first scan in a string, not an int. Then I'd recommend a hash table with the 4-bit strings as keys and the hexadecimal equivalents as values. That should make calculation quite fast.
e.g.
import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;
public class HashMapBin2Hex
{
public static void main(String[] args)
{
//Read the string in
Scanner sc = new Scanner(System.in);
System.out.println("Binary number?");
String bin = sc.nextLine();
//Pad the bitstring with leading zeros to make a multiple of four
String zeros = "";
int i;
if (bin.length() % 4 != 0)
{
for (i = 0; i < 4 - (bin.length() % 4); i++)
{
zeros += "0";
}
}
bin = zeros + bin;
//Split the padded string into 4-bit chunks
String[] chunks = new String[bin.length() / 4];
for (i = 0; (i * 4) < bin.length() - 1; i++)
{
chunks[i] = bin.substring(i * 4, (i * 4) + 4);
}
//Convert the chunks to hexadecimal
String hex = "";
Map<String, String> bin2hex = new HashMap<>();
bin2hex.put("0000", "0");
bin2hex.put("0001", "1");
bin2hex.put("0010", "2");
bin2hex.put("0011", "3");
bin2hex.put("0100", "4");
bin2hex.put("0101", "5");
bin2hex.put("0110", "6");
bin2hex.put("0111", "7");
bin2hex.put("1000", "8");
bin2hex.put("1001", "9");
bin2hex.put("1010", "A");
bin2hex.put("1011", "B");
bin2hex.put("1100", "C");
bin2hex.put("1101", "D");
bin2hex.put("1110", "E");
bin2hex.put("1111", "F");
for (String s : chunks)
{
hex += bin2hex.get(s);
}
System.out.println("Hexadecimal: " + hex);
sc.close();
}
}
Further iterations could have some error checking to prevent catastrophic failure in the case of characters other than 0 or 1.
And of course, if you're fine with the other way (builtins), the following is far easier and more robust (ie will throw an exception if the string contains anything other than 0s and 1s):
import java.util.Scanner;
public class BuiltinBin2Hex
{
public static void main(String[] args)
{
//Read the binary number in
Scanner sc = new Scanner(System.in);
System.out.println("Binary number?");
int bin = sc.nextInt(2);
//And print as hexadecimal
System.out.println("Hexadecimal: " + Integer.toString(bin, 16));
sc.close();
}
}

Java - Convert Big-Endian to Little-Endian

I have the following hex string:
00000000000008a3a41b85b8b29ad444def299fee21793cd8b9e567eab02cd81
but I want it to look like this:
81cd02ab7e569e8bcd9317e2fe99f2de44d49ab2b8851ba4a308000000000000 (Big
endian)
I think I have to reverse and swap the string, but something like this doesn't give me right result:
String hex = "00000000000008a3a41b85b8b29ad444def299fee21793cd8b9e567eab02cd81";
hex = new StringBuilder(hex).reverse().toString();
Result:
81dc20bae765e9b8dc39712eef992fed444da92b8b58b14a3a80000000000000
(wrong)
81cd02ab7e569e8bcd9317e2fe99f2de44d49ab2b8851ba4a308000000000000
(should be)
The swapping:
public static String hexSwap(String origHex) {
// make a number from the hex
BigInteger orig = new BigInteger(origHex,16);
// get the bytes to swap
byte[] origBytes = orig.toByteArray();
int i = 0;
while(origBytes[i] == 0) i++;
// swap the bytes
byte[] swapBytes = new byte[origBytes.length];
for(/**/; i < origBytes.length; i++) {
swapBytes[i] = origBytes[origBytes.length - i - 1];
}
BigInteger swap = new BigInteger(swapBytes);
return swap.toString(10);
}
hex = hexSwap(hex);
Result:
026053973026883595670517176393898043396144045912271014791797784
(wrong)
81cd02ab7e569e8bcd9317e2fe99f2de44d49ab2b8851ba4a308000000000000
(should be)
Can anyone give me a example of how to accomplish this?
Thank you a lot :)
You need to swap each pair of characters, as you're reversing the order of the bytes, not the nybbles. So something like:
public static String reverseHex(String originalHex) {
// TODO: Validation that the length is even
int lengthInBytes = originalHex.length() / 2;
char[] chars = new char[lengthInBytes * 2];
for (int index = 0; index < lengthInBytes; index++) {
int reversedIndex = lengthInBytes - 1 - index;
chars[reversedIndex * 2] = originalHex.charAt(index * 2);
chars[reversedIndex * 2 + 1] = originalHex.charAt(index * 2 + 1);
}
return new String(chars);
}

Java: convert byte[] to base36 String

I'm a bit lost. For a project, I need to convert the output of a hash-function (SHA256) - which is a byte array - to a String using base 36.
So In the end, I want to convert the (Hex-String representation of the) Hash, which is
43A718774C572BD8A25ADBEB1BFCD5C0256AE11CECF9F9C3F925D0E52BEAF89
to base36, so the example String from above would be:
3SKVHQTXPXTEINB0AT1P0G45M4KI8U0HR8PGB96DVXSTDJKI1
For the actual conversion to base36, I found some piece of code here on StackOverflow:
public static String toBase36(byte[] bytes) {
//can provide a (byte[], offset, length) method too
StringBuffer sb = new StringBuffer();
int bitsUsed = 0; //will point how many bits from the int are to be encoded
int temp = 0;
int tempBits = 0;
long swap;
int position = 0;
while((position < bytes.length) || (bitsUsed != 0)) {
swap = 0;
if(tempBits > 0) {
//there are bits left over from previous iteration
swap = temp;
bitsUsed = tempBits;
tempBits = 0;
}
//fill some bytes
while((position < bytes.length) && (bitsUsed < 36)) {
swap <<= 8;
swap |= bytes[position++];
bitsUsed += 8;
}
if(bitsUsed > 36) {
tempBits = bitsUsed - 36; //this is always 4
temp = (int)(swap & ((1 << tempBits) - 1)); //get low bits
swap >>= tempBits; //remove low bits
bitsUsed = 36;
}
sb.append(Long.toString(swap, 36));
bitsUsed = 0;
}
return sb.toString();
}
Now I'm doing this:
// this creates my hash, being a 256-bit byte array
byte[] hash = PBKDF2.deriveKey(key.getBytes(), salt.getBytes(), 2, 256);
System.out.println(hash.length); // outputs "256"
System.out.println(toBase36(hash)); // outputs total crap
the "total crap" is something like
-7-14-8-1q-5se81u0e-3-2v-24obre-73664-7-5-5cor1o9s-6h-4k6hr-5-4-rt2z0-30-8-2u-8-onz-4a2j-6-8-18-8trzza3-3-2x-6-4153to-4e3l01me-6-azz-2-k-4ckq-nav-gu-irqpxx-el-1j-6-rmf8hs-1bb5ax-3z25u-2-2r-t5-22-6-6w1v-1p
so it's not even close to what I want. I tried to find a solution now, but it seems I'm a bit lost here. How do I get the base36-encoded String representation of the Hash that I need?
Try using BigInteger:
String hash = "43A718774C572BD8A25ADBEB1BFCD5C0256AE11CECF9F9C3F925D0E52BEAF89";
//use a radix of 16, default would be 10
String base36 = new BigInteger( hash, 16 ).toString( 36 ).toUpperCase();
This might work:
BigInteger big = new BigInteger(your_byte_array_to_hex_string, 16);
big.toString(36);

Store binary sequence in byte array?

I need to store a couple binary sequences that are 16 bits in length into a byte array (of length 2). The one or two binary numbers don't change, so a function that does conversion might be overkill. Say for example the 16 bit binary sequence is 1111000011110001. How do I store that in a byte array of length two?
String val = "1111000011110001";
byte[] bval = new BigInteger(val, 2).toByteArray();
There are other options, but I found it best to use BigInteger class, that has conversion to byte array, for this kind of problems. I prefer if, because I can instantiate class from String, that can represent various bases like 8, 16, etc. and also output it as such.
Edit: Mondays ... :P
public static byte[] getRoger(String val) throws NumberFormatException,
NullPointerException {
byte[] result = new byte[2];
byte[] holder = new BigInteger(val, 2).toByteArray();
if (holder.length == 1) result[0] = holder[0];
else if (holder.length > 1) {
result[1] = holder[holder.length - 2];
result[0] = holder[holder.length - 1];
}
return result;
}
Example:
int bitarray = 12321;
String val = Integer.toString(bitarray, 2);
System.out.println(new StringBuilder().append(bitarray).append(':').append(val)
.append(':').append(Arrays.toString(getRoger(val))).append('\n'));
I have been disappointed with all of the solutions I have found to converting strings of bits to byte arrays and vice versa -- all have been buggy (even the BigInteger solution above), and very few are as efficient as they should be.
I realize the OP was only concerned with a bit string to an array of two bytes, which the BitInteger approach seems to work fine for. However, since this post is currently the first search result when searching "bit string to byte array java" in Google, I am going to post my general solution here for people dealing with huge strings and/or huge byte arrays.
Note that my solution below is the only solution I have ran that passes all of my test cases -- many online solutions to this relatively simple problem simply do not work.
Code
/**
* Zips (compresses) bit strings to byte arrays and unzips (decompresses)
* byte arrays to bit strings.
*
* #author ryan
*
*/
public class BitZip {
private static final byte[] BIT_MASKS = new byte[] {1, 2, 4, 8, 16, 32, 64, -128};
private static final int BITS_PER_BYTE = 8;
private static final int MAX_BIT_INDEX_IN_BYTE = BITS_PER_BYTE - 1;
/**
* Decompress the specified byte array to a string.
* <p>
* This function does not pad with zeros for any bit-string result
* with a length indivisible by 8.
*
* #param bytes The bytes to convert into a string of bits, with byte[0]
* consisting of the least significant bits in the byte array.
* #return The string of bits representing the byte array.
*/
public static final String unzip(final byte[] bytes) {
int byteCount = bytes.length;
int bitCount = byteCount * BITS_PER_BYTE;
char[] bits = new char[bitCount];
{
int bytesIndex = 0;
int iLeft = Math.max(bitCount - BITS_PER_BYTE, 0);
while (bytesIndex < byteCount) {
byte value = bytes[bytesIndex];
for (int b = MAX_BIT_INDEX_IN_BYTE; b >= 0; --b) {
bits[iLeft + b] = ((value % 2) == 0 ? '0' : '1');
value >>= 1;
}
iLeft = Math.max(iLeft - BITS_PER_BYTE, 0);
++bytesIndex;
}
}
return new String(bits).replaceFirst("^0+(?!$)", "");
}
/**
* Compresses the specified bit string to a byte array, ignoring trailing
* zeros past the most significant set bit.
*
* #param bits The string of bits (composed strictly of '0' and '1' characters)
* to convert into an array of bytes.
* #return The bits, as a byte array with byte[0] containing the least
* significant bits.
*/
public static final byte[] zip(final String bits) {
if ((bits == null) || bits.isEmpty()) {
// No observations -- return nothing.
return new byte[0];
}
char[] bitChars = bits.toCharArray();
int bitCount = bitChars.length;
int left;
for (left = 0; left < bitCount; ++left) {
// Ignore leading zeros.
if (bitChars[left] == '1') {
break;
}
}
if (bitCount == left) {
// Only '0's in the string.
return new byte[] {0};
}
int cBits = bitCount - left;
byte[] bytes = new byte[((cBits) / BITS_PER_BYTE) + (((cBits % BITS_PER_BYTE) > 0) ? 1 : 0)];
{
int iRight = bitCount - 1;
int iLeft = Math.max(bitCount - BITS_PER_BYTE, left);
int bytesIndex = 0;
byte _byte = 0;
while (bytesIndex < bytes.length) {
while (iLeft <= iRight) {
if (bitChars[iLeft] == '1') {
_byte |= BIT_MASKS[iRight - iLeft];
}
++iLeft;
}
bytes[bytesIndex++] = _byte;
iRight = Math.max(iRight - BITS_PER_BYTE, left);
iLeft = Math.max((1 + iRight) - BITS_PER_BYTE, left);
_byte = 0;
}
}
return bytes;
}
}
Performance
I was bored at work so I did some performance testing comparing against the accepted answer here for when N is large. (Pretending to ignore the fact that the BigInteger approach posted above doesn't even work properly as a general approach.)
This is running with a random bit string of size 5M and a random byte array of size 1M:
String -> byte[] -- BigInteger result: 39098ms
String -> byte[] -- BitZip result: 29ms
byte[] -> String -- Integer result: 138ms
byte[] -> String -- BitZip result: 71ms
And the code:
public static void main(String[] argv) {
int testByteLength = 1000000;
int testStringLength = 5000000;
// Independently random.
final byte[] randomBytes = new byte[testByteLength];
final String randomBitString;
{
StringBuilder sb = new StringBuilder();
Random rand = new Random();
for (int i = 0; i < testStringLength; ++i) {
int value = rand.nextInt(1 + i);
sb.append((value % 2) == 0 ? '0' : '1');
randomBytes[i % testByteLength] = (byte) value;
}
randomBitString = sb.toString();
}
byte[] resultCompress;
String resultDecompress;
{
Stopwatch s = new Stopwatch();
TimeUnit ms = TimeUnit.MILLISECONDS;
{
s.start();
{
resultCompress = compressFromBigIntegerToByteArray(randomBitString);
}
s.stop();
{
System.out.println("String -> byte[] -- BigInteger result: " + s.elapsed(ms) + "ms");
}
s.reset();
}
{
s.start();
{
resultCompress = zip(randomBitString);
}
s.stop();
{
System.out.println("String -> byte[] -- BitZip result: " + s.elapsed(ms) + "ms");
}
s.reset();
}
{
s.start();
{
resultDecompress = decompressFromIntegerParseInt(randomBytes);
}
s.stop();
{
System.out.println("byte[] -> String -- Integer result: " + s.elapsed(ms) + "ms");
}
s.reset();
}
{
s.start();
{
resultDecompress = unzip(randomBytes);
}
s.stop();
{
System.out.println("byte[] -> String -- BitZip result: " + s.elapsed(ms) + "ms");
}
s.reset();
}
}
}

Categories

Resources