Convert byte array to escaped string - java

I need some help converting a java byte array to a 7-Bit ASCII string. However I am getting 8-bit sequences and need to escape any unreadable character to it's escaped sequence. Is there a simple solution for this or do I need to build my own?
Seeing that the range of readable characters in 7-bit ASCII is continuous right now I am thinking of the following:
for( int i = 0; i < buffer.length; i++ ) {
int codePoint = ( (int) buffer[ i ] ) & 255;
if( 0x20 <= codePoint && codePoint <= 0x7e ) {
res = res + String( (char) codePoint );
} else {
String c = Integer.toHexString( codePoint );
if( c.length() < 2 ) {
c = "0" + c;
}
res = res + "\\0x" + c;
}
}
However this seems like an awful lot of work for such a simple conversion. Is there a better way?
Also I might need to do the same to data that has been converted from the byte array to strings. Is there a simpler solution in this case?

public static String escape(byte[] data) {
StringBuilder cbuf = new StringBuilder();
for (byte b : data) {
if (b >= 0x20 && b <= 0x7e) {
cbuf.append((char) b);
} else {
cbuf.append(String.format("\\0x%02x", b & 0xFF));
}
}
return cbuf.toString();
}
You can use the format method to pare back the verbiage.
Note that this method is only safe because the ASCII range matches the lower range of the UTF-16 encoding used by Java Strings.

If it doesn't fit base64 so second standard is java.net.URLEncoder.

Related

Too many characters in character literal error

I'm creating a stylish text app but on some places I'm getting an error ("Too many characters in character literal"). I am writing only one letter but when I paste it converts into many letters like this: "\uD83C\uDD89" and the original letter is "🆉".
Please tell me how to write this in a correct way.
for (int charOne = 0; charOne <= strBld.length() - 1; charOne++) {
char a = strBld.charAt(charOne);
char newCh = getSpecialCharEighth(a);
strBld.setCharAt(charOne, newCh);
}
private char getSpecialCharEighth(char a) {
char ch = a;
if (ch == 'Z' || ch == 'z') {
ch = '\uD83C\uDD89';
}
return ch;
}
A Java char stores a 16-bit value, i.e. can store 65536 different values. There are currently 137929 characters in Unicode (12.1).
To handle this, Java strings are stored in UTF-16, which is a 16-bit encoding. Most Unicode characters, known as code points, are stored in a single 16-bit value. Some are stored in a pair of 16-bit values, known as surrogate pairs.
This means that a Unicode character may be stored as 2 char "characters" in Java, which means that if you want your code to have full Unicode character support, you can't store a Unicode character in a single char value.
They can be stored in an int variable, where the value is then referred to as a code point in Java. It is however often easier to store them as a String.
In your case, you seem to be replacing Unicode characters, so a regex replacement call might be better, e.g.
s = s.replaceAll("[Zz]", "\uD83C\uDD89");
// Or like this if source file is UTF-8
s = s.replaceAll("[Zz]", "🆉");
UPDATE
If you want to keep a method for determining the replacement value, you could do this:
s = Pattern.compile(".").matcher(s).replaceAll​(mr -> getSpecialCharEighth(mr.group()));
private static String getSpecialCharEighth(String s) {
int cp = s.codePointAt(0);
if (cp >= 'A' && cp <= 'Z')
return Character.toString​(cp - 'A' + 0x1f170); // "🅰" - "🆉"
if (cp >= 'a' && cp <= 'z')
return Character.toString​(cp - 'a' + 0x1f170); // "🅰" - "🆉"
return s;
}
Note: replaceAll​(replacer) is Java 9+ and Character.toString(codePoint) is Java 11+.
UPDATE 2
Since question is tagged android, Java 9 and Java 11 APIs are not available, so here is Java 7+ solution.
StringBuffer buf = new StringBuffer(s.length() + 16);
Matcher m = Pattern.compile(".").matcher(s);
while (m.find())
m.appendReplacement(buf, getSpecialCharEighth(m.group()));
s = m.appendTail(buf).toString();
private static String getSpecialCharEighth(String s) {
int cp = s.codePointAt(0);
if (cp >= 'A' && cp <= 'Z')
return new String(new int[] { cp - 'A' + 0x1f170 }, 0, 1);
if (cp >= 'a' && cp <= 'z')
return new String(new int[] { cp - 'a' + 0x1f170 }, 0, 1);
return s;
}
Result with s = "Hello World!"
🅷🅴🅻🅻🅾 🆆🅾🆁🅻🅳!
You can't do that with char data type. Use String instead.
https://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html
char: The char data type is a single 16-bit Unicode character. It has a minimum value of '\u0000' (or 0) and a maximum value of '\uffff' (or 65,535 inclusive).

Convert a given decimal string into a binary string(even number of binary digits) in Java [duplicate]

for example, for 1, 2, 128, 256 the output can be (16 digits):
0000000000000001
0000000000000010
0000000010000000
0000000100000000
I tried
String.format("%16s", Integer.toBinaryString(1));
it puts spaces for left-padding:
` 1'
How to put 0s for padding. I couldn't find it in Formatter. Is there another way to do it?
P.S. this post describes how to format integers with left 0-padding, but it is not for the binary representation.
I think this is a suboptimal solution, but you could do
String.format("%16s", Integer.toBinaryString(1)).replace(' ', '0')
There is no binary conversion built into the java.util.Formatter, I would advise you to either use String.replace to replace space character with zeros, as in:
String.format("%16s", Integer.toBinaryString(1)).replace(" ", "0")
Or implement your own logic to convert integers to binary representation with added left padding somewhere along the lines given in this so.
Or if you really need to pass numbers to format, you can convert your binary representation to BigInteger and then format that with leading zeros, but this is very costly at runtime, as in:
String.format("%016d", new BigInteger(Integer.toBinaryString(1)))
Here a new answer for an old post.
To pad a binary value with leading zeros to a specific length, try this:
Integer.toBinaryString( (1 << len) | val ).substring( 1 )
If len = 4 and val = 1,
Integer.toBinaryString( (1 << len) | val )
returns the string "10001", then
"10001".substring( 1 )
discards the very first character. So we obtain what we want:
"0001"
If val is likely to be negative, rather try:
Integer.toBinaryString( (1 << len) | (val & ((1 << len) - 1)) ).substring( 1 )
You can use Apache Commons StringUtils. It offers methods for padding strings:
StringUtils.leftPad(Integer.toBinaryString(1), 16, '0');
I was trying all sorts of method calls that I haven't really used before to make this work, they worked with moderate success, until I thought of something that is so simple it just might work, and it did!
I'm sure it's been thought of before, not sure if it's any good for long string of binary codes but it works fine for 16Bit strings. Hope it helps!! (Note second piece of code is improved)
String binString = Integer.toBinaryString(256);
while (binString.length() < 16) { //pad with 16 0's
binString = "0" + binString;
}
Thanks to Will on helping improve this answer to make it work with out a loop.
This maybe a little clumsy but it works, please improve and comment back if you can....
binString = Integer.toBinaryString(256);
int length = 16 - binString.length();
char[] padArray = new char[length];
Arrays.fill(padArray, '0');
String padString = new String(padArray);
binString = padString + binString;
A simpler version of user3608934's idea "This is an old trick, create a string with 16 0's then append the trimmed binary string you got ":
private String toBinaryString32(int i) {
String binaryWithOutLeading0 = Integer.toBinaryString(i);
return "00000000000000000000000000000000"
.substring(binaryWithOutLeading0.length())
+ binaryWithOutLeading0;
}
I do not know "right" solution but I can suggest you a fast patch.
String.format("%16s", Integer.toBinaryString(1)).replace(" ", "0");
I have just tried it and saw that it works fine.
Starting with Java 11, you can use the repeat(...) method:
"0".repeat(Integer.numberOfLeadingZeros(i) - 16) + Integer.toBinaryString(i)
Or, if you need 32-bit representation of any integer:
"0".repeat(Integer.numberOfLeadingZeros(i != 0 ? i : 1)) + Integer.toBinaryString(i)
try...
String.format("%016d\n", Integer.parseInt(Integer.toBinaryString(256)));
I dont think this is the "correct" way to doing this... but it works :)
I would write my own util class with the method like below
public class NumberFormatUtils {
public static String longToBinString(long val) {
char[] buffer = new char[64];
Arrays.fill(buffer, '0');
for (int i = 0; i < 64; ++i) {
long mask = 1L << i;
if ((val & mask) == mask) {
buffer[63 - i] = '1';
}
}
return new String(buffer);
}
public static void main(String... args) {
long value = 0b0000000000000000000000000000000000000000000000000000000000000101L;
System.out.println(value);
System.out.println(Long.toBinaryString(value));
System.out.println(NumberFormatUtils.longToBinString(value));
}
}
Output:
5
101
0000000000000000000000000000000000000000000000000000000000000101
The same approach could be applied to any integral types. Pay attention to the type of mask
long mask = 1L << i;
A naive solution that work would be
String temp = Integer.toBinaryString(5);
while (temp.length() < Integer.SIZE) temp = "0"+temp; //pad leading zeros
temp = temp.substring(Integer.SIZE - Short.SIZE); //remove excess
One other method would be
String temp = Integer.toBinaryString((m | 0x80000000));
temp = temp.substring(Integer.SIZE - Short.SIZE);
This will produce a 16 bit string of the integer 5
// Below will handle proper sizes
public static String binaryString(int i) {
return String.format("%" + Integer.SIZE + "s", Integer.toBinaryString(i)).replace(' ', '0');
}
public static String binaryString(long i) {
return String.format("%" + Long.SIZE + "s", Long.toBinaryString(i)).replace(' ', '0');
}
This is an old trick, create a string with 16 0's then append the trimmed binary string you got from String.format("%s", Integer.toBinaryString(1)) and use the right-most 16 characters, lopping off any leading 0's. Better yet, make a function that lets you specify how long of a binary string you want. Of course there are probably a bazillion other ways to accomplish this including libraries, but I'm adding this post to help out a friend :)
public class BinaryPrinter {
public static void main(String[] args) {
System.out.format("%d in binary is %s\n", 1, binaryString(1, 4));
System.out.format("%d in binary is %s\n", 128, binaryString(128, 8));
System.out.format("%d in binary is %s\n", 256, binaryString(256, 16));
}
public static String binaryString( final int number, final int binaryDigits ) {
final String pattern = String.format( "%%0%dd", binaryDigits );
final String padding = String.format( pattern, 0 );
final String response = String.format( "%s%s", padding, Integer.toBinaryString(number) );
System.out.format( "\npattern = '%s'\npadding = '%s'\nresponse = '%s'\n\n", pattern, padding, response );
return response.substring( response.length() - binaryDigits );
}
}
This method converts an int to a String, length=bits. Either padded with 0s or with the most significant bits truncated.
static String toBitString( int x, int bits ){
String bitString = Integer.toBinaryString(x);
int size = bitString.length();
StringBuilder sb = new StringBuilder( bits );
if( bits > size ){
for( int i=0; i<bits-size; i++ )
sb.append('0');
sb.append( bitString );
}else
sb = sb.append( bitString.substring(size-bits, size) );
return sb.toString();
}
You can use lib https://github.com/kssource/BitSequence. It accept a number and return bynary string, padded and/or grouped.
String s = new BitSequence(2, 16).toBynaryString(ALIGN.RIGHT, GROUP.CONTINOUSLY));
return
0000000000000010
another examples:
[10, -20, 30]->00001010 11101100 00011110
i=-10->00000000000000000000000000001010
bi=10->1010
sh=10->00 0000 0000 1010
l=10->00000001 010
by=-10->1010
i=-10->bc->11111111 11111111 11111111 11110110
for(int i=0;i<n;i++)
{
for(int j=str[i].length();j<4;j++)
str[i]="0".concat(str[i]);
}
str[i].length() is length of number say 2 in binary is 01 which is length 2
change 4 to desired max length of number. This can be optimized to O(n).
by using continue.
import java.util.Scanner;
public class Q3{
public static void main(String[] args) {
Scanner scn=new Scanner(System.in);
System.out.println("Enter a number:");
int num=scn.nextInt();
int numB=Integer.parseInt(Integer.toBinaryString(num));
String strB=String.format("%08d",numB);//makes a 8 character code
if(num>=1 && num<=255){
System.out.println(strB);
}else{
System.out.println("Number should be in range between 1 and 255");
}
}
}

Convert string to array of uint8_t

For my project I need to read some single line text from a SD card, then get the hex or dec value of each character within the string and group those values in an array.
There are no whitespaces in the text and the lines end with \n
I'm using this code to read all the content into a single string!
String line = "";
while (dataFile.available() != 0)
{
line = dataFile.readStringUntil('\n');
if (line == "")
break;
}
For later use I need to calculate the hex values of each character, this code should iterate over the String and group it in an array.
int lineSize = line.length();
uint8_t data[lineSize];
for (int i = 0; i < lineSize; i++)
{
data[i] = line.charAt(i);
}
I really don't know wether this works or not, but I doubt that I will get the actual hex values...
The values are somewhere but I really don't know how to access them!
The result should look like this:
uint8_t data[] = {0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07}
Think of Hexadecimal as just another format to display any data type (uint8_t, or char or int...) stored in memory. In memory, its all binary, or hexadecimal. Just depends on how you want to look at it.
For example: the following statements:
long int A = 34;
uint8_t B = 34;
char C = 34;
int D = 34;
printf("0x%02x\n", 'A'); // surrounded with '' gives ASCII value of 65, then displayed in Hex
printf("0x%02x\n", A);
printf("0x%02x\n", B);
printf("0x%02x\n", C);
printf("0x%02x\n", D);
Results in:
Breaking any string into its fundamental elements, (char, or uint8_t) and printing them as shown above will yield similar results for you.
Edit:
For this input file (call it in.txt, in the executable directory):
lkjahldfkjghlskjhlskjhlakdjgglsjkahlkj4hl5k6jh67=83kjhlkjshdf8f7s698s7dfgbslfkjbg
And using this code:
int main(void)
{
FILE *fp;
char filename[]=".\\in.txt";
uint8_t c;
int length=0, i=0;
uint8_t *array;
//Get number of entries in file:
fp=fopen(filename, "r");
c= fgetc(fp);
while(c<255)
{
length++;
c= fgetc(fp);
}
fclose(fp);
//give array sufficient space
array = malloc(sizeof(uint8_t)*length);
fp=fopen(filename, "r");
//read file into array, and display as hexadecimal
c = fgetc(fp);
while(c<255)
{
array[i++]= c;
printf("0x%02x\n", c);
c = fgetc(fp);
}
fclose(fp);
getchar();//stop execution to view files (hit any key to exit)
return 0
}
You should see this output: (only first 20 or so values shown...)
Strings in C/C++ are already arrays (even when abstracted by a higher level class like std::string). The array elements are the character values of each character. Are you sure you can't just grab the string's .c_str() (or .data()) and use that, possibly with a cast?
Well, if by getting the hex code you mean getting the hex representation of those characters, you can do this -
const String hexDigits = "0123456789abcdef";
char hex[2] = "";
hex[0] = hexDigits[ (int)line.at(i) / 16 ];
hex[1] = hexDigits[ (int)line.at(i) % 16 ];
For example, if line.at(i) = A, then hex will be "41".

Bitwise not operation on a string of bits

I have a string with the value 0111000000. How can I perform a bitwise not operation on this string?
If I convert it to an integer, use the ~ operator and convert it back to a binary string, the resulting string has extra bits. I want the output to be exactly 1000111111.
The following code works fine, but it's not a good method. Is there another better way of doing this?
String bstr="";
while(m!=str.length())
{
char a=str.charAt(m);
if(a=='1')
{
a='0';
bstr=bstr+a;
m++;
}
else
{
a='1';
bstr=bstr+a;
m++;
}
}
try this
char[] a = s.toCharArray();
for(int i = 0; i < a.length; i++) {
a[i] = a[i]=='0' ? '1' : '0';
}
s = new String(a);
this also works fine
int i = ~Integer.parseInt(s, 2);
String tmp = Integer.toBinaryString(i);
s = tmp.substring(tmp.length()- s.length());
Keep track of how many bits there are in your bit-string. After converting to an integer and using a ~value operation to flip the bits, use a bit-mask to remove the unwanted 1 high-end bits.
Say for example your bit-string has a fixed 10 bits. Then you can mask off the unwanted high-end bits with: value & 0x2ff.
If the number of bits in the bit-string is variable:
value & ((1 << nBits) - 1)
StringUtils.replaceChars from common-lang might help here:
StringUtils.replaceChars("0111000000", "01", "10");
You should use string builder so you are able to change individual bits without creating many many garbage strings. Also you can flip single bits using XOR:
b ^= 1;
Which works on both binary and ASCII values of digits.
maybe this will work:
String result = Integer.toBinaryString(~(Integer.parseInt("0111000000",2)));
converts binary String to int, use bitwise not operator to invert, then convert back to binary string.
You can do this using XOR operation
public String xorOperation(String value) {
String str1 = value;
long l = Long.parseLong(str1, 2);
String str2 = "";
for (int i = 0; i < str1.length(); i++) {
str2 = str2 + "1";
}
long n = Long.parseLong(str2, 2);
long num = l ^ n;
String bininaryString = Long.toBinaryString(num);
System.out.println(bininaryString);
return bininaryString;
}

Bit manipulation and output in Java

If you have binary strings (literally String objects that contain only 1's and 0's), how would you output them as bits into a file?
This is for a text compressor I was working on; it's still bugging me, and it'd be nice to finally get it working. Thanks!
Easiest is to simply take 8 consecutive characters, turn them into a byte and output that byte. Pad with zeros at the end if you can recognize the end-of-stream, or add a header with length (in bits) at the beginning of the file.
The inner loop would look something like:
byte[] buffer = new byte[ ( string.length + 7 ) / 8 ];
for ( int i = 0; i < buffer.length; ++i ) {
byte current = 0;
for ( int j = 7; j >= 0; --j )
if ( string[ i * 8 + j ] == '1' )
current |= 1 << j;
output( current );
}
You'll need to make some adjustments, but that's the general idea.
If you're lucky, java.math.BigInteger may do everything for you.
String s = "11001010001010101110101001001110";
byte[] bytes = (new java.math.BigInteger(s, 2)).toByteArray();
This does depend on the byte order (big-endian) and right-aligning (if the number of bits is not a multiple of 8) being what you want but it may be simpler to modify the array afterwards than to do the character conversion yourself.
public class BitOutputStream extends FilterOutputStream
{
private int buffer = 0;
private int bitCount = 0;
public BitOutputStream(OutputStream out)
{
super(out);
}
public void writeBits(int value, int numBits) throws IOException
{
while(numBits>0)
{
numBits--;
int mix = ((value&1)<<bitCount++);
buffer|=mix;
value>>=1;
if(bitCount==8)
align8();
}
}
#Override
public void close() throws IOException
{
align8(); /* Flush any remaining partial bytes */
super.close();
}
public void align8() throws IOException
{
if(bitCount > 0)
{
bitCount=0;
write(buffer);
buffer=0;
}
}
}
And then...
if (nextChar == '0')
{
bos.writeBits(0, 1);
}
else
{
bos.writeBits(1, 1);
}
Assuming the String has a multiple of eight bits, (you can pad it otherwise), take advantage of Java's built in parsing in the Integer.valueOf method to do something like this:
String s = "11001010001010101110101001001110";
byte[] data = new byte[s.length() / 8];
for (int i = 0; i < data.length; i++) {
data[i] = (byte) Integer.parseInt(s.substring(i * 8, (i + 1) * 8), 2);
}
Then you should be able to write the bytes to a FileOutputStream pretty simply.
On the other hand, if you looking for effeciency, you should consider not using a String to store the bits to begin with, but build up the bytes directly in your compressor.

Categories

Resources