I'm trying to read a shortcod-file binary file that could be found here.
The method i'm using to print the content of this file:
public void read3RegularGraphs( String pathFile ) throws IOException {
InputStream reader = new FileInputStream(pathFile);
byte [] fileBytes = Files.readAllBytes(new File(pathFile).toPath());
char singleChar;
for(byte b : fileBytes) {
singleChar = (char) b;
System.out.print(singleChar);
}
}
Unfortunatly, I'm getting an incorrect output format, i'm getting symbols in place of chars.
How can I convert the binary content to character format.
Thank you
You need to pass Charset to use decoding. Char and byte are two different things
List<String> stringList = Files.readAllLines(new File(pathFile).toPath(), Charset.forName("UTF-8"));
When you convert among Char, Strings and byte arrays, declare explicitly the Charset
byte[] byteArray= stringTest.getBytes(Charset.forName("UTF-8"));
String stringTest = new String(byteArray, Charset.forName("UTF-8"));
Related
I receive json data that contains binary data like that ,and I would like to convert that data to byte[] in java but I don't know how ?.
"payload": "7V1bcxs3ln6frfdcfvfbghfdX8HSw9Zu1QzzartyhblfdcvberCObjvJpkiJUpmhRI1pKXYeXHRsZLSrCy
5dElN5tfvQaO72TdSoiOS3TH8Yxdffgtg754679513qdfrgvlslsqdeqaepdccngrdzedrtghBD+d++e7v//p80/v96v7h+u72
+z1gfK/39x/+9t391cPTzeP88aE/++Fvvd53n+8+Xd1c/fBm/unqAf+7
N7v65en++vGP3vx2fvPHw/XDdwfpHf5mevhq/vQDcnAAwD+gEPwDF+bDxTv+3UF61d/4eesrfP356uFx"
Based on the observation that the "binary" string consists of ASCII letters, digits and "+" and "/", I am fairly confident that it is actually Base64 encoded data.
To decode Base64 to a byte[] you can do something like this:
String s = "7V1bcxs3ln6...";
byte [] bytes = java.util.Base64.getDecoder().decode(s);
The decode call will throw IllegalArgumentException if the input string is not properly Base64 encoded.
When I decoded that particular string using an online Base64 decoder, the result is unintelligible. But that is what I would expect for an arbitrary "blob" of binary data.
In general if you have a String in some object that denotes the json payload you can :
String s = "7V1bcxs3ln6...";
byte [] bytes = s.getBytes();
Other than that if this payload should be decoded somehow then additional code will be required.
In my case I had to convert payload that I knew it was a text something like:
{"payload":"eyJ1c2VyX2lkIjo0LCJ1c2VybmFtZSI6IngiLCJjaXR5IjoiaGVyZSJ9"}
This is the difference between java.util.Base64.getDecoder() and getBytes():
String s = "eyJ1c2VyX2lkIjo0LCJ1c2VybmFtZSI6IngiLCJjaXR5IjoiaGVyZSJ9";
byte [] bytes = s.getBytes();
byte [] bytes_base64 = java.util.Base64.getDecoder().decode(s);
String bytesToStr = new String(bytes, StandardCharsets.UTF_8);
String bytesBase64Tostr = new String(bytes_base64, StandardCharsets.UTF_8);
System.out.println("bytesToStr="+bytesToStr);
System.out.println("bytesBase64Tostr="+bytesBase64Tostr);
Output:
bytesToStr=eyJ1c2VyX2lkIjo0LCJ1c2VybmFtZSI6IngiLCJjaXR5IjoiaGVyZSJ9
bytesBase64Tostr={"user_id":4,"username":"x","city":"here"}
java.util.Base64.getDecoder() worked for in my case
I have a byte array file with me which I am trying to convert into human readable. I tried below ways :
public static void main(String args[]) throws IOException
{
//System.out.println("Platform Encoding : " + System.getProperty("file.encoding"));
FileInputStream fis = new FileInputStream("<Path>");
// Using Apache Commons IOUtils to read file into byte array
byte[] filedata = IOUtils.toByteArray(fis);
String str = new String(filedata, "UTF-8");
System.out.println(str);
}
Another approach :
public static void main(String[] args) {
File file = new File("<Path>");
readContentIntoByteArray(file);
}
private static byte[] readContentIntoByteArray(File file) {
FileInputStream fileInputStream = null;
byte[] bFile = new byte[(int) file.length()];
try {
FileInputStream(file);
fileInputStream.read(bFile);
fileInputStream.close();
for (int i = 0; i < bFile.length; i++) {
System.out.print((char) bFile[i]);
}
} catch (Exception e) {
e.printStackTrace();
}
return bFile;
}
These codes are compiling but its not yielding output file in a human readable fashion. Excuse me if this is a repeated or basic question.
Could someone please correct me where I am going wrong here?
Your code (from the first snippet) for decoding a byte file into a UTF-8 text file looks correct to me (assuming FileInputStream fis = new FileInputStream("Path") is yielding the correct fileInputStream) .
If you're expecting a text file format but are not sure which encoding the file format is in (perhaps it's not UTF-8) , you can use a library like the below to find out.
https://code.google.com/archive/p/juniversalchardet/
or just explore some of the different Charsets in the Charset library and see what they produce in your String initialization line and what you produce:
new String(byteArray, Charset.defaultCharset()) // try other Charsets here.
The second method you show has associated catches with byte to char conversion , depending on the characters, as discussed here (Byte and char conversion in Java).
Chances are, if you cannot find a valid encoding for this file, it is not human readable to begin with, before byte conversion, or the byte array file being passed to you lost something that makes it decodeable along the way.
I want to convert string to byte[] with same content. Example
I have:
String str = "abc";
byte[] bytes;
//I want to convert "str" to "bytes" that they have same content:
(code here)
//after, print bytes -> "abc".
With a little effort, you'd reach this.
So what we do is use the getBytes method
byte[] convertToBytes= stuff.getBytes("UTF-8");
String newString = new String(convertToBytes, "UTF-8");
source
Converting a set of strings to a byte[] array
Also study up on the String API page
String str = "abc";
byte bytes[] = str.getBytes(); // Get the byte array
for (byte b : bytes) {
System.out.println("Byte is "+b); //Iterate and print
}
str = new String(bytes); // Create String from byte array
System.out.println("String is "+str);
I am having a flat file which is pulled from a Db2 table ,the flat file contains records in both the char format as well as packed decimal format.how to convert the packed data to a java string.is there any way to convert the entire flat file to ASCII format.
EBCDIC is a family of encodings. You'll need to know more in details which EBCDIC encoding you're after.
Java has a number of supported encodings, including:
IBM500/Cp500 - EBCDIC 500V1
x-IBM834/Cp834 - IBM EBCDIC DBCS-only Korean (double-byte)
IBM1047/Cp1047 - Latin-1 character set for EBCDIC hosts
Try those and see what you get. Something like:
InputStreamReader rdr = new InputStreamReader(new FileInputStream(<your file>), java.nio.Charset.forName("ibm500"));
while((String line = rdr.readLine()) != null) {
System.out.println(line);
}
Read the file as a String, write it as EBCDIC. Use the OutputStreamWriter and InputStreamWriter and give the encoding in the constructor.
Following from PAP, CP037 is US EBCDIC encoding.
Also have a look at JRecord Project. It allows you to read a file with either a Cobol or Xml description and will handle EBCDIC and Comp-3.
Finally here is a routine to convert packed decimal bytes to String see method getMainframePackedDecimal in Conversion
Sharing a sample code by me for your reference:
package mypackage;
import java.io.UnsupportedEncodingException;
import java.math.BigInteger;
public class EtoA {
public static void main(String[] args) throws UnsupportedEncodingException {
System.out.println("########");
String edata = "/ÂÄÀ"; //Some EBCDIC string ==> here the OP can provide the content of flat file which the OP pulled from DB2 table
System.out.println("ebcdic source to ascii:");
System.out.println("ebcdic: " + edata);
String ebcdic_encoding = "IBM-1047"; //Setting the encoding in which the source was encoded
byte[] result = edata.getBytes(ebcdic_encoding); //Getting the raw bytes of the EBCDIC string by mentioning its encoding
String output = asHex(result); //Converting the raw bytes into hexadecimal format
byte[] b = new BigInteger(output, 16).toByteArray(); //Now its easy to convert it into another byte array (mentioning that this is of base16 since it is hexadecimal)
String ascii = new String(b, "ISO-8859-1"); //Now convert the modified byte array to normal ASCII string using its encoding "ISO-8859-1"
System.out.println("ascii: " + ascii); //This is the ASCII string which we can use universally in JAVA or wherever
//Inter conversions of similar type (ASCII to EBCDIC) are given below:
System.out.println("########");
String adata = "abcd";
System.out.println("ascii source to ebcdic:");
System.out.println("ascii: " + adata);
String ascii_encoding = "ISO-8859-1";
byte[] res = adata.getBytes(ascii_encoding);
String out = asHex(res);
byte[] bytebuff = new BigInteger(out, 16).toByteArray();
String ebcdic = new String(bytebuff, "IBM-1047");
System.out.println("ebcdic: " + ebcdic);
//Converting from hexadecimal string to EBCDIC if needed
System.out.println("########");
System.out.println("hexcode to ebcdic");
String hexinput = "81828384"; //Hexadecimal which we are converting to EBCDIC
System.out.println("hexinput: " + hexinput);
byte[] buffer = new BigInteger(hexinput, 16).toByteArray();
String eout = new String(buffer, "IBM-1047");
System.out.println("ebcdic out:" + eout);
//Converting from hexadecimal string to ASCII if needed
System.out.println("########");
System.out.println("hexcode to ascii");
String hexin = "61626364";
System.out.println("hexin: " + hexin);
byte[] buff = new BigInteger(hexin, 16).toByteArray();
String asciiout = new String(buff, "ISO-8859-1");
System.out.println("ascii out:" + asciiout);
}
//This asHex method converts the given byte array to a String of Hexadecimal equivalent
public static String asHex(byte[] buf) {
char[] HEX_CHARS = "0123456789abcdef".toCharArray();
char[] chars = new char[2 * buf.length];
for (int i = 0; i < buf.length; ++i) {
chars[2 * i] = HEX_CHARS[(buf[i] & 0xF0) >>> 4];
chars[2 * i + 1] = HEX_CHARS[buf[i] & 0x0F];
}
return new String(chars);
}
}
I'm trying to decode a char and get back the same char.
Following is my simple test.
I'm confused, If i have to encode or decode. Tried both. Both print the same result.
Any suggestions are greatly helpful.
char inpData = '†';
String str = Character.toString((char) inpData);
byte b[] = str.getBytes(Charset.forName("MacRoman"));
System.out.println(b[0]); // prints -96
String decData = Integer.toString(b[0]);
CharsetDecoder decoder = Charset.forName("MacRoman").newDecoder();
ByteBuffer inBuffer = ByteBuffer.wrap(decData.getBytes());
CharBuffer result = decoder.decode(inBuffer);
System.out.println(result.toString()); // prints -96, expecting to print †
CharsetEncoder encoder = Charset.forName("MacRoman").newEncoder();
ByteBuffer bbuf = encoder.encode(CharBuffer.wrap(decData));
result = decoder.decode(bbuf);
System.out.println(result.toString());// prints -96, expecting to print †
Thank you.
When you do String decData = Integer.toString(b[0]);, you create the string "-96" and that is the string you're encoding/decoding. Not the original char.
You have to change your String back to a byte before.
To get your character back as a char from the -96 you have to do this :
String string = new String(b, "MacRoman");
char specialChar = string.charAt(0);
With this your reversing your first transformation from char -> String -> byte[0] by doing byte[0] -> String -> char[0]
If you have the String "-96", you must change first your string into a byte with :
byte b = Byte.parseByte("-96");
String decData = Integer.toString(b[0]);
This probably gets you the "-96" output in the last two examples. try
String decData = new String(b, "MacRoman");
Apart from that, keep in mind that System.out.println uses your system-charset to print out strings anyway. For a better test, consider writing your Strings to a file using your specific charset with something like
FileOutputStream fos = new FileOutputStream("test.txt");
OutputStreamWriter writer = new OutputStreamWriter(fos, "MacRoman");
writer.write(result.toString());
writer.close();