I need to be able to read the bytes from a file in android.
Everywhere I look, it seems that FileInputStream should be used to read bytes from a file but that is not what I want to do.
I want to be able to read a text file that contains (edit) a textual representation of byte-wide numeric values (/edit) that I want to save to an array.
An example of the text file I want to have converted to a byte array follows:
0x04 0xF2 0x33 0x21 0xAA
The final file will be much longer. Using FileInputStream takes the values of each character where I want to save an array of length five to have the values listed above.
I want the array to be processed like:
ExampleArray[0] = (byte) 0x04;
ExampleArray[1] = (byte) 0xF2;
ExampleArray[2] = (byte) 0x33;
ExampleArray[3] = (byte) 0x21;
ExampleArray[4] = (byte) 0xAA;
Using FileInputStream on a text file returns the ASCII values of the characters and not the values I need written to the array.
The simplest solution is to use FileInputStream.read(byte[] a) method which will transfer the bytes from file into byte array.
Edit: It seems I've misread the requirements. So the file contains the text representation of bytes.
Scanner scanner = new Scanner(new FileInputStream(FILENAME));
String input;
while (scanner.hasNext()) {
input = scanner.next();
long number = Long.decode(input);
// do something with the value
}
Old answer (obviously wrong for this case, but I'll leave it for posterity):
Use a FileInputStream's read(byte[]) method.
FileInputStream in = new FileInoutStream(filename);
byte[] buffer = new byte[BUFFER_SIZE];
int bytesRead = in.read(buffer, 0, buffer.length);
You just don't store bytes as text. Never!
Because 0x00 can be written as one byte in a file, or as a string, in this case (hex) taking up 4 times more space.
If you're required to do this, discuss how awful this decision would be!
I will edit my answer if you can provide a sensible reason though.
You would only save stuff as actual text, if:
It is easier (not the case)
It adds value (if an increase in filesize by over 4 (spaces count) adds value, then yes)
If users should be able to edit the file (then you would omit the "0x"...)
You can write bytes like this:
public static void writeBytes(byte[] in, File file, boolean append) throws IOException {
FileOutputStream fos = null;
try {
fos = new FileOutputStream(file, append);
fos.write(in);
} finally {
if (fos != null)
fos.close();
}
}
and read like this:
public static byte[] readBytes(File file) throws IOException {
return readBytes(file, (int) file.length());
}
public static byte[] readBytes(File file, int length) throws IOException {
byte[] content = new byte[length];
FileInputStream fis = null;
try {
fis = new FileInputStream(file);
while (length > 0)
length -= fis.read(content);
} finally {
if (fis != null)
fis.close();
}
return content;
}
and therefore have:
public static void writeString(String in, File file, String charset, boolean append)
throws IOException {
writeBytes(in.getBytes(charset), file, append);
}
public static String readString(File file, String charset) throws IOException {
return new String(readBytes(file), charset);
}
to write and read strings.
Note that I don't use the try-with-resource construct because Android's current Java source level is too low for that. :(
Related
I want to calculate the CRC3 checksum of a given InputStream and then use to get the string out of it. Here's what I've tried so far
private long calculateChecksum(InputStream stream) throws IOException {
CRC32 crc = new CRC32();
byte[] buffer = new byte[8192];
int length;
while ((length = stream.read(buffer)) > 0) {
crc.update(buffer, 0, length);
}
return crc.getValue();
}
and then
String text = IOUtils.toString(inputStream, UTF_8);
I also tried to reverse the order. First use it as string and then calculate the checksum. But it didn't work.
What seems to be my issue is that the index goes to the end while calculating the checksum and then doesn't reset. Any idea how to use InputStream after calculating the checksum?
As others said, a stream can be consumed only once. But you can consume it and calculate the CRC value at the same time by wrapping your InputStream with a java.util.zip.CheckedInputStream.
Here is a complete example, assuming the text file "test.txt" is in the current directory and contains only this one line: These are german umlauts: äöüÄÖÜß
import org.apache.commons.io.IOUtils;
import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.charset.StandardCharsets;
import java.util.zip.CRC32;
import java.util.zip.CheckedInputStream;
public class App {
private static final String INPUT_FILE = "test.txt";
public static void main( String[] args ) {
final CRC32 crc32 = new CRC32();
try(InputStream in = new CheckedInputStream(new BufferedInputStream(
new FileInputStream(INPUT_FILE)), crc32))
{
final String text = IOUtils.toString(in, StandardCharsets.UTF_8);
System.out.println(text);
System.out.println(String.format("CRC32: %x", crc32.getValue()));
} catch (IOException e) {
e.printStackTrace();
}
}
}
Output:
These are german umlauts: äöüÄÖÜß
CRC32: 84bcd851
Yes, an InputStream is consumed. You have a few options:
mark
mark() / reset() are optional methods of inputstreams; mark sets a mark (this does, by itself, nothing), and reset 'rewinds back' to the mark, replaying everything that was provided since the last time you called mark().
However, your average inputstream either does not support it, or, if it does, supports it by storing in memory all the bytes that are received since setting the mark. Meaning, if you do this to an inputstream that contains a few GB worth of data, you're going to get an OutOfMemoryError.
If there isn't a lot of data, just use mark and reset. Wrap in a BufferedInputStream which is specced to support mark/reset:
private void example(InputStream in) {
BufferedInputStream buffered = new BufferedInputStream(in);
in.mark();
long crc = calculateChecksum(buffered);
in.reset();
String text = IOUtils.toString(buffered, UTF_8);
}
Duplicate
Your second option is to duplicate the inputstream, sending each retrieved byte both to IOUtils as well as to the CRC algorithm.
This is complicated and not recommended.
Checksum the string instead.
You already have a string of data. Just checksum that:
private void example(InputStream in) {
String text = IOUtils.toString(in, UTF_8);
CRC32 crc = new CRC32();
crc.update(text.getBytes(UTF_8));
long checksum = crc.getValue();
}
Or, ditching IOUtils:
private void example(InputStream in) {
byte[] data = in.readAllBytes();
CRC32 crc = new CRC32();
crc.update(data);
long checksum = crc.getValue();
String text = new String(data, UTF_8);
}
InputStream is a read-once stream. Once you've read it, you can't go back to start again. This is because InputStream is general-purpose: it could be the stream of bytes read from a keyboard, for example, or read from a real-time data feed.
If your input stream is in fact a FileInputStream, then you could use
inputStream.getChannel.position(0);
to reset it to the start of the file.
If it's a ByteArrayInputStream, then you already have a byte array so you might as well just use that instead.
If you want to write a general-purpose function that doesn't know what kind of InputStream it is given, then you can wrap it in a BufferedInputStream and use its mark() method. This will use extra memory to buffer the whole of the stream.
I have a byte array file with me which I am trying to convert into human readable. I tried below ways :
public static void main(String args[]) throws IOException
{
//System.out.println("Platform Encoding : " + System.getProperty("file.encoding"));
FileInputStream fis = new FileInputStream("<Path>");
// Using Apache Commons IOUtils to read file into byte array
byte[] filedata = IOUtils.toByteArray(fis);
String str = new String(filedata, "UTF-8");
System.out.println(str);
}
Another approach :
public static void main(String[] args) {
File file = new File("<Path>");
readContentIntoByteArray(file);
}
private static byte[] readContentIntoByteArray(File file) {
FileInputStream fileInputStream = null;
byte[] bFile = new byte[(int) file.length()];
try {
FileInputStream(file);
fileInputStream.read(bFile);
fileInputStream.close();
for (int i = 0; i < bFile.length; i++) {
System.out.print((char) bFile[i]);
}
} catch (Exception e) {
e.printStackTrace();
}
return bFile;
}
These codes are compiling but its not yielding output file in a human readable fashion. Excuse me if this is a repeated or basic question.
Could someone please correct me where I am going wrong here?
Your code (from the first snippet) for decoding a byte file into a UTF-8 text file looks correct to me (assuming FileInputStream fis = new FileInputStream("Path") is yielding the correct fileInputStream) .
If you're expecting a text file format but are not sure which encoding the file format is in (perhaps it's not UTF-8) , you can use a library like the below to find out.
https://code.google.com/archive/p/juniversalchardet/
or just explore some of the different Charsets in the Charset library and see what they produce in your String initialization line and what you produce:
new String(byteArray, Charset.defaultCharset()) // try other Charsets here.
The second method you show has associated catches with byte to char conversion , depending on the characters, as discussed here (Byte and char conversion in Java).
Chances are, if you cannot find a valid encoding for this file, it is not human readable to begin with, before byte conversion, or the byte array file being passed to you lost something that makes it decodeable along the way.
what i did so far :
I read a file1 with text, XORed the bytes with a key and wrote it back to another file2.
My problem: I read for example 'H' from file1 , the byte value is 72;
72 XOR -32 = -88
Now i wrote -88 in to the file2.
when i read file2 i should get -88 as first byte, but i get -3.
public byte[] readInput(String File) throws IOException {
Path path = Paths.get(File);
byte[] data = Files.readAllBytes(path);
byte[]x=new byte[data.length ];
FileInputStream fis = new FileInputStream(File);
InputStreamReader isr = new InputStreamReader(fis);//utf8
Reader in = new BufferedReader(isr);
int ch;
int s = 0;
while ((ch = in.read()) > -1) {// read till EOF
x[s] = (byte) (ch);
}
in.close();
return x;
}
public void writeOutput(byte encrypted [],String file) {
try {
FileOutputStream fos = new FileOutputStream(file);
Writer out = new OutputStreamWriter(fos,"UTF-8");//utf8
String s = new String(encrypted, "UTF-8");
out.write(s);
out.close();
}
catch (IOException e) {
e.printStackTrace();
}
}
public byte[]DNcryption(byte[]key,byte[] mssg){
if(mssg.length==key.length)
{
byte[] encryptedBytes= new byte[key.length];
for(int i=0;i<key.length;i++)
{
encryptedBytes[i]=Byte.valueOf((byte)(mssg[i]^key[i]));//XOR
}
return encryptedBytes;
}
else
{
return null;
}
}
You're not reading the file as bytes - you're reading it as characters. The encrypted data isn't valid UTF-8-encoded text, so you shouldn't try to read it as such.
Likewise, you shouldn't be writing arbitrary byte arrays as if they're UTF-8-encoded text.
Basically, your methods have signatures accepting or returning arbitrary binary data - don't use Writer or Reader classes at all. Just write the data straight to the stream. (And don't swallow the exception, either - do you really want to continue if you've failed to write important data?)
I would actually remove both your readInput and writeOutput methods entirely. Instead, use Files.readAllBytes and Files.write.
In writeOutput method you convert encrypted byte array into UTF-8 String which changes the actual bytes you are writing later to the file. Try this code snippet to see what is happening when you try to convert byte array with negative values to UTF-8 String:
final String s = new String(new byte[]{-1}, "UTF-8");
System.out.println(Arrays.toString(s.getBytes("UTF-8")));
It will print something like [-17, -65, -67]. Try using OutputStream to write bytes to the file.
new FileOutputStream(file).write(encrypted);
I'm looking for a way that I can read the binary data of a file into a string.
I've found one that reads the bytes directly and converts the bytes to binary, the only problem is that it takes up a significant amount of RAM.
Here's the code I'm currently using
try {
byte[] fileData = new byte[(int) sellect.length()];
FileInputStream in = new FileInputStream(sellect);
in.read(fileData);
in.close();
getBinary(fileData[0]);
getBinary(fileData[1]);
getBinary(fileData[2]);
} catch (IOException e) {
e.printStackTrace();
}
And the getBinary() method
public String getBinary(byte bite) {
String output = String.format("%8s", Integer.toBinaryString(bite & 0xFF)).replace(' ', '0');
System.out.println(output); // 10000001
return output;
}
Can you do something like this:
int buffersize = 1000;
int offset = 0;
byte[] fileData = new byte[buffersize];
int numBytesRead;
String string;
while((numBytesRead = in.read(fileData,offset,buffersize)) != -1)
{
string = getBinary(fileData);//Adjust this so it can work with a whole array of bytes at once
out.write(string);
offset += numBytesRead;
}
This way, you never store more information in the ram than the byte and string structures. The file is read 1000 bytes at a time, translated to a string 1 byte at a time, and then put into a new file as a string. Using read() returns the value of how many bytes it reads.
This link can help you :
File to byte[] in Java
public static byte[] toByteArray(InputStream input) throws IOException
Gets the contents of an InputStream as a byte[]. This method buffers
the input internally, so there is no need to use a
BufferedInputStream.
Parameters: input - the InputStream to read from Returns: the
requested byte array Throws: NullPointerException - if the input is
null IOException - if an I/O error occurs
We are really stuck on this topic, this is the only code we have which converts a file into hex but we need to open a file and then for the java code to read the hex and extract certain bytes (e.g. the first 4 bytes for the file extension:
import java.io.*;
public class FileInHexadecimal
{
public static void main(String[] args) throws Exception
{
FileInputStream fis = new FileInputStream("H://Sample_Word.docx");
int i = 0;
while ((i = fis.read()) != -1) {
if (i != -1) {
System.out.printf("%02X\n ", i);
}
}
fis.close();
}
}
Do not confuse internal and external representation - what you do when converting to hex is that you only create a different representation of the same bytes.
There is no need to convert to hex if you just want to read some bytes from the file - just read them. For example, to read the first four bytes, you can use something like
byte[] buffer = new byte[4];
FileInputStream fis = new FileInputStream("H://Sample_Word.docx");
int read = fis.read(buffer);
if (read != buffer.length) {
System.out.println("Short file!");
}
If you need to read data from an arbitrary position within the file, you might want to check RandomAccessFile instead of using a stream. RandomAccessFile allows to set the position where to start reading.