I am trying to use a FileInputStream to essentially read in a text file, and then output it in a different text file. However, I always get very strange characters when I do this. I'm sure it's some simple mistake I'm making, thanks for any help or pointing me in the right direction. Here's what I've got so far.
File sendFile = new File(fileName);
FileInputStream fileIn = new FileInputStream(sendFile);
byte buf[] = new byte[1024];
while(fileIn.read(buf) > 0) {
System.out.println(buf);
}
The file it is reading from is just a big text file of regular ASCII characters. Whenever I do the system.out.println, however, I get the output [B#a422ede. Any ideas on how to make this work? Thanks
This happens because you are printing a byte array object itself, rather than printing its content. You should construct a String from the buffer and a length, and print that String instead. The constructor to use for this is
String s = new String(buf, 0, len, charsetName);
Above, len should be the value returned by the call of the read() method. The charsetName should represent the encoding used by the underlying file.
If you're reading from a file to another file, you shouldn't convert the bytes to a string at all, just write the bytes read into the other file.
If your intention is to convert a text file from an encoding to another, read from a new InputStreamReader(in, sourceEncoding), and write to a new OutputStreamWriter(out, targetEncoding).
That's because printing buf will print the reference to the byte array, not the bytes themselves as String as you would expect. You need to do new String(buf) to construct the byte array into string
Also consider using BufferedReader rather than creating your own buffer. With it you can just do
String line = new BufferedReader(new FileReader("filename.txt")).readLine();
Your loop should look like this:
int len;
while((len = fileIn.read(buf)) > 0) {
System.out.write(buf, 0, len);
}
You are (a) using the wrong method and (b) ignoring the length returned by read(), other than checking it for < 0. So you are printing junk at the end of each buffer.
the object 's defualt toString method is return object's id in the memory.
byte buf[] is an object.
you can print using this.
File sendFile = new File(fileName);
FileInputStream fileIn = new FileInputStream(sendFile);
byte buf[] = new byte[1024];
while(fileIn.read(buf) > 0) {
System.out.println(Arrays.toString(buf));
}
or
File sendFile = new File(fileName);
FileInputStream fileIn = new FileInputStream(sendFile);
byte buf[] = new byte[1024];
int len=0;
while((len=fileIn.read(buf)) > 0) {
for(int i=0;i<len;i++){
System.out.print(buf[i]);
}
System.out.println();
}
Related
I am new to the Java I/O so please help.
I am trying to process a large file(e.g. a pdf file of 50mb) using the apache commons library.
At first I try:
byte[] bytes = FileUtils.readFileToByteArray(file);
String encodeBase64String = Base64.encodeBase64String(bytes);
byte[] decoded = Base64.decodeBase64(encodeBase64String);
But knowing that the
FileUtils.readFileToByteArray in org.apache.commons.io will load the whole file into memory, I try to use BufferedInputStream to read the file piece by piece:
BufferedInputStream bis = new BufferedInputStream(inputStream);
StringBuilder pdfStringBuilder = new StringBuilder();
int byteArraySize = 10;
byte[] tempByteArray = new byte[byteArraySize];
while (bis.available() > 0) {
if (bis.available() < byteArraySize) { // reaching the end of file
tempByteArray = new byte[bis.available()];
}
int len = Math.min(bis.available(), byteArraySize);
read = bis.read(tempByteArray, 0, len);
if (read != -1) {
pdfStringBuilder.append(Base64.encodeBase64String(tempByteArray));
} else {
System.err.println("End of file reached.");
}
}
byte[] bytes = Base64.decodeBase64(pdfStringBuilder.toString());
However, the 2 decoded bytes array don't look quite the same... ... In fact, the only give 10 bytes, which is my temp array size... ...
Can anyone please help:
what am I doing it wrong to read the file piece by piece?
why is the decoded byte array only returns 10 bytes in the 2nd solution?
Thanks in advance:)
After some digging, it turns out that the byte array's size has to be multiple of 3 in order to avoid padding. After using a temp array size with multiple of 3, the program is able to go through.
I simply change
int byteArraySize = 10;
to be
int byteArraySize = 1024 * 3;
I have 2 java classes. Let them be class A and class B.
Class A gets String input from user and stores the input as byte into the FILE, then Class B should read the file and display the Byte as String.
CLASS A:
File file = new File("C:\\FILE.txt");
file.createNewFile();
FileOutputStream fos = new FileOutputStream(file);
String fwrite = user_input_1+"\n"+user_input_2;
fos.write(fwrite.getBytes());
fos.flush();
fos.close();
In CLASS B, I wrote the code to read the file, but I don't know how to read the file content as bytes.
CLASS B:
fr = new FileReader(file);
br = new BufferedReader(fr);
arr = new ArrayList<String>();
int i = 0;
while((getF = br.readLine()) != null){
arr.add(getF);
}
String[] sarr = (String[]) arr.toArray(new String[0]);
The FILE.txt has the following lines
[B#3ce76a1
[B#36245605
I want both these lines to be converted into their respective string values and then display it. How to do it?
Are you forced to save using a String byte[] representation to save data? Take a look at object serialization (Object Serialization Tutorial), you don't have to worry about any low level line by line read or write methods.
Since you are writing a byte array through the FileOutputStream, the opposite operation would be to read the file using the FileInputStream, and construct the String from the byte array:
File file = new File("C:\\FILE.txt");
Long fileLength = file.length();
byte[] bytes = new byte[fileLength.intValue()]
try (FileInputStream fis = new FileInputStream(file)) {
fis.read(bytes);
}
String result = new String(bytes);
However, there are better ways of writing the String to a file.
You could write it using the FileWriter, and read using FileReader (possibly wrapping them by the corresponding BufferedReader/Writer), this will avoid creating intermediate byte array. Or better yet, use Apache Commons' IOUtils or Google's Guava libraries.
what i did so far :
I read a file1 with text, XORed the bytes with a key and wrote it back to another file2.
My problem: I read for example 'H' from file1 , the byte value is 72;
72 XOR -32 = -88
Now i wrote -88 in to the file2.
when i read file2 i should get -88 as first byte, but i get -3.
public byte[] readInput(String File) throws IOException {
Path path = Paths.get(File);
byte[] data = Files.readAllBytes(path);
byte[]x=new byte[data.length ];
FileInputStream fis = new FileInputStream(File);
InputStreamReader isr = new InputStreamReader(fis);//utf8
Reader in = new BufferedReader(isr);
int ch;
int s = 0;
while ((ch = in.read()) > -1) {// read till EOF
x[s] = (byte) (ch);
}
in.close();
return x;
}
public void writeOutput(byte encrypted [],String file) {
try {
FileOutputStream fos = new FileOutputStream(file);
Writer out = new OutputStreamWriter(fos,"UTF-8");//utf8
String s = new String(encrypted, "UTF-8");
out.write(s);
out.close();
}
catch (IOException e) {
e.printStackTrace();
}
}
public byte[]DNcryption(byte[]key,byte[] mssg){
if(mssg.length==key.length)
{
byte[] encryptedBytes= new byte[key.length];
for(int i=0;i<key.length;i++)
{
encryptedBytes[i]=Byte.valueOf((byte)(mssg[i]^key[i]));//XOR
}
return encryptedBytes;
}
else
{
return null;
}
}
You're not reading the file as bytes - you're reading it as characters. The encrypted data isn't valid UTF-8-encoded text, so you shouldn't try to read it as such.
Likewise, you shouldn't be writing arbitrary byte arrays as if they're UTF-8-encoded text.
Basically, your methods have signatures accepting or returning arbitrary binary data - don't use Writer or Reader classes at all. Just write the data straight to the stream. (And don't swallow the exception, either - do you really want to continue if you've failed to write important data?)
I would actually remove both your readInput and writeOutput methods entirely. Instead, use Files.readAllBytes and Files.write.
In writeOutput method you convert encrypted byte array into UTF-8 String which changes the actual bytes you are writing later to the file. Try this code snippet to see what is happening when you try to convert byte array with negative values to UTF-8 String:
final String s = new String(new byte[]{-1}, "UTF-8");
System.out.println(Arrays.toString(s.getBytes("UTF-8")));
It will print something like [-17, -65, -67]. Try using OutputStream to write bytes to the file.
new FileOutputStream(file).write(encrypted);
I am trying to read a UTF-8 file from a zipFile and its turning out to be a major challenge.
Here I zip the String to a bytes array to persist to my db.
ByteArrayOutputStream bos = new ByteArrayOutputStream();
ZipOutputStream zo = new ZipOutputStream( bos );
zo.setLevel(9);
BufferedWriter writer = new BufferedWriter(
new OutputStreamWriter(bos, Charset.forName("utf-8"))
);
ZipEntry ze = new ZipEntry("data");
zo.putNextEntry(ze);
zo.write( s.getBytes() );
zo.close();
writer.close();
return bos.toByteArray();
And this is how I read the String back:
ZipInputStream zis = new ZipInputStream( new ByteArrayInputStream(bytes) );
ZipEntry entry = zis.getNextEntry();
byte[] buffer = new byte[2048];
ByteArrayOutputStream bos = new ByteArrayOutputStream();
int size;
while ((size = zis.read(buffer, 0, buffer.length)) != -1) {
bos.write(buffer, 0, size);
}
BufferedReader r = new BufferedReader( new InputStreamReader( new ByteArrayInputStream( bos.toByteArray() ), Charset.forName("utf-8") ) );
StringBuilder b = new StringBuilder();
while (r.ready()) {
b.append( r.readLine() ).append(" ");
}
The String that I get back here has lost the UTF8 charecters!
UPDATE 1:
I changed the code around so that I compared the byte array of the original String with the byte array I read back from the zipfile and they freaking match! So its probably how I'm building the string after i have the bytes.
Arrays.equals(converted, orgi)
Your problem is in the writing, presuming s is a String, you have:
zo.write( s.getBytes() );
But that will convert s to bytes using whatever the default encoding is. You'll want to use UTF-8 for that conversion:
zo.write( s.getBytes("utf-8") );
Your observation that the original bytes are the same as the uncompressed bytes make sense because the original written data is the source of the problem.
Note that you have the writer stream declared but you never actually use it for anything (nor should you, in this context, since writing to it will just write uncompressed string data to the same stream bos that your ZipOutputStream writes to). It looks like you may have confused yourself trying a few different things at once here, you should just get rid of writer.
For one, BufferedReader#ready() is not a good indicator for reading input. Here's a number of reasons why
Does BufferedReader.ready() method ensure that readLine() method does not return NULL?
BufferedReader not stating 'ready' when it should
Second, you are using
b.append( r.readLine() ).append(" ");
which is always adding a " " on every iteration. The resulting String value is bound to be different than the original just because of this.
Third, shout out to Jason C about your BufferedWriter not doing anything.
The documentation says that one should not use available() method to determine the size of an InputStream. How can I read the whole content of an InputStream into a byte array?
InputStream in; //assuming already present
byte[] data = new byte[in.available()];
in.read(data);//now data is filled with the whole content of the InputStream
I could read multiple times into a buffer of a fixed size, but then, I will have to combine the data I read into a single byte array, which is a problem for me.
The simplest approach IMO is to use Guava and its ByteStreams class:
byte[] bytes = ByteStreams.toByteArray(in);
Or for a file:
byte[] bytes = Files.toByteArray(file);
Alternatively (if you didn't want to use Guava), you could create a ByteArrayOutputStream, and repeatedly read into a byte array and write into the ByteArrayOutputStream (letting that handle resizing), then call ByteArrayOutputStream.toByteArray().
Note that this approach works whether you can tell the length of your input or not - assuming you have enough memory, of course.
Please keep in mind that the answers here assume that the length of the file is less than or equal to Integer.MAX_VALUE(2147483647).
If you are reading in from a file, you can do something like this:
File file = new File("myFile");
byte[] fileData = new byte[(int) file.length()];
DataInputStream dis = new DataInputStream(new FileInputStream(file));
dis.readFully(fileData);
dis.close();
UPDATE (May 31, 2014):
Java 7 adds some new features in the java.nio.file package that can be used to make this example a few lines shorter. See the readAllBytes() method in the java.nio.file.Files class. Here is a short example:
import java.nio.file.FileSystems;
import java.nio.file.Files;
import java.nio.file.Path;
// ...
Path p = FileSystems.getDefault().getPath("", "myFile");
byte [] fileData = Files.readAllBytes(p);
Android has support for this starting in Api level 26 (8.0.0, Oreo).
You can use Apache commons-io for this task:
Refer to this method:
public static byte[] readFileToByteArray(File file) throws IOException
Update:
Java 7 way:
byte[] bytes = Files.readAllBytes(Paths.get(filename));
and if it is a text file and you want to convert it to String (change encoding as needed):
StandardCharsets.UTF_8.decode(ByteBuffer.wrap(bytes)).toString()
You can read it by chunks (byte buffer[] = new byte[2048]) and write the chunks to a ByteArrayOutputStream. From the ByteArrayOutputStream you can retrieve the contents as a byte[], without needing to determine its size beforehand.
I believe buffer length needs to be specified, as memory is finite and you may run out of it
Example:
InputStream in = new FileInputStream(strFileName);
long length = fileFileName.length();
if (length > Integer.MAX_VALUE) {
throw new IOException("File is too large!");
}
byte[] bytes = new byte[(int) length];
int offset = 0;
int numRead = 0;
while (offset < bytes.length && (numRead = in.read(bytes, offset, bytes.length - offset)) >= 0) {
offset += numRead;
}
if (offset < bytes.length) {
throw new IOException("Could not completely read file " + fileFileName.getName());
}
in.close();
Max value for array index is Integer.MAX_INT - it's around 2Gb (2^31 / 2 147 483 647).
Your input stream can be bigger than 2Gb, so you have to process data in chunks, sorry.
InputStream is;
final byte[] buffer = new byte[512 * 1024 * 1024]; // 512Mb
while(true) {
final int read = is.read(buffer);
if ( read < 0 ) {
break;
}
// do processing
}