When learning Java IO, I found that fileInputStream has an availabl() method, which can be equal to the file size when reading local files. So if you can directly know the size of the file, then in the case of the need to read the entire file, it is necessary to use BufferedInputStream to decorate it?
like this:
FileInputStream fileInputStream=new FileInputStream("F:\\test.txt");
byte[] data=new byte[fileInputStream.available()];
if (fileInputStream.read(data)!=-1) {
System.out.println(new String(data));
}
or
BufferedReader bufferedReader=new BufferedReader(new
FileReader("F:\\test.txt"));
StringBuilder stringBuilder=new StringBuilder();
for (String line;(line=bufferedReader.readLine())!=null;){
stringBuilder.append(line);
}
System.out.println(stringBuilder.toString());
or
BufferedInputStream bufferedInputStream=new BufferedInputStream(new FileInputStream("F:\\test.txt"));
byte[] data=new byte[bufferedInputStream.available()];
if (bufferedInputStream.read(data)!=-1) {
System.out.println(new String(data));
}
What are the pros and cons of these methods? Which one is better?
thx.
You are wrong about the meaning of available(). It returns the possible number of bytes you can read without blocking. From documentation:
Note that while some implementations of InputStream will return the total number of bytes in the stream, many will not. It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream.
So, if you want convert stream to byte array you should use corresponding libraries, such as IOUtils:
byte[] out = IOUtils.toByteArray(stream);
I made an InputStream Object from a file and a InputStreamReader from that.
InputStream ips = new FileInputStream("c:\\data\\input.txt");
InputStreamReader isr = new InputStreamReader(ips);
I will basically read data in the form of bytes to a buffer but when there comes a time when i should read in chars I will 'switch mode' and read with InputStreamReader
byte[] bbuffer = new byte[20];
char[] cbuffer = new char[20];
while(ips.read(buffer, 0, 20)!=-1){
doSomethingWithbBuffer(bbuffer);
// check every 20th byte and if it is 0 start reading as char
if(bbuffer[20] == 0){
while(isr.read(cbuffer, 0, 20)!=-1){
doSomethingWithcBuffer(cbuffer);
// check every 20th char if its # return to reading as byte
if(cbuffer[20] == '#'){
break;
}
}
}
}
is this a safe way to read files that have mixed char and byte data?
no, this is not safe. the InputStreamReader may read "too much" data from the underlying stream (it uses internal buffers) and corrupt your attempt to read from the underlying byte stream. You can use something like DataInputStream if you want to mix reading characters and bytes.
Alternately, just read the data as bytes and use the correct character encoding to convert those bytes to characters/Strings.
I´m having a problem, in my server, after I send a file with X bytes, I send a string saying this file is over and another file is coming, like
FILE: a SIZE: Y\r\n
send Y bytes
FILE a FINISHED\r\n
FILE b SIZE: Z\r\n
send Z byes
FILE b FINISHED\r\n
FILES FINISHED\r\n
In my client it does not recive properly.
I use readline() to get the command lines after reading Y or Z bytes from the socket.
With one file it works fine, with multiple files it rarely works (yeah, I dont know how it worked once or twice)
Here are some code I use to transfer binary
public static void readInputStreamToFile(InputStream is, FileOutputStream fout,
long size, int bufferSize) throws Exception
{
byte[] buffer = new byte[bufferSize];
long curRead = 0;
long totalRead = 0;
long sizeToRead = size;
while(totalRead < sizeToRead)
{
if(totalRead + buffer.length <= sizeToRead)
{
curRead = is.read(buffer);
}
else
{
curRead = is.read(buffer, 0, (int)(sizeToRead - totalRead));
}
totalRead = totalRead + curRead;
fout.write(buffer, 0, (int) curRead);
}
}
public static void writeFileInputStreamToOutputStream(FileInputStream in, OutputStream out, int bufferSize) throws Exception
{
byte[] buffer = new byte[bufferSize];
int count = 0;
while((count = in.read(buffer)) != -1)
{
out.write(buffer, 0, count);
}
}
just for note I could solve replacing readline to this code:
ByteArrayOutputStream ba = new ByteArrayOutputStream();
int ch;
while(true)
{
ch = is.read();
if(ch == -1)
throw new IOException("Conecção finalizada");
if(ch == 13)
{
ch = is.read();
if(ch == 10)
return new String(ba.toByteArray(), "ISO-8859-1");
else
ba.write(13);
}
ba.write(ch);
}
PS: "is" is my input stream from socket: socket.getInputStream();
still I dont know if its the best implementation to do, im tryinf to figure out
There's no readLine() calls in the code here, but to answer your question; Yes, calling BufferedReader.readLine() might very well leave stuff around in its internal buffer. It's buffering the input.
If you wrap one of your InputStream in a BufferedReader, you can't really get much sane behavior if you read from the BufferedReader and then later on read from the InputStream.
You could read bytes from your InputStream and parse out a text line from that by looking for a pair of \r\n bytes. When you got a line saying "FILE: a SIZE: Y\r\n" , you go on as usual, except the buffer you used to parse lines might contain the first few bytes of your file, so write those bytes out first.
Or you use the idea of FTP and use one TCP stream for commands and one TCP stream for the actual transfer, reading from the command stream with a BufferedReader.readLine(), and reading the data as you already do with an InputStream.
Yes, the main point of a BufferedReader is to buffer the data. It is reading input from its underlying Reader in bigger chunks to avoid having multiple small reads.
That it has a readLine() method is just a nice bonus which is made easily possible by the buffering.
You may want to use a DataInputStream (on top of a BufferedInputStream) and it's readLine() method, if you really have to mix text and binary data over the same connection - read the data from the same DataInputStream. (But take care about the encoding here.)
Call flush() on the OutputStream after you've written data that you want to be certain has been sent. So essentially at the end of each file call flush().
I guess you must flush your output stream in order to make sure any buffered bytes are properly sent down the stream. Closing the stream will equally have this process run.
The Javadocs for flush say:
Flushes this output stream and forces
any buffered output bytes to be
written out. The general contract of
flush is that calling it is an
indication that, if any bytes
previously written have been buffered
by the implementation of the output
stream, such bytes should immediately
be written to their intended
destination.
I have some corrupted Gzip log files that I'm trying to restore. The files were transfered to our servers through a Java backed web page. The files have always been sent as plain text, but we recently started to receive log files that were Gzipped. These Gzipped files appear to be corrupted, and are not unzip-able, and the originals have been deleted. I believe this is from the character encoding in the method below.
Is there any way to revert the process to restore the files to their original zipped format? I have the resulting Strings binary array data in a database blob.
Thanks for any help you can give!
private String convertStreamToString(InputStream is) throws IOException {
/*
* To convert the InputStream to String we use the
* Reader.read(char[] buffer) method. We iterate until the
* Reader return -1 which means there's no more data to
* read. We use the StringWriter class to produce the string.
*/
if (is != null) {
Writer writer = new StringWriter();
char[] buffer = new char[1024];
try {
Reader reader = new BufferedReader(
new InputStreamReader(is, "UTF-8"));
int n;
while ((n = reader.read(buffer)) != -1) {
writer.write(buffer, 0, n);
}
} finally {
is.close();
}
return writer.toString();
} else {
return "";
}
}
If this is the method that was used to convert the InputStream to a String, then your data is almost certainly lost.
The problem is that UTF-8 has quite a few byte sequences that are simply not legal (i.e. they don't represent any value). These sequences will be replaced with the Unicode replacement character.
That character is the same no matter which invalid byte sequence was decoded. Therefore the specific information in those bytes is lost.
If that's the code you have you never should have converted to a Reader (or in fact a String). Only preserving as a Stream (or byte array) would avoid corrupting binary files. And once it's read into the string....illegal sequences (and there are many in utf-8) WILL be discarded.
So no, unless you are quite lucky, there is no way to recover the info. You'll have to provide another process where you process the pure stream and insert as a pure BLOB not a CLOB
Someone explain to me what InputStream and OutputStream are?
I am confused about the use cases for both InputStream and OutputStream.
If you could also include a snippet of code to go along with your explanation, that would be great. Thanks!
The goal of InputStream and OutputStream is to abstract different ways to input and output: whether the stream is a file, a web page, or the screen shouldn't matter. All that matters is that you receive information from the stream (or send information into that stream.)
InputStream is used for many things that you read from.
OutputStream is used for many things that you write to.
Here's some sample code. It assumes the InputStream instr and OutputStream osstr have already been created:
int i;
while ((i = instr.read()) != -1) {
osstr.write(i);
}
instr.close();
osstr.close();
InputStream is used for reading, OutputStream for writing. They are connected as decorators to one another such that you can read/write all different types of data from all different types of sources.
For example, you can write primitive data to a file:
File file = new File("C:/text.bin");
file.createNewFile();
DataOutputStream stream = new DataOutputStream(new FileOutputStream(file));
stream.writeBoolean(true);
stream.writeInt(1234);
stream.close();
To read the written contents:
File file = new File("C:/text.bin");
DataInputStream stream = new DataInputStream(new FileInputStream(file));
boolean isTrue = stream.readBoolean();
int value = stream.readInt();
stream.close();
System.out.printlin(isTrue + " " + value);
You can use other types of streams to enhance the reading/writing. For example, you can introduce a buffer for efficiency:
DataInputStream stream = new DataInputStream(
new BufferedInputStream(new FileInputStream(file)));
You can write other data such as objects:
MyClass myObject = new MyClass(); // MyClass have to implement Serializable
ObjectOutputStream stream = new ObjectOutputStream(
new FileOutputStream("C:/text.obj"));
stream.writeObject(myObject);
stream.close();
You can read from other different input sources:
byte[] test = new byte[] {0, 0, 1, 0, 0, 0, 1, 1, 8, 9};
DataInputStream stream = new DataInputStream(new ByteArrayInputStream(test));
int value0 = stream.readInt();
int value1 = stream.readInt();
byte value2 = stream.readByte();
byte value3 = stream.readByte();
stream.close();
System.out.println(value0 + " " + value1 + " " + value2 + " " + value3);
For most input streams there is an output stream, also. You can define your own streams to reading/writing special things and there are complex streams for reading complex things (for example there are Streams for reading/writing ZIP format).
From the Java Tutorial:
A stream is a sequence of data.
A program uses an input stream to read data from a source, one item at a time:
A program uses an output stream to write data to a destination, one item at time:
The data source and data destination pictured above can be anything that holds, generates, or consumes data. Obviously this includes disk files, but a source or destination can also be another program, a peripheral device, a network socket, or an array.
Sample code from oracle tutorial:
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
public class CopyBytes {
public static void main(String[] args) throws IOException {
FileInputStream in = null;
FileOutputStream out = null;
try {
in = new FileInputStream("xanadu.txt");
out = new FileOutputStream("outagain.txt");
int c;
while ((c = in.read()) != -1) {
out.write(c);
}
} finally {
if (in != null) {
in.close();
}
if (out != null) {
out.close();
}
}
}
}
This program uses byte streams to copy xanadu.txt file to outagain.txt , by writing one byte at a time
Have a look at this SE question to know more details about advanced Character streams, which are wrappers on top of Byte Streams :
byte stream and character stream
you read from an InputStream and write to an OutputStream.
for example, say you want to copy a file. You would create a FileInputStream to read from the source file and a FileOutputStream to write to the new file.
If your data is a character stream, you could use a FileReader instead of an InputStream and a FileWriter instead of an OutputStream if you prefer.
InputStream input = ... // many different types
OutputStream output = ... // many different types
byte[] buffer = new byte[1024];
int n = 0;
while ((n = input.read(buffer)) != -1)
output.write(buffer, 0, n);
input.close();
output.close();
OutputStream is an abstract class that represents writing output. There are many different OutputStream classes, and they write out to certain things (like the screen, or Files, or byte arrays, or network connections, or etc). InputStream classes access the same things, but they read data in from them.
Here is a good basic example of using FileOutputStream and FileInputStream to write data to a file, then read it back in.
A stream is a continuous flow of liquid, air, or gas.
Java stream is a flow of data from a source into a destination. The source or destination can be a disk, memory, socket, or other programs. The data can be bytes, characters, or objects. The same applies for C# or C++ streams. A good metaphor for Java streams is water flowing from a tap into a bathtub and later into a drainage.
The data represents the static part of the stream; the read and write methods the dynamic part of the stream.
InputStream represents a flow of data from the source, the OutputStream represents a flow of data into the destination.
Finally, InputStream and OutputStream are abstractions over low-level access to data, such as C file pointers.
Stream: In laymen terms stream is data , most generic stream is binary representation of data.
Input Stream : If you are reading data from a file or any other source , stream used is input stream. In a simpler terms input stream acts as a channel to read data.
Output Stream : If you want to read and process data from a source (file etc) you first need to save the data , the mean to store data is output stream .
An output stream is generally related to some data destination like a file or a network etc.In java output stream is a destination where data is eventually written and it ends
import java.io.printstream;
class PPrint {
static PPrintStream oout = new PPrintStream();
}
class PPrintStream {
void print(String str) {
System.out.println(str)
}
}
class outputstreamDemo {
public static void main(String args[]) {
System.out.println("hello world");
System.out.prinln("this is output stream demo");
}
}
For one kind of InputStream, you can think of it as a "representation" of a data source, like a file.
For example:
FileInputStream fileInputStream = new FileInputStream("/path/to/file/abc.txt");
fileInputStream represents the data in this path, which you can use read method to read bytes from the file.
For the other kind of InputStream, they take in another inputStream and do further processing, like decompression.
For example:
GZIPInputStream gzipInputStream = new GZIPInputStream(fileInputStream);
gzipInputStream will treat the fileInputStream as a compressed data source. When you use the read(buffer, 0, buffer.length) method, it will decompress part of the gzip file into the buffer you provide.
The reason why we use InputStream because as the data in the source becomes larger and larger, say we have 500GB data in the source file, we don't want to hold everything in the memory (expensive machine; not friendly for GC allocation), and we want to get some result faster (reading the whole file may take a long time).
The same thing for OutputStream. We can start moving some result to the destination without waiting for the whole thing to finish, plus less memory consumption.
If you want more explanations and examples, you have check these summaries: InputStream, OutputStream, How To Use InputStream, How To Use OutputStream
In continue to the great other answers, in my simple words:
Stream - like mentioned #Sher Mohammad is data.
Input stream - for example is to get input – data – from the file. The case is when I have a file (the user upload a file – input) – and I want to read what we have there.
Output Stream – is the vice versa. For example – you are generating an excel file, and output it to some place.
The “how to write” to the file, is defined at the sender (the excel workbook class) not at the file output stream.
See here example in this context.
try (OutputStream fileOut = new FileOutputStream("xssf-align.xlsx")) {
wb.write(fileOut);
}
wb.close();