Write string to output stream - java

I have several output listeners that are implementing OutputStream.
It can be either a PrintStream writing to stdout or to a File, or it can be writing to memory or any other output destination; therefore, I specified OutputStream as (an) argument in the method.
Now, I have received the String. What is the best way to write to streams here?
Should I just use Writer.write(message.getBytes())? I can give it bytes, but if the destination stream is a character stream then will it convert automatically?
Do I need to use some bridge streams here instead?

Streams (InputStream and OutputStream) transfer binary data. If you want to write a string to a stream, you must first convert it to bytes, or in other words encode it. You can do that manually (as you suggest) using the String.getBytes(Charset) method, but you should avoid the String.getBytes() method, because that uses the default encoding of the JVM, which can't be reliably predicted in a portable way.
The usual way to write character data to a stream, though, is to wrap the stream in a Writer, (often a PrintWriter), that does the conversion for you when you call its write(String) (or print(String)) method. The corresponding wrapper for InputStreams is a Reader.
PrintStream is a special OutputStream implementation in the sense that it also contain methods that automatically encode strings (it uses a writer internally). But it is still a stream. You can safely wrap your stream with a writer no matter if it is a PrintStream or some other stream implementation. There is no danger of double encoding.
Example of PrintWriter with OutputStream:
try (PrintWriter p = new PrintWriter(new FileOutputStream("output-text.txt", true))) {
p.println("Hello");
} catch (FileNotFoundException e1) {
e1.printStackTrace();
}

OutputStream writes bytes, String provides chars. You need to define Charset to encode string to byte[]:
outputStream.write(string.getBytes(Charset.forName("UTF-8")));
Change UTF-8 to a charset of your choice.

You can create a PrintStream wrapping around your OutputStream and then just call it's print(String):
final OutputStream os = new FileOutputStream("/tmp/out");
final PrintStream printStream = new PrintStream(os);
printStream.print("String");
printStream.close();

By design it is to be done this way:
OutputStream out = ...;
try (Writer w = new OutputStreamWriter(out, "UTF-8")) {
w.write("Hello, World!");
} // or w.close(); //close will auto-flush

Wrap your OutputStream with a PrintWriter and use the print methods on that class. They take in a String and do the work for you.

You may use Apache Commons IO:
try (OutputStream outputStream = ...) {
IOUtils.write("data", outputStream, "UTF-8");
}
IOUtils.write(String data, OutputStream output, String encoding)

Related

How to reduce writing between InputStreams and OutputStreams?

I have the following use case:
read from service InputStream
go through the InputStream and replace some stuff, result is stored in OutputStream
now I need to go on working with an InputStream created from the OutputStream
This is the code I use right now:
InputStream resourceStream = service.getStream();
ByteArrayOutputStream output = new ByteArrayOutputStream();
replace(resourceStream, output, ...);
resourceStream.close();
resourceStream = new ByteArrayInputStream(out.toByteArray());
A lot of shifting and converting streams, so I was wondering if there is a cleaner solution. Maybe some OutputStream which can be used as InputStream, or an OutputStream which contains an InputStream where the content is written to.
No matter how good your idea of having a single object to write and read data from, the implementations of InputStream and OutputStream have been made as "Classes" and not "Interfaces".
Also, due to the fact that, in Java, a single subclass cannot extend multiple Super Classes, the dream of having both Input and Output stream operations in a single class remains just a dream.
That said, the only other option left for programmers would be to create a class that has both InputStream and OutputStream exposed to its clients. Something similar to what as java.net.Socket does. It exposes a getInputStream and a getOutputStream which at a logical level reads from and writes to the same "socket".
So, you can do something like this:
public class IOStreamWrapper {
byte[] streamData;
public InputStream getInputStream() {
// return an inputstream that reads from streamData[]
}
public OutputStream getOutputStream() {
// return an outputstream that writes to streamData[]
}
}
References:
Java InputStream
Java OutputStream
Hope this helps!

When do I need to specify the encoding while writing the file to the disk?

I have a sample method which copies one file to another using InputStream and OutputStream. In this case, the source file is encoded in 'UTF-8'. Even if I don't specify the encoding while writing to the disk, the destination file has the correct encoding. But, if I have to write a java.lang.String to a file, I need to specify the encoding. Why is that ?
public static void copyFile() {
String sourceFilePath = "C://my_encoded.txt";
InputStream inStream = null;
OutputStream outStream = null;
try{
String targetFilePath = "C://my_target.txt";
File sourcefile =new File(sourceFilePath);
outStream = new FileOutputStream(targetFilePath);
inStream = new FileInputStream(sourcefile);
byte[] buffer = new byte[1024];
int length;
//copy the file content in bytes
while ((length = inStream.read(buffer)) > 0){
outStream.write(buffer, 0, length);
}
inStream.close();
outStream.close();
System.out.println("File "+targetFilePath+" is copied successful!");
}catch(IOException e){
e.printStackTrace();
}
}
My guess is that since the source file has thee correct encoding and since we read and write one byte at a time, it works fine. And java.lang.String is 'UTF-16' by default and if we write it to the file, it reads one byte at a time instead of 2 bytes and hence garbage values. Is that correct or am I completely wrong in my understanding ?
You are copying the file byte per byte, so you don't need to care about character encoding.
As a rule of thumb:
Use the various InputStream and OutputStream implementations for byte-wise processing (like file copy).
There are some convenience methods to handle text directly like PrintStream.println(). Be careful because most of them use the default platform specific encoding.
Use the various Reader and Writer implemenations for reading and writing text.
If you need to convert between byte-wise and text processing use InputStreamReader and OutputStreamWriter with explicit file encoding.
Do not rely on the default encoding. The default character encoding is platform specific (e.g. Windows-ANSI aka Cp1252 for Windows, usually UTF-8 on Linux).
Example: If you need to read a UTF-8 text file:
BufferedReader reader =
new BufferedReader(new InputStreamReader(new FileInputStream(inFile), "UTF-8"));
Avoid using a FileReader because a FileReader uses always the default encoding.
A special case: If you need random access to a file you should use RandomAccessFile. With it you can read and write data blocks at arbitrary positions. You can read and write raw byte blocks or you can use convenience methods to read and write text. But you should read the documentation carefully. E.g. the methods readUTF() and writeUTF() use a modified UTF-8 encoding.
InputStream, OutputStream, Reader, Writer and RandomAccessFile form the basic IO functionality, enough for most use cases. For advanced IO (e.g. memory mapped files, ...) have a look at package java.nio.
Just read your code! (For the copy part at least ;-) )
When you copy the two files, you copy it byte by byte. There is no conversion to String, thus.
When you write a String into a file, you need to convert it (indirectly sometimes) in an array of byte (byte[]). There you need to specify your encoding.
When you read a file to get a String, you need to know its encoding in order to do it properly. Java doesn't 'skip' any byte but you need to make a conversion once again : from a byte[] to a String.

Java reading a file different methods

It seems that there are many, many ways to read text files in Java (BufferedReader, DataInputStream etc.) My personal favorite is Scanner with a File in the constructor (it's just simpler, works with mathy data processing better, and has familiar syntax).
Boris the Spider also mentioned Channel and RandomAccessFile.
Can someone explain the pros and cons of each of these methods? To be specific, when would I want to use each?
(edit) I think I should be specific and add that I have a strong preference for the Scanner method. So the real question is, when wouldn't I want to use it?
Lets start at the beginning. The question is what do you want to do?
It's important to understand what a file actually is. A file is a collection of bytes on a disc, these bytes are your data. There are various levels of abstraction above that that Java provides:
File(Input|Output)Stream - read these bytes as a stream of byte.
File(Reader|Writer) - read from a stream of bytes as a stream of char.
Scanner - read from a stream of char and tokenise it.
RandomAccessFile - read these bytes as a searchable byte[].
FileChannel - read these bytes in a safe multithreaded way.
On top of each of those there are the Decorators, for example you can add buffering with BufferedXXX. You could add linebreak awareness to a FileWriter with PrintWriter. You could turn an InputStream into a Reader with an InputStreamReader (currently the only way to specify character encoding for a Reader).
So - when wouldn't I want to use it [a Scanner]?.
You would not use a Scanner if you wanted to, (these are some examples):
Read in data as bytes
Read in a serialized Java object
Copy bytes from one file to another, maybe with some filtering.
It is also worth nothing that the Scanner(File file) constructor takes the File and opens a FileInputStream with the platform default encoding - this is almost always a bad idea. It is generally recognised that you should specify the encoding explicitly to avoid nasty encoding based bugs. Further the stream isn't buffered.
So you may be better off with
try (final Scanner scanner = new Scanner(new BufferedInputStream(new FileInputStream())), "UTF-8") {
//do stuff
}
Ugly, I know.
It's worth noting that Java 7 Provides a further layer of abstraction to remove the need to loop over files - these are in the Files class:
byte[] Files.readAllBytes(Path path)
List<String> Files.readAllLines(Path path, Charset cs)
Both these methods read the entire file into memory, which might not be appropriate. In Java 8 this is further improved by adding support for the new Stream API:
Stream<String> Files.lines(Path path, Charset cs)
Stream<Path> Files.list(Path dir)
For example to get a Stream of words from a Path you can do:
final Stream<String> words = Files.lines(Paths.get("myFile.txt")).
flatMap((in) -> Arrays.stream(in.split("\\b")));
SCANNER:
can parse primitive types and strings using regular expressions.
A Scanner breaks its input into tokens using a delimiter pattern, which by default matches whitespace. The resulting tokens may then be converted into values of different types.more can be read at http://docs.oracle.com/javase/7/docs/api/java/util/Scanner.html
DATA INPUT STREAM:
Lets an application read primitive Java data types from an underlying input stream in a machine-independent way. An application uses a data output stream to write data that can later be read by a data input stream.DataInputStream is not necessarily safe for multithreaded access. Thread safety is optional and is the responsibility of users of methods in this class. More can be read at http://docs.oracle.com/javase/7/docs/api/java/io/DataInputStream.html
BufferedReader:
Reads text from a character-input stream, buffering characters so as to provide for the efficient reading of characters, arrays, and lines.The buffer size may be specified, or the default size may be used. The default is large enough for most purposes.In general, each read request made of a Reader causes a corresponding read request to be made of the underlying character or byte stream. It is therefore advisable to wrap a BufferedReader around any Reader whose read() operations may be costly, such as FileReaders and InputStreamReaders. For example,
BufferedReader in = new BufferedReader(new FileReader("foo.in"));
will buffer the input from the specified file. Without buffering, each invocation of read() or readLine() could cause bytes to be read from the file, converted into characters, and then returned, which can be very inefficient.Programs that use DataInputStreams for textual input can be localized by replacing each DataInputStream with an appropriate BufferedReader.More detail are at http://docs.oracle.com/javase/7/docs/api/java/io/BufferedReader.html
NOTE: This approach is outdated. As Boris points out in his comment. I will leave it here for history, but you should use methods available in JDK.
It depends on what kind of operation you are doing and the size of the file you are reading.
In most of the cases, I recommend using commons-io for small files.
byte[] data = FileUtils.readFileToByteArray(new File("myfile"));
You can read it as string or character array...
Now, you are handing big files, or changing parts of a file directly on the filesystem, then the best it to use a RandomAccessFile and potentially even a FileChannel to do "nio" style.
Using BufferedReader
BufferedReader reader;
char[] buffer = new char[10];
reader = new BufferedReader(new FileReader("FILE_PATH"));
//or
reader = Files.newBufferedReader(Path.get("FILE_PATH"));
while (reader.read(buffer) != -1) {
System.out.print(new String(buffer));
buffer = new char[10];
}
//or
while (buffReader.ready()) {
System.out.println(
buffReader.readLine());
}
reader.close();
Using FileInputStream-Read Binary Files to Bytes
FileInputStream fis;
byte[] buffer = new byte[10];
fis = new FileInputStream("FILE_PATH");
//or
fis=Files.newInoutSream(Paths.get("FILE_PATH"))
while (fis.read(buffer) != -1) {
System.out.print(new String(buffer));
buffer = new byte[10];
}
fis.close();
Using Files– Read Small File to List of Strings
List<String> allLines = Files.readAllLines(Paths.get("FILE_PATH"));
for (String line : allLines) {
System.out.println(line);
}
Using Scanner – Read Text File as Iterator
Scanner scanner = new Scanner(new File("FILE_PATH"));
while (scanner.hasNextLine()) {
System.out.println(scanner.nextLine());
}
scanner.close();
Using RandomAccessFile-Reading Files in Read-Only Mode
RandomAccessFile file = new RandomAccessFile("FILE_PATH", "r");
String str;
while ((str = file.readLine()) != null) {
System.out.println(str);
}
file.close();
Using Files.lines-Reading lines as stream
Stream<String> lines = Files.lines(Paths.get("FILE_PATH") .forEach(s -> System.out.println(s));
Using FileChannel-for increasing performance by using off-heap memory furthermore using MappedByteBuffer
FileInputStream i = new FileInputStream(("FILE_PATH");
ReadableByteChannel r = i.getChannel();
ByteBuffer buffer = ByteBuffer.allocateDirect(16 * 1024);
while (r.read(buffer) != -1) {
buffer.flip();
while (buffer.hasRemaining()) {
System.out.print((char) buffer.get());
}
buffer.clear();
}

writing to file: the difference between stream and writer

Hi I have a bit of confusion about the stream to use to write in a text file
I had seen some example:
one use the PrintWriter stream
PrintWriter out = new PrintWriter(new BufferedWriter(new FileWriter(fname)));
out.println(/*something to write*/);
out.close();
this instead use:
PrintStream out = new PrintStream(new FileOutputStream(fname));
out.println(/*something to write*/)
but which is the difference?both write in a file with the same result?
PrintWriter is new as of Java 1.1; it is more capable than the PrintStream class. You should use PrintWriter instead of PrintStream because it uses the default encoding scheme to convert characters to bytes for an underlying OutputStream. The constructors for PrintStream are deprecated in Java 1.1. In fact, the whole class probably would have been deprecated, except that it would have generated a lot of compilation warnings for code that uses System.out and System.err.
PrintWriter is for writing text, whereas PrintStream is for writing data - raw bytes. PrintWriter may change the encoding of the bytes to make handling text easier, so it might corrupt your data.
PrintWriter extends the class Writer, a class thinked to write characters, while PrintStream implements OutputStream, an interface for more generic outputs.

PrintWriter vs FileWriter in Java

Are PrintWriter and FileWriter in Java the same and no matter which one to use? So far I have used both because their results are the same. Is there some special cases where it makes sense to prefer one over the other?
public static void main(String[] args) {
File fpw = new File("printwriter.txt");
File fwp = new File("filewriter.txt");
try {
PrintWriter pw = new PrintWriter(fpw);
FileWriter fw = new FileWriter(fwp);
pw.write("printwriter text\r\n");
fw.write("filewriter text\r\n");
pw.close();
fw.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
According to coderanch.com, if we combine the answers we get:
FileWriter is the character representation of IO. That means it can be used to write characters. Internally FileWriter would use the default character set of the underlying OS and convert the characters to bytes and write it to the disk.
PrintWriter & FileWriter.
Similarities
Both extend from Writer.
Both are character representation classes, that means they work with characters and convert them to bytes using default charset.
Differences
FileWriter throws IOException in case of any IO failure, this is a checked exception.
None of the PrintWriter methods throw IOExceptions, instead they set a boolean flag which can be obtained using checkError().
PrintWriter has an optional constructor you may use to enable auto-flushing when specific methods are called. No such option exists in FileWriter.
When writing to files, FileWriter has an optional constructor which allows it to append to the existing file when the "write()" method is called.
Difference between PrintStream and OutputStream: Similar to the explanation above, just replace character with byte.
PrintWriter has following methods :
close()
flush()
format()
printf()
print()
println()
write()
and constructors are :
File (as of Java 5)
String (as of Java 5)
OutputStream
Writer
while FileWriter having following methods :
close()
flush()
write()
and constructors are :
File
String
Link: http://www.coderanch.com/t/418148/java-programmer-SCJP/certification/Information-PrintWriter-FileWriter
Both of these use a FileOutputStream internally:
public PrintWriter(File file) throws FileNotFoundException {
this(new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file))),
false);
}
public FileWriter(File file) throws IOException {
super(new FileOutputStream(file));
}
but the main difference is that PrintWriter offers special methods:
Prints formatted representations of
objects to a text-output stream. This
class implements all of the print
methods found in PrintStream. It does
not contain methods for writing raw
bytes, for which a program should use
unencoded byte streams.
Unlike the PrintStream class, if
automatic flushing is enabled it will
be done only when one of the println,
printf, or format methods is invoked,
rather than whenever a newline
character happens to be output. These
methods use the platform's own notion
of line separator rather than the
newline character.
A PrintWriter has a different concept of error handling. You need to call checkError() instead of using try/catch blocks.
PrintWriter doesn't throw IOException.You should call checkError() method for this purpose.
The java.io.PrintWriter in Java5+ allowed for a convenience method/constructor that writes to file.
From the Javadoc;
Creates a new PrintWriter, without automatic line flushing, with the specified file. This convenience constructor creates the necessary intermediate OutputStreamWriter, which will encode characters using the default charset for this instance of the Java virtual machine.
just to provide more info related to FLUSH and Close menthod related to FileOutputStream
flush() ---just makes sure that any buffered data is written to disk flushed compltely and ready to write again to the stream (or writer) afterwards.
close() ----flushes the data and closes any file handles, sockets or whatever.Now connection has been lost and you can't write anything to outputStream.
The fact that java.io.FileWriter relies on the default character encoding of the platform makes it rather useless to me. You should never assume something about the environment where your app will be deployed.

Categories

Resources