Is it bad style to keep the references to streams "further down" a filter chain, and use those lower level streams again, or even to swap one type of stream for another? For example:
OutputStream os = new FileOutputStream("file");
PrintWriter pw = new PrintWriter(os);
pw.print("print writer stream");
pw.flush();
pw = null;
DataOutputStream dos = new DataOutputStream(os);
dos.writeBytes("dos writer stream");
dos.flush();
dos = null;
os.close();
If so, what are the alternatives if I need to use the functionality of both streams, e.g. if I want to write a few lines of text to a stream, followed by binary data, or vice versa?
This can be done in some cases, but it's error-prone. You need to be careful about buffers and stuff like the stream headers of ObjectOutputStream.
if I want to write a few lines of text to a stream, followed by binary
data, or vice versa?
For this, all you need to know is that you can convert text to binary data and back but always need to specify an encoding. However, it is also error-prone because people tend to use the API methods that use the platform default encoding, and of course you're basically implementing a parser for a custom binary file format - lots of things can go wrong there.
All in all, if you're creating a file format, especially when mixing text and binary data, it's best to use an existing framework like Google protocol buffers
If you have to do it, then you have to do it. So if you're dealing with an external dependency that you don't have control over, you just have to do it.
I think the bad style is the fact that you would need to do it. If you had to send binary data across sometimes, and text across at others, it would probably be best to have some kind of message object and send the object itself over the wire with Serialization. The data overhead isn't too much if structured properly.
I don't see why not. I mean, the implementations of the various stream classes should protect you from writing invalid data. So long as you're reading it back the same way, and your code is otherwise understandable, I don't see why that would be a problem.
Style doesn't always mean you have to do it the way you've seen others do it. So long as it's logical, and someone reading the code would see what (and why) you're doing it without you needing to write a bunch of comments, then I don't see what the issue is.
Since you're flushing between, it's probably fine. But it might be cleaner to use one OutputStream and just use os.write(string.getBytes()); to write the strings.
Related
Good afternoon everyone,
First of all, I'll say that it's only for personal purpose in a certain way, it's made to use for little projects to improve my Java knowledge, but my idea is to make this kind of things to understand better the way developers works with sockets and bytes, as I really like to understand this kind of things better for my future ideas.
Actually I'm making a lightweight HTTP server in Java to understand the way it works, and I've been reading documentation but still have some difficulties to actually understand part of the official documentation. The main problem I'm facing is that, something I'd like to know if it's related or not, the content-length seems to have a higher length than the one I get from the BufferedReader. I don't know if the issue is about the way chars are managed and bytes are being parsed to chars on the BufferedReader, so it has less data, so probably what I have to do is treat this part as a binary, so I'd have to read the bytes of the InputStream, but here comes the real deal I'm facing.
As Readers reads a certain amount of bytes, and then it stops and uses this as buffer, this means the data from the InputStream is being used on the Reader, and it's no longer on the stream, so using read() would end up on a -1 as there aren't more bytes to read. A multipart is divided in multiple elements separated with a boundary, and a newline that delimiters the information from the content. I still have to get the information as an String to process it, but the content should be parsed into a binary data, and, without modifying the buffer length, implying I'd require knowledge about the exact length I require to get only the information, the most probably result would be the content being transferred to the BufferedReader buffer. Is it possible to do it even with the processed data from the BufferedStream, or should I find a way to get that certain content as binary without being processed?
As I said, I'm new working with sockets and services, so I don't exactly know which are the possibilities it's another kind of issue, so any help would be appreciated, thank you in advance.
Answer from Remy Lebeau, that can be found on the comments, which become useful for me:
since multipart data is both textual and binary, you are going to have to do your own buffering of the socket data so you have more control and know where the data switches back and forth. At the very least, since you can read binary data directly from a BufferedInputStream, and access its internal buffer, you can let it handle the actual buffering for you, and it is not difficult to write a custom readLine() method that can read a line of text from a BufferedInputStream without using BufferedReader
It seems to me that InputStream and OutputStream are ambiguous names for I/O.
InputStream can be thought of as "to input into a stream", and OutputStream can be thought of as "get output of a stream".
After all, we read from an "input" stream, but shouldn't you be reading from an "output"?
What was the rationale behind choosing these two names and what is a good way to remember Input/Output stream without confusing one for the other?
The streams are named not for how you use them inside your code but for what they accomplish. An InputStream accomplishes reading input from somewhere outside your program (the console, a file, etc.), whereas an OutputStream accomplishes writing an output to somewhere else (again, console, file, etc.). Your Java code is only the intermediary in this scenario: In order to make use of the input, you have to read it from the stream, and in order to produce an output, you first have to write something to the stream.
The problem with the naming is only that streams by design always have something that goes in and something that comes out - you can always read and write on/with any stream. All you have to remember is that they are named for the more important task they do: interacting with something outside your code.
Think of your program/code as the Actor.
When the Actor wants to read something in, it seeks an handle to
InputStream cause its this stream that will provide the Input. And hence when you Read from it.
When the Actor wants to write something out, it seeks an handle
to OutputStream and then start writing to the handle which will do
the rest. Likewise you Write to it.
I hope this answers. I just visualize my code as the classic Stick Diagram Actor and InputStream and OutputStream as the entities with which you interact.
While reading Java Tutorials, the topic Basic I/o says, use InputStreamReader and OutputStreamWriter when there are no prepackaged character stream classes.
1)What are Pepackaged character stream classes?
Does it mean, a file already has some text!
The term is quite vague and doesn't really seem to be defined anywhere, so good question.
As best I understand it it means things like FileInputStream, FileOutputStream, ByteArrayOutputStream, etc. Classes that have wrapped up a particular kind of stream for you and provide the functionality required to work with it.
Note that most of these streams are working with characters not bytes, and that is generally what you want in Java for dealing with String data in files. On the other hand though if you are reading a pure binary source then the data will come in as bytes and you can then use InputStreamReader to convert those bytes to characters.
So a prepackaged stream reader is one that already provides you the data pre-packaged in the form that you want it.
I believe it to mean classes which inherit Reader or Writer. Such classes "wrap" byte streams so as to convert them automatically to character streams. Example: FileReader, FileWriter; they can read text from files directly.
If no such classes exist for your particular stream needs but you know what you get out of it/put into it is text, then you must use these two wrapper classes.
Classical example: HTML. It is text, but what you get from sockets is byte streams; if you want to read it as HTML, use a Reader (with the correct encoding!) over the socket stream (but of course, many APIs today don't require you to do that).
Reading some sources about Java file I/O managing, I get to know that there are more than 1 alternative for input and output operations.
These are:
BufferedReader and BufferedWriter
FileReader and FileWriter
FileInputStream and FileOutputStream
InputStreamReader and OutputStreamWriter
Scanner class
What of these is best alternative for text files managing? What's best alternative for serialization? What does Java NIO say about it?
Two kinds of data
Generally speaking there are two "worlds":
binary data
text data
When it's a file (or a socket, or a BLOB in a DB, or ...), then it's always binary data first.
Some of that binary data can be treated as text data (which involves something called an "encoding" or "character encoding").
Binary Data
Whenever you want to handle the binary data then you need to use the InputStream/OutputStream classes (generally, everything that contains Stream in its name).
That's why there's a FileInputStream and a FileOutputStream: those read from and write to files and they handle binary data.
Text Data
Whenever you want to handle text data, then you need to use the Reader/Writer classes.
Whenever you need to convert binary data to text (or vice versa), then you need some kind of encoding (common ones are UTF-8, UTF-16, ISO-8859-1 (and related ones) and the good old US-ASCII). "Luckily" the Java platform also has something called the "default platform encoding" which it will use whenever it needs one but the code doesn't specify one.
The platform default encoding is a two-sided sword, however:
it makes writing code easier, because you don't have to specify an encoding for each operation but
it might not match the data you have: If the platform-default encoding is ISO-8859-1 and the file you read is actually UTF-8, then you will get a scrambled output!
For reading, we should also mention the BufferedReader which can be wrapped around any other Reader and adds the ability to handle whole lines at once.
Scanner is a special class that's meant to parse text input into tokens. It's most useful for structured text but often used on System.in to provide a very simple way to read data from stdin (i.e. from what the user inputs on the keyboard).
Bridgin the gap
Now, confusingly enough there are classes that make the bridge between those worlds, which generally have both parts in their names:
an InputStreamReader consumes a InputStream and is itself a Reader.
an OutputStreamWriter is a Writer and writes to an OutputStream.
And then there are "shortcut classes" that basically combine two other classes that are often combined.
a FileReader is basically a combination of a FileInputStream with an InputStreamReader
a FileWriter is basically a combination of a FileOutputStream with an OutputStreamWriter
Note that FileReader and FileWriter have a major drawback compared to their more complicated "hand-built" alternative: they always use the platform default encoding, which might not be what you're trying to do!
What about serialization?
ObjectOutputStream and ObjectInputStream are special streams used for serialization.
As the name of the classes implies serializing involves only binary data (even if serializing String objects), so you'll want to use *Stream classes exclusively. As long as you avoid any Reader/Writer classes, you should be fine.
Further resources
the Basic I/O trail.
Joel's old-ish article on Unicode (good introduction, slightly light on technical detail)
On the evils of platform default encoding (also this)
System.out.println("Java is awesome!");
Pardon my enthusiasm; I just can't believe how powerful Java is, what with its ability to not only save objects (and load them), but also with its main purpose, to send them over a network. This is exactly what I must do, for I am conducting a beta-test. In this beta-test, I have given the testers a version of the game that saves the data as Objects in a location most people don't know about (we are the enlightened ones hahaha). This would work fine and dandy, except that it isn't meant for long-term persistence. But, I could collect their record.ser and counter.bin files (the latter tells me how many Objects are in record.ser) via some client/server interaction with sockets (which I know nothing about, until I started reading about it, but I still feel clueless). Most of the examples I have seen online (this one for example: http://uisurumadushanka89.blogspot.com/2010/08/send-file-via-sockets-in-java.html ) were sending the File as a stream of bytes, namely some ObjectOutputStream and ObjectInputStream. This is exactly what my current version of the game is using to save/load GameData.
Sorry for this long-winded intro, but do you know what I would have to do (steps-wise, so I can UNDERSTAND) to actually send the whole file. Would I have to reconstruct the file byte-by-byte (or Object-by-Object)?
Its pretty simple, actually. Just make your objects serializable, and create an ObjectOutputStream and ObjectInputStream that are connected to whatever underlying stream you have, say FileInputStream, etc. Then just write() whatever object you want to the stream and read it on the other side.
Heres an example for you.
For sockets it will be something like
ObjectOutputStream objectOut = new ObjectOutputStream(serverSocket.getOutputStream());
ObjectInputStream objectIn = new ObjectInputStream(clientSocket.getInputStream());
Java Serialization is an immensely powerful protocol. java.io.ObjectOutputStream and java.io.ObjectInputStream are the higher level classes which of course are wrapped with the lower level classes such as FileInputStream and FileOutputStream. My question is why do you wish to read the file byte by byte when the entire file can be read in Objects.
Here is a good explanation of the procedure.
http://www.tutorialspoint.com/java/java_serialization.html