In short I need to do two things with one stream.
I need to pass a stream through a method to see if the bytes of that stream are of a particular type.
I need to create a new class using that stream once that check is completed.
I'm very new to streams and I know that they are "one way streets." So I think I have a bad design in my code or something if I find myself needing to reuse a stream.
Here is a snippit of the logic:
byte[] header = new byte[1024];
//reads entire array or until EOF whichever is first
bis.mark(header.length);
bis.read(header);
if(isFileType(header)) {
bis.reset();
_data.put(fileName, new MyClass(bis)); // Stream is now closed...
methodForFinalBytes(bis);
} else {
// Do other stuff;
}
It depends entirely on whether the InputStream implementation supports mark(). See http://docs.oracle.com/javase/6/docs/api/java/io/InputStream.html#markSupported(). Calling reset() on a stream that doesn't support mark() may throw an exception.
BufferedInputStream and ByteArrayInputStream support mark(), but others don't.
Generally, you can't reset an InputStream to get back to the start. There are, however the mark() / reset() methods, which make a stream remember the current position and you can rewind the stream to the marked position with reset().
Problem is, they are optional and may not be supported by the particular stream class in use. BufferedInputStream does support mark() / reset() (although within buffer limits). You can wrap your InputStream in a BufferedInputStream, immediately mark() and then run your detection code (but make sure it does not read ahead further than the buffer size, you can specify the buffer size in the BufferedInputStream constrcutor). Then call reset() and really read the stream.
EDIT: If you use ByteArrayInputStream anyway, that one supports mark/reset over its entire length (naturally).
Related
public void mark(int readAheadLimit)
throws IOException
FilterReader class in Java.io package has a mark method that marks the next element, what is the use of that parameter in that method.
JavaDocs - Limit on the number of characters that may be read while still preserving the mark. After reading this many characters, attempting to reset the stream may fail
What does that mean . Examples and explanations are appreciated about failing reset and what does that parameter do!!
There is a better explanation in InputStream.mark. A reader that supports mark() should mirror this behavior, e.g. by delegating it to the underlying InputStream:
Marks the current position in this input stream. A subsequent call to the reset method repositions this stream at the last marked position so that subsequent reads re-read the same bytes.
The readlimit arguments tells this input stream to allow that many bytes to be read before the mark position gets invalidated.
The general contract of mark is that, if the method markSupported returns true, the stream somehow remembers all the bytes read after the call to mark and stands ready to supply those same bytes again if and whenever the method reset is called. However, the stream is not required to remember any data at all if more than readlimit bytes are read from the stream before reset is called.
Marking a closed stream should not have any effect on the stream.
The mark method of InputStream does nothing.
So, the parameter tells the mark() method how large the buffer for remembering elements needs to be. This allows it to allocate a buffer of appropriate size, if needed.
I'm using an ObjectInputStream to call readObject for reading in serialized Objects. I would like to avoid having this method block, so I'm looking to use something like Inputstream.available().
InputStream.available() will tell you there are bytes available and that read() will not block. Is there an equivalent method for seriailzation that will tell you if there are Objects available and readObject will not block?
No. Although you could use the ObjectInputStream in another thread and check to see whether that has an object available. Generally polling isn't a great idea, particularly with the poor guarantees of InputStream.available.
The Java serialization API was not designed to support an available() function. If you implement your own object reader/writer functions, you can read any amount of data off the stream you like, and there is no reporting method.
So readObject() does not know how much data it will read, so it does not know how many objects are available.
As the other post suggested, your best bet is to move the reading into a separate thread.
I have an idea that by adding another InputStream into the chain one can make availability information readable by the client:
HACK!
InputStream is = ... // where we actually read the data
BufferedInputStream bis = new BufferedInputStream(is);
ObjectInputStream ois = new ObjectInputStream(bis);
if( bis.available() > N ) {
Object o = ois.readObject();
}
The tricky point is value of N. It should be big enough to cover both serialization header and object data. If those are varying wildly, no luck.
The BufferedInputStream works for me, and why not just check if(bis.available() > 0) instead of a N value, this works perfectly for me.
I think ObjectInputStream.readObject blocks(= waits until) when no input is to be read. So if there is any input at all in the stream aka if(bis.available() > 0) ObjectInputStream.readObject will not block. Keep in mind that ObjectInputStream.readObject might throw a ClassNotFoundException, and that is't a problem at all to me.
I want to write ONLY the values of the data members of an object into a file, so here I can can't use serialization since it writes a whole lot other information which i don't need. Here's is what I have implemented in two ways. One using byte buffer and other without using it.
Without using ByteBuffer:
1st method
public class DemoSecond {
byte characterData;
byte shortData;
byte[] integerData;
byte[] stringData;
public DemoSecond(byte characterData, byte shortData, byte[] integerData,
byte[] stringData) {
super();
this.characterData = characterData;
this.shortData = shortData;
this.integerData = integerData;
this.stringData = stringData;
}
public static void main(String[] args) {
DemoSecond dClass= new DemoSecond((byte)'c', (byte)0x7, new byte[]{3,4},
new byte[]{(byte)'p',(byte)'e',(byte)'n'});
File checking= new File("c:/objectByteArray.dat");
try {
if (!checking.exists()) {
checking.createNewFile();
}
// POINT A
FileOutputStream bo = new FileOutputStream(checking);
bo.write(dClass.characterData);
bo.write(dClass.shortData);
bo.write(dClass.integerData);
bo.write(dClass.stringData);
// POINT B
bo.close();
} catch (FileNotFoundException e) {
System.out.println("FNF");
e.printStackTrace();
} catch (IOException e) {
System.out.println("IOE");
e.printStackTrace();
}
}
}
Using byte buffer: One more thing is that the size of the data members will always remain fixed i.e. characterData= 1byte, shortData= 1byte, integerData= 2byte and stringData= 3byte. So the total size of this class is 7byte ALWAYS
2nd method
// POINT A
FileOutputStream bo = new FileOutputStream(checking);
ByteBuffer buff= ByteBuffer.allocate(7);
buff.put(dClass.characterData);
buff.put(dClass.shortData);
buff.put(dClass.integerData);
buff.put(dClass.stringData);
bo.write(buff.array());
// POINT B
I want know which one of the two methods is more optimized? And kindly give the reason also.
The above class DemoSecond is just a sample class.
My original classes will be of size 5 to 50 bytes. I don't think here size might be the issue.
But each of my classes is of fixed size like the DemoSecond
Also there are so many files of this type which I am going to write in the binary file.
PS
if I use serialization it also writes the word "characterData", "shortData", "integerData","stringData" also and other information which I don't want to write in the file. What I am corcern here is about THEIR VALUES ONLY. In case of this example its:'c', 7, 3,4'p','e','n'. I want to write only this 7bytes into the file, NOT the other informations which is USELESS to me.
As you are doing file I/O, you should bear in mind that the I/O operations are likely to be very much slower than any work done by the CPU in your output code. To a first approximation, the cost of I/O is an amount proportional to the amount of data you are writing, plus a fixed cost for each operating system call made to do the I/O.
So in your case you want to minimise the number of operating system calls to do the writing. This is done by buffering data in the application, so the application performs few put larger operating system calls.
Using a byte buffer, as you have done, is one way of doing this, so your ByteBuffer code will be more efficient than your FileOutputStream code.
But there are other considerations. Your example is not performing many writes. So it is likely to be very fast anyway. Any optimisation is likely to be a premature optimisation. Optimisations tend to make code more complicated and harder to understand. To understand your ByteBuffer code a reader needs to understand how a ByteBuffer works in addition to everything they need to understand for the FileOutputStream code. And if you ever change the file format, you are more likely to introduce a bug with the ByteBuffer code (for example, by having a too small a buffer).
Buffering of output is commonly done. So it should not surprise you that Java already provides code to help you. That code will have been written by experts, tested and debugged. Unless you have special requirements you should always use such code rather than writing your own. The code I am referring to is the BufferedOutputStream class.
To use it simply adapt your code that does not use the ByteBuffer, by changing the line of your code that opens the file to
OutputStream bo = new BufferedOutputStream(new FileOutputStream(checking));
The two methods differ only in the byte buffer allocated.
If you are concerning about unnecessary write action to file, there is already a BufferedOutputStream you can use, for which buffer is allocated internally, and if you are writing to same outputstream multiple times, it is definitely more efficient than allocating buffer every time manually.
It would be simplest to use a DataOutputStream around a BufferedOutputStream around the FileOutputStream.
NB You can't squeeze 'shortData' into a byte. Use the various primitives of DataOutputStream, and use the corresponding ones of DataInputStream when reading them back.
I've been trying to figure out why a method I've written to read objects from a file didn't work and realized that the available() method of ObjectInputStream gave 0 even though the file wasn't fully read.
The method did work after I've used the FileInputStream available() method instead to determine the EOF and it worked!
Why doesn't the method work for ObjectInputStram while it works for FileInputStream?
Here's the code:
public static void getArrFromFile() throws IOException, ClassNotFoundException {
Product p;
FileInputStream in= new FileInputStream(fName);
ObjectInputStream input= new ObjectInputStream(in);
while(in.available()>0){
p=(Product)input.readObject();
if (p.getPrice()>3000)
System.out.println(p);
}
input.close();
P.S-
I've read that I should use the EOF exception instead of available() for this, but I just wanna know why this doesn't work.
Thanks a lot!!!
Because, as the javadoc tells, available() returns an estimation of the number of bytes that can be read without blocking. The base InputStream implementation always returns 0, because this is a valid estimation. But whatever it returns, the fact that it returns 0 doesn't mean that there is nothing to read anymore. Only that the stream can't guarantee that at least one byte can be read without blocking.
Although this is not documented clearly I have realized from experience that it has to do with dynamic data. If your class only contains statically typed data, then available() is able to estimate the size. If there are dynamic data in your object, like lists etc, then it is not possible to make that estimation.
The available() method just tells how many bytes can be read without blocking. It's not very useful in regular code, but people see the name and erroneously think it does something else.
So in short: don't use available(), it's not the right method to use. Streams indicate ending differently, such as returning -1 or in ObjectInputStream's case, throwing an EOFException.
I found some trick! If you still want to use .available() to read all objects to the end, you can add an integer (ex: out.writeInt(0)) before adding each Object (ex: out.writeObject(obj)) when you write to the file and also read integer before reading each Object. So .available() can read byte left on file and won't be crash! Hope it helps you!
Use available function of InputStream instead of ObjectInputStream. Then if there is any data, get them as an object.
Something like:
if(inputStreamObject.available() > 0){
Object anyName = objectInputStreamObject.readObject();
}
You can get the inputStreamObject directly from the Socket.
I used it and the problem solved.
I’ve been reading on InputStream, FileInputStream, ByteArrayInputStream and how their use seems quite clear (output streams too).
What I’m struggling is to understand the use of FilterInputStream & FilterOutputStream:
What is the advantage of using it compared to the other stream classes?
When should I use it?
Please provide a theoretical explanation and a basic example.
FilterInputStream is an example of the the Decorator pattern.
This class must be extended, since its constructor is protected. The derived class would add additional capabilities, but still expose the basic interface of an InputStream.
For example, a BufferedInputStream provides buffering of an underlying input stream to make reading data faster, and a DigestInputStream computes a cryptographic hash of data as it's consumed.
You would use this to add functionality to existing code that depends on the InputStream or OutputStream API. For example, suppose that you use some library that saves data to an OutputStream. The data are growing too large, so you want to add compression. Instead of modifying the data persistence library, you can modify your application so that it "decorates" the stream that it currently creates with a ZipOutputStream. The library will use the stream just as it used the old version that lacked compression.
You use them when you want to decorate the stream of data.
Remember that these stream class instances wrap themselves around another stream instance (whether another subclass of one of these or not) and add some feature, add some processing, make some changes to the data as it passes through.
For example, you might want to remove all the multiple spaces from some stream. You make your own subclass of FilterInputStream and override the read() method. I'm not going to bother all the details but here's some sorta-java for the method in the subclass:
private boolean lastWasBlank = false;
public int read() {
int chr = super.read();
if (chr == ' ') {
if (lastWasBlank) {
return read();
} else {
lastWasBlank = true;
}
} else {
lastWasBlank = false;
}
return chr;
}
In real life, you would probably mess with the other two read() methods too.
Other uses:
Log everything flowing through the stream
Duplicate the 'tee' utility so the stream being read is handled two ways.
Convert line endings between Windows, Mac and Unix/Linux formats
Add delays to simulate slow transmission methods like modems or serial ports or wireless network connections.
FilterInputStream and FilterOutputStream are there to ease the job of developers who wish to implement their own input/output streams. Implementations such as BufferedInputStream may add their own decorations around the basic InputStream API while delegating on the super class - FilteredInputStream in this case - the methods they don't need to override.
Neither FilterInputStream nor FilterOutputStream are designed for end users to use directly.