I have to parse xml from the content of a Jetty buffer using SAX.
From my ContentExchange I can call getRequestContent, and then I get a Buffer
I need an InputStream, or an InputSoruce or a String or a File in order to parse it with SAX. How can I convert the buffer to one of those, and which way is the most efficient?
It looks like something obvious, but I can not find any information in the documentation.
Apologies for answering an old question, but someone (such as myself) may stumble upon this in the future.
Jetty's Buffer class implements a writeTo(OutputStream) method. A simple solution would be to do the following:
PipedInputStream is = new PipedInputStream();
PipedOutputStream os = new PipedOutputStream(is);
Then for each Buffer received, do:
void processBuffer(Buffer buf) {
buf.writeTo(os);
}
This way you can stream responses without need for caching them.
EDIT:
Of course, make sure that processBuffer() and readers of the PipedInputStream are running in separate threads to avoid potential deadlock.
Perhaps you could wrap the buffer in your own custom (anonymous?) InputStream since you only need to implement the read() method. For example:
public InputStream forBuffer(final Buffer buf) {
return new InputStream() {
#Override
public int read() /* throws IOException */ {
return buf.get();
}
};
}
From the Jetty docs it's hard to tell what happens when the Buffer#get() method hits the end but some simple testing should reveal it (and if it happens to return -1 then this example is complete!).
Related
Is there any way to convert an OutputStream into an InputStream?
So the following would work
InputStream convertOStoIS(OutputStream os) {
}
I do not want to use any libraries, I read that there are some who are able to accomplish this with bytecode manipulation.
Edit
I want to be able to intersect a sink, to analyze the data or redirect the output. I want to place another OutputStream under the on given by some function and redirect the data into another input stream.
The related topics had a ByteArrayOutputStream or a PipedStream which is not the case in my question.
Related:
How to convert OutputStream to InputStream?
Most efficient way to create InputStream from OutputStream
Use a java.io.FilterOutputStream to wrap the existing OutputStream. By overriding the write() method you can intercept output and do whatever you want with it, either send it somewhere else, modify it, or discard it completely.
As to your second question, you cannot change the sink of an OutputStream after the fact, i.e. cause previously written data to "move" somewhere else, but using a FilterOutputStream you can intercept and redirect any data written after you wrap the original `OutputStream.
To answer my own question, yes you can build a redirect like this:
class OutInInputRedirect {
public final transient InputStream is;
public final transient OutputStream os;
public OutInInputRedirect() throws IOException {
this(1024);
}
public OutInInputRedirect(int size) throws IOException {
PipedInputStream is = new PipedInputStream(size);
PipedOutputStream os = new PipedOutputStream(is);
this.is = is;
this.os = os;
}
}
Just use the OutputStream as an replacement and the InputStream in those places you need, be awere that the closing of the OutputStream also closes the InputStream!
It is quite easy and works as expected. Either way you cannot change an already connected stream (without reflection).
I've been reading up a lot on Iteratees & Enumerators in order to implement a new module in my application.
I'm now at a point where I'm integrating with a 3rd party Java library, and am stuck at working with this method:
public Email addAttachment(String name, InputStream file) throws IOException {
this.attachments.put(name, file);
return this;
}
What I have in my API is the body returned from a WS HTTP call that is an Enumerator[Array[Byte]].
I am wondering now how to write an Iteratee that would process the chunks of Array[Bytes] and create an InputStream to use in this method.
(Side bar): There are other versions of the addAttachment method that take java.io.File however I want to avoid writing to the disk in this operation, and would rather deal with streams.
I attempted to start by writing something like this:
Iteratee.foreach[Array[Byte]] { bytes =>
???
}
However I'm not sure how to interact with the java InputStream here. I found something called a ByteArrayInputStream however that takes the entire Array[Byte] in its constructor, which I'm not sure would work in this scenario as I'm working with chunks ?
I probably need some Java help here!
Thanks for any help in advance.
If I'm following you, I think you want to work with PipedInputStream and PipedOutputStream:
https://docs.oracle.com/javase/8/docs/api/java/io/PipedInputStream.html
You always use them in pairs. You can construct the pair like so:
PipedInputStream in = new PipedInputStream(); //can also specify a buffer size
PipedOutputStream out = new PipedOutputSream(in);
Pass the input stream to the API, and in your own code iterate through your chucks and write your bytes.
The only caveat is that you need to read/write in separate threads. In your case, its probably good to do your iterating / writing in a separate thread. I'm sure you can handle it in Scala better than me, in Java it would be something like:
PipedInputStream in = new PipedInputStream(); //can also specify a buffer size
PipedOutputStream out = new PipedOutputSream(out);
new Thread(() -> {
// do your looping in here, write to 'out'
out.close();
}).run();
email.addAttachment(in);
email.send();
in.close();
(Leaving out exception handling & resource handling for clarity)
I have this code on the client side :
DataInputStream dis = new DataInputStream(socketChannel.socket().getInputStream());
while(dis.available()){
SomeOtherClass.method(dis);
}
But available() keeps returning 0, although there is readable data in the stream. So after the actual data to be read is finished, empty data is passed to the other class to be read and this causes corruption.
After a little search; I found that available() is not reliable when using with sockets, and that I should be reading first few bytes from stream to actually see if data is available to parse.
But in my case; I have to pass the DataInputStream reference I get from the socket to some other class that I cannot change.
Is it possible to read a few bytes from DataInputStream without corrupting it, or any other suggestions ?
Putting a PushbackInputStream in between allows you to read some bytes without corrupting the data.
EDIT: Untested code example below. This is from memory.
static class MyWrapper extends PushbackInputStream {
MyWrapper(InputStream in) {
super(in);
}
#Override
public int available() throws IOException {
int b = super.read();
// do something specific?
super.unread(b);
return super.available();
}
}
public static void main(String... args) {
InputStream originalSocketStream = null;
DataInputStream dis = new DataInputStream(new MyWrapper(originalSocketStream));
}
This should work:
PushbackInputStream pbi = new PushbackInputStream(socketChannel.socket().getInputStream(), 1);
int singleByte;
DataInputStream dis = new DataInputStream(pbi);
while((singleByte = pbi.read()) != -1) {
pbi.unread(singleByte);
SomeOtherClass.method(dis);
}
But please note that this code will behave different from the example with available (if availabe would work) because available does not block but read may block.
But available() keeps returning 0, although there is readable data in the stream
If available() returns zero, either:
The input stream you are using doesn't support available() and so it just returns zero. That isn't the case here, as you are using a DataInputStream wrapped directly around the socket's input stream, and that configuration does support available(), OR ...
There is no readable data in the stream. That appears to be the case here. In fact the only possible way you can know there is readable data in the stream without actually reading it is to call available() and get a positive result. There is no other way of telling.
There are few correct uses of availabe(), and this isn't one of them. Why should you fall out of that loop just because there isn't any data in the socket receive buffer? The only way you should get out of that loop is by getting an end of stream condition.
I should be reading first few bytes from stream to actually see if data is available to parse.
That doesn't even make sense. If you can read anything from the stream, there is data available, and if you can't, there isn't.
Just read, block, and react correctly to EOS, in its various manifestations.
I have a file that contains bytes, chars, and an object, all of which need to be written then read. What would be the best way to utilize Java's different IO streams for writing and reading these data types? More specifically, is there a proper way to add delimiters and recognize those delimiters, then triggering what stream should be used? I believe I need some clarification on using multiple streams in the same file, something I have never studied before. A thorough explanation would be a sufficient answer. Thanks!
As EJP already suggested, use ObjectOutputStream and ObjectInputStream an0d wrap your other elements as an object(s). I'm giving as an answer so I could show an example (it's hard to do it in comment) EJP - if you want to embed it in your question, please do and I'll delete the answer.
class MyWrapedData implements serializeable{
private String string1;
private String string2;
private char char1;
// constructors
// getters setters
}
Write to file:
ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream(fileName));
out.writeObject(myWrappedDataInstance);
out.flush();
Read from file
ObjectInputStream in = new ObjectInputStream(new FileInputStream(fileName));
Object obj = in.readObject();
MyWrapedData wraped = null;
if ((obj != null) && (obj instanceof MyWrappedData))
wraped = (MyWrapedData)obj;
// get the specific elements from the wraped object
see very clear example here: Read and Write
Redesign the file. There is no sensible way of implementing it as presently designed. For example the object presupposes an ObjectOutputStream, which has a header - where's that going to go? And how are you going to know where to switch from bytes to chars?
I would probably use an ObjectOutputStream for the whole thing and write everything as objects. Then Serialization solves all those problems for you. After all you don't actually care what's in the file, only how to read and write it.
Can you change the structure of the file? It is unclear because the first sentence of your question contradicts being able to add delineators. If you can change the file structure you could output the different data types into separate files. I would consider this the 'proper' way to delineate the data streams.
If you are stuck with the file the way it is then you will need to write an interface to the file's structure which in practice is a shopping list of read operations and a lot of exception handling. A hackish way to program because it will require a hex editor and a lot of trial and error but it works in certain cases.
Why not write the file as XML, possibly with a nice simple library like XSTream. If you are concerned about space, wrap it in gzip compression.
If you have control over the file format, and it's not an exceptionally large file (i.e. < 1 GiB), have you thought about using Google's Protocol Buffers?
They generate code that parses (and serializes) file/byte[] content. Protocol buffers use a tagging approach on every value that includes (1) field number and (2) a type, so they have nice properties such as forward/backward compatability with optional fields etc. They are fairly well optimized for both speed and file size, adding only ~2 bytes of overhead for a short byte[], with ~2-4 additional bytes to encode the length on larger byte[] fields (VarInt encoded lengths).
This could be overkill, but if you have a bunch of different fields & types, protobuf is really helpful. See: http://code.google.com/p/protobuf/.
An alternative is Thrift by Facebook, with support for a few more languages although possibly less use in the wild last I checked.
If the structure of your file is not fixed, consider using a wrapper per type. First you need to create the interface of your wrapper classes….
interface MyWrapper extends Serializable {
void accept(MyWrapperVisitor visitor);
}
Then you create the MyWrapperVisitor interface…
interface MyWrapperVisitor {
void visit(MyString wrapper);
void visit(MyChar wrapper);
void visit(MyLong wrapper);
void visit(MyCustomObject wrapper);
}
Then you create your wrapper classes…
class MyString implements MyWrapper {
public final String value;
public MyString(String value) {
super();
this.value = value;
}
#Override
public void accept(MyWrapperVisitor visitor) {
visitor.visit(this);
}
}
.
.
.
And finally you read your objects…
final InputStream in = new FileInputStream(myfile);
final ObjectInputStream objIn = new ObjectInputStream(in);
final MyWrapperVisitor visitor = new MyWrapperVisitor() {
#Override
public void visit(MyString wrapper) {
//your logic here
}
.
.
.
};
//loop over all your objects here
final MyWrapper wrapper = (MyWrapper) objIn.readObject();
wrapper.accept(visitor);
What I want to do is log the output from an inputstream that I go using
org.apache.http.HttpEntity entity = response.getEntity();
org.apache.http.HttpResponse content =entity.getContent();
//Print the result to the screen for debugging
//puroposes
if(Logging.DEBUG) {
InputStream content =entity.getContent();
int i;
StringBuilder b = new StringBuilder();
while( (i=content.read()) != -1 ) {
b.append((char)i);
}
Log.d(TAG, b.toString());
}
Now after I have finished logging, I want to use the exact same stream through an XML parser. The problem is that it tells me that the steam has already been used.
I tried to the use mark() and reset() calls before and after debugging but it didn't work.
It depends whether the inputstream that is returned supports it. The default implementation in the InputStream class does nothing, as described in the API. So you can't be sure whether the returned Stream actually supports it. To be sure of this, you should wrap it in a BufferedInputStream, which does supports these methods.
In general mark() and reset() won't work on an arbitrary InputStream. They only work on subclasses like FileInputStream where the underlying data source supports these operations.
For something like a SocketInputStream or a console InputStream, your only option will be to read and buffer the entire stream contents somewhere; e.g. in memory or by writing it to a temporary file.