What is the reason for the JVM handling SIGPIPE the way it does?
I would've expected for
java foo | head -10
with
public class Foo {
public static void main(String[] args){
Stream.iterate(0, n -> n + 1).forEach(System.out::println);
}
}
to cause the process to be killed when writing the 11th line, however that is not the case. Instead, it seems that only a trouble flag is being set at the PrintStream, which can be checked through System.out.checkError().
What happens is that the SIGPIPE exception results in an IOException.
For most OutputStream and Writer classes, this exception propagates through the "write" method, and has to be handled by the caller.
However, when you are writing to System.out, you are using a PrintStream, and that class by design takes care of the IOException of you. As the javadoc says:
A PrintStream adds functionality to another output stream, namely the ability to print representations of various data values conveniently. Two other features are provided as well. Unlike other output streams, a PrintStream never throws an IOException; instead, exceptional situations merely set an internal flag that can be tested via the checkError method.
What is the reason for the JVM handling SIGPIPE the way it does?
The above explains what is happening. The "why" is ... I guess ... that the designers wanted to make PrintStream easy to use for typical use cases of System.out where the caller doesn't want to deal with a possible IOException on every call.
Unfortunately, there is no elegant solution to this:
You could just call checkError ...
You should be able get hold of the FileDescriptor.out object, and wrap it in a new FileOutputStream object ... and use that instead of System.out.
Note that there are no strong guarantees that the Java app will only write 10 lines of output in java foo | head -1. It is quite possible for the app to write-ahead many lines, and to only "see" the pipe closed after head has gotten around to reading the first 10 of them. This applies with System.out (and checkError) or if you wrap FileDescriptor.
Related
I saw many times that using the functional API in java is really verbose and error-prone when we have to deal with checked exceptions.
E.g: it's really convenient to write (and easier to read) code like
var obj = Objects.requireNonNullElseGet(something, Other::get);
Indeed, it also avoids to improper multiple invokation of getters, like when you do
var obj = something.get() != null ? something.get() : other.get();
// ^^^^ first ^^^^ ^^^^ second ^^^^
BUT everything becomes a jungle when you have to deal with checked exceptions, and I saw sometimes this really ugly code style:
try {
Objects.requireNonNullElseGet(obj, () -> {
try {
return invokeMethodWhichThrows();
} catch (Exception e) {
throw new RuntimeException(e);
}
});
} catch (RuntimeException r){
Throwable cause = r.getCause();
if(cause == null)
throw r;
else
throw cause;
}
which only intent is to handle checked exceptions like when you write code without lambdas. Now, I know that those cases can be better expressed with the ternary operator and a variable to hold the result of something.get(), but that's also the case for Objects.requireNonNullElse(a, b), which is there, in the java.util package of the JDK.
The same can be said for logging frameworks' methods which take Suppliers as parameters and evaluate them only if needed, BUT if you need to handle checked exceptions in those supplier you need to invoke them and explicitly check for the log level.
if(LOGGER.isDebugEnabled())
LOGGER.debug("request from " + resolveIPOrThrow());
Some similar reasonament can be maid also for Futures, but let me go ahead.
My question is: why is Functional API in java not handling checked exceptions?
For example having something like a ThrowingSupplier interface, like the one below, can potentially fit the need of dealing with checked exceptions, guarantee type consistency and better code readability.
interface ThrowingSupplier<O, T extends Exception> {
O get() throws T;
}
Then we need to duplicate methods that uses Suppliers to have an overload that uses ThrowingSuppliers and throws exceptions. But we as java developers have been used to this kind of duplication (like with Stream, IntStream, LongStream, or methods with overloads to handle int[], char[], long[], byte[], ...), so it's nothing too strange for us.
I would really appreciate if someone who has deep knowledge of the JDK argues about why checked exceptions have been excluded from the functional API, if there was a way to incorporate them.
This question can be interpreted as 'why did those who made this decision decide it this way', which is asking: "Please summarize 5 years of serious debate - specifically what Brian Goetz and co thought about it", which is impossible, unless your name is Brian Goetz. He does not answer questions on SO as far as I know. You can go spelunking in de archives of the lambda-dev mailing list if you want.
One could make an informed guess, though.
In-scope vs Beyond-scope
There are 3 transparancies that lambdas do not have.
Control flow.
Checked exceptions.
Mutable local variables.
Control flow transparency
Take this code, as an example:
private Map<String, PhoneNumber> phonebook = ...;
public PhoneNumber findPhoneNumberOf(String personName) {
phonebook.entrySet().stream().forEach(entry -> {
if (entry.getKey().equals(personName)) return entry.getValue();
});
return null;
}
This code is silly (why not just do a .get, or if we must stream through the thing, why not use .filter and .findFirst, but if you look past that, it doesn't even work: You cannot return the method from within that lambda. That return statement returns the lambda (and thus is a compiler error, the lambda you pass to forEach returns void). You can't continue or break a loop that is outside the lambda from inside it, either.
Contrast to a for loop that can do it just fine:
for (var entry : phonebook.entrySet()) {
if (entry.getKey().equals(personName)) return entry.getValue();
}
return null;
does exactly what you think, and works fine.
Checked exception transparency
This is the one you are complaining about. This doesn't compile:
public void printFiles(Path... files) throws IOException {
Arrays.stream(files).forEach(p -> System.out.println(Files.readString(p)));
}
The fact that the context allows you to throw IOExceptions doesn't help: The above does not compile, because 'can throw IOExceptions' as a status doesn't 'transfer' to the inside of the lambda.
There's a theme here: Rewrite it to a normal for loop and it compiles and works precisely the way you want to. So why, exactly, can't we make lambdas work the same way?
mutable local variables
This doesn't work:
int x = 0;
someList.stream().forEach(k -> x++);
System.out.println("Count: " + x);
You can neither modify local variables declared outside the lambda, nor even read them unless they are (effectively) final. Why not?
These are all GOOD things.. depending on scope layering
So far it seems really stupid that lambdas aren't transparent in these 3 regards. But it turns into a good thing in a slightly different context. Imagine instead of .stream().forEach something a little bit different:
class DoubleNullException extends Exception {} // checked!
public class Example {
private TreeSet<String> words;
public Example() throws DoubleNullException {
int comparisonCount = 0;
this.words = new TreeSet<String>((a, b) -> {
comparisonCount++;
if (a == null && b == null) throw new DoubleNullException();
});
System.out.println("Comparisons performed: " + comparisonCount);
}
}
Let's image the 3 transparencies did work. The above code makes use of two of them (tries to mutate comparisonCount, and tries to throw DoubleNullException from inside to outside).
The above code makes absolutely no sense. The compiler errors are very much desired. That comparator is not going to run until perhaps next week in a completely different thread. It runs whenever you add the second element to the set, which is a field, so who knows who is going to do that and which thread would do it. The constructor has long since ceased running - local vars are 'on the stack' and thus the local var has disappeared. Nevermind that the printing would always print 'comparisons made: 0' here, the statement 'comparisonCount++:' would be trying to increment a memory position that no longer holds that variable at all.
Even if we 'fix' this (the compiler realizes that a local is used in a lambda and hoists it onto heap, this is what most other languages do), the code still makes no sense as a concept: That print statement wouldn't print. Also, that comparator can be called from multiple threads so... do we now allow volatile on our local vars? Quite the can of worms! In current java, a local variable cannot possibly suffer from thread concurrency synchronization issues because it is not possible to share the variable (you can share the object the variable points at, not the variable itself) with another thread.
The reason you ARE allowed to mess with (effectively) final locals is because you can just make a copy, and that's what the compiler does for you. Copies are fine - if nobody changes anything.
The exception similarly doesn't work: It's the code that calls thatSet.add(someElement) that would get the DoubleNullException. The fact that somebody wrote:
Example ex;
try {
ex = new Example();
} catch (DoubleNullException e) {
throw new WrappedEx(e);
}
ex.add(null);
ex.add(null); // BOOM
The line with the remark (BOOM) would throw the DoubleNullEx. It 'breaks' the checked exception rules: That line would compile (set.add doesn't throw DNEx), but isn't in a context where throwing DNEx is allowed. The catch block that is in the above snippet cannot ever run.
See how it all falls apart, and nothing makes sense?
The key clue is: What happens to the lambda? Is it 'transported'?
For some situations, you hand a lambda straight to a method, and that method has a 'use it and lose it' mentality: That method you handed the lambda to will run it 0, 1, or many times, but the key is: It runs it right then and there and once the method you handed the lambda to returns, that lambda is gone. The thing you handed the lambda to did not store it in a field or hand it to other code that stores it in a field, nor did that method transport the lambda to another thread.
In such cases (the method is use-it-then-lose-it), the transparencies would certainly be handy and wouldn't "break" anything.
But when the method you hand the lambda to does transport it to a field (such as the constructor of TreeSet which stores the passed comparator in a field, so that future .add calls can call it), the transparencies break down and make no sense.
Lambdas in java are for both and therefore the lack of transparency (in all 3 regards) actually makes sense. It's just annoying when you have a use-it-then-lose-it situation.
POTENTIAL FUTURE JAVA FIX: I've championed it before but so far, it fell on mostly deaf ears. Next time I see Brian I might bring it up again. Imagine an annotation or other marker you can stick on the parameter of a method that says: "I shall use it or lose it". The compiler will then ensure you do not transport it (the only thing the compiler will let you do with that param is call .invoke() on it. You can't call anything else, nor can you assign it or hand it to anything else unless you hand it to a method that also marked that parameter as #UseItOrLoseIt. Then the compiler can make the transparency happen with some tactical wrapping for control flow, and for checked exception flow, just by not complaining (checked exceptions are a figment of javac's imagination. The runtime does not have checked exceptions. Which is why scala, kotlin, and other runs-on-the-JVM languages can do it).
Actually THEY CAN!
As your question ends with - you can actually write O get() throws T. So why do the various functional interfaces, such as Supplier, not do this?
Mostly because it's a pain. I'm honestly not sure why e.g. list's forEach is not defined as:
public <T extends Throwable> forEach(ThrowingConsumer<? super E, ? super T> consumer) throws T {
for (E elem : this) consumer.consume(elem);
}
Which would work fine and compile (with ThrowingConsumer having the obvious impl). Or even that Consumer as we have it is declared with the <O, T extends Exception> part.
It's a bit of a hassle. The way lambdas 'work' is that the compiler has to infer from context what functionalinterface you are implementing which notably includes having to bind all the generics out. Adding exception binding to this mix makes it even harder. IDEs tend to get a little confused if you're in the middle of writing code in a 'throwing lambda' and start red-underlining rather a lot, and auto-complete and the like is no help, because the IDE can't be useful in that context until it knows.
Lambdas as a system were also designed to backwards compatibly replace any existing usages of the concept, such as swing's ActionListener. Such listeners couldn't throw either, so having the interfaces in the java.util.function package be similar would be more familiar and slightly more java idiomatic, possibly.
The throws T solution would help but isn't a panacea. It solves, to an extent, the lack of checked exception transparency, but does nothing to solve either mutable local var transparency or control flow transparency. Perhaps the conclusion is simply: The benefits of doing it are more limited than you think, the costs are higher than you think. The cost/benefit analysis says: Bad idea, so it wasn't done.
With Java 11, I could initialize an InputStream as:
InputStream inputStream = InputStream.nullInputStream();
But I am unable to understand a potential use case of InputStream.nullInputStream or a similar API for OutputStream
i.e. OutputStream.nullOutputStream.
From the API Javadocs, I could figure out that it
Returns a new InputStream that reads no bytes. The returned stream is
initially open. The stream is closed by calling the close() method.
Subsequent calls to close() have no effect. While the stream is open,
the available(), read(), read(byte[]), ...
skip(long), and transferTo() methods all behave as if end of stream
has been reached.
I went through the detailed release notes further which states:
There are various times where I would like to use methods that require
as a parameter a target OutputStream/Writer for sending output, but
would like to execute those methods silently for their other effects.
This corresponds to the ability in Unix to redirect command output to
/dev/null, or in DOS to append command output to NUL.
Yet I fail to understand what are those methods in the statement as stated as .... execute those methods silently for their other effects. (blame my lack of hands-on with the APIs)
Can someone help me understand what is the usefulness of having such an input or output stream with a help of an example if possible?
Edit: One of a similar implementation I could find on browsing further is apache-commons' NullInputStream, which does justify the testing use case much better.
Sometimes you want to have a parameter of InputStream type, but also to be able to choose not to feed your code with any data. In tests it's probably easier to mock it but in production you may choose to bind null input instead of scattering your code with ifs and flags.
compare:
class ComposableReprinter {
void reprint(InputStream is) throws IOException {
System.out.println(is.read());
}
void bla() {
reprint(InputStream.nullInputStream());
}
}
with this:
class ControllableReprinter {
void reprint(InputStream is, boolean for_real) throws IOException {
if (for_real) {
System.out.println(is.read());
}
}
void bla() {
reprint(new BufferedInputStream(), false);
}
}
or this:
class NullableReprinter {
void reprint(InputStream is) throws IOException {
if (is != null) {
System.out.println(is.read());
}
}
void bla() {
reprint(null);
}
}
It makes more sense with output IMHO. Input is probably more for consistency.
This approach is called Null Object: https://en.wikipedia.org/wiki/Null_object_pattern
I see it as a safer (1) and more expressive (2) alternative to initialising a stream variable with null.
No worries about NPEs.
[Output|Input]Stream is an abstraction. In order to return a null/empty/mock stream, you had to deviate from the core concept down to a specific implementation.
I think nullOutputStream is very easy and clear: just to discard output (similar to > /dev/null) and/or for testing (no need to invent an OutputStream).
An (obviously basic) example:
OutputStream out = ... // an easy way to either print it to System.out or just discard all prints, setting it basically to the nullOutputStream
out.println("yeah... or not");
exporter.exportTo(out); // discard or real export?
Regarding nullInputStream it's probably more for testing (I don't like mocks) and APIs requiring an input stream or (this now being more probable) delivering an input stream which does not contain any data, or you can't deliver and where null is not a viable option:
importer.importDocument("name", /* input stream... */);
InputStream inputStream = content.getInputStream(); // better having no data to read, then getting a null
When you test that importer, you can just use a nullInputStream there, again instead of inventing your own InputStream or instead of using a mock. Other use cases here rather look like a workaround or misuse of the API ;-)
Regarding the return of an InputStream: that rather makes sense. If you haven't any data you may want to return that nullInputStream instead of null so that callers do not have to deal with null and can just read as they would if there was data.
Finally, these are just convenience methods to make our lifes easier without adding another dependency ;-) and as others already stated (comments/answers), it's basically an implementation of the null object pattern.
Using the null*Stream might also have the benefit that tests are executed faster... if you stream real data (of course... depending on size, etc.) you may just slow down your tests unnecessarily and we all want tests to complete fast, right? (some will put in mocks here... well...)
This is a very well discussed topic. Out of many links, I did find the below two useful.
1,
2
My questions is this:
I am writing multiple methods, some for general purposes- say, writing into file, reading from file, splitting a string using some regex.
I am calling these methods from different parts of my application- say class1, class 2 and class 3 calls the file writing method, but with different file names and data.
Where should I include try catch and throws?
1. The outermost method, say the main, from where I am calling the class1, class2 and class 3 methdods have one set of try, catch and use throws in the innermost functions.
Or, have a try catch in the innermost-None to the outer files..
Which is a better option and why? Are there any other ways of dealing with this.
Thanks
The heart of this matter is how to decide logically where your exceptions need to be handled.
Let's take an example. A method that takes a File object and performs some operation.
public void performFileOperation(File f) throws FileNotFoundException{
// Perform logic.
}
In this case, chances are that you want the class/method calling this method to handle the Exception. If you had a try/catch block in the performFileOperation() method, then the method calling the performFileOperation() method would never know if the operation failed (and therefore can't know what to do about it).
It is really a matter of where it makes sense to handle the Exception that is thrown. You need to work out which sections of your code need to know that an exception has occurred, and use try/catch blocks there. If the next level up needs to know as well, then the catch block can throw the Exception so the method further up the stack can know about and handle it as well.
Another good way to look at it is that most exceptions need to be handled. Something needs to be done about them. If your method is not in a position to do something significant to handle that exception then it needs to throw the Exception until it makes it's way to somewhere that can meaningfully handle (read, do something about) it.
I hope this helps.
I have some functionality that I want to mock out being called from main (static: I've read about that too - jmock mocking a static method). i recently read that JMock doesn't support the mocking of static functions. Well, the associated code (that's giving me a problem) must be called from main, and must be in the class with main...
Sample source
Test code
Right now, I want to ensure that my main has a test to make sure that the file exists before it proceeds. Problem is, I have my program getting user input from the console, so I don't know how to mock that out? Do I just go down to that level of granularity, specifying at every point along the way what happens, so that I can write about only one operation in a function that returns the user's input? I know that to write the tests well, when the tests are run, they should not ask for the user input, I should be specifying it in my tests somehow.
I think it has to do with the following:
How to use JMock to test mocked methods inside a mocked method
I'm not that good with JMock...
If the readInput() method does something, like, say:
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
return in.readLine();
Then you might be able to get away with a test that goes something like:
InputStream oldSystemIn = System.in;
InputStream mockSystemIn = context.mock(InputStream.class);
System.setIn(mockSystemIn);
context.checking(new Expectations() {{
// mock expected method calls and return values
}});
// execute
// verify
System.setIn(oldSystemIn);
You can use System Rules instead of mocking System.out and System.in.
public void MyTest {
#Rule
public TextFromStandardInputStream systemInMock = emptyStandardInputStream();
#Test
public void readTextFromStandardInputStream() {
systemInMock.provideText("your file name");
//your code that reads "your file name" from System.in
}
}
Stefan Birkner's answer gave me the direction that I need to be able to solve this. I have posted the code that I used to solve this below.
Solved tests: Birkner's version (recommended)
Solved tests: piped version
Changed source:
WHY: What happens is, with Birkner's library, you can only ever read as much input as you instantiate with the rule originally. If you want to iteratively write to the endpoint, you can do this with a pipe hack, but it doesn't make much of a difference, you can't write to the input over the pipe while the function is actually running, so you might as well use Birkner's version, his #Rule is more concise.
Explanation: In both the pipe hack and with Birkner's code, in the client being tested, multiple calls to create any object that reads from System.in will cause a blocking problem where, once the first object has opened a connection to the Pipe or to System.in, others can not. I don't know why this exactly is for Birkner's code, but with the Pipe I think that it's because you can only open 1 stream to the object-ever. Notice that if you call close on the first buffered reader, and then try to reopen System.in in your client code after having called it from the test, then the second attempt to open will fail because the pipe on the writer's side has been closed as well.
Solution: Easy way to solve this, and probably not the best because it requires modifying the source of the actual project, but not in a horrendous way (yet). So instead of having in the source of the actual project multiple BufferedReader creations, create a buffered reader, and pass the same reader reference around or make it a private variable of the class. Remember that if you have to declare it static that you should not initialize it in a static context because if you do, when the tests run, System.setIn will get called AFTER the reader has been initialized in your client. So it will poll on all readLine/whatever calls, just as it will if you try to create multiple objects from System.in.
Notice that to have your reads segregated between calls from your reader, in this case BufferedReader, you can use newlines to segregate them in the original setup. This way, it returns what you want in each call in the client being tested.
I have a general socket implementation consisting of an OutputStream and an InputStream.
After I do some work, I am closing the OutputStream.
When this is done, my InputStream's read() method returns -1 for an infinite amount of time, instead of throwing an exception like I had anticipated.
I am now unsure of the safest route to take, so I have a few of questions:
Am I safe to assume that -1 is only
returned when the stream is closed?
Is there no way to recreate the IO
exception that occurs when the
connection is forcefully broken?
Should I send a packet that will tell my InputStream that it should close instead of the previous two methods?
Thanks!
The -1 is the expected behavior at the end of a stream. See InputStream.read():
Reads the next byte of data from the input stream. The value byte is returned as an int in the range 0 to 255. If no byte is available because the end of the stream has been reached, the value -1 is returned. This method blocks until input data is available, the end of the stream is detected, or an exception is thrown.
You should still catch IOException for unexpected events of course.
Am I safe to assume that -1 is only returned when the stream is closed?
Yes.
You should not assume things like this. You should read the javadoc and implement according how the API is specified to behave. Especially if you want your code to be robust (or "safe" as you put it.)
Having said that, this is more or less what the javadoc says in this case. (One could quibble that EOF and "stream has been closed" don't necessarily mean the same thing ... and that closing the stream by calling InputStream.close() or Socket.close() locally will have a different effect. However, neither of these are directly relevant to your use-case.)
Is there no way to recreate the IO exception that occurs when the connection is forcefully broken?
No. For a start, no exception is normally thrown in the first place, so there is typically nothing to "recreate". Second the information in the original exception (if there ever was one) is gone.
Should I send a packet that will tell my InputStream that it should close instead of the previous two methods?
No. The best method is to test the result of the read call. You need to test it anyway, since you cannot assume that the read(byte[]) method (or whatever) will have returned the number of bytes you actually asked for.
I suppose that throwing an application specific exception would be OK under some circumstances.
But remember the general principle that exceptions should not be used for normal flow control.
One of the other answers suggests creating a proxy InputStream that throws some exception instead of returning -1.
IMO, that is a bad idea. You end up with a proxy class that claims to be an InputStream, but violates the contract of the read methods. That could lead to trouble if the proxy was passed to something that expected a properly implemented InputStream.
Second, InputStream is an abstract class not an interface, so Java's dynamic proxy mechanism won't work. (For example, the newProxyInstance method requires a list of interfaces, not classes.)
According to the InputStream javadoc, read() returns:
the next byte of data, or -1 if the end of the stream is reached.
So you are safe to assume that and it's better to use what's specified in the API than try and recreate an exception because exceptions thrown could be implementation-dependent.
Also, closing the Outputs Stream in a socket closes the socket itself.
This is what the JavaDoc for Socket says:
public OutputStream getOutputStream()
throws IOException
Returns an output stream for this socket.
If this socket has an associated channel then the resulting output
stream delegates all of its operations
to the channel. If the channel is in
non-blocking mode then the output
stream's write operations will throw
an IllegalBlockingModeException.
Closing the returned OutputStream will close the associated socket.
Returns:
an output stream for writing bytes to this socket.
Throws:
IOException - if an I/O error occurs when creating the output stream
or if the socket is not connected.
Not sure that this is what you actually want to do.
Is there no way to recreate the IO exception that occurs when the connection is forcefully broken?
I'll answer this one. InputStream is only an interface. If you really want implementation to throw an exception on EOF, provide your own small wrapper, override read()s and throw an exception on -1 result.
The easiest (least coding) way would be to use a Dynamic Proxy:
InputStream pxy = (InputStream) java.lang.reflect.Proxy.newProxyInstance(
obj.getClass().getClassLoader(),
new Class[]{ InputStream.class },
new ThrowOnEOFProxy(obj));
where ThrowOnEOFProxy would check the method name, call it and if result is -1, throw IOException("EOF").