Suppose we are writing a Java library, which provides some I/O ulitity functions, for example, a convenient method to read text files as Strings:
public class StringReader {
private static final Logger log = LoggerFactory.getLog(StringReader.class);
/**
* Returns the contents of file <b>fileName</b> as String.
* #param fileName file name to read
* #return null on IO error
*/
public static String readString(String fileName) {
FileInputStream fis = null;
try {
fis = new FileInputStream(fileName);
byte[] data = new byte[fis.available()];
fis.read(data);
return new String(data, "ISO-8859-1"); // may throw UnsupportedEncodingException!
} catch (IOException e) {
log.error("unable to read file", e);
} catch (UnsupportedEncodingException e) {
log.fatal("JRE does not support ISO-8859-1!", e);
// ???
} finally {
closeQuiet(fis);
}
return null;
}
}
This code reads a text file into a String using ISO-8859-1 encoding and returns the String to user.
The String(byte[], String) constructor throws an UnsupportedEncodingException when specified encoding is not supported. But, as we know, ISO-8859-1 must be supported by JRE, as said here (see the Standard charsets section).
Hence, we expect the block
catch (UnsupportedEncodingException e) {
log.fatal("encoding is unsupported", e);
// ???
}
is never reached if JRE distribution conforms the standard.
But what if it doesn't? How to handle this exception in the most correct way?
The question is, how to alert properly about such error?
The suggestions are:
Throw some kind of RuntimeException.
Do not disable the logger in production code, write an exception details in log and ignore it.
Put the assert false here, so it produce AssertionError if user launched VM with -ea.
Throw an AssertionError manually.
Add an UnsupportedEncodingException in method declaration and allow user to choose. Not very convenient, I think.
Call System.exit(1).
Thanks.
But what if it doesn't?
Then you're in a really bad situation, and you should probably get out of it as quickly as possible. When a JRE is violating its own promises, what would you want to depend on?
I'd feel happy using AssertionError in this case.
It's important to note that not all unchecked exceptions are treated equally - it's not unusual for code to catch Exception at the top level of the stack, log an error and then keep going... if you just throw RuntimeException, that will be caught by such a scheme. AssertionError would only be caught if the catch block specified Throwable (or specifically Error or AssertionError, but that's much rarer to see). Given how impossible this should be, I think it's reasonable to abort really hard.
Also note that in Java 7, you can use StandardCharsets.ISO_8859_1 instead of the string name, which is cleaner and removes the problem.
There are other things I'd change about your code, by the way:
I would avoid using available() as far as possible. That tells you how many bytes are available right now - it doesn't tell you how long the file is, necessarily.
I would definitely not assume that read() will read the whole file in one go. Call read() in a loop, ideally until it says there's no more data.
I would personally accept a Charset as a parameter, rather than hard-coding ISO-8859-1. - I would let IOException bubble up from the method rather than just returning null. After all, unless you're really going to check the return value of every call for nullity, you're just going to end up with a NullPointerException instead, which is harder to diagnose than the original IOException.
Alternatively, just use Guava's Files.toString(File, Charset) to start with :) (If you're not already using Guava, now is a good time to start...)
This is a rather common occurrence in code.
Unchecked exceptions are made for this. They shouldn't happen (which is why they are unchecked), but if they do, there is still an exception.
So, throw a RuntimeException that has the original Exception as the cause.
catch (UnsupportedEncodingException e) {
throw new RuntimeException(e); //should not happen
}
assert(false); also throws an unchecked exception, but it assertions can be turned off, so I would recommend RuntimeException.
Related
I've come across the UnsupportedEncodingException while using the URLEncoder with the UTF-8 encoding and it forces me to write my code like this:
String foo = ... // contains unsafe characters
try {
foo = URLEncoder.encode(foo, "UTF-8");
} catch (UnsupportedEncodingException e)
// do something with the exception that should never occur
}
Instead of just this:
String foo = ... // contains unsafe characters
foo = URLEncoder.encode(foo, "UTF-8");
The documentation of URLEncoder discourages the use of any encoding other than UTF-8:
Note: The World Wide Web Consortium Recommendation states that UTF-8 should be used. Not doing so may introduce incompatibilities.
And the UTF-8 encoding should always be available, at least according to the Supported Encodings page from the documentation.
The accepted answer to the question on how to handle the UnsupportedEncodingException and if it can even occur is "It cannot happen, unless there is something fundamentally broken in your JVM".
So I'm wondering, why does the UnsupportedEncodingException class not extend the RuntimeException class, which would allow me to use the second code snippet? Is it just because it exists as it is right now and it would be hard to change that?
If this was changed some existing code could be broken. For example
try {
... do something that could throw UnsupportedEncodingException
} catch (IOException e) {
... handle the exception
}
If UnsupportedEncodingException is no longer an IOException it would not be handled any more.
I have just started learning Java and I have got following problem I have been struggling for hours with. I want to use PrintWriter in order to produce a simple text file.
I do not get any runtime exception, still the file is not appearing in the specified directory.
public class Main {
public static void main(String[] args) {
try (final PrintWriter writer = new PrintWriter(
new File("c:\test\new\notes.txt"))) {
writer.write("Test note");
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}
What am I doing wrong?
\ represents an escape character so needs to be escaped itself for literal backslash characters. You can also use / and Java will resolve the correct separation character for the platform
try (final PrintWriter writer = new PrintWriter("c:\\test\new\\notes.txt")) {
Add writer.flush() after writer.write("Test note"), and use double backslashes for Windows paths (as other answers are suggesting).
As Reimeus already said, \ is an escape character in java.
That means that a string containing "\n" or "\t" does not represent the stringliteral \n or \t!
'\n' represents the newline character and '\t' represents the TAB character!
For the better understanding, the following code:
System.out.println("c:\test\new\notes.txt");
would not print c:\test\new\notes.txt to the console, it would print
c: est
ew
otes.txt
to the console!
To be able to write the backslash in a string you'll need to use '\\'!
I see your question as having 2 parts:
Why doesn't the code work?
Why was no exception thrown?
The first question has already been answered, but I think the answer to the second question is at least as important because your current code will still fail silently if there is any problem writing to the file.
From the documentation of PrintWriter (http://docs.oracle.com/javase/7/docs/api/java/io/PrintWriter.html):
Methods in this class never throw I/O exceptions, although some of its
constructors may. The client may inquire as to whether any errors have
occurred by invoking checkError().
Therefore it is essential that you call checkerror() after every call to a PrintWriter method or your code will not be reliable.
For example I've this file that should be formated like this:
PersonName
2,5,6,7,8,9
First line only has a name and from the second line onwards its all comma separated values.
Ending in a new line. For some reason I can't make a new empty line in code format.
Now lets say that those comma separated values will be copied into an ArrayList and then I check if the ArrayList is empty or not. If it is empty I've to throw an exception. My question is, what exception to throw here?
Something like (in pseudo code):
FileReader fr = new FileReader(f);
BufferedReader buffReader = new BufferedReader(fr);
ArrayList list = new ArrayList();
while(interator.hasCSV) {
// copy values to list
Object obj = interator.next();
list.add(obj);
}
if(list.isEmpty()) {
// What exception to throw here?
// It means there was a name in the file (first line) but then no csv values
}
I hope I made myself clear otherwise just let me know and I'll try to explain it better.
as paxdiablo said you can create your own exception here is an example link
class CheckListException extends Exception
{
//Parameterless Constructor
public CheckListException() {}
//Constructor that accepts a message
public CheckListException(List list)
{
super(list);
}
}
to use the exception
try
{
if(list.isEmpty()) {
// What exception to throw here?
// It means there was a name in the file (first line) but then no csv values
throw new CheckListException();
}
}
catch(CheckListException ex)
{
//Process message however you would like
}
Create your own exception. For further details follow this
if()//your condition
/*create custom exeception*/
throw new Exception("My custom exception");
}
But this is one time exception.
You have to follow the above link to create a new class for your custom exception to reuse.
There's no requirement to use the standard ones defined in Java since you can create whatever ones you need.
While it looks like (from here) the closest one in the standard set is probably DataFormatException (see here), that's targeted to ZIP files.
Perhaps a better match would be ParseException. If you wanted to use one of the more targeted standard ones, that would be the one I'd be looking to use (if you don't wish to create your own), or inherit from (if you do).
But keep in mind you can always inherit from the top-level Exception class if you're not overly concerned about where it is in the hierarchy.
It would be normal to place all the code responsible for parsing the content of the file in its own method. Most parsers interleave reading from the input stream and examining what it has read. Such a method must declare that it throws IOException, because the parts of the method that read from the input stream can throw an IOException. The parsing method will be easier to use if parse errors are also indicated by throwing an IOException: a caller of the method can use one catch to handle all problems, if they are not interested in the details of the problems.
A caller of the method might however want to be able to distinguish between different kinds of problems. The caller might want to report different error messages for a missing file and for an incorrectly formatted file. To support this, you could create your own InvalidFormatException that extends IOException and throw that when there is a parse error.
I'm currently using
try {
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(new FileOutputStream("open_sites_20,txt"), "UTF-8"));
writer.write(String.format("%4d%4d%n", i, j));
writer.close();
} catch (IOException e) {
System.err.print("error: " + e.toString() + "\n");
};
where i, j are integers.
FindBugs reports that the above has the following bad practice
Reliance on default encoding
Found a call to a method which will perform a byte to String (or String to byte) conversion, and will assume that the default platform encoding is suitable. This will cause the application behaviour to vary between platforms. Use an alternative API and specify a charset name or Charset object explicitly.
Any suggestion how this can be improved?
Platform: IntelliJ IDEA 13.1.1 + FindBugs-IDEA 0.9.992.
In this case, FindBugs seems to be wrong. Please keep in mind that its rules are not carved into stone, so it is necessary to apply your own judgment.
As for improving things. There are few way you can improve this code. First, let's deal with character encoding. Since Java 1.4, OutputStreamWriter contains constructor with the following signature:
OutputStreamWriter(OutputStream stream, Charset charEncoding)
It's better to use this, instead of passing the encoding name as string. Why? Starting with Java 7, you can use StandardCharsets enum to create a Charset class instance. Therefor you can write:
new OutputStreamWriter(new FileOutputStream("open_sites_20,txt"), StandardCharsets.UTF_8)
I don't think FindBugs would argue about that.
Another issue I see here, is the way you close writer. In some circumstances, this method will not be called. The best way to deal with it (if you are using Java 7 and above) is to use try-with-resources and let Java close streams/writers:
try (BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("open_sites_20,txt"), StandardCharsets.UTF_8));) {
writer.write(String.format("%4d%4d%n", i, j));
// It always make sense to flush the stream before closing
writer.flush();
} catch (IOException e) {
System.err.print("error: " + e.toString() + "\n");
};
Now, all you do is write single value to file. If I were you, I would try to avoid all the overhead of creating streams, wrapping it with writers and so on. It's complicated. Fortunately, Java 7 has one fantastic class to help you write things to text files: Files.
The class has two methods that come in handy when you need to write something to a text file:
write(Path path, byte[] bytes, OpenOption... options)
write(Path path, Iterable<? extends CharSequence> lines, Charset cs, OpenOption... options)
The first one could also be used to write binary files. Second could be used to write a collection of strings (array, list, set, ...). Your program could be rewritten as:
try {
Path outputPath = Paths.get("open_sites_20,txt");
String nmbrStr = String.format("%4d%4d%n", i, j);
byte[] outputBytes = nmbrStr.getBytes(StandardCharsets.UTF_8));
Files.write(outputPath, outputBytes, StandardOpenOption.CREATE);
} catch (IOException ioe) {
LOGGER.severe(ioe);
}
That's it!
Simple and elegant. What I used here:
Paths
String.getBytes(Charset charEncoding) - I can't guarantee that FindBugs won't complain about this one
StandardOpenOption
Java Logging API - instead writing exceptions to System.out
today I faced with very interesting problem. When I try to rewrite xml file.
I have 3 ways to do this. And I want to know the best way and reason of problem.
I.
File file = new File(REAL_XML_PATH);
try {
FileWriter fileWriter = new FileWriter(file);
XMLOutputter xmlOutput = new XMLOutputter();
xmlOutput.output(document, System.out);
xmlOutput.output(document, fileWriter);
fileWriter.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
In this case I have a big problem with my app. After writing in file in my own language I can't read anything. Encoding file was changed on ANSI javax.servlet.ServletException: javax.servlet.jsp.JspException: Invalid argument looking up property: "document.rootElement.children[0].children"
II.
File file = new File(REAL_XML_PATH);
XMLOutputter output=new XMLOutputter();
try {
output.output(document, new FileOutputStream(file));
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
In this case I haven't problems. Encoding wasn't change. No problem with reading and writing.
And this article http://tripoverit.blogspot.com/2007/04/javas-utf-8-and-unicode-writing-is.html
And I want to know the best way and reason of problem.
Well, this looks like the problem:
FileWriter fileWriter = new FileWriter(file);
That will always use the platform default encoding, which is rarely what you want. Suppose your default encoding is ISO-8859-1. If your document declares itself to be encoded in UTF-8, but you actually write everything in ISO-8859-1, then your file will be invalid if you have any non-ASCII characters - you'll end up writing them out with the ISO-8859-1 single byte representation, which isn't valid UTF-8.
I would actually provide a stream to XMLOutputter rather than a Writer. That way there's no room for conflict between the encoding declared by the file and the encoding used by the writer. So just change your code to:
FileOutputStream fileOutput = new FileOutputStream(file);
...
xmlOutput.output(document, fileOutput);
... as I now see you've done in your second bit of code. So yes, this is the preferred approach. Here, the stream makes no assumptions about the encoding to use, because it's just going to handle binary data. The XML writing code gets to decide what that binary data will be, and it can make sure that the character encoding it really uses matches the declaration at the start of the file.
You should also clean up your exception handling - don't just print a stack trace and continue on failure, and call close in a finally block instead of at the end of the try block. If you can't genuinely handle an exception, either let it propagate up the stack directly (potentially adding throws clauses to your method) or catch it, log it and then rethrow either the exception or a more appropriate one wrapping the cause.
If I remember correctly, you can force your xmlOutputter to use a "pretty" format with:
new XMLOutputter(Format.getPrettyFormat()) so it should work with I too
pretty is:
Returns a new Format object that performs whitespace beautification
with 2-space indents, uses the UTF-8 encoding, doesn't expand empty
elements, includes the declaration and encoding, and uses the default
entity escape strategy. Tweaks can be made to the returned Format
instance without affecting other instances.