This is a total beginner question, I've spent the past hour searching both stackoverflow and Google, but I haven't found what I'm looking for, hopefully someone here can point me in the right direction.
I'm trying to write a string to an OutputStream, which I will then use to write data to a MySQL database. I've successfully retrieved data from a MySQL (from a .php, implementing JSON and RESTful), so I have some idea of what I'm doing, I think. I'm creating a method which will take a string and return an output stream, and I'm having trouble writing to an output stream, because when I try to initialize one, it creates an anonymous inner class with the write(int oneByte) method. That's not what I want.
private static OutputStream convertStringtoStream(String string) {
byte[] stringByte = string.getBytes();
OutputStream os = new OutputStream() {
#Override
public void write(int oneByte) throws IOException {
/** I'd rather this method be something like
public void write(byte[] bytes), but it requires int oneByte*/
}
};
//return os here
}
As you can see, I want to write to my OutputStream with the buffer, not a single byte. I'm sure this is simple question, but I've not been able to find an answer, or even sample code which does what I want. If someone could point me in the right direction I'd really appreciate it. Thanks.
Your method could look like this, but I'm not sure what it would accomplish. How would you use the returned OutputStream?
private static OutputStream convertStringtoStream(String string) {
byte[] stringByte = string.getBytes();
ByteArrayOutputStream bos = new ByteArrayOutputStream(string.length());
bos.write(stringByte);
return bos;
}
Also, note that using String.getBytes() might get you into trouble in the long run because it uses the system's default encoding. It's better to choose an explicit encoding and use the String.getBytes(Charset) method.
Instead of using the abstract OutputStream class, you might want to use ByteArrayOutputStream which allows you to write a buffer. Even better perhaps would be ObjectOutputStream which would allow you to write string directly since string is serializable. Hope that helps.
Related
I have a reporting web application which generates reports. The application gets data from a database and stores data into a StringWriter object. I have to get this data in a byte array format to create a csv file and send it to browser.
Below is the code snippet
return new FileTransfer(fileName, reportType.getMimeType(),
new ByteArrayInputStream(generateCSV(reportType, grid, new DataList(), params).toString().getBytes("UTF-8")));
where generateCSV returns a StringWriter object, then to convert it into byte array I am calling toString and then getBytes() method. Below is what the generateCSV method looks like
StringWriter generateCSV(ReportType reportType, GridConfig grid, DataList dataList, String params) {......}
The problem is that when my report has huge records (more than 1 million), the getBytes() method fails with
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
The whole report data when converted to String object has a huge number of characters (billions of it). The .getBytes("UTF-8") method convert it into array, each array element as one character. And for 1 million records, the character are exceeding the MAX JVM ARRAY size limit (https://plumbr.io/outofmemoryerror/requested-array-size-exceeds-vm-limit).
Now how can I avoid use of toString().getBytes("UTF-8") to avoid OOM error? Is there any better approach to convert to byte array from StringWriter?
It’s strange to receive the result of generateCSV as a StringWriter; the preferred solution would be to let the method write to a target while generating, so you don’t have the entire contents in memory.
In either case, you should resort to the FileTransfer(String, String mimeType, OutputStreamLoader) constructor, to receive the target OutputStream when it is time to write the actual data.
When you can’t avoid the intermediate StringWriter, you should at least avoid calling toString on it, as constructing a String implies creating a copy of the entire buffer.
So a solution could look like:
return new FileTransfer(fileName, reportType.getMimeType(), new OutputStreamLoader() {
public void close() {}
public void load(OutputStream out) throws IOException {
// the best would be to let generateCSV write to out directly
// otherwise use:
StringBuffer sb = generateCSV(reportType, grid, new DataList(), params).getBuffer();
Writer w = new OutputStreamWriter(out, "UTF-8")
final int bufSize = 8192;
for(int s = 0, e; s < sb.length(); s = e) {
e = Math.min(sb.length(), s + bufSize);
w.write(sb.substring(s, e));
}
w.flush(); // let the caller close the OutputStream
}
});
An alternative to StringWriter would be CharArrayWriter, which has a writeTo(Writer out), which eliminates the need to implement a manual copying loop and might be even more efficient. But, as said, refactoring generateCSV to write directly to a target would be even better.
The StringWriter holds its content in the memory. So it's not a good approach to use it with large files.
You should try to chunk the File directly to the InputStream without the StringWriter in the middle.
What about your own InputStream implementation which reads and convert the file to csv on the fly.
Can you show us the generateCSV method?
I have need to return the byte array for the ByteArrayOutputStream from the called method. I see two ways to achieve the same thing: firstly to return ByteArrayOutputStream & use toByteArray() method, and secondly use baos.toByteArray() and return the byte array.
Which one should I use?
To illustrate by example:
Method 1
void parentMethod(){
bytes [] result = process();
}
void byte[] process(){
ByteArrayOutputStream baos;
.....
.....
.....
baos.toByteArray();
}
Method 2
void parentMethod(){
ByteArrayOutputStream baos = process();
}
void ByteArrayOutputStream process(){
ByteArrayOutputStream baos;
.....
.....
.....
return baos;
}
There's another alternative: return an InputStream. The idea is presumably that you're returning the data resulting from the operation. As such, returning an output stream seems very odd to me. To return data, you'd normally either return the raw byte[], or an InputStream wrapping it - the latter is more flexible in that it could be reading from a file or something similar, but does require the caller to close the stream afterwards.
It partly depends on what callers want to do with the data, too - there are some operations which are easier to perform if you've already got a stream; there are others which are easier with a byte array. I'd let that influence the decision quite a lot.
If you do want to return a stream, that's easy:
return new ByteArrayInputStream(baos.toByteArray());
So to summarize:
Don't return ByteArrayOutputStream. The use of that class in coming up with the data is an implementation detail, and it's not really the logical result of the operation.
Consider returning an InputStream if callers are likely to find that easier to use or if you may later want to read the data from a file (or network connection, or whatever); ByteArrayInputStream is suitable in the current implementation
Consider a generic byte reader implementing the following simple API to read an unspecified number of bytes from a data structure that is otherwise inaccessible:
public interface ByteReader
{
public byte[] read() throws IOException; // Returns null only at EOF
}
How could the above be efficiently converted to a standard Java InputStream, so that an application using all methods defined by the InputStream class, works as expected?
A simple solution would be subclassing InputStream to
Call the read() method of the ByteReader as much as needed by the read(...) methods of the InputStream
Buffer the bytes retrieved in a byte[] array
Return part of the byte array as expected, e.g., 1 byte at a time whenever the InputStream read() method is called.
However, this requires more work to be efficient (e.g., for avoiding multiple byte array allocations). Also, for the application to scale to large input sizes, reading everything into memory and then processing is not an option.
Any ideas or open source implementations that could be used?
Create multiple ByteArrayInputStream instances around the returned arrays and use them in a stream that provides for concatenation. You could for instance use SequenceInputStream for this.
Trick is to implement a Enumeration<ByteArrayInputStream> that is can use the ByteReader class.
EDIT: I've implemented this answer, but it is probably better to create your own InputStream instance instead. Unfortunately, this solution does not let you handle IOException gracefully.
final Enumeration<ByteArrayInputStream> basEnum = new Enumeration<ByteArrayInputStream>() {
ByteArrayInputStream baos;
boolean ended;
#Override
public boolean hasMoreElements() {
if (ended) {
return false;
}
if (baos == null) {
getNextBA();
if (ended) {
return false;
}
}
return true;
}
#Override
public ByteArrayInputStream nextElement() {
if (ended) {
throw new NoSuchElementException();
}
if (baos.available() != 0) {
return baos;
}
getNextBA();
return baos;
}
private void getNextBA() {
byte[] next;
try {
next = byteReader.read();
} catch (IOException e) {
throw new IllegalStateException("Issues reading byte arrays");
}
if (next == null) {
ended = true;
return;
}
this.baos = new ByteArrayInputStream(next);
}
};
SequenceInputStream sis = new SequenceInputStream(basEnum);
I assume, by your use of "convert", that a replacement is acceptable.
The easiest way to do this is to just use a ByteArrayInputStream, which already provides all the features you are looking for (but must wrap an existing array), or to use any of the other already provided InputStream for reading data from various sources.
It seems like you may be running the risk of reinventing wheels here. If possible, I would consider scrapping your ByteReader interface entirely, and instead going with one of these options:
Replace with ByteInputStream.
Use the various other InputStream classes (depending on the source of the data).
Extend InputStream with your custom implementation.
I'd stick to the existing InputStream class everywhere. I have no idea how your code is structured but you could, for example, add a getInputStream() method to your current data sources, and have them return an appropriate already-existing InputStream (or a custom subclass if necessary).
By the way, I recommend avoiding the term Reader in your own IO classes, as Reader is already heavily used in the Java SDK to indicate stream readers that operate on encoded character data (as opposed to InputStream which generally operates on raw byte data).
Alright I am very new to Java and am trying to develop an application to teach myself how to use the language.
I have been copying and pasting the same few lines of code all over, and I know that there is a way to consolidate this into a function, but cannot quite figure it out.
FileOutputStream fout4 = openFileOutput("building1hourly.txt", MODE_WORLD_READABLE);
OutputStreamWriter osw4 = new OutputStreamWriter(fout4);
osw4.write("" +iHourlyAfter);
osw4.flush();
osw4.close();
Now isn't there some type of way I could do something like this
public void writerFunction("What to write to file", "name stream", "name writer", "MODE"){insert above code here}
Yes absolutely:
public void writeToFile(String fileName, String contents, int mode) throws IOException {
FileOutputStream fout = openFileOutput(fileName, mode);
OutputStreamWriter osw = new OutputStreamWriter(fout);
osw.write(contents);
osw.flush();
osw.close();
}
First of all, great job so far. Learning programming is just like learning math (except more fun), you can read about it all you want in a book, but you don't really understand concepts until you DO them. You're going about this the right way.
Now, to answer your question: Yes, you can encapsulate the process of writing to a file in a function. Let's call it writeToFile. You want to "call" this function by sending it arguments. The arguments are the information that the function needs to do its work.
There are two sides to a function: the declaration, and the invocation. Just like in math, you can define a function f(x), where f does something. For example: say I have the function f(x) = 2x - 4. That equation is what we call the function declaration, in that we are defining what f does, and you are defining the parameters that it accepts, namely a single value x. Then you want to apply that function on a certain value x, so you might do something like: f(4). This is the function invocation. You are invoking, or calling the function, and sending 4 as the argument. The code that invokes a function is called the caller.
Let's start with the declaration of the function that you want to build:
public void writeToFile (String data, String fileName)
This function defines two parameters in its signature; it expects a String containing the data you will write to the file, and the fileName to which we will write the data. The void means that this function does not return any data back to the caller.
The complete function, the body of which you provided in your post:
public void writeToFile (String data, String fileName){
FileOutputStream fout4 = openFileOutput(fileName, MODE_WORLD_READABLE);
OutputStreamWriter osw4 = new OutputStreamWriter(fout4);
osw4.write("" +iHourlyAfter);
osw4.flush();
osw4.close();
}
Now you will want to call, or invoke this function from somewhere else in your code. You can do this like so:
writeToFile("stuff I want to write to a file", "myFile.txt");
I'm calling a method from an external library with a (simplified) signature like this:
public class Alien
{
// ...
public void munge(Reader in, Writer out) { ... }
}
The method basically reads a String from one stream and writes its results to the other. I have several strings which I need processed by this method, but none of them exist in the file system. The strings can get quite long (ca 300KB each). Ideally, I would like to call munge() as a filter:
public void myMethod (ArrayList<String> strings)
{
for (String s : strings) {
String result = alienObj.mungeString(s);
// do something with result
}
}
Unfortunately, the Alien class doesn't provide a mungeString() method, and wasn't designed to be inherited from. Is there a way I can avoid creating two temporary files every time I need to process a list of strings? Like, pipe my input to the Reader stream and read it back from the Writer stream, without actually touching the file system?
I'm new to Java, please forgive me if the answer is obvious to professionals.
You can easily avoid temporary files by using any/all of these:
CharArrayReader / CharArrayWriter
StringReader / StringWriter
PipedReader / PipedWriter
A sample mungeString() method could look like this:
public String mungeString(String input) {
StringWriter writer = new StringWriter();
alienObj.munge(new StringReader(input), writer));
return writer.toString();
}
StringReader
StringWriter
If you are welling to work with binary arrays in-memory like you do in C# then I think the PipedWriter & PipedReader are the most convenient way to do so. Check this:
Is it possible to avoid temp files when a Java method expects Reader/Writer arguments?