How to reverse a InputStream in Java? - java

Suppose i have a input stream and want to reverse it !
I similar question How to get the content of an input stream in reverse order? but firstly thats about 8 years old and also its not what i want exactly!
Suppose i have a InputStream like :
FileInputStream in = new FileInputStream(new File("some_random_file.bin"));
Note this is not specifically for Text files but for binaries!
Now,
I have a way of reversing it :
public static InputStream reverseStream(InputStream in) throws Exception{
byte bytes[] = in.readAllBytes();
byte bytesRev[] = new byte[bytes.length];
for(int i=bytes.length - 1, j = 0;i >= 0; i--, j++)
bytesRev[i] = bytes[j];
return new ByteArrayInputStream(bytesRev);
}
But i am not sure this is the most efficient way to do it!
I want to have a efficient way to achieve this even for Large Files!

If you're willing to read the entire file into memory, then your solution is pretty good. The memory footprint can be improved by reversing the contents in placed rather than allocating a second array to store the reversed contents:
public static InputStream reverseStream(InputStream in) throws Exception{
byte bytes[] = in.readAllBytes();
for(int i=bytes.length - 1, j = 0;i >j; i--, j++) {
byte tmp = bytes[i];
bytes[i] = bytes[j];
bytes[j] = tmp;
}
return new ByteArrayInputStream(bytes);
}
If the file is so large that you don't want to load it all at once, then you will need to use class java.io.RandomAccessFile to read the file in reverse order. You will need to use some sort of internal buffering to avoid horrible performance. You can wrap this up in your own implementation of InputStream that reads backwards through the buffer, loading a new buffer-full on the fly as necessary.
Here's my stab at a class that does this. This code is completely untested (although it compiles).
/**
* An input stream that reads a file in reverse. (UNTESTED)
*
* #author Ted Hopp
*/
class ReverseFileInputStream extends InputStream {
private final RandomAccessFile file;
/** Internal buffer for reading chunks of the file in reverse. */
private final byte[] buffer;
/** Position of the start of the buffer in the file. */
private long bufferPosition;
/** Position of the next byte to be read from the buffer. */
private int bufferCursor;
public ReverseFileInputStream(File file, int bufferSize) throws IOException {
this.file = new RandomAccessFile(file, "r");
buffer = new byte[bufferSize];
bufferPosition = this.file.length();
bufferCursor = -1;
}
#Override public int read() throws IOException {
if (bufferCursor < 0) {
fillBuffer();
}
return bufferCursor < 0 ? -1 : (buffer[bufferCursor--] & 0xff);
}
#Override public void close() throws IOException {
file.close();
}
private void fillBuffer() throws IOException {
if (bufferPosition > 0) {
long newBufferPosition = Math.max(0L, bufferPosition - buffer.length);
bufferCursor = (int) (bufferPosition - newBufferPosition);
file.seek(newBufferPosition);
file.readFully(buffer, 0, bufferCursor--);
bufferPosition = newBufferPosition;
}
}
}
Note that if you try to wrap a Reader around this, the result will likely be nonsense unless the text encoding of the underlying file is one byte per character. Likewise with DataInputStream, etc.

Related

Does RandomAccessFile in java read entire file in memory?

I need to read last n lines from a large file (say 2GB). The file is UTF-8 encoded.
Would like to know the most efficient way of doing it. Read about RandomAccessFile in java, but does the seek() method , read the entire file in memory. It uses native implementation so i wasn't able to refer the source code.
RandomAccessFile.seek just sets the file-pointer current position, no bytes are read into memory.
Since your file is UTF-8 encoded, it is a text file. For reading text files we typically use BufferedReader, Java 7 even added a convinience method File.newBufferedReader to create an instance of a BufferedReader to read text from a file. Though it may be inefficient for reading last n lines, but easy to implement.
To be efficient we need RandomAccessFile and read file backwards starting from the end. Here is a basic example
public static void main(String[] args) throws Exception {
int n = 3;
List<String> lines = new ArrayList<>();
try (RandomAccessFile f = new RandomAccessFile("test", "r")) {
ByteArrayOutputStream bout = new ByteArrayOutputStream();
for (long length = f.length(), p = length - 1; p > 0 && lines.size() < n; p--) {
f.seek(p);
int b = f.read();
if (b == 10) {
if (p < length - 1) {
lines.add(0, getLine(bout));
bout.reset();
}
} else if (b != 13) {
bout.write(b);
}
}
}
System.out.println(lines);
}
static String getLine(ByteArrayOutputStream bout) {
byte[] a = bout.toByteArray();
// reverse bytes
for (int i = 0, j = a.length - 1; j > i; i++, j--) {
byte tmp = a[j];
a[j] = a[i];
a[i] = tmp;
}
return new String(a);
}
It reads the file byte after byte starting from tail to ByteArrayOutputStream, when LF is reached it reverses the bytes and creates a line.
Two things need to be improved:
buffering
EOL recognition
If you need Random Access, you need RandomAccessFile. You can convert the bytes you get from this into UTF-8 if you know what you are doing.
If you use BuffredReader, you can use skip(n) by number of characters which means it has to read the whole file.
A way to do this in combination; is to use FileInputStream with skip(), find where you want to read from by reading back N newlines and then wrap the stream in BufferedReader to read the lines with UTF-8 encoding.

Binary Search using Java on a UTF-8 encoded text file where line size is not fixed

I have a tab separated UTF-8 file, where the records are sorted on one field. But, the line size is not fixed, so cannot jump into a particular position directly. How can I perform binary search on this?
Example:
line 1: Alfred Brendel /m/011hww /m/0crsgs6,/m/0crvt9h,/m/0cs5n_1,/m/0crtj4t,/m/0crwpnw,/m/0cr_n2s,/m/0crsgyh
line 2: Rupert Sheldrake /m/011ybj /m/0crtszs
You know the number of bytes your hole file contains. Lets say n
-> search-interval [l, r] with l=0, r=n.
Estimate the middle of your search-interval m=(r-l)/2. At this location go as much bytes to the left (right would also work) until you find a tab-character (byte==9 (9 is the ASCII and UTF8 code for a tab)) [lets name this position mReal ] and decode the one line starting that tab.
determine if you have to take the first 'half' (=> new search-interval is [l, mReal]) or the second 'half' (=> new search-interval is [mReal, r]) for the next search step.
public class YourTokenizer {
public static final String EPF_EOL = "\t";
public static final int READ_SIZE = 4 * 1024 ;
/** The EPF stream buffer. */
private StringBuilder buffer = new StringBuilder();
/** The EPF stream. */
private InputStream stream = null;
public YourTokenizer(final InputStream stream) {
this.stream = stream;
}
private String getNextLine() throws IOException {
int pos = buffer.indexOf(EPF_EOL);
if (pos == -1) {
// eof-of-line sequence isn't available yet, read more of the file
final byte[] bytes = new byte[READ_SIZE];
final int readSize = stream.read(bytes, 0, READ_SIZE);
buffer.append(new String(bytes));
pos = buffer.indexOf(EPF_EOL);
if (pos == -1) {
if (readSize < READ_SIZE) {
// we have reached the end of the stream and what we're looking for still can't be found
throw new IOException("Premature end of stream");
}
return getNextLine();
}
}
final String data = buffer.substring(0, pos);
pos += EPF_EOL.length();
buffer = buffer.delete(0, pos);
return data;
}
}
end in main :
final InputStream stream = new FileInputStream(file);
final YourTokenizer tokenizer = new YourTokenizer(stream);
String line = tokenizer.getNextLine();
while(line != line) {
//do something
line = tokenizer.getNextLine();
}
You can jump to the middle of bytes. From there you can find the end of that line and you can read the next line from that point. If you need to search back, take a one quarter point, or three quarters and find the line each time. Eventually you will narrow it down to one line.
I think you can guess the line length from the file size
Yet When you can't even guess the length of the lines then I think it will be better to chose from generating a random number.

Java : Read last n lines of a HUGE file

I want to read the last n lines of a very big file without reading the whole file into any buffer/memory area using Java.
I looked around the JDK APIs and Apache Commons I/O and am not able to locate one which is suitable for this purpose.
I was thinking of the way tail or less does it in UNIX. I don't think they load the entire file and then show the last few lines of the file. There should be similar way to do the same in Java too.
I found it the simplest way to do by using ReversedLinesFileReader from apache commons-io api.
This method will give you the line from bottom to top of a file and you can specify n_lines value to specify the number of line.
import org.apache.commons.io.input.ReversedLinesFileReader;
File file = new File("D:\\file_name.xml");
int n_lines = 10;
int counter = 0;
ReversedLinesFileReader object = new ReversedLinesFileReader(file);
while(counter < n_lines) {
System.out.println(object.readLine());
counter++;
}
If you use a RandomAccessFile, you can use length and seek to get to a specific point near the end of the file and then read forward from there.
If you find there weren't enough lines, back up from that point and try again. Once you've figured out where the Nth last line begins, you can seek to there and just read-and-print.
An initial best-guess assumption can be made based on your data properties. For example, if it's a text file, it's possible the line lengths won't exceed an average of 132 so, to get the last five lines, start 660 characters before the end. Then, if you were wrong, try again at 1320 (you can even use what you learned from the last 660 characters to adjust that - example: if those 660 characters were just three lines, the next try could be 660 / 3 * 5, plus maybe a bit extra just in case).
RandomAccessFile is a good place to start, as described by the other answers. There is one important caveat though.
If your file is not encoded with an one-byte-per-character encoding, the readLine() method is not going to work for you. And readUTF() won't work in any circumstances. (It reads a string preceded by a character count ...)
Instead, you will need to make sure that you look for end-of-line markers in a way that respects the encoding's character boundaries. For fixed length encodings (e.g. flavors of UTF-16 or UTF-32) you need to extract characters starting from byte positions that are divisible by the character size in bytes. For variable length encodings (e.g. UTF-8), you need to search for a byte that must be the first byte of a character.
In the case of UTF-8, the first byte of a character will be 0xxxxxxx or 110xxxxx or 1110xxxx or 11110xxx. Anything else is either a second / third byte, or an illegal UTF-8 sequence. See The Unicode Standard, Version 5.2, Chapter 3.9, Table 3-7. This means, as the comment discussion points out, that any 0x0A and 0x0D bytes in a properly encoded UTF-8 stream will represent a LF or CR character. Thus, simply counting the 0x0A and 0x0D bytes is a valid implementation strategy (for UTF-8) if we can assume that the other kinds of Unicode line separator (0x2028, 0x2029 and 0x0085) are not used. You can't assume that, then the code would be more complicated.
Having identified a proper character boundary, you can then just call new String(...) passing the byte array, offset, count and encoding, and then repeatedly call String.lastIndexOf(...) to count end-of-lines.
The ReversedLinesFileReader can be found in the Apache Commons IO java library.
int n_lines = 1000;
ReversedLinesFileReader object = new ReversedLinesFileReader(new File(path));
String result="";
for(int i=0;i<n_lines;i++){
String line=object.readLine();
if(line==null)
break;
result+=line;
}
return result;
I found RandomAccessFile and other Buffer Reader classes too slow for me. Nothing can be faster than a tail -<#lines>. So this it was the best solution for me.
public String getLastNLogLines(File file, int nLines) {
StringBuilder s = new StringBuilder();
try {
Process p = Runtime.getRuntime().exec("tail -"+nLines+" "+file);
java.io.BufferedReader input = new java.io.BufferedReader(new java.io.InputStreamReader(p.getInputStream()));
String line = null;
//Here we first read the next line into the variable
//line and then check for the EOF condition, which
//is the return value of null
while((line = input.readLine()) != null){
s.append(line+'\n');
}
} catch (java.io.IOException e) {
e.printStackTrace();
}
return s.toString();
}
CircularFifoBuffer from apache commons . answer from a similar question at How to read last 5 lines of a .txt file into java
Note that in Apache Commons Collections 4 this class seems to have been renamed to CircularFifoQueue
package com.uday;
import java.io.File;
import java.io.RandomAccessFile;
public class TailN {
public static void main(String[] args) throws Exception {
long startTime = System.currentTimeMillis();
TailN tailN = new TailN();
File file = new File("/Users/udakkuma/Documents/workspace/uday_cancel_feature/TestOOPS/src/file.txt");
tailN.readFromLast(file);
System.out.println("Execution Time : " + (System.currentTimeMillis() - startTime));
}
public void readFromLast(File file) throws Exception {
int lines = 3;
int readLines = 0;
StringBuilder builder = new StringBuilder();
try (RandomAccessFile randomAccessFile = new RandomAccessFile(file, "r")) {
long fileLength = file.length() - 1;
// Set the pointer at the last of the file
randomAccessFile.seek(fileLength);
for (long pointer = fileLength; pointer >= 0; pointer--) {
randomAccessFile.seek(pointer);
char c;
// read from the last, one char at the time
c = (char) randomAccessFile.read();
// break when end of the line
if (c == '\n') {
readLines++;
if (readLines == lines)
break;
}
builder.append(c);
fileLength = fileLength - pointer;
}
// Since line is read from the last so it is in reverse order. Use reverse
// method to make it correct order
builder.reverse();
System.out.println(builder.toString());
}
}
}
A RandomAccessFile allows for seeking (http://download.oracle.com/javase/1.4.2/docs/api/java/io/RandomAccessFile.html). The File.length method will return the size of the file. The problem is determining number of lines. For this, you can seek to the end of the file and read backwards until you have hit the right number of lines.
I had similar problem, but I don't understood to another solutions.
I used this. I hope thats simple code.
// String filePathName = (direction and file name).
File f = new File(filePathName);
long fileLength = f.length(); // Take size of file [bites].
long fileLength_toRead = 0;
if (fileLength > 2000) {
// My file content is a table, I know one row has about e.g. 100 bites / characters.
// I used 1000 bites before file end to point where start read.
// If you don't know line length, use #paxdiablo advice.
fileLength_toRead = fileLength - 1000;
}
try (RandomAccessFile raf = new RandomAccessFile(filePathName, "r")) { // This row manage open and close file.
raf.seek(fileLength_toRead); // File will begin read at this bite.
String rowInFile = raf.readLine(); // First readed line usualy is not whole, I needn't it.
rowInFile = raf.readLine();
while (rowInFile != null) {
// Here I can readed lines (rowInFile) add to String[] array or ArriyList<String>.
// Later I can work with rows from array - last row is sometimes empty, etc.
rowInFile = raf.readLine();
}
}
catch (IOException e) {
//
}
Here is the working for this.
private static void printLastNLines(String filePath, int n) {
File file = new File(filePath);
StringBuilder builder = new StringBuilder();
try {
RandomAccessFile randomAccessFile = new RandomAccessFile(filePath, "r");
long pos = file.length() - 1;
randomAccessFile.seek(pos);
for (long i = pos - 1; i >= 0; i--) {
randomAccessFile.seek(i);
char c = (char) randomAccessFile.read();
if (c == '\n') {
n--;
if (n == 0) {
break;
}
}
builder.append(c);
}
builder.reverse();
System.out.println(builder.toString());
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
Here is the best way I've found to do it. Simple and pretty fast and memory efficient.
public static void tail(File src, OutputStream out, int maxLines) throws FileNotFoundException, IOException {
BufferedReader reader = new BufferedReader(new FileReader(src));
String[] lines = new String[maxLines];
int lastNdx = 0;
for (String line=reader.readLine(); line != null; line=reader.readLine()) {
if (lastNdx == lines.length) {
lastNdx = 0;
}
lines[lastNdx++] = line;
}
OutputStreamWriter writer = new OutputStreamWriter(out);
for (int ndx=lastNdx; ndx != lastNdx-1; ndx++) {
if (ndx == lines.length) {
ndx = 0;
}
writer.write(lines[ndx]);
writer.write("\n");
}
writer.flush();
}
(See commend)
public String readFromLast(File file, int howMany) throws IOException {
int numLinesRead = 0;
StringBuilder builder = new StringBuilder();
try (RandomAccessFile randomAccessFile = new RandomAccessFile(file, "r")) {
try (ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
long fileLength = file.length() - 1;
/*
* Set the pointer at the end of the file. If the file is empty, an IOException
* will be thrown
*/
randomAccessFile.seek(fileLength);
for (long pointer = fileLength; pointer >= 0; pointer--) {
randomAccessFile.seek(pointer);
byte b = (byte) randomAccessFile.read();
if (b == '\n') {
numLinesRead++;
// (Last line often terminated with a line separator)
if (numLinesRead == (howMany + 1))
break;
}
baos.write(b);
fileLength = fileLength - pointer;
}
/*
* Since line is read from the last so it is in reverse order. Use reverse
* method to make it ordered correctly
*/
byte[] a = baos.toByteArray();
int start = 0;
int mid = a.length / 2;
int end = a.length - 1;
while (start < mid) {
byte temp = a[end];
a[end] = a[start];
a[start] = temp;
start++;
end--;
}// End while
return new String(a).trim();
} // End inner try-with-resources
} // End outer try-with-resources
} // End method
I tried RandomAccessFile first and it was tedious to read the file backwards, repositioning the file pointer upon every read operation. So, I tried #Luca solution and I got the last few lines of the file as a string in just two lines in a few minutes.
InputStream inputStream = Runtime.getRuntime().exec("tail " + path.toFile()).getInputStream();
String tail = new BufferedReader(new InputStreamReader(inputStream)).lines().collect(Collectors.joining(System.lineSeparator()));
Code is 2 lines only
// Please specify correct Charset
ReversedLinesFileReader rlf = new ReversedLinesFileReader(file, StandardCharsets.UTF_8);
// read last 2 lines
System.out.println(rlf.toString(2));
Gradle:
implementation group: 'commons-io', name: 'commons-io', version: '2.11.0'
Maven:
<dependency>
<groupId>commons-io</groupId><artifactId>commons-io</artifactId><version>2.11.0</version>
</dependency>

Modify a .txt file in Java

I have a text file that I want to edit using Java. It has many thousands of lines. I basically want to iterate through the lines and change/edit/delete some text. This will need to happen quite often.
From the solutions I saw on other sites, the general approach seems to be:
Open the existing file using a BufferedReader
Read each line, make modifications to each line, and add it to a StringBuilder
Once all the text has been read and modified, write the contents of the StringBuilder to a new file
Replace the old file with the new file
This solution seems slightly "hacky" to me, especially if I have thousands of lines in my text file.
Anybody know of a better solution?
I haven't done this in Java recently, but writing an entire file into memory seems like a bad idea.
The best idea that I can come up with is open a temporary file in writing mode at the same time, and for each line, read it, modify if necessary, then write into the temporary file. At the end, delete the original and rename the temporary file.
If you have modify permissions on the file system, you probably also have deleting and renaming permissions.
if the file is just a few thousand lines you should be able to read the entire file in one read and convert that to a String.
You can use apache IOUtils which has method like the following.
public static String readFile(String filename) throws IOException {
File file = new File(filename);
int len = (int) file.length();
byte[] bytes = new byte[len];
FileInputStream fis = null;
try {
fis = new FileInputStream(file);
assert len == fis.read(bytes);
} catch (IOException e) {
close(fis);
throw e;
}
return new String(bytes, "UTF-8");
}
public static void writeFile(String filename, String text) throws IOException {
FileOutputStream fos = null;
try {
fos = new FileOutputStream(filename);
fos.write(text.getBytes("UTF-8"));
} catch (IOException e) {
close(fos);
throw e;
}
}
public static void close(Closeable closeable) {
try {
closeable.close();
} catch(IOException ignored) {
}
}
You can use RandomAccessFile in Java to modify the file on one condition:
The size of each line has to be fixed otherwise, when new string is written back, it might override the string in the next line.
Therefore, in my example, I set the line length as 100 and padding with space string when creating the file and writing back to the file.
So in order to allow update, you need to set the length of line a little larger than the longest length of the line in this file.
public class RandomAccessFileUtil {
public static final long RECORD_LENGTH = 100;
public static final String EMPTY_STRING = " ";
public static final String CRLF = "\n";
public static final String PATHNAME = "/home/mjiang/JM/mahtew.txt";
/**
* one two three
Text to be appended with
five six seven
eight nine ten
*
*
* #param args
* #throws IOException
*/
public static void main(String[] args) throws IOException
{
String starPrefix = "Text to be appended with";
String replacedString = "new text has been appended";
RandomAccessFile file = new RandomAccessFile(new File(PATHNAME), "rw");
String line = "";
while((line = file.readLine()) != null)
{
if(line.startsWith(starPrefix))
{
file.seek(file.getFilePointer() - RECORD_LENGTH - 1);
file.writeBytes(replacedString);
}
}
}
public static void createFile() throws IOException
{
RandomAccessFile file = new RandomAccessFile(new File(PATHNAME), "rw");
String line1 = "one two three";
String line2 = "Text to be appended with";
String line3 = "five six seven";
String line4 = "eight nine ten";
file.writeBytes(paddingRight(line1));
file.writeBytes(CRLF);
file.writeBytes(paddingRight(line2));
file.writeBytes(CRLF);
file.writeBytes(paddingRight(line3));
file.writeBytes(CRLF);
file.writeBytes(paddingRight(line4));
file.writeBytes(CRLF);
file.close();
System.out.println(String.format("File is created in [%s]", PATHNAME));
}
public static String paddingRight(String source)
{
StringBuilder result = new StringBuilder(100);
if(source != null)
{
result.append(source);
for (int i = 0; i < RECORD_LENGTH - source.length(); i++)
{
result.append(EMPTY_STRING);
}
}
return result.toString();
}
}
If the file is large, you might want to use a FileStream for output, but that seems pretty much like it is the simplest process to do what you're asking (and without more specificity i.e. on what types of changes / edits / deletions you're trying to do, it's impossible to determine what more complicated way might work).
No reason to buffer the entire file.
Simply write each line as your read it, insert lines when necessary, delete lines when necessary, replace lines when necessary.
Fundamentally, you will not get around having to recreate the file wholesale, especially if it's just a text file.
What kind of data is it? Do you control the format of the file?
If the file contains name/value pairs (or similar), you could have some luck with Properties, or perhaps cobbling together something using a flat file JDBC driver.
Alternatively, have you considered not writing the data so often? Operating on an in-memory copy of your file should be relatively trivial. If there are no external resources which need real time updates of the file, then there is no need to go to disk every time you want to make a modification. You can run a scheduled task to write periodic updates to disk if you are worried about data backup.
In general you cannot edit the file in place; it's simply a very long sequence of characters, which happens to include newline characters. You could edit in place if your changes don't change the number of characters in each line.
Can't you use regular expressions, if you know what you want to change ? Jakarta Regexp should probably do the trick.
Although this question was a time ago posted, I think it is good to put my answer here.
I think that the best approach is to use FileChannel from java.nio.channels package in this scenario. But this, only if you need to have a good performance! You would need to get a FileChannel via a RandomAccessFile, like this:
java.nio.channels.FileChannel channel = new java.io.RandomAccessFile("/my/fyle/path", "rw").getChannel();
After this, you need a to create a ByteBuffer where you will read from the FileChannel.
this looks something like this:
java.nio.ByteBuffer inBuffer = java.nio.ByteBuffer.allocate(100);
int pos = 0;
int aux = 0;
StringBuilder sb = new StringBuilder();
while (pos != -1) {
aux = channel.read(inBuffer, pos);
pos = (aux != -1) ? pos + aux : -1;
b = inBuffer.array();
sb.delete(0, sb.length());
for (int i = 0; i < b.length; ++i) {
sb.append((char)b[i]);
}
//here you can do your stuff on sb
inBuffer = ByteBuffer.allocate(100);
}
Hope that my answer will help you!
I think, FileOutputStream.getFileChannel() will help a lot, see FileChannel api
http://java.sun.com/javase/6/docs/api/java/nio/channels/FileChannel.html
private static void modifyFile(String filePath, String oldString, String newString) {
File fileToBeModified = new File(filePath);
StringBuilder oldContent = new StringBuilder();
try (BufferedReader reader = new BufferedReader(new FileReader(fileToBeModified))) {
String line = reader.readLine();
while (line != null) {
oldContent.append(line).append(System.lineSeparator());
line = reader.readLine();
}
String content = oldContent.toString();
String newContent = content.replaceAll(oldString, newString);
try (FileWriter writer = new FileWriter(fileToBeModified)) {
writer.write(newContent);
}
} catch (IOException e) {
e.printStackTrace();
}
}
You can change the txt file to java by saving on clicking "Save As" and saving *.java extension.

How can I get a java.io.InputStream from a java.lang.String?

I have a String that I want to use as an InputStream. In Java 1.0, you could use java.io.StringBufferInputStream, but that has been #Deprecrated (with good reason--you cannot specify the character set encoding):
This class does not properly convert
characters into bytes. As of JDK 1.1,
the preferred way to create a stream
from a string is via the StringReader
class.
You can create a java.io.Reader with java.io.StringReader, but there are no adapters to take a Reader and create an InputStream.
I found an ancient bug asking for a suitable replacement, but no such thing exists--as far as I can tell.
The oft-suggested workaround is to use java.lang.String.getBytes() as input to java.io.ByteArrayInputStream:
public InputStream createInputStream(String s, String charset)
throws java.io.UnsupportedEncodingException {
return new ByteArrayInputStream(s.getBytes(charset));
}
but that means materializing the entire String in memory as an array of bytes, and defeats the purpose of a stream. In most cases this is not a big deal, but I was looking for something that would preserve the intent of a stream--that as little of the data as possible is (re)materialized in memory.
Update: This answer is precisely what the OP doesn't want. Please read the other answers.
For those cases when we don't care about the data being re-materialized in memory, please use:
new ByteArrayInputStream(str.getBytes("UTF-8"))
If you don't mind a dependency on the commons-io package, then you could use the IOUtils.toInputStream(String text) method.
There is an adapter from Apache Commons-IO which adapts from Reader to InputStream, which is named ReaderInputStream.
Example code:
#Test
public void testReaderInputStream() throws IOException {
InputStream inputStream = new ReaderInputStream(new StringReader("largeString"), StandardCharsets.UTF_8);
Assert.assertEquals("largeString", IOUtils.toString(inputStream, StandardCharsets.UTF_8));
}
Reference: https://stackoverflow.com/a/27909221/5658642
To my mind, the easiest way to do this is by pushing the data through a Writer:
public class StringEmitter {
public static void main(String[] args) throws IOException {
class DataHandler extends OutputStream {
#Override
public void write(final int b) throws IOException {
write(new byte[] { (byte) b });
}
#Override
public void write(byte[] b) throws IOException {
write(b, 0, b.length);
}
#Override
public void write(byte[] b, int off, int len)
throws IOException {
System.out.println("bytecount=" + len);
}
}
StringBuilder sample = new StringBuilder();
while (sample.length() < 100 * 1000) {
sample.append("sample");
}
Writer writer = new OutputStreamWriter(
new DataHandler(), "UTF-16");
writer.write(sample.toString());
writer.close();
}
}
The JVM implementation I'm using pushed data through in 8K chunks, but you could have some affect on the buffer size by reducing the number of characters written at one time and calling flush.
An alternative to writing your own CharsetEncoder wrapper to use a Writer to encode the data, though it is something of a pain to do right. This should be a reliable (if inefficient) implementation:
/** Inefficient string stream implementation */
public class StringInputStream extends InputStream {
/* # of characters to buffer - must be >=2 to handle surrogate pairs */
private static final int CHAR_CAP = 8;
private final Queue<Byte> buffer = new LinkedList<Byte>();
private final Writer encoder;
private final String data;
private int index;
public StringInputStream(String sequence, Charset charset) {
data = sequence;
encoder = new OutputStreamWriter(
new OutputStreamBuffer(), charset);
}
private int buffer() throws IOException {
if (index >= data.length()) {
return -1;
}
int rlen = index + CHAR_CAP;
if (rlen > data.length()) {
rlen = data.length();
}
for (; index < rlen; index++) {
char ch = data.charAt(index);
encoder.append(ch);
// ensure data enters buffer
encoder.flush();
}
if (index >= data.length()) {
encoder.close();
}
return buffer.size();
}
#Override
public int read() throws IOException {
if (buffer.size() == 0) {
int r = buffer();
if (r == -1) {
return -1;
}
}
return 0xFF & buffer.remove();
}
private class OutputStreamBuffer extends OutputStream {
#Override
public void write(int i) throws IOException {
byte b = (byte) i;
buffer.add(b);
}
}
}
Well, one possible way is to:
Create a PipedOutputStream
Pipe it to a PipedInputStream
Wrap an OutputStreamWriter around the PipedOutputStream (you can specify the encoding in the constructor)
Et voilá, anything you write to the OutputStreamWriter can be read from the PipedInputStream!
Of course, this seems like a rather hackish way to do it, but at least it is a way.
A solution is to roll your own, creating an InputStream implementation that likely would use java.nio.charset.CharsetEncoder to encode each char or chunk of chars to an array of bytes for the InputStream as necessary.
You can take help of org.hsqldb.lib library.
public StringInputStream(String paramString)
{
this.str = paramString;
this.available = (paramString.length() * 2);
}
I know this is an old question but I had the same problem myself today, and this was my solution:
public static InputStream getStream(final CharSequence charSequence) {
return new InputStream() {
int index = 0;
int length = charSequence.length();
#Override public int read() throws IOException {
return index>=length ? -1 : charSequence.charAt(index++);
}
};
}

Categories

Resources