I'm looking for a way to switch between reading bytes (as byte[]) and reading lines of Strings from a file. I know that a byte[] can be obtained form a file through a FileInputStream, and a String can be obtained through a BufferedReader, but using both of them at the same time is proving problematic. I know how long the section of bytes are. String encoding can be kept constant from when I write the file. The filetype is a custom one that is still in development, so I can change how I write data to it.
How can I read Strings and byte[]s from the same file in java?
Read as bytes. When you have read a sequence of bytes that you know should be a string, place those bytes in an array, put the array inside a ByteArrayInputStream and use that as the underlying InputStream for a Reader to get the bytes as characters, then read those characters to produce a String.
For the later parts of this process see the related SO question on how to create a String from an InputStream.
Read the file as Strings using a BufferedReader then use String.getBytes().
Why not try this:
BufferedReader bufferedReader = null;
try {
bufferedReader = new BufferedReader(new FileReader("testing.txt"));
String line = bufferedReader.readLine();
while(line != null){
byte[] b = line.getBytes();
}
} finally {
if(bufferedReader!=null){
bufferedReader.close();
}
}
or
FileInputStream in = null;
BufferedReader bufferedReader = null;
try {
bufferedReader = new BufferedReader(new FileReader("xanadu.txt"));
String line = bufferedReader.readLine();
while(line != null){
//read your line
}
in = new FileInputStream("xanadu.txt");
int c;
while ((c = in.read()) != -1) {
//read your bytes (c)
}
} finally {
if (in != null) {
in.close();
}
if(bufferedReader!=null){
bufferedReader.close();
}
}
Read everything as bytes from the buffered input stream, and convert string sections into String's using constructor that accepts the byte array:
String string = new String(bytes, offset, length, "US-ASCII");
Depending on how the data are actually encoded, you may need to use "UTF-8" or something else as the name of the charset.
Related
So, I am developing android application that read JSON text file containing some data. I have a 300 kb (307,312 bytes) JSON in a text file (here). I also develop desktop application (cpp) to generate and loading (and parsing) the JSON text file.
When I try to open and read it using ifstream in c++, I get the string length correctly (307,312). I even succesfully parse it.
Here is my code in C++:
std::string json = "";
std::string line;
std::ifstream myfile(textfile.txt);
if(myfile.is_open()){
while(std::getline(myfile, line)){
json += line;
json.push_back('\n');
}
json.pop_back(); // pop back the last '\n'
myfile.close();
}else{
std::cout << "Unable to open file";
}
In my android application, I put my JSON text file in res/raw folder. When I try to open and read using InputStream, the length of the string only 291,896. And I can't parse it (I parse it using jni with the same c++ code, maybe it is not important).
InputStream is = getResources().openRawResource(R.raw.textfile);
byte[] b = new byte[is.available()];
is.read(b);
in_str = new String(b);
UPDATE:
I also have try using this way.
InputStream is = getResources().openRawResource(R.raw.textfile);
BufferedReader reader = new BufferedReader(new InputStreamReader(is));
String line = reader.readLine();
while(line != null){
in_str += line;
in_str += '\n';
line = reader.readLine();
}
if (in_str != null && in_str.length() > 0) {
in_str = in_str.substring(0, in_str.length()-1);
}
Even, I tried moving it from res/raw folder to assets folder in java android project. And of course I change the InputStream line to InputStream is = getAssets().open("textfile.txt"). Still not working.
Okay, I found the solution. It is the ASCII and UTF-8 problem.
From here:
UTF-8 Variable length encoding, 1-4 bytes per code point. ASCII values are encoded as ASCII using 1 byte.
ASCII Single byte encoding
My filesize is 307,312 bytes and basically I need to take the character each byte. So, I should need to encode the file as ASCII.
When I am using C++ ifstream, the string size is 307,312. (same as of the number character if it is using ASCII encoding)
Meanwhile, when I am using Java InputStream, the string size is 291,896. I assume that it happens because of the reader is using UTF-8 encoding instead.
So, how to use get ASCII encoding in Java?
Through this thread and this article, we can use InputStreamReader in Java and set it to ASCII. Here is my complete code:
String in_str = "";
try{
InputStream is = getResources().openRawResource(R.raw.textfile);
BufferedReader reader = new BufferedReader(new InputStreamReader(is, "ASCII"));
String line = reader.readLine();
while(line != null){
in_str += line;
in_str += '\n';
line = reader.readLine();
}
if (in_str != null && in_str.length() > 0) {
in_str = in_str.substring(0, in_str.length()-1);
}
}catch(Exception e){
e.printStackTrace();
}
If you have the same problem, hope this helps. Cheers.
I'm trying read a file(doesn't matter the extension) and write after this, but when I do it, the output file is different from the input.
my code is the next:
OutputStream outputStream = null;
FileReader fr = new FileReader("rute\\inputfile.PNG");
BufferedReader br = new BufferedReader(fr);
String line;
while ((line= br.readLine()) != null) {
content += line;
}
byte[] toBytes= content.getBytes();
InputStream inputStream = new ByteArrayInputStream(toBytes);
try {
outputStream = new FileOutputStream(new File("rute\\output.PNG"));
int read = 0;
byte[] bytes = new byte[1024];
while ((read = inputStream.read(bytes)) != -1) {
outputStream.write(bytes, 0, read);
}
outputStream.close();
} catch (Exception e) {
e.printStackTrace();
}
inputStream.close();
If you ask me why convert into bytes and write from this form, is because I need do something with the data, and I need this conversion.
If you tell me that i cant load an image on a String, yes I can do something like that:
File fil = ~~~~;
FileInputStream fis = null;
fis = new FileInputStream(fil);
byte[] bytess = IOUtils.toByteArray(fis);
But I dont want do it by this way because if I want load big files, the heap size is not enough an this could be solved by the "line per line" read.
Thanks for your answers
I will recommend read this question before. Since you are reading binary data into a String you are changing the encoding of that data. So the output will be different.
Best approach is read binary files as byte arrays. But I will depend which type of transformation/edition/changes you need to do with them.
UPDATE
And, of course, you are editing your content before writing
while ((line= br.readLine()) != null) {
content += line + "\n";
}
so the your output file will be different always.
UPDATE 2
Since the question/problem is how to read a big binary file, google is usually your friend.
Or you can check this other question
I am reading in a file that is being sent though a socket and then trying to split it via newlines (\n), when I read in the file I am using a byte[] and I convert the byte array to a string so that I can split it.
public String getUserFileData()
{
try
{
byte[] mybytearray = new byte[1024];
InputStream is = clientSocket.getInputStream();
int bytesRead = is.read(mybytearray, 0, mybytearray.length);
is.close();
return new String(mybytearray);
}
catch(IOException e)
{
}
return "";
}
Here is the code used to attempting to split the String
public void readUserFile(String userData, Log logger)
{
String[] data;
String companyName;
data = userData.split("\n");
username = data[0];
password = data[1].toCharArray();
companyName = data[2];
quota = Float.parseFloat(data[3]);
company = new Company();
company.readCompanyFile("C:\\Users\\Chris\\Documents\\NetBeansProjects\\ArFile\\ArFile Clients\\" + companyName + "\\"
+ companyName + ".cmp");
cloudFiles = new CloudFiles();
cloudFiles.readCloudFiles(this, logger);
}
It causes this error
Exception in thread "AWT-EventQueue-1" java.lang.ArrayIndexOutOfBoundsException
You can use the readLine method in BufferedReader class.
Wrap the InputStream under InputStreamReader, and wrap it under BufferedReader:
InputStream is = clientSocket.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(is));
Please also check the encoding of the stream - you might need to specify the encoding in the constructor of InputStreamReader.
As stated in comments, using a BufferedReader would be best - you should be using an InputStreamReader anyway in order to convert from binary to text.
// Or use a different encoding - whatever's appropriate
BufferedReader reader = new BufferedReader(
new InputStreamReader(clientSocket.getInputStream(), "UTF-8");
try {
String line;
// I'm assuming you want to read every incoming line
while ((line = reader.readLine()) != null) {
processLine(line);
}
} finally {
reader.close();
}
Note that it's important to state which encoding you want to use - otherwise it'll use the platform's default encoding, which will vary from machine to machine, whereas presumably the data is in one specific encoding. If you don't know which encoding that is yet, you need to find out. Until then, you simply can't reliably understand the data.
(I hope your real code doesn't have an empty catch block, by the way.)
My input is a InputStream which contains an XML document. Encoding used in XML is unknown and it is defined in the first line of XML document.
From this InputStream, I want to have all document in a String.
To do this, I use a BufferedInputStream to mark the beginning of the file and start reading first line. I read this first line to get encoding and then I use an InputStreamReader to generate a String with the correct encoding.
It seems that it is not the best way to achieve this goal because it produces an OutOfMemory error.
Any idea, how to do it?
public static String streamToString(final InputStream is) {
String result = null;
if (is != null) {
BufferedInputStream bis = new BufferedInputStream(is);
bis.mark(Integer.MAX_VALUE);
final StringBuilder stringBuilder = new StringBuilder();
try {
// stream reader that handle encoding
final InputStreamReader readerForEncoding = new InputStreamReader(bis, "UTF-8");
final BufferedReader bufferedReaderForEncoding = new BufferedReader(readerForEncoding);
String encoding = extractEncodingFromStream(bufferedReaderForEncoding);
if (encoding == null) {
encoding = DEFAULT_ENCODING;
}
// stream reader that handle encoding
bis.reset();
final InputStreamReader readerForContent = new InputStreamReader(bis, encoding);
final BufferedReader bufferedReaderForContent = new BufferedReader(readerForContent);
String line = bufferedReaderForContent.readLine();
while (line != null) {
stringBuilder.append(line);
line = bufferedReaderForContent.readLine();
}
bufferedReaderForContent.close();
bufferedReaderForEncoding.close();
} catch (IOException e) {
// reset string builder
stringBuilder.delete(0, stringBuilder.length());
}
result = stringBuilder.toString();
}else {
result = null;
}
return result;
}
The call to mark(Integer.MAX_VALUE) is causing the OutOfMemoryError, since it's trying to allocate 2GB of memory.
You can solve this by using an iterative approach. Set the mark readLimit to a reasonable value, say 8K. In 99% of cases this will work, but in pathological cases, e.g 16K spaces between the attributes in the declaration, you will need to try again. Thus, have a loop that tries to find the encoding, but if it doesn't find it within the given mark region, it tries again, doubling the requested mark readLimit size.
To be sure you don't advance the input stream past the mark limit, you should read the InputStream yourself, upto the mark limit, into a byte array. You then wrap the byte array in a ByteArrayInputStream and pass that to the constructor of the InputStreamReader assigned to 'readerForEncoding'.
You can use this method to convert inputstream to string. this might help you...
private String convertStreamToString(InputStream input) throws Exception{
BufferedReader reader = new BufferedReader(new InputStreamReader(input));
StringBuilder sb = new StringBuilder();
String line = null;
while ((line = reader.readLine()) != null) {
sb.append(line);
}
input.close();
return sb.toString();
}
I want to send an image from a J2ME client to a Servlet.
I am able to get a byte array of the image and send it using HTTP POST.
conn = (HttpConnection) Connector.open(url, Connector.READ_WRITE, true);
conn.setRequestMethod(HttpConnection.POST);
conn.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
os.write(bytes, 0, bytes.length); // bytes = byte array of image
This is the Servlet code:
String line;
BufferedReader r1 = new BufferedReader(new InputStreamReader(in));
while ((line = r1.readLine()) != null) {
System.out.println("line=" + line);
buf.append(line);
}
String s = buf.toString();
byte[] img_byte = s.getBytes();
But the problem I found is, when I send bytes from the J2ME client, some bytes are lost. Their values are 0A and 0D hex. Exactly, the Carriage Return and Line Feed.
Thus, either POST method or readLine() are not able to accept 0A and 0D values.
Any one have any idea how to do this, or how to use any another method?
That's because you're using a BufferedReader to read the binary stream line by line. The readLine() basically splits the content on CRLF. Those individual lines doesn't contain the CRLF anymore.
Don't use the BufferedReader for binary streams, it doesn't make sense. Just write the obtained InputStream to an OutputStream of any flavor, e.g. FileOutputStream, the usual Java IO way.
InputStream input = null;
OutputStream output = null;
try {
input = request.getInputStream();
output = new FileOutputStream("/path/to/file.ext");
byte[] buffer = new byte[10240];
for (int length = 0; (length = input.read(buffer()) > 0;) {
output.write(buffer, 0, length);
}
} finally {
if (output != null) output.close();
if (input != null) input.close();
}
That said, the Content-Type you're using is technically wrong. You aren't sending a WWW-form URL-encoded value in the request body. You are sending a binary stream. It should be application/octet-stream or maybe image. This is not the cause of this problem, but it is just plain wrong.