Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
We added a new feature in our web application that has the the following code, basically decompressing the inputstream and creating a new String with UTF-8 encoding
....
// is is an instance of java.util.zip.GZIPInputStream
byte[] payloadBuf = org.apache.commons.compress.utils.IOUtils.toByteArray(is);
String plainPayload = new String(payloadBuf, CharEncoding.UTF_8);
...
when we run an intensive load test that triggers this path many times, we see an abnormal increase of not-heap memory in JVM. Can anyone give some hint on interpreting this? And even better, is there a way to avoid it somehow? Thanks a lot
There is nothing abnormal about your results:
If you call this code in a tight loop you are creating lots and lots of short lived objects. 3 byte[] instances ( all Objects ) as well as a ByteArrayStream for every call! And for no reason apparently.
So you are creating and copying a bunch of byte[] instances around and then the String constructor creates at least one more byte[] and copies that as well, all for nothing.
Are not accomplishing what you think you are doing:
You are not creating a new String with UTF-8 encoding, you are creating a new String which is interpreting the byte[] as UTF-8.
Java stores all String objects in memory as UTF-16, so you are not creating a new String with UTF-8 encoding.
Solution:
You should just read the file into a String to begin with and be done with it, you are creating this intermediate byte[] for nothing!
Here is a couple of examples using Guava:
final String text = CharStreams.toString(new InputStreamReader(is,Charsets.UTF_8));
or
final ByteSource source ...
final String text = source.asCharSource(Charsets.UTF_8).read();
Opinion:
That org.apache.commmons stuff is crap with all the cancerous dependencies and it is not doing anything special to begin with and still makes you deal with a checked exception on top of it all!
165 public static byte[] toByteArray(final InputStream input) throws IOException {
166 final ByteArrayOutputStream output = new ByteArrayOutputStream();
167 copy(input, output);
168 return output.toByteArray();
169 }
If you follow the rabbit hole you will find out that one call to .toByteArray() creates at least 3 instances of byte[] objects, a couple of ByteArrayStream objects that all end up as garbage just to get to String.
Related
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 months ago.
Improve this question
I'm using a specific library (unfortunately can't be avoided) that writes some information from a class to a file using a utility function that receives a DataOutputStream as the input.
I would like to get the resulting file's content as a String without actually creating a file and writing into it as the writing can be pretty "taxing" (1000+ lines).
Is this possible to do by using a dummy DataOutputStream or some other method and without resorting to creating a temporary file and reading the result from there?
P.S: the final method that actually writes to the DataOutputStream changes from time to time so I would prefer not actually copy-paste it and redo it every time.
As java.io.DataOutputStream wraps around just any other java.io.OutputStream (you have to specify an instance in the constructor) I would recommend that you use a java.io.ByteArrayOutputStream to collect the data in-memory and get the String of that later with the .toString() method.
Example:
ByteArrayOutputStream inMemoryOutput = new ByteArrayOutputStream();
DataOutputStream dataOutputStream = new DataOutputStream(inMemoryOutput);
// use dataOutputStream here as intended
// and then get the String data
System.out.println(inMemoryOutput.toString());
If the encoding of the collected bytes does not match the system default encoding, you might have to specify a Charset as parameter of the toString.
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 years ago.
Improve this question
Why am I getting a
Null pointer access: The variable versionFromInputStream can only be null at this location
Error?
Is that because the IDE doesn't know about the read method?
byte[] versionFromInputStream = null;
if (input.read(versionFromInputStream, 0, 3) != 3)
{
throw new NetworkException();
}
double version = Double.parseDouble(versionFromInputStream.toString());
The read method of a stream expects an existing byte array with enough space to be passed. Also, the conversion of bytes to String ought to be done via the String(byte[]) constructor.
In this case, you are reading three bytes, so the following ought to suffice:
byte[] versionFromInputStream = new byte[3];
if (input.read(versionFromInputStream, 0, 3) != 3)
{
throw new NetworkException();
}
double version = Double.parseDouble(new String(versionFromInputStream));
From a design standpoint, you may want to avoid sending strings over a network as it's inefficient. As long as you have control over both the sender and the receiver, a DataInputStream/DataOutputStream will let you natively read and write integers to the stream, without the overhead of reading bytes and converting them to strings to be parsed. As a quick example showing the receive side (with an integer version):
DataInputStream dataInput = new DataInputStream(input);
int version = dataInput.readInt();
You'd need to adapt the sender to use a DataOutputStream accordingly.
So you've got a couple of things going on here that aren't right
byte[] versionFromInputStream = null; // you should initialize this like = new byte[2048]; because..
if (input.read(versionFromInputStream, 0, 3) != 3) // because here you are trying to read into this byte array. And because it hasn't been initialized, you are getting the exception
{
throw new NetworkException();
}
double version = Double.parseDouble(versionFromInputStream.toString()); // this isn't going to work either. byte[].toString is the same as Object.toString - it just prints out the location of the object in virtual memory, which isn't what you want
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
This is the steps im supposed to go through. I tried writing the files but i think its wrong.
Creates 10 binary files in a folder called “binaryfiles”. The files must be named “temp0.dat”, “temp1.dat”, etc. to “temp9.dat”.
In each file, write 20 random doubles between 0 and 500. Inclusivity doesn’t matter.
Once the files are written, open each one in sequence from “temp0.dat” to “temp9.dat” and read them one character at a time. As you read the files, print the characters to the output window. Most of the characters will look like Chinese characters.
public class Homework7 {
public static void main(String[] args) throws IOException {
File file = new File("binaryfiles");
file.listFiles();
System.out.println("We have a file" + file);
System.out.println("Does it exist" + file.exists());
System.out.println("?" + file.isDirectory());
Random random = new Random(20);
random.setSeed(500);
double num = random.nextDouble();
OutputStream outStream = new FileOutputStream(file);
outStream.write((int) num);
outStream.close();
}
All files are binary at the lowest level (we talk about different types of files because we choose to interpret the bytes as something else on a higher level) and you create one using an OutputStream and then you write to it either using the stream directly or by using something that writes to the stream for you.
I'm not going to solve this for you since it sounds like a learning assignment, so instead I suggest you look closer at FileOutputStream and DataOutputStream.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm getting the Md5 of a file using Apache DigestUtils as follows:
public static String getMd5(File file) throws Exception
{
FileInputStream fis = null;
String md5 = "";
fis = new FileInputStream(file);
md5 = DigestUtils.md5Hex(fis)
IOUtils.closeQuietly(fis);
return md5;
}
This Md5 is being used as a key. I am doing a check for uniqueness (because of possible collisions), however, if it is not unique, how do I make it unique?
Thanks in advance!
Actually there is nothing you can do to make a hash function unique (obvious, because it maps large data to small one). For MD5, these collisions don't happen by chance for a reasonable number of files, but someone who wants to break your program can construct files with same MD5 hash (see for example http://www.mathstat.dal.ca/~selinger/md5collision/). If you want to avoid this, I would suggest that you use a hash functions that is considered more secure, like SHA-256. If you really have to deal with a hash function with collisions, your data structure that uses this hash as a key needs mechanisms to handle this situation (e.g. secondary hashing or using lists to store items with same hash).
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
So I am having problems reading from a serialized file.
More specifically, I have serialized an object to a file written in a hexadecimal format. The problem occurs when I want to read one line at a time from this file. For example, the file can look like this:
aced 0005 7372 0005 5465 7374 41f2 13c1
215c 9734 6b02 0000 7870
However, the code underneath reads the whole file (instead of just the first line). Also, it automatically converts the hexadecimal data into something more readable: ¬ísrTestAòÁ
....
try (BufferedReader file = new BufferedReader(new FileReader(fileName))) {
read(file);
} catch ...
....
public static void read(BufferedReader in) throws IOException{
String line = in.readLine();
System.out.println(line); // PROBLEM: This prints every line
}
}
This code works perfectly fine if I have a normal text file with some random words, it only prints the first line. My guess is the problems lies in the serialization format. I read somewhere (probably the API) that the file is supposed to be in binary (even though my file is in hexadecimal??).
What should I do to be able to read one line at a time from this file?
EDIT: I have gotten quite a few of answers, which I am thankful for. I never wanted to deserialize the object - only be able to read every hexadecimal line (one at a time) so I could analyze the serialized object. I am sorry if the question was unclear.
Now I have realized that the file is actually not written in hexadecimal but in binary. Further, it is not even devided into lines. The problem I am facing now is to read every byte and convert it into hexadecimal. Basically, I want the data to look like the hexadecimal data above.
UPDATE:
immibis comments helped me solve this.
"Use FileInputStream (or a BufferedInputStream wrapping one) and call read() repeatedly - each call returns one byte (from 0 to 255) or -1 if there are no more bytes in the file. This is the simplest, but not the most efficient, way (reading an array is usually faster)"
The file does not contain hexadecimal text and is not separated into lines.
Whatever program you are using to edit the file is "helpfully" converting it into hexadecimal for you, since it would be gibberish if displayed directly.
If you are writing the file using ObjectOutputStream and FileOutputStream, then you need to read it using ObjectInputStream and FileInputStream.
Your question doesn't make any sense. Serialized data is binary. It doesn't contain lines. You can't read lines from it. You should either read bytes, with an InputStream, or objects, with an ObjectInputStream.