I use the "get" method from java drive api, and I can get the inputstream. but I cannt open the file when I use the inputstream to creat it. It likes the file is broken.
private static String fileurl = "C:\\googletest\\drive\\";
public static void newFile(String filetitle, InputStream stream) throws IOException {
String filepath = fileurl + filetitle;
BufferedInputStream bufferedInputStream=new BufferedInputStream(stream);
byte[] buffer = new byte[bufferedInputStream.available()];
File file = new File(filepath);
if (!file.exists()) {
file.getParentFile().mkdirs();
BufferedOutputStream bufferedOutputStream = new BufferedOutputStream(new FileOutputStream(filepath));
while( bufferedInputStream.read(buffer) != -1) {
bufferedOutputStream.write(buffer);
}
bufferedOutputStream.flush();
bufferedOutputStream.close();
}
}
Firstly, C:\googletest\drive\ is not a URL. It is a file system pathname.
Next, the following probably does not do what you think it does:
byte[] buffer = new byte[bufferedInputStream.available()];
The problem is that the available() call can return zero ... for a non-empty stream. The value returned by available() is an estimate of how many bytes that are currently available to read ... right now. That is not necessarily the stream length ... or anything related to it. And indeed the device drivers for some devices consistently return zero, even when there is data to be read.
Finally, this is wrong:
while( bufferedInputStream.read(buffer) != -1) {
bufferedOutputStream.write(buffer);
You are assuming that read returning -1 means that it filled the buffer. That is not so. Any one of the read calls could return with a partly full buffer. But then you write the entire buffer contents to the output stream ... including "junk" from previous reads.
Either or both of the 2nd and 3rd problems could lead to file corruption. In fact, the third one is likely to.
Related
I need to serve a binary file through a web service implemented in Python/Django. The problem is, that when I compare the original file with the transferred file with vbindiff I see trailing bytes on the transferred file, sadly rendering it useless.
The Binary File is accessed saved by a client in Java with:
HttpURLConnection userdataConnection = null;
URL userdataUrl = null;
try {
userdataUrl = new URL("http://localhost:8000/app/vuforia/10");
userdataConnection = (HttpURLConnection) userdataUrl.openConnection();
userdataConnection.setRequestMethod("GET");
userdataConnection.setRequestProperty("Content-Type", "application/octet-stream");
userdataConnection.connect();
InputStream userdataStream = new BufferedInputStream(userdataConnection.getInputStream());
try (ByteArrayOutputStream fileStream = new ByteArrayOutputStream()) {
byte[] buffer = new byte[4094];
while (userdataStream.read(buffer) != -1) {
fileStream.write(buffer);
}
byte[] fileBytes = fileStream.toByteArray();
try (FileOutputStream fos = new FileOutputStream("./test.dat")) {
fos.write(fileBytes);
}
}
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
I think that HttpURLConnection.getInputStream only reads the body of the response, or not?
This code serves the data in the backend
in views.py:
if request.method == "GET":
all_data = VuforiaDatabase.objects.all()
data = all_data.get(id=version)
return FileResponse(data.get_dat_bytes())
in models.py:
def get_dat_bytes(self):
return self.dat_upload.open()
How do I go about transferring the binary data 1:1?
You’re ignoring the return value of InputStream.read.
From the documentation:
Returns:
the total number of bytes read into the buffer, or -1 if there is no more data because the end of the stream has been reached.
Your code is assuming that the buffer is filled with every call to userdataStream.read(buffer), instead of checking how many bytes were actually read into buffer.
You don’t need to read from an InputStream at all. Just use Files.copy:
Path file = Paths.get("./test.dat");
try (InputStream userdataStream = new BufferedInputStream(userdataConnection.getInputStream())) {
Files.copy(userdataStream, file, StandardCopyOption.REPLACE_EXISTING);
}
You always write a multiple the 4094 bytes, no matter how many bytes you actually read.
Don't do .write(buffer); write the amount you actually read. This is what userdataStream.read returns you. It can return a number smaller than the buffer size, but still positive.
If you project is using Apache Commons already, you can just use copyInputStreamToFile.
Note: 4K = 4096, not 4094, and it's a ridiculously small buffer, unless you operate something like a smartcard. On a PC, use something like a few hundred kb at least.
I am trying to understand what is the correct way to read a file in Java Servlet program. I need to read a file from a fixed path on my machine using my servlet code. Now I can read the file in multiple ways and one of the way which I am planning to use is to read the information in bytes as shown in below code:
private static void readFile(HttpServletRequest req, HttpServletResponse resp, String path)
throws IOException
{
File file = new File("C:\\temp\", path);
if (!file.isFile()) {
resp.sendError(404, "File not found: " + file);
return;
}
InputStream in = null;
ServletOutputStream out = null;
try {
resp.setContentLength(Long.valueOf(file.length()).intValue());
resp.resetBuffer();
out = resp.getOutputStream();
in = new BufferedInputStream(new FileInputStream(file));
readFile(in, out);
}
finally {
//Code for closing the input & output steams
}
}
}
public static void readFile(InputStream in, OutputStream out) throws IOException
{
byte[] buf = new byte[4096];
int data;
while ((data = in.read(buf, 0, buf.length)) != -1)
out.write(buf, 0, data);
}
I don't have issues with this logic and it is working fine.
Now I came across the post How To Read File In Java – BufferedReader in mykyong site and here the example uses BufferedReader.
Can someone please tell me which is the efficient way of reading a file in servlet code? when we need to prefer using BufferedReader in comparison to reading data in bytes.
There's almost no reason to read a file manually anymore since Java 7's NIO.
Just use Files.readAllBytes(Path) to read the full byte[]. Or if you want to stream directly to an OutputStream, Files.copy(Path, OutputStream).
Can someone please tell me which is the efficient way of reading a
file in servlet code? when we need to prefer using BufferedReader in
comparison to reading data in bytes.
Any buffered method will work. Here, BufferedReader allows you to read streams as String values. As the javadoc says
Reads text from a character-input stream, buffering characters so as
to provide for the efficient reading of characters, arrays, and lines.
I was trying to read a file into an array by using FileInputStream, and an ~800KB file took about 3 seconds to read into memory. I then tried the same code except with the FileInputStream wrapped into a BufferedInputStream and it took about 76 milliseconds. Why is reading a file byte by byte done so much faster with a BufferedInputStream even though I'm still reading it byte by byte? Here's the code (the rest of the code is entirely irrelevant). Note that this is the "fast" code. You can just remove the BufferedInputStream if you want the "slow" code:
InputStream is = null;
try {
is = new BufferedInputStream(new FileInputStream(file));
int[] fileArr = new int[(int) file.length()];
for (int i = 0, temp = 0; (temp = is.read()) != -1; i++) {
fileArr[i] = temp;
}
BufferedInputStream is over 30 times faster. Far more than that. So, why is this, and is it possible to make this code more efficient (without using any external libraries)?
In FileInputStream, the method read() reads a single byte. From the source code:
/**
* Reads a byte of data from this input stream. This method blocks
* if no input is yet available.
*
* #return the next byte of data, or <code>-1</code> if the end of the
* file is reached.
* #exception IOException if an I/O error occurs.
*/
public native int read() throws IOException;
This is a native call to the OS which uses the disk to read the single byte. This is a heavy operation.
With a BufferedInputStream, the method delegates to an overloaded read() method that reads 8192 amount of bytes and buffers them until they are needed. It still returns only the single byte (but keeps the others in reserve). This way the BufferedInputStream makes less native calls to the OS to read from the file.
For example, your file is 32768 bytes long. To get all the bytes in memory with a FileInputStream, you will require 32768 native calls to the OS. With a BufferedInputStream, you will only require 4, regardless of the number of read() calls you will do (still 32768).
As to how to make it faster, you might want to consider Java 7's NIO FileChannel class, but I have no evidence to support this.
Note: if you used FileInputStream's read(byte[], int, int) method directly instead, with a byte[>8192] you wouldn't need a BufferedInputStream wrapping it.
A BufferedInputStream wrapped around a FileInputStream, will request data from the FileInputStream in big chunks (512 bytes or so by default, I think.) Thus if you read 1000 characters one at a time, the FileInputStream will only have to go to the disk twice. This will be much faster!
It is because of the cost of disk access. Lets assume you will have a file which size is 8kb. 8*1024 times access disk will be needed to read this file without BufferedInputStream.
At this point, BufferedStream comes to the scene and acts as a middle man between FileInputStream and the file to be read.
In one shot, will get chunks of bytes default is 8kb to memory and then FileInputStream will read bytes from this middle man.
This will decrease the time of the operation.
private void exercise1WithBufferedStream() {
long start= System.currentTimeMillis();
try (FileInputStream myFile = new FileInputStream("anyFile.txt")) {
BufferedInputStream bufferedInputStream = new BufferedInputStream(myFile);
boolean eof = false;
while (!eof) {
int inByteValue = bufferedInputStream.read();
if (inByteValue == -1) eof = true;
}
} catch (IOException e) {
System.out.println("Could not read the stream...");
e.printStackTrace();
}
System.out.println("time passed with buffered:" + (System.currentTimeMillis()-start));
}
private void exercise1() {
long start= System.currentTimeMillis();
try (FileInputStream myFile = new FileInputStream("anyFile.txt")) {
boolean eof = false;
while (!eof) {
int inByteValue = myFile.read();
if (inByteValue == -1) eof = true;
}
} catch (IOException e) {
System.out.println("Could not read the stream...");
e.printStackTrace();
}
System.out.println("time passed without buffered:" + (System.currentTimeMillis()-start));
}
Someone explain to me what InputStream and OutputStream are?
I am confused about the use cases for both InputStream and OutputStream.
If you could also include a snippet of code to go along with your explanation, that would be great. Thanks!
The goal of InputStream and OutputStream is to abstract different ways to input and output: whether the stream is a file, a web page, or the screen shouldn't matter. All that matters is that you receive information from the stream (or send information into that stream.)
InputStream is used for many things that you read from.
OutputStream is used for many things that you write to.
Here's some sample code. It assumes the InputStream instr and OutputStream osstr have already been created:
int i;
while ((i = instr.read()) != -1) {
osstr.write(i);
}
instr.close();
osstr.close();
InputStream is used for reading, OutputStream for writing. They are connected as decorators to one another such that you can read/write all different types of data from all different types of sources.
For example, you can write primitive data to a file:
File file = new File("C:/text.bin");
file.createNewFile();
DataOutputStream stream = new DataOutputStream(new FileOutputStream(file));
stream.writeBoolean(true);
stream.writeInt(1234);
stream.close();
To read the written contents:
File file = new File("C:/text.bin");
DataInputStream stream = new DataInputStream(new FileInputStream(file));
boolean isTrue = stream.readBoolean();
int value = stream.readInt();
stream.close();
System.out.printlin(isTrue + " " + value);
You can use other types of streams to enhance the reading/writing. For example, you can introduce a buffer for efficiency:
DataInputStream stream = new DataInputStream(
new BufferedInputStream(new FileInputStream(file)));
You can write other data such as objects:
MyClass myObject = new MyClass(); // MyClass have to implement Serializable
ObjectOutputStream stream = new ObjectOutputStream(
new FileOutputStream("C:/text.obj"));
stream.writeObject(myObject);
stream.close();
You can read from other different input sources:
byte[] test = new byte[] {0, 0, 1, 0, 0, 0, 1, 1, 8, 9};
DataInputStream stream = new DataInputStream(new ByteArrayInputStream(test));
int value0 = stream.readInt();
int value1 = stream.readInt();
byte value2 = stream.readByte();
byte value3 = stream.readByte();
stream.close();
System.out.println(value0 + " " + value1 + " " + value2 + " " + value3);
For most input streams there is an output stream, also. You can define your own streams to reading/writing special things and there are complex streams for reading complex things (for example there are Streams for reading/writing ZIP format).
From the Java Tutorial:
A stream is a sequence of data.
A program uses an input stream to read data from a source, one item at a time:
A program uses an output stream to write data to a destination, one item at time:
The data source and data destination pictured above can be anything that holds, generates, or consumes data. Obviously this includes disk files, but a source or destination can also be another program, a peripheral device, a network socket, or an array.
Sample code from oracle tutorial:
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
public class CopyBytes {
public static void main(String[] args) throws IOException {
FileInputStream in = null;
FileOutputStream out = null;
try {
in = new FileInputStream("xanadu.txt");
out = new FileOutputStream("outagain.txt");
int c;
while ((c = in.read()) != -1) {
out.write(c);
}
} finally {
if (in != null) {
in.close();
}
if (out != null) {
out.close();
}
}
}
}
This program uses byte streams to copy xanadu.txt file to outagain.txt , by writing one byte at a time
Have a look at this SE question to know more details about advanced Character streams, which are wrappers on top of Byte Streams :
byte stream and character stream
you read from an InputStream and write to an OutputStream.
for example, say you want to copy a file. You would create a FileInputStream to read from the source file and a FileOutputStream to write to the new file.
If your data is a character stream, you could use a FileReader instead of an InputStream and a FileWriter instead of an OutputStream if you prefer.
InputStream input = ... // many different types
OutputStream output = ... // many different types
byte[] buffer = new byte[1024];
int n = 0;
while ((n = input.read(buffer)) != -1)
output.write(buffer, 0, n);
input.close();
output.close();
OutputStream is an abstract class that represents writing output. There are many different OutputStream classes, and they write out to certain things (like the screen, or Files, or byte arrays, or network connections, or etc). InputStream classes access the same things, but they read data in from them.
Here is a good basic example of using FileOutputStream and FileInputStream to write data to a file, then read it back in.
A stream is a continuous flow of liquid, air, or gas.
Java stream is a flow of data from a source into a destination. The source or destination can be a disk, memory, socket, or other programs. The data can be bytes, characters, or objects. The same applies for C# or C++ streams. A good metaphor for Java streams is water flowing from a tap into a bathtub and later into a drainage.
The data represents the static part of the stream; the read and write methods the dynamic part of the stream.
InputStream represents a flow of data from the source, the OutputStream represents a flow of data into the destination.
Finally, InputStream and OutputStream are abstractions over low-level access to data, such as C file pointers.
Stream: In laymen terms stream is data , most generic stream is binary representation of data.
Input Stream : If you are reading data from a file or any other source , stream used is input stream. In a simpler terms input stream acts as a channel to read data.
Output Stream : If you want to read and process data from a source (file etc) you first need to save the data , the mean to store data is output stream .
An output stream is generally related to some data destination like a file or a network etc.In java output stream is a destination where data is eventually written and it ends
import java.io.printstream;
class PPrint {
static PPrintStream oout = new PPrintStream();
}
class PPrintStream {
void print(String str) {
System.out.println(str)
}
}
class outputstreamDemo {
public static void main(String args[]) {
System.out.println("hello world");
System.out.prinln("this is output stream demo");
}
}
For one kind of InputStream, you can think of it as a "representation" of a data source, like a file.
For example:
FileInputStream fileInputStream = new FileInputStream("/path/to/file/abc.txt");
fileInputStream represents the data in this path, which you can use read method to read bytes from the file.
For the other kind of InputStream, they take in another inputStream and do further processing, like decompression.
For example:
GZIPInputStream gzipInputStream = new GZIPInputStream(fileInputStream);
gzipInputStream will treat the fileInputStream as a compressed data source. When you use the read(buffer, 0, buffer.length) method, it will decompress part of the gzip file into the buffer you provide.
The reason why we use InputStream because as the data in the source becomes larger and larger, say we have 500GB data in the source file, we don't want to hold everything in the memory (expensive machine; not friendly for GC allocation), and we want to get some result faster (reading the whole file may take a long time).
The same thing for OutputStream. We can start moving some result to the destination without waiting for the whole thing to finish, plus less memory consumption.
If you want more explanations and examples, you have check these summaries: InputStream, OutputStream, How To Use InputStream, How To Use OutputStream
In continue to the great other answers, in my simple words:
Stream - like mentioned #Sher Mohammad is data.
Input stream - for example is to get input – data – from the file. The case is when I have a file (the user upload a file – input) – and I want to read what we have there.
Output Stream – is the vice versa. For example – you are generating an excel file, and output it to some place.
The “how to write” to the file, is defined at the sender (the excel workbook class) not at the file output stream.
See here example in this context.
try (OutputStream fileOut = new FileOutputStream("xssf-align.xlsx")) {
wb.write(fileOut);
}
wb.close();
I am trying to read a single file from a java.util.zip.ZipInputStream, and copy it into a java.io.ByteArrayOutputStream (so that I can then create a java.io.ByteArrayInputStream and hand that to a 3rd party library that will end up closing the stream, and I don't want my ZipInputStream getting closed).
I'm probably missing something basic here, but I never enter the while loop here:
ByteArrayOutputStream streamBuilder = new ByteArrayOutputStream();
int bytesRead;
byte[] tempBuffer = new byte[8192*2];
try {
while ((bytesRead = zipStream.read(tempBuffer)) != -1) {
streamBuilder.write(tempBuffer, 0, bytesRead);
}
} catch (IOException e) {
// ...
}
What am I missing that will allow me to copy the stream?
Edit:
I should have mentioned earlier that this ZipInputStream is not coming from a file, so I don't think I can use a ZipFile. It is coming from a file uploaded through a servlet.
Also, I have already called getNextEntry() on the ZipInputStream before getting to this snippet of code. If I don't try copying the file into another InputStream (via the OutputStream mentioned above), and just pass the ZipInputStream to my 3rd party library, the library closes the stream, and I can't do anything more, like dealing with the remaining files in the stream.
Your loop looks valid - what does the following code (just on it's own) return?
zipStream.read(tempBuffer)
if it's returning -1, then the zipStream is closed before you get it, and all bets are off. It's time to use your debugger and make sure what's being passed to you is actually valid.
When you call getNextEntry(), does it return a value, and is the data in the entry meaningful (i.e. does getCompressedSize() return a valid value)? IF you are just reading a Zip file that doesn't have read-ahead zip entries embedded, then ZipInputStream isn't going to work for you.
Some useful tidbits about the Zip format:
Each file embedded in a zip file has a header. This header can contain useful information (such as the compressed length of the stream, it's offset in the file, CRC) - or it can contain some magic values that basically say 'The information isn't in the stream header, you have to check the Zip post-amble'.
Each zip file then has a table that is attached to the end of the file that contains all of the zip entries, along with the real data. The table at the end is mandatory, and the values in it must be correct. In contrast, the values embedded in the stream do not have to be provided.
If you use ZipFile, it reads the table at the end of the zip. If you use ZipInputStream, I suspect that getNextEntry() attempts to use the entries embedded in the stream. If those values aren't specified, then ZipInputStream has no idea how long the stream might be. The inflate algorithm is self terminating (you actually don't need to know the uncompressed length of the output stream in order to fully recover the output), but it's possible that the Java version of this reader doesn't handle this situation very well.
I will say that it's fairly unusual to have a servlet returning a ZipInputStream (it's much more common to receive an inflatorInputStream if you are going to be receiving compressed content.
You probably tried reading from a FileInputStream like this:
ZipInputStream in = new ZipInputStream(new FileInputStream(...));
This won’t work since a zip archive can contain multiple files and you need to specify which file to read.
You could use java.util.zip.ZipFile and a library such as IOUtils from Apache Commons IO or ByteStreams from Guava that assist you in copying the stream.
Example:
ByteArrayOutputStream out = new ByteArrayOutputStream();
try (ZipFile zipFile = new ZipFile("foo.zip")) {
ZipEntry zipEntry = zipFile.getEntry("fileInTheZip.txt");
try (InputStream in = zipFile.getInputStream(zipEntry)) {
IOUtils.copy(in, out);
}
}
I'd use IOUtils from the commons io project.
IOUtils.copy(zipStream, byteArrayOutputStream);
You're missing call
ZipEntry entry = (ZipEntry) zipStream.getNextEntry();
to position the first byte decompressed of the first entry.
ByteArrayOutputStream streamBuilder = new ByteArrayOutputStream();
int bytesRead;
byte[] tempBuffer = new byte[8192*2];
ZipEntry entry = (ZipEntry) zipStream.getNextEntry();
try {
while ( (bytesRead = zipStream.read(tempBuffer)) != -1 ){
streamBuilder.write(tempBuffer, 0, bytesRead);
}
} catch (IOException e) {
...
}
You could implement your own wrapper around the ZipInputStream that ignores close() and hand that off to the third-party library.
thirdPartyLib.handleZipData(new CloseIgnoringInputStream(zipStream));
class CloseIgnoringInputStream extends InputStream
{
private ZipInputStream stream;
public CloseIgnoringInputStream(ZipInputStream inStream)
{
stream = inStream;
}
public int read() throws IOException {
return stream.read();
}
public void close()
{
//ignore
}
public void reallyClose() throws IOException
{
stream.close();
}
}
I would call getNextEntry() on the ZipInputStream until it is at the entry you want (use ZipEntry.getName() etc.). Calling getNextEntry() will advance the "cursor" to the beginning of the entry that it returns. Then, use ZipEntry.getSize() to determine how many bytes you should read using zipInputStream.read().
It is unclear how you got the zipStream. It should work when you get it like this:
zipStream = zipFile.getInputStream(zipEntry)
t is unclear how you got the zipStream. It should work when you get it like this:
zipStream = zipFile.getInputStream(zipEntry)
If you are obtaining the ZipInputStream from a ZipFile you can get one stream for the 3d party library, let it use it, and you obtain another input stream using the code before.
Remember, an inputstream is a cursor. If you have the entire data (like a ZipFile) you can ask for N cursors over it.
A diferent case is if you only have an "GZip" inputstream, only an zipped byte stream. In that case you ByteArrayOutputStream buffer makes all sense.
Please try code bellow
private static byte[] getZipArchiveContent(File zipName) throws WorkflowServiceBusinessException {
BufferedInputStream buffer = null;
FileInputStream fileStream = null;
ByteArrayOutputStream byteOut = null;
byte data[] = new byte[BUFFER];
try {
try {
fileStream = new FileInputStream(zipName);
buffer = new BufferedInputStream(fileStream);
byteOut = new ByteArrayOutputStream();
int count;
while((count = buffer.read(data, 0, BUFFER)) != -1) {
byteOut.write(data, 0, count);
}
} catch(Exception e) {
throw new WorkflowServiceBusinessException(e.getMessage(), e);
} finally {
if(null != fileStream) {
fileStream.close();
}
if(null != buffer) {
buffer.close();
}
if(null != byteOut) {
byteOut.close();
}
}
} catch(Exception e) {
throw new WorkflowServiceBusinessException(e.getMessage(), e);
}
return byteOut.toByteArray();
}
Check if the input stream is positioned in the begging.
Otherwise, as implementation: I do not think that you need to write to the result stream while you are reading, unless you process this exact stream in another thread.
Just create a byte array, read the input stream, then create the output stream.