Read first bytes of a file

Read first bytes of a file - java

I need a very simple function that allows me to read the first 1k bytes of a file through FTP. I want to use it in MATLAB to read the first lines and, according to some parameters, to download only files I really need eventually. I found some examples online that unfortunately do not work. Here I'm proposing the sample code where I'm trying to download one single file (I'm using the Apache libraries).
FTPClient client = new FTPClient();
FileOutputStream fos = null;
try {
client.connect("data.site.org");
// filename to be downloaded.
String filename = "filename.Z";
fos = new FileOutputStream(filename);
// Download file from FTP server
InputStream stream = client.retrieveFileStream("/pub/obs/2008/021/ab120210.08d.Z");
byte[] b = new byte[1024];
stream.read(b);
fos.write(b);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
if (fos != null) {
fos.close();
}
client.disconnect();
} catch (IOException e) {
e.printStackTrace();
}
}
the error is in stream which is returned empty. I know I'm passing the folder name in a wrong way, but I cannot understand how I have to do. I've tried in many way.
I've also tried with the URL's Java classes as:
URL url;
url = new URL("ftp://data.site.org/pub/obs/2008/021/ab120210.08d.Z");
URLConnection con = url.openConnection();
BufferedInputStream in =
new BufferedInputStream(con.getInputStream());
FileOutputStream out =
new FileOutputStream("C:\\filename.Z");
int i;
byte[] bytesIn = new byte[1024];
if ((i = in.read(bytesIn)) >= 0) {
out.write(bytesIn);
}
out.close();
in.close();
but it is giving an error when I'm closing the InputStream in!
I'm definitely stuck. Some comments about would be very useful!

Try this test
InputStream is = new URL("ftp://test:test#ftp.secureftp-test.com/bookstore.xml").openStream();
byte[] a = new byte[1000];
int n = is.read(a);
is.close();
System.out.println(new String(a, 0, n));
it definitely works

From my experience when you read bytes from a stream acquired from ftpClient.retrieveFileStream, for the first run it is not guarantied that you get your byte buffer filled up. However, either you should read the return value of stream.read(b); surrounded with a cycle based on it or use an advanced library to fill up the 1024 length byte[] buffer:
InputStream stream = null;
try {
// Download file from FTP server
stream = client.retrieveFileStream("/pub/obs/2008/021/ab120210.08d.Z");
byte[] b = new byte[1024];
IOUtils.read(stream, b); // will call periodically stream.read() until it fills up your buffer or reaches end-of-file
fos.write(b);
} catch (IOException e) {
e.printStackTrace();
} finally {
IOUtils.closeQuietly(inputStream);
}

I cannot understand why it doesn't work. I found this link where they used the Apache library to read 4096 bytes each time. I read the first 1024 bytes and it works eventually, the only thing is that if completePendingCommand() is used, the program is held for ever. Thus I've removed it and everything works fine.

Related

Why is my binary data bigger after getting it from the webserver?

I need to serve a binary file through a web service implemented in Python/Django. The problem is, that when I compare the original file with the transferred file with vbindiff I see trailing bytes on the transferred file, sadly rendering it useless.
The Binary File is accessed saved by a client in Java with:
HttpURLConnection userdataConnection = null;
URL userdataUrl = null;
try {
userdataUrl = new URL("http://localhost:8000/app/vuforia/10");
userdataConnection = (HttpURLConnection) userdataUrl.openConnection();
userdataConnection.setRequestMethod("GET");
userdataConnection.setRequestProperty("Content-Type", "application/octet-stream");
userdataConnection.connect();
InputStream userdataStream = new BufferedInputStream(userdataConnection.getInputStream());
try (ByteArrayOutputStream fileStream = new ByteArrayOutputStream()) {
byte[] buffer = new byte[4094];
while (userdataStream.read(buffer) != -1) {
fileStream.write(buffer);
}
byte[] fileBytes = fileStream.toByteArray();
try (FileOutputStream fos = new FileOutputStream("./test.dat")) {
fos.write(fileBytes);
}
}
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
I think that HttpURLConnection.getInputStream only reads the body of the response, or not?
This code serves the data in the backend
in views.py:
if request.method == "GET":
all_data = VuforiaDatabase.objects.all()
data = all_data.get(id=version)
return FileResponse(data.get_dat_bytes())
in models.py:
def get_dat_bytes(self):
return self.dat_upload.open()
How do I go about transferring the binary data 1:1?

You’re ignoring the return value of InputStream.read.
From the documentation:
Returns:
the total number of bytes read into the buffer, or -1 if there is no more data because the end of the stream has been reached.
Your code is assuming that the buffer is filled with every call to userdataStream.read(buffer), instead of checking how many bytes were actually read into buffer.
You don’t need to read from an InputStream at all. Just use Files.copy:
Path file = Paths.get("./test.dat");
try (InputStream userdataStream = new BufferedInputStream(userdataConnection.getInputStream())) {
Files.copy(userdataStream, file, StandardCopyOption.REPLACE_EXISTING);
}

You always write a multiple the 4094 bytes, no matter how many bytes you actually read.
Don't do .write(buffer); write the amount you actually read. This is what userdataStream.read returns you. It can return a number smaller than the buffer size, but still positive.
If you project is using Apache Commons already, you can just use copyInputStreamToFile.
Note: 4K = 4096, not 4094, and it's a ridiculously small buffer, unless you operate something like a smartcard. On a PC, use something like a few hundred kb at least.

Download files with Android Webview

I am making an Android app that uses a WebView to access to a webpage. To handle downloads I am using AsyncTask in method onDownloadStart of WebView's DownloadListener. However files downloaded are blank (although the filename and extension are correct). My Java code is this:
protected String doInBackground(String... url) {
try {
URL url = new URL(url[0]);
//Creating directory if not exists
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
connection.setDoOutput(true);
connection.connect();
//Obtaining filename
File outputFile = new File(directory, filename);
InputStream input = new BufferedInputStream(connection.getInputStream());
OutputStream output = new FileOutputStream(outputFile);
byte data[] = new byte[1024];
int count = 0;
Log.e(null, "input.read(data) = "+input.read(data), null);
while ((count = input.read(data)) != -1) {
output.write(data, 0, count);
}
connection.disconnect();
output.flush();
output.close();
input.close();
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
log.e line gives -1 value for input.read(data).
PHP code of download page is this (works in all platforms). Files are stored in non-public directories of my HTML server.
<?php
$guid = $_GET['id'];
$file = get_file($guid);
if (isset($file['path'])) {
$mime = $file['MIMEType'];
if (!$mime) {
$mime = "application/octet-stream";
}
header("Pragma: public");
header("Content-type: $mime");
header("Content-Disposition: attachment; filename=\"{$file['filename']}\"");
header('Content-Transfer-Encoding: binary');
ob_clean();
flush();
readfile($file['path']);
exit();
}
?>
I've noticed that if I write some text after "?>" of PHP file, this text is written in the file downloaded.

In your code, you are using ob_clean(), which will just erase the output buffer. Your subsequent call to flush() therefore doesn't return anything, because the output buffer has been flushed beforehand.
Instead of ob_clean() and flush(), use ob_end_flush(). This will stop output buffering and it will send all the output it withheld.
ob_end_flush — Flush (send) the output buffer and turn off output buffering
If you want to stop output buffering without outputting whatever is saved, you can use ob_end_clean(). Anything after this command will be output again, but anything between ob_start() and ob_end_clean() will be "swallowed."
ob_end_clean — Clean (erase) the output buffer and turn off output buffering
What are the benefits of output buffering in the first place? If you are doing ob_start() and then using flush() on everything you might as well output everything directly.

Download file via HTTP with unknown length with Java

I want to download a HTTP query with java, but the file I download has an undetermined length when downloading.
I thought this would be quite standard, so I searched and found a code snippet for it: http://snipplr.com/view/33805/
But it has a problem with the contentLength variable. As the length is unknown, I get -1 back. This creates an error. When I omit the entire check about contentLength, that means I always have to use the maximum buffer.
But the problem is that the file is not ready yet. So the flush gets only partially filled, and parts of the file get lost.
If you try downloading a link like http://overpass-api.de/api/interpreter?data=area%5Bname%3D%22Hoogstade%22%5D%3B%0A%28%0A++node%28area%29%3B%0A++%3C%3B%0A%29+%3B%0Aout+meta+qt%3B with that snippet, you'll notice the error, and when you always download the maximum buffer to omit the error, you end up with a corrupt XML file.
Is there some way to only download the ready part of the file? I would like if this could download big files (up to a few GB).

This should work, i tested it and it works for me:
void downloadFromUrl(URL url, String localFilename) throws IOException {
InputStream is = null;
FileOutputStream fos = null;
try {
URLConnection urlConn = url.openConnection();//connect
is = urlConn.getInputStream(); //get connection inputstream
fos = new FileOutputStream(localFilename); //open outputstream to local file
byte[] buffer = new byte[4096]; //declare 4KB buffer
int len;
//while we have availble data, continue downloading and storing to local file
while ((len = is.read(buffer)) > 0) {
fos.write(buffer, 0, len);
}
} finally {
try {
if (is != null) {
is.close();
}
} finally {
if (fos != null) {
fos.close();
}
}
}
}
If you want this to run in background, simply call it in a Thread:
Thread download = new Thread(){
public void run(){
URL url= new URL("http://overpass-api.de/api/interpreter?data=area%5Bname%3D%22Hoogstade%22%5D%3B%0A%28%0A++node%28area%29%3B%0A++%3C%3B%0A%29+%3B%0Aout+meta+qt%3B");
String localFilename="mylocalfile"; //needs to be replaced with local file path
downloadFromUrl(url, localFilename);
}
};
download.start();//start the thread

Weird stuff while viewing a file which was transferred with sockets in java

Well i am trying to transfer a file using sockets in java
Here is the code
Client Code
try{
// get streams
DataOutputStream dos = new DataOutputStream(socket.getOutputStream());
DataInputStream din = new DataInputStream (socket.getInputStream());
dos.writeUTF(fileName);
dos.flush();
boolean isOk = din.readBoolean();
if(!isOk){
throw new StocFileNotFound("Fisierul: " + fileName +" was not found on:" + address.toString());
} else {
baos = new ByteArrayOutputStream();
byte biti [] = new byte[1024];
while(din.read(biti,0,1024) != -1){
baos.write(biti,0,biti.length);
}
}
}
catch(IOException e){}
finally {
try{ socket.close(); } catch (IOException e){}
}
and then I return the baos.toByteArray() and write it to a file with the OutputStream`s write method.
Server code
try{
DataOutputStream dos = new DataOutputStream(socket.getOutputStream());
DataInputStream din = new DataInputStream (socket.getInputStream());
// check if it is really a file or if it is an existing file
File file = new File(din.readUTF());
// write false
if ( !file.exists() || !file.isFile() ){
dos.writeBoolean(false);
dos.flush();
}
// write true and write the file
else {
byte biti[] = new byte[1024];
dos.writeBoolean(true);
FileInputStream fis = new FileInputStream(file);
while(fis.read(biti,0,1024) != -1){
dos.write(biti,0,biti.length);
}
dos.flush();
try{ fis.close(); } catch (IOException e){}
}
} catch (IOException e){}
finally {
try{socket.close();}catch(IOException e){}
}
The problem
When i transfer a .txt file and view it in gedit it shows the text followed by multiple \00\00\00, though when i open it using notepad(in wine) it shows only the text. Plus viewing images and .doc works also. So is it something with gedit or is it with my program?
Edit
i was sending something like "hi, hope it works!"

This is the problem (or at least a problem):
while(fis.read(biti,0,1024) != -1)
{
dos.write(biti,0,biti.length);
}
You're always writing out the whole buffer, however many bytes were actually read. You should have:
int bytesRead;
while ((bytesRead = fis.read(biti, 0, 1024)) != -1)
{
dos.write(biti, 0, bytesRead);
}
(You've got the same problem in both bits of code.)
You might want to look at Guava which has various utility methods to relieve you of a lot of the tedium (and possible error) of writing this kind of code over and over again.

The read method will return the actual number of bytes read from the stream. You should use that as a parameter to your write method, or else you will be writing garbage to it.

java.net.URL read stream to byte[]

I'm trying to read an image from an URL (with the Java package
java.net.URL) to a byte[]. "Everything" works fine, except that the content isn't being entirely read from the stream (the image is corrupt, it doesn't contain all the image data)... The byte array is being persisted in a database (BLOB). I really don't know what the correct approach is, maybe you can give me a tip. :)
This is my first approach (code formatted, removed unnecessary information...):
URL u = new URL("http://localhost:8080/images/anImage.jpg");
int contentLength = u.openConnection().getContentLength();
Inputstream openStream = u.openStream();
byte[] binaryData = new byte[contentLength];
openStream.read(binaryData);
openStream.close();
My second approach was this one (as you'll see the contentlength is being fetched another way):
URL u = new URL(content);
openStream = u.openStream();
int contentLength = openStream.available();
byte[] binaryData = new byte[contentLength];
openStream.read(binaryData);
openStream.close();
Both of the code result in a corrupted image...
I already read this post from Stack Overflow.

There's no guarantee that the content length you're provided is actually correct. Try something akin to the following:
ByteArrayOutputStream baos = new ByteArrayOutputStream();
InputStream is = null;
try {
is = url.openStream ();
byte[] byteChunk = new byte[4096]; // Or whatever size you want to read in at a time.
int n;
while ( (n = is.read(byteChunk)) > 0 ) {
baos.write(byteChunk, 0, n);
}
}
catch (IOException e) {
System.err.printf ("Failed while reading bytes from %s: %s", url.toExternalForm(), e.getMessage());
e.printStackTrace ();
// Perform any other exception handling that's appropriate.
}
finally {
if (is != null) { is.close(); }
}
You'll then have the image data in baos, from which you can get a byte array by calling baos.toByteArray().
This code is untested (I just wrote it in the answer box), but it's a reasonably close approximation to what I think you're after.

Just extending Barnards's answer with commons-io. Separate answer because I can not format code in comments.
InputStream is = null;
try {
is = url.openStream ();
byte[] imageBytes = IOUtils.toByteArray(is);
}
catch (IOException e) {
System.err.printf ("Failed while reading bytes from %s: %s", url.toExternalForm(), e.getMessage());
e.printStackTrace ();
// Perform any other exception handling that's appropriate.
}
finally {
if (is != null) { is.close(); }
}
http://commons.apache.org/io/api-1.4/org/apache/commons/io/IOUtils.html#toByteArray(java.io.InputStream)

Here's a clean solution:
private byte[] downloadUrl(URL toDownload) {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
try {
byte[] chunk = new byte[4096];
int bytesRead;
InputStream stream = toDownload.openStream();
while ((bytesRead = stream.read(chunk)) > 0) {
outputStream.write(chunk, 0, bytesRead);
}
} catch (IOException e) {
e.printStackTrace();
return null;
}
return outputStream.toByteArray();
}

I am very surprised that nobody here has mentioned the problem of connection and read timeout. It could happen (especially on Android and/or with some crappy network connectivity) that the request will hang and wait forever.
The following code (which also uses Apache IO Commons) takes this into account, and waits max. 5 seconds until it fails:
public static byte[] downloadFile(URL url)
{
try {
URLConnection conn = url.openConnection();
conn.setConnectTimeout(5000);
conn.setReadTimeout(5000);
conn.connect();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
IOUtils.copy(conn.getInputStream(), baos);
return baos.toByteArray();
}
catch (IOException e)
{
// Log error and return null, some default or throw a runtime exception
}
}

byte[] b = IOUtils.toByteArray((new URL( )).openStream()); //idiom
Note however, that stream is not closed in the above example.
if you want a (76-character) chunk (using commons codec)...
byte[] b = Base64.encodeBase64(IOUtils.toByteArray((new URL( )).openStream()), true);

Use commons-io IOUtils.toByteArray(URL):
String url = "http://localhost:8080/images/anImage.jpg";
byte[] fileContent = IOUtils.toByteArray(new URL(url));
Maven dependency:
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.6</version>
</dependency>

The content length is just a HTTP header. You cannot trust it. Just read everything you can from the stream.
Available is definitely wrong. It's just the number of bytes that can be read without blocking.
Another issue is your resource handling. Closing the stream has to happen in any case. try/catch/finally will do that.

It's important to specify timeouts, especially when the server takes to respond. With pure Java, without using any dependency:
public static byte[] copyURLToByteArray(final String urlStr,
final int connectionTimeout, final int readTimeout)
throws IOException {
final URL url = new URL(urlStr);
final URLConnection connection = url.openConnection();
connection.setConnectTimeout(connectionTimeout);
connection.setReadTimeout(readTimeout);
try (InputStream input = connection.getInputStream();
ByteArrayOutputStream output = new ByteArrayOutputStream()) {
final byte[] buffer = new byte[8192];
for (int count; (count = input.read(buffer)) > 0;) {
output.write(buffer, 0, count);
}
return output.toByteArray();
}
}
Using dependencies, e.g., HC Fluent:
public byte[] copyURLToByteArray(final String urlStr,
final int connectionTimeout, final int readTimeout)
throws IOException {
return Request.Get(urlStr)
.connectTimeout(connectionTimeout)
.socketTimeout(readTimeout)
.execute()
.returnContent()
.asBytes();
}

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Read first bytes of a file - java

Try this test InputStream is = new URL("ftp://test:test#ftp.secureftp-test.com/bookstore.xml").openStream(); byte[] a = new byte[1000]; int n = is.read(a); is.close(); System.out.println(new String(a, 0, n)); it definitely works

Related

Why is my binary data bigger after getting it from the webserver?

Download files with Android Webview

Download file via HTTP with unknown length with Java

Weird stuff while viewing a file which was transferred with sockets in java

java.net.URL read stream to byte[]

Categories

Resources