I'm currently developing an app, that should measure (fairly precisely) the size of webpages.
The thing I'm struggling with now is that I need to know the sizes of particular files that are on the website. I have an array of URLs and I try to fetch their headers to get Content-Length, however some files return -1 since they are chunked. If they return -1 I try to download them to get their size.
And here lies the problem - I found out that I always get uncompressed version of the file.
Example file -
http://www.google-analytics.com/analytics.js
When I open it in Chrome, the headers says this:
However, when I download it using HttpURLConnection, it has a size of 25421 bytes, and when I check the Content-Encoding header, its always null.
connection = (HttpURLConnection)(new URL(url)).openConnection();
connection.setRequestProperty("Accept-Encoding", "gzip");
connection.connect();
int contentLength = connection.getContentLength();
if (contentLength == -1 && connection != null) {
InputStream input = connection.getInputStream();
byte[] buffer = new byte[4096];
int count = 0, len;
while ((len = input.read(buffer)) > 0) {
count += len;
}
contentLength = count;
}
So the problem is, that I download a webpage with my application, and it says it has (let's say) 400kB. But when I download it using some kind of tool, like http://tools.pingdom.com/fpt/ , the size is much smaller, like 100kB, since most of the scripts are gzipped, that means the transfer is lower.
I know 300kB is not that much, but when you are using a mobile transfer, every kB counts, and I want my app to be precise.
Could you point me where I make mistake, or how could I solve this?
Thank you
Your HttpURLConnection setup code looks correct to me. You could try setting the User-Agent to a standard browser one, perhaps the server is trying to be more intelligent than it ought to be. Failing that, run your traffic through a debugging proxy like Fiddler or Burp to see what's going on at the network level.
If you are using iJetty, you have to enable gzip compression first
You have to enable the GzipFilter to make Jetty return compressed content. Have a look here on how to do that: http://blog.max.berger.name/2010/01/jetty-7-gzip-filter.html
You can also use the gzip init parameter to make Jetty search for compressed content. That means if the file file.txt is requested, Jetty will watch for a file named file.txt.gz and returns that.
Related
I'm sending zip file over FTP connection so to fetch file size , I have used :
URLConnection conn = imageURL.openConnection();
long l = conn.getContentLengthLong();
But it returns -1
Similarly for files sent over Http request , I get correct file size.
How to get correct file size in ftp connection in this case ?
for files sent over Http request , I get correct file size.
MAYBE. URLConnection.getContentLength[Long] returns specifically the content-length header. HTTP (and HTTPS) supports several different ways of delimiting bodies, and depending on the HTTP options and versions the server implements, it might use a content-length header or it might not.
Somewhat similarly, an FTP server may provide the size of a 'retrieved' file at the beginning of the operation, or it may not. But it never uses a content-length header to do so, so getContentLength[Long] doesn't get it. However, the implementation code does store it internally if the server provides it, and it can be extracted by the following quite ugly hack:
URL url = new URL ("ftp://anonymous:dummy#192.168.56.2/pub/test");
URLConnection conn = url.openConnection();
try( InputStream is = conn.getInputStream() ){
if( ! conn.getClass().getName().equals("sun.net.www.protocol.ftp.FtpURLConnection") ) throw new Exception("conn wrong");
Field fld1 = conn.getClass().getDeclaredField("ftp");
fld1.setAccessible(true); Object ftp = fld1.get(conn);
if( ! ftp.getClass().getName().equals("sun.net.ftp.impl.FtpClient") ) throw new Exception ("ftp wrong");
Field fld2 = ftp.getClass().getDeclaredField("lastTransSize");
fld2.setAccessible(true); long size = fld2.getLong(ftp);
System.out.println (size);
}
Hacking undocumented internals may fail at any time, and versions of Java above 8 progressively discourage it: 9 through 15 give a warning message about illegal access unless you use --add-opens to permit it and 16 makes it an error (i.e. fatal). Unless you really need the size before reading the data (which implicitly gives you the size) I would not recommend this.
I'm trying to download zip files from internet using following code:
public void getFile(String updateURL) throws Exception {
URL url = new URL(updateURL);
HttpURLConnection httpsConn = (HttpURLConnection) url.openConnection();
httpsConn.setRequestMethod("GET");
TrustModifier.relaxHostChecking(httpsConn);
int responseCode = httpsConn.getResponseCode();
if (responseCode == HttpsURLConnection.HTTP_OK) {
String fileName = "fileFromNet";
try (FileOutputStream outputStream = new FileOutputStream(fileName)) {
ReadableByteChannel rbc = Channels.newChannel(httpsConn.getInputStream());
outputStream.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);
}
}
httpsConn.disconnect();
}
TrustModifier is a class used to solve the "trust issue": http://www.obsidianscheduler.com/blog/ignoring-self-signed-certificates-in-java/
The code above works well for zip files available via plain http or for non compressed files exposed via https but but if I try to download a zip file exposed via https endpoint only a small fragment of original file will be downloaded. I have tested with different download links from internet and always got the same result.
Does anybody has an idea what I've been doing wrong here?
Thank you.
transferFrom() must be called in a loop until the transfer is complete, and in this case the only way you can know that is by adding up the return values of transferFrom() until they equal the Content-length of the HTTP response.
Actually the problem was in the TrustModifier Class I was using to switch off the servier certificate check. Once I removed it because I didn't need it any longer (I took the certificate from server and put it in a local trust store), my problem was solved.
i'm wrote simple download manager and i'm trying to set RESUME for all downloads. after googleing for how to do that. i know must be setRequestProperty for connection, but my code does not work and i get this error:
FATAL EXCEPTION: Thread-882
java.lang.IllegalStateException: Cannot set request property after connection is made
at libcore.net.http.HttpURLConnectionImpl.setRequestProperty(HttpURLConnectionImpl.java:510)
My code is:
URL url = new URL(downloadPath);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
final int fileSize = connection.getContentLength();
File file = new File(filepath);
if (file.exists() && fileSize == file.length()) {
return;
} else if (file.exists()) {
connection.setRequestProperty("Range", "bytes="+(file.length())+"-");
}else
connection.setRequestProperty("Range", "bytes=" + downloadedSize + "-");
connection.setRequestMethod("GET");
connection.setDoInput(true);
connection.setDoOutput(true);
connection.connect();
how to resolve this problem and correctly set setRequestProperty to connection?
The problem is that you're calling connection.getContentLength() before you're calling setRequestProperty(). The content length is only available after you've made a request, at which point you can't set the request property...
It's not entirely clear what you're trying to do, but one option is to use a HEAD request just to get the content length, and then make a separate request if you need to get just a portion of the data. Be aware that it's possible that the content length will change between requests, of course.
However, I would actually suggest keeping more metadata somewhere in your download manager - so that when you first start downloading the data, you keep a record of the total size, so that you don't need to make the HEAD request when resuming - you can tell just from the local information whether or not you've already downloaded a file. (This has the same problem in terms of content changing, but that's a different matter.)
I had the same error than OP.
WHY
The problem is that when you try to set the params to the request to resume the download, you have to be disconnected from the Http.
The moment you invoke the method connection.getContentLenght(); what happens is that connection.connect(); so if you then try to set the properties to the connection you will get the error mentioned.
FIX
What I did was that I closed the connection to Http after I invoked the method long totalFileSize = connection.getContentLength();
connection.disconnect()//Disconnect from http
And after that you can set the the parameters you want to the connection and invoke connection.connect() whenever needed.
TIP
In my particular case I was trying to download a file and needed to support resumable downloads, so to do it what I did was:
Check if file exists.
If file exists then get the lenght of the file:
long bytesDownloaded = file.getLenght();
Use this lenght to set it to the connection so it can resume from exactly the bytes it was paused.
Write the bytes to the end of the file.
You should set properties before getContentLength()
If you set range equal to exist file length you will receive remain bytes when call getContentLength() so if content length was equal to "0" that means that file downloaded completely.
But if you want to build a download manager, #Jon Skeet's method is reasonable.
Edit:
public abstract long getContentLength ()
Added in API level 1 Tells the length of the content, if known.
Returns the number of bytes of the content, or a negative number if
unknown. If the content length is known but exceeds Long.MAX_VALUE, a
negative number is returned.
I want to stream a radio with Java, my approach is to download the playlist file (.pls), then extract one of the urls given in that same file and finally, stream it with java. However, it seems I cannot find a way to do it.. I tried with JMF, but I get java.io.IOException: Invalid Http response everytime I run the code.
Here is what I tried:
Player player = Manager.createPlayer(new URL("http://50.7.98.106:8398"));
player.start();
The .pls file:
[playlist]
NumberOfEntries=1
File1=http://50.7.98.106:8398/
In the piece of code above I'm setting the URL by hand, just for testing, but I've sucessfuly done the .pls downloading code and it's working, and from this I make another question, is it a better approach to just simply play the .pls file locally? Can it be done?
You are connecting to an Icecast server, not a web server. That address/port is not sending back HTTP responses, it's sending back Icecast responses.
The HTTP specification states that the response line must start with the HTTP version of the response. Icecast responses don't do that, so they are not valid HTTP responses.
I don't know anything about implementing an Icecast client, but I suspect such clients interpret an http: URL in a .pls file as being just a host and port specification, rather than a true HTTP URL.
You can't use the URL class to download your stream, because it (rightly) rejects invalid HTTP responses, so you'll need to read the data yourself. Fortunately, that part is fairly easy:
Socket connection = new Socket("50.7.98.106", 8398);
String request = "GET / HTTP/1.1\n\n";
OutputStream out = connection.getOutputStream();
out.write(request.getBytes(StandardCharsets.US_ASCII));
out.flush();
InputStream response = connection.getInputStream();
// Skip headers until we read a blank line.
int lineLength;
do {
lineLength = 0;
for (int b = response.read();
b >= 0 && b != '\n';
b = response.read()) {
lineLength++;
}
} while (lineLength > 0);
// rest of stream is audio data.
// ...
You still will need to find something to play the audio. Java Sound can't play MP3s (without a plugin). JMF and JavaFX require a URL, not just an InputStream.
I see a lot of recommendations on Stack Overflow for JLayer, whose Player class accepts an InputStream. Using that, the rest of the code is:
Player player = new Player(response);
player.play();
I am using the official Dropbox API for Java.
So far, everything works smoothly. Authentication via oauth works and so do other functions (like directory listings).
Now, I tried to upload a file like this:
InputStream is = getInputStream();
byte[] bytes = is2Bytes(is); // Gets all bytes "behind" the stream
int len = bytes.length;
api.putFileOverwrite(path, is, len, null);
Now, when I do this call, my application hangs for about 15 seconds and then I get an exception thrown that Dropbox server did not respond.
So, first I asked Dropbox support if there was something wrong with their server. There isn't.
Then, I played around with the parameters of the putFileOverwrite method and I found out that if I set len=0 manually, the server responds and creates a 0 byte file with the correct file name.
As another test, I manually entered the value len=100 (the original file has 250KB so that should be ok). Again, the server does NOT respond.
So, what's wrong?
That is not weird at all. Since you use your self-made method is2Bytes, the steam is empty, because you read all the bytes to count them. The proper way of doing this would be either knowing how many bytes you are going to send or using the build-in method for sending a file.
public HttpResponse putFile(String root, String dbPath, File localFile)
Very weird. I was able to work around this by re-creating a new InputStream from the byte array and send that to Dropbox:
InputStream is = getInputStream();
byte[] bytes = is2Bytes(is); // Gets all bytes "behind" the stream
int len = bytes.length;
ByteArrayInputStream bis = new ByteArrayInputStream(bytes);
api.putFileOverwrite(path, bis, len, null);