Downloading from dropbox url ignores range - java

If i want to download a file from a dropbox url my http header range is ignored:
httpRequest = new HttpGet(url.toURI());
httpRequest.addHeader("Range", "bytes=" + startPos + "-" + dwnInfo.getStopRange());
httpRequest.addHeader("Accept-Encoding", "");
So instead of making my file download in x chunks of 5mb for ex, the connection ignores the specified range and it downloads x chunks of Y mb, where y is the full size of the file.
Downloading from an amazon storange link i don't have any problems.
Anyone else encountered this situation ? This only happens from some days ago. This wasn't a issue until now.
I tried to look on dropbox dev page but didn't see anything that specifies if they removed the accepted range on urls

The link you gave is to an HTML page (total size ~46KB), so even if range retrieval worked there, it wouldn't be very useful.
Per https://www.dropbox.com/help/201/en, you can turn a share link into a direct link to the file by changing the domain to dl.dropboxusercontent.com, so your link becomes https://dl.dropboxusercontent.com/s/5c7atlfmacjf3qn/02%20Armin%20Van%20Buuren%20-%20A%20State%20Of%20Trance%20Year%20Mix%202013%20%28Cd%202%29.mp3, and range retrieval works for that URL.
(Here I'm using httpie.)
$ http get https://dl.dropboxusercontent.com/s/5c7atlfmacjf3qn/02%20Armin%20Van%20Buuren%20-%20A%20State%20Of%20Trance%20Year%20Mix%202013%20%28Cd%202%29.mp3 range:bytes=0-0
HTTP/1.1 206 PARTIAL CONTENT
Connection: keep-alive
Content-Length: 1
Content-Type: audio/mpeg
Date: Wed, 18 Jun 2014 14:53:32 GMT
Server: nginx
accept-ranges: bytes
cache-control: max-age=0
content-range: bytes 0-0/146014047
etag: 346n
pragma: public
set-cookie: uc_session=2cqmevWxG8lmGt743KMXebc23dRC5iuZEfm8Etx6V2VShWk60jmnUJajFnH1wRG4; Domain=dropboxusercontent.com; Path=/; secure; httponly
x-dropbox-request-id: 2f0c5986a62cf2f0b06af1704ece5bd7
x-server-response-time: 535
I

Related

Apache HTTP Client throws NoHttpResponseException When Nginx Ingress Reloaded for POST

When we reload the Nginx Ingress config, we get the NoHttpResponseException for some of our POST requests. This does not occur in neither OkHttp client or just plain ab -c 100 -n 1000 https://...
Using 4.5.7, the latest one, and disabled the Gzip compression for visibility. Put a break point in DefaultHttpResponseParser in:
#Override
protected HttpResponse parseHead(
final SessionInputBuffer sessionBuffer) throws IOException, HttpException {
//read out the HTTP status string
int count = 0;
ParserCursor cursor = null;
do {
// clear the buffer
this.lineBuf.clear();
final int i = sessionBuffer.readLine(this.lineBuf);
if (i == -1 && count == 0) {
// The server just dropped connection on us
throw new NoHttpResponseException("The target server failed to respond");
}
When an error occurs, we observe the buffer has the following contents:
0
1.1 200 OK
Server: nginx/1.15.5
Date: Tue, 19 Mar 2019 08:51:27 GMT
Content-Type: application/json;charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Strict-Transport-Security: max-age=15724800; includeSubDomains
10
{"success":true}
But for the regular requests, it has the following contents, which makes more sense:
HTTP/1.1 200 OK
Server: nginx/1.15.5
Date: Tue, 19 Mar 2019 08:52:30 GMT
Content-Type: application/json;charset=UTF-8
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Strict-Transport-Security: max-age=15724800; includeSubDomains
10
{"success":true}
Now, I am not sure what is wrong, because both okhttp and ab works correctly. Tried many versions, but it seems to remain.

okhttp content-length is -1 with big files

I am downloading a file with okhttp and things work fine - now I want to show the progress and hit a road-bump. The returned content-length is -1.
It comes back correctly from the server:
⋊> ~ curl -i http://ipfs.io/ipfs/QmRMHb4Vhv8LtYqw8RkDgkdZYxJHfrfFeQaHbNUqJYmdF2 13:38:11
HTTP/1.1 200 OK
Date: Tue, 14 Jun 2016 11:38:16 GMT
Content-Type: application/octet-stream
Content-Length: 27865948
I traced the problem down to OkHeaders.java here:
public static long contentLength(Headers headers) {
return stringToLong(headers.get("Content-Length"));
}
I see all the other headers here in headers - but not Content-Length - so headers.get("Content-Length") returns null. Anyone has a clue how this can get lost?
Interestingly if I change the url to "http://google.com" I get a content-length from okhttp - but with curl both look same Content-Length wise - this really confuses me
Update: it seems to correlate with he size of the file. If I use smaller content from the same server I get a Content-Length with okhttp. The problem only happens when the file is big
It looks like above a certain size the server uses chunked encoding and you won't get a content length.
HTTP/1.1 200 OK
Date: Tue, 14 Jun 2016 14:30:07 GMT
Content-Type: application/octet-stream
Transfer-Encoding: chunked

App-Engine directing endpoint to previous class name, now 404 not found

I've been using App-Engine as the backend for an Android and iOS application. It's been working without problem with both the local development server (over http) and actual app-engine (over https).
Then I noticed that, while renaming endpoints, I accidentally duplicated a word in the class name of an endpoint: RegionRegionIconsEndpoint instead of simply RegionIconsEndpoint. It was a 1-line fix.
public class RegionRegionIconsEndpoint {
#ApiMethod(name = "getRegionIcons", path="regionIcons", httpMethod = HttpMethod.POST)
public RegionInfoVersion.RegionIcons getRegionIcons(User user, #Named("id") String id)
throws OAuthRequestException {
...
}
}
became
public class RegionIconsEndpoint {
#ApiMethod(name = "getRegionIcons", path="regionIcons", httpMethod = HttpMethod.POST)
public RegionInfoVersion.RegionIcons getRegionIcons(User user, #Named("id") String id)
throws OAuthRequestException {
...
}
}
I generated new cloud-endpoint libraries and continued development using the local development server. All good.
When I deployed it to the real App-Engine service, however, a problem arose. When my app starts, there are a series of calls to other endpoints defined just as the one shown above; these always work fine. Then there are calls to this endpoint. A typical call looks like this:
POST https://my-app.appspot.com/_ah/api/client/v1/regionIcons?id=foo
Authorization is also provided and the expected result comes back most of the time... say 80%. The AE logs look like this:
2014-05-02 21:36:30.551 /_ah/spi/com.example.app.endpoints.RegionIconsEndpoint.getRegionIcons 200 48ms 0kb Google-HTTP-Java-Client/1.16.0-rc (gzip) module=default version=1
70.80.59.221 - - [02/May/2014:18:36:30 -0700] "POST /_ah/spi/com.example.app.endpoints.RegionIconsEndpoint.getRegionIcons HTTP/1.1" 200 149 - "Google-HTTP-Java-Client/1.16.0-rc (gzip)" "my-app.appspot.com" ms=49 cpu_ms=41 cpm_usd=0.000017 app_engine_release=1.9.4 instance=006c1b117c1b2d35341e0f407ae5785a825b65e5
The remaining times, I get a 404 Not Found response and the AE logs have this:
2014-05-02 21:36:30.852 /_ah/spi/BackendService.logMessages 204 16ms 0kb module=default version=1
10.1.0.41 - - [02/May/2014:18:36:30 -0700] "POST /_ah/spi/BackendService.logMessages HTTP/1.1" 204 0 - - "my-app.appspot.com" ms=16 cpu_ms=0 app_engine_release=1.9.4 instance=006c1b117c1b2d35341e0f407ae5785a825b65e5
E 2014-05-02 21:36:30.851
Request URL: https://my-app.appspot.com/_ah/api/client/v1/regionIcons?id=foo
Method: client.getRegionIcons
Error Code: 404
Reason: notFound
Message: service 'com.example.app.endpoints.RegionRegionIconsEndpoint' not found
2014-05-02 21:36:30.802 /_ah/spi/com.example.app.endpoints.RegionRegionIconsEndpoint.getRegionIcons 404 16ms 0kb Google-HTTP-Java-Client/1.16.0-rc (gzip) module=default version=1
70.80.59.221 - - [02/May/2014:18:36:30 -0700] "POST /_ah/spi/com.example.app.endpoints.RegionRegionIconsEndpoint.getRegionIcons HTTP/1.1" 404 166 - "Google-HTTP-Java-Client/1.16.0-rc (gzip)" "my-app.appspot.com" ms=16 cpu_ms=0 cpm_usd=0.000019 app_engine_release=1.9.4 instance=006c1b117c1b2d35341e0f407ae5785a825b65e5
You can see on the Message line that, sometimes, AE is still trying to process the call using the old class name with the duplicated word! I've done searches over my entire code-base and the generated files and I cannot find the string "RegionRegion" anywhere. I've checked the web.xml file a dozen times and it has only the new "RegionIconsEndpoint" class name.
Wondering if somehow Google's servers were keeping old information around, I deployed the new version of my app as 2-dot-my-app.appspot.com. The behavior remains exactly the same except that there are no AE log messages for the requests that fail with 404 on this version. Successful request logs are as before.
Both my Android and iPad apps are experiencing this. In addition, I've managed to reproduce it using the web and Google's API explorer on my-app.appspot.com. In this last case, a successful request shows this:
200 OK
cache-control: no-cache, no-store, max-age=0, must-revalidate
content-encoding: gzip
content-length: 171
content-type: application/json; charset=UTF-8
date: Sat, 03 May 2014 03:07:05 GMT
etag: "G170GGjYGsLnxTffzUEJmTttHzU/LUWzmydK3mjH7IeRbEc_n9J6cDQ"
expires: Fri, 01 Jan 1990 00:00:00 GMT
pragma: no-cache
server: GSE
{
"iconsVid": "foo",
"iconsVersion": 3,
"kind": "client#resourcesItem",
"etag": "\"G170GGjYGsLnxTffzUEJmTttHzU/LUWzmydK3mjH7IeRbEc_n9J6cDQ\""
}
and a failed request shows this:
404 Not Found
cache-control: private, max-age=0
content-encoding: gzip
content-length: 169
content-type: application/json; charset=UTF-8
date: Sat, 03 May 2014 03:08:34 GMT
expires: Sat, 03 May 2014 03:08:34 GMT
server: GSE
{
"error": {
"errors": [
{
"domain": "global",
"reason": "notFound",
"message": "service 'com.example.app.endpoints.RegionRegionIconsEndpoint' not found"
}
],
"code": 404,
"message": "service 'com.example.app.endpoints.RegionRegionIconsEndpoint' not found"
}
}
again clearly showing an access to the old class name. When trying to do the same to the v2 version that I deployed (2-dot-my-app.appspot.com), it's different. A success request ends like this:
200 OK
cache-control: no-cache, no-store, max-age=0, must-revalidate
content-encoding: gzip
content-length: 171
content-type: application/json; charset=UTF-8
date: Sat, 03 May 2014 03:12:08 GMT
etag: "EP5CWx59se1v4KdDnkfEx7cTkis/LUWzmydK3mjH7IeRbEc_n9J6cDQ"
expires: Fri, 01 Jan 1990 00:00:00 GMT
pragma: no-cache
server: GSE
{
"iconsVid": "foo",
"iconsVersion": 3,
"kind": "client#resourcesItem",
"etag": "\"EP5CWx59se1v4KdDnkfEx7cTkis/LUWzmydK3mjH7IeRbEc_n9J6cDQ\""
}
and a failed request ends like this:
404 Not Found
cache-control: no-cache, no-store, max-age=0, must-revalidate
content-encoding: gzip
content-length: 29
content-type: text/html; charset=UTF-8
date: Sat, 03 May 2014 03:06:10 GMT
expires: Fri, 01 Jan 1990 00:00:00 GMT
pragma: no-cache
server: GSE
Not Found
I don't know what else to try. To me, it looks like a bug in App-Engine.
So... any ideas what is going on here and how to fix or work around it?
2014-05-04: I tried changing the method from POST to GET: exact same behavior. I tried changing the path from regionIcons to regionIconsFoo: exact same behavior. I tried changing the #API version from v1 to v2: exact same behavior.
Finally, I tried changing the name of the class back to the previous (with the duplicated word): I get fewer failures (maybe 5% instead of 20%) but they still occur with the failing requests trying to access the now non-existent class name without the duplicated word.
Restoring the correct name resumes the originally described behavior with the original failure rate.
I've been struggling with similar problem. Check logs on appengine.com project site. You should see log about updating, if it has additional info about error - check it.
Sometimes AE works on local machine well but deployment process reveals some bugs.
Edit:
1. Rename the class back to old "double" name, upload it to AE and check if all requests are working without a bug, if yes, rename the class again. (if it's appengine bug it should fix it).
2. Create as simple as possible api and substitute it with your project. Update it to AE and check with api explorer is everything ok, without methods from your main project. If it's ok, once again swap "test" project with your true one and upload to AE.
This isn't an answer because it doesn't address the cause, but it is my solution.
I duplicated the working class back into the old class name.
public class RegionRegionIconsEndpoint {
#ApiMethod(name = "getRegionIconsOld", path="regionIconsOld", httpMethod = HttpMethod.POST)
public RegionInfoVersion.RegionIcons getRegionIcons(User user, #Named("id") String id)
throws OAuthRequestException {
...
}
}
Now in the log, even though I'm only ever calling getRegionIcons, I see indications that both classes are being called with the "old" version handling about 20% of the requests. It's a hack and I don't like it, but it works and the clients are happy with it.
If you can't beat 'em, join 'em.

Why is my Java app only fetching an old version of an online file?

I have a file online with information about some Minecraft blocks. When I first made this test file, I gave it three rows and a header expiration date of next Sunday (whenever that may be). My Java app fetched this no problem!
However, now I have inserted three more rows into this small database and changed the expiration date to last week, but my Java app still displays the original 3! When I visit the page in a browser, it gives me the full, current table. How come the Java app is still only fetching the old version?
The key code:
InputStream in;
URLConnection urlc = url.openConnection(); // url is a valid java.net.URL object
urlc.setAllowUserInteraction(false);
urlc.setDoInput(true);
urlc.setDoOutput(false);
urlc.setRequestProperty("User-Agent", "BHMI/3.0.0 (+http://prog.BHStudios.org/BHMI) Java/" + System.getProperty("java.version") + "(" + System.getProperty("java.vm.name") + ")"); // GoDaddy blocks Java clients, so we must have a custom user agent string
urlc.setDefaultUseCaches(false);
urlc.setUseCaches(false);
urlc.connect();
System.out.println("Connection successful! Database expires " + new Date(urlc.getExpiration()));
in = urlc.getInputStream();
int data;
StringBuilder sb = new StringBuilder();
while ((data = in.read()) != -1)
sb.append((char) data);
System.out.println("RAW DATA:\r\n"+sb);
Sample output:
Connection successful! Database expires Tue Nov 26 00:09:05 EST 2013
RAW DATA:
minecraft:air,Air,0,0,,
minecraft:stone,Stone,1,0,2,
minecraft:grass,Grass,2,0,,
I cleared the Java network cache through Windows control panel, and all caches and temporary files on my local machine with CCleaner, but this still happens. Heck, it happens across machines, so it can't be that. I've cleared all edge caches from my server, so it also can't be that.
I've even tried downloading the file after telling my browser to use my Java app's User-Agent string, and it fetched all 5 lines.
Request Headers
From my Java app:
GET /http/bhstudios/v2/prog/bhmi/database/get HTTP/1.1
User-Agent: BHMI/3.0.0 (+http://prog.BHStudios.org/BHMI) Java/1.7.0_45(Java HotSpot(TM) 64-Bit Server VM)
Cache-Control: no-cache, must-revalidate, max-age=0, no-store
Pragma: no-cache
Expires: Thu, 01 Jan 1970 00:00:00 GMT
Connection: close
Host: BHStudios.org
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
From Chrome, spoofing the same User-Agent string:
GET /prog/bhmi/database/get/ HTTP/1.1
Host: prog.bhstudios.org
Connection: keep-alive
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
User-Agent: BHMI/3.0.0 (+http://prog.BHStudios.org/BHMI) Java/1.7.0_45(Java HotSpot(TM) 64-Bit Server VM)
Accept-Encoding: gzip,deflate,sdch
Accept-Language: en-US,en;q=0.8
Cookie: __cfduid=dc9d0394ed55ebb1214fcbb5fc825626b1385426208553; visitorId=5293ed2b758cb1b5620000b0
Response Headers
From my Java app:
HTTP/1.1 200 OK
Server: cloudflare-nginx
Date: Tue, 26 Nov 2013 02:17:39 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Set-Cookie: __cfduid=d4432e3d81cf9e5b9393f2cca483e4b2d1385432256651; expires=Mon, 23-Dec-2019 23:50:00 GMT; path=/; domain=.bhstudios.org; HttpOnly
X-Powered-By: ASP.NET
X-UA-Compatible: chrome=IE8
CF-RAY: d33155416660862
Note that suspicious cookie expiration expires=Mon, 23-Dec-2019 23:50:00 GMT. Could this be the cause?
I also note that, when fetching from Chrome and using the same User-Agent string as my app, the header is:
HTTP/1.1 200 OK
Server: cloudflare-nginx
Date: Wed, 27 Nov 2013 17:30:01 GMT
Content-Type: text/html
Transfer-Encoding: chunked
Connection: keep-alive
Cache-Control: no-cache, must-revalidate, max-age=0, no-store
Pragma: no-cache
Expires: Mon, 18 Nov 2013 10:30:01 America/Phoenix
Content-Description: File Transfer
Content-Disposition: attachment; filename=BHMI_Items_Vanilla_172.csv
Content-Transfer-Encoding: base64
X-Powered-By: ASP.NET
X-UA-Compatible: chrome=IE8
CF-RAY: d408b3c56320098
Content-Encoding: gzip
which is the intended header, with an expiration date of last week.
You have to put in your request header info that you are looking for data that are not cached:
urlc.setRequestProperty("Cache-Control","no-cache, must-revalidate"); //HTTP 1.1
urlc.setRequestProperty("Pragma","no-cache"); //HTTP 1.0
... I was requesting the wrong file.
Sorry for wasting your time >.<
As the header shows, I was addressing an old URL scheme, /http/bhstudios/v2/prog/bhmi/database/get, when I wanted /prog/bhmi/database/get

Generating HttpResponse

When creating the HTTP Response manually, how can one get Server and ETag
* HTTP/1.1 200 OK
* Date: Mon, 23 Apr 2012 23:44:52 GMT
* Server: Apache/2.2.3 (Red Hat) <-----
* Last-Modified: Fri, 16 Sep 2005 18:08:50 GMT
* ETag: "421142-2f-400e77c517080" <-----
* Accept-Ranges: bytes
* Content-Length: 47
* Content-Type: text/plain
* Connection: close
"Server" is whatever your HTTP server wants to name/identify itself. I.e. "Zumgto Surver 4.5".
"ETag" identifies "version" of particular item, so as long as your server can reasonable say "this ETag corresponds to current version" you can send pretty much anything. I.e. "v3345", or hash of the item... Totally optional if you don't support "If-None-Match" header in requests.
Neither is required. You can make up your own sever tag using the same format above. Omit the eTag or just generate your own. You could use the current timestamp or a constant. The following formats should work.
Server: Program/version (O/S)
ETag: "Timestamp"

Categories

Resources