URL Size in Google App Engine

URL Size in Google App Engine - java

I am calling a servlet in my app hosted on GAE. The issue I am having is that my request url is greater than 2048 characters and I am getting 400 Bad Request error. While here it is mentioned that we can make a request with 10MB of data. So how can we send a request with 10MB of data? I am currently using free quota. A similar question was asked long ago but it is not answered yet.

AppEngine limits aside, it doesn't make much sense to put 10MB of data in an URL.
When you take a look at the HTTP protocol, a GET-request looks like this
GET /path/to/resource?possibleParam=value HTTP/1.1
Host: www.example.com
a POST-request like this
POST /path/to/resource?possibleParam=value HTTP/1.1
Host: www.example.com
Content-Type: */*; charset=utf-8
Content-Length: 4242
here come the actual data with a length of 4242 bytes
So if you allow large amounts of data the in the URI of a GET request that would mean that the server doesn't know how much memory it has to allocate in order to receive the whole uri. So to get better performance it does come quite natural that one would restrict the length of GET requests and force you to use POST request instead where the Content-Length must be made known before actually sending bulks of information.
Let's take a look at the comments from other Stackoverflow users
tx802 said:
POST your data?
Alex Martelli, refering to the maximum allowed URL length, said:
it will never be extended to 10 MB -- that obviously calls for a POST
or PUT (where data goes in the body, not the URL!)
That should make sense now, because protocol-wise it doesn't make much sense to push megabytes of data as a URI.

Sending megabytes of data in the request would rather warrant POST or PUT as the request method. This way you can send a request totaling up to 10 megabytes as you've noticed on the referenced article.
The reason you're getting the 400 error is outlined in the urlfetch errors module API documentation; the maximum URL length allowed is 2048 characters.
There is currently an existing feature request for increasing this length; although it's unlikely that this will change in the near future. You can 'star' the issue to get further updates and/or provide your use case in the comments.

Related

BufferedReader stuck in readLine() [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I'm trying to get the HTTP request from Google Chrome to get it's data. For that I use readLine() from BufferedReader but for some reason I think it gets stuck at the last line because the buffer stays open and it stays waiting for more input. Here is the code that I use in the while loop:
String line;
ArrayList<String> request = new ArrayList<String>();
while ((line = inFromClient.readLine()) != null) {
request.add(line);
}
If I forcely break the loop it works, basically im trying to get an efficient read of all lines but without the inconsistencies of ready()

HTTP seems like a crazy simple protocol but it is not; you should use an HTTP client library such as the built-in java.net.http client.
The problem is that the concept of 'give me my data, then close it down' is HTTP/1.0, and that's a few decades out of date. HTTP/2.0 and HTTP/3.0 are binary protocols, and HTTP/1.1 tends to leave the connection open. In general, 'read lines', and even 'use Reader' (as in, read characters instead of bytes) is the wrong way to go about it, as HTTP is not a textual protocol. I know. It looks like one. It's not.
Here is a highly oversimplified overview of how e.g. a browser reads HTTP/1.1 responses:
Use raw byte processing because HTTP body content is raw (or can be), therefore wrapping the whole thing into e.g. an InputStreamReader or BufferedReader is a non-starter.
Keep reading until an 0x0A byte (in ASCII, the newline symbol), or X bytes have been read and your buffer for this is full, where X is not extraordinarily large. Wouldn't want a badly behaving server or a misunderstanding where you connect to a different (non-HTTP) service to cause a memory issue! Parse this first line as an HTTP/1.1 response.
Keep doing this loop to pick up all headers. Use the same 'my buffer has limits' trick to avoid memory issues.
Then check the response code in order to figure out if a body will be forthcoming. It's HTTP/1.1, so you can't just go: "Well, if the connection is closed, I guess no body is forthcoming". Whether one will be coming or not depends primarily on the response code.
Assuming a body exists, read the double-newline that separates headers from the body.
If the content is transfered as chunked encoding (common), start blitting data into a buffer, but check if you read the entire chunk. Reading chunked encoding is its own game, really.
Alternatively, HTTP/1.1 DEMANDS that if chunked encoding isn't used that Content-Length is present. Use this header to know precisely how many bytes to read.
Neither 'a newline' nor 'close connection' can ever serve as a meaningful marker of 'end of data' in HTTP/1.1, so, don't.
Then either pass the content+headers+returncode verbatim to the requesting code, or dress it up a bit. For example, if the Content-Type header is present and has value text/html; encoding=UTF-8 you can consider taking the body data and turning it into a string via UTF-8 (new String(byteArray, StandardCharsets.UTF_8);).
Note that I've passed right over some bizarre behaviour that servers do because in ye olden days some dumb browser did weird things and it's now the status quo (for example, range requests are quite bizarre) and there's of course HTTP2 and HTTP3 which are completely different protocols.
Also, of course, HTTP servers are rare these days; HTTPS is where its at, and that's quite different too.

Jetty: Way to get request and response byte size current request processing?

In a Servlet context using Jetty, I would like to know the number of bytes a request was, and the number of bytes the response was (not only the content) - this so that I can log and do stats on this in a Filter upon exiting out.
So far, I've found this:
For response content, I've found that the HttpServletResponse object is a HttpOutput, on which there is a getWritten() returning the number of bytes written - and also, there is a getHttpChannel() returning a HttpChannel, which again has getBytesWritten(). However, both of these only return the size of the content, evidently not including headers - easily seen by a 302 redirect having size 0.
I have also found that from HttpChannel, you can invoke getHttpTransport(), which is a HttpConnection. This has nice "bytesIn" and "bytesOut" LongAdders, which evidently do include all bytes - however, this is for the Connection, and thus with keep-alive, this includes the bytes for all request/responses that this Connection has performed, thus increasing for each request/response cycle that Connection is a part of. (Also, on HttpChannel, there is a getRequests(), which returns the number of requests served with this instance, some kind of average could seemingly be obtained).
Thus: Is there a way to get the total request and response byte sizes for the current request? Bonus for size of content of request too. (I realize that there are two "sizes" to take into account: The one over the wire, which can be compressed, and the actual uncompressed size).

Trusting "Content-Type" on File Uploads

If I'm supporting the upload of content (mostly images and video) by my REST API's users, is it safe to trust the Content-Type they declare in (multipart) uploads? Or should I, instead, run some kind of "media type detection" on the content (using, for example, Apache Tika) to ensure that the declared media type corresponds to the detected, actual one? Am I being over-zealous by introducing this media type detection step?

You certainly shouldn't blindly trust the Content-type header, or any other header. These things should be used to inform your decisions about how to process the request. So, Content-type: application/json should allow you to interpret the message body as a json object - that sort of request might then be passed to a JSON deserialiser to bind it to an object.
It would be wrong to ignore the Content-type header just because the request body contains data which looks like something else. If the request is internally inconsistent then it should be rejected. It's one thing not to send a Content-type header but quite another for the header to be wrong.
So, the only situation where you might want to use some sort of automatic detection should be where you have no reasonable information about the content - either Content-Type is very generic (such as "/") or not present at all. In that situation it's worth deciding whether some kind of autodetection is possible or valuable.

Never trust the input which you get from the user. Always run a check in your server side code be it type of file, size of file, etc. Use the REST API or Javascript to make the experience of the user smoother and faster.

You should definitely reject all the requests that are missing Content-Type header (and Content-Length as well) or have it set incorrectly.
It's definitely not about being over-zealous, rather about securing the system. If you have suspicions about the content just check it. But remember to validate the size before checking the content. If you have a proxy server (e.g. nginx) it has appropriate modules to reject requests that are too big.

How to increase received JSON data size in Java?

I have connected to a REST server using the Java object RestTemplate. The REST responds with the big data, but my program can not receive JSON with length over 10000 chars. Please suggest how to increase the length of JSON received data.

The problem is likely that you are sending the data in a GET request, so it's sent in the URL. Different browsers have different limits for the URL, where IE has the lowest limist of about 2 kB. To be safe, you should never send more data than about a kilobyte in a GET request.
To send that much data, you have to send it in a POST request instead. The browser has no hard limit on the size of a post, but the server has a limit on how large a request can be. IIS for example has a default limit of 4 MB, but it's possible to adjust the limit if you would ever need to send more data than that.

Rate Limit Exceeded - Custom Twitter app

I am working with a java Twitter app (using Twitter4J api). I have created the app and can view the current users timeline, user's profiles, etc..
However, when using the app it seems to quite quickly exceed the 150 requests an hour rate limit set on Twitter clients (i know developers can increase this to 350 on given accounts, but that would not resolve for other users).
Surely this is not affecting all clients, any ideas as to how to get around this?
Does anyone know what counts as a request? For example, when i view a user's profile, i load the User object (twitter4j) and then get the screenname, username, user description, user status, etc to put into a JSON object - would this be a single call to get the object or would it several to include all the user.get... calls?
Thanks in advance

You really do need to keep track what your current request count is when dealing with Twitter.
However, twitter does not seem to drop the count for 304 Not Modified (at least it didn't the last time I dealt with it), so make sure there isn't something breaking your normal use of HTTP caching, and your practical request per hour goes up.
Note that twitter suffers from a bug in mod_gzip on apache where the e-tag is mal-formed in changing it to reflect that the content-encoding is different to that of the non-gzipped entity (this is the Right Thing to do, there's just a bug in the implementation). Because of this, accepting gzipped content from twitter means it'll never send a 304, which increases your request count, and in many cases undermines the efficiency gains of using gzip.
Hence, if you are accepting gzip (your web-library may do so by default, see what you can see with a tool like Fiddler, I'm a .NET guy with only a little Java knowledge, answering at the level of how twitter deals with HTTP so I don't know the details of Java web libraries), try turning that off, and see if it improve things.

Almost every type of read from Twitter's servers (i.e. anything that calls HTTP GET) counts as a request. Getting user timelines, retweets, direct messages, getting user data all count as 1 request each. Pretty much the only Twitter API call that reads from the server without counting against your API limit is checking to see the rate limit status.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.