Using Spring MVC, for each incoming request id like to set the statusCode and headers.
Once they are set, I need to pad the response body to make the entire response size, including all headers and content (actual data wired to the client), to be exactly X bytes (e.g. 300 bytes). The response size will vary per request, but all responses will have to padded.
There are no limitations regarding the manipulation of the response.
Using HttpServletResponse I can set the statusCose and headers and also maybe get the response size. But I couldn't find a way to set the body content/length in accordance to the required size.
If I use a ResponseEntity I can set the body but can't tell the size of the response.
How can I pad the response to the required size while setting the fields above?
First of all, as JB Nizet said, this requirement is fully outside of the HTTP protocol and you'd better fix the client side to only control the size of the body part (what the content-length header is made for).
Now assuming you really need to do that, I can imagine 2 ways to fulfil this requirement (both of them seem equally ugly...):
Use a dedicated proxy to post-process the HTTP response
Consistently add a custom header to the response that indicates the total size the response should have. Put a dedicated proxy between the clients and the server. That proxy should listen on its own port and forward everything to the server. For the response part, it should:
read the header part line by line (delimited with \r\n), store the required total size without transmitting it and forward all other headers storing the number of bytes sent
once the header part is over (empty line), read the body part and trunk or pad it to the correct size before transmitting it.
This would be a low level program that should use directly the socket interfaces, or that could directly be written in C. This really looks like plumbing, but it should be usable even for different servlet containers.
Compute the body part size and guess the header one
Control how your servlet container processes headers. The protocol requires that they are written as NAME=value\r\n and that the header part is followed by an empty line (r\n). But you should control twice whether the containers adds its own headers or whether it automatically adds some headers if you do not provide them. That should allow you to compute the header size from the headers you added to the response, but is clear coupled to a single servlet container, and when used the same way.
Alternatively, you could try to ask the response what headers it contains. Normally (at least Tomcat does it), it actually computes the header part when it commits the response (*). So you could:
set the status and the headers you need
commit the response by a call to flushBuffer()
get the generated headers through:
for (String name: resp.getHeaderNames()) {
for (String value: resp.getHeaders(name)) {
...
If you take care of the status line (should be HTTP/1.1 200 OK\r\n but here again control twice its actual size), it should be enough to compute the total header size. Provided you have enough control on the body to know what you want to write, you should be able to compute how much padding you need.
This avoids a dedicated proxy, but it can only be guaranteed to work on a (version of a) particular servlet container
(*) Beware: when you commit the response manually, the servlet container can add a Transfer-Encoding=chunked header because it cannot guess the body size. Whether it is acceptable for you use case or not, I cannot know...
Considering the comment discussion above, you could create an HttpResponse Wrapper.
Using your wrapper you can then intercept the response and override the actual size.
More info: your response wrapper writes response content to an internal byte array. It is not actually written to the "real" http output stream. In the filter you make sure 'write' and 'flush' called on the wrapper don't exceed your preferred max length.
Having said this, I agree with #jb-nijet that this is not common and looks like a flaw.
Related
Is there a specific scenario where we use a POST instead of GET, to implement the functionality of get operation ?
GET is supposed to get :) and POST is used to mainly add something new or sometimes often used for updates as well (although PUT is recommended in such scenarios). There is no specific scenario where we use a POST instead of a GET, if we require this, that means we are probably doing it wrong, although nothing stops you doing this but this is bad design and you should take a step back and plan your API carefully.
There are 2 important cases for a POST i.e. POST is more secure than a GET and POST can send large amount of data but even with this I won't recommend why one will use POST to simulate a GET behaviour.
Lets understand usage of get and post :
What is GET Method?
It appends form-data to the URL in name/ value pairs. The length of the URL is limited by 2048 characters. This method must not be used if you have a password or some sensitive information to be sent to the server. It is used for submitting the form where the user can bookmark the result. It is better for data that is not secure. It cannot be used for sending binary data like images or word documents. It also provides $_GET associative array to access all the sent information using the GET method.
What is POST Method?
It appends form-data to the body of the HTTP request in such a way that data is not shown in the URL. This method does not have any restrictions on data size to be sent. Submissions by form with POST cannot be bookmarked. This method can be used to send ASCII as well as binary data like image and word documents. Data sent by the POST method goes through HTTP header so security depends on the HTTP protocol. You have to know that your information is secure by using secure HTTP. This method is a little safer than GET because the parameters are not stored in browser history or in web server logs. It also provides $_POST associative array to access all the sent information using the POST method.
Source: https://www.edureka.co/blog/get-and-post-method/
So both the methods have their specific usage.
POST method is used to send data to a server to create or update a resource.
GET method is used to request data from a specified resource.
If you want to fetch some data you can use the GET method. But if you want to update an existing resource or create any new resource you should use POST. GET will not help you to create/update resources. So exposing the api should be specific to your needs.
UPDATE
So your main question is in what scenario we can use POST to implement the functionality of GET.
To answer that, as you understand what GET and POST does, so with GET request you will only fetch the resource. But with POST request you are creating or updating the resource and also can send the response body containing the form data in the same request response scenario. So suppose you are creating a new resource and the same resource you want to see, instead of making a POST call first and making a GET call again to fetch the same resource will cost extra overhead. You can skip the GET call and see your desired response from the POST response itself. This is the scenario you can use POST instead of making an extra GET call.
In a Servlet context using Jetty, I would like to know the number of bytes a request was, and the number of bytes the response was (not only the content) - this so that I can log and do stats on this in a Filter upon exiting out.
So far, I've found this:
For response content, I've found that the HttpServletResponse object is a HttpOutput, on which there is a getWritten() returning the number of bytes written - and also, there is a getHttpChannel() returning a HttpChannel, which again has getBytesWritten(). However, both of these only return the size of the content, evidently not including headers - easily seen by a 302 redirect having size 0.
I have also found that from HttpChannel, you can invoke getHttpTransport(), which is a HttpConnection. This has nice "bytesIn" and "bytesOut" LongAdders, which evidently do include all bytes - however, this is for the Connection, and thus with keep-alive, this includes the bytes for all request/responses that this Connection has performed, thus increasing for each request/response cycle that Connection is a part of. (Also, on HttpChannel, there is a getRequests(), which returns the number of requests served with this instance, some kind of average could seemingly be obtained).
Thus: Is there a way to get the total request and response byte sizes for the current request? Bonus for size of content of request too. (I realize that there are two "sizes" to take into account: The one over the wire, which can be compressed, and the actual uncompressed size).
If I'm supporting the upload of content (mostly images and video) by my REST API's users, is it safe to trust the Content-Type they declare in (multipart) uploads? Or should I, instead, run some kind of "media type detection" on the content (using, for example, Apache Tika) to ensure that the declared media type corresponds to the detected, actual one? Am I being over-zealous by introducing this media type detection step?
You certainly shouldn't blindly trust the Content-type header, or any other header. These things should be used to inform your decisions about how to process the request. So, Content-type: application/json should allow you to interpret the message body as a json object - that sort of request might then be passed to a JSON deserialiser to bind it to an object.
It would be wrong to ignore the Content-type header just because the request body contains data which looks like something else. If the request is internally inconsistent then it should be rejected. It's one thing not to send a Content-type header but quite another for the header to be wrong.
So, the only situation where you might want to use some sort of automatic detection should be where you have no reasonable information about the content - either Content-Type is very generic (such as "/") or not present at all. In that situation it's worth deciding whether some kind of autodetection is possible or valuable.
Never trust the input which you get from the user. Always run a check in your server side code be it type of file, size of file, etc. Use the REST API or Javascript to make the experience of the user smoother and faster.
You should definitely reject all the requests that are missing Content-Type header (and Content-Length as well) or have it set incorrectly.
It's definitely not about being over-zealous, rather about securing the system. If you have suspicions about the content just check it. But remember to validate the size before checking the content. If you have a proxy server (e.g. nginx) it has appropriate modules to reject requests that are too big.
I saw this description in the Oracle website:
"Since TCP by its nature is a stream based protocol, in order to reuse an existing connection, the HTTP protocol has to have a way to indicate the end of the previous response and the beginning of the next one. Thus, it is required that all messages on the connection MUST have a self-defined message length (i.e., one not defined by closure of the connection). Self demarcation is achieved by either setting the Content-Length header, or in the case of chunked transfer encoded entity body, each chunk starts with a size, and the response body ends with a special last chunk."
See Oracle doc
I don't know how to implement, can someone give me an example of Java implementation ?
If you are trying to implement "self-demarcation" in the same way as HTTP does it:
the HTTP 1.1 specification defines how it works,
the source code of (say) the Apache HTTP libraries are an example of its implementation.
In fact, it is advisable NOT to try and implement this (HTTP) yourself from scratch. Use an existing implementation.
On the other hand, if you simply want to implement your own ad-hoc self-demarcation scheme, it is really easy to do.
The sender figures out the size of the message, in bytes or characters or some other unit that makes sense.
The sender sends a the message size, followed by the message itself.
At the other end:
The receiver reads the message size, and then reads the requisite number of bytes, characters, to form the message body.
An alternative is to for the sender to send the message followed by a special end-of-message marker. To make this work, either you need to guarantee that no message will contain the end-of-message marker, or you need to use some sort of escaping mechanism.
Implementing these schemes is simple Java programming.
What makes a connection reusable
That is answered by the text that you quoted in your Question.
I have written a code to send a HTTP request through a socket in java. Now I want to get the HTTP response that was sent by the server to which I sent HTTP request.
It's not totally clear what you're asking for. Assuming you've written the request to the socket, the next thing you'll want to do is:
Call shutdownOutput() on the socket to tell the server that the request is done (not necessary if you've sent the content length)
Read the response from the socket's input stream, parsing according to the HTTP spec.
This is a bunch of work, so I'd suggest that rather than rolling your own HTTP request logic, use URLConnection which is built-in to Java and includes methods for retrieving the content of a response as well as any headers set by the server.
As Jon said, read the HTTP spec. However Internet protocols are generally line oriented, so you can read the response a line a time. The first line will be the response. Following this will be the headers, one per line (unless there's a continuation). If there's a body one of the headers will the content-type, this will tell you what the content is. If you want to read the content you will need to understand the different ways the content can be sent. There may be a content length header (or not) or the content maybe chunked (can't remember the header off the top of my head). And of course the content may be binary rather than text.
yup!
that's right!
the respond should be clearly readed by the inputstream into
a few chunk of bytes...
thus we could translate it into a readable format.
But that also take longer time.... :(