I'm using Tomcat's WebDAV servlet and it seems that I can't get the header "Content-Length" when I'm issuing a PUT request.
How do I get the content length of the file I'm "putting"?
Assuming you mean that you're writing code which is part of the server-side PUT operation - ie you're extending the webdav servlet or something. Then if the client has sent a file via PUT and there is no content-length header then you need to buffer the bytes (probably to disk) and then use the resulting buffered data to give you the length.
Its perfectly legal for clients to send a file without a content length. In that case they simply drop the TCP connection to indicate EOF.
Note that if you are extending the tomcat webdav servlet, you might also want to consider using milton.io. Its a webdav servlet intended to allow a pluggable backend. It also ships with a filesystem implementation equivalent to tomcat's webdav servlet.
Related
In my web application I have a link which, when clicked, invokes an external web service to retrieve a download URL for a file.
I need to send back to client the file which is beyond this URL, instead of the download URL retrieved from the web service. If possible, I would also like to do it without having to download the file on my server beforehand.
I've found this question about a similar task, but which used PHP with the readfile() function.
Is there a similar way to do this in Java 8?
If you doesn't even want to handle that file you should answer the request with a redirect (eg HTTP 301 or 302). If you want to handle the file you should read the file in a byte buffer and send it to the client which would make the transfer slower.
Without seeing your implementation so far, this is my best suggest.
I have a question regarding file upload, which is more related to how it works rather than a code issue. I looked on the internet, but I couldn't find a proper answer.
I have a web application running on tomcat, which handles file uploads (through a servlet). Let's say I want now to upload huge files (> 1 Gb). My understading was that the multipart content of the HTTP request was available in my servlet once the whole file was actually transfered.
My question is where the content of the request is actually stored ? When one calls HttpServletRequest.getParts() an InputStream is available on the Part object. However, where is the stream reading from ? Does Tomcat store it somewhere ?
I guess this might not be clear enough, so I'll update the post according to your comments, if any.
Thanks
Tomcat stores Parts in "X:\some\path\Tomcat 7.0\temp" (/some/path/apache-tomcat-7.0.x/temp) directory.
when a multipart request is parsed, if the size of a single part exceed a threshold, a temporary file is created for that part.
your servlet/jsp will be invoked when transfer of all parts has been completed.
when the request is destroyed all temporary files are deleted as well.
if you are interested in the multipart parse phase, take a look at apache commons-fileupload (specifically ServletFileUpload.parseRequest()), tomcat is based on a variant of that
UPDATE
you can configure it as a java arg, ie in windows:
The InputStream will typically read from a temporary file which is created by the multipart framework during the request. The temp file is normally stored in the application server's temporary area - as specified by the servlet context attribute javax.servlet.context.tempdir. In Tomcat this is somewhere beneath $CATALINA_HOME/work. The file will be deleted once the request completes.
For small file sizes, the multipart framework may keep the whole upload in memory - in which case the InputStream will be reading directly from memory.
If you're using Spring's CommonsMultipartResolver then you can set the maximum upload size allowed in memory via the maxInMemorySize property. If an upload is bigger than this, then it will be stored as a temp file on disk.
I think we should step back for a moment and give a thought on the web infrastructure. First of all the HTTP transmits text data, so binary information encoded in base 64 so that data won't get messed up. This ends up leading to large amouts of data and this gives birth to the multipart form, which breaks datum into parts of encoded text with special markers that allow the server to assembly everything together. But to use this data we have to decode it first, and to do that I have to use the multiple parts of the form.
[a break so we can breath]
Continuing, so the browser needs to send lots of datum (1GB as you mentioned in your example), this datum is encoded with base64 and then separated into pieces (the multipart form) with its markers, then the browser starts to send the pieces to the server, but the server only returns the HTTP RESPONSE once it has finished receiving and processing the HTTP REQUEST (or if a timeout occurs, which incurs in an error on the browser screen).
What can assume here is that Tomcat could (I didn't check the internals) start decoding each part of the multipart that has already arrieved (either from the temp file or from memory) passing the inputstream to the user, since the inputstrem reading is a blocking operation the server would wait for the next piece of data to pass to Tomcat, which in turn would pass it to the program that is processing the data.
Once all data has reached the server the program would prepare the response that Tomcat would return to the browser completing the HTTP Request-Response cycle and closing the connection (since HTTP is a connectionless protocol).
Hope it helps :)
Tomcat follows the Servlet 3.0 specification which allows you to specify things such as how large of a multipart "part" can be before it gets stored (temporarily) on the disk, where temporary files will be written, what the maximum size of a file is, and what the maximum size of the whole request can be. You can find all kinds of good information about configuring multipart uploads (in Tomcat or any other spec-3.0-compliant server) here and here.
Tomcat's implementation specifics aren't terribly relevant: it adheres to the spec. If the file to be uploaded is smaller than the threshold set, then you should be able to read the bytes of the file from memory (i.e. no disk involved). If the file is larger, then it will be written to disk, first (in its entirety) and then you can get the bytes from the container.
So if you want to receive a 1GiB file and don't have that kind of memory available (I wouldn't recommend allowing clients to fill-up your heap with 1GiB of uploaded data for each upload... easy DoS if you just start several simultaneous 1GiB uploads and you are toast), then Tomcat (or whatever container you are using) will read the file (again, in its entirety) onto the disk, and when your servlet gets control, you can read the bytes back from that file.
Note that the container must process the entire multipart request before any of your code really runs. That's to prevent you from breaking anything by partially-reading the request's InputStream or anything like that. Processing multipart requests is non-trivial, and it's easy to break things.
If you want to be able to stream large files for processing (e.g. huge XML files that can be processed serially), then you are going to want to handle the multipart parsing yourself. That way, you don't need a huge amount of heap to buffer the file and you don't need to store the file on the disk before you start processing it. (If this is your use-case, I suggest using HTTP PUT or HTTP POST and not using multipart requests.)
(It's worth mentioning that base64 encoding is not even mentioned in any specification for multipart processing. A few folks have mentioned base64 here, but I've never seen a standard web client use base64 for uploading a file using multipart/form-data. HTTP handles binary uploads just fine, thanks.)
Here is it
User's browser composes http multiple parts request
Tcp/ip stack of user's OS slices them into packets
Routers over the internet pass those packet to your server
Tcp/ip stack of your server's OS get back payloads and passes them
to tcp port listener
Tomcat http connector decodes http post request from tcp data
(source code is
https://github.com/apache/tomcat/tree/trunk/java/org/apache/coyote )
Tomcat http connector wrap a Http Request and eventually forwards to
your servlet (https://github.com/apache/tomcat/blob/trunk/java/org/apache/catalina/connector/Request.java)
Before and while your code reading the content of Http Request, tomcat will buffer the http request body internally
Tomcat will not parse multiple parts body before you call request.getParts() (https://github.com/apache/tomcat/blob/trunk/java/org/apache/catalina/connector/Request.java#L2561), thus no temp file for parts before calling.
Tomcat stores files uploaded into location pointing by #MultipartConfig annotation in your servlet code, unless your code doesn't provide it and allowCasualMultipartParsing is set (http://tomcat.apache.org/tomcat-7.0-doc/config/context.html#Common_Attributes)
Considering allowCasualMultipartParsing is false by default, you should not worry about where tomcat stores file though it is easy to dig out.
I mention 1~5 because it is important to understand the stream returns by request.getInputStream() which is required before Servlet 3.x request.getParts() feature. Typically, tomcat will deliver the request to web app very soon, it is not necessary to wait client side to finish uploading, thus tomcat need not buffer a lot of data. I have left java server side for some years, before JSR-000315 is approved :-)
I have the following scenario:
JSP -> Servlet -> ServiceAPI -> Service Servlet
I enter some cyrilic symbols in the JSP page, which is the start of the scenario. On the next step, the Servlet, I read the data from the JSP in UTF-8. So for, so good. Everything is OK.
Then I pass the data to a ServiceAPI, which sends it to a Service Servlet. Here comes the problem. The data in the Service Servlet is read as '??????'. So, I guess the problem is in the Service API which does not send the data correctly. ServiceAPI implementation uses Apache Http Client to send the data to the Service Servlet.
As I read in Apache Http Client documentation (http://hc.apache.org/httpclient-3.x/preference-api.html#HTTP_method_parameters) there is a way to set a character encoding in the request. But I am not able to apply this, becuase of a the following error: "Access restriction: The method setParameter(String, Object) from the type HttpParams is not accessible due to restriction on required library ...". So I am kind of stuck. Do you have any idea if the problem is really in Apache Http Client and I how can I fix it.
Thanks in advance.
To correctly implement Character Encoding in web apps consists of 4 steps
1.First you have to configure your web server.
2.Then you have to force your web app to use UTF-8 encoding for all requests/responses.
3.Third you have to use JSP page encoding.
4.And last you must use HMTL-meta tags.
In your case the problem lies most probably on step 2 IMO
Here is the perfect article for you How to get UTF-8 working in Java webapps? that describes how to do all these extensively
I need to read binary file from intranet http server and get it to download to public.
SCHEMA
intranet file server(Apache)=1 <-->Public http server(Apache Tomcat)=2<-->internet authorized user=3
howto release this without save to filesystem on server 2
Thanks for answers i am new on java.
Sorry for my English.
Use java.net.URL (or another http client) to read from 1 and then print it out (in response to 3).
(In Apache Http Server or Nginx this can be achieved using reverse proxy.)
I can only think of two ways in this situation:
Redirect the internet request to intranet.
In JSP page use:
<% response.sendRedirect("http://intranet_address");%>
or
<c:redirect url="http://intranet_address"/> using standard taglib.
In Servlet page use:
response.setStatus(302);
response.setHeader("Location", "http://intranet_address"); or just
response.sendRedirect("http://intranet_address");
Use a kind of proxy on server 2 to read from server 1 and send directly to internet user without saving to server 2.
I have never tried the first approach on an intranet, but I don't think it would work given the fact the intranet address won't be valid to the internet user.
Now we are only left with the second approach - using a proxy layer. The proxy function could be implemented in many ways: a simple one might be just a bean behind the Servlet to open URL to the file server 1, read file and send it through Servlet response stream to the user or maybe you can use some kind of embedded HTTPClient.
Edit: Since you are going to download binary file, JSP is not a good choice. It's meant to handle textual data. You need Servlet to do binary stream. You can set things like the following on your HttpServletResponse:
resp.setContentType("application/octet-stream");
resp.setContentLength(length);
resp.setHeader("Content-Disposition", "attachment; filename=\"" + filename + "\"" );
so the content will be send as an attachment with the name you set.
I am trying to put some logging to capture the raw http request coming to my application. My Java code is inside a SpringMVC controller. I have access to the "HttpServletRequest" object. But I could not find a way to get the raw http request stream out of it. There is a reader but only reads the post content. What I want is the whole shebang, the url, the headers, the body. Is there an easy way to do this?
Thanks in advance.
No.
The servlet provides no such API, and it would be hard to implement because (basically) you cannot read the same data twice from a Socket. It is not difficult to get header information, but raw headers are impossible to capture within a servlet container. To get the bodies you need to capture them yourself as your application reads/writes the relevant streams.
Your alternatives are:
Write your own server-side implementation of the HTTP protocol. (Probably not right for your application.)
You may be able to get the header information you need with filters, though they don't show the raw requests.
Some servlet containers have request header logging; e.g. with Tomcat there's a beast called the RequestDumperValve that you can configure in your "server.xml" file.
Implement a proxy server that sits between the client and your "real" server.
Packet sniffing.
Which is best depends on what you are really trying to achieve.
FOLLOWUP:
If the "badness" is in the headers, the RequestDumperValve approach is probably the best for debugging. Go to the "$CATALINA_HOME/conf/server.xml" file, search for "RequestDumperValve" and uncomment the element. Then restart Tomcat. You can also do the equivalent in your webapp's "context.xml" file. The dumped requests and responses end up in "logs/catalina.out" by default. Note that this will give a LOT of output, so you don't want to do this in production ... except as a last resort.
If the badness is in the content of a POST or PUT request, you'll need to modify your application to save a copy the content as it reads it from the input stream. I'm not aware of any shortcuts for this.
Also, if you want to leave logging on for long periods, you'll probably need to solve the problem yourself by calling the HttpServletRequest API and logging headers, etc. The RequestDumperValve generates too much output, and dumps ALL requests not just the bad ones.
No, servlets provide no api to get at the raw request - you might need a sniffer like wireshark for that.
You can get at the parsed request headers and uri though:
getHeaderNames()
getRequestURI()
etc.
I managed to read my raw request in my webapplication deployed on Tomcat 5.5
All I had to do is to read HttpServletRequest through my servlet/Spring controller
using request.getInputStream() only.
It must be the first API approach to the request. before any filter or other command start to mass with the request that cause its completely reading by the webserver.
What's the problem with that approach?