I am talking to a file upload service that accepts post data, not form data. By default, java's HttpURLConnection sets the Content-Type header to application/x-www-form-urlencoded. this is obviously wrong if i'm posting pure data.
I (the client) don't know the content type. I don't want the Content-Type header set at all. the service has a feature where it will guess at the content type (based on the file name, reading some data from the file, etc).
How do I unset a header? There's no remove header, and setting it to null doesn't change the value and setting it to the empty string results in the header being set with no value.
I haven't tested this approach but you can try this:
Extend HttpURLConnection and try by overriding its getContentHandler() and setContentHandler(...) methods. Most probably this should work as, you will take a look at code of getContentHandler().
Use Apache HttpClient instead of URLConnection
Use fluent Request to generate your request
use removeHeader()
What do you mean "i don't want the Content-Type header to set at all"?
The browser (or other http client) sends your post request to the server, so it has to inform the server which way it encoded the parameters.
If the Content-Type header is not set, on the server side you (= your server) won't be able to understand how to parse the received data.
If you didn't set Content-Type, the default value will be used.
You browser (or other http client) MUST do two things:
Send key/value pairs.
Inform the server how the key/value pairs were encoded.
So, it is impossible to completely get rid of this header.
I just accomplished this by setting the header to null.
connection.setRequestProperty(MY_HEADER, null);
Related
I am calling an API, that blacklists certain HttpHeaders including Content-Length which seems to be preset by the HttpClient underneath spring-openfeign.
To properly receive an API response, I'd need to remove the Content-Length header.
The following workarounds had been tried:
I tried to set the header to null or an empty String using the available Feign annotations #Headers, #RequestHeaders
I implemented a RequestInterceptor that creates a copy of the available (immutable) header map, deletes the blacklisted header and sets the Map as requestTemplate.headers(newHeaders). But only new headers can be added and the available ones not modified (seems to be really immutable ;))
I researched on overriding the used HttpClient but wasn't successful until now.
Experienced errors/ issues:
The API I am calling returns a 400 based on their header schema validation.
Code:
In case any code-snippets are needed, I am happy to provide them but to me the issue does not seem to be related to any code issue as I am not running into any exceptions.
Thanks in advance!!
The Apache Http Client included in feign-httpclient will always set the content length header if there is a request body present. One way to address this to configure the Apache Client directly and provide it to Feign via the builder:
This custom client can have an Apache Http Client interceptor applied that allows you to modify the request after it leaves Feign and before Apache sends it. Review their javadoc for more information.
public class Example {
public static void main(String[] args) {
HttpClient httpClient = HttpClients.custom.build();
GitHub github = Feign.builder()
.client(new ApacheHttpClient(httpClient))
.target(GitHub.class, "https://api.github.com");
}
}
FeignClient will preset Content_Length in the request header. In a keep-alive connection mode, either Content-Length or Transfer-Encoding header field must be set to signal the presence of a message body, so you can set Transfer-Encoding=chunked and Content-Length will be ignored by the serverside.
You can refer to rfc7230#section-3.3.1
"The presence of a message body in a request is signaled by a
Content-Length or Transfer-Encoding header field. Request message
framing is independent of method semantics, even if the method does
not define any use for a message body."
"In order to remain persistent, all messages on a connection need
to have a self-defined message length (i.e., one not defined by
closure of the connection), as described in Section 3.3. A server
MUST read the entire request message body or close the connection
after sending its response, since otherwise the remaining data on a
persistent connection would be misinterpreted as the next request.
Likewise, a client MUST read the entire response message body if it
intends to reuse the same connection for a subsequent request."
and from here , you can read:
"All HTTP/1.1 applications that receive entities MUST accept the
"chunked" transfer-coding (section 3.6), thus allowing this mechanism
to be used for messages when the message length cannot be determined
in advance.
Messages MUST NOT include both a Content-Length header field and a
non-identity transfer-coding. If the message does include a non-
identity transfer-coding, the Content-Length MUST be ignored.
When a Content-Length is given in a message where a message-body is
allowed, its field value MUST exactly match the number of OCTETs in
the message-body. HTTP/1.1 user agents MUST notify the user when an
invalid length is received and detected."
I have a server that handles a POST request with JSON. It also looks and decodes query parameters from the URI. My Java Client currently uses HTTPPost to send across the Json with ContentType application/json.
I wonder whehther URLEncodeUtil method format would be able to accomplish this. Except the documentation mentions
suitable for use as an application/x-www-form-urlencoded list of
parameters in an HTTP PUT or HTTP POST.
So my question is
1. Would this work with ContenType set to application/json.
2. Is there another way to accomplish what the Server requires, ie: have JSON as well as Query parameters encoded in the URI.
There are two official methods of posting form data via the (HTML spec). The pertinent value is application/x-www-form-urlencoded which adds a ? along with the name/value pairs encoded in the URL. If the form method is POST then it will be the first line after the HTTP POST statement.
Everything we do with HTTP in REST web services is valid HTTP, but not for HTML. So the application/json can have a combination of the application/x-www-form-urlencoded style parameters and the JSON payload.
The HTTP request will look something like this:
POST /blog/posts?myparam=Something%20Good&token=donotdothis
Accept: application/json
Content-Type: application/json
Content-Length: 57
{"title":"Hello World!","body":"This is my first post!"}
Also spelled out here: http://www.jsonrpc.org/historical/json-rpc-over-http.html
It's the ? that marks the beginning of extra parameters. So while that is technically legal, it does beg the question why everything you need to post can't be part of your JSON. The downside of this approach is that the query parameters are all part of your HTTP logs and is very visible. You definitely should not use this approach with passwords or any other personally identifiable information. Depending on privacy laws in your country, you want to minimize unnecessary records to make compliance much easier.
I am trying to "spoof" a Firefox HTTP POST request in Java using java.net.HttpURLConnection.
I use Wireshark to check the HTTP headers being sent, so I have (hopefully) reliable source of information, why the Java result doesn't match the ideal situation (using Firefox).
I have set all header fields exactly to the values that Firefox sends via HTTP and noticed, that the sequence of the header fields is not the same.
The output for Firefox is like:
POST ...
**Host**
User-Agent
Accept
Accept-Language
Accept-Encoding
Referer
Connection
Content-Type
Content-Length
When I let wireshark tap off my implementation in Java, it gives me a slightly different sequence of fields:
POST...
**User-Agent**
Accept
Accept-Language
Accept-Encoding
Referer
Content-Type
Host
Connection
Content-Length
So basically, I have all the fields, just in a different order.
I have also noticed that the Host field is sent with a different value:
www.thewebsite.com (Firefox) <---> thewebsite.com (Java HttpURLConnection), although I pass on the String to httpUrlConnection.setRequestProperty with the "www."
I have not yet analyzed the byte output of Wireshark, but I know that the server is not returning the same Location in the header fields of my response.
My questions are:
(1) Is is possible to control the sequence the header fields in the request, and if yes is it possible to do using HttpURLConnection? If not, is it possible to directly control the bytes in the HTTP header using Java? [I don't own the server, so my only hope to get the POST method working is through my application pretending to be Firefox, the server is not really verbose, my only info are: Apache with PHP]
(2) Is there a way to fix the setRequestProperty() problem ("www") as described above?
(3) What else could matter? (Do I need to concern the underlying layers, TCP....?)
Thanks for any comments.
PS. I am trying to model a situation without cookies being sent, so that I can ignore the effect.
First, the order of the headers is irrelevant.
Second, in order to manually override the host header you need to set sun.net.http.allowRestrictedHeaders=true either in code
System.setProperty("sun.net.http.allowRestrictedHeaders", "true")
or at JVM start
-Dsun.net.http.allowRestrictedHeaders=true
This is a security precaution introduced by Oracle a while ago. That's because according to RFC
The Host request-header field specifies the Internet host and port
number of the resource being requested, as obtained from the original
URI given by the user or referring resource (generally an HTTP URL).
the headers order is not important. the headers got by server are also out-of-order. And you can not control httpUrlConnection header order. But if you write your own TCP client, you can control your header order. like:
clientSocket = new Socket(serverHost, serverPort);
OutputStream os = clientSocket.getOutputStream();
String send = "GET /?id=y2y HTTP/1.1\r\nConnection: keep-alive\r\nKeep-Alive: timeout=15, max=200\r\nHost: chillyc.info\r\n\r\nGET /?id=y2y HTTP/1.1\r\nConnection: keep-alive\r\nKeep-Alive: timeout=15, max=200\r\nHost: chillyc.info\r\n\r\n";
os.write(send.getBytes());
The Second question is answered by Marcel Stör in the first answer.
a
I got lucky with Apache Http Components, my guess is that the "Host" header's missing "www." made the difference, which can be set exactly as intended using Apache's HttpPost:
httpPost.setHeader("Host", "www.thewebsite.com");
The Wireshark output confirmed my suspicion. Also this time the TCP communication prior to my HTTP post looks different (client ---> server, server ---> client, client ---> server) instead of (client ---> server, server ---> client, client ---> server, client---> server).
Now I get the desired Location header value and the server is also setting the cookies. :)
For the most part, this question is resolved.
Actually I wanted to use the lightweihgt HttpUrlConnection because that's what the Android Developers blog suggesting. The System.setProperty("sun.net.http.allowRestrictedHeaders", "true") might work as well, if it allows to "www." in the Host value.
I have a web page with links pointing to downloadable files. For example:
http://www.mysite.com/download.php?FILE=downloads/programming/various/ebook.pdf
But it can also have navigation links as follows:
http://www.mysite.com/index.php
http://www.mysite.com/index.php?category=programming
http://www.mysite.com/index.php?section=programming&category=various
How can I determine if a URL is pointing to a file as in the first link ? Or inversely, filter out URLs which don't fit ?
Going with your edited question: if you want to filter out files,
screen the Content-Type header.
Here is an informal list of common mime-types
You can inspect response headers to determine if the response will conform, e.g. to an application/pdf But you cannot, just from the URL / URI itself, make this determination.
In fact, I could construct a web application that would respond to the URL http://myapp.com/test.pdf with header Content-Type: image/jpeg and data of a JPG.
Also, I could really break things by sending a header Content-Type: image/jpeg and data of for a PDF.
Presuming that it wasn't intentionally-broken (as I mentioned above) then you can rely on the response.
Be aware if the content itself deviates from the Content-Type header then you could have an exploit happen. This is how the iPhone was jailbroken: through acting on malformed PDF data.
Look for a file name-like parameter?
Any URL could respond with a file when requested.
You have no way of knowing what a URL will respond with until you request it.
In HTTP, URLs don't point to files, ever; they identify resources, for which you get a representation when your "dereference" that URL (i.e. make a GET request).
Whether the user-agent chooses to store that representation as a file is its own choice. What to do with a representation is guided by the content-type.
You may obtain the content-type using a HEAD request. PDF documents should be using application/pdf but there are a number of other types. Most browsers tend to save application/octet-stream as files, by default. (There are also subtleties about content-type negotiation.)
In Java, you could make a HEAD request using something like this:
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("HEAD");
// Check connection.getContentType();
I want to check if a URL's mimetype is not a webpage. Can I do this in Java? I want to check if the file is a rar or mp3 or mp4 or mpeg or whatever, just not a webpage.
You can issue an HTTP HEAD request and check for Content-Type response headers. You can use the HttpURLConnection.setRequestMethod("HEAD") before you issue the request. Then issue the request with URLConnection.connect() and then use URLConnection.getContentType() which reads the HTTP headers.
The bonus of using a HEAD request is that the actual resource is never transmitted/generated. You can also use a GET request and inspect the resulting stream using URLConnection.guessContentTypeFromStream() which will inspect the actual bytes and try to guess what the stream represents. I think that it looks for magic numbers or other patterns in the stream.
There's nothing inherent in a URL which will tell you what you will receive when you request it. You have to actually request the resource, and then inspect the content-type header. At that point, it's still not clear what you should do - some content types will (almost) always be handled by the browser, e.g. text/html. Some types should be handled by a browser, e.g. application/xhtml+xml. Some types may be handled by the browser, e.g. application/pdf.
Which, if any, of these you consider to be "webpage" is still not clear - you'll need to decide for yourself.
You can inspect the content-type header once you're requested the resource, using, for example, the HttpURLConnection class.
content-type:text/html represents webpage.