URL in Referer header is detected as using multiple encoding

URL in Referer header is detected as using multiple encoding - java

Using owasp.esapi for to filter incoming request parameters and headers, I'm stumbling on an issue where apparently the Referer header contains a value that is considered as using "multiple encoding".
An example:
http://123.abc.xx/xyz/input.xhtml?server=http%3A%2F%2F123.abc.xx%3A7016%2Fxyz&o=1&language=en&t=a074faf3
To me though, that URL seems to be correctly encoded, and decoding it results in a perfectly readable and correct url.
So, can anyone explain the issue here, and how to handle this?
ESAPI reports the error when running this method on the header value:
value = ESAPI.encoder().canonicalize(value);
Output:
SEVERE: [SECURITY FAILURE] INTRUSION - Mixed encoding (2x) detected

As a matter of fact yes. I fixed this bug in the upcoming release of ESAPI but it will require an API change, perhaps one that might have a bug based on your data here.
In short, prior to my fix, ESAPI just did a Regex against the URI. The problem and slew of bug reports on this, is that URI’s are not a regular language. They are a language themselves. So what would happen is that the URI in question would have parameters that contained HTML entities, only, some random data variants would align to known HTML entities such as &param=foo which would be interpreted as the entity ¶ which is paragraph. There were also some issues in regards to ASCII vs Unicode (non bmp encodings.).
At any rate there will be a new method to use in the release candidate for our next library, Encoder.getCanonicalizedURI();
This will be safe to regex against as it will be broken down and checked for mixed/multiple encoding. The method you’re currently using is now deprecated.

Related

VERACODE: CRLF Neutralization Warnings for Cookies - JAVA

recently after scanning our project we can across with Veracode warnings on CRLF neutralization. please find my error code below.
Cookie[] c = request.getCookies();
c[i].setValue("");
c[i].setMaxAge(0);
getting the issue on below line
response.addCookie(c[i]);
Solutions Tried:
1.setValue("") tried replacing with \r or \n
2. used Encode.forJava(String)
3. Used ESAPI, but our project is running on Java 1.6. No suitable ESAPI jar was found.
Any recommendations here? am I missing anything? Am I going in the wrong direction? Can anyone help me with this?

I don't think output encoding is the right approach here. Unless you are rendering the cookie name and/or value, the issue is not XSS, but rather HTTP Response Splitting.
Strict allow-listing is the best approach, here but if find that impossible (because you are not sure what the allowed values are supposed to be, which might be the case if you were writing an HTTP library or getting values from downstream processes, etc.), then go with block-list data validation. For the block-list approach, I recommend either outright rejecting any cookie containing ':', '=', '\r', or '\n' (and log an appropriate error) and redirect the user to any appropriate error page. Alternately, if you detect anything in the block list, you could simply ignore those values by just silently stripping them out (although you many want to log them).

Java Url Validation With Placeholders

I have built an API where you can register a callback URL.
The URL's are validated using the Apache UrlValidator class.
I now have to add a feature that allow to add placeholders in the configured URL.
https:/foo.com/${placeholder1}/bar/${placeholder2}
These placeholders will be dynamically replaced using the Apache StrSubstitutor or something similar.
Now my issue, how do I validate the URL's with the placeholders ?
I have thought of a solution :
I replace the expected placeholders with an example value
Then I Validate the URL using the Apache UrlValidator
My issue with this solution is that the Apache UrlValidator only returns a boolean so the error message will be quite ambiguous.
Is there another solution than creating my own regex ?
Update : following discussions in the comments
There is a finite number of allowed placeholders.
The format of the Strings that will replace the placeholders is also known.
The first objective is to be able to check if the given URL which eventually contains placeholders is valid at the time it is configured.
The second objective is, if the URL is not valid return an intelligible error message.
There are multiple error cases :
A placeholder used in the URL is not in the allowed placeholder list
The URL in not valid independently of the placeholders

For a minimal URL validation, you could use the java.net.URL constructor (it will work with your https:/foo.com/${placeholder1}/bar/${placeholder2} example).
According to the docs, it throws:
MalformedURLException - if no protocol is specified, or an unknown protocol is found, or spec is null.
You can then leverage the URL methods as a bonus, to get parts of it such as path, protocol, etc.
I would definitely advise against re-inventing the wheel with regex for URL validation.
Note that java.net.URI has a much stricter validation and would fail your example with placeholders as is.
Edit
As discussed, since you need to validate placeholders as well, you probably want to actually try to fill them first and fail fast if something's wrong, then proceed and validate the populated URL against java.net.URI, for strict validation.
General caveat
You might also want to make your life easier and leverage an existing framework that would allow you to use annotated path variables in the first place (e.g. Spring, etc.), but that's quite a broad discussion.

how to get "Authentication" header from doGet request if I don't want to switch to other libraries?

I just realized that my base64 encoded Header "Authentication" can't be read
with request.getHeader("Authentication").
I found this post about that it's a security Feature in URLConnection
getRequestProperty("Authorization") always returns null
, i don't know why but it seems to be true for request.getHeader as well.
How can i still get this Header if l don't want to Switch to other libraries?

I was searching through https://fossies.org/dox/apache-tomcat-6.0.45-src/catalina_2connector_2Request_8java_source.html#l01947 and found a section where restricted headers will be used if Globals.IS_SECURITY_ENABLED is set.
Since I'm working on a reverse Proxy and only Need to pass requests/Responses through I did simply set "System.setSecurityManager(null);" and for my case it might be a valid solution but if you want to use authentication there is no reason to use this Workaround.
My bad, it does work with https now.

The accepted solution did not work for me – may have something to do with different runtime environments.
However, i've managed to come up with a working snippet to access the underlying MessageHeader-collection via reflection and extract the "Authorization"-header value.

Encoder and canonicalize in ESAPI

I understand what ESAPI is used for, but I see these two lines repeated in a lot of ESAPI examples. Can someone please explain what exactly this does?
ESAPI.encoder().canonicalize(inputUrl,false,false);

See the docs:
Canonicalization is simply the operation of reducing a possibly
encoded string down to its simplest form. This is important, because
attackers frequently use encoding to change their input in a way that
will bypass validation filters, but still be interpreted properly by
the target of the attack. Note that data encoded more than once is not
something that a normal user would generate and should be regarded as
an attack.
The two additional parameters which are set to false in your example indicate whether or not to restrict multiple encoding and mixed encoding (see docs for meaning), respectively.

Decoding Java's JSON Unicode values with PHP

I had experienced different JSON encoded value for the same string depending on the language used in the past. Since the APIs were used in closed environment (no 3rd parties allowed), we made a compromise and all our Java applications are manually encoding Unicode characters. LinkedIn's API is returning "corrupted" values, basically the same as our Java applications. I've already posted a question on their forum, the reason I am asking it here as well is quite simple; sharing is caring :) This question is therefore partially connected with LinkedIn, but mostly trying to find an answer to the general encoding problem described below.
As you can see, my last name contains a letter ž, which should be \u017e but Java (or LinkedIn's API for that matter) returns \u009e with JSON and nothing with XML response. PHP's json_decode() ignores it and my last name becomes Kurida.
After an investigation, I've found ž apparently has two representations, 9e and 17e. What exactly is going on here? Is there a solution for this problem?

U+009E is a usually-invisible control character and not an acceptable alternative representation for ž.
The byte 0x9E represents the character ž in Windows code page 1252. That byte, if decoded using ISO-8859-1, would turn into U+009E.
(The confusion comes from the fact that if you write  in an HTML page, the browser doesn't actually give you character U+009E, as you might expect, but converts it to U+017E. The same is true of all the character references 0080–009F: they get changed as if the numbers referred to cp1252 bytes instead of Unicode characters. This is utterly bizarre and wrong behaviour, but all the major browsers do it so we're stuck with it now. Except in proper XHTML served as XML, since that has to follow the more sensible XML rules.)
Looking at the forum page, the JSON-reading is clearly not wrong: your name is registered as being “David Kurid[U+009E]a”. However that data has got into their system needs looking at.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

URL in Referer header is detected as using multiple encoding - java

Related

VERACODE: CRLF Neutralization Warnings for Cookies - JAVA

Java Url Validation With Placeholders

how to get "Authentication" header from doGet request if I don't want to switch to other libraries?

Encoder and canonicalize in ESAPI

Decoding Java's JSON Unicode values with PHP

Categories

Resources