I worked on a website in China last year. THe website has been running fine, but I just received an email fromNetwork Security Bureau in China.
They sent me an attachment along with a few URLs. one of them is
http://mydomain.com/servlets/pdf?var=xss<audio src=pksovf onerror=pksovf(qrx)>
They say that the above url has a risk. I checked the url and it throws 500 error all the time because the url format is incorrect.
if a page throws a 500 error, is there still a risk? I don't think so, but I just want to confirm.
A 500 is roughly analogous to a crash in a piece of desktop software. Crashes are considered security risks because they can be exploited by malicious inputs, and the same could be said of the web server.
A 500 means "something we weren't expecting happened, and we failed to deal with it." That means there's something there that could be exploited - there's no guarantee of that, just like there's no guarantee that a crash in a piece of desktop software can be exploited, but it's correct to treat both of those cases as potential security holes.
Potentially yes. It depends where the error occured.
If your back-end did inserted this string in a database somewhere before crashing, that may be queried later on to render this string in one of your view, this could be dramatically dangerous as an attacker could potentially insert arbitrary HTML or Javascript in a webpage and redirect for exemple a user to a harmful website.
Also, this could easily result in an SQL injection if a database was queried with this string at some point, given that it was not filtered.
To summarize, it depends on what your did with this string and where did your code crash.
Anyway, you should not have this kind of issues and should always check the validity of a user input. Getting a 500 HTTP return code is exceptional and means that something went wrong on the server side. You got to fix this anyway.
Related
We're currently running into an interesting problem regarding the sanitization of error logs being printed into our server logs. We have proper global error handling set up and have custom error messages that are sent back as responses from our OSGi java servlets.
We use dockerized containers as server instances that are autoscaled, so we're thinking about setting up a log aggregator and storing our exceptions within a DB in the cloud, that way we can also track metrics about our exceptions and pinpoint how we could improve our development process to reduce certain types of errors, etc.
I did a bit of research about how that should be done and I found this. The OWASP Logging sheet cheat. It mentions that passwords should never be logged among a few other things. That brings us to my question:
How do I go about properly sanitizing my logs without using some janky text processing or manually covering up all the potential cases?
Example stacktrace:
pkg.exceptions.CustomException: some registration error
ERROR: duplicate key value violates unique constraint "x_username_org_id_key"
Detail: Key (username, org_id)=(SOME EMAIL, 1) already exists.
Query: with A as (some query) insert into someTable (..values...) Parameters: [X, X, X, X, X, SOME_EMAIL, THE_PASSWORD]
at somepkg.etc
This is a pretty common error with registration systems that happens due to username collisions. Sure there's ways that this specific case can be avoided by ensuring the username isn't taken before the insertion isn't attempted and handling that case separately, but that's just a single case among many others.
After looking around to find a solution there doesn't seem to be an obvious way to solve the problem and I'm wondering if everyone out there has simply implemented their own version of a log sanitizer? We could simply purge the stacktrace if some troublesome strings are present, but that's not the best solution. Any suggestions?
If you only store and pass around password hashes you won't need to sanitize the logs for passwords. In cases where a password must be preserved temporarily in code use char[]s rather than Strings. This is a more secure approach in general and is considered a best practice. The standard library APIs all use character arrays for passwords.
For most of the errors in Mulesoft there is no error code defined. If it doesn't know, Mule flatly prints MULE_ERROR--2. Instead of this I want to put in my own error code which will be fetched from DB and include it in the exception payload. After this, the exception payload should be sent to a handler flow for re-submission based on error code. Hence in error handling part of the flow I need to have more than one component.
Tried Custom Exception Strategy, Catch Exception Strategy, Using Java component and flow-refs but none of them worked.
Also, I built a dummy code for this (without fetching the error code) to put my own custom error msg and what I noticed is, it throws the same error twice, once by default for the first time and again when I put my error msg and throw the error. To suppress this I put
<AsyncLogger name="org.mule.exception.CatchMessagingExceptionStrategy" level="FATAL"/>
in log4j2.xml.
Will this cause any issues?
You can always define your own Errors and customise it. Follow below link for further information,
http://blogs.mulesoft.com/dev/api-dev/api-best-practices-response-handling/
Below is the content for the same :
Use HTTP Status Codes
One of the most commonly misused HTTP Status Codes is 200 – ok or the request was successful. Surprisingly, you’ll find that a lot of APIs use 200 when creating an object (status code 201), or even when the response fails:
invalid200
In the above case, if the developer is solely relying on the status code to see if the request was successful, the program will continue on not realizing that the request failed, and that it did something wrong. This is especially important if there are dependencies within the program on that record existing. Instead, the correct status code to use would have been 400 to indicate a “Bad Request.”
By using the correct status codes, developers can quickly see what is happening with the application and do a “quick check” for errors without having to rely on the body’s response.
You can find a full list of status codes in the HTTP/1.1 RFC, but just for a quick reference, here are some of the most commonly used Status Codes for RESTful APIs:
200 Ok
201 Created
304 Not Modified
400 Bad Request
401 Not Authorized
403 Forbidden
404 Page/ Resource Not Found
405 Method Not Allowed
415 Unsupported Media Type
500 Internal Server Error
Of course, if you feel like being really creative, you can always take advantage of status code:
418 I’m a Teapot
It’s important to note that Twitter’s famed 420 status code – Enhance Your Calm, is not really a standardized response, and you should probably just stick to status code 429 for too many requests instead.
Use Descriptive Error Messages
Again, status codes help developers quickly identify the result of their call, allowing for quick success and failure checks. But in the event of a failure, it’s also important to make sure the developer understands WHY the call failed. This is especially crucial to the initial integration of your API (remember, the easier your API is to integrate, the more likely people are to use it), as well as general maintenance when bugs or other issues come up.
You’ll want your error body to be well formed, and descriptive. This means telling the developer what happened, why it happened, and most importantly – how to fix it. You should avoid using generic or non-descriptive error messages such as:
redx Your request could not be completed
redx An error occurred
redx Invalid request
Generic error messages are one of the biggest hinderances to API integration as developers may struggle for hours trying to figure out why the call is failing, even misinterpreting the intent of the error message altogether. And eventually, if they can’t figure it out, they may stop trying altogether.
For example, I struggled for about 30 minutes with one API trying to figure out why I was getting a “This call is not allowed” error response. After repeatedly reformatting my request and trying different approaches, I finally called support (in an extremely frustrated mood) only to find out it was referring to my access token, which just so happened to be one letter off due to my inability to copy and paste such things.
Just the same, an “Invalid Access Token” response would have saved me a ton of hassle, and from feeling like a complete idiot while on the line with support. It would have also saved them valuable time working on real bugs, instead of trying to troubleshoot the most basic of steps (btw – whenever I get an error the key and token are the first things I check now).
Here are some more examples of descriptive error messages:
greencheckmark Your API Key is Invalid, Generate a Valid API Key at http://…
greencheckmark A User ID is required for this action. Read more at http://…
greencheckmark Your JSON was not properly formed. See example JSON here: http://…
But you can go even further, remember- you’ll want to tell the developer what happened, why it happened, and how to fix it. One of the best ways to do that is by responding with a standardized error format that returns a code (for support reference), the description of what happened, and a link to the appropriate documentation so that they can learn more/ fix it:
{
"error" : {
"code" : "e3526",
"message" : "Missing UserID",
"description" : "A UserID is required to edit a user.",
"link" : "http://docs.mysite.com/errors/e3526/"
}
}
On a support and development side, by doing this you can also track the hits to these pages to see what areas tend to be more troublesome for your users – allowing you to provide even better documentation/ build a better API.
I have seen that that one of the main difference between POST and GET is that POST is not cached but GET is cached.
Could you explain me what do you mean about "cache"?
Also, if I use POST or GET server sends me response. Is there any difference? In all of cases, I have request data and response, is not it?
Thanks
To Cache (in the context of HTTP) means to store a page/response either on the client or some intermediate host - perhaps in a content distribution network. When the client requests a page, then the page can be served from the client's cache (if the client requested it before) or the intermediate host. This is faster and requires fewer resources than getting the page from the server that generated it.
One downside is that if the request changes some state on the server, that change won't happen if the page is served from a cache. This is why POST requests are usually not served from a cache.
Another downside to caching is that the cached copy may be out of date. The HTTP caching mechanisms try to prevent this.
The basic idea behind the GET and POST methods is that a GET message only retrieves information but never changes the state of the server. (Hence the name). As a result, just about any caching system will assume that you can remember the last GET response returned, and that the next one will look the same.
A POST on the other hand is a request that sends new information to the server. So not only can these not be cached (because there's no guaruantuee that the next POST won't modify things even more; think +1 like buttons for example) but they actually have to invalidate parts of the cache because they might modify pages.
As a result, your browser for example will warn you when you try to refresh a page to which you POSTed information, because you might make changes you did not want made by doing so. When GETting a page, it will not do so because you cannot change anything on the site by doing so.
(Or rather; it's your job as a programmer to make sure that nothing changes when GETting a page.)
GET is supposed to return the same result from the server and not change things at the server side and hence idempotent.
Whereas POST means it can modify something at the server(make an entry in db, delete something etc) and hence not idempotent.
And with regards to caching the data in GET has been addressed here in a nice manner.
http://www.ebaytechblog.com/2012/08/20/caching-http-post-requests-and-responses/#.VGy9ovmUeeQ
I am working with a java Twitter app (using Twitter4J api). I have created the app and can view the current users timeline, user's profiles, etc..
However, when using the app it seems to quite quickly exceed the 150 requests an hour rate limit set on Twitter clients (i know developers can increase this to 350 on given accounts, but that would not resolve for other users).
Surely this is not affecting all clients, any ideas as to how to get around this?
Does anyone know what counts as a request? For example, when i view a user's profile, i load the User object (twitter4j) and then get the screenname, username, user description, user status, etc to put into a JSON object - would this be a single call to get the object or would it several to include all the user.get... calls?
Thanks in advance
You really do need to keep track what your current request count is when dealing with Twitter.
However, twitter does not seem to drop the count for 304 Not Modified (at least it didn't the last time I dealt with it), so make sure there isn't something breaking your normal use of HTTP caching, and your practical request per hour goes up.
Note that twitter suffers from a bug in mod_gzip on apache where the e-tag is mal-formed in changing it to reflect that the content-encoding is different to that of the non-gzipped entity (this is the Right Thing to do, there's just a bug in the implementation). Because of this, accepting gzipped content from twitter means it'll never send a 304, which increases your request count, and in many cases undermines the efficiency gains of using gzip.
Hence, if you are accepting gzip (your web-library may do so by default, see what you can see with a tool like Fiddler, I'm a .NET guy with only a little Java knowledge, answering at the level of how twitter deals with HTTP so I don't know the details of Java web libraries), try turning that off, and see if it improve things.
Almost every type of read from Twitter's servers (i.e. anything that calls HTTP GET) counts as a request. Getting user timelines, retweets, direct messages, getting user data all count as 1 request each. Pretty much the only Twitter API call that reads from the server without counting against your API limit is checking to see the rate limit status.
I'm developing a web application, and facing some security problems.
In my app users can send messages and see other's (a bulletin board like app). I'm validating all the form fields that users can send to my app.
There are some very easy fields, like "nick name", that can be 6-10 alpabetical characters, or message sending time, which is sended to the users as a string, and then (when users ask for messages, that are "younger" or "older" than a date) I parse this with SimpleDateFormat (I'm developing in java, but my question is not related to only java).
The big problem is the message field. I can't restrict it to only alphabetical characters (upper or lowercase), because I have to deal with some often use characters like ",',/,{,} etc... (users would not be satisfied if the system didn't allow them to use these stuff)
According to this http://ha.ckers.org/xss.html, there are a lot of ways people can "hack" my site. But I'm wondering, is there any way I can do to prevent that? Not all, because there is no 100% protection, but I'd like a solution that can protect my site.
I'm using servlets on the server side, and jQuery, on the client side. My app is "full" AJAX, so users open 1 JSP, then all the data is downloaded and rendered by jQuery using JSON. (yeah, I know it's not "users-without-javascript" friendly, but it's 2010, right? :-) )
I know front end validation is not enough. I'd like to use 3 layer validation:
- 1. front end, javascript validate the data, then send to the server
- 2. server side, the same validation, if there is anything, that shouldn't be there (because of client side javascript), I BAN the user
- 3. if there is anything that I wasn't able to catch earlier, the rendering process handle and render appropriately
Is there any "out of the box" solution, especially for java? Or other solution that I can use?
To minimize XSS attacks important thing is to encode any field data before putting it back on the page. Like change > to > and so on. This would never allow any malicious code to execute when being added to the page.
I think you are doing lot of right things by white listing the data you expect for different fields. Beyond that for fields which can allow other characters which can be problematic encoding would fix the issue for you.
Further since you are using Ajax it gives you some protection as people cannot override values in URL parameters etc.
Look at the AntiSamy library. It allows you to define rulesets for your application, then run your user input through AntiSamy to clean it per your rules.
The easiest way is to do a simple replacement for the following
< with <
> with >
' with \'
That will solve most database vulnerability.