regex to disallow access to parent directories - java

regex to disallow access to parent directories - java - java

So what I need is to create a regex which is going to be used on my server to make sure that all the files that the user is requesting access to, are under a specific directory. Let's name that dir UserFiles and let's assume that it is under the path /Server/Users/Bob/UserFiles.
So now when a client sends a request to read a file I want to validate that the path that he is asking access to is under /Bob/UserFiles/.
I thought about making sure that the prefix of the path always begins with /Userfiles/ and that there is no .. in the path (so that would also protect me from restricted access like /UserFiles/../../noAccess.txt)
examples of not allowed inputs:
C:/UserFiles/
../../Alice/txt.txt
/UserFiles/../../noAccess.txt
examples of allowed input:
/UserFiles/UserFiles/Alice/txt.txt
/UserFiles/txt.txt
/UserFiles/Bob/Bob/txt.txt
I cannot think of any cases why this wouldn't work. I also tried to build the regex but it is not quite right as it allows inputs like /UserFiles//txt.txt (Might allow even more that it shouldn't that I have no knowledge of)
So is my idea complete or there are other cases I havent thought of? If my idea is complete could you please help me fix my regex?
(?!\.\.)^\/UserFiles\/[/\w,\s-]+\.[A-Za-z]{3}$

How about resolving the path and checking only afterwards (note, the behaviour is OS-dependent):
new File(input).getCanonicalPath().startsWith("/UserFiles/")
Or, depending on how to interpret your question:
new File(input).getCanonicalPath().startsWith("/Server/Users/Bob/UserFiles/")

Related

How to validate a filename in JAVA to resolve CWE ID 73(External Control of File Name or Path) using ESAPI?

I am facing this security flaw in my project at multiple places. I don't have any white-list to do a check at every occurrence of this flaw. I want to use ESAPI call to perform a basic blacklist check on the file name. I have read that we can use SafeFile object of ESAPI but cannot figure out how and where.
Below are a few options I came up with, Please let me know which one will work out?
ESAPI.validator().getValidInput() or ESAPI.validator().getValidFileName()

Blacklists are a no-win scenario. This can only protect you against known threats. Any code scanning tool you use here will continue to report the vulnerability... because a blacklist is a vulnerability. See this note from OWASP:
This strategy, also known as "negative" or "blacklist" validation is a
weak alternative to positive validation. Essentially, if you don't
expect to see characters such as %3f or JavaScript or similar, reject
strings containing them. This is a dangerous strategy, because the set
of possible bad data is potentially infinite. Adopting this strategy
means that you will have to maintain the list of "known bad"
characters and patterns forever, and you will by definition have
incomplete protection.
Also, character encoding and OS makes this a problem too. Let's say we accept an upload of a *.docx file. Here's the different corner-cases to consider, and this would be for every application in your portfolio.
Is the accepting application running on a linux platform or an NT platform? (File separators are \ in Windows and / in linux.)
a. spaces are also treated differently in file/directory paths across systems.
Does the application already account for URL-encoding?
Is the file being sent stored in a database or on the system itself?
Is the file you're receiving executable or not? For example, if I rename netcat.exe to foo.docx does your application actually check to see if the file being uploaded contains the magic numbers for an exe file?
I can go on. But I won't. I could write an encyclopedia.
If this is across multiple applications against your company's portfolio it is your ethical duty to state this clearly, and then your company needs to come up with an app/by/app whitelist.
As far as ESAPI is concerned, you would use Validator.getValidInput() with a regex that was an OR of all the files you wanted to reject, ie. in validation.properties you'd do something like: Validator.blackListsAreABadIdea=regex1|regex2|regex3|regex4
Note that the parsing penalty for blacklists is higher too... every input string will have to be run against EVERY regex in your blacklist, which as OWASP points out, can be infinite.
So again, the correct solution is to have every application team in your portfolio construct a whitelist for their application. If this is really impossible (and I doubt that) then you need to make sure that you've stated the risks cited here clearly to management and you refuse to proceed with the blacklist approach until you have written documentation that the company chooses to accept the risk. This will protect you from legal liability when the blacklist fails and you're taken to court.
[EDIT]
The method you're looking for was called HTTPUtilites.safeFileUpload() listed here as acceptance criteria but this was most likely never implemented due to the difficulties I posted above. Blacklists are extremely custom to the application. The best you'll get is a method HTTPUtilities.getFileUploads() which uses a list defined in ESAPI.properties under the key HttpUtilities.ApprovedUploadExtensions
However, the default version needs to be customized as I doubt you want your users uploading .class files and dll to your system.
Also note: This solution is a whitelist and NOT a blacklist.

The following code snippet works to get past the issue CWE ID 73, if the directory path is static and just the filename is externally controlled :
//'DIRECTORY_PATH' is the directory of the file
//'filename' variable holds the name of the file
//'myFile' variable holds reference to the file object
File dir = new File(DIRECTORY_PATH);
FileFilter fileFilter = new WildcardFileFilter(filename);
File[] files = dir.listFiles(fileFilter);
File myFile = null ;
if(files.length == 1 )
myFile = files[0];

Java find where a class is used in the code - programmatically

I have a List of classes which I can iterate through. Using Java is there a way of finding out where these classes are used so that I can write it out to a report?
I know that I can find out using 'References' in Eclipse but there are too many to be able to do this manually. So I need to be able to do this programmatically. Can anyone give me any pointers please?
Edit:
This is static analysis and part of creating a bigger traceability report for non-technical people. I have comprehensive Javadocs but they are not 'friendly' and also work in the opposite direction to how I need the report. Javadocs start from package and work downwards, whereas I need to start a variable level and work upwards. If that makes any sense.

You could try to add a stacktrace dump somewhere in the class that isolates the specific case you are looking for.
public void someMethodInMyClass()
{
if (conditions_are_met_to_identify)
{
Thread.dumpStack();
}
// ... original code here
}

You may have to scan all the sources, and check the import statements. (Taking care of the * imports.. having to setup your scanner for both the fully Qualified class name and its packagename.*)
EDIT: It would be great to use the eclipse search engine for this. Perhaps here is the answer

Still another approach (probably not complete):
Search Google for 'java recursively list directories and files' and get source code that will recursively list all the *.java file path/names in a project.
For each file in the list:
1: See if the file path/name is in the list of fully qualified file names you are interested in. If so, record is path/name as a match.
2: Regardless if its a match or not, open the file and copy its content to a List collection. Iterate through the content list and see if the class name is present. If found, determine its path by seeing if its in the same package as the current file you are examining. If so, you have a match. If not, you need to extract the paths from the *.import statements, add it to the class name, and see if it exists in your recursive list of file path/names. If still not found, add it to a 'not found' list (including what line number it was found on) so you can manually see why it was not identified.
3: Add all matches to a 'found match' list. Examine the list to ensure it looks correct.

Not sure what you are trying to do, but in case you want to analyse code during runtime, I would use an out-of-the box profiler that shows you what is loaded and what allocated.
#Open source profilers: Open Source Java Profilers
On the other hand, if you want to do this yourself (During runtime) you can write your own custom profiler:
How to write a profiler?
You might also find this one useful (Although not exactly what you want):
How can I list all classes loaded in a specific class loader
http://docs.oracle.com/javase/7/docs/api/java/lang/instrument/Instrumentation.html
If what you are looking is just to examine your code base, there are really good tools out there as well.
#see http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis

Can I automatically refactor an entire java project and rename uppercase method parameters to lowercase?

I'm working in a java project where a big part of the code was written with a formatting style that I don't like (and is also non standard), namely all method parameters are in uppercase (and also all local variables).
On IntellJ I am able to use "Analyze -> Inspect Code" and actually find all occurrences of uppercase method parameters (over 1000).
To fix one occurrence I can do "refactor > rename parameter" and it works fine (let's assume there is no overlapping).
Is there a way to automagically doing this refactor (e.g: rename method parameter starting with uppercase to same name starting with lowercase)?

Use a Source Parser
I think what you need to do is use a source code parser like javaparser to do this.
For every java source file, parse it to a CompilationUnit, create a Visitor, probably using ModifierVisitorAdapter as base class, and override (at least) visit(MethodDeclaration, arg). Then write the changed CompilationUnit to a new File and do a diff afterwards.
I would advise against changing the original source file, but creating a shadow file tree may me a good idea (e.g. old file: src/main/java/com/mycompany/MyClass.java, new file src/main/refactored/com/mycompany/MyClass.java, that way you can diff the entire directories).

I'd advise that you think about a few things before you do anything:
If this is a team effort, inform your team.
If this is for an employer, inform your boss.
If this is checked into a version control system, realize that you'll have diffs coming out the wazoo.
If it's not checked into a version control system, check it in.
Take a backup before you make any changes.
See if you have some tests to check before & after behavior hasn't changed.
This is a dangerous refactoring. Be careful.

I am not aware of any direct support for such refactoring out of the box in IDEs. As most IDEs would support name refactoring (which is regularly used). You may need to write some IDE plugin that could browse through source code (AST) and invoke rename refactoring behind the scene for such parameter names matching such format.

I have done a lot of such refactorings on a rather large scale of files, using TextPad or WildPad, and a bunch of reg-ex replace-all. Always worked for me!
I'm confident that if the code is first formatted using an IDE like Eclipse (if it is not properly formatted), and then a reg-ex involving the methods' signature (scope, return-type, name, bracket, arg list, bracket) can be devised, your job will be done in seconds with these tools. You might need more than one replace-all sets of reg-ex.
The only time-taking activity would be to come up with such a set of reg-ex.
Hope this helps!

What is the base open source java package to filter/match URLs?

I have an high performance application which deals with URLs. For every URL it needs to retrieve the appropriate settings from a predefined pool. Every settings object is associated with a URL pattern which indicates which URLs should use these settings. The matching rules are as follows:
"google.com" match pattern should match all URLs pointing to the google domain (thus, maps.google.com and www.google.com/match are matched).
"*.google.com" should match all URLs pointing to a subdomain of google.com (thus, maps.google.com matches, but google.com and www.google.com don't).
"maps.google.com" should match all URLs pointing to this specific subdomain.
Apart from the above rules, every match rule can contain a path, which means that the path part of the URL should start with the match rule path. So: "*.google.com/maps" matches "maps.google.com/maps" but not "maps.google.com/advanced".
As you can see the rules above are overlapping. In the case two rules exist which match the same URL the most specific should apply. The list above is ranked from least specific to most specific.
This seems to be such a standard problem that I was hoping to use a ready made library rather than program my self. Google reveals a couple of options but without a clear way to choose between them. What would you recommend as a good library for this task?
Thanks,
Boaz

I don't think you need a specific library to solve this; the standard Java API has all that you need to write the code without too much work.
Take a look at java.util.regex.Pattern and work out the regular expressions you need to match each of your rules. You might also want to use java.net.URL to parse out the different fields from the URL.
You already said you have a priority scheme to handle scenarios where multiple patterns match the URL, so that should be the last piece for this puzzle.
It looks like a pretty straight-forward task.

How to retrieve forbidden characters for filenames, in Java?

There are some restricted characters (and even full filenames, in Windows), for file and directory names. This other question covers them already.
Is there a way, in Java, to retrieve this list of forbidden characters, which would be depending on the system (a bit like retrieving the line breaker chars)? Or can I only put the list myself, checking for the system?
Edit: More background on my particular situation, aside from the general question.
I use a default name, coming from some data (no real control over their content), and this name is given to a JFileChooser, as a default file name to use (with setSelectedFile()). However, this one truncates anything prior to the last invalid character.
These default names occasionally end with dates in a "mm/dd/yy" format, which leaves only the "yy", in the default name, because "/" are forbidden. As such, checking for Exceptions is not really an option there, because the file itself is not even created yet.
Edit bis: Hmm, that makes me think, if JFileChooser is truncating the name, it probably has access to a list of such characters, can be interesting to check that further.
Edit ter: Ok, checking sources from JFileChooser shows something completely simple. For the text field, it uses file.getName(). It doesn't actually check for invalid characters, it's simply that it takes the "/" as a path separator, and keeps only the end, the "actual filename". Other forbidden characters actually go through.

When it comes to dealing with "forbidden" characters I'd rather be overcautious and ban/replace all "special" characters that may cause a problem on any filesystem.
Even if technically allowed, sometimes those characters can cause weirdness.
For example, we had an issue where the PDF files were being written (successfully) to a SAN, but when served up via a web server from that location some of the characters would cause issues when we were embedding the PDF in an HTML page that was being rendered in Firefox. It was fine if the PDF was accessed directly and it was fine in other browser. Some weird error with how Firefox and Adobe Reader interact.
Summary: "Special" characters in file names -> weird errors waiting to happen
Ultimately, the only way to be sure is to use a white-list.

Having certain "forbidden characters" is just one of many things that can go wrong when creating a file (others are access rights and file and path name lengths).
It doesn't really make sense to try and catch some of these early when there are others you can't catch until you actually try to create the file. Just handle the exceptions properly.

Have you tried using File.getCanonicalPath and comparing it to the original file name (or whatever is retrieved from getAbsolutePath)?
This will not give you the actual characters, but it may help you in determining whether this is a valid filename in the OS you're running on.

Have a look at this link for some info on how to get the OS the application is running on. Basically you need to use System.getProperty("os.name") and do an equals() or contains() to find out the operating system.
Something to be weary of though is that knowing the OS does not necessarily tell you the underlying file system being used, for example a Mac can read and write onto the FAT32 file system.
source: http://www.mkyong.com/java/how-to-detect-os-in-java-systemgetpropertyosname/

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.