how to protect against Null Byte Injection in a java webapp

how to protect against Null Byte Injection in a java webapp - java

How can null byte injection be done on a java webapp, Or rather - how does on protect against it?
Should I look at each byte of the request parameter and inspect its 'byte' value to be 0 ? I can't imagine a 0 byte sneaking in a request parameter... can it?
My main aim is to make sure the filename used for saving the file is safe enough. And for now, I am not looking answers that recommend (for example): replacing ALL non-word characters with Underscore.

Allowing the user to store files with arbitrary names is dangerous. What happens if the user provides "../../../WINDOWS/explorer.exe"? You should restrict filenames to only contain characters known to be harmless.
'\0' is not known to be harmless. As far as Java is concerned, '\0' is a character like any other. However, the operating system is likely to interpret '\0' as the end of a string. If a string is passed from Java to the operating system, that different interpretation could result in exploitable bugs. Consider:
if (filename.endsWith(".txt") {
store(filename, data);
}
where filename is "C:\Windows\explorer.exe\0.txt", which ends with ".txt" to Java, but with ".exe" to the operating system.

I'm not sure why you're concerned with null byte injection. Java isn't like C/C++, where strings are null-terminated character arrays.
You ought to bind and validate parameters and values coming in from the web tier. How do you define "safe enough"?

You have 2 choices:
1 Scan the string (convert it to a char array first) for null bytes.
2 upgrade to Java 8 or Java 7u40 and you are protected. (Yes, i tested it!, it works!)
in May 1013 Oracle fixed the problem: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8014846

Null byte injection in filenames was fixed in Java 7 update 40 (released around Sept. 2013). So, its been fixed for a while now, but it WAS a problem for over a decade and it was a NASTY vulnerability in Java. The fix is documented here: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8014846
-Dave Wichers

Related

Why are the ModuleEntry32 text values from WinAPI displaying as Chinese characters?

I have been using the JNA (Java Native Access) library to access the memory of processes. I have been writing some code to enumerate through all modules of a process, and the struct MODULEENTRY32 is obtained properly - I am getting their handles and base addresses properly. However, the "String" values szModule and szExePath (which are char arrays) that are returned give me random Chinese characters.
JNA provides helper classes for structs such as MODULEENTRY32 (they call it MODULEENTRY32W) for using functions such as Module32First and Module32Next, which I've been using. They have sort of their own toString method for szModule and szExePath, and those return the random Chinese chars as well. I have tried to encode/decode it myself, and would get close to the "right" values (encoding to UTF-16, then decoding to ISO) but it still is a bit off - as in I can't use equals/equalsIgnoreCase to compare it with another String.
Below is roughly an example of what I am getting when printing out szModule and szExePath in the format szModule:szExePath returned from the Module32First/Module32Next calls:
瑮汤⹬汤l: 瑮汤⹬汤l
䕋乒䱅㈳䐮䱌: 䕋乒䱅㈳䐮䱌
䕋乒䱅䅂䕓搮汬: 䕋乒䱅䅂䕓搮汬
档潲敭敟晬搮汬: 档潲敭敟晬搮汬
䕖卒佉⹎汤l: 䕖卒佉⹎汤l
獭捶瑲搮汬: 獭捶瑲搮汬
And here is roughly how I am enumerating:
// hSnapshot is valid, and I already called "Module32First" - this loops through any other modules
while(this.moduleBaseAddr == null && this.moduleHandle == null) {
Tlhelp32.MODULEENTRY32W currentModuleEntry32 = new Tlhelp32.MODULEENTRY32W();
if(this.kernel32.Module32Next(hSnapshot, currentModuleEntry32)) {
currentModuleEntry32.read();
String currentModuleName = currentModuleEntry32.szModule();
System.out.println(currentModuleName + ": " + currentModuleEntry32.szModule());
if(currentModuleName.equals(MODULE_NAME)) {
this.moduleBaseAddr = currentModuleEntry32.modBaseAddr;
this.moduleHandle = currentModuleEntry32.hModule.getPointer();
break;
}
}else{
break;
}
}
Does anyone have any insight on solving this issue?

You are mixing ANSI function mappings and Unicode structure mappings.
Most Windows functions have two versions of the function, one ending in A and one in W, with comments in the documentation. For example, CreateProcess has two versions, CreateProcessA and CreateProcessW, where the documentation states:
The processthreadsapi.h header defines CreateProcess as an alias which automatically selects the ANSI or Unicode version of this function based on the definition of the UNICODE preprocessor constant. Mixing usage of the encoding-neutral alias with code that not encoding-neutral can lead to mismatches that result in compilation or runtime errors. For more information, see Conventions for Function Prototypes.
That link states:
New Windows applications should use Unicode to avoid the inconsistencies of varied code pages and for ease of localization.
Unfortunately in the case of GetModuleFirst and GetModuleNext, they do not follow the usual SDK convention. There is no -A version of these functions so the mapping you have created is ANSI (really ASCII). The byte string returned for szModule in the first line of the output in your question is 6e74646c6c2e646c6c3a206e74646c6c2e646c6c which in ASCII or UTF-8 decodes to ntdll.dll: ntdll.dll. Because you are using the MODULEENTRY32W (Unicode) structure mapping, these bytes are interpreted as UTF-16, resulting in the characters you are seeing in your output.
The Unicode mappings are GetModuleFirstW and GetModuleNextW, and are the functions you should be using. These are mapped in JNA's Kernel32 class. I highly recommend you use the JNA mappings rather than reinventing the wheel.
Incidentally, JNA's Kernel32Util class already handles all of this and offers a List<Tlhelp32.MODULEENTRY32W> getModules(int processID) method using the correct mappings, that you may find useful.

EBCDIC unpacking comp-3 data returns 40404** in Java

I have used the unpack data logic provided in below link for java
How to unpack COMP-3 digits using Java?
But for the null data in source it returns 404040404 like on Java unpack code. I understand this was space in ebcdic, but how to unpack by handling this space or to avoid it.

There are two problems that we have to deal with. First, is the data valid comp-3 data and second, is the data considered “valid” by older language implementations like COBOL since Comp-3 was mentioned.
If the offests are not misaligned it would appear that spaces are being interpreted by existing programs as 0 instead of spaces. This would be incorrect but could be an artifact of older programs that were engineered to tolerate this bad behaviour.
The approach I would take in a legacy shop (assuming no misalignment) is to consider “spaces” (which are sequences of 0x404040404040) as being zero. This would be a legacy check to compare the field with spaces and then assume that 0x00000000000f as the actual default. This is something an individual shop would have to determine and is not recognized as a general programming approach.
In terms of Java, one has to remember that bytes are “signed” so comparisons can be tricky based on how the code is written. The only “unsigned” data type I
recall in java is char which is really two bytes (unit 16) basically.
This is less of a programming problem than it is recognizing historical tolerance and remediation.

how to create our own O(1) substring function in java as it was in jdk 6.

How to create our own O(1) substring function in java as it was in jdk 6. If there is any method to use substring() of jdk 6 on advanced versions of jdk ?

The O(1) substring was because the underlying character array of the string could be shared between objects. Hence substring simply required creating an object with a pointer to the original string along with an offset and length. There was no copying of the actual data itself, which had the annoying effect that taking a small substring of a huge string, then deleting the huge one, didn't actually free up memory. This lead to code such as:
String newstr = new String(oldStr.substring(5,9));
rather than the more sensible-looking:
String newstr = oldStr.substring(5,9);
Since strings no longer share data (Update 6 of Java 7 is where I think this happened), that's not possible so, if you want to get back that O(1) performance, you'll basically have to construct your own string class to do it.
Just be aware that you may be worrying about something that's not so important. Unless your strings are very large, the extra cost (in space and time) of copying the data for them may be inconsequential.
And the extra effort in converting your O1String into String for every function that needs the latter, as well as the less than perfect integration with literal strings, may well make it even worse.

Here you can view how it was implimented in Java 6
Open JDK

Internal character encoding of Java 7

So far as I know, when JRE executes an Java application,
the string will be seen as a USC2 byte array internally.
In wikipedia, the following content can be found.
Java originally used UCS-2, and added UTF-16 supplementary character support in J2SE 5.0.
With the new release version of Java (Java 7) ,
what is its internal character-encoding?
Is there any possibility that Java start to use UCS-4 internally ?

Java 7 still uses UTF-16 internally (Read the last section of the Charset Javadoc), and it's very unlikely that will change to UCS-4. I'll give you two reasons for that:
Changing from UCS-2=>UCS-4 would most likely meant that they would have to change the char primitive from a 16 bits type to a 32 bits type. Looking in the past at how high Sun/Oracle have valued backwards compatibility, a change like this is very unlikely.
A UCS-4 takes a lot more memory than a UTF-16 encoded String for most use cases.

Q: So far as I know, when JRE executes an Java application, the string
will be seen as a (16-bit Unicode) byte array
A: Yes
Q: With the new release version of Java (Java 7) , what is its
internal charater-encoding?
A: Same
Q: Is there any possibility that Java start to use UCS-4 internally?
A: I haven't heard anything of the kind
However, you can use "code-points" to implement UTF-32 characters in Java 5 and higher:
http://www.ibm.com/developerworks/java/library/j-unicode/
http://jcp.org/en/jsr/detail?id=204

Null Byte Injection check in Java app or What is a Null Byte

How does one protect against Null Byte injection?
see: https://security.stackexchange.com/q/378
If a request parameter is going to used as a filename, should we look at each byte of the request parameter and inspect its 'byte' value to be 0 ?

You have 3 choices:
1 Scan the Java string (convert it to a char array first) for null bytes.
2 Sanitize user input (i.e. check for '%00' - it is a URL-encoded null byte.). But beware! Hackers use different encodings, so #1 is safer!
3 upgrade to Java 8 or Java 7u40 and you are protected. (Yes, i tested it!, it works!)
in May 1013 Oracle fixed the problem: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8014846

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

how to protect against Null Byte Injection in a java webapp - java

I'm not sure why you're concerned with null byte injection. Java isn't like C/C++, where strings are null-terminated character arrays. You ought to bind and validate parameters and values coming in from the web tier. How do you define "safe enough"?

You have 2 choices: 1 Scan the string (convert it to a char array first) for null bytes. 2 upgrade to Java 8 or Java 7u40 and you are protected. (Yes, i tested it!, it works!) in May 1013 Oracle fixed the problem: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=8014846

Related

Why are the ModuleEntry32 text values from WinAPI displaying as Chinese characters?

EBCDIC unpacking comp-3 data returns 40404** in Java

how to create our own O(1) substring function in java as it was in jdk 6.

Internal character encoding of Java 7

Null Byte Injection check in Java app or What is a Null Byte

Categories

Resources