Customising log4j logging for sensitive data - java

I have a class which contains sensitive information (Credit card info, phone numbers etc).
I want to be able to pass this class to log4j, but have it obscure certain information.
If I have a class UserInformation which has getPhoneNumber, getCreditCardNumber methods, how would I customise log4j or this class so that it will obscure the numbers correctly.
I want the credit card number to be output as xxxx-xxxx-xxxx-1234 and the phone number to be output as xxxx-xxx-xxx given that these would be 1234-1234-1234-1234 and 1234-567-890
Thanks

You could try to implement this by writing a custom log record formatter that obscures those patterns. But I think that is a bit dodgy ... because someone could accidentally or deliberately circumvent this by tweaking the logger configuration files, etc.
I think it would be better idea to do one of the following, depending on how you are assembling the log messages:
Change the logger calls in your code to assemble the log messages using alternative getter methods on UserInformation that obscure the sensitive fields.
Change the toString method on UserInformation to obscure the details.

I'd write an obfuscating formatter for those fields and use that to write to the log file.
I'd also ask why you would continue to use String primitives instead of objects that could encapsulate the appropriate behavior.

Update: The best option is probably to wrap your real objects in an Obfuscated-ClassName wrapper that implements the same interface but returns obfuscated versions (by delegating to the real object and obfuscating the result) and hand those to the logging system. This only works if you are actually passing in these objects yourself, and not if they are part of an object tree - that might make the whole situation a bit more complex.
old:
Maybe you should just add getPhoneNumberForLogging()/getObfuscatedPhoneNumber() type functions? (Of course you have to take into account that if you hand an object containing this data to another object/process you cannot control access to the 'normal' functions so technically you don't shield the data at all - although it might be possible to make the methods that show sensitive data package local accessible only?)
You could also investigate the call stack on every call and try to figure out if you want to return the full data or the obfuscated version - this will add quite a bit of overhead and might be very tricky to debug.

Related

Know when value of any variable defined inside the class is changed

I have defined a class which acts like a model/pojo. The class has many keys/variable. I have implemented custom solution for storing the POJO on disk for future uses. Now what I want to do is that whenever any value in the class/POJO is changed, I should call a method which sync the fresh changes with file on disk.
I know I can define setter for each variable. But it's quite tedious to do for 100s of direct and sub fields, and even if I define setter for each field, I have to call sync function from all the setters.
What I need is single proxy setter or interceptor for all change pushes to variables in class.
I am using this in an android application, so whenever the user enters new details in his/her account I have to store those details at that specific instance of time for preventing the data loss. I am using GSON for serialising and de-serialising.
Sorry for using vague terminologies, never been to college :|.
The easiest solution is indeed to use a setter. You only have to create one for each field you want to monitor, and most IDEs generate them for you or you can use something like Koloboke, so it being tedious isn't really an argument.
A proxy class or reflection would also be possible, but that is pretty hacky. Another way would be an asynchronous watcher/worker that checks for changes in you POJO instances, but even that seems unnecessarily complicated.
Apart from that you might need to rethink your POJOs structure if it has that many fields.
The problem with persisting(in your case writting to a disk) entity on each property update is that most of the updates are modifying more then one property. So in case you have a code like this:
entity.setA(avalue);
entity.setb(bvalue);
entity.setc(cvalue);
You would write it to the disk 3 times, which is probably not a best way, as it takes more resources, and 2 out of 3 writes are unnecessary.
There are several ways to deal with it. Imagine you have some service for saving this data to a disk, lets name it entityRepository. So one option is manually call this entityRepository each time you want to save/update your entity. It seems to be very uncomfortable, comparing to calling this automatically on setter call, however, this approach clearly shows you when and why your entity is persisted/updated, in your approach it's unclear, and can lead to some problems future problems and mistakes, for example, in future you will decide that you now need to update one of the properties without immideately persisting, then it appears that you will need 2 setter, one with update, and one without...
Another way is to add version property, and when its setter is called inside this setter call entityRepository.save(this).
The other way is to look at AOP, however anyway I don't recommend persist entity on any change, without having control over it.
You are talking about data binding. There is no built-in way for that so you have indeed to sync it yourself. Look into How to Write a Property Change Listener. There are also lots of other approaches to this, but as said no built-in way.

Is it possible to prevent a class from using a method in java?

Suppose I have a class called Foo. This class will be modified by many people, and WILL print information to the console. To this effect, we have the following method:
private void print(String message){ ... }
which prints out to the screen in the format we want.
However, while reviewing code from other devs I see that they constantly call System.out.println(...)
instead, which results in barely-readable printouts.
My question is the following: is it possible to prevent any and every use of System.out.println() in Foo.java? If so, how?
I've tried looking this up, but all I found had to do with inheritance, which is not related to my question.
Thanks a lot!
N.S.
EDIT: I know that whatever I have to do to prevent the use of a method could be removed by a dev, but we have as a policy never to remove code marked //IMPORTANT so it could still be used as a deterrent.
EDIT2: I know I can simply tell the devs not to do it or use code reviews to filter the "errors" out but 1) I'm already doing it and it costs a lot of time and 2) the question is whether this is possible or not, NOT how to deal with my devs.
public methods are just that - public. There is no way to restrict access to them.
This kind of problem is usually "solved" by setting up some code-checker like PMD or checkstyle and integrating them into the continuous integration build. So violations of these stuff will be emailed to someone with a big hammer :-)
Although communicating that developers should not use System.out directly would be preferred, you could set System.out to another PrintStream, then use the alternative PrintStream in the private method. That way, when people use System.out.println they won't output anything but you'll still be able to use the alternative PrintStream... something like they do here: http://halyph.blogspot.com/2011/07/how-to-disable-systemout.html
Pre-commit hooks for your revision control system (SVN, Git, Mercurial) can grep for uses of System.{err,out} and prevent commit if they occur.
http://stuporglue.org/svn-pre-commit-hook-which-can-syntax-check-all-files/ is an example that takes an action for different changed files based on file extension for SVN. You should be able to modify that example to take an example based on some subset of Java files and reject if something like the following is true
egrep -q '\bSystem\.(err|out)\b'
You can redirect System.out calls to a streams that ignores the output or that redirects it to your logging system.
System.setOut(printStream);
You can also kill those using System.out.println in a production environment.
You can replace the OutputStream of System with your own implementation that would either throw an exception, or redirect the call to your own print implementation (which you would need to make public).
No, it's not possible to 100% prevent a class from ever using a specific method in Java.
Having that said...
My suggestion would be to add code analysis to your build process and failing the build on any occurrence of System.out.println. A good place to start if you're interested in going this route would be to check out PMD.
Also... have some constructive discussions with your developers and talk about why they're doing what they're doing. Good luck.

Better way to store data in java?

I am currently working on a videogame, and i want to have the user be able to save their character to a new file. I know how to use the file io (for the most part), but i have been using the 'serialize' to serialize a whole object (that contains all the variables for the character) and save it to a file. The problem is that i am constantly updating the object and making changes to it, so when i try to load the old character with the new object, it errors and crashes. Same with levels as-well (an object holding a few 2d-array of variables).
There must be a better way to do this so it is compatible with future versions. If there is a way, would anybody please offer some source code and/or a link to a nice tutorial? All help is appreciated, thanks!!!
Use XML or an embedded database (fast and lightweight) such as Derby or H2. You could even use a plain old properties file.
In fact, see if the properties file will work for you. And only if that won't work, try XML or the embedded database approach.
if you are looking for java serializers here is the benchmark for you https://github.com/eishay/jvm-serializers/wiki/
Apache Avro seems to perform well.
Another way is to store the values in the persistent store like HSQLDB or H2 db and load it to memory at startup and persist when needed.You can also use SQLite (for driver check this)
You can implement Externalizable instead of Serializable, and in the readExternal() and writeExternal() methods you can put the logic to read/write the object. This way you have full control of serialization/deserialization and can make changes fairly easily. Alternatively you can use JSON serialization by using Gson. I would not recommend XML, but if you want to you can check out xstream for the same thing.
If you are extending your objects in backwards compatible ways, i.e. add fields, and not removing fields. Make sure that you have declared a serialVersionUID as per the serializable javadoc.
http://download.oracle.com/javase/1.5.0/docs/api/java/io/Serializable.html
One additional option to consider since you're already using serialization, you could implement Externalizable instead of Serializable. The code you use to serialize objects would remain the same. However in your class you would specify exactly how you want it serialized by overriding readExternal() and writeExternal(). E.g.:
public class MyClass implements Externalizable {
private int foo;
private String bar;
public readExternal(ObjectInput in) {
foo = in.readInt();
bar = in.readUTF();
}
public writeExternal(ObjectOutput out) {
out.writeInt(foo);
out.writeUTF(bar);
}
}
Just be sure to keep the order the same when reading and writing. Try to only add fields, however if you need to remove a field leave a gap to account for old versions.
Ultimately though if you're making a lot of changes it might best to switch to a properties or XML file as LES2 suggested. It'll be more portable and readable that way.
This game uses java.util.prefs.Preferences for cross-platform convenience. Because keys are stored individually, new additions rarely interfere with existing entries.

Save object in debug and than use it as stub in tests

My application connects to db and gets tree of categories from here. In debug regime I can see this big tree object and I just thought of ability to save this object somewhere on disk to use in test stubs. Like this:
mockedDao = mock(MyDao.class);
when(mockedDao.getCategoryTree()).thenReturn(mySavedObject);
Assuming mySavedObject - is huge enough, so I don't want to generate it manually or write special generation code. I just want to be able to serialize and save it somewhere during debug session then deserialize it and pass to thenReturn in tests.
Is there is a standard way to do so? If not how is better to implement such approach?
I do love your idea, it's awesome!
I am not aware of a library that would offer that feature out of the box. You can try using ObjectOutoutStream and ObjectInputStream (ie the standard Java serialization) if your objects all implement Seriablizable. Typically they do not. In that case, you might have more luck using XStream or one of its friends.
We usually mock the entire DB is such scenarios, reusing (and implicitly testing) the code to load the categories from the DB.
Specifically, our unit tests run against an in-memory database (hsqldb), which we initialize prior to each test run by importing test data.
Have look at Dynamic Managed Beans - this offers a way to change values of a running java application. Maybe there's a way to define a MBean that holds your tree, read the tree, store it somewhere and inject it again later.
I've run into this same problem and considered possible solutions. A few months ago I wrote custom code to print a large binary object as hex encoded strings. My toJava() method returns a String which is source code for a field definition of the object required. This wasn't hard to implement. I put log statements in to print the result to the log file, and then cut and paste from the log file to a test class. New unit tests reference that file, giving me the ability to dig into operations on an object that would be very hard to build another way.
This has been extremely useful but I quickly hit the limit on the size of bytecode in a compilation unit.

Writing long test method names to describe tests vs using in code documentation

For writing unit tests, I know it's very popular to write test methods that look like
public void Can_User_Authenticate_With_Bad_Password()
{
...
}
While this makes it easy to see what the test is testing for, I think it looks ugly and it doesn't display well in auto-generated documentation (like sandcastle or javadoc).
I'm interested to see what people think about using a naming schema that is the method being tested and underscore test and then the test number. Then using the XML code document(.net) or the javadoc comments to describe what is being tested.
/// <summary>
/// Tests for user authentication with a bad password.
/// </summary>
public void AuthenticateUser_Test1()
{
...
}
by doing this I can easily group my tests together by what methods they are testing, I can see how may test I have for a given method, and I still have a full description of what is being tested.
we have some regression tests that run vs a data source (an xml file), and these file may be updated by someone without access to the source code (QA monkey) and they need to be able to read what is being tested and where, to update the data sources.
I prefer the "long names" version - although only to describe what happens. If the test needs a description of why it happens, I'll put that in a comment (with a bug number if appropriate).
With the long name, it's much clearer what's gone wrong when you get a mail (or whatever) telling you which tests have failed.
I would write it in terms of what it should do though:
LogInSucceedsWithValidCredentials
LogInFailsWithIncorrectPassword
LogInFailsForUnknownUser
I don't buy the argument that it looks bad in autogenerated documentation - why are you running JavaDoc over the tests in the first place? I can't say I've ever done that, or wanted generated documentation. Given that test methods typically have no parameters and don't return anything, if the method name can describe them reasonably that's all the information you need. The test runner should be capable of listing the tests it runs, or the IDE can show you what's available. I find that more convenient than navigating via HTML - the browser doesn't have a "Find Type" which lets me type just the first letters of each word of the name, for example...
Does the documentation show up in your test runner? If not that's a good reason for using long, descriptive names instead.
Personally I prefer long names and rarely see the need to add comments to tests.
I've done my dissertation on a related topic, so here are my two cents: Any time you rely on documentation to convey something that is not in your method signature, you are taking the huge risk that nobody would read the documentation.
When developers are looking for something specific (e.g., scanning a long list of methods in a class to see if what they're looking for is already there), most of them are not going to bother to read the documentation. They want to deal with one type of information that they can easily see and compare (e.g., names), rather than have to start redirecting to other materials (e.g., hover long enough to see the JavaDocs).
I would strongly recommend conveying everything relevant in your signature.
Personally I prefer using the long method names. Note you can also have the method name inside the expression, as:
Can_AuthenticateUser_With_Bad_Password()
I suggest smaller, more focussed (test) classes.
Why would you want to javadoc tests?
What about changing
Can_User_Authenticate_With_Bad_Password
to
AuthenticateDenieTest
AuthenticateAcceptTest
and name suit something like User
As a Group how do we feel about doing a hybrid Naming schema like this
/// <summary>
/// Tests for user authentication with a bad password.
/// </summary>
public void AuthenticateUser_Test1_With_Bad_Password()
{
...
}
and we get the best of both.

Categories

Resources