Check Android Assets Integrity

Check Android Assets Integrity - java

In my folder assets/data, there are a lot of XML files containing static data for my app.
It's really easy for someone to retrieve an APK, modify a part of it and install on a device.
I would like to prevent users to alter my static data by checking the integrity of my assets/data folder.
Initially I was considering to use MD5 checksum, but it will probably be too slow for the amount of files I gonna have (50-100).
Do you have any suggestion?
Edit:
This app is a game with an XML file describing each level.

I'll describe how you can effectively protect against modification and repackaging, not how you can protect the assets on their own, although you could ultimately apply the same technique to encrypting them. It's imperfect, but you can make modification significantly more difficult.
You sign the application with a certificate. Although they can remove yours, noone else can produce the same certificate when putting it back together. You can therefore check the signature of the application at runtime, to make sure it's what you expect.
Here's some cheap and nasty code to do this:
PackageManager pm = context.getPackageManager();
PackageInfo info = pm.getPackageInfo( context.getPackageName(), PackageManager.GET_SIGNATURES );
if ( info.signatures[ 0 ].toCharsString().equals( YOUR_SIGNATURE ) )
{
//signature is OK
}
where YOUR_SIGNATURE is a constant, obtained from running this code on the signed app.
Now, there are two remaining problems that you have already hinted at:
how can you stop someone just modifying the constant in the source code to match their certificate, then repackaging and re-signing the app?
how can you stop someone finding the check method and removing it?
Answer to both: you can't, not absolutely, but you can do a pretty good job through obfuscation. The free Proguard, but more usefully the commercial Dexguard, are tools for doing this. You may baulk at the current €350 cost of the latter; on the other hand, I have tried to reverse engineer apps that are protected like this, and unless the stakes were very high, it isn't worth the trouble.
To an extent, you could also do the obfuscation for (1) yourself; have the signature 'constant' assembled at runtime through some complicated programmatic method that makes it difficult to find and replace.
(2) is really a software design issue; making it sufficiently complicated or annoying to remove the check. Obfuscation just makes it more difficult to find in the first place.
As a further note, you might want to look at whether stuff like Google Licensing gives you any protection in this area. I don't have any experience of it though, so you're on your own there.

Sort of an answer although it is in the negative.
If the person has your apk and has decoded it, then even if you used a checksum, they can just update the code portion with the new checksum. I don't think you can win this one. You can put a great deal of effort into protecting it but if you assume somebody can obtain and modify the apk, then they can also undo the protection. On my commercial stuff, I just try to make the decoding non-obvious but not bullet proof. I know anything more is not worth the effort or even possible.

Perhaps you could zip up the xml files and put it in the assets/data folder; and then do a checksum on that .zip. On the first run, you could unzip the files to get the .xml layouts. See Unzip file from zip archive of multiple files using ZipFile class for unzipping an archive.

Probably the most reliable way would be for the level XML data to be downloaded from a server when the app is started with a check of the time stamp and sizes of the level files. That also lets you provide updates to level data over time. Of course this means you have the added expense of a server to host which may be another problem.

Related

Java - prevent code modification techniques

I recently heard of a software security company that makes your code hack-proof in terms of reverse engineering and code modification. Their technique is this:
They insert checksums in multiple check points in the code that secure the code between them. As the code flow is executed at every checkpoint the checksum is checked and if the code has been tampered with then the checksum fails and you know there has been code modification. If a checkpoint is removed then the next checkpoint will also fail because a checkpoint has been removed.
To buy their services would be completely out of budget for my project (an Android application) however I would like to implement that technique on my own.
Could someone offer some insight on how something like this could be implemented ? Also if there are other methods that one could use το prevent code modification please share.
(Just to clarify I'm aware of obfuscation, weird missleading code logic, and writing fake methods to further make the code difficult to read and will apply these methods too )

prevent code modification techniques
There is not any trick for complete avoidance of reverse engineering.
You basically can't protect your application from being modified. And any protection you put in there can be disabled/removed.
If you have the option of including shared libraries, you can include the needed code in C++ to verify file sizes, integration, etc

Hack-proof is a very loosely defined term. Even if you implement checksums on various portions of your code, there are many other exploits that you need to be aware of that would be outside of modifying the source code like injection, authentication, etc.. My recommendation to you is to worry less about how to prevent someone from modifying your code and focus more on protecting the vulnerable areas if they were to modify your source code including hashed and salted passwords, encrypted data transfer, etc..

Bytecode injection countermeasures in jar files

I have a jnlp application that loads and executes a jar file ( client ) on a users computer. The user uses this jar to communicate with a server that provides a services. I've seen users using javassit and javasnoop to alter the functionality of the client. Is there any way I can remotelly detect changes created by the previously mentionted utilities ? For example, can I checksum the classes locally and send the result to the server ( who knows the correct checksum of each class ) ?

There is no way in general to prevent the client from running any code they wish to. The security of your system should never rely on assuming that clients are running specific code or are not aware of specific information contained in the jars you send them.
Furthermore, attempts to impose DRM tend to cause problems for legitimate users and alienate your customers while doing little to prevent people who actually do want to hack the system.

You can for example create a check sum of your java file and make your application to calculate the checksum at runtime and send it for verification to server. The simplest checksum is a hash code of whole jar.
The only question is why? And who is the super user that takes your jar and performs instrumentation on it? And why is he doing this? And even if he has reasons, who cares? If you are afraid that somebody is going to hack your server make it secure enough and do not care about client (IMHO).

You can open a classfile named p1.p2.ClName with Classloader.getResourceAsStream("/p1/p2/ClName.class"), read it, and compute its checksum.
But, as user can change the functionality, he can also remove that checksum checking.

Is it possible to attack a server by uploading / through uploaded files?

the title actually tells the issue. And before you get me wrong, I DO NOT want to know how this can be done, but how I can prevent it.
I want to write a file uploader (in Java with JPA and MySQL database). Since I'm not yet 100% sure about the internal management, there is the possibility that at some point the file could be executed/opened internally.
So, therefor I'd be glad to know, what there is, an attacker can do to harm, infect or manipulate my system by uploading whatever type of file, may it be a media file, a binary or whatever.
For instance:
What about special characters in the file name?
What about manipulating meta data like EXIF?
What about "embedded viruses" like in an MP3 file?
I hope this is not too vague and I'd be glad to read your tips and hints.
Best regards,
Stacky

It's really very application specific. If you're using a particular web app like phpBB, there are completely different security needs than if you're running a news group. If you want tailored security recommendations, you'll need to search for them based on the context of what you're doing. It could range from sanitizing input to limiting upload size and format.
For example, an MP3 file virus probably only works on a few specific MP3 players. Not on all of them.
At any rate, if you want broad coverage from viruses, then scan the files with a virus scanner, but that probably won't protect you from things like script injection.

If your server doesn't do something inherently stupid, there should be no problem. But...
Since I'm not yet 100% sure about the internal management, there is the possibility that at some point the file could be executed/opened internally.
... this qualifies as inherently stupid. You have to make sure you don't accidently execute uploaded files (permissions on the upload directory are a starting point, limit the upload to specific directories etc.).
Aside from executing, if the server attempts any file type specific processing (e.g. make thumbnails of images) there is always the possibility that the processing can be attacked through buffer overflow exploits (these are specific for each type of software/library though).
A pure file server (e.g. FTP) that just stores/serves files is save (when there are no other holes).

Detecting newly created files though Java in realtime

Using JDK 7 I've had success in watching specific directories for new file creations, deletions and modifications using java.nio.file.StandardWatchEventKinds.*
I'm hoping someone may know a way to get Java to detect new file creations regardless of their path.
I am wanting to do this so I can calculate an MD5 sum for each newly written file.
Thanks for any advice you can offer.

Ok, short answer is I don't think Java can do that out of the box. You'd have to either intercept calls to the operating system which would require something closer to the bare metal, or you could do as suggested in another answer and register listeners to every folder from the root down, not to mention other drives in the case of windows machines.
The first approach would need custom JNI which assumes the OS has such a hook and allows user code access.
The second approach would work but could consume a large amount of memory to track all the listeners. In windows right-click on c:\ and select and see just how many folders we're talking about.

One possibility - not a convenient one, but a possibility - is to walk the directory tree for the directories you want to watch, registering each in a WatchService. That's not a very nice way to go about it, and it could be a problem depending on how large the actual directory tree is.

I do not know StandardWatchEvents (although it sounds convenient).
One way to do one you want is to use a native window API such as ReadDirectoryChangesW (or volume changes). It's painful, but works (been there, done that, wish I had another option at the time).

Java content APIs for a large number of files

Does anyone know any java libraries (open source) that provides features for handling a large number of files (write/read) from a disk. I am talking about 2-4 millions of files (most of them are pdf and ms docs). it is not a good idea to store all files in a single directory. Instead of re-inventing the wheel, I am hoping that it has been done by many people already.
Features I am looking for
1) Able to write/read files from disk
2) Able to create random directories/sub-directories for new files
2) Provide version/audit (optional)
I was looking at JCR API and it looks promising but it starts with a workspace and not sure what will be the performance when there are many nodes.

Edit: JCP does look pretty good. I'd suggest trying it out to see how it actually does perform for your use-case.
If you're running your system on Windows and noticed a horrible n^2 performance hit at some point, you're probably running up against the performance hit incurred by automatic 8.3 filename generation. Of course, you can disable 8.3 filename generation, but as you pointed out, it would still not be a good idea to store large numbers of files in a single directory.
One common strategy I've seen for handling large numbers of files is to create directories for the first n letters of the filename. For example, document.pdf would be stored in d/o/c/u/m/document.pdf. I don't recall ever seeing a library to do this in Java, but it seems pretty straightforward. If necessary, you can create a database to store the lookup table (mapping keys to the uniformly-distributed random filenames), so you won't have to rebuild your index every time you start up. If you want to get the benefit of automatic deduplication, you could hash each file's content and use that checksum as the filename (but you would also want to add a check so you don't accidentally discard a file whose checksum matches an existing file even though the contents are actually different).
Depending on the sizes of the files, you might also consider storing the files themselves in a database--if you do this, it would be trivial to add versioning, and you wouldn't necessarily have to create random filenames because you could reference them using an auto-generated primary key.

Combine the functionality in the java.io package with your own custom solution.
The java.io package can write and read files from disk and create arbitrary directories or sub-directories for new files. There is no external API required.
The versioning or auditing would have to be provided with your own custom solution. There are many ways to handle this, and you probably have a specific need that needs to be filled. Especially if you're concerned about the performance of an open-source API, it's likely that you will get the best result by simply coding a solution that specifically fits your needs.
It sounds like your module should scan all the files on startup and form an index of everything that's available. Based on the method used for sharing and indexing these files, it can rescan the files every so often or you can code it to receive a message from some central server when a new file or version is available. When someone requests a file or provides a new file, your module will know exactly how it is organized and exactly where to get or put the file within the directory tree.
It seems that it would be far easier to just engineer a solution specific to your needs.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.