Java decompiling and JNI

Java decompiling and JNI - java

A little bit like this question How to lock compiled Java classes to prevent decompilation? , However I am well aware of how to decompile an application and try to understand it even if it is obfuscated but one thing im not too sure about is how the same process would work if the application loaded C libraries (.so files) using jni.
For example say if there was a calculator, if this calculator was built in pure java it would be possible to go in and mess up the square root button so that when you passed in 2 it would give back 2^3 rather then 2^2.
Now if this application used JNI to do all this math commands (so it passed the 2 to a native method), how would you be able to go into the C, change it so that it returns 2^3 and not 2^2?

Just figure out the C function signature and compile your own object file that implements that signature.
Years ago, working in a mainframe shop, my boss made his own version of the system date function and re-linked a commercial app we were using so he didn't have to renew the time-limited license. It was illegal as hell, but it worked.

Decompilation is older than bytecode. Pretty much everything can be decompiled. It's definitely harder (both to decompile and to understand/modify the result) with mangled, optimized machine code with zero metadata preserved, but nonetheless possible. Of course you'd need a different decompiler, and - as hinted before - it would be a bit harder, but the fact (which makes all DRM tools imperfect, by the way) "if their CPU runs it, they can modify it", holds for native code as much as for any bytecode.

One option is to use disassembler. A simpler option is to replace the library with your own library.I use it for test purposes almost every day.

You could use a debugger to step into the C code.
You could disassemble it. IDA (Interactive Disassembler) was (is?) a great example, and could produce high quality disassembled code (cross-references, documentation, name of system/lib functions in calls, ...).
It is then possible to patch the binary (which could be protected in some way).
If you concern is that you don't want the people who use your app to see the code or even change it, could you consider letting it run as a web or client/server application, where the user doesn't have access to the server? This would let you resolve the problem.

Related

How can I protect Java/Javafx code from being seen by final user?

I have been working on a project alone for more than two years for a company. The project is a really big one using rxtx to communicate with a hardware device. I used Java 8 and JAVAFX for the UI. Now it is almost finished and I am starting to search how to deliver the end user application that the company will distribute over its clients.
The problem is that the company I am working with wants the code to be non reachable when the software is between final clients hands because the Java code contains some extremely sensitive information that could have very bad consequences for the company if final clients happened to know them. The clients can literally perform actions they don’t have the right to perform.
So after searching (a lot) and thinking relatively to my case, I understood that giving a JAR obfuscated isn’t the solution. I then tried to generate a JAR and then transform it to an EXE but all I succeeded on was wrapping the JAR into EXE which does not prevent extracting the JAR and then seeing all the code easily. Finally, I found that I should use AoT compilation like GCJ compiler to produce native binary exe from my Java code but here I am stuck because after watching videos and reading articles etc I didn’t manage to find a clear way to produce the native binary exe.
I am now confused since I don’t know if I am on the right path and good direction or if I am totally wrong and there is another way of protecting the code (at least from non professional hackers, I understand that it is not possible to make it 100% safe but I am just searching for a reasonable and good way). How should I manage this final step of my work?

I currently work for a company that has code that we don't want anyone to have access to for the security of our clients and-- less important-- for legal reasons. ;-)
One possible solution you could look into would be to rewrite the code you deem most sensitive into a C/C++ library. It would be possible to compile this into a .so/.dll/.dylib file for the respective OSs and it would make it difficult, not entirely impossible, but difficult to decompile.
The trouble would come from learning how to access native code from Java as much of the documentation is not helpful or just simply nonexistent. This would utilize the Java Native Interface (JNI) which allows Java to, well, interface with the native (compiled C/C++) code. This would make it possible to create a Jar file that would effectively become a Java library for you to access throughout the rest of your project. The native code, however will still need to be loaded at runtime, but that's apart of learning how JNI works. A helpful link I found for JNI is http://jnicookbook.owsiak.org/ (for as long as it's still a functional link).
One of our clients here where I work has a project written in Java and needed to implement our code that is unfortunately all written in C. So we needed a way to access this C/C++ code from Java. This is the way we went about solving this issue without rewriting our code in Java. But we had the benefit (?) of having already written our code in C.
This solution to write a bunch of extra code last minute in another language that I may or may not be familiar with doesn't sound like particularly fun time.
I would be curious to learn what possible problems others might see with this solution.

How do Java libraries work?

I have been programming in Java for a while now. However during all this time there was a concept I never understood and finally now I would like to close this knowledge gap:
A Java class may consists of several parts like methods, members variables, comments and maybe other stuff. I think of these as mere tools for pushing around numbers, string etc.
However knowing of the existence of libraries one may find that he can do a lot more with one's code: For example, reading from or writing to files on the local hard drive, recording Audio data, getting the current system time etc.
But how does that work?
Java classes and stuff that needs hardware (a microphone for example) are completely separate things! As far as I know the Java libraries I import in my code also include only Java classes, stuff to help pushing around integers, strings etc.
Where is the "exit" point, when one "leaves" the class and works with stuff, that is not somewhere inside the JVM?
EDIT: Found my answers, posted here below:
In short: https://stackoverflow.com/a/557610/5152565
In a bit more detail: https://stackoverflow.com/a/30636097/5152565

A native method call from your Java library might qualify as an exit point to your Java code. Beyond this point the native code will have to work with the operating system libraries to execute the task.
eg : Java native code to read a File

Decompiling a jar file and modifying the source to hack an application. How to prevent this? [duplicate]

How can I package my Java application into an executable jar that cannot be decompiled (for example , by Jadclipse)?

You can't. If the JRE can run it, an application can de-compile it.
The best you can hope for is to make it very hard to read (replace all symbols with combinations of 'l' and '1' and 'O' and '0', put in lots of useless code and so on). You'd be surprised how unreadable you can make code, even with a relatively dumb translation tool.
This is called obfuscation and, while not perfect, it's sometimes adequate.
Remember, you can't stop the determined hacker any more than the determined burglar. What you're trying to do is make things very hard for the casual attacker. When presented with the symbols O001l1ll10O, O001llll10O, OO01l1ll10O, O0Ol11ll10O and O001l1ll1OO, and code that doesn't seem to do anything useful, most people will just give up.

First you can't avoid people reverse engineering your code. The JVM bytecode has to be plain to be executed and there are several programs to reverse engineer it (same applies to .NET CLR). You can only make it more and more difficult to raise the barrier (i.e. cost) to see and understand your code.
Usual way is to obfuscate the source with some tool. Classes, methods and fields are renamed throughout the codebase, even with invalid identifiers if you choose to, making the code next to impossible to comprehend. I had good results with JODE in the past. After obfuscating use a decompiler to see what your code looks like...
Next to obfuscation you can encrypt your class files (all but a small starter class) with some method and use a custom class loader to decrypt them. Unfortunately the class loader class can't be encrypted itself, so people might figure out the decryption algorithm by reading the decompiled code of your class loader. But the window to attack your code got smaller. Again this does not prevent people from seeing your code, just makes it harder for the casual attacker.
You could also try to convert the Java application to some windows EXE which would hide the clue that it's Java at all (to some degree) or really compile into machine code, depending on your need of JVM features. (I did not try this.)

GCJ is a free tool that can compile to either bytecode or native code. Keeping in mind, that does sort of defeat the purpose of Java.

A little late I know, but the answer is no.
Even if you write in C and compile to native code, there are dissasemblers / debuggers which will allow people to step through your code. Granted - debugging optimized code without symbolic information is a pain - but it can be done, I've had to do it on occasion.
There are steps that you can take to make this harder - e.g. on windows you can call the IsDebuggerPresent API in a loop to see if somebody is debugging your process, and if yes and it is a release build - terminate the process. Of course a sufficiently determined attacker could intercept your call to IsDebuggerPresent and always return false.
There are a whole variety of techniques that have cropped up - people who want to protect something and people who are out to crack it wide open, it is a veritable arms race! Once you go down this path - you will have to constantly keep updating/upgrading your defenses, there is no stopping.

This not my practical solution but , here i think good collection or resource and tutorials for making it happen to highest level of satisfaction.
A suggestion from this website (oracle community)
(clean way), Obfuscate your code, there are many open source and free
obfuscator tools, here is a simple list of them : [Open source
obfuscators list] .
These tools make your code unreadable( though still you can decompile
it) by changing names. this is the most common way to protect your
code.
2.(Not so clean way) If you have a specific target platform (like windows) or you can have different versions for different platforms,
you can write a sophisticated part of your algorithms in a low level
language like C (which is very hard to decompile and understand) and
use it as a native library in you java application. it is not clean,
because many of us use java for it's cross-platform abilities, and
this method fades that ability.
and this one below a step by step follow :
ProtectYourJavaCode
Enjoy!
Keep your solutions added we need this more.

Combining Java and C without gcj -- move C to Java or Java to C?

First, I have no experience doing this. But like the beginning of any good program, I have problem that I need to fix, so I'm willing to learn.
So many of you are probably already familiar with pdftk, the handy utility for handling various pdf-related tasks. So far as I can tell, most of these features are available in much newer, lighter libraries/extensions, except the one I need (and probably the only reason it still exists): merging form data files (fdf and xfdf) with a form PDF and getting a new file as the output.
The problem is that my server doesn't have gcj, which is fundamental to build/compile pdftk. I don't know if it's because I'm on Solaris or if it's for some other sysadmin-level reason, but I'm not getting gcj anytime soon. And there are no pre-compiled binaries for Solaris as far as I can find.
So I'm thinking that the MAKE file and C code can be rewritten to import the Java library (very ancient version of itext) directly, via javac.
But I'm not sure where to really start. All I know is:
I want a binary when I'm done, so that there won't be a need for a Java VM on every use.
The current app uses GCJ.
So my first thought was "Oh this is easy, I can probably just call the classes with some other C-based method", but instead of finding a simple method for doing this, I'm finding tons of lengthy posts on the various angles that this can be approached, etc.
Then I found a page on Sun's site on how to call other languages (like C) in a Java class. But the problems with that approach are:
I'd have to write a wrapper for the wrapper
I'd probably be better off skipping that part and writing the whole thing in Java
I ain't ready for that just yet if I can just import the classes with what is already there
I'm not clear on if I can compile and get a binary at the end or if I'm trapped in Java being needed every time.
Again, I apologize for my ignorance. I just need some advice and examples of how one would replace GCJ dependent C code with something that works directly with Java.
And of course if I'm asking one of those "if we could do that, we'd be rich already" type questions, let me know.

I'm not sure what you are looking for exactly, so I provided several answers.
If you have java code that needs to run, you must:
Run it in a jvm. You can start that vm within your own custom c-code, but it is still using a jvm
Rewrite it in another language.
Compile with an ahead-of-time compiler (eg gcj)
Incidentally, you could compile a copy of gcj in your home folder and use that. I believe the magic switch is --enable-languages=java,c (see: here for more)
If you have c-code you want to call from java, you have four options:
Java Native Interface (JNI). It seems you found this
Java Native Access (JNA). This is slower than JNI, but requires less coding and no wrapper c-code. It does require a jar and a library
Create a CLI utility and use Runtime.Exec(...) to call it.
Use some sort of Inter Process Communication to have the Java code ask the c-code to perform the operation and return the result.
Additional platform dependent options
Use JACOB (win32 only: com access)

I am not sure if I understand what you are looking for.
If you are looking to incorporate the C code into Java to make a native binary without the gcj, I think you are out of luck. You can include the C in Java, but it would be a primarily Java program meaning you would need the JVM on each run. Is there anything stopping you from compiling the gcj yourself?

Creating non-reverse-engineerable Java programs

Is there a way to deploy a Java program in a format that is not reverse-engineerable?
I know how to convert my application into an executable JAR file, but I want to make sure that the code cannot be reverse engineered, or at least, not easily.
Obfuscation of the source code doesn't count... it makes it harder to understand the code, but does not hide it.
A related question is How to lock compiled Java classes to prevent decompilation?
Once I've completed the program, I would still have access to the original source, so maintaining the application would not be the problem. If the application is distributed, I would not want any of the users to be able to decompile it. Obfuscation does not achieve this as the users would still be able to decompile it, and while they would have difficulty following the action flows, they would be able to see the code, and potentially take information out of it.
What I'm concerned about is if there is any information in the code relating to remote access. There is a host to which the application connects using a user-id and password provided by the user. Is there a way to hide the host's address from the user, if that address is located inside the source code?

The short answer is "No, it does not exist".
Reverse engineering is a process that does not imply to look at the code at all. It's basically trying to understand the underlying mechanisms and then mimic them. For example, that's how JScript appears from MS labs, by copying Netscape's JavaScript behavior, without having access to the code. The copy was so perfect that even the bugs were copied.

You could obfuscate your JAR file with YGuard. It doesn't obfuscate your source code, but the compiled classes, so there is no problem about maintaining the code later.
If you want to hide some string, you could encrypt it, making it harder to get it through looking at the source code (it is even better if you obfuscate the JAR file).

If you know which platforms you are targeting, get something that compiles your Java into native code, such as Excelsior JET or GCJ.
Short of that, you're never going to be able to hide the source code, since the user always has your bytecode and can Jad it.

You're writing in a language that has introspection as part of the core language. It generates .class files whose specifications are widely known (thus enabling other vendors to produce clean-room implementations of Java compilers and interpreters).
This means there are publicly-available decompilers. All it takes is a few Google searches, and you have some Java code that does the same thing as yours. Just without the comments, and some of the variable names (but the function names stay the same).
Really, obfuscation is about all you can get (though the decompiled code will already be slightly obfuscated) without going to C or some other fully-compiled language, anyway.

Don't use an interpreted language? What are you trying to protect anyway? If it's valuable enough, anything can be reverse engineered. The chances of someone caring enough to reverse engineer most projects is minimal. Obfuscation provides at least a minimal hurdle.
Ensure that your intellectual property (IP) is protected via other mechanisms. Particularly for security code, it's important that people be able to inspect implementations, so that the security is in the algorithm, not in the source.

I'm tempted to ask why you'd want to do this, but I'll leave that alone...
The problem I see is that the JVM, like the CLR, needs to be able to intrepert you code in order to JIT compile and run it. You can make it more "complex" but given that the spec for bytecode is rather well documented, and exists at a much higher level than something like the x86 assembler spec, it's unlikely you can "hide" the process-flow, since it's got to be there for the program to work in the first place.

Make it into a web service. Then you are the only one that can see the source code.

It can't be done.
Anything that can be compiled can be de-compiled. The very best you can do is obfuscate the hell out of it.
That being said, there is some interesting stuff happening in Quantum Cryptography. Essentially, any attempt to read the message changes it. I don't know if this could be applied to source code or not.

Even if you compile the code into native machine language, there are all sorts of programs that let you essentially decompile it into assembly language and follow the process flow (OlyDbg, IDA Pro).

It can not be done. This is not a Java problem. Any language that can be compiled can be decompiled for Java, it's just easier.
You are trying to show somebody a picture without actually showing them. It is not possible. You also can not hide your host even if you hide at the application level. Someone can still grap it via Wireshark or any other network sniffer.

As someone said above, reverse engineering could always decompile your executable. The only way to protect your source code(or algorithm) is not to distribute your executable.
separate your application into a server code and a client app, hide the important part of your algorithm in your server code and run it in a cloud server, just distribute the client code which works only as a data getter and senter.
By this even your client code is decompiled. You are not losing anything.
But for sure this will decrease the performance and user convenience.
I think this may not be the answer you are looking for, but just to raise different idea of protecting source code.

With anything interpreted at some point it has to be processed "in the clear". The string would show up clear as day once the code is run through JAD. You could deploy an encryption key with your app or do a basic ceasar cipher to encrypt the host connect info and decrypt at runtime...
But at some point during processing the host connection information must be put in the clear in order for your app to connect to the host...
So you could statically hide it, but you can't hide it during runtime if they running a debugger

This is impossible. The CPU will have to execute your program, i.e. your program must be in a format that a CPU can understand. CPUs are much dumber than humans. Ergo, if a CPU can understand your program, a human can.

Having concerns about concealing the code, I'd run ProGuard anyway.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.