At a recent interview, I was asked:
Open source web app (say built on Struts/Spring) is more prone to hacking since anyone can access the source code and change it. How do you prevent it?
My response was:
The java source code is not directly accessible. It is compiled into class files, which are then bundled in a war file and deployed within a secure container like Weblogic app server.
The app server sits behind a corporate firewall and is not directly accessible.
At that time - I did not mention anything about XSS and SQL injection which can affect a COTS-based web app similar to an open source one.
My questions:
a) Is my response to the question correct?
b) What additional points can I add to the answer?
thanks in advance.
EDIT:
While I digest your replies - let me also point out the question was also meant towards frameworks such as Liferay and Apache OFBiz.
The question is a veiled argument towards Security through obscurity. I suggest you read up the usual arguments for and against and see how that fits:
Security through obscurity ( Wikipedia )
Hardening Wordpress
SSH server security (Putty)
My personal opinion is that obscurity is at best the weakest layer of defence against atack. It might help filter out automated attacks by uninformed attackers, but it does not help much against a determined assault.
a) Is my response to the question correct?
The part about the source not being accessible (to change it) because it is compiled and deployed where it cannot be touched is not a good answer. The same applies to non-open-source software. The point that was being made against an open source stack is that the source is accessible to read, which would make it easier to find vulnerabilities that can be exploited against the installed app (compiled or not).
The point about the firewall is good (even though it does not concern the open- or closedness of the software, either).
b) What additional points can I add to the answer?
The main counterargument against security through obscurity (which was the argument being made here) is that with open source software, many more people will be looking at the source in order to find and fix these problems.
since anyone can access the source code and change it.
Are you sure that is what they said? Change it? Not "study it"?
I don't see how anyone can just change the source code for Struts...
A popular open-source web framework/CMS/library is less likely to have horrible bugs in it for long, since there are lots of people looking at the code, finding the bugs, and fixing them. (Note, in order for this to matter, you'll need to keep your stuff up to date.)
Now, your friend does have a tiny point -- anyone who can fix the bugs could also introduce them, if the project is run by a bunch of idiots. If they take patches from any random schmuck without looking the patches over, or don't know what they're doing in the first place, it's possible to introduce bugs into the framework. (This doesn't matter unless you update regularly.) So it's important to use one that's decently maintained by people who have a clue.
Note, all of the problems with open-source frameworks/apps apply to COTS ones as well. You just won't know about bugs in the latter til after bugtraq and other such lists publish them, as big companies like to pretend there aren't any bugs in their software til forced to react.
a) Yes. Open source doesn't mean open binaries :) The sentence "anyone can change the source code" is simply incorrect (you can change your copy of the code, but can't edit Apache Struts code)
b) Maybe the fact that the source code is visible makes it easier to somebody to see the posible flaws it can have and exploit them. But, the same argument functions the other way: as a lot of people review the code the flaws are found faster so the code is more robust at the end.
Related
I have been working on a project alone for more than two years for a company. The project is a really big one using rxtx to communicate with a hardware device. I used Java 8 and JAVAFX for the UI. Now it is almost finished and I am starting to search how to deliver the end user application that the company will distribute over its clients.
The problem is that the company I am working with wants the code to be non reachable when the software is between final clients hands because the Java code contains some extremely sensitive information that could have very bad consequences for the company if final clients happened to know them. The clients can literally perform actions they don’t have the right to perform.
So after searching (a lot) and thinking relatively to my case, I understood that giving a JAR obfuscated isn’t the solution. I then tried to generate a JAR and then transform it to an EXE but all I succeeded on was wrapping the JAR into EXE which does not prevent extracting the JAR and then seeing all the code easily. Finally, I found that I should use AoT compilation like GCJ compiler to produce native binary exe from my Java code but here I am stuck because after watching videos and reading articles etc I didn’t manage to find a clear way to produce the native binary exe.
I am now confused since I don’t know if I am on the right path and good direction or if I am totally wrong and there is another way of protecting the code (at least from non professional hackers, I understand that it is not possible to make it 100% safe but I am just searching for a reasonable and good way). How should I manage this final step of my work?
I currently work for a company that has code that we don't want anyone to have access to for the security of our clients and-- less important-- for legal reasons. ;-)
One possible solution you could look into would be to rewrite the code you deem most sensitive into a C/C++ library. It would be possible to compile this into a .so/.dll/.dylib file for the respective OSs and it would make it difficult, not entirely impossible, but difficult to decompile.
The trouble would come from learning how to access native code from Java as much of the documentation is not helpful or just simply nonexistent. This would utilize the Java Native Interface (JNI) which allows Java to, well, interface with the native (compiled C/C++) code. This would make it possible to create a Jar file that would effectively become a Java library for you to access throughout the rest of your project. The native code, however will still need to be loaded at runtime, but that's apart of learning how JNI works. A helpful link I found for JNI is http://jnicookbook.owsiak.org/ (for as long as it's still a functional link).
One of our clients here where I work has a project written in Java and needed to implement our code that is unfortunately all written in C. So we needed a way to access this C/C++ code from Java. This is the way we went about solving this issue without rewriting our code in Java. But we had the benefit (?) of having already written our code in C.
This solution to write a bunch of extra code last minute in another language that I may or may not be familiar with doesn't sound like particularly fun time.
I would be curious to learn what possible problems others might see with this solution.
I have just created a mid-sized web-application using Java, a custom MVC framework, javascript. My code will be reviewed before it's put in the productions servers (internal use).
The primary objective of building this app was to solve a small problem for internal use and understand the custom made MVC framework used by my employer. So, my app has gone through MANY iterations, feature changes and additions.
So, bottom line, the code is very very dirty and this is my first "product level" Java app.
What are your suggestions, what are some basic checks/refractoring I should do before the code review?
I am thinking about:
Java best practices (conventions)
Make the code simple to understand for the developer who will maintain it. (won't be me)
I noticed, I have created some unnecessary objects and used hashmaps/arraylists where could have easily used some other Data structure and achieved the solution. So, is that worth changing?
Update
Your Code Sucks and I Hate You: The Social Dynamics of Code Reviews
If you did not already, (assuming you use an IDE like eclipse)
get plugins checkstyle and findbugs
go through their configuration and tune to your style
run them on your code
resolve all issues reported
you can also tune the compiler warning setting of eclipse itself and possibly make them more strict in what is reported.
Look at code structure:
get plugin jdepend
investigate your package structure
Code against interfaces (Map, List, Set) instead of implementation classes (HashMap, ArrayList, TreeSet)
Complete your Javadoc and make check it is up to date after all refactorings.
Add JUnit tests; if you have no time left to test the whole application, at least create a test for every bug you find and solve from now on. This helps "growing" a test set as you go.
Next time design and build your application with the end goal in sight. Always assume that the next guy having to maintain your code will know how to find you :-)
Unit tests, and they should be automated as part of your build. You should already have these, but if not, do it now. It will definitely make the refactoring easier, as well improving your general confidence in the code (and the guy who will be maintaining it).
Logging.
One of the more overlooked things is the importance of logging. You need to have a decent logging methodology put in place. Even though this is an internal app, make sure that the basic logs can help regular users find issues and provide more detailed logging so that you (the developer) would know where to go.
Comment your code, explain why it's doing what it's doing and what assumptions have been made.
Try to reduce the amount of mutating state.
Try to remove any singletons you may have.
All right, I've hit a bug that has simply confused the bejeebus out of me. I'm looking for ideas about what it could be that I can investigate, because right now, I got nothing. It goes something like this:
I have a standalone Java application that occasionally needs to twiddle the Line-In volume of the computer it's running on (a WinXP machine). It does this by calling a pair of external executables (written in VB6*) that can get and set various component volumes. (They can handle Line-In, Mic, Wave, CD, and the master volume control.)
There are several hundred units in the field, running on hardware (Dell machines) that my company provided and controls. At least several dozen clients are using this feature, and it works perfectly -- except for one instance.
For this one troublemaking machine, it simply doesn't work. I watch the volume sliders when the app is running, and when the volume is supposed to drop, they stay put. When I check the app's log file, it throws no errors, and appears to be executing the code that drops the volume. When I run the executables from the command line, they work perfectly.
I can't vouch for this machine being 100% identical to all the ones that are behaving properly, but we've been buying the same line of Dells for quite some time now; at a bare minimum, it's very, very similar.
So, turning my confusion into a bullet list:
If I'm doing something stupid in the Java code (i.e., not clearing my STDOUT/STDERR buffers), why is it only an issue on this machine?
If there's something broken in the VB6 executables, why do they work on every other machine and on this machine from the command line?
If there's some sort of hardware oddity on this machine, what sort of oddity could cause the volume control executables to fail only when called from within a Java application?
I am very confused. I do not like being confused. Anybody have any suggestions that may lead to my enlightenment?**
-* -- I know, I know, VB6, 1998 called and they want their obsolescent proprietary bug generator back, etc. Wasn't my decision. But the code works. Usually.
-** -- Insert Buddhism joke here.
Update Edit: Customer service may have stumbled onto something; it may be something to do with client configuration settings in the database. New evidence suggests that either something's misconfigured for that client or my software is doing something stupid in response to a specific configuration. And the problem may be more widespread than we thought, due to this particular feature not being as commonly used as I thought.
Responding to the comments:
Debugger: Theoretically possible, but looks like a massive headache given our setup.
High Verbosity Logging, Java: Good idea this, particularly given than the problem may be more widespread than I originally believed. Time to start revisiting some assumptions. And possibly clubbing them. Like baby seals.
High Verbosity Logging, VB6: A possibility; will need to be rolled-into the high-verbosity Java logging to trap the output, since my VB6-fu is so pitiably weak I don't know how to output text to a file. But, yeah, knowing whether or not the script is even getting called would be valuable.
Window Event Viewer: Not familiar with this tool. May have to correct that.
PATH problem: Doesn't feel likely; the Java code constructs a relative path to the executable that doesn't look like it's relying on any environment variables.
My thanks for the suggestions people have provided; at the very least, you've gotten my brain moving in directions that feel promising.
Solution Edit: And the winner is ... That's Not A Bug, That's A Feature! A feature gone horribly, horribly wrong. A feature that will now be neutered so as to stop bothering us.
A batch of invalid assumptions kept me from seeing it sooner, not the least of which was "I don't need to tool the code with more debug statements -- the statements already in there are telling me all I need to know!" DaDaDom, if you'd like to turn your comment into an answer, there's a shiny checkmark in it for you.
Thanks to everybody who chimed in with a suggestion. Now if you'll excuse me, my head is late for a meeting with my desk.
Here goes an answer:
Can you create a version of the software with verbose logging or could you even debug the code? At least then you can tell if it's in the java or the VB part.
Hmmmm. I've been told that executing programs from Java is either easy or hard. The easy part is starting them up. The hard part is dealing with the I/O streams (see my earlier question on using Runtime.exec()). Maybe the VB program is doing or expecting something weird on these particular machines that the Java code isn't working with properly.
edit: I also found a link to Jakarta Commons Exec:
Rationale
Executing external processes from Java is a well-known problem area. It is inheriently platform dependent and requires the developer to know and test for platform specific behaviors, for example using cmd.exe on Windows or limited buffer sizes causing deadlocks. The JRE support for this is very limited, albeit better with the new Java SE 1.5 ProcessBuilder class.
Reliably executing external processes can also require knowledge of the environment variables before or after the command is executed. In J2SE 1.1-1.4 there is not support for this, since the method, System.getenv(), for retriving environment variables is deprecated.
There are currently several different libraries that for their own purposes have implemented frameworks around Runtime.exec() to handle the various issues outlined above. The proposed project should aim at coordinating and learning from these initatives to create and maintain a simple, reusable and well-tested package. Since some of the more problematic platforms are not readily available, it is my hope that the broad Apache community can be a great help.
Have you considered the possibility that the authenticated user may not have permission to edit volume settings on the workstation? Does the program run correctly if you run as an 'Administrator'?
Is there a way to deploy a Java program in a format that is not reverse-engineerable?
I know how to convert my application into an executable JAR file, but I want to make sure that the code cannot be reverse engineered, or at least, not easily.
Obfuscation of the source code doesn't count... it makes it harder to understand the code, but does not hide it.
A related question is How to lock compiled Java classes to prevent decompilation?
Once I've completed the program, I would still have access to the original source, so maintaining the application would not be the problem. If the application is distributed, I would not want any of the users to be able to decompile it. Obfuscation does not achieve this as the users would still be able to decompile it, and while they would have difficulty following the action flows, they would be able to see the code, and potentially take information out of it.
What I'm concerned about is if there is any information in the code relating to remote access. There is a host to which the application connects using a user-id and password provided by the user. Is there a way to hide the host's address from the user, if that address is located inside the source code?
The short answer is "No, it does not exist".
Reverse engineering is a process that does not imply to look at the code at all. It's basically trying to understand the underlying mechanisms and then mimic them. For example, that's how JScript appears from MS labs, by copying Netscape's JavaScript behavior, without having access to the code. The copy was so perfect that even the bugs were copied.
You could obfuscate your JAR file with YGuard. It doesn't obfuscate your source code, but the compiled classes, so there is no problem about maintaining the code later.
If you want to hide some string, you could encrypt it, making it harder to get it through looking at the source code (it is even better if you obfuscate the JAR file).
If you know which platforms you are targeting, get something that compiles your Java into native code, such as Excelsior JET or GCJ.
Short of that, you're never going to be able to hide the source code, since the user always has your bytecode and can Jad it.
You're writing in a language that has introspection as part of the core language. It generates .class files whose specifications are widely known (thus enabling other vendors to produce clean-room implementations of Java compilers and interpreters).
This means there are publicly-available decompilers. All it takes is a few Google searches, and you have some Java code that does the same thing as yours. Just without the comments, and some of the variable names (but the function names stay the same).
Really, obfuscation is about all you can get (though the decompiled code will already be slightly obfuscated) without going to C or some other fully-compiled language, anyway.
Don't use an interpreted language? What are you trying to protect anyway? If it's valuable enough, anything can be reverse engineered. The chances of someone caring enough to reverse engineer most projects is minimal. Obfuscation provides at least a minimal hurdle.
Ensure that your intellectual property (IP) is protected via other mechanisms. Particularly for security code, it's important that people be able to inspect implementations, so that the security is in the algorithm, not in the source.
I'm tempted to ask why you'd want to do this, but I'll leave that alone...
The problem I see is that the JVM, like the CLR, needs to be able to intrepert you code in order to JIT compile and run it. You can make it more "complex" but given that the spec for bytecode is rather well documented, and exists at a much higher level than something like the x86 assembler spec, it's unlikely you can "hide" the process-flow, since it's got to be there for the program to work in the first place.
Make it into a web service. Then you are the only one that can see the source code.
It can't be done.
Anything that can be compiled can be de-compiled. The very best you can do is obfuscate the hell out of it.
That being said, there is some interesting stuff happening in Quantum Cryptography. Essentially, any attempt to read the message changes it. I don't know if this could be applied to source code or not.
Even if you compile the code into native machine language, there are all sorts of programs that let you essentially decompile it into assembly language and follow the process flow (OlyDbg, IDA Pro).
It can not be done. This is not a Java problem. Any language that can be compiled can be decompiled for Java, it's just easier.
You are trying to show somebody a picture without actually showing them. It is not possible. You also can not hide your host even if you hide at the application level. Someone can still grap it via Wireshark or any other network sniffer.
As someone said above, reverse engineering could always decompile your executable. The only way to protect your source code(or algorithm) is not to distribute your executable.
separate your application into a server code and a client app, hide the important part of your algorithm in your server code and run it in a cloud server, just distribute the client code which works only as a data getter and senter.
By this even your client code is decompiled. You are not losing anything.
But for sure this will decrease the performance and user convenience.
I think this may not be the answer you are looking for, but just to raise different idea of protecting source code.
With anything interpreted at some point it has to be processed "in the clear". The string would show up clear as day once the code is run through JAD. You could deploy an encryption key with your app or do a basic ceasar cipher to encrypt the host connect info and decrypt at runtime...
But at some point during processing the host connection information must be put in the clear in order for your app to connect to the host...
So you could statically hide it, but you can't hide it during runtime if they running a debugger
This is impossible. The CPU will have to execute your program, i.e. your program must be in a format that a CPU can understand. CPUs are much dumber than humans. Ergo, if a CPU can understand your program, a human can.
Having concerns about concealing the code, I'd run ProGuard anyway.
A co worker of mine asked me to review some of my code and he sent me a diff file. I'm not new to diffs or version control in general but the diff file was very difficult to read because of the changes he made. Specifically, he used the "extract method" feature and reordered some methods. Conceptually, very easy to understand but looking at the diff, it was very hard to tell what he had done. It was much easier for me to checkout the previous revision and use Eclipse's "compare" feature, but it was still quite clunky.
Is there any version control system that stores metadata related to refactoring. Of course, it would be IDE and Programming Language specific, but we all use Eclipse and Java! Perhaps there might be some standard on which IDEs and version control implementations can play nicely?
Eclipse can export refactoring history (see 3.2 release notes as well). You could then view the refactoring changes via preview in Eclipse.
I don't know of compare tools that do a good job when the file has been rearranged. In general, this is a bad idea because of this type of problem. All too often people do it to simply meet their own style, which is a bad, bad reason to change code. It can effectively destroy the history, just like reformatting the entire file, and should never be done unless necessary (i.e. it is already a mess and unreadable).
The other problem is that working code will likely get broken because of someones style preferences. If it ain't broken, don't fix it!
I asked a similar question a while ago and never did get a satisfactory answer. I'll be watching your question to see what people come up with.
For your particular situation, it might be best to review the latest version of the file, using the diff as a guide. That's what I have been doing in my situation too.
The Refactoring History feature is new to me, but I like the way it sounds. For a less tool-specific method, I like sending patch files. The person reviewing just applies the patch and reviews the results, and then they can revert to the version in version control when they're done.