Tools to detect duplicated code (Java) [closed] - java

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I am in a project where previous programmers have been copy-pasting codes all over the place. These codes are actually identical (or very similar) and they could have been refactored into one.
I have spent countless hours refactoring these codes manually but I think there must be a better way. Some are very trivial static methods that could have been moved into an ancestor class (but instead was copy pasted all over by previous junior programmers).
Is there a code analysis tool that can detect this and provide reports/recommendations? I prefer free/open source tool if possible.

I use the following tools:
PMD/CPD (BSD-style License).
Checkstyle (LGPL License) - support was removed, see details.
Both tools have code duplication detection support. But both of them lack the ability to advise you how to refactor your code.
JetBrains IntelliJ IDEA Ultimate has good static code analysis with code duplication support, but it is not free.

Most of the tools listed on the Wikipedia article on Duplicate Code Tools will detect duplicates in many different languages, including Java.

SonarQube can detect duplicated codes but does not give recommendation on eliminating them. It is free and - although with the default setup it can only detect lexically identical clones

Either Simian or PMD's CPD. The former supports a wider set of languages but is non free for commercial projects.

http://checkstyle.sourceforge.net/ has support for finding duplicates

See our SD Java CloneDR, a tool for detecting exact and near-miss duplicate code in large Java systems.
The CloneDR will find code clones in spite of whitespace changes, line breaks, comment insertions deletions, modification of constants or identifiers, and in a number of cases, even replacement of one statement by another or a block of statements.
It shows where each set of clones is found, each individual clone, an abstraction of the clones having their shared commonality and parameterization of the abstraction to show how each clone instance can be derived from the abstraction.
It finds 10-20% clones in most Java systems.

Related

Java source code parsers/generators [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I need tools to:
Conveniently parse Java source code and easily access given elements.
Easily generate source code files, to easily transform data structures into code
Any good tips, libraries, frameworks, tools? Thank you for help.
If you need to parse existing source code, use JavaParser. It gives you visitor-based access to the AST. You can write new code, but many things are a pain (e.g. referencing other classes)
If you need to generate source code use CodeModel. It lets you programmatically create classes, packages, methods etc, and it's very easy to use. However, I don't think it can import existing code.
Both are pretty awesome in their respective domains.
Since Java 6, the compiler has an API included in the JDK. Through it you can access the results of the Java parser through the javax.lang.model APIs. The same functionality was present with JDK5 in the form of the Mirror API. There's a good introductory article here.
The best code generation tool I've seen is CodeModel. It has a very simple API and can generate multiple Java source files at once.
Our DMS Software Reengineering Toolkit and its Java Front End can do this. They are designed to enable the construction of custom analyzers and code generators.
DMS provides generic parsing, abstract-syntax tree (with comments) and symbol table building, tree navigation/inspection/modification facilities, and the ability to regenerate the complete source code from the modified tree. Additional facilities includes source-to-source transformation rules ("if you see this syntax, replace it with that syntax"), and patterns (used to build or recognize subtree), attribute grammar evaluators, control and data flow analysis, and call-graph construction. The Java Front End specializes DMS to do all of this for Java 1.4-1.6 with 1.7 nearby.
(EDIT May 2016: Now handles Java 1.8)
DMS is also designed to handle scale: it is often used to process many compilation-units (source files) at the same time, enabling analysis and transformations that cross file boundaries. It can also handle multiple languages at the same time; DMS has front ends for a wide variety of languages.
Check out Antlr. One of its examples is a Java grammar.

Generic java bug-estimator [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I heard there is software available that you can feed Java-code and it gives you back an indication on where your code is likely to produce unexpected behaviour. However, I cannot find such software, nor anything resemblng this. Does anyone have an idea on where to look? (I don't know where I heard this, so no, I cannot go back to the source and ask :( )
For example:
-it might check whether the same code is found at multiple places (when editing you are likely to forget to update one of the two, hence this could give prblems in the future)
-it might check whether you might be returning null somewhere where the null stays untreated,
-it might check for 'magic numbers' (numbers that are used in code without being assigned to a variable), especially when these numbers appear at various places in the code.
-etc, etc.
(What it doesn't need to check is whether the code can be compiled. that's, ofcourse, where we have already many other tools for, like Eclipse).
I do not know for certain whether the described software exists and what it looks like, but any help in this direction would be great!
Sounds like the code inspections you find in IntelliJ Idea.
There is FindBugs which does a static code analysis to find "bug patterns". That is code that falls in one of these category:
Difficult language features
Misunderstood API methods
Misunderstood invariants when code is modified during maintenance
Garden variety mistakes: typos, use of the wrong boolean operator
It's nice to use and tends to find some bugs. However it can not do everything you want (i.e. "returned null" checks).
There are a few tools like this. I've found PMD and Findbugs particularly useful.
findbugs is one such tool. More generally, what you're talking about is called static analysis.
If you work with eclipse, you could use codepro
Use sonar. It is a web application that has all these functionality and nice UI. Visit the site (http://www.sonarsource.org/) and see nemo - the live instance where all jakarta projects are hosted: http://nemo.sonarsource.org/

Java obfuscation - ProGuard/yGuard/other? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
This is along similar lines as these recent questions:
Best Java Obfuscation Application For Size Reduction
Creating non-reverse-engineerable Java programs
However, one ends up recommending yGuard and the other ProGuard but neither mention both. I wonder if we could get a comparison of each one and hear peoples experiences from both sides of the fence. Looking at this comparison chart on the ProGuard website its clearly angled towards ProGuard. But what about real-world experience of each - which one produces smaller output? which one is harder to decompile from? what Java versions are supported by each?
Personally I'm particularly interested from a J2ME point of view but please don't limit the discussion to that.
Results for my project.
Obfuscation - both fine.
Optimisation - ProGuard produced 20% faster code (for the measured app bottleneck).
Compactness - ProGuard about 5% smaller.
Configuration / Ant - YGuard is much easier to configure.
So, I'd advise ProGuard - but configuration and ant integration could definitely be improved.
Proguard is a better product; especially if you take the time to go through the settings for J2ME.
Specifically for J2ME there is a far better (commercial) product called mBooster
I've been getting around 25% improvement in size on my application after its been through Proguard. This is mainly to do with the better Zip compression on the Jar file and comprehensive support for class merging and preverification.
My opinion is - ProGuard is better. Output is smaller a bit. Optimizing is better and much faster.
Decompiling is simple in both cases. Well, i mean, if u know Java well and really know business-logic of what you're decompiling, there is no problem to get it back to sources from obfuscated classes.
So, my opinion is ProGuard is better.

Java obfuscators [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I'm looking for a good Java obfuscator.
I've done initial research into the following Java obfuscators: proguard, yguard, retroguard, dasho, allatori, jshrink, smokescreen, jobfuscate, marvin, jbco, jode, javaguard, jarg, joga, cafebabe, donquixote, mwobfu, bbmug, zelix klassmaster, sandmark, jcloak, thicket, blufuscator, and java code protector.
I tried proguard and it has a really nice GUI, seems really stable, and seems to be the most popular, but it seemed to not like some enumeration on a referenced jar file (not within the code I was trying to obfuscate) which was weird. Yguard seems to require some interaction with ant, which I didn't know too much about.
What is a good java obfuscator? It doesn't need to be free, it just needs to work well and be easy to use.
I use ProGuard heavily for all my release builds and I have found it is excellent. I can't recommend it enough!
I have encountered obscure bugs caused by it's optimizations on several occasions and I now disable optimizations across the board - haven't had a problem caused by ProGuard since. Though, to be fair, these were all quite some versions ago - YMMV.
I used to use the GUI only to get a config started, and then I resort to editing the text config myself, which is really very simple to do. These days I do the config by hand.
I have quite complex projects all of which involve dynamic loading and reflection. I also heavily use reflection for a callback implementation. ProGuard has coped with these very well.
EDIT: We also use DashO Pro for one of our products - I looked into it for packaging the products I am responsible for and concluded that it's configuration was too convoluted and complex; also integrating it into the build script seemed like a bit of a pain. But again, to be fair, this was circa 2001... so it might be better in current versions.
A good collection of links to free and commercial tools is given in this arcticle
"Protect Your Java Code - Through Obfuscators And Beyond"
The author also discusses the strong and weak points of bytecode obfuscation
What is the issue with ProGuard ? (which is recommended both by this question and this one).
There is a section of troubleshooting about enumerator, but they seem to be taken into account just fine.
However, Obfuscation breaks some attempts at reflection, even though modern obfuscators can detect and to some extend adjust usages of reflection in the code they're obfuscating.
I used Zelix Klassmaster in a commercial application for several years and found it to be excellent. I threw quite a few resources at the obfuscated code, and was not able to "break" it. It's pricey, but good.
I only stopped using it when my version got old enough that the upgrade cost was significant. My needs had changed and I didn't really need to obfuscate the classes anymore. However, if the need arises again, I'd pay for it and use it in a flash.
Cheers,
-Richard
We are using Zelix Klassmaster for couple years and I can recommend it.
I use and suggest Zelix - 100% - very solid and robust protection

Tools for converting non-Java into Java source [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
Are there any good tools out there for automatically converting non-Java source code into Java source?
I'm not expecting something perfect, just to get the worst of the grunt work out of the way.
I guess there is a sliding scale of difficulty. C# should be relatively easy (so long as you ignore all the libraries). (well written) C++ not so bad. C requires making a little OO. (Statically type) functional languages may be easy to grok. Dynamic OO languages may require non-local analysis.
One thing you can try is find a Java bytecode compiler for the language you're talking about (there are JVM compilers for all kinds of languages) and then decompile the bytecode back into Java using a decompiler like Jad.
This is fraught with peril. The regenerated code will suck and will probably be unreadable.
Source-to-source migrations fall under the umbrella of Program Transformation. Program-Transformation.org tracks a bunch of tools that are useful for language recognition, analysis, and transformation. Here are few that are capable of source-to-source migrations:
ASF+SDF Meta-Environment - As noted, there is no new development on this tool. Instead, the developers are focusing on Rascal.
Rascal Meta Programming Language
Stratego /XT
TXL
DMSĀ® Software Reengineering Toolkit (commercial)
If you spend any time with one of the open source tools, you'll notice that even though they include source-to-source migration as a feature, it's hard to find working examples. I imagine this is because there's no such thing as a one-size-fits-all migration. Each project/team makes unique use of a language and can vary by libraries used, type complexity, idioms, style, etc. It makes sense to define some transformations per migration. This means a project must reach some critical mass before automatic migration is worth the effort.
A few related documents:
An introduction to Rascal - includes a migration between the toy language Pico and Assembly starting at page 94.
Cracking the 500 Language Problem
An Experiment in Automatic Conversion of Legacy Java Programs to C# (gated) - uses TXL
Google: ANTLR
The language conversion is fairly simple, but you will find the libraries are different.
This is likely to be most of your work.
If you just want to use some legacy C/Pascal code, you could also use JNI to call it from Java.
If you want to run it in a Java applet or similar constrained environment, and it does not have to be very efficient, you can use NestedVM (which is a MIPS to Java bytecode converter) in conjunction with a gcc cross-compiler that compiles to MIPS). But don't expect to get readably Java code from that.
Any of those tools might help only if your non java code is not huge enough.
If its huge non java code and if you want to seriously translate it to java, then few things need to be thought of, its not just hundreds of lines of code, there is a design beneath it, there are few decisions taken by people beneath the code due to which certain problems might have been solved and few things have been working there. and investing time on any good translator won't be worth as it won't exist, it's not just syntax translation from one language to another.
If its not so huge code, its better to re write in java, as it has so many APIs packages out of box, it might not be big deal, hiring few interns for this also might help.
ADA to Java can be done with a find-and-replace!

Categories

Resources