Related
Is there any tool that allows you to "query" (rather than simply search) the JDK documentation? For example:
Show me all deprecated methods.
Show me all subclasses of a particular class (rather than only the direct subclasses, which the documentation provides).
Show me all methods that return (say) a Thread.
Show me all instances of a given method name, regardless of signature.
What prompted this question is that I was muddling up two completely unrelated methods that I only use occasionally: Pattern.matches() and Matcher.matches(). And then I found that there is also PathMatcher.matches(), which also has a completely unrelated purpose. And that got me wondering how many other "matches()" methods there are in the JDK. And then I thought that there may be other useful queries that could also be run against the JDK documentation.
The only motivation for having such a tool is to help me improve my own knowledge of Java with information that is interesting or useful (to me at least), but is not otherwise easy to obtain. This question is similar, but I am looking for something more sophisticated than a simple search.
ETA: Marcel's suggestion below of using the Doclet API provides a great solution, without too much effort.
ETA2: Re determining deprecated methods, I've just found out that Oracle already address this in the JavaDoc API here
Could it be that you're approaching this from the wrong angle? Rather than parsing the docs, which is an already transformed representation of the source, why not parse source code or byte code of the JDK directly?
parsing byte code
parsing source code
hook into the Javadoc tool (i.e. let Javadoc parse the code for you) by using the Doclet API
Depending on your needs you might also want to take the really easy road and have your classpath scanned by the reflections library.
Reflections reflections = new Reflections("some.package");
Set<Method> voidMethods = reflections.getMethodsReturn(Thread.class);
That having said don't forget that any good IDE can dig up a lot of the info you seem to be looking for (e.g. searching for methods called matches).
Whenever I have the need to design an API in Java, I normally start off by opening up my IDE, and creating the packages, classes and interfaces. The method implementations are all dummy, but the javadocs are detailed.
Is this the best way to go about things? I am beginning to feel that the API documentation should be the first to be churned out - even before the first .java file is written up. This has few advantages:
The API designer can complete the design & specification and then split up the implementation among several implementors.
More flexible - change in design does not require one to bounce around among java files looking for the place to edit the javadoc comment.
Are there others who share this opinion? And if so, how do you go about starting off with the API design?
Further, are there any tools out there which might help? Probably even some sort of annotation-based tool which generates documentation and then the skeleton source (kind of like model-to-code generators)? I came across Eclipse PDE API tooling - but this is specific to Eclipse plugin projects. I did not find anything more generic.
For an API (and for many types of problems IMO), a top-down approach for problem partitioning and analysis is the way to go.
However (and this is just my 2c based on my own personal experience, so take it with a grain of salt), focusing on the Javadoc part of it is a good thing to do, but that is still not sufficient, and cannot reliably be the starting point. In fact, that is very implementation oriented. So what happened to the design, the modeling and reasoning that should take place before that (however brief that might be)?
You have to do some sort of modeling to identify the entities (the nouns, roles and verbs) that make up your API. And no matter how "agile" one would like to be, such things cannot be prototyped without having a clear picture of the problem statement (even if it is just a 10K foot view of it.)
The best starting point is to specify what you are trying to implement, or more precisely, what type of problems your API is trying to address. BDD might be of help (more of that below). That is, what is it that your API will provide (datum elements), and to whom, performing what actions (the verbs) and under what conditions (the context). That leads then to identify what entities provide these things and under what roles (interfaces, specifically interfaces with a single, clear role or function, not as catch-all bags of methods). That leads to an analysis on how they are orchestrated together (inheritance, composition, delegation, etc.)
Once you have that, then you might be in a good position to start doing some preliminary Javadoc. Then you can start working on the implementation of those interfaces, of those roles. More Javadoc follows (in addition to other documentation that might not fall within Javadoc .ie. tutorials, how-tos, etc.)
You start your implementation with use cases and verifiable requirements and behavioral descriptions of what each thing should do alone or in collaboration. BDD would be extremely helpful here.
As you work on, you continuously refactor, hopefully by taking some metrics (cyclomatic complexity and some variant of LCOM). These two tell you where you should refactor.
A development of an API should not be inherently different from the development of an application. After all, an API is a utilitarian application for a user (who happens to have a development role.)
As a result, you should not treat API engineering any diferently from general software-intensive application engineering. Use the same practices, tune them according to your needs (which every one who works with software should), and you'll do fine.
Google has been uploading its "Google Tech Talk" video lecture series on youtube for quite some time. One of them is an hour long lecture titled "How To Design A Good API and Why it Matters". You might want to check it out also.
Some links for you that might help:
Google Tech Talk's "Beyond Test Driven Development: Behaviour Driven Development" : http://www.youtube.com/watch?v=XOkHh8zF33o
Behavior Driven Development : http://behaviour-driven.org/
Website Companion to the book "Practical API Design" : http://wiki.apidesign.org/wiki/Main_Page
Going back to the Basics - Structured Design#Cohesion and Coupling : http://en.wikipedia.org/wiki/Structured_Design#Structured_Design
Defining the interface first is the programming-by-contract style of declaring preconditions, postconditions and invariants. I find it combines well with Test-Driven-Development (TDD), because the invariants and postconditions you write first are the behaviours that your tests can check for.
As an aside, it seems the Behaviour-Driven-Development elaboration of TDD seems to have come about because of programmers who did not habitually think of the interface first.
As for my self, I always prefer starting with writing the interfaces along with their documentation and only then start with the implementation.
In the past I took another approach which was starting with the UML and then using the automatic code generation.
The best tool I encountered for this matter was Rational Rose which is not free but I'm sure there are plenty of free plugins and utils.
The advantage of Rational Rose over other designers I bumped into was that you can "attach" the design to your code and then modify on either code or design and the other will update.
I jump right in with the coding with a prototype. Any required interfaces soon pop out at you and you can mould your proto into a final product. Get feedback along the way from whomever is going to be using your API if you can.
There is no 'best way' of approaching API design, do whatever works for you. Domain knowledge also has a large part to play
I'm a great fan of programming to the interface. It forms a contract between the implementors and the users of your code.
Rather than diving straight into code, I usually start with a basic model of my system (UML diagrams etc, depending on the complexity). Not only does this serve as good documentation, it provides a visual clarification of the system structure. Having this makes the coding part much easier to do. This kind of design documentation also makes it easier to understand the system when you come back to it in 6 months, or try to fix bugs :)
Prototyping also has its merits, but be prepared to throw it away and start again.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I have an application written in Java. In is stored in several files. It uses different classes with different methods. The code is big and complicated. I think it would be easier to understand the code if I have a graphical model of the code (some kind of directed graph). Are there some standard methods for visualization of code. I am thinking about usage of UML (not sure it is a correct choice). Can anybody recommend me something?
ADDED:
I consider two possibilities:
Creating the graph by hands (explicitly).
Creating graph in an automatic way. For example to use some tools that read the available code and generate some graph describing the structure of the code.
ADDED 2:
It would be nice to have something for free.
I tried using a number of UML tools and found that the reverse-engineering capabilities in most UML tools were not helpful for understanding code. They focus on designing needs and reverse-engineering capabilities often just ends up showing huge pictures of lots of useless information. When I was working on the Microsoft Office codebase, I found using a pen-and-paper more helpful that the typical design/modelling tools.
You typically want to think about doing this in a number of ways:
Use your brain: Someone else mentioned it - there is no substitute to actually trying to understand a code base. You might need to take notes down and refer back to it later. Can tools help? Definitely. But don't expect them to do most of the work for you.
Find documentation and talk to co-workers: There is no better way than having some source describe the main concepts in a codebase. If you can find someone to help you, take a pen and paper, go to him and take lots of notes. How much to bug the other person? In the beginning - as much as is practical for your work, but no amount is too little.
Think about tools: If you are new to a part of a project - you are going to be spending a significant amount of time understanding the code, so see how much help you can get automatically. There are good tools and bad tools. Try to figure out which tools have capabilities that might be helpful for you first. As I mentioned above, the average UML tool focuses more on modeling and does not seem to not be the right fit for you.
Time vs Cost: Sure, free is great. But if a free tool is not being used by many people - it might be that the tool does not work. There are many tools that were create just as an exploration of what could be done, but are not really helpful and therefore just made available for free in hopes that someone else will adopt it. Another way to think about it, decide how much your time is worth - it might make sense to spend a day or two to get a tool to work for you.
Once there, keep these in mind when going trying to understand the project:
The Mile High View: A layered architectural diagram can be really helpful to know how the main concepts in a project are related to one another. Tools like Lattix and Architexa can be really helpful here.
The Core: Try to figure out how the code works with regards to the main concepts. Class diagrams are exceptionally useful here. Pen-and-paper works often enough here, but tools can not only speed up the process but also help you save and share such diagrams. I think AgileJ and Architexa are your best bets here, but your average UML tool can often be good enough.
Key Use Cases: I would suggest tracing atleast one key use case for your app. You likely can get the most important use cases from anyone on your team, and stepping through it will be really helpful. Most IDE's are really helpful here. If you try drawing them, then sequence diagrams arethe most appropriate. For tools here I think MaintainJ, JDeveloper and Architexa are your best bets here.
Note: I am the founder of Architexa - we build tools to help you understand and document Java code, but I have tried to be unbiased above. My intention is to suggest tools and options given that this is what I focused on as part of my PhD.
The most important tool you should use is your brain, and it's free.
There's no reason why you have to use any sort of standard method of visualization, and you can use whatever media you like. Paper, whiteboard, photoshop, visio, powerpoint, notepad: all of these can be effective. Draw a diagram of classes, objects, methods, properties, variables - whatever you think is useful to see in order to understand the application. The audience is not only other members of your team, but also yourself. Create diagrams that are useful for you to look at and quickly understand. Post them around your workspace and look at them regularly to remind yourself of the overall system architecture as you build it.
UML and other code documentation standards are good guidelines for the types of diagrams you can do and the information you should consider including. However, it is overkill for most applications and basically exists for people who can't take personal responsibility for documenting without standards. If you follow UML to the letter, you'll end up spending way too much time on your documentation instead of creating your application.
It is stored in several files. It uses different classes with different methods. The code is big and complicated.
All Java code written outside the school is like that, particularly for a new developer starting on a project.
This is an old question, but as this is coming up in Google searches, I am adding my response here so that it could be useful to the future visitors. Let me also disclose that I am the author of MaintainJ.
Don't try to understand the whole application
Let me ask you this - why do you want to understand the code? Most probably you are fixing a bug or enhancing a feature of the application. The first thing you should not try to do is to understand the whole application. Trying to understand the entire architecture while starting afresh on a project will just overwhelm you.
Believe me when I say this - developers with 10+ years of solid coding experience may not understand how certain parts of the application work even after working on the same project for more than a year (assuming they are not the original developers). They may not understand how the authentication works or how the transaction management works in the application. I am talking about typical enterprise applications with 1000 to 2000 classes and using different frameworks.
Two important skills required to maintain large applications
Then how do they survive and are paid big bucks? Experienced developers usually understand what they are doing; meaning, if they are to fix a bug, they will find the location of the bug, then fix it and make sure that it does not break the rest of the app. If they need to enhance a feature or add a new feature, most of the time, they just have to imitate an existing feature that does a similar thing.
There are two important skills that help them to do this.
They are able to analyze the impact of the change(s) they do while fixing a bug. First they locate the problem, change the code and test it to make sure that it works. Then, because they know the Java language well and the frameworks 'well enough', they can tell if it will break any other parts of the app. If not, they are done.
I said that they simply need to imitate to enhance the application. To imitate effectively, one needs to know Java well and understand the frameworks 'well enough'. For example, when they are adding a new Struts Action class and adding to the configuration xml, they will first find a similar feature, try to follow the flow of that feature and understand how it works. They may have to tweak a bit of the configuration (like the 'form' data being in 'request' than in 'session' scope). But if they know the frameworks 'well enough', they can easily do this.
The bottom line is, you don't need to understand what all the 2000 classes are doing to fix a bug or enhance the app. Just understand what's needed.
Focus on delivering immediate value
So am I discouraging you from understanding the architecture? No, not at all. All I am asking you is to deliver. Once you start on a project and once you have set up the development environment on your PC, you should not take more than a week to deliver something, however small it may be. If you are an experienced programmer and don't deliver anything after 2 weeks, how can a manager know if you really working or reading sports news?
So, to make life easier for everyone, deliver something. Don't go with the attitude that you need to understand the whole application to deliver something valuable. It's completely false. Adding a small and localized Javascript validation may be very valuable to the business and when you deliver it, the manager feels relieved that he has got some value for his money. Moreover, it gives you the time to read the sports news.
As time passes by and after you deliver 5 small fixes, you would start to slowly understand the architecture. Do not underestimate the time needed to understand each aspect of the app. Give 3-4 days to understand the authentication. May be 2-3 days to understand the transaction management. It really depends on the application and your prior experience on similar applications, but I am just giving the ballpark estimates. Steal the time in between fixing the defects. Do not ask for that time.
When you understand something, write notes or draw the class/sequence/data model diagram.
Diagrams
Haaa...it took me so long to mention diagrams :). I started with the disclosure that I am the author of MaintainJ, the tool that generates runtime sequence diagrams. Let me tell you how it can help you.
The big part of maintenance is to locate the source of a problem or to understand how a feature works.
MaintainJ generated sequence diagrams show the call flow and data flow for a single use case. So, in a simple sequence diagram, you can see which methods are called for a use case. So, if you are fixing a bug, the bug is most probably in one of those methods. Just fix it, ensure that it does not break anything else and get out.
If you need to enhance a feature, understand the call flow of that feature using the sequence diagram and then enhance it. The enhancement may be like adding an extra field or adding a new validation, etc. Usually, adding new code is less risky.
If you need to add a new feature, find some other feature similar to what you need to develop, understand the call flow of that feature using MaintainJ and then imitate it.
Sounds simple? It is actually simple, but there will be cases where you will be doing larger enhancements like building an entirely new feature or something that affects the fundamental design of the application. By the time you are attempting something like that, you should be familiar with the application and understand the architecture of the app reasonably well.
Two caveats to my argument above
I mentioned that adding code is less risky than changing existing code. Because you want to avoid changing, you may be tempted to simply copy an existing method and add to it rather than changing the existing code. Resist this temptation. All applications have certain structure or 'uniformity'. Do not ruin it by bad practices like code duplication. You should know when you are deviating from the 'uniformity'. Ask a senior developer on the project to review the changes. If you must do something that does not follow the conventions, at least make sure that it's local to a small class (a private method in a 200 line class would not ruin the application's esthetics).
If you follow the approach outlined above, though you can survive for years in the industry, you run the risk of not understanding the application architectures, which is not good in the long run. This can be avoided by working on bigger changes or by just less Facebook time. Spend time to understand the architecture when you are a little free and document it for other developers.
Conclusion
Focus on immediate value and use the tools that deliver that, but don't be lazy. Tools and diagrams help, but you can do without them too. You can follow my advice by just taking some time of a senior developer on the project.
Some plugins I know for Eclipse:
Architexa
http://www.architexa.com/
nWire
http://www.nwiresoftware.com/
If you want to reverse engineer the code, you should try Enterprise Architect
have you tried Google CodePro Analytix ?
it can for example display dependencies and is free (screenshot from cod.google.com):
Here is a non UML Tool which has very nice visualization features.
You can mapping the lines of code per class / method to colors / side lenght of rectangles.
You can also show the dependencies between the classes.
http://www.moosetechnology.org/
The nice thing is, you can use Smalltalk scripting for displaying what you need:
http://www.moosetechnology.org/docs/faq/JavaModelManipulation
Here you can see how such a visualization looks like:
http://www.moosetechnology.org/tools/moosejee/casestudy
JUDE Community UML used to be able to import Java, but it is no longer the case. It is a good, free tool.
If your app is really complex I think that diagrams won't carry you very far. When diagrams become very complex they become hard to read and lose their power. Some well chosen diagrams, even if generated by hand, might be enough.
You don't need every method, parameter, and return value spelled out. Usually it's just the relationships and interactions between objects or packages that you need.
Here is a another tool that could do the trick:
http://xplrarc.massey.ac.nz/
You can use JArchitect tool, a pretty complete tool to visualize your code structure using the dependency graph, and browse you source code like a database using CQlinq.
JArchitect is free for open source contributors
Some great tools I use -
StarUML (allows code to diagram conversion)
MS Visio
XMind (very very useful for overview of the system)
Pen and Paper!
In most teams there is a rule that says that the #author and #since keywords have to be used for all documented classes, sometimes even methods.
In an attempt to focus on what matters, I don't use these keywords and instead rely on the fact that I can use the source control management system to determine who the author of a class is and since when it exists.
I believe that #author and #since come from a time where version control was not yet common and I think that they are rather redundant by now. How do you think about this? Should modern Java projects use them?
I think the #author tag actually confuses things. First of all, if it isn't updated judiciously, it becomes wrong. Also, what if you (not being the original author) change half a class? Do you update the #author? Do you add one? And what if you change only a few lines in the class?
I also think it promotes code-ownership, which I don't think is a good thing. Anybody should be allowed to change a file. If there's an #author tag, people will tend to let this author make all the changes instead of doing it themselves and maybe learning something in the process.
Finally, as you said, the same information, in much more detail, is available from your VCS. Anything you add in Javadoc is duplication. And duplication is bad, right?
EDIT: Other answers mention the fact that you may not have access to the VCS, and in such cases the #author tag is useful. I humbly disagree. In such cases, you're most likely dealing with a 3rd party library, or perhaps an artifact from a different team inside your company. If that's the case, does it really matter who the individual was who created a certain class? Most likely, you only need to know the entity who created the library, and talk to their contact person.
Well for one thing Javadoc visibility typically transcends source control visibility. I can view the Javadocs for Java 1.1's library but can't to my knowledge freely peruse the Sun's version history from back then.
You're talking as if your Javadocs are completely isolated to you (the developer), and not distributed to others as part of an API, etc. That's not always the case. Usually Javadocs and VCS information serve completely different purposes.
For me, even if I have free access to the version history of a file, I like being able to see it right there in the source, for the same reason that I like comments explaining odd code in the file instead of having to go to the commit description for a certain code block. It's quicker.
I know that we have used them, and they are really nice when simply perusing the source code. I have had more than one situation where the #since has been really handy to have in there, as it would have taken a bit of work to determine what version something was added in (by comparing dates, etc).
Just my experience however. I think the #author has been less useful, but since we can autogenerate both pieces of data when creating new classes, it doesn't seem like a waste to just let the system do it to me.
I think documentation rules should be enforced only if you need them. If it's redundant for you to put those in Java Docs, then don't enforce the rule. A case where it would matter is if anyone ever needs to see that information and doesn't have access to your version control
No. The audience of the javadoc pages may not have access to your source control, so this information is relevant.
The #since is important as the documentation may be consulted for older versions of the software. When you can see when a feature was introduced you know 1) that it is unavailable to you and 2) that there is a good reason for upgrading.
You might, however, use an email address for author to contact your team for the #author tag.
One of most demanding tasks for any programmer, architect is understanding other's code.
For example, I am contractor, hired to rescue some project very quickly. Fix bugs, plan global refactoring and therefore I need most efficient way to understand the code. What is the list of concepts, their priority and best tools for this?
Of what I know: reverse code engineering to create object models (creating of diagram per package is not so convenient), create sequence diagrams (the tool connects in debug mode to the system and generates diagrams from runtime). Some visualizing techniques, using some tools to work not just with .java but also with e.g. JPA implementors like Hibernate. Generate diagram for not all the codebase, but add some class and then classes used by it.
Is Sparx Enterprise Architect state of the art in reverse engineering or far from that? Any other better tools? Ideally would be that tool makes me understand the code as if I wrote it myself :)
The book Object-Oriented Reengineering Patterns deals with this in detail. Unfortunately there is no silver bullet attached :-)
However, it lists a lot of useful techniques for taking over legacy code. In brief
interview at least some of the original developers (if they are still around) about
development history: phases, releases
current state of affairs
team social structure, politics, dynamics: when and why did people join and leave
bugs: typical, easiest, hardest
code quality: cleanest / ugliest parts
configuration data: form, content and usage
unit / integration / manual / ... test cases and data
SCM branch structure and usage
documentation: what is documented where, is it up to date
contact persons for external interfaces
Watch developers / users during demo to find
main features
typical use cases
usage anecdotes
good / bad, missing / superfluous functionality
"read all the code in one hour"
get high level view of class hierarchies, interfaces
take multiple sessions if needed
identify large structures (these often contain important functionality)
look for design patterns
check comments (they can reveal a lot, but may be also misleading)
skim documentation (if there is any)
just record the availability of specific types of docs e.g. specification, UML diagram, Wiki, Javadoc etc.
is it useful and why (not)
is it up to date
By far the most important tools are your ears, your tongue and your larynx. Ask the people who are familiar with the code - they'll be able to help you understand its general architecture much better than any software tools.
Automatically reverse-engineered complete UML models are generally nearly useless because they cannot distinguish between important abstractions and implementation details - which is the whole point of such models.
Software tools are more useful to answer very specific questions when you are investigating details, such as "where is this method called from?" or "what classes implement this interface" - any good IDE will be able to do that. Debuggers can help too - placing breakpoints at keypoints of the code and looking at the call stack when they're hit is often very enlightening.
Just to elaborate on Michaels mentioning of good IDE's which can help you:
I use the following Eclipse facilities a lot:
Shift-F2 when the cursor is placed in an identifier brings up the Javadoc for that identifier, if any. Good for navigating.
Hovering the mouse over an identifier brings up a box with the Javadoc in it, if any. Good for reminding when writing e.g. a method call.
The Declaration view shows the source where the keyword where the cursor is placed, is defined. This is updated when the cursor moved.
F3 goes to the definition of the current identifier.
Ctrl-T on an identifier shows all subclasses and implementations in a popup. Very useful when working on interfaces.
F4 on an identifier brings up the implementation hierarchy of that identifier in a panel, which can be navigated. Very useful to learn how things are connected. This includes both classes and interfaces.
EclipseUML Omondo is the best Java reverse engineering tool. It reverse all the java code, all packages and even class interaction with interface if not in the same package. Just amazing.
You can also reverse:
- .class
- hibernate annotations
- JPA annotations
What I like with this tool is that my code is clean because all the model information is saved into an xmi format and not as tag in my code. You can also create small documentation inside each existing package using diagrams as a view of the model. Just marvelous and respecting the official uml 2.2 specification.
The only problem is that it is really too expensive so the price is a stop for me !!
Doesn't extract high level architectures, but does make it much easier to climb around your Java code: our Java Source Code Browser. This reads source code (and supporting class files) and produces Javadoc style documentation plus source text bi-directionally hyperlinked to the Javadoc information.
(I'm one of the principals behind it).
I use Enterprise Architect for whole UML (including reverse engineering with Java) and it works perfectly.