How to write "AstRoot" object to a file including comments using Rhino?

How to write "AstRoot" object to a file including comments using Rhino? - java

I've already parsed javascript source using Rhino and reconstructed it successfully.
and when I call astroot.toSource(), it shows to me reconstructed source well.
but .toSource() method can't prints Comments.
using .toSource() method, all my javascript source's comments are disappear.
so, How can I get the full source including comments?
My goal is write AstRoot Object(contain source) to a new javascript file that including full comments.
I'm using Rhino 1.7R4

In general, this is difficult because comments can appear in the middle of any decl, state ment or expression. So how to represent that fact in the various AST objects? It could be done but is very messy for parser and the AST objects it creates.
If you restrict yourself to only allowing comments on statement boundaries there are some possible solutions.
One way would be to write your own javascript tokenizer and inspect the stream while reading the file. Then you would need to figure out how to track them. One hackish way would be to transform them into 'var somexXXxx = "comment";' and use a naming convention to transform them back after ast.toSource() call. That would map your comments into the AST node structure.

Related

Java add attribute to HTML tags without changing formatting

A have a task to make a maven plugin which takes HTML files in certain location and adds a service attribute to each tag that doesn't have it. This is done on the source code which means my colleagues and I will have to edit those files further.
As a first solution I turned to Jsoup which seems to be doing the job but has one small yet annoying problem: if we have a tag with multiple long attributes (we often do as this HTML code is a source for further processing) we wrap the lines like this:
<ui:grid id="category_search" title="${handler.getMessage( 'title' )}"
class="is-small is-outlined is-hoverable is-foldable"
filterListener="onApplyFilter" paginationListener="onPagination" ds="${handler.ds}"
filterFragment="grid_filter" contentFragment="grid_contents"/>
However, Jsoup turns this into one very long line:
<ui:grid id="category_search" title="${handler.getMessage( 'title' )}" class="is-small is-outlined is-hoverable is-foldable" filterListener="onApplyFilter" paginationListener="onPagination" ds="${handler.ds}" filterFragment="grid_filter" contentFragment="grid_contents"/>
Which is a bad practice and real pain to read and edit.
So is there any other not very convoluted way to add this attribute without parsing and recomposing HTML code or maybe somehow preserve line breaks inside the tag?

Unfortunately JSoup's main use case is not to create HTML that is read or edited by humans. Specifically JSoup's API is very closely modeled after DOM which has no way to store or model line breaks inside tags, so it has no way to preserve them.
I can think of only two solutions:
Find (or write) an alternative HTML parser library, that has an API that preserves formatting inside tags. I'd be surprised if such a thing already exists.
Run the generated code through a formatter that supports wrapping inside tags. This won't preserve the original line breaks, but at least the attributes won't be all on one line. I wasn't able to find a Java library that does that, so you may need to consider using an external program.

It seems there is no good way to preserve breaks inside tags while parsing them into POJOs (or I haven't found one), so I wrote a simple tokenizer which splits incoming HTML string into parts sort of like this:
String[] parts = html.split( "((?=<)|(?<=>))" );
This uses regex lookups to split before < and after >. Then just iterate over parts and decide whether to insert attribute or not.

GherkinDocument to Gherkin Raw Text

I would like to store all the gherkin feature files created by a user on the front end as GherkinDocuments on the back end using the gherkin parser. Once saved, I would also like to be able to display the raw gherkin document on the front end. I have read through the documentation and cannot find anything built-in that converts the GherkinDocument back to a raw text. The toString() method is also not overloaded to print out. Is there a way to convert a GherkinDocument object to raw text within the gherkin parser?
I want to be able to keep as much of the original formatting as possible. Normally I would just write my own utility to perform this, however the structure of the GherkinDocument object renders it tedious. I would prefer to use existing capabilities if they exist.

I talked to Aslak, Cucumber developer, on the cucumber help gitter. He told me:
Hi #tramstheman have you considered storing it as text instead of serialising the GherkinDocument AST? It is very quick to parse that text back into an AST when you need to.
There isn't currently a renderer/prettifier that will turn an AST back to source as #mattwynne suggested. The tests don't do roundtrips, they just perform approval testing on various outputs (parser tokens, ASTs as JSON, pickles as JSON)
What I have done instead is extended the GherkinDocument object and set it to store the raw text inside it, as similarly suggested by Aslak.

What about reading the feature files as is and display them? They are available in your test class path. Move them to your production class path and they will be possible to read from any class, test or production. This will allow you to open a stream for each file and display it without any modification.

Transforming .mm file to readably form by java

I am developing Multi-mode resource-constrain project scheduling solver in Java. I was looking for test instances but only I found this. It is in .mm file that is extension for C++ compilator. Is there any way how to transform this data into something easy readable by java like XML, JSON?

As suggested you could of course parse the file as a text file. Alternatively the two other main approaches would be:
Use clang/llvm's active syntax tree (AST) to interpret the data in the file.
Use an Objective-C++ grammar for a compiler generator like yacc or, since you're using Java, JavaCC. This will also yield a syntax tree, that you can that walk and extract information from.

Static code parser for Java source code to extract methods / comments

I'm looking for a parser that can extract methods from a java class (static source code -> .java file) and method signature, comments / documentation, variables of each of the methods. Preferably in Java programming language.
Could someone please advise?
Thanks.

You can use ASTParser by eclipse. Its super simple to use.
Find a quick standalone example here.

Here is what I do to extract the method signatures from a java file/s:
I use Sublime Text 2, to the file I want to get the signatures from and the do a find Ctrl+F with regular expression set for the following Regex I made (I tested it on my code and it works, I hope it will work for you too)
((synchronized +)?(public|private|protected) +(static [a-Z\[\]]+|[a-Z\[\]]+) [a-Z]+\([a-Z ,\[\]]*\)\n?[a-Z ,\t\n]*\{)
After Sublime Text 2 highlight my results I click on "Find All" then copy Ctrl+C, open a new tab Ctrl+N and paste Ctrl+V.
You will then see all your methods signatures.
I hope it helped.

If all you want is the exact text of each method, and the exact text of the variables inside methods, you could get by with a parser that produces a CST, walking the CST to find the right nodes, and then prettyprinting the found subtrees. ANTLR has a Java parser that would work for this. I don't know if it will capture comments. I think the main distribution of ANTLR is coded in Java.
You can likely do this more hackily, in Java, with a lexer for Java, implementing what amounts to a bad island parser that looks for the key phrases. ("After 'class', find '{' and print out everything you find up to the matching '}'" would give you all the methods and fields).
If you want more precise detail (e.g, you want to know the actual type of an argument rather than just its name, or where the type is actually defined) you'll need a parser with a full front end and name resolution. (ANTLR won't do this.) The Eclipse JDT certainly builds trees; it likely does name resolution. Our DMS Software Reengineering Toolkit with its Java Front End can provide everything necessary for this task, including comment capture and extraction. DMS isn't coded in Java.
You objected to Javadoc as being inadequate, because it doesn't give you the content of methods. Perhaps our Java Source Browser, which does give you that code, would serve better. It integrates name resolution data from our DMS/Java Front End to hyperlink JavaDoc-type information into browsable source text; all fields as well as local variables are explicitly indexed. The Source Browser isn't coded in Java, but then presumably you simply want to run it and scrape your result. Such scraping might be harder than it appears staring at the screen; there's a lot of HTML behind such a display.

Generate HTML from plain text using Java

I have to convert a .log file into a nice and pretty HTML file with tables. Right now I just want to get the HTML header down. My current method is to println to file every single line of the HTML file. for example
p.println("<html>");
p.println("<script>");
etc. there has to be a simpler way right?

How about using a JSP scriplet and JSTL?, you could create some custom object which holds all the important information and display it formatted using the Expression Language.

Printing raw HTML text as strings is probably the "easiest" (most straightforward) way to do what you're asking but it has its drawbacks (e.g. properly escaping the content text).
You could use the DOM (e.g. Document et al) interface provided by Java but that would hardly be "easy". Perhaps there are "DOM builder" type tools/libraries for Java that would simplify this task for you; I suggest looking at dom4j.

Look at this Java HTML Generator library (easy to use). It should make generating the actual HTML muuuch clearer. There are complications when creating HTML with Java Strings (what happens if you want to change something like a rowspan?) that can be avoided with this library. Especially when dealing with tables.

There are many templating engines available. Have a look at https://stackoverflow.com/questions/174204/suggestions-for-a-java-based-templating-engine
This way you can define a template in a txt file and have the java code fill in the variables.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.