How to remove all comments from string without affecting URL in java - java

I need to remove all types of comments from my string without affecting the URL defined in that string. When i tried removing comments from string using regular expression some part of the URL also removed from the string.
I tried the following regex but the same issue happening.
String sourceCode= "/*\n"
+ " * Multi-line comment\n"
+ " * Creates a new Object.\n"
+ " */\n"
+ "public Object someFunction() {\n"
+ " // single line comment\n"
+ " Object obj = new Object();\n"
+ " return obj; /* single-line comment */\n"
+ "}"
+ "\n"
+ "https://stackoverflow.com/questions/18040431/remove-comments-in-a-string";
sourceCode=sourceCode.replaceAll("//.*|/\\*((.|\\n)(?!=*/))+\\*/", "");
System.out.println(sourceCode);
but anyway the comments are removed but the out put is showing like this
public Object someFunction() {
Object obj = new Object();
return obj;
}
https:
please help me to find out a solution for this.

[^:]//.*|/\\*((.|\\n)(?!=*/))+\\*/
Changes are in first few characters - [^:]. This means that symbol before // must be not :.
I usually use regex101.com to work with regular expressions. Select python language for your case (since languages use a little bit different escaping).
This is quite complex regexp to be read by human, so another solultion may be in using several simple expressions and process incoming text in multiple passes. Like
Remove one-line comments
Remove multiline comments
Process some special cases
Note: Processing regexp costs pretty much time. So if performance is required, you should check for another solution - your own processor or third-party libraries.
EDITED
As suggested #Wiktor expression [^:]//.*|/\\*((?!=*/)(?s:.))+\\*/ is faster solution. At least 2-3 times faster.

You can split your String by "\n" and check each line. Here is the tested code:
String sourceCode= "/*\n"
+ " * Multi-line comment\n"
+ " * Creates a new Object.\n"
+ " */\n"
+ "public Object someFunction() {\n"
+ " // single line comment\n"
+ " Object obj = new Object();\n"
+ " return obj; /* single-line comment */\n"
+ "}"
+ "\n"
+ "https://stackoverflow.com/questions/18040431/remove-comments-in-a-string";
String [] parts = sourceCode.split("\n");
System.out.println(getUrlFromText(parts));
Here is the fetching method:
private static String getUrlFromText(String []parts) {
for (String part : parts) {
if(part.startsWith("http")) {
return part;
}
}
return null;
}

For more specific this EXP should be use
.*[^:]//.*|/\\*((.|\\n)(?!=*/))*\\*/
Your provided pattern was not able to remove /**/ portion of code if it is there.(If it is special requirement then its fine)
So Your EXP is like :
And it should be:
For more understanding visit and use your EXP .*[^:]\/\/.*|\/\*((.|\n)(?!=*\/))*\*\/ it will show you graph for that.

Related

What code would I use to put parentheses around all the occurrences of a term/substring?

I know how to do it... but the way I'm thinking is complicated and has a lot of room for errors. I'm still learning Java, but I have learned that Java has ways of doing just about anything. Is there some way I can put parentheses around each occurrence of a substring? (see an example below)
Original String: "abcabcabcd"
Search For: "abc"
Final Output: "(abc)(abc)(abc)d"
The easiest way here is to use String::replaceAll
String str = "abcabcabcd";
String sub = "abc";
System.out.println(str.replaceAll(sub, "(" + sub + ")"));
As pointed out by #Jacob G., String::replace may be preferred here because there is no regex element needed.
Output:
(abc)(abc)(abc)d
If you are interested in another solution, there is one with recursion. Just for educational purpose:
private static String parentheses(String input, String template) {
int start = input.indexOf(template);
if (start == -1) {
return input;
}
return input.substring(0, start) +
"(" + input.substring(start, start + template.length()) + ")" +
parentheses(input.substring(start + template.length()), template);
}

Java Regular expression with multi variable and arrayList of string

I created this Java method:
public String isInTheList(List<String> listOfStrings)
{
/*
* Iterates through the list, and if the list contains the input of the user,
* it will be returned.
*/
for(String string : listOfStrings)
{
if(this.answer.matches("(?i).*" + string + ".*"))
{
return string;
}
}
return null;
}
I use this method in a while block in order to validate user input. I want to check if that input matches the concatenation of two different predefined ArrayLists of Strings.
The format of the input must be like this:
(elementOfThefirstList + " " + elementOfTheSecondList)
where the Strings elementOfThefirstList and elementOfTheSecondList are both elements from their respective list.
for(int i = 0; i < firstListOfString.size(); i++)
{
if(userInput.contains(firstListOfString.get(i) + " " + userInput.isInTheList(secondListOfString)))
{
isValid = true;//condition for exit from the while block
}
}
It work if the user input is like this:
elementOfThefirstList + " " + elementOfTheSecondList
However, it will also work if the user input is like this:
elementOfThefirstList + " " + elementOfTheSecondList + " " + anotherElementOfTheFirstList
How can I modify my regular expression, as well as my method, in order to have exactly one repetition of elements in both lists concatenated with a space between them?
I tried with another regular expression and I think that I will use this: "{1}". However, I am not able to do that with a variable.
With the information you provide as to how you are getting this issue, there is little that can be said about how to fix it. I strongly encourage you to look at this quantifiers tutorial before moving forward.
Let's look at some solutions.
For example, lets look at the line:if(this.answer.matches("(?i).*" + string + ".*"))What you are trying to do is to see if this.answer contains string, ignoring case (I doubt you need the last .*). But you are using a Greedy Quantifier to compare them. If the issue is arising due to an input error in this comparison, I would consider looking at the linked tutorial for Reluctant Quantifiers.
Okay, so it wasn't a quantifier issue. The other possible fix may be this block of code:
for(int i = 0; i < firstListOfString.size(); i++)
{
if(userInput.contains(firstListOfString.get(i) + " " + userInput.isInTheList(secondListOfString)))
{
isValid = true;//condition for exit from the while block
}
}
I don't know you you got userInput to have the containsmethod, but I assume that you used containment to call the String method. If this is the case, there could be a solution to the issue. You would only have to state that it is valid if and only if it is equal to an element from the first list and a matching element from the second string.
The final solution I have for you is simple. If there are no other spaces present within the list elements, you could split the concatenated String on a space and check how many elements the resulting array contains. If it is greater than two, then you have an invalid concatenation.
Hopefully this helps!

Remove comments in a string

private static String filterString(String code) {
String partialFiltered = code.replaceAll("/\\*.*\\*/", "");
String fullFiltered = partialFiltered.replaceAll("//.*(?=\\n)", "");
return fullFiltered;
}
I tried above code to remove all comments in a string but it isn't working - please help.
Works with both // single and multi-line /* comments */.
String sourceCode =
"/*\n"
+ " * Multi-line comment\n"
+ " * Creates a new Object.\n"
+ " */\n"
+ "public Object someFunction() {\n"
+ " // single line comment\n"
+ " Object obj = new Object();\n"
+ " return obj; /* single-line comment */\n"
+ "}";
System.out.println(sourceCode.replaceAll(
"//.*|/\\*((.|\\n)(?!=*/))+\\*/", ""));
Input :
/*
* Multi-line comment
* Creates a new Object.
*/
public Object someFunction() {
// single line comment
Object obj = new Object();
return obj; /* single-line comment */
}
Output :
public Object someFunction() {
Object obj = new Object();
return obj;
}
How about....
private static String filterString(String code) {
return code.Replace("//", "").Replace("/*", "").Replace("*/", "");
}
Replace below code
partialFiltered.replaceAll("//.*(?=\\n)", "");
With,
partialFiltered.replaceAll("//.*?\n","\n");
You need to use (?s) at the start of your partialFiltered regex to allow for comments spanning multiple lines (e.g. see Pattern.DOTALL with String.replaceAll).
But then the .* in the middle of /\\*.*\\*/ uses a greedy match so I'd expect it to replace the whole lot between two separate comment blocks. E.g., given the following:
/* Comment #1 */
for (i = 0; i < 10; i++)
{
i++
}
/* Comment #2 */
Haven't tested this so am risking egg on my face but would expect it to remove the whole lot including the code in the middle rather than just the two comments. One way to prevent would be to use .*? to make the inner matching non-greedy, i.e. to match as little as possible:
String partialFiltered = code.replaceAll("(?s)/\\*.*?\\*/", "");
Since the fullFiltered regex doesn't begin with (?s), it should work without the (?=\\n) (since the replaceAll regex doesn't span multiple lines by default) - so you should be able to change it to:
String fullFiltered = partialFiltered.replaceAll("//.*", "");
There are also possible issues with looking for the characters denoting a comment, e.g. if they appear within a string or regular expression pattern but I'm assuming these aren't important for your application - if they are it's probably the end of the road for using simple regular expressions and you may need a parser instead...
Maybe this can help someone:
return code.replaceAll(
"((['\"])(?:(?!\\2|\\\\).|\\\\.)*\\2)|\\/\\/[^\\n]*|\\/\\*(?:[^*]|\\*(?!\\/))*\\*\\/", "$1");
Use this regexp to test ((['"])(?:(?!\2|\\).|\\.)*\2)|\/\/[^\n]*|\/\*(?:[^*]|\*(?!\/))*\*\/ here

How do you parse non-standard form function?

A standard form function like A*B+A*B' is easy to parse (spliting by + and then spliting by *). How do you parse a function, if it doesn't take a standard form?
Example: a function can take the following forms:
A*B+A(A+B')
A*B+(A+B')A
A*B+A*B(A+B)
Any ideas?
P.S: I would like to parse the function in Java.
A standard form function like A*B+A*B' is easy to parse (splitting by + and then splitting by *).
Good. Now, all that's left is to deal with those pesky parenthesis. First, we will remove them with array.split, and then we will add the necessary logic to carry out the multiplications:
Once you have split the string A(A+B')C, you will end up with an array of three strings A, A+B, and C. And notice that in this method odd-number strings are ALWAYS the ones inside the parenthesis. So all we have to do is check to see if the last and first characters of odd strings are letters (A, B, C) or operators (*,+).
String firstString = "A*B+A*B(A+B)+A*B+A*B(A+B)";
String leftOfParenthesis;
String insideParenthesis;
String rightOfParenthesis
String last;
String first;
String[] masterArray;
masterArray = str.split(firstString);
for(int i=0; i<masterArray.length; i+2){
leftOfParenthesis = masterArray[i];
insideParenthesis = masterArray[i+1];
rightParenthesis = masterArray[i+2];
last = leftOfParenthesis.substring(leftOfParenthesis.length()-1);
first = rightParenthesis.substring(0,1);
if(last.isLetter() && first.isLetter()){
leftOfParenthesis.append("*" + insideParenthesis + "*" +
last + "+last*" + insideParenthesis + "*" + first);
rightOfParenthesis[0] = last;
}
else if(last.isLetter()){
leftOfParenthesis.append("*" + insideParenthesis + "*" + last);
}
else if(first.isLetter()){
leftOfParenthesis.append("+" + first + "*" +
insideParenthesis + "*" );
}
}
That's the basic logic. There will be some issues with the rightParenthesis = masterArray[i+2]; if you run past the end of your input string and there aren't that many terms left. So you will have to add some if statements to check for that. And this isn't totally generally, if you have parenthesis inside parenthesis or more than two terms inside a pair of parenthesis, you will have to add special logic to deal with that.
Rather than trying to parse with ad hoc methods (which always ends badly), you
are better off
writing an BNF grammar for your expression forms, in all
variants
code a recursive descent parser (See
https://stackoverflow.com/a/2336769/120163)

Stringtemplate compare strings does not work

Can someone explain why this does not work ?
StringTemplate query = new StringTemplate("hello " +
"$if(param==\"val1\")$" +
" it works! " +
"$endif$ " +
"world");
query.setAttribute("param", "val1");
System.out.println("result: "+query.toString());
It throws
eval tree parse error
:0:0: unexpected end of subtree
at org.antlr.stringtemplate.language.ActionEvaluator.ifCondition(ActionEvaluator.java:815)
at org.antlr.stringtemplate.language.ConditionalExpr.write(ConditionalExpr.java:99)
ST doesn't allow computation in the templates. That would make it part of the model.
You can't compare strings inside stringtemplate, unfortunately, but you can send a result of such a comparison into template as a parameter:
StringTemplate query = new StringTemplate("hello " +
"$if(paramEquals)$" +
" it works! " +
"$endif$ " +
"world");
query.setAttribute("paramEquals", param.equals("val1"));
System.out.println("result: "+query.toString());
It might not be what you're looking for, since every time you need to add a comparison you have to pass an extra parameter, and for loops it's even worse. But this is one workaround that may work for simple cases.

Categories

Resources