I've used the following regex to try to remove parentheses and everything within them in a string called name.
name.replaceAll("\\(.*\\)", "");
For some reason, this is leaving name unchanged. What am I doing wrong?
Strings are immutable. You have to do this:
name = name.replaceAll("\\(.*\\)", "");
Edit: Also, since the .* is greedy, it will kill as much as it can. So "(abc)something(def)" will be turned into "".
As mentionend by by Jelvis, ".*" selects everything and converts "(ab) ok (cd)" to ""
The version below works in these cases "(ab) ok (cd)" -> "ok", by selecting everything except the closing parenthesis and removing the whitespaces.
test = test.replaceAll("\\s*\\([^\\)]*\\)\\s*", " ");
String.replaceAll() doesn't edit the original string, but returns the new one. So you need to do:
name = name.replaceAll("\\(.*\\)", "");
If you read the Javadoc for String.replaceAll(), you'll notice that it specifies that the resulting string is the return value.
More generally, Strings are immutable in Java; they never change value.
I'm using this function:
public static String remove_parenthesis(String input_string, String parenthesis_symbol){
// removing parenthesis and everything inside them, works for (),[] and {}
if(parenthesis_symbol.contains("[]")){
return input_string.replaceAll("\\s*\\[[^\\]]*\\]\\s*", " ");
}else if(parenthesis_symbol.contains("{}")){
return input_string.replaceAll("\\s*\\{[^\\}]*\\}\\s*", " ");
}else{
return input_string.replaceAll("\\s*\\([^\\)]*\\)\\s*", " ");
}
}
You can call it like this:
remove_parenthesis(g, "[]");
remove_parenthesis(g, "{}");
remove_parenthesis(g, "()");
To get around the .* removing everything in between two sets of parentheses you can try :
name = name.replaceAll("\\(?.*?\\)", "");
In Kotlin we must use toRegex.
val newName = name.replace("\\(?.*?\\)".toRegex(), "");
Related
Im trying to find a word in a string. However, due to a period it fails to recognize one word. Im trying to remove punctuation, however it seems to have no effect. Am I missing something here? This is the line of code I am using: s.replaceAll("([a-z] +) [?:!.,;]*","$1");
String test = "This is a line about testing tests. Tests are used to examine stuff";
String key = "tests";
int counter = 0;
String[] testArray = test.toLowerCase().split(" ");
for(String s : testArray)
{
s.replaceAll("([a-z] +) [?:!.,;]*","$1");
System.out.println(s);
if(s.equals(key))
{
System.out.println(key + " FOUND");
counter++;
}
}
System.out.println(key + " has been found " + counter + " times.");
}
I managed to find a solution (though may not be ideal) through using s = s.replaceAll("\W",""); Thanks for everyones guidance on how to solve this problem.
You could also take advantage of the regex in the split operation. Try this:
String[] testArray = test.toLowerCase().split("\\W+");
This will split on apostrophe, so you may need to tweak it a bit with a specific list of characters.
Strings are immutable. You would need assign the result of replaceAll to the new String:
s = s.replaceAll("([a-z] +)*[?:!.,;]*", "$1");
^
Also your regex requires that a space exist between the word and the the punctuation. In the case of tests., this isn't true. You can adjust you regex with an optional (zero or more) character to account for this.
Your regex doesn't seem to work as you want.
If you want to find something which has period after that then this will work
([a-z]*) [?(:!.,;)*]
it returns "tests." when it's run on your given string.
Also
[?(:!.,;)*]
just points out the punctuation which will then can be replaced.
However I am not sure why you are not using substring() function.
Pretty basic question for someone who knows.
Instead of getting from
"This is my text.
And here is a new line"
To:
"This is my text. And here is a new line"
I get:
"This is my text.And here is a new line.
Any idea why?
L.replaceAll("[\\\t|\\\n|\\\r]","\\\s");
I think I found the culprit.
On the next line I do the following:
L.replaceAll( "[^a-zA-Z0-9|^!|^?|^.|^\\s]", "");
And this seems to be causing my issue.
Any idea why?
I am obviously trying to do the following: remove all non-chars, and remove all new lines.
\s is a shortcut for whitespace characters in regex. It has no meaning in a string. ==> You can't use it in your replacement string. There you need to put exactly the character(s) that you want to insert. If this is a space just use " " as replacement.
The other thing is: Why do you use 3 backslashes as escape sequence? Two are enough in Java. And you don't need a | (alternation operator) in a character class.
L.replaceAll("[\\t\\n\\r]+"," ");
Remark
L is not changed. If you want to have a result you need to do
String result = L.replaceAll("[\\t\\n\\r]+"," ");
Test code:
String in = "This is my text.\n\nAnd here is a new line";
System.out.println(in);
String out = in.replaceAll("[\\t\\n\\r]+"," ");
System.out.println(out);
The new line separator is different for different OS-es - '\r\n' for Windows and '\n' for Linux.
To be safe, you can use regex pattern \R - the linebreak matcher introduced with Java 8:
String inlinedText = text.replaceAll("\\R", " ");
Try
L.replaceAll("(\\t|\\r?\\n)+", " ");
Depending on the system a linefeed is either \r\n or just \n.
I found this.
String newString = string.replaceAll("\n", " ");
Although, as you have a double line, you will get a double space. I guess you could then do another replace all to replace double spaces with a single one.
If that doesn't work try doing:
string.replaceAll(System.getProperty("line.separator"), " ");
If I create lines in "string" by using "\n" I had to use "\n" in the regex. If I used System.getProperty() I had to use that.
Your regex is good altough I would replace it with the empty string
String resultString = subjectString.replaceAll("[\t\n\r]", "");
You expect a space between "text." and "And" right?
I get that space when I try the regex by copying your sample
"This is my text. "
So all is well here. Maybe if you just replace it with the empty string it will work. I don't know why you replace it with \s. And the alternation | is not necessary in a character class.
You May use first split and rejoin it using white space.
it will work sure.
String[] Larray = L.split("[\\n]+");
L = "";
for(int i = 0; i<Larray.lengh; i++){
L = L+" "+Larray[i];
}
This should take care of space, tab and newline:
data = data.replaceAll("[ \t\n\r]*", " ");
I need to have access to java source files and I am using the String's method trim() to remove any leading and trailing whitespaces. However the code which is some scope, for example:
if(name.equals("joe")){
System.out.println(name);
}
the white spaces for the printing statement are not being removed completely. Is there a way to be able to remove also these white-spaces please?
Thanks
EDIT: I did use a new variable:
String n = statements.get(i).toString().trim();
System.out.println(n);
however the output still looks like this:
System.out.println("NAME:" + m.getName());
BlockStmt bs = m.getBody();
List<Statement> statements = bs.getStmts();
for (int i = 0; i < statements.size(); i++) {
if ((statements.get(i).toString().trim().contains(needed)) & (statements.get(i).toString().trim().length() == needed.length())) {
System.out.println("HEREEEEEEEEEEEEEEEEEEEEEEEEEE");
}
}
Some of the strings are still containing the spaces beforehand
You are mistaken. The String.trim() method does remove leading and trailing whiteshape entirely.
However, I suspect that your real problem is that you don't know what this really means. Java strings are immutable, so trim() obviously doesn't modify the target String object. Instead, it returns a new String instance with the whitespace removed. So you need to use it as follows:
String trimmed = someString.trim();
You must have to assign the result of string. (String objects are immutable).
name=name.trim();
if(name.equals("joe")){
System.out.println(name);
}
As #home mentioned:
if(name.equals("joe")){
String newName = name.trim();
System.out.println(newName);
}
Should work
EDIT: I guess that you want to use trim before the condition. My mistake.
String newName = name.trim();
if(newName.equals("joe")){
System.out.println(newName);
}
I am getting response for some images in json format within this tag:
"xmlImageIds":"57948916||57948917||57948918||57948919||57948920||57948921||57948 922||57948923||57948924||57948925||57948926||5794892"
What i want to do is to separate each image id using .split("||") of the string class. Then append url with this image id and display it.
I have tried .replace("\"|\"|","\"|"); but its not working for me. Please help.
EDIT: Shabbir, I tried to update your question according to your comments below. Please edit it again, if I didn't get it right.
Use
.replace("||", "|");
| is no special char.
However, if you are using split() or replaceAll instead of replace(), beware that you need to escape the pipe symbol as \\|, because these methods take a regex as parameter.
For example:
public static void main(String[] args) {
String in = "\"xmlImageIds\":\"57948916||57948917||57948918||57948919||57948920||57948921||57948922||57948923||57948924||57948925||57948926||5794892\"".replace("||", "|");
String[] q = in.split("\"");
String[] ids = q[3].split("\\|");
for (String id : ids) {
System.out.println("http://test/" + id);
}
}
I think I know what your problem is. You need to assign the result of replace(), not just call it.
String s = "foo||bar||baz";
s = s.replace("||", "|");
System.out.println(s);
I tested it, and just calling s.replace("||", "|"); doesn't seem to modify the string; you have to assign that result back to s.
Edit: The Java 6 spec says "Returns a new string resulting from replacing all occurrences of oldChar in this string with newChar." (the emphasis is mine).
According to http://docs.oracle.com/javase/6/docs/api/java/lang/String.html, replace() takes chars instead of Strings. Perhaps you should try replaceAll(String, String) instead? Either that, or try changing your String ("") quotation marks into char ('') quotation marks.
Edit: I just noticed the overload for replace() that takes a CharSequence. I'd still give replaceAll() a try though.
String pipe="pipes||";
System.out.println("Old Pipe:::"+pipe);
System.out.println("Updated Pipe:::"+pipe.replace("||", "|"));
i dont remember how it works that method... but you can make your own:
String withTwoPipes = "helloTwo||pipes";
for(int i=0; i<withTwoPipes.lenght;i++){
char a = withTwoPipes.charAt(i);
if(a=='|' && i<withTwoPipes.lenght+1){
char b = withTwoPipes.charAt(i+1);
if(b=='|' && i<withTwoPipes.lenght){
withTwoPipes.charAt(i)='';
withTwoPipes.charAt(i+1)='|';
}
}
}
I think that some code like this should work... its not a perfect answer but can help...
I need to build a regular expression that finds the word "int" only if it's not part of some string.
I want to find whether int is used in the code. (not in some string, only in regular code)
Example:
int i; // the regex should find this one.
String example = "int i"; // the regex should ignore this line.
logger.i("int"); // the regex should ignore this line.
logger.i("int") + int.toString(); // the regex should find this one (because of the second int)
thanks!
It's not going to be bullet-proof, but this works for all your test cases:
(?<=^([^"]*|[^"]*"[^"]*"[^"]*))\bint\b(?=([^"]*|[^"]*"[^"]*"[^"]*)$)
It does a look behind and look ahead to assert that there's either none or two preceding/following quotes "
Here's the code in java with the output:
String regex = "(?<=^([^\"]*|[^\"]*\"[^\"]*\"[^\"]*))\\bint\\b(?=([^\"]*|[^\"]*\"[^\"]*\"[^\"]*)$)";
System.out.println(regex);
String[] tests = new String[] {
"int i;",
"String example = \"int i\";",
"logger.i(\"int\");",
"logger.i(\"int\") + int.toString();" };
for (String test : tests) {
System.out.println(test.matches("^.*" + regex + ".*$") + ": " + test);
}
Output (included regex so you can read it without all those \ escapes):
(?<=^([^"]*|[^"]*"[^"]*"[^"]*))\bint\b(?=([^"]*|[^"]*"[^"]*"[^"]*)$)
true: int i;
false: String example = "int i";
false: logger.i("int");
true: logger.i("int") + int.toString();
Using a regex is never going to be 100% accurate - you need a language parser. Consider escaped quotes in Strings "foo\"bar", in-line comments /* foo " bar */, etc.
Not exactly sure what your complete requirements are but
$\s*\bint\b
perhaps
Assuming input will be each line,
^int\s[\$_a-bA-B\;]*$
it follows basic variable naming rules :)
If you think to parse code and search isolated int word, this works:
(^int|[\(\ \;,]int)
You can use it to find int that in code can be only preceded by space, comma, ";" and left parenthesis or be the first word of line.
You can try it here and enhance it http://www.regextester.com/
PS: this works in all your test cases.
$[^"]*\bint\b
should work. I can't think of a situation where you can use a valid int identifier after the character '"'.
Of course this only applies if the code is limited to one statement per line.