I have a string which I'd like to remove the end of line characters from the very end of the string only using Java
"foo\r\nbar\r\nhello\r\nworld\r\n"
which I'd like to become
"foo\r\nbar\r\nhello\r\nworld"
(This question is similar to, but not the same as question 593671)
You can use s = s.replaceAll("[\r\n]+$", "");. This trims the \r and \n characters at the end of the string
The regex is explained as follows:
[\r\n] is a character class containing \r and \n
+ is one-or-more repetition of
$ is the end-of-string anchor
References
regular-expressions.info/Anchors, Character Class, Repetition
Related topics
You can also use String.trim() to trim any whitespace characters from the beginning and end of the string:
s = s.trim();
If you need to check if a String contains nothing but whitespace characters, you can check if it isEmpty() after trim():
if (s.trim().isEmpty()) {
//...
}
Alternatively you can also see if it matches("\\s*"), i.e. zero-or-more of whitespace characters. Note that in Java, the regex matches tries to match the whole string. In flavors that can match a substring, you need to anchor the pattern, so it's ^\s*$.
Related questions
regex, check if a line is blank or not
how to replace 2 or more spaces with single space in string and delete leading spaces only
Wouldn't String.trim do the trick here?
i.e you'd call the method .trim() on your string and it should return a copy of that string minus any leading or trailing whitespace.
The Apache Commons Lang StringUtils.stripEnd(String str, String stripChars) will do the trick; e.g.
String trimmed = StringUtils.stripEnd(someString, "\n\r");
If you want to remove all whitespace at the end of the String:
String trimmed = StringUtils.stripEnd(someString, null);
Well, everyone gave some way to do it with regex, so I'll give a fastest way possible instead:
public String replace(String val) {
for (int i=val.length()-1;i>=0;i--) {
char c = val.charAt(i);
if (c != '\n' && c != '\r') {
return val.substring(0, i+1);
}
}
return "";
}
Benchmark says it operates ~45 times faster than regexp solutions.
If you have Google's guava-librariesin your project (if not, you arguably should!) you'd do this with a CharMatcher:
String result = CharMatcher.any("\r\n").trimTrailingFrom(input);
String text = "foo\r\nbar\r\nhello\r\nworld\r\n";
String result = text.replaceAll("[\r\n]+$", "");
"foo\r\nbar\r\nhello\r\nworld\r\n".replaceAll("\\s+$", "")
or
"foo\r\nbar\r\nhello\r\nworld\r\n".replaceAll("[\r\n]+$", "")
Related
i want a Regex expression to split a string based on \r characters not a carriage return or a new line.
Below is the sample string i have.
MSH|^~\&|1100|CB|CERASP|TESTSB8F|202008041554||ORU|1361|P|2.2\rPID|1|833944|21796920320|8276975
i want this to be split into
MSH|^~\&|1100|CB|CERASP|TESTSB8F|202008041554||ORU|1361|P|2.2
PID|1|833944|21796920320|8276975
currently i have something like this
StringUtils.split(testStr, "\\r");
but it is splitting into
MSH|^~
&|1100|CB|CERASP|TESTSB8F|202008041554||ORU|1361|P|2.2
PID|1|833944|21796920320|8276975
You can just use String#split:
final String str = "MSH|^~\\&|1100|CB|CERASP|TESTSB8F|202008041554||ORU|1361|P|2.2\\rPID|1|833944|21796920320|8276975";
final String[] substrs = str.split("\\\\r");
System.out.println(Arrays.toString(substrs));
// Outputs [MSH|^~\&|1100|CB|CERASP|TESTSB8F|202008041554||ORU|1361|P|2.2, PID|1|833944|21796920320|8276975]
You can use
import java.utl.regex.*;
//...
String[] results = text.split(Pattern.quote("\\r"));
The Pattern.quote allows using any plain text inside String.split that accepts a valid regular expression. Here, \ is a special char, and needs to be escaped for both Java string interpretation engine and the regex engine.
The method being called matches any one of the contents in the delimiter string as a delimiter, not the entire sequence. Here is the code from SeparatorUtils that executes the delimiter (str is the input string being split) check:
if (separatorChars.indexOf(str.charAt(i)) >= 0) {
As #enzo mentioned, java.lang.String.split() will do the job - just make sure to quote the separator. Pattern.quote() can help.
I have a string that it can contains commas, but not only commas.
For example:
"," is wrong
"hello, my dear" is right
",,,," is wrong
",,hello" is right
For this reason I don't think I can use regex. How could I test this situation avoiding simple comparison like this one?
myString.equals(",") || myString.equals(",,") || ....
The easiest solution, IMHO, would be to stream the characters, and check that they are all ',':
boolean onlyCommas = myString.chars().allMatch(c -> c == ',');
A regex is actually what you're looking for:
boolean result = myString.matches("^,+$");
^ represents the beginning of the string, $ represents the end of the string and ,+ matches only (and at least one) comma characters. This way you match any string that only consists of comma characters.
String replacedString = someString.replace(",", "");
If it was made up of just commas the string will be empty afterwards.
My problem is to replace only the last occurrence of a character in the string with another character. When I used the String.replace(char1, char2), it replaces all the occurrences of the character in the string.
For example, I have an address string like
String str = "Addressline1,Addressline2,City,State,Country,";.
I need to replace the occurrence of ',' at the end of the string with '.'.
My code to replace the character is
str = str.replace(str.charAt(str.lastIndexOf(",")),'.');
After replacing, the string looks like:
Addressline1.Addressline2.City.State.Country.
Is there the problem in Java SDK?. If yes, how to resolve it?
You should use String.replaceAll which use regex
str = str.replaceAll (",$", ".");
The $ mean the end of the String
The Java replace function has a method declaration of:
public String replace(char oldChar, char newChar)
According to the docs replace will:
Return a new string resulting from replacing all occurrences of oldChar in this string with newChar.
So your code:
str.charAt(str.lastIndexOf(","))
Will clearly return the character ,. replace will then replace all instances of the oldChar , with the newChar .. This explains the behavior you were seeing.
The solution that #ScaryWombat beat me to is your best option:
str = str.replaceAll(",$", ".");
Since, in regular expression terms, $ denotes the end of a String.
Hope this helps!
I am trying to solve a codingbat problem using regular expressions whether it works on the website or not.
So far, I have the following code which does not add a * between the two consecutive equal characters. Instead, it just bulldozes over them and replaces them with a set string.
public String pairStar(String str) {
Pattern pattern = Pattern.compile("([a-z])\\1", Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(str);
if(matcher.find())
matcher.replaceAll(str);//this is where I don't know what to do
return str;
}
I want to know how I could keep using regex and replace the whole string. If needed, I think a recursive system could help.
This works:
while(str.matches(".*(.)\\1.*")) {
str = str.replaceAll("(.)\\1", "$1*$1");
}
return str;
Explanation of the regex:
The search regex (.)\\1:
(.) means "any character" (the .) and the brackets create a group - group 1 (the first left bracket)
\\1, which in regex is \1 (a java literal String must escape a backslash with another backslash) means "the first group" - this kind of term is called a "back reference"
So together (.)\1 means "any repeated character"
The replacement regex $1*$1:
The $1 term means "the content captured as group 1"
Recursive solution:
Technically, the solution called for on that site is a recursive solution, so here is recursive implementation:
public String pairStar(String str) {
if (!str.matches(".*(.)\\1.*")) return str;
return pairStar(str.replaceAll("(.)\\1", "$1*$1"));
}
FWIW, here's a non-recursive solution:
public String pairStar(String str) {
int len = str.length();
StringBuilder sb = new StringBuilder(len*2);
char last = '\0';
for (int i=0; i < len; ++i) {
char c = str.charAt(i);
if (c == last) sb.append('*');
sb.append(c);
last = c;
}
return sb.toString();
}
I dont know java, but I believe there is replace function for string in java or with regular expression. Your match string would be
([a-z])\\1
And the replace string would be
$1*$1
After some searching I think you are looking for this,
str.replaceAll("([a-z])\\1", "$1*$1").replaceAll("([a-z])\\1", "$1*$1");
This is my own solutions.
Recursive solution (which is probably more or less the solution that the problem is designed for)
public String pairStar(String str) {
if (str.length() <= 1) return str;
else return str.charAt(0) +
(str.charAt(0) == str.charAt(1) ? "*" : "") +
pairStar(str.substring(1));
}
If you want to complain about substring, then you can write a helper function pairStar(String str, int index) which does the actual recursion work.
Regex one-liner one-function-call solution
public String pairStar(String str) {
return str.replaceAll("(.)(?=\\1)", "$1*");
}
Both solution has the same spirit. They both check whether the current character is the same as the next character or not. If they are the same then insert a * between the 2 identical characters. Then we move on to check the next character. This is to produce the expected output a*a*a*a from input aaaa.
The normal regex solution of "(.)\\1" has a problem: it consumes 2 characters per match. As a result, we failed to compare whether the character after the 2nd character is the same character. The look-ahead is used to resolve this problem - it will do comparison with the next character without consuming it.
This is similar to the recursive solution, where we compare the next character str.charAt(0) == str.charAt(1), while calling the function recursively on the substring with only the current character removed pairStar(str.substring(1).
I have a java string such as this:
String string = "I <strong>really</strong> want to get rid of the strong-tags!";
And I want to remove the tags. I have some other strings where the tags are way longer, so I'd like to find a way to remove everything between "<>" characters, including those characters.
One way would be to use the built-in string method that compares the string to a regEx, but I have no idea how to write those.
Caution is advised when using regex to parse HTML (due its allowable complexity), however for "simple" HTML, and simple text (text without literal < or > in it) this will work:
String stripped = html.replaceAll("<.*?>", "");
To avoid Regex:
String toRemove = StringUtils.substringBetween(string, "<", ">");
String result = StringUtils.remove(string, "<" + toRemove + ">");
For multiple instances:
String[] allToRemove = StringUtils.substringsBetween(string, "<", ">");
String result = string;
for (String toRemove : allToRemove) {
result = StringUtils.remove(result, "<" + toRemove + ">");
}
Apache StringUtils functions are null-, empty-, and no match- safe
You should use
String stripped = html.replaceAll("<[^>]*>", "");
String stripped = html.replaceAll("<[^<>]*>", "");
where <[^>]*> matches substrings starting with <, then zero or more chars other than > (or the chars other than < and > if you choose the second version) and then a > char.
Note that <.*?>
is less efficient than a negated character class (see Which would be better non-greedy regex or negated character class?)
does not find substrings spanning across multiple lines (see How do I match any character across multiple lines in a regular expression?), but it can be solved with (?s)<.*?>, <(?s:.)*?>, <[\w\W]*?>, and many other not-so-efficient variations.
See the regex demo.