How can we remove a "\" backslash character from a string in java? - java

I've been trying to figure out how we can remove a special character along with its preceding letters within a string.
Let's suppose, there a string with "ABC\n000111". In this case we have to remove the "ABC\" character from the string. So, the result would be n000111.
Can someone help me find the efficient way of doing this?

The Java string literal "ABC\n000111" doesn't contain a backslash: \n is a special character sequence, meaning a single character for a (unix) newline.
If you want to replace \n with n, you can do so:
System.out.println("ABC\n000111".replace('\n', 'n'));
If you want to replace everything up to and including the \n with n, you can do so:
System.out.println("ABC\n000111".replaceAll("^.*\n", "n"));

Related

How to get index of Escape Character in a String?

How to get index of Escape Character in a String?
String test="1234\567890";
System.out.println("Result : "+test.lastIndexOf("\\"));
Result i get:
-1
Result i need: 4
Your original String doesn't contain \. Which means you are searching something which does not exist. Inorder to add \ to your string. You have to escape while adding
String test="1234\\567890";
System.out.println("Result : "+test.lastIndexOf("\\"));
Should work.
In your case look at the last line in the table.
I don't think you can get that because when you use an escape character is for java to interpret the following character in a special way. In another words, the escape character and the next character you see in the string are really one entity from the point of view of program being executed.
When you search for "\\", you are searching for the literal character '\' not the escape character.
Here you can see the difference: java fiddle
While \5 is the character with code 0x5 (ENQ), 5 is the character 0x35. See the table.

Java Replacing Characters

This is pretty simple but how would I create a regex to strip anything but
letters a-Z,
numbers 0-9
and commas?
I think the regex expression for the first two is [^a-zA-Z_0-9] but how could I add commas to it.
Also, would it be the following?
"string".replaceAll("expression", null);
First of all, you can not use null for the replacement value. It will give you java.lang.NullPointerException. You must use string there. For example instead of null use empty "".
About the regex, if you need anything to add inside your character class [], just add them there. For example [^a-z,*.]
Furthermore, your a-zA-Z_0-9 can be replaced with \\w
[^\\w,]
You can simply add comma to your negated character class
[^a-zA-Z0-9,]
^ add this
Also Strings are immutable so replaceAll will not affect original string but create new one with replaced characters so you need to store it somewhere (maybe in reference to original String).
Last thing is that you need to pass empty string "" as replacement, not null.
So try with
yourString = yourString.replaceAll("[^a-zA-Z0-9,]","");
Another thing is that regex you are currently using also prevents _ from being removed. If that was intentional then instead of _ a-z A-Z 0-9 you can simply use predefined character class \w (which in Javas String needs to be written as "\\w" because \ needs to be escaped) so your code can look like
yourString = yourString.replaceAll("[^\\w,]","");
No, you should do:
value = "string".replaceAll("[\\W_,]", "");
My pattern doesn't use negation.
You should replace it with empty string and not null and you've to assign the result to your string as strings are immutable.
You can just simplify your regex to mine.
Otherwise just add , to your negated character class.
[\w,]+ is the regex which matches alphanumeric, underscore and comma.
Here \w is equivalent to [A-Za-z0-9_]
[\W,]+ is the regex which matches everything except alphanumeric, underscore and comma.
Here \W - Matches any character that is not a word character (alphanumeric & underscore) which is equivalent to [^A-Za-z0-9_]

Check string contains whitespace along with some other char sequence using regex in java

am using regex expression to check if a string contains white space.
my regex is : ^\\s+$
for example if my string is my name then regex matches should return true.
but it is returning true only if my string contains only spaces no other character.
How to check if a string contains a whitespace or tab or carriage return characters in between/start/end of some string.
^(.*\s+.*)+$ seems to work for me. Accepts anything as long as there is at least one space in the string. This will match the entire string.
If you only want to check for the presence of a space, you can just use \s without any begin or end markers in the string. The difference is that this will only match the individual spaces.
Your regex is not correct.
That's a string representing a regular expression. (as tchrist pointed out correctly)
The corresponding pattern that you get when using Pattern.compile() matches only strings containing one or more whitespace characters, starting from the beginning until the end. Thus, the matching string only consists of whitespace characters.
Try this string instead for Pattern.compile():
"\\s+"
The difference is that without the anchors "^" and "$" there may be other characters around the whitespace character. The whitespace character(s) may be everywhere in the string.
Using this pattern-string the whitespace character(s) must be at the beginning:
"^\\s+"
And here the sequence of whitespace characters has to be at the end:
"\\s+$"
Use org.apache.commons.lang.StringUtils.containsAny(). See http://commons.apache.org/lang/api-3.1/org/apache/commons/lang3/StringUtils.html.

What's the difference between these similar Java regexes?

Are these three related Java regexes just different syntaxes for doing the same thing?
String resultString = subjectString.replaceAll("(?m)^\\d+\\.\\s*", "");
String resultString = subjectString.replace("^[0-9]+\\. *", "");
String resultString = subjectString.replaceAll('\\d+\.\\s+', '');
No, they are different:
(?m)^\\d+\\.\\s* matches
one or more digits at the begin of a line (note m modifier in (?m)), followed by
a literal ., followed by
zero or more whitespace characters (equivalent to [ \t\n\x0B\f\r]);
^[0-9]+\\. * matches
one or more digits at the begin of the string, followed by
a literal ., followed by
zero of more spaces;
\\d+\.\\s+ matches
one or more digits at any position, followed by
a literal ., followed by
one or more whitespace characters.
Besides that, as Adrian Smith has noted, replace does not expect a regular expression but a single char or a CharacterSequence (String implements that interface).
replace doesn't accept a regexp; it accepts a literal string (i.e. will really search for exactly those characters). replaceAll accepts a regexp.
The third one isn't valid because single quotes are used. Single quotes represent individual characters which are char. Double quotes create strings (multiple characters) which are Strings.
Close, each replace a number followed by a period followed by white-space, i.e. 11.. But each one has slight difference:
The first will replace requires that the digit be at the beginning of a line and the white-space can be anything, i.e. a tab.
The second isn't valid, as noted, but if it were replaceAll() the white-space could only be the space character.
The third doesn't have to be at the beginning of the line and will replace any white-space characters like the first one.
The other differences are simply syntax.

How to escape a square bracket for Pattern compilation?

I have comma separated list of regular expressions:
.{8},[0-9],[^0-9A-Za-z ],[A-Z],[a-z]
I have done a split on the comma. Now I'm trying to match this regex against a generated password. The problem is that Pattern.compile does not like square brackets that is not escaped.
Can some please give me a simple function that takes a string like so: [0-9] and returns the escaped string \[0-9\].
For some reason, the above answer didn't work for me. For those like me who come after, here is what I found.
I was expecting a single backslash to escape the bracket, however, you must use two if you have the pattern stored in a string. The first backslash escapes the second one into the string, so that what regex sees is \]. Since regex just sees one backslash, it uses it to escape the square bracket.
\\]
In regex, that will match a single closing square bracket.
If you're trying to match a newline, for example though, you'd only use a single backslash. You're using the string escape pattern to insert a newline character into the string. Regex doesn't see \n - it sees the newline character, and matches that. You need two backslashes because it's not a string escape sequence, it's a regex escape sequence.
You can use Pattern.quote(String).
From the docs:
public static String quote​(String s)
Returns a literal pattern String for the specified String.
This method produces a String that can be used to create a Pattern that would match the string s as if it were a literal pattern.
Metacharacters or escape sequences in the input sequence will be given no special meaning.
You can use the \Q and \E special characters...anything between \Q and \E is automatically escaped.
\Q[0-9]\E
Pattern.compile() likes square brackets just fine. If you take the string
".{8},[0-9],[^0-9A-Za-z ],[A-Z],[a-z]"
and split it on commas, you end up with five perfectly valid regexes: the first one matches eight non-line-separator characters, the second matches an ASCII digit, and so on. Unless you really want to match strings like ".{8}" and "[0-9]", I don't see why you would need to escape anything.

Categories

Resources