How to properly replace a character in a string using java regex? - java

I my java app, I have a following character sequence: b"2 (any single character, followed by a double quote followed by a single-digit number)
I need to replace the double quote with a single quote character.
I'm trying this:
Pattern p = Pattern.compile(".\"d");
Matcher m = p.matcher(initialOutput);
String replacement = m.replaceAll(".'d");
This does not seem to do anything.
What is the right way of doing this?

First off, d represents a literal character. You're looking for \d, which represents a numeric digit.
The other issue is that you're replacing variable characters with the string literal ".'d". One solution is to capture the variable portions and reference them in the replacement:
String replacement = initialOutput.replaceAll("(.)\"(\\d)", "$1'$2");
Another approach is to use lookarounds to check the surrounding characters without actually matching them for replacement:
String replacement = initialOutput.replaceAll("(?<=.)\"(?=\\d)", "'");

Related

Using NOT in Regex in replaceAll

I have this string:
String a = "$$bar$55^$$";
I want remove all symbols. I make regex:
String b = a.replaceAll("(?<=[^[\\p{Alpha}][\\p{Digit}]])", "");
But, I get:
$$bar$55^$$
But I want to get this string:
bar55
What am I doing wrong? How can I filter out all characters except letters and numbers?
In Oracle it work for me:
select regexp_replace('$$bar$55^$$','[^[:alpha:][:digit:]]*') from dual;
You are using a lookaround that is a non-consuming pattern, i.e. the match value will always be empty since only a location inside a string will be matched. Use
String b = a.replaceAll("\\P{Alnum}+", "");
The \\P{Alnum}+ pattern matches one or more chars other than ASCII alphanumeric chars. Also, see Predefined Character classes.
Alternatively, you may use
String b = a.replaceAll("[^\\p{L}\\p{P}\\p{S}]+", "");
This will remove chunks of 1 or more chars other than Unicode letters, punctuation and symbols.

How do i check if string contains char sequence and backslash "\"?

I'm trying to get true in the following test. I have a string with the backslash, that for some reason doesn't recognized.
String s = "Good news\\ everyone!";
Boolean test = s.matches("(.*)news\\.");
System.out.println(test);
I've tried a lot of variants, but only one (.*)news(.*) works. But that actually means any characters after news, i need only with \.
How can i do that?
Group the elements at the end:(.*)news\\(.*)
You can use this instead :
Boolean test = s.matches("(.*)news\\\\(.*)");
Try something like:
Boolean test = s.matches(".*news\\\\.*");
Here .* means any number of characters followed by news, followed by double back slashes (escaped in a string) and then any number of characters after that (can be zero as well).
With your regex what it means is:
.* Any number of characters
news\\ - matches by "news\" (see one slash)
. followed by one character.
which doesn't satisfies for String in your program "Good news\ everyone!"
You are testing for an escaped occurrence of a literal dot: ".".
Refactor your pattern as follows (inferring the last part as you need it for a full match):
String s = "Good news\\ everyone!";
System.out.println(s.matches("(.*)news\\\\.*"));
Output
true
Explanation
The back-slash is used to escape characters and the back-slash itself in Java Strings
In Java Pattern representations, you need to double-escape your back-slashes for representing a literal back-slash ("\\\\"), as double-back-slashes are already used to represent special constructs (e.g. \\p{Punct}), or escape them (e.g. the literal dot \\.).
String.matches will attempt to match the whole String against your pattern, so you need the terminal part of the pattern I've added
you can try this :
String s = "Good news\\ everyone!";
Boolean test = s.matches("(.*)news\\\\(.*)");
System.out.println(test);

Java regex negative lookahead to replace non-triple characters

I'm trying to take a number, convert it into a string and replace all characters that are not a triple.
Eg. if I pass in 1222331 my replace method should return 222. I can find that this pattern exists but I need to get the value and save it into a string for additional logic. I don't want to do a for loop to iterate through this string.
I have the following code:
String first = Integer.toString(num1);
String x = first.replaceAll("^((?!([0-9])\\3{2})).*$","");
But it's replacing the triple digits also. I only need it to replace the rest of the characters. Is my approach wrong?
You can use
first = first.replaceAll("((\\d)\\2{2})|\\d", "$1");
See regex demo
The regex - ((\d)\2{2})|\d - matches either a digit that repeats thrice (and captures it into Group 1), or just matches any other digit. $1 just restores the captured text in the resulting string while removing all others.

Regular expression for a complex string

I am new to regexp. I need to validate a string, but when I use my current attempt it is always returning false.
Rules:
A text matcher like "polygon(( ))"
number matcher like X Y, where x and y can be any double numbers
as many X Y pairs, separated by comma.
eg:
PolyGoN((
-74.0075783459999 40.710775696,
-74.007375926 40.710655064,
-74.0074640719999 40.7108592490001,
-74.0075783459999 40.710775696))
Here is the code that I used:
String inputString = "POLYGON((-74.0075783459999 40.710775696, -74.007375926 40.710655064, -74.0072836009999 40.710720973, -74.0075783459999 40.710775696))";
String regexp = "polygon[\\((][(\\-?\\d+(\\.\\d+)?)\\s*(\\-?\\d+(\\.\\d+)?)]*[\\))]";
Pattern pattern = Pattern.compile(regexp, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(inputString);
boolean result = matcher.matches();
[\\((] is incorrect way of specifying that you need ( twice. No matter how many times you repeat a character but inside a character class [] it counts only once. Since, it's the same character repeating you don't even need a character class there but just the character \\( with a quantifier that tells how many times it should repeat {2}. So, you need \\({2} at the start and \\){2} at the end.
Another problem with your use of [] is that you used them to denote a group of double pairs that repeats (using *). You always use () for grouping a part of your match. [] denotes a character class only. I wonder why you got that wrong because you grouped your doubles and their pairs correctly.
Next, you've forgotten to match all the commas , separating the double pairs. I've included that as (,\\s*)? in my regex. The hyphen - (or the negative sign here) doesn't need to be escaped since it's not inside a character class [] and so the regex parser knows you're not using it to specify a character range.
The corrected regex is (indented for clarity)
polygon\({2}\s*(
(-?\d+(\.\d+)?)\s*(-?\d+(\.\d+)?)(,\s*)
)*(-?\d+(\.\d+)?)\s*(-?\d+(\.\d+)?)
\s*\){2}
m|Polygon\(\(((\s*-?\d+\.\d+\s*){2},)*(\s*-?\d+\.\d+\s*){2}\)\)|i

how to check all character in a string is lowercase using java

I tried like this but it outputs false,Please help me
String inputString1 = "dfgh";// but not dFgH
String regex = "[a-z]";
boolean result;
Pattern pattern1 = Pattern.compile(regex);
Matcher matcher1 = pattern1.matcher(inputString1);
result = matcher1.matches();
System.out.println(result);
Your solution is nearly correct. The regex must say "[a-z]+"—include a quantifier, which means that you are not matching a single character, but one or more lowercase characters. Note that the uber-correct solution, which matches any lowercase char in Unicode, and not only those from the English alphabet, is this:
"\\p{javaLowerCase}+"
Additionally note that you can achieve this with much less code:
System.out.println(input.matches("\\p{javaLowerCase}*"));
(here I am alternatively using the * quantifier, which means zero or more. Choose according to the desired semantics.)
you are almost there, except that you are only checking for one character.
String regex = "[a-z]+";
the above regex would check if the input string would contain any number of characters from a to z
read about how to use Quantifiers in regex
Use this pattern :
String regex = "[a-z]*";
Your current pattern only works if the tested string is one char only.
Note that it does exactly what it looks like : it doesn't really test if the string is in lowercase but if it doesn't contain chars outside [a-z]. This means it returns false for lowercase strings like "àbcd". A correct solution in a Unicode world would be to use the Character.isLowercase() function and loop over the string.
It should be
^[a-z]+$
^ is the start of string
$ is the end of string
[a-z]+ matches 1 to many small characters
You need to use quantifies like * which matches 0 to many chars,+ which matches 1 to many chars..They would matches 0 or 1 to many times of the preceding character or range
Why bother with a regular expression ?
String inputString1 = "dfgh";// but not dFgH
boolean result = inputString1.toLowerCase().equals( inputString1 );

Categories

Resources