finding out if the characters of a string exist in another string with the same order or not using regex in java - java

i want to write a program in java using REGEX that gets 2 strings from the input ( the first one is shorter than the second one ) and then if the characters of the first string was inside the second string with the same order but they do not need to be next to each other ( it is not substring ) it outputs "true" and if not it outputs "false" here's an example:
example1:
input:
phantom
pphvnbajknzxcvbnatopopoim
output:
true
in the above example it is obvious we can see the word "phantom" in the second string (the characters are in the same order)
example2:
input:
apple
fgayiypvbnltsrgte
output:
false
as you can see apple dos not exists in the second string with the conditions i have earlier mentioned so it outputs false
import java.util.Scanner;
public class Main {
public static void main(String[] args) {
Scanner input = new Scanner(System.in);
String word1 = input.next();
String word2 = input.next();
String pattern = "";
int n = word1.length();
char[] word1CharArr = word1.toCharArray();
for ( int i = 0 ; i < n ; i++) {
pattern += "[:alnum:]" +word1CharArr[i]+"[:alnum:]";
// pattern += ".*\\b|\\B" +word1CharArr[i]+"\\b|\\B";
}
pattern = "^" + pattern + "$";
// pattern = "(?s)" + pattern + ".*";
// System.out.println(pattern);
System.out.println(word2.matches(pattern));
}
}
here is what i did . i broke my first string to its characters and want to use REGEX before and after each character to determine the pattern. I have searched much about REGEX and how to use it but still i have problem here. the part i have commented comes out from one of my searches but it did not work
I emphasize that i want to solve it with REGEX not any other way.

[:alnum:] isn't a thing. Even if it is, that would match exactly one character, not 'any number, from 0 to infinitely many of them'.
You just want phantom with .* in the middle: ^.*p.*h.*a.*n.*t.*o.*m.*$' is all you need. After all, phantom` 'fits', and so does paahaanaataaoaamaa -
String pattern = word1.chars()
.mapToObj(c -> ".*" + (char) c)
.collect(Collectors.joining()) + ".*";
should get the job done.

Related

Java regex: Replace all characters with `+` except instances of a given string

I have the following problem which states
Replace all characters in a string with + symbol except instances of the given string in the method
so for example if the string given was abc123efg and they want me to replace every character except every instance of 123 then it would become +++123+++.
I figured a regular expression is probably the best for this and I came up with this.
str.replaceAll("[^str]","+")
where str is a variable, but its not letting me use the method without putting it in quotations. If I just want to replace the variable string str how can I do that? I ran it with the string manually typed and it worked on the method, but can I just input a variable?
as of right now I believe its looking for the string "str" and not the variable string.
Here is the output its right for so many cases except for two :(
List of open test cases:
plusOut("12xy34", "xy") → "++xy++"
plusOut("12xy34", "1") → "1+++++"
plusOut("12xy34xyabcxy", "xy") → "++xy++xy+++xy"
plusOut("abXYabcXYZ", "ab") → "ab++ab++++"
plusOut("abXYabcXYZ", "abc") → "++++abc+++"
plusOut("abXYabcXYZ", "XY") → "++XY+++XY+"
plusOut("abXYxyzXYZ", "XYZ") → "+++++++XYZ"
plusOut("--++ab", "++") → "++++++"
plusOut("aaxxxxbb", "xx") → "++xxxx++"
plusOut("123123", "3") → "++3++3"
Looks like this is the plusOut problem on CodingBat.
I had 3 solutions to this problem, and wrote a new streaming solution just for fun.
Solution 1: Loop and check
Create a StringBuilder out of the input string, and check for the word at every position. Replace the character if doesn't match, and skip the length of the word if found.
public String plusOut(String str, String word) {
StringBuilder out = new StringBuilder(str);
for (int i = 0; i < out.length(); ) {
if (!str.startsWith(word, i))
out.setCharAt(i++, '+');
else
i += word.length();
}
return out.toString();
}
This is probably the expected answer for a beginner programmer, though there is an assumption that the string doesn't contain any astral plane character, which would be represented by 2 char instead of 1.
Solution 2: Replace the word with a marker, replace the rest, then restore the word
public String plusOut(String str, String word) {
return str.replaceAll(java.util.regex.Pattern.quote(word), "#").replaceAll("[^#]", "+").replaceAll("#", word);
}
Not a proper solution since it assumes that a certain character or sequence of character doesn't appear in the string.
Note the use of Pattern.quote to prevent the word being interpreted as regex syntax by replaceAll method.
Solution 3: Regex with \G
public String plusOut(String str, String word) {
word = java.util.regex.Pattern.quote(word);
return str.replaceAll("\\G((?:" + word + ")*+).", "$1+");
}
Construct regex \G((?:word)*+)., which does more or less what solution 1 is doing:
\G makes sure the match starts from where the previous match leaves off
((?:word)*+) picks out 0 or more instance of word - if any, so that we can keep them in the replacement with $1. The key here is the possessive quantifier *+, which forces the regex to keep any instance of the word it finds. Otherwise, the regex will not work correctly when the word appear at the end of the string, as the regex backtracks to match .
. will not be part of any word, since the previous part already picks out all consecutive appearances of word and disallow backtrack. We will replace this with +
Solution 4: Streaming
public String plusOut(String str, String word) {
return String.join(word,
Arrays.stream(str.split(java.util.regex.Pattern.quote(word), -1))
.map((String s) -> s.replaceAll("(?s:.)", "+"))
.collect(Collectors.toList()));
}
The idea is to split the string by word, do the replacement on the rest, and join them back with word using String.join method.
Same as above, we need Pattern.quote to avoid split interpreting the word as regex. Since split by default removes empty string at the end of the array, we need to use -1 in the second parameter to make split leave those empty strings alone.
Then we create a stream out of the array and replace the rest as strings of +. In Java 11, we can use s -> String.repeat(s.length()) instead.
The rest is just converting the Stream to an Iterable (List in this case) and joining them for the result
This is a bit trickier than you might initially think because you don't just need to match characters, but the absence of specific phrase - a negated character set is not enough. If the string is 123, you would need:
(?<=^|123)(?!123).*?(?=123|$)
https://regex101.com/r/EZWMqM/1/
That is - lookbehind for the start of the string or "123", make sure the current position is not followed by 123, then lazy-repeat any character until lookahead matches "123" or the end of the string. This will match all characters which are not in a "123" substring. Then, you need to replace each character with a +, after which you can use appendReplacement and a StringBuffer to create the result string:
String inputPhrase = "123";
String inputStr = "abc123efg123123hij";
StringBuffer resultString = new StringBuffer();
Pattern regex = Pattern.compile("(?<=^|" + inputPhrase + ")(?!" + inputPhrase + ").*?(?=" + inputPhrase + "|$)");
Matcher m = regex.matcher(inputStr);
while (m.find()) {
String replacement = m.group(0).replaceAll(".", "+");
m.appendReplacement(resultString, replacement);
}
m.appendTail(resultString);
System.out.println(resultString.toString());
Output:
+++123+++123123+++
Note that if the inputPhrase can contain character with a special meaning in a regular expression, you'll have to escape them first before concatenating into the pattern.
You can do it in one line:
input = input.replaceAll("((?:" + str + ")+)?(?!" + str + ").((?:" + str + ")+)?", "$1+$2");
This optionally captures "123" either side of each character and puts them back (a blank if there's no "123"):
So instead of coming up with a regular expression that matches the absence of a string. We might as well just match the selected phrase and append + the number of skipped characters.
StringBuilder sb = new StringBuilder();
Matcher m = Pattern.compile(Pattern.quote(str)).matcher(input);
while (m.find()) {
for (int i = 0; i < m.start(); i++) sb.append('+');
sb.append(str);
}
int remaining = input.length() - sb.length();
for (int i = 0; i < remaining; i++) {
sb.append('+');
}
Absolutely just for the fun of it, a solution using CharBuffer (unexpectedly it took a lot more that I initially hoped for):
private static String plusOutCharBuffer(String input, String match) {
int size = match.length();
CharBuffer cb = CharBuffer.wrap(input.toCharArray());
CharBuffer word = CharBuffer.wrap(match);
int x = 0;
for (; cb.remaining() > 0;) {
if (!cb.subSequence(0, size < cb.remaining() ? size : cb.remaining()).equals(word)) {
cb.put(x, '+');
cb.clear().position(++x);
} else {
cb.clear().position(x = x + size);
}
}
return cb.clear().toString();
}
To make this work you need a beast of a pattern. Let's say you you are operating on the following test case as an example:
plusOut("abXYxyzXYZ", "XYZ") → "+++++++XYZ"
What you need to do is build a series of clauses in your pattern to match a single character at a time:
Any character that is NOT "X", "Y" or "Z" -- [^XYZ]
Any "X" not followed by "YZ" -- X(?!YZ)
Any "Y" not preceded by "X" -- (?<!X)Y
Any "Y" not followed by "Z" -- Y(?!Z)
Any "Z" not preceded by "XY" -- (?<!XY)Z
An example of this replacement can be found here: https://regex101.com/r/jK5wU3/4
Here is an example of how this might work (most certainly not optimized, but it works):
import java.util.regex.Pattern;
public class Test {
public static void plusOut(String text, String exclude) {
StringBuilder pattern = new StringBuilder("");
for (int i=0; i<exclude.length(); i++) {
Character target = exclude.charAt(i);
String prefix = (i > 0) ? exclude.substring(0, i) : "";
String postfix = (i < exclude.length() - 1) ? exclude.substring(i+1) : "";
// add the look-behind (?<!X)Y
if (!prefix.isEmpty()) {
pattern.append("(?<!").append(Pattern.quote(prefix)).append(")")
.append(Pattern.quote(target.toString())).append("|");
}
// add the look-ahead X(?!YZ)
if (!postfix.isEmpty()) {
pattern.append(Pattern.quote(target.toString()))
.append("(?!").append(Pattern.quote(postfix)).append(")|");
}
}
// add in the other character exclusion
pattern.append("[^" + Pattern.quote(exclude) + "]");
System.out.println(text.replaceAll(pattern.toString(), "+"));
}
public static void main(String [] args) {
plusOut("12xy34", "xy");
plusOut("12xy34", "1");
plusOut("12xy34xyabcxy", "xy");
plusOut("abXYabcXYZ", "ab");
plusOut("abXYabcXYZ", "abc");
plusOut("abXYabcXYZ", "XY");
plusOut("abXYxyzXYZ", "XYZ");
plusOut("--++ab", "++");
plusOut("aaxxxxbb", "xx");
plusOut("123123", "3");
}
}
UPDATE: Even this doesn't quite work because it can't deal with exclusions that are just repeated characters, like "xx". Regular expressions are most definitely not the right tool for this, but I thought it might be possible. After poking around, I'm not so sure a pattern even exists that might make this work.
The problem in your solution that you put a set of instance string str.replaceAll("[^str]","+") which it will exclude any character from the variable str and that will not solve your problem
EX: when you try str.replaceAll("[^XYZ]","+") it will exclude any combination of character X , character Y and character Z from your replacing method so you will get "++XY+++XYZ".
Actually you should exclude a sequence of characters instead in str.replaceAll.
You can do it by using capture group of characters like (XYZ) then use a negative lookahead to match a string which does not contain characters sequence : ^((?!XYZ).)*$
Check this solution for more info about this problem but you should know that it may be complicated to find regular expression to do that directly.
I have found two simple solutions for this problem :
Solution 1:
You can implement a method to replace all characters with '+' except the instance of given string:
String exWord = "XYZ";
String str = "abXYxyzXYZ";
for(int i = 0; i < str.length(); i++){
// exclude any instance string of exWord from replacing process in str
if(str.substring(i, str.length()).indexOf(exWord) + i == i){
i = i + exWord.length()-1;
}
else{
str = str.substring(0,i) + "+" + str.substring(i+1);//replace each character with '+' symbol
}
}
Note : str.substring(i, str.length()).indexOf(exWord) + i this if statement will exclude any instance string of exWord from replacing process in str.
Output:
+++++++XYZ
Solution 2:
You can try this Approach using ReplaceAll method and it doesn't need any complex regular expression:
String exWord = "XYZ";
String str = "abXYxyzXYZ";
str = str.replaceAll(exWord,"*"); // replace instance string with * symbol
str = str.replaceAll("[^*]","+"); // replace all characters with + symbol except *
str = str.replaceAll("\\*",exWord); // replace * symbol with instance string
Note : This solution will work only if your input string str doesn't contain any * symbol.
Also you should escape any character with a special meaning in a regular expression in phrase instance string exWord like : exWord = "++".

Replace word with special characters from string in Java

I am writing a method which should replace all words which matches with ones from the list with '****'
characters. So far I have code which works but all special characters are ignored.
I have tried with "\\W" in my expression but looks like I didn't use it well so I could use some help.
Here's code I have so far:
for(int i = 0; i < badWords.size(); i++) {
if (StringUtils.containsIgnoreCase(stringToCheck, badWords.get(i))) {
stringToCheck = stringToCheck.replaceAll("(?i)\\b" + badWords.get(i) + "\\b", "****");
}
}
E.g. I have list of words ['bad', '#$$'].
If I have a string: "This is bad string with #$$" I am expecting this method to return "This is **** string with ****"
Note that method should be aware of case sensitive words, e.g. TesT and test should handle same.
I'm not sure why you use the StringUtils you can just directly replace words that match the bad words. This code works for me:
public static void main(String[] args) {
ArrayList<String> badWords = new ArrayList<String>();
badWords.add("test");
badWords.add("BadTest");
badWords.add("\\$\\$");
String test = "This is a TeSt and a $$ with Badtest.";
for(int i = 0; i < badWords.size(); i++) {
test = test.replaceAll("(?i)" + badWords.get(i), "****");
}
test = test.replaceAll("\\w*\\*{4}", "****");
System.out.println(test);
}
Output:
This is a **** and a **** with ****.
The problem is that these special characters e.g. $ are regex control characters and not literal characters. You'll need to escape any occurrence of the following characters in the bad word using two backslashes:
{}()\[].+*?^$|
My guess is that your list of bad words contains special characters that have particular meanings when interpreted in a regular expression (which is what the replaceAll method does). $, for example, typically matches the end of the string/line. So I'd recommend a combination of things:
Don't use containsIgnoreCase to identify whether a replacement needs to be done. Just let the replaceAll run each time - if there is no match against the bad word list, nothing will be done to the string.
The characters like $ that have special meanings in regular expressions should be escaped when they are added into the bad word list. For example, badwords.add("#\\$\\$");
Try something like this:
String stringToCheck = "This is b!d string with #$$";
List<String> badWords = asList("b!d","#$$");
for(int i = 0; i < badWords.size(); i++) {
if (StringUtils.containsIgnoreCase(stringToCheck,badWords.get(i))) {
stringToCheck = stringToCheck.replaceAll("["+badWords.get(i)+"]+","****");
}
}
System.out.println(stringToCheck);
Another solution: bad words matched with word boundaries (and case insensitive).
Pattern badWords = Pattern.compile("\\b(a|b|ĉĉĉ|dddd)\\b",
Pattern.UNICODE_CASE | Pattern.CASE_INSENSITIVE);
String text = "adfsa a dfs bb addfdsaf ĉĉĉ adsfs dddd asdfaf a";
Matcher m = badWords.matcher(text);
StringBuffer sb = new StringBuffer(text.length());
while (m.find()) {
m.appendReplacement(sb, stars(m.group(1)));
}
m.appendTail(sb);
String cleanText = sb.toString();
System.out.println(text);
System.out.println(cleanText);
}
private static String stars(String s) {
return s.replaceAll("(?su).", "*");
/*
int cpLength = s.codePointCount(0, s.length());
final String stars = "******************************";
return cpLength >= stars.length() ? stars : stars.substring(0, cpLength);
*/
}
And then (in comment) the stars with the correct count: one star for a Unicode code point giving two surrogate pairs (two UTF-16 chars).

Removing duplicate same characters in a row

I am trying to create a method which will either remove all duplicates from a string or only keep the same 2 characters in a row based on a parameter.
For example:
helllllllo -> helo
or
helllllllo -> hello - This keeps double letters
Currently I remove duplicates by doing:
private String removeDuplicates(String word) {
StringBuffer buffer = new StringBuffer();
for (int i = 0; i < word.length(); i++) {
char letter = word.charAt(i);
if (buffer.length() == 0 && letter != buffer.charAt(buffer.length() - 1)) {
buffer.append(letter);
}
}
return buffer.toString();
}
If I want to keep double letters I was thinking of having a method like private String removeDuplicates(String word, boolean doubleLetter)
When doubleLetter is true it will return hello not helo
I'm not sure of the most efficient way to do this without duplicating a lot of code.
why not just use a regex?
public class RemoveDuplicates {
public static void main(String[] args) {
System.out.println(new RemoveDuplicates().result("hellllo", false)); //helo
System.out.println(new RemoveDuplicates().result("hellllo", true)); //hello
}
public String result(String input, boolean doubleLetter){
String pattern = null;
if(doubleLetter) pattern = "(.)(?=\\1{2})";
else pattern = "(.)(?=\\1)";
return input.replaceAll(pattern, "");
}
}
(.) --> matches any character and puts in group 1.
?= --> this is called a positive lookahead.
?=\\1 --> positive lookahead for the first group
So overall, this regex looks for any character that is followed (positive lookahead) by itself. For example aa or bb, etc. It is important to note that only the first character is part of the match actually, so in the word 'hello', only the first l is matched (the part (?=\1) is NOT PART of the match). So the first l is replaced by an empty String and we are left with helo, which does not match the regex
The second pattern is the same thing, but this time we look ahead for TWO occurrences of the first group, for example helllo. On the other hand 'hello' will not be matched.
Look here for a lot more: Regex
P.S. Fill free to accept the answer if it helped.
try
String s = "helllllllo";
System.out.println(s.replaceAll("(\\w)\\1+", "$1"));
output
helo
Taking this previous SO example as a starting point, I came up with this:
String str1= "Heelllllllllllooooooooooo";
String removedRepeated = str1.replaceAll("(\\w)\\1+", "$1");
System.out.println(removedRepeated);
String keepDouble = str1.replaceAll("(\\w)\\1{2,}", "$1");
System.out.println(keepDouble);
It yields:
Helo
Heelo
What it does:
(\\w)\\1+ will match any letter and place it in a regex capture group. This group is later accessed through the \\1+. Meaning that it will match one or more repetitions of the previous letter.
(\\w)\\1{2,} is the same as above the only difference being that it looks after only characters which are repeated more than 2 times. This leaves the double characters untouched.
EDIT:
Re-read the question and it seems that you want to replace multiple characters by doubles. To do that, simply use this line:
String keepDouble = str1.replaceAll("(\\w)\\1+", "$1$1");
Try this, this will be most efficient way[Edited after comment]:
public static String removeDuplicates(String str) {
int checker = 0;
StringBuffer buffer = new StringBuffer();
for (int i = 0; i < str.length(); ++i) {
int val = str.charAt(i) - 'a';
if ((checker & (1 << val)) == 0)
buffer.append(str.charAt(i));
checker |= (1 << val);
}
return buffer.toString();
}
I am using bits to identify uniqueness.
EDIT:
Whole logic is that if a character has been parsed then its corrresponding bit is set and next time when that character comes up then it will not be added in String Buffer the corresponding bit is already set.

Iterating through String with .find() in Java regex

I'm currently trying to solve a problem from codingbat.com with regular expressions.
I'm new to this, so step-by-step explanations would be appreciated. I could solve this with String methods relatively easily, but I am trying to use regular expressions.
Here is the prompt:
Given a string and a non-empty word string, return a string made of each char just before and just after every appearance of the word in the string. Ignore cases where there is no char before or after the word, and a char may be included twice if it is between two words.
wordEnds("abcXY123XYijk", "XY") → "c13i"
wordEnds("XY123XY", "XY") → "13"
wordEnds("XY1XY", "XY") → "11"
etc
My code thus far:
String regex = ".?" + word+ ".?";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(str);
String newStr = "";
while(m.find())
newStr += m.group().replace(word, "");
return newStr;
The problem is that when there are multiple instances of word in a row, the program misses the character preceding the word because m.find() progresses beyond it.
For example: wordEnds("abc1xyz1i1j", "1") should return "cxziij", but my method returns "cxzij", not repeating the "i"
I would appreciate a non-messy solution with an explanation I can apply to other general regex problems.
This is a one-liner solution:
String wordEnds = input.replaceAll(".*?(.)" + word + "(?:(?=(.)" + word + ")|(.).*?(?=$|." + word + "))", "$1$2$3");
This matches your edge case as a look ahead within a non-capturing group, then matches the usual (consuming) case.
Note that your requirements don't require iteration, only your question title assumes it's necessary, which it isn't.
Note also that to be absolutely safe, you should escape all characters in word in case any of them are special "regex" characters, so if you can't guarantee that, you need to use Pattern.quote(word) instead of word.
Here's a test of the usual case and the edge case, showing it works:
public static String wordEnds(String input, String word) {
word = Pattern.quote(word); // add this line to be 100% safe
return input.replaceAll(".*?(.)" + word + "(?:(?=(.)" + word + ")|(.).*?(?=$|." + word + "))", "$1$2$3");
}
public static void main(String[] args) {
System.out.println(wordEnds("abcXY123XYijk", "XY"));
System.out.println(wordEnds("abc1xyz1i1j", "1"));
}
Output:
c13i
cxziij
Use positive lookbehind and postive lookahead which are zero-width assertions
(?<=(.)|^)1(?=(.)|$)
^ ^ ^-looks for a character after 1 and captures it in group2
| |->matches 1..you can replace it with any word
|
|->looks for a character just before 1 and captures it in group 1..this is zero width assertion that doesn't move forward to match.it is just a test and thus allow us to capture the values
$1 and $2 contains your value..Go on finding till the end
So this should be like
String s1 = "abcXY123XYiXYjk";
String s2 = java.util.regex.Pattern.quote("XY");
String s3 = "";
String r = "(?<=(.)|^)"+s2+"(?=(.)|$)";
Pattern p = Pattern.compile(r);
Matcher m = p.matcher(s1);
while(m.find()) s3 += m.group(1)+m.group(2);
//s3 now contains c13iij
works here
Use regex as follows:
Matcher m = Pattern.compile("(.|)" + Pattern.quote(b) + "(?=(.?))").matcher(a);
for (int i = 1; m.find(); c += m.group(1) + m.group(2), i++);
Check this demo.

Regex to replace part of the string with spaces

It seems simple, but I can't get it work.
I have a string which look like 'NNDDDDDAAAA', where 'N' is non digit, 'D' is digit, and 'A' is anything. I need to replace each A with a space character. Number of 'N's, 'D's, and 'A's in an input string is always different.
I know how to do it with two expressions. I can split a string in to two, and then replace everything in second group with spaces. Like this
Pattern pattern = Pattern.compile("(\\D+\\d+)(.+)");
Matcher matcher = pattern.matcher(input);
if (matcher.matches()) {
return matcher.group(1) + matcher.group(2).replaceAll(".", " ");
}
But I was wondering if it is possible with a single regex expression.
Given your description, I'm assuming that after the NNDDDDD portion, the first A will actually be a N rather than an A, since otherwise there's no solid boundary between the DDDDD and AAAA portions.
So, your string actually looks like NNDDDDDNAAA, and you want to replace the NAAA portion with spaces. Given this, the regex can be rewritten as such: (\\D+\\d+)(\\D.+)
Positive lookbehind in Java requires a fixed length pattern; You can't use the + or * patterns. You can instead use the curly braces and specify a maximum length. For instance, you can use {1,9} in place of each +, and it will match between 1 and 9 characters: (?<=\\D{1,9}\\d{1,9})(\\D.+)
The only problem here is you're matching the NAAA sequence as a single match, so using "NNNDDDDNAAA".replaceAll("(?<=\\D{1,9}\\d{1,9})(\\D.+)", " ") will result in replacing the entire NAAA sequence with a single space, rather than multiple spaces.
You could take the beginning delimiter of the match, and the string length, and use that to append the correct number of spaces, but I don't see the point. I think you're better off with your original solution; Its simple and easy to follow.
If you're looking for a little extra speed, you could compile your Pattern outside the function, and use StringBuilder or StringBuffer to create your output. If you're building a large String out of all these NNDDDDDAAAAA elements, work entirely in StringBuilder until you're done appending.
class Test {
public static Pattern p = Pattern.compile("(\\D+\\d+)(\\D.+)");
public static StringBuffer replace( String input ) {
StringBuffer output = new StringBuffer();
Matcher m = Test.p.matcher(input);
if( m.matches() )
output.append( m.group(1) ).append( m.group(2).replaceAll("."," ") );
return output;
}
public static void main( String[] args ) {
String input = args[0];
long startTime;
StringBuffer tests = new StringBuffer();
startTime = System.currentTimeMillis();
for( int i = 0; i < 50; i++)
{
tests.append( "Input -> Output: '" );
tests.append( input );
tests.append( "' -> '" );
tests.append( Test.replace( input ) );
tests.append( "'\n" );
}
System.out.println( tests.toString() );
System.out.println( "\n" + (System.currentTimeMillis()-startTime));
}
}
Update:
I wrote a quick iterative solution, and ran some random data through both. The iterative solution is around 4-5x faster.
public static StringBuffer replace( String input )
{
StringBuffer output = new StringBuffer();
boolean second = false, third = false;
for( int i = 0; i < input.length(); i++ )
{
if( !second && Character.isDigit(input.charAt(i)) )
second = true;
if( second && !third && Character.isLetter(input.charAt(i)) )
third = true;
if( second && third )
output.append( ' ' );
else
output.append( input.charAt(i) );
}
return output;
}
what do you mean by nondigit vs anything?
[^a-zA-Z0-9]
matches everything that is not a letter or digit.
you would want to replace anything that gets matched by the above regex with a space.
is this what you were talking about?
You want to use positive look behind to match the N's and D's then use a normal match for the A's.
Not sure of the positive look behind grammar in Java, but some article on Java regex with look behind
I know you asked for a regex, but why do you even need a regex for this? How about:
StringBuilder sb = new StringBuilder(inputString);
for (int i = sb.length() - 1; i >= 0; i--) {
if (Character.isDigit(sb.charAt(i)))
break;
sb.setCharAt(i, ' ');
}
String output = sb.toString();
You might find this post interesting. Of course, the above code assumes there will be at least one digit in the string - all characters following the last digit are converted to spaces. If there are no digits, every character is converted to a space.

Categories

Resources