Java Stringbuilder.replace - java

Consider the following inputs:
String[] input = {"a9", "aa9", "a9a9", "99a99a"};
What would be the most efficient way whilst using a StringBuilder to replace any digit directly prior to a nine with the next letter after it in the alphabet?
After processing these inputs the output should be:
String[] output = {"b9", "ab9", "b9b9", "99b99a"}
I've been scratching my head for a while and the StringBuilder.setCharAt was the best method I could think of.
Any advice or suggestions would be appreciated.

Since you have to look at every character, you'll never perform better than linear in the size of the buffer. So you can just do something like
for (int i=1; buffer.length() ++i) // Note this starts at "1"
if (buffer.charAt[i] == '9')
buffer.setCharAt(i-1, buffer.getCharAt(i-1) + 1);

You can following code:
String[] input = {"a9", "aa9", "a9a9", "99a99a", "z9", "aZ9"};
String[] output = new String[input.length];
Pattern pt = Pattern.compile("([a-z])(?=9)", Pattern.CASE_INSENSITIVE);
for (int i=0; i<input.length; i++) {
Matcher mt = pt.matcher(input[i]);
StringBuffer sb = new StringBuffer();
while (mt.find()) {
char ch = mt.group(1).charAt(0);
if (ch == 'z') ch = 'a';
else if (ch == 'Z') ch = 'A';
else ch++;
mt.appendReplacement(sb, String.valueOf(ch));
}
mt.appendTail(sb);
output[i] = sb.toString();
}
System.out.println(Arrays.toString(output));
OUTPUT:
[b9, ab9, b9b9, 99b99a, a9, aA9]

You want to use a very simple state machine. For each character you're looping through in the input string, keep track of a boolean. If the character is a 9, set the boolean to true. If the character is a letter add one to the letter and set the boolean to false. Then add the character to the output stringbuilder.
For input you use a Reader. For output use a StringBuilder.

Use a 1 token look ahead parser technique. Here is some psuedoish code:
for (int index = 0; index < buffer.length(); ++index)
{
if (index < buffer.length() - 1)
{
if (buffer.charAt(index + 1) == '9')
{
char current = buffer.charAt(index) + 1; // this is probably not the best technique for this.
buffer.setCharAt(index, current);
}
}
}

another solution is for example to use
StringUtils.indexOf(String str, char searchChar, int startPos)
in a way as Ernest Friedman-Hill pointed, take this as experimental example, not the most performant

Related

How do i reverse each character in a string?

class Solution {
public String reverseWords(String s) {
int count = 0;
int current = 0;
StringBuilder build = new StringBuilder();
for(int i = 0; i< s.length(); i++){
count++;
current = count;
if(s.charAt(i) == ' '){
while(current > 0){
build.append(s.charAt(current - 1));
current--;
}
build.append(s.charAt(i));
}
}
return build.toString();
}
}
I am having trouble understanding why this doesn't work. I went through the
entire code a couple of times but there seems to be an issue.
Input : "Let's take LeetCode contest"
my answer: " s'teL ekat s'teL edoCteeL ekat s'teL "
correct answer: "s'teL ekat edoCteeL tsetnoc"
whats going on?
There are several problems:
You set current to the current position, and then iterate down to 0 append characters. Instead of going down to 0, you iterate until the beginning of the last word. Alternative, iterate until the previous ' ' character.
You only append something after seeing a ' ' character. What will happen at the end of the sentences? As i goes over the letters in the last word, there will be no more ' ' characters, and so the last word will never get appended. To handle this case, you would need to add some logic after the for-loop to check if there is an unwritten word, and append it reversed.
A simpler approach is to take advantage of the StringBuilder's ability to insert characters at some position.
You could track the beginning position of the current word,
and as you iterate over the characters,
insert if it's not ' ',
or else append a ' ' and reset the insertion position.
StringBuilder build = new StringBuilder();
int current = 0;
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
if (c == ' ') {
build.append(' ');
current = i + 1;
} else {
build.insert(current, c);
}
}
return build.toString();
Use a StringBuilder to reverse each word in your String :
String input = "Let's take LeetCode contest";
String[] split = input.split(" +");
StringBuilder output = new StringBuilder();
for (int i = 0; i < split.length; i++) {
output.append(new StringBuilder(split[i]).reverse());
if(i<input.length()-1)
output.append(" ");
}
System.out.println(output);
First, you skip over index 0. Put these lines at the end of your method to avoid this:
count++;
current = count;
You might need a variable to track where the current word began. For example, declare one along with count and current like this:
int wordStart = 0;
Then, when you finish processing a word, set wordStart to point to the first character of the next word. I would put that after the while loop, here:
build.append(s.charAt(i));
wordStart = count + 1;
You will also need to change this: while(current > 0){ to this: while(current >= wordStart)
Also: you don't need count. The variable i is the exact same thing.
You can use stream-api for your goal:
Stream.of(str.split(" "))
.map(s -> new StringBuilder(s).reverse().toString())
.reduce((s1, s2) -> s1 + " " + s2)
.orElse(null);
This is the easy way:
return Arrays.stream(s.split(" " ))
.map(StringBuilder::new)
.map(StringBuilder::reverse)
.map(StringBuilder::toString)
.collect(Collectors.joining(" "));

How can i replace a char in a String using chars from another string (Caesar Cypher)(JAVA)

I'm doing a caesar-cypher. Trying to replace all characters from a string to a certain character from the shifted alphabet.
Here is my code so far
public static String caesarify(String str, int key){
String alphabetNormal = shiftAlphabet(0);
String alphabetShifted = shiftAlphabet(key);
for (int i =0; i < str.length();i++){
for (int c =0; c < alphabetNormal.length(); c++) {
if (str.charAt(i) == alphabetNormal.charAt(c)) {
char replacement = alphabetShifted.charAt(c);
str.replace(str.charAt(i), replacement);
}
}
}
return str;
}
public static String shiftAlphabet(int shift) {
int start =0;
if (shift < 0) {
start = (int) 'Z' + shift + 1;
} else {
start = 'A' + shift;
}
String result = "";
char currChar = (char) start;
for(; currChar <= 'Z'; ++currChar) {
result = result + currChar;
}
if(result.length() < 26) {
for(currChar = 'A'; result.length() < 26; ++currChar) {
result = result + currChar;
}
}
return result;
}
I don't know why the string for example "ILIKEDONUTS" doesn't change to "JMJLFEPOVUT" when it's caesarified.
Don't use replace(), or any replace method, to replace a character at a given index in a String. It doesn't work. You're hoping that
str.replace(str.charAt(i), replacement);
will replace the i'th character of str. As pointed out in the other (now deleted) answer, str.replace doesn't change str itself, so you'd need to write
str = str.replace(str.charAt(i), replacement);
But that doesn't work. The replace() method doesn't know what your index i is. It only knows what character to replace. And, as the javadoc for replace() says, it replaces all characters in the string. So suppose that str.charAt(i) is 'a', and you want to replace it with 'd'. This code would replace all a characters with d, including (1) those that you already replaced with a earlier in the loop, so that this will defeat the work you've already done; and (2) a characters that come after this one, which you want to replace with d, but this will fail because later in the loop you will see d and replace it with g.
So you can't use replace(). There are a number of ways to replace the i'th character of a string, including using substring():
str = str.substring(0, i) + replacement + str.substring(i + 1);
But there are better ways, if you are going to replace every character. One is to create a StringBuilder from str, use StringBuilder's setCharAt method to change characters at specified indexes, and then convert the StringBuilder back to a String at the end. You should be able to look at the javadoc to find out what methods to use.
More: After looking into this more, I see why it was returning all A's. This inner loop has an error:
for (int c =0; c < alphabetNormal.length(); c++) {
if (str.charAt(i) == alphabetNormal.charAt(c)) {
char replacement = alphabetShifted.charAt(c);
str.replace(str.charAt(i), replacement);
}
}
Suppose key is 1, and the current character is 'C'. Your inner loop will eventually find C in alphabetNormal; it finds the corresponding character in alphabetShifted, which is D, and replaces C with D.
But then it loops back. Since the next character in alphabetNormal is D, it now matches the new str.char(i), which is now D, and therefore changes it again, to E. Then it loops back, and ... you get the picture.
replace below line
str.replace(str.charAt(i), replacement);
With
str= str.replace(str.charAt(i), replacement);
or you can make a String arr and then replace character in that. in the end create a new string from that array and return.
a better version of caesarify():
public static String caesarify(String str, int key){
String alphabetNormal = shiftAlphabet(0);
String alphabetShifted = shiftAlphabet(key);
//create a char array
char[] arr=str.toCharArray();
//always try to create variable outside of loop
char replacement
for (int i =0; i < arr.length();i++){
for (int c =0; c < alphabetNormal.length(); c++) {
if (arr[i] == alphabetNormal.charAt(c)) {
replacement = alphabetShifted.charAt(c);
//replace char on specific position in the array
arr[i]= replacement;
}
}
}
//return arrray as String
return new String(arr);
}

Replacing an unknown pattern size with a character

Is there a way to replace a specific repetitive character using regular expressions?
Example:
str = "Anne has nnnn things"
The solution would be:
"Ane has n things"
If a string has two or more instances of one character next to each other, the regular expression should replace them all with just one.
It is possible:
inputString.replaceAll("(.)\\1+", "$1")
Match one character, capture it, repeat it once or more, replace with only the capture.
However this may not be the faster solution. Such a thing is also doable with a simple loop:
public String removeRepetitions(final String input)
{
if (input.isEmpty())
return input;
final int len = input.length();
final StringBuilder sb = new StringBuilder(length);
char current = input.charAt(0);
char c;
sb.append(current);
for (int i = 1; i < len; i++) {
c = input.charAt(i);
if (c != current) {
sb.append(c);
current = c;
}
}
return sb.toString();
}
This should match n that repeats 2 or more times:
/n{2,}/

How to remove surrogate characters in Java?

I am facing a situation where i get Surrogate characters in text that i am saving to MySql 5.1. As the UTF-16 is not supported in this, I want to remove these surrogate pairs manually by a java method before saving it to the database.
I have written the following method for now and I am curious to know if there is a direct and optimal way to handle this.
Thanks in advance for your help.
public static String removeSurrogates(String query) {
StringBuffer sb = new StringBuffer();
for (int i = 0; i < query.length() - 1; i++) {
char firstChar = query.charAt(i);
char nextChar = query.charAt(i+1);
if (Character.isSurrogatePair(firstChar, nextChar) == false) {
sb.append(firstChar);
} else {
i++;
}
}
if (Character.isHighSurrogate(query.charAt(query.length() - 1)) == false
&& Character.isLowSurrogate(query.charAt(query.length() - 1)) == false) {
sb.append(query.charAt(query.length() - 1));
}
return sb.toString();
}
Here's a couple things:
Character.isSurrogate(char c):
A char value is a surrogate code unit if and only if it is either a low-surrogate code unit or a high-surrogate code unit.
Checking for pairs seems pointless, why not just remove all surrogates?
x == false is equivalent to !x
StringBuilder is better in cases where you don't need synchronization (like a variable that never leaves local scope).
I suggest this:
public static String removeSurrogates(String query) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < query.length(); i++) {
char c = query.charAt(i);
// !isSurrogate(c) in Java 7
if (!(Character.isHighSurrogate(c) || Character.isLowSurrogate(c))) {
sb.append(firstChar);
}
}
return sb.toString();
}
Breaking down the if statement
You asked about this statement:
if (!(Character.isHighSurrogate(c) || Character.isLowSurrogate(c))) {
sb.append(firstChar);
}
One way to understand it is to break each operation into its own function, so you can see that the combination does what you'd expect:
static boolean isSurrogate(char c) {
return Character.isHighSurrogate(c) || Character.isLowSurrogate(c);
}
static boolean isNotSurrogate(char c) {
return !isSurrogate(c);
}
...
if (isNotSurrogate(c)) {
sb.append(firstChar);
}
Java strings are stored as sequences of 16-bit chars, but what they represent is sequences of unicode characters. In unicode terminology, they are stored as code units, but model code points. Thus, it's somewhat meaningless to talk about removing surrogates, which don't exist in the character / code point representation (unless you have rogue single surrogates, in which case you have other problems).
Rather, what you want to do is to remove any characters which will require surrogates when encoded. That means any character which lies beyond the basic multilingual plane. You can do that with a simple regular expression:
return query.replaceAll("[^\u0000-\uffff]", "");
why not simply
for (int i = 0; i < query.length(); i++)
char c = query.charAt(i);
if(!isHighSurrogate(c) && !isLowSurrogate(c))
sb.append(c);
you probably should replace them with "?", instead of out right erasing them.
Just curious. If char is high surrogate is there a need to check the next one? It is supposed to be low surrogate. The modified version would be:
public static String removeSurrogates(String query) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < query.length(); i++) {
char ch = query.charAt(i);
if (Character.isHighSurrogate(ch))
i++;//skip the next char is it's supposed to be low surrogate
else
sb.append(ch);
}
return sb.toString();
}
if remove, all these solutions are useful
but if repalce, below is better
StringBuffer sb = new StringBuffer();
for (int i = 0; i < s.length(); i++) {
char c = s.charAt(i);
if(Character.isHighSurrogate(c)){
sb.append('*');
}else if(!Character.isLowSurrogate(c)){
sb.append(c);
}
}
return sb.toString();

Given a string find the first embedded occurrence of an integer

This was asked in an interview:
Given in any string, get me the first occurence of an integer.
For example
Str98 then it should return 98
Str87uyuy232 -- it should return 87
I gave the answer as loop through the string and compared it with numeric characters, as in
if ((c >= '0') && (c <= '9'))
Then I got the index of the number, parsed it and returned it. Somehow he was not convinced.
Can any one share the best possible solution?
With a regex, it's pretty simple:
String s = new String("Str87uyuy232");
Matcher matcher = Pattern.compile("\\d+").matcher(s);
matcher.find();
int i = Integer.valueOf(matcher.group());
(Thanks to Eric Mariacher)
Using java.util.Scanner :
int res = new Scanner("Str87uyuy232").useDelimiter("\\D+").nextInt();
The purpose of a Scanner is to extract tokens from an input (here, a String). Tokens are sequences of characters separated by delimiters. By default, the delimiter of a Scanner is the whitespace, and the tokens are thus whitespace-delimited words.
Here, I use the delimiter \D+, which means "anything that is not a digit". The tokens that our Scanner can read in our string are "87" and "232". The nextInt() method will read the first one.
nextInt() throws java.util.NoSuchElementException if there is no token to read. Call the method hasNextInt() before calling nextInt(), to check that there is something to read.
There are two issues with this solution.
Consider the test cases - there are 2 characters '8' and '7', and they both form the integer 87 that you should be returning. (This is the main issue)
This is somewhat pedantic, but the integer value of the character '0' isn't necessarily less than the value of '1', '2', etc. It probably almost always is, but I imagine interviewers like to see this sort of care. A better solution would be
if (Character.isDigit(c)) { ... }
There are plenty of different ways to do this. My first thought would be:
int i = 0;
while (i < string.length() && !Character.isDigit(string.charAt(i))) i++;
int j = i;
while (j < string.length() && Character.isDigit(string.charAt(j))) j++;
return Integer.parseInt(string.substring(i, j)); // might be an off-by-1 here
Of course, as mentioned in the comments, using the regex functionality in Java is likely the best way to do this. But of course many interviewers ask you to do things like this without libraries, etc...
String input = "Str87uyuy232";
Matcher m = Pattern.compile("[^0-9]*([0-9]+).*").matcher(input);
if (m.matches()) {
System.out.println(m.group(1));
}
Just in case you wanted non-regex and not using other utilities.
here you go
public static Integer returnInteger(String s)
{
if(s== null)
return null;
else
{
char[] characters = s.toCharArray();
Integer value = null;
boolean isPrevDigit = false;
for(int i =0;i<characters.length;i++)
{
if(isPrevDigit == false)
{
if(Character.isDigit(characters[i]))
{
isPrevDigit = true;
value = Character.getNumericValue(characters[i]);
}
}
else
{
if(Character.isDigit(characters[i]))
{
value = (value*10)+ Character.getNumericValue(characters[i]);
}
else
{
break;
}
}
}
return value;
}
}
You could go to a lower level too. A quick look at ASCII values reveals that alphabetical characters start at 65. Digits go from 48 - 57. With that being the case, you can simply 'and' n character against 127 and see if that value meets a threshold, 48 - 57.
char[] s = "Str87uyuy232".toCharArray();
String fint = "";
Boolean foundNum = false;
for (int i = 0; i < s.length; i++)
{
int test = s[i] & 127;
if (test < 58 && test > 47)
{
fint += s[i];
foundNum = true;
}
else if (foundNum)
break;
}
System.out.println(fint);
Doing this wouldn't be good for the real world (different character sets), but as a puzzle solution is fun.

Categories

Resources