Replace nested string with some rules - java

There are 3 rules in the string:
It contains either word or group (enclosed by parentheses), and group can be nested;
If there is a space between word or group, those words or groups should append with "+".
For example:
"a b" needs to be "+a +b"
"a (b c)" needs to be "+a +(+b +c)"
If there is a | between word or group, those words or groups should be surround with parentheses.
For example:
"a|b" needs to be "(a b)"
"a|b|c" needs to be "(a b c)"
Consider all the rules, here is another example:
"aa|bb|(cc|(ff gg)) hh" needs to be "+(aa bb (cc (+ff +gg))) +hh"
I have tried to use regex, stack and recursive descent parser logic, but still cannot fully solve the problem.
Could anyone please share the logic or pseudo code on this problem?
New edited:
One more important rule: vertical bar has higher precedence.
For example:
aa|bb hh cc|dd (a|b) needs to be +(aa bb) +hh +(cc dd) +((a b))
(aa dd)|bb|cc (ee ff)|(gg hh) needs to be +((+aa +dd) bb cc) +((+ee +ff) (+gg +hh))
New edited:
To solve the precedence problem, I find a way to add the parentheses before calling Sunil Dabburi's methods.
For example:
aa|bb hh cc|dd (a|b) will be (aa|bb) hh (cc|dd) (a|b)
(aa dd)|bb|cc (ee ff)|(gg hh) will be ((aa dd)|bb|cc) ((ee ff)|(gg hh))
Since the performance is not a big concern to my application, this way at least make it work for me. I guess the JavaCC tool may solve this problem beautifully. Hope someone else can continue to discuss and contribute this problem.

Here is my attempt. Based on your examples and a few that I came up with I believe it is correct under the rules. I solved this by breaking the problem up into 2 parts.
Solving the case where I assume the string only contains words or is a group with only words.
Solving words and groups by substituting child groups out, use the 1) part and recursively repeating 2) with the child groups.
private String transformString(String input) {
Stack<Pair<Integer, String>> childParams = new Stack<>();
String parsedInput = input;
int nextInt = Integer.MAX_VALUE;
Pattern pattern = Pattern.compile("\\((\\w|\\|| )+\\)");
Matcher matcher = pattern.matcher(parsedInput);
while (matcher.find()) {
nextInt--;
parsedInput = matcher.replaceFirst(String.valueOf(nextInt));
String childParam = matcher.group();
childParams.add(Pair.of(nextInt, childParam));
matcher = pattern.matcher(parsedInput);
}
parsedInput = transformBasic(parsedInput);
while (!childParams.empty()) {
Pair<Integer, String> childGroup = childParams.pop();
parsedInput = parsedInput.replace(childGroup.fst.toString(), transformBasic(childGroup.snd));
}
return parsedInput;
}
// Transform basic only handles strings that contain words. This allows us to simplify the problem
// and not have to worry about child groups or nested groups.
private String transformBasic(String input) {
String transformedBasic = input;
if (input.startsWith("(")) {
transformedBasic = input.substring(1, input.length() - 1);
}
// Append + in front of each word if there are multiple words.
if (transformedBasic.contains(" ")) {
transformedBasic = transformedBasic.replaceAll("( )|^", "$1+");
}
// Surround all words containing | with parenthesis.
transformedBasic = transformedBasic.replaceAll("([\\w]+\\|[\\w|]*[\\w]+)", "($1)");
// Replace pipes with spaces.
transformedBasic = transformedBasic.replace("|", " ");
if (input.startsWith("(") && !transformedBasic.startsWith("(")) {
transformedBasic = "(" + transformedBasic + ")";
}
return transformedBasic;
}
Verified with the following test cases:
#ParameterizedTest
#CsvSource({
"a b,+a +b",
"a (b c),+a +(+b +c)",
"a|b,(a b)",
"a|b|c,(a b c)",
"aa|bb|(cc|(ff gg)) hh,+(aa bb (cc (+ff +gg))) +hh",
"(aa(bb(cc|ee)|ff) gg),(+aa(bb(cc ee) ff) +gg)",
"(a b),(+a +b)",
"(a(c|d) b),(+a(c d) +b)",
"bb(cc|ee),bb(cc ee)",
"((a|b) (a b)|b (c|d)|e),(+(a b) +((+a +b) b) +((c d) e))"
})
void testTransformString(String input, String output) {
Assertions.assertEquals(output, transformString(input));
}
#ParameterizedTest
#CsvSource({
"a b,+a +b",
"a b c,+a +b +c",
"a|b,(a b)",
"(a b),(+a +b)",
"(a|b),(a b)",
"a|b|c,(a b c)",
"(aa|bb cc|dd),(+(aa bb) +(cc dd))",
"(aa|bb|ee cc|dd),(+(aa bb ee) +(cc dd))",
"aa|bb|cc|ff gg hh,+(aa bb cc ff) +gg +hh"
})
void testTransformBasic(String input, String output) {
Assertions.assertEquals(output, transformBasic(input));
}

I tried to solve the problem. Not sure if it works in all cases. Verified with the inputs given in the question and it worked fine.
We need to format the pipes first. That will help add necessary parentheses and spacing.
The spaces generated as part of pipe processing can interfere with actual spaces that are available in our expression. So used $ symbol to mask them.
To process spaces, its tricky as parantheses need to be processed individually. So the approach I am following is to find a set of parantheses starting from outside and going inside.
So typically we have <left_part><parantheses_code><right_part>. Now left_part can be empty, similary right_part can be empty. we need to handle such cases.
Also, if the right_part starts with a space, we need to add '+' to left_part as per space requirement.
NOTE: I am not sure what's expected of (a|b). If the result should be ((a b)) or (a b). I am going with ((a b)) purely by the definition of it.
Now here is the working code:
public class Test {
public static void main(String[] args) {
String input = "aa|bb hh cc|dd (a|b)";
String result = formatSpaces(formatPipes(input)).replaceAll("\\$", " ");
System.out.println(result);
}
private static String formatPipes(String input) {
while (true) {
char[] chars = input.toCharArray();
int pIndex = input.indexOf("|");
if (pIndex == -1) {
return input;
}
input = input.substring(0, pIndex) + '$' + input.substring(pIndex + 1);
int first = pIndex - 1;
int closeParenthesesCount = 0;
while (first >= 0) {
if (chars[first] == ')') {
closeParenthesesCount++;
}
if (chars[first] == '(') {
if (closeParenthesesCount > 0) {
closeParenthesesCount--;
}
}
if (chars[first] == ' ') {
if (closeParenthesesCount == 0) {
break;
}
}
first--;
}
String result;
if (first > 0) {
result = input.substring(0, first + 1) + "(";
} else {
result = "(";
}
int last = pIndex + 1;
int openParenthesesCount = 0;
while (last <= input.length() - 1) {
if (chars[last] == '(') {
openParenthesesCount++;
}
if (chars[last] == ')') {
if (openParenthesesCount > 0) {
openParenthesesCount--;
}
}
if (chars[last] == ' ') {
if (openParenthesesCount == 0) {
break;
}
}
last++;
}
if (last >= input.length() - 1) {
result = result + input.substring(first + 1) + ")";
} else {
result = result + input.substring(first + 1, last) + ")" + input.substring(last);
}
input = result;
}
}
private static String formatSpaces(String input) {
if (input.isEmpty()) {
return "";
}
int startIndex = input.indexOf("(");
if (startIndex == -1) {
if (input.contains(" ")) {
String result = input.replaceAll(" ", " +");
if (!result.trim().startsWith("+")) {
result = '+' + result;
}
return result;
} else {
return input;
}
}
int endIndex = startIndex + matchingCloseParenthesesIndex(input.substring(startIndex));
if (endIndex == -1) {
System.out.println("Invalid input!!!");
return "";
}
String first = "";
String last = "";
if (startIndex > 0) {
first = input.substring(0, startIndex);
}
if (endIndex < input.length() - 1) {
last = input.substring(endIndex + 1);
}
String result = formatSpaces(first);
String parenthesesStr = input.substring(startIndex + 1, endIndex);
if (last.startsWith(" ") && first.isEmpty()) {
result = result + "+";
}
result = result + "("
+ formatSpaces(parenthesesStr)
+ ")"
+ formatSpaces(last);
return result;
}
private static int matchingCloseParenthesesIndex(String input) {
int counter = 1;
char[] chars = input.toCharArray();
for (int i = 1; i < chars.length; i++) {
char ch = chars[i];
if (ch == '(') {
counter++;
} else if (ch == ')') {
counter--;
}
if (counter == 0) {
return i;
}
}
return -1;
}
}

Related

How to remove repeating code in this solution?

I have this code which compresses characters in the given string and replaces repeated adjacent characters with their count.
Consider the following example:
Input:
aaabbccdsa
Expecting output:
a3b2c2dsa
My code is working properly but I think repeating if condition can be removed.
public class Solution {
public static String getCompressedString(String str) {
String result = "";
char anch = str.charAt(0);
int count = 0;
for (int i = 0; i < str.length(); i++) {
char ch = str.charAt(i);
if (ch == anch) {
count++;
} else {
if (count == 1) { // from here
result += anch;
} else {
result += anch + Integer.toString(count);
} // to here
anch = ch;
count = 1;
}
if (i == str.length() - 1) {
if (count == 1) { // from here
result += anch;
} else {
result += anch + Integer.toString(count);
} // to here
}
}
return result;
}
}
In this solution code below is repeated two times
if (count == 1) {
result += anch;
} else {
result += anch + Integer.toString(count);
}
Please, note, I don't want to use a separate method for repeating logic.
You could do away with the if statements.
public static String getCompressedString(String str) {
char[] a = str.toCharArray();
StringBuilder sb = new StringBuilder();
for(int i=0,j=0; i<a.length; i=j){
for(j=i+1;j < a.length && a[i] == a[j]; j++);
sb.append(a[i]).append(j-i==1?"":j-i);
}
return sb.toString();
}
}
You can do something like this:
public static String getCompressedString(String str) {
String result = "";
int count = 1;
for (int i = 0; i < str.length(); i++) {
if (i + 1 < str.length() && str.charAt(i) == str.charAt(i + 1)) {
count++;
} else {
if (count == 1) {
result += str.charAt(i);
} else {
result += str.charAt(i) + "" + count;
count = 1;
}
}
}
return result;
}
I got rid of the repeated code, and it do as intended.
You can use this approach as explained below:
Code:
public class Test {
public static void main(String[] args) {
String s = "aaabbccdsaccbbaaadsa";
char[] strArray = s.toCharArray();
char ch0 = strArray[0];
int counter = 0;
StringBuilder sb = new StringBuilder();
for(int i=0;i<strArray.length;i++){
if(ch0 == strArray[i]){//check for consecutive characters and increment the counter
counter++;
} else { // when character changes while iterating
sb.append(ch0 + "" + (counter > 1 ? counter : ""));
counter = 1; // reset the counter to 1
ch0 = strArray[i]; // reset the ch0 with the current character
}
if(i == strArray.length-1){// case for last element of the string
sb.append(ch0 + "" + (counter > 1 ? counter : ""));
}
}
System.out.println(sb);
}
}
Sample Input/Output:
Input:: aaabbccdsaccbbaaadsa
Output:: a3b2c2dsac2b2a3dsa
Input:: abcdaaaaa
Output:: abcda5
Since, the body of the else and second if is the same, so we can merge them by updating the condition. The updated body of the function will be:
String result = "";
char anch = str.charAt(0);
int count = 0;
char ch = str.charAt(0); // declare ch outside the loop, and initialize to avoid error
for (int i = 0; i < str.length(); i++) {
ch = str.charAt(i);
if (ch == anch) {
count++;
}
// check if the second condition is false, or if we are at the end of the string
if (ch != anch || i == str.length() - 1) {
if (count == 1) { // from here
result += anch;
} else {
result += anch + Integer.toString(count);
} // to here
anch = ch;
count = 1;
}
}
// add the condition
// if count is greater than or
// if the last character added already to the result
if (count > 1 || (len < 2 || result.charAt(len - 2) != ch)) {
result += ch;
}
return result;
Test Cases:
I have tested the solution on the following inputs:
aaabbccdsa -> a3b2c2dsa
aaab -> a3b
aaa -> a3
ab -> ab
aabbc -> a2b2c
Optional
If you want to make it shorter, you can update these 2 conditions.
if (count == 1) { // from here
result += anch;
} else {
result += anch + Integer.toString(count);
} // to here
as
result += anch;
if (count != 1) { // from here
result += count;// no need to convert (implicit conversion)
} // to here
Here's a single-statement solution using Stream API and regular expressions:
public static final Pattern GROUP_OF_ONE_OR_MORE = Pattern.compile("(.)\\1*");
public static String getCompressedString(String str) {
return GROUP_OF_ONE_OR_MORE.matcher(str).results()
.map(MatchResult::group)
.map(s -> s.charAt(0) + (s.length() == 1 ? "" : String.valueOf(s.length())))
.collect(Collectors.joining());
}
main()
public static void main(String[] args) {
System.out.println(getCompressedString("aaabbccdsa"));
System.out.println(getCompressedString("awswwwhhhp"));
}
Output:
a3b2c2dsa // "aaabbccdsa"
awsw3h3p // "awswwwhhhp"
How does it work
A regular expression "(.)\\1*" is capturing a group (.) of identical characters of length 1 or greater. Where . - denotes any symbol, and \\1 is a back reference to the group.
Method Matcher.results() "returns a stream of match results for each subsequence of the input sequence that matches the pattern".
The only thing left is to evaluate the length of each group and transform it accordingly before collecting into the resulting String.
Links:
A quick tutorial on Regular Expressions.
Official tutorials on lambda expressions and streams
You can use a function which has the following 3 parameters : result, anch, count .
something of this sort:
private static String extractedFunction(String result,int count, char anch) {
return count ==1 ? (result + anch) : (result +anch+Integer.toString(count) );
}
make a function call from those two points like this :
result = extractedFunction(result,count,anch);
Try this.
static final Pattern PAT = Pattern.compile("(.)\\1*");
static String getCompressedString(String str) {
return PAT.matcher(str)
.replaceAll(m -> m.group(1)
+ (m.group().length() == 1 ? "" : m.group().length()));
}
Test cases:
#Test
public void testGetCompressedString() {
assertEquals("", getCompressedString(""));
assertEquals("a", getCompressedString("a"));
assertEquals("abc", getCompressedString("abc"));
assertEquals("abc3", getCompressedString("abccc"));
assertEquals("a3b2c2dsa", getCompressedString("aaabbccdsa"));
}
The regular expression "(.)\\1*" used here matches any sequence of identical characters. .replaceAll() takes a lambda expression as an argument, evaluates the lambda expression each time the pattern matches, and replaces the original string with the result.
The lambda expression is passed a Matcher object containing the results of the match. Here we are receiving this object in the variable m. m.group() returns the entire matched substring, m.group(1) returns its first character.
If the input string is "aaabbccdsa", it will be processed as follows.
m.group(1) m.group() returned by lambda
a aaa a3
b bb b2
c cc c2
d d d
s s s
a a a

Java language conversion

this is code I am using to translate some text to English from pig Latin and vice versa. Except the fromPig method does not seem to be working correctly.
Expected output: "java is a wonderful programming language, and object oriented programming is the best thing after sliced bread."
Got output: "avajavajay isyisyay ayay onderfulwonderfulway ogrammingprogrammingpray..."
So you can see that the words are inside but I need to get rid of the other parts on the ends of the word. Please try to provide code if you can fixing my mistake.
public class PigLatin {
public String fromPig(String pigLatin) {
String res="";
String[] data=pigLatin.split(" ");
for(String word : data)
res += toEnglishWord(word) + " ";
return res;
}
private String toEnglishWord(String word) {
char punc=0;
for(int i=0;i<word.length();i++) {
if(word.charAt(i)=='.'||word.charAt(i)==','||word.charAt(i)==';'||word.charAt(i)=='!') {
punc=word.charAt(i);
break;
}
}
word=word.replace(punc + "","");
String[] data=word.split("-");
String firstPart=data[0];
String lastPart=data[0];
if(lastPart.equals("yay"))
return firstPart + punc ;
else {
lastPart=lastPart.replace("ay","");
return(lastPart+firstPart+punc);
}
}
}
This is the class that needs to execute the sentences.
public class Convert {
public static void main(String args []) {
PigLatin demo=new PigLatin();
String inEnglish="Now is the winter of our discontent " +
"Made glorious summer by this sun of York; " +
"And all the clouds that lour'd upon our house " +
"In the deep bosom of the ocean buried.";
String inPigLatin="avajay isyay ayay onderfulway ogrammingpray " +
"anguagelay, andyay objectyay orientedyay ogrammingpray " +
"isyay hetay estbay ingthay afteryay icedslay eadbray.";
System.out.println(demo.toPig(inEnglish));
System.out.println(demo.fromPig(inPigLatin));
}
}
Basically the english sentence needs to be converted to pig latin and the pig latin sentence needs to be converted to english.
English to piglatin is being done correctly. But piglatin to english is not.
Address the following problems:
Convert the text to a single case (e.g. lowercase) because you are comparing with only lowercase vowels.
The code inside your toEnglishWord is not correct. Change it as follows:
private String toEnglishWord(String word) {
char punc = 0;
for (int i = 0; i < word.length(); i++) {
if (word.charAt(i) == '.' || word.charAt(i) == ',' || word.charAt(i) == ';' || word.charAt(i) == '!') {
punc = word.charAt(i);
break;
}
}
// Trim the word, and remove all punctuation chars
word = word.trim().replaceAll("[\\.,;!]", "");
String english = "";
// If the word ends with 'yay', remove 'yay' from its end. Otherwise, if the
// word ends with 'ay', form the word as (3rd last letter + characters from
// beginning till the 4th last character). Also, add the punctuation at the end
if (word.length() > 2 && word.substring(word.length() - 3).equals("yay")) {
english = word.substring(0, word.length() - 3) + String.valueOf(punc);
} else if (word.length() > 3 && word.substring(word.length() - 2).equals("ay")) {
english = word.substring(word.length() - 3, word.length() - 2) + word.substring(0, word.length() - 3)
+ String.valueOf(punc);
}
return english;
}
Demo:
class PigLatin {
public String toPig(String english) {
String res = "";
String[] data = english.toLowerCase().split(" ");
for (String word : data)
res += toPigWord(word) + " ";
return res;
}
public String fromPig(String pigLatin) {
String res = "";
String[] data = pigLatin.toLowerCase().split(" ");
for (String word : data)
res += toEnglishWord(word) + " ";
return res;
}
private String toPigWord(String word) {
char punc = 0;
for (int i = 0; i < word.length(); i++) {
if (word.charAt(i) == '.' || word.charAt(i) == ',' || word.charAt(i) == ';' || word.charAt(i) == '!') {
punc = word.charAt(i);
break;
}
}
word = word.replace(punc + "", "");
if (isFirstLetterVowel(word))
return (word + "yay" + punc);
else {
int indexVowel = indexOfFirstVowel(word);
String after = word.substring(indexVowel);
String before = word.substring(0, indexVowel);
return (after + before + "ay" + punc);
}
}
private String toEnglishWord(String word) {
char punc = 0;
for (int i = 0; i < word.length(); i++) {
if (word.charAt(i) == '.' || word.charAt(i) == ',' || word.charAt(i) == ';' || word.charAt(i) == '!') {
punc = word.charAt(i);
break;
}
}
// Trim the word, and remove all punctuation chars
word = word.trim().replaceAll("[\\.,;!]", "");
String english = "";
// If the word ends with 'yay', remove 'yay' from its end. Otherwise, if the
// word ends with 'ay' and form the word as (3rd last letter + characters from
// beginning to the 4th last character). Also, add the punctuation at the end
if (word.length() > 2 && word.substring(word.length() - 3).equals("yay")) {
english = word.substring(0, word.length() - 3) + String.valueOf(punc);
} else if (word.length() > 3 && word.substring(word.length() - 2).equals("ay")) {
english = word.substring(word.length() - 3, word.length() - 2) + word.substring(0, word.length() - 3)
+ String.valueOf(punc);
}
return english;
}
private boolean isFirstLetterVowel(String word) {
String temp = word.toLowerCase();
return (temp.charAt(0) == 'a' || temp.charAt(0) == 'e' || temp.charAt(0) == 'i' || temp.charAt(0) == 'o'
|| temp.charAt(0) == 'u');
}
private int indexOfFirstVowel(String word) {
int index = 0;
String temp = word.toLowerCase();
for (int i = 0; i < temp.length(); i++) {
if (temp.charAt(i) == 'a' || temp.charAt(i) == 'e' || temp.charAt(i) == 'i' || temp.charAt(i) == 'o'
|| temp.charAt(i) == 'u') {
index = i;
break;
}
}
return index;
}
}
class Main {
public static void main(String[] args) {
PigLatin pigLatin = new PigLatin();
String str = "hello world! good morning! honesty is a good policy.";
String strToPigLatin = pigLatin.toPig(str);
System.out.println(strToPigLatin);
System.out.println(pigLatin.fromPig(strToPigLatin));
}
}
Output:
ellohay orldway! oodgay orningmay! onestyhay isyay ayay oodgay olicypay.
hello world! good morning! honesty is a good policy.
Note: With the current logic, there is no way to convert a word like ogrammingpray (which is the piglatin of programming) back to programming.
It's impossible to convert Pig Latin to English just like that - you need a human, a program with access to a database of words, or a neural net or something that's been trained on a database of words. Without that, you will only be convert words back to English if they originally started with a vowel or their first letter was a consonant and their second letter was a vowel. Otherwise, there is literally no way for a machine to tell where the word originally ended.
For that reason, you need to make a method that outputs a List<String> like this:
public static List<String> allPossibleSentences(String inPigLatin) {
List<String> pigWords = Arrays.asList(inPigLatin.split("\\s"));
//You can also use a method reference here
List<List<String>> possSentences = cartesianProduct(pigWords.stream().map(word -> possibleEnglishWords(word)).collect(Collectors.toList()));
return possSentences.stream().map(words -> String.join(" ", words)).collect(Collectors.toList());
}
But since you need a fromPig method in your code, you can write it like this:
public static String fromPig(String inPigLatin) {
return allPossibleSentences(inPigLatin).get(0);
}
This is overpowered, compared to the answer by Arvind Kumar Avinash, since it generates all permutations, but I feel it will be more useful if you have the most likely case and all possible sentences.
Example main method
public static void main(String[] args) {
String inPigLatin="avajay isyay ayay onderfulway ogrammingpray " +
"anguagelay andyay objectyay orientedyay ogrammingpray " +
"isyay hetay estbay ingthay afteryay icedslay eadbray";
System.out.println(fromPig(inPigLatin));
inPigLatin = "icedslay eadbray";
System.out.println("\nExpected = sliced bread, gotten = " + fromPig(inPigLatin));
System.out.println(String.join("\n", allPossibleSentences(inPigLatin)));
}
Example output
java is a wonderful rogrammingp language and object oriented rogrammingp is the best hingt after liceds readb
Expected = sliced bread, gotten = liceds readb
liceds readb
liceds bread
liceds dbrea
sliced readb
sliced bread //Here you have the right answer, but your method has no way to confirm that
sliced dbrea
dslice readb
dslice bread
dslice dbrea
Your PigLatin class now
import java.util.Arrays;
import java.util.List;
import java.util.ArrayList;
import java.util.stream.Collectors;
class PigLatin {
//Put your toPig and toPigWord methods here (I couldn't find them)
public static String fromPig(String inPigLatin) {
return allPossibleSentences(inPigLatin).get(0);
}
/* Methods that above code relies on */
private static List<String> possibleEnglishWords(String word) {
List<String> possibilities = new ArrayList<>();
if (word.matches(".*yay$")) {
possibilities.add(word.substring(0, word.length() - 3));
return possibilities;
}
//Remove the pig latin part
word = word.substring(0, word.length() - 2);
for (int i = word.length() - 1; i >= 0; i --) {
if (isVowel(word.charAt(i))) break;
if (word == "anguagel") System.out.println("char = " + word.charAt(i));
possibilities.add(word.substring(i) + word.substring(0, i));
}
return possibilities;
}
//Put all the words together
public static List<List<String>> cartesianProduct(List<List<String>> possWordArr) {
if (possWordArr.size() == 1) {
List<List<String>> possSentencesAsWords = new ArrayList<>();
possSentencesAsWords.add(possWordArr.get(0));
return possSentencesAsWords;
}
return _cartesianProduct(0, possWordArr);
}
//Helper method
private static List<List<String>> _cartesianProduct(int index, List<List<String>> possWordArr) {
List<List<String>> ret = new ArrayList<>();
if (index == possWordArr.size()) {
ret.add(new ArrayList<>());
} else {
for (String word : possWordArr.get(index)) {
for (List<String> words : _cartesianProduct(index + 1, possWordArr)) {
words.add(0, word);
ret.add(words);
}
}
}
return ret;
}
private static boolean isVowel(char c) {
c = toUppercase(c);
switch (c) {
case 'A':
case 'E':
case 'I':
case 'O':
case 'U':
return true;
default:
return false;
}
}
private static char toUppercase(char c) {
if (c >= 'a') return (char) (((char) (c - 'a')) + 'A');
else return c;
}
}

Shortening the representation of a string by adding the number of consecutive characters

Given a random character string not including (0-9), I need to shorten the representation of that string by adding the number of consecutive characters. For e.g: ggee will result in g2e2 being displayed.
I managed to implement the program and tested it (works correctly) through various inputs. I have run into the issue where I cannot seem to understand how the character "e" is displayed given the input above.
I have traced my code multiple times but I don't see when/how "e" is displayed when "i" is 2/3.
String input = new String("ggee");
char position = input.charAt(0);
int accumulator = 1;
for (int i = 1; i < input.length(); i++)
{
// Correction. Was boolean lastIndexString = input.charAt(i) == (input.charAt(input.length() - 1));
boolean lastIndexString = i == (input.length() - 1);
if (position == input.charAt(i))
{
accumulator++;
if (lastIndexOfString)
System.out.print(accumulator); // In my mind, I should be printing
// (input.charAt(i) + "" + accumulator); here
}
else //(position != input.charAt(i))
{
if (accumulator > 1)
{
System.out.print(position + "" + accumulator);
}
else
{
System.out.print(position + "");
}
position = input.charAt(i);
accumulator = 1;
if (lastIndexOfString)
System.out.print(input.charAt(i)); // This is always printing when
// I am at the last index of my string,
// even ignoring my condition of
// (position == input.charAt(i))
}
}
In Java 9+, using regular expression to find consecutive characters, the following will do it:
static String shorten(String input) {
return Pattern.compile("(.)\\1+").matcher(input)
.replaceAll(r -> r.group(1) + r.group().length());
}
Test
System.out.println(shorten("ggggeecaaaaaaaaaaaa"));
System.out.println(shorten("ggggee😀😀😀😁😁😁😁"));
Output
g4e2ca12
g4e2😀6😁8
However, as you can see, that code doesn't work if input string contains Unicode characters from the supplemental planes, such as Emoji characters.
Small modification will fix that:
static String shorten(String input) {
return Pattern.compile("(.)\\1+").matcher(input)
.replaceAll(r -> r.group(1) + r.group().codePointCount(0, r.group().length()));
}
Or:
static String shorten(String input) {
return Pattern.compile("(.)\\1+").matcher(input)
.replaceAll(r -> r.group(1) + input.codePointCount(r.start(), r.end()));
}
Output
g4e2ca12
g4e2😀3😁4
Basically you want each char with no of repeats.
*******************************************************************************/
public class Main
{
public static void main(String[] args) {
String s="ggggggeee";
StringBuilder s1=new
StringBuilder("") ;
;
for(int i=0;i<s.length();i++)
{
int count=0,j;
for( j=i+1;j<s.length();j++)
{
if(s.charAt(i)==s.charAt(j))
count++;
else
{
break;}
}
i=j-1;
s1=s1.append(s.charAt(i)+""+(count+1));
}
System.out.print(s1);
}}
Output

Recursively decompressing a String

I have this assignment that needs me to decompress a previously compressed string.
Examples of this would be
i4a --> iaaaa
q3w2ai2b --> qwwwaaibb
3a --> aaa
Here's what I've written so far:
public static String decompress(String compressedText)
{
char c;
char let;
int num;
String done = "";
String toBeDone = "";
String toBeDone2 = "";
if(compressedText.length() <= 1)
{
return compressedText;
}
if (Character.isLetter(compressedText.charAt(0)))
{
done = compressedText.substring(0,1);
toBeDone = compressedText.substring(1);
return done + decompress(toBeDone);
}
else
{
c = compressedText.charAt(0);
num = Character.getNumericValue(c);
let = compressedText.charAt(1);
if (num > 0)
{
num--;
toBeDone = num + Character.toString(let);
toBeDone2 = compressedText.substring(2);
return Character.toString(let) + decompress(toBeDone) + decompress(toBeDone2);
}
else
{
toBeDone2 = compressedText.substring(2);
return Character.toString(let) + decompress(toBeDone2);
}
}
}
My return values are absolutely horrendous.
"ab" yields "babb" somehow.
"a" or any 1 letter string string yields the right result
"2a" yields "aaaaaaaaaaa"
"2a3b" gives me "aaaabbbbbbbbbbbbbbbbbbbbbbbbbbaaabbbbaaaabbbbbbbbbbbbbbbbbbbbbbbbbb"
The only place I can see a mistake in would probably be the last else section, since I wasn't entirely sure on what to do once the number reaches 0 and I have to stop using recursion on the letter after it. Other than that, I can't really see a problem that gives such horrifying outputs.
I reckon something like this would work:
public static String decompress(String compressedText) {
if (compressedText.length() <= 1) {
return compressedText;
}
char c = compressedText.charAt(0);
if (Character.isDigit(c)) {
return String.join("", Collections.nCopies(Character.digit(c, 10), compressedText.substring(1, 2))) + decompress(compressedText.substring(2));
}
return compressedText.charAt(0) + decompress(compressedText.substring(1));
}
As you can see, the base case is when the compressed String has a length less than or equal to 1 (as you have it in your program).
Then, we check if the first character is a digit. If so, we substitute in the correct amount of characters, and continue with the recursive process until we reach the base case.
If the first character is not a digit, then we simply append it and continue.
Keep in mind that this will only work with numbers from 1 to 9; if you require higher values, let me know!
EDIT 1: If the Collections#nCopies method is too complex, here is an equivalent method:
if (Character.isDigit(c)) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < Character.digit(c, 10); i++) {
sb.append(compressedText.charAt(1));
}
return sb.toString() + decompress(compressedText.substring(2));
}
EDIT 2: Here is a method that uses a recursive helper-method to repeat a String:
public static String decompress(String compressedText) {
if (compressedText.length() <= 1) {
return compressedText;
}
char c = compressedText.charAt(0);
if (Character.isDigit(c)) {
return repeatCharacter(compressedText.charAt(1), Character.digit(c, 10)) + decompress(compressedText.substring(2));
}
return compressedText.charAt(0) + decompress(compressedText.substring(1));
}
public static String repeatCharacter(char character, int counter) {
if (counter == 1) {
return Character.toString(character);
}
return character + repeatCharacter(character, counter - 1);
}

Extract the difference between two strings in Java

Hi I have two strings :
String hear = "Hi My name is Deepak"
+ "\n"
+ "How are you ?"
+ "\n"
+ "\n"
+ "How is everyone";
String dear = "Hi My name is Deepak"
+ "\n"
+ "How are you ?"
+ "\n"
+ "Hey there \n"
+ "How is everyone";
I want to get what is not present in the hear string that is "Hey There \n". I found a method , but it fails for this case :
static String strDiffChop(String s1, String s2) {
if (s1.length() > s2.length()) {
return s1.substring(s2.length() - 1);
} else if (s2.length() > s1.length()) {
return s2.substring(s1.length() - 1);
} else {
return "";
}
}
Can any one help ?
google-diff-match-patch
The Diff Match and Patch libraries offer robust algorithms to perform the operations required for synchronizing plain text.
Diff:
Compare two blocks of plain text and efficiently return a list of differences.
Match:
Given a search string, find its best fuzzy match in a block of plain text. Weighted for both accuracy and location.
Patch:
Apply a list of patches onto plain text. Use best-effort to apply patch even when the underlying text doesn't match.
Currently available in Java, JavaScript, Dart, C++, C#, Objective C, Lua and Python. Regardless of language, each library features the same API and the same functionality. All versions also have comprehensive test harnesses.
There is a Line or word diffs wiki page which describes how to do line-by-line diffs.
One can use the StringUtils from Apache Commons. Here is the StringUtils API.
public static String difference(String str1, String str2) {
if (str1 == null) {
return str2;
}
if (str2 == null) {
return str1;
}
int at = indexOfDifference(str1, str2);
if (at == -1) {
return EMPTY;
}
return str2.substring(at);
}
public static int indexOfDifference(String str1, String str2) {
if (str1 == str2) {
return -1;
}
if (str1 == null || str2 == null) {
return 0;
}
int i;
for (i = 0; i < str1.length() && i < str2.length(); ++i) {
if (str1.charAt(i) != str2.charAt(i)) {
break;
}
}
if (i < str2.length() || i < str1.length()) {
return i;
}
return -1;
}
I have used the StringTokenizer to find the solution. Below is the code snippet
public static List<String> findNotMatching(String sourceStr, String anotherStr){
StringTokenizer at = new StringTokenizer(sourceStr, " ");
StringTokenizer bt = null;
int i = 0, token_count = 0;
String token = null;
boolean flag = false;
List<String> missingWords = new ArrayList<String>();
while (at.hasMoreTokens()) {
token = at.nextToken();
bt = new StringTokenizer(anotherStr, " ");
token_count = bt.countTokens();
while (i < token_count) {
String s = bt.nextToken();
if (token.equals(s)) {
flag = true;
break;
} else {
flag = false;
}
i++;
}
i = 0;
if (flag == false)
missingWords.add(token);
}
return missingWords;
}
convert the string to lists and then use the following method to get result How to remove common values from two array list
If you prefer not to use an external library, you can use the following Java snippet to efficiently compute the difference:
/**
* Returns an array of size 2. The entries contain a minimal set of characters
* that have to be removed from the corresponding input strings in order to
* make the strings equal.
*/
public String[] difference(String a, String b) {
return diffHelper(a, b, new HashMap<>());
}
private String[] diffHelper(String a, String b, Map<Long, String[]> lookup) {
return lookup.computeIfAbsent(((long) a.length()) << 32 | b.length(), k -> {
if (a.isEmpty() || b.isEmpty()) {
return new String[]{a, b};
} else if (a.charAt(0) == b.charAt(0)) {
return diffHelper(a.substring(1), b.substring(1), lookup);
} else {
String[] aa = diffHelper(a.substring(1), b, lookup);
String[] bb = diffHelper(a, b.substring(1), lookup);
if (aa[0].length() + aa[1].length() < bb[0].length() + bb[1].length()) {
return new String[]{a.charAt(0) + aa[0], aa[1]};
} else {
return new String[]{bb[0], b.charAt(0) + bb[1]};
}
}
});
}
This approach is using dynamic programming. It tries all combinations in a brute force way but remembers already computed substrings and therefore runs in O(n^2).
Examples:
String hear = "Hi My name is Deepak"
+ "\n"
+ "How are you ?"
+ "\n"
+ "\n"
+ "How is everyone";
String dear = "Hi My name is Deepak"
+ "\n"
+ "How are you ?"
+ "\n"
+ "Hey there \n"
+ "How is everyone";
difference(hear, dear); // returns {"","Hey there "}
difference("Honda", "Hyundai"); // returns {"o","yui"}
difference("Toyota", "Coyote"); // returns {"Ta","Ce"}
I was looking for some solution but couldn't find the one i needed, so I created a utility class for comparing two version of text - new and old - and getting result text with changes between tags - [added] and [deleted]. It could be easily replaced with highlighter you choose instead of this tags, for example: a html tag. string-version-comparison
Any comments will be appreciated.
*it might not worked well with long text because of higher probability of finding same phrases as deleted.
You should use StringUtils from Apache Commons
String diff = StringUtils.difference( "Word", "World" );
System.out.println( "Difference: " + diff );
Difference: ld
Source: https://www.oreilly.com/library/view/jakarta-commons-cookbook/059600706X/ch02s15.html
My solution is for simple strings.
You can extend it by tokenising lines from a paragraph.
It uses min Edit distance(recursion approach). You can use Dp if you would like.
import java.util.concurrent.atomic.AtomicInteger;
// A Naive recursive Java program to find minimum number
// operations to convert str1 to str2
class JoveoTest {
static int min(int x, int y, int z)
{
if (x <= y && x <= z)
return x;
if (y <= x && y <= z)
return y;
else
return z;
}
static int editDist(String str1, String str2, int m,
int n,StringBuilder str)
{
if (m == 0) {
StringBuilder myStr1=new StringBuilder();
myStr1.append("+"+str2);
myStr1.reverse();
str=myStr1;
return n;
}
if (n == 0){
StringBuilder myStr1=new StringBuilder();
myStr1.append("-"+str1);
myStr1.reverse();
str=myStr1;
return m;
}
if (str1.charAt(m - 1) == str2.charAt(n - 1))
return editDist(str1, str2, m - 1, n - 1,str);
StringBuilder myStr1=new StringBuilder();
StringBuilder myStr2=new StringBuilder();
StringBuilder myStr3=new StringBuilder();
int insert= editDist(str1, str2, m, n - 1,myStr1);
int remove=editDist(str1, str2, m - 1, n,myStr2);
int replace=editDist(str1, str2, m - 1, n-1,myStr3);
if(insert<remove&&insert<replace){
myStr1.insert(0,str2.charAt(n-1)+"+");
str.setLength(0);
str.append(myStr1);
}
else if(remove<insert&&remove<replace){
myStr2.insert(0,str2.charAt(m-1)+"-");
str.setLength(0);
str.append(myStr2);
}
else{
myStr3.insert(0,str2.charAt(n-1)+"+"+str1.charAt(m-1)+"-");
str.setLength(0);
str.append(myStr3);
}
return 1+min(insert,remove,replace);
}
// Driver Code
public static void main(String args[])
{
String str1 = "sunday";
String str2 = "saturday";
StringBuilder ans=new StringBuilder();
System.out.println(editDist(
str1, str2, str1.length(), str2.length(),ans ));
System.out.println(ans.reverse().toString());
}
}
3
+a+t-n+r
what about this snippet ?
public static void strDiff(String hear, String dear){
String[] hr = dear.split("\n");
for (String h : hr) {
if (!hear.contains(h)) {
System.err.println(h);
}
}
}

Categories

Resources