Finding a repeated pattern in a string - java

How can I find a repeated pattern in a string? For example, if the input file were
AAAAAAAAA
ABABAB
ABCAB
ABAb
it would output:
A
AB
ABCAB
ABAb

If you use regex, you only need one line:
String repeated = str.replaceAll("(.+?)\\1+", "$1");
Breaking down the regex (.+?)\1:
(.+?) means "at least one character, but as few as possible, captured as group 1"
\1 means "the same character(s) as group 1
Here's some test code:
String[] strs = {"AAAAAAAAA", "ABABAB", "ABCAB", "ABAb"};
for (String str : strs) {
String repeated = str.replaceAll("(.+?)\\1+", "$1");
System.out.println(repeated);
}
Output:
A
AB
ABCAB
ABAb

This outputs what you ask for - the regex can probably be improved to avoid the loop but I can't manage to fix it...
public static void main(String[] args) {
List<String> inputs = Arrays.asList("AAAAAAAAA", "ABABAB", "ABCAB", "ABAb");
for (String s : inputs) System.out.println(findPattern(s));
}
private static String findPattern(String s) {
String output = s;
String temp;
while (true) {
temp = output.replaceAll("(.+)\\1", "$1");
if (temp.equals(output)) break;
output = temp;
}
return output;
}

Written in C#, but translation should be trivial.
public static string FindPattern(string s)
{
for (int length = 1; length <= s.Length / 2; length++)
{
string pattern = s.Substring(0, length);
if(MatchesPattern(s, pattern))
{
return pattern;
}
}
return s;
}
public static bool MatchesPattern(string s, string pattern)
{
for (int i = 0; i < s.Length; i++)
{
if(!s[i].Equals(pattern[i%pattern.Length]))
{
return false;
}
}
return true;
}

If you might have spaces between the repeated segment:
(.+?)(\\ ?\\1)+

Yon don't need reexp to find pattern. Knuth-Morris-Pratt KMP Algorithm can do this much faster.
func getPattern(s string) string {
res := make([]int, len(s)+1)
i := 0
j := -1
res[0] = -1
var patternLength int
for i < len(s) {
if j == -1 || s[i] == s[j] {
i++
j++
res[i] = j
if res[i] == 0 {
patternLength++
} else {
break
}
} else {
j = res[j]
}
}
if patternLength == len(s) {
patternLength = 0
}
return s[:patternLength]
}
Unit Tests
func Test_getPattern(t *testing.T) {
testCases := []struct {
str1 string
expected string
}{
{"AAAAAAAAA", "A"},
{"ABCABC", "ABC"},
{"ABABAB", "AB"},
{"LEET", ""},
}
for _, tc := range testCases {
actual := getPattern(tc.str1)
if tc.expected != actual {
t.Errorf("Source: s1:%s\n Expected:%s\n Actual: %s",
tc.str1,
tc.expected,
actual)
}
}
}

Related

How to remove repeating code in this solution?

I have this code which compresses characters in the given string and replaces repeated adjacent characters with their count.
Consider the following example:
Input:
aaabbccdsa
Expecting output:
a3b2c2dsa
My code is working properly but I think repeating if condition can be removed.
public class Solution {
public static String getCompressedString(String str) {
String result = "";
char anch = str.charAt(0);
int count = 0;
for (int i = 0; i < str.length(); i++) {
char ch = str.charAt(i);
if (ch == anch) {
count++;
} else {
if (count == 1) { // from here
result += anch;
} else {
result += anch + Integer.toString(count);
} // to here
anch = ch;
count = 1;
}
if (i == str.length() - 1) {
if (count == 1) { // from here
result += anch;
} else {
result += anch + Integer.toString(count);
} // to here
}
}
return result;
}
}
In this solution code below is repeated two times
if (count == 1) {
result += anch;
} else {
result += anch + Integer.toString(count);
}
Please, note, I don't want to use a separate method for repeating logic.
You could do away with the if statements.
public static String getCompressedString(String str) {
char[] a = str.toCharArray();
StringBuilder sb = new StringBuilder();
for(int i=0,j=0; i<a.length; i=j){
for(j=i+1;j < a.length && a[i] == a[j]; j++);
sb.append(a[i]).append(j-i==1?"":j-i);
}
return sb.toString();
}
}
You can do something like this:
public static String getCompressedString(String str) {
String result = "";
int count = 1;
for (int i = 0; i < str.length(); i++) {
if (i + 1 < str.length() && str.charAt(i) == str.charAt(i + 1)) {
count++;
} else {
if (count == 1) {
result += str.charAt(i);
} else {
result += str.charAt(i) + "" + count;
count = 1;
}
}
}
return result;
}
I got rid of the repeated code, and it do as intended.
You can use this approach as explained below:
Code:
public class Test {
public static void main(String[] args) {
String s = "aaabbccdsaccbbaaadsa";
char[] strArray = s.toCharArray();
char ch0 = strArray[0];
int counter = 0;
StringBuilder sb = new StringBuilder();
for(int i=0;i<strArray.length;i++){
if(ch0 == strArray[i]){//check for consecutive characters and increment the counter
counter++;
} else { // when character changes while iterating
sb.append(ch0 + "" + (counter > 1 ? counter : ""));
counter = 1; // reset the counter to 1
ch0 = strArray[i]; // reset the ch0 with the current character
}
if(i == strArray.length-1){// case for last element of the string
sb.append(ch0 + "" + (counter > 1 ? counter : ""));
}
}
System.out.println(sb);
}
}
Sample Input/Output:
Input:: aaabbccdsaccbbaaadsa
Output:: a3b2c2dsac2b2a3dsa
Input:: abcdaaaaa
Output:: abcda5
Since, the body of the else and second if is the same, so we can merge them by updating the condition. The updated body of the function will be:
String result = "";
char anch = str.charAt(0);
int count = 0;
char ch = str.charAt(0); // declare ch outside the loop, and initialize to avoid error
for (int i = 0; i < str.length(); i++) {
ch = str.charAt(i);
if (ch == anch) {
count++;
}
// check if the second condition is false, or if we are at the end of the string
if (ch != anch || i == str.length() - 1) {
if (count == 1) { // from here
result += anch;
} else {
result += anch + Integer.toString(count);
} // to here
anch = ch;
count = 1;
}
}
// add the condition
// if count is greater than or
// if the last character added already to the result
if (count > 1 || (len < 2 || result.charAt(len - 2) != ch)) {
result += ch;
}
return result;
Test Cases:
I have tested the solution on the following inputs:
aaabbccdsa -> a3b2c2dsa
aaab -> a3b
aaa -> a3
ab -> ab
aabbc -> a2b2c
Optional
If you want to make it shorter, you can update these 2 conditions.
if (count == 1) { // from here
result += anch;
} else {
result += anch + Integer.toString(count);
} // to here
as
result += anch;
if (count != 1) { // from here
result += count;// no need to convert (implicit conversion)
} // to here
Here's a single-statement solution using Stream API and regular expressions:
public static final Pattern GROUP_OF_ONE_OR_MORE = Pattern.compile("(.)\\1*");
public static String getCompressedString(String str) {
return GROUP_OF_ONE_OR_MORE.matcher(str).results()
.map(MatchResult::group)
.map(s -> s.charAt(0) + (s.length() == 1 ? "" : String.valueOf(s.length())))
.collect(Collectors.joining());
}
main()
public static void main(String[] args) {
System.out.println(getCompressedString("aaabbccdsa"));
System.out.println(getCompressedString("awswwwhhhp"));
}
Output:
a3b2c2dsa // "aaabbccdsa"
awsw3h3p // "awswwwhhhp"
How does it work
A regular expression "(.)\\1*" is capturing a group (.) of identical characters of length 1 or greater. Where . - denotes any symbol, and \\1 is a back reference to the group.
Method Matcher.results() "returns a stream of match results for each subsequence of the input sequence that matches the pattern".
The only thing left is to evaluate the length of each group and transform it accordingly before collecting into the resulting String.
Links:
A quick tutorial on Regular Expressions.
Official tutorials on lambda expressions and streams
You can use a function which has the following 3 parameters : result, anch, count .
something of this sort:
private static String extractedFunction(String result,int count, char anch) {
return count ==1 ? (result + anch) : (result +anch+Integer.toString(count) );
}
make a function call from those two points like this :
result = extractedFunction(result,count,anch);
Try this.
static final Pattern PAT = Pattern.compile("(.)\\1*");
static String getCompressedString(String str) {
return PAT.matcher(str)
.replaceAll(m -> m.group(1)
+ (m.group().length() == 1 ? "" : m.group().length()));
}
Test cases:
#Test
public void testGetCompressedString() {
assertEquals("", getCompressedString(""));
assertEquals("a", getCompressedString("a"));
assertEquals("abc", getCompressedString("abc"));
assertEquals("abc3", getCompressedString("abccc"));
assertEquals("a3b2c2dsa", getCompressedString("aaabbccdsa"));
}
The regular expression "(.)\\1*" used here matches any sequence of identical characters. .replaceAll() takes a lambda expression as an argument, evaluates the lambda expression each time the pattern matches, and replaces the original string with the result.
The lambda expression is passed a Matcher object containing the results of the match. Here we are receiving this object in the variable m. m.group() returns the entire matched substring, m.group(1) returns its first character.
If the input string is "aaabbccdsa", it will be processed as follows.
m.group(1) m.group() returned by lambda
a aaa a3
b bb b2
c cc c2
d d d
s s s
a a a

parsing/converting task with characters and numbers within

It is necessary to repeat the character, as many times as the number behind it.
They are positive integer numbers.
case #1
input: "abc3leson11"
output: "abccclesonnnnnnnnnnn"
I already finish it in the following way:
String a = "abbc2kd3ijkl40ggg2H5uu";
String s = a + "*";
String numS = "";
int cnt = 0;
for (int i = 0; i < s.length(); i++) {
char ch = s.charAt(i);
if (Character.isDigit(ch)) {
numS = numS + ch;
cnt++;
} else {
cnt++;
try {
for (int j = 0; j < Integer.parseInt(numS); j++) {
System.out.print(s.charAt(i - cnt));
}
if (i != s.length() - 1 && !Character.isDigit(s.charAt(i + 1))) {
System.out.print(s.charAt(i));
}
} catch (Exception e) {
if (i != s.length() - 1 && !Character.isDigit(s.charAt(i + 1))) {
System.out.print(s.charAt(i));
}
}
cnt = 0;
numS = "";
}
}
But I wonder is there some better solution with less and cleaner code?
Could you take a look below? I'm using a library from StringUtils from Apache Common Utils to repeat character:
public class MicsTest {
public static void main(String[] args) {
String input = "abc3leson11";
String output = input;
Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher(input);
while (m.find()) {
int number = Integer.valueOf(m.group());
char repeatedChar = input.charAt(m.start()-1);
output = output.replaceFirst(m.group(), StringUtils.repeat(repeatedChar, number));
}
System.out.println(output);
}
}
In case you don't want to use StringUtils. You can use the below custom method to achieve the same effect:
public static String repeat(char c, int times) {
char[] chars = new char[times];
Arrays.fill(chars, c);
return new String(chars);
}
Using java basic string regx should make it more terse as follows:
public class He1 {
private static final Pattern pattern = Pattern.compile("[a-zA-Z]+(\\d+).*");
// match the number between or the last using regx;
public static void main(String... args) {
String s = "abc3leson11";
System.out.println(parse(s));
s = "abbc2kd3ijkl40ggg2H5uu";
System.out.println(parse(s));
}
private static String parse(String s) {
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
int num = Integer.valueOf(matcher.group(1));
char prev = s.charAt(s.indexOf(String.valueOf(num)) - 1);
// locate the char before the number;
String repeated = new String(new char[num-1]).replace('\0', prev);
// since the prev is not deleted, we have to decrement the repeating number by 1;
s = s.replaceFirst(String.valueOf(num), repeated);
matcher = pattern.matcher(s);
}
return s;
}
}
And the output should be:
abccclesonnnnnnnnnnn
abbcckdddijkllllllllllllllllllllllllllllllllllllllllggggHHHHHuu
String g(String a){
String result = "";
String[] array = a.split("(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D)");
//System.out.println(java.util.Arrays.toString(array));
for(int i=0; i<array.length; i++){
String part = array[i];
result += part;
if(++i == array.length){
break;
}
char charToRepeat = part.charAt(part.length() - 1);
result += repeat(charToRepeat+"", new Integer(array[i]) - 1);
}
return result;
}
// In Java 11 this could be removed and replaced with the builtin `str.repeat(amount)`
String repeat(String str, int amount){
return new String(new char[amount]).replace("\0", str);
}
Try it online.
Explanation:
The split will split the letters and numbers:
abbc2kd3ijkl40ggg2H5uu would become ["abbc", "2", "kd", "3", "ijkl", "40", "ggg", "2", "H", "5", "uu"]
We then loop over the parts and add any strings as is to the result.
We then increase i by 1 first and if we're done (after the "uu") in the array above, it will break the loop.
If not the increase of i will put us at a number. So it will repeat the last character of the part x amount of times, where x is the number we found minus 1.
Here is another solution:
String str = "abbc2kd3ijkl40ggg2H5uu";
String[] part = str.split("(?<=\\d)(?=\\D)|(?=\\d)(?<=\\D)");
String res = "";
for(int i=0; i < part.length; i++){
if(i%2 == 0){
res = res + part[i];
}else {
res = res + StringUtils.repeat(part[i-1].charAt(part[i-1].length()-1),Integer.parseInt(part[i])-1);
}
}
System.out.println(res);
Yet another solution :
public static String getCustomizedString(String input) {
ArrayList<String > letters = new ArrayList<>(Arrays.asList(input.split("(\\d)")));
letters.removeAll(Arrays.asList(""));
ArrayList<String > digits = new ArrayList<>(Arrays.asList(input.split("(\\D)")));
digits.removeAll(Arrays.asList(""));
for(int i=0; i< digits.size(); i++) {
int iteration = Integer.valueOf(digits.get(i));
String letter = letters.get(i);
char c = letter.charAt(letter.length()-1);
for (int j = 0; j<iteration -1 ; j++) {
letters.set(i,letters.get(i).concat(String.valueOf(c)));
}
}
String finalResult = "";
for (String str : letters) {
finalResult += str;
}
return finalResult;
}
The usage:
public static void main(String[] args) {
String testString1 = "abbc2kd3ijkl40ggg2H5uu";
String testString2 = "abc3leson11";
System.out.println(getCustomizedString(testString1));
System.out.println(getCustomizedString(testString2));
}
And the result:
abbcckdddijkllllllllllllllllllllllllllllllllllllllllggggHHHHHuu
abccclesonnnnnnnnnnn

How to write a Loop that replaces occurrence of characters from a String?

This is something I have been trying to do since morning but no luck so far.
Without the use of "regex" or replace() of String, but only loop, write a method that replaces occurance of string from parentString with something else.
I was able to implement a version where type char replaceWith is to be replaced, but no luck if a type String replaceWith is to be replaced as in the template below.
public String replaceWith(String parentString, String occurrence, String replaceWith){
String newString; //Initialize
//loop through "parentString",
//find and replace "occurence" with "replaceWith"
return newString;
}
Use a string searching algorithm, that checks the characters of the occurrence against all the characters up to the length of occurrence. Something like this Rabin-Karp Algorithm
function NaiveSearch(string s[1..n], string pattern[1..m])
for i from 1 to n-m+1
for j from 1 to m
if s[i+j-1] ≠ pattern[j]
jump to next iteration of outer loop
return i
return not found
Something like that:
public String replace(String source, String target, String replacement) {
int targetLength = target.length();
int sourceLength = source.length();
if (sourceLength < targetLength) {
return source;
}
String result = source;
for (int i = 0; i< sourceLength - targetLength; i++) {
String before = result.substring(0, i);
String substring = result.substring(i, i+targetLength - 1);
String after = result.substring(i + targetLength);
if (substring.equals(target)) {
result = before.concat(replacement).concat(after);
}
}
return result;
}
My quick solution was this. No guarantees all output is correct.
public static String replaceWith(String s, String find, String replace) {
StringBuilder sb = new StringBuilder();
int findLength = find.length();
int sourceLength = s.length();
for (int i = 0; i < sourceLength; i++) {
String nextSubstring;
if (i + findLength >= sourceLength) {
nextSubstring = s.substring(i);
} else {
nextSubstring = s.substring(i, i + findLength);
}
if (nextSubstring.equals(find)) {
sb.append(replace);
i += findLength - 1;
} else {
sb.append(s.charAt(i));
}
}
return sb.toString();
}
Sample test
replaceWith("Hello World", "Hello", "World") => "World World"
replaceWith("HelloHelloWorld", "Hello", "World") => "WorldWorldWorld"
replaceWith("I replace banana, banana and some more banana", "banana", "apple") => I replace apple, apple and some more apple
I wrote this fast solution:
import java.util.*;
import java.lang.*;
import java.io.*;
class Ideone
{
public static String replaceWith(String parentString, String occurrence, String replaceWith){
String newString = "";
for(int i = 0; i <= parentString.length()-occurrence.length(); ++i) {
boolean add = false;
for(int j = 0; j < occurrence.length(); ++j) {
if(parentString.charAt(i+j) != occurrence.charAt(j)) add = true;
}
if(add) {
newString += parentString.charAt(i);
} else {
i += occurrence.length()-1;
newString += replaceWith;
}
}
return newString;
}
public static void main (String[] args) throws java.lang.Exception
{
System.out.println(replaceWith("I replace banana, banana and some more banana", "banana", "apple"));
}
}
You can see this working here: https://ideone.com/GwB7Ba
It outputs:
I replace apple, apple and some more apple

Convert alternate char to uppercase

I am new to java programming. I want to print a string with alternate characters in UpperCase.
String x=jTextField1.getText();
x=x.toLowerCase();
int y=x.length();
for(int i=1;i<=y;i++)
{}
I don't know how to proceed further. I want to do this question with the help of looping and continue function.
Help would be appreciated. Thanks.
#Test
public void alternateUppercase(){
String testString = "This is a !!!!! test - of the emergency? broadcast System.";
char[] arr = testString.toLowerCase().toCharArray();
boolean makeUppercase = true;
for (int i=0; i<arr.length; i++) {
if(makeUppercase && Character.isLetter(arr[i])) {
arr[i] = Character.toUpperCase(arr[i]);
makeUppercase = false;
} else if (!makeUppercase && Character.isLetter(arr[i])) {
makeUppercase = true;
}
}
String convertedString = String.valueOf(arr);
System.out.println(convertedString);
}
First, java indexes start at 0 (not 1). I think you are asking for something as simple as alternating calls to Character.toLowerCase(char) and Character.toUpperCase(char) on the result of modulo (remainder of division) 2.
String x = jTextField1.getText();
for (int i = 0, len = x.length(); i < len; i++) {
char ch = x.charAt(i);
if (i % 2 == 0) {
System.out.print(Character.toLowerCase(ch));
} else {
System.out.print(Character.toUpperCase(ch));
}
}
System.out.println();
Strings start at index 0 and finish at index x.length()-1
To look up a String by index you can use String.charAt(i)
To convert a character to upper case you can do Character.toUpperCase(ch);
I suggest you build a StringBuilder from these characters which you can toString() when you are done.
you can make it using the 65 distnace of lower case and upper case ABCabc from the unicode table like:
String str = "abbfFDdshFSDjdFDSsfsSdoi".toLowerCase();
char c;
boolean state = false;
String newStr = "";
for (int i=0; i<str.length(); i++){
c = str.charAt(o);
if (state){
newStr += c;
}
else {
newStr += c + 65;
}
state = !state;
}
I'm sure there is a slicker way to do this, but this will work for a 2 minute-answer:
public String homeWork(){
String x = "Hello World";
StringBuilder sb = new StringBuilder();
for(int i=0;i<=x.length();i++){
char c = x.charAt(i);
if(i%2==0){
sb.append(String.valueOf(c).toUpperCase());
} else {
sb.append(String.valueOf(c).toLowerCase());
}
}
return sb.toString();
}
To explain i%2==0, if the remainder of i divided by 2 is equal to zero (even numbered) return true
public class PrintingStringInAlternativeCase {
public static void main(String s[])
{
String testString = "TESTSTRING";
String output = "";
for (int i = 0; i < testString.length(); i++) {
if(i%2 == 0)
{
output += Character.toUpperCase(testString.toCharArray()[i]);
}else
{
output += Character.toLowerCase(testString.toCharArray()[i]);
}
}
System.out.println("Newly generated string is as follow: "+ output);
}
}
Using as much of your code as I could, here's what I got. First I made a string called build that will help you build your resulting string. Also, I changed the index to go from [0,size-1] not [1,size]. Using modulo devision of 2 helps with the "every other" bit.
String build =""
String x=jTextField1.getText();
x=x.toLowerCase();
int y=x.length();
for(int i=0;i<y;i++)
{
if(i%2==0){
build+=Character.toUpperCase(x.charAt(i));
else{
build+=x.charAt(i);
}
}
x=build; //not necessary, you could just use build.
Happy oding! Leave a comment if you have any questions.
public static void main(String[] args)
{
Scanner sc=new Scanner(System.in);
System.out.println("Enter Stirng");
String str=sc.nextLine();
for(int i=0;i<str.length();i++)
{
if(i%2==0)
{
System.out.print(Character.toLowerCase(str.charAt(i)));
}
else
{
System.out.print(Character.toUpperCase(str.charAt(i)));
}
}
sc.close();
}
Java 8 Solution:
static String getMixedCase(String str) {
char[] chars = str.toCharArray();
return IntStream.range(0, str.length())
.mapToObj(i -> String.valueOf(i % 2 == 1 ? chars[i] : Character.toUpperCase(chars[i])))
.collect(Collectors.joining());
}
public class ClassC {
public static void main(String[] args) {
String str = "Hello";
StringBuffer strNew = new StringBuffer();
for (int i = 0; i < str.length(); i++) {
if (i % 2 == 0) {
strNew.append(Character.toLowerCase(str.charAt(i)));
} else {
strNew.append(Character.toUpperCase(str.charAt(i)));
}
}
System.out.println(strNew);
}
}

Substring before numeric

I have :
String word = "It cost me 500 box
What I want to do is to display this sentence like this :
It cost me
500 box
I need a general methode, not only for this example.
Can you helpe me please ?
As suggested you can use regular expression to do this job below is a code snippet that can do the trick for you:
String word = "It cost me 500 box";
Pattern p = Pattern.compile("(.* )([0-9].*)");
Matcher m = p.matcher(word);
if(m.matches()) {
System.out.println(m.group(1));
System.out.println(m.group(2));
}
Hope this helps.
Not the optimum way but an easy one:
On top of your Activity:
String finalText="";
public static boolean isNumber(String string)
{
try
{
double d = Double.parseDouble(string);
}
catch(NumberFormatException e)
{
return false;
}
return true;
}
In your code:
String word = "It cost me 500 box";
for (int i=0 ; i<word.length() ; i++){
String a = Character.toString(word.charAt(i));
if (isNumber(a)){
finalText+="\n";
for (int j=i ; j<word.length() ; j++){
String b = Character.toString(word.charAt(j));
finalText+=b;
}
i = word.length();
}
else{
finalText+=a;
}
}
textView.setText(finalText);
I don't know if there are anymore inbuilt functions that I know of, but one way of doing this would be:
void match(String string) {
int numIndex = -1;
int charIndex = 0;
if (string.length() > 0) {
while (numIndex == -1 && charIndex < string.length()) {
if (Character.isDigit(string.charAt(charIndex)))
numIndex = charIndex;
charIndex++;
}
}
if (numIndex != -1) {
System.out.println(string.substring(0, numIndex));
System.out.println(string.substring(numIndex));
}
}

Categories

Resources