How to check if a pair already exists? - java

I have a string say "abab" and im splitting it in pairs.(i.e ab,ab) If pair already exists then i dont want it to be generated.How do i do it
Here's the code for what ive tried
String r="abab";
String pair[] = new String[r.length()/2];
for( int i = 0; i <pair.length; i++ )
{
pair[i] = r.substring(i*2,(i*2)+2);
}

Before adding it to the pair array you could see if it already exists with the Arrays function .contains. If the pair already exists then don't add it to the pair list. For example here the ab and fe pairs will not be added:
String r="ababtefedefe";
String pair[] = new String[r.length()/2];
String currentPair = "";
for( int i = 0; i <pair.length; i++ )
{
currentPair = r.substring(i*2,(i*2)+2);
if(!java.util.Arrays.asList(pair).contains(currentPair))
pair[i] = currentPair;
System.out.println(pair[i]);
}

I would use a Set to help me out.
private String[] retrieveUniquePair(String input) {
int dim = input.length() / 2;
Set<String> pairs = new LinkedHashSet<>(dim);
for (int i = 0; i <= dim; i += 2) {
String currentPair = input.substring(i, i + 2);
pairs.add(currentPair);
}
return pairs.toArray(new String[] {});
}
Edit:
I post the solution I propose and the test
public class PairTest {
#DataProvider(name = "input")
public static Object[][] input() {
return new Object[][] {
{"abcd", Arrays.asList("ab", "cd")},
{"abcde", Arrays.asList("ab", "cd")},
{"abcdab", Arrays.asList("ab", "cd")},
{"ababcdcd", Arrays.asList("ab", "cd")},
{"ababtefedefe", Arrays.asList("ab", "te", "fe", "de")},
};
}
#Test(dataProvider = "input")
public void test(String input, List<String> expectedOutput) {
String[] output = retrieveUniquePair(input);
Assert.assertNotNull(output);
Assert.assertEquals(output.length, expectedOutput.size());
for (String pair : output) {
Assert.assertTrue(expectedOutput.contains(pair));
}
}
private String[] retrieveUniquePair(String input) {
int pairNumber = input.length() / 2;
Set<String> pairs = new LinkedHashSet<>(pairNumber);
int endIteration = input.length();
if (input.length() % 2 != 0) { // odd number
endIteration--; // ignore last character
}
for (int i = 0; i < endIteration; i += 2) {
String currentPair = input.substring(i, i + 2);
pairs.add(currentPair);
}
return pairs.toArray(new String[pairs.size() - 1]);
}
}

Related

Separate text in k-shingles without Scanner.class in Java

I am trying to separate a text in k-shingles, sadly I cannot use scanner. If the last shingle is too short, I want to fill up with "_". I came this far:
public class Projektarbeit {
public static void main(String[] args) {
testKShingling(7, "ddssggeezzfff");
}
public static void testKShingling(int k, String source) {
//first eliminate whitespace and then fill up with withespaces to match target.length%shingle.length() == 0
String txt = source.replaceAll("\\s", "");
//get shingles
ArrayList<String> shingles = new ArrayList<String>();
int i;
int l = txt.length();
String shingle = "";
if (k == 1) {
for(i = 0; i < l; i++){
shingle = txt.substring(i, i + k);
shingles.add(shingle);
};
}
else {
for(i = 0; i < l; i += k - 1){
try {
shingle = txt.substring(i, i + k);
shingles.add(shingle);
}
catch(Exception e) {
txt = txt.concat("_");
i -= k - 1;
};
};
}
System.out.println(shingles);
}
}
Output: [ddssgge, eezzfff, f______]
It works almost, but in the with the given parameters in the example the last shingle is not necessary (it should be [ddssgge, eezzfff]
Any idea how to do this more beautiful?
To make the code posted work you only need to add break and the end of the catch block:
catch(Exception e) {
txt = txt.concat("_");
i -= k - 1;
break;
};
Having said that I wouldn't use an Exception to control the program. Exception are just that: should be used for run time errors.
Avoid StringIndexOutOfBoundsException by controlling the loop parameters:
public static void main(String[] args) {
testKShingling(3, "ddssggeezzfff");
}
public static void testKShingling(int substringLength, String source) {
//todo validate input
String txt = source.replaceAll("\\s", "");
//get shingles
ArrayList<String> shingles = new ArrayList<>();
int stringLength = txt.length();
if (substringLength == 1) {
for(int index = 0; index < stringLength; index++){
String shingle = txt.substring(index, index + substringLength);
shingles.add(shingle);
};
}
else {
for(int index = 0; index < stringLength -1 ; index += substringLength - 1){
int endIndex = Math.min(index + substringLength, stringLength);
String shingle = txt.substring(index, endIndex);
if(shingle.length() < substringLength){
shingle = extend(shingle, substringLength);
}
shingles.add(shingle);
};
}
System.out.println(shingles);
}
private static String extend(String shingle, int toLength) {
String s = shingle;
for(int index = 0; index < toLength - shingle.length(); index ++){
s = s.concat("_");
}
return s;
}
An alternative implementation of testKShingling:
public static void testKShingling(int substringLength, String source) {
//todo validate input
String txt = source.replaceAll("\\s", "");
ArrayList<String> shingles = new ArrayList<>();
if (substringLength == 1) {
for(char c : txt.toCharArray()){
shingles.add(Character.toString(c));
};
}
else {
while(txt.length() > substringLength) {
String shingle = txt.substring(0, substringLength);
shingles.add(shingle);
txt = txt.substring(substringLength - 1); //remove first substringLength - 1 chars
}
if(txt.length() < substringLength){ //check the length of what's left
txt = extend(txt, substringLength);
}
shingles.add(txt); //add what's left
}
System.out.println(shingles);
}

parsing/converting task with characters and numbers within

It is necessary to repeat the character, as many times as the number behind it.
They are positive integer numbers.
case #1
input: "abc3leson11"
output: "abccclesonnnnnnnnnnn"
I already finish it in the following way:
String a = "abbc2kd3ijkl40ggg2H5uu";
String s = a + "*";
String numS = "";
int cnt = 0;
for (int i = 0; i < s.length(); i++) {
char ch = s.charAt(i);
if (Character.isDigit(ch)) {
numS = numS + ch;
cnt++;
} else {
cnt++;
try {
for (int j = 0; j < Integer.parseInt(numS); j++) {
System.out.print(s.charAt(i - cnt));
}
if (i != s.length() - 1 && !Character.isDigit(s.charAt(i + 1))) {
System.out.print(s.charAt(i));
}
} catch (Exception e) {
if (i != s.length() - 1 && !Character.isDigit(s.charAt(i + 1))) {
System.out.print(s.charAt(i));
}
}
cnt = 0;
numS = "";
}
}
But I wonder is there some better solution with less and cleaner code?
Could you take a look below? I'm using a library from StringUtils from Apache Common Utils to repeat character:
public class MicsTest {
public static void main(String[] args) {
String input = "abc3leson11";
String output = input;
Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher(input);
while (m.find()) {
int number = Integer.valueOf(m.group());
char repeatedChar = input.charAt(m.start()-1);
output = output.replaceFirst(m.group(), StringUtils.repeat(repeatedChar, number));
}
System.out.println(output);
}
}
In case you don't want to use StringUtils. You can use the below custom method to achieve the same effect:
public static String repeat(char c, int times) {
char[] chars = new char[times];
Arrays.fill(chars, c);
return new String(chars);
}
Using java basic string regx should make it more terse as follows:
public class He1 {
private static final Pattern pattern = Pattern.compile("[a-zA-Z]+(\\d+).*");
// match the number between or the last using regx;
public static void main(String... args) {
String s = "abc3leson11";
System.out.println(parse(s));
s = "abbc2kd3ijkl40ggg2H5uu";
System.out.println(parse(s));
}
private static String parse(String s) {
Matcher matcher = pattern.matcher(s);
while (matcher.find()) {
int num = Integer.valueOf(matcher.group(1));
char prev = s.charAt(s.indexOf(String.valueOf(num)) - 1);
// locate the char before the number;
String repeated = new String(new char[num-1]).replace('\0', prev);
// since the prev is not deleted, we have to decrement the repeating number by 1;
s = s.replaceFirst(String.valueOf(num), repeated);
matcher = pattern.matcher(s);
}
return s;
}
}
And the output should be:
abccclesonnnnnnnnnnn
abbcckdddijkllllllllllllllllllllllllllllllllllllllllggggHHHHHuu
String g(String a){
String result = "";
String[] array = a.split("(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D)");
//System.out.println(java.util.Arrays.toString(array));
for(int i=0; i<array.length; i++){
String part = array[i];
result += part;
if(++i == array.length){
break;
}
char charToRepeat = part.charAt(part.length() - 1);
result += repeat(charToRepeat+"", new Integer(array[i]) - 1);
}
return result;
}
// In Java 11 this could be removed and replaced with the builtin `str.repeat(amount)`
String repeat(String str, int amount){
return new String(new char[amount]).replace("\0", str);
}
Try it online.
Explanation:
The split will split the letters and numbers:
abbc2kd3ijkl40ggg2H5uu would become ["abbc", "2", "kd", "3", "ijkl", "40", "ggg", "2", "H", "5", "uu"]
We then loop over the parts and add any strings as is to the result.
We then increase i by 1 first and if we're done (after the "uu") in the array above, it will break the loop.
If not the increase of i will put us at a number. So it will repeat the last character of the part x amount of times, where x is the number we found minus 1.
Here is another solution:
String str = "abbc2kd3ijkl40ggg2H5uu";
String[] part = str.split("(?<=\\d)(?=\\D)|(?=\\d)(?<=\\D)");
String res = "";
for(int i=0; i < part.length; i++){
if(i%2 == 0){
res = res + part[i];
}else {
res = res + StringUtils.repeat(part[i-1].charAt(part[i-1].length()-1),Integer.parseInt(part[i])-1);
}
}
System.out.println(res);
Yet another solution :
public static String getCustomizedString(String input) {
ArrayList<String > letters = new ArrayList<>(Arrays.asList(input.split("(\\d)")));
letters.removeAll(Arrays.asList(""));
ArrayList<String > digits = new ArrayList<>(Arrays.asList(input.split("(\\D)")));
digits.removeAll(Arrays.asList(""));
for(int i=0; i< digits.size(); i++) {
int iteration = Integer.valueOf(digits.get(i));
String letter = letters.get(i);
char c = letter.charAt(letter.length()-1);
for (int j = 0; j<iteration -1 ; j++) {
letters.set(i,letters.get(i).concat(String.valueOf(c)));
}
}
String finalResult = "";
for (String str : letters) {
finalResult += str;
}
return finalResult;
}
The usage:
public static void main(String[] args) {
String testString1 = "abbc2kd3ijkl40ggg2H5uu";
String testString2 = "abc3leson11";
System.out.println(getCustomizedString(testString1));
System.out.println(getCustomizedString(testString2));
}
And the result:
abbcckdddijkllllllllllllllllllllllllllllllllllllllllggggHHHHHuu
abccclesonnnnnnnnnnn

Get largest Group of anagrams in an array

For an assignment I have been asked to find the largest group of anagrams in a list. I believe I would have to have an accumulation loop inside of another loop that keeps track of the largest number of items. The problem is that I don't know how to count how many of each anagram I have. I have been able to sort the array into groups based on their anagrams. So from the index 1-3 is one anagram, 4-10 is another, etc. How do I search through and count how many of each anagram I have? Then compare each one to the previous count.
Sample of the code:
public static String[] getLargestAnagramGroup(String[] inputArray) {
ArrayList<String> largestGroupArrayList = new ArrayList<String>();
if (inputArray.length == 0 || inputArray == null) {
return new String[0];
}
insertionSort(inputArray, new AnagramComparator());
String[] largestGroupArray = new String[largestGroupArrayList.size()];
largestGroupArrayList.toArray(inputArray);
System.out.println(largestGroupArray);
return largestGroupArray;
}
UPDATE: This is how we solved it. Is there a more efficient way?
public static String[] getLargestAnagramGroup(String[] inputArray) {
int numberOfAnagrams = 0;
int temporary = 1;
int position = -1;
int index = 0;
if (inputArray == null) {
return new String[0];
}
insertionSort(inputArray, new AnagramComparator());
for (index = 0; index < inputArray.length - 1; index++) {
if (areAnagrams(inputArray[index], inputArray[index + 1])) {
temporary++;
} else {
if (temporary > numberOfAnagrams) {
numberOfAnagrams = temporary;
position = index;
temporary = 1;
} else if (temporary < numberOfAnagrams) {
temporary = 1;
}
}
}
if (temporary > numberOfAnagrams) {
position = index;
numberOfAnagrams = temporary;
}
String[] largestArray = new String[numberOfAnagrams];
for (int startIndex = position - numberOfAnagrams + 1, i = 0; startIndex <= position; startIndex++, i++) {
largestArray[i] = inputArray[startIndex];
}
return largestArray;
}
Here is a piece of code to help you out.
public class AnagramTest {
public static void main(String[] args) {
String[] input = {"test", "ttes", "abcd", "dcba", "dbac"};
for (String string : getLargestAnagramGroup(input)) {
System.out.println(string);
}
}
/**
* Gives an array of Strings which are anagrams and has the highest occurrence.
*
* #param inputArray
* #return
*/
public static String[] getLargestAnagramGroup(String[] inputArray) {
// Creating a linked hash map to maintain the order
Map<String, List<String>> map = new LinkedHashMap<String, List<String>>();
for (String string : inputArray) {
char[] charArray = string.toCharArray();
Arrays.sort(charArray);
String sortedStr = new String(charArray);
List<String> anagrams = map.get(sortedStr);
if (anagrams == null) {
anagrams = new ArrayList<String>();
}
anagrams.add(string);
map.put(sortedStr, anagrams);
}
Set<Entry<String, List<String>>> entrySet = map.entrySet();
List<String> l = new ArrayList<String>();
int highestAnagrams = -1;
for (Entry<String, List<String>> entry : entrySet) {
List<String> value = entry.getValue();
if (value.size() > highestAnagrams) {
highestAnagrams = value.size();
l = value;
}
}
return l.toArray(new String[l.size()]);
}
}
The idea is to first find the anangrams. I am doing that using a sorting the string's character array and using the LinkedhashMap.
Then I am storing the original string in the list which can be used to print or reuse as a result.
You have to keep counting the number of times the an anagram occurs and that value can be used solve your problem
This is my solution in C#.
public static string[] LargestAnagramsSet(string[] words)
{
var maxSize = 0;
var maxKey = string.Empty;
Dictionary<string, List<string>> set = new Dictionary<string, List<string>>();
for (int i = 0; i < words.Length; i++)
{
char[] temp = words[i].ToCharArray();
Array.Sort(temp);
var key = new string(temp);
if (set.ContainsKey(key))
{
set[key].Add(words[i]);
}
else
{
var anagrams = new List<string>
{
words[i]
};
set.Add(key, anagrams);
}
if (set[key].Count() > maxSize)
{
maxSize = set[key].Count();
maxKey = key;
}
}
return string.IsNullOrEmpty(maxKey) ? words : set[maxKey].ToArray();
}

Getting common string pairs in two string list

Hi i am taking common count in two list.
Here is my code.
public static int getMatchCount(List<String> listOne, List<String> listTwo) {
String valueOne = "";
String valueTwo = "";
int matchCount = 0;
boolean isMatchedOnce=false;
for (int i = 0; i < listOne.size(); i++) {
valueOne = listOne.get(i);
isMatchedOnce=false;
if (StringUtils.isBlank(valueOne))
continue;
for (int j = 0; j < listTwo.size(); j++) {
valueTwo = listTwo.get(j);
if (StringUtils.isBlank(valueTwo))
continue;
if (valueTwo.equals(valueOne) && (!isMatchedOnce)) {
matchCount++;
listOne.set(i, "");
listTwo.set(j, "");
isMatchedOnce=true;
}
}
}
return matchCount;
}
for ex
listone listTwo
A A
A B
B
Then result is 2 not 3
As their is only two common pair we can take out.
But the method is very slow Any Improvement in Above method to make it quick.
This should be an easier work around:
List<String> listOne = new ArrayList<String>();
//add elements
List<String> listTwo= new ArrayList<String>();
//add elements
List<String> commonList = new ArrayList<String>(listTwo);
commonList.retainAll(listOne);
int commonListSize = commonList.size();
Use an interim Collection and addAll(), retainAll():
Set<String> set = new HashSet<String>();
set.addAll(list1);
set.retainAll(list2);
int count = set.size();
maybe you can try this ...
public static int getMatchCount(List<String> listOne, List<String> listTwo) {
String valueOne;
String valueTwo;
int matchCount = 0;
boolean isMatchedOnce;
//for (int i = 0; i < listOne.size(); i++) {
for(String i : listOne){
valueOne = i;
isMatchedOnce = false;
if (StringUtils.isBlank(valueOne)) {
continue;
}
for (String j : listTwo) {
valueTwo = j;
if (StringUtils.isBlank(valueTwo)) {
continue;
}
if (valueTwo.equals(valueOne) && (!isMatchedOnce)) {
matchCount++;
listOne.set(listOne.indexOf(i), "");
listTwo.set(listOne.indexOf(j), "");
isMatchedOnce = true;
}
}
}
return matchCount;
}

Java: How to split a string by a number of characters?

I tried to search online to solve this question but I didn't found anything.
I wrote the following abstract code to explain what I'm asking:
String text = "how are you?";
String[] textArray= text.splitByNumber(4); //this method is what I'm asking
textArray[0]; //it contains "how "
textArray[1]; //it contains "are "
textArray[2]; //it contains "you?"
The method splitByNumber splits the string "text" every 4 characters. How I can create this method??
Many Thanks
I think that what he wants is to have a string split into substrings of size 4. Then I would do this in a loop:
List<String> strings = new ArrayList<String>();
int index = 0;
while (index < text.length()) {
strings.add(text.substring(index, Math.min(index + 4,text.length())));
index += 4;
}
Using Guava:
Iterable<String> result = Splitter.fixedLength(4).split("how are you?");
String[] parts = Iterables.toArray(result, String.class);
What about a regexp?
public static String[] splitByNumber(String str, int size) {
return (size<1 || str==null) ? null : str.split("(?<=\\G.{"+size+"})");
}
See Split string to equal length substrings in Java
Try this
String text = "how are you?";
String array[] = text.split(" ");
Or you can use it below
List<String> list= new ArrayList<String>();
int index = 0;
while (index<text.length()) {
list.add(text.substring(index, Math.min(index+4,text.length()));
index=index+4;
}
Quick Hack
private String[] splitByNumber(String s, int size) {
if(s == null || size <= 0)
return null;
int chunks = s.length() / size + ((s.length() % size > 0) ? 1 : 0);
String[] arr = new String[chunks];
for(int i = 0, j = 0, l = s.length(); i < l; i += size, j++)
arr[j] = s.substring(i, Math.min(l, i + size));
return arr;
}
Using simple java primitives and loops.
private static String[] splitByNumber(String text, int number) {
int inLength = text.length();
int arLength = inLength / number;
int left=inLength%number;
if(left>0){++arLength;}
String ar[] = new String[arLength];
String tempText=text;
for (int x = 0; x < arLength; ++x) {
if(tempText.length()>number){
ar[x]=tempText.substring(0, number);
tempText=tempText.substring(number);
}else{
ar[x]=tempText;
}
}
return ar;
}
Usage : String ar[]=splitByNumber("nalaka", 2);
I don't think there's an out-of-the-box solution, but I'd do something like this:
private String[] splitByNumber(String s, int chunkSize){
int chunkCount = (s.length() / chunkSize) + (s.length() % chunkSize == 0 ? 0 : 1);
String[] returnVal = new String[chunkCount];
for(int i=0;i<chunkCount;i++){
returnVal[i] = s.substring(i*chunkSize, Math.min((i+1)*chunkSize-1, s.length());
}
return returnVal;
}
Usage would be:
String[] textArray = splitByNumber(text, 4);
EDIT: the substring actually shouldn't surpass the string length.
This is the simplest solution i could think off.. try this
public static String[] splitString(String str) {
if(str == null) return null;
List<String> list = new ArrayList<String>();
for(int i=0;i < str.length();i=i+4){
int endindex = Math.min(i+4,str.length());
list.add(str.substring(i, endindex));
}
return list.toArray(new String[list.size()]);
}
Here's a succinct implementation using Java8 streams:
String text = "how are you?";
final AtomicInteger counter = new AtomicInteger(0);
Collection<String> strings = text.chars()
.mapToObj(i -> String.valueOf((char)i) )
.collect(Collectors.groupingBy(it -> counter.getAndIncrement() / 4
,Collectors.joining()))
.values();
Output:
[how , are , you?]
Try this solution,
public static String[]chunkStringByLength(String inputString, int numOfChar) {
if (inputString == null || numOfChar <= 0)
return null;
else if (inputString.length() == numOfChar)
return new String[]{
inputString
};
int chunkLen = (int)Math.ceil(inputString.length() / numOfChar);
String[]chunks = new String[chunkLen + 1];
for (int i = 0; i <= chunkLen; i++) {
int endLen = numOfChar;
if (i == chunkLen) {
endLen = inputString.length() % numOfChar;
}
chunks[i] = new String(inputString.getBytes(), i * numOfChar, endLen);
}
return chunks;
}
My application uses text to speech!
Here is my algorithm, to split by "dot" and conconate string if string length less then limit
String[] text = sentence.split("\\.");
ArrayList<String> realText = sentenceSplitterWithCount(text);
Function sentenceSplitterWithCount: (I concanate string lf less than 100 chars lenght, It depends on you)
private ArrayList<String> sentenceSplitterWithCount(String[] splittedWithDot){
ArrayList<String> newArticleArray = new ArrayList<>();
String item = "";
for(String sentence : splittedWithDot){
item += DataManager.setFirstCharCapitalize(sentence)+".";
if(item.length() > 100){
newArticleArray.add(item);
item = "";
}
}
for (String a : newArticleArray){
Log.d("tts", a);
}
return newArticleArray;
}
function setFirstCharCapitalize just capitalize First letter: I think, you dont need it, anyway
public static String setFirstCharCapitalize(String input) {
if(input.length()>2) {
String k = checkStringStartWithSpace(input);
input = k.substring(0, 1).toUpperCase() + k.substring(1).toLowerCase();
}
return input;
}

Categories

Resources