Count Occurence of Needle String in Haystack String, most optimally?

Count Occurence of Needle String in Haystack String, most optimally? - java

The Problem is simple Find "ABC" in "ABCDSGDABCSAGAABCCCCAAABAABC" without using String.split("ABC")
Here is the solution I propose, I'm looking for any solutions that might be better than this one.
public static void main(String[] args) {
String haystack = "ABCDSGDABCSAGAABCCCCAAABAABC";
String needle = "ABC";
char [] needl = needle.toCharArray();
int needleLen = needle.length();
int found=0;
char hay[] = haystack.toCharArray();
int index =0;
int chMatched =0;
for (int i=0; i<hay.length; i++){
if (index >= needleLen || chMatched==0)
index=0;
System.out.print("\nchar-->"+hay[i] + ", with->"+needl[index]);
if(hay[i] == needl[index]){
chMatched++;
System.out.println(", matched");
}else {
chMatched=0;
index=0;
if(hay[i] == needl[index]){
chMatched++;
System.out.print("\nchar->"+hay[i] + ", with->"+needl[index]);
System.out.print(", matched");
}else
continue;
}
if(chMatched == needleLen){
found++;
System.out.println("found. Total ->"+found);
}
index++;
}
System.out.println("Result Found-->"+found);
}
It took me a while creating this one. Can someone suggest a better solution (if any)
P.S. Drop the sysouts if they look messy to you.

How about:
boolean found = haystack.indexOf("ABC") >= 0;
**Edit - The question asks for number of occurences, so here's a modified version of the above:
public static void main(String[] args)
{
String needle = "ABC";
String haystack = "ABCDSGDABCSAGAABCCCCAAABAABC";
int numberOfOccurences = 0;
int index = haystack.indexOf(needle);
while (index != -1)
{
numberOfOccurences++;
haystack = haystack.substring(index+needle.length());
index = haystack.indexOf(needle);
}
System.out.println("" + numberOfOccurences);
}

If you're looking for an algorithm, google for "Boyer-Moore". You can do this in sub-linear time.
edit to clarify and hopefully make all the purists happy: the time bound on Boyer-Moore is, formally speaking, linear. However the effective performance is often such that you do many fewer comparisons than you would with a simpler approach, and in particular you can often skip through the "haystack" string without having to check each character.

You say your challenge is to find ABC within a string. If all you need is to know if ABC exists within the string, a simple indexOf() test will suffice.
If you need to know the number of occurrences, as your posted code tries to find, a simple approach would be to use a regex:
public static int countOccurrences(string haystack, string regexToFind) {
Pattern p = Pattern.compile(regexToFind);
Matcher m = p.matcher(haystack); // get a matcher object
int count = 0;
while(m.find()) {
count++;
}
return count;
}

Have a look at http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm

public class NeedleCount
{
public static void main(String[] args)
{
String s="AVBVDABCHJHDFABCJKHKHF",ned="ABC";
int nedIndex=-1,count=0,totalNed=0;
for(int i=0;i<s.length();i++)
{
if(i>ned.length()-1)
nedIndex++;
else
nedIndex=i;
if(s.charAt(i)==ned.charAt(nedIndex))
count++;
else
{
nedIndex=0;
count=0;
if(s.charAt(i)==ned.charAt(nedIndex))
count++;
else
nedIndex=-1;
}
if(count==ned.length())
{
nedIndex=-1;
count=0;
totalNed++;
System.out.println(totalNed+" needle found at index="+(i-(ned.length()-1)));
}
}
System.out.print("Total Ned="+totalNed);
}
}

Asked by others, better in what sense? A regexp based solution will be the most concise and readable (:-) ). Boyer-Moore (http://en.wikipedia.org/wiki/Boyer–Moore_string_search_algorithm) will be the most efficient in terms of time (O(N)).

If you don't mind implementing a new datastructure as replacement for strings, have a look at Tries: http://c2.com/cgi/wiki?StringTrie or http://en.wikipedia.org/wiki/Trie
If you don't look for a regular expression but an exact match they should provide the fastest solution (proportional to length of search string).

public class FindNeedleInHaystack {
String hayStack="ASDVKDBGKBCDGFLBJADLBCNFVKVBCDXKBXCVJXBCVKFALDKBJAFFXBCD";
String needle="BCD";
boolean flag=false;
public void findNeedle() {
//Below for loop iterates the string by each character till reaches max length
for(int i=0;i<hayStack.length();i++) {
//When i=n (0,1,2... ) then we are at nth character of hayStack. Let's start comparing nth char of hayStach with first char of needle
if(hayStack.charAt(i)==needle.charAt(0)) {
//if condition return true, we reach forloop which iterates needle by lenghth.
//Now needle(BCD) first char is 'B' and nth char of hayStack is 'B'. Then let's compare remaining characters of needle with haystack using below loop.
for(int j=0;j<needle.length();j++) {
//for example at i=9 is 'B', i+j is i+0,i+1,i+2...
//if condition return true, loop continues or else it will break and goes to i+1
if(hayStack.charAt(i+j)==needle.charAt(j)) {
flag=true;
} else {
flag=false;
break;
}
}
if(flag) {
System.out.print(i+" ");
}
}
}
}
}

Below code will perform exactly O(n) complexity because we are looping n chars of haystack. If you want to capture start and end index's of needle uncomment below commented code. Solution is around playing with characters and no Java String functions (Pattern matching, IndexOf, substring etc.,) are used as they may bring extra space/time complexity
char[] needleArray = needle.toCharArray();
char[] hayStackArray = hayStack.toCharArray();
//java.util.LinkedList<Pair<Integer,Integer>> indexList = new LinkedList<>();
int head;
int tail = 0;
int needleCount = 0;
while(tail<hayStackArray.length){
head = tail;
boolean proceed = false;
for(int j=0;j<needleArray.length;j++){
if(head+j<hayStackArray.length && hayStackArray[head+j]==needleArray[j]){
tail = head+j;
proceed = true;
}else{
proceed = false;
break;
}
}
if(proceed){
// indexList.add(new Pair<>(head,tail));
needleCount++;
}
++tail;
}
System.out.println(needleCount);
//System.out.println(indexList);

Related

Efficient way for finding Shortest Palindrome from Arraylist of strings

I am trying to solve this problem in java. I have an arraylist of palindromic strings. I have to find the shortest palindrome string out of the given array list. I have solved the question but was looking at getting feedback on my code and also how I can try to make the code more efficient/better.
Here is the code what I have tried.
In this case, size would be 3, since that is the length of the smallest palindromic string.
import java.util.ArrayList;
class ShortestPalindrome {
public static int isShortestPalindrome(ArrayList<String> list) {
int smallest = list.get(0).length();
boolean ret = false;
for (String element : list) {
ret = isPalindrome(element);
if (ret) {
if (element.length() < smallest)
smallest = element.length();
}
}
return smallest;
}
private static boolean isPalindrome(String input) {
String str = "";
boolean result = false;
if (input.length() == 1 || input.length() == 0)
return true;
if (input.charAt(0) != input.charAt(input.length() - 1))
return false;
StringBuilder sb = new StringBuilder(input.toLowerCase());
str = sb.reverse().toString();
if (input.equals(str)) {
result = true;
}
return result;
}
public static void main(String[] args) {
ArrayList<String> array = new ArrayList<String>();
array.add("malayam");
array.add("aba");
array.add("abcdeyugugi");
array.add("nitin");
int size = isShortestPalindrome(array);
System.out.println("Shortest length of string in list:" + size);
}
}

I have a couple of comments regarding your code:
In general - if you break your problem to smaller parts, there are efficient solutions all around.
As #AKSW mentioned in his comment, if - in any case - we have to check each string's length, it's better to do it in the beginning - so we don't run the relatively expensive method isPalindrome() with irrelevant strings.(Just notice I override the given list with the sorted one, even though initializing a new sorted list is trivial)
The main improvement that I made is in the isPalindrome() method:
Reversing a string of length n takes n time and additional n space. Comparing the two takes also n time.Overall: 2n time, n space
Comparing each two matching characters (from the beginning and from the end) takes 2 additional space (for the integers) and approximately n/2 time.Overall: n/2 time, 2 space
Obviously when using limits for complexity calculations, the time complexities are both the same - O(n) - but the second solution is still 4 times cheaper and cost a negligible amount of space.
Therefore I believe this is the most efficient way to achieve your test:
import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
class ShortestPalindrome {
public static int isShortestPalindrome(ArrayList<String> list) {
// Sorts the given ArrayList by length
Collections.sort(list, Comparator.comparingInt(String::length));
for (String element : list) {
if(isPalindrome(element)) {
return element.length();
}
}
return -1; // If there is no palindrome in the given array
}
private static boolean isPalindrome(String input) {
String lowerCased = input.toLowerCase();
int pre = 0;
int end = lowerCased.length() - 1;
while (end > pre) {
if (lowerCased.charAt(pre) != lowerCased.charAt(end))
return false;
pre ++;
end --;
}
return true;
}
public static void main(String[] args) {
ArrayList<String> array = new ArrayList<>(Arrays.asList("malayam", "aba", "abcdeyugugi", "nitin"));
int size = isShortestPalindrome(array);
System.out.println("Shortest length of string in list: " + size);
}
}
Edit: I've tested this algorithm with the following list. Sorting the list before checking for palindromes reduces run time in 50%.
"malayam", "aba", "abcdeyugugi", "nitin", "sadjsaudifjksdfjds", "sadjsaudifjksdfjdssadjsaudifjksdfjds", "sadjsaudifjksdfjdssadjsaudifjksdfjdssadjsaudifjksdfjds", "a"

The simplest improvement to your code is to only check whether a string is a palindrome, if it's length is smaller than smallest.
Btw the initialization int smallest = list.get(0).length(); is not correct, imagine the first element not being a palindrome and being of smallest size of all strings. You should do int smallest = Integer.MAX_VALUE;
Also the check
if (input.charAt(0) != input.charAt(input.length() - 1))
return false;
is incorrect, as you don't convert the characters to lower case (as you do later), thus "ajA" would not be a palindrome.
There are further improvements of your code possible:
You could replace the palindrome checking by copying and reversing with this:
for (int i = 0; i < input.length() / 2; ++i)
if (Character.toLowerCase(input.charAt(i)) != Character.toLowerCase(input.charAt(input.length() - 1 - i)))
return false;
Here there is no copy necessary and in the average case it might be faster (as it can terminate early).
Also, like AKSW mentioned, it might be faster to sort the strings by length and then you can terminate early, once you found a palindrome.

Below simple code should work for you.
First check for length and then check is it actually palindrome or not.
If yes, then just store it in smallest
public static int isShortestPalindrome(ArrayList<String> list) {
Integer smallest = null;
for(String s:list){
if ( (smallest == null || s.length()< smallest) && new StringBuilder(s).reverse().toString().equalsIgnoreCase(s) ){
smallest = s.length();
}
}
return smallest == null ? 0 :smallest;
}

Here is a stream version:
OptionalInt minimalLenOfPalindrome
= list.paralellStream()
.filter(st -> {
StringBuilder sb = new StringBuilder(st);
String reversedSt = sb.reverse().toString();
return st.equalsIgnoreCase(reversedSt);
})
.mapToInt(String::length)
.min();
Thanks for #yassin answer I am changing above code with:
public class SOFlow {
private static boolean isPalindrome(String input) {
for (int i = 0; i < input.length() / 2; ++i) {
if (Character.toLowerCase(input.charAt(i)) != Character.toLowerCase(input.charAt(input.length() - 1 - i))) {
return false;
}
}
return true;
}
public static void main(String args[]) {
List<String> list = new ArrayList<>();
list.add("cAcc");
list.add("a;;;;a");
list.add("aJA");
list.add("vrrtrrr");
list.add("cAccccccccc");
OptionalInt minimalLenOfPalindrome
= list.parallelStream()
.filter(SOFlow::isPalindrome)
.mapToInt(String::length)
.min();
System.out.println(minimalLenOfPalindrome);
}
}

Using Java8 streams, and considering sorting first, for the same reasons as others:
boolean isPalindrome (String input) {
StringBuilder sb = new StringBuilder(input.toLowerCase());
return sb == sb.reverse();
}
public static int isShortestPalindrome(ArrayList<String> list) {
return (list.stream().sorted ((s1, s2) -> {
return s1.length () - s2.length ();
})
.filter (s-> isPalindrome (s))
.findFirst ()
.map (s -> s.length ())
.orElse (-1));
}
If you have many Strings of equal, minimal length, of very big length, only non palindromic in the very middle, you might spend much time in isPalindrome and prefer something like isPalindrome1 over isPalindrome.
If we assume a Million Strings of equally distributed length from 1000 to 2000 characters, we would end up with concentration on in avg. 1000 Strings. If most of them are equal except on few characters, close to the middle, then fine tuning that comparison might be relevant. But finding a palindrome early terminates our search, so the percentage of palindromes is of heavy influence on the performance too.
private static boolean isPalindrome1 (String s) {
String input = s.toLowerCase ();
int len = input.length ();
for (int i = 0, j = len -1; i < len/2 && j > len/2; ++i, --j)
if (input.charAt(i) != input.charAt (j))
return false;
return true;
}
The Result of the stream sorting and filtering is an Option, which is great opportunity to signal, that nothing was found. I sticked to your interface of returning int and return -1 if nothing is found, which has to be properly evaluated, of course, by the caller.

Substring alternative

So I'm creating a program that will output the first character of a string and then the first character of another string. Then the second character of the first string and the second character of the second string, and so on.
I created what is below, I was just wondering if there is an alternative to this using a loop or something rather than substring
public class Whatever
{
public static void main(String[] args)
{
System.out.println (interleave ("abcdefg", "1234"));
}
public static String interleave(String you, String me)
{
if (you.length() == 0) return me;
else if (me.length() == 0) return you;
return you.substring(0,1) + interleave(me, you.substring(1));
}
}
OUTPUT: a1b2c3d4efg

Well, if you really don't want to use substrings, you can use String's toCharArray() method, then you can use a StringBuilder to append the chars. With this you can loop through each of the array's indices.
Doing so, this would be the outcome:
public static String interleave(String you, String me) {
char[] a = you.toCharArray();
char[] b = me.toCharArray();
StringBuilder out = new StringBuilder();
int maxLength = Math.max(a.length, b.length);
for( int i = 0; i < maxLength; i++ ) {
if( i < a.length ) out.append(a[i]);
if( i < b.length ) out.append(b[i]);
}
return out.toString();
}
Your code is efficient enough as it is, though. This can be an alternative, if you really want to avoid substrings.

This is a loop implementation (not handling null value, just to show the logic):
public static String interleave(String you, String me) {
StringBuilder result = new StringBuilder();
for (int i = 0 ; i < Math.max(you.length(), me.length()) ; i++) {
if (i < you.length()) {
result.append(you.charAt(i)); }
if (i < me.length()) {
result.append(me.charAt(i));
}
}
return result.toString();
}

The solution I am proposing is based on the expected output - In your particular case consider using split method of String since you are interleaving by on character.
So do something like this,
String[] xs = "abcdefg".split("");
String[] ys = "1234".split("");
Now loop over the larger array and ensure interleave ensuring that you perform length checks on the smaller one before accessing.

To implement this as a loop you would have to maintain the position in and keep adding until one finishes then tack the rest on. Any larger sized strings should use a StringBuilder. Something like this (untested):
int i = 0;
String result = "";
while(i <= you.length() && i <= me.length())
{
result += you.charAt(i) + me.charAt(i);
i++;
}
if(i == you.length())
result += me.substring(i);
else
result += you.substring(i);

Improved (in some sense) #BenjaminBoutier answer.
StringBuilder is the most efficient way to concatenate Strings.
public static String interleave(String you, String me) {
StringBuilder result = new StringBuilder();
int min = Math.min(you.length(), me.length());
String longest = you.length() > me.length() ? you : me;
int i = 0;
while (i < min) { // mix characters
result.append(you.charAt(i));
result.append(me.charAt(i));
i++;
}
while (i < longest.length()) { // add the leading characters of longest
result.append(longest.charAt(i));
i++;
}
return result.toString();
}

Return word specified by the integer

I know I'm missing some things and that's what I really need help with. The code doesn't work in all cases and am looking for help improving/fixing it.
Assignment:
The code I have so far:
public String word(int num, String words)
{
int l = words.indexOf(" ");
int r = words.indexOf(" ", l+1);
for(int i = 3; i <= num; i++){
l = r;
r = words.indexOf(" ", l+1);
//if(i != num)
// l = r;
}
String theword = words.substring(l,r);
return theword;
}
}

As this is clearly homework, I will give you text only.
Your approach may work eventually, but it is laborious and overly complicated, so it's hard to debug and hard to get right.
make use of String's API by using the split() method
after splitting the sentence into an array of word Strings, return the element at num less one (array are indexed starting at zero
check the length of the array first, in case there are less words than num, and take whatever action you think is appropriate in that case
For part 2, a solution in a simple form may be:
create a new blank string for the result
iterate over the characters of the given string adding the character to the front of the result string
make use of String's toUpperCase() method

Since this is homework and you have showed some effort. This is how you can do part 1 of your question. This code is pretty evident.
1) I am returning null if number is greater than the number of words in string as we dont want user to enter 5 when there are only 2 words in a string
2) Splitting the string by space and basically returning the array with the number mentioned by user
There are more conditions which you must figure out such as telling the user to enter a number of the string length since it would not give him any result and taking input from Scanner instead of directy adding input in method.
public static String word(int num, String words)
{
String wordsArr[] = words.split(" ");
if(num <= 0 || num > wordsArr.length) return null;
return (wordsArr[num-1]);
}
the second part of your question must be attempted by you.

Well... not often you see people coming here with homework AND showing effort at the same time so bravo :).
This is example of how you can split the string and return the [x] element from that string
public class SO {
public static void main(String[] args) throws Exception {
int number = 3;
String word = "Hello this is sample code";
SO words = new SO();
words.returnWord(number, word);
}
private void returnWord(int number, String word) throws Exception {
String[] words = word.split("\\s+");
int numberOfWords = words.length;
if(numberOfWords >= number) {
System.out.println(words[number-1]);
} else {
throw new Exception("Not enought words!!!");
}
}
}
Yes it is a working example but do not just copy and paste that for your homework - as simple question from teacher - What is this doing, or how this works and your out :)! So understand the code, and try to modify it in a way that you are familiar what is doing what. Also its worth getting some Java book - and i recommend Head first Java by O'Really <- v.good beginner book!
if you have any questions please do ask!. Note that this answer is not 100% with what the textbook is asking for, so you can modify this code accordingly.
As of part 2. Well what Bohemian said will also do, but there is a lot quicker solution to this.
Look at StringBuilder(); there is a method on it that will be of your interest.
To convert String so all letter are upper case you can use .toUpperCase() method on this reversed string :)

You can try:
public class trial {
public static void main(String[] args)
{
System.out.println(specificword(0, "yours faithfully kyobe"));
System.out.println(reverseString("derrick"));}
public static String specificword(int number, String word){
//split by space
String [] parts = word.split("\\ ");
if(number <= parts.length){
return parts[number];
}
else{
return "null String";
}
}
public static String reverseString(String n){
String c ="";
for(int i = n.length()-1; i>=0; i--){
char m = n.charAt(i);
c = c + m;
}
String m = c.toUpperCase();
return m;
}
}

For the first problem, I'll give you two approaches (1. is recommended):
Use the String.split method to split the words up into an array of words, where each element is a word. Instead of one string containing all of the words, such as "hello my name is Michael", it will create an array of the words, like so [hello, my, name, is, Michael] and that way you can use the array to access the words. Very easy:
public static String word(int num, String words)
{
// split words string into array by the spaces
String[] wordArray = words.split(" "); // or = words.split("\\s+");
// if the number is within the range
if (num > 0 && num <= wordArray.length) {
return wordArray[num - 1]; // return the word from the word array
} else { // the number is not within the range of words
return null;
}
}
Only use this if you cannot use arrays! Loop through the word until you have found enough spaces to match the word you want to find:
public static String word(int num, String words)
{
for (int i = 0; i < words.length(); i++) { // every character in words
if (words.substring(i, i+1).equals(" ")) { // if word is a space
num = num - 1; // you've found the next word, so subtract 1 (number of words left is remaining)
}
if (num == 1) { // found all words
// return this word
int lastIndex = i+1;
while (lastIndex < words.length()) { // until end of words string
if (words.substring(lastIndex, lastIndex+1).equals(" ")) {
break;
}
lastIndex = lastIndex + 1; // not a space so keep moving along the word
}
/*
// or you could use this to find the last index:
int lastIndex = words.indexOf(" ", i + 1); // next space after i+1
if (lastIndex == -1) { // couldn't find another space
lastIndex = words.length(); // so just make it the last letter in words
}*/
if (words.substring(i, i+1).equals(" ")) { // not the first word
return words.substring(i+1, lastIndex);
} else {
return words.substring(i, lastIndex);
}
}
}
return null; // didn't find word
}
As for the second problem, just iterate backwards through the string and add each letter to a new string. You add each letter from the original string to a new string, but just back to front. And you can use String.toUpperCase() to convert the string to upper case. Something like this:
public static String reverse(String str) {
String reversedString = ""; // this will be the reversed string
// for every character started at the END of the string
for (int i = str.length() - 1; i > -1; i--) {
// add it to the reverse string
reversedString += str.substring(i, i+1);
}
return reversedString.toUpperCase(); // return it in upper case
}

Find longest strings

I have a large string like "wall hall to wall hall fall be", and I want to print longest strings. Then i want to know how many times all longest strings Is repeated?
For exampele,longest strings are:
wall Is repeated 2
hall Is repeated 2
fall Is repeated 1
This is my code:
public void bigesttstring(String str){
String[] wordsArray=str.split(" ");
int n= str.trim().split("\\s+").length;
int maxsize=0;
String maxWord="";
for(int i=0;i<wordsArray.length;i++){
if(wordsArray[i].length()>maxsize){
maxWord=wordsArray[i];
maxsize=wordsArray[i].length();
}
}
System.out.println("Max sized word is "+maxWord+" with size "+maxsize);
}
But this code only prints "wall".
for count repeated String(i mean "maxWord"),this code write:
int count=0;
for(int i=0;i<wordsArray.length;i++){
if(maxWord.equals(wordsArray[i])){
count++;
}
}
and for display other longest strings i have this code:
int k=0;
for(int i=0;i<wordsArray.length;i++){
if(maxWord.equals(wordsArray[i])){
continue;
}
if(maxsize==wordsArray[i].length()){
k++;
}
}
String[] other=new String[k];
int o=0;
for(int i=0;i<wordsArray.length;i++){
if(maxWord.equals(wordsArray[i])){
continue;
}
if(maxsize==wordsArray[i].length()){
other[o]=wordsArray[i];
o++;
}
}
I allowed to use this functions:
char char At(int i);
int ComoareTo(String another string);
boolean endsWith(String suffix);
int indexof();
int indexof(String str);
String substring();
char[] toCharArray();
String lowercase();
And want another code like this for shortest strings.

You have written
if(wordsArray[i].length()>maxsize)
For wall, hall and fall, it is only true for first wall. That's why you are getting wall and size 4.
Here you are not considering that the longest string length may be same for different string. You will have to store the longest string in an array and if condition should be
if(wordsArray[i].length()>=maxsize)
you will consider = and > case seperately. Since in the case of > you will have to delete all the string in array.

You need to change it to equal because currently if the words is the same length as the current largest word it will ignore it. Also if you want it to have the biggest words. You need to store them in an array. I implemented it here.
package OtherPeoplesCode;
public class string {
public static void main(String[] args) {
bigeststring("wall hall to wall hall fall be");
}
public static void bigeststring(String str){
String[] wordsArray=str.split(" ");
String[] biggestWordsArray = new String[wordsArray.length];
int x = 0;
int n= str.trim().split("\\s+").length;
int maxsize=0;
String maxWord="";
for(int i=0;i<wordsArray.length;i++){
if(wordsArray[i].length()>maxsize){
maxWord=wordsArray[i];
maxsize=wordsArray[i].length();
for(int y = 0; y <= biggestWordsArray.length -1; y++){
biggestWordsArray[y] = "";
}
}
else if(maxsize==wordsArray[i].length()){
biggestWordsArray[x] = wordsArray[i];
x++;
}
}
if(biggestWordsArray[0].equals("")){
System.out.println("Max sized word is "+maxWord+" with size "+maxsize);
}
else if(!(biggestWordsArray[0].equals(""))){
System.out.println("TIE!");
for(int y = 0; y <= biggestWordsArray.length -1; y++){
if(!(biggestWordsArray[y].equals(""))){
System.out.print("Word #" + y + " is ");
System.out.println(biggestWordsArray[y]);
}
}
}
}
}
EDIT: This is the working code, sorry about the delay.

Using Map is possibly the most straight-forward and easy way to do. However if you said your teacher don't allow you to use that, may you tell us what is allowed? So that we don't end up wasting time suggesting different methods and end up none of them is acceptable because your teacher doesn't allow.
One most brute force way that I can suggest you to try is (lots of place for optimization, but I think you may want the easiest way):
loop through the list of words, and find out the length of the longest word and number of words with such length
Create a new array with "number of word" you found in 1. Loop through the original word list again, for each word with length == maxWordLength, put that in the new array IF it is not already existed in it (a simple check by a loop.
Now you have a list that contains all DISTINCT words that are "longest", with some possible null at the end. In order to display them in a format like "word : numOfOccurence", you can do something like
loop through result array until you hit null. For each word in the result array, have a loop in the original word list to count its occurence. Then you can print out the message as you want
in psuedo code:
String[] wordList = ....;
int maxLen = 0;
int maxLenOccurence = 0;
foreach word in wordList {
if word is longer then maxLen {
maxLen = word's length
maxLenOccurence = 1;
}
else if word's length is equals to maxLen {
maxLenOccurence ++
}
}
// 2,3
String[] maxLenWordList = new String[maxLenOccurence];
foreach word in wordList {
else if word's length is equals to maxLen {
for i = 0 to maxLenWordList length {
if (maxLenWordList[i] == word)
break
if (maxLenWordList[i] == null
maxLenWordList[i] = word
}
}
//4
foreach maxLenWord in maxLenWordList {
count = 0
foreach word in wordList {
if maxLenWord == word
count ++
}
display "Max sized word is "+ maxLenWord + " with size " + count
}
Another way doesn't involve other data structure is:
Have the word list
Sort the word list first by length then by the literal value
First element of the result list is the longest one, and string with same value become adjacent. You can do a loop print out all matching and its count (do some thinking by yourself here. Shouldn't be that hard)

Also you can use this;
String[] allLongestStrings(String[] inputArray) {
List<String> list = new ArrayList<String>();
int max = 0;
for (int i = 0; i < inputArray.length; i++) {
StringBuilder s = new StringBuilder(inputArray[i]);
int n = s.length();
if (n > max) {
max = n;
}
}
for (int i = 0; i < inputArray.length; i++) {
StringBuilder s = new StringBuilder(inputArray[i]);
int n = s.length();
if (n == max) {
list.add(s.toString());
}
}
return list.toArray(new String[list.size()]);
}

Algorithm for duplicated but overlapping strings

I need to write a method where I'm given a string s and I need to return the shortest string which contains s as a contiguous substring twice.
However two occurrences of s may overlap. For example,
aba returns ababa
xxxxx returns xxxxxx
abracadabra returns abracadabracadabra
My code so far is this:
import java.util.Scanner;
public class TwiceString {
public static String getShortest(String s) {
int index = -1, i, j = s.length() - 1;
char[] arr = s.toCharArray();
String res = s;
for (i = 0; i < j; i++, j--) {
if (arr[i] == arr[j]) {
index = i;
} else {
break;
}
}
if (index != -1) {
for (i = index + 1; i <= j; i++) {
String tmp = new String(arr, i, i);
res = res + tmp;
}
} else {
res = res + res;
}
return res;
}
public static void main(String args[]) {
Scanner inp = new Scanner(System.in);
System.out.println("Enter the string: ");
String word = inp.next();
System.out.println("The requires shortest string is " + getShortest(word));
}
}
I know I'm probably wrong at the algorithmic level rather than at the coding level. What should be my algorithm?

Use a suffix tree. In particular, after you've constructed the tree for s, go to the leaf representing the whole string and walk up until you see another end-of-string marker. This will be the leaf of the longest suffix that is also a prefix of s.

As #phs already said, part of the problem can be translated to "find the longest prefix of s that is also a suffix of s" and a solution without a tree may be this:
public static String getShortest(String s) {
int i = s.length();
while(i > 0 && !s.endsWith(s.substring(0, --i)))
;
return s + s.substring(i);
}

Once you've found your index, and even if it's -1, you just need to append to the original string the substring going from index + 1 (since index is the last matching character index) to the end of the string. There's a method in String to get this substring.

i think you should have a look at the Knuth-Morris-Pratt algorithm, the partial match table it uses is pretty much what you need (and by the way it's a very nice algorithm ;)

If your input string s is, say, "abcde" you can easily build a regex like the following (notice that the last character "e" is missing!):
a(b(c(d)?)?)?$
and run it on the string s. This will return the starting position of the trailing repeated substring. You would then just append the missing part (i.e. the last N-M characters of s, where N is the length of s and M is the length of the match), e.g.
aba
^ match "a"; append the missing "ba"
xxxxxx
^ match "xxxxx"; append the missing "x"
abracadabra
^ match "abra"; append the missing "cadabra"
nooverlap
--> no match; append "nooverlap"

From my understanding you want to do this:
input: dog
output: dogdog
--------------
input: racecar
output: racecaracecar
So this is how i would do that:
public String change(String input)
{
StringBuilder outputBuilder = new StringBuilder(input);
int patternLocation = input.length();
for(int x = 1;x < input.length();x++)
{
StringBuilder check = new StringBuilder(input);
for(int y = 0; y < x;y++)
check.deleteCharAt(check.length() - 1);
if(input.endsWith(check.toString()))
{
patternLocation = x;
break;
}
}
outputBuilder.delete(0, input.length() - patternLocation);
return outputBuilder.toString();
}
Hope this helped!

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Count Occurence of Needle String in Haystack String, most optimally? - java

Have a look at http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm

Asked by others, better in what sense? A regexp based solution will be the most concise and readable (:-) ). Boyer-Moore (http://en.wikipedia.org/wiki/Boyer–Moore_string_search_algorithm) will be the most efficient in terms of time (O(N)).

Related

Efficient way for finding Shortest Palindrome from Arraylist of strings

Substring alternative

Return word specified by the integer

Find longest strings

Algorithm for duplicated but overlapping strings

Categories

Resources