count occurrence of a character in a given string

count occurrence of a character in a given string - java

I froze onto the following code for counting occurence of a character in a string:
public static void main(String[] args) {
String source = "hello low how ale you";
Scanner in = new Scanner(System.in);
String temp = in.nextLine();
char test = temp.toCharArray()[0];
int fromIndex = 0;
int occurences =0;
while(fromIndex>-1)
{
fromIndex = source.indexOf(test, fromIndex);
System.out.println("found at"+fromIndex);
//if(fromIndex!=-1) occurences++;
}
System.out.println(occurences);
}
The loop runs infinitely if the "if(fromIndex!=-1)" line is commented out!
The loop properly terminates if the same line is uncommented.
Its strange to observe that loop's termination depends on variable fromIndex and not on the updation of variable occurences which is being updated inside the If block.
Any guesses as to why this is happening?

the fromIndex value is not changing in the subsequent iterations. Thats the reason behind the endless loop.Thats because fromIndex will give the exact index of the character. increment the fromIndex by 1 for the next loop and tht would solve the problem.
while(fromIndex>-1)
{
fromIndex = source.indexOf(test, fromIndex+1);
System.out.println("found at"+fromIndex);
if(fromIndex!=-1) occurences++;
}
}
Hope this helps.

If you are not strict on using an approach like in your code, I suggest you to use a regex. You will have cleaner and less code, too. Try the code snippet below, let's say you have character ch:
char ch = 'o';
String input = "Hi there boys! Come on!";
String regex = ch + "";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(input);
ArrayList<String> matches = new ArrayList<String>();
while (m.find())
matches.add(m.group());
System.out.println(matches.size());

I think you are trying to do something like below.
public class Solution {
public static void main(String[] args) {
String source = "hello low how ale you";
char test = 'u';
int fromIndex = 0;
int occurrences = 0;
int length = source.length();
while (fromIndex < length) {
int index = source.indexOf(test, fromIndex);
if (index != -1) {
occurrences++;
fromIndex = index;
}
fromIndex++;
}
System.out.println(occurrences);
}
}
Some explanation-
Suppose you need to find 'h' in the given string, you will start from 0, so your fromIndex initialization will be 0.
You search from initial position till the end of the string. That's why in the while loop, you need to give your condition as, fromIndex < length.
Now, try to find the occurrence of your test. If there is a test character, you get its character. Else you get -1. Store it in index.
Now if the index is not -1, assign it to fromIndex. Because you will now search starting from this position. Not again from 0.
Increment fromIndex. This will be done regardless of the value of index variable.
Now analyze your code for errors.

Related

Java Replace a substring after a certain number of repeats of that substring

I'm working on an assignment that asks you to write a method to replace a substring with a new substring, but only if the original substring is repeated within its string a given number of times, and only replace the substring at that repeat.
We are given:
public class Statement
{
private String remark;
public Statement (String a){
remark = a;
}
/**Returns the index in the statement of the kth str;
*returns -1 if kth time does not exist.
*Precondition: str.length() > 0 and k > 0
*Postcondition: the current statement is not modified.
*/
public int locateKth (String str, int k)
{ /*implementation not shown*/ }
/**Modifies the current statement by replacing the kth time of str with newStr.
*If the kth time does not exist, the current statement is unchanged.
*Precondition: str.length() > 0 and k > 0
*/
public void changeKth (int k, String str, String newStr)
We are then asked to write the method changeKth, and are given examples of how it works:
Statement ex1 = new Statement(“The cat in the hat knows a lot about
that.”)
ex1.changeKth(1, “at”, “ow”);
System.out.println(ex1);
Returns: The cow in the hat knows a lot about that.
I know I will have to index the k instance of str, but I am not sure where to go from there to replace only that instance of str. I've seen people replace only the first instance of a substring, but not ever only instances after that. How would I do that?

I would use the indexOf(String str, int fromIndex) method from the String class to find all the times str appears in the sentence; setting fromIndex to the index of the last time str showed up until you've gone through the whole sentence. Such as:
String sentence;
String str;
String newStr;
int k;
List<Integer> indexes = new ArrayList<Integer>();
int lastIndex = 0;
//LOOP UNTIL WE'VE GONE THROUGH THE WHOLE SENTENCE
while(lastIndex < sentence.length){
//GET THE FIRST PLACE WHERE str APPEARS AFTER THE LAST INDEX WE'VE ALREADY LOOKED AT
int index = sentence.indexOf(str, lastIndex);
//KEEP TRACK OF THE INDEXES str APPEARS AT IN THE SENTENCE
indexes.add(index);
//UPDATE THE LAST INDEX WE'VE LOOKED AT
lastIndex = index;
}
//GET KTH INDEX
int kthIndex = indexes.get(k);
//GET THE SENTENCE BEFORE str APPEARS AT THE kthIndex
String result = sentence.substring(0, kthIndex)
//ADD newStr (REPLACE str WITH newStr)
result += newStr;
//ADD THE LAST PART OF THE SENTENCE AFTER str APPEARS AT THE kthIndex
result += sentence.substring(kthIndex + str.length, sentence.length);
//THIS IS THE RESULT
return result;

Counting unique characters in a string

So i have been trying to make a code that counts the number of words in a string which was pretty easy. I'm running into problems when im trying to make it count the number of unique characters in a string. The program compiles and runs it doesn't display the number of Unique characters. Adding a System.out.println(countOfUniqueChars); below return doesn't work.
Here's the code:
public class Uniquechar{
public static void main(String[] args) {
String s = "Jag vet inte vad jag heter idag";
String[] parts = s.split(" ");
int wordcount = parts.length;
System.out.println("The number of words is" + wordcount);
countUniqueCharacters(s);
}
public static int countUniqueCharacters(String s) {
String lowerCase = s.toLowerCase();
char characters[] = lowerCase.toCharArray();
int countOfUniqueChars = s.length();
for (int i = 0; i < characters.length; i++) {
if (i != lowerCase.indexOf(characters[i])) {
countOfUniqueChars--;
}
}
return countOfUniqueChars;
}

Try this:
s = s.replace(" ", ""); // If you don't want to count space
char[] chars = s.toCharArray();
Set<Character> uniqueChars = new HashSet<>();
for (char c : chars) {
uniqueChars.add(c);
}
System.out.println(c.size());

Just print the method call, it prints the result.
System.out.println(countUniqueCharacters(s));
Adding a System.out.println(countOfUniqueChars); below return doesn't work.
It won't work. Because the code after return statement is unreachable. Perhaps you can do it just before return.
System.out.println(countOfUniqueChars);
return countOfUniqueChars;

You can do System.out.println(countUniqueCharacters(s)); in the main method, to output the return value of your method. After a return, you cannot add more code. I did it for you and the output is 12, so it seems to be that there is also something wrong with your algorithm.
int uniqeCharsCount = countUniqueCharacters(s);
System.out.println("The number of uniqe chars is " + uniqeCharsCount);
Output: 12
Your algorithm:
Actually you are checking every char, if this char is one more time in the string before. But you should also check if the char is anywhere in the string after the current index. You can fix it if you change your if condition to if (i != lowerCase.indexOf(characters[i]) || i != lowerCase.lastIndexOf(characters[i]))
Output of the fixed version: 3 (n, h, r)

I would recommend using a Set to retain only uniques, then count its size, instead of iterating:
public static int countUniqueCharacters(String s) {
String lowerCase = s.toLowerCase();
char characters[] = lowerCase.toCharArray();
Set<Character> uniques = new HashSet<Character>();
for (char c: characters) {
uniques.add(c);
}
return uniques.size();
}

if (i != lowerCase.indexOf(characters[i])) {
countOfUniqueChars--;
}
This is wrong. Your lowerCase string is lowercase, so any uppercase letters in characters[i] will have an index of -1 in lowerCase (will be calculated as a non-unique character). You can fix this by using indexOf(lowerCase.charAt(i));

A good way to count the number of characters would be eliminating repetitions. The ideia is get the first character, then find next occurrences and replace by nothing, once you do that you can count the unique characters.
public static int countUniqueCharacters(String s) {
String lowerCase = s.toLowerCase();
///Get the first char of lowerCase
String firstChar = lowerCase.substring(0,1);
//Take off the first char
String subS = lowerCase.substring(1);
///replace all chars equals to first char
String replacedSubS = subS.replace(firstChar, "");
/// Now, call method again to calculate size
/// of the substring with first char
// replaced by blank char
return 1+countUniqueCharacters(replacedSubS);
}
This method worked for me, take a look. You may do that in two lines, but i thought it's better be detailed here.

Adding a System.out.println(countOfUniqueChars); below return doesn't work.
That is expected behavior because return means that flow of control will be returned from method to place where this method was invoked. This means that code after return will not be executed, so in situation like
return countOfUniqueChars;
System.out.println(countOfUniqueChars);
System.out.println(countOfUniqueChars); would be dead code.
You could try printing value before you return it like
System.out.println(countOfUniqueChars);
return countOfUniqueChars;
or simply print returned value in main method like
int count = countUniqueCharacters(s);
System.out.println(count);
or using this one-liner
System.out.println(countUniqueCharacters(s));
BTW since Java 8 your code can look like
s.toLowerCase().chars().distinct().summaryStatistics().getCount()
or if you want to skip spaces you can add
s.toLowerCase().replace(" ","").chars().distinct().summaryStatistics().getCount()

public static int countUniqueCharacters(String s) {
char [] input=s.toCharArray();
Set<Character> charset=new HashSet<>();
for (int i = 0; i < input.length; i++) {
charset.add(input[i]);
}
return charset.size();
}

Algorithm for duplicated but overlapping strings

I need to write a method where I'm given a string s and I need to return the shortest string which contains s as a contiguous substring twice.
However two occurrences of s may overlap. For example,
aba returns ababa
xxxxx returns xxxxxx
abracadabra returns abracadabracadabra
My code so far is this:
import java.util.Scanner;
public class TwiceString {
public static String getShortest(String s) {
int index = -1, i, j = s.length() - 1;
char[] arr = s.toCharArray();
String res = s;
for (i = 0; i < j; i++, j--) {
if (arr[i] == arr[j]) {
index = i;
} else {
break;
}
}
if (index != -1) {
for (i = index + 1; i <= j; i++) {
String tmp = new String(arr, i, i);
res = res + tmp;
}
} else {
res = res + res;
}
return res;
}
public static void main(String args[]) {
Scanner inp = new Scanner(System.in);
System.out.println("Enter the string: ");
String word = inp.next();
System.out.println("The requires shortest string is " + getShortest(word));
}
}
I know I'm probably wrong at the algorithmic level rather than at the coding level. What should be my algorithm?

Use a suffix tree. In particular, after you've constructed the tree for s, go to the leaf representing the whole string and walk up until you see another end-of-string marker. This will be the leaf of the longest suffix that is also a prefix of s.

As #phs already said, part of the problem can be translated to "find the longest prefix of s that is also a suffix of s" and a solution without a tree may be this:
public static String getShortest(String s) {
int i = s.length();
while(i > 0 && !s.endsWith(s.substring(0, --i)))
;
return s + s.substring(i);
}

Once you've found your index, and even if it's -1, you just need to append to the original string the substring going from index + 1 (since index is the last matching character index) to the end of the string. There's a method in String to get this substring.

i think you should have a look at the Knuth-Morris-Pratt algorithm, the partial match table it uses is pretty much what you need (and by the way it's a very nice algorithm ;)

If your input string s is, say, "abcde" you can easily build a regex like the following (notice that the last character "e" is missing!):
a(b(c(d)?)?)?$
and run it on the string s. This will return the starting position of the trailing repeated substring. You would then just append the missing part (i.e. the last N-M characters of s, where N is the length of s and M is the length of the match), e.g.
aba
^ match "a"; append the missing "ba"
xxxxxx
^ match "xxxxx"; append the missing "x"
abracadabra
^ match "abra"; append the missing "cadabra"
nooverlap
--> no match; append "nooverlap"

From my understanding you want to do this:
input: dog
output: dogdog
--------------
input: racecar
output: racecaracecar
So this is how i would do that:
public String change(String input)
{
StringBuilder outputBuilder = new StringBuilder(input);
int patternLocation = input.length();
for(int x = 1;x < input.length();x++)
{
StringBuilder check = new StringBuilder(input);
for(int y = 0; y < x;y++)
check.deleteCharAt(check.length() - 1);
if(input.endsWith(check.toString()))
{
patternLocation = x;
break;
}
}
outputBuilder.delete(0, input.length() - patternLocation);
return outputBuilder.toString();
}
Hope this helped!

# of times a single word in a sentence

How can i get the # of times a single word is in a given sentence.. String.split cannot be used.. I don't really need the code. I just need an idea to get started..
package exam2;
import java.util.Arrays;
public class Problem2 {
/**
* #param args
*/
public static String input, word, a, b, c;
public static int index;
public static void Input() {
System.out.println("Enter Sentence: ");
input = IO.readString();
System.out.println("Enter Word: ");
word = IO.readString();
}
public static void Calc() {
Input();
index = input.indexOf(word);
int[] data = new int[input.length()];
data[0] = index;
while (index >= 0) {
System.out.println("Index : " + index);
for (int i = 1; i < data.length; i++) {
data[i] = index;
}
index = input.indexOf(word, index + word.length());
}
int count = 0;
for (int i = 0; i < data.length; i++) {
for (int j = 0; j < data.length; j++) {
if (data[i] == data[j]) {
a = input.substring(data[i], data[i] + word.length());
}
else if (data[i] != data[j]) {
b = input.substring(data[i], data[i] + word.length());
c = input.substring(data[j], data[j] + word.length());
}
}
}
if (a.equalsIgnoreCase(word)) {
count++;
}
System.out.println(count);
}
public static void main(String[] args) {
Calc();
}
}
using a while loop i finding the index of word given by the user in the sentence again given by the user.. I am storing those index in the array. for some reason that is not working.. so i found another way of implementing it. if the index of that word in the array is the equals each other then the word only exits once. I have got this to work.. but if the word exits more than once that is creating the problem..

Get the first letter of the word given by user.Next look at the sentence and find the letter.Then check the second letter of the word and compare it with the next letter in sentence . If its same again continue comparing .If not start again from next letter. Each time you get all the letters of the word and then a space you add 1 to a counter.Think that will work.

Take a look at String.indexOf(String, int). This finds the next position of the parameter String, starting at the parameter int.

Split can't be used? That seems rather odd. I'll bite though, and say simply that we don't have enough information to give a correct answer.
What is a word exactly?
How should symbols/letters be considered?
When is a sentence completed?
Should hyphened words be special?
Do we have 1 sentence and we are testing many words?
Do we have 1 word and many sentence?
What about substring matches (can vs canteen)?
Given what I can guess you should loop through the "sentence" tokenized the input by building "words" until you hit word boundary. Put the found words into a HashMap (keyed on the word) and increment the value for each word as you find it.

You need to call 'input.indexOf(word, fromIndex)' in a loop to find the string. Each time you call this function and it returns something other than -1 increment your count. When it returns -1 or you reach the end of the string stop. fromIndex will start at 0 and will need to be incremented each time you find a string by the length of the string.

Count Occurence of Needle String in Haystack String, most optimally?

The Problem is simple Find "ABC" in "ABCDSGDABCSAGAABCCCCAAABAABC" without using String.split("ABC")
Here is the solution I propose, I'm looking for any solutions that might be better than this one.
public static void main(String[] args) {
String haystack = "ABCDSGDABCSAGAABCCCCAAABAABC";
String needle = "ABC";
char [] needl = needle.toCharArray();
int needleLen = needle.length();
int found=0;
char hay[] = haystack.toCharArray();
int index =0;
int chMatched =0;
for (int i=0; i<hay.length; i++){
if (index >= needleLen || chMatched==0)
index=0;
System.out.print("\nchar-->"+hay[i] + ", with->"+needl[index]);
if(hay[i] == needl[index]){
chMatched++;
System.out.println(", matched");
}else {
chMatched=0;
index=0;
if(hay[i] == needl[index]){
chMatched++;
System.out.print("\nchar->"+hay[i] + ", with->"+needl[index]);
System.out.print(", matched");
}else
continue;
}
if(chMatched == needleLen){
found++;
System.out.println("found. Total ->"+found);
}
index++;
}
System.out.println("Result Found-->"+found);
}
It took me a while creating this one. Can someone suggest a better solution (if any)
P.S. Drop the sysouts if they look messy to you.

How about:
boolean found = haystack.indexOf("ABC") >= 0;
**Edit - The question asks for number of occurences, so here's a modified version of the above:
public static void main(String[] args)
{
String needle = "ABC";
String haystack = "ABCDSGDABCSAGAABCCCCAAABAABC";
int numberOfOccurences = 0;
int index = haystack.indexOf(needle);
while (index != -1)
{
numberOfOccurences++;
haystack = haystack.substring(index+needle.length());
index = haystack.indexOf(needle);
}
System.out.println("" + numberOfOccurences);
}

If you're looking for an algorithm, google for "Boyer-Moore". You can do this in sub-linear time.
edit to clarify and hopefully make all the purists happy: the time bound on Boyer-Moore is, formally speaking, linear. However the effective performance is often such that you do many fewer comparisons than you would with a simpler approach, and in particular you can often skip through the "haystack" string without having to check each character.

You say your challenge is to find ABC within a string. If all you need is to know if ABC exists within the string, a simple indexOf() test will suffice.
If you need to know the number of occurrences, as your posted code tries to find, a simple approach would be to use a regex:
public static int countOccurrences(string haystack, string regexToFind) {
Pattern p = Pattern.compile(regexToFind);
Matcher m = p.matcher(haystack); // get a matcher object
int count = 0;
while(m.find()) {
count++;
}
return count;
}

Have a look at http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm

public class NeedleCount
{
public static void main(String[] args)
{
String s="AVBVDABCHJHDFABCJKHKHF",ned="ABC";
int nedIndex=-1,count=0,totalNed=0;
for(int i=0;i<s.length();i++)
{
if(i>ned.length()-1)
nedIndex++;
else
nedIndex=i;
if(s.charAt(i)==ned.charAt(nedIndex))
count++;
else
{
nedIndex=0;
count=0;
if(s.charAt(i)==ned.charAt(nedIndex))
count++;
else
nedIndex=-1;
}
if(count==ned.length())
{
nedIndex=-1;
count=0;
totalNed++;
System.out.println(totalNed+" needle found at index="+(i-(ned.length()-1)));
}
}
System.out.print("Total Ned="+totalNed);
}
}

Asked by others, better in what sense? A regexp based solution will be the most concise and readable (:-) ). Boyer-Moore (http://en.wikipedia.org/wiki/Boyer–Moore_string_search_algorithm) will be the most efficient in terms of time (O(N)).

If you don't mind implementing a new datastructure as replacement for strings, have a look at Tries: http://c2.com/cgi/wiki?StringTrie or http://en.wikipedia.org/wiki/Trie
If you don't look for a regular expression but an exact match they should provide the fastest solution (proportional to length of search string).

public class FindNeedleInHaystack {
String hayStack="ASDVKDBGKBCDGFLBJADLBCNFVKVBCDXKBXCVJXBCVKFALDKBJAFFXBCD";
String needle="BCD";
boolean flag=false;
public void findNeedle() {
//Below for loop iterates the string by each character till reaches max length
for(int i=0;i<hayStack.length();i++) {
//When i=n (0,1,2... ) then we are at nth character of hayStack. Let's start comparing nth char of hayStach with first char of needle
if(hayStack.charAt(i)==needle.charAt(0)) {
//if condition return true, we reach forloop which iterates needle by lenghth.
//Now needle(BCD) first char is 'B' and nth char of hayStack is 'B'. Then let's compare remaining characters of needle with haystack using below loop.
for(int j=0;j<needle.length();j++) {
//for example at i=9 is 'B', i+j is i+0,i+1,i+2...
//if condition return true, loop continues or else it will break and goes to i+1
if(hayStack.charAt(i+j)==needle.charAt(j)) {
flag=true;
} else {
flag=false;
break;
}
}
if(flag) {
System.out.print(i+" ");
}
}
}
}
}

Below code will perform exactly O(n) complexity because we are looping n chars of haystack. If you want to capture start and end index's of needle uncomment below commented code. Solution is around playing with characters and no Java String functions (Pattern matching, IndexOf, substring etc.,) are used as they may bring extra space/time complexity
char[] needleArray = needle.toCharArray();
char[] hayStackArray = hayStack.toCharArray();
//java.util.LinkedList<Pair<Integer,Integer>> indexList = new LinkedList<>();
int head;
int tail = 0;
int needleCount = 0;
while(tail<hayStackArray.length){
head = tail;
boolean proceed = false;
for(int j=0;j<needleArray.length;j++){
if(head+j<hayStackArray.length && hayStackArray[head+j]==needleArray[j]){
tail = head+j;
proceed = true;
}else{
proceed = false;
break;
}
}
if(proceed){
// indexList.add(new Pair<>(head,tail));
needleCount++;
}
++tail;
}
System.out.println(needleCount);
//System.out.println(indexList);

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

count occurrence of a character in a given string - java

Related

Java Replace a substring after a certain number of repeats of that substring

Counting unique characters in a string

Algorithm for duplicated but overlapping strings

# of times a single word in a sentence

Count Occurence of Needle String in Haystack String, most optimally?

Categories

Resources