Having trouble with a frequency table in Java - java

I have to create a frequency table for how many times a word appears in a sentence. I was trying to accomplish this with 2 arrays but every time I trace it, the words will not go into the frequency tables.
boolean found = false;
for (int y = 0; y < numWordsInArray; y++)
{
found = arrayOfWords[y].equals(word);
if(found)
{
numTimesAppeared[y]++;
}
if (!found) //it's not already found
{
//add the word to the array of words
arrayOfWords[numWordsInArray] = word;
numWordsInArray++;
}
}
and when I run this loop:
for(int x = 0; x < 10; x++)
{
System.out.println(arrayOfWords[x]);
}
to trace the array, I get output of 10 spaces.
link to the whole program : http://pastebin.com/F4t6yCkD

Use a HashMap instead of the array numTimesAppeared.
Sample code:
private Map<String, Integer> freqMap = new HashMap<String, Integer>();
public void statFreq(String[] words) {
for(String word : words) {
Integer freq = freqMap.get(word);
if(freq == null) {
freqMap.put(word, 1);
} else {
freqMap.put(word, freq + 1);
}
}
}

try using a hashtable instead
Hashtable<String, Integer> words = new Hashtable<String, Integer>();
using
Integer count = words.get (word)
and
if (count == null) {
words.put (word, 1);
else
words.put (word, count.intValue () + 1);

Related

Find duplicate characters in string

I am trying to print duplicate characters in a string for example if string input is: "aabacdceefeg" output should be a-->3,b-->1,c--->2,e-->3 like this way have to print values but below code not working for this logic can some one suggest me please
public class Test {
public static void main(String[] args) {
String string1 = "Great responsibility";
char string[] = string1.toCharArray();
HashMap<Character, Integer> hashMap = new HashMap<Character, Integer>();
for (int i = 0; i < string.length; i++) {
for (int j = i + 1; j < string.length; j++) {
if (string[i] == string[j]) {
Integer value = hashMap.get(string[i]);
hashMap.put(string[i], value+1);
} else {
hashMap.put(string[i], 1);
}
}
}
System.out.println(hashMap);
}
}
There are many answers how to optimise the solution but there are none to show how the original O(N^2) time complexity solution could be fixed.
Here are the things to fix in the original solution (besides the obvious inefficiency)
If a character doesn't exist in the map yet then value should be set to 1 as that's its first time occurrence.
If the current char doesn't equal to another char then keep its original count.
The fixed code is below:
public static void main(String[] args) {
String string1 = "Great responsibility";
char string[] = string1.toCharArray();
HashMap<Character, Integer> hashMap = new HashMap<Character, Integer>();
for (int i = 0; i < string.length; i++) {
for (int j = i + 1; j < string.length; j++) {
Integer value = hashMap.get(string[i]);
if (value == null) {
value = 1;
}
if (string[i] == string[j]) {
hashMap.put(string[i], value + 1);
} else {
hashMap.put(string[i], value);
}
}
}
System.out.println(hashMap);
}
You can simplify it by just using the hashMap directly and only using one loop
String string1 = "Great responsibility";
HashMap<Character, Integer> hashMap = new HashMap<Character, Integer>();
for (Character c : string1.toCharArray()) {
if (hashMap.containsKey(c)) {
int val = hashMap.get(c);
hashMap.put(c, val + 1);
}
else {
hashMap.put(c, 1);
}
}
System.out.println(hashMap);
output
{ =1, a=1, b=1, e=2, G=1, i=3, l=1, n=1, o=1, p=1, r=2, s=2, t=2, y=1}
You only need one level of loops:
Map<Character, Integer> map = new HashMap<Character, Integer>();
for (int i = 0; i < string.length; i++) {
Integer count = map.get(string[i]);
if (count == null) {
map .put(string[i], 1);
} else {
map .put(string[i], count+1);
}
}
first create a list of characters and then loop over your desired string.
ArrayList<Character> charList=new ArrayList();
charList.clear();
for (int i = 0; i < string.length; i++) {
if(!charList.Contains(string[i])){
charList.add(string[i]))
}
}
HashMap<Character, Integer> hashMap = new HashMap<Character, Integer>();
for (Char char:charList) {
count=0;
for (int j = i + 1; j < string.length; j++) {
if (char == string[j]) {
hashMap.put(char, count++);
}
}
}
System.out.println(hashMap);
}
String string1 = "Great responsibility";
char[] chars = string1.toCharArray();
Map<Character, Integer> map = new HashMap<>();
for(char c : chars)
{
if(map.containsKey(c)) {
int counter = map.get(c);
map.put(c, ++counter);
} else {
map.put(c, 1);
}
}
I hope it will help you..
You can also use Java8 lambda function to solve the problem. You can simply convert the String into the count Map.
String string1 = "aabacdceefeg";
Map<Character,Long> countMap = string1.chars().mapToObj(i -> (char)i).collect(
Collectors.groupingBy(Function.identity(), Collectors.counting())
);
System.out.println(countMap);

find matching characters in two strings at different indexes

I am a C++ programmer. Out of interest, I am developing a java application.
I have two strings in java:
String word1 = "Fold";
String word2 = "Flow";
Now I need a function to get the count of matching characters in both strings but those that are at different indexes. The strings could be of any length but always both words will be of the same length.
Added:
We should increment count for a character by that many occurrences in both words. Ex: abcd and xyaa should return 1, but abca and xaay should return 2. Hope it is clear now.
For ex:, the count for the above example should be 2 (Only letters 'o' and 'l' are considered. Though letter 'f' is present in both words, it is not considered since it is present at the same index on both strings.
My method was to create two map variables Map and initialize it with 0 for all characters. Then calculate count of how many times each letter occurs in both strings and finally check how many of these characters have count more than one.
Ex:
Map<Character, Integer> word_count_1 = createMap(); // initialize with a:0, b:0, c:0,...z:0
Map<Character, Integer> word_count_2 = createMap(); // initialize with a:0, b:0, c:0,...z:0
int count, value;
for (int i=0; i<word1.length(); i++)
{
if (word1.charAt(i) != word2.charAt(i))
{
value = word_count_1.get(word1.charAt(i));
word_count_1.put(word1.charAt(i), ++value);
value= word_count_2.get(word2.charAt(i));
word_count_2.put(word2.charAt(i), ++value);
}
}
Set set = word_count_2.entrySet();
Iterator i = set.iterator();
Map.Entry<Character, Integer> iter;
while(i.hasNext())
{
iter = (Map.Entry)i.next();
if ( (iter.getValue() > 0) && (word_count_1.get(iter.getKey())) > 0 )
{
count++; // This line has a bug. We shall ignore it for now
}
}
Is there any other better method to get the count instead of what I am trying to do? I just dont get a good feeling about what I have done.
Edited:
The line count++ (that I mentioned having a bug) should be changed to following to give correct result:
int letterCount1 = word_count_1.get(iter.getKey());
int letterCount2 = iter.getValue();
if ( (letterCount1 > 0) && (letterCount2 > 0 )
{
int minVal = letterCount1;
if (minVal > letterCount2)
minVal = letterCount2;
count+= minVal;
}
Java 8 Solution
public int duplicates(String wordOne, String wordTwo ){
Set<Character> charSet = new HashSet(109);
wordOne.chars().mapToObj(i -> (char)i).forEach(letter->charSet.add(letter));
int count = 0;
for(int i = 0; i < wordTwo.length(); i++)
if( charSet.contains(wordTwo.charAt(i)) && wordTwo.charAt(i) != wordOne.charAt(i) )
count++;
return count;
}
duplicates("Fold", "Flow"); // -> 2
There's nicer syntax to iterate over the set (see example below) but the actual counting looks fine.
Map<Character, Integer> word_count_1 = createMap(); // initialize with a:0, b:0, c:0,...z:0
Map<Character, Integer> word_count_2 = createMap(); // initialize with a:0, b:0, c:0,...z:0<Character, Integer>
int count, value;
for (int i=0; i<word1.length(); i++)
{
if (word1.charAt(i) != word2.charAt(i))
{
value = word_count_1.get(word1.charAt(i));
word_count_1.put(word1.charAt(i), ++value);
value= word_count_2.get(word2.charAt(i));
word_count_2.put(word2.charAt(i), ++value);
}
}
Set set = word_count_2.entrySet();
for(<Map.Entry<Character, Integer>> iter:set)
{
if ( (iter.getValue() > 0) && (word_count_1.get(iter.getKey())) > 0 )
{
count++; // This line has a bug. We shall ignore it for now
}
}
//Create set which contains word1's unique chars
Set<Character> word1Chars = new HashSet<>();
for(int i = 0; i< word1.length(); i++)
{
char ch = word1.charAt(i);
word1Chars.add(ch);
}
// Count how many chars in word2 are contained in word1 but in another position
int count = 0;
for(int i = 0; i < word2.length(); i++)
{
char ch = word2.charAt(i);
if(ch != word1.charAt(i) && word1Chars.contains(ch))
{
count++;
}
}
EDIT: You have to take into consideration that you may get a different counting depending on which word you iterate. E.g: "abc" and "daa"; "abc" has 1 but "daa" has 2.
If you want the total of correspondences in both words you need to modify this code accordingly.
You do not need to initialize maps for all the characters.
public static int matchCharCountInDifferentIndex(String word1, String word2) {
Map<Character, Integer> word_count_1 = new HashMap<>();
Map<Character, Integer> word_count_2 = new HashMap<>();
for (int i=0; i<word1.length(); i++)
{
if (word1.charAt(i) != word2.charAt(i))
{
word_count_1.compute(word1.charAt(i), (k, v) -> v == null ? 1 : v + 1);
word_count_2.compute(word2.charAt(i), (k, v) -> v == null ? 1 : v + 1);
}
}
int count = 0;
for (Map.Entry<Character, Integer> e : word_count_2.entrySet())
{
count += Math.min(e.getValue(), word_count_1.getOrDefault(e.getKey(), 0));
}
System.out.printf("word1=%s word2=%s result=%d%n", word_count_1, word_count_2, count);
return count;
}
Tests are
matchCharCountInDifferentIndex("Fold", "Flow"); // -> word1={d=1, l=1, o=1} word2={w=1, l=1, o=1} result=2
matchCharCountInDifferentIndex("abca", "xaay"); // -> word1={a=2, b=1, c=1} word2={a=2, x=1, y=1} result=2
In this code
map.compute(key, (k, v) -> v == null ? 1 : v + 1);
is equivalent to
map.put(key, map.getOrDefault(key, 0) + 1);
And
map.getOrDefault(key, 0)
is equivalent to
map.containsKey(key) ? map.get(key) : 0;

Want to count occurances of Strings in Java

So I have a .txt file which I am calling using
String[] data = loadStrings("data/data.txt");
The file is already sorted and essentially looks like:
Animal
Animal
Cat
Cat
Cat
Dog
I am looking to create an algorithm to count the sorted list in java, without using any libraries like Multisets or without the use of Maps/HashMaps. I have managed so far to get it print out the top occurring word like so:
ArrayList<String> words = new ArrayList();
int[] occurrence = new int[2000];
Arrays.sort(data);
for (int i = 0; i < data.length; i ++ ) {
words.add(data[i]); //Put each word into the words ArrayList
}
for(int i =0; i<data.length; i++) {
occurrence[i] =0;
for(int j=i+1; j<data.length; j++) {
if(data[i].equals(data[j])) {
occurrence[i] = occurrence[i]+1;
}
}
}
int max = 0;
String most_talked ="";
for(int i =0;i<data.length;i++) {
if(occurrence[i]>max) {
max = occurrence[i];
most_talked = data[i];
}
}
println("The most talked keyword is " + most_talked + " occuring " + max + " times.");
I want rather than just to get the highest occurring word perhaps the top 5 or top 10.
Hope that was clear enough. Thanks for reading
Since you said you dont want to use some kind of data structure i think that you can do something like this, but it is not performant.
I usually prefer to store index rather than values.
ArrayList<String> words = new ArrayList();
int[] occurrence = new int[2000];
Arrays.sort(data);
int nwords = 0;
occurrence[nwords]=1;
words.add(data[0]);
for (int i = 1; i < data.length; i ++ ) {
if(!data[i].equals(data[i-1])){ //if a new word is found
words.add(data[i]); //put it into the words ArrayList
nwords++; //increment the index
occurrence[nwords]=0; //initialize its occurrence counter
}
occurrence[nwords]++; //increment the occurrence counter
}
int max;
for(int k=0; k<5; k++){ //loop to find 5 times the most talked word
max = 0; //index of the most talked word
for(int i = 1; i<words.size(); i++) { //for every word
if(occurrence[i]>occurrence[max]) { //if it is more talked than max
max = i; //than it is the new most talked
}
}
println("The most talked keyword is " + words.get(max) + " occuring " + occurence[max] + " times.");
occurence[max]=0;
}
Every time I find the value with the higher occurence value, i set his occurrence counter to 0 and I reiterate again the array, this for 5 times.
If you cannot use Guava's Multiset, then you can implement an equivalent yourself. Basically, you just need to create a Map<String, Integer>, which keeps track of counts (value) per each word (key). This means changing this
ArrayList<String> words = new ArrayList<String>();
// ...
for (int i = 0; i < data.length; i ++ ) {
words.add(data[i]); //Put each word into the words ArrayList
}
into this:
Map<String, Integer> words = new HashMap<String>();
// ...
for (String word : data) {
Integer count = words.get(word);
words.put(word, (count != null : count.intValue() + 1 ? 1));
}
After you've filled the map, just sort it by the values.
If you cannot use a Map either, you can do the following:
First, create a wrapper class for your word counts:
public class WordCount implements Comparable<WordCount> {
private String word;
private int count;
public WordCount(String w, int c) {
this.word = w;
this.count = c;
}
public String getWord() {
return word;
}
public int getCount() {
return count;
}
public void incrementCount() {
count++;
}
#Override
public int compareTo(WordCount other) {
return this.count - other.count;
}
}
Then, change your code to store WordCount instances in your list (instead of Strings):
ArrayList<WordCount> words = new ArrayList<WordCount>();
// ...
for (String word : data) {
WordCount wc = new WordCount(word, 1);
boolean wordFound = false;
for (WordCount existing : words) {
if (existing.getWord().equals(wc.getWord())) {
existing.incrementCount();
wordFound = true;
break;
}
}
if (!wordFound) {
words.add(wc);
}
}
Finally, after populating the List, simply sort it using Collections.sort(). This is easy because the value objects implement Comparable:
Collections.sort(words, Collections.reverseOrder());
You could try something simple like this..
int count = 0;
for( int i = 0; i < words.size(); i++ ){
System.out.printf("%s: ", words.get( i ));
for( int j = 0; j < words.size(); j++ ) {
if( words.get( i ).equals( words.get( j ) ) )
count++;
}
System.out.printf( "%d\n", count );
}

Java count occurrence of each item in an sorted array

I have an Array of Strings and want to count the occurrences of any single String.
I have already sorted it. (It's a long Array and I wanted to get rid of the O(n²)-loop)
Here my code.. obviously it runs out in an ind.outOfB. exc.. the reason is clear but I donno how to solve..
for (int i = 0; i < patternsTest.length-1; i++) {
int occ=1;
String temp=patternsTest[i];
while(temp.equals(patternsTest[i+1])){
i++;
occ++;
}
}
This would be a good place for a HashMap, the key would be the Word, and the value the Number of times it occurs. The Map.containsKey and Map.get methods are constant time lookups which are very fast.
Map<String,Integer> map = new HashMap<String,Integer>();
for (int i = 0; i < patternsTest.length; i++) {
String word=patternsTest[i];
if (!map.containsKey(word)){
map.put(word,1);
} else {
map.put(word, map.get(word) +1);
}
}
As a side benefit you don't even need to sort beforehand!
You can use Java HashMap:
Map<String, Integer> occurrenceOfStrings = new HashMap<String, Integer>();
for(String str: patternsTest)
{
Integer currentValue = occurrenceOfStrings.get(str);
if(currentValue == null)
occurrenceOfStrings.put(str, 1);
else
occurrenceOfStrings.put(str, currentValue + 1);
}
This does not have index out of bounds:
String[] patternsTest = {"a", "b"};
for (int i = 0; i < patternsTest.length-1; i++) {
int occ=1;
String temp=patternsTest[i];
while(temp.equals(patternsTest[i+1])){
i++;
occ++;
}
}
You can cause an Index Out of Bounds by changing the data to:
String[] patternsTest = {"a", "a"};
you could try a map and only one loop
Map<String, Integer> occurences = new HashMap<String, Integer>();
String currentString = patternsTest[0];
Integer count = 1;
for (int i = 1; i < patternsTest.length; i++) {
if(currentString.equals(patternsTest[i]) {
count++;
} else {
occurrences.put(currentString, count);
currentString = patternsTest[i];
count = 1;
}
}
occurrences.put(currentString, count);
Guava Multiset solution (two lines of code):
Multiset<String> multiset = HashMultiset.create();
multiset.addAll(Arrays.asList(patternsTest));
//Then you could do...
multiset.count("hello");//Return count the number of occurrences of "hello".
We could use it both sorted and un-sorted arrays. Easy to maintain code.
My solution is:
public int cantOccurences(String pattern, String[] values){
int count = 0;
for (String s : values) {
count += (s.replaceAll("[^".concat(pattern).concat("]"), "").length());
}
return count;
}

How can i keep track of multiple counter variables

I have written some code that count the number of "if" statements from unknown number of files. How can i keep a count for each file separate and a total of "if" from all files?
code:
import java.io.*;
public class ifCounter4
{
public static void main(String[] args) throws IOException
{
// variable to keep track of number of if's
int ifCount = 0;
for (int c = 0; c < args.length; c++)
{
// parameter the TA will pass in
String fileName = args[c];
// create a new BufferReader
BufferedReader reader = new BufferedReader( new FileReader (fileName));
String line = null;
StringBuilder stringBuilder = new StringBuilder();
String ls = System.getProperty("line.separator");
// read from the text file
while (( line = reader.readLine()) != null)
{
stringBuilder.append(line);
stringBuilder.append(ls);
}
// create a new string with stringBuilder data
String tempString = stringBuilder.toString();
// create one last string to look for our valid if(s) in
// with ALL whitespace removed
String compareString = tempString.replaceAll("\\s","");
// check for valid if(s)
for (int i = 0; i < compareString.length(); i++)
{
if (compareString.charAt(i) == ';' || compareString.charAt(i) == '}' || compareString.charAt(i) == '{') // added opening "{" for nested ifs :)
{
i++;
if (compareString.charAt(i) == 'i')
{
i++;
if (compareString.charAt(i) == 'f')
{
i++;
if (compareString.charAt(i) == '(')
ifCount++;
} // end if
} // end if
} // end if
} // end for
// print the number of valid "if(s) with a new line after"
System.out.println(ifCount + " " + args[c]); // <-- this keeps running total
// but not count for each file
}
System.out.println();
} // end main
} // end class
You can create a Map that stores the file names as keys and the count as values.
Map<String, Integer> count = new HashMap<String, Integer>();
After each file,
count.put(filename, ifCount);
ifcount = 0;
Walk the value set to get the total.
How about a Map which uses the file name as key and keeps the count of ifs as value? For overall count, store it in its own int, or just calculate it when needed by adding up all the values in the Map.
Map<String, Integer> ifsByFileName = new HashMap<String, Integer>();
int totalIfs = 0;
for each if in "file" {
totalIfs++;
Integer currentCount = ifsByFileName.get(file);
if (currentCount == null) {
currentCount = 0;
}
ifsByFileName.put(file, currentCount + 1);
}
// total from the map:
int totalIfsFromMap = 0;
for (Integer fileCount : ifsByFileName.values()) {
totalIfsFromMap += fileCount;
}
Using an array would solve this problem.
int[] ifCount = new int[args.length];
and then in your loop ifCount[c]++;
Problematic in this scenario is when many threads want to increase the same set of counters.
Operations such as ifCount[c]++; and ifsByFileName.put(file, currentCount + 1);are not thread safe.
The obvious solution to use a ConcurrentMap and AtomicLong is also insufficient, since you must place the initial values of 0, which would require additional locking.
The Google Guava project provides a convenient out of the box sollution: AtomicLongMap
With this class you can write:
AtomicLongMap<String> cnts = AtomicLongMap.create();
cnts.incrementAndGet("foo");
cnts.incrementAndGet("bar");
cnts.incrementAndGet("foo");
for (Entry<String, Long> entry : cnts.asMap().entrySet()) {
System.out.println(entry);
}
which prints:
foo=2
bar=1
And is completely thread safe.
This is a counter that adds to 100 and if you edit the value of N it puts a * next to the multiples of.
public class SmashtonCounter_multiples {
public static void main(String[] args) {
int count;
int n = 3; //change this variable for different multiples of
for(count = 1; count <= 100; count++) {
if((count % n) == 0) {
System.out.print(count + "*");
}
else {
System.out.print(count);
if (count < 100) {
System.out.print(",");
}
}
}
}

Categories

Resources