Using HashMap to count instances - java

I have the following code to count the instances of different strings in an array;
String words[] = {"the","cat","in","the","hat"};
HashMap<String,Integer> wordCounts = new HashMap<String,Integer>(50,10);
for(String w : words) {
Integer i = wordCounts.get(w);
if(i == null) wordCounts.put(w, 1);
else wordCounts.put(w, i + 1);
}
Is this a correct way of doing it? It seems a bit long-winded for a simple task. The HashMap result is useful to me because I will be indexing it by the string.
I am worried that the line
else wordCounts.put(w, i + 1);
could be inserting a second key-value pair due to the fact that
new Integer(i).equals(new Integer(i + 1));
would be false, so two Integers would end up under the same String key bucket, right? Or have I just over-thought myself into a corner?

Your code will work - but it would be simpler to use HashMultiset from Guava.
// Note: prefer the below over "String words[]"
String[] words = {"the","cat","in","the","hat"};
Multiset<String> set = HashMultiset.create(Arrays.asList(words));
// Write out the counts...
for (Multiset.Entry<String> entry : set.entrySet()) {
System.out.println(entry.getElement() + ": " + entry.getCount());
}

Yes you are doing it correct way. HashMap replaces values if same key is provided.
From Java doc of HashMap#put
Associates the specified value with the specified key in this map. If the map previously contained a mapping for the key, the old value is replaced.

Your code is perfectly fine. You map strings to integers. Nothing is duplicated.

HashMap don't allow duplicate keys, so there is no way to have more than one SAME key-value pairs in your map.

Here is a String-specific counter that should be genericized and have a sort by value option for toString(), but is an object-oriented wrapper to the problem, since I can't find anything similar:
package com.phogit.util;
import java.util.Map;
import java.util.HashMap;
import java.lang.StringBuilder;
public class HashCount {
private final Map<String, Integer> map = new HashMap<>();
public void add(String s) {
if (s == null) {
return;
}
Integer i = map.get(s);
if (i == null) {
map.put(s, 1);
} else {
map.put(s, i+1);
}
}
public int getCount(String s) {
if (s == null) {
return -1;
}
Integer i = map.get(s);
if (i == null) {
return -1;
}
return i;
}
public String toString() {
if (map.size() == 0) {
return null;
}
StringBuilder sb = new StringBuilder();
// sort by key for now
Map<String, Integer> m = new TreeMap<String, Integer>(map);
for (Map.Entry pair : m.entrySet()) {
sb.append("\t")
.append(pair.getKey())
.append(": ")
.append(pair.getValue())
.append("\n");;
}
return sb.toString();
}
public void clear() {
map.clear();
}
}

Your code looks fine to me and there is no issue with it. Thanks to Java 8 features it can be simplified to:
String words[] = {"the","cat","in","the","hat"};
HashMap<String,Integer> wordCounts = new HashMap<String,Integer>(50,10);
for(String w : words) {
wordCounts.merge(w, 1, (a, b) -> a + b);
}
the follwowing code
System.out.println("HASH MAP DUMP: " + wordCounts.toString());
would print out.
HASH MAP DUMP: {cat=1, hat=1, in=1, the=2}

Related

Using a Hashmap to detect duplicates and count of duplicates in a list

I'm trying to use hashmaps to detect any duplicates in a given list, and if there is, I want to add "1" to that String to indicate its duplication. If it occurs 3 times, the third one would add "3" after that string.
I can't seem to figure that out, keeping track of the number of duplicates. It only adds 1 to the duplicates, no matter if it's the 2nd or 3rd or 4th,..etc duplicate.
This is what I have:
public static List<String> duplicates(List<String> given) {
List<String> result = new ArrayList<String>();
HashMap<String, Integer> hashmap = new HashMap<String, Integer>();
for (int i=0; i<given.size(); i++) {
String current = given.get(i);
if (hashmap.containsKey(current)) {
result.add(current+"1");
} else {
hashmap.put(current,i);
result.add(current);
}
}
return result;
}
I want to include the values that only occur once as well, as is (no concatenation).
Sample Input: ["mixer", "toaster", "mixer", "mixer", "bowl"]
Sample Output: ["mixer", "toaster", "mixer1", "mixer2", "bowl"]
public static List<String> duplicates(List<String> given) {
final Map<String, Integer> count = new HashMap<>();
return given.stream().map(s -> {
int n = count.merge(s, 1, Integer::sum) - 1;
return s + (n < 1 ? "" : n);
}).collect(toList());
}
I renamed final to output as the first one is a keyword that cannot be used as a variable name.
if (hashmap.containsKey(current)) {
output.add(current + hashmap.get(current)); // append the counter to the string
hashmap.put(current, hashmap.get(current)+1); // increment the counter for this item
} else {
hashmap.put(current,1); // set a counter of 1 for this item in the hashmap
output.add(current);
}
You always add the hard-coded string "1" instead of using the count saved in the map:
public static List<String> duplicates(List<String> given) {
List<String> result = new ArrayList<>(given.size());
Map<String, Integer> hashmap = new HashMap<>();
for (String current : given) {
if (hashmap.containsKey(current)) {
int count = hashmap.get(current) + 1;
result.add(current + count);
hashmap.put(current, count);
} else {
hashmap.put(current, 0);
result.add(current);
}
}
return result;
}
ArrayList finallist = new ArrayList<String>();
for (int i=0; i<given.size(); i++) {
String current = given.get(i);
if (hashmap.containsKey(current)) {
hashmap.put(current,hashmap.get(current)+1);
} else {
hashmap.put(current,1);
}
String num = hashmap.get(current) == 1 ? "" :Integer.toString(hashmap.get(current));
finallist.add(current+num);
}
System.out.println(finallist);

print only repeated words in java

I want to display only the words that appear more than once in a string, single appearance of string should not be printed. Also i want to print strings whose length is more than 2 (to eliminate is,was,the etc)..
The code which I tried..prints all the strings and shows is occurrence number..
Code:
public static void main(String args[])
{
Map<String, Integer> wordcheck = new TreeMap<String, Integer>();
String string1="world world is new world of kingdom of palace of kings palace";
String string2[]=string1.split(" ");
for (int i=0; i<string2.length; i++)
{
String string=string2[i];
wordcheck.put(string,(wordcheck.get(string) == null?1: (wordcheck.get(string)+1)));
}
System.out.println(wordcheck);
}
Output:
{is=1, kingdom=1, kings=1, new=1, of=3, palace=2, world=3}
single appearance of string should not be printed...
also i want to print strings whose length is more than 2 (to eliminate is,was,the etc)..
Use it
for (String key : wordcheck.keySet()) {
if(wordcheck.get(key)>1)
System.out.println(key + " " + wordcheck.get(key));
}
Keeping track of the number of occurrences in a map will allow you to do this.
import java.util.HashMap;
import java.util.Map.Entry;
import java.util.Set;
public class Test1
{
public static void main(String[] args)
{
String string1="world world is new world of kingdom of palace of kings palace";
String string2[]=string1.split(" ");
HashMap<String, Integer> uniques = new HashMap<String, Integer>();
for (String word : string2)
{
// ignore words 2 or less characters long
if (word.length() <= 2)
{
continue;
}
// add or update the word occurrence count
Integer existingCount = uniques.get(word);
uniques.put(word, (existingCount == null ? 1 : (existingCount + 1)));
}
Set<Entry<String, Integer>> uniqueSet = uniques.entrySet();
boolean first = true;
for (Entry<String, Integer> entry : uniqueSet)
{
if (entry.getValue() > 1)
{
System.out.print((first ? "" : ", ") + entry.getKey() + "=" + entry.getValue());
first = false;
}
}
}
}
To get only the words occurring more then once, you have to filter your map.
Depending on your Java version you can use either this:
List<String> wordsOccuringMultipleTimes = new LinkedList<String>();
for (Map.Entry<String, Integer> singleWord : wordcheck.entrySet()) {
if (singleWord.getValue() > 1) {
wordsOccuringMultipleTimes.add(singleWord.getKey());
}
}
or starting with Java 8 this equivalent Lambda expression:
List<String> wordsOccuringMultipleTimes = wordcheck.entrySet().stream()
.filter((entry) -> entry.getValue() > 1)
.map((entry) -> entry.getKey())
.collect(Collectors.toList());
Regarding the nice printing, you have to do something similar while iterating over your result.
Use the below code
for (String key : wordcheck.keySet()) {
if(wordcheck.get(key)>1)
System.out.println(key + " " + wordcheck.get(key));
}
public static void main(String args[])
{
Map<String, Integer> wordcheck = new TreeMap<String, Integer>();
String string1="world world is new world of kingdom of palace of kings palace";
String string2[]=string1.split(" ");
HashSet<String> set = new HashSet<String>();
for (int i=0; i<string2.length; i++)
{
String data=string2[i];
for(int j=0;j<string2.length;j++)
{
if(i != j)
{
if(data.equalsIgnoreCase(string2[j]))
{
set.add(data);
}
}
}
}
System.out.println("Duplicate word size :"+set.size());
System.out.println("Duplicate words :"+set);
}
TreeMap.toString() is inherited from AbstractMap and the documentation states that
Returns a string representation of this map. The string representation consists of a list of key-value mappings in the order returned by the map's entrySet view's iterator, enclosed in braces ("{}"). Adjacent mappings are separated by the characters ", " (comma and space). Each key-value mapping is rendered as the key followed by an equals sign ("=") followed by the associated value. Keys and values are converted to strings as by String.valueOf(Object).
So better you write your own method that prints out the TreeMap in a way you want.

Split string into key-value pairs

I have a string like this:
pet:cat::car:honda::location:Japan::food:sushi
Now : indicates key-value pairs while :: separates the pairs.
I want to add the key-value pairs to a map.
I can achieve this using:
Map<String, String> map = new HashMap<String, String>();
String test = "pet:cat::car:honda::location:Japan::food:sushi";
String[] test1 = test.split("::");
for (String s : test1) {
String[] t = s.split(":");
map.put(t[0], t[1]);
}
for (String s : map.keySet()) {
System.out.println(s + " is " + map.get(s));
}
But is there an efficient way of doing this?
I feel the code is inefficient because I have used 2 String[] objects and called the split function twice.
Also, I am using t[0] and t[1] which might throw an ArrayIndexOutOfBoundsException if there are no values.
You could do a single call to split() and a single pass on the String using the following code. But it of course assumes the String is valid in the first place:
Map<String, String> map = new HashMap<String, String>();
String test = "pet:cat::car:honda::location:Japan::food:sushi";
// split on ':' and on '::'
String[] parts = test.split("::?");
for (int i = 0; i < parts.length; i += 2) {
map.put(parts[i], parts[i + 1]);
}
for (String s : map.keySet()) {
System.out.println(s + " is " + map.get(s));
}
The above is probably a little bit more efficient than your solution, but if you find your code clearer, then keep it, because there is almost zero chance such an optimization has a significant impact on performance, unless you do that millions of times. Anyway, if it's so important, then you should measure and compare.
EDIT:
for those who wonder what ::? means in the above code: String.split() takes a regular expression as argument. A separator is a substring that matches the regular expression. ::? is a regular expression which means: 1 colon, followed by 0 or 1 colon. It thus allows considering :: and : as separators.
Using Guava library it's a one-liner:
String test = "pet:cat::car:honda::location:Japan::food:sushi";
Map<String, String> map = Splitter.on( "::" ).withKeyValueSeparator( ':' ).split( test );
System.out.println(map);
The output:
{pet=cat, car=honda, location=Japan, food=sushi}
This also might work faster than JDK String.split as it does not create a regexp for "::".
Update it even handles correctly the corner case from the comments:
String test = "pet:cat::car:honda::location:Japan::food:sushi:::cool";
Map<String, String> map = Splitter.on( "::" ).withKeyValueSeparator( ':' ).split( test );
System.out.println(map);
The output is:
{pet=cat, car=honda, location=Japan, food=sushi, =cool}
Your solution is indeed somewhat inefficient.
The person who gave you the string to parse is also somewhat of a clown. There are industry standard serialization formats, like JSON or XML, for which fast, efficient parses exist. Inventing the square wheel is never a good idea.
First question: Do you care? Is it slow enough that it hinders performance of your application? It's likely not to, but there is only one way to find out. Benchmark your code.
That said, more efficient solutions exist. Below is an example
public static void main (String[] args) throws java.lang.Exception
{
String test = "pet:cat::car:honda::location:Japan::food:sushi";
boolean stateiskey = true;
Map<String, String> map = new HashMap<>();
int keystart = 0;
int keyend = 0;
int valuestart = 0;
int valueend = 0;
for(int i = 0; i < test.length(); i++){
char nextchar = test.charAt(i);
if (stateiskey) {
if (nextchar == ':') {
keyend = i;
stateiskey = false;
valuestart = i + 1;
}
} else {
if (i == test.length() - 1 || (nextchar == ':' && test.charAt(i + 1) == ':')) {
valueend = i;
if (i + 1 == test.length()) valueend += 1; //compensate one for the end of the string
String key = test.substring(keystart, keyend);
String value = test.substring(valuestart, valueend);
keystart = i + 2;
map.put(key, value);
i++;
stateiskey = true;
}
}
}
System.out.println(map);
}
This solution is a finite state machine with only two states. It looks at every character only twice, once when it tests it for a boundary, and once when it copies it to the new string in your map. This is the minimum amount.
It doesn't create objects that are not needed, like stringbuilders, strings or arrays, this keeps collection pressure low.
It maintains good locality. The next character probably always is in cache, so the lookup is cheap.
It comes at a grave cost that is probably not worth it though:
It's far more complicated and less obvious
There are all sorts of moving parts
It's harder to debug when your string is in an unexpected format
Your coworkers will hate you
You will hate you when you have to debug something
Worth it? Maybe. How fast do you need that string parsed exactly?
A quick and dirty benchmark at https://ideone.com/8T7twy tells me that for this string, this method is approximately 4 times faster. For longer strings the difference is likely somewhat greater.
But your version is still only 415 milliseconds for 100.000 repetitions, where this one is 99 milliseconds.
Try this code - see the comments for an explanation:
HashMap<String,String> hmap = new HashMap<>();
String str="abc:1::xyz:2::jkl:3";
String straraay[]= str.split("::?");
for(int i=0;i<straraay.length;i+=2) {
hmap.put(straraay[i],straraay[i+1]);
}
for(String s:straraay){
System.out.println(hmap.values()); //for Values only
System.out.println(hmap.keySet()); //for keys only if you want to more clear
}
I don't know this is best approach or not but i think this is another way of doing same thing without using split method twice
Map<String, String> map = new HashMap<String, String>();
String test = "pet:cat::car:honda::location:Japan::food:sushi";
String[] test1 = test.replaceAll("::",":").split(":");
for(int i=0;i<test1.length;i=i+2)
{
map.put(test1[i], test1[i+1]);
}
for (String s : map.keySet()) {
System.out.println(s + " is " + map.get(s));
}
Hope it will help :)
This might be useful.
*utm_source=test_source&utm_medium=test_medium&utm_term=test_term&
utm_content=test_content&utm_campaign=test_name&referral_code=DASDASDAS
String str[] = referrerString.split("&");
HashMap<String,String> stringStringHashMap= new HashMap<>();
List<String> al;
al = Arrays.asList(str);
String[] strkey ;
for (String s : al) {
strkey= s.split("=");
stringStringHashMap.put(strkey[0],strkey[1]);
}
for (String s : stringStringHashMap.keySet()) {
System.out.println(s + " is " + stringStringHashMap.get(s));
}
Your program is absolutely fine.
Just because you asked for a more optimal code.
I reduced your memory by taking few variables instead of taking arrays and storing in them.
Look at your string it follows a patter.
key : value :: key : value ::....
What can we do from this?
get the key till it is : , once it reaches : get value until it reaches '::'.
package qwerty7;
import java.util.HashMap;
public class Demo {
public static void main(String ar[])
{
StringBuilder s = new StringBuilder("pet:cat::car:honda::location:Japan::food:sushi");
boolean isKey = true;
String key = "", value = "";
HashMap<String, String> hm = new HashMap();
for(int i = 0; i < s.length(); i++)
{
char ch = s.charAt(i);
char nextChar = s.charAt(i+1);
if(ch == ':' && nextChar != ':')
{
isKey = false;
continue;
}
else if(ch == ':' && nextChar == ':')
{
hm.put(key, value);
isKey = true;
key = "";
value = "";
i+=1;
continue;
}
if(isKey)
{
key += ch;
}
else
{
value += ch;
}
if(i == s.length() - 1)
{
hm.put(key, value);
}
}
for (String x : hm.keySet()) {
System.out.println(x + " is " + hm.get(x));
}
}
}
Doing so doesn't take up much iterations on splitting each time.
Doesn't take up much memory.
Time complexity O(n)
Output:
car is honda
location is Japan
pet is cat
food is sushi

Add words frequency to Hashtable

I'm trying to do a program that takes words from a file and put them into a Hashtable. Then I must do the frequency of the words and output like this : word , number of appearances.
I know my add method it's messed up but i don't know how to do it. I'm new to java.
public class Hash {
private Hashtable<String, Integer> table = new Hashtable<String, Integer>();
public void readFile() {
File file = new File("file.txt");
try {
Scanner sc = new Scanner(file);
String words;
while (sc.hasNext()) {
words = sc.next();
words = words.toLowerCase();
if (words.length() >= 2) {
table.put(words, 1);
add(words);
}
}
sc.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
public void add(String words) {
Set<String> keys = table.keySet();
for (String count : keys) {
if (table.containsKey(count)) {
table.put(count, table.get(count) + 1);
} else {
table.put(count, 1);
}
}
}
public void show() {
for (Entry<String, Integer> entry : table.entrySet()) {
System.out.println(entry.getKey() + "\t" + entry.getValue());
}
}
public static void main(String args[]) {
Hash abc = new Hash();
abc.readFile();
abc.show();
}
}
This is my file.txt
one one
two
three
two
Output :
two , 2
one , 5
three , 3
Set<String> keys = table.keySet();
for (String count : keys) {
if (table.containsKey(count)) {
table.put(count, table.get(count) + 1);
} else {
table.put(count, 1);
}
}
Right now, you're incrementing the keys that are already in the map. Instead, I don't think you want to loop over anything, you just want to have the increment if condition for words, which I think actually only represents one word.
if (table.containsKey(words)) {
table.put(words, table.get(words) + 1);
} else {
table.put(words, 1);
}
You can drop the add function. You attempt to increment after you have set the value to 1 Instead I would write
try (Scanner sc = new Scanner(file)) {
while (sc.hasNext()) {
String word = sc.next().toLowerCase();
if (words.length() >= 2) {
Integer count = table.get(word);
table.put(word, count == null ? 1 : (count+1));
}
}
}
Note: in Java 8 you can do all this in one line, processing each line in parallel.
Map<String, Long> wordCount = Files.lines(path).parallel()
.flatMap(line -> Arrays.asList(line.split("\\b")).stream())
.collect(groupingByConcurrent(w -> w, counting()));
Note that
map.merge(word, 1, (c, inc) -> c + inc);
Or
map.compute(word, c -> c != null ? c + 1 : 1);
Versions are shorter and likely to be more efficient than
if (table.containsKey(words)) {
table.put(words, table.get(words) + 1);
} else {
table.put(words, 1);
}
And
Integer count = table.get(word);
table.put(word, count == null ? 1 : (count+1));
Suggested by people in this thread.

Sorting words in order of frequency? (least to greatest)

does any one have any idea how to sort a list of words in the order of their frequency (least to greatest) using the built in collection.sort and a comparator<string> interface?
I already have a method that gets the count of a certain word in the text file. Now, I just need to create a method that compares the counts of each word and then puts them in a list sorted by the least frequency to the greatest.
Any ideas and tips would be very much appreciated. I'm having trouble getting started on this particular method.
public class Parser implements Comparator<String> {
public Map<String, Integer> wordCount;
void parse(String filename) throws IOException {
File file = new File(filename);
Scanner scanner = new Scanner(file);
//mapping of string -> integer (word -> frequency)
Map<String, Integer> wordCount = new HashMap<String, Integer>();
//iterates through each word in the text file
while(scanner.hasNext()) {
String word = scanner.next();
if (scanner.next()==null) {
wordCount.put(word, 1);
}
else {
wordCount.put(word, wordCount.get(word) + 1);;
}
}
scanner.next().replaceAll("[^A-Za-z0-9]"," ");
scanner.next().toLowerCase();
}
public int getCount(String word) {
return wordCount.get(word);
}
public int compare(String w1, String w2) {
return getCount(w1) - getCount(w2);
}
//this method should return a list of words in order of frequency from least to greatest
public List<String> getWordsInOrderOfFrequency() {
List<Integer> wordsByCount = new ArrayList<Integer>(wordCount.values());
//this part is unfinished.. the part i'm having trouble sorting the word frequencies
List<String> result = new ArrayList<String>();
}
}
First of all your usage of scanner.next() seems incorrect. next() will return the next word and move onto next one every time you call it, therefore the following code:
if(scanner.next() == null){ ... }
and also
scanner.next().replaceAll("[^A-Za-z0-9]"," ");
scanner.next().toLowerCase();
will consume and then just throw away words. What you probably want to do is:
String word = scanner.next().replaceAll("[^A-Za-z0-9]"," ").toLowerCase();
at the beginning of your while loop, so that the changes to your word are saved in the word variable, and not just thrown away.
Secondly, the usage of the wordCount map is slightly broken. What you want to do is to check if the word is already in the map to decide what word count to set. To do this, instead of checking for scanner.next() == null you should look in the map, for example:
if(!wordCount.containsKey(word)){
//no count registered for the word yet
wordCount.put(word, 1);
}else{
wordCount.put(word, wordCount.get(word) + 1);
}
alternatively you can do this:
Integer count = wordCount.get(word);
if(count == null){
//no count registered for the word yet
wordCount.put(word, 1);
}else{
wordCount.put(word, count+1);
}
I would prefer this approach, because it's a bit cleaner, and does only one map look-up per word, whereas the first approach sometimes does two look-ups.
Now, to get a list of words in descending order of frequencies, you can convert your map to a list first, then apply Collections.sort() as was suggested in this post. Below is a simplified version suited to your needs:
static List<String> getWordInDescendingFreqOrder(Map<String, Integer> wordCount) {
// Convert map to list of <String,Integer> entries
List<Map.Entry<String, Integer>> list =
new ArrayList<Map.Entry<String, Integer>>(wordCount.entrySet());
// Sort list by integer values
Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() {
public int compare(Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2) {
// compare o2 to o1, instead of o1 to o2, to get descending freq. order
return (o2.getValue()).compareTo(o1.getValue());
}
});
// Populate the result into a list
List<String> result = new ArrayList<String>();
for (Map.Entry<String, Integer> entry : list) {
result.add(entry.getKey());
}
return result;
}
Hope this helps.
Edit:
Changed the comparison function as suggested by #dragon66. Thanks.
You can compare and extract ideas from the following:
public class FrequencyCount {
public static void main(String[] args) {
// read in the words as an array
String s = StdIn.readAll();
// s = s.toLowerCase();
// s = s.replaceAll("[\",!.:;?()']", "");
String[] words = s.split("\\s+");
// sort the words
Merge.sort(words);
// tabulate frequencies of each word
Counter[] zipf = new Counter[words.length];
int M = 0; // number of distinct words
for (int i = 0; i < words.length; i++) {
if (i == 0 || !words[i].equals(words[i-1])) // short-circuiting OR
zipf[M++] = new Counter(words[i], words.length);
zipf[M-1].increment();
}
// sort by frequency and print
Merge.sort(zipf, 0, M); // sorting a subarray
for (int j = M-1; j >= 0; j--) {
StdOut.println(zipf[j]);
}
}
}
A solution, close to your original posting with corrections and the sorting as suggested by Torious in the comments:
import java.util.*;
public class Parser implements Comparator <String> {
public Map<String, Integer> wordCount;
void parse ()
{
Scanner scanner = new Scanner (System.in);
// don't redeclare it here - your attribute wordCount will else be shadowed
wordCount = new HashMap<String, Integer> ();
//iterates through each word in the text file
while (scanner.hasNext ()) {
String word = scanner.next ();
// operate on the word, not on next and next of next word from Scanner
word = word.replaceAll (" [^A-Za-z0-9]", " ");
word = word.toLowerCase ();
// look into your map:
if (! wordCount.containsKey (word))
wordCount.put (word, 1);
else
wordCount.put (word, wordCount.get (word) + 1);;
}
}
public int getCount (String word) {
return wordCount.get (word);
}
public int compare (String w1, String w2) {
return getCount (w1) - getCount (w2);
}
public List<String> getWordsInOrderOfFrequency () {
List<String> justWords = new ArrayList<String> (wordCount.keySet());
Collections.sort (justWords, this);
return justWords;
}
public static void main (String args []) {
Parser p = new Parser ();
p.parse ();
List<String> ls = p.getWordsInOrderOfFrequency ();
for (String s: ls)
System.out.println (s);
}
}
rodions Solution is a kind of a Generics hell, but I don't have it simpler - just different.
In the End, his solution is shorter and better.
At the first looks, it seems that a TreeMap might be appropriate, but it sorts by Key, and is of no help for sorting by value, and we can't switch key-value, because we look it up by the key.
So the next idea is to generate a HashMap, and use Collections.sort, but it doesn't take a Map, just Lists for sorting. From a Map, there is entrySet, which produces another Collection, which is a Set, and not a List. That was the point where I took another direction:
I implemented an Iterator: I iterate over the entrySet, and only return Keys, where the value is 1. If the value is 2, I buffer them for later use. If the Iterator is exhausted, I look into the buffer, and if it isn't empty, I use the iterator of the buffer in future, increment the minimum value I look for, and create a new Buffer.
The advantage of an Iterator/Iterable pair is, that the values can be obtained by the simplified for-loop.
import java.util.*;
// a short little declaration :)
public class WordFreq implements Iterator <Map.Entry <String, Integer>>, Iterable <Map.Entry <String, Integer>>
{
private Map <String, Integer> counter;
private Iterator <Map.Entry <String, Integer>> it;
private Set <Map.Entry <String, Integer>> buf;
private int maxCount = 1;
public Iterator <Map.Entry <String, Integer>> iterator () {
return this;
}
// The iterator interface expects a "remove ()" - nobody knows why
public void remove ()
{
if (hasNext ())
next ();
}
public boolean hasNext ()
{
return it.hasNext () || ! buf.isEmpty ();
}
public Map.Entry <String, Integer> next ()
{
while (it.hasNext ()) {
Map.Entry <String, Integer> mesi = it.next ();
if (mesi.getValue () == maxCount)
return mesi;
else
buf.add (mesi);
}
if (buf.isEmpty ())
return null;
++maxCount;
it = buf.iterator ();
buf = new HashSet <Map.Entry <String, Integer>> ();
return next ();
}
public WordFreq ()
{
it = fill ();
buf = new HashSet <Map.Entry <String, Integer>> ();
// The "this" here has to be an Iterable to make the foreach work
for (Map.Entry <String, Integer> mesi : this)
{
System.out.println (mesi.getValue () + ":\t" + mesi.getKey ());
}
}
public Iterator <Map.Entry <String, Integer>> fill ()
{
counter = new HashMap <String, Integer> ();
Scanner sc = new Scanner (System.in);
while (sc.hasNext ())
{
push (sc.next ());
}
Set <Map.Entry <String, Integer>> set = counter.entrySet ();
return set.iterator ();
}
public void push (String word)
{
Integer i = counter.get (word);
int n = 1 + ((i != null) ? i : 0);
counter.put (word, n);
}
public static void main (String args[])
{
new WordFreq ();
}
}
Since my solution reads from stdin, you invoke it with:
cat WordFreq.java | java WordFreq

Categories

Resources