Add words frequency to Hashtable - java

I'm trying to do a program that takes words from a file and put them into a Hashtable. Then I must do the frequency of the words and output like this : word , number of appearances.
I know my add method it's messed up but i don't know how to do it. I'm new to java.
public class Hash {
private Hashtable<String, Integer> table = new Hashtable<String, Integer>();
public void readFile() {
File file = new File("file.txt");
try {
Scanner sc = new Scanner(file);
String words;
while (sc.hasNext()) {
words = sc.next();
words = words.toLowerCase();
if (words.length() >= 2) {
table.put(words, 1);
add(words);
}
}
sc.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
public void add(String words) {
Set<String> keys = table.keySet();
for (String count : keys) {
if (table.containsKey(count)) {
table.put(count, table.get(count) + 1);
} else {
table.put(count, 1);
}
}
}
public void show() {
for (Entry<String, Integer> entry : table.entrySet()) {
System.out.println(entry.getKey() + "\t" + entry.getValue());
}
}
public static void main(String args[]) {
Hash abc = new Hash();
abc.readFile();
abc.show();
}
}
This is my file.txt
one one
two
three
two
Output :
two , 2
one , 5
three , 3

Set<String> keys = table.keySet();
for (String count : keys) {
if (table.containsKey(count)) {
table.put(count, table.get(count) + 1);
} else {
table.put(count, 1);
}
}
Right now, you're incrementing the keys that are already in the map. Instead, I don't think you want to loop over anything, you just want to have the increment if condition for words, which I think actually only represents one word.
if (table.containsKey(words)) {
table.put(words, table.get(words) + 1);
} else {
table.put(words, 1);
}

You can drop the add function. You attempt to increment after you have set the value to 1 Instead I would write
try (Scanner sc = new Scanner(file)) {
while (sc.hasNext()) {
String word = sc.next().toLowerCase();
if (words.length() >= 2) {
Integer count = table.get(word);
table.put(word, count == null ? 1 : (count+1));
}
}
}
Note: in Java 8 you can do all this in one line, processing each line in parallel.
Map<String, Long> wordCount = Files.lines(path).parallel()
.flatMap(line -> Arrays.asList(line.split("\\b")).stream())
.collect(groupingByConcurrent(w -> w, counting()));

Note that
map.merge(word, 1, (c, inc) -> c + inc);
Or
map.compute(word, c -> c != null ? c + 1 : 1);
Versions are shorter and likely to be more efficient than
if (table.containsKey(words)) {
table.put(words, table.get(words) + 1);
} else {
table.put(words, 1);
}
And
Integer count = table.get(word);
table.put(word, count == null ? 1 : (count+1));
Suggested by people in this thread.

Related

Using a Hashmap to detect duplicates and count of duplicates in a list

I'm trying to use hashmaps to detect any duplicates in a given list, and if there is, I want to add "1" to that String to indicate its duplication. If it occurs 3 times, the third one would add "3" after that string.
I can't seem to figure that out, keeping track of the number of duplicates. It only adds 1 to the duplicates, no matter if it's the 2nd or 3rd or 4th,..etc duplicate.
This is what I have:
public static List<String> duplicates(List<String> given) {
List<String> result = new ArrayList<String>();
HashMap<String, Integer> hashmap = new HashMap<String, Integer>();
for (int i=0; i<given.size(); i++) {
String current = given.get(i);
if (hashmap.containsKey(current)) {
result.add(current+"1");
} else {
hashmap.put(current,i);
result.add(current);
}
}
return result;
}
I want to include the values that only occur once as well, as is (no concatenation).
Sample Input: ["mixer", "toaster", "mixer", "mixer", "bowl"]
Sample Output: ["mixer", "toaster", "mixer1", "mixer2", "bowl"]
public static List<String> duplicates(List<String> given) {
final Map<String, Integer> count = new HashMap<>();
return given.stream().map(s -> {
int n = count.merge(s, 1, Integer::sum) - 1;
return s + (n < 1 ? "" : n);
}).collect(toList());
}
I renamed final to output as the first one is a keyword that cannot be used as a variable name.
if (hashmap.containsKey(current)) {
output.add(current + hashmap.get(current)); // append the counter to the string
hashmap.put(current, hashmap.get(current)+1); // increment the counter for this item
} else {
hashmap.put(current,1); // set a counter of 1 for this item in the hashmap
output.add(current);
}
You always add the hard-coded string "1" instead of using the count saved in the map:
public static List<String> duplicates(List<String> given) {
List<String> result = new ArrayList<>(given.size());
Map<String, Integer> hashmap = new HashMap<>();
for (String current : given) {
if (hashmap.containsKey(current)) {
int count = hashmap.get(current) + 1;
result.add(current + count);
hashmap.put(current, count);
} else {
hashmap.put(current, 0);
result.add(current);
}
}
return result;
}
ArrayList finallist = new ArrayList<String>();
for (int i=0; i<given.size(); i++) {
String current = given.get(i);
if (hashmap.containsKey(current)) {
hashmap.put(current,hashmap.get(current)+1);
} else {
hashmap.put(current,1);
}
String num = hashmap.get(current) == 1 ? "" :Integer.toString(hashmap.get(current));
finallist.add(current+num);
}
System.out.println(finallist);

Why is it returning the wrong amount of repetitions in the string?

I want to return the characters that are being repeated as well as the number of times it occurs but my output isn't consistent with what I'm expecting as the output.
It's outputting e 6 times when it should be 4 times as well as outputting j 1 time when it should be 2 times. I'm aware I'm returning it the wrong way as well.
What am I doing wrong and how can I fix it?
public static String solution(String s) {
int i, j, count = 0;
for(i = 0; i < s.length(); i++) {
for(j = i + 1; j < s.length(); j++) {
if(s.charAt(i) == s.charAt(j)) {
System.out.print(s.charAt(i) + " ");
count++;
}
}
}
System.out.println();
System.out.println("no duplicates");
System.out.println("There are " + count + " repetitions");
return s;
}
public static void main(String args[]) {
String s = "eeejiofewnj";
solution(s);
}
output:
e e e e e e j
no duplicates
There are 7 repititions
So what you are doing wrong is counting for each letter in the string, how many other letters after this one match it.
So for the first e your loop finds 3 matches, for the second e your loop finds 2 matches etc. and adds these all up.
What you want to do is count how many instances of a char there are in a String and then only display the ones that are higher than 1. The way I'd do it is with a map... like this:
public static String solution(String s) {
Map<Character, Integer> counts = new HashMap<Character, Integer>();
// Go through each char and make a map of char to their counts.
for (char c : s.toCharArray()) {
// See if the char is already in the map
Integer count = counts.get(c);
// if it wasn't then start counting from 1
if (count == null) {
count = 0;
}
count++;
// update the count
counts.put(c, count);
}
// now go through the map and print out any chars if their counts are higher than 1 (meaning there's a duplicate)
for (Entry<Character, Integer> entry : counts.entrySet()) {
if (entry.getValue() > 1) {
System.out.println(MessageFormat.format("there are {0} {1}s",
entry.getValue(), entry.getKey()));
}
}
return s;
}
public static void main(String args[]) {
String s = "eeejiofewnj";
solution(s);
}
Another alternative with Regular Expressions (discussed in more detail here).
public static void solutioniseThis(final String str)
{
Matcher repeatedMatcher = Pattern.compile("(\\w)\\1+").matcher(str);
while (repeatedMatcher.find())
{
int count = 0;
Matcher countMatcher = Pattern.compile(Matcher.quoteReplacement(repeatedMatcher.group(1))).matcher(str);
while (countMatcher.find())
{
count++;
}
System.out.println(MessageFormat.format("Repeated Character \"{0}\" - found {2} repetitions, {1} sequentially", repeatedMatcher.group(1),
repeatedMatcher.group(0).length(), count));
}
}
public static void main(String args[])
{
solutioniseThis("eeejiofewnj");
}
Produces an output of:
Repeated Character "e" - found 4 repetitions, 3 sequentially
You are counting each matching combination. For e (pseudocode):
CharAt(0) == CharAt(1)
CharAt(0) == CharAt(2)
CharAt(0) == CharAt(7)
CharAt(1) == CharAt(2)
CharAt(1) == CharAt(7)
CharAt(2) == CharAt(7)
For j there is only one:
CharAt(3) == CharAt(10)
hello this simple code also work:
public static void solution(String s) {
int[] repetitons = new int[128];
for (int i=0; i<s.length(); i++){
repetitons[(int)s.charAt(i)]++;
}
int count = 0;
for (int i=0; i<128; i++){
if (repetitons[i]>1){
count+=repetitons[i];
for (int j=0; j<repetitons[i]; j++){
System.out.print((char)i+" ");
}
}
}
System.out.println();
if (count == 0){
System.out.println("no duplicates");
} else {
System.out.println("There are " + count + " repetitions");
}
}
public static void main(String args[]) {
solution("eeejiofewnj");
}
Another solution using recursion.
public Map<Character, Integer> countRecursive(final String s)
{
final Map<Character, Integer> counts = new HashMap<Character, Integer>();
if(!s.isEmpty())
{
counts.putAll(countRecursive(s.substring(1)));
final char c = s.charAt(0);
if(counts.containsKey(c))
{
counts.put(c, counts.get(c) + 1);
}
else
{
counts.put(c, 1);
}
}
return counts;
}
public static void main(String args[])
{
final String s = "eeejiofewnj";
final Map<Character, Integer> counts = new CountCharacters().countRecursive(s);
for(Map.Entry<Character, Integer> count : counts.entrySet())
{
if (count.getValue() > 1)
{
System.out.println(MessageFormat.format("There are {0} {1}s",
count.getValue(), count.getKey()));
}
}
}
Another alternative with Java 8 and Apache Utils.
final String s = "eeejiofewnj";
new HashSet<>(s.chars().mapToObj(e->(char)e).collect(Collectors.toList())).stream().map(c -> Pair.of(c, StringUtils.countOccurrencesOf(s, "" + "" + c))).filter(count -> count.getRight() > 0).forEach(count -> System.out.println("There are " + count.getRight() + " repetitions of " + count.getLeft()));

How to calculate the frequency of each word in a String [duplicate]

This question already has answers here:
How do I compare strings in Java?
(23 answers)
Closed 5 years ago.
~~This is my code Can Anybody tell me why outputs are not coming correct?
class FrequencyOfWord
{
public static void main(String dt[])
{
String str="Hello World Hello";
int i=0,j=0,space=0,count=0;
for(i=0;i<str.length()-1;i++)
{
if(str.charAt(i)==' ')
{
space++;
}
}
String arr[]=new String[space+1];
String cstr="";
for(i=0;i<str.length();i++)
{
if(str.charAt(i)==' ')
{
arr[j]=cstr;
cstr="";
j++;
}
else
{
cstr=cstr+str.charAt(i);
}
arr[j]=cstr;
}
//System.out.println(str);
for(i=0;i<arr.length;i++)
{
System.out.print(arr[i]);
}
for(i=0;i<arr.length;i++)
{
count=0;
int flag=0;
for(j=0;j<arr.length;j++)
{
if(arr[i]==arr[j])
{
count++;
}
if((arr[i]==arr[j])&&(i>j))
{
flag=1;
}
}
if((count!=0)&&(flag==0))
{
System.out.println(arr[i]+"\t\t"+count);
}
}
}
}
the output of count is coming to be 1 for each word. Can anyone tell me the error. The flag variable is used so that only once the frequency of a word is printed.
Your code is way way too complex - try using split and a HashMap
String str="Hello World Hello";
HashMap<String, Integer> res = new HashMap<String, Integer>();
String el [] = str.split("\\s+");
for (String s : el) {
int count = 0;
if (res.containsKey(s)) {
count = res.get(s);
}
res.put(s, count + 1);
}
// output
for (String keys : res.keySet()) {
System.out.printf("%s : %d%n", keys, res.get(keys));
}
output
World : 1
Hello : 2
EDIT: Or using Java 8 you can do
Map<String, Long> freq =
Stream.of(str.trim().split("\\s+"))
.collect(Collectors.groupBy(w -> w, Collectors.counting()));
freg.forEach((k, v) -> System.out.println(k + ": " + v);

First Non Repeating Character using hashmap in one loop?

Recently an interviewer asked me to implement the first non repeating character in a string,I implemented it with hashmap using two different loops.Although the time complexity is O(n)+O(n),but he asked me to solve in a single loop.Can someone tells me how to do that?
Below is my implementation:
import java.util.HashMap;
import java.util.Map;
public class firstnonrepeating {
public static void main(String[] args) {
String non = "nnjkljklhihis";
Map<Character, Integer> m = new HashMap<Character, Integer>();
for (int i = 0; i < non.length(); i++) {
if (m.get(non.charAt(i)) != null) {
m.put(non.charAt(i), m.get(non.charAt(i)) + 1);
} else {
m.put(non.charAt(i), 1);
}
}
for (int i = 0; i < non.length(); i++) {
if (m.get(non.charAt(i)) == 1) {
System.out.println("First Non Reapeating Character is "
+ non.charAt(i));
break;
} else {
if (i == non.length() - 1)
System.out.println("No non repeating Character");
}
}
}
}
String non = "nnnjkljklhihis";
Map<String,LinkedHashSet<Character>> m = new HashMap<String,LinkedHashSet<Character>>() ;
m.put("one", new LinkedHashSet<Character>());
m.put("else", new LinkedHashSet<Character>());
m.put("all", new LinkedHashSet<Character>());
for (int i = 0; i < non.length(); i++) {
if (m.get("all").contains(non.charAt(i))) {
m.get("one").remove(non.charAt(i));
m.get("else").add(non.charAt(i));
} else {
m.get("one").add(non.charAt(i));
m.get("all").add(non.charAt(i));
}
}
if(m.get("one").size()>0){
System.out.println("first non repeatant : "+m.get("one").iterator().next());
}
Here is how I would do it:
import java.util.HashMap;
import java.util.Map;
public class Main
{
public static void main(String[] args)
{
String characters = "nnjkljklhihis";
Character firstNonRepeatingChar = getFirstNonRepeatingCharacter(characters);
if(firstNonRepeatingChar == null)
{
System.out.println("No non repeating characters in " + characters);
}
else
{
System.out.println("The first non repeating character is " + firstNonRepeatingChar);
}
}
private static Character getFirstNonRepeatingCharacter(String characters)
{
Map<Integer, Character> m = new HashMap<Integer, Character>();
for(int i = 0; i < characters.length(); i++)
{
Character currentChar = characters.charAt(i);
if(i > 0)
{
Character previousChar = m.get(i-1);
if(!previousChar.equals(currentChar))
{
return currentChar;
}
}
m.put(i, currentChar);
}
return null;//No non repeating character found
}
}
This is the same answer as Osama, re-written in a more modern way.
public static Optional<Character> getFirstNonRepeatingCharacter(String characters) {
HashMap<Character, Consumer<Character>> map = new HashMap<>();
LinkedHashSet<Character> set = new LinkedHashSet<>();
for(char c: characters.toCharArray()) {
map.merge(c, set::add, (_1, _2) -> set::remove).accept(c);
}
return set.stream().findFirst();
}
One more possible solution to this:
public class FirstNonRepeatingCharacterInString {
public static void main(String[] args) {
Character character = firstNonRepeatingCharacter("nnjkljklhihis");
System.out.println("First Non repeating character : " + character != null ? character : null);
}
private static Character firstNonRepeatingCharacter(String arg) {
char[] characters = arg.toCharArray();
Map<Character, Character> set = new LinkedHashMap<>();
// cost of the operation is O(n)
for (char c : characters) {
if (set.containsKey(c)) {
set.remove(c);
} else {
set.put(c, c);
}
}
//here we are just getting the first value from collection
// not iterating the whole collection and the cost of this operation is O(1)
Iterator<Character> iterator = set.keySet().iterator();
if (iterator.hasNext()) {
return iterator.next();
} else {
return null;
}
}
}
Given a string, find its first non-repeating character:
public class Test5 {
public static void main(String[] args) {
String a = "GiniSoudiptaGinaProtijayi";
Map<Character, Long> map = a.chars().mapToObj(
ch -> Character.valueOf((char)ch)
).collect(Collectors.groupingBy(Function.identity(),
LinkedHashMap:: new,
Collectors.counting()
));
System.out.println(map);
//List<Character> list = map.entrySet().stream().filter( entry -> entry.getValue() == 1 )
//.map(entry -> entry.getKey()).collect(Collectors.toList());
//System.out.println(list);
Character ch = map.entrySet().stream()
.filter( entry -> entry.getValue() == 1L )
.map(entry -> entry.getKey()).findFirst().get();
System.out.println(ch);
}
}

Using HashMap to count instances

I have the following code to count the instances of different strings in an array;
String words[] = {"the","cat","in","the","hat"};
HashMap<String,Integer> wordCounts = new HashMap<String,Integer>(50,10);
for(String w : words) {
Integer i = wordCounts.get(w);
if(i == null) wordCounts.put(w, 1);
else wordCounts.put(w, i + 1);
}
Is this a correct way of doing it? It seems a bit long-winded for a simple task. The HashMap result is useful to me because I will be indexing it by the string.
I am worried that the line
else wordCounts.put(w, i + 1);
could be inserting a second key-value pair due to the fact that
new Integer(i).equals(new Integer(i + 1));
would be false, so two Integers would end up under the same String key bucket, right? Or have I just over-thought myself into a corner?
Your code will work - but it would be simpler to use HashMultiset from Guava.
// Note: prefer the below over "String words[]"
String[] words = {"the","cat","in","the","hat"};
Multiset<String> set = HashMultiset.create(Arrays.asList(words));
// Write out the counts...
for (Multiset.Entry<String> entry : set.entrySet()) {
System.out.println(entry.getElement() + ": " + entry.getCount());
}
Yes you are doing it correct way. HashMap replaces values if same key is provided.
From Java doc of HashMap#put
Associates the specified value with the specified key in this map. If the map previously contained a mapping for the key, the old value is replaced.
Your code is perfectly fine. You map strings to integers. Nothing is duplicated.
HashMap don't allow duplicate keys, so there is no way to have more than one SAME key-value pairs in your map.
Here is a String-specific counter that should be genericized and have a sort by value option for toString(), but is an object-oriented wrapper to the problem, since I can't find anything similar:
package com.phogit.util;
import java.util.Map;
import java.util.HashMap;
import java.lang.StringBuilder;
public class HashCount {
private final Map<String, Integer> map = new HashMap<>();
public void add(String s) {
if (s == null) {
return;
}
Integer i = map.get(s);
if (i == null) {
map.put(s, 1);
} else {
map.put(s, i+1);
}
}
public int getCount(String s) {
if (s == null) {
return -1;
}
Integer i = map.get(s);
if (i == null) {
return -1;
}
return i;
}
public String toString() {
if (map.size() == 0) {
return null;
}
StringBuilder sb = new StringBuilder();
// sort by key for now
Map<String, Integer> m = new TreeMap<String, Integer>(map);
for (Map.Entry pair : m.entrySet()) {
sb.append("\t")
.append(pair.getKey())
.append(": ")
.append(pair.getValue())
.append("\n");;
}
return sb.toString();
}
public void clear() {
map.clear();
}
}
Your code looks fine to me and there is no issue with it. Thanks to Java 8 features it can be simplified to:
String words[] = {"the","cat","in","the","hat"};
HashMap<String,Integer> wordCounts = new HashMap<String,Integer>(50,10);
for(String w : words) {
wordCounts.merge(w, 1, (a, b) -> a + b);
}
the follwowing code
System.out.println("HASH MAP DUMP: " + wordCounts.toString());
would print out.
HASH MAP DUMP: {cat=1, hat=1, in=1, the=2}

Categories

Resources