How to do a counter of specific words in Java?

How to do a counter of specific words in Java? - java

Hey guys I am developing a project where I have 4 questions where someone can evaluate as (great, good, regular, and poor), and after that I would need to check how many people voted as great, how many voted as good, regular, and poor, for each of the 4 questions. So I would like to make a count to check the .txt and count how many times the word (great, good, regular, and poor) apears on it. I was trying to do it like in Python, where you only need a dictionary (or a counter) and simply do something like:
dict["great"] += 1
However, it isn't possible to do so in Java. Does anyone know any method that would be similar to this one in Java, or another way to do it simply (without having to create a lot of variables to save each question's answer).
Thank you very much for your help.

In java 8 the compute method was added to the Map interface. It may be a bit more complicated than in python, but it's probably the closest it gets to the python code:
Map<String, Integer> map = new HashMap<>();
String rating = ...
map.compute(rating, (key, oldValue) -> ((oldValue == null) ? 1 : oldValue+1));
The lambda expression passed as second parameter to compute receives the old value the key was mapped to as second parameter or null, if there was no mapping.

This is 100% possible in Java.
Use a HashMap to store the values.
For example:
HashMap counts = new HashMap<String, Integer>();
counts.put("great", 0);
counts.put("good", 0);
counts.put("regular", 0);
counts.put("poor", 0);
Now, suppose you read in a string input.
To increase the counter, do :
counts.put(input, counts.get(input) + 1);
This will increase the counter in that position by 1.
Use counts.get(input) to get the counter of input string.

Related

how to use associateArray as a key in other associativeArray in PHP (just like using HashMap as key in another HashMap in JAVA)

In Java, we can use HashMap as a key in other HashMap. I'm using an associative array as a map in PHP. Now there is a need to store an associative array as a key in another associative array.
I asked ChatGPT and it presented a lengthy solution:
Suppose $map is an array that I want to use as a key:
ksort($map);
$key = serialize($map);
if(!isset($main[$key])){
$main[$key] = 0;
}
$main[$key]++
I'm running above code in a loop where map is:
on first iteration: [a=>1, b=>2, c=>3]
on second iteration: [b=>2, a=>1, c=>3]
after 2 two iterations
main is looking like
$main["serialized---key"] -> 2
Yes, I need to use ksort because the next $map could contain the same array but keys could be in a different order.
The above solution is working fine but it will slow down drastically on large inputs. I need a better way where I don't need to use ksort and serialization.
I also used spl_object_hash instead of serialize but it didn't work. please suggest an optimal approach just like hashmap in java.
Also, I used splObjectStorage class by type-casting the array into an object but it gives incorrect results.
Detailed problem:
What I'm actually trying to do?
I'm solving the following problem:
Given an array of strings strs, group the anagrams together. You can return the answer in any order.
An Anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.
Example 1:
Input: strs =
["eat","tea","tan","ate","nat","bat"]
Output:
[["bat"],["nat","tan"],["ate","eat","tea"]]
My working code:
In short, I'm just grouping the strings based on their frequency map.
function groupAnagrams(array $arr){
$main = [];
$ret = [];
for($i=0; $i<count($arr); $i++){
$el = $arr[$i];
$map = [];
for($j=0; $j<strlen($el); $j++){
if(!isset($map[$el[$j]])){
$map[$el[$j]]=0;
}
$map[$el[$j]]++;
}
ksort($map);
$key = serialize($map);
if(!isset($main[$key])){
$main[$key] = [];
}
$main[$key][] = $el;
}
//return $main;
foreach($main as $key=>$val){
$ret[] = $val;
}
return $ret;
}
Here is the problem link: https://leetcode.com/problems/group-anagrams/

Understanding JavaPairRDD.reduceByKey function

I came across follow code snippet of Apache Spark:
JavaRDD<String> lines = new JavaSparkContext(sparkSession.sparkContext()).textFile("src\\main\\resources\\data.txt");
JavaPairRDD<String, Integer> pairs = lines.mapToPair(s -> new Tuple2(s, 1));
System.out.println(pairs.collect());
JavaPairRDD<String, Integer> counts = pairs.reduceByKey((a, b) -> a + b);
System.out.println("Reduced data: " + counts.collect());
My data.txt is as follows:
Mahesh
Mahesh
Ganesh
Ashok
Abnave
Ganesh
Mahesh
The output is:
[(Mahesh,1), (Mahesh,1), (Ganesh,1), (Ashok,1), (Abnave,1), (Ganesh,1), (Mahesh,1)]
Reduced data: [(Ganesh,2), (Abnave,1), (Mahesh,3), (Ashok,1)]
While I understand how first line of output is obtained, I dont understand how second line is obtained, that is how JavaPairRDD<String, Integer> counts is formed by reduceByKey.
I found that the signature of reduceByKey() is as follows:
public JavaPairRDD<K,V> reduceByKey(Function2<V,V,V> func)
The [signature](http://spark.apache.org/docs/1.2.0/api/java/org/apache/spark/api/java/function/Function2.html#call(T1, T2)) of Function2.call() is as follows:
R call(T1 v1, T2 v2) throws Exception
The explanation of reduceByKey() reads as follows:
Merge the values for each key using an associative reduce function. This will also perform the merging locally on each mapper before sending results to a reducer, similarly to a "combiner" in MapReduce. Output will be hash-partitioned with the existing partitioner/ parallelism level.
Now this explanation sounds somewhat confusing to me. May be there is something more to the functionality of reduceByKey(). By looking at input and output to reduceByKey() and Function2.call(), I feel somehow reducebyKey() sends values of same keys to call() in pairs. But that simply does not sound clear. Can anyone explain what precisely how reduceByKey() and Function2.call() works together?

As its name implies, reduceByKey() reduces data based on the lambda function you pass to it.
In your example, this function is a simple adder: for a and b, return a + b.
The best way to understand how the result is formed is to imagine what happens internally. The ByKey() part groups your records based on their key values. In your example, you'll have 4 different sets of pairs:
Set 1: ((Mahesh, 1), (Mahesh, 1), (Mahesh, 1))
Set 2: ((Ganesh, 1), (Ganesh, 1))
Set 3: ((Ashok, 1))
Set 4: ((Abnave, 1))
Now, the reduce part will try to reduce the previous 4 sets using the lambda function (the adder):
For Set 1: (Mahesh, 1 + 1 + 1) -> (Mahesh, 3)
For Set 2: (Ganesh, 1 + 1) -> (Ganesh, 2)
For Set 3: (Ashok , 1) -> (Ashok, 1) (nothing to add)
For Set 4: (Abnave, 1) -> (Abnave, 1) (nothing to add)
Functions signatures can be sometimes confusing as they tend to be more generic.

I'm thinking that you probably understand groupByKey? groupByKey groups all values for a certain key into a list (or iterable) so that you can do something with that - like, say, sum (or count) the values. Basically, what sum does is to reduce a list of many values into a single value. It does so by iteratively adding two values to yield one value and that is what Function2 needs to do when you write your own. It needs to take in two values and return one value.
ReduceByKey does the same as a groupByKey, BUT it does what is called a "map-side reduce" before shuffling data around. Because Spark distributes data across many different machines to allow for parallel processing, there is no guarantee that data with the same key is placed on the same machine. Spark thus has to shuffle data around, and the more data that needs to be shuffled the longer our computations will take, so it's a good idea to shuffle as little data as needed.
In a map-side reduce, Spark will first sum all the values for a given key locally on the executors before it sends (shuffles) the result around for the final sum to be computed. This means that much less data - a single value instead of a list of values - needs to be send between the different machines in the cluster and for this reason, reduceByKey is most often preferable to a groupByKey.
For a more detailed description, I can recommend this article :)

Count how many list entries have a string property that ends with a particular char

I have an array list with some names inside it (first and last names). What I have to do is go through each "first name" and see how many times a character (which the user specifies) shows up at the end of every first name in the array list, and then print out the number of times that character showed up.
public int countFirstName(char c) {
int i = 0;
for (Name n : list) {
if (n.getFirstName().length() - 1 == c) {
i++;
}
}
return i;
}
That is the code I have. The problem is that the counter (i) doesn't add 1 even if there is a character that matches the end of the first name.

You're comparing the index of last character in the string to the required character, instead of the last character itself, which you can access with charAt:
String firstName = n.getFirstName()
if (firstName.charAt(firstName.length() - 1) == c) {
i++;
}

When you're setting out learning to code, there is a great value in using pencil and paper, or describing your algorithm ahead of time, in the language you think in. Most people that learn a foreign language start out by assembling a sentence in their native language, translating it to foreign, then speaking the foreign. Few, if any, learners of a foreign language are able to think in it natively
Coding is no different; all your life you've been speaking English and thinking in it. Now you're aiming to learn a different pattern of thinking, syntax, key words. This task will go a lot easier if you:
work out in high level natural language what you want to do first
write down the steps in clear and simple language, like a recipe
don't try to do too much at once
Had I been a tutor marking your program, id have been looking for something like this:
//method to count the number of list entries ending with a particular character
public int countFirstNamesEndingWith(char lookFor) {
//declare a variable to hold the count
int cnt = 0;
//iterate the list
for (Name n : list) {
//get the first name
String fn = n.getFirstName();
//get the last char of it
char lc = fn.charAt(fn.length() - 1);
//compare
if (lc == lookFor) {
cnt++;
}
}
return cnt;
}
Taking the bullet points in turn:
The comments serve as a high level description of what must be done. We write them aLL first, before even writing a single line of code. My course penalised uncommented code, and writing them first was a handy way of getting the requirement out of the way (they're a chore, right? Not always, but..) but also it is really easy to write a logic algorithm in high level language, then translate the steps into the language learning. I definitely think if you'd taken this approach you wouldn't have made the error you did, as it would have been clear that the code you wrote didn't implement the algorithm you'd have described earlier
Don't try to do too much in one line. Yes, I'm sure plenty of coders think it looks cool, or trick, or shows off what impressive coding smarts they have to pack a good 10 line algorithm into a single line of code that uses some obscure language features but one day it's highly likely that someone else is going to have to come along to maintain that code, improve it or change part of what it does - at that moment it's no longer cool, and it was never really a smart thing to do
Aominee, in their comment, actually gives us something like an example of this:
return (int)list.stream().filter(e -> e.charAt.length()-1)==c).count();
It's a one line implementation of a solution to your problem. Cool huh? Well, it has a bug* (for a start) but it's not the main thrust of my argument. At a more basic level: have you got any idea what it's doing? can you look at it and in 2 seconds tell me how it works?
It's quite an advanced language feature, it's trick for sure, but it might be a very poor solution because it's hard to understand, hard to maintain as a result, and does a lot while looking like a little- it only really makes sense if you're well versed in the language. This one line bundles up a facility that loops over your list, a feature that effectively has a tiny sub method that is called for every item in the list, and whose job is to calculate if the name ends with the sought char
It p's a brilliant feature, a cute example and it surely has its place in production java, but it's place is probably not here, in your learning exercise
Similarly, I'd go as far to say that this line of yours:
if (n.getFirstName().length() - 1 == c) {
Is approaching "doing too much" - I say this because it's where your logic broke down; you didn't write enough code to effectively implement the algorithm. You'd actually have to write even more code to implement this way:
if (n.getFirstName().charAt(n.getFirstName().length() - 1) == c) {
This is a right eyeful to load into your brain and understand. The accepted answer broke it down a bit by first getting the name into a temporary variable. That's a sensible optimisation. I broke it out another step by getting the last char into a temp variable. In a production system I probably wouldn't go that far, but this is your learning phase - try to minimise the number of operations each of your lines does. It will aid your understanding of your own code a great deal
If you do ever get a penchant for writing as much code as possible in as few chars, look at some code golf games here on the stack exchange network; the game is to abuse as many language features as possible to make really short, trick code.. pretty much every winner stands as a testament to condense that should never, ever be put into a production system maintained by normal coders who value their sanity
*the bug is it doesn't get the first name out of the Name object

HashMap- java program issues

Rather than explaining some big problem, I'll skip all that and list the small loop I am struggling with. Anyways, I have to print the key of a map, so I am using a special way to print the key by switching the value and the key around.
for (int i = 0; i < elementData.length; i++){
System.out.print("[" + i + "]");
for (Entry<HashEntry<E>, Integer> entry : foob.entrySet()){
if (entry.getValue().equals(i)){
System.out.print(entry.getKey().toString());
}
}
}
This is my goal: Print [0][1][2][3] like that all the way to 20. Along with that, 9 numbers will go in between those numbers in parens randomly, based on my program.
Here is my result:
[0][1]HashSet$HashEntry#7d4991ad[2][3][4]HashSet$HashEntry#4554617cHashSet$HashEntry#28d93b30[5][6][7][8][9]HashSet$HashEntry#232204a1[10][11]
So there's just some trick to make it not print all this machine language looking stuff. Anyways, what do I have to do? Looks like 1 thing was supposed to come after [1], 2 things after [4], something after [9], and so on.
Thanks!

So there's just some trick to make it not print all this machine language looking stuff?
Yea.
Don't try to print an instance of a class that doesn't override Object.toString(). That "machine language looking stuff" is simply the output of Object.toString().
However, I suspect that your real code is doing this:
if (entry.getValue().equals(i)){
System.out.print(entry.toString());
}
because "HashSet$HashEntry#7d4991ad" looks like the output you would get if you printed a HashSet.HashEntry object. (The other possibility is that you have used HashSet.HashEntry objects as keys in your Map.)

In Java 8, how do I apply this filter?

First of all I apologize for the title of the question, I couldn't think of a better way to phrase it, please let me know if I should fix it.
Basically I'm a Java programmer that is way too used with imperative programming, discovering (and playing with) the new features from Java 8.
The method I am writing is quite simple, and I have it working fine, I'm just wondering if there's a more "functional" way of solving this.
Basically I receive a list of User, and need to return the percentage of those users that have Status = INVALID.
So here's what I did so far:
public static double getFailedPercentage(List<User> users){
Long failedCount = users.stream().filter(user -> User.Status.INVALID.equals(user.getStatus())).collect(counting());
return (failedCount * 100) / users.size();
}
I would like to keep this as a one liner if possible, I know this might be overthinking things, but I like to know the limits and the possibilities of the language I use.
Thanks in advance for your time.

The following should work
return users.stream()
.mapToInt(user -> User.Status.INVALID.equals(user.getStatus()) ? 100 : 0)
.average()
.getAsDouble();
It maps the Status to either 0 or 100 and then takes the average. Which means for every INVALID user you have a 100, for every other you have a 0, the average of that is the exact result as you requested.
The 100 is the same hundred you multiplied by. If you want a decimal representation of the percentage, replace it with a 1.
https://docs.oracle.com/javase/tutorial/collections/streams/

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to do a counter of specific words in Java? - java

Related

how to use associateArray as a key in other associativeArray in PHP (just like using HashMap as key in another HashMap in JAVA)

Understanding JavaPairRDD.reduceByKey function

Count how many list entries have a string property that ends with a particular char

HashMap- java program issues

In Java 8, how do I apply this filter?

Categories

Resources