I have an Array which is structured like this :
String Array = {"1","2","3","41","56","41","72","72","72","78","99"}
and I want to partition this array into a number of arrays which values are not duplicates... like this :
String Array1 = {"1","2","3","41","56","72","78","99"}
String Array2 = {"41","72"}
String Array3 = {"72"}
is there any straight way to do this in Java or I have to do this with ugly loops (Just kidding !) ?
Thanks !
UPDATE
I'm gonna make the question a bit harder... now I have a Map which structure is like below :
Map<String,String> map = new HashMap(){{
put("1##96","10");
put("2##100","5");
put("3##23","100");
put("41##34","14");
put("56##22","25");
put("41##12","100");
put("72##10","100");
put("72##100","120");
put("72##21","0");
put("78##22","7");
}}
note that the values are not important BUT the keys are important...
what can I do to partition this map to submaps which are like :
Map map1 = {"1##96" => "10"
"2##100" => "5"
"3##23" => "100"
"41##34" => "14"
"56##22" => "25"
"72##10" => "100"
"78##22" => "7"
}
Map map2 = {
"41##12" => "100"
"72##100" => "120"
}
Map map3 = {
"72##100" => "120"
}
like before the first part of the map (before '##') is the ID which I want the uniqueness be based upon... this is just like the Array Example but a bit harder and more complex...
Sorry for changing the question midway...
Probably nothing in libs (seems not generic enough) but some ideas:
O(n) time and O(n) space complexity. Here you just count how many times each number occurs and then put them in that many resulting arrays.
#Edit: as #mpkorstanje pointed out if you change the input from numbers to strings or any other objects in the worst-worst case this will degrade to O(n^2). But in that case you should revise your hashing imho for the data on which you're working as it's not well distributed.
public List<List<Integer>> split(int[] input) {
Map<Integer, Integer> occurrences = new HashMap<>();
int maxOcc = 0;
for (int val : input) {
int occ = 0;
if (occurrences.containsKey(val)) {
occ = occurrences.get(val);
}
if (occ + 1 > maxOcc) {
maxOcc = occ + 1;
}
occurrences.put(val, occ + 1);
}
List<List<Integer>> result = new ArrayList<>(maxOcc);
for (int i = 0; i < maxOcc; i++) {
result.add(new LinkedList<>());
}
for (Map.Entry<Integer, Integer> entry : occurrences.entrySet()) {
for (int i = 0; i < entry.getValue(); i++) {
result.get(i).add(entry.getKey());
}
}
return result;
}
O(nlogn) time and O(1) space complexity (not counting the resulting arrays) but doesn't retain order and "destroys" the input array. Here you utilize the fact that the array is already sorted so you can just go over it and keep adding the element to an appropriate resulting list depending on whether you're looking at a duplicate or a "new" entry.
public List<List<Integer>> split(int[] input) {
Arrays.sort(input);
int maxDup = getMaxDuplicateNumber(input);
List<List<Integer>> result = new ArrayList<>(maxDup);
for(int i = 0; i < maxDup; i++) {
result.add(new LinkedList<>());
}
int count = 0;
result.get(0).add(input[0]);
for(int i = 1; i < input.length; i++) {
if(input[i] == input[i-1]) {
count++;
} else {
count = 0;
}
result.get(count).add(input[i]);
}
return result;
}
private int getMaxDuplicateNumber(int[] input) {
int maxDups = 1;
int currentDupCount = 1;
for(int i = 1; i < input.length; i++) {
if(input[i] == input[i - 1]) {
currentDupCount++;
} else {
currentDupCount = 1;
}
if(currentDupCount > maxDups) {
maxDups = currentDupCount;
}
}
return maxDups;
}
You can't do this without loops. But you can use a set to remove some loops. You can add data structure trappings to your own liking.
I'm assuming here that the order of elements in the bins must be consistent with the order of the elements in the input array. If not this can be done more efficiently.
public static void main(String[] args) {
String[] array = { "1", "2", "3", "41", "56", "41", "72", "72", "72",
"78", "99" };
List<Set<String>> bins = new ArrayList<>();
for (String s : array) {
findOrCreateBin(bins, s).add(s);
}
System.out.println(bins); // Prints [[1, 2, 3, 41, 56, 72, 78, 99], [41, 72], [72]]
}
private static Set<String> findOrCreateBin(List<Set<String>> bins, String s) {
for (Set<String> bin : bins) {
if (!bin.contains(s)) {
return bin;
}
}
Set<String> bin = new LinkedHashSet<>();
bins.add(bin);
return bin;
}
Related
Problem link: https://codingbat.com/prob/p238573
Requirement:
Write a function that replaces the words in raw with the words in code_words such that the first occurrence of each word in raw is assigned the first unassigned word in code_words.
encoder(["a"], ["1", "2", "3", "4"]) → ["1"]
encoder(["a", "b"], ["1", "2", "3", "4"]) → ["1", "2"]
encoder(["a", "b", "a"], ["1", "2", "3", "4"]) → ["1", "2", "1"]
I tried two different solutions but it still shows that my function doesn't work on "other tests"
First:
public String[] encoder(String[] raw, String[] code_words) {
HashMap<String, String> hm = new HashMap<String, String>();
for (int i=raw.length - 1; i >= 0; i--) {
hm.put(raw[i], code_words[i]);
}
String [] finalarray = new String[raw.length];
for (int i=0; i < raw.length; i++) {
String x = hm.get(raw[i]);
finalarray[i] = x;
}
return finalarray;
}
All tests were fine, but the "other tests" failed
so I thought it was because of this line in requirements
the first occurrence of each word in raw is assigned the first unassigned word in code_words
so I updated the code to this:
public String[] encoder(String[] raw, String[] code_words) {
HashMap<String, String> hm = new HashMap<String, String>();
for (int i=0; i < raw.length; i++) {
String word = raw[i];
String value = code_words[i];
if (!hm.containsKey(word)) {
if (hm.containsValue(value)) {
for (int i1=0; i1 < code_words.length; i1++) {
value = code_words[i1];
if (!hm.containsValue(value)) {
hm.put(word, value);
break;
}
}
}
else {
hm.put(word, value);
}
}
}
String[] finalarray = new String[raw.length];
for (int i=0; i < raw.length; i++) {
String x = hm.get(raw[i]);
finalarray[i] = x;
}
return finalarray;
}
But it failed and I don't know why is that.
EDIT:
The problem with my (Second) code was:
if we assume raw = {"a", "a", "b", "d"}
and code words = {"1", "2", "3", "4"}
my code would assign letter "a" to "1" and "b" to "3" and d to "4"
that would leave "2" unassigned even though, it was the first unassigned letter
the code I provided an work with few adjustments
public String[] encoder(String[] raw, String[] code_words) {
HashMap<String, String> hm = new HashMap<String, String>();
for (int i=0; i < raw.length; i++) {
String word = raw[i];
int assigned = 0;
String value = code_words[assigned];
if (!hm.containsKey(word)) {
if (hm.containsValue(value)) {
for (int i1=0; i1 < code_words.length; i1++) {
value = code_words[i1];
if (!hm.containsValue(value)) {
hm.put(word, value);
assigned++;
break;
}
}
}
else {
hm.put(word, value);
assigned++;
}
}
}
String[] finalarray = new String[raw.length];
for (int i=0; i < raw.length; i++) {
String x = hm.get(raw[i]);
finalarray[i] = x;
}
return finalarray;
}
but it's definitely more efficient to use the code provided below. thanks to the contributors!
The problem
Your first idea wasn't all that bad. The problem is, that you should replace all occurrences of a word in raw with the first unassigned word in code_words.
How to fix
Lets first analyse how to fix your first code. Your idea of using a HashMap is pretty good. Clearly, if a word of raw already exists in the HashMap you don't want to add it a second time, so you just skip it in your first iteration.
Now, if the ith word in raw has no assigned value in your HashMap, you should add it the first unassigned word of code_words, which may have a different index than i, so we assign it another index, let's say j. After that, the jth word has been assigned and the first unassigned word has index j+1.
After iterating like that once over raw, every word has an assigned code in your HashMap and you can iterate over it one more time and assign the values.
The Code
Your final code will look something like this:
public String[] encoder(String[] raw, String[] code_words) {
HashMap<String, String> dictionary = new HashMap<>();
String[] coded = new String[raw.length];
int j = 0;
for(int i = 0; i < raw.length; i++) {
if(!dictionary.containsKey(raw[i])) { //if it has no assigned value
dictionary.put(raw[i], code_words[j]); //add to hashmap
j++; //set index to next unassigned
}
//do nothing if already found before
}
for(int i = 0; i < raw.length; i++) {
coded[i] = dictionary.get(raw[i]); //get coded word and set in final array
}
return coded;
}
We can write this somewhat more compacter, which some may prefer and other might find more confusing, so it's up to you.
public String[] encoder(String[] raw, String[] code_words) {
HashMap<String, String> dictionary = new HashMap<>();
String[] coded = new String[raw.length];
int j = 0;
for(int i = 0; i < raw.length; i++) {
if(!dictionary.containsKey(raw[i])) { //if it has no assigned value
dictionary.put(raw[i], code_words[j++]); //add to hashmap and also increment index of code_words
}
coded[i] = dictionary.get(raw[i]);
}
return coded;
}
This last code passed all tests.
You're making it a lot more complex than it is.
Yes, you need the hm map, and yes, you only add to it if the raw word isn't already a key in the map.
But to keep track of the next unassigned code_word, all you need is an index into the code_words array.
Map<String, String> hm = new HashMap<>();
int unassigned = 0;
for (String word : raw) {
if (! hm.containsKey(word)) {
hm.put(word, code_words[unassigned]);
unassigned++;
}
}
The code of the entire method can be compacted to:
public String[] encoder(String[] raw, String[] code_words) {
String[] encoded = new String[raw.length];
Map<String, String> hm = new HashMap<>();
for (int i = 0, unassigned = 0; i < raw.length; i++)
if ((encoded[i] = hm.get(raw[i])) == null)
hm.put(raw[i], encoded[i] = code_words[unassigned++]);
return encoded;
}
Just update one line
hm.put(raw[i], code_words[raw[i].charAt(0)-'a']);
I've included a picture of the problem below that explains it in more detail. The goal is to just find the k highest occurrences in a dictionary of words. My approach is getting the frequency in a HashMap and then using a Priority Queue to store the max k elements. I then add the max k elements to my return list and return it.
For the given input in the picture, my code returns to correct output -
["i","love"]. The problem is for inputs like the one below:
input: ["the", "day", "is", "sunny", "the", "the", "the", "sunny", "is", "is"]
output: ["day","sunny","is","the"]
expected: ["the","is","sunny","day"]
The correct answer would just be a reverse of my current string, however if I reverse the string before returning the original input (the one in the picture) no longer works.
I think this had something to do with how the values are being store in the priority queue when their frequency is the same...but I'm not sure of to check for that.
Any thoughts on how I could fix this?
class Solution {
public List<String> topKFrequent(String[] words, int k) {
HashMap<String, Integer> map = new HashMap<>();
List<String> mostFrequent = new ArrayList<>();
for(int i = 0; i < words.length; i++) {
if(map.containsKey(words[i])) {
map.put(words[i], map.get(words[i]) + 1);
}
else {
map.put(words[i], 1);
}
}
PriorityQueue<String> pq = new PriorityQueue<String>((a,b) -> map.get(a) - map.get(b));
for(String s : map.keySet()) {
pq.add(s);
if(pq.size() > k) {
pq.remove();
}
}
for(String s : pq) {
mostFrequent.add(s);
}
//Collections.reverse(mostFrequent);
return mostFrequent;
}
}
One way to achieve is by changing your code like below,
class Solution {
public static List<String> topKFrequent(String[] words, int k) {
HashMap<String, Integer> map = new HashMap<>();
List<String> mostFrequent = new ArrayList<>();
for(int i = 0; i < words.length; i++) {
if(map.containsKey(words[i])) {
map.put(words[i], map.get(words[i]) + 1);
}
else {
map.put(words[i], 1);
}
}
//Below I am sorting map based on asked condition and storing it into the list.
List<Map.Entry<String,Integer>> sorted = new ArrayList<>(map.entrySet());
Collections.sort(sorted,(Map.Entry<String,Integer> x,Map.Entry<String,Integer> y) -> x.getValue().compareTo(y.getValue()) == 0? x.getKey().compareTo(y.getKey()):x.getValue().compareTo(y.getValue()) > 0 ? -1 : 1 );
for(Map.Entry<String,Integer> e : sorted) {
mostFrequent.add(e.getKey());
}
return mostFrequent;
}
Here, after creating the frequency map I am sorting them based on frequency and creating one new ArrayList.
You almost did it. But you have a few bugs.
At first, your solution is not full. In original task they asked not only about frequency but additional, if frequency is equal - output elements in alphabetical order.
To achieve that you can use following comparator for PriorityQueue:
PriorityQueue<String> pq = new PriorityQueue<String>((a, b) -> {
int countComparison = Integer.compare(map.get(a), map.get(b));
if (countComparison == 0)
return b.compareTo(a);
return countComparison;
});
And, the next mistake with your solution is iterating over PriorityQueue. PriorityQueue iterator does not guarantee any particular order. From the Javadocs
The Iterator provided in method iterator() is not guaranteed to traverse the elements of the priority queue in any particular order.
Because of it you need to poll elements from the queue. And part responsible for it:
while(!pq.isEmpty()) {
String s = pq.poll();
mostFrequent.add(s);
}
And final part - because of the order of elements in queue(from the lowest to the highest) you need to reverse output array:
Collections.reverse(mostFrequent);
The final solution will look like this:
public List<String> topKFrequent(String[] words, int k) {
HashMap<String, Integer> map = new HashMap<>();
List<String> mostFrequent = new ArrayList<>();
for(int i = 0; i < words.length; i++) {
if(map.containsKey(words[i])) {
map.put(words[i], map.get(words[i]) + 1);
}
else {
map.put(words[i], 1);
}
}
PriorityQueue<String> pq = new PriorityQueue<String>((a, b) -> {
int countComparison = Integer.compare(map.get(a), map.get(b));
if (countComparison == 0)
return b.compareTo(a);
return countComparison;
});
for(String s : map.keySet()) {
pq.add(s);
if(pq.size() > k) {
pq.remove();
}
}
while(!pq.isEmpty()) {
String s = pq.poll();
mostFrequent.add(s);
}
Collections.reverse(mostFrequent);
return mostFrequent;
}
Two comma separated strings exist.
The first string is essentially the keys and the second is the linked values,
The first string needs to be in ascending order while retaining any duplicate values, and the second string needs to follow in suit to maintain the sequence as such.
Looked at hashmaps and tuples without success so far.
System is Java 6
String A = "3, 4, 1, 2, 3"
String B = "19, 24, 32, 68, 50"
Result Output needed
String A = "1, 2, 3, 3, 4"
String B = "32, 68, 19, 50, 24"
You have a multitude of possibilities to realize your requirements. Here are two examples (existing in parallel within the example code):
You can use a Map<Integer, List<Integer>> that holds a key and all values it has in a List
you can create a POJO class that holds exactly one key and one value, but you need to make it sortable/comparable and use a suitable data structure
See this example and pay attention to the code comments:
public class StackoverflowMain {
public static void main(String[] args) {
String a = "3, 4, 1, 2, 3";
String b = "19, 24, 32, 68, 50";
// Map-approach: use a map that maps a key to a list of values
Map<Integer, List<Integer>> ab = new TreeMap<>();
// Pair-approach: make a sortable POJO that holds a key and a value only
// and create a data structure that holds them sorted
SortedSet<Pair> pairList = new TreeSet<Pair>();
// split both Strings by comma
String[] aSplit = a.split(",");
String[] bSplit = b.split(",");
// check if the length of the resulting arrays is the same
if (aSplit.length == bSplit.length) {
// if yes, go through the arrays of numbers
for (int i = 0; i < aSplit.length; i++) {
int key = Integer.parseInt(aSplit[i].trim());
int value = Integer.parseInt(bSplit[i].trim());
// in the Pair-approach, you just have to create a Pair with the value found
Pair pair = new Pair(key, value);
// and add it to the set of pairs
pairList.add(pair);
// the following check is only needed for the Map-solution
if (ab.containsKey(key)) {
// if the key is already present,
// just add the new value to its value list
ab.get(key).add(value);
// sort the value list each time a new value has been added
ab.get(key).sort(Comparator.naturalOrder());
} else {
// if the key is not present in the Map so far,
// create a new List for the value
List<Integer> valueList = new ArrayList<>();
// add the value to that list
valueList.add(value);
// and put both into the Map
ab.put(key, valueList);
}
}
} else {
System.err.println("The Strings have different amounts of elements!");
}
// print what's in the Map
System.out.println("Map-approach:");
for (int key : ab.keySet()) {
List<Integer> value = ab.get(key);
for (int val : value) {
System.out.println(key + " : " + val);
}
}
System.out.println("————————————————");
System.out.println("Pairs-approach:");
for (Pair pair : pairList) {
System.out.println(pair.key + " : " + pair.val);
}
}
/**
* This class is needed for the Pair-approach.
* It is comparable (and by that, sortable) and will be sorted by key
* and if the keys are equal, it will sort by value.
*/
static class Pair implements Comparable<Pair> {
int key;
int val;
Pair(int key, int value) {
this.key = key;
this.val = value;
}
#Override
public int compareTo(Pair otherPair) {
if (key == otherPair.key) {
if (val == otherPair.val) {
return 0;
} else if (val < otherPair.key) {
return -1;
} else {
return 1;
}
} else if (key < otherPair.key) {
return -1;
} else {
return 1;
}
}
}
}
This code produces the following output:
Map-approach:
1 : [32]
2 : [68]
3 : [19, 50]
4 : [24]
————————————————
Pairs-approach:
1 : 32
2 : 68
3 : 19
3 : 50
4 : 24
EDIT
Since the Pair-approach does not sort correctly, I came up with this Map-approach:
public class StackoverflowMain {
public static void main(String[] args) {
String a = "3, 4, 1, 3, 3, 2, 3";
String b = "5, 24, 35, 99, 32, 68, 19";
// Map-approach: use a map that maps a key to a list of values
Map<Integer, List<Integer>> ab = new TreeMap<>();
// split both Strings by comma
String[] aSplit = a.split(",");
String[] bSplit = b.split(",");
// check if the length of the resulting arrays is the same
if (aSplit.length == bSplit.length) {
// if yes, go through the arrays of numbers
for (int i = 0; i < aSplit.length; i++) {
int key = Integer.parseInt(aSplit[i].trim());
int value = Integer.parseInt(bSplit[i].trim());
// the following check is only needed for the Map-solution
if (ab.containsKey(key)) {
// if the key is already present, just add the new value to its value list
ab.get(key).add(value);
// sort the value list each time a new value has been added
ab.get(key).sort(Comparator.naturalOrder());
} else {
// if the key is not present in the Map so far, create a new List for the value
List<Integer> valueList = new ArrayList<>();
// add the value to that list
valueList.add(value);
// and put both into the Map
ab.put(key, valueList);
}
}
} else {
System.err.println("The Strings have different amounts of elements!");
}
// print what's in the Map
System.out.println("Map-approach:");
for (int key : ab.keySet()) {
List<Integer> value = ab.get(key);
for (int val : value) {
System.out.println(key + " : " + val);
}
}
}
}
It is shorter and uses a Map<Integer, List<Integer>> and sorts the List<Integer> every time a new value gets added (apart from the first value, which doesn't need a sort). That needed another loop in the output code, but you don't have to create a new class.
It produces the following output:
Map-approach:
1 : 35
2 : 68
3 : 5
3 : 19
3 : 32
3 : 99
4 : 24
First: java 6 is too old; at least java 8 (generic types, expressive Streams). Then use variable & method names starting with a small letter, as this is a really hard community convention.
In java 6 a poor solution would be an array of packed longs:
Sorting must be done on pairs of A and B values, so here the values are packied.
Wanting things ordered implies a logarithmic access time for a get,
hence a binary search on a sorted array is feasible.
The consequence of using arrays is the fixed array size: insert/delete being cumbersome;
a List would be better, a Map then best.
Turn the strings into data of single elements.
public class MapIntToInts {
long[] ab;
public MapIntToInts (String A, String B) {
String[] a = A.split(", ");
String[] b = B.split(", ");
ab = new int[a.length];
for (int i = 0; i < ab.length; ++i) {
long key = Integer.parseInt(a[i]) << 32L;
long value = Integer.parseInt(b[i]) && 0xFFFF_FFFFL;
ab[i] = key | value;
}
Arrays.sort(ab);
}
Getting the values of one A key can be done by a binary search in O(log N) time:
public int[] get(int key) {
long abKey <<= 32L;
int firstI = Arrays.binSearch(ab, key);
if (firstI < 0) { // Not found
firstI = ~firstI ; // Insert position
}
int i = firstI;
while (i < ab.length && ab[i] >> 32 == key) {
++i;
}
int n = i - firstI;
int[] result = new int[n];
for (int i = 0; i < n; ++i) {
result[i] = (int)ab[firstI + i];
}
Arrays.sort(result); // For mixed negative positive values
return result();
}
}
Usage:
String A = "3, 4, 1, 2, 3";
String B = "19, 24, 32, 68, 50";
MapIntToInts map = MapIntToInts(A, B);
int[] results = map.get(3);
Improvements:
Replace long[] ab; by int[] a; int[b]; or better:
Replace long[] ab; by int[] uniqueKeys; int[][] values;.
You should invest more into decomposing a problem into smaller chunks. For example, converting a string into array of integers is a far more simple operation than the one declared in the title. Let's assume it could be dealt with separately.
Integer[] A = {3, 4, 1, 2, 3};
Integer[] B = {19, 24, 32, 68, 50};
So, when you have two integer arrays at your disposal, you can benefit from natural ordering—it will allow you to skip Comparator implementation should you choose a SortedMap solution.
SortedMap<Integer, List<Integer>> map = new TreeMap<Integer, List<Integer>>();
for (int i = 0; i < A.length; i++) {
if (map.get(A[i]) == null) {
List<Integer> list = new ArrayList<Integer>();
list.add(B[i]);
map.put(A[i], list);
} else {
map.get(A[i]).add(B[i]);
}
}
With the above mentioned the most complex thing to do will be getting rid of the trailing comma when creating an output. But with a small trick it shouldn't be a problem.
StringBuilder sb = new StringBuilder();
for (List<Integer> list : map.values()) {
for (Integer integer : list) {
sb.append(", ").append(integer);
}
}
String output = sb.toString().substring(2);
If you want to verify the answer, you can enable assertions by passing -ea argument to JVM on application start (or just write a unit test).
assert output.equals("32, 68, 19, 50, 24");
I have a question from a quiz :
If input data of randomList are 4 5 1 2 3 4
Results are:
pick(4) -> 4 4
pick(1) -> 1
pick(2) -> 2
pick(6) -> there is no value
These are the default codes, and we're free to place any codes anywhere:
public static void main(String[] args){
List<Integer> randomList = new ArrayList<>();
for(int i = 0; i < 100000000; i++) {
randomList.add(new Random().nextInt());
}
.....
System.out.println("result = " + pick(new Random().nextInt()));
The Question is, what is the most efficient method for function pick() which is better than O(n) ?
This is my version of O(n) :
static List<Integer> list2 = new ArrayList<>();
public static void main(String[] args){
List<Integer> randomList = new ArrayList<>();
for(int i = 0; i < 10; i++) {
randomList.add(new Random().nextInt(5)+1);
}
list2 = randomList;
System.out.println("result = " + pick(new Random().nextInt(5)+1));
}
public static String pick(int rand) {
String result = "";
System.out.println("search = " + rand);
for(Integer s : list2) {
if(s == rand) {
result = result + " " + rand;
}
}
return result;
}
Given your constraints, there is no better searching algorithm besides O(n). The reason for this:
Your data contains "randomized" values between 0 and 100,000,000
You want to collect all values which match a given number (in your example, 4)
You have no ability to sort the list (which would incur an additional O(n*log(n)) overhead)
The only way this could get better is if you could move your data set to a different data structure, such as a Map. Then, you would incur an O(n) penalty for loading the data, but you'd be able to find the values in constant time after that.
If you use a Map in which key is your input value and a value is the frequency then Map will find a key in O(1) time. The string constructing will be proportional to the frequency of a key though. So, the code could be as follows:
Map<Integer, Integer> mapList = new HashMap<>();
public static void main(String[] args){
for(int i = 0; i < 10; i++) {
int key = new Random().nextInt(5)+1;
if (mapList.contains(key)) {
mapList.put(key, mapList.get(key) + 1);
} else {
mapList.put(key, 1);
}
}
System.out.println("result = " + pick(new Random().nextInt(5)+1));
}
public static String pick(int rand) {
Integer count = mapList.get(rand);
if (count == null) {
return "";
}
StringJoiner sj = new StringJoiner(" ");
for (int i = 0; i < count; i++) {
sj.add(rand);
}
return sj.toString();
}
Edit
As suggested by #Pshemo, StringJoiner is used instead of StringBuilder as it's more compact and doesn't add a redundant space for the last character.
I do not fully understand how to return a 2D object. So I wrote a method that takes in an input with a document and I have to return a list of all unique words in it and their number of occurrences, sorted by the number of occurrences in a descending order. It is a requirement that I cannot control that this be returned as a 2-dimensional array of String.
So here is what I have so far:
static String[][] wordCountEngine(String document) {
// your code goes here
if (document == null || document.length() == 0)
return null;
Map<String, String> map = new HashMap<>();
String[] allWords = document.toLowerCase().split("[^a-zA-Z]+");
for (String s : allWords) {
if (map.containsKey(s)) {
int newVersion = (Integer.parseInt(map.get(s).substring(1, map.get(s).length())) + 1);
String sb = Integer.toString(newVersion);
map.put(s, sb);
} else {
map.put(s, "1");
}
}
String[][] array = new String[map.size()][2];
int count = 0;
for (Map.Entry<String, String> entry : map.entrySet()) {
array[count][0] = entry.getKey();
array[count][1] = entry.getValue();
count++;
}
return array;
}
I'm trying to use a HashMap to store the words and their occurrences. What is the best way to store key --> value pairs from a table into a String[][].
If the input is:
input: document = "Practice makes perfect. you'll only
get Perfect by practice. just practice!"
The output should be:
output: [ ["practice", "3"], ["perfect", "2"],
["by", "1"], ["get", "1"], ["just", "1"],
["makes", "1"], ["only", "1"], ["youll", "1"] ]
How do I store data like this in a 2D array?
String[][] simply is the wrong data structure for this task.
You should use a Map<String, Integer> map instead of <String, String> during the method run and simply return exactly that map.
This has multiple reasons:
you store integers as strings, and even do calculations by parsing the String to an integer again, calculating and then parsing back - bad idea.
The returned array does not guarantee the dimensions, there is no way to enforce that each sub-array has exactly two elements.
Note regarding your comment: if (for some reason) you need to convert the map to a String[][] you can certainly do that, but that conversion logic should be separated from the code generating the map itself. That way the code for wordCountEngine remains clean and easily maintainable.
Just because you need to return a particular typed data-structure does not mean you need to create similarly typed map inside your method. Nothing prevents you from using Map<String, Integer> and then converting it to String[][]:
Here is the code that does not use Java8 streeams:
static String[][] wordCountEngine(String document) {
// your code goes here
if (document == null || document.length() == 0)
return null;
Map<String, Integer> map = new HashMap<>();
for ( String s : document.toLowerCase().split("[^a-zA-Z]+") ){
Integer c = map.get(s);
map.put(s, c != null ? c + 1: 1);
}
String[][] result = new String[ map.size() ][ 2 ];
int count = 0;
for ( Map.Entry<String, Integer> e : map.entrySet() ){
result[count][0] = e.getKey();
result[count][1] = e.getValue().toString();
count += 1;
}
return result;
}
And for fun a Java8 version:
static String[][] wordCountEngine(String document) {
// your code goes here
if (document == null || document.length() == 0)
return null;
return Arrays
//convert words into map with word and count
.stream( document.toLowerCase().split("[^a-zA-Z]+") )
.collect( Collectors.groupingBy( s -> s, Collectors.summingInt(s -> 1) ) )
//convert the above map to String[][]
.entrySet()
.stream().map( (e) -> new String[]{ e.getKey(), e.getValue().toString() } )
.toArray( String[][]::new );
}
this is my Solution to Pramp's question although in C# I think it is the same Idea
[TestMethod]
public void PrampWordCountEngineTest()
{
string document = "Practice makes perfect. you'll only get Perfect by practice. just practice!";
string[,] result = WordCountEngine(document);
string[,] expected =
{
{"practice", "3"}, {"perfect", "2"},
{"makes", "1"}, {"youll", "1"}, {"only", "1"},
{"get", "1"}, {"by", "1"}, {"just", "1"}
};
CollectionAssert.AreEqual(expected,result);
}
public string[,] WordCountEngine(string document)
{
Dictionary<string, int> wordMap = new Dictionary<string, int>();
string[] wordList = document.Split(' ');
int largestCount = 0;
foreach (string word in wordList)
{
string lowerWord = word.ToLower(); // can't assing to the same variable
//remove special/punctuation characters
var sb = new StringBuilder();
foreach (var c in lowerWord)
{
if (c >= 'a' && c <= 'z')
{
sb.Append(c);
}
}
string cleanWord = sb.ToString();
if (cleanWord.Length < 1)
{
continue;
}
int count = 0;
if (wordMap.ContainsKey(cleanWord))
{
count = wordMap[cleanWord];
count++;
}
else
{
count = 1;
}
if (count > largestCount)
{
largestCount = count;
}
wordMap[cleanWord] = count;
}
// we have a list of all of the words in the same length in a given cell of the big list
List<List<string>> counterList = new List<List<string>>();
for (int i = 0; i < largestCount + 1; i++)
{
counterList.Add(new List<string>());
}
foreach (var word in wordMap.Keys)
{
int counter = wordMap[word];
counterList[counter].Add(word);
}
string[,] result = new string[wordMap.Keys.Count,2];
int key = 0;
//foreach list of words with the same length we insert the count of that word into the 2D array
for (var index = counterList.Count-1; index > 0; index--)
{
var list = counterList[index];
List<string> wordListCounter = list;
if (wordListCounter == null)
{
continue;
}
foreach (var word in wordListCounter)
{
result[key, 0] = word;
result[key, 1] = index.ToString();
key++;
}
}
return result;
}