How to return a 2D String

How to return a 2D String - java

I do not fully understand how to return a 2D object. So I wrote a method that takes in an input with a document and I have to return a list of all unique words in it and their number of occurrences, sorted by the number of occurrences in a descending order. It is a requirement that I cannot control that this be returned as a 2-dimensional array of String.
So here is what I have so far:
static String[][] wordCountEngine(String document) {
// your code goes here
if (document == null || document.length() == 0)
return null;
Map<String, String> map = new HashMap<>();
String[] allWords = document.toLowerCase().split("[^a-zA-Z]+");
for (String s : allWords) {
if (map.containsKey(s)) {
int newVersion = (Integer.parseInt(map.get(s).substring(1, map.get(s).length())) + 1);
String sb = Integer.toString(newVersion);
map.put(s, sb);
} else {
map.put(s, "1");
}
}
String[][] array = new String[map.size()][2];
int count = 0;
for (Map.Entry<String, String> entry : map.entrySet()) {
array[count][0] = entry.getKey();
array[count][1] = entry.getValue();
count++;
}
return array;
}
I'm trying to use a HashMap to store the words and their occurrences. What is the best way to store key --> value pairs from a table into a String[][].
If the input is:
input: document = "Practice makes perfect. you'll only
get Perfect by practice. just practice!"
The output should be:
output: [ ["practice", "3"], ["perfect", "2"],
["by", "1"], ["get", "1"], ["just", "1"],
["makes", "1"], ["only", "1"], ["youll", "1"] ]
How do I store data like this in a 2D array?

String[][] simply is the wrong data structure for this task.
You should use a Map<String, Integer> map instead of <String, String> during the method run and simply return exactly that map.
This has multiple reasons:
you store integers as strings, and even do calculations by parsing the String to an integer again, calculating and then parsing back - bad idea.
The returned array does not guarantee the dimensions, there is no way to enforce that each sub-array has exactly two elements.
Note regarding your comment: if (for some reason) you need to convert the map to a String[][] you can certainly do that, but that conversion logic should be separated from the code generating the map itself. That way the code for wordCountEngine remains clean and easily maintainable.

Just because you need to return a particular typed data-structure does not mean you need to create similarly typed map inside your method. Nothing prevents you from using Map<String, Integer> and then converting it to String[][]:
Here is the code that does not use Java8 streeams:
static String[][] wordCountEngine(String document) {
// your code goes here
if (document == null || document.length() == 0)
return null;
Map<String, Integer> map = new HashMap<>();
for ( String s : document.toLowerCase().split("[^a-zA-Z]+") ){
Integer c = map.get(s);
map.put(s, c != null ? c + 1: 1);
}
String[][] result = new String[ map.size() ][ 2 ];
int count = 0;
for ( Map.Entry<String, Integer> e : map.entrySet() ){
result[count][0] = e.getKey();
result[count][1] = e.getValue().toString();
count += 1;
}
return result;
}
And for fun a Java8 version:
static String[][] wordCountEngine(String document) {
// your code goes here
if (document == null || document.length() == 0)
return null;
return Arrays
//convert words into map with word and count
.stream( document.toLowerCase().split("[^a-zA-Z]+") )
.collect( Collectors.groupingBy( s -> s, Collectors.summingInt(s -> 1) ) )
//convert the above map to String[][]
.entrySet()
.stream().map( (e) -> new String[]{ e.getKey(), e.getValue().toString() } )
.toArray( String[][]::new );
}

this is my Solution to Pramp's question although in C# I think it is the same Idea
[TestMethod]
public void PrampWordCountEngineTest()
{
string document = "Practice makes perfect. you'll only get Perfect by practice. just practice!";
string[,] result = WordCountEngine(document);
string[,] expected =
{
{"practice", "3"}, {"perfect", "2"},
{"makes", "1"}, {"youll", "1"}, {"only", "1"},
{"get", "1"}, {"by", "1"}, {"just", "1"}
};
CollectionAssert.AreEqual(expected,result);
}
public string[,] WordCountEngine(string document)
{
Dictionary<string, int> wordMap = new Dictionary<string, int>();
string[] wordList = document.Split(' ');
int largestCount = 0;
foreach (string word in wordList)
{
string lowerWord = word.ToLower(); // can't assing to the same variable
//remove special/punctuation characters
var sb = new StringBuilder();
foreach (var c in lowerWord)
{
if (c >= 'a' && c <= 'z')
{
sb.Append(c);
}
}
string cleanWord = sb.ToString();
if (cleanWord.Length < 1)
{
continue;
}
int count = 0;
if (wordMap.ContainsKey(cleanWord))
{
count = wordMap[cleanWord];
count++;
}
else
{
count = 1;
}
if (count > largestCount)
{
largestCount = count;
}
wordMap[cleanWord] = count;
}
// we have a list of all of the words in the same length in a given cell of the big list
List<List<string>> counterList = new List<List<string>>();
for (int i = 0; i < largestCount + 1; i++)
{
counterList.Add(new List<string>());
}
foreach (var word in wordMap.Keys)
{
int counter = wordMap[word];
counterList[counter].Add(word);
}
string[,] result = new string[wordMap.Keys.Count,2];
int key = 0;
//foreach list of words with the same length we insert the count of that word into the 2D array
for (var index = counterList.Count-1; index > 0; index--)
{
var list = counterList[index];
List<string> wordListCounter = list;
if (wordListCounter == null)
{
continue;
}
foreach (var word in wordListCounter)
{
result[key, 0] = word;
result[key, 1] = index.ToString();
key++;
}
}
return result;
}

Related

How to mark and edit duplicate words in list java?

Hi I have this list below
String[] list = {"I","think","she","think","he","think","she","loves"};
I want to produce it like below. So the repeated words get increment.
["I","think","she","think(1)","he","think(2)","she(1)","loves"];
I've tried to explore this logic but I find it hard to add the increment number to my list, so far I'm only able to detect the repeated words. How can I add the number to the repeated words?

You can traverse the array and store each word with their number of occurrences in a Map object. As you traverse through and you find a word which is already present in the map then its simply means its a duplicate word.
EG:
Map<String, Integer> map = new HashMap<>();
String[] result = new String[list.length];
int i = 0;
for (String val : list) {
int count = map.getOrDefault(val, 0); // if map does not contain the key then the default occurrence is 0
result[i] = count > 0 ? val + "(" + count + ")" : val;
count++;
map.put(val, count);
i++;
}
Edit:
As mentioned by #Holger in the comments , a simplified for-loop.
for(String val : list) {
int count = map.merge(val, 1, Integer::sum) - 1;
result[i++] = count > 0 ? val + "(" + count + ")" : val;
}

Here is one possible solution:
String[] list = {"I", "think", "she", "think", "he", "think", "she", "loves"};
List<String> modified = new ArrayList<>();
Map<String, Integer> words = new HashMap<>();
for (String item : list) {
int val = words.getOrDefault(item, 0);
words.put(item, val + 1);
}
for (Map.Entry<String, Integer> entry : words.entrySet()) {
int val = entry.getValue();
String key = entry.getKey();
while (val > 0) {
if (val - 1 > 0) {
String newKey = key + "(" + (val - 1) + ")";
modified.add(newKey);
} else {
modified.add(key);
}
val--;
}
}
String[] result = modified.toArray(new String[0]);
System.out.println(Arrays.toString(result));
if you run this code, you should see output like this:
[think(2), think(1), think, she(1), she, loves, I, he]

class Test {
public static void main(String[] args) {
String[] list = { "I", "think", "she", "think", "he", "think", "she", "loves" };
List<String> finalList = new ArrayList<>();
for (int i = 0; i < list.length; i++) {
String elem = list[i];
long matchedCount = Arrays.asList(list).subList(0, i).parallelStream().filter(a -> a.equals(elem)).count();
if (matchedCount > 0) {
finalList.add(elem + "(" + matchedCount + ")");
} else {
finalList.add(elem);
}
}
System.out.println(finalList);
}
}
Output: [I, think, she, think(1), he, think(2), she(1), loves]

Using a Hashmap to detect duplicates and count of duplicates in a list

I'm trying to use hashmaps to detect any duplicates in a given list, and if there is, I want to add "1" to that String to indicate its duplication. If it occurs 3 times, the third one would add "3" after that string.
I can't seem to figure that out, keeping track of the number of duplicates. It only adds 1 to the duplicates, no matter if it's the 2nd or 3rd or 4th,..etc duplicate.
This is what I have:
public static List<String> duplicates(List<String> given) {
List<String> result = new ArrayList<String>();
HashMap<String, Integer> hashmap = new HashMap<String, Integer>();
for (int i=0; i<given.size(); i++) {
String current = given.get(i);
if (hashmap.containsKey(current)) {
result.add(current+"1");
} else {
hashmap.put(current,i);
result.add(current);
}
}
return result;
}
I want to include the values that only occur once as well, as is (no concatenation).
Sample Input: ["mixer", "toaster", "mixer", "mixer", "bowl"]
Sample Output: ["mixer", "toaster", "mixer1", "mixer2", "bowl"]

public static List<String> duplicates(List<String> given) {
final Map<String, Integer> count = new HashMap<>();
return given.stream().map(s -> {
int n = count.merge(s, 1, Integer::sum) - 1;
return s + (n < 1 ? "" : n);
}).collect(toList());
}

I renamed final to output as the first one is a keyword that cannot be used as a variable name.
if (hashmap.containsKey(current)) {
output.add(current + hashmap.get(current)); // append the counter to the string
hashmap.put(current, hashmap.get(current)+1); // increment the counter for this item
} else {
hashmap.put(current,1); // set a counter of 1 for this item in the hashmap
output.add(current);
}

You always add the hard-coded string "1" instead of using the count saved in the map:
public static List<String> duplicates(List<String> given) {
List<String> result = new ArrayList<>(given.size());
Map<String, Integer> hashmap = new HashMap<>();
for (String current : given) {
if (hashmap.containsKey(current)) {
int count = hashmap.get(current) + 1;
result.add(current + count);
hashmap.put(current, count);
} else {
hashmap.put(current, 0);
result.add(current);
}
}
return result;
}

ArrayList finallist = new ArrayList<String>();
for (int i=0; i<given.size(); i++) {
String current = given.get(i);
if (hashmap.containsKey(current)) {
hashmap.put(current,hashmap.get(current)+1);
} else {
hashmap.put(current,1);
}
String num = hashmap.get(current) == 1 ? "" :Integer.toString(hashmap.get(current));
finallist.add(current+num);
}
System.out.println(finallist);

Get largest Group of anagrams in an array

For an assignment I have been asked to find the largest group of anagrams in a list. I believe I would have to have an accumulation loop inside of another loop that keeps track of the largest number of items. The problem is that I don't know how to count how many of each anagram I have. I have been able to sort the array into groups based on their anagrams. So from the index 1-3 is one anagram, 4-10 is another, etc. How do I search through and count how many of each anagram I have? Then compare each one to the previous count.
Sample of the code:
public static String[] getLargestAnagramGroup(String[] inputArray) {
ArrayList<String> largestGroupArrayList = new ArrayList<String>();
if (inputArray.length == 0 || inputArray == null) {
return new String[0];
}
insertionSort(inputArray, new AnagramComparator());
String[] largestGroupArray = new String[largestGroupArrayList.size()];
largestGroupArrayList.toArray(inputArray);
System.out.println(largestGroupArray);
return largestGroupArray;
}
UPDATE: This is how we solved it. Is there a more efficient way?
public static String[] getLargestAnagramGroup(String[] inputArray) {
int numberOfAnagrams = 0;
int temporary = 1;
int position = -1;
int index = 0;
if (inputArray == null) {
return new String[0];
}
insertionSort(inputArray, new AnagramComparator());
for (index = 0; index < inputArray.length - 1; index++) {
if (areAnagrams(inputArray[index], inputArray[index + 1])) {
temporary++;
} else {
if (temporary > numberOfAnagrams) {
numberOfAnagrams = temporary;
position = index;
temporary = 1;
} else if (temporary < numberOfAnagrams) {
temporary = 1;
}
}
}
if (temporary > numberOfAnagrams) {
position = index;
numberOfAnagrams = temporary;
}
String[] largestArray = new String[numberOfAnagrams];
for (int startIndex = position - numberOfAnagrams + 1, i = 0; startIndex <= position; startIndex++, i++) {
largestArray[i] = inputArray[startIndex];
}
return largestArray;
}

Here is a piece of code to help you out.
public class AnagramTest {
public static void main(String[] args) {
String[] input = {"test", "ttes", "abcd", "dcba", "dbac"};
for (String string : getLargestAnagramGroup(input)) {
System.out.println(string);
}
}
/**
* Gives an array of Strings which are anagrams and has the highest occurrence.
*
* #param inputArray
* #return
*/
public static String[] getLargestAnagramGroup(String[] inputArray) {
// Creating a linked hash map to maintain the order
Map<String, List<String>> map = new LinkedHashMap<String, List<String>>();
for (String string : inputArray) {
char[] charArray = string.toCharArray();
Arrays.sort(charArray);
String sortedStr = new String(charArray);
List<String> anagrams = map.get(sortedStr);
if (anagrams == null) {
anagrams = new ArrayList<String>();
}
anagrams.add(string);
map.put(sortedStr, anagrams);
}
Set<Entry<String, List<String>>> entrySet = map.entrySet();
List<String> l = new ArrayList<String>();
int highestAnagrams = -1;
for (Entry<String, List<String>> entry : entrySet) {
List<String> value = entry.getValue();
if (value.size() > highestAnagrams) {
highestAnagrams = value.size();
l = value;
}
}
return l.toArray(new String[l.size()]);
}
}
The idea is to first find the anangrams. I am doing that using a sorting the string's character array and using the LinkedhashMap.
Then I am storing the original string in the list which can be used to print or reuse as a result.
You have to keep counting the number of times the an anagram occurs and that value can be used solve your problem

This is my solution in C#.
public static string[] LargestAnagramsSet(string[] words)
{
var maxSize = 0;
var maxKey = string.Empty;
Dictionary<string, List<string>> set = new Dictionary<string, List<string>>();
for (int i = 0; i < words.Length; i++)
{
char[] temp = words[i].ToCharArray();
Array.Sort(temp);
var key = new string(temp);
if (set.ContainsKey(key))
{
set[key].Add(words[i]);
}
else
{
var anagrams = new List<string>
{
words[i]
};
set.Add(key, anagrams);
}
if (set[key].Count() > maxSize)
{
maxSize = set[key].Count();
maxKey = key;
}
}
return string.IsNullOrEmpty(maxKey) ? words : set[maxKey].ToArray();
}

Partition an Array with duplicate elements into arrays with unique elements

I have an Array which is structured like this :
String Array = {"1","2","3","41","56","41","72","72","72","78","99"}
and I want to partition this array into a number of arrays which values are not duplicates... like this :
String Array1 = {"1","2","3","41","56","72","78","99"}
String Array2 = {"41","72"}
String Array3 = {"72"}
is there any straight way to do this in Java or I have to do this with ugly loops (Just kidding !) ?
Thanks !
UPDATE
I'm gonna make the question a bit harder... now I have a Map which structure is like below :
Map<String,String> map = new HashMap(){{
put("1##96","10");
put("2##100","5");
put("3##23","100");
put("41##34","14");
put("56##22","25");
put("41##12","100");
put("72##10","100");
put("72##100","120");
put("72##21","0");
put("78##22","7");
}}
note that the values are not important BUT the keys are important...
what can I do to partition this map to submaps which are like :
Map map1 = {"1##96" => "10"
"2##100" => "5"
"3##23" => "100"
"41##34" => "14"
"56##22" => "25"
"72##10" => "100"
"78##22" => "7"
}
Map map2 = {
"41##12" => "100"
"72##100" => "120"
}
Map map3 = {
"72##100" => "120"
}
like before the first part of the map (before '##') is the ID which I want the uniqueness be based upon... this is just like the Array Example but a bit harder and more complex...
Sorry for changing the question midway...

Probably nothing in libs (seems not generic enough) but some ideas:
O(n) time and O(n) space complexity. Here you just count how many times each number occurs and then put them in that many resulting arrays.
#Edit: as #mpkorstanje pointed out if you change the input from numbers to strings or any other objects in the worst-worst case this will degrade to O(n^2). But in that case you should revise your hashing imho for the data on which you're working as it's not well distributed.
public List<List<Integer>> split(int[] input) {
Map<Integer, Integer> occurrences = new HashMap<>();
int maxOcc = 0;
for (int val : input) {
int occ = 0;
if (occurrences.containsKey(val)) {
occ = occurrences.get(val);
}
if (occ + 1 > maxOcc) {
maxOcc = occ + 1;
}
occurrences.put(val, occ + 1);
}
List<List<Integer>> result = new ArrayList<>(maxOcc);
for (int i = 0; i < maxOcc; i++) {
result.add(new LinkedList<>());
}
for (Map.Entry<Integer, Integer> entry : occurrences.entrySet()) {
for (int i = 0; i < entry.getValue(); i++) {
result.get(i).add(entry.getKey());
}
}
return result;
}
O(nlogn) time and O(1) space complexity (not counting the resulting arrays) but doesn't retain order and "destroys" the input array. Here you utilize the fact that the array is already sorted so you can just go over it and keep adding the element to an appropriate resulting list depending on whether you're looking at a duplicate or a "new" entry.
public List<List<Integer>> split(int[] input) {
Arrays.sort(input);
int maxDup = getMaxDuplicateNumber(input);
List<List<Integer>> result = new ArrayList<>(maxDup);
for(int i = 0; i < maxDup; i++) {
result.add(new LinkedList<>());
}
int count = 0;
result.get(0).add(input[0]);
for(int i = 1; i < input.length; i++) {
if(input[i] == input[i-1]) {
count++;
} else {
count = 0;
}
result.get(count).add(input[i]);
}
return result;
}
private int getMaxDuplicateNumber(int[] input) {
int maxDups = 1;
int currentDupCount = 1;
for(int i = 1; i < input.length; i++) {
if(input[i] == input[i - 1]) {
currentDupCount++;
} else {
currentDupCount = 1;
}
if(currentDupCount > maxDups) {
maxDups = currentDupCount;
}
}
return maxDups;
}

You can't do this without loops. But you can use a set to remove some loops. You can add data structure trappings to your own liking.
I'm assuming here that the order of elements in the bins must be consistent with the order of the elements in the input array. If not this can be done more efficiently.
public static void main(String[] args) {
String[] array = { "1", "2", "3", "41", "56", "41", "72", "72", "72",
"78", "99" };
List<Set<String>> bins = new ArrayList<>();
for (String s : array) {
findOrCreateBin(bins, s).add(s);
}
System.out.println(bins); // Prints [[1, 2, 3, 41, 56, 72, 78, 99], [41, 72], [72]]
}
private static Set<String> findOrCreateBin(List<Set<String>> bins, String s) {
for (Set<String> bin : bins) {
if (!bin.contains(s)) {
return bin;
}
}
Set<String> bin = new LinkedHashSet<>();
bins.add(bin);
return bin;
}

Representing an Array( String[] ) as String CSV with Ranges

I have an array of String that contain numbers(unsigned integers) padded with an arbitrary number of zeros, for example :
[ 0001, 0002, 0003, 0005,0007, 0010,0011,0012,0013,0014, 0015 ]
i want to convert the array into a representing string, the representing string should aggregate adjacent values with a range representation ( 0000-0003 ) and non-adjacent values as comma separated values, so for example the above string array should be represented as follow representing string :
0001-0003, 0005, 0007, 0010-0015
What is the best/simplest/more readable way to do it (without writing a tons of code :-) ) ?
Thanks.

If I understood the requirements correctly then following code should work for you: (hope it is not really a tons of code :-))
String[] arr = new String[] {"0001", "0020", "0002", "0003", "0019", "0005", "0007",
"0010", "0018", "0011", "0012", "0013", "0014", "0015"};
Map<Integer, String> m = new TreeMap<Integer, String>();
for (String s : arr)
m.put(new Integer(s), s);
Iterator<Entry<Integer, String>> it;
Integer prev = -1;
StringBuffer sb = new StringBuffer();
boolean isCont = false;
for (it=m.entrySet().iterator(); it.hasNext();) {
Entry<Integer, String> entry = it.next();
if (prev == -1)
sb.append(entry.getValue());
else if (entry.getKey() == (prev+1))
isCont = true;
else if (entry.getKey() > (prev+1)) {
if (isCont)
sb.append('-').append(m.get(prev)).append(", ");
else
sb.append(", ");
sb.append(entry.getValue());
isCont = false;
}
prev = entry.getKey();
}
if (isCont)
sb.append('-').append(m.get(prev));
System.out.println(sb);
OUTPUT:
0001-0003, 0005, 0007, 0010-0015, 0018-0020

Here is my answer, of course everyone has different taste.
String[] a = { "0001", "0002", "0003", "0005", "0010" , "0011" , "0012" , "0013" , "0014", "0015", "0017" };
String out = new String();
String curStart = null;
String curEnd = null;
for (int i=0; i<a.length; i++) {
if (curStart == null) curStart = a [i];
if ( a.length != i+1
&& Integer.parseInt(a[i])+1 == Integer.parseInt(a[i+1])) {
curEnd = a[i+1];
} else {
if (!out.equals("")) out+=", ";
out+=""+curStart;
if (curEnd != null) out+="-"+curEnd;
curStart = null;
curEnd = null;
}
}
System.out.println(out);

I would do it by treating every string as its own range, unioning adjacent ones together, and specializing my Range.toString() implementation for the case of a single element on its own. Something like:
class Range {
int low;
int high;
public Range(int elem) { this.low = elem; this.high = elem;}
private Range(int low, int high) { this.low=low; this.high=high;}
public Range tryMerge(Range other) {
if(high + 1 == other.low) {
return new Range(low, other.high);
} else {
return null;
}
}
public String toString() {
return (low == high) ? Integer.toString(low) : low + "-" + high;
}
}
with possibly some more stuff involved in the padding.

Develop Reference

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to return a 2D String - java

Related

How to mark and edit duplicate words in list java?

Using a Hashmap to detect duplicates and count of duplicates in a list

Get largest Group of anagrams in an array

Partition an Array with duplicate elements into arrays with unique elements

Representing an Array( String[] ) as String CSV with Ranges

Categories

Resources