Logic for Grouping number - java

I am having 1 -10 in different groups A,B and C.
For eg. A-1,A-2,A-3,B-4,C-5,B-6,A-7,C-8,A-9,A,10
I want to make group separately as A, B and C
A
1-3,
7,
9-10
B
4,
6
C
5,
8
can any one help me with logic..?

Guava will help you creating the Data Structure you need:
public static void main(final String[] args) {
String input = "A-1,A-2,A-3,B-4,C-5,B-6,A-7,C-8,A-9,A-10";
// create multimap
Map<String, Collection<Integer>> map=Maps.newTreeMap();
SortedSetMultimap<String, Integer> multimap = Multimaps.newSortedSetMultimap(
map, new Supplier<SortedSet<Integer>>() {
public SortedSet<Integer> get() {
return new TreeSet<Integer>();
}
});
//add data
Splitter entrySplitter = Splitter.on(',');
Splitter keyValueSplitter = Splitter.on('=');
for (String entry : entrySplitter.split(input)) {
Iterator<String> tokens = keyValueSplitter.split(entry).iterator();
multimap.put(tokens.next(), Integer.valueOf(tokens.next()));
}
// read data
for (Entry<String, Collection<Integer>> entry : map.entrySet()) {
System.out.println(entry.getKey()+":");
printMergedValues(entry.getValue());
}
}
private static void printMergedValues(Collection<Integer> value) {
// TODO implement this yourself
}
The only thing I left for you is to join the groups

Here is something to get you started:
String[] items = new String[] {
"A-1", "B-2", "A-5"
}
// This is the data structure that will receive the final data. The map key is the
// group name (e.g. "A" for item "A-15") and the map value is a list of numbers that
// have been found for that group. TreeMap is chosen because the groups will be sorted
// alphabetically. If you don't need that, you could also use HashMap.
Map<String, List<Integer>> groups = new TreeMap<String, List<Integer>>();
for (String item : items) {
// Split the item into the group and the number
String group = item.substring(0, 1);
String number = Integer.toString(item.substring(2));
// See if this group is already registered in our Map
List<Integer> groupData = groups.get(group);
if (groupData==null) {
groupData = new List<Integer>();
groups.put(group, groupData);
}
// Add the number to the data
groupData.add(number);
}
I assume here that your items are always in the form 1 letter dash number. If it is a bit more complicated than that, you'll want to have a look at regular expressions (see java class Pattern). This is not tested, I let you test it and handle the special cases.
This function will output for { "A-1", "A-2", "A-3", "B-2", "A-5" }:
A -> {1, 2, 3, 5}
B -> {2}
You'll need to process the resulting number lists if you want to merge consecutive numbers, but that should not be too difficult if you think about it a little while.

I'd go this way:
String[] input = {"A-1","A-2","A-3","B-4","C-5","B-6","A-7","C-8","A-9"};
Map<String, Set<Integer>> result = new HashMap<String, Set<Integer>>();
String[] inputSplit;
String group;
Integer groupNumber;
for (String item : input)
{
inputSplit = item.split("-");
group = inputSplit[0];
groupNumber = Integer.valueOf( inputSplit[1] );
if ( result.get(group) == null ) { result.put(group, new HashSet<Integer>()); }
result.get(group).add(groupNumber);
}
for (Map.Entry entry : result.entrySet())
{
System.out.println( entry.getKey() + ":" + entry.getValue() );
}

Related

How to add digits to a created stopwords list in Java?

I have a method which creates a stopword list with the 10% of most frequent words from the lemmas key in my JSON file – which looks like this:
{..
,"lemmas":{
"doc41":"the dynamically expand when there too many collision i e have distinct hash code but fall into same slot modulo size expect average effect"
,"doc40":"retrieval operation include get generally do block so may overlap update operation include put remove retrieval reflect result any non null k new longadder increment"
,"doc42":"a set projection"..
}
}
private static List<String> StopWordsFile(ConcurrentHashMap<String, String> lemmas) {
// ConcurrentHashMap stores each word and its frequency
ConcurrentHashMap<String, Integer> counts = new ConcurrentHashMap<String, Integer>();
// Array List for all the individual words
ArrayList<String> corpus = new ArrayList<String>();
for (Entry<String, String> entry : lemmas.entrySet()) {
String line = entry.getValue().toLowerCase();
line = line.replaceAll("\\p{Punct}", " ");
line = line.replaceAll("\\d+"," ");
line = line.replaceAll("\\s+", " ");
line = line.trim();
String[] value = line.split(" ");
List<String> words = new ArrayList<String>(Arrays.asList(value));
corpus.addAll(words);
}
// count all the words in the corpus and store the words with each frequency in
// the counts
for (String word : corpus) {
if (counts.keySet().contains(word)) {
counts.put(word, counts.get(word) + 1);
} else {
counts.put(word, 1);
}
}
// Create a list to store all the words with their frequency and sort it by values.
List<Entry<String, Integer>> list = new ArrayList<>(counts.entrySet());
list.sort((e2, e1) -> e1.getValue().compareTo(e2.getValue()));
List<Entry<String, Integer>> stopwordslist = new ArrayList<>(list.subList(0, (int) (0.10 * list.size())));
// Create the stopwords list with the 10% most frequent words
List<String> stopwords = new ArrayList<>();
// for (Map.Entry<String, Integer> e : sublist) {
for (ConcurrentHashMap.Entry<String, Integer> e : stopwordslist) {
stopwords.add(e.getKey());
}
System.out.println(stopwords);
return stopwords;
}
It outputs these words:
[the, of, value, v, key, to, given, a, k, map, in, for, this, returns, if, is, super, null, ... that, none]
I want to add single digits to it such as '1,2,3,4,5,6,7,8,9' or/and another stopwords.txt file containing digits.
How can I do that?
Also, how can I output this stopwords list to a CSV file? Can someone point me in the right direction?
I'm new to Java.

Is there a way of using a for loop to sum integer that have a string associated with them?

I have read a CSV file into a ArrayList but need to use a for loop to sum all the values that have a specific name with them, then return the top strings, in this case letters, in a string array. For example,
"A", 2
"B", 3
"C", 4
"A", 1
"B", 3
I have a class which reads the csv into objects so i have getters if that is of any help.
The result would give back a String [] that would have, in order, [B, C, A] as B totals 6, C totals 4 and A totals 3. Thank you.
Code I have so far,
public ArrayList<String> getTopRooms(int n){
ArrayList<String> roomNames = new ArrayList<>();
for (int i =0; i<recordList.size();i++){
if(!roomNames.contains(recordList.get(i).getRoomName()))
roomNames.add(recordList.get(i).getRoomName());
}
recordList contains data from the csv file, in this case i am trying to get the top rooms that have been booked. all rooms have a length of time which is shown by an int so for example, kitchen would have the length of 2.
Just use a map to keep track of the tallies for each letter/key.
Map<String, Integer> map = new HashMap<>();
for (String line : yourList) {
String[] parts = line.split(",\\s*");
String key = parts[0];
Integer value = Integer.parseInt(parts[1]);
Integer currValue = map.get(key);
map.put(key, Objects.isNull(currValue) ? value : currValue + value);
}
map.entrySset().stream().forEach(e -> System.out.println(e));
I am assuming here that your flat file actually looks like:
A, 2
B, 3
C, 4
A, 1
B, 3
and that each entry in your record list would be one CSV tuple.
Create a plain old java object from the String and Integer values, then store those objects in a list. We then take that list of objects and group them based on their identifier, and find the sum of each of the subsequent matching pojos with that identifier.
class Pojo {
final String identifier;
final int value;
Pojo(String identifier, int value) {
this.identifier = identifier;
this.value = value;
}
public String getIdentifier() {
return identifier;
}
public int getValue() {
return value;
}
}
List<Pojo> pojos = new ArrayList<>(
Arrays.asList(
new Pojo("A", 2),
new Pojo("B", 3),
new Pojo("C", 4),
new Pojo("A", 1),
new Pojo("B", 3)));
Map<String, Integer> map =
pojos.stream().collect(Collectors.groupingBy(
Pojo::getIdentifier, Collectors.summingInt(Pojo::getValue)));
Output
{A=3, B=6, C=4}

Count and remove similar elements in a list while iterating through it

I used many references in the site to build up my program but I'm kind of stuck right now. I think using iterator will do the job. Sadly even though I went through questions which had iterator, I couldn't get the way of using it properly to implement it on my code.
I want to,
1. remove the similar elements found in the list fname
2. count & add the that count of each element found in fname to
counter.
Please help me do the above using iterator or with any other method. Following is my code,
List<String> fname = new ArrayList<>(Arrays.asList(fullname.split(""))); //Assigning the string to a list//
int count = 1;
ArrayList<Integer> counter = new ArrayList<>();
List<String> holder = new ArrayList<>();
for(int element=0; element<=fname.size; element++)
{
for(int run=(element+1); run<=fname.size; run++)
{
if((fname.get(element)).equals(fname.get(run)))
{
count++;
holder.add(fname.get(run));
}
counter.add(count);
}
holder.add(fname.get(element));
fname.removeAll(holder);
}
System.out.println(fname);
System.out.println(counter);
Thanks.
From your questions, you basically want to:
1. Eliminate duplicates from given String List
You can simply convert your List to HashSet (it doesn't allow duplicates) and then convert it back to list (if you want the end result to be a List so you can do something else with it...)
2. Count all occurences of unique words in your list
The fastest coding is to use Java 8 Streams (code borrowed frome here: How to count the number of occurrences of an element in a List)
Complete code
public static void main(String[] args) {
String fullname = "a b c d a b c"; //something
List<String> fname = new ArrayList<>(Arrays.asList(fullname.split(" ")));
// Convert input to Set, and then back to List (your program output)
Set<String> uniqueNames = new HashSet<>(fname);
List<String> uniqueNamesInList = new ArrayList<>(uniqueNames);
System.out.println(uniqueNamesInList);
// Collects (reduces) your list
Map<String, Long> counts = fname.stream().collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));
System.out.println(counts);
}
I do not think that you need iterators here. However, there are many other possible solutions you could use, like recursion. Nevertheless, I have just modified your code as the following:
final List<String> fname = new ArrayList<String>(Arrays.asList(fullname.split("")));
// defining a list that will hold the unique elements.
final List<String> resultList = new ArrayList<>();
// defining a list that will hold the number of replication for every item in the fname list; the order here is same to the order in resultList
final ArrayList<Integer> counter = new ArrayList<>();
for (int element = 0; element < fname.size(); element++) {
int count = 1;
for (int run = (element + 1); run < fname.size(); run++) {
if ((fname.get(element)).equals(fname.get(run))) {
count++;
// we remove the element that has been already counted and return the index one step back to start counting over.
fname.remove(run--);
}
}
// we add the element to the resulted list and counter of that element
counter.add(count);
resultList.add(fname.get(element));
}
// here we print out both lists.
System.out.println(resultList);
System.out.println(counter);
Assuming String fullname = "StringOfSomeStaff"; the output will be as the following:
[S, t, r, i, n, g, O, f, o, m, e, a]
[3, 2, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1]
You can try something like this:
Set<String> mySet = new HashSet<>();
mySet.addAll( fname ); // Now you have unique values
for(String s : mySet) {
count = 0;
for(String x : fname) {
if( s.equals(x) ) { count++; }
}
counter.add( count );
}
This way we don't have a specific order. But I hope it helps.
In Java 8, there's a one-liner:
List<Integer> result = fname
.stream()
.collect(Collectors.groupingBy(s -> s))
.entrySet()
.stream()
.map(e -> e.getValue().size())
.collect(Collectors.toList());
I was using LinkedHashMap to preserve order of elements. Also for loop, which I am using, implicitly uses Iterator. Code example is using Map.merge method, which is available since Java 8.
List<String> fname = new ArrayList<>(Arrays.asList(fullname.split("")));
/*
Create Map which will contain pairs kay=values
(in this case key is a name and value is the counter).
Here we are using LinkedHashMap (instead of common HashMap)
to preserve order in which name occurs first time in the list.
*/
Map<String, Integer> countByName = new LinkedHashMap<>();
for (String name : fname) {
/*
'merge' method put the key into the map (first parameter 'name').
Second parameter is a value which we that to associate with the key
Last (3rd) parameter is a function which will merge two values
(new and ald) if map already contains this key
*/
countByName.merge(name, 1, Integer::sum);
}
System.out.println(fname); // original list [a, d, e, a, a, f, t, d]
System.out.println(countByName.values()); // counts [3, 2, 1, 1, 1]
System.out.println(countByName.keySet()); // unique names [a, d, e, f, t]
Also same might be done using Stream API but it would be probably hard for understanding if you are not familiar with Streams.
Map<String, Long> countByName = fname.stream()
.collect(Collectors.groupingBy(Function.identity(), LinkedHashMap::new, Collectors.counting()));

How can I split an ArrayList into two new ArrayLists?

I have on ArrayList which contains data like this: 13-ITEM,14-ITEM,15-ITEMGROUP (with a hyphen (-) as the separator).
I want to split this list into two new ArrayLists:
ArrayList-1 containing the ids: [13,14,15..]
ArrayList-2 containing the Strings: [ITEM,ITEM,ITEMGROUP...]
I am new to Java. Thanks in advance.
You can use String#indexOf(char) to find the index in the String of the separator then use String#substring to extract the sub strings, as next:
List<String> list = Arrays.asList("13-ITEM","14-ITEM","15-ITEMGROUP");
List<String> list1 = new ArrayList<>(list.size());
List<String> list2 = new ArrayList<>(list.size());
for (String s : list) {
int index = s.indexOf('-');
// Add what we have before the separator in list1
list1.add(s.substring(0, index));
// Add what we have after the separator in list2
list2.add(s.substring(index + 1));
}
System.out.printf("List 1 = %s, List 2 = %s%n", list1, list2);
Output:
List 1 = [13, 14, 15], List 2 = [ITEM, ITEM, ITEMGROUP]
Split each entry and add the parts to the different lists. If the texts contain more -s, then use substring.
ArrayList<String> input = ...
List<String> output1 = new ArrayList<>(input.size());
List<String> output2 = new ArrayList<>(input.size());
for(String item:input){
String[] splitted = item.split("-");
output1.add(splitted[0]);
output2.add(splitted[1]);
}
You can use the following code
List<String> list = Arrays.asList("13-ITEM", "14-ITEM", "15-ITEMGROUP");
list.stream().map(p -> p.substring(0, p.indexOf('-'))).forEach(System.out::println);
list.stream().map(p -> p.substring(p.indexOf('-') + 1)).forEach(System.out::println);
If you split your concerns like this (each list is created using different logic), you will have a possibility to encapsulate code further. For example you can add some exception handling.
private static Function<String, String> getFunction() {
return new Function<String, String>() {
#Override
public String apply(String p) {
return p.substring(0, p.indexOf('-'));
}
};
}

Descending sort of substrings by occurence - Java

Lest's say I have string:
String test= "AA BB CC BB BB CC BB";
What I would like to do is create String array like this:
String[]{"BB", "CC", "AA"}
Since B occurred 4 times C did 2 times and A only 1 time.
What would solution for this problem look like?
String test = "AA BB CC BB BB CC BB";
System.out.println(Arrays.deepToString(sort(test)));
Output: [BB, CC, AA]
Code:
public static String[] sort(String test) {
String[] strings = test.split(" ");
HashMap<String,Integer> map = new HashMap<String,Integer>();
for (String s : strings) {
Integer i = map.get(s);
if (i != null) {
map.put(s, i+1);
} else {
map.put(s, 1);
}
}
TreeMap<Integer,String> sort = new TreeMap<Integer,String>(Collections.reverseOrder());
for (Entry<String,Integer> e : map.entrySet()) {
sort.put(e.getValue(), e.getKey());
}
return sort.values().toArray(new String[0]);
}
What you could do is something like this (rough code):
String[] myOccurences = test.split(" ");
Then:
HashMap<String,Integer> occurencesMap = new HashMap<String,Integer>()
for( String s : myOccurences ){
if( occurencesMap.get( s ) == null ){
occurencesMap.put(s, 1);
} else {
occurencesMap.put(s, occurencesMap.get(s)++ );
}
}
Edit: The actual sorting (again rough code and unchecked):
List<String> mapKeys = new ArrayList<String>(occurencesMap.keySet()); // Keys
List<Integer> mapValues = new ArrayList<Integer>(occurencesMap.values()); // Values
TreeSet<Integer> sortedSet = new TreeSet( mapValues ); // Sorted according to natural order
Integer[] sortedValuesArray = sortedSet.toArray();
HashMap<String,Integer> lhMap = new LinkedHashMap<String,Integer>(); // LinkedHashMaps conserve order
for (int i=0; i<size; i++){
lhMap.put(mapKeys.get(mapValues.indexOf(sortedArray[i])), sortedValuesArray[i]);
}
mapKeys = new ArrayList<String>(occurencesMap.keySet()); // Keys again, this time sorted
Collections.sort(mapKeys, Collections.reverseOrder()); // Reverse since original ascending
String[] occurencesSortedByDescendingArray = mapKeys.toArray();
Feel free to comment.
If you want to use Guava:
Lists.transform(
Ordering
.natural()
.onResultOf(new Function<Multiset.Entry<String>, Integer>() {
public Integer apply(Multiset.Entry<String> entry) {
return entry.getCount();
}
})
.reverse()
.sortedCopy(
ImmutableMultiset.copyOf( Splitter.onPattern("\\s+").split(test) ).entrySet()
),
new Function<Multiset.Entry<String>, String>() {
public String apply(Multiset.Entry<String> entry) {
return entry.getElement();
}
}
);
I am not sure if a method exists for this exact purpose.
However, you could use the String.split() method to split the single string into an array of strings. From there, you could locate unique strings (either by manually checking or adding them all to a set, which would check for duplicates). Track (and increment a counter unique to each unique String) each time you add an element and it is not part of the collection. Then create an array that is sorted based on this count.
A map would be ideal for holding the String/count, as it would maintain the set of unique Strings as keys, and the count for each String as the value.

Categories

Resources