I am facing a situation similar to described below in my project, of which I am unable to implement the code.
I have a POJO Class
public class TranObject {
public String loadId;
public String vDate;
public String dDate;
public String pDate;
public TranObject(String loadId, String vDate, String dDate, String pDate) {
super();
this.loadId = loadId;
this.vDate = vDate;
this.dDate = dDate;
this.pDate = pDate;
}
//Getter and Setters
//toString()
}
Now I have another processor class where I want to implement some comparison between tranload objects that I am receiving through a data service call and collect them into another collection.
The implementation logic is given in the comments below. Please read the below comments
import java.util.Arrays;
import java.util.List;
public class DemoClass {
public static void main(String[] args) {
List<TranObject> listObj = Arrays.asList(
new TranObject("LOAD1", "20180102", "20180202", null),
new TranObject("LOAD2", "20180402", "20180403", null),
new TranObject("LOAD3", "20180102", "20180202", "20190302"),
new TranObject("LOAD4", "20180402", "20180403", null),
new TranObject("LOAD5", "20200202", "20200203", null)
);
/*
IF (obj1, obj3 vdate and dDate are equal)
IF(pDate == null for obj1 or obj3)
THEN obj1 and obj3 are equal/duplicates, and we collect them.
ELSE IF(pDate != null for obj1 and obj3)
IF(pDate is same for obj1 and obj3)
THEN obj1 and obj3 are duplicates, and we collect them.
ELSE
THEN obj1 and obj3 are unique.
*/
}
}
My End result should be a collection like List containing duplicate Tran objects for further update.
I searched internet in order to how to solve it using Lambda API.
-> Tried using groupingBy first with vDate and then dDate, but then I could not compare them for pDate equality.
Can anyone help me solve this issue. A little help will be very helpful for me. I am stuck here
UPDATE:
After some reading I am trying to implement the same by over-riding equals method in POJO class as shown below:
#Override
public boolean equals(Object obj) {
boolean isEqual=false;
if(obj!=null) {
TranObject tran = (TranObject) obj;
isEqual=(this.vDate.equals(tran.getvDate()) && this.dDate.equals(tran.getdDate()));
if(isEqual && this.pDate != null && tran.getpDate()!= null) {
isEqual = (this.pDate.equals(tran.getpDate()));
}
}
return isEqual;
}
Still it's not working as expected... Can anyone please help me why??
The closest to your requirement would be grouping in a nested manner and then filtering the inner Map for varied conditions while being interested only in values eventually.
Stream<Map<String, List<TranObject>>> groupedNestedStream = listObj.stream()
.collect(Collectors.groupingBy(a -> Arrays.asList(a.vDate, a.dDate)
, Collectors.groupingBy(t -> t.pDate == null ? "default" : t.pDate)))
.values().stream();
from these groupings further the conditions for the values (from map) to be eligible are
they all have same pDate in this case the innerMap would have just one entry with the common pDate (m.size() == 1)
one of the values after grouping has exactly one pDate as null (meaning m.containsKey("default") && m.get("default").size() == 1)
List<TranObject> tranObjects = groupedNestedStream
.filter(m -> m.size() == 1 || (m.containsKey("default") && m.get("default").size() == 1))
.flatMap(m -> m.values().stream().flatMap(List::stream))
.collect(Collectors.toList());
Note, the use of "default" string constant to avoid failures(or poor practice) in collecting a Map with null keys or values.
Sounds like TranObject needs an equals and hashCode method.
#Override
public boolean equals(Object obj) {
//check instanceof and self comparison
TranObject other = (TranObject) obj;
if(this.vDate.equals(other.vDate) && this.dDate.equals(other.dDate)) {
//if pDate is not given then consider them duplicate
if(this.pDate == null || other.pDate == null)
return true;
//if pDate are the same then they are duplicate, otherwise they are unique
return this.pDate.equals(other.pDate);
}
return false;
}
//auto generated by Eclipse
#Override
public int hashCode() {
final int prime = 31;
int result = 1;
result = prime * result + ((dDate == null) ? 0 : dDate.hashCode());
result = prime * result + ((pDate == null) ? 0 : pDate.hashCode());
result = prime * result + ((vDate == null) ? 0 : vDate.hashCode());
return result;
}
Now that you have an equals method to determine if two TranObjects are considered equal (based on the rules you specified), just collect the elements that occur in the list more than once:
private static List<TranObject> collectDuplicates(List<TranObject> list) {
List<TranObject> result = new ArrayList<TranObject>();
for(TranObject element : list) {
if(Collections.frequency(list, element) > 1)
result.add(element);
}
return result;
}
This will return all elements that have a duplicate.
Note: collectDuplicates does not return a unique list of the elements that are duplicated. Instead, it returns a list of each duplicated element (as required by OP's question).
This question already has answers here:
Sample Directed Graph and Topological Sort Code [closed]
(7 answers)
Closed 4 years ago.
Problem
I have the requirement to sort a list by a certain property of each object in that list. This is a standard action supported in most languages.
However, there is additional requirement that certain items may depend on others, and as such, must not appear in the sorted list until items they depend on have appeared first, even if this requires going against the normal sort order. Any such item that is 'blocked', should appear in the list the moment the items 'blocking' it have been added to the output list.
An Example
If I have items:
[{'a',6},{'b',1},{'c',5},{'d',15},{'e',12},{'f',20},{'g',14},{'h',7}]
Sorting these normally by the numeric value will get:
[{'b',1},{'c',5},{'a',6},{'h',7},{'e',12},{'g',14},{'d',15},{'f',20}]
However, if the following constraints are enforced:
a depends on e
g depends on d
c depends on b
Then this result is invalid. Instead, the result should be:
[{'b',1},{'c',5},{'h',7},{'e',12},{'a',6},{'d',15},{'g',14},{'f',20}]
Where b, c, d, e, f and h have been sorted in correct order b, c, h, e, d and f; both a and g got delayed until e and d respectively had been output; and c did not need delaying, as the value it depended on, b, had already been output.
What I have already tried
Initially I investigated if this was possible using basic Java comparators, where the comparator implementation was something like:
private Map<MyObject,Set<MyObject>> dependencies; // parent to set of children
public int compare(MyObj x, MyObj y) {
if (dependencies.get(x).contains(y)) {
return 1;
} else if (dependencies.get(y).contains(x)) {
return -1;
} else if (x.getValue() < y.getValue()) {
return -1;
} else if (x.getValue() > y.getValue()) {
return 1;
} else {
return 0;
}
}
However this breaks the requirement of Java comparators of being transitive. Taken from the java documentation:
((compare(x, y)>0) && (compare(y, z)>0)) implies compare(x, z)>0.
However, in the above example
a(6) < h(7) : true
h(7) < e(12) : true
a(6) < e(12) : false
Instead, I have come up with the below code, which while works, seems massively over-sized and over-complex for what seems like a simple problem. (Note: This is a slightly cut down version of the class. It can also be viewed and run at https://ideone.com/XrhSeA)
import java.util.ArrayList;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
import java.util.ListIterator;
import java.util.Map;
import java.util.Objects;
import java.util.PriorityQueue;
import java.util.Set;
import java.util.stream.Collectors;
import java.util.stream.Stream;
public final class ListManager<ValueType extends Comparable<ValueType>> {
private static final class ParentChildrenWrapper<ValueType> {
private final ValueType parent;
private final Set<ValueType> childrenByReference;
public ParentChildrenWrapper(ValueType parent, Set<ValueType> childrenByReference) {
this.parent = parent;
this.childrenByReference = childrenByReference;
}
public ValueType getParent() {
return this.parent;
}
public Set<ValueType> getChildrenByReference() {
return this.childrenByReference;
}
}
private static final class QueuedItem<ValueType> implements Comparable<QueuedItem<ValueType>> {
private final ValueType item;
private final int index;
public QueuedItem(ValueType item, int index) {
this.item = item;
this.index = index;
}
public ValueType getItem() {
return this.item;
}
public int getIndex() {
return this.index;
}
#Override
public int compareTo(QueuedItem<ValueType> other) {
if (this.index < other.index) {
return -1;
} else if (this.index > other.index) {
return 1;
} else {
return 0;
}
}
}
private final Set<ValueType> unsortedItems;
private final Map<ValueType, Set<ValueType>> dependentsOfParents;
public ListManager() {
this.unsortedItems = new HashSet<>();
this.dependentsOfParents = new HashMap<>();
}
public void addItem(ValueType value) {
this.unsortedItems.add(value);
}
public final void registerDependency(ValueType parent, ValueType child) {
if (!this.unsortedItems.contains(parent)) {
throw new IllegalArgumentException("Unrecognized parent");
} else if (!this.unsortedItems.contains(child)) {
throw new IllegalArgumentException("Unrecognized child");
} else if (Objects.equals(parent,child)) {
throw new IllegalArgumentException("Parent and child are the same");
} else {
this.dependentsOfParents.computeIfAbsent(parent, __ -> new HashSet<>()).add(child);
}
}
public List<ValueType> createSortedList() {
// Create a copy of dependentsOfParents where the sets of children can be modified without impacting the original.
// These sets will representing the set of children for each parent that are yet to be dealt with, and such sets will shrink as more items are processed.
Map<ValueType, Set<ValueType>> blockingDependentsOfParents = new HashMap<>(this.dependentsOfParents.size());
for (Map.Entry<ValueType, Set<ValueType>> parentEntry : this.dependentsOfParents.entrySet()) {
Set<ValueType> childrenOfParent = parentEntry.getValue();
if (childrenOfParent != null && !childrenOfParent.isEmpty()) {
blockingDependentsOfParents.put(parentEntry.getKey(), new HashSet<>(childrenOfParent));
}
}
// Compute a list of which children impact which parents, alongside the set of children belonging to each parent.
// This will allow a child to remove itself from all of it's parents' lists of blocking children.
Map<ValueType,List<ParentChildrenWrapper<ValueType>>> childImpacts = new HashMap<>();
for (Map.Entry<ValueType, Set<ValueType>> entry : blockingDependentsOfParents.entrySet()) {
ValueType parent = entry.getKey();
Set<ValueType> childrenForParent = entry.getValue();
ParentChildrenWrapper<ValueType> childrenForParentWrapped = new ParentChildrenWrapper<>(parent,childrenForParent);
for (ValueType child : childrenForParent) {
childImpacts.computeIfAbsent(child, __ -> new LinkedList<>()).add(childrenForParentWrapped);
}
}
// If there are no relationships, the remaining code can be massively optimised.
boolean hasNoRelationships = blockingDependentsOfParents.isEmpty();
// Create a pre-sorted stream of items.
Stream<ValueType> rankedItemStream = this.unsortedItems.stream().sorted();
List<ValueType> outputList;
if (hasNoRelationships) {
// There are no relationships, and as such, the stream is already in a perfectly fine order.
outputList = rankedItemStream.collect(Collectors.toList());
} else {
Iterator<ValueType> rankedIterator = rankedItemStream.iterator();
int queueIndex = 0;
outputList = new ArrayList<>(this.unsortedItems.size());
// A collection of items that have been visited but are blocked by children, stored in map form for easy deletion.
Map<ValueType,QueuedItem<ValueType>> lockedItems = new HashMap<>();
// A list of items that have been freed from their blocking children, but have yet to be processed, ordered by order originally encountered.
PriorityQueue<QueuedItem<ValueType>> freedItems = new PriorityQueue<>();
while (true) {
// Grab the earliest-seen item which was once locked but has now been freed. Otherwise, grab the next unseen item.
ValueType item;
boolean mustBeUnblocked;
QueuedItem<ValueType> queuedItem = freedItems.poll();
if (queuedItem == null) {
if (rankedIterator.hasNext()) {
item = rankedIterator.next();
mustBeUnblocked = false;
} else {
break;
}
} else {
item = queuedItem.getItem();
mustBeUnblocked = true;
}
// See if this item has any children that are blocking it from being added to the output list.
Set<ValueType> childrenWaitingUpon = blockingDependentsOfParents.get(item);
if (childrenWaitingUpon == null || childrenWaitingUpon.isEmpty()) {
// There are no children blocking this item, so start removing it from all blocking lists.
// Get a list of all parents that is item was blocking, if there are any.
List<ParentChildrenWrapper<ValueType>> childImpact = childImpacts.get(item);
if (childImpact != null) {
// Iterate over all those parents
ListIterator<ParentChildrenWrapper<ValueType>> childImpactIterator = childImpact.listIterator();
while (childImpactIterator.hasNext()) {
// Remove this item from that parent's blocking children.
ParentChildrenWrapper<ValueType> wrappedParentImpactedByChild = childImpactIterator.next();
Set<ValueType> childrenOfParentImpactedByChild = wrappedParentImpactedByChild.getChildrenByReference();
childrenOfParentImpactedByChild.remove(item);
// Does this parent no longer have any children blocking it?
if (childrenOfParentImpactedByChild.isEmpty()) {
// Remove it from the children impacts map, to prevent unnecessary processing of a now empty set in future iterations.
childImpactIterator.remove();
// If this parent was locked, mark it as now freed.
QueuedItem<ValueType> freedQueuedItem = lockedItems.remove(wrappedParentImpactedByChild.getParent());
if (freedQueuedItem != null) {
freedItems.add(freedQueuedItem);
}
}
}
// If there are no longer any parents at all being blocked by this child, remove it from the map.
if (childImpact.isEmpty()) {
childImpacts.remove(item);
}
}
outputList.add(item);
} else if (mustBeUnblocked) {
throw new IllegalStateException("Freed item is still blocked. This should not happen.");
} else {
// Mark the item as locked.
lockedItems.put(item,new QueuedItem<>(item,queueIndex++));
}
}
// Check that all items were processed successfully. Given there is only one path that will add an item to to the output list without an exception, we can just compare sizes.
if (outputList.size() != this.unsortedItems.size()) {
throw new IllegalStateException("Could not complete ordering. Are there recursive chains of items?");
}
}
return outputList;
}
}
My question
Is there an already existing algorithm, or an algorithm significantly shorter than the above, that will allow this to be done?
While the language I am developing in is Java, and the code above is in Java, language-independent answers that I could implement in Java are also fine.
This is called topological sorting. You can model "blocking" as edges of a directed graph. This should work if there are no circular "blockings".
I've done this in <100 lines of c# code (with comments). This implementation seems a little complicated.
Here is the outline of the algorithm
Create a priority queue that is keyed by value that you want to sort by
Insert all the items that do not have any "blocking" connections incoming
While there are elements in the queue:
Take an element of the queue. Put it in your resulting list.
If there are any elements that were being directly blocked by this element and were not visited previously, put them into the queue (an element can have more than one blocking element, so you check for that)
A list of unprocessed elements should be empty at the end, or you had a cycle in your dependencies.
This is essentialy Topological sort with built in priority for nodes. Keep in mind that the result can be quite suprising depending on the number of connections in your graph (ex. it's possible to actually get elements that are in reverse order).
As Pratik Deoghare stated in their answer, you can use topological sorting. You can view your "dependencies" as arcs of a Directed Acyclic Graph (DAG). The restriction that the dependencies on the objects are acyclic is important as topological sorting is only possible "if and only if the graph has no directed cycles." The dependencies also of course don't make sense otherwise (i.e. a depends on b and b depends on a doesn't make sense because this is a cyclic dependency).
Once you do topological sorting, the graph can be interpreted as having "layers". To finish the solution, you need to sort within these layers. If there are no dependencies in the objects, this leads to there being just one layer where all the nodes in the DAG are on the same layer and then they are sorted based on their value.
The overall running time is still O(n log n) because topological sorting is O(n) and sorting within the layers is O(n log n). See topological sorting wiki for full running time analysis.
Since you said any language that could be converted to Java, I've done a combination of [what I think is] your algorithm and ghord's in C.
A lot of the code is boilerplate to handle arrays, searches, and array/list insertions that I believe can be reduced by using standard Java primitives. Thus, the amount of actual algorithm code is fairly small.
The algorithm I came up with is:
Given: A raw list of all elements and a dependency list
Copy elements that depend on another element to a "hold" list. Otherwise, copy them to a "sort" list.
Note: an alternative is to only use the sort list and just remove the nodes that depend on another to the hold list.
Sort the "sort" list.
For all elements in the dependency list, find the corresponding nodes in the sort list and the hold list. Insert the hold element into the sort list after the corresponding sort element.
Here's the code:
#include <stdio.h>
#include <stdlib.h>
// sort node definition
typedef struct {
int key;
int val;
} Node;
// dependency definition
typedef struct {
int keybef; // key of node that keyaft depends on
int keyaft; // key of node to insert
} Dep;
// raw list of all nodes
Node rawlist[] = {
{'a',6}, // depends on e
{'b',1},
{'c',5}, // depends on b
{'d',15},
{'e',12},
{'f',20},
{'g',14}, // depends on d
{'h',7}
};
// dependency list
Dep deplist[] = {
{'e','a'},
{'b','c'},
{'d','g'},
{0,0}
};
#define MAXLIST (sizeof(rawlist) / sizeof(rawlist[0]))
// hold list -- all nodes that depend on another
int holdcnt;
Node holdlist[MAXLIST];
// sort list -- all nodes that do _not_ depend on another
int sortcnt;
Node sortlist[MAXLIST];
// prtlist -- print all nodes in a list
void
prtlist(Node *node,int nodecnt,const char *tag)
{
printf("%s:\n",tag);
for (; nodecnt > 0; --nodecnt, ++node)
printf(" %c:%d\n",node->key,node->val);
}
// placenode -- put node into hold list or sort list
void
placenode(Node *node)
{
Dep *dep;
int holdflg;
holdflg = 0;
// decide if node depends on another
for (dep = deplist; dep->keybef != 0; ++dep) {
holdflg = (node->key == dep->keyaft);
if (holdflg)
break;
}
if (holdflg)
holdlist[holdcnt++] = *node;
else
sortlist[sortcnt++] = *node;
}
// sortcmp -- qsort compare function
int
sortcmp(const void *vlhs,const void *vrhs)
{
const Node *lhs = vlhs;
const Node *rhs = vrhs;
int cmpflg;
cmpflg = lhs->val - rhs->val;
return cmpflg;
}
// findnode -- find node in list that matches the given key
Node *
findnode(Node *node,int nodecnt,int key)
{
for (; nodecnt > 0; --nodecnt, ++node) {
if (node->key == key)
break;
}
return node;
}
// insert -- insert hold node into sorted list at correct spot
void
insert(Node *sort,Node *hold)
{
Node prev;
Node next;
int sortidx;
prev = *sort;
*sort = *hold;
++sortcnt;
for (; sort < &sortlist[sortcnt]; ++sort) {
next = *sort;
*sort = prev;
prev = next;
}
}
int
main(void)
{
Node *node;
Node *sort;
Node *hold;
Dep *dep;
prtlist(rawlist,MAXLIST,"RAW");
printf("DEP:\n");
for (dep = deplist; dep->keybef != 0; ++dep)
printf(" %c depends on %c\n",dep->keyaft,dep->keybef);
// place nodes into hold list or sort list
for (node = rawlist; node < &rawlist[MAXLIST]; ++node)
placenode(node);
prtlist(sortlist,sortcnt,"SORT");
prtlist(holdlist,holdcnt,"HOLD");
// sort the "sort" list
qsort(sortlist,sortcnt,sizeof(Node),sortcmp);
prtlist(sortlist,sortcnt,"SORT");
// add nodes from hold list to sort list
for (dep = deplist; dep->keybef != 0; ++dep) {
printf("inserting %c after %c\n",dep->keyaft,dep->keybef);
sort = findnode(sortlist,sortcnt,dep->keybef);
hold = findnode(holdlist,holdcnt,dep->keyaft);
insert(sort,hold);
prtlist(sortlist,sortcnt,"POST");
}
return 0;
}
Here's the program output:
RAW:
a:6
b:1
c:5
d:15
e:12
f:20
g:14
h:7
DEP:
a depends on e
c depends on b
g depends on d
SORT:
b:1
d:15
e:12
f:20
h:7
HOLD:
a:6
c:5
g:14
SORT:
b:1
h:7
e:12
d:15
f:20
inserting a after e
POST:
b:1
h:7
e:12
a:6
d:15
f:20
inserting c after b
POST:
b:1
c:5
h:7
e:12
a:6
d:15
f:20
inserting g after d
POST:
b:1
c:5
h:7
e:12
a:6
d:15
g:14
f:20
I think you are generally on the right track, and the core concept behind your solution is similar to the one I will post below. The general algorithm is as follows:
Create a map that associates each item to the items that depend upon it.
Insert elements with no dependencies into a heap.
Remove the top element from the heap.
Subtract 1 from dependency count of each dependent of the element.
Add any elements with a dependency count of zero to the heap.
Repeat from step 3 until the heap is empty.
For simplicity I have replaced your ValueType with a String, but the same concepts apply.
The BlockedItem class:
import java.util.ArrayList;
import java.util.List;
public class BlockedItem implements Comparable<BlockedItem> {
private String value;
private int index;
private List<BlockedItem> dependentUpon;
private int dependencies;
public BlockedItem(String value, int index){
this.value = value;
this.index = index;
this.dependentUpon = new ArrayList<>();
this.dependencies = 0;
}
public String getValue() {
return value;
}
public List<BlockedItem> getDependentUpon() {
return dependentUpon;
}
public void addDependency(BlockedItem dependentUpon) {
this.dependentUpon.add(dependentUpon);
this.dependencies++;
}
#Override
public int compareTo(BlockedItem other){
return this.index - other.index;
}
public int countDependencies() {
return dependencies;
}
public int subtractDependent(){
return --this.dependencies;
}
#Override
public String toString(){
return "{'" + this.value + "', " + this.index + "}";
}
}
The BlockedItemHeapSort class:
import java.util.*;
public class BlockedItemHeapSort {
//maps all blockedItems to the blockItems which depend on them
private static Map<String, Set<BlockedItem>> generateBlockedMap(List<BlockedItem> unsortedList){
Map<String, Set<BlockedItem>> blockedMap = new HashMap<>();
//initialize a set for each element
unsortedList.stream().forEach(item -> {
Set<BlockedItem> dependents = new HashSet<>();
blockedMap.put(item.getValue(), dependents);
});
//place each element in the sets corresponding to its dependencies
unsortedList.stream().forEach(item -> {
if(item.countDependencies() > 0){
item.getDependentUpon().stream().forEach(dependency -> blockedMap.get(dependency.getValue()).add(item));
}
});
return blockedMap;
}
public static List<BlockedItem> sortBlockedItems(List<BlockedItem> unsortedList){
List<BlockedItem> sorted = new ArrayList<>();
Map<String, Set<BlockedItem>> blockedMap = generateBlockedMap(unsortedList);
PriorityQueue<BlockedItem> itemHeap = new PriorityQueue<>();
//put elements with no dependencies in the heap
unsortedList.stream().forEach(item -> {
if(item.countDependencies() == 0) itemHeap.add(item);
});
while(itemHeap.size() > 0){
//get the top element
BlockedItem item = itemHeap.poll();
sorted.add(item);
//for each element that depends upon item, decrease its dependency count
//if it has a zero dependency count after subtraction, add it to the heap
if(!blockedMap.get(item.getValue()).isEmpty()){
blockedMap.get(item.getValue()).stream().forEach(dependent -> {
if(dependent.subtractDependent() == 0) itemHeap.add(dependent);
});
}
}
return sorted;
}
}
You can modify this to more closely fit your use-case.
Java Code for topological sort:
static List<ValueType> topoSort(List<ValueType> vertices) {
List<ValueType> result = new ArrayList<>();
List<ValueType> todo = new LinkedList<>();
Collections.sort(vertices);
for (ValueType v : vertices){
todo.add(v);
}
outer:
while (!todo.isEmpty()) {
for (ValueType r : todo) {
if (!hasDependency(r, todo)) {
todo.remove(r);
result.add(r);
// no need to worry about concurrent modification
continue outer;
}
}
}
return result;
}
static boolean hasDependency(ValueType r, List<ValueType> todo) {
for (ValueType c : todo) {
if (r.getDependencies().contains(c))
return true;
}
return false;
}
ValueType is described like below:
class ValueType implements Comparable<ValueType> {
private Integer index;
private String value;
private List<ValueType> dependencies;
public ValueType(int index, String value, ValueType...dependencies){
this.index = index;
this.value = value;
this.dependencies = dependencies==null?null:Arrays.asList(dependencies);
}
public List<ValueType> getDependencies() {
return dependencies;
}
public void setDependencies(List<ValueType> dependencies) {
this.dependencies = dependencies;
}
#Override
public int compareTo(#NotNull ValueType o) {
return this.index.compareTo(o.index);
}
#Override
public String toString() {
return value +"(" + index +")";
}
}
And tested with these values:
public static void main(String[] args) {
//[{'a',6},{'b',1},{'c',5},{'d',15},{'e',12},{'f',20},{'g',14},{'h',7}]
//a depends on e
//g depends on d
//c depends on b
ValueType b = new ValueType(1,"b");
ValueType c = new ValueType(5,"c", b);
ValueType d = new ValueType(15,"d");
ValueType e = new ValueType(12,"e");
ValueType a = new ValueType(6,"a", e);
ValueType f = new ValueType(20,"f");
ValueType g = new ValueType(14,"g", d);
ValueType h = new ValueType(7,"h");
List<ValueType> valueTypes = Arrays.asList(a,b,c,d,e,f,g,h);
List<ValueType> r = topoSort(valueTypes);
for(ValueType v: r){
System.out.println(v);
}
}
I'm developing a Java Application that reads a lot of strings data likes this:
1 cat (first read)
2 dog
3 fish
4 dog
5 fish
6 dog
7 dog
8 cat
9 horse
...(last read)
I need a way to keep all couple [string, occurrences] in order from last read to first read.
string occurrences
horse 1 (first print)
cat 2
dog 4
fish 2 (last print)
Actually i use two list:
1) List<string> input; where i add all data
In my example:
input.add("cat");
input.add("dog");
input.add("fish");
...
2)List<string> possibilities; where I insert the strings once in this way:
if(possibilities.contains("cat")){
possibilities.remove("cat");
}
possibilities.add("cat");
In this way I've got a sorted list where all possibilities.
I use it like that:
int occurrence;
for(String possible:possibilities){
occurrence = Collections.frequency(input, possible);
System.out.println(possible + " " + occurrence);
}
That trick works good but it's too slow(i've got millions of input)... any help?
(English isn’t my first language, so please excuse any mistakes.)
Use a Map<String, Integer>, as #radoslaw pointed, to keep the insertion sorting use LinkedHashMap and not a TreeMap as described here:
LinkedHashMap keeps the keys in the order they were inserted, while a TreeMap is kept sorted via a Comparator or the natural Comparable ordering of the elements.
Imagine you have all the strings in some array, call it listOfAllStrings, iterate over this array and use the string as key in your map, if it does not exists, put in the map, if it exists, sum 1 to actual result...
Map<String, Integer> results = new LinkedHashMap<String, Integer>();
for (String s : listOfAllStrings) {
if (results.get(s) != null) {
results.put(s, results.get(s) + 1);
} else {
results.put(s, 1);
}
}
Make use of a TreeMap, which will keep ordering on the keys as specified by the compare of your MyStringComparator class handling MyString class which wraps String adding insertion indexes, like this:
// this better be immutable
class MyString {
private MyString() {}
public static MyString valueOf(String s, Long l) { ... }
private String string;
private Long index;
public hashcode(){ return string.hashcode(); }
public boolean equals() { // return rely on string.equals() }
}
class MyStringComparator implements Comparator<MyString> {
public int compare(MyString s1, MyString s2) {
return -s1.getIndex().compareTo(s2.gtIndex());
}
}
Pass the comparator while constructing the map:
Map<MyString,Integer> map = new TreeMap<>(new MyStringComparator());
Then, while parsing your input, do
Long counter = 0;
while (...) {
MyString item = MyString.valueOf(readString, counter++);
if (map.contains(item)) {
map.put(map.get(item)+1);
} else {
map.put(item,1);
}
}
There will be a lot of instantiation because of the immutable class, and the comparator will not be consistent with equals, but it should work.
Disclaimer: this is untested code just to show what I'd do, I'll come back and recheck it when I get my hands on a compiler.
Here is the complete solution for your problem,
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class DataDto implements Comparable<DataDto>{
public int count = 0;
public String string;
public long lastSeenTime;
public DataDto(String string) {
this.string = string;
this.lastSeenTime = System.currentTimeMillis();
}
public boolean equals(Object object) {
if(object != null && object instanceof DataDto) {
DataDto temp = (DataDto) object;
if(temp.string != null && temp.string.equals(this.string)) {
return true;
}
}
return false;
}
public int hashcode() {
return string.hashCode();
}
public int compareTo(DataDto o) {
if(o != null) {
return o.lastSeenTime < this.lastSeenTime ? -1 : 1;
}
return 0;
}
public String toString() {
return this.string + " : " + this.count;
}
public static final void main(String[] args) {
String[] listOfAllStrings = {"horse", "cat", "dog", "fish", "cat", "fish", "dog", "cat", "horse", "fish"};
Map<String, DataDto> results = new HashMap<String, DataDto>();
for (String s : listOfAllStrings) {
DataDto dataDto = results.get(s);
if(dataDto != null) {
dataDto.count = dataDto.count + 1;
dataDto.lastSeenTime = System.nanoTime();
} else {
dataDto = new DataDto(s);
results.put(s, dataDto);
}
}
List<DataDto> finalResults = new ArrayList<DataDto>(results.values());
System.out.println(finalResults);
Collections.sort(finalResults);
System.out.println(finalResults);
}
}
Ans
[horse : 1, cat : 2, fish : 2, dog : 1]
[fish : 2, horse : 1, cat : 2, dog : 1]
I think this solution will be suitable for your requirement.
If you know that your data is not going to exceed your memory capacity when you read it all into memory, then the solution is simple - using a LinkedList or a and a LinkedHashMap.
For example, if you use a Linked list:
LinkedList<String> input = new LinkedList();
You then proceed to use input.add() as you did originally. But when the input list is full, you basically use Jordi Castilla's solution - but put the entries in the linked list in reverse order. To do that, you do:
Iterator<String> iter = list.descendingIterator();
LinkedHashMap<String,Integer> map = new LinkedHashMap<>();
while (iter.hasNext()) {
String s = iter.next();
if ( map.containsKey(s)) {
map.put( s, map.get(s) + 1);
} else {
map.put(s, 1);
}
}
Now, the only real difference between his solution and mine is that I'm using list.descendingIterator() which is a method in LinkedList that gives you the entries in backwards order, from "horse" to "cat".
The LinkedHashMap will keep the proper order - whatever was entered first will be printed first, and because we entered things in reverse order, then whatever was read last will be printed first. So if you print your map the result will be:
{horse=1, cat=2, dog=4, fish=2}
If you have a very long file, and you can't load the entire list of strings into memory, you had better keep just the map of frequencies. In this case, in order to keep the order of entry, we'll use an object such as this:
private static class Entry implements Comparable<Entry> {
private static long nextOrder = Long.MIN_VALUE;
private String str;
private int frequency = 1;
private long order = nextOrder++;
public Entry(String str) {
this.str = str;
}
public String getString() {
return str;
}
public int getFrequency() {
return frequency;
}
public void updateEntry() {
frequency++;
order = nextOrder++;
}
#Override
public int compareTo(Entry e) {
if ( order > e.order )
return -1;
if ( order < e.order )
return 1;
return 0;
}
#Override
public String toString() {
return String.format( "%s: %d", str, frequency );
}
}
The trick here is that every time you update the entry (add one to the frequency), it also updates the order. But the compareTo() method orders Entry objects from high order (updated/inserted later) to low order (updated/inserted earlier).
Now you can use a simple HashMap<String,Entry> to store the information as you read it (I'm assuming you are reading from some sort of scanner):
Map<String,Entry> m = new HashMap<>();
while ( scanner.hasNextLine() ) {
String str = scanner.nextLine();
Entry entry = m.get(str);
if ( entry == null ) {
entry = new Entry(str);
m.put(str, entry);
} else {
entry.updateEntry();
}
}
Scanner.close();
Now you can sort the values of the entries:
List<Entry> orderedList = new ArrayList<Entry>(m.values());
m = null;
Collections.sort(orderedList);
Running System.out.println(orderedList) will give you:
[horse: 1, cat: 2, dog: 4, fish: 2]
In principle, you could use a TreeMap whose keys contained the "order" stuff, rather than a plain HashMap like this followed by sorting, but I prefer not having either mutable keys in a map, nor changing the keys constantly. Here we are only changing the values as we fill the map, and each key is inserted into the map only once.
What you could do:
Reverse the order of the list using
Collections.reverse(input). This runs in linear time - O(n);
Create a Set from the input list. A Set garantees uniqueness.
To preserve insertion order, you'll need a LinkedHashSet;
Iterate over this set, just as you did above.
Code:
/* I don't know what logic you use to create the input list,
* so I'm using your input example. */
List<String> input = Arrays.asList("cat", "dog", "fish", "dog",
"fish", "dog", "dog", "cat", "horse");
/* by the way, this changes the input list!
* Copy it in case you need to preserve the original input. */
Collections.reverse(input);
Set<String> possibilities = new LinkedHashSet<String>(strings);
for (String s : possibilities) {
System.out.println(s + " " + Collections.frequency(strings, s));
}
Output:
horse 1
cat 2
dog 4
fish 2
I'm trying to iterate over an associative array and tally up how many instances of each combination there are (for use in determining conditional probability of A given B)
For example, in PHP I can iterate over the indexed array $Data[i] given input (A, ~B) and get a result of 2.
$Data[0] = array("A", "~B");
$Data[1] = array("~A", "B");
$Data[2] = array("A", "~B");
$Data[3] = array("A", "B");
I tried replicating this in Java with maps, but maps only allow a unique key for each value... So the following wouldn't work because key A is being used for three entries.
map.put("A", "~B");
map.put("~A", "B");
map.put("A", "~B");
map.put("A", "B");
Is there something else I can use?
Thanks!
You can use a Map<T,List<U>> (in your case it is Map<String,List<String>>) or you can use a Multimap<String,String> using some library such as guava (or apache commons version of it - MultiMap)
If iteration of the structure is your primary goal, a List<ConditionResult> would seem to be the most appropriate choice for your situation, where ConditionResult is given below.
If maintaining a count of the combinations is the sole goal, then a Map<ConditionResult,Integer> would also work well.
public class ConditionResult
{
// Assuming strings for the data types,
// but an enum might be more appropriate.
private String condition;
private String result;
public ConditionResult(String condition, String result)
{
this.condition = condition;
this.result = result;
}
public String getCondition() { return condition; }
public String getResult() { return result; }
public boolean equals(Object object)
{
if (this == object) return true;
if (object == null) return false;
if (getClass() != object.getClass()) return false;
ConditionResult other = (ConditionResult) object;
if (condition == null)
{
if (other.condition != null) return false;
} else if (!condition.equals(other.condition)) return false;
if (result == null)
{
if (other.result != null) return false;
} else if (!result.equals(other.result)) return false;
return true;
}
// Need to implement hashCode as well, for equals consistency...
}
Iteration and counting could be done as:
/**
* Count the instances of condition to result in the supplied results list
*/
public int countInstances(List<ConditionResult> results, String condition, String result)
{
int count = 0;
ConditionResult match = new ConditionResult(condition,result);
for (ConditionResult result : results)
{
if (match.equals(result)) count++;
}
return count;
}