Organizing a set of sets by size in Java - java

I'm writing a simple program to recursively find all subsets of some larger set. I've got it working, but I wanted to order all of the sets in order by size.
I posted my working code below.
import java.util.*;
public class AllSubsets {
public static void main(String[] args) {
// Change contents of this array to easily change contents of set.
Integer[] setContents = {3, 6, 8, 9, 10, 22};
// create initial unused set by dumping all of the aray into a set.
Set<Integer> unused = new HashSet<Integer>(Arrays.asList(setContents));
// create initial empty set for used set.
Set<Integer> used = new HashSet<Integer>();
// create output set of sets.
Set<Set<Integer>> allSets = new HashSet<Set<Integer>>();
allSets.add(used);
// find all sets recursively
findAllSets(used, unused, allSets);
// print out results
System.out.println(allSets);
}
public static void findAllSets(Set<Integer> used, Set<Integer> unused,
Set<Set<Integer>> allSets) {
if (unused != null) {
Set<Integer> copyOfUnused = new HashSet<Integer>(unused);
for (Integer val : copyOfUnused) {
unused.remove(val);
used.add(val);
allSets.add(new HashSet<Integer>(used));
findAllSets(used, unused, allSets);
used.remove(val);
unused.add(val);
}
}
}
}
I was wondering what the best way would be to order these sets by size. I tried to create a TreeSet which holds multiple HashSet objects with it's comparator method overwritten. This ended up compiling but didn't store the values correctly. The code for this I wrote is very similar to the code above, so I will write out the main difference below:
Set<Set<Integer>> allSets =
new TreeSet<Set<Integer>>(new Comparator<Set<Integer>>() {
public int compare(Set<Integer> a, Set<Integer> b) {
return a.size() - b.size();
}
});
In this version of the code, it compiles but the objects are not storing correctly. The proper sets are being computed and added to "allSets" in the recursive method (tested using println) but it only ever holds one set at a time. I have a feeling it's mostly because I overwrote the Comparator for Set but I am using HashSets. Is there a better way to organize my sets or maybe just a small bug in my code?
Thanks!!

TreeSet<Set<Integer>> will only store one Set element with a given size, because it considers two different sets with the same size to be "equal": it takes a.compareTo(b) == 0 to mean a == b.
If you want to get all of the sets and then print them in order of size, gather all of the sets in a regular (Hash)Set, and then sort the entries:
List<Set<Integer>> listOfSets = new ArrayList<>(allSets);
Collections.sort(listOfSets, <your comparator above>);
System.out.println(listOfSets).

Related

How Set checks for duplicates? Java HashSet

For the below code it outputs " 1 ". and second code outputs " 2 " I don't understand why this is happening. Is it because I am adding the same object? How should I achieve the desired output 2.
import java.util.*;
public class maptest {
public static void main(String[] args) {
Set<Integer[]> set = new HashSet<Integer[]>();
Integer[] t = new Integer[2];
t[0] = t[1] = 1;
set.add(t);
Integer[] t1 = new Integer[2];
t[0] = t[1] = 0;
set.add(t);
System.out.println(set.size());
}
}
Second Code:
import java.util.*;
public class maptest {
public static void main(String[] args) {
Set<Integer[]> set = new HashSet<Integer[]>();
Integer[] t = new Integer[2];
t[0] = t[1] = 1;
set.add(t);
Integer[] t1 = new Integer[2];
t1[0] = t1[1] = 1;
set.add(t1);
System.out.println(set.size());
}
}
The Set implementation probably calls t.hashCode() and since arrays don't override the Object.hashCode method, the same object will have the same hashcode. Changing the array's contents thus does not affect its hash code. To get an array's hash code correctly, you should call Arrays.hashCode.
You shouldn't really put mutable things inside sets anyways, so I would suggest you put immutable lists into sets instead. If you want to stick with arrays, just create a new array, like you did with t1, and put it into the set.
EDIT:
For code 2, t and t1 are two different arrays so their hash code are different. Again, since the hashCode method is not overridden in arrays. The array's contents don't effect the hash code, whether or not they are the same.
A Set contains only distinct element (it is its nature). The basic implementation, HashSet, use hashCode() to first find a bucket containing values then equals(Object) to look for a distinct value.
Arrays are simple: their hashCode() use the default, inherited from Object, and therefore depending on reference. The equals(Object) is also the same than Object: it check only the identify, that is: references must be equals.
Defined as Java:
public boolean equals(Object other) {
return other == this;
}
If you want to put distinct arrays, you'll have to either try your luck with TreeSet and a proper implementation of Comparator, either wrap you array or use a List or another Set:
Set<List<Integer[]>> set = new HashSet<>();
Integer[] t = new Integer[]{1, 1};
set.add(Arrays.asList(t));
Integer[] t1 = new Integer[]{1, 1};
set.add(Arrays.asList(t1));
System.out.println(set.size());
As for mutability of the object used in a Set or a Map key:
fields used by the boolean equals(Object) should not be muted because the muted object could be then equals to another. The Set would no longer contains distinct values.
fields used by the int hashCode() should not be muted for hash based collection (HashSet, HashMap) because as said above their operate by putting items in a bucket. If the hashCode() change, it is likely the place of the object in the bucket will also change: the Set would then contains twice the same reference.
fields used by the int compareTo(T) or Comparator::compare(T,T) should not be muted for the same reason than equals: the SortedSet would not know there was a change.
If the need arise, you would have to first remove item from the set, then mutate it, the re-add it.
You're adding the Object to a Set which
contains no duplicate elements.
You are only ever adding one Object to the Set. You only change the value of it's contents. To see what I mean try adding System.out.println(set.add(t));.
As the add() method:
Returns true if this set did not already contain the specified element
Also your t1 is completely irrelevant in your first code snippet as you never use it.
In your second code snippet it outputs two because you are adding two different Integer[] Objects to the Set
Try printing out the hashcode of the Objects to see how this works:
Integer[] t = new Integer[2];
t[0] = t[1] = 1;
//Before we change the values
System.out.println(t.hashCode());
Integer[] t1 = new Integer[2];
t1[0] = t1[1] = 1;
//After we change the values of t
System.out.println(t.hashCode());
//Hashcode of the second object
System.out.println(t1.hashCode());
Output:
//Hashcode for t is the same before and after modifying data
366712642
366712642
//Hashcode for t1 is different from t; different object
1829164700
How java.util.Set implementations check for duplicate objects depends on the implementation, but per the documentation of Set, the appropriate meaning of "duplicate" is that o1.equals(o2).
Since HashSet in particular is based on a hash table, it will go about looking for a duplicate by computing the hashCode() of the object presented to it, and then going through all the objects, if any, in the corresponding hash bucket.
Arrays do not override hashCode() or equals(), so they implement instance identity, not value identity. Thus, regardless of the values of its elements, a given array always has the same hash code, and always equals() itself and only itself. You first code adds the same array object to a set two times. Regardless of the values of its elements, it is still the same set. The second code adds two different array objects to a set. Regardless of the values of their elements, they are different objects.
Note, too, that if you have mutable objects that implement value identity, such that their equality and hash codes depends on the values of their members, then modifying such an object while it is a member of a Set very likely breaks the Set. This is documented on a per-implementation basis.

Proving mergesort is stable

I have written a mergesort algorithm. When I run the following test:
public static void main(String[] args){
Integer[] arr = {3,7,9,11,0,-5,2,5,8,8,1};
List<Integer> list = new ArrayList<>();
list.addAll(Arrays.asList(arr)); // asList() returns fixed size list, so can't pass to mergesort()
List<Integer> result = mergesort(list);
System.out.println(result);
}
I get [-5, 0, 1, 2, 3, 5, 7, 8, 8, 9, 11], which is correct. However, I know that mergesort is a stable sort, so how can I write a test that can prove that the two 8's are in the order they originally were?
EDIT: Since I used the Integer class, rather than primitive ints, I figured I could just get the hashCode() since Integer extends the base Object class.
However, when I tried
Integer[] arr = {3,7,9,11,0,-5,2,5,8,8,1};
System.out.println(arr[8].hashCode());
System.out.println(arr[9].hashCode());
I only get:
8
8
The best way I can think of is if you wrap the numbers in their wrapper Integer class. If you do the following:
Integer eight = new Integer(8);
Integer anotherEight = new Integer(8);
a == b; //Returns false
a.equals(b); //Returns true
Else, as suggested in the comments, you can add an extra field to your class for comparison.
EDIT: To answer your edit, the Integer.hashcode() documentation states that the hascode is
equal to the primitive int value represented by this Integer object.
I think in using a simple Key-Value structure, like some of this or some of that
Using this (to be short):
public class Tuple<X, Y> {
public final X x;
public final Y y;
public Tuple(X x, Y y) {
this.x = x;
this.y = y;
}
}
you could do:
public static void main(String[] args)
{
Integer[] arr = {new Tuple(3,0),new Tuple(7,1),new Tuple(9,2) ,new Tuple(11,3), new Tuple(0,4), new Tuple(-5,5), new Tuple(2,6), new Tuple(5,7), new Tuple(8,8), new Tuple(8,9), new Tuple(1,10)};
List<Tuple<Integer,Integer>> list = new ArrayList<>();
list.addAll(Arrays.asList(arr)); // asList() returns fixed size list, so can't pass to mergesort()
List<Integer> result = mergesort(list);
System.out.println(result);
}
And in the code of the merge sort, only merge for the first item, and let the second be the same, you should see the (8,8) first and then the (8,9)
Proof of stability would involve analyzing the merge sort algorithm, not creating test cases where stability tests did not fail. It might be simpler to analyze bottom up merge sort first. First an array of n elements is treated as an array of n runs of length 1, and then those runs are repeatedly merged until a single sorted run is produced. In a k-way merge sort, if the current elements from the k runs are equal, the element from the left most run is moved to the output array, preserving stability.
For top down merge sort, which is normally 2 way, the same rule applies, when equal elements are encountered, the element from the left run is moved, preserving stability.

Java - Set not printed out in order

I just started learning about sets, and it was mentioned that it did not care about order, unlike lists.
However, when I typed this piece of code:
public class test {
public static void main(String[] args) {
Set<Integer> nums = new HashSet<Integer>();
nums.add(0);
nums.add(1);
nums.add(2);
nums.add(3);
for (Integer num : nums)
System.out.println(num);
}
}
Based on the first line, the output should have been random, but instead it gave ordered output:
0
1
2
3
I have tried scrambling the order at which the numbers are being added, like this:
public class test {
public static void main(String[] args) {
Set<Integer> nums = new HashSet<Integer>();
nums.add(1);
nums.add(0);
nums.add(3);
nums.add(2);
for (Integer num : nums)
System.out.println(num);
}
}
Oddly though, the output was still ordered!
Is there anything that somehow sorts the set before I print its elements out?
Or is HashSet not meant for creating unordered sets?
HashSet doesn't provide any order guarantees. That doesn't mean that order can't emerge, for some data sets, as a by-product of how it is implemented. Just that you cannot rely on that, and it may change from implementation to implementation, etc.
HashSet is unordered by design. You are putting only limited small numbers which produce hash code of the value in same order. That's why it is printing in order. See the code below to see hash code and analyze it
for (Integer num : nums){
System.out.println(num + " - hashcode = " +num.hashCode());
}
Add few more large numbers to see unordered nature in action.
Example:
nums.add(29000);
nums.add(199201);
This is just a coincidence (or actually it's because how the HashSet internally works but don't care about that for now). Try adding a few more values and then removing and then adding etcetera and you will see that it doesn't print correctly. HashSet is unordered. Sets in general are unordered unless it is stated otherwise.
A HashSet is indeed an unsorted set. This means you can't assume anything about the order it's iterated over (and printed) - the same way you can't assume it will be ordered, you also can't assume it won't be. The order is completely up to the internal implementation.

Remove duplicates in an array without changing order of elements

I have an array, say List<Integer> 139, 127, 127, 139, 130
How to remove duplicates of it and keep its order unchanged? i.e. 139, 127, 130
Use an instance of java.util.LinkedHashSet.
Set<Integer> set = new LinkedHashSet<>(list);
With this one-liner:
yourList = new ArrayList<Integer>(new LinkedHashSet<Integer>(yourList))
Without LinkedHashSet overhead (uses HashSet for seen elements instead which is slightly faster):
List<Integer> noDuplicates = list
.stream()
.distinct()
.collect(Collectors.toList());
Note that the order is guaranteed by the Stream.distinct() contract:
For ordered streams, the selection of distinct elements is stable (for
duplicated elements, the element appearing first in the encounter
order is preserved.)
Construct Set from your list - "A collection that contains no duplicate elements":
Set<Integer> yourSet = new HashSet<Integer>(yourList);
And convert it back to whatever you want.
Note: If you want to preserve order, use LinkedHashSet instead.
Use LinkedHashSet to remove duplicate and maintain order.
As I cant deduct, you need to preserve insertion order, that compleating what #Maroun Maroun wrote, use set, but specialidez implementation like LinkedHashSet<E> whitch does exactly the thing you need.
Iterate through array (via iterator, not foreach) and remove duplicates. Use set for find duplicates.
OR
Iterate through array and add all elements to LinkedHashSet, it isn't allows duplicates and keeps order of elements.
Then clear array, iterate through set and add each element to array.
Although converting the ArrayList to a HashSet effectively removes duplicates, if you need to preserve insertion order, I'd rather suggest you to use this variant
// list is some List of Strings
Set<String> s = new LinkedHashSet<String>(list);
Then, if you need to get back a List reference, you can use again the conversion constructor.
There are 2 ways:
create new list with unique ints only
(the same as Maroun Maroun answer)
you can do it with 2 nested fors like this O(n.n/2):
List<int> src,dst;
// src is input list
// dst is output list
dst.allocate(src.num); // prepare size to avoid slowdowns by reallocations
dst.num=0; // start from empty list
for (int i=0;i<src.num;i++)
{
int e=1;
for (int j=0;i<dst.num;i++)
if (src[i]==dst[j]) { e=0; break; }
if (e) dst.add(src[i]);
}
You can select duplicate items and delete them ... O(2.n) with the flagged delete
this is way much faster but you need memory table for whole int range
if you use numbers <0,10000> then it will take BYTE cnt[10001]
if you use numbers <-10000,10000> then it will take BYTE cnt[20002]
for small ranges like this is ok but if you have to use 32 bit range it will take 4GB !!!
with bit packing you can have 2 bits per value so it will be just 1GB but that is still too much for my taste
ok now how to check for duplicity ...
List<WORD> src; // src is input list
BYTE cnt[65536]; // count usage for all used numbers
int i;
for (i=0;i<65536;i++) cnt[i]=0; // clear the count for all numbers
for (i=0;i<src.num;i++) // compute the count for used numbers in the list
if (cnt[src[i]]!=255)
cnt[src[i]]++;
after this any number i is duplicate if (cnt[i]>1)
so now we want to delete duplicate items (all except one)
to do that change cnt[] like this
for (i=0;i<65536;i++) if (cnt[i]>1) cnt[i]=1; else cnt[i]=0;
ok now comes the delete part:
for (i=0;i<src.num;i++)
if (cnt[src[i]]==1) cnt[src[i]]=2; // do not delete the first time
else if (cnt[src[i]]==2) // but all the others yes
{
src.del(i);
i--; // indexes in src changed after delete so recheck for the same index again
}
you can combine both approaches together
delete item from list is slow because of item shift in the list
but can be speed up by adding delete flag to items
instead of delete just set the flag
and after all items to delete is flagged then simply remove hem at once O(n)
PS. Sorry for non standard list usage but i think the code is understandable enough if not comment me and i respond
PPS. for use with signed values do not forget to shift the address by half range !!!
Below I have given the sample example that implements a generic function to remove duplicate from arraylist and maintain the order at the same time.
import java.util.*;
public class Main {
//Generic function to remove duplicates in list and maintain order
private static <E> List<E> removeDuplicate(List<E> list) {
Set<E> array = new LinkedHashSet<E>();
array.addAll(list);
return new ArrayList<>(array);
}
public static void main(String[] args) {
//Print [2, 3, 5, 4]
System.out.println(removeDuplicate(Arrays.asList(2,2,3,5, 3, 4)));
//Print [AB, BC, CD]
System.out.println(removeDuplicate(Arrays.asList("AB","BC","CD","AB")));
}
}
Method 1 : In Python => Using a set and list comprehension
a= [139, 127, 127, 139, 130]
print(a)
seen =set()
aa = [ch for ch in a if ch not in seen and not seen.add(ch)]
print(aa)
Method 2 :
aa = list(set(a))
print(aa)
In Java : using Set and making a new ArrayList
class t1 {
public static void main(String[] args) {
int[] a = {139, 127, 127, 139, 130};
List<Integer> list1 = new ArrayList<>();
Set<Integer> set = new LinkedHashSet<Integer>();
for( int ch : a) {
if(!set.contains(ch)) {
set.add(ch);
}
}//for
set.forEach( (k) -> list1.add(k));
System.out.println(list1);
}
}
Bro this is you answer but this have 0(n2) T.C remember.
vector<int> sol(int arr[],int n){
vector<int> dummy;
for(int i=0;i<n-1;i++){
for(int j=i+1;j<n;j++){
if(arr[i]==arr[j]){
dummy.push_back(j);
}
}
}
vector<int> ans;
for(int i=0;i<n;i++){
bool check=true;
for(int j=0;j<dummy.size();j++){
if(dummy[j]==i){
check=false;
}
}
if(check==false)
continue;
ans.push_back(arr[i]);
}
return ans;
}

How to keep List index fixed in Java

I want to keep the indices of the items in a Java List fixed.
Example code:
import java.util.ArrayList;
public class Test {
public static void main(String[] args) {
ArrayList<Double> a = new ArrayList<Double>();
a.add(12.3);
a.add(15.3);
a.add(17.3);
a.remove(1);
System.out.println(a.get(1));
}
}
This will output 17.3. The problem is that 17.3 was on index 2 and now it's on index 1!
Is there any way to preserve the indices of other elements when removing an element? Or is there another class more suitable for this purpose?
Note: I don't want a fixed size Collection.
You might want to use java.util.SortedMap with int keys:
import java.util.*;
public class Test {
public static void main(String[] args)
{
SortedMap<Integer, Double> a = new TreeMap<Integer, Double>();
a.put(0, 12.3);
a.put(1, 15.3);
a.put(2, 17.3);
System.out.println(a.get(1)); // prints 15.3
System.out.println(a.get(2)); // prints 17.3
a.remove(1);
System.out.println(a.get(1)); // prints null
System.out.println(a.get(2)); // prints 17.3
}
}
SortedMap is a variable-size Collection
It stores values mapped to an ordered set of keys (similar to List's indices)
No implementation of java.util.List#remove(int) may preserve the indices since the specification reads:
Removes the element at the specified position in this list (optional operation). Shifts any subsequent elements to the left (subtracts one from their indices). Returns the element that was removed from the list.
Instead of calling a.remove(1) you could do a.set(1, null). This will keep all elements in the same place while still "removing" the value at index one.
If the relationship should be always the same between the index and value then use a java.util.Map.
Instead of removing the element with the call to remove set the element to null:
i.e:
import java.util.ArrayList;
public class Test
{
public static void main(String[] args)
{
ArrayList<Double> a = new ArrayList<Double>();
a.add(12.3);
a.add(15.3);
a.add(17.3);
a.set(1, null);
System.out.println(a.get(1));
}
}
You could use a HashMap<Integer, Double>. You could add items using
myMap.put(currentMaximumIndex++, myDoubleValue);
This way, indices would be unique, if you need sparse storage you'd be reasonably okay, and removing a value wouldn't hurt existing ones.
Addition to the above answer its also suggested you should use LinkedHashMap<Integer,Double>, instead of a regular Hashmap
It will preserve the order in which you insert the element.

Categories

Resources