Proving mergesort is stable - java

I have written a mergesort algorithm. When I run the following test:
public static void main(String[] args){
Integer[] arr = {3,7,9,11,0,-5,2,5,8,8,1};
List<Integer> list = new ArrayList<>();
list.addAll(Arrays.asList(arr)); // asList() returns fixed size list, so can't pass to mergesort()
List<Integer> result = mergesort(list);
System.out.println(result);
}
I get [-5, 0, 1, 2, 3, 5, 7, 8, 8, 9, 11], which is correct. However, I know that mergesort is a stable sort, so how can I write a test that can prove that the two 8's are in the order they originally were?
EDIT: Since I used the Integer class, rather than primitive ints, I figured I could just get the hashCode() since Integer extends the base Object class.
However, when I tried
Integer[] arr = {3,7,9,11,0,-5,2,5,8,8,1};
System.out.println(arr[8].hashCode());
System.out.println(arr[9].hashCode());
I only get:
8
8

The best way I can think of is if you wrap the numbers in their wrapper Integer class. If you do the following:
Integer eight = new Integer(8);
Integer anotherEight = new Integer(8);
a == b; //Returns false
a.equals(b); //Returns true
Else, as suggested in the comments, you can add an extra field to your class for comparison.
EDIT: To answer your edit, the Integer.hashcode() documentation states that the hascode is
equal to the primitive int value represented by this Integer object.

I think in using a simple Key-Value structure, like some of this or some of that
Using this (to be short):
public class Tuple<X, Y> {
public final X x;
public final Y y;
public Tuple(X x, Y y) {
this.x = x;
this.y = y;
}
}
you could do:
public static void main(String[] args)
{
Integer[] arr = {new Tuple(3,0),new Tuple(7,1),new Tuple(9,2) ,new Tuple(11,3), new Tuple(0,4), new Tuple(-5,5), new Tuple(2,6), new Tuple(5,7), new Tuple(8,8), new Tuple(8,9), new Tuple(1,10)};
List<Tuple<Integer,Integer>> list = new ArrayList<>();
list.addAll(Arrays.asList(arr)); // asList() returns fixed size list, so can't pass to mergesort()
List<Integer> result = mergesort(list);
System.out.println(result);
}
And in the code of the merge sort, only merge for the first item, and let the second be the same, you should see the (8,8) first and then the (8,9)

Proof of stability would involve analyzing the merge sort algorithm, not creating test cases where stability tests did not fail. It might be simpler to analyze bottom up merge sort first. First an array of n elements is treated as an array of n runs of length 1, and then those runs are repeatedly merged until a single sorted run is produced. In a k-way merge sort, if the current elements from the k runs are equal, the element from the left most run is moved to the output array, preserving stability.
For top down merge sort, which is normally 2 way, the same rule applies, when equal elements are encountered, the element from the left run is moved, preserving stability.

Related

Seeming Discrepancy in Arrays.copyOf

Why this question is not a possible duplication of How Arrays.asList(int[]) can return List<int[]>?.
That question doesn't really answer my particular situation as I am trying to figure out if there is a discrepancy in my use of Arrays.copyOf.
CASE 1: Supposed deep copy of the array
// Creating a integer array, populating its values
int[] src = new int[2];
src[0] = 2;
src[1] = 3;
// Create a copy of the array
int [] dst= Arrays.copyOf(src,src.length);
Assert.assertArrayEquals(src, dst);
// Now change one element in the original
dst[0] = 4;
// Following line throws an exception, (which is expected) if the copy is a deep one
Assert.assertArrayEquals(src, dst);
CASE 2:
Here is where things seem to be weird:
What I am trying to do with the below method (lifted verbatim from a book) is to create an immutable list view of a copy of the input array arguments. That way, if the input array changes, the contents of the returned list don't change.
#SafeVarargs
public static <T> List<T> list(T... t) {
return Collections.unmodifiableList(new ArrayList<>(Arrays.asList(Arrays.copyOf(t, t.length))));
}
int[] arr2 = new int[2];
arr2[0] = 2;
arr2[1] = 3;
// Create an unmodifiable list
List<int[]> list2 = list(arr2);
list2.stream().forEach(s -> System.out.println(Arrays.toString(s)));
// Prints [2, 3] as expected
arr2[0] = 3;
list2.stream().forEach(s -> System.out.println(Arrays.toString(s)));
// Prints [3, 3] which doesn't make sense to me... I would have thought it would print [2, 3] and not be affected by my changing the value of the element.
The contradiction that I see is that in one case (Case 1), Arrays.copyOf seems to be a deep copy, whereas in the other case (Case 2), it seems like a shallow one. The changes to the original array seem to have written through to the list, even though I have copied the array in creating my unmodifiable list.
Would someone be able to help me resolve this discrepancy?
First of all, your list method performs an unnecessary step, you don't need the copyOf operation, so here goes:
#SafeVarargs
public static <T> List<T> list(T... t) {
return Collections.unmodifiableList(
new ArrayList<>(Arrays.asList(t))
);
}
The ArrayList constructor already copies the incoming list, so you're safe there.
Next, when you are calling your list() method with an int[], that array is considered to be a single element of type int[], because the type erasure of your T... is Object..., and int is primitive. There is no way you can make your method do a deep copy inside the list without either changing the parameter types or doing an instanceOf check and performing the copy manually inside the method. I'd say the wisest thing to do is probably to move the Arrays.copyOf() call outside the method:
List<int[]> list2 = list(Arrays.copyOf(arr2));

Organizing a set of sets by size in Java

I'm writing a simple program to recursively find all subsets of some larger set. I've got it working, but I wanted to order all of the sets in order by size.
I posted my working code below.
import java.util.*;
public class AllSubsets {
public static void main(String[] args) {
// Change contents of this array to easily change contents of set.
Integer[] setContents = {3, 6, 8, 9, 10, 22};
// create initial unused set by dumping all of the aray into a set.
Set<Integer> unused = new HashSet<Integer>(Arrays.asList(setContents));
// create initial empty set for used set.
Set<Integer> used = new HashSet<Integer>();
// create output set of sets.
Set<Set<Integer>> allSets = new HashSet<Set<Integer>>();
allSets.add(used);
// find all sets recursively
findAllSets(used, unused, allSets);
// print out results
System.out.println(allSets);
}
public static void findAllSets(Set<Integer> used, Set<Integer> unused,
Set<Set<Integer>> allSets) {
if (unused != null) {
Set<Integer> copyOfUnused = new HashSet<Integer>(unused);
for (Integer val : copyOfUnused) {
unused.remove(val);
used.add(val);
allSets.add(new HashSet<Integer>(used));
findAllSets(used, unused, allSets);
used.remove(val);
unused.add(val);
}
}
}
}
I was wondering what the best way would be to order these sets by size. I tried to create a TreeSet which holds multiple HashSet objects with it's comparator method overwritten. This ended up compiling but didn't store the values correctly. The code for this I wrote is very similar to the code above, so I will write out the main difference below:
Set<Set<Integer>> allSets =
new TreeSet<Set<Integer>>(new Comparator<Set<Integer>>() {
public int compare(Set<Integer> a, Set<Integer> b) {
return a.size() - b.size();
}
});
In this version of the code, it compiles but the objects are not storing correctly. The proper sets are being computed and added to "allSets" in the recursive method (tested using println) but it only ever holds one set at a time. I have a feeling it's mostly because I overwrote the Comparator for Set but I am using HashSets. Is there a better way to organize my sets or maybe just a small bug in my code?
Thanks!!
TreeSet<Set<Integer>> will only store one Set element with a given size, because it considers two different sets with the same size to be "equal": it takes a.compareTo(b) == 0 to mean a == b.
If you want to get all of the sets and then print them in order of size, gather all of the sets in a regular (Hash)Set, and then sort the entries:
List<Set<Integer>> listOfSets = new ArrayList<>(allSets);
Collections.sort(listOfSets, <your comparator above>);
System.out.println(listOfSets).

Define an array of arbitrary dimension [duplicate]

I am trying to create an array of arrays of arrays etc..., except I don't know how many nested levels deep it needs to be until runtime.
Depending on the input, I might need either int[], int[][], int[][][][][][], or anything else. (For context, I am trying to construct an N-dimensional grid for a cellular automaton, where N is passed as a parameter.)
I don't have any code for you because I have no idea how to go about this; I suspect is not possible at all using just arrays. Any help, or alternative solutions, would be appreciated.
You could do this with an Object[], limiting its members to either Object[] or int[].
For example, here's an array that goes three levels deep in one part, and two levels deep in another:
Object[] myarray = new Object[] {
new Object[] { new int[] { 1, 2 },
new int[] { 3, 4 }},
new int[] { 5, 6 }
};
After you've created it, you may want to access members. In your case, you know the depth N up front, so you know at what depth to expect an Object[] and at what depth to expect an int[].
However, if you didn't know the depth, you could use reflection to determine whether a member is another Object[] level or a leaf int[].
if ( myarray[0] instanceof Object[] ) {
System.out.println("This should print true.");
}
EDIT:
Here's a sketch [untested so far, sorry] of a method that access a member of an array of known depth, given an array of indices. The m_root member can be an Object[] or an int[]. (You could relax this further to support scalars.)
public class Grid {
private int m_depth;
private Object m_root;
...
public int get( int ... indices ) {
assert( indices.length == m_depth );
Object level = m_root;
for ( int i = 0; i + 1 < m_depth; ++i ) {
level = ((Object[]) level)[ indices[i] ];
}
int[] row = (int[]) level;
return row[ indices[m_depth - 1] ];
}
}
This should be achievable using Object[], since arrays are objects:
int[] arr = {1,2,3};
int[] arr2 = {1,2,3};
int[] arr3 = {1,2,3};
int[] arr4 = {1,2,3};
Object[] arr5 = {arr, arr2}; // basically an int[][]
Object[] arr6 = {arr3, arr4}; // basically an int[][]
Object[] arr7 = {arr5, arr6}; // basically an int[][][]
// etc.
Note that one array doesn't have to contain arrays of the same dimensions:
Object[] arr7 = {arr5, arr};
To prevent this (and to allow for easier access to the data), I suggest writing a class which has an Object member (which will be your int[] or Object[]) and a depth variable and some nice functions to give you access to what you want.
ArrayLists will also work:
ArrayList array = new ArrayList();
array.add(new ArrayList());
array.add(new ArrayList());
((ArrayList)array.get(0)).add(new ArrayList());
// etc.
As your N increases going with nested arrays becomes less and less advantageous, especially when you have a grid structure. Memory usage goes up exponentially in N with this approach and the code becomes complex.
If your grid is sparsely populated (a lot of cells with the same value) you can instead have a collection of Cell objects where each of these holds a coordinate vector and the integer value of the cell. Every cell that is not in the collection is assumed to have a default value, which is your most common value.
For faster access you can use for example a k-d tree (https://en.wikipedia.org/wiki/K-d_tree) but that depends a bit on your actual use-case.
#Andy Thomas explains how to do this using Object[] for the higher levels of the multidimensional array. Unfortunately, this means that the types are not correct to allow indexing, or indeed to allow element access without typecasts.
You can't do this:
Object[] array = ...
int i = array[1][2][3][4];
To get types that allow you to do the above, you need to create an object whose real type is (for example) int[][][][].
But the flipside is that it is not really practical to use that style of indexing for N dimensional arrays where N is a variable. You can't write Java source code to do that unless you place a bound on N (i.e. up to 5) and treat the different cases individually. That becomes unmanageable very quickly.
You can use Java reflection as Arrays are objects.
public static void main(String[] args) throws InstantiationException,
IllegalAccessException, ClassNotFoundException {
Class<?> intClass = int.class;
Class<?> oneDimensionalArrayClass = Class.forName("[I");
Object oneDimensionalIntArray1 = Array.newInstance(intClass, 1);
Array.set(oneDimensionalIntArray1, 0, 1);
Object oneDimensionalIntArray2 = Array.newInstance(intClass, 1);
Array.set(oneDimensionalIntArray2, 0, 2);
Object oneDimensionalIntArray3 = Array.newInstance(intClass, 1);
Array.set(oneDimensionalIntArray3, 0, 3);
Object twoDimensionalIntArray = Array.newInstance(oneDimensionalArrayClass, 3);
Array.set(twoDimensionalIntArray, 0, oneDimensionalIntArray1);
Array.set(twoDimensionalIntArray, 1, oneDimensionalIntArray2);
Array.set(twoDimensionalIntArray, 2, oneDimensionalIntArray1);
System.out.println(Array.get(Array.get(twoDimensionalIntArray, 1), 0));
}
The class Array with its static methods gives access on items while you can specify the dimension of your arrays with the number of leading "[".
The whole construct of multi-dimensional arrays is just the compiler doing some work for you on a big block of memory (ok as some have commented in java this is multiple blocks of memory). One way to deal with the problem you face is to use nested arraylists at runtime. Another (more performant) way is to just allocate a single-dimensional array of the size you need and do the indexing yourself. You could then hide the indexing code in a method that was passed all the details like an array de-reference.
private int[] doAllocate(int[] dimensions)
{
int totalElements = dimensions[0];
for (int i=1; i< dimensions.length; i++)
{
totalElements *= dimensions[i];
}
int bigOne = new int[totalElements];
return bigOne;
}
private int deReference(int[] dimensions, int[] indicies, int[] bigOne)
{
int index = 0;
// Not sure if this is only valid when the dimensions are all the same.
for (int i=0; i<dimensions.length; i++)
{
index += Math.pow(dimensions[i],i) * indicies[dimensions.length - (i + 1)];
}
return bigOne[index];
}
Fields like you wrote above a checked and created by the compiler. If you want a dynamic data structure during runtime you could create your own data structure. Search for Composite Pattern. A small snippet should show you how it works:
interface IGrid {
void insert(IGrid subgrid);
void insert(int[] values);
}
class Grid implements IGrid {
private IGrid subgrid;
void insert(IGrid subgrid) {this.subgrid = subgrid;}
void insert(int[] values) {/* Do nothing */}
}
class SubGrid implements IGrid {
private int[] values;
void insert(IGrid subgrid) {/* Do nothing */}
void insert(int[] values) {this.values = values;}
}
You could simply create a Subgrid for int[] or a Grid with a Subgrid for int[][]. It's only a rudimental solution, you would have to create some code for working on your automaton's levels and values. I would do it this way. Hope it will help :) And look forward for more solutions^^

How do filter out this list with java 8 streams and functional interfaces?

if I have a list of arrays like this (pseudo java code):
Note the list valsSorted will be always sorted with x[0] asc and x[1] desc order.
List valsSorted = {[1 5][1 4][1 3][2 1][3 2][3 1][4 2][4 1][5 1][6 2][6 1]};
How do I filter this list with Java 8 streams and lambdas so that I get:
result = {[1 5][2 1][3 2][4 2][5 1][6 2]}
The first item of the array (x[0]) is ID and the second is a version number. So the rule is give all distinct IDs with the highest version back.
If I would use a for loop the following code would be fine:
ArrayList<int[]> result= new ArrayList();
int keep = -1;
for (int[] x : valsSorted) {
int id = x[0];
int version = x[1];
if(keep == id) continue;
keep = id;
result.add(x);
}
Your use of the word "distinct" suggests using the distinct() stream operation. Unfortunately that operation is hardwired to use the equals() method of the stream elements, which isn't useful for arrays. One approach for dealing with this would be to wrap the arrays in a wrapper object that has the semantics of equality that you're looking for:
class Wrapper {
final int[] array;
Wrapper(int[] array) { this.array = array; }
int[] getArray() { return array; }
#Override
public boolean equals(Object other) {
if (! (other instanceof Wrapper))
return false;
else
return this.array[0] == ((Wrapper)other).array[0];
}
#Override
public int hashCode() { ... }
}
Then wrap up your object before distinct() and unwrap it after:
List<int[]> valsDistinct =
valsSorted.stream()
.map(Wrapper::new)
.distinct()
.map(Wrapper::getArray)
.collect(toList());
This makes one pass over the data but it generates a garbage object per value. This also relies on the stream elements being processed in-order since you want the first one.
Another approach would be to use some kind of stateful collector, but that will end up storing the entire result list before any subsequent processing begins, which you said you wanted to avoid.
It might be worth considering making the data elements be actual classes instead of two-element arrays. This way you can provide a reasonable notion of equality, and you can also make the values comparable so that you can sort them easily.
(Credit: technique stolen from this answer.)
class Test{
List<Point> valsSorted = Arrays.asList(new Point(1,5),
new Point(1,4),
new Point(1,3),
new Point(2,1),
new Point(3,2),
new Point(3,1),
new Point(4,2),
new Point(4,1),
new Point(5,1),
new Point(6,2),
new Point(6,1));
public Test(){
List<Point> c = valsSorted.stream()
.collect(Collectors.groupingBy(Point::getX))
.values()
.stream()
.map(j -> j.get(0))
.collect(Collectors.toList());
for(int i=0; i < c.size(); i++){
System.out.println(c.get(i));
}
}
public static void main(String []args){
Test t = new Test()
}
}
I decided to use the point class and represent the ID field as x and the version number as Y. So from there if you create a stream and group them by ID. You can call the values method which returns a Collection of Lists Collection<List<Point>>. You can then call the stream for this Collection and get the first value from each list which according to your specifications is ordered with descending version number so it should be the the highest version number. From there all you have to do is collect them into a list, array or whatever you see necessary and assign it as needed.
The only problem here is that they are printed out of order. That should be an easy fix though.

Java containsAll does not return true when given lists

I want to check an array is subset of another array.
The program prints false, but I expect true. Why isn't containsAll returning true?
int[] subset;
subset = new int[3];
subset[0]=10;
subset[1]=20;
subset[2]=30;
int[] superset;
superset = new int[5];
superset[0]=10;
superset[1]=20;
superset[2]=30;
superset[3]=40;
superset[4]=60;
HashSet sublist = new HashSet(Arrays.asList(subset));
HashSet suplist = new HashSet(Arrays.asList(superset));
boolean isSubset = sublist.containsAll(Arrays.asList(suplist));
System.out.println(isSubset);
There is a subtle bug in:
new HashSet(Arrays.asList(subset));
The above line does not create a set of integers as you might have expected. Instead, it creates a HashSet<int[]> with a single element, the subset array.
This has to do with the fact that generics don't support primitive types.
Your compiler would have told you about the mistake if you declared sublist and suplist as HashSet<Integer>.
On top of that, you got suplist and sublist the wrong way round in the containsAll() call.
The following works as expected:
Integer[] subset = new Integer[]{10, 20, 30};
Integer[] superset = new Integer[]{10, 20, 30, 40, 60};
HashSet<Integer> sublist = new HashSet<Integer>(Arrays.asList(subset));
HashSet<Integer> suplist = new HashSet<Integer>(Arrays.asList(superset));
boolean isSubset = suplist.containsAll(sublist);
System.out.println(isSubset);
One key change is that this is using Integer[] in place of int[].
Leaving aside your initialisation issues (as identified by NPE), you've mixed up your two sets and you actually want:
boolean isSubset = suplist.containsAll(Arrays.asList(sublist));
i.e. does {10,20,30,40,60} contain {10,20,30} ? (which, of course, it does)

Categories

Resources