Can you have collections without storing the values in Java? - java

I have a question about java collections such as Set or List. More generally objects that you can use in a for-each loop. Is there any requirement that the elements of them actually has to be stored somewhere in a data structure or can they be described only from some sort of requirement and calculated on the fly when you need them? It feels like this should be possible to be done, but I don't see any of the java standard collection classes doing anything like this. Am I breaking any sort of contract here?
The thing I'm thinking about using these for is mainly mathematics. Say for example I want to have a set representing all prime numbers under 1 000 000. It might not be a good idea to save these in memory but to instead have a method check if a particular number is in the collection or not.
I'm also not at all an expert at java streams, but I feel like these should be usable in java 8 streams since the objects have very minimal state (the objects in the collection doesn't even exist until you try to iterate over them or check if a particular object exists in the collection).
Is it possible to have Collections or Iterators with virtually infinitely many elements, for example "all numbers on form 6*k+1", "All primes above 10" or "All Vectors spanned by this basis"? One other thing I'm thinking about is combining two sets like the union of all primes below 1 000 000 and all integers on form 2^n-1 and list the mersenne primes below 1 000 000. I feel like it would be easier to reason about certain mathematical objects if it was done this way and the elements weren't created explicitly until they are actually needed. Maybe I'm wrong.
Here's two mockup classes I wrote to try to illustrate what I want to do. They don't act exactly as I would expect (see output) which make me think I am breaking some kind of contract here with the iterable interface or implementing it wrong. Feel free to point out what I'm doing wrong here if you see it or if this kind of code is even allowed under the collections framework.
import java.util.AbstractSet;
import java.util.Iterator;
public class PrimesBelow extends AbstractSet<Integer>{
int max;
int size;
public PrimesBelow(int max) {
this.max = max;
}
#Override
public Iterator<Integer> iterator() {
return new SetIterator<Integer>(this);
}
#Override
public int size() {
if(this.size == -1){
System.out.println("Calculating size");
size = calculateSize();
}else{
System.out.println("Accessing calculated size");
}
return size;
}
private int calculateSize() {
int c = 0;
for(Integer p: this)
c++;
return c;
}
public static void main(String[] args){
PrimesBelow primesBelow10 = new PrimesBelow(10);
for(int i: primesBelow10)
System.out.println(i);
System.out.println(primesBelow10);
}
}
.
import java.util.Iterator;
import java.util.NoSuchElementException;
public class SetIterator<T> implements Iterator<Integer> {
int max;
int current;
public SetIterator(PrimesBelow pb) {
this.max= pb.max;
current = 1;
}
#Override
public boolean hasNext() {
if(current < max) return true;
else return false;
}
#Override
public Integer next() {
while(hasNext()){
current++;
if(isPrime(current)){
System.out.println("returning "+current);
return current;
}
}
throw new NoSuchElementException();
}
private boolean isPrime(int a) {
if(a<2) return false;
for(int i = 2; i < a; i++) if((a%i)==0) return false;
return true;
}
}
Main function gives the output
returning 2
2
returning 3
3
returning 5
5
returning 7
7
Exception in thread "main" java.util.NoSuchElementException
at SetIterator.next(SetIterator.java:27)
at SetIterator.next(SetIterator.java:1)
at PrimesBelow.main(PrimesBelow.java:38)
edit: spotted an error in the next() method. Corrected it and changed the output to the new one.

Well, as you see with your (now fixed) example, you can easily do it with Iterables/Iterators. Instead of having a backing collection, the example would've been nicer with just an Iterable that takes the max number you wish to calculate primes to. You just need to make sure that you handle the hasNext() method properly so you don't have to throw an exception unnecessarily from next().
Java 8 streams can be used easier to perform these kinds of things nowadays, but there's no reason you can't have a "virtual collection" that's just an Iterable. If you start implementing Collection it becomes harder, but even then it wouldn't be completely impossible, depending on the use cases: e.g. you could implement contains() that checks for primes, but you'd have to calculate it and it would be slow for large numbers.
A (somewhat convoluted) example of a semi-infinite set of odd numbers that is immutable and stores no values.
public class OddSet implements Set<Integer> {
public boolean contains(Integer o) {
return o % 2 == 1;
}
public int size() {
return Integer.MAX_VALUE;
}
public boolean add(Integer i) {
throw new OperationNotSupportedException();
}
public boolean equals(Object o) {
return o instanceof OddSet;
}
// etc. etc.
}

As DwB stated, this is not possible to do with Java's Collections API, as every element must be stored in memory. However, there is an alternative: this is precisely why Java's Stream API was implemented!
Streams allow you to iterate across an infinite amount of objects that are not stored in memory unless you explicitly collect them into a Collection.
From the documentation of IntStream#iterate:
Returns an infinite sequential ordered IntStream produced by iterative application of a function f to an initial element seed, producing a Stream consisting of seed, f(seed), f(f(seed)), etc.
The first element (position 0) in the IntStream will be the provided seed. For n > 0, the element at position n, will be the result of applying the function f to the element at position n - 1.
Here are some examples that you proposed in your question:
public class Test {
public static void main(String[] args) {
IntStream.iterate(1, k -> 6 * k + 1);
IntStream.iterate(10, i -> i + 1).filter(Test::isPrime);
IntStream.iterate(1, n -> 2 * n - 1).filter(i -> i < 1_000_000);
}
private boolean isPrime(int a) {
if (a < 2) {
return false;
}
for(int i = 2; i < a; i++) {
if ((a % i) == 0) {
return false;
}
return true;
}
}
}

Related

What is the fastest way to fill an ArrayList with null in java?

I want a List of n Sets of Integers and initially this list should be filled with null.
A lot of the Sets will be initialised later, and some will remain null.
I have tried different methods to implement this, some of them are included here:
List<HashSet<Integer>> List_of_Sets = Arrays.asList(new HashSet[n]);
ArrayList<HashSet<Integer>> List_of_Sets = new ArrayList<>(n);
while(n-- > 0) List_of_Sets.add(null);
Is there a faster way to do this?
For clarification an example for arrays would be Arrays.fill() used to be slower than:
/*
* initialize a smaller piece of the array and use the System.arraycopy
* call to fill in the rest of the array in an expanding binary fashion
*/
public static void bytefill(byte[] array, byte value) {
int len = array.length;
if (len > 0){
array[0] = value;
}
//Value of i will be [1, 2, 4, 8, 16, 32, ..., len]
for (int i = 1; i < len; i += i) {
System.arraycopy(array, 0, array, i, ((len - i) < i) ? (len - i) : i);
}
}
^above code is from Ross Drew's answer to Fastest way to set all values of an array?
Is there a faster way to do this?
As far as I am aware, no. Certainly, there is no easy way that is faster.
Based on how it works, I think (but I have not tested) that the Arrays.asList(new HashSet[n]) should be the fastest solution.
It would be possible to implement a custom List implementation that is like an ArrayList but is pre-initialized to N null values. But under the hood the initialization will be pretty much identical with what happens in the List implementation that asList returns. So I doubt that any performance improvements would be significant ... or worth the effort.
If you want to be sure of this, you could write a benchmark of the various options. However, I don't think this is the right approach in this case.
Instead I would recommend benchmarking and profiling your entire application to determine if operations on this list are a real performance hotspot.
If it is not a hotspot, my recommendation would be to just use the Arrays.asList approach and spend your time on something more important.
If it is a hotspot, you should consider replacing the List with an array. From your earlier description it seemed you are going to use the List like an array; i.e. using positional get and set operations, and no operations that change the list size. If that is the case, then using a real array should be more efficient. It saves memory, and avoids a level of indirection and (possibly) some bounds checking.
One reason not to do this would be if you needed to pass the array to some other code that requires a List.
If resizing is not important to you then implementing your own list might be fast. It might also be buggy. It would at least be interesting to benchmark compared to Java's lists. One strange effect that you might see is that standard lists might be optimised by the JIT sooner, as they could be used internally by Java's standard library.
Here is my attempt, although I suggest you don't use it. Use a standard list implementation instead.
import java.util.*;
public class FastListOfNullsDemo {
public static void main(String[] args) {
Set<Integer>[] arr = new Set[100_000]; // all set to null by default.
List<Set<Integer>> myList = new ArrayBackedList<>(arr);
myList.set(3, new TreeSet<Integer>());
myList.get(3).add(5);
myList.get(3).add(4);
myList.get(3).add(3);
myList.get(3).add(2);
myList.get(3).add(1);
// Let's just print some because 100,000 is a lot!
for (int i = 0; i < 10; i++) {
System.out.println(myList.get(i));
}
}
}
class ArrayBackedList<T> extends AbstractList<T> {
private final T[] arr;
ArrayBackedList(T[] arr) {
this.arr = arr;
}
#Override
public T get(int index) {
return arr[index];
}
#Override
public int size() {
return arr.length;
}
#Override
public T set(int index, T value) {
T result = arr[index];
arr[index] = value;
return result;
}
}
Another possibility would be implementing an always-null, fixed-size list. Use that to initialise the ArrayList. I won't promise that it is fast but you could try it out.
import java.util.*;
public class FastListOfNullsDemo {
public static void main(String[] args) {
List<Set<Integer>> allNull = new NullList<>(100_000);
List<Set<Integer>> myList = new ArrayList<>(allNull);
myList.set(3, new TreeSet<Integer>());
myList.get(3).add(5);
myList.get(3).add(4);
myList.get(3).add(3);
myList.get(3).add(2);
myList.get(3).add(1);
System.out.println(myList.size());
// Let's just print some because 100,000 is a lot!
for (int i = 0; i < 10; i++) {
System.out.println(myList.get(i));
}
}
}
class NullList<T> extends AbstractList<T> {
private int count;
NullList(int count) {
this.count = count;
}
#Override
public T get(int index) {
return null;
}
#Override
public int size() {
return count;
}
}

Best big set data structure in Java

I need to find gaps in a big Integer Set populated with a read loop through files and I want to know if exists something already done for this purpose to avoid a simple Set object with heap overflow risk.
To better explain my question I have to tell you how my ticketing java software works.
Every ticket has a global progressive number stored in a daily log file with other informations. I have to write a check procedure to verify if there are number gaps inside daily log files.
The first idea was to create a read loop with all log files, read each line, get the ticket number and store it in a Integer TreeSet Object and then find gaps in this Set.
The problem is that ticket number can be very high and could saturate the memory heap space and I want a good solution also if I have to switch to Long objects.
The Set solution waste a lot of memory because if I find that there are no gap in the first 100 number has no sense to store them in the Set.
How can I solve? Can I use some datastructure already done for this purpose?
I'm assuming that (A) the gaps you are looking for are the exception and not the rule and (B) the log files you are processing are mostly sorted by ticket number (though some out-of-sequence entries are OK).
If so, then I'd think about rolling your own data structure for this. Here's a quick example of what I mean (with a lot left to the reader).
Basically what it does is implement Set but actually store it as a Map, with each entry representing a range of contiguous values in the set.
The add method is overridden to maintain the backing Map appropriately. E.g., if you add 5 to the set and already have a range containing 4, then it just extends that range instead of adding a new entry.
Note that the reason for the "mostly sorted" assumption is that, for totally unsorted data, this approach will still use a lot of memory: the backing map will grow large (as unsorted entries get added all over the place) before growing smaller (as additional entries fill in the gaps, allowing contiguous entries to be combined).
Here's the code:
package com.matt.tester;
import java.util.Collection;
import java.util.Comparator;
import java.util.Iterator;
import java.util.Map;
import java.util.SortedSet;
import java.util.TreeMap;
public class SE {
public class RangeSet<T extends Long> implements SortedSet<T> {
private final TreeMap<T, T> backingMap = new TreeMap<T,T>();
#Override
public int size() {
// TODO Auto-generated method stub
return 0;
}
#Override
public boolean isEmpty() {
// TODO Auto-generated method stub
return false;
}
#Override
public boolean contains(Object o) {
if ( ! ( o instanceof Number ) ) {
throw new IllegalArgumentException();
}
T n = (T) o;
// Find the greatest backingSet entry less than n
Map.Entry<T,T> floorEntry = backingMap.floorEntry(n);
if ( floorEntry == null ) {
return false;
}
final Long endOfRange = floorEntry.getValue();
if ( endOfRange >= n) {
return true;
}
return false;
}
#Override
public Iterator<T> iterator() {
throw new IllegalAccessError("Method not implemented. Left for the reader. (You'd need a custom Iterator class, I think)");
}
#Override
public Object[] toArray() {
throw new IllegalAccessError("Method not implemented. Left for the reader.");
}
#Override
public <T> T[] toArray(T[] a) {
throw new IllegalAccessError("Method not implemented. Left for the reader.");
}
#Override
public boolean add(T e) {
if ( (Long) e < 1L ) {
throw new IllegalArgumentException("This example only supports counting numbers, mainly because it simplifies printGaps() later on");
}
if ( this.contains(e) ) {
// Do nothing. Already in set.
}
final Long previousEntryKey;
final T eMinusOne = (T) (Long) (e-1L);
final T nextEntryKey = (T) (Long) (e+1L);
if ( this.contains(eMinusOne ) ) {
// Find the greatest backingSet entry less than e
Map.Entry<T,T> floorEntry = backingMap.floorEntry(e);
final T startOfPrecedingRange;
startOfPrecedingRange = floorEntry.getKey();
if ( this.contains(nextEntryKey) ) {
// This addition will join two previously separated ranges
T endOfRange = backingMap.get(nextEntryKey);
backingMap.remove(nextEntryKey);
// Extend the prior entry to include the whole range
backingMap.put(startOfPrecedingRange, endOfRange);
return true;
} else {
// This addition will extend the range immediately preceding
backingMap.put(startOfPrecedingRange, e);
return true;
}
} else if ( this.backingMap.containsKey(nextEntryKey) ) {
// This addition will extend the range immediately following
T endOfRange = backingMap.get(nextEntryKey);
backingMap.remove(nextEntryKey);
// Extend the prior entry to include the whole range
backingMap.put(e, endOfRange);
return true;
} else {
// This addition is a new range, it doesn't touch any others
backingMap.put(e,e);
return true;
}
}
#Override
public boolean remove(Object o) {
throw new IllegalAccessError("Method not implemented. Left for the reader.");
}
#Override
public boolean containsAll(Collection<?> c) {
throw new IllegalAccessError("Method not implemented. Left for the reader.");
}
#Override
public boolean addAll(Collection<? extends T> c) {
throw new IllegalAccessError("Method not implemented. Left for the reader.");
}
#Override
public boolean retainAll(Collection<?> c) {
throw new IllegalAccessError("Method not implemented. Left for the reader.");
}
#Override
public boolean removeAll(Collection<?> c) {
throw new IllegalAccessError("Method not implemented. Left for the reader.");
}
#Override
public void clear() {
this.backingMap.clear();
}
#Override
public Comparator<? super T> comparator() {
throw new IllegalAccessError("Method not implemented. Left for the reader.");
}
#Override
public SortedSet<T> subSet(T fromElement, T toElement) {
throw new IllegalAccessError("Method not implemented. Left for the reader.");
}
#Override
public SortedSet<T> headSet(T toElement) {
throw new IllegalAccessError("Method not implemented. Left for the reader.");
}
#Override
public SortedSet<T> tailSet(T fromElement) {
throw new IllegalAccessError("Method not implemented. Left for the reader.");
}
#Override
public T first() {
throw new IllegalAccessError("Method not implemented. Left for the reader.");
}
#Override
public T last() {
throw new IllegalAccessError("Method not implemented. Left for the reader.");
}
public void printGaps() {
Long lastContiguousNumber = 0L;
for ( Map.Entry<T, T> entry : backingMap.entrySet() ) {
Long startOfNextRange = (Long) entry.getKey();
Long endOfNextRange = (Long) entry.getValue();
if ( startOfNextRange > lastContiguousNumber + 1 ) {
System.out.println( String.valueOf(lastContiguousNumber+1) + ".." + String.valueOf(startOfNextRange - 1) );
}
lastContiguousNumber = endOfNextRange;
}
System.out.println( String.valueOf(lastContiguousNumber+1) + "..infinity");
System.out.println("Backing map size is " + this.backingMap.size());
System.out.println(backingMap.toString());
}
}
public static void main(String[] args) {
SE se = new SE();
RangeSet<Long> testRangeSet = se.new RangeSet<Long>();
// Start by putting 1,000,000 entries into the map with a few, pre-determined, hardcoded gaps
for ( long i = 1; i <= 1000000; i++ ) {
// Our pre-defined gaps...
if ( i == 58349 || ( i >= 87333 && i <= 87777 ) || i == 303998 ) {
// Do not put these numbers in the set
} else {
testRangeSet.add(i);
}
}
testRangeSet.printGaps();
}
}
And the output is:
58349..58349
87333..87777
303998..303998
1000001..infinity
Backing map size is 4
{1=58348, 58350=87332, 87778=303997, 303999=1000000}
I believe it's a perfect moment to get familiar with bloom-filter. It's a wonderful probabilistic data-structure which can be used for immediate proof that an element isn't in the set.
How does it work? The idea is pretty simple, the boost more complicated and the implementation can be found in Guava.
The idea
Initialize a filter which will be an array of bits of length which would allow you to store maximum value of used hash function. When adding element to the set, calculate it's hash. Determinate what bit's are 1s and assure, that all of them are switched to 1 in the filter (array). When you want to check if an element is in the set, simply calculate it's hash and then check if all bits that are 1s in the hash, are 1s in the filter. If any of those bits is a 0 in the filter, the element definitely isn't in the set. If all of them are set to 1, the element might be in the filter so you have to loop through all of the elements.
The Boost
Simple probabilistic model provides the answer on how big should the filter (and the range of hash function) be to provide optimal chance for false positive which is the situation, that all bits are 1s but the element isn't in the set.
Implementation
The Guava implementation provides the following constructor to the bloom-filter: create(Funnel funnel, int expectedInsertions, double falsePositiveProbability). You can configure the filter on your own depending on the expectedInsertions and falsePositiveProbability.
False positive
Some people are aware of bloom-filters because of false-positive possibility. Bloom filter can be used in a way that don't rely on mightBeInFilter flag. If it might be, you should loop through all the elements and check one by one if the element is in the set or not.
Possible usage
In your case, I'd create the filter for the set, then after all tickets are added simply loop through all the numbers (as you have to loop anyway) and check if they filter#mightBe int the set. If you set falsePositiveProbability to 3%, you'll achieve complexity around O(n^2-0.03m*n) where m stands for the number of gaps. Correct me if I'm wrong with the complexity estimation.
Well either you store everything in memory, and you risk overflowing the heap, or you don't store it in memory and you need to do a lot of computing.
I would suggest something in between - store the minimum needed information needed during processing. You could store the endpoints of the known non-gap sequence in a class with two Long fields. And all these sequence datatypes could be stored in a sorted list. When you find a new number, iterate through the list to see if it is adjacent to one of the endpoints. If so, change the endpoint to the new integer, and check if you can merge the adjacent sequence-objects (and hence remove one of the objects). If not, create a new sequence object in the properly sorted place.
This will end up being O(n) in memory usage and O(n) in cpu usage. But using any data structure which stores information about all numbers will simply be n in memory usage, and O(n*lookuptime) in cpu if lookuptime is not done in constant time.
Read as many ticket numbers as you can fit into available memory.
Sort them, and write the sorted list to a temporary file. Depending on the expected number of gaps, it might save time and space to use a run-length–encoding scheme when writing the sorted numbers.
After all the ticket numbers have been sorted into temporary files, you can merge them into a single, sorted stream of ticket numbers, looking for gaps.
If this would result in too many temporary files to open at once for merging, groups of files can be merged into intermediate files, and so on, maintaining the total number below a workable limit. However, this extra copying can slow the process significantly.
The old tape-drive algorithms are still relevant.
Here is an idea: if you know in advance the range of your numbers, then
pre-calculate the sum of all the numbers that you expect to be there.
2. Then keep reading your numbers and produce the sum of all read numbers as well as the number of your numbers.
3. If the sum you come up with is the same as pre-calculated one, then there are no gaps.
4. If the sum is different and the number of your numbers is short just by one of the expected number then pre-calculated sum - actual sum will give you your missing number.
5. If the number of your numbers is short by more then one, then you will know how many numbers are missing and what their sum is.
The best part is that you will not need to store the collection of your numbers in memory.

ArrayList add(index, object) vs add(object) complexity

I was given the following little multiple-choice question in my APCS class concerning adding elements to ArrayList's and although one particular answer seems intuitively correct to me (choice B), I'm not entirely sure whether it's indeed right or what's actually going on behind the scenes performance-wise:
//Consider the following methods
public static List<Integer> process1(int n) {
List<Integer> someList = new ArrayList<Integer>();
for (int k = 0; k < n; k++) {
someList.add(new Integer(k));
}
return someList;
}
public static List<Integer> process2(int n) {
List<Integer> someList = new ArrayList<Integer>();
for (int k = 0; k < n; k++) {
someList.add(k, new Integer(k));
}
return someList;
}
//Which of the following best describes the behavior of process1 and process2?
//(A) Both methods produce the same result and take the same amount of time
//(B) Both methods produce the same result and process1 is faster than process2
//(C) The two methods produce different results and process1 is faster than process2
//(D) The two methods produce different results and process2 is faster than process1
Note: I did test both methods out on my computer using large enough parameters and both are quite close in run length, but method1 seems to be slightly faster. Also, this isn't a homework problem to be turned in or anything, so no need to feel worried about providing me with answers:)
From the JDK source (reproduced in #ScaryWombat's answer), it appears that the first will be slightly faster.
In context, System.arraycopy won't actually do anything, but the call will still be made. Otherwise, they are essentially identical. The first has one extra function call, so it will likely be a tiny bit slower (magnified by large n).
public boolean add(E e) {
ensureCapacity(size + 1); // Increments modCount!!
elementData[size++] = e;
return true;
}
vs
public void add(int index, E element) {
rangeCheckForAdd(index);
ensureCapacity(size+1); // Increments modCount!!
System.arraycopy(elementData, index, elementData, index + 1,
size - index);
elementData[index] = element;
size++;
}
so it looks like method has more code to do in addition to the common code shared by both

Returning multiple values from a recursive function

I have this problem where I have to convert a decimal number to binary and then store the bits in a linked list where the head node is the most significant bit and the last node is the least significant bit. Solving the problem itself is actually easy as you only need to keep taking the modulo of 2 recursively and add the result in the list until the decimal number becomes 0.
Where I'm stuck is that I have to write the function such that it returns a pair of number, (whether an array or a list) of the most significant bit and the last significant bit.
i.e: Inputting 14 in the function would return (1, 0), since 14 is 1110 in binary.
I do have access to the MSB and LSB easily(getFirst(), getLast()).
The function can only take one argument which is the decimal number.
Currently I have this current code:
public static void encodeBin(int n) {
if(n == 0) return; //Base case
else {
if(n % 2 == 0)
theList.addFirst(0);
else
theList.addFirst(1);
encodeBin(n / 2);
}
// return?
}
The problem is I can't figure out how return the 2 values. Haveing a return value means I can't call encodeBin() by itself.
Moreover, where should I create the list? If I put something like List<Integer> = new LinkedList<Integer>() at the very beginning of the function, then each time the function calls itself, it creates a new list and adds the bits in THAT new list not the original right?(The list created from when the function is called the first time)
Anybody knows how to solve this?
You cannot return 2 values. You are going to have to return some object that contains the 2 values. either an array or some new object, depending on your homework requirments and where this function is going to be used.
For the linkedlist creation, what you need is a recursive helper method. Your public method will be used to initialize your objects, start the recursion, and return your result. This allows your actual recursive function to have more than 1 argument.
public static SOME_TYPE encodeBin(int n) {
LinkedList result = new LinkedList();
encodeBin_helper(result,n);
// return the MSB and LSB
}
public static void encodeBin_helper(LinkedList theList, int n) {
if(n == 0) return; //Base case
else {
if(n % 2 == 0)
theList.addFirst(0);
else
theList.addFirst(1);
encodeBin_helper(theList, n/2);
}
}
You can't return two values separately. You can, however, return an array containing the first bit and the last bit or create your own class to hold this data, and return an instance of that class.
And about the list, I see two options:
Make it a static class variable
Make it an argument of the function (although I see you said you couldn't do this).
The first method would look like this:
public class MyClass {
private static List<Integer> theList = new LinkedList<Integer>();
// `encodeBin` method as you have it
}
The second method would look like this:
public static void encodeBin(int n, List<Integer> theList) {
if(n == 0) return; //Base case
else {
if(n % 2 == 0)
theList.addFirst(0);
else
theList.addFirst(1);
encodeBin(n / 2, theList);
}
}
You could then do something along the lines of
List<Integer> theList = new LinkedList<Integer>();
encodeBin(14, theList);
and theList would hold the appropriate bits as desired.
As a note, you might want to consider making this a list of booleans instead of integers, with true representing 1 and false representing 0.
I suggest declaring two methods:
(1) public static int[] encodeBin(int n)
and
(2) private static void encodeBin(LinkedList, int n)
The public method merely creates an empty list and then calls the private version passing both the empty list and the orignal input n as the parameters
something like this:
public static int[] encodeBin(int n) {
LinkedList<Integer> aList = new LinkedList<Integer>();
encodeBin(aList , n);
int MSB = aList.getFirst();
int LSB = aList.getLast();
return new int[] {MSB, LSB};
}
private static void encodeBin(LinkedList<Integer> list, n) {
//your recursive version here
}

Refactor this recursive method?

I'm pretty new to the idea of recursion and this is actually my first attempt at writing a recursive method.
I tried to implement a recursive function Max that passes an array, along with a variable that holds the array's size in order to print the largest element.
It works, but it just doesn't feel right!
I have also noticed that I seem to use the static modifier much more than my classmates in general...
Can anybody please provide any general tips as well as feedback as to how I can improve my code?
public class RecursiveTry{
static int[] n = new int[] {1,2,4,3,3,32,100};
static int current = 0;
static int maxValue = 0;
static int SIZE = n.length;
public static void main(String[] args){
System.out.println(Max(n, SIZE));
}
public static int Max(int[] n, int SIZE) {
if(current <= SIZE - 1){
if (maxValue <= n[current]) {
maxValue = n[current];
current++;
Max(n, SIZE);
}
else {
current++;
Max(n, SIZE);
}
}
return maxValue;
}
}
Your use of static variables for holding state outside the function will be a source of difficulty.
An example of a recursive implementation of a max() function in pseudocode might be:
function Max(data, size) {
assert(size > 0)
if (size == 1) {
return data[0]
}
maxtail = Max(data[1..size], size-1)
if (data[0] > maxtail) {
return data[0]
} else {
return maxtail
}
}
The key here is the recursive call to Max(), where you pass everything except the first element, and one less than the size. The general idea is this function says "the maximum value in this data is either the first element, or the maximum of the values in the rest of the array, whichever is larger".
This implementation requires no static data outside the function definition.
One of the hallmarks of recursive implementations is a so-called "termination condition" which prevents the recursion from going on forever (or, until you get a stack overflow). In the above case, the test for size == 1 is the termination condition.
Making your function dependent on static variables is not a good idea. Here is possible implementation of recursive Max function:
int Max(int[] array, int currentPos, int maxValue) {
// Ouch!
if (currentPos < 0) {
raise some error
}
// We reached the end of the array, return latest maxValue
if (currentPos >= array.length) {
return maxValue;
}
// Is current value greater then latest maxValue ?
int currentValue = array[currentPos];
if (currentValue > maxValue) {
// currentValue is a new maxValue
return Max(array, currentPos + 1, currentValue);
} else {
// maxValue is still a max value
return Max(array, currentPos + 1, maxValue);
}
}
...
int[] array = new int[] {...};
int currentPos = 0;
int maxValue = array[currentPos] or minimum int value;
maxValue = Max(array, currentPos, maxValue);
A "max" function is the wrong type of thing to write a recursive function for -- and the fact you're using static values for "current" and "maxValue" makes your function not really a recursive function.
Why not do something a little more amenable to a recursive algorithm, like factorial?
"not-homework"?
Anyway. First things first. The
static int[] n = new int[] {1,2,4,3,3,32,100};
static int SIZE = n.length;
have nothing to do with the parameters of Max() with which they share their names. Move these over to main and lose the "static" specifiers. They are used only once, when calling the first instance of Max() from inside main(). Their scope shouldn't extend beyond main().
There is no reason for all invocations of Max() to share a single "current" index. "current" should be local to Max(). But then how would successive recurrences of Max() know what value of "current" to use? (Hint: Max() is already passing other Max()'s lower down the line some data. Add "current" to this data.)
The same thing goes for maxValue, though the situation here is a bit more complex. Not only do you need to pass a current "maxValue" down the line, but when the recursion finishes, you have to pass it back up all the way to the first Max() function, which will return it to main(). You may need to look at some other examples of recursion and spend some time with this one.
Finally, Max() itself is static. Once you've eliminated the need to refer to external data (the static variables) however; it doesn't really matter. It just means that you can call Max() without having to instantiate an object.
As others have observed, there is no need for recursion to implement a Max function, but it can be instructive to use a familiar algorithm to experiment with a new concept. So, here is the simplified code, with an explanation below:
public class RecursiveTry
{
public static void main(String[] args)
{
System.out.println(Max(new int[] {1,2,4,3,3,32,100}, 0, 0));
}
public static int Max(int[] n, int current, int maxValue)
{
if(current < n.Length)
{
if (maxValue <= n[current] || current == 0))
{
return Max(n, current+1, n[current]);
}
return Max(n, current+1, maxValue);
}
return maxValue;
}
}
all of the static state is gone as unnecessary; instead everything is passed on the stack. the internal logic of the Max function is streamlined, and we recurse in two different ways just for fun
Here's a Java version for you.
public class Recursion {
public static void main(String[] args) {
int[] data = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
System.out.println("Max: " + max(0, data));
}
public static int max(int i, int[] arr) {
if(i == arr.length-1) {
return arr[i];
}
int memo = max(i+1, arr);
if(arr[i] > memo) {
return arr[i];
}
return memo;
}
}
The recurrence relation is that the maximum element of an array is either the first element, or the maximum of the rest of the array. The stop condition is reached when you reach the end of the array. Note the use of memoization to reduce the recursive calls (roughly) in half.
You are essentially writing an iterative version but using tail recursion for the looping. Also, by making so many variables static, you are essentially using global variables instead of objects. Here is an attempt at something closer to a typical recursive implementation. Of course, in real life if you were using a language like Java that doesn't optimize tail calls, you would implement a "Max" function using a loop.
public class RecursiveTry{
static int[] n;
public static void main(String[] args){
RecursiveTry t = new RecursiveTry(new int[] {1,2,4,3,3,32,100});
System.out.println(t.Max());
}
RecursiveTry(int[] arg) {
n = arg;
}
public int Max() {
return MaxHelper(0);
}
private int MaxHelper(int index) {
if(index == n.length-1) {
return n[index];
} else {
int maxrest = MaxHelper(index+1);
int current = n[index];
if(current > maxrest)
return current;
else
return maxrest;
}
}
}
In Scheme this can be written very concisely:
(define (max l)
(if (= (length l) 1)
(first l)
(local ([define maxRest (max (rest l))])
(if (> (first l) maxRest)
(first l)
maxRest))))
Granted, this uses linked lists and not arrays, which is why I didn't pass it a size element, but I feel this distills the problem to its essence. This is the pseudocode definition:
define max of a list as:
if the list has one element, return that element
otherwise, the max of the list will be the max between the first element and the max of the rest of the list
A nicer way of getting the max value of an array recursively would be to implement quicksort (which is a nice, recursive sorting algorithm), and then just return the first value.
Here is some Java code for quicksort.
Smallest codesize I could get:
public class RecursiveTry {
public static void main(String[] args) {
int[] x = new int[] {1,2,4,3,3,32,100};
System.out.println(Max(x, 0));
}
public static int Max(int[] arr, int currPos) {
if (arr.length == 0) return -1;
if (currPos == arr.length) return arr[0];
int len = Max (arr, currPos + 1);
if (len < arr[currPos]) return arr[currPos];
return len;
}
}
A few things:
1/ If the array is zero-size, it returns a max of -1 (you could have another marker value, say, -MAX_INT, or throw an exception). I've made the assumption for code clarity here to assume all values are zero or more. Otherwise I would have peppered the code with all sorts of unnecessary stuff (in regards to answering the question).
2/ Most recursions are 'cleaner' in my opinion if the terminating case is no-data rather than last-data, hence I return a value guaranteed to be less than or equal to the max when we've finished the array. Others may differ in their opinion but it wouldn't be the first or last time that they've been wrong :-).
3/ The recursive call just gets the max of the rest of the list and compares it to the current element, returning the maximum of the two.
4/ The 'ideal' solution would have been to pass a modified array on each recursive call so that you're only comparing the first element with the rest of the list, removing the need for currPos. But that would have been inefficient and would have bought down the wrath of SO.
5/ This may not necessarily be the best solution. It may be that by gray matter has been compromised from too much use of LISP with its CAR, CDR and those interminable parentheses.
First, let's take care of the static scope issue ... Your class is defining an object, but never actually instantiating one. Since main is statically scoped, the first thing to do is get an object, then execute it's methods like this:
public class RecursiveTry{
private int[] n = {1,2,4,3,3,32,100};
public static void main(String[] args){
RecursiveTry maxObject = new RecursiveTry();
System.out.println(maxObject.Max(maxObject.n, 0));
}
public int Max(int[] n, int start) {
if(start == n.length - 1) {
return n[start];
} else {
int maxRest = Max(n, start + 1);
if(n[start] > maxRest) {
return n[start];
}
return maxRest;
}
}
}
So now we have a RecursiveTry object named maxObject that does not require the static scope. I'm not sure that finding a maximum is effective using recursion as the number of iterations in the traditional looping method is roughly equivalent, but the amount of stack used is larger using recursion. But for this example, I'd pare it down a lot.
One of the advantages of recursion is that your state doesn't generally need to be persisted during the repeated tests like it does in iteration. Here, I've conceded to the use of a variable to hold the starting point, because it's less CPU intensive that passing a new int[] that contains all the items except for the first one.

Categories

Resources