How to index an array of BigIntegers by BigIntegers - java

I'm trying to build an array of BigIntegers, but it seems like the array needs to be indexed by integers themselves (and if true, that seems extremely stupid to me but I'm hoping I'm just misunderstanding something). What I'm trying is the essentially the following:
BigInteger totalChoiceFunctions = BigInteger.valueOf(50031545098999704);
BigInteger[][] choiceFunctions = new BigInteger[totalChoiceFunctions][36];
But this causes the error "type mismatch: cannot convert from BigInteger to int". In order to remedy this I tried:
BigInteger[][] choiceFunctions = new BigInteger[totalChoiceFunctions.intValue()][36];
however this doesn't seem to help. When I compile and run I get the runtime error of:
exception in thread 'main' java.lang.NegativeArraySizeException
Confused, I looked at the oracle documentation for the intValue() method of BigInteger and found that "if this BigInteger is too big to fit in an int, only the low-order 32 bits are returned. Note that this conversion can lose information about the overall magnitude of the BigInteger value as well as return a result with the opposite sign". I suspect this is what's going on, considering that 50031545098999704 is certainly too big for int (and why I turned towards an array of BigIntegers since I want my array to be indexed by the numbers from 1 to 50031545098999704).
If my understanding is correct, then:
BigInteger[][] chioceFunctions = new BigInteger[totalChoiceFunctions][36];
creates an array which stores BigIntegers but is still indexed by ints. How can I make an array that both stores and is indexed by BigIntegers? Is it possible? Note that the code I'm using this for might in this case be able to work if I use longs rather than ints to index, but I want it to scale to a size where I'll be forced to index by BigIntegers. Am I missing something obvious?

Arrays in java are not sparse, so your array would need about 200 000 terabyte (not including the referenced arrays/BigIntegers). So no, currently not possible. There are some plans to support long as index in arrays with maybe java 10 (certainly not java9).
I guess you actually want a sparse datastructure; a Map<BigInteger,BigInteger> or as you have a nested array Map<Tuple<BigInteger,Integer>, BigInteger> should work for you.

No, it is not possible. In java, all arrays are indexed by integers only.

In theory (but not practical), you can create such a class.
An ordinary array can't increase its size dynamically, right? But ArrayList can! How can it do that? Well, by recreating a new array with a larger size when the capacity is full.
You can kind of apply the same logic here.
An ordinary array cannot hold 50031545098999704 items. But you can create multiple arrays to hold them!
So in your class, you should have a matrix:
public class BigIntegerArray<T> {
private T[][] innerMatrix;
}
The constructor is going to accept a number as the array length, right? Using the array length, you know how many arrays you need. For example, if the array size is N where Integer.MAX_VALUE < N <= Integer.MAX_VALUE * 2, you know that you should initialize the inner matrix like this:
innerMatrix = new T[2][Integer.MAX_VALUE];
Just use some maths!
And you would want to implement a get and set method. If the index is larger than Integer.MAX_VALUE but smaller than or equal to Integer.MAX_VALUE * 2, access it like this:
innerArray[1][index - Integer.MAX_VALUE]; // not real code. I'm just illustrating the point.
You get what I mean? It's basically simple maths.
EDIT:
Maybe I didn't explain this well enough. The algorithm of creating the inner matrix is like this: (pseudocode)
if arraySize is smaller than Integer.MAX_VALUE then
create an array of size arraySize
return
initialize a variable counter to 0
while arraySize > 0
subtract Integer.MAX_VALUE from arraySize
increment counter
create an array with the array size of counter
The accessing part is similar:
if index is smaller than Integer.MAX_VALUE then
access [0][index]
return
initialize a variable counter to 0
while arraySize > 0
subtract Integer.MAX_VALUE from arraySize
increment counter
access [counter][index - Integer.MAX_VALUE * counter]

Java will only let you index by int the maximum index that you can achieve is Integer.MAX_VALUE which is 1 byte short of 2 GB which is 2147483647. Your array can not hold numbers greater than this.
You should be using a data structure which scales more than int limits, e.g. maps. Maps will behave like a 2D Array only without any limit, here you can store them as :-
Map m = new HashMap<BigInteger,BigInteger>();
The insertion and retrieval will be a little harder as compared to simple 2D array but considering the limitations of indexes using only int we have to go with another approach.

Related

Accessing an array element if using long datatype in Java

I am trying to solve competitive programming questions using Java. In questions involving the use of arrays, I will have to write my programs such that the array can hold a large number of values(beyond the range of int) and also each array element is also beyond the int range.
So I am using long for this. But if I try accessing say
a[i++] = b[j++];
where i & j & a& b are all of long type, I get the following error:
error: possible loss of precision
for the index access.
How do I access such elements? I tried typecast of long to int for the index values, and the error goes away, but won't that affect the precision too?
variable declarations are as follows:
long numTest = in.nextLong();
long [] sizes = new long[numTest];
The Java Language Specification says that you can't use longs as array indexes:
Arrays must be indexed by int values... An attempt to access an array component with a long index value results in a compile-time error.
So you could expect that tha maximum size would be Integer.MAX_VALUE, but that depends on your JVM which is discussed here
Unfortunately Java can't handle arrays bigger than 2^31 elements, the maximum size for a signed integer. A so big array wouldn't probably fit in memory anyways. Let's consider this case for example:
Array with 2^32 long elements
Size is 2^32 * 8 bytes = 2^35 bytes = 32 GB (!)
In this example the array size is just slightly bigger than the integer maximum value but we already reached 32 gigabytes of used memory.
I think you should find some alternative solution such as memory-mapped files or dinamically loading parts of the data as needed.
I'd also like to link to this existing answer: https://stackoverflow.com/a/10787175
Java arrays must be indexed by int values. That is a rule stated in the language specification. So if you use a long, as an index, the compiler is helping you obey the rules and is implicitly coercing it to an int. That is why you get the message.
To have an array containing a long number of objects you will need a custom implementation of array.
Java allows you to create an array maximum of integer range and hence the indexes should be within the integer range.
To achieve array of long ranges, i suggest you to use the concept of ArrayLet. I.e create 2 dimension arrays where the top level array can help you to cross the integer range, for example see below,
long[][] array = new long[x][Integer.MAX_VALUE];
where x could be any number that defines the multiples of Integer.MAX_VALUE values that your program requires.

How to optimize checking if one of the given arrays is contained in yet another array

I have an array of integers which is updated every set interval of time with a new value (let's call it data). When that happens I want to check if that array contains any other array of integers from specified set (let's call that collection).
I do it like this:
separate a sub-array from the end of data of length X (arrays in the collection have a set max length of X);
iterate trough the collection and check if any array in it is contained in the separated data chunk;
It works, though it doesn't seem optimal. But every other idea I have involves creating more collections (e.g. create a collection of all the arrays from the original collection that end with the same integer as data, repeat). And that seems even more complex (on the other hand, it looks like the only way to deal with arrays in collections without limited max length).
Are there any standard algorithms to deal with such a problem? If not, are there any worthwhile optimizations I can apply to my approach?
EDIT:
To be precise, I:
separate a sub-array from the end of data of length X (arrays in the collection have a set max length of X and if the don't it's just the length of the longest one in the collection);
iterate trough the collection and for every array in it:
separate sub-array from the previous sub-array with length matching current array in collection;
use Java's List.equals to compare the arrays;
EDIT 2:
Thanks for all the replays, surely they'll come handy some day. In this case I decided to drop the last steps and just compare the arrays in my own loop. That eliminates creating yet another sub-array and it's already O(N), so in this specific case will do.
Take a look at the KMP algorithm. It's been designed with String matching in mind, but it really comes down to matching subsequences of arrays to given sequences. Since that algorithm has linear complexity (O(n)), it can be said that it's pretty optimal. It's also a basic staple in standard algorithms.
dfens proposal is smart in that it incurs no significant extra complexity iff you keep the current product along with the main array, and can be checked in O(1), but it is also quite fragile and produces many false positives and negatives. Just imagine a target array [1, 1, ..., 1], which will always produce a positive test for all non-trivial main arrays. It also breaks down when one bucket contains a 0. That means that a successful check against his test is always a necessary condition for a hit (0s aside), but is never sufficient - aka with that method alone, you can never be sure of the validity of that result.
look at the rsync algorithm... if i understand it correctly you could go about:
you've got a immense array of data [length L]
at the end of that data, you've got N Bytes of data, and you want to know whether those N bytes ever appeared before.
precalculate:
for every offset in the array, calculate the checksum over the next N data elements.
Hold that checksum in a seperate array.
Using a rolling checksum like rsync does, you can do this step in O(N) time for all elements..
Whenever new data arrives:
Calculate the checksum over the last N elements. Using a rolling checksum, this could be O(1)
Check that checksum against all the precalculated checksums. If it matches, check equality of the subarrays (subslices , whatever...). If that matches too, you've got a match.
I think, in essence this is the same as dfen's approach with the product of all numbers.
I think you can keep product of array to for immediate rejections.
So if your array is [n_1,n_2,n_3,...] you can say that it is not subarray of [m_1,m_2,m_3,...] if product m_1*m_2*... = M is not divisible by productn_1*n_2*... = N.
Example
Let's say you have array
[6,7]
And comparing with:
[6,10,12]
[4,6,7]
Product of you array is 6*7 = 42
6 * 10 * 12 = 720 which is not divisible by 42 so you can reject first array immediately
[4, 6, 7] is divisble by 42 (but you cannot reject it - it can have other multipliers)
In each interval of time you can just multiply product by new number to avoid computing whole product everytime.
Note that you don't have to allocate anything if you simulate List's equals yourself. Just one more loop.
Similar to dfens' answer, I'd offer other criteria:
As the product is too big to be handled efficiently, compute the GCD instead. It produces much more false positives, but surely fits in long or int or whatever your original datatype is.
Compute the total number of trailing zeros, i.e., the "product" ignoring all factors but powers of 2. Also pretty weak, but fast. Use this criterion before the more time-consuming ones.
But... this is a special case of DThought's answer. Use rolling hash.

Java quickly access array item

I'm doing a calculation that returns a long decimal, for example 4611686018427387904. I need to first convert it to hex and then chack an array of size (16) depending on the bits set.
So the above number gets converted to 0x40000000000000000L, this corresponds to the first index in the array. If the number is 0x0004000000000000L it corresponds to the 3rd index in the array.
My questions are:
Is there a quick way to convert a decimal to hex?
Is there a quick way to access an array depending on the bits set of the value (instead of using loops)?
If the number is in the long range, use Long.highestOneBit(). Alternatively, BigInteger has a bitLength() method.
First, a guess: you have a long that you are viewing as 16 4-bit numbers. You want to use each 4-bit number as an index into a 16-element array.
I think the fastest way to do this is going to be with a mask and some bit-shifting. Both are fast operations. If you mask out of the bottom, you don't have to shift the result after you mask.
There are 16 4-bit groupings in a long, so:
long l = 1234;
int[] results = new int[16];
for (int i=15; i>=0; i--)
{
int index = (int)l & 0xF;
results[i] = index;
l = l >> 4;
}
This gets your 16 indices from the right (lower-order) bits of your long, and you say the left (higher-order) bits are the first indices. So this gets them all in reverse order and stores them accordingly. Adjust for shorts or whatever you need, and this will be fast.
Warning: I have not run this code. I mean it as a kind of java pseudo-code.
IN CASE YOU ARE NOT FAMILIAR WITH THE '&' OPERATOR: it gives you back the ones and zeros in the positions which have 1s in them, so & with 0xF will tell you whatever the lower-order 4 bits are.
Remember internally all primitives in Java is stored as binary. The fact that it prints a decimal number when you pass it to System.out.println() is a view on that binary value. So converting a decimal to hex if it's stored in a primitive (double, float, long, integer, short, byte) is already done. If you want to convert that value to a hex string for display purposes you can Integer.toHexString( int ).
Java has an upper limit on array sizes which is an integer so 2^31 slots. Anything over that and you'll have to use two or more structures and combine them. I suppose that's sorta what you are doing. But, if you are doing that maybe use a long, and Long.highestOneBit() could be a first index into an array of arrays where each slow is an array of potentially 2^31 slots. Of course that isn't a particularly efficient structure for memory usage. But that would give you a view that is potentially the size of a long. You probably don't have enough memory for that, but who knows.

How to get table size 2^32 in java

I have to get in java protected final static int [] SIEVE = new int [ 1 << 32 ];
But i cant force java to that.
Max sieve what i get is 2^26 i need 2^32 to end my homework. I tried with mask but i need to have SIEVE[n] = k where min{k: k|n & k >2}.
EDIT
I need to find Factor numbers from 2 to 2^63-1 using Sieve and sieve must have information that P[n]= is smallest prime with divide n. I know that with sieve i can Factorise number to 2^52. But how do that exercises with holding on to the content.
EDIT x2 problem solved
You can't. A Java array can have at most 2^31 - 1 elements because the size of an array has to fit in a signed 32-bit integer.
This applies whether you run on a 32 bit or 64 bit JVM.
I suspect that you are missing something in your homework. Is the requirement to be able to find all primes less than 2^32 or something? If that is the case, they expect you to treat each int of the int[] as an array of 32 bits. And you need an array of only 2^25 ints to do that ... if my arithmetic is right.
A BitSet is another good alternative.
A LinkedList<Integer> is a poor alternative. It uses roughly 8 times the memory that an array of the same size would, and the performance of get(int) is going to be horribly slow for a long list ... assuming that you use it in the obvious fashion.
If you want something that can efficiently use as much memory as you can configure your JVM to use, then you should use an int[][] i.e. an array of arrays of integers, with the int[] instances being as large as you can make them.
I need to find Factor numbers from 2 to 2^63-1 using Sieve and sieve must have information that P[n]= is smallest prime with divide n. I know that with sieve i can Factorise number to 2^52. But how do that exercises with holding on to the content.
I'm not really sure I understand you. To factorize a number in the region of 2^64, you only need prime numbers up to 2^32 ... not 2^52. (The square root of 2^64 is 2^32 and a non-prime number must have a prime factor that is less than or equal to its square root.)
It sounds like you are trying to sieve more numbers than you need to.
If you really need to store that much data in memory, try using java.util.LinkedList collection instead.
However, there's a fundamental flaw in your algorithm if you need to store 16GB of data in memory.
If you're talking about Sieve of Eratosthenes and you need to store all primes < 2^32 in an array, you still wouldn't need an array of size 2^32. I'd suggest you use java.util.BitSet to find the primes and either iterate and print or store them in a LinkedList as required.

Memory footprint of int[] and Integer[] arrays

I try to create an array of Integers (i tried with own object but the same happened with int) , with size of 30 million. i keep getting "OutOfMemoryError: Java heap space"
Integer [] index = new Integer[30000000];
for (int i = 0 ; i < 30000000 ; i++){
index[i] = i;
}
i checked the total heap space, using "Runtime.getRuntime().totalMemory()" and "maxMemory()"
and saw that i start with 64 MB and the max is 900+ MB, and during the run i get to 900+ on the heap and crush.
now i know that Integer takes 4 bytes, so even if i multiply 30*4*1000000 i should still only get about 150-100 mega.
if i try with a primitive type, like int, it works.
how could i fix it ?
Java's int primitive will take up 4 bytes but if you use a ValueObject like Integer it's going to take up much more space. Depending on your machine a reference alone could take up 32 or 64 bits + the size of the primitive it is wrapping.
You should probably just use primitive ints if space is an issue. Here is a very good SO answer that explains this topic in more detail.
Lets assume that we are talking about a 32bit OpenJDK-based JVM.
Each Integer object has 1 int field - occupying 4 bytes.
Each Integer object has 2 header words - occupying 8 bytes.
The granularity of allocation is (I believe) 2 words - 4 bytes of padding.
The Integer[] has 1 reference for each array element / position - 4 bytes.
So the total is 20 bytes per array element. 20 x 30 x 1,000,000 = 600,000,000 Mbytes. Now add the fact that the generational collector will allocate at least 3 object spaces of various sizes, and that could easily add up to 900+ Mbytes.
how could i fix it ?
Use int[] instead of Integer.
If the Integer values mostly represent numbers in the range -128 to + 127, allocate them with Integer.valueOf(int). The JLS guarantees that Integer objects created that way will be shared. (Note that when an Integer is created by auto-boxing, then JLS stipulates that valueOf is used. So, in fact, this "fix" has already been applied in your example.)
If your Integer values mostly come from a larger but still small domain, consider implementing your own cache for sharing Integer objects.
My question was about Integer as an example, in my program i use my own object that only holds an array of bytes (max size of 4). when i create it, it takes a lot more then 4 bytes on the memory.
Yes, it will do.
Let's assume your class is defined like this:
public class MyInt {
private byte[] bytes = new byte[4];
}
Each MyInt will occupy:
MyInt header words - 8 bytes
MyInt.bytes field - 4 byte
Padding - 4 bytes
Header words for the byte array - 12 bytes
Array content - 4 bytes
Now add the space taken by the MyInt reference:
Reference to each MyInt - 4 bytes
Grand total - 36 bytes per MyInt element of a MyInt[].
Compare that with 20 bytes per Integer element of an Integer[] or 4 bytes per int element of an int[].
How to fix that case?
Well an array of 4 bytes contains 32 bits of data. That can be encoded as int. So the fix is the same as before. Use an int[] instead of a MyInt[], or (possibly) adapt one of the other ideas discussed above.
Alternatively, make the heap larger, or use a database or something like that so that the data doesn't need to be held in RAM.
Integer is an object which will take more than 4 bytes. How much more is implementation dependent. Do you really need Integer? The only benefit is that it can be null. Perhaps you could use a "sentinal value" instead; say, -1, or Integer.MIN_VALUE.
Perhaps you should be using a database rather than a huge array, but if you must use a huge array of objects, have you tried increasing the Java memory size by using a the -Xms command line argument when running the Java application launcher?
This is not what you are looking for but the optimal solution is to use a function instead of an array in this simple example.
static int index(int num) {
return num;
}
If you have a more realistic example, there may be other optimisations you can use.

Categories

Resources