Arbitrary precision multiplication, Knuth 4.3.1 leading zero elimination - java

I am working with the basic Knuth 4.3.1 Algorithm M to do arbitrary precision multiplication on the natural numbers. My implementation in Java is below. The problem is that it is generating leading zeroes, seemingly as a side effect of the algorithm not knowing whether a given result has two places or one. For example, 2 x 3 = 6 (one digit), but 4 x 7 = 28 (two digits). The algorithm seems to always reserve two digits which results in leading zeroes.
My question is two-fold: (1) Is my algorithm a correct implementation of M, or am I doing something wrong which is unnecessarily creating leading zeroes, and (2) If it is an unavoidable side effect of M that it produces leading zeroes, then how can we adjust or use an improved algorithm to avoid leading zeroes.
// Knuth M algorithm 4.3.1
final public static void multiplyDecimals( int[] decimalM1, int[] decimalN1, int[] result, int radix ){
Arrays.fill( result, 0 );
int lenM = decimalM1[0];
int lenN = decimalN1[0];
result[0] = lenM + lenN;
int iStepM = lenM;
while( iStepM > 0 ){
int iStepN = lenN;
int iCarry = 0;
while( iStepN > 0 ){
int iPartial = decimalM1[iStepM] * decimalN1[iStepN] + result[iStepM + iStepN] + iCarry;
result[iStepM + iStepN] = iPartial % radix;
iCarry = iPartial / radix;
iStepN--;
}
result[iStepM] = iCarry;
iStepM--;
}
return;
}
Output of the algorithm showing factorials being generated which shows the leading zeroes.
1 01
2 002
3 0006
4 00024
5 000120
6 0000720
7 00005040
8 000040320
9 0000362880
10 000003628800
11 00000039916800
12 0000000479001600
13 000000006227020800
14 00000000087178291200
15 0000000001307674368000
16 000000000020922789888000
17 00000000000355687428096000
18 0000000000006402373705728000
19 000000000000121645100408832000
20 00000000000002432902008176640000

The algorithm isn't allocating any leading zeros at all. You are. You're providing the output array, and filling it with zeros too. Knuth Algorithm M doesn't do that.
In addition:
You should certainly skip all the leading zeros in both numbers. This can have a massive effect on performance, as it's an O(MN) algorithm. The sum of the final M and N is nearly the correct number of output digits; the final step after multiplication is to remove possibly one leading zero.
You can also skip the inner loop if the current M digit is zero. This is Knuth's step M2. Note that a zero digit occurs more frequently in numbers in nature than 1/10: there's a law about this that says each digit 1,2,3,5,6,7,8,9 is successively less likely.

Each individual multiplication allocates enough space for the worst case input. Allocating for the worst case is the right thing to do here, because in general you won't know for sure if your result has a leading zero until you've finished doing your multiplication!
To prevent the cascading effect of redundant leading zeros in your question, check for leading zeroes after you have performed the multiplication, and reduce the length accordingly. Note that, if neither of your inputs has any leading zeroes, the result of their multiplication should have no more than one. However, this is not true for, say, subtraction (which can obviously generate as many leading zeroes as you like!).

I figured out how to solve the problem. The program needs to be modified as follows:
final public static int multiplyDecimals( int[] decimalM1, int[] decimalN1, int[] result, int radix ){
Arrays.fill( result, 0 );
int lenM = decimalM1[0];
int lenN = decimalN1[0];
result[0] = lenM + lenN;
int iStepM = lenM;
while( iStepM > 0 ){
int iStepN = lenN;
int iCarry = 0;
while( iStepN > 0 ){
int iPartial = decimalM1[iStepM] * decimalN1[iStepN] + result[iStepM + iStepN] + iCarry;
result[iStepM + iStepN] = iPartial % radix;
iCarry = iPartial / radix;
iStepN--;
}
result[iStepM] = iCarry;
iStepM--;
}
int xFirstDigit = 1;
while( result[xFirstDigit] == 0 ) xFirstDigit++;
if( xFirstDigit > 1 ){
int ctDigits = result[0] - xFirstDigit + 1;
for( int xDigit = 1; xDigit <= ctDigits; xDigit++ ) result[xDigit] = result[xDigit + xFirstDigit - 1];
result[0] = ctDigits;
}
return result[0];
}

Related

calculate Check Number in Java

Not sure if anyone can explain this to me or help me.
I have a 15 Digit Number of which I want to multiply each even number by 2 unless the even number is greater than 9. If it is this needs to be subtracted by 9 to give me an integer that again I can multiply by 2. Once I have all the even numbers multiplied by 2 i need to add them all together with the odd numbers.
Does that make sense.
UPDATE ***
so i have a number say 49209999856459. for that number I am looking to get the even integer so for example the first even one would be 4 then the second would be 2 and so on.
If one of those even numbers are multiplied by 2 then it might be above 9 so I want to subtract 9 to then use the remainder as the even number in its place.
SO !!!
Multiply by 2 the value of each even digit starting from index 0 and then each even index. In each case, if the resulting value is greater than 9, subtract 9 from it (which reduces larger values to a single digit). Leave the values of the digits at the odd indexes unchanged.
public String calculateCheckNumber()
String firstFifteen = longNumber.substring(0,15) ;
int i, checkSum = 0, totalSum = 0;
for (i = 0; i<firstFifteen.length(); i += 2) {
while (i < 9)
i *= 2;
if (i > 9)
i -= 9 ;
}
Was one option I was trying but it honestly I cant seem to get my head around it.
Any Help would be greatly appreciated.
Well, here is one approach. This uses the ternary (?:) operator to condense the operations. Edited base on clarification from the OP. The example you gave is actually a 14 digit string. But the following will work with any number of digits if they start out in a string. If you have a long value, then you can create the character array using:
long v = 49209999856459L;
char[] d = Long.toString(v).toCharArray();
Here is the main algorithm.
String s = "49209999856459";
int sum = 0;
char[] d = s.toCharArray();
for (int i = 0; i < d.length; i++) {
int v = d[i] - '0';
// The even digit will only be greater than 9 after
// doubling if it is >= 5 before.
sum += ((i % 2) == 1) ? v : (v >= 5) ? v+v-9 : v+v;
}
System.out.println(sum);
Prints
86

Why is this solution to Reverse Integer (Leet Code) O((log10(n))?

The problem in question asks to reverse a 32-bit signed integer. Here's the given solution in Java:
public int reverse(int x) {
int rev = 0;
while (x != 0) {
int pop = x % 10;
x /= 10;
if (rev > Integer.MAX_VALUE/10 || (rev == Integer.MAX_VALUE / 10 && pop > 7)) return 0;
if (rev < Integer.MIN_VALUE/10 || (rev == Integer.MIN_VALUE / 10 && pop < -8)) return 0;
rev = rev * 10 + pop;
}
return rev;
}
}
​According to the solution's explanation, it's time complexity is O(log10(n)) because there are roughly log10(x) digits in x. Intuitively, there seems to be n-1 iterations of the while loop, where n is the number of digits. (I.E: a 7 digit number requires 6 iterations) However, the solution and given complexity implies that the n is the integer itself and not the number of digits. Can anyone help me gain an intuitive understanding of why the above solution is log10(n) ?
If x is an integer, then floor(log10(x)) + 1 is equal to the number of digits in x.
Let log(10)x = some number y. Then 10^y = x.
For example,
log(10) 10 = 1
log(10) 100 = 2
log(10) 1000 = 3
...
When x is not a perfect power of 10:
floor( log(213) ) = 2
Let me know if this doesn't answer your question.
Let's say the x = 123.
int rev = 0;
rev = rev * 10 + x % 10; // rev = 3, 1st iteration.
x = x/10; // x = 12
rev = rev * 10 + x % 10; // rev = 3 * 10 + 2 = 32, 2nd iteration
x = x/10; // x = 1
rev = rev * 10 + x % 10; // rev = 32 * 10 + 1 = 321, 3rd iteration.
x = 0 so the loop terminates after 3 iterations for 3 digits.
The conditionals within the loop check to see if the reversed values would exceed what a 32 bit number could hold.
So it is log10(n) exactly for the reason you stated in your question. The log of a number n to a given base is the exponent required to raise the base back to the number n. And the exponent is an approximation of the number of digits in the number.
Based on your comment, it could also have been stated that "For any number n, where m is the the number of digits in n, the time complexity is O(m)."
The given reverse algorithm requires in the worst case log_10(x) iterations. In other words, if the given input x consists of k decimal digits, it requires k iterations. But stating that this algorithm is O(log_10(x)) is misleading. This is not logarithmic algorithm. If the input size is not intuitive (for example, testing whether given integer is a prime), we need to rigorously apply the correct definition of input size. In Big O analysis, the input size is defined as the number of characters it takes to write the input. Since we normally encode integers in binary digits, the input size of this algorithm n is approximately log_2 x. Therefore, x is roughly 2^n. The worst case complexity W(x) = log_10 (x) = log_10(2^n) = n log_10(2). Therefore, the big O of reverse algorithm is O(n).

Improve performance of string to binary number conversion

This is one of the questions that I faced in competitive programming.
Ques) You have an input String which is in binary format 11100 and you need to count number of steps in which number will be zero. If number is odd -> subtract it by 1, if even -> divide it by 2.
For example
28 -> 28/2
14 -> 14/2
7 -> 7-1
6 -> 6/2
3 -> 3-1
2 -> 2/2
1-> 1-1
0 -> STOP
Number of steps =7
I came up with the following solutions
public int solution(String S) {
// write your code in Java SE 8
String parsableString = cleanString(S);
int integer = Integer.parseInt(S, 2);
return stepCounter(integer);
}
private static String cleanString(String S){
int i = 0;
while (i < S.length() && S.charAt(i) == '0')
i++;
StringBuffer sb = new StringBuffer(S);
sb.replace(0,i,"");
return sb.toString();
}
private static int stepCounter(int integer) {
int counter = 0;
while (integer > 0) {
if (integer == 0)
break;
else {
counter++;
if (integer % 2 == 0)
integer = integer / 2;
else
integer--;
}
}
return counter;
}
The solution to this question looks quite simple and straightforward, however the performance evaluation of this code got me a big ZERO. My initial impressions were that converting the string to int was a bottleneck but failed to find a better solution for this. Can anybody please point out to me the bottlenecks of this code and where it can be significantly improved ?
If a binary number is odd, the last (least significant) digit must be 1, so subtracting 1 is just changing the last digit from 1 to 0 (which, importantly, makes the number even).
If a binary number is even, the last digit must be 0, and dividing by zero can be accomplished by simply removing that last 0 entirely. (Just like in base ten, the number 10 can be divided by ten by taking away the last 0, leaving 1.)
So the number of steps is two steps for every 1 digit, and one step for every 0 digit -- minus 1, because when you get to the last 0, you don't divide by 2 any more, you just stop.
Here's a simple JavaScript (instead of Java) solution:
let n = '11100';
n.length + n.replace(/0/g, '').length - 1;
With just a little more work, this can deal with leading zeros '0011100' properly too, if that were needed.
Number of times you need to subtract is the number of one bits which is Integer.bitCount(). Number of times you need to divide is the position of most-significant bit which is Integer.SIZE (32, total number of bits in integer) minus Integer.numberOfLeadingZeros() minus one (you don't need to divide 1). For zero input I assume, the result should be zero. So we have
int numberOfOperations = integer == 0 ? 0 : Integer.bitCount(integer) +
Integer.SIZE - Integer.numberOfLeadingZeros(integer) - 1;
As per the given condition, we are dividing the number by 2 if it is even which is equivalent to remove the LSB, again if number is odd we are subtracting 1 and making it an even which is equivalent to unset the set bit (changing 1 to 0). Analyzing the above process we can say that the total number of steps required will be the sum of (number of bits i.e. (log2(n) +1)) and number of set bits - 1(last 0 need not to be removed).
C++ code:
result = __builtin_popcount(n) + log2(n) + 1 - 1;
result = __builtin_popcount(n) + log2(n);

How to find the next lower integer (with the same number of 1s)

How to find the next lower binary number for an integer (same number of 1s)? For example: if given input number n = 10 (1010), the function should return 9 (1001), or n = 14 (1110) then return 13 (1101), or n = 22 (10110) then return 21 (10101), n = 25 (11001) then return 22 (10110)... etc.
You can do this.
static int nextLower(int n) {
int bc = Integer.bitCount(n);
for (int i = n - 1; i > 0; i--)
if (Integer.bitCount(i) == bc)
return i;
throw new RuntimeException(n+" is the lowest with a bit count of "+bc);
}
Of course if this is homework you are going to have trouble convincing someone you wrote this ;)
For the sake of clarity, in this answer I will use the term 'cardinality' to indicate the number of 1s in the binary representation of a number.
One (obvious) way is to run a downwards loop, and seek for the first number with the same cardinality as your input (just like Peter Lawrey suggested).
I don't think this is inefficient, because I guess the output number is always pretty close to the input. More precisely, all you have to do is to find the rightmost '10' bit sequence, and change it to '01'. Then replace the right part with a number having all 1s at its left, as many as you can, without breaking the postcondition. This brings us to another solution, which consists in converting the number to a binary string (like user2573153 showed you), performing the replacement (with a regular expression, maybe), and then converting back to int.
A slightly faster version of Peter's algorithm should be the following, which performs on integers the manipulation I proposed you for strings:
static int nextLower(int n) {
int fixPart = 0;
int shiftCount = 0;
while ((n & 3) != 2) {
if (n == 0) {
throw new IllegalArgumentException(
fixPart + " is the lowest number with its cardinality");
}
fixPart |= (n & 1) << shiftCount;
shiftCount += 1;
n /= 2;
}
int fixZeros = shiftCount - Integer.bitCount(fixPart);
return ((n ^ 3) << shiftCount) | (((1 << shiftCount) - 1) & ~((1 << fixZeros) - 1));
}
which is O(log n) rather than O(n), but it's definitely harder to understand, and may also be practically slower, due to its complexity. Anyway, you could only notice a difference if you try with some huge difficult number.
EDIT I tried a little benchmark, and found that this code is 67% faster than Peter Lawrey's when applied consecutively to all numbers from 2 to 100,000,000. I don't think this is enough to justify the increased code complexity.
I like such binary task, so to find next lower number you should find right most 1 followed by 0 and exchange them,. UPDATE: you need to "reorder" the rest part of number with 1s at left and 0s at right
10 1010 ->
9 1001
14 1110 ->
13 1101
25 11001 ->
22 10110
here is sample code:
int originalValue = 25;
int maskToCheck = 2; // in binary 10b
int clearingMask = 1;
int settingMask = 0;
int zeroCount = 0;
while (maskToCheck > 0)
{
if ( (originalValue&(maskToCheck|(maskToCheck>>1))) == maskToCheck ) // we found such
{
int newValue = originalValue&(~maskToCheck); // set 1 with 0
newValue = newValue&(~clearingMask)|(settingMask<<zeroCount); // clear all the rest bits, and set most valuable ones
newValue = newValue|(maskToCheck>>1); // set 0 with 1
System.out.println("for " + originalValue + " we found " + newValue);
break;
}
else
{
if ( (originalValue&(maskToCheck>>1)) > 0) // we have 1 bit in cleared part
settingMask = (settingMask<<1) | 1;
else
zeroCount++;
maskToCheck = maskToCheck<<1; // try next left bits
clearingMask = (clearingMask<<1)|1;
}
}

Calculating Extremely Large Powers of 2

I have made a program in Java that calculates powers of two, but it seems very inefficient. For smaller powers (2^4000, say), it does it in less than a second. However, I am looking at calculating 2^43112609, which is one greater than the largest known prime number. With over 12 million digits, it will take a very long time to run. Here's my code so far:
import java.io.*;
public class Power
{
private static byte x = 2;
private static int y = 43112609;
private static byte[] a = {x};
private static byte[] b = {1};
private static byte[] product;
private static int size = 2;
private static int prev = 1;
private static int count = 0;
private static int delay = 0;
public static void main(String[] args) throws IOException
{
File f = new File("number.txt");
FileOutputStream output = new FileOutputStream(f);
for (int z = 0; z < y; z++)
{
product = new byte[size];
for (int i = 0; i < a.length; i++)
{
for (int j = 0; j < b.length; j++)
{
product[i+j] += (byte) (a[i] * b[j]);
checkPlaceValue(i + j);
}
}
b = product;
for (int i = product.length - 1; i > product.length - 2; i--)
{
if (product[i] != 0)
{
size++;
if (delay >= 500)
{
delay = 0;
System.out.print(".");
}
delay++;
}
}
}
String str = "";
for (int i = (product[product.length-1] == 0) ?
product.length - 2 : product.length - 1; i >= 0; i--)
{
System.out.print(product[i]);
str += product[i];
}
output.write(str.getBytes());
output.flush();
output.close();
System.out.println();
}
public static void checkPlaceValue(int placeValue)
{
if (product[placeValue] > 9)
{
byte remainder = (byte) (product[placeValue] / 10);
product[placeValue] -= 10 * remainder;
product[placeValue + 1] += remainder;
checkPlaceValue(placeValue + 1);
}
}
}
This isn't for a school project or anything; just for the fun of it. Any help as to how to make this more efficient would be appreciated! Thanks!
Kyle
P.S. I failed to mention that the output should be in base-10, not binary.
The key here is to notice that:
2^2 = 4
2^4 = (2^2)*(2^2)
2^8 = (2^4)*(2^4)
2^16 = (2^8)*(2^8)
2^32 = (2^16)*(2^16)
2^64 = (2^32)*(2^32)
2^128 = (2^64)*(2^64)
... and in total of 25 steps ...
2^33554432 = (2^16777216)*(16777216)
Then since:
2^43112609 = (2^33554432) * (2^9558177)
you can find the remaining (2^9558177) using the same method, and since (2^9558177 = 2^8388608 * 2^1169569), you can find 2^1169569 using the same method, and since (2^1169569 = 2^1048576 * 2^120993), you can find 2^120993 using the same method, and so on...
EDIT: previously there was a mistake in this section, now it's fixed:
Also, further simplification and optimization by noticing that:
2^43112609 = 2^(0b10100100011101100010100001)
2^43112609 =
(2^(1*33554432))
* (2^(0*16777216))
* (2^(1*8388608))
* (2^(0*4194304))
* (2^(0*2097152))
* (2^(1*1048576))
* (2^(0*524288))
* (2^(0*262144))
* (2^(0*131072))
* (2^(1*65536))
* (2^(1*32768))
* (2^(1*16384))
* (2^(0*8192))
* (2^(1*4096))
* (2^(1*2048))
* (2^(0*1024))
* (2^(0*512))
* (2^(0*256))
* (2^(1*128))
* (2^(0*64))
* (2^(1*32))
* (2^(0*16))
* (2^(0*8))
* (2^(0*4))
* (2^(0*2))
* (2^(1*1))
Also note that 2^(0*n) = 2^0 = 1
Using this algorithm, you can calculate the table of 2^1, 2^2, 2^4, 2^8, 2^16 ... 2^33554432 in 25 multiplications. Then you can convert 43112609 into its binary representation, and easily find 2^43112609 using less than 25 multiplications. In total, you need to use less than 50 multiplications to find any 2^n where n is between 0 and 67108864.
Displaying it in binary is easy and fast - as quickly as you can write to disk! 100000...... :D
Let n = 43112609.
Assumption: You want to print 2^n in decimal.
While filling a bit vector than represents 2^n in binary is trivial, converting that number to decimal notation will take a while. For instance, the implementation of java.math.BigInteger.toString takes O(n^2) operations. And that's probably why
BigInteger.ONE.shiftLeft(43112609).toString()
still hasn't terminated after an hour of execution time ...
Let's start with an asymptotic analysis of your algorithm. Your outer loop will execute n times. For each iteration, you'll do another O(n^2) operations. That is, your algorithm is O(n^3), so poor scalability is expected.
You can reduce this to O(n^2 log n) by making use of
x^64 = x^(2*2*2*2*2*2) = ((((((x^2)^2)^2)^2)^2)^2
(which requires only 8 multiplications) rather than the 64 multiplications of
x^64 = x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x*x
(Generalizing to arbitrary exponents is left as exercise for you. Hint: Write the exponent as binary number - or look at Lie Ryan's answer).
For speeding up multiplication, you might employ the Karatsuba Algorithm, reducing the overall runtime to O(n^((log 3)/(log 2)) log n).
As mentioned, powers of two correspond to binary digits. Binary is base 2, so each digit is double the value of the previous one.
For example:
1 = 2^0 = b1
2 = 2^1 = b10
4 = 2^2 = b100
8 = 2^3 = b1000
...
Binary is base 2 (that's why it's called "base 2", 2 is the the base of the exponents), so each digit is double the value of the previous one. The shift operator ('<<' in most languages) is used to shift each binary digit to the left, each shift being equivalent to a multiply by two.
For example:
1 << 6 = 2^6 = 64
Being such a simple binary operation, most processors can do this extremely quickly for numbers which can fit in a register (8 - 64 bits, depending on the processor). Doing it with larger numbers requires some type of abstraction (Bignum for example), but it still should be an extremely quick operation. Nevertheless, doing it to 43112609 bits will take a little work.
To give you a little context, 2 << 4311260 (missing the last digit) is 1297181 digits long. Make sure you have enough RAM to handle the output number, if you don't your computer will be swapping to disk, which will cripple your execution speed.
Since the program is so simple, also consider switching to a language which compiles directly into assembly, such as C.
In truth, generating the value is trivial (we already know the answer, a one followed by 43112609 zeros). It will take quite a bit longer to convert it into decimal.
As #John SMith suggests, you can try. 2^4000
System.out.println(new BigInteger("1").shiftLeft(4000));
EDIT: Turning a binary into a decimal is an O(n^2) problem. When you double then number of bits you double the length of each operation and you double the number of digits produced.
2^100,000 takes 0.166 s
2^1000,000 takes 11.7 s
2^10,000,000 should take 1200 seconds.
NOTE: The time taken is entriely in the toString(), not the shiftLeft which takes < 1 ms even for 10 million.
The other key to notice is that your CPU is much faster at multiplying ints and longs than you are by doing long multiplication in Java. Get that number split up into long (64-byte) chunks, and multiply and carry the chunks instead of individual digits. Coupled with the previous answer (using squaring instead of sequential multiplication of 2) will probably speed it up by a factor of 100x or more.
Edit
I attempted to write a chunking and squaring method and it runs slightly slower than BigInteger (13.5 seconds vs 11.5 seconds to calculate 2^524288). After doing some timings and experiments, the fastest method seems to be repeated squaring with the BigInteger class:
public static String pow3(int n) {
BigInteger bigint = new BigInteger("2");
while (n > 1) {
bigint = bigint.pow(2);
n /= 2;
}
return bigint.toString();
}
Some timing results for power of 2 exponents (2^(2^n) for some n)
131072 - 0.83 seconds
262144 - 3.02 seconds
524288 - 11.75 seconds
1048576 - 49.66 seconds
At this rate of growth, it would take approximately 77 hours to calculate 2^33554432, let alone the time storing and adding all the powers together to make the final result of 2^43112609.
Edit 2
Actually, for really large exponents, the BigInteger.ShiftLeft method is the fastest. I estimate that for 2^33554432 with ShiftLeft, it would take approximately 28-30 hours. Wonder how fast a C or Assembly version would take...
Because one actually wants all the digits of the result (unlike, e.g. RSA, where one is only interested in the residue mod a number that's much smaller than the numbers we have here) I think the best approach is probably to extract nine decimal digits at once using long division implemented using multiplication. Start with residue equal zero, and apply the following to each 32 bits in turn (MSB first)
residue = (residue SHL 32)+data
result = 0
temp = (residue >> 30)
temp += (temp*316718722) >> 32
result += temp;
residue -= temp * 1000000000;
while (residue >= 1000000000) /* I don't think this loop ever runs more than twice */
{
result ++;
residue -= 1000000000;
}
Then store the result in the 32 bits just read, and loop through each lower word. The residue after the last step will be the nine bottom decimal digits of the result. Since the computation of a power of two in binary will be quick and easy, I think dividing out to convert to decimal may be the best approach.
BTW, this computes 2^640000 in about 15 seconds in vb.net, so 2^43112609 should be about five hours to compute all 12,978,188 digits.

Categories

Resources