Spigot Algorithms
Introduction
Recently I was talking with my girlfriend about how her unit calculator tool, Rink, calculates . This reminded me of some algorithms that are not very well known but are very interesting, so I decided to search about them more but I couldn't find much information apart from a section in a Wikipedia article and a few research papers. So I'm writing this post to condense all the information I could find or derive myself, explaining everything as thoroughly as it is reasonable in the hope it will be interesting.
There are many widely known methods for approximating and other important irrational numbers with infinite series, continued fractions or other procedures involving limits. The following alternating series was independently discovered by Madhava of Sangamagrama in the 14-th century, James Gregory in 1671 and Gottfried Leibniz in 1673 and is known as the Leibniz formula for
How good is this formula for approximating ? Sadly, not very good, as it needs five thousand terms just to get four decimal places and literally billions of terms for ten decimal places. In constrast, Ramanujan's formula for converges extremely fast:
Indeed, it is so fast that it only needs one term to calculate up to six decimal places and two terms for fifteen. This formula is an example of a Ramanujan-Sato series which are the basis of the Chudnovsky algorithm, an extraordinarily fast algorithm for calculating , used for many record breaking computations of up to 202 trillion digits in 2024 using y-cruncher, a very well optimized program that computes and other constants.
Although these series converge really fast, they all suffer from two problems:
- They require floating point numbers. Floating point operations suffer from precision issues and it can be hard to predict how much precision we need to get a specific number of digits;
- You need to compute all the earlier digits before computing some later digit. It is not possible to extract the billionth digit of using a traditional algorithm without computing and storing the 999,999,999 other digits.
The second problem may sound very esoteric, you may have never considered that it is even possible to extract any arbitrary digit of an irrational number without calculating previous digits, but I will soon show how this is, very surprisingly, actually possible. As for the first problem, there is a way to compute digit by digit without any floating point operations, but it won't be faster than an algorithm like Chudnovsky's due to slower convergence.
We call algorithms that approximate a number digit by digit spigot algorithms, and if the algorithm also allows the digits to be calculated without first calculating the earlier digits it is a digit extraction algorithm. Before presenting a spigot algorithm for I will first use Euler's number as an example of how we can construct such algorithms since it is easier than the algorithm for .
The Idea
Euler's number is defined as
Euler proved that
and we can rewrite this as
We know that is the integer part of so of course the remaining expression is its fractional part, which means we can multiply it by to shift the next digit to the integer part. For this example it is enough to use only six terms:
for any integer where is the quotient and the remainder so we can apply this to all the fractions from right to left like this:
and, indeed, is the next digit of . You can repeat this procedured as many times as you want and it will keep producing more digits as long as you use enough terms. In general, if we have a number defined by a recursive sum of the form
we can apply this method to produce digit by digit, is always its integer part and the rest of the recursive sum is its fractional part. So, by induction, if we apply the procedure above we will get the next digit so the integer part of is and the rest of the expression is the fractional part, whose first digit is of course the second digit of . So we can keep applying this procedure to generate digit by digit. Hopefully you can see this pattern by looking at the example. And as it turns out, computers are very good at patterns!
However, this is not necessarily true for the most general case when the recursive sum is not of the above form. In a sum with different fractions it is possible that the sum after is not always the fractional part. For example, a similar formula for is
and of course is not the integer part of , so the rest of the sum still contributes to the integer part. In fact, after we get some number of digits the remaining sum will again contribute to the last digit we produced. This is the complication I mentioned in the Introduction section and we will see how to deal with it later.
In the next section I will formalize the idea presented above and prove that it works as expected for sums of the form (*).
Generalizing Number Bases
We are all used to numbers written in base 10, such as , and , and I believe many of you are also fluent in base 2 and base 16. In general, for an integer then we have a positional numeral system of base where a string of digits represents the number
and when for all we say that the representation is regular. This representation does not need to be unique. To see this, let's write as
which follows from the definition given above. We also have the inequality
Since , by the squeeze theorem
Therefore . And the same is true for any other base, in binary and in hexadecimal.
Rewriting (4) as
gives us a way to generalized numeral systems. Instead of having a base based on a fixed , we can use any numbers for each and in the above expression, giving us
where are all possibly different numbers, maybe even fractions. We write this base as , and we're going to omit the integer part when it is in decimal
So we can use the base , that is, the radices are all the natural numbers greater than , to write (2) as . In particular, if we set to and to we can write Euler's number in base as . We have created a numeral system where an irrational number can be periodic! We could even write Euler's number as if the context is clear.
George Cantor was the first to formalize and generalize mixed radix numeral systems as we just defined, publishing his work in Über einfache Zahlensysteme by the Zeitschrift für Math journal. Remember that these numeral systems are nothing but a way to write a number, and as we saw earlier numbers don't even need to have a unique representation, so this is a perfectly valid way of representing a number.
The same way the representation of a number in some fixed natural radix is regular when all the digits are smaller than the base, in mixed-radix a representation is regular when each is smaller than the numerator of . So if we now look at (1) we can see that multiplication by an integer in mixed-radix works the same way as in fixed-base: we perform the multiplication in the right-most digit modulo the numerator of and carry the quotient of , add the quotient to the next digit and repeat for the next digits.
For example, in base 10 to do we multiply , so the first digit is , and carry the quotient of , which is , and we keep doing this until we get the integer part and the new fractional part . So the manipulations we did in (1) are equivalent to multiplying in base ! This becomes even clearer when we write in base 10 as
I'm not going to develop mixed-radix systems any further because we already have all we need to continue.
The Algorithm for
As we disucussed, the only problem with our main idea is that the rest of the recursive sum might not actually be the fractional part of the number in question, it might be the case that the sum after in the first expression of (2) is greater than . What this means in terms of base is that might not be less than in decimal even though it has integer part in .
So we want to know when . This is a special case of the following lemma.
Lemma 1. If then
Proof. This is equivalent to showing that
so we let to be their maximum values and we get
Since
the series is absolutely convergent by the ratio test and we are allowed to rearrange it:
The fact that completes the proof.
Corollary 1.1. if and only if for all , and in this case the integer part of is , otherwise the integer part is . The fractional part is never greater than .
The above corollary implies that if we choose some amount of precision to compute like we did in the introduction section, which is equivalent to setting to for some , and apply the procedure we will never get an integer greater than . The only question left is how many digits in we actually need. It turns out that we need digits is enough, and the proof of this fact is in the proof of the correctness of the following algorithm.
Algorithm A. (Generate the digits of Euler's number in base 10). Given the number of desired digits this algorithm outputs digits of .
- Step 1: Set 2.
- Step 2: Set .
- Step 3: Set to 1 for .
- Step 4: Set to .
- Step 5: Output .
- Step 6: Set for
- Step 7: If the algorithm terminates.
- Step 8: Set .
- Step 9: If set and go to step 5.
- Step 10: Divide by and let be the quotient and the remainder.
- Step 11: Set .
- Step 12: Set
- Step 13: Set .
- Step 14: Go to step 7.
Theorem 1. With input Algorithm A produces digits of in time and space.
Proof. The algorithm maintains an array of the digits of the fractional part in base and does exactly what we did in (2). Lemma 1 shows that this in Step 4 will never be greater than . So all we have to do now is show that digits in is enough precision to get digits in base 10.
First we will prove that for large enough , digits of precision is enough. So we need to show that where :
Stirling's approximation says that
so
By taking the natural logarithm of
we get
and taking the exponential we get
So for every . This means that digits in is enough for digits in decimal when , and we can manually verify that is enough for . We have to keep in mind that we are counting the integer part, which is already in decimal so we just print it directly in the algorithm. This is why we use an array of length to store the digits in base .
The fact that the algorithm uses space is trivial given that we always use an array of length . Step 2 takes time and steps from Step 5 to Step 16 take time so the total running time is .
Here is an implementation of this algorithm in C and Haskell:
#include <stdio.h>
#include <stdlib.h>
void generate_digits_of_e(int n)
{
int q = 2;
int len = n + 2;
int *a = malloc(len * sizeof(int));
if (!a) {
fprintf(stderr, "Out of memory.");
exit(1);
}
for (int i = 0; i < len; ++i)
a[i] = 10;
printf("%d.", q);
for (int i = 0; i < n - 1; ++i) {
for (int j = len - 1; j >= 0; --j) {
const int d = a[j];
a[j] = d % (j + 2);
q = d / (j + 2);
a[j - 1] += q;
}
printf("%d", q);
fflush(stdout);
for (int j = 0; j < len; ++j)
a[j] *= 10;
}
free(a);
}
digits n = 2 : digits' (n - 1) (replicate (n + 2) 1)
digits' n a@(h:t) = let (qs, rs) = unzip $ scanl f (f (0, 0) (h * 10, initB)) t'
in last qs : if n == 1 then [] else digits' (n - 1) rs
where initB = length a + 1
t' = zip (map (*10) t) [initB - 1, initB - 2..]
f (q, _) (n, b) = divMod (n + q) b
This algorithm was first published by A. H. J. Sale in his 1967 paper “The Calculation of to Many Significant Digits”.
The Algorithm for
In 1995 Stanley Rabinowitz and Stan Wagon published a similar algorithm for . We defined a base for representing and constructed the algorithm for by converting this base to decimal, likewise we will use the base based on (3) to represent as . Base is, unfortunately, not as nice as . For example, this base with maximal digits and no integer part (meaning every digit in the fractional part is one less than the denominator of ) is the number
Where is the double factorial, see this Wikipedia article for what it means (it is not just factorial applied twice). Using the identity
we get
Here I used the fact that
which can be proven using integration by parts, and
This shows that numbers of the form are in the range and, in contrast with base , the integer part of may be with limited precision and we may get an integer in the range after the multiply-reduce-carry stage of the algorithm.
This is not hard to fix. The algorithm is almost the same, but instead of printing the digit right away we hold it for as long as the next digits are 9's. If we then get an integer greater than 9 we add 1 to the pre-digit we first got and print it, then we print 0's instead of 9's, otherwise we print the pre-digit, all the 9's and set the new pre-digit to the integer modulo . There is nothing complicated going on here, just multiplication with carry. Of course, we only print as many digits as needed, so we may not print all the 9's or all the 0's.
So this algorithm has a very serious flaw compared to the algorithm for Euler's number: at the beginning we need to specify a number of digits to be printed, but we may reach a sequence of 9's that goes beyond this limit, in which case the algorithm would terminate without fixing the pre-digit, possibly printing one minus the correct digit, and a bunch of 9's. We can't fix this by simply allocating more memory for the array and continuing as if we started with more precision since this would mess up the array's state. It is also not possible (or is it?) to predict in advance where and how long the sequences of 9's are, coming up with such a method could even lead to a proof that is or is not normal, an open problem in mathematics. So if you get a sequence of 9's, you need to call the algorithm again with more digits to be sure the digits are correct.
That being said, the authors of the original paper say that we need base digits but, as you can check with , this is wrong and we actually need digits. Here is the code in C:
#include <stdio.h>
#include <stdlib.h>
void generate_digits_of_pi(int n)
{
const int len = (10 * n) / 3 + 1;
int *const a = malloc(len * sizeof(len));
if (!a) {
fprintf(stderr, "Out of memory.");
exit(1);
}
for (int i = 0; i < len; ++i)
a[i] = 2;
int nines = 0;
int predigit = -1;
while (n >= 0) {
for (int j = 0; j < len; ++j)
a[j] *= 10;
int q;
for (int j = len - 1; j >= 1; --j) {
const int d = a[j];
a[j] = d % (2 * j + 1);
q = d / (2 * j + 1);
a[j - 1] += q * j;
}
// Remember that the integer part of base c uses base 10.
const int d = a[0];
a[0] = d % 10;
q = d / 10;
if (q == 9) {
++nines;
} else {
// This prints predigit + 1 if the quotient is greater
// than 9 and predigit otherwise.
if (predigit >= 0)
printf("%d", predigit + q / 10);
if (n < nines)
nines = n;
// Decrease n by the number of digits we are about to
// print plus the digit we printed.
n -= nines + 1;
while (--nines >= 0)
printf("%d", q == 10 ? 0 : 9);
nines = 0;
predigit = q == 10 ? 0 : q;
}
fflush(stdout);
}
free(a);
}
I'm not going to go over the correctness of this algorithm since the idea is basically the same as the algorithm for , if you are interested check the paper by Stanley Rabinowitz and Stan Wagon. Jeremy Gibbons published an algorithm that doesn't have the problems I mentioned, in the paper “Unbounded Spigot Algorithms for the Digits of Pi”. The purpose of the last sections was to present the idea of spigot algorithms and how they can be constructed, if you really are interested in this kind of thing I recommend reading this paper, as I'm about to move on to something a lot more interesting.
Digit Extraction Algorithms
In 1995 Simon Plouffe discovered the remarkable formula
This is the BBP formula, named after David Bailey, Peter Borwein and Plouffe. This formula is special because we can use it to construct a spigot algorithm that is a lot more interesting than the previous algorithm: not only can we generate digit by digit, but we can also calculate the nth hexadecimal digit of without calculating the preceding digits!
To see how this is possible first let
and notice that the digit of in position from 1 (counting the integer part) is the first digit in the fractional part of
where means the fractional part of . We are just shifting digits and removing the integer part. From this definition of fractional part we can see that if is an integer then
which gives us
We can also show that the same way. Now we can write
Let's isolate :
The second sum didn't have an integer part in the first place so we don't need to take the fractional part. To see this, just consider the geometric series
So only the first sum has terms with integer part. Since we are taking the fractional part we can remove the integer part of every term by reducing the numerator modulo the denominator (the remainder of divided by is equal to the fractional part of ). So we now have
Putting everything together:
Of course, the digit we want is , and now we have a great way to calculate this. The right sum of is very small and for large it quickly gets smaller than our floating-point precision of choice can represent. We can also use binary modulo exponentiation to compute efficiently, this is the full source code:
#include <math.h>
#include <stdint.h>
#include <stdio.h>
/*
* Returns 16 raised to n, modulo m.
*/
static double expm2(long n, double m)
{
if (m == 1)
return 0;
double result = 1.;
long double base = fmod(16, m);
while (n > 0) {
if (n % 2 == 1)
result = fmod(result * base, m);
n /= 2;
base = fmod(base * base, m);
}
return result;
}
/*
* We need the modulo to be always non-zero since although the digits in the
* fractional part are all correct the ones in the integer part are not, which
* means we might end up with a negative number in the last stage of the
* algorithm where we use modulo 1 to take the fractional part.
*/
static double mod(double x, long m)
{
x = fmod(x, (double)m);
return x < 0 ? x + m : x;
}
static double sigma(long i, long n)
{
double s = 0;
#pragma omp parallel for reduction(+:s)
for (long k = 0; k <= n; ++k) {
const double m = 8 * k + i;
const double a = expm2(n - k, m) / m;
s += mod(a, 1);
}
for (long k = n + 1; k < 20; ++k)
s += pow(16, n - k) / (8 * k + i);
return mod(s, 1);
}
int get_digit(long n) {
n -= 2;
double acc = 0;
acc += 4 * sigma(1, n);
acc -= 2 * sigma(4, n);
acc -= sigma(5, n);
acc -= sigma(6, n);
return (int)floor(16 * mod(acc, 1));
}
Since this algorithm is embarrassingly parallel I took the liberty of adding an OpenMP pragma to speed things up. The amount of precision required depends on how far the digit is, at some point using 64-bit floating point numbers will not be enough and you would have to use a library like MPFR with some numerical analysis to estimate how much precision you need, but a 64-bit float was enough for me to calculate the 10,000,000th digit (it is !) in 2 seconds. Also, by computing the digit we want we inevitably compute the next few digits so with enough floating-point precision and terms in the sums we can get not only the nth digit but also digits from that position by multiplying by instead of just . It is important to understand that the reason why this algorithm works is because of the division by in the BBP formula, which allows us to multiply by and split each sum into two, this is why this formula is special.
In 1997 Fabrice Bellard discovered the formula
which allows to calculate the hexadecimal or binary digits of 43% faster than the BBP formula. Using 20 computers in parallel Bellard calculated 152 binary digits of starting from the trillionth position. Colin Percival calculated the quadrillionth binary digit in his PiHex project using a distributed network of 1,734 computers in 56 different countries.
There are many known BBP-type formulas for other constants:
In the one is an integer. In general, this kind of formula has the form
where and are polynomials with integer coefficients and is an integer greater than one (this is also the base for which we can construct a digit extraction algorithm).