Doc/Articles/Play133
At Play With J ... Table of Contents ... Previous Chapter ... Next Chapter
13. Extended Integers
. By Eugene McDonnell. First published in Vector, 13, 3, (January 1997), 127-135.
J has recently had added to it a new class of number, called extended integer. An extended integer can be produced by applying the extend verb x: to an ordinary integer:
x: 1234 1234
An extended integer may also be written directly by putting an x at the end of an ordinary integer:
1234x 1234
If one or more of the integers in a list is extended, they are all extended:
1 2x 3 1 2 3
Various primitives produce extended integer results if the argument is extended. For example, very large exact factorials are possible:
! 30x 265252859812191058636308480000000 */ x: >: i. 30 265252859812191058636308480000000
Some verbs f signal domain errors on some extended arguments because the result is not integral; however, <.@f and >.@f will work on extended arguments. I think you'll get the idea from a few examples:
1234 % 5x |domain error | 1234 %5x 1234 <.@% 5x 246 1234 % 2x 617 1234 <.@% 2x 617 ] r =. <.@%: 2*10^38x 14142135623730950488 *: r + ,. _1 0 1 199999999999999999971238085416445537169 199999999999999999999522356663907438144 200000000000000000027806627911369339121
What if you want to turn an extended integer into an ordinary integer? Those of you conversant with J will guess that the way to do this is to apply the inverse of x:, like this:
x:^:_1 [ 1234x NB. or _1 x: 1234x 1234
THE APPLICATION
I was excited when I heard about extended integers, because I had an immediate use for them. Jeffrey Shallit gave a paper at APL83 in which he discussed the problem of determining how many times the random number generator had been used, given a value of the random link. In order to give an APL solution he had to include in the paper a portion of an extended arithmetic package. This was because the numbers needed to solve the problem were very large integers. Stating the problem as simply as possible, given the equation:
y = M|g^x NB. (A)
the problem is to find x, where y, M, and g are known. This is the basis of the random number generator we shall be discussing. The value g is also known as the generator, and this can lead to confusion when we talk about the generator of the generator. I offer my apology in advance. The problem is sometimes known as the logarithm problem, since (forgetting the M-residue for a moment), if we have:
y = g ^ x
and know y and g, we can find x by taking the base-g logarithm of y:
x = g ^. y
In solving (A) for the particular problem of APL\360 and its descendants (including J), x can be as large as 2,147,483,646. For this value of x, g^x has over 9,000,000,000 decimal digits, and would take several hundred large volumes to print out. There are tactics one can employ to cut the size of the problem down, but extended-precision arithmetic will still be required. In J's implementation, the phrase M&|@^ is recognized and can be computed efficiently both in time and space.
The technique discussed by Shallit to solve the problem is due to Pohlig and Hellman. You'll have to look up the references if you are interested in the mathematical background, since I shall focus on the problem's algorithmic aspects.
THE GORY DETAILS
Several global constants are needed. The modulus used in random number generators of the APL\360 kind must be a prime. The largest prime that can be stored as a 4-byte integer is in fact also the largest integer that can be stored, that is, one less than 2^31. This prime was discovered by Euler and for over a hundred years was the largest prime known. It is the Mersenne prime M31, too, for those of you interested in the Euclidian perfect numbers.
M =. x: <: 2^31
It is convenient to have the value of the integer one less than M handy:
L =. <: M
For the random number generator to have maximum period, the generator g must be a primitive root of the modulus. A primitive root of a prime has the property that its powers, mod the prime, are distinct. For example, the prime 7 has 3 and 5 as primitive roots, because their powers, mod 7, are distinct:
7|3 5^/>:i.6 3 2 6 4 5 1 5 4 6 2 3 1
but the other positive integers less than 7 have repeated elements:
7|1 2 4 6^/>:i.6 1 1 1 1 1 1 2 4 1 2 4 1 4 2 1 4 2 1 6 1 6 1 6 1
Dr. Bryant Tuckerman, of the IBM Watson Research Laboratory in Yorktown Heights, New York, gave the APL\360 implementors the primitive root 7^5, or 16807. A decade or so later people began exploring random number generators extensively, and were surprised to find that there was no better generator than this for the modulus M:
g =. x: 7^5
A prime-power factor is an integer all of whose factors are the same. For example, 32, 9, 125, 49, and 11 are all prime-power factors. The prime-power factors of L play a key role in the algorithm. The primitive q: in J yields the prime factors of a number, but these may be repeated. For example, q: 12 is 2 2 3. The algorithm requires that repeated primes be replaced by their product. The verb h, to be defined later, factors numbers and replaces repeated items by their product:
f =. h L NB. f is 2 9 7 11 31 151 331
Certain powers of the generator g are needed. Those needed are the quotients of dividing L by f, and multiplying this quotient by the integers less than the items of f. For example, for the factor 7 we get:
,.B =. (L%7)*i.7 0 306783378 613566756 920350134 1227133512 1533916890 1840700268
and similarly for the other factors.
The verb p, to be defined later, raises the generator g to to any integer power, mod M. We use it to raise the generator to the powers B:
,.C =. p (L%7) * i.7 1 1600955193 894255406 1205362885 1752599774 1537170743 1599590586
Such a list is made for each prime-power factor. These are boxed and joined together, forming q, a list of lists, containing 542 numbers altogether (+/2 9 7 11 31 151 331), and too large to display here.
q =. <@p@j"0 f
You should be warned that the formation of q takes a minute or so to execute, depending on the speed of your computer. I find it convenient to comment out this line in the script, and insert the value of q directly.
We need the quotient of L with its factors as a separate global noun:
,. e =. L % f 1073741823 238609294 306783378 195225786 69273666 14221746 6487866
Those are all of the global nouns. Now we have to deploy a number of utility verbs. The verb w:
w =. ~. ^ #/.~
raises each item of its argument's nub to its tally:
w 2 2 2 3 3 5 7 8 9 5 7
The verb h:
h =.w @ q:
factors its argument and produces the prime power factors from it:
h L 2 9 7 11 31 151 331
The verb s:
s=. M&|@^
raises its left argument x to the power of its right argument, mod M:
3 s 2 9 16807 s 2000 75099568
The verb p:
p =. g&s
raises g to the power of its argument:
p 2000 75099568
The verb j:
j =. L&% * i.
divides its argument into L, and multiplies this quotient by the nonnegative integers less than it.
,. j 7 0 306783378 613566756 920350134 1227133512 1533916890 1840700268
THE MAIN PROBLEM
With all this behind us, we're ready to discuss the main problem. Suppose we find that the value of y, the random link, is 1209311799:
y =. 1209311799
We define a verb t:
t =. f ,. q i.&> ] s e"_
The phrase
] s e"_
can be replaced by
,.D =. y s e NB. let this be (D) 2147483646 473297587 1537170743 353622995 4096 709324280 667991092
The heart of the matter is that each distinct value that y may take yields a different list D. We look for the index of each item of D in q, finding:
] E=. q i.&> D 1 1 5 4 23 142 268
These values are the residues we seek. We form a table F by stitching f and E:
] F =. f ,. E 2 1 9 1 7 5 11 4 31 23 151 142 331 268
and this is also the result of applying verb t to y:
t y 2 1 9 1 7 5 11 4 31 23 151 142 331 268
If you refer to Hui's article, you'll see that F is similar to the table shown on page 64, beneath the expression c mr q. That is, F is a table of moduli and residues. For example, the items in list D correspond to the factors in f. In particular, the value 1537170743x corresponds to the factor 7. We gave all the possible values that may be taken on in this position in list C back a bit. In C we find that the value 1537170743x is in position 5, and so in F we find the value 5 next to the 7 in the first column. It is a remarkable fact that the expression y s e produces values which must occur also in the corresponding item of q, and that its index in the list in q is the residue we want for the next step.
The verb r:
r =. {: @ (cr1/ @ t)
inserts Hui's verb cr1 between each of the items of its argument, and yields a two-item list, with the first item necessarily equal to L, and the second item the power of g yielding y, mod M.
x:z =. cr1/F 2147483646 1234567 L 2147483646
We take the tail of z as our desired x:
] x =. {: z 1234567
We can verify that x is indeed the desired value by applying p to it:
p x 1209311799 y 1209311799
Are we happy? We shouldn't be yet, because this wasn't precisely the problem we wanted to solve, which was, how many times had the random number generator been used to arrive at the given random link. This part is easy, because we know that the initial value of the random link is 16807, which corresponds to exponent 1. All we have to do to get the value we want is to decrease x by 1. This gives us at last the verb ner:
ner =. <:@r NB. number of executions of roll x:ner y 1234566
Reference
Hui, Roger K. W., The Ball Clock Problem, Vector 12 2, October 1995, 55-66
Pohlig, Stephen C., and Martin E. Hellman, An Improved Algorithm for Computing Logarithms Over GF(p) and Its Cryptographic Significance, IEEE Trans. Info. Theory IT-24 (1978) 106-110
Shallit, J. O., Merrily We Roll Along: Some Aspects of ?, APL83 Conference Proceedings, APL Quote-Quad 13 3, March 1983, 243-248
Appendix
NB. Script for finding x in y=m|g^x, knowing y, M, and g, NB. particularized for use in analyzing the behavior of NB. the linear congruential random number generator found NB. in APL\360 and its descendants. NB. UTILITY VERBS w =: ~. ^ #/.~ h =:w@q: NB. GLOBAL NOUNS M =: x: <: 2^31 L =: <: M g =: x: 7^5 f =: h L NB. f is 2 9 7 11 31 151 331 s=: M&|@^ p =: g&s j =: L&% * i. q =: <@p@j"0 f e =: L % f NB. HUI's CHINESE REMAINDER VERBS FROM VECTOR 12 2, P 66 NB. INCLUDING GCD AS A LINEAR COMBINATION NB. Chinese Remainder ab =: |.@(gcd/ * [ % +./)@(,&{.) cr1 =: [: |/\ *.&{. , ,&{: +/ .* ab chkc =: [: assert ,&{: -: ,&{. | {:@cr1 cr =: cr1 [ chkc NB. GCD as a Linear Combination g0 =: , ,. =@i.@2: it =: {: ,: {. - {: * <.@%&{./ gcd =: (}.@{.)@(it^:(*@{.@{:)^:_)@g0 assert=: 13!:8@(12"_)^:-. NB. MAIN VERBS t =: f ,. q i.&> ] s e"_ r =: {: @ (cr1/ @ t) ner =: <:@r NB. number of executions of roll
This article has been updated to reflect the current version of J.
Endnote[1]
The examples showing domain errors where the result is not integral, now return extended rational, which was added since this article was written.