Essays/Combination Index
The Problem
comb is from the Combinations page; m comb n computes all size m combinations of i.n .
comb=: 4 : 0 k=. i.>:d=.y-x z=. (d$<i.0 0),<i.1 0 for. i.x do. z=. k ,.&.> ,&.>/\. >:&.> z end. ; z ) 4 comb 6 0 1 2 3 0 1 2 4 0 1 2 5 0 1 3 4 0 1 3 5 0 1 4 5 0 2 3 4 0 2 3 5 0 2 4 5 0 3 4 5 1 2 3 4 1 2 3 5 1 2 4 5 1 3 4 5 2 3 4 5
In this note, we discuss the problem of determining the index given m and n and a particular combination, and the combination given m and n and the index. That is, devise verbs ic and ci such that:
((m,n) ic c) -: c i.~ m comb n ((m,n) ci i) -: i { m comb n
Index from Combination
ic can be computed as follows:
ic=: 4 : 0 " 1 'm n'=. x: x -/ +/ ((-i.)m) ! n - (|.!.'' 1+y),.y ) 4 6 ic 1 2 3 5 11 4 6 ic 4 comb 6 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 23 10000 ic 9999-i.-23 3771461434429626616429953717057906562371311072521139025732118617119999 <: 23!10000x 3771461434429626616429953717057906562371311072521139025732118617119999
ic is derived from a recursive solution written years ago:
ic1=: 4 : 0 " 1 'm n'=. x: x if. 1>:m do. {.y,0 return. end. k=. {.y (+/(m-1)!(n-k)+i.k) + (x-1,1+k) ic1 (}.y-1+k) )
k is the leading term of the combination in question. The sum +/(m-1)!(n-k)+i.k accounts for the combinations whose leading terms are less than k . Using the sum directly as in ic1 is not satisfactory for an enormous combination such as 23 10000 ic1 9999-i.-23 (with k=9977).
What is this sum? To use a particular example, the terms of the sum are
m=:3 [ n=:9 [ k=:5 (m-1)!(n-k)+i.k 6 10 15 21 28
and are adjacent entries in Pascal's triangle:
1 1 1 1 2 1 1 3 3 1 1 4 6 4 1 1 5 10 10 5 1 1 6 15 20 15 6 1 1 7 21 35 35 21 7 1 1 8 28 56 70 56 28 8 1 1 9 36 84 126 126 84 36 9 1
If we add the extra term 4 in Pascal's triangle to the right of 6 , the sum collapses into a single binomial coefficient:
6 4 10 10 10 15 15 15 20 21 21 21 21 35 28 28 28 28 28 56 84 +/6 10 15 21 28 80 3!9 84 (3!9)-3!9-5 80
which is m!n , from which the original extra term m!n-k must be subtracted. "Pascal's ladder" or "Pascal's telescoping sum" would be good descriptions of this. Of course, to make the derivation rigorous and respectable, one would have to employ a proof by (say) induction. The details of such a proof are left as an exercise for the reader.
The following benchmark illustrates the improvement from ic1 to ic .
mn=: 23 10000 c=: 9999-i.-23 ts=: 6!:2 , 7!:2@] NB. time and space ts 'mn ic1 c' 5.98758 3.88736e6 ts 'mn ic c' 0.00465897 155392
Combination from Index
(m,n) ci i produces the i-th size m combination of i.n .
ci=: 4 : 0 " 1 0 'm n'=. x: x z=. 0$q=. n for_p. (-i.)m do. k=. (p,q) lead y y=. y-(p!q)-p!q-k q=. q-1+k z=. z,k end. (i.m) + +/\z ) lead=: 4 : 0 'm n'=. x: x a=. m!n p=. n-1 q=. m-1 while. p>:q do. j=. q+<.-:p-q s=. (a - m!j) - y if. 0 > s do. p=. j-1 elseif. 0 < s do. q=. j+1 elseif. 1 do. n-j return. end. end. (n-1)-p ) 4 6 ci 11 1 2 3 5 (4 comb 6) -: 4 6 ci i.4!6 1 23 10000 ci <:23!10000x 9977 9978 9979 9980 9981 9982 9983 9984 9985 9986 9987 9988 ... 9998 9999
ci generates the combination one term at a time, starting with the leftmost term. It derives the leading term using the sub-function lead , which is equivalent to
((m!n) - m!(m-1)+i.m-1+n) (>i.1:) y
but finds the answer using a binary search. The left argument in the above expression has 1+n-m elements, which can be so large that it would be very expensive or impossible to create the actual vector.
As in the case of ic , ci is derived from a function written years ago:
ci1=: 4 : 0 " 1 0 'm n'=. x: x if. 0=m do. i.0 return. end. v=. +/\ (m-1)!(1-m)}.i.-n k=. v (>i.1:) y k , (1+k) + (x-1,1+k) ci1 (y-k{0,v) )
A term (m-1)!(n-1)-j of the sum +/\(m-1)!(1-m)}.i.-n accounts for the number of combinations whose leading term is j . This sum is seen to be equivalent to (m!n)-m!(m-1)+i.m-1+n using reasoning similar to the "Pascal's Ladder" derivation for ic . The latter expression is amenable to binary search.
Checks
iccheck and cicheck have the same arguments and results as ic and ci , but incorporate checks:
iccheck=: 4 : 0 " 1 'm n'=. x: x assert. (0<:m)*.m<:n assert. m = #y assert. (-: /:~) y assert. (-: ~. ) y assert. y e. i.n z=. -/ +/ ((-i.)m) ! n - (|.!.'' 1+y),.y assert. (0<:z)*.z<m!n ) cicheck=: 4 : 0 " 1 0 'm n'=. x: x assert. (0<:m)*.m<:n assert. (0<:y)*.y<m!n z=. 0$q=. n for_p. (-i.)m do. k=. (p,q) lead y y=. y-(p!q)-p!q-k q=. q-1+k z=. z,k end. z=. z + (i.#z) + |.!.'' +/\z assert. m = #z assert. (-: /:~) z assert. (-: ~. ) z assert. z e. i.n )
See also
Contributed by Roger Hui.