Doc/Articles/Play152

From J Wiki
Jump to navigation Jump to search

At Play With J ... Table of Contents ... Previous Chapter ... Next Chapter

18. Maximum Infix Sums

. By Eugene McDonnell. First published in Vector, 15, 2, (October 1998), 100-103.

In his book Programming Pearls (Addison-Wesley 1986), Jon Bentley collected a baker's dozen of his columns of the same name that had appeared in Communications of ACM. Column 7 therein is called "Algorithm Design Techniques", and the problem discussed in there is the subject of this column.

Here's the problem:

Given a vector x of real numbers (positive, negative, or zero), define a function f that yields the maximum of the sums of all the infixes. For example, if

   x
31 _41 59 26 _53 58 97 _93 _23 84
 then f x is 187, the sum of 59 26 _53 58 97. If all the numbers are positive the maximum infix is x, with sum +/x. When all inputs are negative the maximum infix is the empty vector, with sum 0.

How many infixes are there in a vector? An infix can start at any point and end at any point. The number of infixes starting at the very beginning is #x, one for each possible ending point. The number of infixes beginning at the next position is (#x)-1; at the next is (#x)-2, and so on; the total is thus the triangular number -:(#x)*(1+#x), or, expressed functionally, -:@(*>:), one of the series 1 3 6 10 15. The number of infixes for #x where x is a power of 10 are:

     #x  -:@(*>:) #x
      1            1
     10           55
    100         5050
   1000       500500
  10000     50005000
 100000   5000050000
1000000 500000500000

So even for a vector of length 1000, the number of infixes to consider is greater than half a million. Bentley gives as a bad example an algorithm (I think written in Pascal) that tests the sum of each infix, and shows that a computer that takes an hour for #x 1000 will take 39 days for #x 10000.

MaxSoFar := 0.0
for L := 1 to N do
    for U := L to N do
        Sum := 0.0
        for I := L to U do
            Sum := Sum + X[I]
        /* Sum now contains the sum of X[L..U] */
        MaxSoFar := max(MaxSoFar, Sum)

He goes on to discuss faster algorithms, but which still take quadratic time. He describes then a subquadratic algorithm which uses O(N log N) time. He tells how he and a colleague thought this was probably the best possible. They described this at a meeting at Carnegie Mellon University, and someone in the audience designed a much improved linear time algorithm in less than a minute. Several APL colleagues of mine have voiced their suspicions that the designer knew APL.

The algorithm (again I think written in Pascal) for the linear-time algorithm is:

MaxSoFar := 0.0
MaxEndingHere := 0.0
for I := 1 to N do
    /*  Invariant: MaxEndingHere and MaxSoFar
                   are accurate for X[1..I-1] */
    MaxEndingHere := max(MaxEndingHere+X[I], 0.0)
    MaxSoFar := max(MaxSoFar, MaxEndingHere)
 This can be translated into a J verb:
   mis =: 3 : '>./(0>.+)/\.y,0'  NB. maximum infix sum

We append 0 to the end of the list argument (y,0), because the last value on the list may be negative, then do a suffix scan (/\.) using the verb 'the maximum of zero and the sum so far' (0>.+). This produces a list of sums formed according to the rule that if a negative sum is encountered it is replaced by zero. Finally, the maximum of all the sums is yielded (>./). To see how this works we use mis0, which is like mis but without the maximum over, on a short list:

   y =: _100 2 3 4 _100 5 6 _7 8 9 _100

   mis0 y
0 9 7 4 0 21 16 10 17 9 0 0

   mis y
21

The maximum infix is 5 6 _7 8 9, with sum 21.

The need for appending a zero may be demonstrated by using mis1, which is like mis0 but without the appended zero:

   mis1=: 3 : '(0>.+)/\.y'

   mis1 y
0 9 7 4 0 12 7 1 8 0 _100

Jon Bentley showed the advantage of linear over quadratic algorithms by writing his slow algorithm in fine-tuned Fortran and running it on a Cray-1 supercomputer, and the linear algorithm on a Basic interpreter running on a Radio Shack TRS-80 Model III microcomputer. The runtime for an argument of length N was 3.0N^3 nanoseconds for the slow algorithm on the Cray, and 19.5N milliseconds (19,500,000N nanoseconds) for the fast algorithm on the TRS-80. The linear algorithm on the slow machine caught up with the slow algorithm on the fast machine at N = 2500, and for N = 1,000,000, the slow algorithm/fast machine took 95 years (estimated) and the fast algorithm/slow machine took just 5.4 hours.

The moral of the story is something like, "It's foolish to improve performance by getting a faster machine; one's money is better spent on finding a faster algorithm." This isn't always possible, of course. Machine designers are always finding ways to build faster machines, but it takes a very rare skill to devise faster algorithms.

This brings me to a confession. The J version of the algorithm wasn't translated from Pascal (which is a language I don't understand), but from Arthur Whitney's K. Many of you will know of the existence of two different APL successors, J and K; some of you will know of their progenitors: Ken Iverson and Roger Hui for J, and Arthur Whitney for K. There are connections among these three. All are from Alberta, in Canada (Roger Hui by way of Hong Kong); all worked for I. P. Sharp Associates in Toronto (Whitney also worked for IPSA in Australia, Hong Kong, and Singapore), and all are good friends. I might also say that all have first-rate brains. That they have taken divergent paths in pushing the APL idea into the next millenium might lead you to think that their relationship might be strained, and they do indeed have a rivalry in that regard; but it is a friendly, though determined one. I use both J and K, and believe that each language is excellent, but also that each has features that I wish the other had.

The K algorithm, which I learned from Arthur Whitney (we live just a mile or so apart in Palo Alto, California) is:

  f : {|/0(0|+)\x}

In K the vertical bar (|) signifies the dyad 'larger'. So |/ is 'the largest of all, or maximum'. This is applied to a scan using the hook 0|+, where 0| is the monad 'larger of 0 and the argument', and + is the dyad 'plus', of course. In K the scan (like over) begins at the left. The 0 preceding (0|+) is the initial value of the scan, and serves the same purpose as the appended 0 at the end of the J function j.

Since I have both J and K on my machine (K can be downloaded from the website kx.com) I was motivated to time mis and f on similar length arguments. The results are interesting.

The arguments are named xn, where 10^n is the length of the list.

            J         K
   x1    0.00017   0.000386
   x2    0.0005    0.0024
   x3    0.0077    0.0212
   x4    0.082     0.089
   x5    0.87      0.532
   x6      --      5.091

J is faster for arguments up to 1000 items. and there is a tie for arguments 10000 items long, but K is faster for larger arguments, and, indeed, can handle much larger arguments than J can. I've reported this difficulty with J and perhaps by the time you read this it will have been remedied.


Editor's Note:

The J timing problem has now been fixed, and performance is an order of magnitude better. K is faster for smaller arguments, and slower for larger. The following updated timings are in milliseconds:

             J         K
   x1     0.0129    0.0052
   x2     0.0482    0.0441
   x3     0.392     0.419
   x4     3.77      4.19
   x5    37.96     42.7
   x6   393       476

Vector Vol.15 No.2