Doc/Articles/Play152
At Play With J ... Table of Contents ... Previous Chapter ... Next Chapter
18. Maximum Infix Sums
. By Eugene McDonnell. First published in Vector, 15, 2, (October 1998), 100-103.
In his book Programming Pearls (Addison-Wesley 1986), Jon Bentley collected a baker's dozen of his columns of the same name that had appeared in Communications of ACM. Column 7 therein is called "Algorithm Design Techniques", and the problem discussed in there is the subject of this column.
Here's the problem:
Given a vector x of real numbers (positive, negative, or zero), define a function f that yields the maximum of the sums of all the infixes. For example, if
x 31 _41 59 26 _53 58 97 _93 _23 84
then f x is 187, the sum of 59 26 _53 58 97. If all the numbers are positive the maximum infix is x, with sum +/x. When all inputs are negative the maximum infix is the empty vector, with sum 0.
How many infixes are there in a vector? An infix can start at any point and end at any point. The number of infixes starting at the very beginning is #x, one for each possible ending point. The number of infixes beginning at the next position is (#x)-1; at the next is (#x)-2, and so on; the total is thus the triangular number -:(#x)*(1+#x), or, expressed functionally, -:@(*>:), one of the series 1 3 6 10 15. The number of infixes for #x where x is a power of 10 are:
#x -:@(*>:) #x 1 1 10 55 100 5050 1000 500500 10000 50005000 100000 5000050000 1000000 500000500000
So even for a vector of length 1000, the number of infixes to consider is greater than half a million. Bentley gives as a bad example an algorithm (I think written in Pascal) that tests the sum of each infix, and shows that a computer that takes an hour for #x 1000 will take 39 days for #x 10000.
MaxSoFar := 0.0 for L := 1 to N do for U := L to N do Sum := 0.0 for I := L to U do Sum := Sum + X[I] /* Sum now contains the sum of X[L..U] */ MaxSoFar := max(MaxSoFar, Sum)
He goes on to discuss faster algorithms, but which still take quadratic time. He describes then a subquadratic algorithm which uses O(N log N) time. He tells how he and a colleague thought this was probably the best possible. They described this at a meeting at Carnegie Mellon University, and someone in the audience designed a much improved linear time algorithm in less than a minute. Several APL colleagues of mine have voiced their suspicions that the designer knew APL.
The algorithm (again I think written in Pascal) for the linear-time algorithm is:
MaxSoFar := 0.0 MaxEndingHere := 0.0 for I := 1 to N do /* Invariant: MaxEndingHere and MaxSoFar are accurate for X[1..I-1] */ MaxEndingHere := max(MaxEndingHere+X[I], 0.0) MaxSoFar := max(MaxSoFar, MaxEndingHere)
This can be translated into a J verb:
mis =: 3 : '>./(0>.+)/\.y,0' NB. maximum infix sum
We append 0 to the end of the list argument (y,0), because the last value on the list may be negative, then do a suffix scan (/\.) using the verb 'the maximum of zero and the sum so far' (0>.+). This produces a list of sums formed according to the rule that if a negative sum is encountered it is replaced by zero. Finally, the maximum of all the sums is yielded (>./). To see how this works we use mis0, which is like mis but without the maximum over, on a short list:
y =: _100 2 3 4 _100 5 6 _7 8 9 _100 mis0 y 0 9 7 4 0 21 16 10 17 9 0 0 mis y 21
The maximum infix is 5 6 _7 8 9, with sum 21.
The need for appending a zero may be demonstrated by using mis1, which is like mis0 but without the appended zero:
mis1=: 3 : '(0>.+)/\.y' mis1 y 0 9 7 4 0 12 7 1 8 0 _100
Jon Bentley showed the advantage of linear over quadratic algorithms by writing his slow algorithm in fine-tuned Fortran and running it on a Cray-1 supercomputer, and the linear algorithm on a Basic interpreter running on a Radio Shack TRS-80 Model III microcomputer. The runtime for an argument of length N was 3.0N^3 nanoseconds for the slow algorithm on the Cray, and 19.5N milliseconds (19,500,000N nanoseconds) for the fast algorithm on the TRS-80. The linear algorithm on the slow machine caught up with the slow algorithm on the fast machine at N = 2500, and for N = 1,000,000, the slow algorithm/fast machine took 95 years (estimated) and the fast algorithm/slow machine took just 5.4 hours.
The moral of the story is something like, "It's foolish to improve performance by getting a faster machine; one's money is better spent on finding a faster algorithm." This isn't always possible, of course. Machine designers are always finding ways to build faster machines, but it takes a very rare skill to devise faster algorithms.
This brings me to a confession. The J version of the algorithm wasn't translated from Pascal (which is a language I don't understand), but from Arthur Whitney's K. Many of you will know of the existence of two different APL successors, J and K; some of you will know of their progenitors: Ken Iverson and Roger Hui for J, and Arthur Whitney for K. There are connections among these three. All are from Alberta, in Canada (Roger Hui by way of Hong Kong); all worked for I. P. Sharp Associates in Toronto (Whitney also worked for IPSA in Australia, Hong Kong, and Singapore), and all are good friends. I might also say that all have first-rate brains. That they have taken divergent paths in pushing the APL idea into the next millenium might lead you to think that their relationship might be strained, and they do indeed have a rivalry in that regard; but it is a friendly, though determined one. I use both J and K, and believe that each language is excellent, but also that each has features that I wish the other had.
The K algorithm, which I learned from Arthur Whitney (we live just a mile or so apart in Palo Alto, California) is:
f : {|/0(0|+)\x}
In K the vertical bar (|) signifies the dyad 'larger'. So |/ is 'the largest of all, or maximum'. This is applied to a scan using the hook 0|+, where 0| is the monad 'larger of 0 and the argument', and + is the dyad 'plus', of course. In K the scan (like over) begins at the left. The 0 preceding (0|+) is the initial value of the scan, and serves the same purpose as the appended 0 at the end of the J function j.
Since I have both J and K on my machine (K can be downloaded from the website kx.com) I was motivated to time mis and f on similar length arguments. The results are interesting.
The arguments are named xn, where 10^n is the length of the list.
J K x1 0.00017 0.000386 x2 0.0005 0.0024 x3 0.0077 0.0212 x4 0.082 0.089 x5 0.87 0.532 x6 -- 5.091
J is faster for arguments up to 1000 items. and there is a tie for arguments 10000 items long, but K is faster for larger arguments, and, indeed, can handle much larger arguments than J can. I've reported this difficulty with J and perhaps by the time you read this it will have been remedied.
Editor's Note:
The J timing problem has now been fixed, and performance is an order of magnitude better. K is faster for smaller arguments, and slower for larger. The following updated timings are in milliseconds:
J K x1 0.0129 0.0052 x2 0.0482 0.0441 x3 0.392 0.419 x4 3.77 4.19 x5 37.96 42.7 x6 393 476
Vector Vol.15 No.2