Essays/NearerLargerNeighbor
Nearer Larger Neighbor: J Programs Compared
Howard A. Peelle
University of Massachusetts Amherst
hapeelle@educ.umass.edu
The Nearer Larger Neighbor problem is introduced. J programs for solving it are defined and compared by speed, space, and spread.
Introduction
The “Nearer Larger Neighbor” problem is:
For each number in a list, find the nearer number that is larger than it.
If the larger numbers to the left and right are equally near, pick the larger one.
If there is no larger number, give a suitable default number.
Variants of this problem arise in several areas of mathematics and computer science, notably computational geometry for choosing Euclidean distances between points. See references below. Fundamentally, the problem entails bidirectional search yet generalizes naturally to higher dimensional arrays. It is instructional to develop programs with different approaches for a list first.
Asano et al [1] posed the “All Nearest Larger Neighbors” problem and presented several 1-D algorithms. Although their problem is slightly different, the following programs begin with my attempts to code theirs in J for comparisons here.
Asano’s Algorithms
A1
Asano’s Algorithm 1 is a right-and-left stack scan where the output is a list of associated indexes (converted to origin 0), and ties are resolved to the leftmost index:
A1 =: 3 : 0 n =. #y s =. i.0 nln =. n#0 lnln =. n#0 rnln =. n#0 for_i. i.n do. while. ((#s)~:0)*.(i{y)>:({.s){y do. s =. }.s end. if. (#s)=0 do. lnln =.(-n) i} lnln else. lnln =. ({.s) i} lnln end. s =. i,s end. s =. i.0 for_i. |.i.n do. while. ((#s)~:0)*.(i{y)>:({.s){y do. s =. }.s end. if. (#s)=0 do. rnln =. (3*n) i} rnln else. rnln =. ({.s) i} lnln end. s =. i,s end. for_i. i.n do. if. (i-i{lnln)<:(i{rnln)-i do. nln =. (i{lnln) i} nln else. nln =. (i{rnln) i} nln end. end. nln )
Example from [1]:
A1 87 32 12 54 28 35 14 61 18 53 _10 0 1 0 3 3 5 0 7 7 NB. indexes with left preference
Notice that _10 is the default index for the largest item. Another example:
A1 4 5 3 8 6 3 2 4 5 1 1 _10 1 _10 3 4 5 4 4 8
A1 incorrectly returns -n for the second item (5) when there is no larger item to the left even though there is a larger to the right and returns the wrong index for the eighth item (4) which has a nearer larger item (5) on its right. This program can be corrected and significantly simplified in J style programming:
A1j =: 3 : 0 n =. #y nln =. i.0 for_i. i.n do. item =. i{y left =. i while. left>:0 do. if. item<left{y do. break. end. left =. <:left end. left =. (left<0){left,-n right =. i while. right<n do. if. item<right{y do. break. end. right =. >:right end. right =. (right=n){right,3*n larger =. (i-left)<:right-i nln =. nln,larger{right,left end. ) A1j 4 5 3 8 6 3 2 4 5 1 1 3 1 _10 3 4 5 8 4 8
This program is much shorter and increasingly faster than A1 and uses much less space.
It is desirable to change default index (-n) to i-n (and 3*n to i+n) so that the output can index actual items directly. So, use a negative default index for the largest item and eliminate the inner loops:
A1j =: 3 : 0 n =. #y nln =. i.0 for_i. i.n do. item =. i{y left =. |.item<i{.y left =. (+./left){n,>:left i. 1 right =. item<}.i}.y right =. (+./right){n,>:right i. 1 nln =. nln,(left<:right){i+right,-left end. )
Example:
,i =. A1j v =. 4 5 3 8 6 3 2 4 5 1 1 3 1 _7 3 4 5 8 4 8
Then, to index items, simply enter:
i{v 5 8 5 8 8 6 3 5 6 5 NB. default is largest item
Now change Asano’s problem to return items instead of indexes and to not prefer the left item in case of ties – that is, to solve the Nearer Larger Neighbor problem stated above.
A1a =: 3 : 0 n =. #y all =. i.0 for_i. i.n do. left =. right =. i while. look=.left>:0 do. large =. left{y if. large>i{y do. break. end. left =. <:left end. left =. left%look large =. large*look while. look=.right<n do. larger =. right{y if. larger>i{y do. break. end. right =. >:right end. right =. right%look larger =. larger*look nearer =. (i-left)(>,>:)right-i all =. all,>./nearer{large,larger end. )
Example:
A1a 4 5 3 8 6 3 2 4 5 1 5 8 8 0 8 6 4 5 6 5 NB. items (with 0 default)
This program is much faster and much slimmer than A1 despite doing more work to find the larger of two items at equal distance.
A2
Asano’s Algorithm 2 is a bidirectional scan by increasing distance, where the output is indexes with ties resolved to the leftmost index and -n is the default index (like A1).
A2 =: 3 : 0 n =. #y nln =. n#0 for_i. 0 To n-1 do. nln =. (-n) i} nln for_k. 1 To Larger(i,n-i+1) do. if. (i-k)>:0 do. if. (i{y)<(i-k){y do. nln =. (i-k) i} nln break. end. end. if. (i+k)<n do. if. (i{y)<(i+k){y do. nln =. (i+k) i} nln break. end. end. end. end. nln ) To =: [ + i.@>:@-~ Larger =: >./ A2 4 5 3 8 6 3 2 4 5 1 1 3 1 _10 3 4 5 8 4 8
Again, change Asano’s problem to return items instead of indexes and to return the larger item at equal distance for the Nearer Larger Neighbor problem:
A2a =: 3 : 0 n =. #y nln =. i.0 for_i. i.n do. for_d. >:i.i>.n->:i do. left =. right =. 0 if. i>:d do. left=.(i-d){y end. if. i<n-d do. right=.(i+d){y end. larger =. left>.right if. larger>i{y do. break. end. larger =. 0 end. nln =. nln,larger end. )
Same example:
A2a 4 5 3 8 6 3 2 4 5 1 5 8 8 0 8 6 4 5 6 5 NB. items with 0 default
This program is about as inefficient as A2, mainly because of the algorithm. (Incidentally, using a while loop instead of a for loop is slower but saves a lot of space.)
Proposed A2 Algorithm
I propose an algorithm that searches in two stages -- a minimum distance (near) left and right, then the remaining distance left or right:
A2a =: 3 : 0 n =. #y i =. 0 all =. i.0 while. i<n do. d =. 1 near =. i <. n-i+1 while. d<:near do. left =. (i-d){y right =. (i+d){y larger =. left >. right if. larger>i{y do. goto_next. end. d =. d+1 end. near =. n-1+2*near if. i<n%2 do. d =. n-near else. d =. near-n+1 end. while. near>0 do. larger =. d{y if. larger>i{y do. goto_next. end. d =. d+*d near =. near-1 end. larger =. 0 NB. default label_next. i =. i+1 all =. all,larger end. )
This program is much more efficient than A2 even though it returns items instead of indexes and does more work by computing the larger of two items at the same distance instead of exiting after finding a larger item on the left. Indeed, it is 10+ times faster than A2 and uses only 37% space for n=1e5 and increasingly faster for higher n. It can be improved further by using a two-item index (d) for distances left and right:
A2a =: 3 : 0 n =. #y i =. 0 all =. i.0 while. i<n do. d =. i + _1 1 while. *./d~:_1,n do. larger =. >./ d{y if. larger>i{y do. goto_next. end. d =. d+_1 1 end. direction =. i<-:n d =. direction{d direction =. direction{_1 1 while. *./d~:_1,n do. larger =. d{y if. larger>i{y do. goto_next. end. d =. d+direction end. larger =. 0 label_next. i =. >:i all =. all,larger end. )
This provides a 20% speed-up in slightly less space. This is the slimmest program here.
Try combining the two inner while loops and removing goto, using a default of –infinity:
A2a =: 3 : 0 n =. #y i =. 0 all =. i.0 while. i<n do. d =. i whilst. (larger<:i{y)*.+./OK do. d =. d + _1 1 * OK=.d~:0,n-1 larger =. >./OK#d{y NB. default __ end. i =. i+1 all =. all,larger end. )
This is much shorter but slower, using slightly more space.
A better idea is to find the largest item first and exit the inner loop:
A2a =: 3 : 0 n =. #y max =. >./y i =. 0 all =. i.0 while. i<n do. it =. i{y if. it=max do. larger=.0 goto_next. end. d =. i while. it>:larger=.>./d{y do. d =. d + _1 1 * d~:0,n-1 end. label_next. i =. i+1 all =. all,larger end. )
This is 40% faster in the same space.
Lastly, try using a for loop and replace 0 defaults with each larger item found.
A2b =: 3 : 0 n =. #y all =. y<>./y for_i. all#i.n do. d =. i while. (i{y)>:larger=.>./d{y do. d =. d+_1 1*d~:0,n-1 end. all =. larger i} all end. )
Nicely shorter and almost as speedy, although 40% fatter.
New Algorithms
Now consider my algorithms, presented in J. All programs allow any list of positive numbers input and return the same number of selected numbers output, with 0 as a default for no larger neighbor (unless noted otherwise).
Parallel Search by Position
Begin with a straightforward vector processing approach. Program NLN0 computes the nearer larger neighbor for each index position by creating a table of Neighbors from lists Lefts and Rights, finding the Largers in parallel, then selecting the Nearer item whose First indexed item is less than the input:
EACH =: "0 _ NLN0 =: All NLN EACH ] All =: i.@# NLN =: { Nearer Larger@Neighbors Neighbors =: Lefts ,: Rights Lefts =: |.@{. Rights =: }.@}. Larger =: >./ Nearer =: < First ] First =: {.@#
For example:
NLN0 4 5 3 8 6 3 2 4 5 1 5 8 8 0 8 6 4 5 6 5
This could be written as a one-liner: NLN0 =: i.@# ({ (< {.@# ]) |.@{. >./@:,: }.@}.)"0 _ ]
Obviously the shortest program here, this runs a little faster in a little more space. It could also be defined explicitly:
NLN0x =: 3 : 0 n =. #y i =. 0 all =. i.0 while. i<n do. x =. i{y lefts =. |.i{.y rights =. }.i}.y i =. i+1 all =. all , x Nearer Larger lefts,:rights end. )
3% slower in 10% less space for n=1e5.
A slightly different tacit definition is:
NLN0a =: All Nearer@Larger@Neighbors EACH ] All =: i.@# Neighbors =: Lefts,: Rights Lefts =: 0: , |.@{. Rights =: }. Larger =: >./ Nearer =: (> {.) First ] First =: {.@# Larger =: >./ Larger =: >./
This has about the same efficiency as NLN0.
Also, ^: can be used to drop the Largers iteratively until the first is larger:
NLN0b =: All {.@Neibors EACH ] All =: i.@# Neibors =: { Near ^: While ^: _ Lefts Larger@,: Rights Near =: }.@] While =: >: {. Lefts =: |.@{. Rights =: }.@}. Larger =: >./
Short, but slow and fat.
Parallel by Distance
A different approach computes each distance in parallel instead of each position:
EACH1 =: "0 1 FLIP =: &. |. NLN1 =: ] Nearer EACH1 |:@LN1 Nearer =: < First ] First =: {.@# LN1 =: All1 Neighbors1 EACH ] All1 =: }.@i.@# NB. distances Neighbors1 =: Lefts1 >. Rights1 Rights1 =: #@] {. }. Lefts1 =: Rights1 FLIP
This program is 3 times slower than NLN0 and runs out of memory at n=1e5. It is possible to use Transpose, but it also requires excessive time and even more space:
T =: &. |: NLN1 =: ] Nearer EACH1 T All1 Neighbors1 EACH ]
Shifting
Try shifting in Lefts and Rights:
NLNs =: All NLN EACH ] All =: i.@# NLN =: { Nearer Neighbors Neighbors =: Lefts >. Rights Lefts =: #@] {. |.@{. Rights =: #@] {. }.@}. Nearer =: < First ] First =: {.@#
This uses 30% less space than NLN0.
Indexing
A different approach indexes each distance left and right. Explicitly:
NLN2 =: 3 : 0 all =. i.0 while. all <&# y do. all =. all , (#all) NLNx y end. ) NLNx =: 3 : 0 : item =. x{y lefts =. (-#y){.x{.y rights =. (#y){.x}.y d =. 1 while. d<#y do. larger =. (d{rights)>.(-d){lefts if. larger>item do. larger return. end. d =. d+1 end. 0 )
This is much more efficient than NLN0 in time and space.
Masking
Another approach is to mask indexes progressively and store found largers for positions at distances d=1, 2, etc.
NLN3 =: 3 : 0 n =. #y all =. n#0 alli =. i.n d =. 1 while. d<n do. y =. y,0 lefts =. (alli-d){y rights =. (alli+d){y largers =. lefts >. rights mask =. largers > alli{y i =. mask # alli largers =. mask # largers all =. largers i} all alli =. alli -. i d =. d+1 end. all )
Very fast but fat.
Better yet, remove the index of the largest item before searching.
NLN3 =: 3 : 0 n =. #y all =. y < >./y alli =. all # i.n d =. 1 while. 0<#alli do. y =. y,0 largers =. (alli-d){y largers =. largers >. (alli+d){y all =. largers alli} all largers =. largers <: alli{y alli =. largers#alli d =. d+1 end. all )
Faster and much less space for high n.
A variant using a +/ table:
NLN3a =: 3 : 0 n =. #y all =. y < >./y alli =. all # i.n d =. 0 while. (#alli)>0 do. y =. y,0 d =. d + 1 _1 largers =. >./ (d +/ alli){y all =. largers alli} all largers =. largers <: alli{y alli =. largers#alli end. all )
3% faster in 16% more space.
Try another masking definition:
NLN3c =: 3 : 0 n =. #x =. y y =. __ * y = >./y NB. __ default d =. 1 whilst. 0 e. y do. lefts =. (-n){.(-d)}.x rights =. n {. d }. x largers =. lefts >. rights largers =. largers * largers>x y =. y + largers * y=0 d =. d+1 end. y )
This conserves space but takes much more time.
Matrix
Dare a matrix approach (without looping):
NLN4 =: 3 : 0 ns =. ->:i.#y right =. }:ns {."0 1 y left =. }:|."1 ns {."0 1 |.y larger =. left >. right mask =. larger >"1 y new =. larger *&{: mask first1 =. |. </\ |.mask *"1 -.{:mask old =. +/ larger * first1 new + old )
Slow and very fat.
Boolean
Now try a Boolean approach within an inner loop:
NLN5 =: 3 : 0 n =. #y i =. 0 all =. i.0 while. i<n do. x =. i{y d =. 1 whilst. {./:~neighbors<:x do. neighbors =. d < (i+1),(n-i) neighbors =. neighbors # i + d * _1 1 neighbors =. neighbors { y d =. d+1 end. i =. i+1 all =. all,>./neighbors NB. __ default end. )
Slim, but not so fast.
Spider
Use Infix \ (scan) to search like a spider stepping simultaneously to both sides in increasingly larger steps:
NLN6 =: 3 : 0 n =. #y all =. n#0 smaller =. y < >./y half =. -:n d =. 1 while. d<half do. dd =. >:+:d largers =. dd Spider\ y largers =. (d{.d}.y),largers,(-d){.(-d)}.y larger =. smaller*.largers > y all =. all+larger*largers smaller =. larger~:smaller d =. d+1 end. while. d>0 do. largers =. (-n){.d{.y largers =. largers >. n{.(-d){.y larger =. smaller*.largers > y all =. all+larger*largers smaller =. larger~:smaller d =. d-1 end. all ) Spider =: {. >. {:
Very slow in lots of space.
Recursion
The Nearer Larger Neighbor problem can be solved recursively:
EACH =: “0 _ ELSE =: ` WHEN =: @. NLN7 =: All {.@NLNr EACH ] All =: i.@# NLNr =: Leftr Nearerr Rightr Leftr =: |.@{. , { NB. item on end Rightr =: 1: |. }. Nearerr =: Nearerr&}. ELSE 0: WHEN Done ELSE Largerr WHEN (Largerr>N) Largerr =: >.&{. N =: >.&{: Done =: +.&# = 0:
Not efficient, of course.
Comparisons
Now compare the most competitive programs by ratios of time, space, and length -- where 1.00 is the best – for the same n=1e6 non-distinct random positive integers. Finally, sum the ratios for a composite of overall program efficiency. See Appendix below for benchmark details and copies of these programs.
Time Space Length Overall A1a 32.9 1.67 1.21 35.8 A2a 10.1 1.00 1.40 12.5 A2b 10.4 1.42 1.00 12.8 NLN3 1.03 4.08 1.25 6.37 NLN3a 1.00 4.75 1.17 6.92 NLN5 19.2 1.00 1.40 21.6
So, the winner is NLN3, using the Masking approach.
Readers may wish to use a higher n, average over several samples, and weight the three criteria.
Conclusion
The Nearer Larger Neighbor problem was introduced and programmed in J. Definitions of programs were compared, and the best was determined. (It was not the fastest, nor slimmest, nor shortest.) To follow up, see Essays/NearestLargerNeighbor for programs for 2-D, 3-D, and n-D arrays.
References
[1] T. Asano, S. Bereg, and D. Kirkpatrick, "Finding nearest larger neighbors: a case study in algorithm design and analysis", Lecture Notes in Computer Science, Efficient Algorithms, pp. 249-260, Albers, Alt & Naeher (eds.) Springer, 2009
[2] Weisstein, Eric W. “Nearest Neighbor Problem”, http://mathworld.wolfram.com
[3] Lifshits, Y. “The Homepage of Nearest Neighbors and Similarity Search”, http://simsearch.yury.name
Appendix
Benchmarks for time and space were obtained on a Dell Latitude E6410 laptop (64-bit OS M620 at 2.67 GHz with 4GB RAM) running J6.02.
Length = total number of characters in a program definition body, counting only 1 for each name.
Utility programs used:
Time =: 6 !: 2 Space =: 7 !: 2 Odom =: #~ #: i.@^
Example to match all lists of 6 integers:
*./(NLN3a -: NLN3)"1 Odom~6 1
Example plots:
load 'plot' plot n ; (Time,10&^.@Space)"1 'NLN0 ?#~',"1 ":,.n=.1000*>:i.10 NB. Note log space for common scale.
Special cases:
For an empty input, all programs return an empty output. For a list of constants, all programs return defaults.
Programs compared:
A1a =: 3 : 0 n =. #y all =. i.0 for_i. i.n do. left =. right =. i while. look=.left>:0 do. large =. left{y if. large>i{y do. break. end. left =. <:left end. left =. left%look large =. large*look while. look=.right<n do. larger =. right{y if. larger>i{y do. break. end. right =. >:right end. right =. right%look larger =. larger*look nearer =. (i-left)(>,>:)right-i all =. all,>./nearer{large,larger end. ) A2a =: 3 : 0 n =. #y max =. >./y i =. 0 all =. i.0 while. i<n do. it =. i{y if. it=max do. larger=.0 goto_next. end. d =. i while. it>:larger=.>./d{y do. d =. d + _1 1 * d~:0,n-1 end. label_next. i =. i+1 all =. all,larger end. ) A2b =: 3 : 0 n =. #y all =. y<>./y for_i. all#i.n do. d =. i while. (i{y)>:larger=.>./d{y do. d =. d+_1 1*d~:0,n-1 end. all =. larger i} all end. ) NLN3 =: 3 : 0 n =. #y all =. y < >./y alli =. all # i.n d =. 1 while. 0<#alli do. y =. y,0 largers =. (alli-d){y largers =. largers >. (alli+d){y all =. largers alli} all largers =. largers <: alli{y alli =. largers#alli d =. d+1 end. all ) NLN3a =: 3 : 0 n =. #y all =. y < >./y alli =. all # i.n d =. 0 while. (#alli)>0 do. y =. y,0 d =. d + 1 _1 largers =. >./ (d +/ alli){y all =. largers alli} all largers =. largers <: alli{y alli =. largers#alli end. all ) NLN5 =: 3 : 0 n =. #y i =. 0 all =. i.0 while. i<n do. x =. i{y d =. 1 whilst. {./:~neighbors<:x do. neighbors =. d < (i+1),(n-i) neighbors =. neighbors # i + d * _1 1 neighbors =. neighbors { y d =. d+1 end. i =. i+1 all =. all,>./ neighbors NB. __ default end. )
Contributed by Howard Peelle