NYCJUG/2013-01-08
program readability, presenting code, spline, distance between points, analyze sentence, function composition, IDE versus command-line
Location:: BEST
Meeting Agenda for NYCJUG 20130108
1. Beginner's regatta: code readability and flexibility - an example of finding the distance between two points: see "Distance Between Two Points in J.pdf". 2. Show-and-tell: the usefulness of examples - see "Example Code for Spline-fitting.doc". Dissect: A graphical, 2-D J sentence analyzer; see "Introduction to Dissect - 2-D J Sentence Analyzer". 3. Advanced topics: See "The Benefits of Function Composition". Using QT as the GUI for J? Discuss. Also, see "sketching a user interface with Denim.doc". 4. Learning, teaching and promoting J, et al.: who needs an IDE? See "The IDE As a Bad Programming Language Smell.doc".
Beginner's regatta
Code Readability: Distance Between Two Points in emacs Lisp
[From http://www.mitchellsoftwareengineering.com/ProgrammingWithGNUEmacsLisp.pdf]
"Consider a function linelen that computes the distance between two points represented as dotted-pairs:
ELISP> (linelen '(0 . 0) '(1 . 1)) 1.4142135623730951
Definition:
(defun linelen (p1 p2) (setq x1 (car p1)) (setq y1 (cdr p1)) (setq x2 (car p2)) (setq y2 (cdr p2)) (setq xdiff (- x2 x1)) (setq ydiff (- y2 y1)) (sqrt (+ (* xdiff xdiff) (* ydiff ydiff))) ; return value )
How would you rate the readability of the above code?"
Distance Between Two Points in J
Since we have arrays in J, there's no need to add to the complexity of the code by treating the elements of an (X,Y) pair as separate, unrelated scalars as is done above. Also, at the end of our exercise, we'll see how much more general and flexible the resulting array code can be.
Say we have two named point-sets like this:
pts0=. 0 0 pts1=. 1 1
The difference between the Xs and the Ys in the respective sets is simply
pts0-pts1 _1 _1
Squaring each of these:
*:pts0-pts1 1 1
Summing the squares:
+/*:pts0-pts1 2
Taking the square root of the sum of the squares:
%:+/*:pts0-pts1 1.41421
So, if we want to encapsulate this logic in a dyadic, explicit verb:
linelen=: 4 : '%:+/*:x-y'
Checking our work:
pts0 linelen pts1 1.41421
How would you rate the readability of the above code?
Variations on a Theme
But, wait, is this the best way to code this? Consider first that "square" (*:) and "square-root" (%:) are inverses applied “under” summation (+/) and we have an adverb that does just this: "under" (&.). Also consider that we may want a tacit form of this verb and we may want it to be monadic. Taking these first two together and using the tacit translator “13 : “ as a crutch, we might try this:
13 : '+/&.*:x-y' [: +/&.*: - pts0 ([: +/&.*: -) pts1 1 1
It doesn’t work – why? Based on the shape of the result, we might suspect a problem with rank. Unfortunately, we can’t specify the rank of an adverb:
pts0 ([: +/(&."_)*: -) pts1 |syntax error | pts0([:+/( &."_)*:-)pts1
Abandoning this idea for the time being, let’s look instead at representing this tacitly as a monadic verb. An advantage of a monadic verb is that we can take a single argument – a 2 x n array of point pairs.
]pp=: pts0,:pts1 0 0 1 1 ([:%:[:+/[:*:[:-/]) pp 1.41421
Notice how this requires us to use minus reduction (-/) instead of subtraction. Let’s extend this to more point pairs to see how it fares.
]pts=. (0 0,1 1,:2 2) ,: 1 1,2 2,:0 0 0 0 1 1 2 2 1 1 2 2 0 0
So we have three sets of point pairs in a 2 x 3 x 2 array. The pairs between which we'll find the distance are ((0,0),(1,1)), ((1,1),(2,2)), and ((2,2),(0,0)).
([:%:[:+/[:*:[:-/]) pts 2.44949 2.44949
This is obviously wrong because it's not the three answers we want – what went wrong? Let’s apply the verbs in an increasing train to see where we go astray.
pts; (-/pts) ; (*:-/pts) ; +/*:-/pts +---+-----+---+---+ |0 0|_1 _1|1 1|6 6| |1 1|_1 _1|1 1| | |2 2| 2 2|4 4| | | | | | | |1 1| | | | |2 2| | | | |0 0| | | | +---+-----+---+---+
Clearly we go wrong because the summation is down the first dimension. Fixing this gives us what we’re looking for:
([:%:[: (+/"1) [:*:[:-/]) pts 1.41421 1.41421 2.82843
We could move the reduction (/) outside the body of the function and make the minus (-) dyadic to make the resulting verb dyadic and apply it as a reduction:
([: %: [: +/"1 [: *: -) / pts 1.41421 1.41421 2.82843
Back to Basics and Beyond
Interestingly, when we search the J wiki for a solution to this problem, we find this (at http://www.jsoftware.com/jwiki/Phrases/Geometry#Distance_between_2_points) to calculate the distance between 2 points:
NB.*dPP v distance between 2 points dPP=: [: +/ &.: *: - dPP/pp 1.41421
Notice that this succeeds with our early idea about applying summation "under" (&.:) a verb (*:), then applying that verb's inverse (%:) to the result. The key is this variant of "under" which is applied with infinite rank - &.: - instead of the more familiar version &..
However, this version suffers the same flaw as one of our earlier attempts when we attempt to extend it to multiple point pairs:
dPP/pts 2.44949 2.44949
Fortunately, we can apply the same fix that worked before: specify the summation across 1-dimensional elements of the array:
dPP=: [: (+/"1) &.: *: -
Applying this dyadic verb between the pair of points in pp:
dPP/pts 1.41421 1.41421 2.82843
This also clues us in that the dyadic formulation may be more useful. In fact, it allows us the flexibility for all these useful variants and extensions:
dPP"1 / / pts NB. All point pairs 1.41421 2.82843 0 0 1.41421 1.41421 1.41421 0 2.82843 2 dPP"1 / \ 0 0,1 1,2 2,:10 10 NB. Adjacent point pairs 1.41421 1.41421 11.3137 dPP/0 0 0,:1 1 1 NB. 3-D points 1.73205 dPP/0 0 0 0,:1 1 1 1 NB. 4-D points 2 dPP/0 0 0 0 0,:1 1 1 1 1 NB. 5-D points 2.23607
How would you rate the generality and flexibility of this code?
Show-and-Tell
Code for Spline-fitting
This code, posted by J. Patrick Harrington, fits a curve to some points using spline methods. It’s slightly edited: the major change being to show the examples as “multi-line comments” using “0 : 0” instead of prefixing each comment line with “NB.” as this eases cut-and-paste of the examples.
NB.* ls_spline.ijs: find cubic spline fit by least squares. NB. From http://www.astro.umd.edu/~jph/ls_spline.ijs by NB. J. Patrick Harrington (jph@astro.umd.edu). NB. Ref.: S K Lucas, Gazette Aust. Math. Soc., 30, 207 (2003). NB. t= boundaries of cubic segments, (xi,yi) = data points. NB. Usage: 't z zp'=. OUT=. t ls_spline xi;yi NB. z and zp are values of spline fit f(x) and df/dx at t's. NB. (z & zp are sufficient to evaluate the fit f(x) at any x.) cos=: 2&o. ls_spline=: dyad define 'xi yi'=. /:~&.|: >y NB. sort xi into an ascending sequence idx=. x IdX xi NB. idx is t interval in which each xi resides NB. if first or last interval(s) are empty of data, delete them t=. ({. idx)}.((2-#x)+{:idx)}. x Np1=. 2+ Nm1=. <: N=. #h=. (}.-}:)t bx=. <"1 |: >(i. n=. #xi); t IdX xi v=. _1+ u=. bx{ h%"1~ xi -/ }:t uu=. *: u=. u bx}0$~n,N vv=. *: v=. v bx}0$~n,N aa=. +/ *:al=. (1+2*u)*vv ab=. +/ al*bt=. uu*(1-2*v) ag=. +/ al*gm=. h*"1 u*vv ad=. +/ al*dl=. h*"1 uu*v bd=. +/ bt*dl [bg=. +/ bt*gm [bb=. +/ *:bt dd=. +/ *:dl [gd=. +/ gm*dl [gg=. +/ *:gm D=. +:(0,~ +/ yi*al)+(0, +/ yi*bt) G=. +: (0,~ +/ yi*gm)+(0, +/ yi*dl) bx11=. >: &.> bx00=. ;/ ,.~i. N bx01=. (0 1)&+ &.> bx00 bx10=. (1 0)&+ &.> bx00 A=. (bb bx11}E) + aa bx00}E=. 0$~,~Np1 A=. +: (ab,ab) (bx01,bx10)}A B=. (ag bx00}E) + bd bx11}E B=. +: (bg,ad) (bx01,bx10)}B E=. (gg bx00}E) + dd bx11}E E=. +: (gd,gd) (bx01,bx10)}E bx00=. }: bx00 [ bx01=. }: bx01 bx02=. (0 2)&+ &.> bx00 hip=. }.h [hi=. }:h C=. (3*hip%hi) bx00}F=. 0$~Nm1,Np1 C=. (3*(hi%hip)-hip%hi) bx01}C C=. (_3*hi%hip) bx02}C F=. hip bx00}F F=. (+: hi+hip) bx01}F F=. hi bx02}F NB. set up matrix of linear system to be solved: M=. (A,.(|:B),.|:C),(B,.E,.|:F),C,.F RHS=. (D,G,Nm1#0) NB. right hand side of system ZZ=. RHS %. M NB. solve for f(x_i) & [df/dx]_i z=. (i. Np1){ZZ >t; z; zp=. (Np1+i. Np1){ZZ )
We won't go over this code in any detail. We'll just note that it was copied, with little alteration, from a scalar language. In the highly-repetitive, adjacent lines above, we can hear the cries of an array struggling to get out.
The code continues with some elaboration.
NB. Thanks to Arie Groeneveld for improvements to this code. NB. Note: equation (2) of Lucas should be u_i(x)=(x-t_i)/h_i NB. and the 2nd B in the pseudocode matrix should not be B^T NB. but just B as in equation (7). NB. t IdX x0 --> index of where x0 fits in t-array: NB. 0 1 2 3 4 5 6 <- t values NB. | | | | | | | NB. <-- 0 | 1 | 2 | 3 | 4 | 5 ---> IdX values IdX=: [: <: (1"_ >. [: +/"1 ] >/ [) <. [: <: [: # [ NB. Least squares spline evaluation using output "OUT" NB. of ls_spline. Usage: OUT eval_spline y --> f(y) NB. The argument y may be an array of x-values. eval_spline=: dyad define 't z zp'=. x NB. Array t holds the boundaries of the spline pieces, z is NB. the array of the values of the cubic splines at the NB. t points, and zp (z') the derivatives at those points. N=. # h=. (}.-}:) t v=. _1+ u=. h%~ |: y -/ }:t a=. (1+2*u)*v*v* }:z b=. u*u*(1-2*v)* }.z c=. h*u*v*v* }:zp d=. h*u*u*v* }.zp box=. <"1 |: >(i. #y); t IdX y box { ((#y),N)$,|: a+b+c+d )
Since this code will be fitting a line to some points, the author defines a customized plot verb to show both the points and the line in the same graph.
NB. plot dots & line: (dots x;y) dot_and_line (line x;y) require'plot' dot_and_line=: dyad define pd 'output cairo 600 300' [ pd'reset' NB. Size 300x600 pixels pd'type dot;pensize 4;color red' NB. Distinguish dots as red pd x pd'type line;pensize 3; color blue' pd y pd'show' )
I changed this slightly, adding the "output" statement to specify the plot size and distinguishing the lines from the dots by color differences.
The Unreasonable Effectiveness of Examples
Here's where J can really shine: if we're thoughtful enough to provide specific examples of using our code, it becomes trivially easy to do basic sanity tests and ensure that the code is doing what we expect.
Here we see an example formatted as a J noun (using "0 : 0") in order to keep the lines free of clutter. This simplifies cutting and pasting this example into a J session to watch it perform.
spline0_eguse_=: 0 : 0 t=. 0, 0.2 0.5 0.7 1 xx=. rand 35 yy=. (cos 1p1*xx)+_0.05+0.1*rand 35 'type symbol;pensize 3' plot xx;yy load 'ls_spline.ijs' ]OUT=. t ls_spline xx;yy uu=. int01 300 vv=. OUT eval_spline uu (xx;yy) dot_and_line uu;vv )
If we run this, we'll see a graph like this:
We see the curve fitted to the points (show as dots).
A "Classic" Example
Here, the author defines two utility verbs and provides some data that apparently has appeared frequently in the exposition of this algorithm.
rand=: [: ? ] # 0"_ int01=: i.@>:%] NB. int01 n --> divides [0 ... 1] into n intervals. NB. Test using classic "titanium heat data": splineTi_eguse_=: 0 : 0 ti=: 0.644 0.622 0.638 0.649 0.652 0.639 0.646 0.657 0.652 0.655 ti=: ti,0.644 0.663 0.663 0.668 0.676 0.676 0.686 0.679 0.678 0.683 ti=: ti,0.694 0.699 0.710 0.730 0.763 0.812 0.907 1.044 1.336 1.881 ti=: ti,2.169 2.075 1.598 1.211 0.916 0.746 0.672 0.627 0.615 0.607 ti=: ti,0.606 0.609 0.603 0.601 0.603 0.601 0.611 0.601 0.608 xx=. 595+ 10*i.#ti NB. 8 interval points "judiciously" chosen (for best fit!): t=. 595 820 850 870 900 915 980 1075 OUT=. t ls_spline xx;ti uu=. 595+(1075-595)*int01 400 vv=. OUT eval_spline uu (xx;ti) dot_and_line uu;vv NB. this is the plot shown as "Fit ..." )
If we run this example, we'll see the following:
An Alternative Method
The author presents an alternative algorithm to the above: a least-squares fit of points to "not quite a cubic spline".
NB. (see Ferguson and Staley, Comm. ACM 16, 380 (1973)). NB. This routine is simpler than ls_spline above, since it doesn't enforce NB. continuity of the second derivatives. The results usually look similar NB. but are not so smooth. Given for comparison, but I don't recommend it.
Also useful in any code are explanations of the major inputs and outputs. We might also consider slightly more meaningful names, though there's no need to fall into the excess verbosity favored by some languages.
NB. t= frets, xi and yi data points. NB. Usage: 't z zp'=. OUT2=. t ls_spline2 xi;yi NB. z and zp are values of spline fit f(x) and df/dx at t's. ls_spline2=: dyad define 'xi yi'=. /:~&.|: >y NB. sort xi into an ascending sequence idx=. x IdX xi NB. if first or last interval(s) are empty of data, delete them t=. ({. idx)}.((2-#x)+{:idx)}. x Np1=. >: N=. #h=. (}.-}:)t bx=. <"1 |: >(i. n=. #xi); t IdX xi v=. _1+ u=. bx{ h%"1~ xi -/ }:t uu=. *: u=. u bx}0$~n,N vv=. *: v=. v bx}0$~n,N aa=. +/ *:al=. (1+2*u)*vv ab=. +/ al*bt=. uu*(1-2*v) ag=. +/ al*gm=. h*"1 u*vv ad=. +/ al*dl=. h*"1 uu*v bd=. +/ bt*dl [bg=. +/ bt*gm [bb=. +/ *:bt dd=. +/ *:dl [gd=. +/ gm*dl [gg=. +/ *:gm D=. +:(0,~ +/ yi*al)+(0, +/ yi*bt) G=. +: (0,~ +/ yi*gm)+(0, +/ yi*dl) bx11=. >: &.> bx00=. ;/ ,.~i. N bx01=. (0 1)&+ &.> bx00 bx10=. (1 0)&+ &.> bx00 A=. (bb bx11}E) + aa bx00}E=. 0$~,~Np1 A=. +: (ab,ab) (bx01,bx10)}A B=. (ag bx00}E) + bd bx11}E B=. +: (bg,ad) (bx01,bx10)}B E=. (gg bx00}E) + dd bx11}E E=. +: (gd,gd) (bx01,bx10)}E M=. (A,.|:B),(B,.E) ZZ=. (D,G)%. M NB. solve for f(x_i) & [df/dx]_i z=. (i. Np1){ZZ >t; z; zp=. (Np1+i. Np1){ZZ )
Again, an example or two is worth pages of exposition or code.
NB. Example: spline2_eguse_=: 0 : 0 t=. 0, 0.2 0.5 0.7 1 xx=. rand 35 yy=. (cos 1.5p1*xx)+_0.05+0.1*rand 35 'type symbol;pensize 3' plot xx;yy load'ls_spline2.ijs' ]OUT2=. t ls_spline2 xx;yy uu=. int01 300 vv2=. OUT2 eval_spline uu (xx;yy) dot_and_line uu;vv2 NB. Compare to ls_spline from above: (xx;yy) dot_and_line uu;>vv;vv2 )
The results of the example of the alternative implementation look like this:
Comparing this alternative with the original version (by putting the data for both lines (>vv;vv2) in the last cell of the right argument) shows us that it's hard to distinguish the resulting lines. The alternative version overlays the original version nearly exactly except for the visible difference in the upper-left corner of this graph.
"Dissect" Tool for Analyzing J Sentences
See this page for information on this tool that breaks down a J expression into pieces as they are executed. We can take a look to see how this could have helped us with our exercise above where we were developing the code to find the distance between point pairs.
Advanced Topics
Qt
We talked a little about the recent switch by the J developers to using Qt as the basis for J graphics instead of Gtk. Thomas said, based on his experiences with both environments, that Qt seemed more comprehensive and more well-designed than Gtk. Based on messages on the forums, there still seems to be some teething pains with getting the Qt-based versions of the J GUI to work but development is proceeding rapidly.
Higher-level Functions
Unlike most other functional languages, J does not simply, vaguely refer to "higher-level" functions: we have a specific vocabulary for distinguishing between different kinds of "higher-level" functions. There are adverbs for combining verbs and nouns, and there are conjunctions for combining verbs. We also use the more general term "function composition" to refer to these different things together, as well as "hooks" and "forks" for different modes of composition.
Graham Parkhouse had this to say on the J-Programming Forum about the benefits of function composition in J.
The Benefits of Function Composition
from: Graham Parkhouse <graham.parkhouse@ntlworld.com> to: programming@jsoftware.com date: Thu, Jan 3, 2013 at 3:25 AM subject: Re: [Jprogramming] Atop continues to puzzle me
Subject: Re: Atop continues to puzzle me
This post was initially titled 'The benefits of function composition' - see my P.S. There are things you can achieve with function composition that cannot be achieved so elegantly any other way:
3 4 5<@#"0 i.3 +-----+-------+---------+ |0 0 0|1 1 1 1|2 2 2 2 2| +-----+-------+---------+
This is what I want - a set of 3 0s, a set of 4 1s and a set of 5 2s.
This doesn't give me what I want:
3 4 5#"0 i.3 0 0 0 0 0 1 1 1 1 0 2 2 2 2 2
Nor does this:
3 4 5 ([: <"1 #"0) i.3 +---------+---------+---------+ |0 0 0 0 0|1 1 1 1 0|2 2 2 2 2| +---------+---------+---------+
But this does:
each +--+-+ |&.|>| +--+-+ 3 4 5#"0 each i.3 +-----+-------+---------+ |0 0 0|1 1 1 1|2 2 2 2 2| +-----+-------+---------+
... but then &. is function composition.
The unavoidable problem with not using function composition in such instances is that J turns the result from every execution of a verb into a rectangular array. When a series of verbs is executed in succession without composition, J has to produce a succession of 'properly finished' results along the way. With composition, this finishing process is confined to just once at the end of the process.
P.S. I wrote this before reading Linda's recent post on 'atop continues to puzzle me' (Wed, 02 Jan 2013 03:02:54 -0500):
[At NYCJUG, Thomas pointed out that the "rank 0" ("0) in the last example above is unnecessary because "each" already works at this rank.]
Learning, teaching, and problem-solving
We looked at an essay about problems with IDEs (Interactive Development Environments) - The IDE As a Bad Programming Language Smell - compared with simpler, editor-based methods often preferred by more experienced programmers. Sometimes it seems as though IDEs exist to address common shortcomings in a language - think of the advantage of auto-complete in a verbose language like Java - but they potentially provide benefit for organizing large bodies of code.
Materials
- File:Distance Between Two Points in J.pdf
- File:Example Code for Spline-fitting.pdf
- File:Introduction to Dissect - 2-D J Sentence Analyzer.pdf
- File:The Benefits of Function Composition.pdf
- File:Ls spline.ijs
- File:ProgrammingWithGNUEmacsLisp.pdf
- File:Getting Started Programming with Qt Widgets.pdf
- File:Getting Started with QT.pdf
- File:Sketching a user interface with Denim.pdf
- File:The IDE As a Bad Programming Language Smell.pdf
-- Devon McCormick <<DateTime(2013-01-15T14:22:46-0200)>>