NYCJUG/2021-06-08

Meeting Agenda for NYCJUG 20210608

Beginner's regatta

The "copy" verb, dyadic "#", illustrates how the values of left argument can be taken from increasingly large domains to extend the functionality of the verb in a logical fashion.

Copy is perhaps most commonly used with a Boolean left argument to retain items corresponding to ones and remove those corresponding to zeros.

   1 0 1 # 'ABC'
AC
   1 0 1 0 0 1 # 1 22 333 4444 55555 666666
1 333 666666

If we think of the left argument as specifying the number of copies to make of the compatible list of items on the right, it's easy to see how extending the left argument from Boolean to integer makes sense.

   1 2 3 # 'ABC'
ABBCCC
   (i.5) # i.5
1 2 2 3 3 3 4 4 4 4

Perhaps not as intuitive is the extension of the left argument to complex numbers where the real part specifies the number of copies and the imaginary part specifies the number of fill elements.

   1+j. i.5            NB. Complex left argument specifying increasing more fill elements.
1 1j1 1j2 1j3 1j4
   (1+j. i.5) # i.5    NB. Zero fill elements for zero, one for one, etc.
0 1 0 2 0 0 3 0 0 0 4 0 0 0 0
   (1+j. i.5) # 'ABCDE'
AB C  D   E

Show-and-tell

Where we see that benchmarks are highly sensitive to many different factors.

Benchmarks

Following are a range of different benchmarks, from those more or less appropriate to J to those of questionable value which attempt to compare performance between languages.

Basic Factors

When looking into some benchmarks proposed on the J forum, I found a set of basic benchmarks I had created some years ago, some results of which are here and code here. These were intended to test a few of the most basic factors affecting overall performance: floating-point and integer arithmetic, and file reads and writes. For example, on a machine running a 2.80 GHz Intel i9-9000 processor, here we see the minimum, maximum, mean and standard deviations of a number of timings.

Starting: 2021 6 7 23 19 2.408
Floating-point arithmetic: min, max, mean, SD:
0.101 0.261955 0.180248 0.0784597
Integer arithmetic: min, max, mean, SD:
0.12477 2.61215 1.35106 1.25349
File writes: min, max, mean, SD:
0.018953 0.0354831 0.0241326 0.00569215
File reads: min, max, mean, SD:
7.06e_5 0.206807 0.0168366 0.0316145
Random File reads: min, max, mean, SD:
0.0186341 0.135364 0.0421362 0.035625
Done: 2021 6 7 23 19 43.083

These benchmarks would be useful for comparing the performance of different versions of J on the same machine or different machines running the same version of J. For instance, here are these timings run on a 1.80 GHz AMD A10-8700P Radeon R6 for the same version of J as the above:

Starting: 2021 6 7 23 29 17.877
Floating-point arithmetic: min, max, mean, SD:
0.694389 1.90298 1.28435 0.544883
Integer arithmetic: min, max, mean, SD:
0.636503 12.7672 6.50783 5.75177
File writes: min, max, mean, SD:
0.0608219 0.16584 0.105032 0.0279952
File reads: min, max, mean, SD:
0.000154 0.323715 0.0413655 0.0740718
Random File reads: min, max, mean, SD:
0.0548537 0.311063 0.153417 0.112474
Done: 2021 6 7 23 32 21.216

This tells us that the first machine is 2.5 to 7 times as fast as the other for these basic operations.

A much older machine, running a 2.40 GHz Intel i5 M520 gives these results:

Starting: 2021 6 7 22 14 41.609
Floating-point arithmetic: min, max, mean, SD:
0.737734 1.9486 1.28483 0.532203
Integer arithmetic: min, max, mean, SD:
0.618851 7.85335 4.17601 3.64397
File writes: min, max, mean, SD:
0.596799 24.4731 2.97664 4.58737
File reads: min, max, mean, SD:
8.08273e_5 1.68123 0.061683 0.12634
Random File reads: min, max, mean, SD:
0.0185441 0.0989287 0.0572687 0.0342837
Done: 2021 6 7 22 19 31.083

This machine is substantially worse on the file-writing benchmark but otherwise is not much worse than the 2nd 1.80 GHz machine (comparing means):

   1.28483 4.17601 2.97664 0.061683 0.0562687 % 1.28435 6.50783 0.105032 0.0413655 0.153417
1.00037 0.64169 28.3403 1.49117 0.36677

Both of these latter two machines use SSD drives. One would think this would make quite a difference on the file benchmarks but it does not always do so. Here are the results from running on the first machine but using its SSD drive:

Starting: 2021 6 7 23 50 29.969
Floating-point arithmetic: min, max, mean, SD:
0.0986838 0.251052 0.173908 0.0760136
Integer arithmetic: min, max, mean, SD:
0.123842 2.57652 1.32949 1.23506
File writes: min, max, mean, SD:
0.0211734 0.0348024 0.0259054 0.00524834
File reads: min, max, mean, SD:
7.41e_5 0.111038 0.0224532 0.0402293
Random File reads: min, max, mean, SD:
0.0196967 0.0549903 0.036197 0.015408
Done: 2021 6 7 23 51 12.924

The difference in the file operations:

   0.0241326 0.168366 0.0421362 % 0.0259054 0.0224532 0.036197
0.931566 7.49853 1.16408

So the sequential file read is substantially faster but the others are about the same.

Furthermore, these timings are very sensitive to whatever else is running on the machine, as one would expect.

Attempt at Cross-Language Benchmarks

A correspondent on the J forum proposed some benchmarks in an attempt to do cross-language comparisons but this raises a number of difficulties, not the least of which is that J pays a penalty for code written in an unnecessarily looping fashion. We use

   ts =: 6!:2 , 7!:2@]

to find the amount of time and space a given expression uses.

The benchmarker tried to compensate for this by including both code that mimics the method of scalar languages, for adding up numbers in a loop, and a more J-like version of the same code. So we have the loopy version:

NB. sumloop
NB. loop to sum by incrementing in a loop
NB. done in non-J way
sumloop =: 3 : 0
sum =. 0
for_i. i. y do.
  sum =. sum + 1
end.
sum
)

And the more J-like version:

NB. sumj
NB. sum consecutive integers as in J
sumj =: 13 : '+/ y$1'

As one might expect, the loopy version is slower:

   (10) 6!:2 'sumloop 1e6'
0.19529
   (10) 6!:2 'sumj 1e6'
0.00053342
   0.19529%0.00053342
366.109

However, I would argue that an even more J-like version would be this one:

   sumj2=: <:@:>:^:#   NB. Loop using the power conjunction
   (10) 6!:2 'sumj2 1e6'
1.82e_6
   (10) 6!:2 'sumj 1e7'
0.00322712
   (10) 6!:2 'sumj2 1e7'
1.66e_6
   (10) 6!:2 'sumj 1e8'
0.0372194
   (10) 6!:2 'sumj2 1e8'
1.73e_6

This one seems to take the same amount of time regardless of its argument and is enormously faster:

   0.0372194%1.73e_6
21514.1

Adjusting a Benchmark Using Inside Knowledge

Henry Rich responded to this thread by noting that "[b]enchmarks such as these are hard to make meaningful and easy to misuse, perhaps maliciously."

He gives an example of what he calls a "counterbenchmarks: problems to show off the really beautiful algorithms in JE" (J engine):

1. grade 1e4 integers.  I pick 1e4 because it fits in cache.  If the
arguments are too long the CPU is waiting for memory
    CLOCKSPEED =: 2.5e9   NB. My CPU's clock frequency
    r =: 1e4 ?@$ 2e4
    (#r) %~ CLOCKSPEED * 100000 (6!:2) '/: r'
12.2257   NB. 12 cycles per atom!
    r =: 1e4 ?@$ 1e9
    (#r) %~ CLOCKSPEED * 100000 (6!:2) '/: r'
54.1968

The caching effect is noticeable. Here we see that increasing the size of the argument by a factor of 10 takes 14 times as long:

   r =: 1e4 ?@$ 2e4      NB. 1e4 fits in cache
   100000 (6!:2) '/: r'
4.74625e_5
   r =: 1e5 ?@$ 2e4      NB. 1e5 exceeds cache
   100000 (6!:2) '/: r'
0.000691088
   0.000691088 % 4.74625e_5
14.5607

A Classic Array-language Benchmark Through the Years

Finally, no exposition of J benchmarks would be complete without referring to one that has been used in the APL world for quite few years: matrix inversion of a 50 by 50 matrix of integers, i.e. 6!:2 '%.?50 50$1000' in J. Looking here, we see that the time for this has come down from over 2800 seconds in 1990 to 5.6e_5 seconds recently, speedup of about 50 million times.

Learning and Teaching J

Dan is teaching himself Dyalog and would like any help anyone can offer. He has been putting together a Rosetta Stone page between APL and J here.

This reminded me of a talk a while back about someone going in the opposite direction.

An APLer Learning J

Back in 2009, Paul L. Jackson, a long-time APLer, gave a talk on his rewarding experience learning J as someone coming from an APL background.

Paul cannot pin down exactly when he started learning J because his Microsoft Exchange emails are unavailable due to his unwillingness to use MS software in the past 10 years.

He showed us a cheat sheet to map APL to J equivalents. He uses J as a "broken key" APL, i.e. with work-arounds for parts he does not yet understand. Here is an example similar to Dan's Rosetta page where he aligns equivalent APL and J primitives.

He has been working on writing a version of APL in .NET which reflects an effort that STSC had begun.

While working on his own Javascript implementation of APL, he met someone working on an APL in JavaScript. This is Joe Bogner, who was on the call. Apparently he had trouble implementing adverbs and he had a philosophical objection to quad-CT so Paul added it to the implementation. Paul was favorably impressed by the performance of APL written in JavaScript.

He notes J's prejudice toward the first dimension. There had been talk of defaulting to first dimension at least back as far as Roland Pesch working on SAX(?), so Ken had been thinking of this for some time before Arthur Whitney got approval of this for A+. Arthur came back from a weekend with Ken and declared "Dad said it was OK". This seems obvious in retrospect when looking at concatenation since it's much easier to add a new row than a new column.

There was also dispute about how encode and decode worked - Larry Breed wanted compatibility between decode and encode with inner product.

Brackets seem truly foreign even in old APL. He had a point about parenthesis usage:

   this=: |.>:i.3
   this; 2 3; 'fun'         NB. This works
NB. but
   this; i. 2 3; 'fun'      NB. This fails
   this; (i. 2 3) ; 'fun'   NB. We need to this because of right-to-left execution.

This sort of things trips up novices even now.

He also had a comment about internal event-handling, how it changed to more of an object model to base the locus of execution on which object has focus.

Looking at J's z vs. j Namespaces and Paul's Utilities

All names that begin with uppercase are for public consumption; those beginning with lowercase are internal, as seen here.

If you work with a language that keeps its code in text files rather than in workspaces, you don't want something like ")SAVE" with no argument.

His set of names is now publicly available; they have what he sees as handy default behavior, e.g. Load re-loads the last thing loaded as seen in the examples here.

Continuing...

Adverbs and Conjunctions

He had inadvertently left an argument ("y" or "x" and "y") out of a definition and found that this returned something other than a function - adverb or conjunction returns a verb. Leaving off "x" or "y" does this. Leaving off the "u" or "v" returns a value error on "y".

"Del" is his most-used one-liner ever.
"Dr" is a verb; "Rn" is a conjunction: <directory> Rn View
So, "'lessons' Rn View" looks at what has been done already and picks up where it left off. He uses the abbreviation "LRC" for "Load, Run, Copy".

Here is a list of some of his utilities:

His utilities often work with the clipboard, taking implicit arguments from there.

His "Use" verb does different things based on the file extension but load, run, and copy know on which extension it should work.

All his publicly-available work is here or thereabouts.

Joe Bogner asked about Paul's use of Android: has he used J on laptop or such? He seems geared toward a simple interface based on his extensive use of beta releases. He feels more comfortable looking down like reading a book rather than looking up to a laptop screen.

"plj" is publicly available namespace; "pj" is the internal namespace. The convention of starting a public namespace with "p" is documented somewhere?

Podcasts

Adam Gordon Bell interviewed software evangelist Gabriel Gonzalez on the Corecursive podcast. Among other things, they talked about how a language becomes popular.

The second Array Cast podcast is now available.