NYCJUG/2021-06-08
Meeting Agenda for NYCJUG 20210608
Beginner's regatta
The "copy" verb, dyadic "#", illustrates how the values of left argument can be taken from increasingly large domains to extend the functionality of the verb in a logical fashion.
Copy is perhaps most commonly used with a Boolean left argument to retain items corresponding to ones and remove those corresponding to zeros.
1 0 1 # 'ABC' AC 1 0 1 0 0 1 # 1 22 333 4444 55555 666666 1 333 666666
If we think of the left argument as specifying the number of copies to make of the compatible list of items on the right, it's easy to see how extending the left argument from Boolean to integer makes sense.
1 2 3 # 'ABC' ABBCCC (i.5) # i.5 1 2 2 3 3 3 4 4 4 4
Perhaps not as intuitive is the extension of the left argument to complex numbers where the real part specifies the number of copies and the imaginary part specifies the number of fill elements.
1+j. i.5 NB. Complex left argument specifying increasing more fill elements. 1 1j1 1j2 1j3 1j4 (1+j. i.5) # i.5 NB. Zero fill elements for zero, one for one, etc. 0 1 0 2 0 0 3 0 0 0 4 0 0 0 0 (1+j. i.5) # 'ABCDE' AB C D E
Show-and-tell
Where we see that benchmarks are highly sensitive to many different factors.
Benchmarks
Following are a range of different benchmarks, from those more or less appropriate to J to those of questionable value which attempt to compare performance between languages.
Basic Factors
When looking into some benchmarks proposed on the J forum, I found a set of basic benchmarks I had created some years ago, some results of which are here and code here. These were intended to test a few of the most basic factors affecting overall performance: floating-point and integer arithmetic, and file reads and writes. For example, on a machine running a 2.80 GHz Intel i9-9000 processor, here we see the minimum, maximum, mean and standard deviations of a number of timings.
Starting: 2021 6 7 23 19 2.408 Floating-point arithmetic: min, max, mean, SD: 0.101 0.261955 0.180248 0.0784597 Integer arithmetic: min, max, mean, SD: 0.12477 2.61215 1.35106 1.25349 File writes: min, max, mean, SD: 0.018953 0.0354831 0.0241326 0.00569215 File reads: min, max, mean, SD: 7.06e_5 0.206807 0.0168366 0.0316145 Random File reads: min, max, mean, SD: 0.0186341 0.135364 0.0421362 0.035625 Done: 2021 6 7 23 19 43.083
These benchmarks would be useful for comparing the performance of different versions of J on the same machine or different machines running the same version of J. For instance, here are these timings run on a 1.80 GHz AMD A10-8700P Radeon R6 for the same version of J as the above:
Starting: 2021 6 7 23 29 17.877 Floating-point arithmetic: min, max, mean, SD: 0.694389 1.90298 1.28435 0.544883 Integer arithmetic: min, max, mean, SD: 0.636503 12.7672 6.50783 5.75177 File writes: min, max, mean, SD: 0.0608219 0.16584 0.105032 0.0279952 File reads: min, max, mean, SD: 0.000154 0.323715 0.0413655 0.0740718 Random File reads: min, max, mean, SD: 0.0548537 0.311063 0.153417 0.112474 Done: 2021 6 7 23 32 21.216
This tells us that the first machine is 2.5 to 7 times as fast as the other for these basic operations.
A much older machine, running a 2.40 GHz Intel i5 M520 gives these results:
Starting: 2021 6 7 22 14 41.609 Floating-point arithmetic: min, max, mean, SD: 0.737734 1.9486 1.28483 0.532203 Integer arithmetic: min, max, mean, SD: 0.618851 7.85335 4.17601 3.64397 File writes: min, max, mean, SD: 0.596799 24.4731 2.97664 4.58737 File reads: min, max, mean, SD: 8.08273e_5 1.68123 0.061683 0.12634 Random File reads: min, max, mean, SD: 0.0185441 0.0989287 0.0572687 0.0342837 Done: 2021 6 7 22 19 31.083
This machine is substantially worse on the file-writing benchmark but otherwise is not much worse than the 2nd 1.80 GHz machine (comparing means):
1.28483 4.17601 2.97664 0.061683 0.0562687 % 1.28435 6.50783 0.105032 0.0413655 0.153417 1.00037 0.64169 28.3403 1.49117 0.36677
Both of these latter two machines use SSD drives. One would think this would make quite a difference on the file benchmarks but it does not always do so. Here are the results from running on the first machine but using its SSD drive:
Starting: 2021 6 7 23 50 29.969 Floating-point arithmetic: min, max, mean, SD: 0.0986838 0.251052 0.173908 0.0760136 Integer arithmetic: min, max, mean, SD: 0.123842 2.57652 1.32949 1.23506 File writes: min, max, mean, SD: 0.0211734 0.0348024 0.0259054 0.00524834 File reads: min, max, mean, SD: 7.41e_5 0.111038 0.0224532 0.0402293 Random File reads: min, max, mean, SD: 0.0196967 0.0549903 0.036197 0.015408 Done: 2021 6 7 23 51 12.924
The difference in the file operations:
0.0241326 0.168366 0.0421362 % 0.0259054 0.0224532 0.036197 0.931566 7.49853 1.16408
So the sequential file read is substantially faster but the others are about the same.
Furthermore, these timings are very sensitive to whatever else is running on the machine, as one would expect.
Attempt at Cross-Language Benchmarks
A correspondent on the J forum proposed some benchmarks in an attempt to do cross-language comparisons but this raises a number of difficulties, not the least of which is that J pays a penalty for code written in an unnecessarily looping fashion. We use
ts =: 6!:2 , 7!:2@]
to find the amount of time and space a given expression uses.
The benchmarker tried to compensate for this by including both code that mimics the method of scalar languages, for adding up numbers in a loop, and a more J-like version of the same code. So we have the loopy version:
NB. sumloop NB. loop to sum by incrementing in a loop NB. done in non-J way sumloop =: 3 : 0 sum =. 0 for_i. i. y do. sum =. sum + 1 end. sum )
And the more J-like version:
NB. sumj NB. sum consecutive integers as in J sumj =: 13 : '+/ y$1'
As one might expect, the loopy version is slower:
(10) 6!:2 'sumloop 1e6' 0.19529 (10) 6!:2 'sumj 1e6' 0.00053342 0.19529%0.00053342 366.109
However, I would argue that an even more J-like version would be this one:
sumj2=: <:@:>:^:# NB. Loop using the power conjunction (10) 6!:2 'sumj2 1e6' 1.82e_6 (10) 6!:2 'sumj 1e7' 0.00322712 (10) 6!:2 'sumj2 1e7' 1.66e_6 (10) 6!:2 'sumj 1e8' 0.0372194 (10) 6!:2 'sumj2 1e8' 1.73e_6
This one seems to take the same amount of time regardless of its argument and is enormously faster:
0.0372194%1.73e_6 21514.1
Adjusting a Benchmark Using Inside Knowledge
Henry Rich responded to this thread by noting that "[b]enchmarks such as these are hard to make meaningful and easy to misuse, perhaps maliciously."
He gives an example of what he calls a "counterbenchmarks: problems to show off the really beautiful algorithms in JE" (J engine):
1. grade 1e4 integers. I pick 1e4 because it fits in cache. If the arguments are too long the CPU is waiting for memory CLOCKSPEED =: 2.5e9 NB. My CPU's clock frequency r =: 1e4 ?@$ 2e4 (#r) %~ CLOCKSPEED * 100000 (6!:2) '/: r' 12.2257 NB. 12 cycles per atom! r =: 1e4 ?@$ 1e9 (#r) %~ CLOCKSPEED * 100000 (6!:2) '/: r' 54.1968
The caching effect is noticeable. Here we see that increasing the size of the argument by a factor of 10 takes 14 times as long:
r =: 1e4 ?@$ 2e4 NB. 1e4 fits in cache 100000 (6!:2) '/: r' 4.74625e_5 r =: 1e5 ?@$ 2e4 NB. 1e5 exceeds cache 100000 (6!:2) '/: r' 0.000691088 0.000691088 % 4.74625e_5 14.5607
A Classic Array-language Benchmark Through the Years
Finally, no exposition of J benchmarks would be complete without referring to one that has been used in the APL world for quite few years: matrix inversion of a 50 by 50 matrix of integers, i.e. 6!:2 '%.?50 50$1000' in J. Looking here, we see that the time for this has come down from over 2800 seconds in 1990 to 5.6e_5 seconds recently, speedup of about 50 million times.
Learning and Teaching J
Dan is teaching himself Dyalog and would like any help anyone can offer. He has been putting together a Rosetta Stone page between APL and J here.
This reminded me of a talk a while back about someone going in the opposite direction.
An APLer Learning J
Back in 2009, Paul L. Jackson, a long-time APLer, gave a talk on his rewarding experience learning J as someone coming from an APL background.
Paul cannot pin down exactly when he started learning J because his Microsoft Exchange emails are unavailable due to his unwillingness to use MS software in the past 10 years.
He showed us a cheat sheet to map APL to J equivalents. He uses J as a "broken key" APL, i.e. with work-arounds for parts he does not yet understand. Here is an example similar to Dan's Rosetta page where he aligns equivalent APL and J primitives.
He has been working on writing a version of APL in .NET which reflects an effort that STSC had begun.
While working on his own Javascript implementation of APL, he met someone working on an APL in JavaScript. This is Joe Bogner, who was on the call. Apparently he had trouble implementing adverbs and he had a philosophical objection to quad-CT so Paul added it to the implementation. Paul was favorably impressed by the performance of APL written in JavaScript.
He notes J's prejudice toward the first dimension. There had been talk of defaulting to first dimension at least back as far as Roland Pesch working on SAX(?), so Ken had been thinking of this for some time before Arthur Whitney got approval of this for A+. Arthur came back from a weekend with Ken and declared "Dad said it was OK". This seems obvious in retrospect when looking at concatenation since it's much easier to add a new row than a new column.
There was also dispute about how encode and decode worked - Larry Breed wanted compatibility between decode and encode with inner product.
Brackets seem truly foreign even in old APL. He had a point about parenthesis usage:
this=: |.>:i.3 this; 2 3; 'fun' NB. This works NB. but this; i. 2 3; 'fun' NB. This fails this; (i. 2 3) ; 'fun' NB. We need to this because of right-to-left execution.
This sort of things trips up novices even now.
He also had a comment about internal event-handling, how it changed to more of an object model to base the locus of execution on which object has focus.
Looking at J's z vs. j Namespaces and Paul's Utilities
All names that begin with uppercase are for public consumption; those beginning with lowercase are internal, as seen here.
If you work with a language that keeps its code in text files rather than in workspaces, you don't want something like ")SAVE" with no argument.
His set of names is now publicly available; they have what he sees as handy default behavior, e.g. Load re-loads the last thing loaded as seen in the examples here.
Continuing...
Continuing...
Adverbs and Conjunctions
He had inadvertently left an argument ("y" or "x" and "y") out of a definition and found that this returned something other than a function - adverb or conjunction returns a verb. Leaving off "x" or "y" does this. Leaving off the "u" or "v" returns a value error on "y".
"Del" is his most-used one-liner ever.
"Dr" is a verb; "Rn" is a conjunction: <directory> Rn View
So, "'lessons' Rn View" looks at what has been done already and picks up where it left off. He uses the abbreviation "LRC" for "Load, Run, Copy".
Here is a list of some of his utilities:
His utilities often work with the clipboard, taking implicit arguments from there.
His "Use" verb does different things based on the file extension but load, run, and copy know on which extension it should work.
All his publicly-available work is here or thereabouts.
Joe Bogner asked about Paul's use of Android: has he used J on laptop or such? He seems geared toward a simple interface based on his extensive use of beta releases. He feels more comfortable looking down like reading a book rather than looking up to a laptop screen.
"plj" is publicly available namespace; "pj" is the internal namespace. The convention of starting a public namespace with "p" is documented somewhere?
Podcasts
Adam Gordon Bell interviewed software evangelist Gabriel Gonzalez on the Corecursive podcast. Among other things, they talked about how a language becomes popular.
The second Array Cast podcast is now available.