NYCJUG/2023-11-13

From J Wiki
Jump to navigation Jump to search

Beginner's regatta

We look at a an approximation of some of the terms of the Fibonacci sequence and a little-known feature of indexing in J called complementary indexing.

Fibonacci-esque

Here is an amusing trick to generate the first 11 Fibonacci numbers.

   _2]\2}.0j22":100x%9899
01
01
02
03
05
08
13
21
34
55
90

Well, ok, the last one is off by one.

It relies on this fraction containing the series as pairs of digits:

   0j22":100x%9899
0.0101020305081321345590

It can be extended like this:

   }:_3]\2}.0j51":1000x%998999
001
001
002
003
005
008
013
021
034
055
089
144
233
377
610
988

Again, the last one is off by one. However, it's not clear how this works or how to extend it further.

Complementary Indexing

Normally we supply indexes specifying a position to extract items from an array.

   ]mat=. i.4 5
 0  1  2  3  4
 5  6  7  8  9
10 11 12 13 14
15 16 17 18 19
   
   1 2{mat                    NB. Select rows
 5  6  7  8  9
10 11 12 13 14

   1 2{"1 mat                 NB. Select columns
 1  2
 6  7
11 12
16 17

   (0 0;1 1;2 2;3 3){mat      NB. Select items
0 6 12 18

However, J's indexing allows us to exclude rather than select sections of an array. This is done by triple-boxing the indexes like this:

   (<<<1 2){"1 mat            NB. Exclude columns
 0  3  4
 5  8  9
10 13 14
15 18 19

   (<<<1 2){mat               NB. Exclude rows
 0  1  2  3  4
15 16 17 18 19

This does not extend to items as selection does:

   (<0 0;1 1;2 2;3 3){mat     NB. Exclude items?
|length error, executing dyad {
|selector is overlong; has length 4 but rank of y is only 2
|   (<0 0;1 1;2 2;3 3)    {mat

This also easily extends to excluding multiple selections:

   (<^:3&>1 2){mat            NB. Exclude each row independently
 0  1  2  3  4
10 11 12 13 14
15 16 17 18 19

 0  1  2  3  4
 5  6  7  8  9
15 16 17 18 19

More information about this ccan be found here.

Show-and-tell

We look into implementing the Taylor series in J and an apparently paradoxical betting problem.

Taylor Series

One of our members wants to port over the old code for the Taylor series primitives into a library for use in modern J. This series used to be available in J primitives t. and T. but these symbols were re-purposed to handle multi-threading as this seemed like a more useful feature. The C code which implemented the old primitives is still available and could be used to create an addon to use Taylor series.

A Taylor series can be used to approximate a function with a polynomial which may be easier to evaluate than the function itself as shown here (from Wikipedia):

ExampleTaylorSeriesApproximation.JPG 1280px-Taylorsine.svg.png

J Example of Taylor Approximation

We can easily implement the series above and verify the Wikipedia information. The first thing to notice is that the first term in the sine approximation above is in the same form of the subsequent terms, i.e. x is the same as (x^1)%!x, so we can represent the series like this in J:

   {{ -/(y^>:+:i.4)%!>:+:i.4 }}

Where -/ gives us the alternating sum, thanks to left-to-right evaluation.

Rather than use >:+:i.4 twice, we can use a tacit expression

    13 : '(y^x)%!x'
^~ % [: ! [

giving us taylorSine=: {{ -/(>:+:i.4) (^~%[:![) y }}"0.

Testing this for a few points around zero with a line for sine for comparison:

   (1&o. ,: taylorSine) _3 _2 _1 _0.5 0 0.5 1 2 3
  _0.14112 _0.909297 _0.841471 _0.479426 0 0.479426 0.841471 0.909297   0.14112
_0.0910714 _0.907937 _0.841468 _0.479426 0 0.479426 0.841468 0.907937 0.0910714

Giving us a fairly close approximation. Notice that the accuracy of the approximation increases as we lengthen the series so we could pull out the hard-coded "4" and make it a parameter.

   taylorSine=: {{ -/(>:+:i.x) (^~%[:![) y }}"0

   (1&o. ,: 6&taylorSine) _3 _2 _1 _0.5 0 0.5 1 2 3
 _0.14112 _0.909297 _0.841471 _0.479426 0 0.479426 0.841471 0.909297  0.14112
_0.140875 _0.909296 _0.841471 _0.479426 0 0.479426 0.841471 0.909296 0.140875

This gives us a better approximation by adding two more terms to the series.

An Apparent Paradox

This problem, as seen on Reddit, seems paradoxical. It goes like this:

You are offered the chance to play a game. Each round, there are two possible outcomes:

Outcome A (75% chance):

You double your money. You must now choose whether to continue playing or take the money you have earned and stop.

Outcome B (25% chance):

You lose all of the money that you have bet and can no longer continue playing.

So the question is, at what point should you stop and take your money?

The paradox arises because a simple calculation of the odds at any point in the game seem to favor continuing to play even though this means we will eventually lose all our money. A number of respondents to the original query pointed out the similarity of this to the St. Petersburg paradox but it is a different problem.

The odds can be calculated like this:

   (2*0.75)+_1*0.25
1.25

The 2 represents doubling our bet and _1 represents losing our bet; the corresponding chances of each possibility are 0.75 and 0.25.

We see that our expectation is greater than one, so the odds tell us we should take the bet. The same calculation is valid for each time we play but it's easy to see that the probability of ruin approaches one as the more times we play.

A more sophisticated way to write this in J as as follows:

   2 _1 +/ . * (],-.) 0.75    NB. 75% chance of winning, 100%-75% chance of losing
1.25 

This makes use of monadic "not" (-.) which, when applied to an argument in the closed interval [0,1], returns the complementary probability; for a probability p, -.p returns 1-p.

Evaluating Chances

The fixed numbers in this problem make it amenable to an analytic evaluation. For instance, our chance of ruin increases like this for each additional play:

   -.(-.0.25)^i.10   NB. Chance of ruin for 0 to 9 plays
0 0.25 0.4375 0.578125 0.683594 0.762695 0.822021 0.866516 0.899887 0.924915  

So, we have zero chance of ruin if we don't play, 25% if we play once, and so on.

However, not all problems are this tractable and it's fun to write a simulation to get possible results we can evaluate with work on an optimal strategy. Also, this lets us explore some different ways of writing the same thing and evaluating the efficiency of different approaches so that's what we will do.

Simulations

We develop a few simulations in different styles, using different J adverbs.

Simulation 0

Our first simulation will be one that looks more like a conventional approach than a more J-like one. Here we count how many times we play before we bust.

playRuns0=: 3 : 0"0
   ct=. 1                               NB. Play at least once
   while. 0.25<?0 do. ct=. >:ct end.    NB. 25% chance of failure
   ct
)

Notice that the argument to this function is not used at all; it's merely a placeholder.

We implement the 25% chance of ruin by picking a random number between 0 and 1 (?0) and checking if it is less than 0.25.

Running this 10 times gives us something like this:

   playRuns0 i.10
12 5 11 1 2 1 15 4 8 7 

If we do one million simulations and compare the cumulative probability of ruin to our calculated value for the first ten, we see a close match which gives us confidence in the correctness of our simulation.

   ft=. frtab playRuns0 i.1e6    NB. frequency table of results
   (-.(-.0.25)^>:i.10),.~10{.(] ,. [: +/\ [: (+/ %~ ]) 0 { "1 ]) (]/:1&{"1)ft
249996  1 0.249996     0.25
188096  2 0.438092   0.4375
140639  3 0.578731 0.578125
105797  4 0.684528 0.683594
 79092  5  0.76362 0.762695
 58715  6 0.822335 0.822021
 44321  7 0.866656 0.866516
 33498  8 0.900154 0.899887
 25047  9 0.925201 0.924915
 18582 10 0.943783 0.943686

The first column shows the number of times the second column occurred, the third and fourth columns show the empirical and calculated cumulative probability of ruin. We use frtab which is defined thusly:

frtab=: 3 : 0
   cts=. #/.~ y          NB. Count # of each unique item.
   if. -.isNum y do.     NB. Special case enclosed, text mat, text vec,
       if. (L.=0:) y do. (<"0 cts),.<"0 ~.y else. 
           if. 2=#$y do. (<"0 cts),.<"1 ~.y else. (<"0 cts),.<"0 ~.y end.
       end.
   else. cts,.~.y end.   NB. and simple numeric vec.
NB.EG (1 2 3 2 1,. 11 22 33 222 99) -: frtab 11 22 22 33 33 33 222 222 99
)

Simulation 1

Since the form of this simulation is to continue until a condition is met, we can use J's power conjunction. However, the conditional nature of our stopping condition requires this double power form:

playRuns1=: (>:^:(0.25<[:?0*])^:_)"0

We see that the verb for our first power simply increments the counter. The second one - (0.25<[:?0*])^:_ - embeds our stopping condition for an infinity of possible runs. Notice that, unlike our initial simulation, we use the argument to initialize the counter so our argument should be a vector of ones.

   ft=. frtab playRuns1 1e6$1
   (-.(-.0.25)^>:i.10),.~10{.(] ,. [: +/\ [: (+/ %~ ]) 0 { "1 ]) (]/:1&{"1)ft
249476  1 0.249476     0.25
187002  2 0.436478   0.4375
140500  3 0.576978 0.578125
105677  4 0.682655 0.683594
 79580  5 0.762235 0.762695
 59509  6 0.821744 0.822021
 44599  7 0.866343 0.866516
 33261  8 0.899604 0.899887
 25009  9 0.924613 0.924915
 18753 10 0.943366 0.943686

Again, the simulation results agree with our calculated percentages to about three of four digits.

Relative Performance

If we compare the performance of these two methods, we see that the more J-like one performs better.

   6!:2 'playRuns0 1e6$1'
0.890378
   6!:2 'playRuns1 1e6$1'
0.561831
   6!:2 'playRuns0 1e7$1'
8.97768
   6!:2 'playRuns1 1e7$1'
5.68311

We also see that the performance improvement appears to be a constant ration regardless of the number of simulations.

Can we do better?

Simulation 2

Notice that the more J-like version of playRuns1 is still an iterative, scalar solution. If we evaluate its results to get an idea of the possible maximums, we see this:

   >./playRuns1 1e7$1
55
   >./playRuns1 1e8$1
67

This hints at a more efficient array-oriented solution. What if we simply generate a fixed number of simulations and find the first one where we face ruin?

playRuns2=: (0 i.~ 0.25 < 0 ?@$~ ])"0  NB. Return point of 1st ruin

For this version, the argument is the maximum number of simulations we want to consider. Since we've seen that even for 100 million simulations our maximum was 67, let's assume we'll never need more that 100 simulations. Since we know how to calculate the probability of ruin, how likely is it we could reach 100?

   0j13":-.(-.1r4)^>:100x   NB. Probability of ruin for 100 plays
0.9999999999998

We had to resort to rational numbers to calculate the very low probability of success and high probability of ruin which we see is extremely high.

Looking at performance, we see that this last version is only about as efficient as our previous attempt.

   6!:2 'playRuns2 1e6$100'
0.609262
   6!:2 'playRuns2 1e7$100'
5.96439

We can improve this a little by relying on the very low chance of success for fewer than 100 plays.

   0j10":-.(-.1r4)^>:70x    NB. Probability of ruin for 70 plays
0.9999999987

   6!:2 'playRuns2 1e6$70'
0.46969
   6!:2 'playRuns2 1e7$70'
4.72017

Advanced topics

We leave J to take a look at large language models (LLMs) we can run ourselves. The simplest of these appears to be oobabooga but the one I have installed so far is a different one called Kobold.

Kobold

Taking a look here, we see an LLM called KoboldCPP which I installed on my local machine. Instructions for installing it can be found here.

There are several pre-trained models of varying size to use with this front-end.

Name (link) RAM Required
Guanaco 7 GB
Nous Hermes 12.3 GB
WizardLM 27 GB

Here are some tips for configuring KoboldCPP.

How it Works

Once we run the koboldcpp.exe on our local machine, pointing it to one of the pre-trained models above, we connect to it using a browser-based front end.

For the smallest of the models mentioned above, here is what I found for a simple question:

Instruction:

  What is the APL programming language?

Output:

  APL (Automated Placement Language) is an imperative computer programming language developed by Ken Iverson at IBM's Watson Research Center in 1962. It was designed to be a general-purpose language suitable for all types of tasks, ranging from scientific programming to business applications and games.
  APL has a unique syntax that resembles mathematical notation, with symbols such as +, -, *, and ^ representing the basic arithmetic operators. It also features the use of curly braces for grouping statements and the use of punctuation marks for...

The output was truncated as shown though what is shown is fairly inaccurate, from the incorrect expansion of the acronym and incorrect categorization as imperative to the commonly used arithmetic symbols which are not APL symbols for times and power.

When I re-ran this using the Nous Hermes model as mentioned above, this is what I got.

Instruction:

 What is the APL programming language?

Output:

 As mentioned earlier, APL (Automated Placement Language) is an imperative computer programming language developed by Ken Iverson at IBM's Watson Research Center in 1962.

It not only remembers what I asked before, it gets a little snippy.

I tried to get more with a follow-up question.

Instruction:

 What are the features of APL?

Output:

 APL has several distinctive features that set it apart from other programming languages:
 1. Vectoral Notation: APL uses vectorized operations, which means that it can perform multiple calculations simultaneously. This makes it particularly efficient for matrix operations and other numerical computations.
 2. One-Dimensional Arrays: In APL, arrays are represented as one-dimensional structures, which simplifies the notation for manipulating large datasets.
 3. Unique Syntax: APL has a unique syntax that resembles mathematical notation, making it easy to read and write. The language also supports built-in functions for common operations like sorting, searching, and mapping.
 4. Interactivity: APL is highly interactive, allowing programmers to evaluate expressions and see the results immediately. This makes it well-suited for iterative development and exploratory data analysis.
 5. Extensive Libraries: APL has a rich library of functions and built-in procedures, covering areas such as linear algebra, statistics, graphics, and more.

We see that the answers are not particularly accurate with some correct information mixed with incorrect or misleading answers.

Other Local LLMs

Here are some other LLMs that can be downloaded and run locally.

Name (link) Features
privateGPT Can be trained on one's own documents
Alpaca Electron Supposedly easy to install and requires less memory
GPT4All Privacy aware, can run without internet or GPU
BLOOM Large and powerful

Learning and Teaching J

We look at the interplay of language and perception with the corollary that language influences thinking.

Color Words

According to a recent study of a remote Amazonian tribe that had no distinct words for "blue" or "green", when members of the tribe learned a language (Spanish) that had these words, they invented words in their own language corresponding to these "new" colors.

From the article:

Among members of the Tsimane’ society, who live in a remote part of the Bolivian Amazon rainforest, the researchers found that those who had learned Spanish as a second language began to classify colors into more words, making color distinctions that are not commonly used by Tsimane’ who are monolingual.

In the most striking finding, Tsimane’ who were bilingual began using two different words to describe blue and green, which monolingual Tsimane’ speakers do not typically do. And, instead of borrowing Spanish words for blue and green, they repurposed words from their own language to describe those colors.

Also, the researchers noted that the exposure to another language's concepts affected the bilingual Tsimane’ this way.

The researchers also found that the bilingual Tsimane’ became more precise in describing colors such as yellow and red, which monolingual speakers tend to use to encompass many shades beyond what a Spanish or English speaker would include.