NYCJUG/2023-11-13
Beginner's regatta
We look at a an approximation of some of the terms of the Fibonacci sequence and a little-known feature of indexing in J called complementary indexing.
Fibonacci-esque
Here is an amusing trick to generate the first 11 Fibonacci numbers.
_2]\2}.0j22":100x%9899 01 01 02 03 05 08 13 21 34 55 90
Well, ok, the last one is off by one.
It relies on this fraction containing the series as pairs of digits:
0j22":100x%9899 0.0101020305081321345590
It can be extended like this:
}:_3]\2}.0j51":1000x%998999 001 001 002 003 005 008 013 021 034 055 089 144 233 377 610 988
Again, the last one is off by one. However, it's not clear how this works or how to extend it further.
Complementary Indexing
Normally we supply indexes specifying a position to extract items from an array.
]mat=. i.4 5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 1 2{mat NB. Select rows 5 6 7 8 9 10 11 12 13 14 1 2{"1 mat NB. Select columns 1 2 6 7 11 12 16 17 (0 0;1 1;2 2;3 3){mat NB. Select items 0 6 12 18
However, J's indexing allows us to exclude rather than select sections of an array. This is done by triple-boxing the indexes like this:
(<<<1 2){"1 mat NB. Exclude columns 0 3 4 5 8 9 10 13 14 15 18 19 (<<<1 2){mat NB. Exclude rows 0 1 2 3 4 15 16 17 18 19
This does not extend to items as selection does:
(<0 0;1 1;2 2;3 3){mat NB. Exclude items? |length error, executing dyad { |selector is overlong; has length 4 but rank of y is only 2 | (<0 0;1 1;2 2;3 3) {mat
This also easily extends to excluding multiple selections:
(<^:3&>1 2){mat NB. Exclude each row independently 0 1 2 3 4 10 11 12 13 14 15 16 17 18 19 0 1 2 3 4 5 6 7 8 9 15 16 17 18 19
More information about this ccan be found here.
Show-and-tell
We look into implementing the Taylor series in J and an apparently paradoxical betting problem.
Taylor Series
One of our members wants to port over the old code for the Taylor series primitives into a library for use in modern J. This series used to be available in J primitives t. and T. but these symbols were re-purposed to handle multi-threading as this seemed like a more useful feature. The C code which implemented the old primitives is still available and could be used to create an addon to use Taylor series.
A Taylor series can be used to approximate a function with a polynomial which may be easier to evaluate than the function itself as shown here (from Wikipedia):
J Example of Taylor Approximation
We can easily implement the series above and verify the Wikipedia information. The first thing to notice is that the first term in the sine approximation above is in the same form of the subsequent terms, i.e. x is the same as (x^1)%!x, so we can represent the series like this in J:
{{ -/(y^>:+:i.4)%!>:+:i.4 }}
Where -/ gives us the alternating sum, thanks to left-to-right evaluation.
Rather than use >:+:i.4 twice, we can use a tacit expression
13 : '(y^x)%!x' ^~ % [: ! [
giving us taylorSine=: {{ -/(>:+:i.4) (^~%[:![) y }}"0.
Testing this for a few points around zero with a line for sine for comparison:
(1&o. ,: taylorSine) _3 _2 _1 _0.5 0 0.5 1 2 3 _0.14112 _0.909297 _0.841471 _0.479426 0 0.479426 0.841471 0.909297 0.14112 _0.0910714 _0.907937 _0.841468 _0.479426 0 0.479426 0.841468 0.907937 0.0910714
Giving us a fairly close approximation. Notice that the accuracy of the approximation increases as we lengthen the series so we could pull out the hard-coded "4" and make it a parameter.
taylorSine=: {{ -/(>:+:i.x) (^~%[:![) y }}"0 (1&o. ,: 6&taylorSine) _3 _2 _1 _0.5 0 0.5 1 2 3 _0.14112 _0.909297 _0.841471 _0.479426 0 0.479426 0.841471 0.909297 0.14112 _0.140875 _0.909296 _0.841471 _0.479426 0 0.479426 0.841471 0.909296 0.140875
This gives us a better approximation by adding two more terms to the series.
An Apparent Paradox
This problem, as seen on Reddit, seems paradoxical. It goes like this:
You are offered the chance to play a game. Each round, there are two possible outcomes:
Outcome A (75% chance):
You double your money. You must now choose whether to continue playing or take the money you have earned and stop.
Outcome B (25% chance):
You lose all of the money that you have bet and can no longer continue playing.
So the question is, at what point should you stop and take your money?
The paradox arises because a simple calculation of the odds at any point in the game seem to favor continuing to play even though this means we will eventually lose all our money. A number of respondents to the original query pointed out the similarity of this to the St. Petersburg paradox but it is a different problem.
The odds can be calculated like this:
(2*0.75)+_1*0.25 1.25
The 2 represents doubling our bet and _1 represents losing our bet; the corresponding chances of each possibility are 0.75 and 0.25.
We see that our expectation is greater than one, so the odds tell us we should take the bet. The same calculation is valid for each time we play but it's easy to see that the probability of ruin approaches one as the more times we play.
A more sophisticated way to write this in J as as follows:
2 _1 +/ . * (],-.) 0.75 NB. 75% chance of winning, 100%-75% chance of losing 1.25
This makes use of monadic "not" (-.) which, when applied to an argument in the closed interval [0,1], returns the complementary probability; for a probability p, -.p returns 1-p.
Evaluating Chances
The fixed numbers in this problem make it amenable to an analytic evaluation. For instance, our chance of ruin increases like this for each additional play:
-.(-.0.25)^i.10 NB. Chance of ruin for 0 to 9 plays 0 0.25 0.4375 0.578125 0.683594 0.762695 0.822021 0.866516 0.899887 0.924915
So, we have zero chance of ruin if we don't play, 25% if we play once, and so on.
However, not all problems are this tractable and it's fun to write a simulation to get possible results we can evaluate with work on an optimal strategy. Also, this lets us explore some different ways of writing the same thing and evaluating the efficiency of different approaches so that's what we will do.
Simulations
We develop a few simulations in different styles, using different J adverbs.
Simulation 0
Our first simulation will be one that looks more like a conventional approach than a more J-like one. Here we count how many times we play before we bust.
playRuns0=: 3 : 0"0 ct=. 1 NB. Play at least once while. 0.25<?0 do. ct=. >:ct end. NB. 25% chance of failure ct )
Notice that the argument to this function is not used at all; it's merely a placeholder.
We implement the 25% chance of ruin by picking a random number between 0 and 1 (?0) and checking if it is less than 0.25.
Running this 10 times gives us something like this:
playRuns0 i.10 12 5 11 1 2 1 15 4 8 7
If we do one million simulations and compare the cumulative probability of ruin to our calculated value for the first ten, we see a close match which gives us confidence in the correctness of our simulation.
ft=. frtab playRuns0 i.1e6 NB. frequency table of results (-.(-.0.25)^>:i.10),.~10{.(] ,. [: +/\ [: (+/ %~ ]) 0 { "1 ]) (]/:1&{"1)ft 249996 1 0.249996 0.25 188096 2 0.438092 0.4375 140639 3 0.578731 0.578125 105797 4 0.684528 0.683594 79092 5 0.76362 0.762695 58715 6 0.822335 0.822021 44321 7 0.866656 0.866516 33498 8 0.900154 0.899887 25047 9 0.925201 0.924915 18582 10 0.943783 0.943686
The first column shows the number of times the second column occurred, the third and fourth columns show the empirical and calculated cumulative probability of ruin. We use frtab which is defined thusly:
frtab=: 3 : 0 cts=. #/.~ y NB. Count # of each unique item. if. -.isNum y do. NB. Special case enclosed, text mat, text vec, if. (L.=0:) y do. (<"0 cts),.<"0 ~.y else. if. 2=#$y do. (<"0 cts),.<"1 ~.y else. (<"0 cts),.<"0 ~.y end. end. else. cts,.~.y end. NB. and simple numeric vec. NB.EG (1 2 3 2 1,. 11 22 33 222 99) -: frtab 11 22 22 33 33 33 222 222 99 )
Simulation 1
Since the form of this simulation is to continue until a condition is met, we can use J's power conjunction. However, the conditional nature of our stopping condition requires this double power form:
playRuns1=: (>:^:(0.25<[:?0*])^:_)"0
We see that the verb for our first power simply increments the counter. The second one - (0.25<[:?0*])^:_ - embeds our stopping condition for an infinity of possible runs. Notice that, unlike our initial simulation, we use the argument to initialize the counter so our argument should be a vector of ones.
ft=. frtab playRuns1 1e6$1 (-.(-.0.25)^>:i.10),.~10{.(] ,. [: +/\ [: (+/ %~ ]) 0 { "1 ]) (]/:1&{"1)ft 249476 1 0.249476 0.25 187002 2 0.436478 0.4375 140500 3 0.576978 0.578125 105677 4 0.682655 0.683594 79580 5 0.762235 0.762695 59509 6 0.821744 0.822021 44599 7 0.866343 0.866516 33261 8 0.899604 0.899887 25009 9 0.924613 0.924915 18753 10 0.943366 0.943686
Again, the simulation results agree with our calculated percentages to about three of four digits.
Relative Performance
If we compare the performance of these two methods, we see that the more J-like one performs better.
6!:2 'playRuns0 1e6$1' 0.890378 6!:2 'playRuns1 1e6$1' 0.561831 6!:2 'playRuns0 1e7$1' 8.97768 6!:2 'playRuns1 1e7$1' 5.68311
We also see that the performance improvement appears to be a constant ration regardless of the number of simulations.
Can we do better?
Simulation 2
Notice that the more J-like version of playRuns1 is still an iterative, scalar solution. If we evaluate its results to get an idea of the possible maximums, we see this:
>./playRuns1 1e7$1 55 >./playRuns1 1e8$1 67
This hints at a more efficient array-oriented solution. What if we simply generate a fixed number of simulations and find the first one where we face ruin?
playRuns2=: (0 i.~ 0.25 < 0 ?@$~ ])"0 NB. Return point of 1st ruin
For this version, the argument is the maximum number of simulations we want to consider. Since we've seen that even for 100 million simulations our maximum was 67, let's assume we'll never need more that 100 simulations. Since we know how to calculate the probability of ruin, how likely is it we could reach 100?
0j13":-.(-.1r4)^>:100x NB. Probability of ruin for 100 plays 0.9999999999998
We had to resort to rational numbers to calculate the very low probability of success and high probability of ruin which we see is extremely high.
Looking at performance, we see that this last version is only about as efficient as our previous attempt.
6!:2 'playRuns2 1e6$100' 0.609262 6!:2 'playRuns2 1e7$100' 5.96439
We can improve this a little by relying on the very low chance of success for fewer than 100 plays.
0j10":-.(-.1r4)^>:70x NB. Probability of ruin for 70 plays 0.9999999987 6!:2 'playRuns2 1e6$70' 0.46969 6!:2 'playRuns2 1e7$70' 4.72017
Advanced topics
We leave J to take a look at large language models (LLMs) we can run ourselves. The simplest of these appears to be oobabooga but the one I have installed so far is a different one called Kobold.
Kobold
Taking a look here, we see an LLM called KoboldCPP which I installed on my local machine. Instructions for installing it can be found here.
There are several pre-trained models of varying size to use with this front-end.
Name (link) | RAM Required |
---|---|
Guanaco | 7 GB |
Nous Hermes | 12.3 GB |
WizardLM | 27 GB |
Here are some tips for configuring KoboldCPP.
How it Works
Once we run the koboldcpp.exe on our local machine, pointing it to one of the pre-trained models above, we connect to it using a browser-based front end.
For the smallest of the models mentioned above, here is what I found for a simple question:
Instruction:
What is the APL programming language?
Output:
APL (Automated Placement Language) is an imperative computer programming language developed by Ken Iverson at IBM's Watson Research Center in 1962. It was designed to be a general-purpose language suitable for all types of tasks, ranging from scientific programming to business applications and games. APL has a unique syntax that resembles mathematical notation, with symbols such as +, -, *, and ^ representing the basic arithmetic operators. It also features the use of curly braces for grouping statements and the use of punctuation marks for...
The output was truncated as shown though what is shown is fairly inaccurate, from the incorrect expansion of the acronym and incorrect categorization as imperative to the commonly used arithmetic symbols which are not APL symbols for times and power.
When I re-ran this using the Nous Hermes model as mentioned above, this is what I got.
Instruction:
What is the APL programming language?
Output:
As mentioned earlier, APL (Automated Placement Language) is an imperative computer programming language developed by Ken Iverson at IBM's Watson Research Center in 1962.
It not only remembers what I asked before, it gets a little snippy.
I tried to get more with a follow-up question.
Instruction:
What are the features of APL?
Output:
APL has several distinctive features that set it apart from other programming languages: 1. Vectoral Notation: APL uses vectorized operations, which means that it can perform multiple calculations simultaneously. This makes it particularly efficient for matrix operations and other numerical computations.
2. One-Dimensional Arrays: In APL, arrays are represented as one-dimensional structures, which simplifies the notation for manipulating large datasets.
3. Unique Syntax: APL has a unique syntax that resembles mathematical notation, making it easy to read and write. The language also supports built-in functions for common operations like sorting, searching, and mapping.
4. Interactivity: APL is highly interactive, allowing programmers to evaluate expressions and see the results immediately. This makes it well-suited for iterative development and exploratory data analysis.
5. Extensive Libraries: APL has a rich library of functions and built-in procedures, covering areas such as linear algebra, statistics, graphics, and more.
We see that the answers are not particularly accurate with some correct information mixed with incorrect or misleading answers.
Other Local LLMs
Here are some other LLMs that can be downloaded and run locally.
Name (link) | Features |
---|---|
privateGPT | Can be trained on one's own documents |
Alpaca Electron | Supposedly easy to install and requires less memory |
GPT4All | Privacy aware, can run without internet or GPU |
BLOOM | Large and powerful |
Learning and Teaching J
We look at the interplay of language and perception with the corollary that language influences thinking.
Color Words
According to a recent study of a remote Amazonian tribe that had no distinct words for "blue" or "green", when members of the tribe learned a language (Spanish) that had these words, they invented words in their own language corresponding to these "new" colors.
From the article:
Among members of the Tsimane’ society, who live in a remote part of the Bolivian Amazon rainforest, the researchers found that those who had learned Spanish as a second language began to classify colors into more words, making color distinctions that are not commonly used by Tsimane’ who are monolingual.
In the most striking finding, Tsimane’ who were bilingual began using two different words to describe blue and green, which monolingual Tsimane’ speakers do not typically do. And, instead of borrowing Spanish words for blue and green, they repurposed words from their own language to describe those colors.
Also, the researchers noted that the exposure to another language's concepts affected the bilingual Tsimane’ this way.
The researchers also found that the bilingual Tsimane’ became more precise in describing colors such as yellow and red, which monolingual speakers tend to use to encompass many shades beyond what a Spanish or English speaker would include.