NYCJUG/2012-05-08
beginner's example, process data from website, input prompt, BAPLA APL Moot, traveling-salesman problem set, graphics in J7, teaching eigenvectors, Kinect, 3-D point cloud, language and thought, tryAPL online
Location:: Heartland
Agenda
Meeting Agenda for NYCJUG 20120508 ---------------------------------- 1. Beginner's regatta: see "Gathering and Using Data.pdf" for a detailed examination of a few lines of J. See "Prompt Rides Again.pdf". 2. Show-and-tell: APL Moot report - see "2012 BAPLA Moot.pdf". Showcasing J: see "MergeSort Examples.pdf" and "Prim's Algorithm.pdf". See "Graphics Play in J7.pdf". 3. Advanced topics: introduction to eigenvectors - see "Explaining the E word.pdf". An intriguing platform possibility? See "Introduction to Kinect.pdf" and "Kinect SDK1 - A 3D Point Cloud.pdf". 4. Learning, teaching and promoting J, et al.: experiments to test how language shapes thought: "How does Our Language Shape the Way We Think.pdf" versus "Which comes first - language or thought.pdf". Rant against code that's "too big" - see "Rant against code that is too big by Yegge.pdf". TryAPL online: see "TryAPL Online.pdf".
Beginner's regatta
We examined, in detail, how to get external data into a neat table in J as well as the often-revisited question on how to prompt a user for input from a command-line.
Gathering and Using Data
We can start by getting some interesting data from the Web, like this:
Then we can cut and paste the data into a J table, starting like this:
'ontit ontap'=: split <;._1&>TAB,&.><;._2 ] 0 : 0 Beer Name Served In ABV Price Allagash White 16oz. Draft 5.5 $7.00 Bear Republic Roggenbier 14oz. Draft 4.5 $7.00 Brooklyn Sorachi Ace 14oz. Draft 6.5 $8.00 Captain Lawrence Freshchester Pale 16oz. Draft 5.6 $7.00 Cigar City Maduro Brown Ale 16oz. Draft 5.5 $8.00 De Ranke XX Bitter 14oz. Draft 6.2 $8.00 Dogfish Head 90 Minute 14oz. Draft 9.0 $8.00 …
How does this work? Let’s look at the code in detail.
Notice that we start by assigning a pair of names “ontit” and “ontap” as globals ( =: ). This is because we’re doing this within a script “beerInfo.ijs”. If we’d assigned them as locals ( =. ), the variables would disappear after the script is loaded because they are local to it. Let’s look at their values.
1!:44 'C:\amisc\J\NYCJUG\201205\' NB. Change to directory where script is. load 'beerInfo.ijs' NB. Loading it runs definitions. $ontit NB. Check the shape first, 4 ontit NB. then the contents. +-----------------+----------+----+-----+ |On Tap Beer Name |Served In |ABV |Price| +-----------------+----------+----+-----+ $ontap NB. This one’s larger, so only 39 4 3{.ontap NB. look at the first three rows. +-------------------------+------------+----+-----+ |Allagash White |16oz. Draft |5.5 |$7.00| +-------------------------+------------+----+-----+ |Bear Republic Roggenbier |14oz. Draft |4.5 |$7.00| +-------------------------+------------+----+-----+ |Brooklyn Sorachi Ace |14oz. Draft |6.5 |$8.00| +-------------------------+------------+----+-----+
We can see that “ontit” contains the column titles and “ontap” is a boxed array of data from the web page. How did this data get there? The last thing before the assignment was the application of the “split” verb to whatever input it receives from the expression to its right. Let’s take a look at this verb:
split {. ,&< }.
There’s not much to this verb: it’s a basic fork, the ends of which are “take” ( {. ) and “drop” ( }. ); the middle of the fork, applied last, is “catenate” ( , ) composed with ( & ) “box” ( < ). OK, maybe that seems like a lot, but looking at the pieces, we see that the ends split something into two pieces: its first item and all but its first item, then the middle tine of the fork joins them back together as boxed items.
A few examples should make this clearer.
split 'foo' +-+--+ |f|oo| +-+--+ split 212 438 2443 +---+--------+ |212|438 2443| +---+--------+ ]mat=. 'Hello,',:'world!' Hello, world! split mat +------+------+ |Hello,|world!| +------+------+ split i. 3 4 +-------+---------+ |0 1 2 3|4 5 6 7| | |8 9 10 11| +-------+---------+
So, what about the rest of the phrase which contains only plain J without any pre-defined verbs? To remind us, that phrase is
'ontit ontap'=: split <;._1 &> TAB,&.> <;._2 ] 0 : 0
(with some extra spaces inserted to emphasize the logical pieces we'll talk about below).
The next piece is <;._1 . This is the very handy “cut” conjunction ( ;. ) with its left verb argument of “box” ( < ), which we’ve already seen, and a noun right argument of negative one: the “1” means to use the leading element as the partition indicator and “_” means to drop that element. So, this splits its argument, which we'll look at next, into boxed items according to the leading element.
The next conjunction, “compose” ( & ), we've already seen in the definition of “split”. Here, it's combined with “unbox” ( > ) to apply the cut on its left to the TAB-delimited items to the right. These items have the “TAB” character ( 9{a. ) concatenated ( , ) to each ( &. ) unboxed ( > ) item on the right. So this puts a tab character at the beginning of each tab-delimited line to set up the “box cut in each box” ( <;._1 &> ) to give us a row of cells in a matrix for each tab-delimited row.
The argument to this preceding “tablify tab-delimited rows” is the result of “cut” ( ;. ) applied, using “box” ( < ) as we've already seen but with a noun right argument of negative two ( _2 ) which specifies using the final element (indicated by the “2”) as the partition indicator and eliminating it (indicated by the negative sign) from the result.
The thing being cut is the right argument ( ] ) which is a noun defined by the following lines ( 0 : 0 ) until the first solitary right parenthesis.
Working from Right to Left
To show these last two steps by example, we’ll look at the two major steps in the order in which they are run, from right to left. First, we turn the character vector returned by “0 : 0” into a vector of lines based on the final (LF) delimiter:
<;._2 ] 0 : 0 Beer Name Served In ABV Price Allagash White 16oz. Draft 5.5 $7.00 ) +--------------------------------+---------------------------------------+ |Beer Name Served In ABV Price|Allagash White 16oz. Draft 5.5 $7.00| +--------------------------------+---------------------------------------+
Then we break each of these lines into a series of boxes based on the tab delimiter:
<;._1&>TAB,&.><;._2 ] 0 : 0 Beer Name Served In ABV Price Allagash White 16oz. Draft 5.5 $7.00 ) +---------------+------------+----+-----+ |Beer Name |Served In |ABV |Price| +---------------+------------+----+-----+ |Allagash White |16oz. Draft |5.5 |$7.00| +---------------+------------+----+-----+
This gives us the table we split into the vector of column names and table of data.
Using the Data
Now that we know how we created this vector of column titles and its corresponding matrix of data, what might we do with it? It looks like we could perform some analysis on the numeric portion of the table but we would first need to extract the pure numbers from each of the three final columns. We could do this by eliminating all but the characters which comprise a number here – '0123456789.' - except that there's a period in the “Served In” column which is not part of a number. Also, this column has trailing characters we'd like to eliminate whereas the “Price” column has a leading character, so there doesn't seem to be a single, simple way to convert all three numeric columns to numbers.
Should we process each column with its own numeric extraction verb or is there a general verb we can apply to each column with some sort of modifier for each of the three cases?
For an example of how we might use the results of this phrase as applied to another tab-delimited table, take a look at the follow-up to this explanation in next month's meeting.
“Prompt” Rides Again
We get an inquiry like this with some regularity as new people attempt J:
from: James Bragg jsbragg@att.net to: general@jsoftware.com date: Mon, May 7, 2012 at 10:20 AM subject: [Jgeneral] How do I get user input? First, I'm really enjoying learning J. Is there a prompt verb? Would "get input" be a prompt verb? Is it possible to get input from a user in the J Term gtk window or J console and assign it to a variable? Thanks in advance, -james
Raul responds with pretty much the party line.
from: [[User:Raul Miller|Raul Miller]] rauldmiller@gmail.com date: Mon, May 7, 2012 at 10:58 AM > First, I'm really enjoying learning J. That is good to hear. > Is there a prompt verb? Would "get input" be a prompt verb? Is it possible > to get input from a user in the J Term gtk window or J console and assign it > to a variable? There is, but it's ... it's not exactly deprecated, but "neglected" might be accurate. For most purposes, the J command line is the best "prompt verb". For many other purposes, you should be building a "UI" rather than a command line interface. If you are going to stick to a command line interface, the J command line is usually the best tool for the job. This leaves a neglected niche for the spurious cases where you want a prompt. In j6.02, this worked in the GUI and I think worked in jconsole on linux/mac (but not in jconsole on windows): require'misc' prompt '$ ' $ 1 2 3 1 2 3 In J7.01, with its various front ends for jconsole (including the browser, via jhs) I do not think that this mechanism is supported robustly. (But I have not kept up to date on everything that's going on in J7.01 so maybe that aspect has been built out.) FYI,
I also pointed the inquirer to an evolving FAQ on the subject.
from: Devon McCormick devonmcc@gmail.com date: Mon, May 7, 2012 at 11:17 AM Hi – this is a recurring question but the answer keeps changing as Raul indicated. The extensive answer here - http://www.jsoftware.com/jwiki/RicSherlock/Temp/InteractivePrompt - is out of date with J7, so I attempted to figure out how to update it. It looks like you can do this from the J7 GTK interface: load '~addons/gui/gtkwd/jinput.ijs' input_result=: 3 : 'RESPONSE=: y' create_jinput_ 'Hi" to create a window with the prompt "Hi" and return the result in the global "RESPONSE". Unfortunately, this breaks silently on the "try." line in "a_ok_button_jinput_" because "COCREATOR" is not defined; this is supposed to be a callback to the user-defined "input_result" verb (as in the example shown above) in the namespace that created the prompt. My klugy fix for it is to re-define "a_ok_button_jinput_" as follows - by hard-coding the "base" namespace - then ensuring that you run from this (which is the default) namespace: a_ok_button_jinput_=: 3 : 0 wd 'pclose' try. input_result_base_ edit catch. end. destroy'' ) So, after loading "jinput.ijs", re-defining as shown and creating "input_result" as shown, when you run create_jinput_ 'Hi' you'll get a (character vector) variable "RESPONSE" with the user's input.
Bill Lam weighed in with where to look to address the “COCREATOR” defect.
from: bill lam bbill.lam@gmail.com date: Mon, May 7, 2012 at 6:39 PM that jinput is a copy of what is there in j602. The correct way of using it as demonstrated in intest.ijs and it should be the same as that appeared inside Ric's webpage. However I have never used jinput myself. require 'jinput' ABC=: 0 : 0 pc abc; xywh 136 8 34 12;cc ok button;cn "OK"; xywh 136 23 34 12;cc cancel button;cn "Cancel"; xywh 21 16 97 11;cc edit edit ws_border es_autohscroll; xywh 51 32 34 12;cc input button; pas 6 6;pcenter; rem form end; )…
Show-and-Tell
A Brief Overview of the 2012 BAPLA APL Moot
The BAPLA general meeting opened the conference. Topics discussed included: Allowing many posts to be filled by representatives from the same company (Optima), whether or not to continue to attempt to retrieve the funds owed BAPLA by BCS (about £14,000), opening nominations for the Outstanding Achievement Award to a wider community (to include members not only of APL but J, K, Q, etc.), changing the frequency at which Vector is published.
We agreed to allow a disproportionate representation from a single company as the posts are hard to fill. Chris Hogan volunteered to prod the BCS another time for the money owed us. The nominations for the Outstanding Achievement Award can be made by any member of BAPLA, so there is really no restriction on the field of nominees. We agreed that Vector need not be published four times a year.
We were introduced to the area surrounding the conference venue - the Lee Valley Park - by a park ranger who told us about the history and ongoing efforts to re-vitalize this 26-mile long nature preserve. Following this, Kai Jaeger presented work he’s been doing on “FIRE” – a find-and-replace utility for APL workspaces. It provides a slick, GUI-based functionality for doing this in an APL-oriented way. We finished the first day with a barbecue. The second day began with general discussion and was oriented toward Dyalog as we had two of their main developers in for the day. Kai filled their ears with suggestions based on his intensive use of their product. Dick Bowman demonstrated the use of the Windows WPF facility under Dyalog. They make use of XAML to define layouts and callbacks much as Glade does with XML. We heard from a few J people. I told about NYCJUG and gave an extended example of the sort of topics we cover by re-capping a discussion from the previous meeting on “The Fat Middle” problem – see allAtOnceVsApplyAcross.pdf at http://www.jsoftware.com/jwiki/NYCJUG/2012-04-11 . Graeme Robertson showed us some interesting work he does using J to track down features of large structures by analyzing the sounds they make. He showed us how he analyzes a large dataset of sound samples, using J to generate graphical representations and focusing on areas of particular interest in greater detail as needed. David Alis demonstrated a tool he uses to help break down tacit expressions in J in order to better understand them. We had a well-organized working environment where we sat around an open, central area where speakers could set up and use he projector. There were plenty of power strips available, as well as a steady supply of beverages and snacks. There were a number of other demonstrations, mostly of facilities of Dyalog APL, including uses of its “dynamic function” facility. Stephen Taylor used the opportunity to work on beginning to learn K on a laptop he’d re-purposed to be a Linux machine. He had very exciting news to share about what’s happening in the K world. Phil Last deserves a lot of credit for putting this together and ensuring that it ran so smoothly. There are plans for participants to post follow-ups to what we talked about at the moot on the moot wiki at http://moot.aplwiki.com/ . More pictures are available here: http://www.flickr.com/photos/photonatic/sets/72157629960791987/
Showcasing J: Merge-Sort Examples
Following are some examples of the famous "Merge-Sort" algorithm implemented in different languages. This arose out of a Meetup here in New York about alogrithm implementation: the group meets each week to outline and discuss algorithms. This discussion followed an on-line university course about this well-known sorting algorithm.
J Implementation
NB.* mergeSort.ijs: basic implementation of merge sort. NB. Can't handle > 9998 integers -> stack error if we use "merge0" which NB. is a simple, recursive implementation; "merge" is much faster and NB. handles larger arguments. mergeSort=: 3 : 0 if. 2>#y do. y else. (<.-:#y)(([: mergeSort {.) merge [: mergeSort }.) y end. ) merge=: [: /:~, merge0=: 4 : 0 if. 0=#x do. y return. end. if. 0=#y do. x return. end. 'x0 y0'=. {.&.>x;<y if. x0<y0 do. x0,(}.x) merge y else. y0,x merge }.y end. )
As one might expect, the J implementation clearly illustrates the algorithm more succinctly than do the following two examples in Java and Python. However, note that the latter implementation of "merge" illustrates the common problem of over-specification in algorithm explanation. If we implement "merge" in order to follow the overly-detailed method outlined in the course, we fail to take advantage of J's strengths and end up with a very fragile and poorly-performing implementation.
Of course, J usually deals with sorting at a much more sophisticated level than is found in other languages by having "grade" as its primary verb, but it may be a while before the rest of them catch up to us, especially if they remain stuck at the finely over-wrought level exemplified by the latter example of "merge" shown above.
Java Implementation
[from Java: [from https://github.com/AlgorithmsNYC/AlgorithmsNYC/blob/master/ALGO101/UNIT_01/MergeSort.java]]
/* MergeSort.java: example of Merge-Sort algorithm. */ import java.util.ArrayList; import java.util.Arrays; import java.util.List; /** * @author kalyan */ public class MergeSort { public static void main(String args[]) { ArrayList<Integer> sortedList = new ArrayList<Integer>(); List<Integer> intList = new ArrayList<Integer>(); int[] input = new int[]{9,4,7,2,6,0,3,1,8,5}; for(int i=0;i<input.length;i++) intList.add(i, new Integer(input[i])); System.out.println("Unsorted list: "); for(int i = 0;i<intList.size()-1;i++) System.out.printf( "%s, ", intList.get(i)); System.out.println(intList.get(intList.size()-1)+"."); sortedList = mSort((ArrayList<Integer>) intList); System.out.println("Sorted list: "); for(int i = 0;i<sortedList.size()-1;i++) System.out.printf( "%s, ", sortedList.get(i)); System.out.println(intList.get(sortedList.size()-1)+"."); } public static ArrayList<Integer> mSort(ArrayList<Integer> listToBeSorted) { ArrayList<Integer> left = new ArrayList<Integer>(); ArrayList<Integer> right = new ArrayList<Integer>(); int mediumLen = 0; // if list size is negative or zero if(listToBeSorted != null && listToBeSorted.size()<=1) { return listToBeSorted; }else { //determine mid point mediumLen = listToBeSorted.size()/2; System.out.println("medium length:"+mediumLen); //add anything less than medium index to left array for(int i=0;i<mediumLen;i++) { System.out.println("add to left array iteration:"+i+"value being added:"+listToBeSorted.get(i)); left.add(listToBeSorted.get(i)); } //add anything greater than or equal to right array for(int i=mediumLen;i<listToBeSorted.size();i++) { System.out.println("add to right array iteration:"+i+"value being added"+listToBeSorted.get(i)); right.add(listToBeSorted.get(i)); } System.out.println("left array size:"+left.size()); System.out.println("right array size:"+right.size()); left = mSort(left); right = mSort(right); } return merge(left,right); } public static ArrayList<Integer> merge(ArrayList<Integer> leftArrList ,ArrayList<Integer> rightArrList) { ArrayList<Integer> result = new ArrayList<Integer>(); while(leftArrList.size()>0 || rightArrList.size()>0) { if(leftArrList.size()>0 && rightArrList.size()>0) { if(leftArrList.get(0)<=rightArrList.get(0)) { result.add(leftArrList.get(0)); //append first of left to result leftArrList.remove(0); //append rest to left } else { result.add(rightArrList.get(0)); //append first of right to result rightArrList.remove(0); //append rest to right } } else if(leftArrList.size()>0) { result.add(leftArrList.get(0)); //append first(left) to result leftArrList.remove(0); //left = rest(left) } else if(rightArrList.size()>0) { result.add(rightArrList.get(0)); //append first(right) to result rightArrList.remove(0); //right = rest(right) } } return result; } }
Python Implementation
#TopDownMergeSortInPython: #A Python implementation of the pseudocode at #http://en.wikipedia.org/wiki/Merge_sort def merge_sort(m): if len(m) <=1: return m left = [] right = [] middle = int(len(m) / 2) for x in range(0, middle): left.append(m[x]) for x in range(middle, len(m)): right.append(m[x]) left = merge_sort(left) right = merge_sort(right) return merge(left, right) def merge(left, right): result = [] while len(left) > 0 or len(right) > 0: if len(left) > 0 and len(right) > 0: if left[0] <= right[0]: result.append(left[0]) left = left[1:len(left)] else: result.append(right[0]) right = right[1:len(right)] elif len(left) > 0: result.append(left[0]) left = left[1:len(left)] elif len(right) > 0: result.append(right[0]) right = right[1:len(right)] return result input = [5, 4, 7, 8, 2, 3, 1, 6, 9] print merge_sort(input)
Graphics Play in J7
There’s a “Traveling Salesman” problem challenge here - http://www.tsp.gatech.edu/data/ml/monalisa.html, which starts like this.
In February 2009, Robert Bosch created a 100,000-city instance of the traveling salesman problem (TSP) that provides a representation of Leonardo da Vinci's Mona Lisa as a continuous-line drawing. Techniques for developing such point sets have evolved over the past several years through work of Bosch and Craig Kaplan. An optimal solution to the 100,000-city Mona Lisa instance would set a new world record for the TSP. If you have ideas for producing good tours or for showing that a solution is optimal, this is a very nice challenge problem! I would be happy to report any computational results you obtain on this example. The data set in TSPLIB format is given in the file “mona-lisa100K.tsp”. The following papers describe various aspects of the mathematics behind the selection of city locations for the Mona Lisa TSP…. New! $1000 Prize Offered
So here’s what I did to play with the dataset.
1!:44 'C:\amisc\J\' mltsp=. <;._2 CR-.~fread 'mona-lisa100K.tsp' nn=. _".&>}."1 <;._1&>' ',&.>_1}.6}.mltsp $nn 100000 2 load 'plot' 6!:2 '''type point;pensize 2;output cairo 1000 1000'' plot j./"1 ] nn' 3.35216
. This gives us the following plot:.
Of course, the more interesting and much harder problem of solving the TSP for this set remains to be implemented in J.
Advanced Topics: Explaining Eigenvectors
Following an essay about teaching the "scary-sounding" concept of "eigenvalues", there's this comment with an illustration of the use of eigenvectors and eigenvalues:
-- Andrew said...
Suppose that x people in your state live in the city, and y people live in the country. Each year, 10% of the people who lived in the country last year will move to the city this year; and each year, 5% of the people who lived in the city last year will move to the country. So if x_n and y_n are the new city and country populations, and x_o and y_o the old ones, then
x_n = (0.95)x_o + (0.10)y_o y_n = (0.05)x_o + (0.90)y_o
which you could write as a matrix. Now, start with any city and country population you like, and multiply by the matrix to find the population next year, and the next, and the next... You will find the populations approaching an equilibrium. When the populations are at equilibrium, multiplying by the matrix to get next year's population will give the same population as this year. That is, the equilibrium populations are an *eigenvector* of the matrix, with eigenvalue 1. You can solve algebraically for this fixed vector by using the fact that for the equilibrium values x_e and y_e, you have x_e=x_o=x_n, i.e.
x_e = (0.95)x_e + (0.10)y_e y_e = (0.05)x_e + (0.90)y_e
(You'll end up with two copies of the same equation: this only gives the *ratio* of x_e and y_e. To find the values you need to know the total population you started with.) You can find more examples like this (and more realistic ones) by looking up "Markov chains". Google uses a similar sort of reasoning to do its ranking of webpages. If you look up "Google eigenvector" you can find two or three good expository articles on this. . November 29, 2010 8:26 PM
-- Sue VanHattum said...
@Andrew, oh yeah, I remember Markov chains. So anything that can be described that way, if it has an equilibrium, will have an eigenvalue of 1 and an eigenvector representing the equilibrium state. I agree that that's a good way to start.
. December 1, 2010 5:31 PM
--
We can look at the population migration scenario in J with a simple simulation:
pop=. 1e6 1e5 ]popchg=. 0.95 0.10,:0.05 0.90 0.95 0.1 0.05 0.9 popchg (+/ . *) pop NB. Check result for one year. 960000 140000 NB. 9600000 = (0.95*1e6) + 0.10*1e5 and NB. 140000 = (0.05*1e6) + 0.90*1e5: OK popchg (+/ . *)^:(i.3) pop NB. Show for first 3 years: 0 through 2 1e6 100000 960000 140000 926000 174000 popchg (+/ . *)^:(_) pop NB. Iterate until stable. 733333 366667
However, I'm not sure how to relate this scenario-based approach to the one using eigenvalues and eigenvectors as mentioned in these comments. Using one of the simple J definitions of eigenvalue mentioned on the wiki, I get a result like this:
rheval=:+/ .*&>/@|.@(128!:0) NB. Roger Hui rh=: (<0 1) |: rheval^:_ rh popchg 1 0.85
I see that perhaps the initial result of "1" tells me there's a stable equilibrium to this problem but I'm not sure what to make of the "0.85". Perhaps someone more well-versed in this could elaborate?
Using the code from Donald MacIntyre's article on using Jacobi's method to calculate eigenvectors, I get this result:
1e_5 clean 1e_6 jacobi popchg 1 0 _0.05 0.85 0.707107 _0.707107 0.707107 0.707107
However, the reconciliation between this and the result of my simulation eludes me.
Learning, teaching, and problem-solving
Materials
"Gathering and Using Data.pdf"
"How does Our Language Shape the Way We Think.pdf"
"Kinect SDK1 - A 3D Point Cloud.pdf"
"Rant against code that is too big by Yegge.pdf"
"Which comes first - language or thought.pdf"
. -- Devon McCormick <<DateTime>>