NYCJUG/2006-07-11
Proceedings
We talked about improving the J FAQs, regular expressions, John's "Rough Guide to Reading J", potential niches for J, J compared to other languages like Perl and how we might emulate some of this language's success.
Beginner's regatta
Dan says he has a template for an FAQ (Frequently Asked Questions) page. We should put this up on the J Wiki (there are now two: Language FAQ and General FAQ).
Niches for J
What are potential niches for J?
John thinks that some MATLAB territory is claimable by J. This will only work if a domain expert can use the program immediately: the argument that you first have to spend a year or so getting up to speed on J is not a strong selling point.
Are there any such areas we can identify?
Advanced topics
Lessons from Perl
Perl is a relatively popular niche language with a community reminiscent of the J community. What can we learn from this language's strengths?
Learning and Teaching J
A Rough Introduction to Reading J
Here we have a brief but thorough essay from John Randall on how to approach reading J phrases. Also available as a PDF File:Readingj.pdf which looks much nicer than the wiki.
1. Introduction
Much has been written about writing in J: we concentrate here on reading. Introductory materials focus on writing sentences that result in a noun and writing explicit verbs. A J learner is then shocked by reading J phrases [1] and postings on the J forums [2] to find that experienced J programmers write in what looks like another language. For example, frequently used code for the odometer verb is given by
odometer=:#:i.@(*/)
The aim of this article is to provide the intermediate J learner with a way to reading such code in J601 [though it remains widely applicable through at least version 9.04].
J is often criticised as a write-only language. It is quite possible to write unreadable code in any language, for example in the International Obfuscated C Code Contest [3]. Most beginners in C have struggled with the code [The C Programming Language, p. 106] [4]
while(*s++=*t++);
for copying strings.
Many languages require a certain amount of redundancy. For example, in constructing an object, Java requires duplication of the class identifier:
String s=new String("Hello, world!");
Yes, s is a String. This can aid comprehension while inflating the size of programs.
Much J code is not deliberately obfuscated, but has little or no redundancy, and careful attention must be paid to details. In addition, J has an immediate execution model: sentences are executed and their value is whatever it happens to be. Even the part of speech of a name cannot be inferred from examination of the program text alone.
2. Parsing and execution
Most J programs do not use the full generality of the language, and you can get by with some rules of thumb that at least allow comprehension of the code. As with all useful fictions, these rules will sometimes fail. The precise rules used in parsing and execution, are given on a single page, Section II.E of the J Dictionary[5]. As with all specifications, not all of the consequences are drawn out. We give some useful pointers here. Henry Rich’s book[6] has a more detailed explanation, including useful parse examples.
2.1. Lexical Analysis
The rhematic rules of J, whereby characters are formed into words, are largely unremarkable, but have a few quirks in J: the use of _ instead of - in number formation (so -1 0 1 is _1 0 _1 and not _1 0 1), and the fact that a list of numbers is a word resulting in a noun. It is easy to go wrong if you use numbers as arguments. For example
(0) 1 } i.3 NB. amend item 1 to 0 0 0 2 0 1 } i.3 |rank error
In the second example, 0 1 is taken to be a single word. On the other hand, a list of nouns is a syntax error: there are no strands (sequences of nouns interpreted as lists) in J.
a=:1 b=:2 a b NB. syntax error a,b NB. probably what was meant.
Locales are another lexical issue: we completely ignore them in this article by never using _ in names.
2.2. Syntax analysis
Syntax analysis in J is significantly different from a compiled language, where the syntactic meaning of everything in the program text is known after compilation and before execution: in J, the part of speech of an executable fragment may not be known until after execution.
2.3. Types of nouns
J has no declarations: the types of nouns are determined dynamically. The sentences
x=:42 NB. integer x=:’asdf’ NB. literal string x=:1 2 3 NB. list
respectively assign an integer, a string and a numerical list to x. While this is a change from statically typed languages, it is easy to grasp.
2.4. Parts of speech
J also determines part of speech dynamically: the part of speech to which a name refers cannot be inferred from examination of the program text alone. This is harder to grasp, and imposes obvious limits to such things as syntax highlighting of the program text. The sentences
x=:+ NB. verb x=:/ NB. adverb x=:@ NB. conjunction
assign different parts of speech to the same variable x. The sentence
x=:m :n
can assign a noun, adverb, conjunction or verb to x depending on the value of m. Ignoring the mnemonic value of labelling syntactic entities by parts of speech, we can give some general properties. The only ambivalent objects in J are verbs.
All other parts of speech have a fixed number of arguments:
valence | part of speech | example |
---|---|---|
0 | noun | 0 1 2 |
1,2 | verb | + |
1 | adverb | / |
2 | conjunction | & |
2 | copula | =: |
Verbs take nouns as arguments and result in nouns. Adverbs and conjunctions (collectively known as modifiers) have arguments that are nouns or verbs, and can result in any part of speech. This extreme generality is dealt with quite conservatively by the J primitives: in most cases adverbs and conjunctions result in verbs or nouns, and the nouns that are produced are generally gerunds. Notable exceptions are ~ (evoke), : (explicit definition) and f. (fix). User-defined modifiers that do not produce verbs are beyond our scope.
2.5. Rules of thumb for parsing and execution
We first give some simple rules (these are expansions of the “simplified rules” in Section II.E of [JDictionary]).
- Execution generally proceeds from right to left, except for parentheses, which are executed when encountered. There is no precedence of verbs.
1-2+3 _4 (1-2)+3 2 1*2+3 5
The rule on parentheses is almost the same as “parentheses are done first”, but it also specifies the order in which they are done. For example
(2 f 3) [ (f=.+) 0 5
works because (f=.+) (which defines f) is executed before (2 f 3).
Unlike many languages, J has no rules as to the order of evaluation of arguments to dyadic verbs: this is determined by context. [****Example****]
- Adjectives and conjunctions are executed before verbs, and they have long left scope. [***Examples****]
- +/\ means (+/)\
- f^:g^:_ means (f^:g)^:_ not f^:(g^:_)
- f+g"1 means (f+g)"1 not f+(g"1)
- A verb is applied dyadically if possible, that is if its right argument is a noun and its left argument is a noun that is not the right argument of a
conjunction.
*1-3 NB. - (minus) is applied dyadically *&1-3 NB. - (negate) is applied monadically
Even if you have defined a verb to be a monad, it will still be applied dyadically if it has appropriate arguments
f=:3 : ’+’ NB. define f to be monad + (conjugate) 2 f 3 |domain error: f | 2 f 3
This gives a domain error, the result of applying a monad dyadically, rather than a syntax error from the juxtaposition of two nouns. A common error is to assume that bonding a noun to a verb produces a monad:
plus2=:2&+ NB. both a monad and a dyad plus2 3 NB. monadic application 5 4 plus2 3 NB. dyadic application: means plus2^:4 (3) 11
- Certain isolated sequences yield trains, that are interpreted in special ways. We discuss examples of these below. Let us note for now that
+/%# 1 2 3
and
(+/%#) 1 2 3
have completely different results: the first example can be resolved by prior rules, the second is a train.
Tacit definitions
3.1. Introduction
The J website contain a wealth of useful constructions: however, they are difficult for a novice to read because many are tacit verb definitions, and completely unlike the explicit style of verb definitions given in introductions to J.
In addition to providing the ability to formally manipulate definitions, the major reasons for tacit definitions are conciseness and performance, particularly the latter: under most circumstances a tacit verb will execute faster than an explicit equivalent. This is especially true when the verb is applied with small rank.
While trains may be used casually in explicit style programming, they are central to tacit programming, and many advanced verb definitions are tacit. We begin with some generalities, and then describe some important examples in detail.
3.2. Trains
Tacit definitions are parsed in exactly the same way as the rest of J, but there are unexpected consequences.
Certain sequences of parts of speech are identified as trains (hooks and forks), and interpreted in a special (see [2, Section II.F]). They must be “isolated” in that they are either parenthesized or occur in a tacit verb definition: you do not get this behavior when you apply a sequence of verbs to noun arguments. We will deal here only with trains of verbs: the hook (g h), fork (f g h), and capped fork ([: g h). A longer train is resolved into forks from the right with possibly a final leftmost hook. Thus the verb train (a b c d e) means a b (c d e) and (a b c d e f) means a (b c (d e f)).
When we talk about a verb being monadic or dyadic, we are talking about its intended use: any verb will be applied according to the number of arguments supplied.
Monadic hook: (g h) y means y g h y. Note that g is dyadic and h is monadic.
Example: odometer (monadic hook)
odometer=: #: i.@(*/) NB. equivalent to 3 : ’y #: (i. */ y)’ odometer 2 3 0 0 0 1 0 2 1 0 1 1 1 2
You can get the tree display
5!:2 <'odometer' +--+------------+ |#:|+--+-+-----+| | ||i.|@|+-+-+|| | || | ||*|/||| | || | |+-+-+|| | |+--+-+-----+| +--+------------+
This is a hook g h, where g is #: and h is i.@(*/) . Note the parenthesization of the right argument to the conjuction @, which is frequently required, because of
the ”long left scope” rule. (If not, i.@*/ is parsed as (i.@*)/ . This is an easy mistake to make if you forget that frequently used verb-adverb combinations such as +/ are not primitives). The sentence
odometer 2 3
invokes the hook (g h) monadically. In this case h 2 3 is just i.6. The original argument 2 3 is reused as the left argument to g to give 2 3 #: (i.6) which when executed yields the answer above.
Example: isint (monadic hook)
isint=:=<. NB. is y an integer? Equivalent to 3 : ’y=<.y’ isint 1.5 1 3j2 0 1 1
Dyadic hook: x (g h) y means x g h y. Once again, g is dyadic and h is monadic. Whether the hook is dyadic or monadic does not affect the valence of g or h, only whether x or y is used as the left argument to g.
Example: reshape (dyadic hook)
NB. Reshape ravelled atoms of y by shape x NB. Equivalent to 4 : ’x $ (,y)’ reshape=:$, 3 1 reshape 3 $ i.3 0 1 2 2 3 reshape (|: 3 2 $ i.6) 0 2 4 1 3 5
Monadic fork: (f g h)y means (f y) g (h y). f and h are applied monadically to the fork argument, while g is applied dyadically to their result.
Example: mean (monadic fork)
+/%# 1 2 3 NB. not a fork: equivalent to +/(% (# 1 2 3)) 0.333333 mean=:+/%# NB. monadic fork NB. equivalent to 3 : ’(+/ y) % (# y)’ mean 1 2 3 2
The adverb / is executed first, so that mean is equivalent to (+/)%#, and is a fork (f g h). The sentence mean 1 2 3 executes the fork monadically, so it is equivalent to (+/ 1 2 3)%(# 1 2 3).
Dyadic fork: x(f g h)y means (x f y) g (x h y). g is applied dyadically, and f and h are applied dyadically to the original fork arguments.
Example: membership in open interval (dyadic fork)
NB. x inopen (a,b) tests if x is in the open interval (a,b), NB. equivalent to 4 : ’(x>{.y)*.(x<{:y)’ inopen=:([>{.@:])*.([<{:@:]) 1 inopen 2 3 0 2 inopen 1 3 1
This is a fork (f g h), and f and h are each forks themselves. Because @ is a conjunction, [>{.@] means [>({.@]). Note the use of [ and ] to select left and right arguments. They are not quite the same as x and y in explicit definitions: in particular ] is a verb while y is a noun.
Capped fork: If the verb f in the fork (f g h) is [: , the fork is called a capped fork, and has a different effect. In ([: g h) the verb [: serves merely as a marker and it is not executed. Instead g is executed monadically. Arbitrarily long compositions can be written as trains: [: a [: b [: c [: d e is equivalent to a @: b @: c @: d @: e. Only the rightmost verb of such a train or composition can be applied dyadically: all the other verbs must be applied monadically.
Monadic capped fork: ([: g h) y means g h y. This is the same as g@:h, the composition of g with h.
Example: number of divisors (composition)
ndiv=:*/@:>:@:{:@:(__&q:) NB. written as composition ndiv=:[: */ [: >: [: {: [: (__&q:) NB. written as train ndiv 120 16
This has the form of a composition f@:g@:h@:k . The first verb gives
(__&q:) 120 2 3 5 3 1 1
This is a table of primes and exponents in the factorization of 120. The number of divisors is obtained by selecting the last row, incrementing it, and inserting product.
Dyadic capped fork. x ([: g h) y means g (x h y): only h is executed dyadically.
Example: distance as the norm of the difference of two vectors
norm=:+/&.(*:"_)"1 NB. Euclidean norm of a vector distance=:[: norm - NB. Euclidean distance between vectors NB. equivalent to norm@:- or 4 : ’norm x-y’ 1 2 distance 4 6 5
Hooks v forks: A hook is not the same as a capped fork. In (g h), h is monadic and g is dyadic. In ([: g h), h has the valence of the fork and g is monadic. [***Example***]
3.3. Control structures
J’s control structures, such as for., while. and if., can only be used in explicit definitions. It is therefore not obvious to an intermediate J programmer how these apparently essential structures can be dispensed with in tacit programming.
Since our emphasis is on reading rather than writing J, we will confine ourselves to pointing out some of the more common ways in which control structures may be implemented tacitly.
Definite iteration. Every time a verb is executed, some implicit looping may occur.
This mechanism in itself obviates the need for many loops. Certain patterns of use are supported by J primitives: outer product by f is the dyadic form of f/, and inner products can be represented as f . g for suitable f and g (for example matrix multiplication or vector dot product is +/ . * and determinant of a matrix is -/ . * ). J also has a number of adverbs which apply a verb to parts of an array, for example / (insert), \ (prefix) and \. (suffix). These and other primitives cover many of the cases of traditional definite iteration.
Functional iteration is done using the power conjunction (^:). In its simplest form, u^:n y returns the n-fold iteration of u on y, so for example
>:^:10 (1) 11
Indefinite iteration. Indefinite iteration (corresponding to while.) is usually done tacitly with functional iteration.
Applying a verb infinitely often corresponds to fixed point iteration, so u^:_ y will iterate u until y-:u y.
The canonical example of fixed-point iteration is Newton’s method. Since J501, most adverbs can only be written explicitly.
newton=:1 : ’] - u % u d. 1’ NB. Newton’s method adverb (*:-2:) newton^:_ (3) 1.41421
In mathematical terms, newton applied to a function f returns the Newton iterating function g(x) = x − f(x) / f'(x) . Infinite iteration of g means iteration until g(x) is tolerantly equal to x.
The construction f^:g^:_ y is equivalent to while. g y do. y=.f y end. except that the loop will also terminate at a fixed point of f. For example, greatest common divisor can be calculated (inefficiently) as
gcd=:(<./ , >./ - <./)^:(~:/)^:_ gcd 21 49 7 7 gcd 23 121 1 1
Writing
sub=:<./ , >./ - <./ neq=:~:/
gcd has the form sub^:neq^:_ . It will execute while its argument consists of distinct items (while neq is true), and then sub returns the smallest and the largest minus the smallest of its arguments.
Conditionals. Conditionals tend to be avoided by J programmers. However, there are some tacit forms that are used.
The first is using # to selecting items from an array. The verb selectodd selects the odd numbers from a numeric list.
selectodd=:#~ 2&| NB. equivalent to 3 : ’(2&| y) # y’ selectodd i.20 1 3 5 7 9 11 13 15 17 19
The if. statement can be done using the power conjunction: f^:g (the one-shot version of the infinite iteration discussed above) corresponds to if. g y do. f y end. [***Example***]
The if...else statement requires a restricted form of the agenda conjunction (@.), which is a functional form of the computed goto. The verb f‘t@.c has the same effect when applied monadically as if. c y do. t y else. f y end. [***Example***]
3.4. Recursion
3.5. Variables
3.6. Monad/Dyad
3.7. Amend
3.8. Asymmetry of dyadic verb arguments
3.9. Desperate measures
- When rank fails, box everything and apply f&.> .
- Where tacit programming fails.
References
- ↑ Chris Burke, Roger K.W. Hui, Kenneth E. Iverson, Eugene E. McDonnell, and Donald B. MacIntyre. J Phrases. Jsoftware, 1991–2005.
- ↑ J forums. http://jsoftware.com/forums.htm.
- ↑ International Obfuscated C Code Contest. http://www.ioccc.org/.
- ↑ Brian W. Kernighan and Dennis M. Ritchie. The C Programming Language. Prentice Hall, second edition, 1988.
- ↑ Roger K.W. Hui and Kenneth E. Iverson. J Dictionary. Jsoftware, 1991–2006.
- ↑ Henry Rich. J for C Programmers. Jsoftware, 2006
Scan of Meeting Notes
The only man I know who behaves sensibly is my tailor; he takes my measurements anew each time he sees me. The rest go on with their old measurements and expect me to fit them. - George Bernard Shaw