User:John Randall/ReadingJ
The Rough Guide to Reading J
Note: This document contains uncorrected errors. Please report any more.
Introduction
Much has been written about writing in J: we concentrate here on reading. Introductory materials focus on writing sentences that result in a noun and writing explicit verbs. A J learner is then shocked by reading J Phrasesand postings on the J Forums to find that experienced J programmers write in what looks like another language. For example, frequently used code for the odometer verb is given by
odometer=:#:i.@(*/)
The aim of this article is to provide the intermediate J learner with a way to reading such code in J602.
J is often criticised as a write-only language. It is quite possible to write unreadable code in any language: see, for example the Obfuscated C Contest. Most beginners in C have struggled with
while(*s++=*t++);
for copying strings.
Many languages require a certain amount of redundancy. For example, in constructing an object, Java requires duplication of the class identifier:
String s=new String("Hello, world!");
Yes, s is a String. This can aid comprehension while inflating the size of programs.
Much J code is not deliberately obfuscated, but has little or no redundancy, and careful attention must be paid to details. In addition, J has an immediate execution model: sentences are executed and their value is whatever it happens to be. Even the part of speech of a name cannot be inferred from examination of the program text alone.
Parsing and execution
Most J programs do not use the full generality of the language, and you can get by with some rules of thumb that at least allow comprehension of the code. As with all useful fictions, these rules will sometimes fail. The precise rules used in parsing and execution, are given on a single page, Section II.E of the J Dictionary. As with all specifications, not all of the consequences are drawn out. We give some useful pointers here. Henry Rich's book has a more detailed explanation, including useful parse examples.
Lexical analysis
The rhematic rules of J, whereby characters are formed into words, are largely unremarkable, but have a few quirks in J: the use of _ instead of - in number formation (so -1 0 1 is _1 0 _1 and not _1 0 1), and the fact that a list of numbers is a word resulting in a noun. It is easy to go wrong if you use numbers as arguments. For example
(0) 1 } i.3 NB. amend item 1 to 0 0 0 2 0 1 } i.3 |rank error
In the second example, 0 1 is taken to be a single word. On the other hand, a list of nouns is a syntax error: there are no strands (sequences of nouns interpreted as lists) in J.
a=:1 b=:2 a b NB. syntax error a,b NB. probably what was meant.
Locales are another lexical issue: we completely ignore them in this article by never using _ in names.
Syntax analysis
Syntax analysis in J is significantly different from a compiled language, where the syntactic meaning of everything in the program text is known after compilation and before execution: in J, the part of speech of an executable fragment may not be known until after execution.
Types of nouns
J has no declarations: the types of nouns are determined dynamically. The sentences
x=:42 NB. integer x=:'asdf' NB. literal string x=:1 2 3 NB. list
respectively assign an integer, a string and a numerical list to x. While this is a change from statically typed languages, it is easy to grasp.
Parts of speech
J also determines part of speech dynamically: the part of speech to which a name refers cannot be inferred from examination of the program text alone. This is harder to grasp, and imposes obvious limits to such things as syntax highlighting of the program text. The sentences
x=:+ NB. verb x=:/ NB. adverb x=:@ NB. conjunction
assign different parts of speech to the same variable x. The sentence
x=:m :n
can assign a noun, adverb, conjunction or verb to x depending on the value of m.
Ignoring the mnemonic value of labelling syntactic entities by parts of speech, we can give some general properties. The only ambivalent objects in J are verbs. All other parts of speech have a fixed number of arguments:
valence part of speech example 0 noun 0 1 2 1,2 verb + 1 adverb / 2 conjunction & 2 copula =:
Verbs take nouns as arguments and result in nouns. Adverbs and conjunctions (collectively known as modifiers) have arguments that are nouns or verbs, and can result in any part of speech. This extreme generality is dealt with quite conservatively by the J primitives: in most cases adverbs and conjunctions result in verbs or nouns, and the nouns that are produced are generally gerunds. Notable exceptions are ~ (evoke), : (explicit definition) and f.(fix). User-defined modifiers that do not produce verbs are beyond our scope.
Rules of thumb for parsing and execution
We first give some simple rules (these are expansions of the "simplified rules" in Section II.E of the Dictionary).
1. Execution generally proceeds from right to left, except for parentheses, which are executed when encountered. There is no precedence of verbs.
1-2+3 _4 (1-2)+3 2 1*2+3 5
The rule on parentheses is almost the same as "parentheses are done first", but it also specifies the order in which they are done. For example
(2 f 3) [ (f=.+) 0 5
works because (f=.+) (which defines f) is executed before (2 f 3).
Unlike many languages, J has no rules as to the order of evaluation of arguments to dyadic verbs: this is determined by context.
2. Adjectives and conjunctions are executed before verbs, and they have long left scope.
- +/\ means (+/)\
- f^:g^:_ means (f^:g)^:_ not f^:(g^:_)
- f=:4 : 'x+y' "1 means (4 : 'x+y')"1
3. A verb is applied dyadically if possible, that is if its right argument is a noun and its left argument is a noun that is not the right argument of a conjunction.
*1-3 NB. - (minus) is applied dyadically *&1-3 NB. - (negate) is applied monadically
All verbs are ambivalent. However if a verb has only the monadic form defined, calling it dyadically will result in a domain error, and vice versa.
f=:3 : '+' NB. define f to be monad + (conjugate) 2 f 3 |domain error: f | 2 f 3
This gives a domain error, the result of applying a monad dyadically, rather than a syntax error from the juxtaposition of two nouns.
A common error is to assume that bonding a noun to a verb produces a monad:
plus2=:2&+ NB. both a monad and a dyad plus2 3 NB. monadic application 5 4 plus2 3 NB. dyadic application: means plus2^:4 (3) 11
4. Certain isolated sequences yield trains, that are interpreted in special ways.
We discuss examples of these below. Let us note for now that
+/%# 1 2 3
and
(+/%#) 1 2 3
have completely different results: the first example can be resolved by prior rules, the second is a train.
Tacit definitions
Introduction
The J website contain a wealth of useful constructions: however, they are difficult for a novice to read because many are tacit verb definitions, and completely unlike the explicit style of verb definitions given in introductions to J.
In addition to providing the ability to formally manipulate definitions, the major reasons for tacit definitions are conciseness and performance, particularly the latter: under most circumstances a tacit verb will execute faster than an explicit equivalent. This is especially true when the verb is applied with small rank.
While trains may be used casually in explicit style programming, they are central to tacit programming, and many advanced verb definitions are tacit. We begin with some generalities, and then describe some important examples in detail.
Trains
Tacit definitions are parsed in exactly the same way as the rest of J, but there are unexpected consequences.
Certain sequences of parts of speech are identified as trains (hooks and forks), and interpreted in a special way (see Section II.F of the Dictionary). They must be "isolated" in that they are either parenthesized or occur in a tacit verb definition: you do not get this behavior when you apply a sequence of verbs to noun arguments.
We will deal here only with trains of verbs: the hook (g h), fork (f g h), and capped fork ([: g h). A longer train is resolved into forks from the right with possibly a final leftmost hook. Thus the verb train (a b c d e) means a b (c d e) and (a b c d e f) means a (b c (d e f)).
When we talk about a verb being monadic or dyadic, we are talking about its intended use: any verb will be applied according to the number of arguments supplied.
Monadic hook: (g h) y means y g h y. Note that g is dyadic and h is monadic.
Example: odometer (monadic hook)
odometer=: #: i.@(*/) NB. equivalent to 3 : 'y #: (i. */ y)' odometer 2 3 odometer 2 3 0 0 0 1 0 2 1 0 1 1 1 2
You can get the tree display
5!:2 < 'odometer' +--+------------+ |#:|+--+-+-----+| | ||i.|@|+-+-+|| | || | ||*|/||| | || | |+-+-+|| | |+--+-+-----+| +--+------------+
This is a hook g h, where g is #: and h is i.@(*/) . Note the parenthesization of the right argument to the conjunction @, which is frequently required, because of the "long left scope" rule. (If not, i.@*/ is parsed as (i.@*)/ . This is an easy mistake to make if you forget that frequently used verb-adverb combinations such as +/ are not primitives). The sentence
odometer 2 3
invokes the hook (g h)monadically. In this case h 2 3 is just i.6. The original argument 2 3 is reused as the left argument to g to give
2 3 #: (i.6)
which when executed yields the answer above.
Example: isint (monadic hook)
isint=:=<. NB. is y an integer? Equivalent to 3 : 'y=<.y' isint 1.5 1 3j2 0 1 1
Dyadic hook: x (g h) y means x g h y. Once again, g is dyadic and h is monadic. Whether the hook is dyadic or monadic does not affect the valence of g or h, only whether x or y is used as the left argument to g.
Example: reshape (dyadic hook)
NB. Reshape ravelled atoms of y by shape x NB. Equivalent to 4 : 'x $ (,y)' reshape=:$, 3 1 reshape 3 $ i.3 0 1 2 2 3 reshape (|: 3 2 $ i.6) 0 2 4 1 3 5
Monadic fork: (f g h)y means (f y) g (h y). f and h are applied monadically to the fork argument, while g is applied dyadically to their result.
Example: mean (monadic fork)
+/%# 1 2 3 NB. not a fork: equivalent to +/(% (# 1 2 3)) 0.333333 mean=:+/%# NB. monadic fork NB. equivalent to 3 : '(+/ y) % (# y)' mean 1 2 3 2
The adverb / is executed first, so that mean is equivalent to (+/)%#, and is a fork (f g h) . The sentence
mean 1 2 3
executes the fork monadically, so it is equivalent to (+/ 1 2 3)%(# 1 2 3) .
Dyadic fork: x(f g h)y means (x f y) g (x h y). g is applied dyadically, and f and h are applied dyadically to the original fork arguments.
Example: membership in open interval (dyadic fork)
NB. x inopen (a,b) tests if x is in the open interval (a,b), NB. equivalent to 4 : '(x>{.y)*.(x<{:y)' inopen=:([>{.@:])*.([<{:@:]) 1 inopen 2 3 0 2 inopen 1 3 1
This is a a fork (f g h), and f and h are each forks themselves. Because @: is a conjunction, >{.@:] means [>({.@:]) . Note the use of [ and ] to select left and right arguments. They are not quite the same as x and y in explicit definitions: in particular ] is a verb while y is a noun.