Doc/J4APL

From J Wiki
Jump to navigation Jump to search

J for the APL Programmer

Though J shares many concepts with APL, in many respects it is radically different, and almost all APL constructions that are in J differ in some way. These differences can be a stumbling block to the newcomer who thinks that J is simply an ASCII version of APL, prompting questions such as:

  • How do I save my workspace?
  • Why do J functions work along the columns and not the rows?
  • Can I have a version of J with □IO←1 ?
  • Can I have a version of J with real APL characters?

In this article we will compare and contrast the two languages. The term APL is used generically to refer to facilities found in most commercial implementations of APL.


Overview

Why is J so different from APL? Would it not have been easier just to produce an ASCII version of APL, and otherwise preserve compatibility between APL and J? To answer this, you should understand the main benefits of the changes:

  • The syntax of J is at once simpler, more powerful and more consistent than APL.
  • Anomalous constructions in APL such as bracket indexing are removed.
  • There are many new constructions, such as function rank, that could not be

added to APL without major changes to the language.

  • The programming environment for J is like those of other languages and

unlike that for APL, permitting the use of standard editors and code management systems.

Terminology and Spelling

J uses terms from English grammar, thus a function such as addition is a verb (because it performs an action), and an entity that modifies a verb is called an adverb. Many APL primitives that are also in J have been renamed to better describe their behavior in J. Thus APL reduce becomes insert, APL scan becomes prefix, and the new reverse scan is suffix.

J terminology should be used whenever clarity of expression is important. Other terms commonly used in programming such as data, variable, function, operator, can then be used in specific contexts, e.g. "functions" in Calculus.

J uses the 7-bit ASCII alphabet, which is a subset of UTF-8 (i.e. byte values 32 through 127 of UTF-8). J scripts are assumed to be in UTF-8 format, and so may contain arbitrary unicode character strings. J also has a 2 byte unicode data type, and verbs to convert between this and UTF-8. Using ASCII/UTF-8 avoids the many problems associated with using APL symbols. It allows J to be used on a variety of machines without special hardware or software, and permits easy communication between J and other systems.

Language primitives typically consist of a single ASCII symbol, optionally followed by either a dot or colon, e.g. the symbols # , #. and #: are all valid J primitives. The J spelling scheme is quite mnemonic; the vocabulary is a careful piece of design and bears study. Consider:

 + conjugate • plus       +. real/imag • GCD (or)           +: double • nor
 * signum • times         *. length/angle • LCM (and)       *: square • nand
 - negate • minus         -. not • less                     -: halve • match
 % reciprocal • divide    %. matrix inverse • matrix div    %: square root • root

+. for or and *. for and make the analogy of or as logical addition and and as logical multiplication. +. is extended to the entire numeric domain as the GCD function and likewise *. for LCM. The monad -. is logical negation, not, and is extended to the entire numeric domain as the function 1&- (probability complement). The dyad -. is set difference.

There is no 1-1 mapping between J and APL symbols:

  • Some functions are assigned the same symbols. For example, + for conjugate

and plus, - for negate and minus, | for magnitude and residue, ? for roll and deal.

  • Some functions are assigned symbols having different meanings in APL.

For example, ^ for exponentiation and power.

  • Some functions are assigned new symbols. For example, % for reciprocal

and divide, o. for pi times and the trigonometric, hyperbolic, and other related functions, ^. for natural log and log.

  • Meanings are assigned to some uninterpreted APL cases. For example, <: for

decrement and less than or equal, >: for increment and larger or equal, {. for head and take, f/ for insertion and outer product. In particular, the monads of the symbols #. , binary value and base value, and #: , binary representation and representation, resurrect meanings implemented in the original APL but later removed for lack of space.

  • Some monad/dyad pairs have different partners. For example, /: for

grade up and sort.

Language

The J language is essentially a superset of APL. All the standard APL functions are included in one form or another. Extensions include:

  • Functions and operators that enhance the possibilities for tacit definition

(operator expressions). For example, constant functions 0: 1: 2: a"_ ; composition operators @ &; @: & ; "trivial" functions like [ and ] (left and right argument), >: <: (increment and decrement), +: -: *: %: (double, halve, square, square root).

  • Operators that treat their function arguments symbolically, providing a calculus of

functions and operators. For example, the power operator ^: ( f^:_1 is the inverse of f ), f&.g (f under g , g^:_1@f&g ), the hypergeometric operator H. , and f/x when x is empty.

  • The systematic use of function rank. A function typically operates

on the leading axis of its arguments, and operation along other axes is easily effected using the rank operator.

  • Certain sequences (trains) of two or three words are assigned meanings

(hooks and forks). These are compatible changes as they replace what would be errors in APL; put another way, one could add hooks and forks to APL.

  • Several families of new functions and operators. For example:
    • Calculus:  D. D: t. t: T.

  derivatives and integrals, secant slope, Taylor's series

    • Complex numbers:  +. *. j. r.

  rectangular and polar, etc.

    • Linear algebra:  p. -/ .*

  polynomial roots and polynomial evaluation, determinant

    • Combinatorics:  A. C. { +/ .*

  atomic and cycle representations of permutations, catalog, permanent

    • Numbers:  p: q: x:

  primes and prime factors, extended integers

    • Box arrays:  ; L. L:

  raze/link, level

Arrays

J numeric and character arrays are the same as in APL. J also supports complex numbers, extended (arbitrary-precision) integers and rationals, sparse arrays, symbols and unicode.

Boxed arrays are the same as in SHARP APL. They are similar to the enclosed arrays in other APL systems, except that the box of a scalar is not the same as the scalar.

Boxed arrays must be explicitly created as such; J does not have strand notation. The following example uses the verb ; link to link items together into a boxed array. Note that J boxes output automatically:

   1 2 3;10 20                     J
┌─────┬─────┐
│1 2 3│10 20│
└─────┴─────┘

   (1 2 3) (10 20)                 APL
 1 2 3  10 20

Infinity, negative infinity, and indeterminate ( _ __ _. ) have been added. Numeric constants can be specified in several more ways, akin to 4e2 for 400. Thus 3j4 is the number with real part 3 and imaginary part 4, 7ad90 is the number with magnitude 7 and angle 90 degrees, 1.5ar_1 is the number with magnitude 1.5 and angle minus pi radians, 1r3 is the fraction one-third, etc.

J does not support heterogeneous arrays.

Verbs

Verbs correspond to APL's functions.

J makes no distinction between primitive verbs and user-defined verbs, unlike APL, where some facilities (such as the axis operator) are only available for primitive functions.

There are no niladic verbs. This allows verbs to be assigned, if no argument is given. This function assignment is also available in APLW, but there can only work with verbs that are not niladic. The J equivalent of an APL niladic function is a monadic verb that ignores its argument, which nevertheless must be given.

If you enter the name of a verb alone, its definition is displayed (this is true of any J object).

All verbs are ambivalent, though the monadic or dyadic domain may be empty. If you invoke a verb with an empty domain, the result is domain error and not SYNTAX ERROR.

The left and right arguments of an explicitly defined verb are x and y respectively.

All verbs return a result, which is the result of the last statement executed (other than test statements in control structures). Statements in defined verbs that are not assignments do not display their results in the session.

There is no equivalent to APL's list of local names in the function header. Indeed, J definitions do not have a header, since there is no need for one.

Adverbs and Conjunctions

Adverbs and conjunctions correspond to APL's monadic and dyadic operators. The separate names for the two classes of operators reflect the importance of operators in J.

The result of an adverb is typically a verb; the result of a conjunction given one argument is typically an adverb, and given both arguments is typically a verb. However, adverbs and conjunctions may return any type of object.

Since a verb can have one or two arguments, and an operator argument can be a noun or a verb, then an adverb can have up to 4 cases, and a conjunction up to 8 cases:

   n adv y         m conj n y         u conj n y
 x n adv y       x m conj n y       x u conj n y
   v adv y         m conj v y         u conj v y
 x v adv y       x m conj v y       x u conj v y

All these possibilities are in APL as well as J, but are exploited more systematically and more extensively in J.

Execution Control

Execution control is provided by control words, if. else. while. etc. There is no equivalent to APL's computed goto → .

J control words differ in two major respects from those added to APL, in that they are stream and block oriented: control words can occur in any part of a line of code, and they group the code into blocks. For example, compare:

 :if statement-is-true             APL
        ...
 :else
        ...
 :end

 if. block-is-true do. ...  else. ... end.           J

where the APL statement is a single statement which must return 0 or 1, whereas the J block can be any number of J sentences (which may themselves contain control words). A J block is true if the first element of the result is non-zero.

The power conjunction ^: provides another form of execution control. In the phrase v^:n y , the verb v is applied n times to the array y . Here, n may be an array of powers; infinite powers specify the limit of application of v . For example:

   newton=. -:@(+ 2e4&;%)

   newton newton newton 1
2502.62

   newton ^:3 [ 1
2502.62

   newton ^:(i.3 5) [ 1
      1 10000.5 5001.25 2502.62 1255.31
 635.62 333.543 196.753 149.202 141.624
141.422 141.421 141.421 141.421 141.421

   newton ^:_ [ 1
141.421

The agenda conjunction @. provides for if-then-else and case statements.

Diamond Separator

There is no equivalent of APL's ◊ separator used to enter several statements on a single line. However, you can obtain similar behavior using the verb [ same which returns its left argument. Using this, statements are evaluated right to left.

Index Origin

Index origin is 0 throughout. There is no equivalent to APL's □IO . (APL expressions used here assume  □IO←0 .)

The main reason for the change is to avoid the ambiguity of □IO , a feature more defended against than used. The choice of origin 0 is that used in most other languages, with the benefit that in general, code written in origin 0 is simpler than in origin 1.

Indexing

The anomalous use of brackets for indexing in APL, is replaced with from { and amend } for indexing and indexed replacement.

Since this change represents a common difficulty for APL'ers, it is worth examining in detail. To see the problem with APL bracket indexing, consider the expression to select row 2 of a matrix:

   M[2;]

Indexing is a function of two arguments (data and indices), yet this APL expression does not have the form of other APL dyadic functions (i.e. a primitive with arguments to left and right). Moreover you cannot use the APL expression as a function. For example if A is an enclosed array, each item of which is a matrix, you could not use bracket indexing to select row 2 of each item:

   A ¨ [2;]
SYNTAX ERROR

To accomplish this, you could define a cover function to index the matrix:

   ∇ r←ndx rowndx mat
[1]  r←mat[ndx;]
   ∇

You could then select row 2 of M as:

   2 rowndx M

and row 2 of each element of A as:

   2 rowndx¨ A

The corresponding J expressions are:

   2 { M
   2 {each M

It may be seen that the verb { is simply a functional form of APL bracket indexing.

From and amend provide all the facilities of APL bracket indexing and indexed replacement, and more.

Assignment

The assignment symbols are =. for local assignment and =: for global assignment.

All definitions in J are created by assignment—there is no edit mode, as in APL. The result of an expression that defines a verb is the verb itself, not the name of the verb; similarly for adverbs and conjunctions. Thus a verb is the result of some expression, not a side-effect as in APL, and this is very useful in functional programming.

The left hand side of an assignment must be a name or list of names; there is no strand assignment or indexed assignment or selective assignment.

Strand assignment can be effected by specifying a list of names. For example:

   'first second third'=. 1 2 3;(i.3 4);'abc'

defines first as 1 2 3 , second as i.3 4 and third as 'abc' .

Indexed or selective assignment can be effected by amend, which has no side-effects and is syntactically an adverb like any other. Indexed assignment A[i]←B can be effected by A=. B i}A . Selective assignment (F A)←B can be effected by A=. B (F A)}A . In the latter phrase, F can be any verb, primitive, derived or user-defined, that produces valid indices; there is no distinction of a sub-class of "selection functions".

Local and global assignments differ only inside an explicit definition, otherwise they are the same. Unlike APL, local names are strictly local. If u and v are explicit definitions, and if x is local to u , then x is not visible to v . This is much cleaner than APL's dynamic scoping, where a local name is also visible to all subroutines. If you wanted to make a set of names visible to some functions without their names cluttering up the workspace, then you could store them in a locale.

Tacit vs. Explicit

It is helpful to distinguish tacit and explicit definitions—the former do not refer to their arguments, while the latter do so. For example:

   sum=. +/                        NB. tacit
   sum=. 3 : '+/y'                NB. explicit (y is the right argument)

There are no hard and fast rules as to whether to use tacit or explicit definition. In general, for simpler expressions, tacit definitions are shorter and easier to read, as in the example above; while the explicit form is most often used for multi-line definitions and when using control words.

Tacit and explicit definitions can be used within one another, for example, as in the use of the tacit +/ in the explicit sum above.

Items, Axes, Rank

In general, non-scalar functions in APL apply either to an array as a whole, or to the last axis. You can modify this behaviour to some extent with the axis operator, though the axis operator is not always applicable (for example, it is not permitted with user-defined functions), and even where applicable cannot specify left and right axes separately.

In contrast, non-scalar verbs in J typically apply to the array as a whole, or to the leading axis. You can modify this behaviour using the rank conjunction " , which applies to all verbs, and which can specify left and right ranks separately. Rank is an extension and generalization of scalar extension.

This change is in line with J's emphasis on items, which for an array of rank n, are the subarrays formed from the last n-1 axes, for example, the rows of a matrix. Some J verbs apply specifically to items:

   [M=. i.3 4
0 1  2  3
4 5  6  7
8 9 10 11

   #M                              NB. # counts number of items (here, the rows)
3

   2$M                             NB. $ reshapes the items
0 1 2 3
4 5 6 7

Dyadic index x i. y and member of x e. y look for items instead of scalars, and are much more useful for that. For example, when x and y are character matrices, x i. y produces the row indices of y in x , and y e. x indicates which rows of y are in x .

The actual behaviour of a verb is specified by the verb's rank. Each verb is assigned a rank, and a rank may be otherwise specified by the rank conjunction. Rank determines the behavior of a verb on certain subarrays of its arguments, which for a rank-k verb are referred to as the k-cells. These k-cells are the subarrays formed from the last k axes. The rest of the shape vector is called the frame.

The shape of a result is the frame catenated with the shape produced by applying the verb to the individual cells. Usually these individual shapes agree, but if not, they are first brought to a common rank by introducing leading unit axes to any of lower rank, and are then brought to a common shape by padding with an appropriate fill element. This allows J to assemble results which in APL would signal rank or length errors:

   ⊃ (1 2 3) (2 2ρ10 11 12 13)     APL
RANK ERROR

   >1 2 3;2 2$10 11 12 13          J
 1  2 3
 0  0 0

10 11 0
12 13 0

J provides the verb ,: for lamination.

Agreement

In APL, two arguments agree if both shapes are the same, or if one argument is a scalar (known as scalar extension). APL also permits the anomalous singleton extension.

J generalizes scalar extension: two arguments agree if both shapes are the same, or if the frame of one is a prefix of the frame of the other (prefix agreement). Thus:

   100 200 + i.2 3
100 101 102
203 204 205

The combination of J's more general agreement rules, permissive assembly, and the rank conjunction permits many more calculations to be expressed in J without looping or reshaping.

Rank vs. Each

J's rank conjunction " and APL's each operator ¨ are notationally similar, and are sometimes confused. To understand the difference, note that the each operator is used for two essentially different expressions, for which J has two quite different solutions.

APL each can be used to apply a function to each element of an array, typically an enclosed array. The corresponding expression in J is also called each , which is a standard utility defined as &.> . In such cases, the following are the same:

   f ¨ A                           APL

   f each A                        J

APL each can also be used in the following situation. A function is to be applied to some data, however it does not work correctly on it, but requires the following three steps: the data is split up into enclosed items, the function is applied to each item in turn, and the results are then reassembled. For example, a familiar idiom for catenating a vector to each row of a matrix in APL is:

   ⊃ (⊂vec) ,¨ ⊂[1]mat

J's rank conjunction handles such cases directly. The corresponding expression is:

    vec ,"1 mat

The J expression is not only more concise, it is more efficient, since many verbs have rank support built-in.

Implicit Reduction

Some APL expressions apply reduction implicitly. For example, APL scan applies its left argument reduction to successive prefixes of the right argument, while inner product applies the left argument reduction to the result of the right argument. The use of reduction ensured that computations such as sum scan can be effected by primitive function arguments to the operator, and that the overall result could be properly assembled in APL\360.

With J's more permissive assembly rules (and the use of boxed arrays), reduction is no longer necessary, and in many cases, inappropriate. Therefore in J, if a reduction is required, it must be specified. Thus +/\ in J computes sum scan and +/\. computes suffix sum scan. The following J example does not use reduction, and is therefore not readily expressed using APL scan:

   <\'abcdef'
┌─┬──┬───┬────┬─────┬──────┐
│a│ab│abc│abcd│abcde│abcdef│
└─┴──┴───┴────┴─────┴──────┘

Pervasion vs Level

J does not support automatic pervasive application of a verb to boxed arrays, and has no subclass of "pervasive functions". Instead the level conjunction L: provides for controlled application of a verb at any level. The phrases  v L: d y and  x v L: d y specify that v should be applied to level(s) d of the arguments x and y . Positive levels count up from the bottom; negative ones count down from the top.

Programming Environment

Workspaces and Scripts

J has a completely different way of storing programs and data than APL. Instead of the workspace, J uses script files—which are ordinary text files that can be accessed with any editor. Running an application consists of reading in the required script files and running the main function.

There is no direct equivalent to the APL workspace that can be saved and loaded. You can think of the active session much as you think of the active workspace in APL, but, there are no built-in commands such as )save to save the active workspace, or )load to clear out the active workspace and replace it with another. J sessions are allocated dynamically and can be as large as the virtual space on the machine.

The J session consists of windows that represent execution and script sessions. You use the execution sessions for experimentation, and the script sessions for development. An execution session is much like the APL session—you can enter J expressions and have them evaluated immediately. A script session is where you develop code, and is analogous to the APL editor. Both types of session represent ordinary text files; in particular, script sessions represent script files.

It is important to understand that script files are saved as you work on them, each time they are loaded. Thus, work in progress is always being saved, and therefore there is no need to explicitly save your work as in APL.

Why change from workspaces to scripts? It brings J into line with other programming languages, which also use ordinary text files for program development. The focus on script files also eliminates the many problems associated with workspaces, such as the inability to browse through workspaces or to update one workspace with values from another.

Locales

The flat APL workspace is replaced by locales which allow you to partition the names in use. Thus, name conflicts are avoided when loading two applications at the same time, and commonly-used utility functions can be made available without their names cluttering up listings of work in progress. The idea is similar to the namespaces of Dyalog APL, though the implement differs in many respects. The base (or no-name) locale is the default locale. The z locale is the parent locale. Definitions stored in either of these locales can be referenced directly, without specifying the locale name. Utility functions are typically stored in the z locale.

External Interfaces

System Commands, Functions and Variables

The functionality of APL's System Functions, System Variables and System Commands is provided by the foreign conjunction !: . This conjunction is defined for a variety of numeric arguments, grouped by type. For example, 1!:1 reads a file, and 1!:2 writes to a file.

The foreign conjunction builds ordinary J verbs— !: is a conjunction syntactically like any other, and there is nothing special about the syntax of the resultant verbs, unlike the special syntax used in APL for System Commands, and for some System Functions. You can assign a name to any J verb derived from the foreign conjunction, but you could not, for example, provide a cover function for □FMT , because of its special syntax.

Names and cover functions for most foreign conjunctions, are provided by the J utility scripts. For example:

   nc 'each';'sort'                NB. name class
1 3

   datatype 1j1                    NB. datatype
complex

   fread '\config.sys';0 30        NB. read first 30 bytes from file
device=c:\windows\setver.exe

Files

J provides native and component file functions much like APL*PLUS III and APLW. Unlike APL, you can use filenames directly, without having to worry about tie numbers. Also, you can read and write several components at time. For example, read 50 components starting from component 100, from file "mydata":

   # jread 'mydata';100+i.50
50

Access to Other Facilities

APL implementors have traditionally tried to provide a complete application development environment within the language. Such an approach was quite feasible and necessary in the early days of APL, but is no longer practical or desirable. However, the tradition dies hard; for example, even now APL takes the "complete package" approach, yet provides no easy way for an external program to call an APL subroutine. In contrast, the development in J has focused on the core language, without attempting to duplicate facilities that are better provided elsewhere. Consequently, the J language has developed more rapidly than APL, while access to other facilities is easier.

References

Discussions on the differences between APL and J (and the process that led from APL to J) can be found in the following publications by K.E. Iverson:

Appendix D, APL2 versus a Comparable Subset.

1987-09. See especially Section III, Dialects.

IBM Systems Journal, Volume 30, Number 4, 1991-12.



Contributed by Chris Burke and Roger Hui. Substantially the same text previously appeared in APL Quote-Quad, Volume 27, Number 1, September 1996.